From raziebe@gmail.com Fri Jul 1 02:01:45 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 01 Jul 2005 02:01:50 -0700 (PDT) Received: from wproxy.gmail.com (wproxy.gmail.com [64.233.184.204]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6191iH9029279 for ; Fri, 1 Jul 2005 02:01:45 -0700 Received: by wproxy.gmail.com with SMTP id i20so259692wra for ; Fri, 01 Jul 2005 02:00:12 -0700 (PDT) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:reply-to:to:subject:mime-version:content-type:content-transfer-encoding:content-disposition; b=CFP/B2ras/rASvHjwg3bq1xvfsoRqcy3iIH53ToMxO/MNkCqWKBN1taqlp/ytn2ZBAunkqBzzqEkkL8l64vNCVonzGS4beI4hn3/V7Io4BLRClgBS8vqjMLuIJv4ek/3NeUTV96G/wmnZj4qNm3M0P3A2Vr/qSvibdNZfSwAjAc= Received: by 10.54.101.2 with SMTP id y2mr1295995wrb; Fri, 01 Jul 2005 02:00:12 -0700 (PDT) Received: by 10.54.122.5 with HTTP; Fri, 1 Jul 2005 02:00:11 -0700 (PDT) Message-ID: <5d96567b05070102001fc7b677@mail.gmail.com> Date: Fri, 1 Jul 2005 11:00:11 +0200 From: "Raz Ben-Jehuda(caro)" Reply-To: "Raz Ben-Jehuda(caro)" To: netdev@oss.sgi.com Subject: general bonding question Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id j6191iH9029279 X-archive-position: 2584 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: raziebe@gmail.com Precedence: bulk X-list: netdev To any of you that deal with the nic bonding driver. Is there a special reason why the bond interface is not using the slave device features, such as "scatter gather" or hardware checksum ? I think of fixing the driver but i thought it would better to advise first. -- Raz Long Live the Penguin From eric-madwifi-devel@lammerts.org Fri Jul 1 08:36:39 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 01 Jul 2005 08:36:47 -0700 (PDT) Received: from mail.ultrawaves.com ([64.135.31.50]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j61FacH9027189 for ; Fri, 1 Jul 2005 08:36:39 -0700 Received: from [10.20.30.16] (sweetums.ultrawaves [10.20.30.16]) by mail.ultrawaves.com (Postfix) with ESMTP id 589BD45C015; Fri, 1 Jul 2005 11:35:00 -0400 (EDT) Message-ID: <42C562A4.9070501@lammerts.org> Date: Fri, 01 Jul 2005 11:35:00 -0400 From: Eric Lammerts User-Agent: Mozilla Thunderbird 1.0 (X11/20041206) X-Accept-Language: en-us, en MIME-Version: 1.0 To: madwifi-devel@lists.sourceforge.net Cc: tommy.christensen@tpack.net, herbert@gondor.apana.org.au, davem@davemloft.net, netdev@oss.sgi.com Subject: problems with 2.6.12 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 2585 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: eric-madwifi-devel@lammerts.org Precedence: bulk X-list: netdev Hello all, I was having problems with 2.6.12 + madwifi in master mode. No packets were going out. With tcpdump I see DHCP requests coming in, with strace I see the dhcp daemon sending replies out, but they don't show up in tcpdump. It's caused by this change: http://oss.sgi.com/projects/netdev/archive/2005-05/msg00109.html which btw also causes problems for other people: http://marc.theaimsgroup.com/?l=linux-kernel&m=111853727810345&w=2 Madwifi doesn't call netif_carrier_on() in master mode, so Linux drops all packets. When I remove the dev_deactivate() line, it works fine again. Should we fix madwifi or the kernel? Eric From tommy.christensen@tpack.net Fri Jul 1 09:34:31 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 01 Jul 2005 09:34:35 -0700 (PDT) Received: from mail.tpack.net (ip18.tpack.net [213.173.228.18]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id j61GYUH9015022 for ; Fri, 1 Jul 2005 09:34:31 -0700 Received: (qmail 5899 invoked from network); 1 Jul 2005 16:32:59 -0000 Received: from unknown (HELO ?172.17.159.11?) (192.168.111.1) by 0 with SMTP; 1 Jul 2005 16:32:59 -0000 Message-ID: <42C57058.70806@tpack.net> Date: Fri, 01 Jul 2005 18:33:28 +0200 From: Tommy Christensen User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.2) Gecko/20040803 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Eric Lammerts CC: madwifi-devel@lists.sourceforge.net, herbert@gondor.apana.org.au, davem@davemloft.net, netdev@oss.sgi.com Subject: Re: problems with 2.6.12 References: <42C562A4.9070501@lammerts.org> In-Reply-To: <42C562A4.9070501@lammerts.org> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 2586 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tommy.christensen@tpack.net Precedence: bulk X-list: netdev Eric Lammerts wrote: > Hello all, > I was having problems with 2.6.12 + madwifi in master mode. No packets > were going out. With tcpdump I see DHCP requests coming in, with strace > I see the dhcp daemon sending replies out, but they don't show up in > tcpdump. > > It's caused by this change: > http://oss.sgi.com/projects/netdev/archive/2005-05/msg00109.html > > which btw also causes problems for other people: > http://marc.theaimsgroup.com/?l=linux-kernel&m=111853727810345&w=2 Auch. And vlan interfaces are having trouble as well. > Madwifi doesn't call netif_carrier_on() in master mode, so Linux drops > all packets. When I remove the dev_deactivate() line, it works fine again. Netdevices are "born" with carrier on, so if your code don't call netif_carrier_off() or set dev->state directly, I don't see how you can end up in this state. Could you investigate this? > Should we fix madwifi or the kernel? The code is there for a reason, so hopefully we can work this out. -Tommy From tgraf@suug.ch Fri Jul 1 10:42:37 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 01 Jul 2005 10:42:45 -0700 (PDT) Received: from postel.suug.ch (postel.suug.ch [195.134.158.23]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j61HgXH9020965 for ; Fri, 1 Jul 2005 10:42:36 -0700 Received: by postel.suug.ch (Postfix, from userid 10001) id DA5F81C0F3; Fri, 1 Jul 2005 19:41:17 +0200 (CEST) Date: Fri, 1 Jul 2005 19:41:17 +0200 From: Thomas Graf To: Patrick McHardy Cc: Patrick Jenkins , linux-kernel@vger.kernel.org, Maillist netdev Subject: Re: [PATCH] multipath routing algorithm, better patch Message-ID: <20050701174117.GW16076@postel.suug.ch> References: <42C4919A.5000009@trash.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <42C4919A.5000009@trash.net> X-archive-position: 2587 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev * Patrick McHardy <42C4919A.5000009@trash.net> 2005-07-01 02:43 > Multiple algorithms can be compiled in at once, so this patch is wrong. > mp_alg is supplied by userspace: > > if (rta->rta_mp_alg) { > mp_alg = *rta->rta_mp_alg; > > if (mp_alg < IP_MP_ALG_NONE || > mp_alg > IP_MP_ALG_MAX) > goto err_inval; > } > > If it isn't set correctly its an iproute problem. Did you actually > experience any problems? Well, my patch for iproute2 to enable multipath algorithm selection is currently being merged to Stephen together with the ematch bits. We had to work out a dependency on GNU flex first (the berkley version uses the same executable names) so the inclusion was delayed a bit. From kaber@trash.net Fri Jul 1 12:35:52 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 01 Jul 2005 12:35:55 -0700 (PDT) Received: from kaber.coreworks.de ([62.206.217.67]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j61JZpH9030771 for ; Fri, 1 Jul 2005 12:35:52 -0700 Received: from localhost ([127.0.0.1]) by kaber.coreworks.de with esmtp (Exim 4.51) id 1DoRHH-0002t5-2M; Fri, 01 Jul 2005 21:34:19 +0200 Message-ID: <42C59ABA.1070305@trash.net> Date: Fri, 01 Jul 2005 21:34:18 +0200 From: Patrick McHardy User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.7.8) Gecko/20050514 Debian/1.7.8-1 X-Accept-Language: en MIME-Version: 1.0 To: Thomas Graf CC: Patrick Jenkins , linux-kernel@vger.kernel.org, Maillist netdev Subject: Re: [PATCH] multipath routing algorithm, better patch References: <42C4919A.5000009@trash.net> <20050701174117.GW16076@postel.suug.ch> In-Reply-To: <20050701174117.GW16076@postel.suug.ch> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 2588 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kaber@trash.net Precedence: bulk X-list: netdev Thomas Graf wrote: > * Patrick McHardy <42C4919A.5000009@trash.net> 2005-07-01 02:43 > >>If it isn't set correctly its an iproute problem. Did you actually >>experience any problems? > > Well, my patch for iproute2 to enable multipath algorithm selection > is currently being merged to Stephen together with the ematch bits. > We had to work out a dependency on GNU flex first (the berkley > version uses the same executable names) so the inclusion was > delayed a bit. So its no problem but simply missing support. BTW, do you know if Stephen's new CVS repository is exported somewhere? Regards Patrick From radheka.godse@intel.com Fri Jul 1 13:24:50 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 01 Jul 2005 13:24:58 -0700 (PDT) Received: from orsfmr003.jf.intel.com (fmr18.intel.com [134.134.136.17]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j61KOnH9001472 for ; Fri, 1 Jul 2005 13:24:50 -0700 Received: from orsfmr100.jf.intel.com (orsfmr100.jf.intel.com [10.7.209.16]) by orsfmr003.jf.intel.com (8.12.10/8.12.10/d: major-outer.mc,v 1.1 2004/09/17 17:50:56 root Exp $) with ESMTP id j61KN3xw003914; Fri, 1 Jul 2005 20:23:04 GMT Received: from nwlxmail01.jf.intel.com (nwlxmail01.jf.intel.com [10.7.171.40]) by orsfmr100.jf.intel.com (8.12.10/8.12.10/d: major-inner.mc,v 1.2 2004/09/17 18:05:01 root Exp $) with ESMTP id j61KN3Pb018286; Fri, 1 Jul 2005 20:23:03 GMT Received: from [134.134.3.92] ([134.134.3.92]) by nwlxmail01.jf.intel.com (8.12.10/8.12.9/MailSET/Hub) with ESMTP id j61KN3SL031915; Fri, 1 Jul 2005 13:23:03 -0700 Date: Fri, 1 Jul 2005 13:22:05 -0700 (PDT) From: Radheka Godse X-X-Sender: radheka@localhost.localdomain To: fubar@us.ibm.com, bonding-devel@lists.sourceforge.net cc: netdev@oss.sgi.com Subject: [PATCH 2.6.13-rc1 0/17] bonding: Sysfs Support Message-ID: ReplyTo: "Radheka Godse" MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Scanned-By: MIMEDefang 2.44 X-archive-position: 2589 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: radheka.godse@intel.com Precedence: bulk X-list: netdev This patch set is an updated sysfs bonding patch to the one sent previously in April. The patch is upto date with all recent bonding and kernel changes and applies cleanly to linux-2.6.13-rc1. It adds sysfs entry for the "Xmit Hash Policy" for new bonding module param; incorporates feedback and bug fixes on all known issues and has been tested on linux-2.6.13-rc1. The interface is pretty simple. Here are some notes on how it could be used: The file /sys/class/net/bonding_masters contains the names of all the active bonds. To add or remove bonds, write a whitespace-delimited list of interface names to this file. For example: echo "bond1 bond2 bond3" > /sys/class/net/bonding_masters will create three bonds with the given names. If any other bonds exist, they will be deleted. echo "bond0 bond2 bond3" > /sys/class/net/bonding_masters would then create bond0 and remove bond1. For each bond, there is a directory /sys/class/net//bonding. In this directory are a number of files which control the bond. The names of these files match the existing module parameters and do the same things. The slaves file contains a whitespace-delimited list of interface names, which are slaves to the bond. This file behaves much the same as the "bonding_masters" file; just write a list of your desired interfaces to this file. Likewise, the arp_targets file contains a whitespace-delimited list of IP addresses and should be written to in a similar fashion. The other files contain single values(numeric and/or mnemonic). Some caveats: - slaves can only be assigned when the interface is up - mode can only be changed when the interface is down - Xmit hash policy can be changed only when interface is down Warnings and status messages will be logged and can be viewed with dmesg. Example: modprobe bonding echo "bond0 bond1" > /sys/class/net/bonding_masters echo "6" > /sys/class/net/bond0/bonding/mode echo "1000" > /sys/class/net/bond0/bonding/miimon ifconfig bond0 192.168.0.1 echo "eth0 eth1" > /sys/class/net/bond0/bonding/slaves # bond0 is now ready to use echo "1" > /sys/class/net/bond1/bonding/mode # ... and so on for bond1 These patches were generated against 2.6.12 with Jay's upstream patches. It applies cleanly to kernel 2.6.13-rc1. - Radheka Godse - Mitch Williams From radheka.godse@intel.com Fri Jul 1 13:31:46 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 01 Jul 2005 13:31:48 -0700 (PDT) Received: from orsfmr005.jf.intel.com (fmr20.intel.com [134.134.136.19]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j61KVjH9002272 for ; Fri, 1 Jul 2005 13:31:46 -0700 Received: from orsfmr101.jf.intel.com (orsfmr101.jf.intel.com [10.7.209.17]) by orsfmr005.jf.intel.com (8.12.10/8.12.10/d: major-outer.mc,v 1.1 2004/09/17 17:50:56 root Exp $) with ESMTP id j61KU8tp005569; Fri, 1 Jul 2005 20:30:08 GMT Received: from nwlxmail01.jf.intel.com (nwlxmail01.jf.intel.com [10.7.171.40]) by orsfmr101.jf.intel.com (8.12.10/8.12.10/d: major-inner.mc,v 1.2 2004/09/17 18:05:01 root Exp $) with ESMTP id j61KU87G020793; Fri, 1 Jul 2005 20:30:08 GMT Received: from [134.134.3.92] ([134.134.3.92]) by nwlxmail01.jf.intel.com (8.12.10/8.12.9/MailSET/Hub) with ESMTP id j61KU8SL032541; Fri, 1 Jul 2005 13:30:08 -0700 Date: Fri, 1 Jul 2005 13:29:20 -0700 (PDT) From: Radheka Godse X-X-Sender: radheka@localhost.localdomain To: fubar@us.ibm.com, bonding-devel@lists.sourceforge.net cc: netdev@oss.sgi.com Subject: [PATCH 2.6.13-rc1 1/17] bonding: make some functions not static Message-ID: ReplyTo: "Radheka Godse" MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Scanned-By: MIMEDefang 2.44 X-archive-position: 2590 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: radheka.godse@intel.com Precedence: bulk X-list: netdev This patch prepares for adding sysfs functionality to bonding by making some functions in bond_main non-static and adding protos into the header. diff -urN -X dontdiff linux-2.6.12post/drivers/net/bonding/bonding.h linux-2.6.12post-sysfs/drivers/net/bonding/bonding.h --- linux-2.6.12post/drivers/net/bonding/bonding.h 2005-06-28 18:18:03.000000000 -0700 +++ linux-2.6.12post-sysfs/drivers/net/bonding/bonding.h 2005-06-30 13:58:27.000000000 -0700 @@ -255,6 +255,17 @@ struct vlan_entry *bond_next_vlan(struct bonding *bond, struct vlan_entry *curr); int bond_dev_queue_xmit(struct bonding *bond, struct sk_buff *skb, struct net_device *slave_dev); +void bond_deinit(struct net_device *bond_dev); +int bond_release(struct net_device *bond_dev, struct net_device *slave_dev); +int bond_sethwaddr(struct net_device *bond_dev, struct net_device *slave_dev); +void bond_mii_monitor(struct net_device *bond_dev); +void bond_loadbalance_arp_mon(struct net_device *bond_dev); +void bond_activebackup_arp_mon(struct net_device *bond_dev); +void bond_set_mode_ops(struct bonding *bond, int mode); +int bond_parse_parm(char *mode_arg, struct bond_parm_tbl *tbl); +const char *bond_mode_name(int mode); +void bond_select_active_slave(struct bonding *bond); +void bond_change_active_slave(struct bonding *bond, struct slave *new_active); #endif /* _LINUX_BONDING_H */ diff -urN -X dontdiff linux-2.6.12post/drivers/net/bonding/bond_main.c linux-2.6.12post-sysfs/drivers/net/bonding/bond_main.c --- linux-2.6.12post/drivers/net/bonding/bond_main.c 2005-06-28 18:18:03.000000000 -0700 +++ linux-2.6.12post-sysfs/drivers/net/bonding/bond_main.c 2005-06-30 13:53:55.000000000 -0700 @@ -1453,7 +1453,7 @@ * * Warning: Caller must hold curr_slave_lock for writing. */ -static void bond_change_active_slave(struct bonding *bond, struct slave *new_active) +void bond_change_active_slave(struct bonding *bond, struct slave *new_active) { struct slave *old_active = bond->curr_active_slave; @@ -1527,7 +1527,7 @@ * * Warning: Caller must hold curr_slave_lock for writing. */ -static void bond_select_active_slave(struct bonding *bond) +void bond_select_active_slave(struct bonding *bond) { struct slave *best_slave; @@ -1595,7 +1595,7 @@ /*---------------------------------- IOCTL ----------------------------------*/ -static int bond_sethwaddr(struct net_device *bond_dev, struct net_device *slave_dev) +int bond_sethwaddr(struct net_device *bond_dev, struct net_device *slave_dev) { dprintk("bond_dev=%p\n", bond_dev); dprintk("slave_dev=%p\n", slave_dev); @@ -2030,7 +2030,7 @@ * for Bonded connections: * The first up interface should be left on and all others downed. */ -static int bond_release(struct net_device *bond_dev, struct net_device *slave_dev) +int bond_release(struct net_device *bond_dev, struct net_device *slave_dev) { struct bonding *bond = bond_dev->priv; struct slave *slave, *oldcurrent; @@ -2479,7 +2479,7 @@ /*-------------------------------- Monitoring -------------------------------*/ /* this function is called regularly to monitor each slave's link. */ -static void bond_mii_monitor(struct net_device *bond_dev) +void bond_mii_monitor(struct net_device *bond_dev) { struct bonding *bond = bond_dev->priv; struct slave *slave, *oldcurrent; @@ -2904,7 +2904,7 @@ * arp is transmitted to generate traffic. see activebackup_arp_monitor for * arp monitoring in active backup mode. */ -static void bond_loadbalance_arp_mon(struct net_device *bond_dev) +void bond_loadbalance_arp_mon(struct net_device *bond_dev) { struct bonding *bond = bond_dev->priv; struct slave *slave, *oldcurrent; @@ -3042,7 +3042,7 @@ * may have received. * see loadbalance_arp_monitor for arp monitoring in load balancing mode */ -static void bond_activebackup_arp_mon(struct net_device *bond_dev) +void bond_activebackup_arp_mon(struct net_device *bond_dev) { struct bonding *bond = bond_dev->priv; struct slave *slave; @@ -4484,7 +4484,7 @@ /* * set bond mode specific net device operations */ -static inline void bond_set_mode_ops(struct bonding *bond, int mode) +void bond_set_mode_ops(struct bonding *bond, int mode) { struct net_device *bond_dev = bond->dev; @@ -4603,7 +4603,7 @@ /* De-initialize device specific data. * Caller must hold rtnl_lock. */ -static inline void bond_deinit(struct net_device *bond_dev) +void bond_deinit(struct net_device *bond_dev) { struct bonding *bond = bond_dev->priv; @@ -4639,7 +4639,7 @@ * Convert string input module parms. Accept either the * number of the mode or its string name. */ -static inline int bond_parse_parm(char *mode_arg, struct bond_parm_tbl *tbl) +int bond_parse_parm(char *mode_arg, struct bond_parm_tbl *tbl) { int i; From radheka.godse@intel.com Fri Jul 1 13:40:34 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 01 Jul 2005 13:40:38 -0700 (PDT) Received: from orsfmr003.jf.intel.com (fmr18.intel.com [134.134.136.17]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j61KeYH9003213 for ; Fri, 1 Jul 2005 13:40:34 -0700 Received: from orsfmr100.jf.intel.com (orsfmr100.jf.intel.com [10.7.209.16]) by orsfmr003.jf.intel.com (8.12.10/8.12.10/d: major-outer.mc,v 1.1 2004/09/17 17:50:56 root Exp $) with ESMTP id j61Kcwxw011205; Fri, 1 Jul 2005 20:38:58 GMT Received: from nwlxmail01.jf.intel.com (nwlxmail01.jf.intel.com [10.7.171.40]) by orsfmr100.jf.intel.com (8.12.10/8.12.10/d: major-inner.mc,v 1.2 2004/09/17 18:05:01 root Exp $) with ESMTP id j61KcwPb027451; Fri, 1 Jul 2005 20:38:58 GMT Received: from [134.134.3.92] ([134.134.3.92]) by nwlxmail01.jf.intel.com (8.12.10/8.12.9/MailSET/Hub) with ESMTP id j61KcwSL001300; Fri, 1 Jul 2005 13:38:58 -0700 Date: Fri, 1 Jul 2005 13:38:11 -0700 (PDT) From: Radheka Godse X-X-Sender: radheka@localhost.localdomain To: fubar@us.ibm.com, bonding-devel@lists.sourceforge.net cc: netdev@oss.sgi.com Subject: [PATCH 2.6.13-rc1 2/17] bonding: split bond creation into new function Message-ID: ReplyTo: "Radheka Godse" MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Scanned-By: MIMEDefang 2.44 X-archive-position: 2592 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: radheka.godse@intel.com Precedence: bulk X-list: netdev This patch moves the work of creating a new bond into a separate function, instead of being inline in bonding_init. This function is non-static and proto is added to the header, for use by the sysfs interface. Signed-off-by: Radheka Godse Signed-off-by: Mitch Williams diff -urN -X dontdiff linux-2.6.12post/drivers/net/bonding/bonding.h linux-2.6.12post-sysfs/drivers/net/bonding/bonding.h --- linux-2.6.12post/drivers/net/bonding/bonding.h 2005-06-28 18:18:03.000000000 -0700 +++ linux-2.6.12post-sysfs/drivers/net/bonding/bonding.h 2005-06-30 13:58:27.000000000 -0700 @@ -255,6 +255,7 @@ struct vlan_entry *bond_next_vlan(struct bonding *bond, struct vlan_entry *curr); int bond_dev_queue_xmit(struct bonding *bond, struct sk_buff *skb, struct net_device *slave_dev); +int bond_create(char *name, struct bond_params *params, struct bonding **newbond); void bond_deinit(struct net_device *bond_dev); int bond_release(struct net_device *bond_dev, struct net_device *slave_dev); int bond_sethwaddr(struct net_device *bond_dev, struct net_device *slave_dev); diff -urN -X dontdiff linux-2.6.12post/drivers/net/bonding/bond_main.c linux-2.6.12post-sysfs/drivers/net/bonding/bond_main.c --- linux-2.6.12post/drivers/net/bonding/bond_main.c 2005-06-28 18:18:03.000000000 -0700 +++ linux-2.6.12post-sysfs/drivers/net/bonding/bond_main.c 2005-06-30 13:53:55.000000000 -0700 @@ -555,6 +555,7 @@ static char *xmit_hash_policy = NULL; static int arp_interval = BOND_LINK_ARP_INTERV; static char *arp_ip_target[BOND_MAX_ARP_TARGETS] = { NULL, }; +struct bond_params bonding_defaults; module_param(max_bonds, int, 0); MODULE_PARM_DESC(max_bonds, "Max number of bonded devices"); @@ -4530,12 +4531,10 @@ * Does not allocate but creates a /proc entry. * Allowed to fail. */ -static int __init bond_init(struct net_device *bond_dev, struct bond_params *params) +static int bond_init(struct net_device *bond_dev, struct bond_params *params) { struct bonding *bond = bond_dev->priv; - dprintk("Begin bond_init for %s\n", bond_dev->name); - /* initialize rwlocks */ rwlock_init(&bond->lock); rwlock_init(&bond->curr_slave_lock); @@ -4919,61 +5025,81 @@ return 0; } +/* Create a new bond based on the specified name and bonding parameters. + * Caller must NOT hold rtnl_lock; we need to release it here before we + * set up our sysfs entries. + */ +int bond_create(char *name, struct bond_params *params, struct bonding **newbond) +{ + struct net_device *bond_dev; + int res; + + rtnl_lock(); + bond_dev = alloc_netdev(sizeof(struct bonding), name, ether_setup); + if (!bond_dev) { + printk(KERN_ERR DRV_NAME + ": %s: eek! can't alloc netdev!\n", + name); + res = -ENOMEM; + goto out_rtnl; + } + + /* bond_init() must be called after dev_alloc_name() (for the + * /proc files), but before register_netdevice(), because we + * need to set function pointers. + */ + + res = bond_init(bond_dev, params); + if (res < 0) { + goto out_netdev; + } + + SET_MODULE_OWNER(bond_dev); + + res = register_netdevice(bond_dev); + if (res < 0) { + goto out_bond; + } + if (newbond) + *newbond = bond_dev->priv; + + rtnl_unlock(); /* allows sysfs registration of net device */ + res = bond_create_sysfs_entry(bond_dev->priv); + goto done; +out_bond: + bond_deinit(bond_dev); +out_netdev: + free_netdev(bond_dev); +out_rtnl: + rtnl_unlock(); +done: + return res; +} + static int __init bonding_init(void) { - struct bond_params params; int i; int res; + char new_bond_name[8]; /* Enough room for 999 bonds at init. */ printk(KERN_INFO "%s", version); - res = bond_check_params(¶ms); + res = bond_check_params(&bonding_defaults); if (res) { - return res; + goto out; } - rtnl_lock(); - #ifdef CONFIG_PROC_FS bond_create_proc_dir(); #endif for (i = 0; i < max_bonds; i++) { - struct net_device *bond_dev; - - bond_dev = alloc_netdev(sizeof(struct bonding), "", ether_setup); - if (!bond_dev) { - res = -ENOMEM; - goto out_err; - } - - res = dev_alloc_name(bond_dev, "bond%d"); - if (res < 0) { - free_netdev(bond_dev); - goto out_err; - } - - /* bond_init() must be called after dev_alloc_name() (for the - * /proc files), but before register_netdevice(), because we - * need to set function pointers. - */ - res = bond_init(bond_dev, ¶ms); - if (res < 0) { - free_netdev(bond_dev); - goto out_err; - } - - SET_MODULE_OWNER(bond_dev); - - res = register_netdevice(bond_dev); - if (res < 0) { - bond_deinit(bond_dev); - free_netdev(bond_dev); - goto out_err; - } + sprintf(new_bond_name, "bond%d",i); + res = bond_create(new_bond_name,&bonding_defaults, NULL); + if (res) + goto err; } - rtnl_unlock(); register_netdevice_notifier(&bond_netdev_notifier); register_inetaddr_notifier(&bond_inetaddr_notifier); From radheka.godse@intel.com Fri Jul 1 13:42:00 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 01 Jul 2005 13:42:02 -0700 (PDT) Received: from orsfmr002.jf.intel.com (fmr17.intel.com [134.134.136.16]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j61Kg0H9004005 for ; Fri, 1 Jul 2005 13:42:00 -0700 Received: from orsfmr100.jf.intel.com (orsfmr100.jf.intel.com [10.7.209.16]) by orsfmr002.jf.intel.com (8.12.10/8.12.10/d: major-outer.mc,v 1.1 2004/09/17 17:50:56 root Exp $) with ESMTP id j61KeNq8017560; Fri, 1 Jul 2005 20:40:23 GMT Received: from nwlxmail01.jf.intel.com (nwlxmail01.jf.intel.com [10.7.171.40]) by orsfmr100.jf.intel.com (8.12.10/8.12.10/d: major-inner.mc,v 1.2 2004/09/17 18:05:01 root Exp $) with ESMTP id j61KeNPb028405; Fri, 1 Jul 2005 20:40:23 GMT Received: from [134.134.3.92] ([134.134.3.92]) by nwlxmail01.jf.intel.com (8.12.10/8.12.9/MailSET/Hub) with ESMTP id j61KeNSL001674; Fri, 1 Jul 2005 13:40:23 -0700 Date: Fri, 1 Jul 2005 13:39:31 -0700 (PDT) From: Radheka Godse X-X-Sender: radheka@localhost.localdomain To: fubar@us.ibm.com, bonding-devel@lists.sourceforge.net cc: netdev@oss.sgi.com Subject: [PATCH 2.6.13-rc1 3/17] bonding: export some structs to bonding.h Message-ID: ReplyTo: "Radheka Godse" MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Scanned-By: MIMEDefang 2.44 X-archive-position: 2593 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: radheka.godse@intel.com Precedence: bulk X-list: netdev This patch exposes some data structures for use by the sysfs interface. Signed-off-by: Radheka Godse Signed-off-by: Mitch Williams diff -urN -X dontdiff linux-2.6.12post/drivers/net/bonding/bonding.h linux-2.6.12post-sysfs/drivers/net/bonding/bonding.h --- linux-2.6.12post/drivers/net/bonding/bonding.h 2005-06-28 18:18:03.000000000 -0700 +++ linux-2.6.12post-sysfs/drivers/net/bonding/bonding.h 2005-06-30 13:58:27.000000000 -0700 @@ -152,6 +152,11 @@ u32 arp_targets[BOND_MAX_ARP_TARGETS]; }; +struct bond_parm_tbl { + char *modename; + int mode; +}; + struct vlan_entry { struct list_head vlan_list; u32 vlan_ip; diff -urN -X dontdiff linux-2.6.12post/drivers/net/bonding/bond_main.c linux-2.6.12post-sysfs/drivers/net/bonding/bond_main.c --- linux-2.6.12post/drivers/net/bonding/bond_main.c 2005-06-28 18:18:03.000000000 -0700 +++ linux-2.6.12post-sysfs/drivers/net/bonding/bond_main.c 2005-06-30 13:53:55.000000000 -0700 @@ -585,7 +585,7 @@ static const char *version = DRV_DESCRIPTION ": v" DRV_VERSION " (" DRV_RELDATE ")\n"; -static LIST_HEAD(bond_dev_list); +LIST_HEAD(bond_dev_list); #ifdef CONFIG_PROC_FS static struct proc_dir_entry *bond_proc_dir = NULL; @@ -604,18 +604,14 @@ * command comes from an application using * another ABI version. */ -struct bond_parm_tbl { - char *modename; - int mode; -}; -static struct bond_parm_tbl bond_lacp_tbl[] = { +struct bond_parm_tbl bond_lacp_tbl[] = { { "slow", AD_LACP_SLOW}, { "fast", AD_LACP_FAST}, { NULL, -1}, }; -static struct bond_parm_tbl bond_mode_tbl[] = { +struct bond_parm_tbl bond_mode_tbl[] = { { "balance-rr", BOND_MODE_ROUNDROBIN}, { "active-backup", BOND_MODE_ACTIVEBACKUP}, { "balance-xor", BOND_MODE_XOR}, @@ -626,7 +622,7 @@ { NULL, -1}, }; -static struct bond_parm_tbl xmit_hashtype_tbl[] = { +struct bond_parm_tbl xmit_hashtype_tbl[] = { { "layer2", BOND_XMIT_POLICY_LAYER2}, { "layer3+4", BOND_XMIT_POLICY_LAYER34}, { NULL, -1}, @@ -634,12 +630,11 @@ /*-------------------------- Forward declarations ---------------------------*/ -static inline void bond_set_mode_ops(struct bonding *bond, int mode); static void bond_send_gratuitous_arp(struct bonding *bond); /*---------------------------- General routines -----------------------------*/ -static const char *bond_mode_name(int mode) +const char *bond_mode_name(int mode) { switch (mode) { case BOND_MODE_ROUNDROBIN : From radheka.godse@intel.com Fri Jul 1 13:39:09 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 01 Jul 2005 13:39:23 -0700 (PDT) Received: from orsfmr004.jf.intel.com (fmr19.intel.com [134.134.136.18]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j61Kd9H9003077 for ; Fri, 1 Jul 2005 13:39:09 -0700 Received: from orsfmr101.jf.intel.com (orsfmr101.jf.intel.com [10.7.209.17]) by orsfmr004.jf.intel.com (8.12.10/8.12.10/d: major-outer.mc,v 1.1 2004/09/17 17:50:56 root Exp $) with ESMTP id j61KbWkn012922; Fri, 1 Jul 2005 20:37:32 GMT Received: from nwlxmail01.jf.intel.com (nwlxmail01.jf.intel.com [10.7.171.40]) by orsfmr101.jf.intel.com (8.12.10/8.12.10/d: major-inner.mc,v 1.2 2004/09/17 18:05:01 root Exp $) with ESMTP id j61KbW7G024918; Fri, 1 Jul 2005 20:37:32 GMT Received: from [134.134.3.92] ([134.134.3.92]) by nwlxmail01.jf.intel.com (8.12.10/8.12.9/MailSET/Hub) with ESMTP id j61Kb7SL000669; Fri, 1 Jul 2005 13:37:32 -0700 Date: Fri, 1 Jul 2005 13:36:19 -0700 (PDT) From: Radheka Godse X-X-Sender: radheka@localhost.localdomain To: fubar@us.ibm.com, bonding-devel@lists.sourceforge.net cc: netdev@oss.sgi.com Subject: [PATCH 2.6.13-rc1 1/17] bonding: make some functions not static Message-ID: ReplyTo: "Radheka Godse" MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Scanned-By: MIMEDefang 2.44 X-archive-position: 2591 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: radheka.godse@intel.com Precedence: bulk X-list: netdev This patch prepares for adding sysfs functionality to bonding by making some functions in bond_main non-static and adding protos into the header. Signed-off-by: Radheka Godse Signed-off-by: Mitch Williams diff -urN -X dontdiff linux-2.6.12post/drivers/net/bonding/bonding.h linux-2.6.12post-sysfs/drivers/net/bonding/bonding.h --- linux-2.6.12post/drivers/net/bonding/bonding.h 2005-06-28 18:18:03.000000000 -0700 +++ linux-2.6.12post-sysfs/drivers/net/bonding/bonding.h 2005-06-30 13:58:27.000000000 -0700 @@ -255,6 +255,17 @@ struct vlan_entry *bond_next_vlan(struct bonding *bond, struct vlan_entry *curr); int bond_dev_queue_xmit(struct bonding *bond, struct sk_buff *skb, struct net_device *slave_dev); +void bond_deinit(struct net_device *bond_dev); +int bond_release(struct net_device *bond_dev, struct net_device *slave_dev); +int bond_sethwaddr(struct net_device *bond_dev, struct net_device *slave_dev); +void bond_mii_monitor(struct net_device *bond_dev); +void bond_loadbalance_arp_mon(struct net_device *bond_dev); +void bond_activebackup_arp_mon(struct net_device *bond_dev); +void bond_set_mode_ops(struct bonding *bond, int mode); +int bond_parse_parm(char *mode_arg, struct bond_parm_tbl *tbl); +const char *bond_mode_name(int mode); +void bond_select_active_slave(struct bonding *bond); +void bond_change_active_slave(struct bonding *bond, struct slave *new_active); #endif /* _LINUX_BONDING_H */ diff -urN -X dontdiff linux-2.6.12post/drivers/net/bonding/bond_main.c linux-2.6.12post-sysfs/drivers/net/bonding/bond_main.c --- linux-2.6.12post/drivers/net/bonding/bond_main.c 2005-06-28 18:18:03.000000000 -0700 +++ linux-2.6.12post-sysfs/drivers/net/bonding/bond_main.c 2005-06-30 13:53:55.000000000 -0700 @@ -1453,7 +1453,7 @@ * * Warning: Caller must hold curr_slave_lock for writing. */ -static void bond_change_active_slave(struct bonding *bond, struct slave *new_active) +void bond_change_active_slave(struct bonding *bond, struct slave *new_active) { struct slave *old_active = bond->curr_active_slave; @@ -1527,7 +1527,7 @@ * * Warning: Caller must hold curr_slave_lock for writing. */ -static void bond_select_active_slave(struct bonding *bond) +void bond_select_active_slave(struct bonding *bond) { struct slave *best_slave; @@ -1595,7 +1595,7 @@ /*---------------------------------- IOCTL ----------------------------------*/ -static int bond_sethwaddr(struct net_device *bond_dev, struct net_device *slave_dev) +int bond_sethwaddr(struct net_device *bond_dev, struct net_device *slave_dev) { dprintk("bond_dev=%p\n", bond_dev); dprintk("slave_dev=%p\n", slave_dev); @@ -2030,7 +2030,7 @@ * for Bonded connections: * The first up interface should be left on and all others downed. */ -static int bond_release(struct net_device *bond_dev, struct net_device *slave_dev) +int bond_release(struct net_device *bond_dev, struct net_device *slave_dev) { struct bonding *bond = bond_dev->priv; struct slave *slave, *oldcurrent; @@ -2479,7 +2479,7 @@ /*-------------------------------- Monitoring -------------------------------*/ /* this function is called regularly to monitor each slave's link. */ -static void bond_mii_monitor(struct net_device *bond_dev) +void bond_mii_monitor(struct net_device *bond_dev) { struct bonding *bond = bond_dev->priv; struct slave *slave, *oldcurrent; @@ -2904,7 +2904,7 @@ * arp is transmitted to generate traffic. see activebackup_arp_monitor for * arp monitoring in active backup mode. */ -static void bond_loadbalance_arp_mon(struct net_device *bond_dev) +void bond_loadbalance_arp_mon(struct net_device *bond_dev) { struct bonding *bond = bond_dev->priv; struct slave *slave, *oldcurrent; @@ -3042,7 +3042,7 @@ * may have received. * see loadbalance_arp_monitor for arp monitoring in load balancing mode */ -static void bond_activebackup_arp_mon(struct net_device *bond_dev) +void bond_activebackup_arp_mon(struct net_device *bond_dev) { struct bonding *bond = bond_dev->priv; struct slave *slave; @@ -4484,7 +4484,7 @@ /* * set bond mode specific net device operations */ -static inline void bond_set_mode_ops(struct bonding *bond, int mode) +void bond_set_mode_ops(struct bonding *bond, int mode) { struct net_device *bond_dev = bond->dev; @@ -4603,7 +4603,7 @@ /* De-initialize device specific data. * Caller must hold rtnl_lock. */ -static inline void bond_deinit(struct net_device *bond_dev) +void bond_deinit(struct net_device *bond_dev) { struct bonding *bond = bond_dev->priv; @@ -4639,7 +4639,7 @@ * Convert string input module parms. Accept either the * number of the mode or its string name. */ -static inline int bond_parse_parm(char *mode_arg, struct bond_parm_tbl *tbl) +int bond_parse_parm(char *mode_arg, struct bond_parm_tbl *tbl) { int i; From radheka.godse@intel.com Fri Jul 1 13:43:33 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 01 Jul 2005 13:43:36 -0700 (PDT) Received: from orsfmr002.jf.intel.com (fmr17.intel.com [134.134.136.16]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j61KhXH9004840 for ; Fri, 1 Jul 2005 13:43:33 -0700 Received: from orsfmr100.jf.intel.com (orsfmr100.jf.intel.com [10.7.209.16]) by orsfmr002.jf.intel.com (8.12.10/8.12.10/d: major-outer.mc,v 1.1 2004/09/17 17:50:56 root Exp $) with ESMTP id j61Kfuq8018705; Fri, 1 Jul 2005 20:41:56 GMT Received: from nwlxmail01.jf.intel.com (nwlxmail01.jf.intel.com [10.7.171.40]) by orsfmr100.jf.intel.com (8.12.10/8.12.10/d: major-inner.mc,v 1.2 2004/09/17 18:05:01 root Exp $) with ESMTP id j61KftPb029624; Fri, 1 Jul 2005 20:41:55 GMT Received: from [134.134.3.92] ([134.134.3.92]) by nwlxmail01.jf.intel.com (8.12.10/8.12.9/MailSET/Hub) with ESMTP id j61KftSL002146; Fri, 1 Jul 2005 13:41:55 -0700 Date: Fri, 1 Jul 2005 13:41:08 -0700 (PDT) From: Radheka Godse X-X-Sender: radheka@localhost.localdomain To: fubar@us.ibm.com, bonding-devel@lists.sourceforge.net cc: netdev@oss.sgi.com Subject: [PATCH 2.6.13-rc1 4/17] bonding: return pointer to slave from enslave Message-ID: ReplyTo: "Radheka Godse" MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Scanned-By: MIMEDefang 2.44 X-archive-position: 2594 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: radheka.godse@intel.com Precedence: bulk X-list: netdev This patch changes the bond_enslave function so that it (optionally) returns a pointer tothe slave struct for a new slave. This functionality is not used by the existing ioctl interface, but will be used by the sysfs interface. This function is also made non-static and a proto is placed in the header. Signed-off-by: Radheka Godse Signed-off-by: Mitch Williams diff -urN -X dontdiff linux-2.6.12post/drivers/net/bonding/bonding.h linux-2.6.12post-sysfs/drivers/net/bonding/bonding.h --- linux-2.6.12post/drivers/net/bonding/bonding.h 2005-06-28 18:18:03.000000000 -0700 +++ linux-2.6.12post-sysfs/drivers/net/bonding/bonding.h 2005-06-30 13:58:27.000000000 -0700 @@ -262,6 +262,7 @@ int bond_dev_queue_xmit(struct bonding *bond, struct sk_buff *skb, struct net_device *slave_dev); int bond_create(char *name, struct bond_params *params, struct bonding **newbond); void bond_deinit(struct net_device *bond_dev); +int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev, struct slave **vassal); int bond_release(struct net_device *bond_dev, struct net_device *slave_dev); int bond_sethwaddr(struct net_device *bond_dev, struct net_device *slave_dev); void bond_mii_monitor(struct net_device *bond_dev); diff -urN -X dontdiff linux-2.6.12post/drivers/net/bonding/bond_main.c linux-2.6.12post-sysfs/drivers/net/bonding/bond_main.c --- linux-2.6.12post/drivers/net/bonding/bond_main.c 2005-06-28 18:18:03.000000000 -0700 +++ linux-2.6.12post-sysfs/drivers/net/bonding/bond_main.c 2005-06-30 13:53:55.000000000 -0700 @@ -1601,7 +1601,7 @@ } /* enslave device to bond device */ -static int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev) +int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev, struct slave **vassal) { struct bonding *bond = bond_dev->priv; struct slave *new_slave = NULL; @@ -1992,6 +1992,8 @@ new_slave->link != BOND_LINK_DOWN ? "n up" : " down"); /* enslave is successful */ + if (vassal) + *vassal=new_slave; return 0; /* Undo stages on error */ @@ -4049,6 +4051,7 @@ return -EINVAL; } + down_write(&(bonding_rwsem)); slave_dev = dev_get_by_name(ifr->ifr_slave); dprintk("slave_dev=%p: \n", slave_dev); @@ -4060,7 +4063,7 @@ switch (cmd) { case BOND_ENSLAVE_OLD: case SIOCBONDENSLAVE: - res = bond_enslave(bond_dev, slave_dev); + res = bond_enslave(bond_dev, slave_dev, NULL); break; case BOND_RELEASE_OLD: case SIOCBONDRELEASE: From radheka.godse@intel.com Fri Jul 1 13:45:45 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 01 Jul 2005 13:45:53 -0700 (PDT) Received: from orsfmr002.jf.intel.com (fmr17.intel.com [134.134.136.16]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j61KjjH9006613 for ; Fri, 1 Jul 2005 13:45:45 -0700 Received: from orsfmr100.jf.intel.com (orsfmr100.jf.intel.com [10.7.209.16]) by orsfmr002.jf.intel.com (8.12.10/8.12.10/d: major-outer.mc,v 1.1 2004/09/17 17:50:56 root Exp $) with ESMTP id j61Ki8q8019807; Fri, 1 Jul 2005 20:44:08 GMT Received: from nwlxmail01.jf.intel.com (nwlxmail01.jf.intel.com [10.7.171.40]) by orsfmr100.jf.intel.com (8.12.10/8.12.10/d: major-inner.mc,v 1.2 2004/09/17 18:05:01 root Exp $) with ESMTP id j61Ki8Pb030802; Fri, 1 Jul 2005 20:44:08 GMT Received: from [134.134.3.92] ([134.134.3.92]) by nwlxmail01.jf.intel.com (8.12.10/8.12.9/MailSET/Hub) with ESMTP id j61Ki8SL002587; Fri, 1 Jul 2005 13:44:08 -0700 Date: Fri, 1 Jul 2005 13:43:21 -0700 (PDT) From: Radheka Godse X-X-Sender: radheka@localhost.localdomain To: fubar@us.ibm.com, bonding-devel@lists.sourceforge.net cc: netdev@oss.sgi.com Subject: [PATCH 2.6.13-rc1 5/17] bonding: reset RLB flag during ALB init Message-ID: ReplyTo: "Radheka Godse" MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Scanned-By: MIMEDefang 2.44 X-archive-position: 2595 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: radheka.godse@intel.com Precedence: bulk X-list: netdev This patch makes the ALB init function explicitly clear the RLB flag if RLB is not used. This is required by the sysfs interfaces, which allows the user to change mode without destroying the bond. Signed-off-by: Radheka Godse Signed-off-by: Mitch Williams diff -urN -X dontdiff linux-2.6.12post/drivers/net/bonding/bond_alb.c linux-2.6.12post-sysfs/drivers/net/bonding/bond_alb.c --- linux-2.6.12post/drivers/net/bonding/bond_alb.c 2005-06-17 12:48:29.000000000 -0700 +++ linux-2.6.12post-sysfs/drivers/net/bonding/bond_alb.c 2005-06-28 18:21:35.000000000 -0700 @@ -1256,6 +1256,8 @@ tlb_deinitialize(bond); return res; } + } else { + bond->alb_info.rlb_enabled = 0; } return 0; From tgraf@suug.ch Fri Jul 1 13:47:50 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 01 Jul 2005 13:47:53 -0700 (PDT) Received: from postel.suug.ch (postel.suug.ch [195.134.158.23]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j61KllH9007212 for ; Fri, 1 Jul 2005 13:47:49 -0700 Received: by postel.suug.ch (Postfix, from userid 10001) id 56F661C0F3; Fri, 1 Jul 2005 22:46:37 +0200 (CEST) Date: Fri, 1 Jul 2005 22:46:37 +0200 From: Thomas Graf To: Patrick McHardy Cc: Patrick Jenkins , linux-kernel@vger.kernel.org, Maillist netdev Subject: Re: [PATCH] multipath routing algorithm, better patch Message-ID: <20050701204637.GX16076@postel.suug.ch> References: <42C4919A.5000009@trash.net> <20050701174117.GW16076@postel.suug.ch> <42C59ABA.1070305@trash.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <42C59ABA.1070305@trash.net> X-archive-position: 2596 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev * Patrick McHardy <42C59ABA.1070305@trash.net> 2005-07-01 21:34 > So its no problem but simply missing support. BTW, do you know if > Stephen's new CVS repository is exported somewhere? cvs -d :pserver:cvsanon@developer.osdl.org/repos cvs login cvs -d :pserver:cvsanon@developer.osdl.org/repos cvs co iproute2 From radheka.godse@intel.com Fri Jul 1 13:48:20 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 01 Jul 2005 13:48:23 -0700 (PDT) Received: from orsfmr003.jf.intel.com (fmr18.intel.com [134.134.136.17]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j61KmJH9007295 for ; Fri, 1 Jul 2005 13:48:20 -0700 Received: from orsfmr101.jf.intel.com (orsfmr101.jf.intel.com [10.7.209.17]) by orsfmr003.jf.intel.com (8.12.10/8.12.10/d: major-outer.mc,v 1.1 2004/09/17 17:50:56 root Exp $) with ESMTP id j61Kkhxw015076; Fri, 1 Jul 2005 20:46:43 GMT Received: from nwlxmail01.jf.intel.com (nwlxmail01.jf.intel.com [10.7.171.40]) by orsfmr101.jf.intel.com (8.12.10/8.12.10/d: major-inner.mc,v 1.2 2004/09/17 18:05:01 root Exp $) with ESMTP id j61Kkh7G031178; Fri, 1 Jul 2005 20:46:43 GMT Received: from [134.134.3.92] ([134.134.3.92]) by nwlxmail01.jf.intel.com (8.12.10/8.12.9/MailSET/Hub) with ESMTP id j61KkhSL003239; Fri, 1 Jul 2005 13:46:43 -0700 Date: Fri, 1 Jul 2005 13:45:56 -0700 (PDT) From: Radheka Godse X-X-Sender: radheka@localhost.localdomain To: fubar@us.ibm.com, bonding-devel@lists.sourceforge.net cc: netdev@oss.sgi.com Subject: [PATCH 2.6.13-rc1 6/17] bonding: ALB init kmalloc inside spinlock bugfix Message-ID: ReplyTo: "Radheka Godse" MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Scanned-By: MIMEDefang 2.44 X-archive-position: 2597 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: radheka.godse@intel.com Precedence: bulk X-list: netdev This patch corrects bug in ALB init where kmalloc called inside a held lock causes stacdump in debug mode Signed-off-by: Radheka Godse Signed-off-by: Mitch Williams diff -urN -X dontdiff linux-2.6.12post/drivers/net/bonding/bond_alb.c linux-2.6.12post-sysfs/drivers/net/bonding/bond_alb.c --- linux-2.6.12post/drivers/net/bonding/bond_alb.c 2005-06-17 12:48:29.000000000 -0700 +++ linux-2.6.12post-sysfs/drivers/net/bonding/bond_alb.c 2005-06-28 18:21:35.000000000 -0700 @@ -198,20 +198,21 @@ { struct alb_bond_info *bond_info = &(BOND_ALB_INFO(bond)); int size = TLB_HASH_TABLE_SIZE * sizeof(struct tlb_client_info); + struct tlb_client_info *new_hashtbl; int i; spin_lock_init(&(bond_info->tx_hashtbl_lock)); - _lock_tx_hashtbl(bond); - - bond_info->tx_hashtbl = kmalloc(size, GFP_KERNEL); - if (!bond_info->tx_hashtbl) { + new_hashtbl = kmalloc(size, GFP_KERNEL); + if (!new_hashtbl) { printk(KERN_ERR DRV_NAME ": Error: %s: Failed to allocate TLB hash table\n", bond->dev->name); - _unlock_tx_hashtbl(bond); return -1; } + _lock_tx_hashtbl(bond); + + bond_info->tx_hashtbl = new_hashtbl; memset(bond_info->tx_hashtbl, 0, size); @@ -798,21 +801,22 @@ { struct alb_bond_info *bond_info = &(BOND_ALB_INFO(bond)); struct packet_type *pk_type = &(BOND_ALB_INFO(bond).rlb_pkt_type); + struct rlb_client_info *new_hashtbl; int size = RLB_HASH_TABLE_SIZE * sizeof(struct rlb_client_info); int i; spin_lock_init(&(bond_info->rx_hashtbl_lock)); - _lock_rx_hashtbl(bond); - - bond_info->rx_hashtbl = kmalloc(size, GFP_KERNEL); - if (!bond_info->rx_hashtbl) { + new_hashtbl = kmalloc(size, GFP_KERNEL); + if (!new_hashtbl) { printk(KERN_ERR DRV_NAME ": Error: %s: Failed to allocate RLB hash table\n", bond->dev->name); - _unlock_rx_hashtbl(bond); return -1; } + _lock_rx_hashtbl(bond); + + bond_info->rx_hashtbl = new_hashtbl; bond_info->rx_hashtbl_head = RLB_NULL_INDEX; From radheka.godse@intel.com Fri Jul 1 13:49:48 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 01 Jul 2005 13:49:51 -0700 (PDT) Received: from orsfmr005.jf.intel.com (fmr20.intel.com [134.134.136.19]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j61KnSH9008077 for ; Fri, 1 Jul 2005 13:49:48 -0700 Received: from orsfmr100.jf.intel.com (orsfmr100.jf.intel.com [10.7.209.16]) by orsfmr005.jf.intel.com (8.12.10/8.12.10/d: major-outer.mc,v 1.1 2004/09/17 17:50:56 root Exp $) with ESMTP id j61Klptp008705; Fri, 1 Jul 2005 20:47:51 GMT Received: from nwlxmail01.jf.intel.com (nwlxmail01.jf.intel.com [10.7.171.40]) by orsfmr100.jf.intel.com (8.12.10/8.12.10/d: major-inner.mc,v 1.2 2004/09/17 18:05:01 root Exp $) with ESMTP id j61KlpPb000733; Fri, 1 Jul 2005 20:47:51 GMT Received: from [134.134.3.92] ([134.134.3.92]) by nwlxmail01.jf.intel.com (8.12.10/8.12.9/MailSET/Hub) with ESMTP id j61KlpSL003353; Fri, 1 Jul 2005 13:47:51 -0700 Date: Fri, 1 Jul 2005 13:47:04 -0700 (PDT) From: Radheka Godse X-X-Sender: radheka@localhost.localdomain To: fubar@us.ibm.com, bonding-devel@lists.sourceforge.net cc: netdev@oss.sgi.com Subject: [PATCH 2.6.13-rc1 7/17] bonding: make sysfs consistent with ifenslave behavior Message-ID: ReplyTo: "Radheka Godse" MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Scanned-By: MIMEDefang 2.44 X-archive-position: 2598 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: radheka.godse@intel.com Precedence: bulk X-list: netdev For consistency with ifenslave, instead of exiting with an error, updated bonding sysfs to close and attempt to enslave an up adapter. Signed-off-by: Radheka Godse Signed-off-by: Mitch Williams diff -urN -X dontdiff linux-2.6.12post/drivers/net/bonding/bond_main.c linux-2.6.12post-sysfs/drivers/net/bonding/bond_main.c --- linux-2.6.12post/drivers/net/bonding/bond_main.c 2005-06-28 18:18:03.000000000 -0700 +++ linux-2.6.12post-sysfs/drivers/net/bonding/bond_main.c 2005-06-30 13:53:55.000000000 -0700 @@ -1665,10 +1665,19 @@ */ if ((slave_dev->flags & IFF_UP)) { printk(KERN_ERR DRV_NAME - ": Error: %s is up\n", - slave_dev->name); + ": %s: Warning: %s is up. Closing it " + "before adding to the bond.\n", + bond_dev->name, slave_dev->name); res = -EPERM; - goto err_undo_flags; + res = dev_close(slave_dev); + if (res) + { + printk(KERN_ERR DRV_NAME + ": %s: Error: Failed to close %s.\n", + bond_dev->name, slave_dev->name); + res = -EPERM; + goto err_undo_flags; + } } if (slave_dev->set_mac_address == NULL) { From radheka.godse@intel.com Fri Jul 1 13:51:16 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 01 Jul 2005 13:51:22 -0700 (PDT) Received: from orsfmr004.jf.intel.com (fmr19.intel.com [134.134.136.18]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j61KpBH9008828 for ; Fri, 1 Jul 2005 13:51:16 -0700 Received: from orsfmr101.jf.intel.com (orsfmr101.jf.intel.com [10.7.209.17]) by orsfmr004.jf.intel.com (8.12.10/8.12.10/d: major-outer.mc,v 1.1 2004/09/17 17:50:56 root Exp $) with ESMTP id j61KnVkn019278; Fri, 1 Jul 2005 20:49:31 GMT Received: from nwlxmail01.jf.intel.com (nwlxmail01.jf.intel.com [10.7.171.40]) by orsfmr101.jf.intel.com (8.12.10/8.12.10/d: major-inner.mc,v 1.2 2004/09/17 18:05:01 root Exp $) with ESMTP id j61KnV7G032516; Fri, 1 Jul 2005 20:49:31 GMT Received: from [134.134.3.92] ([134.134.3.92]) by nwlxmail01.jf.intel.com (8.12.10/8.12.9/MailSET/Hub) with ESMTP id j61KnVSL003497; Fri, 1 Jul 2005 13:49:31 -0700 Date: Fri, 1 Jul 2005 13:48:44 -0700 (PDT) From: Radheka Godse X-X-Sender: radheka@localhost.localdomain To: fubar@us.ibm.com, bonding-devel@lists.sourceforge.net cc: netdev@oss.sgi.com Subject: [PATCH 2.6.13-rc1 8/17] bonding: SYSFS INTERFACE (large) Message-ID: ReplyTo: "Radheka Godse" MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Scanned-By: MIMEDefang 2.44 X-archive-position: 2599 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: radheka.godse@intel.com Precedence: bulk X-list: netdev This large patch adds the sysfs interface to channel bonding. It will allow users to add and remove bonds, add and remove slaves, and change all bonding parameters without using ifenslave. The ifenslave interface still works. Signed-off-by: Radheka Godse Signed-off-by: Mitch Williams diff -urN -X dontdiff linux-2.6.12post/drivers/net/bonding/bonding.h linux-2.6.12post-sysfs/drivers/net/bonding/bonding.h --- linux-2.6.12post/drivers/net/bonding/bonding.h 2005-06-28 18:18:03.000000000 -0700 +++ linux-2.6.12post-sysfs/drivers/net/bonding/bonding.h 2005-06-30 13:58:27.000000000 -0700 @@ -37,6 +37,7 @@ #include #include #include +#include #include "bond_3ad.h" #include "bond_alb.h" @@ -262,6 +259,13 @@ int bond_dev_queue_xmit(struct bonding *bond, struct sk_buff *skb, struct net_device *slave_dev); int bond_create(char *name, struct bond_params *params, struct bonding **newbond); void bond_deinit(struct net_device *bond_dev); +int bond_create_sysfs(void); +void bond_destroy_sysfs(void); +void bond_destroy_sysfs_entry(struct bonding *bond); +int bond_create_sysfs_entry(struct bonding *bond); +int bond_create_slave_symlinks(struct net_device *master, struct net_device *slave); +void bond_destroy_slave_symlinks(struct net_device *master, struct net_device *slave); +int bond_check_abi_ver(void); int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev, struct slave **vassal); int bond_release(struct net_device *bond_dev, struct net_device *slave_dev); int bond_sethwaddr(struct net_device *bond_dev, struct net_device *slave_dev); diff -urN -X dontdiff linux-2.6.12post/drivers/net/bonding/bond_main.c linux-2.6.12post-sysfs/drivers/net/bonding/bond_main.c --- linux-2.6.12post/drivers/net/bonding/bond_main.c 2005-06-28 18:18:03.000000000 -0700 +++ linux-2.6.12post-sysfs/drivers/net/bonding/bond_main.c 2005-06-30 13:53:55.000000000 -0700 @@ -591,6 +591,7 @@ static struct proc_dir_entry *bond_proc_dir = NULL; #endif +extern struct rw_semaphore bonding_rwsem; static u32 arp_target[BOND_MAX_ARP_TARGETS] = { 0, } ; static int arp_ip_count = 0; static int bond_mode = BOND_MODE_ROUNDROBIN; @@ -1994,6 +1994,9 @@ } } + res = bond_create_slave_symlinks(bond_dev, slave_dev); + if (res) + goto err_unset_master; printk(KERN_INFO DRV_NAME ": %s: enslaving %s as a%s interface with a%s link.\n", bond_dev->name, slave_dev->name, @@ -2165,6 +2176,9 @@ write_unlock_bh(&bond->lock); + /* must do this from outside any spinlocks */ + bond_destroy_slave_symlinks(bond_dev, slave_dev); + bond_del_vlans_from_slave(bond, slave_dev); /* If the mode USES_PRIMARY, then we should only remove its @@ -2256,6 +2263,7 @@ */ write_unlock_bh(&bond->lock); + bond_destroy_slave_symlinks(bond_dev, slave_dev); bond_del_vlans_from_slave(bond, slave_dev); /* If the mode USES_PRIMARY, then we should only remove its @@ -2436,6 +2444,22 @@ } } +int bond_check_abi_ver(void) +{ + int retval = 1; + + if (orig_app_abi_ver == -1) { + orig_app_abi_ver = BOND_ABI_VERSION; + app_abi_ver = BOND_ABI_VERSION; + } + else { + if (app_abi_ver == 0) + retval = 0; + } + + return retval; +} + static int bond_info_query(struct net_device *bond_dev, struct ifbond *info) { struct bonding *bond = bond_dev->priv; @@ -3569,7 +3593,10 @@ bond_remove_proc_entry(bond); bond_create_proc_entry(bond); #endif - + down_write(&(bonding_rwsem)); + bond_destroy_sysfs_entry(bond); + bond_create_sysfs_entry(bond); + up_write(&(bonding_rwsem)); return NOTIFY_DONE; } @@ -4101,6 +4128,7 @@ orig_app_abi_ver = prev_abi_ver; } + up_write(&(bonding_rwsem)); return res; } @@ -5000,18 +4990,22 @@ goto err; } + res = bond_create_sysfs(); + if (res) + goto err; + register_netdevice_notifier(&bond_netdev_notifier); register_inetaddr_notifier(&bond_inetaddr_notifier); - return 0; - -out_err: - /* free and unregister all bonds that were successfully added */ + goto out; +err: + rtnl_lock(); bond_free_all(); - + bond_destroy_sysfs(); rtnl_unlock(); - +out: return res; + } static void __exit bonding_exit(void) @@ -5021,6 +5053,7 @@ rtnl_lock(); bond_free_all(); + bond_destroy_sysfs(); rtnl_unlock(); } diff -urN -X dontdiff linux-2.6.12post/drivers/net/bonding/bond_sysfs.c linux-2.6.12post-sysfs/drivers/net/bonding/bond_sysfs.c --- linux-2.6.12post/drivers/net/bonding/bond_sysfs.c 1969-12-31 16:00:00.000000000 -0800 +++ linux-2.6.12post-sysfs/drivers/net/bonding/bond_sysfs.c 2005-06-30 13:54:22.000000000 -0700 @@ -0,0 +1,1493 @@ + +/* + * Copyright(c) 2004-2005 Intel Corporation. All rights reserved. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License as published by the + * Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, but + * WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY + * or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License + * for more details. + * + * You should have received a copy of the GNU General Public License along + * with this program; if not, write to the Free Software Foundation, Inc., + * 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. + * + * The full GNU General Public License is included in this distribution in the + * file called LICENSE. + * + * + * Changes: + * + * 2004/12/12 - Mitch Williams + * - Initial creation of sysfs interface. + * + * 2005/06/22 - Radheka Godse + * - Added ifenslave -c type functionality to sysfs + * - Added sysfs files for attributes such as MII Status and + * 802.3ad aggregator that are displayed in /proc + * - Added "name value" format to sysfs "mode" and + * "lacp_rate", for e.g., "active-backup 1" or "slow 0" for + * consistency and ease of script parsing + * - Fixed reversal of octets in arp_ip_targets via sysfs + * - sysfs support to handle bond interface re-naming + * - Moved all sysfs entries into /sys/class/net instead of + * of using a standalone subsystem. + * - Added sysfs symlinks between masters and slaves + * - Corrected bugs in sysfs unload path when creating bonds + * with existing interface names. + * - Removed redundant sysfs stat file since it duplicates slave info + * from the proc file + * - Fixed errors in sysfs show/store arp targets. + * - For consistency with ifenslave, instead of exiting + * with an error, updated bonding sysfs to + * close and attempt to enslave an up adapter. + * - Fixed NULL dereference when adding a slave interface + * that does not exist. + * - Added checks in sysfs bonding to reject invalid ip addresses + * - Synch up with post linux-2.6.12 bonding changes + * - Created sysfs bond attrib for xmit_hash_policy + */ +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +/* #define BONDING_DEBUG 1 */ +#include "bonding.h" +#define to_class_dev(obj) container_of(obj,struct class_device,kobj) +#define to_net_dev(class) container_of(class, struct net_device, class_dev) +#define to_bond(cd) ((struct bonding *)(to_net_dev(cd)->priv)) + +/*---------------------------- Declarations -------------------------------*/ + +/* Macros for real simple parsing of text. */ +#define eat_nonalnum(str,whence,max) \ + while (whence < max) {if (!isalnum(str[whence])) whence++; else break;}; +#define find_next_nonalpha(str,whence,max) \ + while (whence < max) {if (isalnum(str[whence])) whence++; else break;}; + +extern struct list_head bond_dev_list; +extern struct bond_params bonding_defaults; +extern struct bond_parm_tbl bond_mode_tbl[]; +extern struct bond_parm_tbl bond_lacp_tbl[]; +extern struct bond_parm_tbl xmit_hashtype_tbl[]; + +static struct class *netdev_class; +/*--------------------------- Data Structures -----------------------------*/ + +/* Bonding sysfs lock. Why can't we just use the subsytem lock? + * Because kobject_register tries to acquire the subsystem lock. If + * we already hold the lock (which we would if the user was creating + * a new bond through the sysfs interface), we deadlock. + */ + +struct rw_semaphore bonding_rwsem; + + + + +/*------------------------------ Functions --------------------------------*/ + +/* + * "show" function for the bond_masters attribute. + * The class parameter is ignored. + */ +static ssize_t bonding_show_bonds(struct class *cls, char *buffer) +{ + int res = 0; + struct bonding *bond; + + down_read(&(bonding_rwsem)); + + list_for_each_entry(bond, &bond_dev_list, bond_list) { + res += sprintf(buffer + res, "%s ", + bond->dev->name); + if (res > (PAGE_SIZE - IFNAMSIZ)) { + dprintk("eek! too many bonds!\n"); + break; + } + } + res += sprintf(buffer + res, "\n"); + res++; + up_read(&(bonding_rwsem)); + return res; +} + +/* + * "store" function for the bond_masters attribute. This is what + * creates and deletes entire bonds. + * + * The class parameter is ignored. + * + * This function uses the eat_nonalnum and eat_alnum macros, define + * above. Why not use sscanf()? Scanf can get strings, but can't filter + * out inappropriate characters. For example, we can't have bonds named + * "foo/bar" or "foo*bar" or "Does this work?" as these aren't valid + * filenames. While we could use scanf to get strings and then validate + * them, this is quicker. + * The above examples give us these results: + * "foo/bar" gives two bonds, "foo" and "bar". + * "foo*bar" gives two bonds, "foo" and "bar". + * "Does this work?" gives three bonds, "Does", "this", and "work". + */ + +static ssize_t bonding_store_bonds(struct class *cls, const char *buffer, size_t count) +{ + char name[IFNAMSIZ]; + int i, res, found, pos = 0; + struct bonding *bond; + struct bonding *nxt; + + down_write(&(bonding_rwsem)); + /* First process adds */ + eat_nonalnum(buffer, pos, count); + /* Pos now points to the first alpha character. */ + i = pos; + find_next_nonalpha(buffer, i, count); + /* i now points to the next character past the end of the bond name. */ + if (i - pos >= IFNAMSIZ) { + printk(KERN_ERR DRV_NAME "Interface name %.*s too large! Ignoring.\n", + i - pos, buffer + pos); + up_write(&(bonding_rwsem)); + return -EPERM; + } + /* Copy the bond name so we can deal with it separately. */ + strncpy(name, buffer + pos, i - pos); + /* Don't forget the null terminator! */ + name[i - pos] = 0; + while (strlen(name)) { + /* Got a bond name in name. Is it already in the list? */ + found = 0; + list_for_each_entry_safe(bond, nxt, &bond_dev_list, bond_list) { + if (strnicmp(bond->dev->name, name, IFNAMSIZ) == 0) { + /* Temporarily set a meaningless flag. When + * we get done with the loop, we'll check all of these. + * If the bond doesn't have this flag set, then we need + * to remove the bond. If the flag has it set, then + * we can just clear the flag. + */ + bond->flags |= IFF_DYNAMIC; + found = 1; + break; /* Found it, so go to next name */ + } + } + if (found == 0) { + printk(KERN_INFO DRV_NAME ": %s is being created...\n", name); + res = bond_create(name, &bonding_defaults, &bond); + if (res) { + up_write(&(bonding_rwsem)); + printk(KERN_INFO DRV_NAME ": %s interface already exists. Bond creation failed.\n", name); + return res; + } + printk(KERN_INFO DRV_NAME ": %s created.\n", name); + /* Set the flag so we don't delete + * this interface in the loop below. + */ + bond->flags |= IFF_DYNAMIC; + } + /* Scan for the next name. i still has the location of + * the char just past the end of the last name we handled. + */ + pos = i; + eat_nonalnum(buffer, pos, count); + i = pos; + find_next_nonalpha(buffer, i, count); + if (i - pos >= IFNAMSIZ) { + printk(KERN_ERR DRV_NAME + ": %.*s interface name too large! Ignoring.\n", + i - pos, buffer + pos); + up_write(&(bonding_rwsem)); + return -EPERM; + } + strncpy(name, buffer + pos, i - pos); + name[i - pos] = 0; + } /* end of while loop and end of input */ + + /* Now we do deletes. Walk through the list, and check to see + * if the flag didn't get set. If it's set, clear it. If it's + * not set, delete the bond. + */ + list_for_each_entry_safe(bond, nxt, &bond_dev_list, bond_list) { + if (bond->flags & IFF_DYNAMIC) { + bond->flags &= ~IFF_DYNAMIC; + } else { + printk(KERN_INFO DRV_NAME ": %s is being deleted...\n", bond->dev->name); + rtnl_lock(); + unregister_netdevice(bond->dev); + bond_deinit(bond->dev); + bond_destroy_sysfs_entry(bond); + rtnl_unlock(); + printk(KERN_INFO DRV_NAME ": %s deleted.\n", bond->dev->name); + } + } + /* Always return either count or an error. If you return 0, you'll + * get called forever, which is bad. + */ + up_write(&(bonding_rwsem)); + return count; +} +/* class attribute for bond_masters file. This ends up in /sys/class/net */ +static CLASS_ATTR(bonding_masters, S_IWUSR | S_IRUGO, + bonding_show_bonds, bonding_store_bonds); + +int bond_create_slave_symlinks(struct net_device *master, struct net_device *slave) +{ + char linkname[IFNAMSIZ+7]; + int ret = 0; + + /* first, create a link from the slave back to the master */ + ret = sysfs_create_link(&(slave->class_dev.kobj), &(master->class_dev.kobj), + "master"); + if (ret) + return ret; + /* next, create a link from the master to the slave */ + sprintf(linkname,"slave_%s",slave->name); + ret = sysfs_create_link(&(master->class_dev.kobj), &(slave->class_dev.kobj), + linkname); + return ret; + +} + +void bond_destroy_slave_symlinks(struct net_device *master, struct net_device *slave) +{ + char linkname[IFNAMSIZ+7]; + + sysfs_remove_link(&(slave->class_dev.kobj), "master"); + sprintf(linkname,"slave_%s",slave->name); + sysfs_remove_link(&(master->class_dev.kobj), linkname); +} + + +/* + * Show the slaves in the current bond. + */ +static ssize_t bonding_show_slaves(struct class_device *cd, char *buf) +{ + struct slave *slave; + int i, res = 0; + struct bonding *bond = to_bond(cd); + + down_read(&(bonding_rwsem)); + + read_lock_bh(&bond->lock); + bond_for_each_slave(bond, slave, i) { + res += sprintf(buf + res, "%s ", slave->dev->name); + if (res > (PAGE_SIZE - IFNAMSIZ)) { + dprintk("eek! too many slaves!\n"); + break; + } + } + read_unlock_bh(&bond->lock); + res += sprintf(buf + res, "\n"); + res++; + up_read(&(bonding_rwsem)); + return res; +} + +/* + * Set the slaves in the current bond. The bond interface must be + * up for this to succeed. + * This function is largely the same flow as bonding_update_bonds(). + */ +static ssize_t bonding_store_slaves(struct class_device *cd, const char *buffer, size_t count) +{ + char name[IFNAMSIZ]; + int i, j, res, found, pos = 0, ret = count; + struct slave *slave; + struct net_device *dev = 0; + struct bonding *bond = to_bond(cd); + + if (!bond_check_abi_ver()) + { + printk(KERN_ERR DRV_NAME + ": Error: your version of ifenslave is incompatible " + "with the sysfs interface in this version of bonding. " + "Upgrade ifenslave to version 1 or greater and reload " + "bonding.\n"); + return -EINVAL; + } + + down_write(&(bonding_rwsem)); + + /* Quick sanity check -- is the bond interface up? */ + if (!(bond->dev->flags & IFF_UP)) { + printk(KERN_ERR DRV_NAME + ": %s: Unable to update slaves because interface is down.\n", + bond->dev->name); + ret = -EPERM; + goto out; + } + + /* Note: We can't hold bond->lock here, as bond_create grabs it. */ + + /* First process adds */ + + /* Copy the first name we find. */ + eat_nonalnum(buffer, pos, count); + i = pos; + find_next_nonalpha(buffer, i, count); + if (i - pos >= IFNAMSIZ) { + printk(KERN_ERR DRV_NAME ": %s: Slave name %.*s too large! Ignoring.\n", + bond->dev->name, + i - pos, buffer + pos); + ret = -EPERM; + goto out; + } + strncpy(name, buffer + pos, i - pos); + name[i - pos] = 0; + + while (strlen(name)) { + /* Got a slave name in name. Is it already there? */ + found = 0; + read_lock_bh(&bond->lock); + bond_for_each_slave(bond, slave, j) { + if (strnicmp(slave->dev->name, name, IFNAMSIZ) == 0) { + /* Temporarily set a meaningless flag. When + * we get done with the loop, we'll check all of these. + * If the slave doesn't have this flag set, then we need + * to remove the slave. If the flag has it set, then + * we can just clear the flag. + */ + slave->original_flags |= IFF_DYNAMIC; + found = 1; + break; /* Found it, so go to next name */ + } + } + read_unlock_bh(&bond->lock); + if (found == 0) { + printk(KERN_INFO DRV_NAME ": %s: Adding slave %s.\n", + bond->dev->name, name); + dev = dev_get_by_name(name); + if (!dev) { + printk(KERN_INFO DRV_NAME + ": %s: Interface %s does not exist!\n", + bond->dev->name, name); + ret = -EPERM; + goto out; + } + else { + dev_put(dev); + } + + + if (dev->flags & IFF_SLAVE) { + printk(KERN_INFO DRV_NAME + ": %s: Interface %s is already enslaved!\n", + bond->dev->name, name); + ret = -EPERM; + goto out; + } + + /* If this is the first slave, then we need to set + the master's hardware address to be the same as the + slave's. */ + if (!(*((u32 *) & (bond->dev->dev_addr[0])))) { + memcpy(bond->dev->dev_addr, dev->dev_addr, + dev->addr_len); + } + + /* Set the slave's MTU to match the bond */ + if (dev->mtu != bond->dev->mtu) { + if (dev->change_mtu) { + res = dev->change_mtu(dev, + bond->dev->mtu); + if (res) { + ret = res; + goto out; + } + } else { + dev->mtu = bond->dev->mtu; + } + } + rtnl_lock(); + res = bond_enslave(bond->dev, dev, &slave); + rtnl_unlock(); + if (res) { + ret = res; + goto out; + } + slave->original_flags |= IFF_DYNAMIC; + } + /* Get the next name in the buffer. */ + pos = i; + eat_nonalnum(buffer, pos, count); + i = pos; + find_next_nonalpha(buffer, i, count); + if (i - pos >= IFNAMSIZ) { + printk(KERN_ERR DRV_NAME + ": %s: Slave name %.*s too large! Ignoring.\n", + bond->dev->name, i - pos, buffer + pos); + ret = -EPERM; + goto out; + } + strncpy(name, buffer + pos, i - pos); + name[i - pos] = 0; + } /* End of while loop, and end of input. */ + + /* Now handle deletes. + * We can't use bond_for_each_slave here because we might modify + * the list when we're inside the loop. We can't hold the lock + * either because bond_release grabs it. + */ + slave = bond->first_slave; + i = bond->slave_cnt; + while (i > 0) { + BUG_ON(slave == 0); + if (slave->original_flags & IFF_DYNAMIC) { + slave->original_flags &= ~IFF_DYNAMIC; + slave = slave->next; + } else { + /* Didn't find the name of this slave in the list, so + * remove the slave. + */ + printk(KERN_INFO DRV_NAME + ": %s: Removing slave %s\n", + bond->dev->name, + slave->dev->name); + dev = slave->dev; + slave = slave->next; + rtnl_lock(); + res = bond_release(bond->dev, dev); + rtnl_unlock(); + if (res) { + ret = res; + goto out; + } + /* set the slave MTU to the default */ + if (dev->change_mtu) { + dev->change_mtu(dev, 1500); + } else { + dev->mtu = 1500; + } + } + i--; + } + +out: + up_write(&(bonding_rwsem)); + return ret; +} +static CLASS_DEVICE_ATTR(slaves, S_IRUGO | S_IWUSR, bonding_show_slaves, bonding_store_slaves); + +/* + * Show and set the bonding mode. The bond interface must be down to + * change the mode. + */ +static ssize_t bonding_show_mode(struct class_device *cd, char *buf) +{ + int count; + struct bonding *bond = to_bond(cd); + + down_read(&(bonding_rwsem)); + count = sprintf(buf, "%s %d\n", + bond_mode_tbl[bond->params.mode].modename, + bond->params.mode) + 1; + up_read(&(bonding_rwsem)); + return count; +} + +static ssize_t bonding_store_mode(struct class_device *cd, const char *buf, size_t count) +{ + int new_value, ret = count; + struct bonding *bond = to_bond(cd); + + down_write(&(bonding_rwsem)); + + if (bond->dev->flags & IFF_UP) { + printk(KERN_ERR DRV_NAME + "Unable to update mode of %s because interface is up.\n", + bond->dev->name); + ret = -EPERM; + goto out; + } + + new_value = bond_parse_parm((char *)buf, bond_mode_tbl); + if (new_value < 0) { + printk(KERN_ERR DRV_NAME + ": %s: Ignoring invalid mode value %.*s.\n", + bond->dev->name, + (int)strlen(buf) - 1, buf); + ret = -EINVAL; + goto out; + } else { + bond->params.mode = new_value; + bond_set_mode_ops(bond, bond->params.mode); + printk(KERN_INFO DRV_NAME ": %s: setting mode to %s (%d).\n", + bond->dev->name, bond_mode_tbl[new_value].modename, new_value); + } +out: + up_write(&(bonding_rwsem)); + return ret; +} +static CLASS_DEVICE_ATTR(mode, S_IRUGO | S_IWUSR, bonding_show_mode, bonding_store_mode); + +/* + * Show and set the bonding transmit hash method. The bond interface must be down to + * change the xmit hash policy. + */ +static ssize_t bonding_show_xmit_hash(struct class_device *cd, char *buf) +{ + int count; + struct bonding *bond = to_bond(cd); + + down_read(&(bonding_rwsem)); + if ((bond->params.mode != BOND_MODE_XOR) && + (bond->params.mode != BOND_MODE_8023AD)) { + // Not Applicable + count = sprintf(buf, "NA\n") + 1; + } else { + count = sprintf(buf, "%s %d\n", + xmit_hashtype_tbl[bond->params.xmit_policy].modename, + bond->params.xmit_policy) + 1; + } + up_read(&(bonding_rwsem)); + + return count; +} + +static ssize_t bonding_store_xmit_hash(struct class_device *cd, const char *buf, size_t count) +{ + int new_value, ret = count; + struct bonding *bond = to_bond(cd); + + down_write(&(bonding_rwsem)); + + if (bond->dev->flags & IFF_UP) { + printk(KERN_ERR DRV_NAME + "%s: Interface is up. Unable to update xmit policy.\n", + bond->dev->name); + ret = -EPERM; + goto out; + } + + if ((bond->params.mode != BOND_MODE_XOR) && + (bond->params.mode != BOND_MODE_8023AD)) { + printk(KERN_ERR DRV_NAME + "%s: Transmit hash policy is irrelevant in this mode.\n", + bond->dev->name); + ret = -EPERM; + goto out; + } + + new_value = bond_parse_parm((char *)buf, xmit_hashtype_tbl); + if (new_value < 0) { + printk(KERN_ERR DRV_NAME + ": %s: Ignoring invalid xmit hash policy value %.*s.\n", + bond->dev->name, + (int)strlen(buf) - 1, buf); + ret = -EINVAL; + goto out; + } else { + bond->params.xmit_policy = new_value; + bond_set_mode_ops(bond, bond->params.mode); + printk(KERN_INFO DRV_NAME ": %s: setting xmit hash policy to %s (%d).\n", + bond->dev->name, xmit_hashtype_tbl[new_value].modename, new_value); + } +out: + up_write(&(bonding_rwsem)); + return ret; +} +static CLASS_DEVICE_ATTR(xmit_hash_policy, S_IRUGO | S_IWUSR, bonding_show_xmit_hash, bonding_store_xmit_hash); + +/* + * Show and set the arp timer interval. There are two tricky bits + * here. First, if ARP monitoring is activated, then we must disable + * MII monitoring. Second, if the ARP timer isn't running, we must + * start it. + */ +static ssize_t bonding_show_arp_interval(struct class_device *cd, char *buf) +{ + int count; + struct bonding *bond = to_bond(cd); + + down_read(&(bonding_rwsem)); + count = sprintf(buf, "%d\n", bond->params.arp_interval) + 1; + up_read(&(bonding_rwsem)); + return count; +} + +static ssize_t bonding_store_arp_interval(struct class_device *cd, const char *buf, size_t count) +{ + int new_value, ret = count; + struct bonding *bond = to_bond(cd); + + down_write(&(bonding_rwsem)); + + sscanf(buf, "%d", &new_value); + if (new_value < 0) { + printk(KERN_ERR DRV_NAME + ": %s: Invalid arp_interval value %d not in range %d-%d; rejected.\n", + bond->dev->name, new_value, 1, INT_MAX); + ret = -EINVAL; + goto out; + } else { + printk(KERN_INFO DRV_NAME + ": %s: Setting ARP monitoring interval to %d.\n", + bond->dev->name, new_value); + bond->params.arp_interval = new_value; + if (bond->params.miimon) { + printk(KERN_INFO DRV_NAME + ": %s: ARP monitoring cannot be used with MII monitoring. " + "%s Disabling MII monitoring.\n", + bond->dev->name, bond->dev->name); + bond->params.miimon = 0; + /* Kill MII timer, else it brings bond's link down */ + if (bond->arp_timer.function) { + printk(KERN_INFO DRV_NAME + ": %s: Kill MII timer, else it brings bond's link down...\n", + bond->dev->name); + del_timer_sync(&bond->mii_timer); + } + } + if (!bond->params.arp_targets[0]) { + printk(KERN_INFO DRV_NAME + ": %s: ARP monitoring has been set up, " + "but no ARP targets have been specified.\n", + bond->dev->name); + } + if (bond->dev->flags & IFF_UP) { + /* If the interface is up, we may need to fire off + * the ARP timer. If the interface is down, the + * timer will get fired off when the open function + * is called. + */ + if (bond->arp_timer.function) { + /* The timer's already set up, so fire it off */ + mod_timer(&bond->arp_timer, jiffies + 1); + } else { + /* Set up the timer. */ + init_timer(&bond->arp_timer); + bond->arp_timer.expires = jiffies + 1; + bond->arp_timer.data = + (unsigned long) bond->dev; + if (bond->params.mode == BOND_MODE_ACTIVEBACKUP) { + bond->arp_timer.function = + (void *) + &bond_activebackup_arp_mon; + } else { + bond->arp_timer.function = + (void *) + &bond_loadbalance_arp_mon; + } + add_timer(&bond->arp_timer); + } + } + } + +out: + up_write(&(bonding_rwsem)); + return ret; +} +static CLASS_DEVICE_ATTR(arp_interval, S_IRUGO | S_IWUSR , bonding_show_arp_interval, bonding_store_arp_interval); + +/* + * Show and set the arp targets. + */ +static ssize_t bonding_show_arp_targets(struct class_device *cd, char *buf) +{ + int i, res = 0; + struct bonding *bond = to_bond(cd); + + down_read(&(bonding_rwsem)); + + for (i = 0; (i < BOND_MAX_ARP_TARGETS) && bond->params.arp_targets[i]; + i++) { + res += sprintf(buf + res, "%u.%u.%u.%u", + NIPQUAD(bond->params.arp_targets[i])); + if ((i+1 < BOND_MAX_ARP_TARGETS) && bond->params.arp_targets[i+1]) + res += sprintf(buf + res, " "); + else + res += sprintf(buf + res, "\n"); + res++; + } + + up_read(&(bonding_rwsem)); + return res; +} + +static ssize_t bonding_store_arp_targets(struct class_device *cd, const char *buf, size_t count) +{ + int octet1, octet2, octet3, octet4, ret = count; + int newpos = 0, pos = 0, i = 0; + struct bonding *bond = to_bond(cd); + const char *delimiter; + + down_write(&(bonding_rwsem)); + + memset(bond->params.arp_targets, 0, sizeof(bond->params.arp_targets)); + if (sscanf(buf, "%d.%d.%d.%d%n", &octet1, &octet2, &octet3, &octet4, &pos) != 4) { + printk(KERN_ERR DRV_NAME + ": %s: Invalid arp targets.\n", bond->dev->name); + ret = -EINVAL; + goto out; + } + + /* check for delimiting white space */ + delimiter = buf + pos; + if(*delimiter != ' ' && *delimiter != '\n' ) { + printk(KERN_ERR DRV_NAME + ": %s: Invalid arp targets with missing whitespace.\n", bond->dev->name); + ret = -EINVAL; + goto out; + } + + while ((pos < count) && (i < BOND_MAX_ARP_TARGETS)) { + if ( (octet1 < 0) || (octet1 > 255) + || (octet2 < 0) || (octet2 > 255) + || (octet3 < 0) || (octet3 > 255) + || (octet4 < 0) || (octet4 > 255) + || ((octet1 == 0) && (octet2 == 0) && + (octet3 == 0) && (octet4 == 0)) + || ((octet1 == 255) && (octet2 == 255) && + (octet3 == 255) && (octet4 == 255)) + ) { + memset(bond->params.arp_targets, 0, sizeof(bond->params.arp_targets)); + printk(KERN_ERR DRV_NAME + ": %s: Unable to add arp target %d.%d.%d.%d\n", + bond->dev->name, octet1, octet2, octet3, octet4); + ret = -EINVAL; + goto out; + } else { + printk(KERN_INFO DRV_NAME + ": %s: Adding arp target %d.%d.%d.%d\n", + bond->dev->name, octet1, octet2, octet3, octet4); + bond->params.arp_targets[i] = htonl + ((((u32) octet1 & 0xff) << 24) | + (((u32) octet2 & 0xff) << 16) | + (((u32) octet3 & 0xff) << 8) | ((u32) octet4 & + 0xff)); + + i++; + } + + newpos = 0; + if (sscanf(buf + pos, "%d.%d.%d.%d%n", &octet1, &octet2, &octet3, + &octet4, &newpos) != 4 && newpos != 0) { + memset(bond->params.arp_targets, 0, sizeof(bond->params.arp_targets)); + printk(KERN_ERR DRV_NAME + ": %s: Invalid arp targets specified.\n", bond->dev->name); + ret = -EINVAL; + goto out; + } + if (newpos == 0) { + break; + } + + pos += newpos; + /* check for delimiting white space */ + delimiter = buf + pos; + if(*delimiter != ' ' && *delimiter != '\n' ) { + memset(bond->params.arp_targets, 0, sizeof(bond->params.arp_targets)); + printk(KERN_ERR DRV_NAME + ": %s: Invalid arp targets with missing whitespace.\n", bond->dev->name); + ret = -EINVAL; + goto out; + } + } +out: + up_write(&(bonding_rwsem)); + return ret; +} +static CLASS_DEVICE_ATTR(arp_ip_target, S_IRUGO | S_IWUSR , bonding_show_arp_targets, bonding_store_arp_targets); + +/* + * Show and set the up and down delays. These must be multiples of the + * MII monitoring value, and are stored internally as the multiplier. + * Thus, we must translate to MS for the real world. + */ +static ssize_t bonding_show_downdelay(struct class_device *cd, char *buf) +{ + int count; + struct bonding *bond = to_bond(cd); + + down_read(&(bonding_rwsem)); + count = sprintf(buf, "%d\n", bond->params.downdelay * bond->params.miimon) + 1; + up_read(&(bonding_rwsem)); + return count; +} + +static ssize_t bonding_store_downdelay(struct class_device *cd, const char *buf, size_t count) +{ + int new_value, ret = count; + struct bonding *bond = to_bond(cd); + + down_write(&(bonding_rwsem)); + + if (!(bond->params.miimon)) { + printk(KERN_ERR DRV_NAME + ": %s: Unable to set down delay as MII monitoring is disabled\n", + bond->dev->name); + ret = -EPERM; + goto out; + } + + sscanf(buf, "%d", &new_value); + if (new_value < 0) { + printk(KERN_ERR DRV_NAME + ": %s: Invalid down delay value %d not in range %d-%d; rejected.\n", + bond->dev->name, new_value, 1, INT_MAX); + ret = -EINVAL; + goto out; + } else { + if ((new_value % bond->params.miimon) != 0) { + printk(KERN_WARNING DRV_NAME + ": %s: Warning: down delay (%d) is not a multiple " + "of miimon (%d), delay rounded to %d ms\n", + bond->dev->name, new_value, bond->params.miimon, + (new_value / bond->params.miimon) * + bond->params.miimon); + } + bond->params.downdelay = new_value / bond->params.miimon; + printk(KERN_INFO DRV_NAME ": %s: Setting down delay to %d.\n", + bond->dev->name, bond->params.downdelay * bond->params.miimon); + + } + +out: + up_write(&(bonding_rwsem)); + return ret; +} +static CLASS_DEVICE_ATTR(down_delay, S_IRUGO | S_IWUSR , bonding_show_downdelay, bonding_store_downdelay); + +static ssize_t bonding_show_updelay(struct class_device *cd, char *buf) +{ + int count; + struct bonding *bond = to_bond(cd); + + down_read(&(bonding_rwsem)); + count = sprintf(buf, "%d\n", bond->params.updelay * bond->params.miimon) + 1; + up_read(&(bonding_rwsem)); + return count; + +} + +static ssize_t bonding_store_updelay(struct class_device *cd, const char *buf, size_t count) +{ + int new_value, ret = count; + struct bonding *bond = to_bond(cd); + + down_write(&(bonding_rwsem)); + + if (!(bond->params.miimon)) { + printk(KERN_ERR DRV_NAME + ": %s: Unable to set up delay as MII monitoring is disabled\n", + bond->dev->name); + ret = -EPERM; + goto out; + } + + sscanf(buf, "%d", &new_value); + if (new_value < 0) { + printk(KERN_ERR DRV_NAME + ": %s: Invalid down delay value %d not in range %d-%d; rejected.\n", + bond->dev->name, new_value, 1, INT_MAX); + ret = -EINVAL; + goto out; + } else { + if ((new_value % bond->params.miimon) != 0) { + printk(KERN_WARNING DRV_NAME + ": %s: Warning: up delay (%d) is not a multiple " + "of miimon (%d), updelay rounded to %d ms\n", + bond->dev->name, new_value, bond->params.miimon, + (new_value / bond->params.miimon) * + bond->params.miimon); + } + bond->params.updelay = new_value / bond->params.miimon; + printk(KERN_INFO DRV_NAME ": %s: Setting up delay to %d.\n", + bond->dev->name, bond->params.updelay * bond->params.miimon); + + } + +out: + up_write(&(bonding_rwsem)); + return ret; +} +static CLASS_DEVICE_ATTR(up_delay, S_IRUGO | S_IWUSR , bonding_show_updelay, bonding_store_updelay); + +/* + * Show and set the LACP interval. Interface must be down, and the mode + * must be set to 802.3ad mode. + */ +static ssize_t bonding_show_lacp(struct class_device *cd, char *buf) +{ + int count; + struct bonding *bond = to_bond(cd); + + down_read(&(bonding_rwsem)); + count = sprintf(buf, "%s %d\n", + bond_lacp_tbl[bond->params.lacp_fast].modename, + bond->params.lacp_fast) + 1; + up_read(&(bonding_rwsem)); + return count; +} + +static ssize_t bonding_store_lacp(struct class_device *cd, const char *buf, size_t count) +{ + int new_value, ret = count; + struct bonding *bond = to_bond(cd); + + down_write(&(bonding_rwsem)); + + if (bond->dev->flags & IFF_UP) { + printk(KERN_ERR DRV_NAME + ": %s: Unable to update LACP rate because interface is up.\n", + bond->dev->name); + ret = -EPERM; + goto out; + } + + if (bond->params.mode != BOND_MODE_8023AD) { + printk(KERN_ERR DRV_NAME + ": %s: Unable to update LACP rate because bond is not in 802.3ad mode.\n", + bond->dev->name); + ret = -EPERM; + goto out; + } + + new_value = bond_parse_parm((char *)buf, bond_lacp_tbl); + + if ((new_value == 1) || (new_value == 0)) { + bond->params.lacp_fast = new_value; + printk(KERN_INFO DRV_NAME + ": %s: Setting LACP rate to %s (%d).\n", + bond->dev->name, bond_lacp_tbl[new_value].modename, new_value); + } else { + printk(KERN_ERR DRV_NAME + ": %s: Ignoring invalid LACP rate value %.*s.\n", + bond->dev->name, (int)strlen(buf) - 1, buf); + ret = -EINVAL; + } +out: + up_write(&(bonding_rwsem)); + return ret; +} +static CLASS_DEVICE_ATTR(lacp_rate, S_IRUGO | S_IWUSR, bonding_show_lacp, bonding_store_lacp); + +/* + * Show and set the MII monitor interval. There are two tricky bits + * here. First, if MII monitoring is activated, then we must disable + * ARP monitoring. Second, if the timer isn't running, we must + * start it. + */ +static ssize_t bonding_show_miimon(struct class_device *cd, char *buf) +{ + int count; + struct bonding *bond = to_bond(cd); + + down_read(&(bonding_rwsem)); + count = sprintf(buf, "%d\n", bond->params.miimon) + 1; + up_read(&(bonding_rwsem)); + return count; +} + +static ssize_t bonding_store_miimon(struct class_device *cd, const char *buf, size_t count) +{ + int new_value, ret = count; + struct bonding *bond = to_bond(cd); + + down_read(&(bonding_rwsem)); + + sscanf(buf, "%d", &new_value); + if (new_value < 0) { + printk(KERN_ERR DRV_NAME + ": %s: Invalid miimon value %d not in range %d-%d; rejected.\n", + bond->dev->name, new_value, 1, INT_MAX); + ret = -EINVAL; + goto out; + } else { + printk(KERN_INFO DRV_NAME + ": %s: Setting MII monitoring interval to %d.\n", + bond->dev->name, new_value); + bond->params.miimon = new_value; + if(bond->params.updelay) + printk(KERN_INFO DRV_NAME + ": %s: Note: Updating updelay (to %d) " + "since it is a multiple of the miimon value.\n", + bond->dev->name, + bond->params.updelay * bond->params.miimon); + if(bond->params.downdelay) + printk(KERN_INFO DRV_NAME + ": %s: Note: Updating downdelay (to %d) " + "since it is a multiple of the miimon value.\n", + bond->dev->name, + bond->params.downdelay * bond->params.miimon); + if (bond->params.arp_interval) { + printk(KERN_INFO DRV_NAME + ": %s: MII monitoring cannot be used with " + "ARP monitoring. Disabling ARP monitoring...\n", + bond->dev->name); + bond->params.arp_interval = 0; + /* Kill ARP timer, else it brings bond's link down */ + if (bond->mii_timer.function) { + printk(KERN_INFO DRV_NAME + ": %s: Kill ARP timer, else it brings bond's link down...\n", + bond->dev->name); + del_timer_sync(&bond->arp_timer); + } + } + + if (bond->dev->flags & IFF_UP) { + /* If the interface is up, we may need to fire off + * the MII timer. If the interface is down, the + * timer will get fired off when the open function + * is called. + */ + if (bond->mii_timer.function) { + /* The timer's already set up, so fire it off */ + mod_timer(&bond->mii_timer, jiffies + 1); + } else { + /* Set up the timer. */ + init_timer(&bond->mii_timer); + bond->mii_timer.expires = jiffies + 1; + bond->mii_timer.data = + (unsigned long) bond->dev; + bond->mii_timer.function = + (void *) &bond_mii_monitor; + add_timer(&bond->mii_timer); + } + } + } +out: + up_read(&(bonding_rwsem)); + return ret; +} +static CLASS_DEVICE_ATTR(miimon, S_IRUGO | S_IWUSR, bonding_show_miimon, bonding_store_miimon); + +/* + * Show and set the primary slave. The store function is much + * simpler than bonding_store_slaves function because it only needs to + * handle one interface name. + * The bond must be a mode that supports a primary for this be + * set. + */ +static ssize_t bonding_show_primary(struct class_device *cd, char *buf) +{ + int count = 0; + struct bonding *bond = to_bond(cd); + + down_read(&(bonding_rwsem)); + if (bond->primary_slave) + count = sprintf(buf, "%s\n", bond->primary_slave->dev->name) + 1; + + up_read(&(bonding_rwsem)); + return count; +} + +static ssize_t bonding_store_primary(struct class_device *cd, const char *buf, size_t count) +{ + int i; + struct slave *slave; + struct bonding *bond = to_bond(cd); + + down_write(&(bonding_rwsem)); + + write_lock_bh(&bond->lock); + if (!USES_PRIMARY(bond->params.mode)) { + printk(KERN_INFO DRV_NAME + ": %s: Unable to set primary slave; %s is in mode %d\n", + bond->dev->name, bond->dev->name, bond->params.mode); + } else { + bond_for_each_slave(bond, slave, i) { + if (strnicmp + (slave->dev->name, buf, + strlen(slave->dev->name)) == 0) { + printk(KERN_INFO DRV_NAME + ": %s: Setting %s as primary slave.\n", + bond->dev->name, slave->dev->name); + bond->primary_slave = slave; + bond_select_active_slave(bond); + goto out; + } + } + + /* if we got here, then we didn't match the name of any slave */ + + if (strlen(buf) == 0 || buf[0] == '\n') { + printk(KERN_INFO DRV_NAME + ": %s: Setting primary slave to None.\n", + bond->dev->name); + bond->primary_slave = 0; + bond_select_active_slave(bond); + } else { + printk(KERN_INFO DRV_NAME + ": %s: Unable to set %.*s as primary slave as it is not a slave.\n", + bond->dev->name, (int)strlen(buf) - 1, buf); + } + } +out: + write_unlock_bh(&bond->lock); + up_write(&(bonding_rwsem)); + return count; +} +static CLASS_DEVICE_ATTR(primary, S_IRUGO | S_IWUSR, bonding_show_primary, bonding_store_primary); + +/* + * Show and set the use_carrier flag. + */ +static ssize_t bonding_show_carrier(struct class_device *cd, char *buf) +{ + int count; + struct bonding *bond = to_bond(cd); + + down_read(&(bonding_rwsem)); + count = sprintf(buf, "%d\n", bond->params.use_carrier) + 1; + up_read(&(bonding_rwsem)); + return count; +} + +static ssize_t bonding_store_carrier(struct class_device *cd, const char *buf, size_t count) +{ + int new_value; + struct bonding *bond = to_bond(cd); + + down_write(&(bonding_rwsem)); + + sscanf(buf, "%d", &new_value); + if ((new_value == 0) || (new_value == 1)) { + bond->params.use_carrier = new_value; + printk(KERN_INFO DRV_NAME ": %s: Setting use_carrier to %d.\n", + bond->dev->name, new_value); + } else { + printk(KERN_INFO DRV_NAME + ": %s: Ignoring invalid use_carrier value %d.\n", + bond->dev->name, new_value); + } + up_write(&(bonding_rwsem)); + return count; +} +static CLASS_DEVICE_ATTR(use_carrier, S_IRUGO | S_IWUSR, bonding_show_carrier, bonding_store_carrier); + + +/* + * Show and set currently active_slave. + */ +static ssize_t bonding_show_active_slave(struct class_device *cd, char *buf) +{ + struct slave *curr; + struct bonding *bond = to_bond(cd); + int count; + + down_read(&(bonding_rwsem)); + + read_lock(&bond->curr_slave_lock); + curr = bond->curr_active_slave; + read_unlock(&bond->curr_slave_lock); + + if (USES_PRIMARY(bond->params.mode)) { + count = sprintf(buf, "%s\n", (curr) ? curr->dev->name : "None") + 1; + } + else { + count = sprintf(buf, "%s\n", "None") + 1; + } + up_read(&(bonding_rwsem)); + return count; +} + +static ssize_t bonding_store_active_slave(struct class_device *cd, const char *buf, size_t count) +{ + int i; + struct slave *slave; + struct slave *old_active = NULL; + struct slave *new_active = NULL; + struct bonding *bond = to_bond(cd); + + down_write(&(bonding_rwsem)); + + write_lock_bh(&bond->lock); + if (!USES_PRIMARY(bond->params.mode)) { + printk(KERN_INFO DRV_NAME + ": %s: Unable to change active slave; %s is in mode %d\n", + bond->dev->name, bond->dev->name, bond->params.mode); + } else { + bond_for_each_slave(bond, slave, i) { + if (strnicmp + (slave->dev->name, buf, + strlen(slave->dev->name)) == 0) { + old_active = bond->curr_active_slave; + new_active = slave; + if (new_active && (new_active == old_active)) { + /* do nothing */ + printk(KERN_INFO DRV_NAME + ": %s: %s is already the current active slave.\n", + bond->dev->name, slave->dev->name); + goto out; + } + else { + if ((new_active) && + (old_active) && + (new_active->link == BOND_LINK_UP) && + IS_UP(new_active->dev)) { + printk(KERN_INFO DRV_NAME + ": %s: Setting %s as active slave.\n", + bond->dev->name, slave->dev->name); + bond_change_active_slave(bond, new_active); + } + else { + printk(KERN_INFO DRV_NAME + ": %s: Could not set %s as active slave; " + "either %s is down or the link is down.\n", + bond->dev->name, slave->dev->name, + slave->dev->name); + } + goto out; + } + } + } + + /* if we got here, then we didn't match the name of any slave */ + + if (strlen(buf) == 0 || buf[0] == '\n') { + printk(KERN_INFO DRV_NAME + ": %s: Setting active slave to None.\n", + bond->dev->name); + bond->primary_slave = 0; + bond_select_active_slave(bond); + } else { + printk(KERN_INFO DRV_NAME + ": %s: Unable to set %.*s as active slave as it is not a slave.\n", + bond->dev->name, (int)strlen(buf) - 1, buf); + } + } +out: + write_unlock_bh(&bond->lock); + up_write(&(bonding_rwsem)); + return count; + +} +static CLASS_DEVICE_ATTR(active_slave, S_IRUGO | S_IWUSR, bonding_show_active_slave, bonding_store_active_slave); + + +/* + * Show link status of the bond interface. + */ +static ssize_t bonding_show_mii_status(struct class_device *cd, char *buf) +{ + int count; + struct slave *curr; + struct bonding *bond = to_bond(cd); + + down_read(&(bonding_rwsem)); + + read_lock(&bond->curr_slave_lock); + curr = bond->curr_active_slave; + read_unlock(&bond->curr_slave_lock); + + count = sprintf(buf, "%s\n", (curr) ? "up" : "down") + 1; + up_read(&(bonding_rwsem)); + return count; +} +static CLASS_DEVICE_ATTR(mii_status, S_IRUGO, bonding_show_mii_status, NULL); + + +/* + * Show current 802.3ad aggregator ID. + */ +static ssize_t bonding_show_ad_aggregator(struct class_device *cd, char *buf) +{ + int count = 0; + struct bonding *bond = to_bond(cd); + + down_read(&(bonding_rwsem)); + if (bond->params.mode == BOND_MODE_8023AD) { + struct ad_info ad_info; + count = sprintf(buf, "%d\n", (bond_3ad_get_active_agg_info(bond, &ad_info)) ? 0 : ad_info.aggregator_id) + 1; + } + up_read(&(bonding_rwsem)); + return count; +} +static CLASS_DEVICE_ATTR(ad_aggregator, S_IRUGO, bonding_show_ad_aggregator, NULL); + + +/* + * Show number of active 802.3ad ports. + */ +static ssize_t bonding_show_ad_num_ports(struct class_device *cd, char *buf) +{ + int count = 0; + struct bonding *bond = to_bond(cd); + + down_read(&(bonding_rwsem)); + if (bond->params.mode == BOND_MODE_8023AD) { + struct ad_info ad_info; + count = sprintf(buf, "%d\n", (bond_3ad_get_active_agg_info(bond, &ad_info)) ? 0: ad_info.ports) + 1; + } + up_read(&(bonding_rwsem)); + return count; +} +static CLASS_DEVICE_ATTR(ad_num_ports, S_IRUGO, bonding_show_ad_num_ports, NULL); + + +/* + * Show current 802.3ad actor key. + */ +static ssize_t bonding_show_ad_actor_key(struct class_device *cd, char *buf) +{ + int count = 0; + struct bonding *bond = to_bond(cd); + + down_read(&(bonding_rwsem)); + if (bond->params.mode == BOND_MODE_8023AD) { + struct ad_info ad_info; + count = sprintf(buf, "%d\n", (bond_3ad_get_active_agg_info(bond, &ad_info)) ? 0 : ad_info.actor_key) + 1; + } + up_read(&(bonding_rwsem)); + return count; +} +static CLASS_DEVICE_ATTR(ad_actor_key, S_IRUGO, bonding_show_ad_actor_key, NULL); + + +/* + * Show current 802.3ad partner key. + */ +static ssize_t bonding_show_ad_partner_key(struct class_device *cd, char *buf) +{ + int count = 0; + struct bonding *bond = to_bond(cd); + + down_read(&(bonding_rwsem)); + if (bond->params.mode == BOND_MODE_8023AD) { + struct ad_info ad_info; + count = sprintf(buf, "%d\n", (bond_3ad_get_active_agg_info(bond, &ad_info)) ? 0 : ad_info.partner_key) + 1; + } + up_read(&(bonding_rwsem)); + return count; +} +static CLASS_DEVICE_ATTR(ad_partner_key, S_IRUGO, bonding_show_ad_partner_key, NULL); + + +/* + * Show current 802.3ad partner mac. + */ +static ssize_t bonding_show_ad_partner_mac(struct class_device *cd, char *buf) +{ + int count = 0; + struct bonding *bond = to_bond(cd); + + down_read(&(bonding_rwsem)); + if (bond->params.mode == BOND_MODE_8023AD) { + struct ad_info ad_info; + if (!bond_3ad_get_active_agg_info(bond, &ad_info)) { + count = sprintf(buf,"%02x:%02x:%02x:%02x:%02x:%02x\n", + ad_info.partner_system[0], + ad_info.partner_system[1], + ad_info.partner_system[2], + ad_info.partner_system[3], + ad_info.partner_system[4], + ad_info.partner_system[5]) + 1; + } + } + up_read(&(bonding_rwsem)); + return count; +} +static CLASS_DEVICE_ATTR(ad_partner_mac, S_IRUGO, bonding_show_ad_partner_mac, NULL); + + + +static struct attribute *per_bond_attrs[] = { + &class_device_attr_slaves.attr, + &class_device_attr_mode.attr, + &class_device_attr_arp_interval.attr, + &class_device_attr_arp_ip_target.attr, + &class_device_attr_down_delay.attr, + &class_device_attr_up_delay.attr, + &class_device_attr_lacp_rate.attr, + &class_device_attr_xmit_hash_policy.attr, + &class_device_attr_miimon.attr, + &class_device_attr_primary.attr, + &class_device_attr_use_carrier.attr, + &class_device_attr_active_slave.attr, + &class_device_attr_mii_status.attr, + &class_device_attr_ad_aggregator.attr, + &class_device_attr_ad_num_ports.attr, + &class_device_attr_ad_actor_key.attr, + &class_device_attr_ad_partner_key.attr, + &class_device_attr_ad_partner_mac.attr, + NULL, +}; + +static struct attribute_group bonding_group = { + .name = "bonding", + .attrs = per_bond_attrs, +}; + +/* + * Initialize sysfs. This sets up the bonding_masters file in + * /sys/class/net. + */ +int bond_create_sysfs(void) +{ + int ret = 0; + struct bonding *firstbond; + + init_rwsem(&bonding_rwsem); + + /* get the netdev class pointer */ + firstbond = container_of(bond_dev_list.next, struct bonding, bond_list); + if (!firstbond) + { + return -ENODEV; + + } + netdev_class = firstbond->dev->class_dev.class; + if (!netdev_class) + { + return -ENODEV; + } + ret = class_create_file(netdev_class, &class_attr_bonding_masters); + + return ret; + +} + +/* + * Remove /sys/class/net/bonding_masters. + */ +void bond_destroy_sysfs(void) +{ + if (netdev_class) + class_remove_file(netdev_class, &class_attr_bonding_masters); +} + +/* + * Initialize sysfs for each bond. This sets up and registers + * the 'bondctl' directory for each individual bond under /sys/class/net. + */ +int bond_create_sysfs_entry(struct bonding *bond) +{ + struct net_device *dev = bond->dev; + int err; + + err = sysfs_create_group(&(dev->class_dev.kobj), &bonding_group); + if (err) { + printk(KERN_EMERG "eek! didn't create group!\n"); + } + + return err; +} +/* + * Remove sysfs entries for each bond. + */ +void bond_destroy_sysfs_entry(struct bonding *bond) +{ + struct net_device *dev = bond->dev; + + sysfs_remove_group(&(dev->class_dev.kobj), &bonding_group); +} + diff -urN -X dontdiff linux-2.6.12post/drivers/net/bonding/Makefile linux-2.6.12post-sysfs/drivers/net/bonding/Makefile --- linux-2.6.12post/drivers/net/bonding/Makefile 2005-06-17 12:48:29.000000000 -0700 +++ linux-2.6.12post-sysfs/drivers/net/bonding/Makefile 2005-06-28 18:22:23.000000000 -0700 @@ -4,5 +4,5 @@ obj-$(CONFIG_BONDING) += bonding.o -bonding-objs := bond_main.o bond_3ad.o bond_alb.o +bonding-objs := bond_main.o bond_3ad.o bond_alb.o bond_sysfs.o From radheka.godse@intel.com Fri Jul 1 13:52:16 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 01 Jul 2005 13:52:18 -0700 (PDT) Received: from orsfmr002.jf.intel.com (fmr17.intel.com [134.134.136.16]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j61KqFH9009088 for ; Fri, 1 Jul 2005 13:52:15 -0700 Received: from orsfmr100.jf.intel.com (orsfmr100.jf.intel.com [10.7.209.16]) by orsfmr002.jf.intel.com (8.12.10/8.12.10/d: major-outer.mc,v 1.1 2004/09/17 17:50:56 root Exp $) with ESMTP id j61Kodq8022675; Fri, 1 Jul 2005 20:50:39 GMT Received: from nwlxmail01.jf.intel.com (nwlxmail01.jf.intel.com [10.7.171.40]) by orsfmr100.jf.intel.com (8.12.10/8.12.10/d: major-inner.mc,v 1.2 2004/09/17 18:05:01 root Exp $) with ESMTP id j61KodPb002162; Fri, 1 Jul 2005 20:50:39 GMT Received: from [134.134.3.92] ([134.134.3.92]) by nwlxmail01.jf.intel.com (8.12.10/8.12.9/MailSET/Hub) with ESMTP id j61KodSL003725; Fri, 1 Jul 2005 13:50:39 -0700 Date: Fri, 1 Jul 2005 13:49:52 -0700 (PDT) From: Radheka Godse X-X-Sender: radheka@localhost.localdomain To: fubar@us.ibm.com, bonding-devel@lists.sourceforge.net cc: netdev@oss.sgi.com Subject: [PATCH 2.6.13-rc1 9/17] bonding: get primary name from slave dev Message-ID: ReplyTo: "Radheka Godse" MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Scanned-By: MIMEDefang 2.44 X-archive-position: 2600 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: radheka.godse@intel.com Precedence: bulk X-list: netdev This patch fixes a bug in the proc file handler in bonding. The name of the primary slave was taken from the command-line options, instead of from the dev structure of the primary slave itself. This caused an incorrect dispaly if t he primary is set or changed via sysfs. Signed-off-by: Radheka Godse Signed-off-by: Mitch Williams diff -urN -X dontdiff linux-2.6.12post/drivers/net/bonding/bond_main.c linux-2.6.12post-sysfs/drivers/net/bonding/bond_main.c --- linux-2.6.12post/drivers/net/bonding/bond_main.c 2005-06-28 18:18:03.000000000 -0700 +++ linux-2.6.12post-sysfs/drivers/net/bonding/bond_main.c 2005-06-30 13:53:55.000000000 -0700 @@ -3369,8 +3369,8 @@ if (USES_PRIMARY(bond->params.mode)) { seq_printf(seq, "Primary Slave: %s\n", - (bond->params.primary[0]) ? - bond->params.primary : "None"); + (bond->primary_slave) ? + bond->primary_slave->dev->name : "None"); seq_printf(seq, "Currently Active Slave: %s\n", (curr) ? curr->dev->name : "None"); From radheka.godse@intel.com Fri Jul 1 13:53:16 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 01 Jul 2005 13:53:19 -0700 (PDT) Received: from orsfmr004.jf.intel.com (fmr19.intel.com [134.134.136.18]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j61KrFH9009713 for ; Fri, 1 Jul 2005 13:53:16 -0700 Received: from orsfmr101.jf.intel.com (orsfmr101.jf.intel.com [10.7.209.17]) by orsfmr004.jf.intel.com (8.12.10/8.12.10/d: major-outer.mc,v 1.1 2004/09/17 17:50:56 root Exp $) with ESMTP id j61Kpdkn020138; Fri, 1 Jul 2005 20:51:39 GMT Received: from nwlxmail01.jf.intel.com (nwlxmail01.jf.intel.com [10.7.171.40]) by orsfmr101.jf.intel.com (8.12.10/8.12.10/d: major-inner.mc,v 1.2 2004/09/17 18:05:01 root Exp $) with ESMTP id j61Kpd7G001196; Fri, 1 Jul 2005 20:51:39 GMT Received: from [134.134.3.92] ([134.134.3.92]) by nwlxmail01.jf.intel.com (8.12.10/8.12.9/MailSET/Hub) with ESMTP id j61KpdSL003806; Fri, 1 Jul 2005 13:51:39 -0700 Date: Fri, 1 Jul 2005 13:50:52 -0700 (PDT) From: Radheka Godse X-X-Sender: radheka@localhost.localdomain To: fubar@us.ibm.com, bonding-devel@lists.sourceforge.net cc: netdev@oss.sgi.com Subject: [PATCH 2.6.13-rc1 10/17] bonding: added missing mode descriptions to modinfo Message-ID: ReplyTo: "Radheka Godse" MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Scanned-By: MIMEDefang 2.44 X-archive-position: 2601 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: radheka.godse@intel.com Precedence: bulk X-list: netdev This patch updates the modinfo description for mode module param to include all available modes as described in bonding.txt. Signed-off-by: Radheka Godse Signed-off-by: Mitch Williams diff -urN -X dontdiff linux-2.6.12post/drivers/net/bonding/bond_main.c linux-2.6.12post-sysfs/drivers/net/bonding/bond_main.c --- linux-2.6.12post/drivers/net/bonding/bond_main.c 2005-06-28 18:18:03.000000000 -0700 +++ linux-2.6.12post-sysfs/drivers/net/bonding/bond_main.c 2005-06-30 13:53:55.000000000 -0700 @@ -564,17 +564,24 @@ module_param(updelay, int, 0); MODULE_PARM_DESC(updelay, "Delay before considering link up, in milliseconds"); module_param(downdelay, int, 0); -MODULE_PARM_DESC(downdelay, "Delay before considering link down, in milliseconds"); +MODULE_PARM_DESC(downdelay, "Delay before considering link down, " + "in milliseconds"); module_param(use_carrier, int, 0); -MODULE_PARM_DESC(use_carrier, "Use netif_carrier_ok (vs MII ioctls) in miimon; 0 for off, 1 for on (default)"); +MODULE_PARM_DESC(use_carrier, "Use netif_carrier_ok (vs MII ioctls) in miimon; " + "0 for off, 1 for on (default)"); module_param(mode, charp, 0); -MODULE_PARM_DESC(mode, "Mode of operation : 0 for round robin, 1 for active-backup, 2 for xor"); +MODULE_PARM_DESC(mode, "Mode of operation : 0 for balance-rr, " + "1 for active-backup, 2 for balance-xor, " + "3 for broadcast, 4 for 802.3ad, 5 for balance-tlb, " + "6 for balance-alb"); module_param(primary, charp, 0); MODULE_PARM_DESC(primary, "Primary network device to use"); module_param(lacp_rate, charp, 0); -MODULE_PARM_DESC(lacp_rate, "LACPDU tx rate to request from 802.3ad partner (slow/fast)"); +MODULE_PARM_DESC(lacp_rate, "LACPDU tx rate to request from 802.3ad partner " + "(slow/fast)"); module_param(xmit_hash_policy, charp, 0); -MODULE_PARM_DESC(xmit_hash_policy, "XOR hashing method : 0 for layer 2 (default), 1 for layer 3+4"); +MODULE_PARM_DESC(xmit_hash_policy, "XOR hashing method: 0 for layer 2 (default)" + ", 1 for layer 3+4"); module_param(arp_interval, int, 0); MODULE_PARM_DESC(arp_interval, "arp interval in milliseconds"); module_param_array(arp_ip_target, charp, NULL, 0); From radheka.godse@intel.com Fri Jul 1 13:54:18 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 01 Jul 2005 13:54:21 -0700 (PDT) Received: from orsfmr002.jf.intel.com (fmr17.intel.com [134.134.136.16]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j61KsHH9010291 for ; Fri, 1 Jul 2005 13:54:17 -0700 Received: from orsfmr101.jf.intel.com (orsfmr101.jf.intel.com [10.7.209.17]) by orsfmr002.jf.intel.com (8.12.10/8.12.10/d: major-outer.mc,v 1.1 2004/09/17 17:50:56 root Exp $) with ESMTP id j61Kqfq8023577; Fri, 1 Jul 2005 20:52:41 GMT Received: from nwlxmail01.jf.intel.com (nwlxmail01.jf.intel.com [10.7.171.40]) by orsfmr101.jf.intel.com (8.12.10/8.12.10/d: major-inner.mc,v 1.2 2004/09/17 18:05:01 root Exp $) with ESMTP id j61Kqf7G001818; Fri, 1 Jul 2005 20:52:41 GMT Received: from [134.134.3.92] ([134.134.3.92]) by nwlxmail01.jf.intel.com (8.12.10/8.12.9/MailSET/Hub) with ESMTP id j61KqfSL003880; Fri, 1 Jul 2005 13:52:41 -0700 Date: Fri, 1 Jul 2005 13:51:49 -0700 (PDT) From: Radheka Godse X-X-Sender: radheka@localhost.localdomain To: fubar@us.ibm.com, bonding-devel@lists.sourceforge.net cc: netdev@oss.sgi.com Subject: [PATCH 2.6.13-rc1 11/17] bonding: add driver name to log messages Message-ID: ReplyTo: "Radheka Godse" MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Scanned-By: MIMEDefang 2.44 X-archive-position: 2602 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: radheka.godse@intel.com Precedence: bulk X-list: netdev Trivial fix to include DRVNAME in logs printed to /var/log/messages Signed-off-by: Radheka Godse Signed-off-by: Mitch Williams diff -urN -X dontdiff linux-2.6.12post/drivers/net/bonding/bond_3ad.c linux-2.6.12post-sysfs/drivers/net/bonding/bond_3ad.c --- linux-2.6.12post/drivers/net/bonding/bond_3ad.c 2005-06-28 18:18:03.000000000 -0700 +++ linux-2.6.12post-sysfs/drivers/net/bonding/bond_3ad.c 2005-06-28 18:21:34.000000000 -0700 @@ -1915,7 +1915,8 @@ struct aggregator *aggregator; if (bond == NULL) { - printk(KERN_ERR "The slave %s is not attached to its bond\n", slave->dev->name); + printk(KERN_ERR DRV_NAME ": %s: The slave %s is not attached to its bond\n", + slave->dev->master->name, slave->dev->name); return -1; } @@ -2085,7 +2086,8 @@ // clear the aggregator ad_clear_agg(temp_aggregator); if (select_new_active_agg) { - printk(KERN_INFO "Removing an active aggregator\n"); + printk(KERN_INFO DRV_NAME ": %s: Removing an active aggregator\n", + slave->dev->master->name); // select new active aggregator ad_agg_selection_logic(__get_first_agg(port)); } @@ -2230,8 +2232,9 @@ // if slave is null, the whole port is not initialized if (!port->slave) { - printk(KERN_WARNING DRV_NAME ": Warning: speed changed for uninitialized port on %s\n", - slave->dev->name); + printk(KERN_WARNING DRV_NAME ": Warning: %s: speed " + "changed for uninitialized port on %s\n", + slave->dev->master->name, slave->dev->name); return; } @@ -2257,8 +2260,9 @@ // if slave is null, the whole port is not initialized if (!port->slave) { - printk(KERN_WARNING DRV_NAME ": Warning: duplex changed for uninitialized port on %s\n", - slave->dev->name); + printk(KERN_WARNING DRV_NAME ": %s: Warning: duplex changed " + "for uninitialized port on %s\n", + slave->dev->master->name, slave->dev->name); return; } @@ -2363,7 +2367,8 @@ } if (bond_3ad_get_active_agg_info(bond, &ad_info)) { - printk(KERN_DEBUG "ERROR: bond_3ad_get_active_agg_info failed\n"); + printk(KERN_DEBUG DRV_NAME ": %s: Error: " + "bond_3ad_get_active_agg_info failed\n", dev->name); goto out; } @@ -2372,7 +2377,9 @@ if (slaves_in_agg == 0) { /*the aggregator is empty*/ - printk(KERN_DEBUG "ERROR: active aggregator is empty\n"); + printk(KERN_DEBUG DRV_NAME ": %s: Error: active " + "aggregator is empty\n", + dev->name); goto out; } diff -urN -X dontdiff linux-2.6.12post/drivers/net/bonding/bond_main.c linux-2.6.12post-sysfs/drivers/net/bonding/bond_main.c --- linux-2.6.12post/drivers/net/bonding/bond_main.c 2005-06-28 18:18:03.000000000 -0700 +++ linux-2.6.12post-sysfs/drivers/net/bonding/bond_main.c 2005-06-30 13:53:55.000000000 -0700 @@ -1884,10 +1884,10 @@ new_slave->dev->name); if (bond->params.mode == BOND_MODE_8023AD) { - printk(KERN_WARNING - "Operation of 802.3ad mode requires ETHTOOL " + printk(KERN_WARNING DRV_NAME + ": %s: Warning: Operation of 802.3ad mode requires ETHTOOL " "support in base driver for proper aggregator " - "selection.\n"); + "selection.\n", bond_dev->name); } } @@ -2716,8 +2706,11 @@ break; default: /* Should not happen */ - printk(KERN_ERR "bonding: Error: %s Illegal value (link=%d)\n", - slave->dev->name, slave->link); + printk(KERN_ERR DRV_NAME + ": %s: Error: %s Illegal value (link=%d)\n", + bond_dev->name, + slave->dev->name, + slave->link); goto out; } /* end of switch (slave->link) */ From radheka.godse@intel.com Fri Jul 1 13:55:25 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 01 Jul 2005 13:55:30 -0700 (PDT) Received: from orsfmr004.jf.intel.com (fmr19.intel.com [134.134.136.18]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j61KtOH9010859 for ; Fri, 1 Jul 2005 13:55:25 -0700 Received: from orsfmr101.jf.intel.com (orsfmr101.jf.intel.com [10.7.209.17]) by orsfmr004.jf.intel.com (8.12.10/8.12.10/d: major-outer.mc,v 1.1 2004/09/17 17:50:56 root Exp $) with ESMTP id j61Krmkn020954; Fri, 1 Jul 2005 20:53:48 GMT Received: from nwlxmail01.jf.intel.com (nwlxmail01.jf.intel.com [10.7.171.40]) by orsfmr101.jf.intel.com (8.12.10/8.12.10/d: major-inner.mc,v 1.2 2004/09/17 18:05:01 root Exp $) with ESMTP id j61Krm7G002460; Fri, 1 Jul 2005 20:53:48 GMT Received: from [134.134.3.92] ([134.134.3.92]) by nwlxmail01.jf.intel.com (8.12.10/8.12.9/MailSET/Hub) with ESMTP id j61KrmSL003920; Fri, 1 Jul 2005 13:53:48 -0700 Date: Fri, 1 Jul 2005 13:53:01 -0700 (PDT) From: Radheka Godse X-X-Sender: radheka@localhost.localdomain To: fubar@us.ibm.com, bonding-devel@lists.sourceforge.net cc: netdev@oss.sgi.com Subject: [PATCH 2.6.13-rc1 12/17] bonding: prefix bondname to log messages Message-ID: ReplyTo: "Radheka Godse" MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Scanned-By: MIMEDefang 2.44 X-archive-position: 2603 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: radheka.godse@intel.com Precedence: bulk X-list: netdev This patch adds : prefix to logs printed to /var/log/messages to identify messages on a per bond basis. Signed-off-by: Radheka Godse Signed-off-by: Mitch Williams diff -urN -X dontdiff linux-2.6.12post/drivers/net/bonding/bond_3ad.c linux-2.6.12post-sysfs/drivers/net/bonding/bond_3ad.c --- linux-2.6.12post/drivers/net/bonding/bond_3ad.c 2005-06-28 18:18:03.000000000 -0700 +++ linux-2.6.12post-sysfs/drivers/net/bonding/bond_3ad.c 2005-06-28 18:21:34.000000000 -0700 @@ -1378,8 +1378,9 @@ } } if (!curr_port) { // meaning: the port was related to an aggregator but was not on the aggregator port list - printk(KERN_WARNING DRV_NAME ": Warning: Port %d (on %s) was " + printk(KERN_WARNING DRV_NAME ": %s: Warning: Port %d (on %s) was " "related to aggregator %d but was not on its port list\n", + port->slave->dev->master->name, port->actor_port_number, port->slave->dev->name, port->aggregator->aggregator_identifier); } @@ -1450,7 +1451,8 @@ dprintk("Port %d joined LAG %d(new LAG)\n", port->actor_port_number, port->aggregator->aggregator_identifier); } else { - printk(KERN_ERR DRV_NAME ": Port %d (on %s) did not find a suitable aggregator\n", + printk(KERN_ERR DRV_NAME ": %s: Port %d (on %s) did not find a suitable aggregator\n", + port->slave->dev->master->name, port->actor_port_number, port->slave->dev->name); } } @@ -1582,8 +1584,9 @@ // check if any partner replys if (best_aggregator->is_individual) { - printk(KERN_WARNING DRV_NAME ": Warning: No 802.3ad response from the link partner " - "for any adapters in the bond\n"); + printk(KERN_WARNING DRV_NAME ": %s: Warning: No 802.3ad response from " + "the link partner for any adapters in the bond\n", + best_aggregator->slave->dev->master->name); } // check if there are more than one aggregator @@ -1991,7 +1994,9 @@ // if slave is null, the whole port is not initialized if (!port->slave) { - printk(KERN_WARNING DRV_NAME ": Trying to unbind an uninitialized port on %s\n", slave->dev->name); + printk(KERN_WARNING DRV_NAME ": Warning: %s: Trying to " + "unbind an uninitialized port on %s\n", + slave->dev->master->name, slave->dev->name); return; } @@ -2022,7 +2027,8 @@ dprintk("Some port(s) related to LAG %d - replaceing with LAG %d\n", aggregator->aggregator_identifier, new_aggregator->aggregator_identifier); if ((new_aggregator->lag_ports == port) && new_aggregator->is_active) { - printk(KERN_INFO DRV_NAME ": Removing an active aggregator\n"); + printk(KERN_INFO DRV_NAME ": %s: Removing an active aggregator\n", + aggregator->slave->dev->master->name); // select new active aggregator select_new_active_agg = 1; } @@ -2052,15 +2058,17 @@ ad_agg_selection_logic(__get_first_agg(port)); } } else { - printk(KERN_WARNING DRV_NAME ": Warning: unbinding aggregator, " - "and could not find a new aggregator for its ports\n"); + printk(KERN_WARNING DRV_NAME ": %s: Warning: unbinding aggregator, " + "and could not find a new aggregator for its ports\n", + slave->dev->master->name); } } else { // in case that the only port related to this aggregator is the one we want to remove select_new_active_agg = aggregator->is_active; // clear the aggregator ad_clear_agg(aggregator); if (select_new_active_agg) { - printk(KERN_INFO "Removing an active aggregator\n"); + printk(KERN_INFO DRV_NAME ": %s: Removing an active aggregator\n", + slave->dev->master->name); // select new active aggregator ad_agg_selection_logic(__get_first_agg(port)); } @@ -2133,7 +2141,8 @@ // select the active aggregator for the bond if ((port = __get_first_port(bond))) { if (!port->slave) { - printk(KERN_WARNING DRV_NAME ": Warning: bond's first port is uninitialized\n"); + printk(KERN_WARNING DRV_NAME ": %s: Warning: bond's first port is " + "uninitialized\n", bond->dev->name); goto re_arm; } @@ -2145,7 +2154,8 @@ // for each port run the state machines for (port = __get_first_port(bond); port; port = __get_next_port(port)) { if (!port->slave) { - printk(KERN_WARNING DRV_NAME ": Warning: Found an uninitialized port\n"); + printk(KERN_WARNING DRV_NAME ": %s: Warning: Found an uninitialized " + "port\n", bond->dev->name); goto re_arm; } @@ -2186,7 +2196,8 @@ port = &(SLAVE_AD_INFO(slave).port); if (!port->slave) { - printk(KERN_WARNING DRV_NAME ": Warning: port of slave %s is uninitialized\n", slave->dev->name); + printk(KERN_WARNING DRV_NAME ": %s: Warning: port of slave %s is " + "uninitialized\n", slave->dev->name, slave->dev->master->name); return; } @@ -2289,8 +2300,9 @@ // if slave is null, the whole port is not initialized if (!port->slave) { - printk(KERN_WARNING DRV_NAME ": Warning: link status changed for uninitialized port on %s\n", - slave->dev->name); + printk(KERN_WARNING DRV_NAME ": Warning: %s: link status changed for " + "uninitialized port on %s\n", + slave->dev->master->name, slave->dev->name); return; } @@ -2397,7 +2409,8 @@ } if (slave_agg_no >= 0) { - printk(KERN_ERR DRV_NAME ": Error: Couldn't find a slave to tx on for aggregator ID %d\n", agg_id); + printk(KERN_ERR DRV_NAME ": %s: Error: Couldn't find a slave to tx on " + "for aggregator ID %d\n", dev->name, agg_id); goto out; } diff -urN -X dontdiff linux-2.6.12post/drivers/net/bonding/bond_alb.c linux-2.6.12post-sysfs/drivers/net/bonding/bond_alb.c --- linux-2.6.12post/drivers/net/bonding/bond_alb.c 2005-06-17 12:48:29.000000000 -0700 +++ linux-2.6.12post-sysfs/drivers/net/bonding/bond_alb.c 2005-06-28 18:21:35.000000000 -0700 @@ -515,7 +515,8 @@ client_info->mac_dst); if (!skb) { printk(KERN_ERR DRV_NAME - ": Error: failed to create an ARP packet\n"); + ": %s: Error: failed to create an ARP packet\n", + client_info->slave->dev->master->name); continue; } @@ -525,7 +526,8 @@ skb = vlan_put_tag(skb, client_info->vlan_id); if (!skb) { printk(KERN_ERR DRV_NAME - ": Error: failed to insert VLAN tag\n"); + ": %s: Error: failed to insert VLAN tag\n", + client_info->slave->dev->master->name); continue; } } @@ -608,8 +610,9 @@ if (!client_info->slave) { printk(KERN_ERR DRV_NAME - ": Error: found a client with no channel in " - "the client's hash table\n"); + ": %s: Error: found a client with no channel in " + "the client's hash table\n", + bond->dev->name); continue; } /*update all clients using this src_ip, that are not assigned @@ -930,7 +933,8 @@ skb = vlan_put_tag(skb, vlan->vlan_id); if (!skb) { printk(KERN_ERR DRV_NAME - ": Error: failed to insert VLAN tag\n"); + ": %s: Error: failed to insert VLAN tag\n", + bond->dev->name); continue; } } @@ -959,11 +963,11 @@ s_addr.sa_family = dev->type; if (dev_set_mac_address(dev, &s_addr)) { printk(KERN_ERR DRV_NAME - ": Error: dev_set_mac_address of dev %s failed! ALB " + ": %s: Error: dev_set_mac_address of dev %s failed! ALB " "mode requires that the base driver support setting " "the hw address also when the network device's " "interface is open\n", - dev->name); + dev->master->name, dev->name); return -EOPNOTSUPP; } return 0; @@ -1113,9 +1117,9 @@ * of the new slave */ printk(KERN_ERR DRV_NAME - ": Error: the hw address of slave %s is not " + ": %s: Error: the hw address of slave %s is not " "unique - cannot enslave it!", - slave->dev->name); + bond->dev->name, slave->dev->name); return -EINVAL; } @@ -1161,16 +1165,16 @@ bond->alb_info.rlb_enabled); printk(KERN_WARNING DRV_NAME - ": Warning: the hw address of slave %s is in use by " + ": %s: Warning: the hw address of slave %s is in use by " "the bond; giving it the hw address of %s\n", - slave->dev->name, free_mac_slave->dev->name); + bond->dev->name, slave->dev->name, free_mac_slave->dev->name); } else if (has_bond_addr) { printk(KERN_ERR DRV_NAME - ": Error: the hw address of slave %s is in use by the " + ": %s: Error: the hw address of slave %s is in use by the " "bond; couldn't find a slave with a free hw address to " "give it (this should not have happened)\n", - slave->dev->name); + bond->dev->name, slave->dev->name); return -EFAULT; } diff -urN -X dontdiff linux-2.6.12post/drivers/net/bonding/bond_main.c linux-2.6.12post-sysfs/drivers/net/bonding/bond_main.c --- linux-2.6.12post/drivers/net/bonding/bond_main.c 2005-06-28 18:18:03.000000000 -0700 +++ linux-2.6.12post-sysfs/drivers/net/bonding/bond_main.c 2005-06-30 13:53:55.000000000 -0700 @@ -1621,8 +1621,8 @@ if (slave_dev->do_ioctl == NULL) { printk(KERN_WARNING DRV_NAME - ": Warning : no link monitoring support for %s\n", - slave_dev->name); + ": %s: Warning: no link monitoring support for %s\n", + bond_dev->name, slave_dev->name); } /* bond must be initialized by bond_open() before enslaving */ @@ -1643,17 +1643,17 @@ dprintk("%s: NETIF_F_VLAN_CHALLENGED\n", slave_dev->name); if (!list_empty(&bond->vlan_list)) { printk(KERN_ERR DRV_NAME - ": Error: cannot enslave VLAN " + ": %s: Error: cannot enslave VLAN " "challenged slave %s on VLAN enabled " - "bond %s\n", slave_dev->name, + "bond %s\n", bond_dev->name, slave_dev->name, bond_dev->name); return -EPERM; } else { printk(KERN_WARNING DRV_NAME - ": Warning: enslaved VLAN challenged " + ": %s: Warning: enslaved VLAN challenged " "slave %s. Adding VLANs will be blocked as " "long as %s is part of bond %s\n", - slave_dev->name, slave_dev->name, + bond_dev->name, slave_dev->name, slave_dev->name, bond_dev->name); bond_dev->features |= NETIF_F_VLAN_CHALLENGED; } @@ -1705,8 +1705,8 @@ */ if (!(slave_dev->flags & IFF_UP)) { printk(KERN_ERR DRV_NAME - ": Error: %s is not running\n", - slave_dev->name); + ": %s: Error: %s is not running\n", + bond_dev->name, slave_dev->name); res = -EINVAL; goto err_undo_flags; } @@ -1715,9 +1715,9 @@ (bond->params.mode == BOND_MODE_TLB) || (bond->params.mode == BOND_MODE_ALB)) { printk(KERN_ERR DRV_NAME - ": Error: to use %s mode, you must upgrade " + ": %s: Error: to use %s mode, you must upgrade " "ifenslave.\n", - bond_mode_name(bond->params.mode)); + bond_dev->name, bond_mode_name(bond->params.mode)); res = -EOPNOTSUPP; goto err_undo_flags; } @@ -1838,21 +1838,21 @@ * the messages for netif_carrier. */ printk(KERN_WARNING DRV_NAME - ": Warning: MII and ETHTOOL support not " + ": %s: Warning: MII and ETHTOOL support not " "available for interface %s, and " "arp_interval/arp_ip_target module parameters " "not specified, thus bonding will not detect " "link failures! see bonding.txt for details.\n", - slave_dev->name); + bond_dev->name, slave_dev->name); } else if (link_reporting == -1) { /* unable get link status using mii/ethtool */ printk(KERN_WARNING DRV_NAME - ": Warning: can't get link status from " + ": %s: Warning: can't get link status from " "interface %s; the network driver associated " "with this interface does not support MII or " "ETHTOOL link status reporting, thus miimon " "has no effect on this interface.\n", - slave_dev->name); + bond_dev->name, slave_dev->name); } } @@ -1879,9 +1894,9 @@ if (bond_update_speed_duplex(new_slave) && (new_slave->link != BOND_LINK_DOWN)) { printk(KERN_WARNING DRV_NAME - ": Warning: failed to get speed and duplex from %s, " + ": %s: Warning: failed to get speed and duplex from %s, " "assumed to be 100Mb/sec and Full.\n", - new_slave->dev->name); + bond_dev->name, new_slave->dev->name); if (bond->params.mode == BOND_MODE_8023AD) { printk(KERN_WARNING DRV_NAME @@ -2080,11 +2080,12 @@ ETH_ALEN); if (!mac_addr_differ && (bond->slave_cnt > 1)) { printk(KERN_WARNING DRV_NAME - ": Warning: the permanent HWaddr of %s " + ": %s: Warning: the permanent HWaddr of %s " "- %02X:%02X:%02X:%02X:%02X:%02X - is " "still in use by %s. Set the HWaddr of " "%s to a different address to avoid " "conflicts.\n", + bond_dev->name, slave_dev->name, slave->perm_hwaddr[0], slave->perm_hwaddr[1], @@ -2158,19 +2168,20 @@ bond_dev->features |= NETIF_F_VLAN_CHALLENGED; } else { printk(KERN_WARNING DRV_NAME - ": Warning: clearing HW address of %s while it " + ": %s: Warning: clearing HW address of %s while it " "still has VLANs.\n", - bond_dev->name); + bond_dev->name, bond_dev->name); printk(KERN_WARNING DRV_NAME - ": When re-adding slaves, make sure the bond's " - "HW address matches its VLANs'.\n"); + ": %s: When re-adding slaves, make sure the bond's " + "HW address matches its VLANs'.\n", + bond_dev->name); } } else if ((bond_dev->features & NETIF_F_VLAN_CHALLENGED) && !bond_has_challenged_slaves(bond)) { printk(KERN_INFO DRV_NAME - ": last VLAN challenged slave %s " + ": %s: last VLAN challenged slave %s " "left bond %s. VLAN blocking is removed\n", - slave_dev->name, bond_dev->name); + bond_dev->name, slave_dev->name, bond_dev->name); bond_dev->features &= ~NETIF_F_VLAN_CHALLENGED; } @@ -2327,12 +2329,13 @@ bond_dev->features |= NETIF_F_VLAN_CHALLENGED; } else { printk(KERN_WARNING DRV_NAME - ": Warning: clearing HW address of %s while it " + ": %s: Warning: clearing HW address of %s while it " "still has VLANs.\n", - bond_dev->name); + bond_dev->name, bond_dev->name); printk(KERN_WARNING DRV_NAME - ": When re-adding slaves, make sure the bond's " - "HW address matches its VLANs'.\n"); + ": %s: When re-adding slaves, make sure the bond's " + "HW address matches its VLANs'.\n", + bond_dev->name); } printk(KERN_INFO DRV_NAME @@ -2424,8 +2427,9 @@ &endptr, 0); if (*endptr) { printk(KERN_ERR DRV_NAME - ": Error: got invalid ABI " - "version from application\n"); + ": %s: Error: got invalid ABI " + "version from application\n", + bond_dev->name); return -EINVAL; } @@ -4496,8 +4500,9 @@ struct sk_buff *skb2 = skb_clone(skb, GFP_ATOMIC); if (!skb2) { printk(KERN_ERR DRV_NAME - ": Error: bond_xmit_broadcast(): " - "skb_clone() failed\n"); + ": %s: Error: bond_xmit_broadcast(): " + "skb_clone() failed\n", + bond_dev->name); continue; } @@ -4566,7 +4571,8 @@ default: /* Should never happen, mode already checked */ printk(KERN_ERR DRV_NAME - ": Error: Unknown bonding mode %d\n", + ": %s: Error: Unknown bonding mode %d\n", + bond_dev->name, mode); break; } From radheka.godse@intel.com Fri Jul 1 13:56:40 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 01 Jul 2005 13:56:44 -0700 (PDT) Received: from orsfmr004.jf.intel.com (fmr19.intel.com [134.134.136.18]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j61KuKH9011318 for ; Fri, 1 Jul 2005 13:56:40 -0700 Received: from orsfmr100.jf.intel.com (orsfmr100.jf.intel.com [10.7.209.16]) by orsfmr004.jf.intel.com (8.12.10/8.12.10/d: major-outer.mc,v 1.1 2004/09/17 17:50:56 root Exp $) with ESMTP id j61Ksikn021391; Fri, 1 Jul 2005 20:54:44 GMT Received: from nwlxmail01.jf.intel.com (nwlxmail01.jf.intel.com [10.7.171.40]) by orsfmr100.jf.intel.com (8.12.10/8.12.10/d: major-inner.mc,v 1.2 2004/09/17 18:05:01 root Exp $) with ESMTP id j61KsiPb004944; Fri, 1 Jul 2005 20:54:44 GMT Received: from [134.134.3.92] ([134.134.3.92]) by nwlxmail01.jf.intel.com (8.12.10/8.12.9/MailSET/Hub) with ESMTP id j61KshSL004040; Fri, 1 Jul 2005 13:54:44 -0700 Date: Fri, 1 Jul 2005 13:53:51 -0700 (PDT) From: Radheka Godse X-X-Sender: radheka@localhost.localdomain To: fubar@us.ibm.com, bonding-devel@lists.sourceforge.net cc: netdev@oss.sgi.com Subject: [PATCH 2.6.13-rc1 13/17] bonding: make error messages more consistent Message-ID: ReplyTo: "Radheka Godse" MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Scanned-By: MIMEDefang 2.44 X-archive-position: 2604 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: radheka.godse@intel.com Precedence: bulk X-list: netdev This patch attempts to make error reporting more consistent, removes/adds newlines as appropriate, adds bonding: : prefix if missing (especially in alb, ad) messages. Signed-off-by: Radheka Godse Signed-off-by: Mitch Williams diff -urN -X dontdiff linux-2.6.12post/drivers/net/bonding/bond_3ad.c linux-2.6.12post-sysfs/drivers/net/bonding/bond_3ad.c --- linux-2.6.12post/drivers/net/bonding/bond_3ad.c 2005-06-28 18:18:03.000000000 -0700 +++ linux-2.6.12post-sysfs/drivers/net/bonding/bond_3ad.c 2005-06-28 18:21:34.000000000 -0700 @@ -1198,10 +1198,10 @@ // detect loopback situation if (!MAC_ADDRESS_COMPARE(&(lacpdu->actor_system), &(port->actor_system))) { // INFO_RECEIVED_LOOPBACK_FRAMES - printk(KERN_ERR DRV_NAME ": An illegal loopback occurred on adapter (%s)\n", - port->slave->dev->name); - printk(KERN_ERR "Check the configuration to verify that all Adapters " - "are connected to 802.3ad compliant switch ports\n"); + printk(KERN_ERR DRV_NAME ": %s: An illegal loopback occurred on " + "adapter (%s). Check the configuration to verify that all " + "Adapters are connected to 802.3ad compliant switch ports\n", + port->slave->dev->master->name, port->slave->dev->name); __release_rx_machine_lock(port); return; } diff -urN -X dontdiff linux-2.6.12post/drivers/net/bonding/bond_alb.c linux-2.6.12post-sysfs/drivers/net/bonding/bond_alb.c --- linux-2.6.12post/drivers/net/bonding/bond_alb.c 2005-06-17 12:48:29.000000000 -0700 +++ linux-2.6.12post-sysfs/drivers/net/bonding/bond_alb.c 2005-06-28 18:21:35.000000000 -0700 @@ -206,7 +206,7 @@ new_hashtbl = kmalloc(size, GFP_KERNEL); if (!new_hashtbl) { printk(KERN_ERR DRV_NAME - ": Error: %s: Failed to allocate TLB hash table\n", + ": %s: Error: Failed to allocate TLB hash table\n", bond->dev->name); return -1; } @@ -811,7 +798,7 @@ new_hashtbl = kmalloc(size, GFP_KERNEL); if (!new_hashtbl) { printk(KERN_ERR DRV_NAME - ": Error: %s: Failed to allocate RLB hash table\n", + ": %s: Error: Failed to allocate RLB hash table\n", bond->dev->name); return -1; } diff -urN -X dontdiff linux-2.6.12post/drivers/net/bonding/bond_main.c linux-2.6.12post-sysfs/drivers/net/bonding/bond_main.c --- linux-2.6.12post/drivers/net/bonding/bond_main.c 2005-06-28 18:18:03.000000000 -0700 +++ linux-2.6.12post-sysfs/drivers/net/bonding/bond_main.c 2005-06-30 13:53:55.000000000 -0700 @@ -919,7 +919,7 @@ res = bond_add_vlan(bond, vid); if (res) { printk(KERN_ERR DRV_NAME - ": %s: Failed to add vlan id %d\n", + ": %s: Error: Failed to add vlan id %d\n", bond_dev->name, vid); } } @@ -953,7 +953,7 @@ res = bond_del_vlan(bond, vid); if (res) { printk(KERN_ERR DRV_NAME - ": %s: Failed to remove vlan id %d\n", + ": %s: Error: Failed to remove vlan id %d\n", bond_dev->name, vid); } } @@ -1690,11 +1698,10 @@ if (slave_dev->set_mac_address == NULL) { printk(KERN_ERR DRV_NAME - ": Error: The slave device you specified does " - "not support setting the MAC address.\n"); - printk(KERN_ERR + ": %s: Error: The slave device you specified does " + "not support setting the MAC address. " "Your kernel likely does not support slave " - "devices.\n"); + "devices.\n", bond_dev->name); res = -EOPNOTSUPP; goto err_undo_flags; @@ -2059,7 +2058,7 @@ if (!(slave_dev->flags & IFF_SLAVE) || (slave_dev->master != bond_dev)) { printk(KERN_ERR DRV_NAME - ": Error: %s: cannot release %s.\n", + ": %s: Error: cannot release %s.\n", bond_dev->name, slave_dev->name); return -EINVAL; } @@ -4758,7 +4757,7 @@ if (max_bonds < 1 || max_bonds > INT_MAX) { printk(KERN_WARNING DRV_NAME ": Warning: max_bonds (%d) not in range %d-%d, so it " - "was reset to BOND_DEFAULT_MAX_BONDS (%d)", + "was reset to BOND_DEFAULT_MAX_BONDS (%d)\n", max_bonds, 1, INT_MAX, BOND_DEFAULT_MAX_BONDS); max_bonds = BOND_DEFAULT_MAX_BONDS; } From radheka.godse@intel.com Fri Jul 1 13:57:41 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 01 Jul 2005 13:57:44 -0700 (PDT) Received: from orsfmr005.jf.intel.com (fmr20.intel.com [134.134.136.19]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j61KvfH9011902 for ; Fri, 1 Jul 2005 13:57:41 -0700 Received: from orsfmr101.jf.intel.com (orsfmr101.jf.intel.com [10.7.209.17]) by orsfmr005.jf.intel.com (8.12.10/8.12.10/d: major-outer.mc,v 1.1 2004/09/17 17:50:56 root Exp $) with ESMTP id j61Ku5tp009751; Fri, 1 Jul 2005 20:56:05 GMT Received: from nwlxmail01.jf.intel.com (nwlxmail01.jf.intel.com [10.7.171.40]) by orsfmr101.jf.intel.com (8.12.10/8.12.10/d: major-inner.mc,v 1.2 2004/09/17 18:05:01 root Exp $) with ESMTP id j61Ku57G003690; Fri, 1 Jul 2005 20:56:05 GMT Received: from [134.134.3.92] ([134.134.3.92]) by nwlxmail01.jf.intel.com (8.12.10/8.12.9/MailSET/Hub) with ESMTP id j61Ku0SL004186; Fri, 1 Jul 2005 13:56:04 -0700 Date: Fri, 1 Jul 2005 13:55:08 -0700 (PDT) From: Radheka Godse X-X-Sender: radheka@localhost.localdomain To: fubar@us.ibm.com, bonding-devel@lists.sourceforge.net cc: netdev@oss.sgi.com Subject: [PATCH 2.6.13-rc1 14/17] bonding: spelling and whitespace correction Message-ID: ReplyTo: "Radheka Godse" MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Scanned-By: MIMEDefang 2.44 X-archive-position: 2605 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: radheka.godse@intel.com Precedence: bulk X-list: netdev Trivial patch to fix spelling errors and some white spaces changes for readabilitiy. Signed-off-by: Radheka Godse Signed-off-by: Mitch Williams diff -urN -X dontdiff linux-2.6.12post/drivers/net/bonding/bond_alb.c linux-2.6.12post-sysfs/drivers/net/bonding/bond_alb.c --- linux-2.6.12post/drivers/net/bonding/bond_alb.c 2005-06-17 12:48:29.000000000 -0700 +++ linux-2.6.12post-sysfs/drivers/net/bonding/bond_alb.c 2005-06-28 18:21:35.000000000 -0700 @@ -1423,7 +1423,7 @@ read_lock(&bond->curr_slave_lock); bond_for_each_slave(bond, slave, i) { - alb_send_learning_packets(slave,slave->dev->dev_addr); + alb_send_learning_packets(slave, slave->dev->dev_addr); } read_unlock(&bond->curr_slave_lock); diff -urN -X dontdiff linux-2.6.12post/drivers/net/bonding/bonding.h linux-2.6.12post-sysfs/drivers/net/bonding/bonding.h --- linux-2.6.12post/drivers/net/bonding/bonding.h 2005-06-28 18:18:03.000000000 -0700 +++ linux-2.6.12post-sysfs/drivers/net/bonding/bonding.h 2005-06-30 13:58:27.000000000 -0700 @@ -165,7 +165,7 @@ }; struct slave { - struct net_device *dev; /* first - usefull for panic debug */ + struct net_device *dev; /* first - useful for panic debug */ struct slave *next; struct slave *prev; s16 delay; @@ -191,7 +191,7 @@ * beforehand. */ struct bonding { - struct net_device *dev; /* first - usefull for panic debug */ + struct net_device *dev; /* first - useful for panic debug */ struct slave *first_slave; struct slave *curr_active_slave; struct slave *current_arp_slave; diff -urN -X dontdiff linux-2.6.12post/drivers/net/bonding/bond_main.c linux-2.6.12post-sysfs/drivers/net/bonding/bond_main.c --- linux-2.6.12post/drivers/net/bonding/bond_main.c 2005-06-28 18:18:03.000000000 -0700 +++ linux-2.6.12post-sysfs/drivers/net/bonding/bond_main.c 2005-06-30 13:53:55.000000000 -0700 @@ -2117,7 +2117,6 @@ /* release the slave from its bond */ bond_detach_slave(bond, slave); - if (bond->primary_slave == slave) { bond->primary_slave = NULL; } @@ -2436,7 +2435,6 @@ if (orig_app_abi_ver == -1) { orig_app_abi_ver = new_abi_ver; } - app_abi_ver = new_abi_ver; } @@ -4226,6 +4224,7 @@ bond_for_each_slave(bond, slave, i) { dprintk("s %p s->p %p c_m %p\n", slave, slave->prev, slave->dev->change_mtu); + res = dev_set_mtu(slave->dev, new_mtu); if (res) { @@ -4623,7 +4622,7 @@ */ bond_dev->features |= NETIF_F_VLAN_CHALLENGED; - /* don't acquire bond device's xmit_lock when + /* don't acquire bond device's xmit_lock when * transmitting */ bond_dev->features |= NETIF_F_LLTX; @@ -5035,7 +4976,6 @@ #ifdef CONFIG_PROC_FS bond_create_proc_dir(); #endif - for (i = 0; i < max_bonds; i++) { sprintf(new_bond_name, "bond%d",i); res = bond_create(new_bond_name,&bonding_defaults, NULL); From radheka.godse@intel.com Fri Jul 1 14:11:46 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 01 Jul 2005 14:11:55 -0700 (PDT) Received: from orsfmr004.jf.intel.com (fmr19.intel.com [134.134.136.18]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j61LBOH9013745 for ; Fri, 1 Jul 2005 14:11:46 -0700 Received: from orsfmr101.jf.intel.com (orsfmr101.jf.intel.com [10.7.209.17]) by orsfmr004.jf.intel.com (8.12.10/8.12.10/d: major-outer.mc,v 1.1 2004/09/17 17:50:56 root Exp $) with ESMTP id j61L9kkn028420; Fri, 1 Jul 2005 21:09:46 GMT Received: from nwlxmail01.jf.intel.com (nwlxmail01.jf.intel.com [10.7.171.40]) by orsfmr101.jf.intel.com (8.12.10/8.12.10/d: major-inner.mc,v 1.2 2004/09/17 18:05:01 root Exp $) with ESMTP id j61L9k7G011307; Fri, 1 Jul 2005 21:09:46 GMT Received: from [134.134.3.92] ([134.134.3.92]) by nwlxmail01.jf.intel.com (8.12.10/8.12.9/MailSET/Hub) with ESMTP id j61L9jSL006114; Fri, 1 Jul 2005 14:09:46 -0700 Date: Fri, 1 Jul 2005 14:08:52 -0700 (PDT) From: Radheka Godse X-X-Sender: radheka@localhost.localdomain To: fubar@us.ibm.com, bonding-devel@lists.sourceforge.net cc: netdev@oss.sgi.com Subject: [PATCH 2.6.13-rc1 15/17] bonding: include ARP and Xmit Hash Policy information in /proc file Message-ID: ReplyTo: "Radheka Godse" MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Scanned-By: MIMEDefang 2.44 X-archive-position: 2606 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: radheka.godse@intel.com Precedence: bulk X-list: netdev For bonds configured to do ARP monitoring, this patch displays polling interval and ip targets info in their respective proc files. For balance-XOR and 802.3ad modes, this patch displays Xmit Hash Policy. Signed-off-by: Radheka Godse Signed-off-by: Mitch Williams diff -urN -X dontdiff linux-2.6.12post/drivers/net/bonding/bond_main.c linux-2.6.12post-sysfs/drivers/net/bonding/bond_main.c --- linux-2.6.12post/drivers/net/bonding/bond_main.c 2005-06-28 18:18:03.000000000 -0700 +++ linux-2.6.12post-sysfs/drivers/net/bonding/bond_main.c 2005-06-30 13:53:55.000000000 -0700 @@ -3370,6 +3370,8 @@ { struct bonding *bond = seq->private; struct slave *curr; + int i; + u32 target; read_lock(&bond->curr_slave_lock); curr = bond->curr_active_slave; @@ -3378,6 +3366,13 @@ seq_printf(seq, "Bonding Mode: %s\n", bond_mode_name(bond->params.mode)); + if (bond->params.mode == BOND_MODE_XOR || + bond->params.mode == BOND_MODE_8023AD) { + seq_printf(seq, "Transmit Hash Policy: %s (%d)\n", + xmit_hashtype_tbl[bond->params.xmit_policy].modename, + bond->params.xmit_policy); + } + if (USES_PRIMARY(bond->params.mode)) { seq_printf(seq, "Primary Slave: %s\n", (bond->primary_slave) ? @@ -3394,6 +3403,24 @@ seq_printf(seq, "Down Delay (ms): %d\n", bond->params.downdelay * bond->params.miimon); + + // ARP information + if(bond->params.arp_interval > 0) { + seq_printf(seq, "ARP Polling Interval (ms): %d\n", + bond->params.arp_interval); + + seq_printf(seq, "ARP IP target/s (n.n.n.n form):"); + + for(i = 0; (i < BOND_MAX_ARP_TARGETS) && bond->params.arp_targets[i] ;i++) { + target = ntohl(bond->params.arp_targets[i]); + seq_printf(seq, " %d.%d.%d.%d", HIPQUAD(target)); + if((i+1 < BOND_MAX_ARP_TARGETS) && bond->params.arp_targets[i+1]) + seq_printf(seq, ","); + else + seq_printf(seq, "\n"); + } + } + if (bond->params.mode == BOND_MODE_8023AD) { struct ad_info ad_info; From radheka.godse@intel.com Fri Jul 1 14:12:41 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 01 Jul 2005 14:12:45 -0700 (PDT) Received: from orsfmr005.jf.intel.com (fmr20.intel.com [134.134.136.19]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j61LCfH9014049 for ; Fri, 1 Jul 2005 14:12:41 -0700 Received: from orsfmr101.jf.intel.com (orsfmr101.jf.intel.com [10.7.209.17]) by orsfmr005.jf.intel.com (8.12.10/8.12.10/d: major-outer.mc,v 1.1 2004/09/17 17:50:56 root Exp $) with ESMTP id j61LB5tp012564; Fri, 1 Jul 2005 21:11:05 GMT Received: from nwlxmail01.jf.intel.com (nwlxmail01.jf.intel.com [10.7.171.40]) by orsfmr101.jf.intel.com (8.12.10/8.12.10/d: major-inner.mc,v 1.2 2004/09/17 18:05:01 root Exp $) with ESMTP id j61LB57G013097; Fri, 1 Jul 2005 21:11:05 GMT Received: from [134.134.3.92] ([134.134.3.92]) by nwlxmail01.jf.intel.com (8.12.10/8.12.9/MailSET/Hub) with ESMTP id j61LB5SL006345; Fri, 1 Jul 2005 14:11:05 -0700 Date: Fri, 1 Jul 2005 14:10:17 -0700 (PDT) From: Radheka Godse X-X-Sender: radheka@localhost.localdomain To: fubar@us.ibm.com, bonding-devel@lists.sourceforge.net cc: netdev@oss.sgi.com Subject: [PATCH 2.6.13-rc1 16/17] bonding: version, date and log update Message-ID: ReplyTo: "Radheka Godse" MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Scanned-By: MIMEDefang 2.44 X-archive-position: 2607 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: radheka.godse@intel.com Precedence: bulk X-list: netdev This patch updates the bonding version number and adds a few entries to the change log in bond_main. The major version number is changed to 3 because of the sysfs interface. Signed-off-by: Radheka Godse Signed-off-by: Mitch Williams diff -urN -X dontdiff linux-2.6.12post/drivers/net/bonding/bonding.h linux-2.6.12post-sysfs/drivers/net/bonding/bonding.h --- linux-2.6.12post/drivers/net/bonding/bonding.h 2005-06-28 18:18:03.000000000 -0700 +++ linux-2.6.12post-sysfs/drivers/net/bonding/bonding.h 2005-06-30 13:58:27.000000000 -0700 @@ -29,6 +29,10 @@ * 2005/05/05 - Jason Gabler * - added "xmit_policy" kernel parameter for alternate hashing policy * support for mode 2 + * + * 2005/06/21 - Mitch Williams + * Radheka Godse + * - Added bonding sysfs interface */ #ifndef _LINUX_BONDING_H @@ -41,8 +37,8 @@ #include "bond_3ad.h" #include "bond_alb.h" -#define DRV_VERSION "2.6.3" -#define DRV_RELDATE "June 8, 2005" +#define DRV_VERSION "3.0.0" +#define DRV_RELDATE "June 28, 2005" #define DRV_NAME "bonding" #define DRV_DESCRIPTION "Ethernet Channel Bonding Driver" diff -urN -X dontdiff linux-2.6.12post/drivers/net/bonding/bond_main.c linux-2.6.12post-sysfs/drivers/net/bonding/bond_main.c --- linux-2.6.12post/drivers/net/bonding/bond_main.c 2005-06-28 18:18:03.000000000 -0700 +++ linux-2.6.12post-sysfs/drivers/net/bonding/bond_main.c 2005-06-30 13:53:55.000000000 -0700 @@ -487,7 +487,36 @@ * * Added xmit_hash_policy_layer34() * - Modified by Jay Vosburgh to also support mode 4. * Set version to 2.6.3. - */ + * 2005/06/24 - Mitch Williams + * - Radheka Godse + * - Added bonding sysfs interface + * + * - pre-work: + * - Split out bond creation code to allow for future addition of + * sysfs interface. + * - Added extra optional parameter to bond_enslave to return a + * a pointer to the new slave. This will be used by future + * sysfs functionality. + * - Removed static declaration on some functions and data items. + * + * - Added sysfs support, including capability to add/remove/change + * any bond at runtime. + * + * - Miscellaneous: + * - Added bonding: : prefix to sysfs log messages + * - added arp_ip_targets to /proc entry + * - trivial fix: added missing modes description to modinfo + * - Corrected bug in ALB init where kmalloc is called inside + * a held lock + * - Corrected behavior to maintain bond link when changing + * from arp monitor to miimon and vice versa + * - Added missing bonding: : prefix to alb, ad log messages + * - Fixed stack dump warnings seen if changing between miimon + * and arp monitoring when the bond interface is down. + * - Fixed stack dump warnings seen when enslaving an e100 + * driver + * Set version to 3.0.0 +*/ //#define BONDING_DEBUG 1 From radheka.godse@intel.com Fri Jul 1 14:16:24 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 01 Jul 2005 14:16:26 -0700 (PDT) Received: from orsfmr002.jf.intel.com (fmr17.intel.com [134.134.136.16]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j61LGNH9015503 for ; Fri, 1 Jul 2005 14:16:23 -0700 Received: from orsfmr100.jf.intel.com (orsfmr100.jf.intel.com [10.7.209.16]) by orsfmr002.jf.intel.com (8.12.10/8.12.10/d: major-outer.mc,v 1.1 2004/09/17 17:50:56 root Exp $) with ESMTP id j61LElq8001904; Fri, 1 Jul 2005 21:14:47 GMT Received: from nwlxmail01.jf.intel.com (nwlxmail01.jf.intel.com [10.7.171.40]) by orsfmr100.jf.intel.com (8.12.10/8.12.10/d: major-inner.mc,v 1.2 2004/09/17 18:05:01 root Exp $) with ESMTP id j61LElPb018068; Fri, 1 Jul 2005 21:14:47 GMT Received: from [134.134.3.92] ([134.134.3.92]) by nwlxmail01.jf.intel.com (8.12.10/8.12.9/MailSET/Hub) with ESMTP id j61LElSL006679; Fri, 1 Jul 2005 14:14:47 -0700 Date: Fri, 1 Jul 2005 14:14:00 -0700 (PDT) From: Radheka Godse X-X-Sender: radheka@localhost.localdomain To: fubar@us.ibm.com, bonding-devel@lists.sourceforge.net cc: netdev@oss.sgi.com Subject: [PATCH 2.6.13-rc1 17/17] bonding: Optimization to read MII only when link status has changed. Message-ID: ReplyTo: "Radheka Godse" MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Scanned-By: MIMEDefang 2.44 X-archive-position: 2608 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: radheka.godse@intel.com Precedence: bulk X-list: netdev Enhanced bond_mii_monitor fn: to read MII only when link status has changed for an enslaved adapter. Signed-off-by: Radheka Godse diff -urN -X dontdiff linux-2.6.12post/drivers/net/bonding/bond_main.c linux-2.6.12post-sysfs/drivers/net/bonding/bond_main.c --- linux-2.6.12post/drivers/net/bonding/bond_main.c 2005-06-28 18:18:03.000000000 -0700 +++ linux-2.6.12post-sysfs/drivers/net/bonding/bond_main.c 2005-06-30 13:53:55.000000000 -0700 @@ -2652,6 +2652,7 @@ if (slave == oldcurrent) { do_failover = 1; } + bond_update_speed_duplex(slave); } else { slave->delay--; } @@ -2742,6 +2675,7 @@ } else { slave->delay--; } + bond_update_speed_duplex(slave); } break; default: @@ -2754,8 +2675,6 @@ goto out; } /* end of switch (slave->link) */ - bond_update_speed_duplex(slave); - if (bond->params.mode == BOND_MODE_8023AD) { if (old_speed != slave->speed) { bond_3ad_adapter_speed_changed(slave); ~ From zdenek@rcn.com Fri Jul 1 19:24:38 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 01 Jul 2005 19:24:43 -0700 (PDT) Received: from smtp05.mrf.mail.rcn.net (smtp05.mrf.mail.rcn.net [207.172.4.64]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j622ObH9001243 for ; Fri, 1 Jul 2005 19:24:38 -0700 Received: from 209-150-50-115.c3-0.nwt-ubr3.sbo-nwt.ma.cable.rcn.com (HELO funex) (209.150.50.115) by smtp05.mrf.mail.rcn.net with SMTP; 01 Jul 2005 22:23:07 -0400 Message-Id: <3u3gb7$1mhk2i@smtp05.mrf.mail.rcn.net> X-IronPort-AV: i="3.93,252,1115006400"; d="scan'208"; a="57200722:sNHT42246956" X-Sender: zdenek@pop.rcn.com X-Mailer: QUALCOMM Windows Eudora Pro Version 4.0 Date: Fri, 01 Jul 2005 22:16:13 -0400 To: netdev@oss.sgi.com, linux-net@vger.kernel.org From: Zdenek Radouch Subject: controlling ARP Proxy scope? Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" X-archive-position: 2609 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: zdenek@rcn.com Precedence: bulk X-list: netdev I haven't been able to locate anything discussing how to control the scope of Linux proxy ARP. So, left with only a binary flag in /proc, and network definition on the interface, I assumed (perhaps naively) that the arp would proxy only for the addresses within the subnet defined for the interface (on which the proxy arp is turned on). However, that does not seem to be the case. I have an interface with address 10.1.2.219 and mask 255.255.255.248 with proxy arp turned on on this interface, and the machine is responding (I see that with tcpdump) to arp requests for address 10.1.2.1, i.e., an address outside of the proxy interface's subnet. Can anyone explain the behavior? What is the scope of the proxying? If the scope is not limited to the proxy interface's subnet, then how do I avoid proxying for addresses of machines facing the proxy server? This seems to be quite broken. I am running 2.4.25 kernel. I'll appreciate any pointers. Thanks! -Zdenek From dtor_core@ameritech.net Fri Jul 1 22:31:40 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 01 Jul 2005 22:31:49 -0700 (PDT) Received: from smtp101.sbc.mail.re2.yahoo.com (smtp101.sbc.mail.re2.yahoo.com [68.142.229.104]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id j625VdH9013788 for ; Fri, 1 Jul 2005 22:31:40 -0700 Received: (qmail 95981 invoked from network); 2 Jul 2005 05:30:05 -0000 Received: from unknown (HELO mail.corenet.homeip.net) (dtor?core@ameritech.net@68.253.32.177 with login) by smtp101.sbc.mail.re2.yahoo.com with SMTP; 2 Jul 2005 05:30:04 -0000 From: Dmitry Torokhov To: netdev@oss.sgi.com Subject: Re: [PATCH 2.6.13-rc1 8/17] bonding: SYSFS INTERFACE (large) Date: Sat, 2 Jul 2005 00:30:03 -0500 User-Agent: KMail/1.8.1 Cc: Radheka Godse , fubar@us.ibm.com, bonding-devel@lists.sourceforge.net References: In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200507020030.03635.dtor_core@ameritech.net> X-archive-position: 2610 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: dtor_core@ameritech.net Precedence: bulk X-list: netdev Hi, On Friday 01 July 2005 15:48, Radheka Godse wrote: > This large patch adds the sysfs interface to channel bonding. It will > allow users to add and remove bonds, add and remove slaves, and change > all bonding parameters without using ifenslave. > The ifenslave interface still works. ... Couple of comments: > @@ -3569,7 +3593,10 @@ > bond_remove_proc_entry(bond); > bond_create_proc_entry(bond); > #endif > - > + down_write(&(bonding_rwsem)); > + bond_destroy_sysfs_entry(bond); > + bond_create_sysfs_entry(bond); > + up_write(&(bonding_rwsem)); Space vs. tab identation. > return NOTIFY_DONE; > } > > @@ -4101,6 +4128,7 @@ > orig_app_abi_ver = prev_abi_ver; > } > > + up_write(&(bonding_rwsem)); Whu extra parens? > + * Changes: > + * > + * 2004/12/12 - Mitch Williams > + * - Initial creation of sysfs interface. > + * > + * 2005/06/22 - Radheka Godse > + * - Added ifenslave -c type functionality to sysfs > + * - Added sysfs files for attributes such as MII Status and > + * 802.3ad aggregator that are displayed in /proc > + * - Added "name value" format to sysfs "mode" and > + * "lacp_rate", for e.g., "active-backup 1" or "slow 0" for > + * consistency and ease of script parsing > + * - Fixed reversal of octets in arp_ip_targets via sysfs > + * - sysfs support to handle bond interface re-naming > + * - Moved all sysfs entries into /sys/class/net instead of > + * of using a standalone subsystem. > + * - Added sysfs symlinks between masters and slaves > + * - Corrected bugs in sysfs unload path when creating bonds > + * with existing interface names. > + * - Removed redundant sysfs stat file since it duplicates slave info > + * from the proc file > + * - Fixed errors in sysfs show/store arp targets. > + * - For consistency with ifenslave, instead of exiting > + * with an error, updated bonding sysfs to > + * close and attempt to enslave an up adapter. > + * - Fixed NULL dereference when adding a slave interface > + * that does not exist. > + * - Added checks in sysfs bonding to reject invalid ip addresses > + * - Synch up with post linux-2.6.12 bonding changes > + * - Created sysfs bond attrib for xmit_hash_policy I think we prefer to rely in SCMs to keep changelogs for new modules. > + > +static struct class *netdev_class; > +/*--------------------------- Data Structures -----------------------------*/ > + > +/* Bonding sysfs lock. Why can't we just use the subsytem lock? > + * Because kobject_register tries to acquire the subsystem lock. If > + * we already hold the lock (which we would if the user was creating > + * a new bond through the sysfs interface), we deadlock. > + */ > + > +struct rw_semaphore bonding_rwsem; klists were just added to the kernel proper. Does this sentiment still holds true? > + > +/* > + * "show" function for the bond_masters attribute. > + * The class parameter is ignored. > + */ > +static ssize_t bonding_show_bonds(struct class *cls, char *buffer) > +{ > + int res = 0; > + struct bonding *bond; > + > + down_read(&(bonding_rwsem)); Why extra parens? > + list_for_each_entry_safe(bond, nxt, &bond_dev_list, bond_list) { > + if (strnicmp(bond->dev->name, name, IFNAMSIZ) == 0) { > + /* Temporarily set a meaningless flag. When > + * we get done with the loop, we'll check all of these. > + * If the bond doesn't have this flag set, then we need > + * to remove the bond. If the flag has it set, then > + * we can just clear the flag. > + */ > + bond->flags |= IFF_DYNAMIC; > + found = 1; > + break; /* Found it, so go to next name */ > + } > + } Why list_for_each_entry_safe is used? NO elements is being deleted in the loop... > + > + /* first, create a link from the slave back to the master */ > + ret = sysfs_create_link(&(slave->class_dev.kobj), &(master->class_dev.kobj), > + "master"); Extra parens again. > +static ssize_t bonding_show_arp_interval(struct class_device *cd, char *buf) > +{ > + int count; > + struct bonding *bond = to_bond(cd); > + > + down_read(&(bonding_rwsem)); > + count = sprintf(buf, "%d\n", bond->params.arp_interval) + 1; > + up_read(&(bonding_rwsem)); > + return count; > +} What does this lock really protects here? As far as I can see params will not go away... > + > + /* get the netdev class pointer */ > + firstbond = container_of(bond_dev_list.next, struct bonding, bond_list); > + if (!firstbond) > + { Open brace should go on the same line as if. Besides, here it is not needed at all... Thanks! -- Dmitry From greg@kroah.com Sat Jul 2 01:16:00 2005 Received: with ECARTIS (v1.0.0; list netdev); Sat, 02 Jul 2005 01:16:02 -0700 (PDT) Received: from perch.kroah.org (mail.kroah.org [69.55.234.183]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j628FxH9025464 for ; Sat, 2 Jul 2005 01:16:00 -0700 Received: from [192.168.0.10] (c-24-22-115-24.hsd1.or.comcast.net [24.22.115.24]) (authenticated) by perch.kroah.org (8.11.6/8.11.6) with ESMTP id j628EIq03553; Sat, 2 Jul 2005 01:14:19 -0700 Received: from greg by echidna.kroah.org with local (masqmail 0.2.19) id 1Dod8E-5Q6-00; Sat, 02 Jul 2005 01:13:46 -0700 Date: Sat, 2 Jul 2005 01:13:46 -0700 From: Greg KH To: Radheka Godse Cc: netdev@oss.sgi.com Subject: Re: [PATCH 2.6.13-rc1 8/17] bonding: SYSFS INTERFACE (large) Message-ID: <20050702081346.GA20789@kroah.com> References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.8i X-archive-position: 2611 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: greg@kroah.com Precedence: bulk X-list: netdev On Fri, Jul 01, 2005 at 01:48:44PM -0700, Radheka Godse wrote: > This large patch adds the sysfs interface to channel bonding. It will > allow users to add and remove bonds, add and remove slaves, and change > all bonding parameters without using ifenslave. > The ifenslave interface still works. Have a short example of what the sysfs tree looks like, and what the new files contain and expect to be written to? > diff -urN -X dontdiff linux-2.6.12post/drivers/net/bonding/bonding.h > linux-2.6.12post-sysfs/drivers/net/bonding/bonding.h > --- linux-2.6.12post/drivers/net/bonding/bonding.h 2005-06-28 > 18:18:03.000000000 -0700 patch looks linewrapped :( > +/* Bonding sysfs lock. Why can't we just use the subsytem lock? The subsystem lock is no more. Well, it's still around, but almost no one uses it, due to the klist changes. You shouldn't need a lock either now. > + * Because kobject_register tries to acquire the subsystem lock. If > + * we already hold the lock (which we would if the user was creating > + * a new bond through the sysfs interface), we deadlock. > + */ > + > +struct rw_semaphore bonding_rwsem; > + > + > + > + > +/*------------------------------ Functions > --------------------------------*/ > + > +/* > + * "show" function for the bond_masters attribute. > + * The class parameter is ignored. > + */ > +static ssize_t bonding_show_bonds(struct class *cls, char *buffer) > +{ > + int res = 0; > + struct bonding *bond; > + > + down_read(&(bonding_rwsem)); > + > + list_for_each_entry(bond, &bond_dev_list, bond_list) { > + res += sprintf(buffer + res, "%s ", > + bond->dev->name); > + if (res > (PAGE_SIZE - IFNAMSIZ)) { > + dprintk("eek! too many bonds!\n"); > + break; > + } > + } > + res += sprintf(buffer + res, "\n"); > + res++; > + up_read(&(bonding_rwsem)); > + return res; This violates the 1-value-per-sysfs file rule. Please fix this up. > +static ssize_t bonding_show_slaves(struct class_device *cd, char *buf) > +{ > + struct slave *slave; > + int i, res = 0; > + struct bonding *bond = to_bond(cd); > + > + down_read(&(bonding_rwsem)); > + > + read_lock_bh(&bond->lock); > + bond_for_each_slave(bond, slave, i) { > + res += sprintf(buf + res, "%s ", slave->dev->name); > + if (res > (PAGE_SIZE - IFNAMSIZ)) { > + dprintk("eek! too many slaves!\n"); > + break; > + } > + } > + read_unlock_bh(&bond->lock); > + res += sprintf(buf + res, "\n"); > + res++; > + up_read(&(bonding_rwsem)); > + return res; > +} Same sysfs violation. I think other files also have problems :( thanks, greg k-h From hno@marasystems.com Sat Jul 2 14:24:07 2005 Received: with ECARTIS (v1.0.0; list netdev); Sat, 02 Jul 2005 14:24:10 -0700 (PDT) Received: from filer.marasystems.com (marasystems.com [83.241.133.2]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j62LO6H9005944 for ; Sat, 2 Jul 2005 14:24:07 -0700 Received: from localhost (henrik@localhost) by filer.marasystems.com (8.11.6/8.11.6) with ESMTP id j62LLbV27158; Sat, 2 Jul 2005 23:21:37 +0200 Date: Sat, 2 Jul 2005 23:21:37 +0200 (CEST) From: Henrik Nordstrom To: Zdenek Radouch cc: netdev@oss.sgi.com, linux-net@vger.kernel.org Subject: Re: controlling ARP Proxy scope? In-Reply-To: <3u3gb7$1mhk2i@smtp05.mrf.mail.rcn.net> Message-ID: References: <3u3gb7$1mhk2i@smtp05.mrf.mail.rcn.net> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-archive-position: 2612 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hno@marasystems.com Precedence: bulk X-list: netdev On Fri, 1 Jul 2005, Zdenek Radouch wrote: > So, left with only a binary flag in /proc, and network definition on the > interface, > I assumed (perhaps naively) that the arp would proxy only for the addresses > within the subnet defined for the interface (on which the proxy arp is > turned on). > However, that does not seem to be the case. You may be able to tune this with either arp_filter or arp_ignore. > I have an interface with address 10.1.2.219 and mask 255.255.255.248 with > proxy arp turned on on this interface, and the machine is responding > (I see that with tcpdump) to arp requests for address 10.1.2.1, i.e., > an address outside of the proxy interface's subnet. Correct. > Can anyone explain the behavior? proxy_arp simply ARPs if there is a route for the requested destination going out on another interface than where the ARP was seen. Regards Henrik From bunk@stusta.de Sat Jul 2 16:52:59 2005 Received: with ECARTIS (v1.0.0; list netdev); Sat, 02 Jul 2005 16:53:05 -0700 (PDT) Received: from mailout.stusta.mhn.de (emailhub.stusta.mhn.de [141.84.69.5]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id j62NqrH9015684 for ; Sat, 2 Jul 2005 16:52:58 -0700 Received: (qmail 30658 invoked from network); 2 Jul 2005 23:51:10 -0000 Received: from r063144.stusta.swh.mhn.de (10.150.63.144) by mailout.stusta.mhn.de with SMTP; 2 Jul 2005 23:51:10 -0000 Received: by r063144.stusta.swh.mhn.de (Postfix, from userid 1000) id A17DFAFA5A; Sun, 3 Jul 2005 01:51:09 +0200 (CEST) Date: Sun, 3 Jul 2005 01:51:09 +0200 From: Adrian Bunk To: jgarzik@pobox.com Cc: jkmaline@cc.hut.fi, hostap@shmoo.com, netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: [-mm patch] net/ieee80211/: remove pci.h #include's Message-ID: <20050702235109.GJ5346@stusta.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.9i X-archive-position: 2613 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: bunk@stusta.de Precedence: bulk X-list: netdev I was wondering why editing pci.h triggered the rebuild of three files under net/, and as far as I can see, there's no reason for these three files to #include pci.h . Signed-off-by: Adrian Bunk --- This patch was already sent on: - 30 May 2005 - 1 May 2005 net/ieee80211/ieee80211_module.c | 1 - net/ieee80211/ieee80211_rx.c | 1 - net/ieee80211/ieee80211_tx.c | 1 - 3 files changed, 3 deletions(-) --- linux-2.6.12-rc3-mm1-full/net/ieee80211/ieee80211_module.c.old 2005-04-30 23:23:14.000000000 +0200 +++ linux-2.6.12-rc3-mm1-full/net/ieee80211/ieee80211_module.c 2005-04-30 23:23:18.000000000 +0200 @@ -40,7 +40,6 @@ #include #include #include -#include #include #include #include --- linux-2.6.12-rc3-mm1-full/net/ieee80211/ieee80211_tx.c.old 2005-04-30 23:23:25.000000000 +0200 +++ linux-2.6.12-rc3-mm1-full/net/ieee80211/ieee80211_tx.c 2005-04-30 23:23:32.000000000 +0200 @@ -33,7 +33,6 @@ #include #include #include -#include #include #include #include --- linux-2.6.12-rc3-mm1-full/net/ieee80211/ieee80211_rx.c.old 2005-04-30 23:23:42.000000000 +0200 +++ linux-2.6.12-rc3-mm1-full/net/ieee80211/ieee80211_rx.c 2005-04-30 23:23:46.000000000 +0200 @@ -23,7 +23,6 @@ #include #include #include -#include #include #include #include From yi.zhu@intel.com Sun Jul 3 22:54:25 2005 Received: with ECARTIS (v1.0.0; list netdev); Sun, 03 Jul 2005 22:54:29 -0700 (PDT) Received: from fmsfmr001.fm.intel.com (fmr13.intel.com [192.55.52.67]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j645sOH9002416 for ; Sun, 3 Jul 2005 22:54:25 -0700 Received: from fmsfmr100.fm.intel.com (fmsfmr100.fm.intel.com [10.253.24.20]) by fmsfmr001.fm.intel.com (8.12.10/8.12.10/d: major-outer.mc,v 1.1 2004/09/17 17:50:56 root Exp $) with ESMTP id j645qnqU029685; Mon, 4 Jul 2005 05:52:49 GMT Received: from fmsmsxvs041.fm.intel.com (fmsmsxvs041.fm.intel.com [132.233.42.126]) by fmsfmr100.fm.intel.com (8.12.10/8.12.10/d: major-inner.mc,v 1.2 2004/09/17 18:05:01 root Exp $) with SMTP id j645qi9d031973; Mon, 4 Jul 2005 05:52:49 GMT Received: from debian.sh.intel.com ([172.16.219.38]) by fmsmsxvs041.fm.intel.com (SAVSMTP 3.1.7.47) with SMTP id M2005070322524716745 ; Sun, 03 Jul 2005 22:52:48 -0700 Subject: Re: ipw2100: more cleanups From: Zhu Yi To: Pavel Machek Cc: jketreno@linux.intel.com, netdev@oss.sgi.com, jbohac@suse.cz, jbenc@suse.cz In-Reply-To: <20050513214025.GA1863@elf.ucw.cz> References: <20050513214025.GA1863@elf.ucw.cz> Content-Type: text/plain Organization: Intel Corp. Date: Mon, 04 Jul 2005 13:49:05 +0800 Message-Id: <1120456146.27821.85.camel@debian.sh.intel.com> Mime-Version: 1.0 X-Mailer: Evolution 2.2.2 Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.44 X-archive-position: 2615 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: yi.zhu@intel.com Precedence: bulk X-list: netdev Content-Length: 304 Lines: 13 On Fri, 2005-05-13 at 23:40 +0200, Pavel Machek wrote: > More cleanups (relative to ipw2100-1.1.0 version). Highlights include > removing of two printk wrappers. [snip] > -static DEVICE_ATTR(bssinfo, S_IRUGO, show_bssinfo, NULL); Any special reason to remove these driver sysfs entries? Thanks, -yi From pavel@ucw.cz Mon Jul 4 00:06:36 2005 Received: with ECARTIS (v1.0.0; list netdev); Mon, 04 Jul 2005 00:06:40 -0700 (PDT) Received: from amd.ucw.cz (gprs189-60.eurotel.cz [160.218.189.60]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6476WH9005409 for ; Mon, 4 Jul 2005 00:06:34 -0700 Received: by amd.ucw.cz (Postfix, from userid 8) id DD4EC8B8D2; Mon, 4 Jul 2005 09:05:06 +0200 (CEST) Date: Mon, 4 Jul 2005 09:05:06 +0200 From: Pavel Machek To: Zhu Yi Cc: jketreno@linux.intel.com, netdev@oss.sgi.com, jbohac@suse.cz, jbenc@suse.cz Subject: Re: ipw2100: more cleanups Message-ID: <20050704070506.GA15370@elf.ucw.cz> References: <20050513214025.GA1863@elf.ucw.cz> <1120456146.27821.85.camel@debian.sh.intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1120456146.27821.85.camel@debian.sh.intel.com> X-Warning: Reading this can be dangerous to your mental health. User-Agent: Mutt/1.5.9i X-archive-position: 2616 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: pavel@suse.cz Precedence: bulk X-list: netdev Content-Length: 589 Lines: 20 Hi! > > More cleanups (relative to ipw2100-1.1.0 version). Highlights include > > removing of two printk wrappers. > > [snip] > > > -static DEVICE_ATTR(bssinfo, S_IRUGO, show_bssinfo, NULL); > > Any special reason to remove these driver sysfs entries? Umm, yes, they should not be needed and we do not want to maintain them in /sys forever. /sys should be "one value per file", and definitely not for debugging stuff like this. (See debugfs) Anyway I think I did drop that patch from newer cleanup attempts. Pavel -- teflon -- maybe it is a trademark, but it should not be. From dada1@cosmosbay.com Mon Jul 4 14:23:50 2005 Received: with ECARTIS (v1.0.0; list netdev); Mon, 04 Jul 2005 14:23:55 -0700 (PDT) Received: from smtp.cegetel.net (mf00.sitadelle.com [212.94.174.67]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j64LNnH9008430 for ; Mon, 4 Jul 2005 14:23:50 -0700 Received: from [192.168.0.5] (unknown [84.5.72.184]) by smtp.cegetel.net (Postfix) with ESMTP id EB2911A4205; Mon, 4 Jul 2005 23:22:12 +0200 (CEST) Message-ID: <42C9A884.1030906@cosmosbay.com> Date: Mon, 04 Jul 2005 23:22:12 +0200 From: Eric Dumazet User-Agent: Mozilla Thunderbird 1.0 (Windows/20041206) X-Accept-Language: fr, en MIME-Version: 1.0 To: Michael Chan Cc: "David S. Miller" , netdev@oss.sgi.com Subject: Re: [TG3]: About hw coalescing infrastructure. References: <20050511.141530.57445142.davem@davemloft.net> <42B982D4.9040704@cosmosbay.com> <1119467012.5325.15.camel@rh4> In-Reply-To: <1119467012.5325.15.camel@rh4> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit X-archive-position: 2617 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: dada1@cosmosbay.com Precedence: bulk X-list: netdev Content-Length: 4321 Lines: 122 Michael Chan a écrit : > On Wed, 2005-06-22 at 17:25 +0200, Eric Dumazet wrote: > > >>Is there anything I can try to tune the coalescing ? >>Being able to handle 100 packets each interrupt instead of one or two would certainly help. >>I dont mind about latency. But of course I would like not to drop packets :) >>But maybe the BCM5702 is not able to delay an interrupt ? >> > > > On the 5702 that supports CLRTCKS mode, you need to play around with the > following parameters in tg3.h. To reduce interrupts, you generally have > to increase the values. > > #define LOW_RXCOL_TICKS_CLRTCKS 0x00000014 > #define LOW_TXCOL_TICKS_CLRTCKS 0x00000048 > #define LOW_RXMAX_FRAMES 0x00000005 > #define LOW_TXMAX_FRAMES 0x00000035 > #define DEFAULT_RXCOAL_TICK_INT_CLRTCKS 0x00000014 > #define DEFAULT_TXCOAL_TICK_INT_CLRTCKS 0x00000014 > #define DEFAULT_RXCOAL_MAXF_INT 0x00000005 > #define DEFAULT_TXCOAL_MAXF_INT 0x00000005 > > > Thanks Michael I tried various settings. # ethtool -c eth0 Coalesce parameters for eth0: Adaptive RX: off TX: off stats-block-usecs: 1000000 sample-interval: 0 pkt-rate-low: 0 pkt-rate-high: 0 rx-usecs: 500 rx-frames: 20 rx-usecs-irq: 500 rx-frames-irq: 20 tx-usecs: 600 tx-frames: 53 tx-usecs-irq: 600 tx-frames-irq: 20 rx-usecs-low: 0 rx-frame-low: 0 tx-usecs-low: 0 tx-frame-low: 0 rx-usecs-high: 0 rx-frame-high: 0 tx-usecs-high: 0 tx-frame-high: 0 But it seems something is wrong because when network load becomes high I get : Jul 4 23:01:19 dada1 kernel: NETDEV WATCHDOG: eth0: transmit timed out Jul 4 23:01:19 dada1 kernel: tg3: eth0: transmit timed out, resetting Jul 4 23:01:24 dada1 kernel: NETDEV WATCHDOG: eth0: transmit timed out Jul 4 23:01:24 dada1 kernel: tg3: eth0: transmit timed out, resetting Jul 4 23:01:29 dada1 kernel: NETDEV WATCHDOG: eth0: transmit timed out Jul 4 23:01:29 dada1 kernel: tg3: eth0: transmit timed out, resetting Jul 4 23:01:34 dada1 kernel: NETDEV WATCHDOG: eth0: transmit timed out Jul 4 23:01:34 dada1 kernel: tg3: eth0: transmit timed out, resetting Jul 4 23:01:39 dada1 kernel: NETDEV WATCHDOG: eth0: transmit timed out Jul 4 23:01:39 dada1 kernel: tg3: eth0: transmit timed out, resetting Jul 4 23:01:44 dada1 kernel: NETDEV WATCHDOG: eth0: transmit timed out Jul 4 23:01:44 dada1 kernel: tg3: eth0: transmit timed out, resetting Jul 4 23:01:49 dada1 kernel: NETDEV WATCHDOG: eth0: transmit timed out Jul 4 23:01:49 dada1 kernel: tg3: eth0: transmit timed out, resetting Jul 4 23:01:54 dada1 kernel: NETDEV WATCHDOG: eth0: transmit timed out Jul 4 23:01:54 dada1 kernel: tg3: eth0: transmit timed out, resetting Jul 4 23:01:59 dada1 kernel: NETDEV WATCHDOG: eth0: transmit timed out Jul 4 23:01:59 dada1 kernel: tg3: eth0: transmit timed out, resetting Jul 4 23:02:04 dada1 kernel: NETDEV WATCHDOG: eth0: transmit timed out Jul 4 23:02:04 dada1 kernel: tg3: eth0: transmit timed out, resetting Jul 4 23:02:09 dada1 kernel: NETDEV WATCHDOG: eth0: transmit timed out Jul 4 23:02:09 dada1 kernel: tg3: eth0: transmit timed out, resetting Jul 4 23:02:14 dada1 kernel: NETDEV WATCHDOG: eth0: transmit timed out Jul 4 23:02:14 dada1 kernel: tg3: eth0: transmit timed out, resetting Jul 4 23:02:19 dada1 kernel: NETDEV WATCHDOG: eth0: transmit timed out Jul 4 23:02:19 dada1 kernel: tg3: eth0: transmit timed out, resetting Jul 4 23:02:24 dada1 kernel: NETDEV WATCHDOG: eth0: transmit timed out Jul 4 23:02:24 dada1 kernel: tg3: eth0: transmit timed out, resetting Jul 4 23:02:29 dada1 kernel: NETDEV WATCHDOG: eth0: transmit timed out Jul 4 23:02:29 dada1 kernel: tg3: eth0: transmit timed out, resetting Jul 4 23:02:34 dada1 kernel: NETDEV WATCHDOG: eth0: transmit timed out Jul 4 23:02:34 dada1 kernel: tg3: eth0: transmit timed out, resetting Jul 4 23:02:39 dada1 kernel: NETDEV WATCHDOG: eth0: transmit timed out Jul 4 23:02:39 dada1 kernel: tg3: eth0: transmit timed out, resetting Then the machine crashes at this point. tg3.c : #define DRV_MODULE_VERSION "3.31" #define DRV_MODULE_RELDATE "June 8, 2005" # ethtool -g eth0 Ring parameters for eth0: Pre-set maximums: RX: 511 RX Mini: 0 RX Jumbo: 255 TX: 0 Current hardware settings: RX: 400 RX Mini: 0 RX Jumbo: 40 TX: 511 Thank you Eric Dumazet From davem@davemloft.net Mon Jul 4 14:29:02 2005 Received: with ECARTIS (v1.0.0; list netdev); Mon, 04 Jul 2005 14:29:08 -0700 (PDT) Received: from sunset.davemloft.net (dsl027-180-168.sfo1.dsl.speakeasy.net [216.27.180.168]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j64LT2H9008976 for ; Mon, 4 Jul 2005 14:29:02 -0700 Received: from localhost ([127.0.0.1] ident=davem) by sunset.davemloft.net with esmtp (Exim 4.50) id 1DpYS4-0000vU-AW; Mon, 04 Jul 2005 14:26:04 -0700 Date: Mon, 04 Jul 2005 14:26:04 -0700 (PDT) Message-Id: <20050704.142604.18592223.davem@davemloft.net> To: dada1@cosmosbay.com Cc: mchan@broadcom.com, netdev@oss.sgi.com Subject: Re: [TG3]: About hw coalescing infrastructure. From: "David S. Miller" In-Reply-To: <42C9A884.1030906@cosmosbay.com> References: <42B982D4.9040704@cosmosbay.com> <1119467012.5325.15.camel@rh4> <42C9A884.1030906@cosmosbay.com> X-Mailer: Mew version 4.2 on Emacs 21.4 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 2618 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev Content-Length: 599 Lines: 13 From: Eric Dumazet Date: Mon, 04 Jul 2005 23:22:12 +0200 > But it seems something is wrong because when network load becomes high I get : > > Jul 4 23:01:19 dada1 kernel: NETDEV WATCHDOG: eth0: transmit timed out > Jul 4 23:01:19 dada1 kernel: tg3: eth0: transmit timed out, resetting > Jul 4 23:01:24 dada1 kernel: NETDEV WATCHDOG: eth0: transmit timed out > Jul 4 23:01:24 dada1 kernel: tg3: eth0: transmit timed out, resetting > Jul 4 23:01:29 dada1 kernel: NETDEV WATCHDOG: eth0: transmit timed out Does this happen without changing the coalescing settings at all? From dada1@cosmosbay.com Mon Jul 4 14:41:11 2005 Received: with ECARTIS (v1.0.0; list netdev); Mon, 04 Jul 2005 14:41:16 -0700 (PDT) Received: from smtp.cegetel.net (mf00.sitadelle.com [212.94.174.67]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j64LfAH9010131 for ; Mon, 4 Jul 2005 14:41:11 -0700 Received: from [192.168.0.5] (unknown [84.5.72.184]) by smtp.cegetel.net (Postfix) with ESMTP id EA0061A4211; Mon, 4 Jul 2005 23:39:35 +0200 (CEST) Message-ID: <42C9AC97.1020902@cosmosbay.com> Date: Mon, 04 Jul 2005 23:39:35 +0200 From: Eric Dumazet User-Agent: Mozilla Thunderbird 1.0 (Windows/20041206) X-Accept-Language: fr, en MIME-Version: 1.0 To: "David S. Miller" Cc: mchan@broadcom.com, netdev@oss.sgi.com Subject: Re: [TG3]: About hw coalescing infrastructure. References: <42B982D4.9040704@cosmosbay.com> <1119467012.5325.15.camel@rh4> <42C9A884.1030906@cosmosbay.com> <20050704.142604.18592223.davem@davemloft.net> In-Reply-To: <20050704.142604.18592223.davem@davemloft.net> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit X-archive-position: 2619 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: dada1@cosmosbay.com Precedence: bulk X-list: netdev Content-Length: 738 Lines: 20 David S. Miller a écrit : > From: Eric Dumazet > Date: Mon, 04 Jul 2005 23:22:12 +0200 > > >>But it seems something is wrong because when network load becomes high I get : >> >>Jul 4 23:01:19 dada1 kernel: NETDEV WATCHDOG: eth0: transmit timed out >>Jul 4 23:01:19 dada1 kernel: tg3: eth0: transmit timed out, resetting >>Jul 4 23:01:24 dada1 kernel: NETDEV WATCHDOG: eth0: transmit timed out >>Jul 4 23:01:24 dada1 kernel: tg3: eth0: transmit timed out, resetting >>Jul 4 23:01:29 dada1 kernel: NETDEV WATCHDOG: eth0: transmit timed out > > > Does this happen without changing the coalescing settings > at all? > > Not easy to answer because I have to rebuild a kernel and reboot. Will do that shortly. From davem@davemloft.net Mon Jul 4 14:52:48 2005 Received: with ECARTIS (v1.0.0; list netdev); Mon, 04 Jul 2005 14:52:50 -0700 (PDT) Received: from sunset.davemloft.net (dsl027-180-168.sfo1.dsl.speakeasy.net [216.27.180.168]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j64LqlH9011503 for ; Mon, 4 Jul 2005 14:52:48 -0700 Received: from localhost ([127.0.0.1] ident=davem) by sunset.davemloft.net with esmtp (Exim 4.50) id 1DpYp5-0000xs-6f; Mon, 04 Jul 2005 14:49:51 -0700 Date: Mon, 04 Jul 2005 14:49:51 -0700 (PDT) Message-Id: <20050704.144950.112302038.davem@davemloft.net> To: dada1@cosmosbay.com Cc: mchan@broadcom.com, netdev@oss.sgi.com Subject: Re: [TG3]: About hw coalescing infrastructure. From: "David S. Miller" In-Reply-To: <42C9AC97.1020902@cosmosbay.com> References: <42C9A884.1030906@cosmosbay.com> <20050704.142604.18592223.davem@davemloft.net> <42C9AC97.1020902@cosmosbay.com> X-Mailer: Mew version 4.2 on Emacs 21.4 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 2620 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev Content-Length: 181 Lines: 7 From: Eric Dumazet Date: Mon, 04 Jul 2005 23:39:35 +0200 > Not easy to answer because I have to rebuild a kernel and > reboot. Will do that shortly. Thanks. From dada1@cosmosbay.com Mon Jul 4 15:33:32 2005 Received: with ECARTIS (v1.0.0; list netdev); Mon, 04 Jul 2005 15:33:37 -0700 (PDT) Received: from smtp.cegetel.net (mf00.sitadelle.com [212.94.174.67]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j64MXVH9014548 for ; Mon, 4 Jul 2005 15:33:32 -0700 Received: from [192.168.0.5] (unknown [84.5.72.184]) by smtp.cegetel.net (Postfix) with ESMTP id 08FD11A419A; Tue, 5 Jul 2005 00:31:56 +0200 (CEST) Message-ID: <42C9B8DB.7020403@cosmosbay.com> Date: Tue, 05 Jul 2005 00:31:55 +0200 From: Eric Dumazet User-Agent: Mozilla Thunderbird 1.0 (Windows/20041206) X-Accept-Language: fr, en MIME-Version: 1.0 To: Eric Dumazet Cc: "David S. Miller" , mchan@broadcom.com, netdev@oss.sgi.com Subject: Re: [TG3]: About hw coalescing infrastructure. References: <42B982D4.9040704@cosmosbay.com> <1119467012.5325.15.camel@rh4> <42C9A884.1030906@cosmosbay.com> <20050704.142604.18592223.davem@davemloft.net> <42C9AC97.1020902@cosmosbay.com> In-Reply-To: <42C9AC97.1020902@cosmosbay.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit X-archive-position: 2621 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: dada1@cosmosbay.com Precedence: bulk X-list: netdev Content-Length: 1006 Lines: 34 Eric Dumazet a écrit : > David S. Miller a écrit : > >> From: Eric Dumazet >> Date: Mon, 04 Jul 2005 23:22:12 +0200 >> >> >>> But it seems something is wrong because when network load becomes >>> high I get : >>> >>> Jul 4 23:01:19 dada1 kernel: NETDEV WATCHDOG: eth0: transmit timed out >>> Jul 4 23:01:19 dada1 kernel: tg3: eth0: transmit timed out, resetting >>> Jul 4 23:01:24 dada1 kernel: NETDEV WATCHDOG: eth0: transmit timed out >>> Jul 4 23:01:24 dada1 kernel: tg3: eth0: transmit timed out, resetting >>> Jul 4 23:01:29 dada1 kernel: NETDEV WATCHDOG: eth0: transmit timed out >> >> >> >> Does this happen without changing the coalescing settings >> at all? >> >> > Not easy to answer because I have to rebuild a kernel and reboot. Will > do that shortly. > > > Oops, I forgot to tell you I applied the patch : [TG3]: Eliminate all hw IRQ handler spinlocks. (Date: Fri, 03 Jun 2005 12:25:58 -0700 (PDT)) Maybe I should revert to stock 2.6.12 tg3 driver ? Eric From dada1@cosmosbay.com Mon Jul 4 15:49:35 2005 Received: with ECARTIS (v1.0.0; list netdev); Mon, 04 Jul 2005 15:49:37 -0700 (PDT) Received: from smtp.cegetel.net (mf01.sitadelle.com [212.94.174.68]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j64MnYH9015751 for ; Mon, 4 Jul 2005 15:49:35 -0700 Received: from [192.168.0.5] (unknown [84.5.72.184]) by smtp.cegetel.net (Postfix) with ESMTP id 9772E31841A; Tue, 5 Jul 2005 00:47:59 +0200 (CEST) Message-ID: <42C9BC9C.9070700@cosmosbay.com> Date: Tue, 05 Jul 2005 00:47:56 +0200 From: Eric Dumazet User-Agent: Mozilla Thunderbird 1.0 (Windows/20041206) X-Accept-Language: fr, en MIME-Version: 1.0 To: Eric Dumazet Cc: "David S. Miller" , mchan@broadcom.com, netdev@oss.sgi.com Subject: Re: [TG3]: About hw coalescing infrastructure. References: <42B982D4.9040704@cosmosbay.com> <1119467012.5325.15.camel@rh4> <42C9A884.1030906@cosmosbay.com> <20050704.142604.18592223.davem@davemloft.net> <42C9AC97.1020902@cosmosbay.com> <42C9B8DB.7020403@cosmosbay.com> In-Reply-To: <42C9B8DB.7020403@cosmosbay.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit X-archive-position: 2623 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: dada1@cosmosbay.com Precedence: bulk X-list: netdev Content-Length: 458 Lines: 19 Eric Dumazet a écrit : > > > Oops, I forgot to tell you I applied the patch : [TG3]: Eliminate all > hw IRQ handler spinlocks. > (Date: Fri, 03 Jun 2005 12:25:58 -0700 (PDT)) > > Maybe I should revert to stock 2.6.12 tg3 driver ? > > Eric > > Oh well, I checked 2.6.13-rc1 and it contains this line in tg3_netif_stop() : + tp->dev->trans_start = jiffies; /* prevent tx timeout */ So I am trying driver 3.32 June 24, 2005, included in 2.6.13-rc1 From davem@davemloft.net Mon Jul 4 15:48:56 2005 Received: with ECARTIS (v1.0.0; list netdev); Mon, 04 Jul 2005 15:49:00 -0700 (PDT) Received: from sunset.davemloft.net (dsl027-180-168.sfo1.dsl.speakeasy.net [216.27.180.168]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j64MmuH9015666 for ; Mon, 4 Jul 2005 15:48:56 -0700 Received: from localhost ([127.0.0.1] ident=davem) by sunset.davemloft.net with esmtp (Exim 4.50) id 1DpZib-0000kO-1N; Mon, 04 Jul 2005 15:47:13 -0700 Date: Mon, 04 Jul 2005 15:47:12 -0700 (PDT) Message-Id: <20050704.154712.63128211.davem@davemloft.net> To: dada1@cosmosbay.com Cc: mchan@broadcom.com, netdev@oss.sgi.com Subject: Re: [TG3]: About hw coalescing infrastructure. From: "David S. Miller" In-Reply-To: <42C9B8DB.7020403@cosmosbay.com> References: <20050704.142604.18592223.davem@davemloft.net> <42C9AC97.1020902@cosmosbay.com> <42C9B8DB.7020403@cosmosbay.com> X-Mailer: Mew version 4.2 on Emacs 21.4 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 2622 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev Content-Length: 740 Lines: 17 From: Eric Dumazet Date: Tue, 05 Jul 2005 00:31:55 +0200 > Oops, I forgot to tell you I applied the patch : [TG3]: Eliminate all hw IRQ handler spinlocks. > (Date: Fri, 03 Jun 2005 12:25:58 -0700 (PDT)) > > Maybe I should revert to stock 2.6.12 tg3 driver ? Please don't ever do stuff like that :-( That makes the driver version, and any other information you report completely meaningless and useless. You've just wasted a lot of our time. I have no idea to even know _WHICH_ IRQ spinlock patch you applied. The one that ended up in Linus's tree has many bug fixes and refinements. The ones which were posted on netdev had many bugs and deadlocks which needed to be cured before pushing the change upstream. From dada1@cosmosbay.com Mon Jul 4 15:57:19 2005 Received: with ECARTIS (v1.0.0; list netdev); Mon, 04 Jul 2005 15:57:20 -0700 (PDT) Received: from smtp.cegetel.net (mf01.sitadelle.com [212.94.174.68]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j64MvIH9016997 for ; Mon, 4 Jul 2005 15:57:18 -0700 Received: from [192.168.0.5] (unknown [84.5.72.184]) by smtp.cegetel.net (Postfix) with ESMTP id A33C231840E; Tue, 5 Jul 2005 00:55:41 +0200 (CEST) Message-ID: <42C9BE69.2070008@cosmosbay.com> Date: Tue, 05 Jul 2005 00:55:37 +0200 From: Eric Dumazet User-Agent: Mozilla Thunderbird 1.0 (Windows/20041206) X-Accept-Language: fr, en MIME-Version: 1.0 To: "David S. Miller" Cc: mchan@broadcom.com, netdev@oss.sgi.com Subject: Re: [TG3]: About hw coalescing infrastructure. References: <20050704.142604.18592223.davem@davemloft.net> <42C9AC97.1020902@cosmosbay.com> <42C9B8DB.7020403@cosmosbay.com> <20050704.154712.63128211.davem@davemloft.net> In-Reply-To: <20050704.154712.63128211.davem@davemloft.net> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit X-archive-position: 2624 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: dada1@cosmosbay.com Precedence: bulk X-list: netdev Content-Length: 691 Lines: 25 David S. Miller a écrit : > Please don't ever do stuff like that :-( That makes the driver > version, and any other information you report completely meaningless > and useless. You've just wasted a lot of our time. > Yes. But if you dont want us to test your patches, dont send them to the list. For your information, 2.6.13-rc1 locks too. Very easy way to lock it : ping -f some_destination. > I have no idea to even know _WHICH_ IRQ spinlock patch you applied. > > The one that ended up in Linus's tree has many bug fixes and > refinements. The ones which were posted on netdev had many bugs and > deadlocks which needed to be cured before pushing the change upstream. > > From dada1@cosmosbay.com Mon Jul 4 15:59:39 2005 Received: with ECARTIS (v1.0.0; list netdev); Mon, 04 Jul 2005 15:59:41 -0700 (PDT) Received: from smtp.cegetel.net (mf01.sitadelle.com [212.94.174.68]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j64MxcH9017537 for ; Mon, 4 Jul 2005 15:59:38 -0700 Received: from [192.168.0.5] (unknown [84.5.72.184]) by smtp.cegetel.net (Postfix) with ESMTP id 65ED13183E9; Tue, 5 Jul 2005 00:58:03 +0200 (CEST) Message-ID: <42C9BEF6.4080402@cosmosbay.com> Date: Tue, 05 Jul 2005 00:57:58 +0200 From: Eric Dumazet User-Agent: Mozilla Thunderbird 1.0 (Windows/20041206) X-Accept-Language: fr, en MIME-Version: 1.0 To: Eric Dumazet Cc: "David S. Miller" , mchan@broadcom.com, netdev@oss.sgi.com Subject: Re: [TG3]: About hw coalescing infrastructure. References: <20050704.142604.18592223.davem@davemloft.net> <42C9AC97.1020902@cosmosbay.com> <42C9B8DB.7020403@cosmosbay.com> <20050704.154712.63128211.davem@davemloft.net> <42C9BE69.2070008@cosmosbay.com> In-Reply-To: <42C9BE69.2070008@cosmosbay.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit X-archive-position: 2625 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: dada1@cosmosbay.com Precedence: bulk X-list: netdev Content-Length: 188 Lines: 15 Eric Dumazet a écrit : > > For your information, 2.6.13-rc1 locks too. > > Very easy way to lock it : > > ping -f some_destination. Arg. False alarm. Sorry. Time for me to sleep. From davem@davemloft.net Mon Jul 4 16:02:11 2005 Received: with ECARTIS (v1.0.0; list netdev); Mon, 04 Jul 2005 16:02:15 -0700 (PDT) Received: from sunset.davemloft.net (dsl027-180-168.sfo1.dsl.speakeasy.net [216.27.180.168]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j64N2BH9018411 for ; Mon, 4 Jul 2005 16:02:11 -0700 Received: from localhost ([127.0.0.1] ident=davem) by sunset.davemloft.net with esmtp (Exim 4.50) id 1DpZvW-0000mX-9d; Mon, 04 Jul 2005 16:00:34 -0700 Date: Mon, 04 Jul 2005 16:00:34 -0700 (PDT) Message-Id: <20050704.160034.55511095.davem@davemloft.net> To: dada1@cosmosbay.com Cc: mchan@broadcom.com, netdev@oss.sgi.com Subject: Re: [TG3]: About hw coalescing infrastructure. From: "David S. Miller" In-Reply-To: <42C9BE69.2070008@cosmosbay.com> References: <42C9B8DB.7020403@cosmosbay.com> <20050704.154712.63128211.davem@davemloft.net> <42C9BE69.2070008@cosmosbay.com> X-Mailer: Mew version 4.2 on Emacs 21.4 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id j64N2BH9018411 X-archive-position: 2626 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev Content-Length: 788 Lines: 27 From: Eric Dumazet Date: Tue, 05 Jul 2005 00:55:37 +0200 > David S. Miller a écrit : > > > Please don't ever do stuff like that :-( That makes the driver > > version, and any other information you report completely meaningless > > and useless. You've just wasted a lot of our time. > > > > Yes. But if you dont want us to test your patches, dont send them to > the list. The case here is that you didn't even mention that you had the patch applied. If you don't say what changes you've applied to the stock driver we can't properly debug anything. > For your information, 2.6.13-rc1 locks too. > > Very easy way to lock it : > > ping -f some_destination. SMP system? What platform and 570x chip revision? Does the stock driver in 2.6.12 hang as well? From davem@davemloft.net Mon Jul 4 16:03:15 2005 Received: with ECARTIS (v1.0.0; list netdev); Mon, 04 Jul 2005 16:03:18 -0700 (PDT) Received: from sunset.davemloft.net (dsl027-180-168.sfo1.dsl.speakeasy.net [216.27.180.168]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j64N3FH9018782 for ; Mon, 4 Jul 2005 16:03:15 -0700 Received: from localhost ([127.0.0.1] ident=davem) by sunset.davemloft.net with esmtp (Exim 4.50) id 1DpZwa-0000ms-Eu; Mon, 04 Jul 2005 16:01:40 -0700 Date: Mon, 04 Jul 2005 16:01:40 -0700 (PDT) Message-Id: <20050704.160140.21591849.davem@davemloft.net> To: dada1@cosmosbay.com Cc: mchan@broadcom.com, netdev@oss.sgi.com Subject: Re: [TG3]: About hw coalescing infrastructure. From: "David S. Miller" In-Reply-To: <42C9BEF6.4080402@cosmosbay.com> References: <20050704.154712.63128211.davem@davemloft.net> <42C9BE69.2070008@cosmosbay.com> <42C9BEF6.4080402@cosmosbay.com> X-Mailer: Mew version 4.2 on Emacs 21.4 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id j64N3FH9018782 X-archive-position: 2627 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev Content-Length: 247 Lines: 16 From: Eric Dumazet Date: Tue, 05 Jul 2005 00:57:58 +0200 > Eric Dumazet a écrit : > > > Very easy way to lock it : > > > > ping -f some_destination. > > > Arg. False alarm. Sorry. > > Time for me to sleep. Oh well... From dada1@cosmosbay.com Tue Jul 5 00:40:30 2005 Received: with ECARTIS (v1.0.0; list netdev); Tue, 05 Jul 2005 00:40:35 -0700 (PDT) Received: from smtp.cegetel.net (mf01.sitadelle.com [212.94.174.68]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j657eTH9029339 for ; Tue, 5 Jul 2005 00:40:30 -0700 Received: from [192.168.0.5] (84-4-148-22.dti.cegetel.net [84.4.148.22]) by smtp.cegetel.net (Postfix) with ESMTP id 2A0EC318580; Tue, 5 Jul 2005 09:38:53 +0200 (CEST) Message-ID: <42CA390C.9000801@cosmosbay.com> Date: Tue, 05 Jul 2005 09:38:52 +0200 From: Eric Dumazet User-Agent: Mozilla Thunderbird 1.0 (Windows/20041206) X-Accept-Language: fr, en MIME-Version: 1.0 To: "David S. Miller" Cc: netdev@oss.sgi.com Subject: [PATCH] loop unrolling in net/sched/sch_generic.c References: <20050704.154712.63128211.davem@davemloft.net> <42C9BE69.2070008@cosmosbay.com> <42C9BEF6.4080402@cosmosbay.com> <20050704.160140.21591849.davem@davemloft.net> In-Reply-To: <20050704.160140.21591849.davem@davemloft.net> Content-Type: multipart/mixed; boundary="------------090908060004010709010307" X-archive-position: 2628 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: dada1@cosmosbay.com Precedence: bulk X-list: netdev Content-Length: 1730 Lines: 54 This is a multi-part message in MIME format. --------------090908060004010709010307 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit [NET] : unroll a small loop in pfifo_fast_dequeue(). Compiler generates better code. (Using skb_queue_empty() to test the queue is faster than trying to __skb_dequeue()) oprofile says this function uses now 0.29% instead of 1.22 %, on a x86_64 target. Signed-off-by: Eric Dumazet --------------090908060004010709010307 Content-Type: text/plain; name="patch.sch_generic" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="patch.sch_generic" --- linux-2.6.12/net/sched/sch_generic.c 2005-06-17 21:48:29.000000000 +0200 +++ linux-2.6.12-ed/net/sched/sch_generic.c 2005-07-05 09:11:30.000000000 +0200 @@ -333,18 +333,23 @@ static struct sk_buff * pfifo_fast_dequeue(struct Qdisc* qdisc) { - int prio; struct sk_buff_head *list = qdisc_priv(qdisc); struct sk_buff *skb; - for (prio = 0; prio < 3; prio++, list++) { - skb = __skb_dequeue(list); - if (skb) { - qdisc->q.qlen--; - return skb; - } + for (;;) { + if (!skb_queue_empty(list)) + break; + list++; + if (!skb_queue_empty(list)) + break; + list++; + if (!skb_queue_empty(list)) + break; + return NULL; } - return NULL; + skb = __skb_dequeue(list); + qdisc->q.qlen--; + return skb; } static int --------------090908060004010709010307-- From tgraf@suug.ch Tue Jul 5 04:52:35 2005 Received: with ECARTIS (v1.0.0; list netdev); Tue, 05 Jul 2005 04:52:40 -0700 (PDT) Received: from postel.suug.ch (postel.suug.ch [195.134.158.23]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j65BqWH9017561 for ; Tue, 5 Jul 2005 04:52:35 -0700 Received: by postel.suug.ch (Postfix, from userid 10001) id 003BA1C0EB; Tue, 5 Jul 2005 13:51:08 +0200 (CEST) Date: Tue, 5 Jul 2005 13:51:08 +0200 From: Thomas Graf To: Eric Dumazet Cc: "David S. Miller" , netdev@oss.sgi.com Subject: Re: [PATCH] loop unrolling in net/sched/sch_generic.c Message-ID: <20050705115108.GE16076@postel.suug.ch> References: <20050704.154712.63128211.davem@davemloft.net> <42C9BE69.2070008@cosmosbay.com> <42C9BEF6.4080402@cosmosbay.com> <20050704.160140.21591849.davem@davemloft.net> <42CA390C.9000801@cosmosbay.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <42CA390C.9000801@cosmosbay.com> X-archive-position: 2629 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev Content-Length: 574 Lines: 12 * Eric Dumazet <42CA390C.9000801@cosmosbay.com> 2005-07-05 09:38 > [NET] : unroll a small loop in pfifo_fast_dequeue(). Compiler generates > better code. > (Using skb_queue_empty() to test the queue is faster than trying to > __skb_dequeue()) > oprofile says this function uses now 0.29% instead of 1.22 %, on a > x86_64 target. I think this patch is pretty much pointless. __skb_dequeue() and !skb_queue_empty() should produce almost the same code and as soon as you disable profiling and debugging you'll see that the compiler unrolls the loop itself if possible. From tgraf@suug.ch Tue Jul 5 05:04:41 2005 Received: with ECARTIS (v1.0.0; list netdev); Tue, 05 Jul 2005 05:04:47 -0700 (PDT) Received: from postel.suug.ch (postel.suug.ch [195.134.158.23]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j65C4eH9018676 for ; Tue, 5 Jul 2005 05:04:41 -0700 Received: by postel.suug.ch (Postfix, from userid 10001) id 0431F1C0F3; Tue, 5 Jul 2005 14:03:27 +0200 (CEST) Date: Tue, 5 Jul 2005 14:03:27 +0200 From: Thomas Graf To: Eric Dumazet Cc: "David S. Miller" , netdev@oss.sgi.com Subject: Re: [PATCH] loop unrolling in net/sched/sch_generic.c Message-ID: <20050705120327.GF16076@postel.suug.ch> References: <20050704.154712.63128211.davem@davemloft.net> <42C9BE69.2070008@cosmosbay.com> <42C9BEF6.4080402@cosmosbay.com> <20050704.160140.21591849.davem@davemloft.net> <42CA390C.9000801@cosmosbay.com> <20050705115108.GE16076@postel.suug.ch> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20050705115108.GE16076@postel.suug.ch> X-archive-position: 2630 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev Content-Length: 706 Lines: 15 * Thomas Graf <20050705115108.GE16076@postel.suug.ch> 2005-07-05 13:51 > * Eric Dumazet <42CA390C.9000801@cosmosbay.com> 2005-07-05 09:38 > > [NET] : unroll a small loop in pfifo_fast_dequeue(). Compiler generates > > better code. > > (Using skb_queue_empty() to test the queue is faster than trying to > > __skb_dequeue()) > > oprofile says this function uses now 0.29% instead of 1.22 %, on a > > x86_64 target. > > I think this patch is pretty much pointless. __skb_dequeue() and > !skb_queue_empty() should produce almost the same code and as soon > as you disable profiling and debugging you'll see that the compiler > unrolls the loop itself if possible. ... given one enables it of course. From dada1@cosmosbay.com Tue Jul 5 06:06:04 2005 Received: with ECARTIS (v1.0.0; list netdev); Tue, 05 Jul 2005 06:06:08 -0700 (PDT) Received: from gw1.cosmosbay.com (gw1.cosmosbay.com [62.23.185.226]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j65D63H9023158 for ; Tue, 5 Jul 2005 06:06:04 -0700 Received: from [172.16.0.131] (edumazet-port [172.16.0.131]) by gw1.cosmosbay.com (8.13.3/8.13.3) with ESMTP id j65D4MJi029934; Tue, 5 Jul 2005 15:04:22 +0200 Message-ID: <42CA8555.9050607@cosmosbay.com> Date: Tue, 05 Jul 2005 15:04:21 +0200 From: Eric Dumazet User-Agent: Mozilla Thunderbird 1.0 (Windows/20041206) X-Accept-Language: fr, en MIME-Version: 1.0 To: Thomas Graf CC: "David S. Miller" , netdev@oss.sgi.com Subject: Re: [PATCH] loop unrolling in net/sched/sch_generic.c References: <20050704.154712.63128211.davem@davemloft.net> <42C9BE69.2070008@cosmosbay.com> <42C9BEF6.4080402@cosmosbay.com> <20050704.160140.21591849.davem@davemloft.net> <42CA390C.9000801@cosmosbay.com> <20050705115108.GE16076@postel.suug.ch> In-Reply-To: <20050705115108.GE16076@postel.suug.ch> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-1.6 (gw1.cosmosbay.com [172.16.8.80]); Tue, 05 Jul 2005 15:04:23 +0200 (CEST) X-archive-position: 2631 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: dada1@cosmosbay.com Precedence: bulk X-list: netdev Content-Length: 4244 Lines: 86 Thomas Graf a écrit : > * Eric Dumazet <42CA390C.9000801@cosmosbay.com> 2005-07-05 09:38 > >>[NET] : unroll a small loop in pfifo_fast_dequeue(). Compiler generates >>better code. >> (Using skb_queue_empty() to test the queue is faster than trying to >> __skb_dequeue()) >> oprofile says this function uses now 0.29% instead of 1.22 %, on a >> x86_64 target. > > > I think this patch is pretty much pointless. __skb_dequeue() and > !skb_queue_empty() should produce almost the same code and as soon > as you disable profiling and debugging you'll see that the compiler > unrolls the loop itself if possible. > > OK. At least my compiler (gcc-3.3.1) does NOT unroll the loop : Original 2.6.12 gives : ffffffff802a9790 : /* pfifo_fast_dequeue total: 2904054 1.9531 */ 258371 0.1738 :ffffffff802a9790: lea 0xc0(%rdi),%rcx 273669 0.1841 :ffffffff802a9797: xor %esi,%esi 12533 0.0084 :ffffffff802a9799: mov (%rcx),%rdx 292315 0.1966 :ffffffff802a979c: cmp %rcx,%rdx 11717 0.0079 :ffffffff802a979f: je ffffffff802a97d1 4474 0.0030 :ffffffff802a97a1: mov %rdx,%rax 6238 0.0042 :ffffffff802a97a4: mov (%rdx),%rdx 41 2.8e-05 :ffffffff802a97a7: decl 0x10(%rcx) 6089 0.0041 :ffffffff802a97aa: test %rax,%rax 126 8.5e-05 :ffffffff802a97ad: movq $0x0,0x10(%rax) 39 2.6e-05 :ffffffff802a97b5: mov %rcx,0x8(%rdx) 6974 0.0047 :ffffffff802a97b9: mov %rdx,(%rcx) 2841 0.0019 :ffffffff802a97bc: movq $0x0,0x8(%rax) 366 2.5e-04 :ffffffff802a97c4: movq $0x0,(%rax) 14757 0.0099 :ffffffff802a97cb: je ffffffff802a97d1 288 1.9e-04 :ffffffff802a97cd: decl 0x40(%rdi) 94 6.3e-05 :ffffffff802a97d0: retq 970400 0.6526 :ffffffff802a97d1: inc %esi 982402 0.6607 :ffffffff802a97d3: add $0x18,%rcx 4 2.7e-06 :ffffffff802a97d7: cmp $0x2,%esi 1 6.7e-07 :ffffffff802a97da: jle ffffffff802a9799 59754 0.0402 :ffffffff802a97dc: xor %eax,%eax 561 3.8e-04 :ffffffff802a97de: data16 :ffffffff802a97df: nop :ffffffff802a97e0: retq And new code (2.6.12-ed): ffffffff802b1020 : /* pfifo_fast_dequeue total: 153139 0.2934 */ 27388 0.0525 :ffffffff802b1020: lea 0xc0(%rdi),%rdx 42091 0.0806 :ffffffff802b1027: cmp %rdx,0xc0(%rdi) :ffffffff802b102e: jne ffffffff802b1052 474 9.1e-04 :ffffffff802b1030: lea 0xd8(%rdi),%rdx 5571 0.0107 :ffffffff802b1037: cmp %rdx,0xd8(%rdi) 2 3.8e-06 :ffffffff802b103e: jne ffffffff802b1052 1 1.9e-06 :ffffffff802b1040: lea 0xf0(%rdi),%rdx 20030 0.0384 :ffffffff802b1047: xor %eax,%eax 6 1.1e-05 :ffffffff802b1049: cmp %rdx,0xf0(%rdi) 6 1.1e-05 :ffffffff802b1050: je ffffffff802b1086 :ffffffff802b1052: mov (%rdx),%rcx 11796 0.0226 :ffffffff802b1055: xor %eax,%eax :ffffffff802b1057: cmp %rdx,%rcx 8 1.5e-05 :ffffffff802b105a: je ffffffff802b1083 3146 0.0060 :ffffffff802b105c: mov %rcx,%rax 12 2.3e-05 :ffffffff802b105f: mov (%rcx),%rcx 118 2.3e-04 :ffffffff802b1062: decl 0x10(%rdx) 4924 0.0094 :ffffffff802b1065: movq $0x0,0x10(%rax) 65 1.2e-04 :ffffffff802b106d: mov %rdx,0x8(%rcx) 725 0.0014 :ffffffff802b1071: mov %rcx,(%rdx) 11493 0.0220 :ffffffff802b1074: movq $0x0,0x8(%rax) 194 3.7e-04 :ffffffff802b107c: movq $0x0,(%rax) 2995 0.0057 :ffffffff802b1083: decl 0x40(%rdi) 19607 0.0376 :ffffffff802b1086: nop 2487 0.0048 :ffffffff802b1087: retq Please give us the code your compiler produces, and explain me how disabling oprofile can change the generated assembly. :) Debugging has no impact on this code either. Thank you Eric From tgraf@suug.ch Tue Jul 5 06:49:21 2005 Received: with ECARTIS (v1.0.0; list netdev); Tue, 05 Jul 2005 06:49:28 -0700 (PDT) Received: from postel.suug.ch (postel.suug.ch [195.134.158.23]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j65DnIH9026297 for ; Tue, 5 Jul 2005 06:49:21 -0700 Received: by postel.suug.ch (Postfix, from userid 10001) id C19731C0EB; Tue, 5 Jul 2005 15:48:05 +0200 (CEST) Date: Tue, 5 Jul 2005 15:48:05 +0200 From: Thomas Graf To: Eric Dumazet Cc: "David S. Miller" , netdev@oss.sgi.com Subject: Re: [PATCH] loop unrolling in net/sched/sch_generic.c Message-ID: <20050705134805.GH16076@postel.suug.ch> References: <20050704.154712.63128211.davem@davemloft.net> <42C9BE69.2070008@cosmosbay.com> <42C9BEF6.4080402@cosmosbay.com> <20050704.160140.21591849.davem@davemloft.net> <42CA390C.9000801@cosmosbay.com> <20050705115108.GE16076@postel.suug.ch> <42CA8555.9050607@cosmosbay.com> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <42CA8555.9050607@cosmosbay.com> X-archive-position: 2632 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev Content-Length: 2708 Lines: 118 * Eric Dumazet <42CA8555.9050607@cosmosbay.com> 2005-07-05 15:04 > Thomas Graf a écrit : > >* Eric Dumazet <42CA390C.9000801@cosmosbay.com> 2005-07-05 09:38 > > > >>[NET] : unroll a small loop in pfifo_fast_dequeue(). Compiler generates > >>better code. > >> (Using skb_queue_empty() to test the queue is faster than trying to > >> __skb_dequeue()) > >> oprofile says this function uses now 0.29% instead of 1.22 %, on a > >> x86_64 target. > > > > > >I think this patch is pretty much pointless. __skb_dequeue() and > >!skb_queue_empty() should produce almost the same code and as soon > >as you disable profiling and debugging you'll see that the compiler > >unrolls the loop itself if possible. > > > > > > OK. At least my compiler (gcc-3.3.1) does NOT unroll the loop : Because you don't specify -funroll-loop [...] > Please give us the code your compiler produces, Unrolled version: pfifo_fast_dequeue: pushl %esi xorl %edx, %edx pushl %ebx movl 12(%esp), %esi movl 128(%esi), %eax leal 128(%esi), %ecx cmpl %ecx, %eax je .L132 movl %eax, %edx movl (%eax), %eax decl 8(%ecx) movl $0, 8(%edx) movl %ecx, 4(%eax) movl %eax, 128(%esi) movl $0, 4(%edx) movl $0, (%edx) .L132: testl %edx, %edx je .L131 movl 96(%edx), %ebx movl 80(%esi), %eax decl 40(%esi) subl %ebx, %eax movl %eax, 80(%esi) movl %edx, %eax .L117: popl %ebx popl %esi ret .L131: movl 20(%ecx), %eax leal 20(%ecx), %edx xorl %ebx, %ebx cmpl %edx, %eax je .L137 movl %eax, %ebx movl (%eax), %eax decl 8(%edx) movl $0, 8(%ebx) movl %edx, 4(%eax) movl %eax, 20(%ecx) movl $0, 4(%ebx) movl $0, (%ebx) .L137: testl %ebx, %ebx je .L147 .L146: movl 96(%ebx), %ecx movl 80(%esi), %eax decl 40(%esi) subl %ecx, %eax movl %eax, 80(%esi) movl %ebx, %eax jmp .L117 .L147: movl 40(%ecx), %eax leal 40(%ecx), %edx xorl %ebx, %ebx cmpl %edx, %eax je .L142 movl %eax, %ebx movl (%eax), %eax decl 8(%edx) movl $0, 8(%ebx) movl %edx, 4(%eax) movl %eax, 40(%ecx) movl $0, 4(%ebx) movl $0, (%ebx) .L142: xorl %eax, %eax testl %ebx, %ebx jne .L146 jmp .L117 > and explain me how > disabling oprofile can change the generated assembly. :) > Debugging has no impact on this code either. I just noticed that this is a local modification of my own, so in the vanilla tree it indeed doesn't have any impact on the code generated. Still, your patch does not make sense to me. The latest tree also includes my pfifo_fast changes wich modified the code to maintain a backlog and made it easy to add more fifos at compile time. If you want the loop unrolled then let the compiler do it via -funroll-loop. These kind of optimization seem as uncessary to me as all the loopback optimizations. From patjenk@wam.umd.edu Tue Jul 5 08:19:19 2005 Received: with ECARTIS (v1.0.0; list netdev); Tue, 05 Jul 2005 08:19:24 -0700 (PDT) Received: from po2.wam.umd.edu (po2.wam.umd.edu [128.8.10.164]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j65FJIH9003084 for ; Tue, 5 Jul 2005 08:19:18 -0700 Received: from rac3.wam.umd.edu (IDENT:sendmail@rac3.wam.umd.edu [128.8.10.143]) by po2.wam.umd.edu (8.12.10/8.12.10) with ESMTP id j65FHeZD016122; Tue, 5 Jul 2005 11:17:40 -0400 (EDT) Received: from localhost (patjenk@localhost) by rac3.wam.umd.edu (8.12.10/8.12.10) with ESMTP id j65FHZl1017138; Tue, 5 Jul 2005 11:17:37 -0400 (EDT) X-Authentication-Warning: rac3.wam.umd.edu: patjenk owned process doing -bs Date: Tue, 5 Jul 2005 11:17:35 -0400 (EDT) From: Patrick Jenkins To: linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: [PATCH] multipath routing algorithm, better patch Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-archive-position: 2633 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: patjenk@wam.umd.edu Precedence: bulk X-list: netdev Content-Length: 1199 Lines: 32 >Thomas Graf wrote: >> * Patrick McHardy <42C4919A.5000009@trash.net> 2005-07-01 02:43 >> >>>If it isn't set correctly its an iproute problem. Did you actually >>>experience any problems? >> >> Well, my patch for iproute2 to enable multipath algorithm selection >> is currently being merged to Stephen together with the ematch bits. >> We had to work out a dependency on GNU flex first (the berkley >> version uses the same executable names) so the inclusion was >> delayed a bit. > >So its no problem but simply missing support. BTW, do you know if >Stephen's new CVS repository is exported somewhere? Yes, we did experience a problem. The routing doesnt work as advertised (i.e. it wont utilize the multiple routes). Patrick is correct in saying its missing support. From what you are saying it seems the problem will be corrected by a new feature in iproute2 that allows the user to select this ability. Is this correct? Also, does this mean my patch is not needed? It seems to me that it should be supported somehow in the kernel seeing as how everyone might not utilize iproute2. Once again, please cc me in the reply because I am not a member of the list. Thanks, Patrick Jenkins From dada1@cosmosbay.com Tue Jul 5 09:00:23 2005 Received: with ECARTIS (v1.0.0; list netdev); Tue, 05 Jul 2005 09:00:26 -0700 (PDT) Received: from gw1.cosmosbay.com (gw1.cosmosbay.com [62.23.185.226]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j65G0MH9006625 for ; Tue, 5 Jul 2005 09:00:22 -0700 Received: from [172.16.0.131] (edumazet-port [172.16.0.131]) by gw1.cosmosbay.com (8.13.3/8.13.3) with ESMTP id j65FwfLs005806; Tue, 5 Jul 2005 17:58:43 +0200 Message-ID: <42CAAE2F.5070807@cosmosbay.com> Date: Tue, 05 Jul 2005 17:58:39 +0200 From: Eric Dumazet User-Agent: Mozilla Thunderbird 1.0 (Windows/20041206) X-Accept-Language: fr, en MIME-Version: 1.0 To: Thomas Graf CC: "David S. Miller" , netdev@oss.sgi.com Subject: Re: [PATCH] loop unrolling in net/sched/sch_generic.c References: <20050704.154712.63128211.davem@davemloft.net> <42C9BE69.2070008@cosmosbay.com> <42C9BEF6.4080402@cosmosbay.com> <20050704.160140.21591849.davem@davemloft.net> <42CA390C.9000801@cosmosbay.com> <20050705115108.GE16076@postel.suug.ch> <42CA8555.9050607@cosmosbay.com> <20050705134805.GH16076@postel.suug.ch> In-Reply-To: <20050705134805.GH16076@postel.suug.ch> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-1.6 (gw1.cosmosbay.com [172.16.8.80]); Tue, 05 Jul 2005 17:58:43 +0200 (CEST) X-archive-position: 2634 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: dada1@cosmosbay.com Precedence: bulk X-list: netdev Content-Length: 5028 Lines: 165 Thomas Graf a écrit : >>OK. At least my compiler (gcc-3.3.1) does NOT unroll the loop : > > > Because you don't specify -funroll-loop I'm using vanilla 2.6.12 : no -funroll-loop in it. Maybe in your tree, not on 99.9% of 2.6.12 trees. Are you suggesting everybody should use this compiler flag ? Something like : net/sched/Makefile: CFLAGS_sch_generic.o := -funroll-loops ? > > [...] > > >>Please give us the code your compiler produces, > > > Unrolled version: > > pfifo_fast_dequeue: > pushl %esi > xorl %edx, %edx > pushl %ebx > movl 12(%esp), %esi > movl 128(%esi), %eax > leal 128(%esi), %ecx > cmpl %ecx, %eax > je .L132 > movl %eax, %edx > movl (%eax), %eax > decl 8(%ecx) > movl $0, 8(%edx) > movl %ecx, 4(%eax) > movl %eax, 128(%esi) > movl $0, 4(%edx) > movl $0, (%edx) > .L132: > testl %edx, %edx > je .L131 > movl 96(%edx), %ebx > movl 80(%esi), %eax > decl 40(%esi) > subl %ebx, %eax > movl %eax, 80(%esi) > movl %edx, %eax > .L117: > popl %ebx > popl %esi > ret > .L131: > movl 20(%ecx), %eax > leal 20(%ecx), %edx > xorl %ebx, %ebx > cmpl %edx, %eax > je .L137 > movl %eax, %ebx > movl (%eax), %eax > decl 8(%edx) > movl $0, 8(%ebx) > movl %edx, 4(%eax) > movl %eax, 20(%ecx) > movl $0, 4(%ebx) > movl $0, (%ebx) > .L137: > testl %ebx, %ebx > je .L147 > .L146: > movl 96(%ebx), %ecx > movl 80(%esi), %eax > decl 40(%esi) > subl %ecx, %eax > movl %eax, 80(%esi) > movl %ebx, %eax > jmp .L117 > .L147: > movl 40(%ecx), %eax > leal 40(%ecx), %edx > xorl %ebx, %ebx > cmpl %edx, %eax > je .L142 > movl %eax, %ebx > movl (%eax), %eax > decl 8(%edx) > movl $0, 8(%ebx) > movl %edx, 4(%eax) > movl %eax, 40(%ecx) > movl $0, 4(%ebx) > movl $0, (%ebx) > .L142: > xorl %eax, %eax > testl %ebx, %ebx > jne .L146 > jmp .L117 > OK thanks, but you dont give the code for my version :) shorter and unrolled as you can see, and with nice predicted branches. 00000fc0 : fc0: 56 push %esi fc1: 89 c1 mov %eax,%ecx fc3: 53 push %ebx fc4: 8d 98 a0 00 00 00 lea 0xa0(%eax),%ebx fca: 39 98 a0 00 00 00 cmp %ebx,0xa0(%eax) fd0: 89 da mov %ebx,%edx fd2: 75 22 jne ff6 fd4: 8d 90 c4 00 00 00 lea 0xc4(%eax),%edx fda: 39 90 c4 00 00 00 cmp %edx,0xc4(%eax) fe0: 89 d3 mov %edx,%ebx fe2: 75 12 jne ff6 fe4: 8d 98 e8 00 00 00 lea 0xe8(%eax),%ebx fea: 31 f6 xor %esi,%esi fec: 39 98 e8 00 00 00 cmp %ebx,0xe8(%eax) ff2: 89 da mov %ebx,%edx ff4: 74 27 je 101d ff6: 8b 32 mov (%edx),%esi ff8: 39 d6 cmp %edx,%esi ffa: 74 26 je 1022 ffc: 8b 06 mov (%esi),%eax ffe: ff 4b 08 decl 0x8(%ebx) 1001: c7 46 08 00 00 00 00 movl $0x0,0x8(%esi) 1008: 89 50 04 mov %edx,0x4(%eax) 100b: 89 02 mov %eax,(%edx) 100d: c7 46 04 00 00 00 00 movl $0x0,0x4(%esi) 1014: c7 06 00 00 00 00 movl $0x0,(%esi) 101a: ff 49 28 decl 0x28(%ecx) 101d: 5b pop %ebx 101e: 89 f0 mov %esi,%eax 1020: 5e pop %esi 1021: c3 ret 1022: ff 49 28 decl 0x28(%ecx) 1025: 31 f6 xor %esi,%esi 1027: eb f4 jmp 101d > > I just noticed that this is a local modification of my own, so in > the vanilla tree it indeed doesn't have any impact on the code > generated. > > Still, your patch does not make sense to me. The latest tree > also includes my pfifo_fast changes wich modified the code to > maintain a backlog and made it easy to add more fifos at compile > time. If you want the loop unrolled then let the compiler do it > via -funroll-loop. These kind of optimization seem as uncessary > to me as all the loopback optimizations. > I dont want change compiler flags in my tree and loose this optim when 2.6.13 is released. I dont know about loopback optimization, I am not involved with this stuff, maybe you think I'm another guy ? It seems to me you give unrelated arguments. I dont know what are your plans, but mine were not to say you are writing bad code. Just to give my performance analysis and feedback, I'm sorry if it hurts you. Eric Dumazet From dada1@cosmosbay.com Tue Jul 5 09:16:43 2005 Received: with ECARTIS (v1.0.0; list netdev); Tue, 05 Jul 2005 09:16:46 -0700 (PDT) Received: from gw1.cosmosbay.com (gw1.cosmosbay.com [62.23.185.226]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j65GGgH9008304 for ; Tue, 5 Jul 2005 09:16:43 -0700 Received: from [172.16.0.131] (edumazet-port [172.16.0.131]) by gw1.cosmosbay.com (8.13.3/8.13.3) with ESMTP id j65GExUv006247; Tue, 5 Jul 2005 18:15:00 +0200 Message-ID: <42CAB203.7050700@cosmosbay.com> Date: Tue, 05 Jul 2005 18:14:59 +0200 From: Eric Dumazet User-Agent: Mozilla Thunderbird 1.0 (Windows/20041206) X-Accept-Language: fr, en MIME-Version: 1.0 To: "David S. Miller" CC: mchan@broadcom.com, netdev@oss.sgi.com Subject: Re: [TG3]: About hw coalescing infrastructure. References: <42C9B8DB.7020403@cosmosbay.com> <20050704.154712.63128211.davem@davemloft.net> <42C9BE69.2070008@cosmosbay.com> <20050704.160034.55511095.davem@davemloft.net> In-Reply-To: <20050704.160034.55511095.davem@davemloft.net> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-1.6 (gw1.cosmosbay.com [172.16.8.80]); Tue, 05 Jul 2005 18:15:00 +0200 (CEST) X-archive-position: 2635 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: dada1@cosmosbay.com Precedence: bulk X-list: netdev Content-Length: 794 Lines: 31 David S. Miller a écrit : > SMP system? What platform and 570x chip revision? > dual opterons 248, Ethernet controller: Broadcom Corporation NetXtreme BCM5702 Gigabit Ethernet (rev 02) > Does the stock driver in 2.6.12 hang as well? Stock driver in 2.6.11 or 2.6.12 cause hangs (crashes ?) too, but no kernel mesages are logged on disk. I think this is crashing in cache_grow() (slab code), not sure because I dont have access to the console. The 2.6.13-rc1 tg3 driver seems to survive (running for 18 hours now), and the cpu used by network activity was cut down :) rx-usecs: 500 rx-frames: 30 rx-usecs-irq: 500 rx-frames-irq: 20 tx-usecs: 490 tx-frames: 53 tx-usecs-irq: 490 tx-frames-irq: 5 About 1200 IRQ/second now, with 30000 recv packs/s and 25000 xmit pack/s Thank you Eric From tgraf@suug.ch Tue Jul 5 10:35:25 2005 Received: with ECARTIS (v1.0.0; list netdev); Tue, 05 Jul 2005 10:35:34 -0700 (PDT) Received: from postel.suug.ch (postel.suug.ch [195.134.158.23]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j65HZOH9013591 for ; Tue, 5 Jul 2005 10:35:25 -0700 Received: by postel.suug.ch (Postfix, from userid 10001) id 5576C1C0EB; Tue, 5 Jul 2005 19:34:11 +0200 (CEST) Date: Tue, 5 Jul 2005 19:34:11 +0200 From: Thomas Graf To: Eric Dumazet Cc: "David S. Miller" , netdev@oss.sgi.com Subject: Re: [PATCH] loop unrolling in net/sched/sch_generic.c Message-ID: <20050705173411.GK16076@postel.suug.ch> References: <20050704.154712.63128211.davem@davemloft.net> <42C9BE69.2070008@cosmosbay.com> <42C9BEF6.4080402@cosmosbay.com> <20050704.160140.21591849.davem@davemloft.net> <42CA390C.9000801@cosmosbay.com> <20050705115108.GE16076@postel.suug.ch> <42CA8555.9050607@cosmosbay.com> <20050705134805.GH16076@postel.suug.ch> <42CAAE2F.5070807@cosmosbay.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <42CAAE2F.5070807@cosmosbay.com> X-archive-position: 2636 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev Content-Length: 474 Lines: 12 * Eric Dumazet <42CAAE2F.5070807@cosmosbay.com> 2005-07-05 17:58 > I'm using vanilla 2.6.12 : no -funroll-loop in it. Maybe in your tree, not > on 99.9% of 2.6.12 trees. > > Are you suggesting everybody should use this compiler flag ? > I dont know about loopback optimization, I am not involved with this stuff, > maybe you think I'm another guy ? > > It seems to me you give unrelated arguments. Do as you wish, I don't feel like argueing about micro optimizations. From leonid.grossman@neterion.com Tue Jul 5 11:32:47 2005 Received: with ECARTIS (v1.0.0; list netdev); Tue, 05 Jul 2005 11:32:50 -0700 (PDT) Received: from ns1.s2io.com (ns1.s2io.com [142.46.200.198]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j65IWlH9017921 for ; Tue, 5 Jul 2005 11:32:47 -0700 Received: from guinness.s2io.com (sentry.s2io.com [142.46.200.199]) by ns1.s2io.com (8.12.10/8.12.10) with ESMTP id j65IVBcx012270; Tue, 5 Jul 2005 14:31:11 -0400 (EDT) Received: from lgt40 ([10.16.16.64]) by guinness.s2io.com (8.12.6/8.12.6) with ESMTP id j65IV6xS005692; Tue, 5 Jul 2005 14:31:07 -0400 (EDT) Message-Id: <200507051831.j65IV6xS005692@guinness.s2io.com> From: "Leonid Grossman" To: Cc: , "'Raghavendra Koushik'" Subject: Msi-X on Opterons? Date: Tue, 5 Jul 2005 11:31:02 -0700 MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Mailer: Microsoft Office Outlook, Build 11.0.5510 Thread-Index: AcWBj60cBTBHZqLpTBm1j+9Oe4KOHQ== X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1409 X-Scanned-By: MIMEDefang 2.34 X-archive-position: 2637 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: leonid.grossman@neterion.com Precedence: bulk X-list: netdev Content-Length: 247 Lines: 6 Did anyone had any luck getting MSI-X working on Opteron platforms? The newer 8132-based systems support MSI-X, but we could not get it working with our card there. It works fine on Xeon systems, I wonder if the implementation is Xeon-centric... From davem@davemloft.net Tue Jul 5 14:23:54 2005 Received: with ECARTIS (v1.0.0; list netdev); Tue, 05 Jul 2005 14:24:00 -0700 (PDT) Received: from sunset.davemloft.net (dsl027-180-168.sfo1.dsl.speakeasy.net [216.27.180.168]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j65LNsH9031419 for ; Tue, 5 Jul 2005 14:23:54 -0700 Received: from localhost ([127.0.0.1] ident=davem) by sunset.davemloft.net with esmtp (Exim 4.50) id 1Dpurq-0001qd-TQ; Tue, 05 Jul 2005 14:22:10 -0700 Date: Tue, 05 Jul 2005 14:22:10 -0700 (PDT) Message-Id: <20050705.142210.14973612.davem@davemloft.net> To: tgraf@suug.ch Cc: dada1@cosmosbay.com, netdev@oss.sgi.com Subject: Re: [PATCH] loop unrolling in net/sched/sch_generic.c From: "David S. Miller" In-Reply-To: <20050705173411.GK16076@postel.suug.ch> References: <20050705134805.GH16076@postel.suug.ch> <42CAAE2F.5070807@cosmosbay.com> <20050705173411.GK16076@postel.suug.ch> X-Mailer: Mew version 4.2 on Emacs 21.4 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 2638 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev Content-Length: 752 Lines: 19 From: Thomas Graf Date: Tue, 5 Jul 2005 19:34:11 +0200 > Do as you wish, I don't feel like argueing about micro optimizations. I bet the performance gain really comes from the mispredicted branches in the loop. For loops of fixed duration, say, 5 or 6 iterations or less, it totally defeats the branch prediction logic in most processors. By the time the chip moves the I-cache branch state to "likely" the loop has ended and we eat a mispredict. I think the original patch is OK, hand unrolling the loop in the C code. Adding -funroll-loops to the CFLAGS has lots of implications, and in particular the embedded folks might not be happy with some things that result from that. So I'll apply the original unrolling patch for now. From davem@davemloft.net Tue Jul 5 14:28:19 2005 Received: with ECARTIS (v1.0.0; list netdev); Tue, 05 Jul 2005 14:28:21 -0700 (PDT) Received: from sunset.davemloft.net (dsl027-180-168.sfo1.dsl.speakeasy.net [216.27.180.168]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j65LSJH9004901 for ; Tue, 5 Jul 2005 14:28:19 -0700 Received: from localhost ([127.0.0.1] ident=davem) by sunset.davemloft.net with esmtp (Exim 4.50) id 1DpuwE-0002hy-2J; Tue, 05 Jul 2005 14:26:42 -0700 Date: Tue, 05 Jul 2005 14:26:41 -0700 (PDT) Message-Id: <20050705.142641.08323224.davem@davemloft.net> To: dada1@cosmosbay.com Cc: netdev@oss.sgi.com Subject: Re: [PATCH] loop unrolling in net/sched/sch_generic.c From: "David S. Miller" In-Reply-To: <42CA390C.9000801@cosmosbay.com> References: <42C9BEF6.4080402@cosmosbay.com> <20050704.160140.21591849.davem@davemloft.net> <42CA390C.9000801@cosmosbay.com> X-Mailer: Mew version 4.2 on Emacs 21.4 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 2639 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev Content-Length: 423 Lines: 11 Eric, I've told you this before many times. Please do something so that your email client does not corrupt the patches. Once again, your email client turned all the tab characters into spaces and thus made the patch unusable. Even though you used an attachment, the tab-->space transformation still happened somehow. Please fix this, and make a serious mental note to prevent this somehow in the future, thanks a lot. From tgraf@suug.ch Tue Jul 5 14:35:12 2005 Received: with ECARTIS (v1.0.0; list netdev); Tue, 05 Jul 2005 14:35:15 -0700 (PDT) Received: from postel.suug.ch (postel.suug.ch [195.134.158.23]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j65LZBH9005929 for ; Tue, 5 Jul 2005 14:35:11 -0700 Received: by postel.suug.ch (Postfix, from userid 10001) id 259FD1C0F3; Tue, 5 Jul 2005 23:33:56 +0200 (CEST) Date: Tue, 5 Jul 2005 23:33:55 +0200 From: Thomas Graf To: "David S. Miller" Cc: dada1@cosmosbay.com, netdev@oss.sgi.com Subject: Re: [PATCH] loop unrolling in net/sched/sch_generic.c Message-ID: <20050705213355.GM16076@postel.suug.ch> References: <20050705134805.GH16076@postel.suug.ch> <42CAAE2F.5070807@cosmosbay.com> <20050705173411.GK16076@postel.suug.ch> <20050705.142210.14973612.davem@davemloft.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20050705.142210.14973612.davem@davemloft.net> X-archive-position: 2640 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev Content-Length: 248 Lines: 5 * David S. Miller <20050705.142210.14973612.davem@davemloft.net> 2005-07-05 14:22 > So I'll apply the original unrolling patch for now. The patch must be changed to use __qdisc_dequeue_head() instead of __skb_dequeue() or we screw up the backlog. From davem@davemloft.net Tue Jul 5 14:37:27 2005 Received: with ECARTIS (v1.0.0; list netdev); Tue, 05 Jul 2005 14:37:31 -0700 (PDT) Received: from sunset.davemloft.net (dsl027-180-168.sfo1.dsl.speakeasy.net [216.27.180.168]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j65LbRH9006333 for ; Tue, 5 Jul 2005 14:37:27 -0700 Received: from localhost ([127.0.0.1] ident=davem) by sunset.davemloft.net with esmtp (Exim 4.50) id 1Dpv53-0004ba-4H; Tue, 05 Jul 2005 14:35:49 -0700 Date: Tue, 05 Jul 2005 14:35:48 -0700 (PDT) Message-Id: <20050705.143548.28788459.davem@davemloft.net> To: tgraf@suug.ch Cc: dada1@cosmosbay.com, netdev@oss.sgi.com Subject: Re: [PATCH] loop unrolling in net/sched/sch_generic.c From: "David S. Miller" In-Reply-To: <20050705213355.GM16076@postel.suug.ch> References: <20050705173411.GK16076@postel.suug.ch> <20050705.142210.14973612.davem@davemloft.net> <20050705213355.GM16076@postel.suug.ch> X-Mailer: Mew version 4.2 on Emacs 21.4 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 2641 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev Content-Length: 390 Lines: 10 From: Thomas Graf Date: Tue, 5 Jul 2005 23:33:55 +0200 > * David S. Miller <20050705.142210.14973612.davem@davemloft.net> 2005-07-05 14:22 > > So I'll apply the original unrolling patch for now. > > The patch must be changed to use __qdisc_dequeue_head() instead of > __skb_dequeue() or we screw up the backlog. Ok, good thing the patch didn't apply correctly anyways :) From dada1@cosmosbay.com Tue Jul 5 16:17:45 2005 Received: with ECARTIS (v1.0.0; list netdev); Tue, 05 Jul 2005 16:17:54 -0700 (PDT) Received: from smtp.cegetel.net (mf01.sitadelle.com [212.94.174.68]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j65NHiH9013123 for ; Tue, 5 Jul 2005 16:17:45 -0700 Received: from [192.168.0.5] (84-4-149-97.dti.cegetel.net [84.4.149.97]) by smtp.cegetel.net (Postfix) with ESMTP id 90AED31819E; Wed, 6 Jul 2005 01:16:01 +0200 (CEST) Message-ID: <42CB14B2.5090601@cosmosbay.com> Date: Wed, 06 Jul 2005 01:16:02 +0200 From: Eric Dumazet User-Agent: Mozilla Thunderbird 1.0 (Windows/20041206) X-Accept-Language: fr, en MIME-Version: 1.0 To: "David S. Miller" , tgraf@suug.ch Cc: netdev@oss.sgi.com Subject: Re: [PATCH] loop unrolling in net/sched/sch_generic.c References: <20050705173411.GK16076@postel.suug.ch> <20050705.142210.14973612.davem@davemloft.net> <20050705213355.GM16076@postel.suug.ch> <20050705.143548.28788459.davem@davemloft.net> In-Reply-To: <20050705.143548.28788459.davem@davemloft.net> Content-Type: multipart/mixed; boundary="------------080605050609050601090102" X-archive-position: 2642 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: dada1@cosmosbay.com Precedence: bulk X-list: netdev Content-Length: 2313 Lines: 84 This is a multi-part message in MIME format. --------------080605050609050601090102 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit David S. Miller a écrit : > From: Thomas Graf > Date: Tue, 5 Jul 2005 23:33:55 +0200 > > >>* David S. Miller <20050705.142210.14973612.davem@davemloft.net> 2005-07-05 14:22 >> >>>So I'll apply the original unrolling patch for now. >> >>The patch must be changed to use __qdisc_dequeue_head() instead of >>__skb_dequeue() or we screw up the backlog. > > > Ok, good thing the patch didn't apply correctly anyways :) > > Oh well, I was unaware of last changes in 2.6.13-rc1 :( Given the fact that the PFIFO_FAST_BANDS macro was introduced, I wonder if the patch should be this one or not... Should we assume PFIFO_FAST_BANDS will stay at 3 or what ? [NET] : unroll a small loop in pfifo_fast_dequeue(). Compiler generates better code. oprofile says this function uses now 0.29% instead of 1.22 %, on a x86_64 target. Signed-off-by: Eric Dumazet --------------080605050609050601090102 Content-Type: text/plain; name="patch.sch_generic" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="patch.sch_generic" --- linux-2.6.13-rc1/net/sched/sch_generic.c 2005-07-06 00:46:53.000000000 +0200 +++ linux-2.6.13-rc1-ed/net/sched/sch_generic.c 2005-07-06 01:05:04.000000000 +0200 @@ -328,18 +328,31 @@ static struct sk_buff *pfifo_fast_dequeue(struct Qdisc* qdisc) { - int prio; struct sk_buff_head *list = qdisc_priv(qdisc); - for (prio = 0; prio < PFIFO_FAST_BANDS; prio++, list++) { - struct sk_buff *skb = __qdisc_dequeue_head(qdisc, list); - if (skb) { - qdisc->q.qlen--; - return skb; +#if PFIFO_FAST_BANDS == 3 + for (;;) { + if (!skb_queue_empty(list)) + break; + list++; + if (!skb_queue_empty(list)) + break; + list++; + if (!skb_queue_empty(list)) + break; + return NULL; } - } - - return NULL; +#else + int prio; + for (prio = 0;; list++) { + if (!skb_queue_empty(list)) + break; + if (++prio == PFIFO_FAST_BANDS) + return NULL; + } +#endif + qdisc->q.qlen--; + return __qdisc_dequeue_head(qdisc, list); } static int pfifo_fast_requeue(struct sk_buff *skb, struct Qdisc* qdisc) --------------080605050609050601090102-- From tgraf@suug.ch Tue Jul 5 16:42:23 2005 Received: with ECARTIS (v1.0.0; list netdev); Tue, 05 Jul 2005 16:42:27 -0700 (PDT) Received: from postel.suug.ch (postel.suug.ch [195.134.158.23]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j65NgMH9018222 for ; Tue, 5 Jul 2005 16:42:23 -0700 Received: by postel.suug.ch (Postfix, from userid 10001) id 7CCA11C0F3; Wed, 6 Jul 2005 01:41:04 +0200 (CEST) Date: Wed, 6 Jul 2005 01:41:04 +0200 From: Thomas Graf To: Eric Dumazet Cc: "David S. Miller" , netdev@oss.sgi.com Subject: Re: [PATCH] loop unrolling in net/sched/sch_generic.c Message-ID: <20050705234104.GR16076@postel.suug.ch> References: <20050705173411.GK16076@postel.suug.ch> <20050705.142210.14973612.davem@davemloft.net> <20050705213355.GM16076@postel.suug.ch> <20050705.143548.28788459.davem@davemloft.net> <42CB14B2.5090601@cosmosbay.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <42CB14B2.5090601@cosmosbay.com> X-archive-position: 2643 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev Content-Length: 1555 Lines: 41 * Eric Dumazet <42CB14B2.5090601@cosmosbay.com> 2005-07-06 01:16 > Oh well, I was unaware of last changes in 2.6.13-rc1 :( Ok, this clarifies a lot for me, I was under the impression you knew about these changes. > Given the fact that the PFIFO_FAST_BANDS macro was introduced, I wonder if > the patch should be this one or not... > Should we assume PFIFO_FAST_BANDS will stay at 3 or what ? It is very unlikely to change within mainline but the idea behind it is to allow it be changed at compile time. I still think we can fix this performance issue without manually unrolling the loop or we should at least try to. In the end gcc should notice the constant part of the loop and move it out so basically the only difference should the additional prio++ and possibly a failing branch prediction. What about this? I'm still not sure where exactly all the time is lost so this is a shot in the dark. Index: net-2.6/net/sched/sch_generic.c =================================================================== --- net-2.6.orig/net/sched/sch_generic.c +++ net-2.6/net/sched/sch_generic.c @@ -330,10 +330,11 @@ static struct sk_buff *pfifo_fast_dequeu { int prio; struct sk_buff_head *list = qdisc_priv(qdisc); + struct sk_buff *skb; - for (prio = 0; prio < PFIFO_FAST_BANDS; prio++, list++) { - struct sk_buff *skb = __qdisc_dequeue_head(qdisc, list); - if (skb) { + for (prio = 0; prio < PFIFO_FAST_BANDS; prio++) { + if (!skb_queue_empty(list + prio)) { + skb = __qdisc_dequeue_head(qdisc, list); qdisc->q.qlen--; return skb; } From davem@davemloft.net Tue Jul 5 16:46:47 2005 Received: with ECARTIS (v1.0.0; list netdev); Tue, 05 Jul 2005 16:46:51 -0700 (PDT) Received: from sunset.davemloft.net (dsl027-180-168.sfo1.dsl.speakeasy.net [216.27.180.168]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j65NkkH9018911 for ; Tue, 5 Jul 2005 16:46:47 -0700 Received: from localhost ([127.0.0.1] ident=davem) by sunset.davemloft.net with esmtp (Exim 4.50) id 1Dpx67-0000ik-4w; Tue, 05 Jul 2005 16:45:03 -0700 Date: Tue, 05 Jul 2005 16:45:03 -0700 (PDT) Message-Id: <20050705.164503.104035718.davem@davemloft.net> To: tgraf@suug.ch Cc: dada1@cosmosbay.com, netdev@oss.sgi.com Subject: Re: [PATCH] loop unrolling in net/sched/sch_generic.c From: "David S. Miller" In-Reply-To: <20050705234104.GR16076@postel.suug.ch> References: <20050705.143548.28788459.davem@davemloft.net> <42CB14B2.5090601@cosmosbay.com> <20050705234104.GR16076@postel.suug.ch> X-Mailer: Mew version 4.2 on Emacs 21.4 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 2644 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev Content-Length: 680 Lines: 16 From: Thomas Graf Date: Wed, 6 Jul 2005 01:41:04 +0200 > I still think we can fix this performance issue without manually > unrolling the loop or we should at least try to. In the end gcc > should notice the constant part of the loop and move it out so > basically the only difference should the additional prio++ and > possibly a failing branch prediction. But the branch prediction is where I personally think a lot of the lossage is coming from. These can cost upwards of 20 or 30 processor cycles, easily. That's getting close to the cost of a L2 cache miss. I see the difficulties with this change now, why don't we revisit this some time in the future? From tgraf@suug.ch Tue Jul 5 16:56:18 2005 Received: with ECARTIS (v1.0.0; list netdev); Tue, 05 Jul 2005 16:56:21 -0700 (PDT) Received: from postel.suug.ch (postel.suug.ch [195.134.158.23]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j65NuHH9019820 for ; Tue, 5 Jul 2005 16:56:18 -0700 Received: by postel.suug.ch (Postfix, from userid 10001) id 8C7211C0F3; Wed, 6 Jul 2005 01:55:04 +0200 (CEST) Date: Wed, 6 Jul 2005 01:55:04 +0200 From: Thomas Graf To: "David S. Miller" Cc: dada1@cosmosbay.com, netdev@oss.sgi.com Subject: Re: [PATCH] loop unrolling in net/sched/sch_generic.c Message-ID: <20050705235504.GS16076@postel.suug.ch> References: <20050705.143548.28788459.davem@davemloft.net> <42CB14B2.5090601@cosmosbay.com> <20050705234104.GR16076@postel.suug.ch> <20050705.164503.104035718.davem@davemloft.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20050705.164503.104035718.davem@davemloft.net> X-archive-position: 2645 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev Content-Length: 1265 Lines: 29 * David S. Miller <20050705.164503.104035718.davem@davemloft.net> 2005-07-05 16:45 > From: Thomas Graf > Date: Wed, 6 Jul 2005 01:41:04 +0200 > > > I still think we can fix this performance issue without manually > > unrolling the loop or we should at least try to. In the end gcc > > should notice the constant part of the loop and move it out so > > basically the only difference should the additional prio++ and > > possibly a failing branch prediction. > > But the branch prediction is where I personally think a lot > of the lossage is coming from. These can cost upwards of 20 > or 30 processor cycles, easily. That's getting close to the > cost of a L2 cache miss. Absolutely. I think what happens is that we produce predicion failures due to the logic within qdisc_dequeue_head(), I cannot back this up with numbers though. > I see the difficulties with this change now, why don't we revisit > this some time in the future? Fine with me. Eric, the patch I just posted should result in the same branch prediction as your loop unrolling. The only additional overhead we still have is the list + prio thing and an additional conditional jump to do the loop. If you have the cycles etc. it would be nice to compare it with your numbers. From dada1@cosmosbay.com Tue Jul 5 17:34:01 2005 Received: with ECARTIS (v1.0.0; list netdev); Tue, 05 Jul 2005 17:34:03 -0700 (PDT) Received: from smtp.cegetel.net (mf00.sitadelle.com [212.94.174.67]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j660Y0H9022148 for ; Tue, 5 Jul 2005 17:34:00 -0700 Received: from [192.168.0.5] (84-4-149-97.dti.cegetel.net [84.4.149.97]) by smtp.cegetel.net (Postfix) with ESMTP id 8834C1A409E; Wed, 6 Jul 2005 02:32:24 +0200 (CEST) Message-ID: <42CB2698.2080904@cosmosbay.com> Date: Wed, 06 Jul 2005 02:32:24 +0200 From: Eric Dumazet User-Agent: Mozilla Thunderbird 1.0 (Windows/20041206) X-Accept-Language: fr, en MIME-Version: 1.0 To: Thomas Graf Cc: "David S. Miller" , netdev@oss.sgi.com Subject: Re: [PATCH] loop unrolling in net/sched/sch_generic.c References: <20050705173411.GK16076@postel.suug.ch> <20050705.142210.14973612.davem@davemloft.net> <20050705213355.GM16076@postel.suug.ch> <20050705.143548.28788459.davem@davemloft.net> <42CB14B2.5090601@cosmosbay.com> <20050705234104.GR16076@postel.suug.ch> In-Reply-To: <20050705234104.GR16076@postel.suug.ch> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit X-archive-position: 2646 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: dada1@cosmosbay.com Precedence: bulk X-list: netdev Content-Length: 2279 Lines: 68 Thomas Graf a écrit : > I still think we can fix this performance issue without manually > unrolling the loop or we should at least try to. In the end gcc > should notice the constant part of the loop and move it out so > basically the only difference should the additional prio++ and > possibly a failing branch prediction. > > What about this? I'm still not sure where exactly all the time > is lost so this is a shot in the dark. > > Index: net-2.6/net/sched/sch_generic.c > =================================================================== > --- net-2.6.orig/net/sched/sch_generic.c > +++ net-2.6/net/sched/sch_generic.c > @@ -330,10 +330,11 @@ static struct sk_buff *pfifo_fast_dequeu > { > int prio; > struct sk_buff_head *list = qdisc_priv(qdisc); > + struct sk_buff *skb; > > - for (prio = 0; prio < PFIFO_FAST_BANDS; prio++, list++) { > - struct sk_buff *skb = __qdisc_dequeue_head(qdisc, list); > - if (skb) { > + for (prio = 0; prio < PFIFO_FAST_BANDS; prio++) { > + if (!skb_queue_empty(list + prio)) { > + skb = __qdisc_dequeue_head(qdisc, list); > qdisc->q.qlen--; > return skb; > } > > Hum... shouldnt it be : + skb = __qdisc_dequeue_head(qdisc, list + prio); ? Anyway, the branches misprediction come from the fact that most of packets are queued in the prio=2 list. So each time this function is called, a non unrolled version has to pay 2 to 5 branches misprediction. if ((!skb_queue_empty(list + prio)) /* branch not taken, mispredict when prio=0 */ if ((!skb_queue_empty(list + prio)) /* branch not taken, mispredict when prio=1 */ if ((!skb_queue_empty(list + prio)) /* branch taken (or not if queue is really empty), mispredict when prio=2 */ Maybe we can rewrite the whole thing without branches, examining prio from PFIFO_FAST_BANDS-1 down to 0, at least for modern cpu with conditional mov (cmov) struct sk_buff_head *best = NULL; struct sk_buff_head *list = qdisc_priv(qdisc)+PFIFO_FAST_BANDS-1; if (skb_queue_empty(list)) best = list ; list--; if (skb_queue_empty(list)) best = list ; list--; if (skb_queue_empty(list)) best = list ; if (best != NULL) { qdisc->q.qlen--; return __qdisc_dequeue_head(qdisc, best); } This version should have one branch. I will test this after some sleep :) See you Eric From tgraf@suug.ch Tue Jul 5 17:52:55 2005 Received: with ECARTIS (v1.0.0; list netdev); Tue, 05 Jul 2005 17:52:59 -0700 (PDT) Received: from postel.suug.ch (postel.suug.ch [195.134.158.23]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j660qsH9023252 for ; Tue, 5 Jul 2005 17:52:55 -0700 Received: by postel.suug.ch (Postfix, from userid 10001) id 6D5D71C0F2; Wed, 6 Jul 2005 02:51:40 +0200 (CEST) Date: Wed, 6 Jul 2005 02:51:40 +0200 From: Thomas Graf To: Eric Dumazet Cc: "David S. Miller" , netdev@oss.sgi.com Subject: Re: [PATCH] loop unrolling in net/sched/sch_generic.c Message-ID: <20050706005140.GT16076@postel.suug.ch> References: <20050705173411.GK16076@postel.suug.ch> <20050705.142210.14973612.davem@davemloft.net> <20050705213355.GM16076@postel.suug.ch> <20050705.143548.28788459.davem@davemloft.net> <42CB14B2.5090601@cosmosbay.com> <20050705234104.GR16076@postel.suug.ch> <42CB2698.2080904@cosmosbay.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <42CB2698.2080904@cosmosbay.com> X-archive-position: 2647 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev Content-Length: 869 Lines: 27 * Eric Dumazet <42CB2698.2080904@cosmosbay.com> 2005-07-06 02:32 > Hum... shouldnt it be : > > + skb = __qdisc_dequeue_head(qdisc, list + prio); > Correct. > Anyway, the branches misprediction come from the fact that most of packets > are queued in the prio=2 list. > > So each time this function is called, a non unrolled version has to pay 2 > to 5 branches misprediction. > > if ((!skb_queue_empty(list + prio)) /* branch not taken, mispredict when > prio=0 */ The !expr implies an unlikely so the prediction should be right and equal to your unrolling version. > Maybe we can rewrite the whole thing without branches, examining prio from > PFIFO_FAST_BANDS-1 down to 0, at least for modern cpu with conditional mov > (cmov) This would break the whole thing, the qdisc is supposed to try and dequeue from the highest priority queue (prio=0) first. From dada1@cosmosbay.com Tue Jul 5 17:55:06 2005 Received: with ECARTIS (v1.0.0; list netdev); Tue, 05 Jul 2005 17:55:09 -0700 (PDT) Received: from smtp.cegetel.net (mf00.sitadelle.com [212.94.174.67]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j660t4H9023684 for ; Tue, 5 Jul 2005 17:55:05 -0700 Received: from [192.168.0.5] (84-4-149-97.dti.cegetel.net [84.4.149.97]) by smtp.cegetel.net (Postfix) with ESMTP id 200FA1A40F2; Wed, 6 Jul 2005 02:53:26 +0200 (CEST) Message-ID: <42CB2B84.50702@cosmosbay.com> Date: Wed, 06 Jul 2005 02:53:24 +0200 From: Eric Dumazet User-Agent: Mozilla Thunderbird 1.0 (Windows/20041206) X-Accept-Language: fr, en MIME-Version: 1.0 To: Eric Dumazet Cc: Thomas Graf , "David S. Miller" , netdev@oss.sgi.com Subject: Re: [PATCH] loop unrolling in net/sched/sch_generic.c References: <20050705173411.GK16076@postel.suug.ch> <20050705.142210.14973612.davem@davemloft.net> <20050705213355.GM16076@postel.suug.ch> <20050705.143548.28788459.davem@davemloft.net> <42CB14B2.5090601@cosmosbay.com> <20050705234104.GR16076@postel.suug.ch> <42CB2698.2080904@cosmosbay.com> In-Reply-To: <42CB2698.2080904@cosmosbay.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit X-archive-position: 2648 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: dada1@cosmosbay.com Precedence: bulk X-list: netdev Content-Length: 2924 Lines: 87 Eric Dumazet a écrit : > > > Maybe we can rewrite the whole thing without branches, examining prio > from PFIFO_FAST_BANDS-1 down to 0, at least for modern cpu with > conditional mov (cmov) > > struct sk_buff_head *best = NULL; > struct sk_buff_head *list = qdisc_priv(qdisc)+PFIFO_FAST_BANDS-1; > if (skb_queue_empty(list)) best = list ; > list--; > if (skb_queue_empty(list)) best = list ; > list--; > if (skb_queue_empty(list)) best = list ; > if (best != NULL) { > qdisc->q.qlen--; > return __qdisc_dequeue_head(qdisc, best); > } > > This version should have one branch. > I will test this after some sleep :) > See you > Eric > > (Sorry, still using 2.6.12, but the idea remains) static struct sk_buff * pfifo_fast_dequeue(struct Qdisc* qdisc) { struct sk_buff_head *list = qdisc_priv(qdisc); struct sk_buff_head *best = NULL; list += 2; if (!skb_queue_empty(list)) best = list; list--; if (!skb_queue_empty(list)) best = list; list--; if (!skb_queue_empty(list)) best = list; if (best) { qdisc->q.qlen--; return __skb_dequeue(best); } return NULL; } At least the compiler output seems promising : 0000000000000550 : 550: 48 8d 97 f0 00 00 00 lea 0xf0(%rdi),%rdx 557: 31 c9 xor %ecx,%ecx 559: 48 8d 87 c0 00 00 00 lea 0xc0(%rdi),%rax 560: 48 39 97 f0 00 00 00 cmp %rdx,0xf0(%rdi) 567: 48 0f 45 ca cmovne %rdx,%rcx 56b: 48 8d 97 d8 00 00 00 lea 0xd8(%rdi),%rdx 572: 48 39 97 d8 00 00 00 cmp %rdx,0xd8(%rdi) 579: 48 0f 45 ca cmovne %rdx,%rcx 57d: 48 39 87 c0 00 00 00 cmp %rax,0xc0(%rdi) 584: 48 0f 45 c8 cmovne %rax,%rcx 588: 31 c0 xor %eax,%eax 58a: 48 85 c9 test %rcx,%rcx 58d: 74 32 je 5c1 // one conditional branch 58f: ff 4f 40 decl 0x40(%rdi) 592: 48 8b 11 mov (%rcx),%rdx 595: 48 39 ca cmp %rcx,%rdx 598: 74 27 je 5c1 // never taken branch : always predicted OK 59a: 48 89 d0 mov %rdx,%rax 59d: 48 8b 12 mov (%rdx),%rdx 5a0: ff 49 10 decl 0x10(%rcx) 5a3: 48 c7 40 10 00 00 00 movq $0x0,0x10(%rax) 5aa: 00 5ab: 48 89 4a 08 mov %rcx,0x8(%rdx) 5af: 48 89 11 mov %rdx,(%rcx) 5b2: 48 c7 40 08 00 00 00 movq $0x0,0x8(%rax) 5b9: 00 5ba: 48 c7 00 00 00 00 00 movq $0x0,(%rax) 5c1: 90 nop 5c2: c3 retq I Will post tomorrow some profiling results. Eric From tgraf@suug.ch Tue Jul 5 18:03:14 2005 Received: with ECARTIS (v1.0.0; list netdev); Tue, 05 Jul 2005 18:03:16 -0700 (PDT) Received: from postel.suug.ch (postel.suug.ch [195.134.158.23]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6613DH9024750 for ; Tue, 5 Jul 2005 18:03:14 -0700 Received: by postel.suug.ch (Postfix, from userid 10001) id 973B41C0F3; Wed, 6 Jul 2005 03:02:00 +0200 (CEST) Date: Wed, 6 Jul 2005 03:02:00 +0200 From: Thomas Graf To: Eric Dumazet Cc: "David S. Miller" , netdev@oss.sgi.com Subject: Re: [PATCH] loop unrolling in net/sched/sch_generic.c Message-ID: <20050706010200.GU16076@postel.suug.ch> References: <20050705173411.GK16076@postel.suug.ch> <20050705.142210.14973612.davem@davemloft.net> <20050705213355.GM16076@postel.suug.ch> <20050705.143548.28788459.davem@davemloft.net> <42CB14B2.5090601@cosmosbay.com> <20050705234104.GR16076@postel.suug.ch> <42CB2698.2080904@cosmosbay.com> <42CB2B84.50702@cosmosbay.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <42CB2B84.50702@cosmosbay.com> X-archive-position: 2649 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev Content-Length: 416 Lines: 10 * Eric Dumazet <42CB2B84.50702@cosmosbay.com> 2005-07-06 02:53 > (Sorry, still using 2.6.12, but the idea remains) I think you got me wrong, the whole point of this qdisc is to prioritize which means that we cannot dequeue from prio 1 as long as the queue in prio 0 is not empty. If you have no traffic at all for prio=0 and prio=1 then the best solution is to replace the qdisc on the device with a simple fifo. From dada1@cosmosbay.com Tue Jul 5 18:06:14 2005 Received: with ECARTIS (v1.0.0; list netdev); Tue, 05 Jul 2005 18:06:16 -0700 (PDT) Received: from smtp.cegetel.net (mf00.sitadelle.com [212.94.174.67]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6616DH9025527 for ; Tue, 5 Jul 2005 18:06:14 -0700 Received: from [192.168.0.5] (84-4-149-97.dti.cegetel.net [84.4.149.97]) by smtp.cegetel.net (Postfix) with ESMTP id E35A31A4145; Wed, 6 Jul 2005 03:04:37 +0200 (CEST) Message-ID: <42CB2E24.6010303@cosmosbay.com> Date: Wed, 06 Jul 2005 03:04:36 +0200 From: Eric Dumazet User-Agent: Mozilla Thunderbird 1.0 (Windows/20041206) X-Accept-Language: fr, en MIME-Version: 1.0 To: Thomas Graf Cc: "David S. Miller" , netdev@oss.sgi.com Subject: Re: [PATCH] loop unrolling in net/sched/sch_generic.c References: <20050705173411.GK16076@postel.suug.ch> <20050705.142210.14973612.davem@davemloft.net> <20050705213355.GM16076@postel.suug.ch> <20050705.143548.28788459.davem@davemloft.net> <42CB14B2.5090601@cosmosbay.com> <20050705234104.GR16076@postel.suug.ch> <42CB2698.2080904@cosmosbay.com> <20050706005140.GT16076@postel.suug.ch> In-Reply-To: <20050706005140.GT16076@postel.suug.ch> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit X-archive-position: 2650 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: dada1@cosmosbay.com Precedence: bulk X-list: netdev Content-Length: 636 Lines: 18 Thomas Graf a écrit : >>Maybe we can rewrite the whole thing without branches, examining prio from >>PFIFO_FAST_BANDS-1 down to 0, at least for modern cpu with conditional mov >>(cmov) > > > This would break the whole thing, the qdisc is supposed to try and > dequeue from the highest priority queue (prio=0) first. > > I still dequeue a packet from the highest priority queue. But nothing prevents us to look the three queues in the reverse order, if you can avoid the conditional branches. No memory penalty, since most of time we were looking at the three queues anyway, and the 3 sk_buff_head are in the same cache line. From tgraf@suug.ch Tue Jul 5 18:09:10 2005 Received: with ECARTIS (v1.0.0; list netdev); Tue, 05 Jul 2005 18:09:13 -0700 (PDT) Received: from postel.suug.ch (postel.suug.ch [195.134.158.23]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j66199H9026127 for ; Tue, 5 Jul 2005 18:09:10 -0700 Received: by postel.suug.ch (Postfix, from userid 10001) id 3F8631C0F3; Wed, 6 Jul 2005 03:07:56 +0200 (CEST) Date: Wed, 6 Jul 2005 03:07:56 +0200 From: Thomas Graf To: Eric Dumazet Cc: "David S. Miller" , netdev@oss.sgi.com Subject: Re: [PATCH] loop unrolling in net/sched/sch_generic.c Message-ID: <20050706010756.GV16076@postel.suug.ch> References: <20050705173411.GK16076@postel.suug.ch> <20050705.142210.14973612.davem@davemloft.net> <20050705213355.GM16076@postel.suug.ch> <20050705.143548.28788459.davem@davemloft.net> <42CB14B2.5090601@cosmosbay.com> <20050705234104.GR16076@postel.suug.ch> <42CB2698.2080904@cosmosbay.com> <20050706005140.GT16076@postel.suug.ch> <42CB2E24.6010303@cosmosbay.com> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <42CB2E24.6010303@cosmosbay.com> X-archive-position: 2651 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev Content-Length: 560 Lines: 17 * Eric Dumazet <42CB2E24.6010303@cosmosbay.com> 2005-07-06 03:04 > Thomas Graf a écrit : > > >>Maybe we can rewrite the whole thing without branches, examining prio > >>from PFIFO_FAST_BANDS-1 down to 0, at least for modern cpu with > >>conditional mov (cmov) > > > > > >This would break the whole thing, the qdisc is supposed to try and > >dequeue from the highest priority queue (prio=0) first. > > > > > > I still dequeue a packet from the highest priority queue. Ahh... sorry, I misread your patch, interesting idea. I'll be waiting for your numbers. From dada1@cosmosbay.com Tue Jul 5 18:10:37 2005 Received: with ECARTIS (v1.0.0; list netdev); Tue, 05 Jul 2005 18:10:39 -0700 (PDT) Received: from smtp.cegetel.net (mf00.sitadelle.com [212.94.174.67]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j661AaH9026585 for ; Tue, 5 Jul 2005 18:10:36 -0700 Received: from [192.168.0.5] (84-4-149-97.dti.cegetel.net [84.4.149.97]) by smtp.cegetel.net (Postfix) with ESMTP id 4E2891A4038; Wed, 6 Jul 2005 03:09:01 +0200 (CEST) Message-ID: <42CB2F2C.7050602@cosmosbay.com> Date: Wed, 06 Jul 2005 03:09:00 +0200 From: Eric Dumazet User-Agent: Mozilla Thunderbird 1.0 (Windows/20041206) X-Accept-Language: fr, en MIME-Version: 1.0 To: Thomas Graf Cc: "David S. Miller" , netdev@oss.sgi.com Subject: Re: [PATCH] loop unrolling in net/sched/sch_generic.c References: <20050705173411.GK16076@postel.suug.ch> <20050705.142210.14973612.davem@davemloft.net> <20050705213355.GM16076@postel.suug.ch> <20050705.143548.28788459.davem@davemloft.net> <42CB14B2.5090601@cosmosbay.com> <20050705234104.GR16076@postel.suug.ch> <42CB2698.2080904@cosmosbay.com> <42CB2B84.50702@cosmosbay.com> <20050706010200.GU16076@postel.suug.ch> In-Reply-To: <20050706010200.GU16076@postel.suug.ch> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit X-archive-position: 2652 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: dada1@cosmosbay.com Precedence: bulk X-list: netdev Content-Length: 650 Lines: 21 Thomas Graf a écrit : > I think you got me wrong, the whole point of this qdisc > is to prioritize which means that we cannot dequeue from > prio 1 as long as the queue in prio 0 is not empty. if prio 0 is not empty, then the last if (!skb_queue_empty(list)) best = list; will set 'best' to the prio 0 list, and we dequeue the packet on this prio 0 list, not on prio 1 or prio 2. > > If you have no traffic at all for prio=0 and prio=1 then > the best solution is to replace the qdisc on the device > with a simple fifo. Yes sure, but I know that already. Unfortunatly I have some trafic on prio=1 and prio=0 (about 5 %) Thank you Eric From zdenek@rcn.com Tue Jul 5 19:04:07 2005 Received: with ECARTIS (v1.0.0; list netdev); Tue, 05 Jul 2005 19:04:10 -0700 (PDT) Received: from smtp05.mrf.mail.rcn.net (smtp05.mrf.mail.rcn.net [207.172.4.64]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j66246H9030373 for ; Tue, 5 Jul 2005 19:04:07 -0700 Received: from 209-150-50-115.c3-0.nwt-ubr3.sbo-nwt.ma.cable.rcn.com (HELO funex) (209.150.50.115) by smtp05.mrf.mail.rcn.net with SMTP; 05 Jul 2005 22:02:31 -0400 Message-Id: <3u3gb7$1no73u@smtp05.mrf.mail.rcn.net> X-IronPort-AV: i="3.93,263,1115006400"; d="scan'208"; a="58465406:sNHT21009888" X-Sender: zdenek@pop.rcn.com X-Mailer: QUALCOMM Windows Eudora Pro Version 4.0 Date: Tue, 05 Jul 2005 21:55:29 -0400 To: Henrik Nordstrom From: Zdenek Radouch Subject: Re: controlling ARP Proxy scope? Cc: netdev@oss.sgi.com, linux-net@vger.kernel.org In-Reply-To: References: <3u3gb7$1mhk2i@smtp05.mrf.mail.rcn.net> <3u3gb7$1mhk2i@smtp05.mrf.mail.rcn.net> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" X-archive-position: 2653 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: zdenek@rcn.com Precedence: bulk X-list: netdev Content-Length: 2238 Lines: 56 At 11:21 PM 7/2/05 +0200, Henrik Nordstrom wrote: >On Fri, 1 Jul 2005, Zdenek Radouch wrote: > >> So, left with only a binary flag in /proc, and network definition on the >> interface, >> I assumed (perhaps naively) that the arp would proxy only for the addresses >> within the subnet defined for the interface (on which the proxy arp is >> turned on). >> However, that does not seem to be the case. > >You may be able to tune this with either arp_filter or arp_ignore. Unfortunately I can't. Not without adding more code to what is quite obviously a bunch of kludgy patches for an ill-conceived ARP proxy design. > >> I have an interface with address 10.1.2.219 and mask 255.255.255.248 with >> proxy arp turned on on this interface, and the machine is responding >> (I see that with tcpdump) to arp requests for address 10.1.2.1, i.e., >> an address outside of the proxy interface's subnet. > >Correct. > >> Can anyone explain the behavior? > >proxy_arp simply ARPs if there is a route for the requested destination >going out on another interface than where the ARP was seen. In my case, the proxy replies to a request seen on the very same interface to which the route points to. That's wrong no matter how you look at it; this is a route to which this node will be routing, i.e., this node will be ARPing for this route address - it itself should not reply to such requests, nor could it ever successfully do so. I find the idea to proxy based on routing tables quite questionable. It may work is some pretty trivial cases, but will very obviously fail with a more complex configuration. I have seven or eight networks attached to the node, and I certainly do not want to proxy for every single address one may find in the routing tables. It is equally mind boggling to me how this could ever work with a stack allowing source-based routing, that is, a stack allowing coexistence of multiple, possibly conflicting routing tables. Sounds to me like I am going to have to rewrite the module. It needs to be configured manually - the notion that it could work automagically, without external configuration is quite unrealistic, as one can see from the code in arp_filter and arp_ignore. Thanks for the pointers. -Zdenek From hno@marasystems.com Tue Jul 5 19:22:18 2005 Received: with ECARTIS (v1.0.0; list netdev); Tue, 05 Jul 2005 19:22:22 -0700 (PDT) Received: from filer.marasystems.com (marasystems.com [83.241.133.2]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j662MGH9031846 for ; Tue, 5 Jul 2005 19:22:17 -0700 Received: from localhost (henrik@localhost) by filer.marasystems.com (8.11.6/8.11.6) with ESMTP id j662KWt31633; Wed, 6 Jul 2005 04:20:32 +0200 Date: Wed, 6 Jul 2005 04:20:32 +0200 (CEST) From: Henrik Nordstrom To: Zdenek Radouch cc: netdev@oss.sgi.com, linux-net@vger.kernel.org Subject: Re: controlling ARP Proxy scope? In-Reply-To: <3u3gb7$1no73u@smtp05.mrf.mail.rcn.net> Message-ID: References: <3u3gb7$1mhk2i@smtp05.mrf.mail.rcn.net> <3u3gb7$1mhk2i@smtp05.mrf.mail.rcn.net> <3u3gb7$1no73u@smtp05.mrf.mail.rcn.net> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-archive-position: 2654 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hno@marasystems.com Precedence: bulk X-list: netdev Content-Length: 2747 Lines: 65 On Tue, 5 Jul 2005, Zdenek Radouch wrote: >> proxy_arp simply ARPs if there is a route for the requested destination >> going out on another interface than where the ARP was seen. > > In my case, the proxy replies to a request seen on the very same interface > to which the route points to. Are you really sure on this? This part has always worked fine for me with Linux proxy-arp and a large variety of different kernels. > I find the idea to proxy based on routing tables quite questionable. So do I. The manual proxy-arp entries method suits me much better, but is a pain due to lack of range support (probably why it got removed in 2.4) > It may work is some pretty trivial cases, but will very obviously fail > with a more complex configuration. Haven't managed to find a single situation not solveable yet.. and this involves pretty complex configurations.. I don't remember which of the sysctls mentioned earlier did the trick, but once enabled things starts to behave quite sanely even when there is multiple foreign networks unexpectedly carried on the same Ethernet. IIRC the settings I settled for was arp_ignore = 1 arp_announce = 1 > I have seven or eight networks attached to the node, and I certainly do > not want to proxy for every single address one may find in the routing > tables. Then don't. > It is equally mind boggling to me how this could ever work with a stack > allowing source-based routing, that is, a stack allowing coexistence of > multiple, possibly conflicting routing tables. Why not? > Sounds to me like I am going to have to rewrite the module. It needs to be > configured manually Well, for most setups it does work automagically. Just bring up the interfaces with the same IP, route the network out on the "main" interface having most hosts and host (or subnet) route the other out the other interface. ARP then follows automatically. But in messy networks or when your routing table is not correct then sysctls is needed to restrict when to respond to stop you from responding to ARP requests to outside/foreign networks. Probably isn't very hard to bring back the support for published proxy-arp entries if needed. But without range support it's a pain to maitain in most setups requiring proxy-arp as you then need an ARP entry for every "other" station on each interface involved in proxy-arp, meaning that if you proxy-arp a /24 network then you need 253 proxy-arp entries (one per station, defining which interface it belongs on). In the normal situation that you only act as a proxy-arp gateway for less than a handful stations this is a significant administrative overhead compared to just configuring routing which is required anyway. Regards Henrik From zdenek@rcn.com Tue Jul 5 21:41:23 2005 Received: with ECARTIS (v1.0.0; list netdev); Tue, 05 Jul 2005 21:41:26 -0700 (PDT) Received: from smtp05.mrf.mail.rcn.net (smtp05.mrf.mail.rcn.net [207.172.4.64]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j664fMH9012561 for ; Tue, 5 Jul 2005 21:41:23 -0700 Received: from 209-150-50-115.c3-0.nwt-ubr3.sbo-nwt.ma.cable.rcn.com (HELO funex) (209.150.50.115) by smtp05.mrf.mail.rcn.net with SMTP; 06 Jul 2005 00:39:47 -0400 Message-Id: <3u3gb7$1npg80@smtp05.mrf.mail.rcn.net> X-IronPort-AV: i="3.93,263,1115006400"; d="scan'208"; a="58507520:sNHT22991424" X-Sender: zdenek@pop.rcn.com X-Mailer: QUALCOMM Windows Eudora Pro Version 4.0 Date: Wed, 06 Jul 2005 00:32:44 -0400 To: Henrik Nordstrom From: Zdenek Radouch Subject: Re: controlling ARP Proxy scope? Cc: netdev@oss.sgi.com, linux-net@vger.kernel.org In-Reply-To: References: <3u3gb7$1no73u@smtp05.mrf.mail.rcn.net> <3u3gb7$1mhk2i@smtp05.mrf.mail.rcn.net> <3u3gb7$1mhk2i@smtp05.mrf.mail.rcn.net> <3u3gb7$1no73u@smtp05.mrf.mail.rcn.net> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" X-archive-position: 2655 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: zdenek@rcn.com Precedence: bulk X-list: netdev Content-Length: 4535 Lines: 108 At 04:20 AM 7/6/05 +0200, Henrik Nordstrom wrote: >On Tue, 5 Jul 2005, Zdenek Radouch wrote: > >>> proxy_arp simply ARPs if there is a route for the requested destination >>> going out on another interface than where the ARP was seen. >> >> In my case, the proxy replies to a request seen on the very same interface >> to which the route points to. > >Are you really sure on this? This part has always worked fine for me with >Linux proxy-arp and a large variety of different kernels. Yes, I am very sure about it. Of all attached networks, only one is a 10.* network, and the address answered, 10.1.2.1 happens to be the main router/gateway to the rest of the world. My machine (with proxy arp) answers incorrectly the ARP requests (for 10.1.2.1) from other machines on the 10.1.2.* subnet. Once it steals the role of the gateway, my machine then forwards the packets to its own default gateway which is behind the proxied machines. As a result of this mess, I see foreign traffic diverted through my proxy server via a private network to another gateway on the other side. > >> I find the idea to proxy based on routing tables quite questionable. > >So do I. The manual proxy-arp entries method suits me much better, but is >a pain due to lack of range support (probably why it got removed in 2.4) > >> It may work is some pretty trivial cases, but will very obviously fail >> with a more complex configuration. > >Haven't managed to find a single situation not solveable yet.. and this >involves pretty complex configurations.. I don't remember which of the >sysctls mentioned earlier did the trick, but once enabled things starts to >behave quite sanely even when there is multiple foreign networks >unexpectedly carried on the same Ethernet. IIRC the settings I settled for >was > > arp_ignore = 1 > arp_announce = 1 > >> I have seven or eight networks attached to the node, and I certainly do >> not want to proxy for every single address one may find in the routing >> tables. > >Then don't. Well, how do I tell it that I want to proxy for all machines on the 192.168.13.128/29 net attached to eth0.5, but not for any of the machines on 192.168.2.0/24 attached to eth0.6 ? It just occured to me that if I misunderstood the semantics, the setup may be wrong. I assumed that turning the proxy arp on on interface X would make the interface X answer (proxy) the ARP queries. Is that correct? (The alternative would be that turning the bit on interface X would make all other interfaces answer on behalf of X). > >> It is equally mind boggling to me how this could ever work with a stack >> allowing source-based routing, that is, a stack allowing coexistence of >> multiple, possibly conflicting routing tables. > >Why not? Because in rule-based routing, a table entries are only valid when the corresponding rule hits, based on the source address. In the absence of a source address as is the case of an ARP request, how could you possibly determine which of the routing tables should be consulted to decide whether the ARP query should be answered? > >> Sounds to me like I am going to have to rewrite the module. It needs to be >> configured manually > >Well, for most setups it does work automagically. Just bring up the >interfaces with the same IP, route the network out on the "main" interface >having most hosts and host (or subnet) route the other out the other >interface. ARP then follows automatically. > >But in messy networks or when your routing table is not correct then >sysctls is needed to restrict when to respond to stop you from responding >to ARP requests to outside/foreign networks. > >Probably isn't very hard to bring back the support for published proxy-arp >entries if needed. But without range support it's a pain to maitain in >most setups requiring proxy-arp as you then need an ARP entry for every >"other" station on each interface involved in proxy-arp, meaning that if >you proxy-arp a /24 network then you need 253 proxy-arp entries (one per >station, defining which interface it belongs on). In the normal situation >that you only act as a proxy-arp gateway for less than a handful stations >this is a significant administrative overhead compared to just configuring >routing which is required anyway. I agree that you wouldn't want to enter discrete addresses. But it could be a simple command using the standard subnet notation: arp_proxy --add 10.1.2.0/24 [11:22:33:44:55:66] (optional Eth address for 3rd party proxy ARP). Regards -Zdenek From tgraf@suug.ch Wed Jul 6 05:43:27 2005 Received: with ECARTIS (v1.0.0; list netdev); Wed, 06 Jul 2005 05:43:34 -0700 (PDT) Received: from postel.suug.ch (postel.suug.ch [195.134.158.23]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j66ChQH9031341 for ; Wed, 6 Jul 2005 05:43:27 -0700 Received: by postel.suug.ch (Postfix, from userid 10001) id C53701C0F3; Wed, 6 Jul 2005 14:42:06 +0200 (CEST) Date: Wed, 6 Jul 2005 14:42:06 +0200 From: Thomas Graf To: Eric Dumazet Cc: "David S. Miller" , netdev@oss.sgi.com Subject: Re: [PATCH] loop unrolling in net/sched/sch_generic.c Message-ID: <20050706124206.GW16076@postel.suug.ch> References: <20050705173411.GK16076@postel.suug.ch> <20050705.142210.14973612.davem@davemloft.net> <20050705213355.GM16076@postel.suug.ch> <20050705.143548.28788459.davem@davemloft.net> <42CB14B2.5090601@cosmosbay.com> <20050705234104.GR16076@postel.suug.ch> <42CB2698.2080904@cosmosbay.com> <42CB2B84.50702@cosmosbay.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <42CB2B84.50702@cosmosbay.com> X-archive-position: 2656 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev Content-Length: 2350 Lines: 72 * Eric Dumazet <42CB2B84.50702@cosmosbay.com> 2005-07-06 02:53 A short recap after some coffee and sleep: The inital issue you brought up which you backed up with numbers is probably the cause of multiple wrong branch predictions due to the fact that I wrote skb = dequeue(); if (skb) which was assumed to be likely by the compiler. In your patch you fixed this with !skb_queue_empty() which fixed this wrong prediction and also acts as a little optimization due to skb_queue_empty() being really simple to implement for the compiler. The patch I posted should result in almost the same result, despite of the additional wrong branch prediction for the loop it always has one wrong prediction which is when we hit a non-empty queue. In your unrolled version you could optimize it even more by taking advantage that prio=2 is the most likely non-empty queue so you could change the check to likely and save a wrong branch prediction for the common case at the cost of a branch misprediction if all queues are empty. The patch I posted results in something like this: pfifo_fast_dequeue: pushl %ebx xorl %ecx, %ecx movl 8(%esp), %ebx leal 128(%ebx), %edx .L129: movl (%edx), %eax cmpl %edx, %eax jne .L132 ; if (!skb_queue_empty()) incl %ecx addl $20, %edx cmpl $2, %ecx jle .L129 ; end of loop xorl %eax, %eax ; all queues empty .L117: popl %ebx ret I regard the miss here as acceptable for the increased flexibility we get. It can be optimized with a loop unrolling but my opinion is to try and avoid it if possible. Now your second thought is quite interesting, although it heavly depends on the fact that prio=2 is the most often used band. It will be interesting to see some numbers. > static struct sk_buff * > pfifo_fast_dequeue(struct Qdisc* qdisc) > { > struct sk_buff_head *list = qdisc_priv(qdisc); > struct sk_buff_head *best = NULL; > > list += 2; > if (!skb_queue_empty(list)) > best = list; > list--; > if (!skb_queue_empty(list)) > best = list; > list--; > if (!skb_queue_empty(list)) > best = list; Here is what I mean, a likely() should be even better. > if (best) { > qdisc->q.qlen--; > return __skb_dequeue(best); > } > return NULL; > } From hno@marasystems.com Wed Jul 6 08:33:59 2005 Received: with ECARTIS (v1.0.0; list netdev); Wed, 06 Jul 2005 08:34:04 -0700 (PDT) Received: from filer.marasystems.com (marasystems.com [83.241.133.2]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j66FXwH9020140 for ; Wed, 6 Jul 2005 08:33:58 -0700 Received: from localhost (henrik@localhost) by filer.marasystems.com (8.11.6/8.11.6) with ESMTP id j66FWGq06815; Wed, 6 Jul 2005 17:32:16 +0200 Date: Wed, 6 Jul 2005 17:32:16 +0200 (CEST) From: Henrik Nordstrom To: Zdenek Radouch cc: netdev@oss.sgi.com, linux-net@vger.kernel.org Subject: Re: controlling ARP Proxy scope? In-Reply-To: <3u3gb7$1npg80@smtp05.mrf.mail.rcn.net> Message-ID: References: <3u3gb7$1no73u@smtp05.mrf.mail.rcn.net> <3u3gb7$1mhk2i@smtp05.mrf.mail.rcn.net> <3u3gb7$1mhk2i@smtp05.mrf.mail.rcn.net> <3u3gb7$1no73u@smtp05.mrf.mail.rcn.net> <3u3gb7$1npg80@smtp05.mrf.mail.rcn.net> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-archive-position: 2657 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hno@marasystems.com Precedence: bulk X-list: netdev Content-Length: 2642 Lines: 62 On Wed, 6 Jul 2005, Zdenek Radouch wrote: > Well, how do I tell it that I want to proxy for all machines on the > 192.168.13.128/29 net attached to eth0.5, but not for any of > the machines on 192.168.2.0/24 attached to eth0.6 ? For whom going where? What is eth0.5? A tagged 802.1q vlan on eth0, or something else? (i.e. is it a interface of it's own according to ip link show, or just an alias?) > It just occured to me that if I misunderstood the semantics, the setup > may be wrong. I assumed that turning the proxy arp on on interface X > would make the interface X answer (proxy) the ARP queries. > Is that correct? Yes. >>> It is equally mind boggling to me how this could ever work with a stack >>> allowing source-based routing, that is, a stack allowing coexistence of >>> multiple, possibly conflicting routing tables. >> >> Why not? > > Because in rule-based routing, a table entries are only valid when the > corresponding > rule hits, based on the source address. In the absence of a source address > as is the case of an ARP request, how could you possibly determine which > of the routing tables should be consulted to decide whether the ARP query > should be answered? ARP requests do have valid source addresses, at least the normal ARP queries. The duplicate IP check ARP requests and a few other obscure cases don't. > I agree that you wouldn't want to enter discrete addresses. > But it could be a simple command using the standard subnet notation: > > arp_proxy --add 10.1.2.0/24 [11:22:33:44:55:66] > (optional Eth address for 3rd party proxy ARP). Which to my experience is all automatic from the routing table. But I do remember some slight complications in source routed networks but nothing major. Before arp_ignore existed I often used source routing as a substitute for keeping proxy-ARP at bay but I don't remember the exact details. I only experienced problems when there was other IP networks on the same Ethernet but not known to the box. In this case I had to enable the arp_ignore sysctl to stop answering "other" ARP queries for networks not defined on the same Ethernet interface. I also used the arp_announce to keep the source address of ARP responses under control. In addition in a setup which involved stations with IP addresses identical to that of one of my own other Ethernet interfaces I had to hack the kernel slightly to make it possible to ignore ARP duplicate IP check packets on the relevant interfaces. But this is very special setup involving NAT and other nastinesses which shouldn't be encountered in any reasonably normal network situation. Regards Henrik From mitch.a.williams@intel.com Wed Jul 6 11:39:44 2005 Received: with ECARTIS (v1.0.0; list netdev); Wed, 06 Jul 2005 11:40:49 -0700 (PDT) Received: from orsfmr002.jf.intel.com (fmr17.intel.com [134.134.136.16]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j66IdhH9009649 for ; Wed, 6 Jul 2005 11:39:43 -0700 Received: from orsfmr101.jf.intel.com (orsfmr101.jf.intel.com [10.7.209.17]) by orsfmr002.jf.intel.com (8.12.10/8.12.10/d: major-outer.mc,v 1.1 2004/09/17 17:50:56 root Exp $) with ESMTP id j66IbrYn026215; Wed, 6 Jul 2005 18:37:53 GMT Received: from nwlxmail01.jf.intel.com (nwlxmail01.jf.intel.com [10.7.171.40]) by orsfmr101.jf.intel.com (8.12.10/8.12.10/d: major-inner.mc,v 1.2 2004/09/17 18:05:01 root Exp $) with ESMTP id j66Ibrqx025701; Wed, 6 Jul 2005 18:37:53 GMT Received: from mawilli1-desk2.amr.corp.intel.com (mawilli1-desk2.amr.corp.intel.com [134.134.3.58]) by nwlxmail01.jf.intel.com (8.12.10/8.12.9/MailSET/Hub) with ESMTP id j66IboSL026134; Wed, 6 Jul 2005 11:37:50 -0700 Date: Wed, 6 Jul 2005 11:37:49 -0700 From: Mitch Williams X-X-Sender: mawilli1@mawilli1-desk2.amr.corp.intel.com To: Dmitry Torokhov cc: netdev@oss.sgi.com, Radheka Godse , fubar@us.ibm.com, bonding-devel@lists.sourceforge.net Subject: Re: [PATCH 2.6.13-rc1 8/17] bonding: SYSFS INTERFACE (large) In-Reply-To: <200507020030.03635.dtor_core@ameritech.net> Message-ID: References: <200507020030.03635.dtor_core@ameritech.net> ReplyTo: "Mitch Williams" MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Scanned-By: MIMEDefang 2.44 X-archive-position: 2658 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mitch.a.williams@intel.com Precedence: bulk X-list: netdev Content-Length: 1242 Lines: 40 On Sat, 2 Jul 2005, Dmitry Torokhov wrote: > > Couple of comments: [snip] > > + > > +static struct class *netdev_class; > > +/*--------------------------- Data Structures -----------------------------*/ > > + > > +/* Bonding sysfs lock. Why can't we just use the subsytem lock? > > + * Because kobject_register tries to acquire the subsystem lock. If > > + * we already hold the lock (which we would if the user was creating > > + * a new bond through the sysfs interface), we deadlock. > > + */ > > + > > +struct rw_semaphore bonding_rwsem; > > klists were just added to the kernel proper. Does this sentiment still > holds true? Thanks for reviewing this patch, Dmitry. We appreciate your efforts, and we'll make the changes you pointed out. In this case, we hold the lock on access to all bonding-owned sysfs files, because it's possible for changes to one file to alter the contents and/or presence of another file. Consider: 1) process 'foo' opens /sys/class/net/bond1/mode 2) process 'bar' opens /sys/class/net/bonding_masters 3) process 'bar' writes to bonding_masters and removes bond1 4) process 'foo' tries to write 5) Boom. Or rather, oops. Thus, we have this lock. I don't think that klists will help here. -Mitch From mitch.a.williams@intel.com Wed Jul 6 11:55:42 2005 Received: with ECARTIS (v1.0.0; list netdev); Wed, 06 Jul 2005 11:55:45 -0700 (PDT) Received: from orsfmr005.jf.intel.com (fmr20.intel.com [134.134.136.19]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j66ItgH9011667 for ; Wed, 6 Jul 2005 11:55:42 -0700 Received: from orsfmr101.jf.intel.com (orsfmr101.jf.intel.com [10.7.209.17]) by orsfmr005.jf.intel.com (8.12.10/8.12.10/d: major-outer.mc,v 1.1 2004/09/17 17:50:56 root Exp $) with ESMTP id j66Is4hn008811; Wed, 6 Jul 2005 18:54:04 GMT Received: from nwlxmail01.jf.intel.com (nwlxmail01.jf.intel.com [10.7.171.40]) by orsfmr101.jf.intel.com (8.12.10/8.12.10/d: major-inner.mc,v 1.2 2004/09/17 18:05:01 root Exp $) with ESMTP id j66Is4qx007518; Wed, 6 Jul 2005 18:54:04 GMT Received: from mawilli1-desk2.amr.corp.intel.com (mawilli1-desk2.amr.corp.intel.com [134.134.3.58]) by nwlxmail01.jf.intel.com (8.12.10/8.12.9/MailSET/Hub) with ESMTP id j66IrSSL027116; Wed, 6 Jul 2005 11:53:38 -0700 Date: Wed, 6 Jul 2005 11:53:13 -0700 From: Mitch Williams X-X-Sender: mawilli1@mawilli1-desk2.amr.corp.intel.com To: Greg KH cc: Radheka Godse , netdev@oss.sgi.com Subject: Re: [PATCH 2.6.13-rc1 8/17] bonding: SYSFS INTERFACE (large) In-Reply-To: <20050702081346.GA20789@kroah.com> Message-ID: References: <20050702081346.GA20789@kroah.com> ReplyTo: "Mitch Williams" MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Scanned-By: MIMEDefang 2.44 X-archive-position: 2659 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mitch.a.williams@intel.com Precedence: bulk X-list: netdev Content-Length: 1367 Lines: 42 On Sat, 2 Jul 2005, Greg KH wrote: > > This violates the 1-value-per-sysfs file rule. Please fix this up. > Thanks for looking at our patch, Greg. We're aware of the "one value" rule, but we really couldn't find any way to do what we wanted to do any other way. The kernel docs do indicate that it is "socially acceptable to express an array of values of values of the same type", which this certainly is. In this particular case, the file /sys/class/net/bonding_masters contains the names of all of the bonds in the system. By default, the module creates a single bond when it loads, thus: $ cat bonding_masters bond0 You can add and remove bonds just by writing to the file. In keeping with the "array of types" concept, you must write the names of all active bonds back to the file. Thus, $ echo "bond0 bond1" > bonding_masters retains bond0 and adds bond1. Likewise, $ echo "bond1 bond2" > bonding_masters retains bond1, deletes bond0, and adds bond2. The slaves file in each bond's directory acts the same way, and is used to add or remove slaves from each individual bond. We discussed this design extensively before implementation, but really couldn't come up with anything as elegant or easy to understand as this scheme. Since it really is an array of similar values, we are hoping that it will be viewed as socially acceptable. -Mitch From shemminger@osdl.org Wed Jul 6 12:03:48 2005 Received: with ECARTIS (v1.0.0; list netdev); Wed, 06 Jul 2005 12:03:52 -0700 (PDT) Received: from smtp.osdl.org (smtp.osdl.org [65.172.181.4]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j66J3lH9014430 for ; Wed, 6 Jul 2005 12:03:47 -0700 Received: from shell0.pdx.osdl.net (fw.osdl.org [65.172.181.6]) by smtp.osdl.org (8.12.8/8.12.8) with ESMTP id j66J1vjA021193 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO); Wed, 6 Jul 2005 12:01:57 -0700 Received: from dxpl.pdx.osdl.net (dxpl.pdx.osdl.net [10.8.0.74]) by shell0.pdx.osdl.net (8.13.1/8.11.6) with ESMTP id j66J1vBh016852; Wed, 6 Jul 2005 12:01:57 -0700 Date: Wed, 6 Jul 2005 12:02:02 -0700 From: Stephen Hemminger To: Mitch Williams Cc: Dmitry Torokhov , netdev@oss.sgi.com, Radheka Godse , fubar@us.ibm.com, bonding-devel@lists.sourceforge.net Subject: Re: [PATCH 2.6.13-rc1 8/17] bonding: SYSFS INTERFACE (large) Message-ID: <20050706120202.6a048b14@dxpl.pdx.osdl.net> In-Reply-To: References: <200507020030.03635.dtor_core@ameritech.net> Organization: Open Source Development Lab X-Mailer: Sylpheed-Claws 1.0.5 (GTK+ 1.2.10; x86_64-unknown-linux-gnu) X-Face: &@E+xe?c%:&e4D{>f1O<&U>2qwRREG5!}7R4;D<"NO^UI2mJ[eEOA2*3>(`Th.yP,VDPo9$ /`~cw![cmj~~jWe?AHY7D1S+\}5brN0k*NE?pPh_'_d>6;XGG[\KDRViCfumZT3@[ Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-MIMEDefang-Filter: osdl$Revision: 1.111 $ X-Scanned-By: MIMEDefang 2.36 X-archive-position: 2660 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev Content-Length: 1459 Lines: 43 On Wed, 6 Jul 2005 11:37:49 -0700 Mitch Williams wrote: > > > On Sat, 2 Jul 2005, Dmitry Torokhov wrote: > > > > > Couple of comments: > [snip] > > > > + > > > +static struct class *netdev_class; > > > +/*--------------------------- Data Structures -----------------------------*/ > > > + > > > +/* Bonding sysfs lock. Why can't we just use the subsytem lock? > > > + * Because kobject_register tries to acquire the subsystem lock. If > > > + * we already hold the lock (which we would if the user was creating > > > + * a new bond through the sysfs interface), we deadlock. > > > + */ > > > + > > > +struct rw_semaphore bonding_rwsem; > > > > klists were just added to the kernel proper. Does this sentiment still > > holds true? > > > Thanks for reviewing this patch, Dmitry. We appreciate your efforts, and > we'll make the changes you pointed out. > > In this case, we hold the lock on access to all bonding-owned sysfs files, > because it's possible for changes to one file to alter the contents and/or > presence of another file. Consider: > > 1) process 'foo' opens /sys/class/net/bond1/mode > 2) process 'bar' opens /sys/class/net/bonding_masters > 3) process 'bar' writes to bonding_masters and removes bond1 > 4) process 'foo' tries to write > 5) Boom. Or rather, oops. > > Thus, we have this lock. I don't think that klists will help here. You need to use kobject ref counting then, not the semaphore. From dtor_core@ameritech.net Wed Jul 6 12:10:45 2005 Received: with ECARTIS (v1.0.0; list netdev); Wed, 06 Jul 2005 12:10:49 -0700 (PDT) Received: from web81301.mail.yahoo.com (web81301.mail.yahoo.com [206.190.37.76]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id j66JAiH9016798 for ; Wed, 6 Jul 2005 12:10:45 -0700 Received: (qmail 74130 invoked by uid 60001); 6 Jul 2005 19:09:06 -0000 Message-ID: <20050706190906.74128.qmail@web81301.mail.yahoo.com> Received: from [167.167.7.254] by web81301.mail.yahoo.com via HTTP; Wed, 06 Jul 2005 12:09:06 PDT Date: Wed, 6 Jul 2005 12:09:06 -0700 (PDT) From: Dmitry Torokhov Subject: Re: [PATCH 2.6.13-rc1 8/17] bonding: SYSFS INTERFACE (large) To: Mitch Williams Cc: netdev@oss.sgi.com, Radheka Godse , fubar@us.ibm.com, bonding-devel@lists.sourceforge.net In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-archive-position: 2661 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: dtor_core@ameritech.net Precedence: bulk X-list: netdev Content-Length: 1834 Lines: 56 Mitch Williams wrote: > On Sat, 2 Jul 2005, Dmitry Torokhov wrote: > > > > > Couple of comments: > [snip] > > > > + > > > +static struct class *netdev_class; > > > +/*--------------------------- Data Structures > -----------------------------*/ > > > + > > > +/* Bonding sysfs lock. Why can't we just use the subsytem lock? > > > + * Because kobject_register tries to acquire the subsystem lock. If > > > + * we already hold the lock (which we would if the user was creating > > > + * a new bond through the sysfs interface), we deadlock. > > > + */ > > > + > > > +struct rw_semaphore bonding_rwsem; > > > > klists were just added to the kernel proper. Does this sentiment still > > holds true? > > > Thanks for reviewing this patch, Dmitry. We appreciate your efforts, and > we'll make the changes you pointed out. > > In this case, we hold the lock on access to all bonding-owned sysfs files, > because it's possible for changes to one file to alter the contents and/or > presence of another file. Consider: > > 1) process 'foo' opens /sys/class/net/bond1/mode > 2) process 'bar' opens /sys/class/net/bonding_masters > 3) process 'bar' writes to bonding_masters and removes bond1 > 4) process 'foo' tries to write > 5) Boom. Or rather, oops. > > Thus, we have this lock. I don't think that klists will help here. > Semaphore will not help in scenario you described: 1) process 'bar' opens /sys/class/net/bonding_masters 2) process 'foo' opens /sys/class/net/bond1/mode 3) process 'bar' starts to write to bonding_masters and acquires semaphore 3) process 'foo' tries to write and waits for semaphore to become available 3) process 'bar' finishes writing and removes bond1 4) process 'foo' acquires the semaphore and continues with write 5) Boom. Or rather, oops. -- Dmitry From greg@kroah.com Wed Jul 6 13:21:05 2005 Received: with ECARTIS (v1.0.0; list netdev); Wed, 06 Jul 2005 13:21:07 -0700 (PDT) Received: from perch.kroah.org (mail.kroah.org [69.55.234.183]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j66KL4H9028234 for ; Wed, 6 Jul 2005 13:21:05 -0700 Received: from [192.168.0.10] (c-24-22-115-24.hsd1.or.comcast.net [24.22.115.24]) (authenticated) by perch.kroah.org (8.11.6/8.11.6) with ESMTP id j66KJ2q31217; Wed, 6 Jul 2005 13:19:02 -0700 Received: from greg by echidna.kroah.org with local (masqmail 0.2.19) id 1DqFwe-4rB-00; Wed, 06 Jul 2005 12:52:32 -0700 Date: Wed, 6 Jul 2005 12:52:32 -0700 From: Greg KH To: Mitch Williams Cc: Radheka Godse , netdev@oss.sgi.com Subject: Re: [PATCH 2.6.13-rc1 8/17] bonding: SYSFS INTERFACE (large) Message-ID: <20050706195232.GB18359@kroah.com> References: <20050702081346.GA20789@kroah.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.8i X-archive-position: 2662 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: greg@kroah.com Precedence: bulk X-list: netdev Content-Length: 2129 Lines: 66 On Wed, Jul 06, 2005 at 11:53:13AM -0700, Mitch Williams wrote: > > > On Sat, 2 Jul 2005, Greg KH wrote: > > > > > This violates the 1-value-per-sysfs file rule. Please fix this up. > > > > Thanks for looking at our patch, Greg. We're aware of the "one value" > rule, but we really couldn't find any way to do what we wanted to do > any other way. The kernel docs do indicate that it is "socially > acceptable to express an array of values of values of the same type", > which this certainly is. > > In this particular case, the file /sys/class/net/bonding_masters contains > the names of all of the bonds in the system. By default, the module > creates a single bond when it loads, thus: > > $ cat bonding_masters > bond0 And, if you have a _lot_ of bonds, you will not show them all, right? That would not work well if you read the file, and then append a new one and write it back. > You can add and remove bonds just by writing to the file. In keeping with > the "array of types" concept, you must write the names of all active bonds > back to the file. Thus, > > $ echo "bond0 bond1" > bonding_masters > > retains bond0 and adds bond1. Likewise, > > $ echo "bond1 bond2" > bonding_masters > > retains bond1, deletes bond0, and adds bond2. > > The slaves file in each bond's directory acts the same way, and is used to > add or remove slaves from each individual bond. > > We discussed this design extensively before implementation, but really > couldn't come up with anything as elegant or easy to understand as this > scheme. Since it really is an array of similar values, we are hoping that > it will be viewed as socially acceptable. No. How about this: bond_add - write to this to add a new bond, one value only. bond_remove - write to this to remove a bond that is present. bonds/bond0 bonds/bond1 bonds/bond2 ... - list of bonds currently present. If you want, you could make those bondX files directories, and put other info about the individual bonds in there, if you need it (I know nothing about the bonding intrerface, sorry.) Would that work? thanks, greg k-h From szilm@presinet.com Wed Jul 6 15:29:00 2005 Received: with ECARTIS (v1.0.0; list netdev); Wed, 06 Jul 2005 15:29:05 -0700 (PDT) Received: from presinet-main.presinet.com (presinet.com [209.53.156.1]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j66MT0H9004874 for ; Wed, 6 Jul 2005 15:29:00 -0700 Received: from [10.10.1.152] (10.10.1.152 [10.10.1.152]) by presinet-main.presinet.com with SMTP (Microsoft Exchange Internet Mail Service Version 5.5.2658.3) id 3HW05PPG; Wed, 6 Jul 2005 15:23:12 -0700 Mime-Version: 1.0 (Apple Message framework v730) Content-Transfer-Encoding: 7bit Message-Id: Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed To: netdev@oss.sgi.com From: Stuart Zilm Subject: policy routing inconsistency Date: Wed, 6 Jul 2005 15:27:19 -0700 X-Mailer: Apple Mail (2.730) X-archive-position: 2663 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: szilm@presinet.com Precedence: bulk X-list: netdev Content-Length: 1382 Lines: 39 While trying to do some policy routing recently I discovered an inconsistency in the behavior that selects the source address by route for locally generated outgoing packets. It seems that while routing occurs through the full policy database (rules and routes), the routes source address is always looked up in the main routing table. For example: DST=10.10.1.1 SRC=10.10.1.2 ip address add $SRC dev eth0 # this is a secondary address on the interface # this works - the source selected is $SRC ip route add $DST dev eth0 src $SRC # implicit table main # this fails - the source selected is chosen from main ip route del $DST dev eth0 src $SRC # implicit table main - NOTE: if this route remains, this source address will be chosen (from table main!) ip route add $SRC dev eth0 src $SRC table 1 ip rule add fwmark 1 table 1 iptables -t mangle -A OUTPUT -d $DST -j MARK --set-mark 1 I expected my source to come from the route that matches and routes my packets. Instead, it seems like there is a separate lookup done on table main directly to select the source. The behavior is the same on linux 2.4.30 and 2.6.8 kernels. Is this done intentionally? What I hoped to achieve was the ability to have two routes to the same host, using different source addresses and select routes based on packet marks. Is that possible? Stuart Zilm PresiNET Systems From mwallis@serialmonkey.com Wed Jul 6 16:36:25 2005 Received: with ECARTIS (v1.0.0; list netdev); Wed, 06 Jul 2005 16:36:31 -0700 (PDT) Received: from mail.qvalent.com (qvfw1.qvalent.com [202.7.65.65]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j66NaMH9013604 for ; Wed, 6 Jul 2005 16:36:25 -0700 Received: from qvwallis (unknown [192.168.40.150]) by mail.qvalent.com (Postfix) with ESMTP id DCECA53 for ; Thu, 7 Jul 2005 10:23:28 +1000 (EST) From: "Mark Wallis" To: "'NetDev'" Subject: FW: ieee80211 patches Date: Thu, 7 Jul 2005 09:30:34 +1000 MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Mailer: Microsoft Office Outlook, Build 11.0.6353 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.2527 Thread-Index: AcWBG9pEhVptwaQXReSIqwktGFpe4gBZsb5g Message-Id: <20050707002328.DCECA53@mail.qvalent.com> X-archive-position: 2664 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mwallis@serialmonkey.com Precedence: bulk X-list: netdev Content-Length: 938 Lines: 33 Hi everyone, On 28/06/2005, at 10:23 PM, Jiri Benc wrote: Our patches against latest ieee80211 branch of netdev tree can be found at http://forge.novell.com/modules/xfmod/cvs/cvsbrowse.php/ieee80211/patches-up stream/ (it is possible to download a tarball from this link too). Should we be expecting to see these patches in the net-dev GIT repository anytime soon ? We are basing our new Ralink driver off the ieee80211 common stack but haven't seen any commits in there for awhile (even know patches have been posted here on net-dev). Is there are more appropriate way for us to be keeping up with the latest ieee80211 stack other than the net-dev GIT repository ? Another repository we are not aware of. I apologise is this is a silly question and there is another repository that exists that we just don't know of. ---- Regards, Mark Wallis rt2x00 Project Leader mwallis@serialmonkey.com http://rt2x00.serialmonkey.com From zdenek@rcn.com Wed Jul 6 19:55:16 2005 Received: with ECARTIS (v1.0.0; list netdev); Wed, 06 Jul 2005 19:55:22 -0700 (PDT) Received: from smtp05.mrf.mail.rcn.net (smtp05.mrf.mail.rcn.net [207.172.4.64]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j672tFH9026492 for ; Wed, 6 Jul 2005 19:55:15 -0700 Received: from 209-150-50-115.c3-0.nwt-ubr3.sbo-nwt.ma.cable.rcn.com (HELO funex) (209.150.50.115) by smtp05.mrf.mail.rcn.net with SMTP; 06 Jul 2005 22:53:37 -0400 Message-Id: <3u3gb7$1o5sb5@smtp05.mrf.mail.rcn.net> X-IronPort-AV: i="3.93,267,1115006400"; d="scan'208"; a="58913125:sNHT23007186" X-Sender: zdenek@pop.rcn.com X-Mailer: QUALCOMM Windows Eudora Pro Version 4.0 Date: Wed, 06 Jul 2005 22:46:33 -0400 To: Henrik Nordstrom From: Zdenek Radouch Subject: Re: controlling ARP Proxy scope? Cc: netdev@oss.sgi.com, linux-net@vger.kernel.org In-Reply-To: References: <3u3gb7$1npg80@smtp05.mrf.mail.rcn.net> <3u3gb7$1no73u@smtp05.mrf.mail.rcn.net> <3u3gb7$1mhk2i@smtp05.mrf.mail.rcn.net> <3u3gb7$1mhk2i@smtp05.mrf.mail.rcn.net> <3u3gb7$1no73u@smtp05.mrf.mail.rcn.net> <3u3gb7$1npg80@smtp05.mrf.mail.rcn.net> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" X-archive-position: 2665 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: zdenek@rcn.com Precedence: bulk X-list: netdev Content-Length: 3279 Lines: 69 At 05:32 PM 7/6/05 +0200, Henrik Nordstrom wrote: >On Wed, 6 Jul 2005, Zdenek Radouch wrote: > >> Well, how do I tell it that I want to proxy for all machines on the >> 192.168.13.128/29 net attached to eth0.5, but not for any of >> the machines on 192.168.2.0/24 attached to eth0.6 ? > >For whom going where? The setup is actually quite complex. I have a linear array of machines communicating over a proprietary (L1 and L2) protocol. The machines are connected point-to-point, and they use a proprietary VLAN protocol below IP in order to be able to control bandwidth (QoS). One of these VLANs (vsc1) is meant to provide external access. For purposes of redundant access, each node has two external addresses Ai and Bi on vsc1, set up as aliases vsc1:0 and vsc1:1. The two addresses are meant to allow access from the left end of the array (Ai) and from the right end of the array (Bi), so even in case of a line failure one could access all of the nodes. Only the two edge nodes (left-most and right-most) are connected to the [outside] customer network - they are connected via Ethernet (specifically 802.1q vlans) interfaces eth0.1 or eth0.2. Additionally, each node has multiple CPUs communicating on a private network (Ethernet/802.1q) via interface eth0.5 or eth0.6. You can ignore the point-to-point nature of the array interconnect - I have designed an L2 layer that hides this, making it appear as a bus-style network except there is no ARP (it is not needed). So in the left-most node for example, I have the following interfaces: eth0.1 A1/32 // connection from the customer LAN to the array (left access) eth0.5 Private1/29 // private intra-node interconnect vsc0 Private2/28 // private array interconnect vsc1:0 A1/28 // public access to the array (from the left) vsc1:1 B1/28 // public access to the array (from the right) The routing table has, in addition to the obvious, a default route via eth0.1 to an address Ax (router attached to the left-most node). [It is really supposed to have two rule-based routes to be able to return the packet in the direction it came, but for some reason the ip rule command does not work, and I have not had a chance to debug that yet]. The purpose of the proxy ARP is to proxy for the ARP-less nodes (on vsc1) hidden behind this node, when they are accessed from the left, i.e., using the Ai addresses, via eth0.1 which is the only public interface here. So I turned on the proxy ARP on eth0.1. My question was, if the proxy ARP is based on the routing table, then how do I do I tell it that I want to proxy only for the Ai addresses, and not for the Bi or any of the private addresses? And the problem I was observing is that this node proxies for the Ax address, when the requests for it are seen on eth0.1. These are legitimate requests of nodes out there trying to talk to the router, not to my array. > >I only experienced problems when there was other IP networks on the same >Ethernet but not known to the box. In this case I had to enable the >arp_ignore sysctl to stop answering "other" ARP queries for networks not >defined on the same Ethernet interface. This may actually be my problem. The Ax address the proxy ARP wrongly answers is not part of the eth0.1 subnet. Regards -Zdenek From linville@bilbo.tuxdriver.com Thu Jul 7 07:27:34 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 07 Jul 2005 07:27:38 -0700 (PDT) Received: from ra.tuxdriver.com (ra.tuxdriver.com [24.172.12.4]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j67ERXH9009991 for ; Thu, 7 Jul 2005 07:27:34 -0700 Received: from bilbo.tuxdriver.com (azure.tuxdriver.com [24.172.12.5]) by ra.tuxdriver.com (8.13.3/8.13.3) with ESMTP id j67EMBOI021618 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Thu, 7 Jul 2005 10:22:12 -0400 Received: from bilbo.tuxdriver.com (localhost.localdomain [127.0.0.1]) by bilbo.tuxdriver.com (8.13.1/8.13.1) with ESMTP id j67EPmPM010078; Thu, 7 Jul 2005 10:25:48 -0400 Received: (from linville@localhost) by bilbo.tuxdriver.com (8.13.1/8.13.1/Submit) id j67EPmZS010077; Thu, 7 Jul 2005 10:25:48 -0400 Date: Thu, 7 Jul 2005 10:25:48 -0400 From: "John W. Linville" To: Greg KH Cc: Mitch Williams , Radheka Godse , netdev@oss.sgi.com Subject: Re: [PATCH 2.6.13-rc1 8/17] bonding: SYSFS INTERFACE (large) Message-ID: <20050707142544.GA9418@tuxdriver.com> Mail-Followup-To: Greg KH , Mitch Williams , Radheka Godse , netdev@oss.sgi.com References: <20050702081346.GA20789@kroah.com> <20050706195232.GB18359@kroah.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20050706195232.GB18359@kroah.com> User-Agent: Mutt/1.4.1i X-archive-position: 2667 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: linville@tuxdriver.com Precedence: bulk X-list: netdev Content-Length: 887 Lines: 30 On Wed, Jul 06, 2005 at 12:52:32PM -0700, Greg KH wrote: > On Wed, Jul 06, 2005 at 11:53:13AM -0700, Mitch Williams wrote: > > scheme. Since it really is an array of similar values, we are hoping that > > it will be viewed as socially acceptable. > > No. > > How about this: > bond_add - write to this to add a new bond, one value only. > bond_remove - write to this to remove a bond that is present. > bonds/bond0 > bonds/bond1 > bonds/bond2 > ... > - list of bonds currently present. If you want, you > could make those bondX files directories, and put > other info about the individual bonds in there, if you > need it (I know nothing about the bonding intrerface, > sorry.) > > Would that work? I like that suggestion. It keeps the interface creation/deletion a little more independent of each other. John -- John W. Linville linville@tuxdriver.com From webmaster@rapidforum.com Thu Jul 7 08:54:26 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 07 Jul 2005 08:54:29 -0700 (PDT) Received: from rapidforum.com (www.rapidforum.com [80.237.244.2]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id j67FsPH9022523 for ; Thu, 7 Jul 2005 08:54:26 -0700 Received: (qmail 24479 invoked by uid 1004); 7 Jul 2005 15:52:47 -0000 Received: from p549f0a42.dip0.t-ipconnect.de (HELO ?84.159.10.66?) (dragony@84.159.10.66) by www.rapidforum.com with SMTP; 7 Jul 2005 15:52:47 -0000 Message-ID: <42CD4F8F.9060503@rapidforum.com> Date: Thu, 07 Jul 2005 17:51:43 +0200 From: Christian Schmid User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.7) Gecko/20050414 X-Accept-Language: de, en MIME-Version: 1.0 To: netdev@oss.sgi.com Subject: Bug still exists... Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 2668 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: webmaster@rapidforum.com Precedence: bulk X-list: netdev Content-Length: 484 Lines: 9 Hi. The vm slow-down bug starting at 4000 sockets still exists. I now have finally traced the issue down. It depends on the number of sockets AND the bandwidth you need. So I suppose it depends on the number of packets flowing through the NIC. I managed to completely remove the bug by using two cards and the nexthop-parameter to use eth0 and eth1 with the same gateway-ip. Maybe interrupt-problems? Slow-down appears on intel and broadcom cards. Other cards not tested. Chris From akepner@sgi.com Thu Jul 7 13:20:13 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 07 Jul 2005 13:20:21 -0700 (PDT) Received: from omx1.americas.sgi.com (omx1-ext.sgi.com [192.48.179.11]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j67KKAH9002865 for ; Thu, 7 Jul 2005 13:20:11 -0700 Received: from cthulhu.engr.sgi.com (cthulhu.engr.sgi.com [192.26.80.2]) by omx1.americas.sgi.com (8.12.10/8.12.9/linux-outbound_gateway-1.1) with ESMTP id j67KIXxT027098 for ; Thu, 7 Jul 2005 15:18:33 -0500 Received: from [192.168.2.20] (mtv-vpn-sw-corp-0-42.corp.sgi.com [134.15.0.42]) by cthulhu.engr.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id j67KIUdP45231961; Thu, 7 Jul 2005 13:18:31 -0700 (PDT) Date: Thu, 7 Jul 2005 13:14:26 -0700 (PDT) From: Arthur Kepner X-X-Sender: akepner@resonance.WorkGroup To: Herbert Xu cc: netdev@vger.kernel.org, netdev@oss.sgi.com Subject: Re: [RFC/PATCH] "safer ipv4 reassembly" (fwd) Message-ID: MIME-Version: 1.0 Content-Type: MULTIPART/MIXED; BOUNDARY="32512-1443077357-1120766375=:24321" Content-ID: X-archive-position: 2669 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: akepner@sgi.com Precedence: bulk X-list: netdev Content-Length: 18215 Lines: 301 This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. --32512-1443077357-1120766375=:24321 Content-Type: TEXT/PLAIN; CHARSET=US-ASCII Content-ID: Pine.LNX.4.61.0507071300211.24321@resonance.WorkGroup Version 2 of the rfc/patch is attached. It has been changed as indicated in the commentary below. Diffstat: include/linux/sysctl.h | 1 net/ipv4/ip_fragment.c | 195 +++++++++++++++++++++++++++++++++++++++++++++ net/ipv4/sysctl_net_ipv4.c | 11 ++ Signed-off-by: Arthur Kepner On Tue, 28 Jun 2005, Arthur Kepner wrote: > > On Sun, 26 Jun 2005, Herbert Xu wrote: > > > > > > +struct ipc { > > > ...... > > > + struct rcu_head rcu; > > > > Is RCU worth it here? The only time we'd be taking the locks on this > > is when the first fragment of a packet comes in. At that point we'll > > be taking write_lock(&ipfrag_lock) anyway. > > Yes, I think rcu is worth it here. The reason is that to not use rcu would necessitate grabbing the (global) ipfrag_lock an additional time, when we free an ipc. Adding an "ipc" to the hashtable could be done under the ipfrag_lock, as you mention. But removing an ipc shouldn't be done at the same time that fragments are destroyed, because the common case is that another fragment queue will soon be created for the same (src,dst,proto). Better to save the ipc for a while to avoid freeing and then immediately recreating it. Since the freeing of the ipc has to be deferred until well after the last associated fragment queue has been freed, we can't take advantage of the fact that the ipfrag_lock is held when the fragment queue is freed. So when finally freeing the ipc, we can either grab the global ipfrag_lock again, or use some other, finer-grained lock to protect the ipc_hash entries. I'd prefer to avoid introducing new uses of global locks. If we use the finer-grained ipc_hash[].lock locks then rcu allows us to avoid taking any locks in ipc_find when we create a new fragment chain and there already happens to be an ipc for the associated (src,dst,proto). (I suspect this would be a fairly common case.) > > The only other use of RCU in your patch is ip_count. That should be > > changed to be done in ip_defrag instead. At that point you can simply > > find the ipc by deferencing ipq, so no need for __ipc_find and hence > > RCU. > > > > The reason you need to change it in this way is because you can't make > > assumptions about ip_rcv_finish being the first place where a packet > > is defragmented. With connection tracking enabled conntrack is the first > > place where defragmentation occurs. > > > ..... This has been fixed. ip_input.c isn't changed by this version of the patch. But there's the caveat that I mentioned earlier: > > There is a (big) advantage to doing this in ip_defrag() - this > becomes a no-op for non-fragmented datagrams. The disadvantage > is that there could be a situation where you receive: > > 1) first fragment of datagram X [for a particular (src,dst,proto)] > 2) a zillion non-fragmented datagrams [for the same (src,dst,proto)] > 3) last fragment of datagram X [for (src,dst,proto)] > > and no "disorder" would be detected for the datagrams associated > with (src,dst,proto), even though the ip id could have wrapped in the > meantime. This seems like a very uncommon case, however. > > > > > +#define IPC_HASHSZ IPQ_HASHSZ > > > +static struct { > > > + struct hlist_head head; > > > + spinlock_t lock; > > > +} ipc_hash[IPC_HASHSZ]; > > > > I'd store ipc entries in the main ipq hash table since they can use > > the same keys for lookup as ipq entries. You just need to set protocol > > to zero and map the user to values specific to ipc for ipc entries. > > One mapping would be to set the top bit of user for ipc entries, e.g. > > > > #define IP_DEFRAG_IPC 0x80000000 > > ipc->user = ipq->user | IP_DEFRAG_IPC; > > > > Of course you also need to make sure that the two structures share > > the leading elements. You can then use the user field to distinguish > > between ipc/ipq entries. > I thought about this point, but I dislike reusing the same structure for such different purposes, so left this unchanged. Comments? -- Arthur --32512-1443077357-1120766375=:24321 Content-Type: TEXT/PLAIN; CHARSET=US-ASCII; NAME="ip_reasm.patch.2" Content-Transfer-Encoding: BASE64 Content-ID: Pine.LNX.4.61.0507071259350.24321@resonance.WorkGroup Content-Description: ip_reasm.patch.2 Content-Disposition: ATTACHMENT; FILENAME="ip_reasm.patch.2" ZGlmZiAtcnVwIGxpbnV4Lm9yaWcvaW5jbHVkZS9saW51eC9zeXNjdGwuaCBsaW51eC9pbmNsdWRl L2xpbnV4L3N5c2N0bC5oDQotLS0gbGludXgub3JpZy9pbmNsdWRlL2xpbnV4L3N5c2N0bC5oCTIw MDUtMDctMDYgMTI6MzI6MDMuMjI0NTQ2OTUzIC0wNzAwDQorKysgbGludXgvaW5jbHVkZS9saW51 eC9zeXNjdGwuaAkyMDA1LTA3LTA3IDA5OjU2OjQyLjg1NDg0NTA4OSAtMDcwMA0KQEAgLTM0Myw2 ICszNDMsNyBAQCBlbnVtDQogCU5FVF9UQ1BfQklDX0JFVEE9MTA4LA0KIAlORVRfSVBWNF9JQ01Q X0VSUk9SU19VU0VfSU5CT1VORF9JRkFERFI9MTA5LA0KIAlORVRfVENQX0NPTkdfQ09OVFJPTD0x MTAsDQorCU5FVF9JUFY0X1JFQVNNX0NPVU5UPTExMSwNCiB9Ow0KIA0KIGVudW0gew0KZGlmZiAt cnVwIGxpbnV4Lm9yaWcvbmV0L2lwdjQvaXBfZnJhZ21lbnQuYyBsaW51eC9uZXQvaXB2NC9pcF9m cmFnbWVudC5jDQotLS0gbGludXgub3JpZy9uZXQvaXB2NC9pcF9mcmFnbWVudC5jCTIwMDUtMDct MDYgMTI6MzA6NTEuMDMzMzgwODMwIC0wNzAwDQorKysgbGludXgvbmV0L2lwdjQvaXBfZnJhZ21l bnQuYwkyMDA1LTA3LTA3IDA5OjU2OjQyLjg1Njc5ODIzNCAtMDcwMA0KQEAgLTU2LDYgKzU2LDgg QEANCiBpbnQgc3lzY3RsX2lwZnJhZ19oaWdoX3RocmVzaCA9IDI1NioxMDI0Ow0KIGludCBzeXNj dGxfaXBmcmFnX2xvd190aHJlc2ggPSAxOTIqMTAyNDsNCiANCitpbnQgc3lzY3RsX2lwX3JlYXNz ZW1ibHlfY291bnQ7DQorDQogLyogSW1wb3J0YW50IE5PVEUhIEZyYWdtZW50IHF1ZXVlIG11c3Qg YmUgZGVzdHJveWVkIGJlZm9yZSBNU0wgZXhwaXJlcy4NCiAgKiBSRkM3OTEgaXMgd3JvbmcgcHJv cG9zaW5nIHRvIHByb2xvbmdhdGUgdGltZXIgZWFjaCBmcmFnbWVudCBhcnJpdmFsIGJ5IFRUTC4N CiAgKi8NCkBAIC02OSw2ICs3MSwyNSBAQCBzdHJ1Y3QgaXBmcmFnX3NrYl9jYg0KIA0KICNkZWZp bmUgRlJBR19DQihza2IpCSgoc3RydWN0IGlwZnJhZ19za2JfY2IqKSgoc2tiKS0+Y2IpKQ0KIA0K Ky8qIHN0cnVjdCBpcGMgY29udGFpbnMgYSBjb3VudCBvZiB0aGUgbnVtYmVyIG9mIElQIGRhdGFn cmFtcyANCisgKiByZWNlaXZlZCBmb3IgYSAoc2FkZHIsIGRhZGRyLCBwcm90b2NvbCkgdHVwbGUg LSBidXQgb25lIG9mIA0KKyAqIHRoZXNlIHN0cnVjdHVyZXMgZXhpc3RzIGZvciBhIGdpdmVuIChz YWRkciwgZGFkZHIsIHByb3RvY29sKSANCisgKiBpZiBhbmQgb25seSBpZiB0aGVyZSBpcyBhIHF1 ZXVlIG9mIElQIGZyYWdtZW50cyBhc3NvY2lhdGVkIA0KKyAqIHdpdGggdGhhdCAzLXR1cGxlIGFu ZCBzeXNjdGxfaXBfcmVhc3NlbWJseV9jb3VudCBpcyBub24temVyby4NCisgKi8NCitzdHJ1Y3Qg aXBjIHsNCisJc3RydWN0IGhsaXN0X25vZGUJbm9kZTsNCisJdTMyCQkJc2FkZHI7DQorCXUzMgkJ CWRhZGRyOw0KKwl1OAkJCXByb3RvY29sOw0KKwlhdG9taWNfdAkJcmVmY250OwkvKiBob3cgbWFu eSBpcHFzIGhvbGQgcmVmcyB0byB1cyAqLw0KKwlhdG9taWNfdAkJc2VxOwkvKiBob3cgbWFueSBp cCBmcmFnbWVudHMgZm9yIHRoaXMgDQorCQkJCQkgKiAoc2FkZHIsZGFkZHIscHJvdG9jb2wpIHNp bmNlIHdlIA0KKwkJCQkJICogd2VyZSBjcmVhdGVkICovDQorCXN0cnVjdCB0aW1lcl9saXN0CXRp bWVyOw0KKwlzdHJ1Y3QgcmN1X2hlYWQJCXJjdTsNCit9Ow0KKw0KIC8qIERlc2NyaWJlIGFuIGVu dHJ5IGluIHRoZSAiaW5jb21wbGV0ZSBkYXRhZ3JhbXMiIHF1ZXVlLiAqLw0KIHN0cnVjdCBpcHEg ew0KIAlzdHJ1Y3QgaXBxCSpuZXh0OwkJLyogbGlua2VkIGxpc3QgcG9pbnRlcnMJCQkqLw0KQEAg LTkyLDYgKzExMywxNCBAQCBzdHJ1Y3QgaXBxIHsNCiAJc3RydWN0IGlwcQkqKnBwcmV2Ow0KIAlp bnQJCWlpZjsNCiAJc3RydWN0IHRpbWV2YWwJc3RhbXA7DQorCXN0cnVjdCBpcGMJKmlwYzsNCisJ YXRvbWljX3QJc2VxOwkJDQorCS8qIGlwcS0+c2VxIGRlZmluZXMgdGhlICJib3R0b20iIG9mIHRo ZSB3aW5kb3cgb2Ygc2VxdWVuY2UgbnVtYmVycyANCisJICogdGhhdCBhcmUgdmFsaWQgZm9yIHRo aXMgZnJhZ21lbnQgLSB0aGUgInRvcCIgb2YgdGhlIHdpbmRvdyBpcyANCisJICogKGlwcS0+c2Vx ICsgc3lzY3RsX2lwX3JlYXNzZW1ibHlfY291bnQpLiBpcHEtPnNlcSBpcyBpbml0aWFsaXplZA0K KwkgKiB0byB0aGUgdmFsdWUgaW4gdGhlIGFzc29jaWF0ZWQgaXBjIHdoZW4gdGhlIGZyYWdtZW50 IHF1ZXVlIGlzIA0KKwkgKiBjcmVhdGVkLCBhbmQgaW5jcmVtZW50ZWQgZWFjaCB0aW1lIGEgZnJh Z21lbnQgaXMgYWRkZWQgdG8gdGhlIA0KKwkgKiBxdWV1ZSAqLw0KIH07DQogDQogLyogSGFzaCB0 YWJsZS4gKi8NCkBAIC0xMDUsNiArMTM0LDEyIEBAIHN0YXRpYyB1MzIgaXBmcmFnX2hhc2hfcm5k Ow0KIHN0YXRpYyBMSVNUX0hFQUQoaXBxX2xydV9saXN0KTsNCiBpbnQgaXBfZnJhZ19ucXVldWVz ID0gMDsNCiANCisjZGVmaW5lIElQQ19IQVNIU1oJSVBRX0hBU0hTWg0KK3N0YXRpYyBzdHJ1Y3Qg ew0KKwlzdHJ1Y3QgaGxpc3RfaGVhZCBoZWFkOw0KKwlzcGlubG9ja190IGxvY2s7DQorfSBpcGNf aGFzaFtJUENfSEFTSFNaXTsNCisNCiBzdGF0aWMgX19pbmxpbmVfXyB2b2lkIF9faXBxX3VubGlu ayhzdHJ1Y3QgaXBxICpxcCkNCiB7DQogCWlmKHFwLT5uZXh0KQ0KQEAgLTEyMSw2ICsxNTYsMTEg QEAgc3RhdGljIF9faW5saW5lX18gdm9pZCBpcHFfdW5saW5rKHN0cnVjdA0KIAl3cml0ZV91bmxv Y2soJmlwZnJhZ19sb2NrKTsNCiB9DQogDQorc3RhdGljIHVuc2lnbmVkIGludCBpcGNoYXNoZm4o dTMyIHNhZGRyLCB1MzIgZGFkZHIsIHU4IHByb3QpDQorew0KKwlyZXR1cm4gamhhc2hfM3dvcmRz KHByb3QsIHNhZGRyLCBkYWRkciwgMCkgJiAoSVBDX0hBU0hTWiAtIDEpOw0KK30NCisNCiBzdGF0 aWMgdW5zaWduZWQgaW50IGlwcWhhc2hmbih1MTYgaWQsIHUzMiBzYWRkciwgdTMyIGRhZGRyLCB1 OCBwcm90KQ0KIHsNCiAJcmV0dXJuIGpoYXNoXzN3b3JkcygodTMyKWlkIDw8IDE2IHwgcHJvdCwg c2FkZHIsIGRhZGRyLA0KQEAgLTIzMSw4ICsyNzEsMTYgQEAgc3RhdGljIF9faW5saW5lX18gdm9p ZCBpcHFfcHV0KHN0cnVjdCBpcA0KICAqLw0KIHN0YXRpYyB2b2lkIGlwcV9raWxsKHN0cnVjdCBp cHEgKmlwcSkNCiB7DQorCXN0cnVjdCBpcGMgKmNwID0gaXBxLT5pcGM7DQorDQogCWlmIChkZWxf dGltZXIoJmlwcS0+dGltZXIpKQ0KIAkJYXRvbWljX2RlYygmaXBxLT5yZWZjbnQpOw0KKwlpZiAo Y3ApIHsNCisJCWF0b21pY19kZWMoJmNwLT5yZWZjbnQpOw0KKwkJLyogbm8gcGFydGljdWxhciBy ZWFzb24gdG8gdXNlIHN5c2N0bF9pcGZyYWdfdGltZSANCisJCSAqIGZvciB0aGlzIHRpbWVyICov DQorCQltb2RfdGltZXIoJmNwLT50aW1lciwgamlmZmllcyArIHN5c2N0bF9pcGZyYWdfdGltZSk7 DQorCX0NCiANCiAJaWYgKCEoaXBxLT5sYXN0X2luICYgQ09NUExFVEUpKSB7DQogCQlpcHFfdW5s aW5rKGlwcSk7DQpAQCAtMzQ4LDEwICszOTYsMTEwIEBAIHN0YXRpYyBzdHJ1Y3QgaXBxICppcF9m cmFnX2ludGVybih1bnNpZ24NCiAJcmV0dXJuIHFwOw0KIH0NCiANCitzdGF0aWMgaW5saW5lIHZv aWQgX19pcGNfZGVzdHJveShzdHJ1Y3QgcmN1X2hlYWQgKmhlYWQpDQorew0KKwlrZnJlZShjb250 YWluZXJfb2YoaGVhZCwgc3RydWN0IGlwYywgcmN1KSk7DQorfQ0KKw0KK3N0YXRpYyB2b2lkIGlw Y19kZXN0cm95KHVuc2lnbmVkIGxvbmcgYXJnKSANCit7DQorCXN0cnVjdCBpcGMgKmNwID0gKHN0 cnVjdCBpcGMgKikgYXJnOw0KKwl1bnNpZ25lZCBpbnQgaGFzaCA9IGlwY2hhc2hmbihjcC0+c2Fk ZHIsIGNwLT5kYWRkciwgY3AtPnByb3RvY29sKTsNCisNCisJc3Bpbl9sb2NrKCZpcGNfaGFzaFto YXNoXS5sb2NrKTsNCisJQlVHX09OKChhdG9taWNfcmVhZCgmY3AtPnJlZmNudCkpIDwgMCk7DQor DQorCWlmIChhdG9taWNfcmVhZCgmY3AtPnJlZmNudCkgPT0gMCkgew0KKwkJaGxpc3RfZGVsX3Jj dSgmY3AtPm5vZGUpOw0KKwkJY2FsbF9yY3UoJmNwLT5yY3UsIF9faXBjX2Rlc3Ryb3kpOw0KKwl9 DQorCXNwaW5fdW5sb2NrKCZpcGNfaGFzaFtoYXNoXS5sb2NrKTsNCit9DQorDQorLyogDQorICog bXVzdCBob2xkIHNwaW5sb2NrIGZvciB0aGUgYXBwcm9wcmlhdGUgaGFzaCBsaXN0IGhlYWQgd2hl biANCisgKiBfX2lwY19jcmVhdGUgaXMgY2FsbGVkIA0KKyAqLw0KKw0KK3N0YXRpYyBpbmxpbmUg c3RydWN0IGlwYyAqX19pcGNfY3JlYXRlKHN0cnVjdCBpcGhkciAqaXBoLCANCisJCQkJICAgICAg IGNvbnN0IHVuc2lnbmVkIGludCBoYXNoKSANCit7DQorCXN0cnVjdCBpcGMgKmNwID0ga21hbGxv YyhzaXplb2Yoc3RydWN0IGlwYyksIEdGUF9BVE9NSUMpOw0KKwkvKiBYWFggc2hvdWxkIHdlIGFj Y291bnQgc2l6ZSB0byBpcF9mcmFnX21lbSA/Pz8gKi8NCisJaWYgKGNwKSB7DQorCQljcC0+c2Fk ZHIgPSBpcGgtPnNhZGRyOw0KKwkJY3AtPmRhZGRyID0gaXBoLT5kYWRkcjsNCisJCWNwLT5wcm90 b2NvbCA9IGlwaC0+cHJvdG9jb2w7DQorCQlhdG9taWNfc2V0KCZjcC0+c2VxLCAwKTsNCisJCWF0 b21pY19zZXQoJmNwLT5yZWZjbnQsIDEpOw0KKwkJSU5JVF9ITElTVF9OT0RFKCZjcC0+bm9kZSk7 DQorCQlobGlzdF9hZGRfaGVhZF9yY3UoJmNwLT5ub2RlLCAmaXBjX2hhc2hbaGFzaF0uaGVhZCk7 DQorCQlpbml0X3RpbWVyKCZjcC0+dGltZXIpOw0KKwkJY3AtPnRpbWVyLmRhdGEgPSAodW5zaWdu ZWQgbG9uZykgY3A7DQorCQljcC0+dGltZXIuZnVuY3Rpb24gPSBpcGNfZGVzdHJveTsNCisJfSBl bHNlIHsNCisJCU5FVERFQlVHKGlmIChuZXRfcmF0ZWxpbWl0KCkpIA0KKwkJCXByaW50ayhLRVJO X0VSUiAiX19pcGNfY3JlYXRlOiBubyBtZW1vcnkgbGVmdCAhXG4iKSk7DQorCX0NCisJcmV0dXJu IGNwOw0KK30NCisNCisvKiANCisgKiBtdXN0IGJlICJyY3Ugc2FmZSIgd2hlbiBfX2lwY19maW5k IGlzIGNhbGxlZCAtIGVpdGhlciB1c2UgDQorICogcmN1X3JlYWRfbG9jayAoaWYgeW91IGludGVu ZCBvbmx5IHRvIHJlYWQgdGhlIHJldHVybmVkIHN0cnVjdCkgDQorICogb3IgZ3JhYiB0aGUgc3Bp bmxvY2sgZm9yIHRoZSBhcHByb3ByaWF0ZSBoYXNoIGxpc3QgaGVhZCAoaWYgDQorICogeW91IG1p Z2h0IG1vZGlmeSB0aGUgcmV0dXJuZWQgc3RydWN0KSANCisgKi8NCitzdGF0aWMgaW5saW5lIHN0 cnVjdCBpcGMgKl9faXBjX2ZpbmQodTMyIHNhZGRyLCB1MzIgZGFkZHIsIHU4IHByb3RvY29sLCAN CisJCQkJICAgICBjb25zdCB1bnNpZ25lZCBpbnQgaGFzaCkNCit7DQorCXN0cnVjdCBobGlzdF9u b2RlICpwOw0KKw0KKwlobGlzdF9mb3JfZWFjaF9yY3UocCwgJmlwY19oYXNoW2hhc2hdLmhlYWQp IHsNCisJCXN0cnVjdCBpcGMgKiBjcCA9IChzdHJ1Y3QgaXBjICopcDsNCisJCWlmKGNwLT5zYWRk ciA9PSBzYWRkciAmJg0KKwkJICAgY3AtPmRhZGRyID09IGRhZGRyICYmDQorCQkgICBjcC0+cHJv dG9jb2wgPT0gcHJvdG9jb2wpIHsNCisJCQlyZXR1cm4gY3A7DQorCQl9DQorCX0NCisJcmV0dXJu IE5VTEw7DQorfQ0KKw0KK3N0YXRpYyBzdHJ1Y3QgaXBjICppcGNfZmluZChzdHJ1Y3QgaXBoZHIg KmlwaCkNCit7DQorCXN0cnVjdCBpcGMgKmNwOw0KKwl1bnNpZ25lZCBpbnQgaGFzaCA9IGlwY2hh c2hmbihpcGgtPnNhZGRyLCBpcGgtPmRhZGRyLCBpcGgtPnByb3RvY29sKTsNCisNCisJcmN1X3Jl YWRfbG9jaygpOw0KKwlpZigoY3AgPSBfX2lwY19maW5kKGlwaC0+c2FkZHIsIGlwaC0+ZGFkZHIs IA0KKwkJCSAgICBpcGgtPnByb3RvY29sLCBoYXNoKSkgIT0gTlVMTCkgew0KKwkJYXRvbWljX2lu YygmY3AtPnJlZmNudCk7DQorCQlyY3VfcmVhZF91bmxvY2soKTsNCisJCXJldHVybiBjcDsNCisJ fQ0KKwlyY3VfcmVhZF91bmxvY2soKTsNCisJc3Bpbl9sb2NrKCZpcGNfaGFzaFtoYXNoXS5sb2Nr KTsNCisJaWYoKGNwID0gX19pcGNfZmluZChpcGgtPnNhZGRyLCBpcGgtPmRhZGRyLCANCisJCQkg ICAgaXBoLT5wcm90b2NvbCwgaGFzaCkpICE9IE5VTEwpIHsNCisJCWF0b21pY19pbmMoJmNwLT5y ZWZjbnQpOw0KKwkJc3Bpbl91bmxvY2soJmlwY19oYXNoW2hhc2hdLmxvY2spOw0KKwkJcmV0dXJu IGNwOw0KKwl9DQorCWNwID0gX19pcGNfY3JlYXRlKGlwaCwgaGFzaCk7DQorCXNwaW5fdW5sb2Nr KCZpcGNfaGFzaFtoYXNoXS5sb2NrKTsNCisJcmV0dXJuIGNwOw0KK30NCisNCisNCiAvKiBBZGQg YW4gZW50cnkgdG8gdGhlICdpcHEnIHF1ZXVlIGZvciBhIG5ld2x5IHJlY2VpdmVkIElQIGRhdGFn cmFtLiAqLw0KIHN0YXRpYyBzdHJ1Y3QgaXBxICppcF9mcmFnX2NyZWF0ZSh1bnNpZ25lZCBoYXNo LCBzdHJ1Y3QgaXBoZHIgKmlwaCwgdTMyIHVzZXIpDQogew0KIAlzdHJ1Y3QgaXBxICpxcDsNCisJ c3RydWN0IGlwYyAqY3AgPSBOVUxMOw0KKw0KKwlpZiAoc3lzY3RsX2lwX3JlYXNzZW1ibHlfY291 bnQgJiYgKGNwID0gaXBjX2ZpbmQoaXBoKSkgPT0gTlVMTCkNCisJCXJldHVybiBOVUxMOw0KIA0K IAlpZiAoKHFwID0gZnJhZ19hbGxvY19xdWV1ZSgpKSA9PSBOVUxMKQ0KIAkJZ290byBvdXRfbm9t ZW07DQpAQCAtMzY2LDYgKzUxNCwxMCBAQCBzdGF0aWMgc3RydWN0IGlwcSAqaXBfZnJhZ19jcmVh dGUodW5zaWduDQogCXFwLT5tZWF0ID0gMDsNCiAJcXAtPmZyYWdtZW50cyA9IE5VTEw7DQogCXFw LT5paWYgPSAwOw0KKwlxcC0+aXBjID0gY3A7DQorCWlmIChzeXNjdGxfaXBfcmVhc3NlbWJseV9j b3VudCAmJiBjcCkgew0KKwkJYXRvbWljX3NldCgmcXAtPnNlcSwgYXRvbWljX3JlYWQoJmNwLT5z ZXEpKTsNCisJfQ0KIA0KIAkvKiBJbml0aWFsaXplIGEgdGltZXIgZm9yIHRoaXMgZW50cnkuICov DQogCWluaXRfdGltZXIoJnFwLT50aW1lcik7DQpAQCAtMzgxLDYgKzUzMywzOSBAQCBvdXRfbm9t ZW06DQogCXJldHVybiBOVUxMOw0KIH0NCiANCitzdGF0aWMgaW5saW5lIGludCBpbl93aW5kb3co aW50IGJvdHRvbSwgaW50IHNpemUsIGludCBzZXEpIHsNCisJcmV0dXJuICgoKHNlcSAtIGJvdHRv bSkgPj0gMCkgJiYgKChzZXEgLSAoYm90dG9tICsgc2l6ZSkpIDwgMCkpOw0KK30NCisNCitzdGF0 aWMgaW50IF9faXBfcmVhc3NlbWJseV9jb3VudF9jaGVjayhjb25zdCBzdHJ1Y3QgaXBoZHIgKmlw aCwgc3RydWN0IGlwcSAqcXApDQorew0KKwlzdHJ1Y3QgaXBjICpjcCA9IHFwLT5pcGM7DQorCWlu dCBjc2VxLCBxc2VxOw0KKw0KKwkvKiBxcC0+aXBjIG1heSBiZSBOVUxMIGlmIHN5c2N0bF9pcF9y ZWFzc2VtYmx5X2NvdW50IHdhcyBvZmYgDQorCSAqIGF0IHRoZSB0aW1lIHRoZSBmcmFnbWVudCBx dWV1ZSB3YXMgY3JlYXRlZCAqLw0KKwlpZiAoY3AgPT0gTlVMTCkNCisJCXJldHVybiAwOw0KKw0K Kwljc2VxID0gYXRvbWljX2luY19yZXR1cm4oJmNwLT5zZXEpOw0KKwlxc2VxID0gYXRvbWljX2lu Y19yZXR1cm4oJnFwLT5zZXEpOw0KKw0KKwlpZiAoIWluX3dpbmRvdyhxc2VxLCBzeXNjdGxfaXBf cmVhc3NlbWJseV9jb3VudCwgY3NlcSkpIHsNCisJCWF0b21pY19pbmMoJnFwLT5yZWZjbnQpOw0K KwkJcmVhZF91bmxvY2soJmlwZnJhZ19sb2NrKTsNCisJCXNwaW5fbG9jaygmcXAtPmxvY2spOw0K KwkJaWYgKCEocXAtPmxhc3RfaW4mQ09NUExFVEUpKQ0KKwkJCWlwcV9raWxsKHFwKTsNCisJCXNw aW5fdW5sb2NrKCZxcC0+bG9jayk7DQorCQlpcHFfcHV0KHFwLCBOVUxMKTsNCisJCUlQX0lOQ19T VEFUU19CSChJUFNUQVRTX01JQl9SRUFTTUZBSUxTKTsNCisJCXJlYWRfbG9jaygmaXBmcmFnX2xv Y2spOw0KKwkJcmV0dXJuIDE7DQorCX0NCisJcmV0dXJuIDA7DQorfQ0KKw0KKw0KIC8qIEZpbmQg dGhlIGNvcnJlY3QgZW50cnkgaW4gdGhlICJpbmNvbXBsZXRlIGRhdGFncmFtcyIgcXVldWUgZm9y DQogICogdGhpcyBJUCBkYXRhZ3JhbSwgYW5kIGNyZWF0ZSBuZXcgb25lLCBpZiBub3RoaW5nIGlz IGZvdW5kLg0KICAqLw0KQEAgLTQwMCw2ICs1ODUsMTAgQEAgc3RhdGljIGlubGluZSBzdHJ1Y3Qg aXBxICppcF9maW5kKHN0cnVjdA0KIAkJICAgcXAtPmRhZGRyID09IGRhZGRyCSYmDQogCQkgICBx cC0+cHJvdG9jb2wgPT0gcHJvdG9jb2wgJiYNCiAJCSAgIHFwLT51c2VyID09IHVzZXIpIHsNCisJ CQlpZiAoc3lzY3RsX2lwX3JlYXNzZW1ibHlfY291bnQgJiYNCisJCQkJX19pcF9yZWFzc2VtYmx5 X2NvdW50X2NoZWNrKGlwaCwgcXApKSB7DQorCQkJCWJyZWFrOw0KKwkJCX0NCiAJCQlhdG9taWNf aW5jKCZxcC0+cmVmY250KTsNCiAJCQlyZWFkX3VubG9jaygmaXBmcmFnX2xvY2spOw0KIAkJCXJl dHVybiBxcDsNCkBAIC02NzksOSArODY4LDE1IEBAIHN0cnVjdCBza19idWZmICppcF9kZWZyYWco c3RydWN0IHNrX2J1ZmYNCiANCiB2b2lkIGlwZnJhZ19pbml0KHZvaWQpDQogew0KKwlpbnQgaTsN CiAJaXBmcmFnX2hhc2hfcm5kID0gKHUzMikgKChudW1fcGh5c3BhZ2VzIF4gKG51bV9waHlzcGFn ZXM+PjcpKSBeDQogCQkJCSAoamlmZmllcyBeIChqaWZmaWVzID4+IDYpKSk7DQogDQorCWZvciAo aSA9IDA7IGkgPCBJUENfSEFTSFNaOyBpKysgKSB7DQorCQlJTklUX0hMSVNUX0hFQUQoJmlwY19o YXNoW2ldLmhlYWQpOw0KKwkJc3Bpbl9sb2NrX2luaXQoJmlwY19oYXNoW2ldLmxvY2spOw0KKwl9 DQorDQogCWluaXRfdGltZXIoJmlwZnJhZ19zZWNyZXRfdGltZXIpOw0KIAlpcGZyYWdfc2VjcmV0 X3RpbWVyLmZ1bmN0aW9uID0gaXBmcmFnX3NlY3JldF9yZWJ1aWxkOw0KIAlpcGZyYWdfc2VjcmV0 X3RpbWVyLmV4cGlyZXMgPSBqaWZmaWVzICsgc3lzY3RsX2lwZnJhZ19zZWNyZXRfaW50ZXJ2YWw7 DQpkaWZmIC1ydXAgbGludXgub3JpZy9uZXQvaXB2NC9zeXNjdGxfbmV0X2lwdjQuYyBsaW51eC9u ZXQvaXB2NC9zeXNjdGxfbmV0X2lwdjQuYw0KLS0tIGxpbnV4Lm9yaWcvbmV0L2lwdjQvc3lzY3Rs X25ldF9pcHY0LmMJMjAwNS0wNy0wNiAxMjozNDoxNi42ODU4Njk1MzggLTA3MDANCisrKyBsaW51 eC9uZXQvaXB2NC9zeXNjdGxfbmV0X2lwdjQuYwkyMDA1LTA3LTA3IDA5OjU2OjQyLjg1Nzc3NDgw NiAtMDcwMA0KQEAgLTMwLDYgKzMwLDcgQEAgZXh0ZXJuIGludCBzeXNjdGxfaXBmcmFnX2xvd190 aHJlc2g7DQogZXh0ZXJuIGludCBzeXNjdGxfaXBmcmFnX2hpZ2hfdGhyZXNoOyANCiBleHRlcm4g aW50IHN5c2N0bF9pcGZyYWdfdGltZTsNCiBleHRlcm4gaW50IHN5c2N0bF9pcGZyYWdfc2VjcmV0 X2ludGVydmFsOw0KK2V4dGVybiBpbnQgc3lzY3RsX2lwX3JlYXNzZW1ibHlfY291bnQ7DQogDQog LyogRnJvbSBpcF9vdXRwdXQuYyAqLw0KIGV4dGVybiBpbnQgc3lzY3RsX2lwX2R5bmFkZHI7DQpA QCAtNTAsNiArNTEsNyBAQCBleHRlcm4gaW50IGluZXRfcGVlcl9nY19taW50aW1lOw0KIGV4dGVy biBpbnQgaW5ldF9wZWVyX2djX21heHRpbWU7DQogDQogI2lmZGVmIENPTkZJR19TWVNDVEwNCitz dGF0aWMgaW50IHplcm87DQogc3RhdGljIGludCB0Y3BfcmV0cjFfbWF4ID0gMjU1OyANCiBzdGF0 aWMgaW50IGlwX2xvY2FsX3BvcnRfcmFuZ2VfbWluW10gPSB7IDEsIDEgfTsNCiBzdGF0aWMgaW50 IGlwX2xvY2FsX3BvcnRfcmFuZ2VfbWF4W10gPSB7IDY1NTM1LCA2NTUzNSB9Ow0KQEAgLTY0Myw2 ICs2NDUsMTUgQEAgY3RsX3RhYmxlIGlwdjRfdGFibGVbXSA9IHsNCiAJCS5zdHJhdGVneQk9ICZz eXNjdGxfamlmZmllcw0KIAl9LA0KIAl7DQorCQkuY3RsX25hbWUJPSBORVRfSVBWNF9SRUFTTV9D T1VOVCwNCisJCS5wcm9jbmFtZQk9ICJpcF9yZWFzc2VtYmx5X2NvdW50IiwNCisJCS5kYXRhCQk9 ICZzeXNjdGxfaXBfcmVhc3NlbWJseV9jb3VudCwNCisJCS5tYXhsZW4JCT0gc2l6ZW9mKGludCks DQorCQkubW9kZQkJPSAwNjQ0LA0KKwkJLnByb2NfaGFuZGxlcgk9ICZwcm9jX2RvaW50dmVjX21p bm1heCwNCisJCS5leHRyYTEJCT0gJnplcm8NCisJfSwNCisJew0KIAkJLmN0bF9uYW1lCT0gTkVU X1RDUF9OT19NRVRSSUNTX1NBVkUsDQogCQkucHJvY25hbWUJPSAidGNwX25vX21ldHJpY3Nfc2F2 ZSIsDQogCQkuZGF0YQkJPSAmc3lzY3RsX3RjcF9ub21ldHJpY3Nfc2F2ZSwNCg== --32512-1443077357-1120766375=:24321-- From davem@davemloft.net Thu Jul 7 14:19:12 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 07 Jul 2005 14:19:18 -0700 (PDT) Received: from sunset.davemloft.net ([216.27.180.168]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j67LJBH9008028 for ; Thu, 7 Jul 2005 14:19:12 -0700 Received: from localhost ([127.0.0.1] ident=davem) by sunset.davemloft.net with esmtp (Exim 4.50) id 1DqdkE-0002uA-UF; Thu, 07 Jul 2005 14:17:18 -0700 Date: Thu, 07 Jul 2005 14:17:18 -0700 (PDT) Message-Id: <20050707.141718.85410359.davem@davemloft.net> To: tgraf@suug.ch Cc: dada1@cosmosbay.com, netdev@oss.sgi.com Subject: Re: [PATCH] loop unrolling in net/sched/sch_generic.c From: "David S. Miller" In-Reply-To: <20050706124206.GW16076@postel.suug.ch> References: <42CB2698.2080904@cosmosbay.com> <42CB2B84.50702@cosmosbay.com> <20050706124206.GW16076@postel.suug.ch> X-Mailer: Mew version 4.2 on Emacs 21.4 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 2670 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev Content-Length: 1381 Lines: 33 From: Thomas Graf Date: Wed, 6 Jul 2005 14:42:06 +0200 > The inital issue you brought up which you backed up with numbers > is probably the cause of multiple wrong branch predictions due > to the fact that I wrote skb = dequeue(); if (skb) which was > assumed to be likely by the compiler. In your patch you fixed > this with !skb_queue_empty() which fixed this wrong prediction > and also acts as a little optimization due to skb_queue_empty() > being really simple to implement for the compiler. As an aside, this reminds me that as part of my quest to make sk_buff smaller, I intend to walk across the tree and change all tests of the form: if (!skb_queue_len(list)) into: if (skb_queue_empty(list)) It would be really nice, after the above transformation and some others, to get rid of sk_buff_head->qlen. Why? Because that also allows us to remove the skb->list member as well, as it's the only reason for existing is so that the SKB queue removal routines can decrement the queue length. That's kind of silly, and most SKB lists in the kernel do not care about the queue length at all. Rather, they care about empty and non-empty. The cases that do care (mostly packet schedulers) can keep track of the queue length themselves in their private data structures. When they remove packets, they _know_ which queue to decrement the queue length of. From tgraf@suug.ch Thu Jul 7 14:36:15 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 07 Jul 2005 14:36:19 -0700 (PDT) Received: from postel.suug.ch (postel.suug.ch [195.134.158.23]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j67LaFH9009605 for ; Thu, 7 Jul 2005 14:36:15 -0700 Received: by postel.suug.ch (Postfix, from userid 10001) id B07871C0F3; Thu, 7 Jul 2005 23:34:50 +0200 (CEST) Date: Thu, 7 Jul 2005 23:34:50 +0200 From: Thomas Graf To: "David S. Miller" Cc: dada1@cosmosbay.com, netdev@oss.sgi.com Subject: Re: [PATCH] loop unrolling in net/sched/sch_generic.c Message-ID: <20050707213450.GB16076@postel.suug.ch> References: <42CB2698.2080904@cosmosbay.com> <42CB2B84.50702@cosmosbay.com> <20050706124206.GW16076@postel.suug.ch> <20050707.141718.85410359.davem@davemloft.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20050707.141718.85410359.davem@davemloft.net> X-archive-position: 2671 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev Content-Length: 851 Lines: 23 * David S. Miller <20050707.141718.85410359.davem@davemloft.net> 2005-07-07 14:17 > As an aside, this reminds me that as part of my quest to make > sk_buff smaller, I intend to walk across the tree and change > all tests of the form: > > if (!skb_queue_len(list)) > > into: > > if (skb_queue_empty(list)) > > [...] > > The cases that do care (mostly packet schedulers) can > keep track of the queue length themselves in their private data > structures. When they remove packets, they _know_ which queue to > decrement the queue length of. Since I'm changing the classful qdiscs to use a generic API for queue management anyway I could take care of this if you want. WRT the leaf qdiscs it's a bit more complicated since we have to change the new API to take a new struct which includes the qlen and the sk_buff_head but not a problem either. From raghavendra.koushik@neterion.com Thu Jul 7 15:17:39 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 07 Jul 2005 15:17:51 -0700 (PDT) Received: from linux.site (adsl-67-120-213-161.dsl.sntc01.pacbell.net [67.120.213.161]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j67MHYH9012776 for ; Thu, 7 Jul 2005 15:17:35 -0700 Received: by linux.site (Postfix, from userid 0) id A8AC689826; Thu, 7 Jul 2005 15:05:06 -0700 (PDT) To: jgarzik@pobox.com, netdev@oss.sgi.com Cc: raghavendra.koushik@neterion.com, ravinandan.arakali@neterion.com, leonid.grossman@neterion.com, rapuru.sriram@neterion.com From: raghavendra.koushik@neterion.com Subject: [PATCH 2.6.12.1 1/12] S2io: Code cleanup Message-Id: <20050707220506.A8AC689826@linux.site> Date: Thu, 7 Jul 2005 15:05:06 -0700 (PDT) X-archive-position: 2672 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: raghavendra.koushik@neterion.com Precedence: bulk X-list: netdev Content-Length: 142306 Lines: 4202 Hi, We are submitting a series of 12 patches to support our Xframe I and Xframe II line of products. The patches can be categorized as follows: Patches 1-8 : Changes applicable to both Xframe I and II Patches 9-11: Xframe II specific features Patch 12: Addresses issues found during testing cycle. Please review the patches and let us know your comments. Starting with patch 1 below. This patch involves cosmetic changes(tabs and indentation, regrouping of transmit and receive data structures, typecasting, code cleanup). Signed-off-by: Ravinandan Arakali Signed-off-by: Raghavendr Koushik --- diff -urpN vanilla_kernel/drivers/net/s2io-regs.h linux-2.6.12-rc6/drivers/net/s2io-regs.h --- vanilla_kernel/drivers/net/s2io-regs.h 2005-06-06 08:22:29.000000000 -0700 +++ linux-2.6.12-rc6/drivers/net/s2io-regs.h 2005-06-27 06:21:32.000000000 -0700 @@ -77,19 +77,18 @@ typedef struct _XENA_dev_config { #define ADAPTER_ECC_EN BIT(55) u64 serr_source; -#define SERR_SOURCE_PIC BIT(0) -#define SERR_SOURCE_TXDMA BIT(1) -#define SERR_SOURCE_RXDMA BIT(2) +#define SERR_SOURCE_PIC BIT(0) +#define SERR_SOURCE_TXDMA BIT(1) +#define SERR_SOURCE_RXDMA BIT(2) #define SERR_SOURCE_MAC BIT(3) #define SERR_SOURCE_MC BIT(4) #define SERR_SOURCE_XGXS BIT(5) -#define SERR_SOURCE_ANY (SERR_SOURCE_PIC | \ - SERR_SOURCE_TXDMA | \ - SERR_SOURCE_RXDMA | \ - SERR_SOURCE_MAC | \ - SERR_SOURCE_MC | \ - SERR_SOURCE_XGXS) - +#define SERR_SOURCE_ANY (SERR_SOURCE_PIC | \ + SERR_SOURCE_TXDMA | \ + SERR_SOURCE_RXDMA | \ + SERR_SOURCE_MAC | \ + SERR_SOURCE_MC | \ + SERR_SOURCE_XGXS) u8 unused_0[0x800 - 0x120]; diff -urpN vanilla_kernel/drivers/net/s2io.c linux-2.6.12-rc6/drivers/net/s2io.c --- vanilla_kernel/drivers/net/s2io.c 2005-06-06 08:22:29.000000000 -0700 +++ linux-2.6.12-rc6/drivers/net/s2io.c 2005-06-27 06:21:32.000000000 -0700 @@ -11,29 +11,28 @@ * See the file COPYING in this distribution for more information. * * Credits: - * Jeff Garzik : For pointing out the improper error condition - * check in the s2io_xmit routine and also some - * issues in the Tx watch dog function. Also for - * patiently answering all those innumerable + * Jeff Garzik : For pointing out the improper error condition + * check in the s2io_xmit routine and also some + * issues in the Tx watch dog function. Also for + * patiently answering all those innumerable * questions regaring the 2.6 porting issues. * Stephen Hemminger : Providing proper 2.6 porting mechanism for some * macros available only in 2.6 Kernel. - * Francois Romieu : For pointing out all code part that were + * Francois Romieu : For pointing out all code part that were * deprecated and also styling related comments. - * Grant Grundler : For helping me get rid of some Architecture + * Grant Grundler : For helping me get rid of some Architecture * dependent code. * Christopher Hellwig : Some more 2.6 specific issues in the driver. - * + * * The module loadable parameters that are supported by the driver and a brief * explaination of all the variables. - * rx_ring_num : This can be used to program the number of receive rings used - * in the driver. - * rx_ring_len: This defines the number of descriptors each ring can have. This + * rx_ring_num : This can be used to program the number of receive rings used + * in the driver. + * rx_ring_len: This defines the number of descriptors each ring can have. This * is also an array of size 8. * tx_fifo_num: This defines the number of Tx FIFOs thats used int the driver. - * tx_fifo_len: This too is an array of 8. Each element defines the number of + * tx_fifo_len: This too is an array of 8. Each element defines the number of * Tx descriptors that can be associated with each corresponding FIFO. - * in PCI Configuration space. ************************************************************************/ #include @@ -56,19 +55,19 @@ #include #include -#include #include #include +#include /* local include */ #include "s2io.h" #include "s2io-regs.h" /* S2io Driver name & version. */ -static char s2io_driver_name[] = "s2io"; -static char s2io_driver_version[] = "Version 1.7.7.1"; +static char s2io_driver_name[] = "Neterion"; +static char s2io_driver_version[] = "Version 1.7.7"; -/* +/* * Cards with following subsystem_id have a link state indication * problem, 600B, 600C, 600D, 640B, 640C and 640D. * macro below identifies these cards given the subsystem_id. @@ -85,9 +84,13 @@ static char s2io_driver_version[] = "Ver static inline int rx_buffer_level(nic_t * sp, int rxb_size, int ring) { int level = 0; - if ((sp->pkt_cnt[ring] - rxb_size) > 16) { + mac_info_t *mac_control; + + mac_control = &sp->mac_control; + if ((mac_control->rings[ring].pkt_cnt - rxb_size) > 16) { level = LOW; - if ((sp->pkt_cnt[ring] - rxb_size) < MAX_RXDS_PER_BLOCK) { + if ((mac_control->rings[ring].pkt_cnt - rxb_size) < + MAX_RXDS_PER_BLOCK) { level = PANIC; } } @@ -152,8 +155,7 @@ static char ethtool_stats_keys[][ETH_GST #define S2IO_TEST_LEN sizeof(s2io_gstrings) / ETH_GSTRING_LEN #define S2IO_STRINGS_LEN S2IO_TEST_LEN * ETH_GSTRING_LEN - -/* +/* * Constants to be programmed into the Xena's registers, to configure * the XAUI. */ @@ -195,8 +197,7 @@ static u64 default_dtx_cfg[] = { END_SIGN }; - -/* +/* * Constants for Fixing the MacAddress problem seen mostly on * Alpha machines. */ @@ -226,6 +227,8 @@ static unsigned int rx_ring_num = 1; static unsigned int rx_ring_sz[MAX_RX_RINGS] = {[0 ...(MAX_RX_RINGS - 1)] = 0 }; static unsigned int Stats_refresh_time = 4; +static unsigned int rts_frm_len[MAX_RX_RINGS] = + {[0 ...(MAX_RX_RINGS - 1)] = 0 }; static unsigned int rmac_pause_time = 65535; static unsigned int mc_pause_threshold_q0q3 = 187; static unsigned int mc_pause_threshold_q4q7 = 187; @@ -236,9 +239,9 @@ static unsigned int rmac_util_period = 5 static unsigned int indicate_max_pkts; #endif -/* +/* * S2IO device table. - * This table lists all the devices that this driver supports. + * This table lists all the devices that this driver supports. */ static struct pci_device_id s2io_tbl[] __devinitdata = { {PCI_VENDOR_ID_S2IO, PCI_DEVICE_ID_S2IO_WIN, @@ -246,9 +249,9 @@ static struct pci_device_id s2io_tbl[] _ {PCI_VENDOR_ID_S2IO, PCI_DEVICE_ID_S2IO_UNI, PCI_ANY_ID, PCI_ANY_ID}, {PCI_VENDOR_ID_S2IO, PCI_DEVICE_ID_HERC_WIN, - PCI_ANY_ID, PCI_ANY_ID}, - {PCI_VENDOR_ID_S2IO, PCI_DEVICE_ID_HERC_UNI, - PCI_ANY_ID, PCI_ANY_ID}, + PCI_ANY_ID, PCI_ANY_ID}, + {PCI_VENDOR_ID_S2IO, PCI_DEVICE_ID_HERC_UNI, + PCI_ANY_ID, PCI_ANY_ID}, {0,} }; @@ -267,8 +270,8 @@ static struct pci_driver s2io_driver = { /** * init_shared_mem - Allocation and Initialization of Memory * @nic: Device private variable. - * Description: The function allocates all the memory areas shared - * between the NIC and the driver. This includes Tx descriptors, + * Description: The function allocates all the memory areas shared + * between the NIC and the driver. This includes Tx descriptors, * Rx descriptors and the statistics block. */ @@ -278,11 +281,11 @@ static int init_shared_mem(struct s2io_n void *tmp_v_addr, *tmp_v_addr_next; dma_addr_t tmp_p_addr, tmp_p_addr_next; RxD_block_t *pre_rxd_blk = NULL; - int i, j, blk_cnt; + int i, j, blk_cnt, rx_sz, tx_sz; int lst_size, lst_per_page; struct net_device *dev = nic->dev; #ifdef CONFIG_2BUFF_MODE - unsigned long tmp; + u64 tmp; buffAdd_t *ba; #endif @@ -307,28 +310,34 @@ static int init_shared_mem(struct s2io_n } lst_size = (sizeof(TxD_t) * config->max_txds); + tx_sz = lst_size * size; lst_per_page = PAGE_SIZE / lst_size; for (i = 0; i < config->tx_fifo_num; i++) { int fifo_len = config->tx_cfg[i].fifo_len; int list_holder_size = fifo_len * sizeof(list_info_hold_t); - nic->list_info[i] = kmalloc(list_holder_size, GFP_KERNEL); - if (!nic->list_info[i]) { + mac_control->fifos[i].list_info = kmalloc(list_holder_size, + GFP_KERNEL); + if (!mac_control->fifos[i].list_info) { DBG_PRINT(ERR_DBG, "Malloc failed for list_info\n"); return -ENOMEM; } - memset(nic->list_info[i], 0, list_holder_size); + memset(mac_control->fifos[i].list_info, 0, list_holder_size); } for (i = 0; i < config->tx_fifo_num; i++) { int page_num = TXD_MEM_PAGE_CNT(config->tx_cfg[i].fifo_len, lst_per_page); - mac_control->tx_curr_put_info[i].offset = 0; - mac_control->tx_curr_put_info[i].fifo_len = + mac_control->fifos[i].tx_curr_put_info.offset = 0; + mac_control->fifos[i].tx_curr_put_info.fifo_len = config->tx_cfg[i].fifo_len - 1; - mac_control->tx_curr_get_info[i].offset = 0; - mac_control->tx_curr_get_info[i].fifo_len = + mac_control->fifos[i].tx_curr_get_info.offset = 0; + mac_control->fifos[i].tx_curr_get_info.fifo_len = config->tx_cfg[i].fifo_len - 1; + mac_control->fifos[i].fifo_no = i; + mac_control->fifos[i].nic = nic; + mac_control->fifos[i].max_txds = MAX_SKB_FRAGS; + for (j = 0; j < page_num; j++) { int k = 0; dma_addr_t tmp_p; @@ -344,16 +353,15 @@ static int init_shared_mem(struct s2io_n while (k < lst_per_page) { int l = (j * lst_per_page) + k; if (l == config->tx_cfg[i].fifo_len) - goto end_txd_alloc; - nic->list_info[i][l].list_virt_addr = + break; + mac_control->fifos[i].list_info[l].list_virt_addr = tmp_v + (k * lst_size); - nic->list_info[i][l].list_phy_addr = + mac_control->fifos[i].list_info[l].list_phy_addr = tmp_p + (k * lst_size); k++; } } } - end_txd_alloc: /* Allocation and initialization of RXDs in Rings */ size = 0; @@ -366,21 +374,26 @@ static int init_shared_mem(struct s2io_n return FAILURE; } size += config->rx_cfg[i].num_rxd; - nic->block_count[i] = + mac_control->rings[i].block_count = config->rx_cfg[i].num_rxd / (MAX_RXDS_PER_BLOCK + 1); - nic->pkt_cnt[i] = - config->rx_cfg[i].num_rxd - nic->block_count[i]; + mac_control->rings[i].pkt_cnt = + config->rx_cfg[i].num_rxd - mac_control->rings[i].block_count; } + size = (size * (sizeof(RxD_t))); + rx_sz = size; for (i = 0; i < config->rx_ring_num; i++) { - mac_control->rx_curr_get_info[i].block_index = 0; - mac_control->rx_curr_get_info[i].offset = 0; - mac_control->rx_curr_get_info[i].ring_len = + mac_control->rings[i].rx_curr_get_info.block_index = 0; + mac_control->rings[i].rx_curr_get_info.offset = 0; + mac_control->rings[i].rx_curr_get_info.ring_len = config->rx_cfg[i].num_rxd - 1; - mac_control->rx_curr_put_info[i].block_index = 0; - mac_control->rx_curr_put_info[i].offset = 0; - mac_control->rx_curr_put_info[i].ring_len = + mac_control->rings[i].rx_curr_put_info.block_index = 0; + mac_control->rings[i].rx_curr_put_info.offset = 0; + mac_control->rings[i].rx_curr_put_info.ring_len = config->rx_cfg[i].num_rxd - 1; + mac_control->rings[i].nic = nic; + mac_control->rings[i].ring_no = i; + blk_cnt = config->rx_cfg[i].num_rxd / (MAX_RXDS_PER_BLOCK + 1); /* Allocating all the Rx blocks */ @@ -394,32 +407,36 @@ static int init_shared_mem(struct s2io_n &tmp_p_addr); if (tmp_v_addr == NULL) { /* - * In case of failure, free_shared_mem() - * is called, which should free any - * memory that was alloced till the + * In case of failure, free_shared_mem() + * is called, which should free any + * memory that was alloced till the * failure happened. */ - nic->rx_blocks[i][j].block_virt_addr = + mac_control->rings[i].rx_blocks[j].block_virt_addr = tmp_v_addr; return -ENOMEM; } memset(tmp_v_addr, 0, size); - nic->rx_blocks[i][j].block_virt_addr = tmp_v_addr; - nic->rx_blocks[i][j].block_dma_addr = tmp_p_addr; + mac_control->rings[i].rx_blocks[j].block_virt_addr = + tmp_v_addr; + mac_control->rings[i].rx_blocks[j].block_dma_addr = + tmp_p_addr; } /* Interlinking all Rx Blocks */ for (j = 0; j < blk_cnt; j++) { - tmp_v_addr = nic->rx_blocks[i][j].block_virt_addr; + tmp_v_addr = + mac_control->rings[i].rx_blocks[j].block_virt_addr; tmp_v_addr_next = - nic->rx_blocks[i][(j + 1) % + mac_control->rings[i].rx_blocks[(j + 1) % blk_cnt].block_virt_addr; - tmp_p_addr = nic->rx_blocks[i][j].block_dma_addr; + tmp_p_addr = + mac_control->rings[i].rx_blocks[j].block_dma_addr; tmp_p_addr_next = - nic->rx_blocks[i][(j + 1) % + mac_control->rings[i].rx_blocks[(j + 1) % blk_cnt].block_dma_addr; pre_rxd_blk = (RxD_block_t *) tmp_v_addr; - pre_rxd_blk->reserved_1 = END_OF_BLOCK; /* last RxD + pre_rxd_blk->reserved_1 = END_OF_BLOCK; /* last RxD * marker. */ #ifndef CONFIG_2BUFF_MODE @@ -432,43 +449,43 @@ static int init_shared_mem(struct s2io_n } #ifdef CONFIG_2BUFF_MODE - /* + /* * Allocation of Storages for buffer addresses in 2BUFF mode * and the buffers as well. */ for (i = 0; i < config->rx_ring_num; i++) { blk_cnt = config->rx_cfg[i].num_rxd / (MAX_RXDS_PER_BLOCK + 1); - nic->ba[i] = kmalloc((sizeof(buffAdd_t *) * blk_cnt), + mac_control->rings[i].ba = kmalloc((sizeof(buffAdd_t *) * blk_cnt), GFP_KERNEL); - if (!nic->ba[i]) + if (!mac_control->rings[i].ba) return -ENOMEM; for (j = 0; j < blk_cnt; j++) { int k = 0; - nic->ba[i][j] = kmalloc((sizeof(buffAdd_t) * + mac_control->rings[i].ba[j] = kmalloc((sizeof(buffAdd_t) * (MAX_RXDS_PER_BLOCK + 1)), GFP_KERNEL); - if (!nic->ba[i][j]) + if (!mac_control->rings[i].ba[j]) return -ENOMEM; while (k != MAX_RXDS_PER_BLOCK) { - ba = &nic->ba[i][j][k]; + ba = &mac_control->rings[i].ba[j][k]; - ba->ba_0_org = kmalloc + ba->ba_0_org = (void *) kmalloc (BUF0_LEN + ALIGN_SIZE, GFP_KERNEL); if (!ba->ba_0_org) return -ENOMEM; - tmp = (unsigned long) ba->ba_0_org; + tmp = (u64) ba->ba_0_org; tmp += ALIGN_SIZE; - tmp &= ~((unsigned long) ALIGN_SIZE); + tmp &= ~((u64) ALIGN_SIZE); ba->ba_0 = (void *) tmp; - ba->ba_1_org = kmalloc + ba->ba_1_org = (void *) kmalloc (BUF1_LEN + ALIGN_SIZE, GFP_KERNEL); if (!ba->ba_1_org) return -ENOMEM; - tmp = (unsigned long) ba->ba_1_org; + tmp = (u64) ba->ba_1_org; tmp += ALIGN_SIZE; - tmp &= ~((unsigned long) ALIGN_SIZE); + tmp &= ~((u64) ALIGN_SIZE); ba->ba_1 = (void *) tmp; k++; } @@ -482,9 +499,9 @@ static int init_shared_mem(struct s2io_n (nic->pdev, size, &mac_control->stats_mem_phy); if (!mac_control->stats_mem) { - /* - * In case of failure, free_shared_mem() is called, which - * should free any memory that was alloced till the + /* + * In case of failure, free_shared_mem() is called, which + * should free any memory that was alloced till the * failure happened. */ return -ENOMEM; @@ -494,15 +511,14 @@ static int init_shared_mem(struct s2io_n tmp_v_addr = mac_control->stats_mem; mac_control->stats_info = (StatInfo_t *) tmp_v_addr; memset(tmp_v_addr, 0, size); - DBG_PRINT(INIT_DBG, "%s:Ring Mem PHY: 0x%llx\n", dev->name, (unsigned long long) tmp_p_addr); return SUCCESS; } -/** - * free_shared_mem - Free the allocated Memory +/** + * free_shared_mem - Free the allocated Memory * @nic: Device private variable. * Description: This function is to free all memory locations allocated by * the init_shared_mem() function and return it to the kernel. @@ -532,15 +548,18 @@ static void free_shared_mem(struct s2io_ lst_per_page); for (j = 0; j < page_num; j++) { int mem_blks = (j * lst_per_page); - if (!nic->list_info[i][mem_blks].list_virt_addr) + if (!mac_control->fifos[i].list_info[mem_blks]. + list_virt_addr) break; pci_free_consistent(nic->pdev, PAGE_SIZE, - nic->list_info[i][mem_blks]. + mac_control->fifos[i]. + list_info[mem_blks]. list_virt_addr, - nic->list_info[i][mem_blks]. + mac_control->fifos[i]. + list_info[mem_blks]. list_phy_addr); } - kfree(nic->list_info[i]); + kfree(mac_control->fifos[i].list_info); } #ifndef CONFIG_2BUFF_MODE @@ -549,10 +568,12 @@ static void free_shared_mem(struct s2io_ size = SIZE_OF_BLOCK; #endif for (i = 0; i < config->rx_ring_num; i++) { - blk_cnt = nic->block_count[i]; + blk_cnt = mac_control->rings[i].block_count; for (j = 0; j < blk_cnt; j++) { - tmp_v_addr = nic->rx_blocks[i][j].block_virt_addr; - tmp_p_addr = nic->rx_blocks[i][j].block_dma_addr; + tmp_v_addr = mac_control->rings[i].rx_blocks[j]. + block_virt_addr; + tmp_p_addr = mac_control->rings[i].rx_blocks[j]. + block_dma_addr; if (tmp_v_addr == NULL) break; pci_free_consistent(nic->pdev, size, @@ -565,35 +586,21 @@ static void free_shared_mem(struct s2io_ for (i = 0; i < config->rx_ring_num; i++) { blk_cnt = config->rx_cfg[i].num_rxd / (MAX_RXDS_PER_BLOCK + 1); - if (!nic->ba[i]) - goto end_free; for (j = 0; j < blk_cnt; j++) { int k = 0; - if (!nic->ba[i][j]) { - kfree(nic->ba[i]); - goto end_free; - } + if (!mac_control->rings[i].ba[j]) + continue; while (k != MAX_RXDS_PER_BLOCK) { - buffAdd_t *ba = &nic->ba[i][j][k]; - if (!ba || !ba->ba_0_org || !ba->ba_1_org) - { - kfree(nic->ba[i]); - kfree(nic->ba[i][j]); - if(ba->ba_0_org) - kfree(ba->ba_0_org); - if(ba->ba_1_org) - kfree(ba->ba_1_org); - goto end_free; - } + buffAdd_t *ba = &mac_control->rings[i].ba[j][k]; kfree(ba->ba_0_org); kfree(ba->ba_1_org); k++; } - kfree(nic->ba[i][j]); + kfree(mac_control->rings[i].ba[j]); } - kfree(nic->ba[i]); + if (mac_control->rings[i].ba) + kfree(mac_control->rings[i].ba); } -end_free: #endif if (mac_control->stats_mem) { @@ -604,12 +611,12 @@ end_free: } } -/** - * init_nic - Initialization of hardware +/** + * init_nic - Initialization of hardware * @nic: device peivate variable - * Description: The function sequentially configures every block - * of the H/W from their reset values. - * Return Value: SUCCESS on success and + * Description: The function sequentially configures every block + * of the H/W from their reset values. + * Return Value: SUCCESS on success and * '-1' on failure (endian settings incorrect). */ @@ -625,12 +632,13 @@ static int init_nic(struct s2io_nic *nic struct config_param *config; int mdio_cnt = 0, dtx_cnt = 0; unsigned long long mem_share; + int mem_size; mac_control = &nic->mac_control; config = &nic->config; - /* Initialize swapper control register */ - if (s2io_set_swapper(nic)) { + /* to set the swapper control on the card */ + if(s2io_set_swapper(nic)) { DBG_PRINT(ERR_DBG,"ERROR: Setting Swapper failed\n"); return -1; } @@ -638,8 +646,8 @@ static int init_nic(struct s2io_nic *nic /* Remove XGXS from reset state */ val64 = 0; writeq(val64, &bar0->sw_reset); - val64 = readq(&bar0->sw_reset); msleep(500); + val64 = readq(&bar0->sw_reset); /* Enable Receiving broadcasts */ add = &bar0->mac_cfg; @@ -659,18 +667,18 @@ static int init_nic(struct s2io_nic *nic val64 = dev->mtu; writeq(vBIT(val64, 2, 14), &bar0->rmac_max_pyld_len); - /* - * Configuring the XAUI Interface of Xena. + /* + * Configuring the XAUI Interface of Xena. * *************************************** - * To Configure the Xena's XAUI, one has to write a series - * of 64 bit values into two registers in a particular - * sequence. Hence a macro 'SWITCH_SIGN' has been defined - * which will be defined in the array of configuration values - * (default_dtx_cfg & default_mdio_cfg) at appropriate places - * to switch writing from one regsiter to another. We continue + * To Configure the Xena's XAUI, one has to write a series + * of 64 bit values into two registers in a particular + * sequence. Hence a macro 'SWITCH_SIGN' has been defined + * which will be defined in the array of configuration values + * (default_dtx_cfg & default_mdio_cfg) at appropriate places + * to switch writing from one regsiter to another. We continue * writing these values until we encounter the 'END_SIGN' macro. - * For example, After making a series of 21 writes into - * dtx_control register the 'SWITCH_SIGN' appears and hence we + * For example, After making a series of 21 writes into + * dtx_control register the 'SWITCH_SIGN' appears and hence we * start writing into mdio_control until we encounter END_SIGN. */ while (1) { @@ -751,8 +759,8 @@ static int init_nic(struct s2io_nic *nic DBG_PRINT(INIT_DBG, "Fifo partition at: 0x%p is: 0x%llx\n", &bar0->tx_fifo_partition_0, (unsigned long long) val64); - /* - * Initialization of Tx_PA_CONFIG register to ignore packet + /* + * Initialization of Tx_PA_CONFIG register to ignore packet * integrity checking. */ val64 = readq(&bar0->tx_pa_cfg); @@ -769,54 +777,54 @@ static int init_nic(struct s2io_nic *nic } writeq(val64, &bar0->rx_queue_priority); - /* - * Allocating equal share of memory to all the + /* + * Allocating equal share of memory to all the * configured Rings. */ val64 = 0; + mem_size = 64; for (i = 0; i < config->rx_ring_num; i++) { switch (i) { case 0: - mem_share = (64 / config->rx_ring_num + - 64 % config->rx_ring_num); + mem_share = (mem_size / config->rx_ring_num + + mem_size % config->rx_ring_num); val64 |= RX_QUEUE_CFG_Q0_SZ(mem_share); continue; case 1: - mem_share = (64 / config->rx_ring_num); + mem_share = (mem_size / config->rx_ring_num); val64 |= RX_QUEUE_CFG_Q1_SZ(mem_share); continue; case 2: - mem_share = (64 / config->rx_ring_num); + mem_share = (mem_size / config->rx_ring_num); val64 |= RX_QUEUE_CFG_Q2_SZ(mem_share); continue; case 3: - mem_share = (64 / config->rx_ring_num); + mem_share = (mem_size / config->rx_ring_num); val64 |= RX_QUEUE_CFG_Q3_SZ(mem_share); continue; case 4: - mem_share = (64 / config->rx_ring_num); + mem_share = (mem_size / config->rx_ring_num); val64 |= RX_QUEUE_CFG_Q4_SZ(mem_share); continue; case 5: - mem_share = (64 / config->rx_ring_num); + mem_share = (mem_size / config->rx_ring_num); val64 |= RX_QUEUE_CFG_Q5_SZ(mem_share); continue; case 6: - mem_share = (64 / config->rx_ring_num); + mem_share = (mem_size / config->rx_ring_num); val64 |= RX_QUEUE_CFG_Q6_SZ(mem_share); continue; case 7: - mem_share = (64 / config->rx_ring_num); + mem_share = (mem_size / config->rx_ring_num); val64 |= RX_QUEUE_CFG_Q7_SZ(mem_share); continue; } } writeq(val64, &bar0->rx_queue_cfg); - /* - * Initializing the Tx round robin registers to 0. - * Filling Tx and Rx round robin registers as per the - * number of FIFOs and Rings is still TODO. + /* Initializing the Tx round robin registers to 0 + * filling tx and rx round robin registers as per + * the number of FIFOs and Rings is still TODO */ writeq(0, &bar0->tx_w_round_robin_0); writeq(0, &bar0->tx_w_round_robin_1); @@ -824,30 +832,30 @@ static int init_nic(struct s2io_nic *nic writeq(0, &bar0->tx_w_round_robin_3); writeq(0, &bar0->tx_w_round_robin_4); - /* + /* * TODO - * Disable Rx steering. Hard coding all packets be steered to - * Queue 0 for now. + * Disable Rx steering. Hard coding all packets to be steered to + * Queue 0 for now. */ val64 = 0x8080808080808080ULL; writeq(val64, &bar0->rts_qos_steering); /* UDP Fix */ val64 = 0; - for (i = 1; i < 8; i++) + for (i = 0; i < 8; i++) writeq(val64, &bar0->rts_frm_len_n[i]); - /* Set rts_frm_len register for fifo 0 */ - writeq(MAC_RTS_FRM_LEN_SET(dev->mtu + 22), - &bar0->rts_frm_len_n[0]); + /* Set the default rts frame length for ring0 */ + writeq(MAC_RTS_FRM_LEN_SET(dev->mtu+22), + &bar0->rts_frm_len_n[0]); - /* Enable statistics */ + /* Program statistics memory */ writeq(mac_control->stats_mem_phy, &bar0->stat_addr); val64 = SET_UPDT_PERIOD(Stats_refresh_time) | STAT_CFG_STAT_RO | STAT_CFG_STAT_EN; writeq(val64, &bar0->stat_cfg); - /* + /* * Initializing the sampling rate for the device to calculate the * bandwidth utilization. */ @@ -856,11 +864,12 @@ static int init_nic(struct s2io_nic *nic writeq(val64, &bar0->mac_link_util); - /* - * Initializing the Transmit and Receive Traffic Interrupt + /* + * Initializing the Transmit and Receive Traffic Interrupt * Scheme. */ - /* TTI Initialization. Default Tx timer gets us about + /* + * TTI Initialization. Default Tx timer gets us about * 250 interrupts per sec. Continuous interrupts are enabled * by default. */ @@ -879,7 +888,7 @@ static int init_nic(struct s2io_nic *nic val64 = TTI_CMD_MEM_WE | TTI_CMD_MEM_STROBE_NEW_CMD; writeq(val64, &bar0->tti_command_mem); - /* + /* * Once the operation completes, the Strobe bit of the command * register will be reset. We poll for this particular condition * We wait for a maximum of 500ms for the operation to complete, @@ -916,7 +925,7 @@ static int init_nic(struct s2io_nic *nic val64 = RTI_CMD_MEM_WE | RTI_CMD_MEM_STROBE_NEW_CMD; writeq(val64, &bar0->rti_command_mem); - /* + /* * Once the operation completes, the Strobe bit of the command * register will be reset. We poll for this particular condition * We wait for a maximum of 500ms for the operation to complete, @@ -925,7 +934,7 @@ static int init_nic(struct s2io_nic *nic time = 0; while (TRUE) { val64 = readq(&bar0->rti_command_mem); - if (!(val64 & TTI_CMD_MEM_STROBE_NEW_CMD)) { + if (!(val64 & RTI_CMD_MEM_STROBE_NEW_CMD)) { break; } if (time > 10) { @@ -937,15 +946,15 @@ static int init_nic(struct s2io_nic *nic msleep(50); } - /* - * Initializing proper values as Pause threshold into all + /* + * Initializing proper values as Pause threshold into all * the 8 Queues on Rx side. */ writeq(0xffbbffbbffbbffbbULL, &bar0->mc_pause_thresh_q0q3); writeq(0xffbbffbbffbbffbbULL, &bar0->mc_pause_thresh_q4q7); /* Disable RMAC PAD STRIPPING */ - add = &bar0->mac_cfg; + add = (void *) &bar0->mac_cfg; val64 = readq(&bar0->mac_cfg); val64 &= ~(MAC_CFG_RMAC_STRIP_PAD); writeq(RMAC_CFG_KEY(0x4C0D), &bar0->rmac_cfg_key); @@ -954,8 +963,8 @@ static int init_nic(struct s2io_nic *nic writel((u32) (val64 >> 32), (add + 4)); val64 = readq(&bar0->mac_cfg); - /* - * Set the time value to be inserted in the pause frame + /* + * Set the time value to be inserted in the pause frame * generated by xena. */ val64 = readq(&bar0->rmac_pause_cfg); @@ -963,7 +972,7 @@ static int init_nic(struct s2io_nic *nic val64 |= RMAC_PAUSE_HG_PTIME(nic->mac_control.rmac_pause_time); writeq(val64, &bar0->rmac_pause_cfg); - /* + /* * Set the Threshold Limit for Generating the pause frame * If the amount of data in any Queue exceeds ratio of * (mac_control.mc_pause_threshold_q0q3 or q4q7)/256 @@ -987,8 +996,8 @@ static int init_nic(struct s2io_nic *nic } writeq(val64, &bar0->mc_pause_thresh_q4q7); - /* - * TxDMA will stop Read request if the number of read split has + /* + * TxDMA will stop Read request if the number of read split has * exceeded the limit pointed by shared_splits */ val64 = readq(&bar0->pic_control); @@ -998,14 +1007,14 @@ static int init_nic(struct s2io_nic *nic return SUCCESS; } -/** - * en_dis_able_nic_intrs - Enable or Disable the interrupts +/** + * en_dis_able_nic_intrs - Enable or Disable the interrupts * @nic: device private variable, * @mask: A mask indicating which Intr block must be modified and, * @flag: A flag indicating whether to enable or disable the Intrs. * Description: This function will either disable or enable the interrupts - * depending on the flag argument. The mask argument can be used to - * enable/disable any Intr block. + * depending on the flag argument. The mask argument can be used to + * enable/disable any Intr block. * Return Value: NONE. */ @@ -1023,20 +1032,20 @@ static void en_dis_able_nic_intrs(struct temp64 = readq(&bar0->general_int_mask); temp64 &= ~((u64) val64); writeq(temp64, &bar0->general_int_mask); - /* + /* * Disabled all PCIX, Flash, MDIO, IIC and GPIO - * interrupts for now. - * TODO + * interrupts for now. + * TODO */ writeq(DISABLE_ALL_INTRS, &bar0->pic_int_mask); - /* + /* * No MSI Support is available presently, so TTI and * RTI interrupts are also disabled. */ } else if (flag == DISABLE_INTRS) { - /* - * Disable PIC Intrs in the general - * intr mask register + /* + * Disable PIC Intrs in the general + * intr mask register */ writeq(DISABLE_ALL_INTRS, &bar0->pic_int_mask); temp64 = readq(&bar0->general_int_mask); @@ -1054,27 +1063,27 @@ static void en_dis_able_nic_intrs(struct temp64 = readq(&bar0->general_int_mask); temp64 &= ~((u64) val64); writeq(temp64, &bar0->general_int_mask); - /* - * Keep all interrupts other than PFC interrupt + /* + * Keep all interrupts other than PFC interrupt * and PCC interrupt disabled in DMA level. */ val64 = DISABLE_ALL_INTRS & ~(TXDMA_PFC_INT_M | TXDMA_PCC_INT_M); writeq(val64, &bar0->txdma_int_mask); - /* - * Enable only the MISC error 1 interrupt in PFC block + /* + * Enable only the MISC error 1 interrupt in PFC block */ val64 = DISABLE_ALL_INTRS & (~PFC_MISC_ERR_1); writeq(val64, &bar0->pfc_err_mask); - /* - * Enable only the FB_ECC error interrupt in PCC block + /* + * Enable only the FB_ECC error interrupt in PCC block */ val64 = DISABLE_ALL_INTRS & (~PCC_FB_ECC_ERR); writeq(val64, &bar0->pcc_err_mask); } else if (flag == DISABLE_INTRS) { - /* - * Disable TxDMA Intrs in the general intr mask - * register + /* + * Disable TxDMA Intrs in the general intr mask + * register */ writeq(DISABLE_ALL_INTRS, &bar0->txdma_int_mask); writeq(DISABLE_ALL_INTRS, &bar0->pfc_err_mask); @@ -1092,15 +1101,15 @@ static void en_dis_able_nic_intrs(struct temp64 = readq(&bar0->general_int_mask); temp64 &= ~((u64) val64); writeq(temp64, &bar0->general_int_mask); - /* - * All RxDMA block interrupts are disabled for now - * TODO + /* + * All RxDMA block interrupts are disabled for now + * TODO */ writeq(DISABLE_ALL_INTRS, &bar0->rxdma_int_mask); } else if (flag == DISABLE_INTRS) { - /* - * Disable RxDMA Intrs in the general intr mask - * register + /* + * Disable RxDMA Intrs in the general intr mask + * register */ writeq(DISABLE_ALL_INTRS, &bar0->rxdma_int_mask); temp64 = readq(&bar0->general_int_mask); @@ -1117,8 +1126,8 @@ static void en_dis_able_nic_intrs(struct temp64 = readq(&bar0->general_int_mask); temp64 &= ~((u64) val64); writeq(temp64, &bar0->general_int_mask); - /* - * All MAC block error interrupts are disabled for now + /* + * All MAC block error interrupts are disabled for now * except the link status change interrupt. * TODO */ @@ -1131,8 +1140,8 @@ static void en_dis_able_nic_intrs(struct val64 &= ~((u64) RMAC_LINK_STATE_CHANGE_INT); writeq(val64, &bar0->mac_rmac_err_mask); } else if (flag == DISABLE_INTRS) { - /* - * Disable MAC Intrs in the general intr mask register + /* + * Disable MAC Intrs in the general intr mask register */ writeq(DISABLE_ALL_INTRS, &bar0->mac_int_mask); writeq(DISABLE_ALL_INTRS, @@ -1151,14 +1160,14 @@ static void en_dis_able_nic_intrs(struct temp64 = readq(&bar0->general_int_mask); temp64 &= ~((u64) val64); writeq(temp64, &bar0->general_int_mask); - /* + /* * All XGXS block error interrupts are disabled for now - * TODO + * TODO */ writeq(DISABLE_ALL_INTRS, &bar0->xgxs_int_mask); } else if (flag == DISABLE_INTRS) { - /* - * Disable MC Intrs in the general intr mask register + /* + * Disable MC Intrs in the general intr mask register */ writeq(DISABLE_ALL_INTRS, &bar0->xgxs_int_mask); temp64 = readq(&bar0->general_int_mask); @@ -1174,9 +1183,9 @@ static void en_dis_able_nic_intrs(struct temp64 = readq(&bar0->general_int_mask); temp64 &= ~((u64) val64); writeq(temp64, &bar0->general_int_mask); - /* - * All MC block error interrupts are disabled for now - * TODO + /* + * All MC block error interrupts are disabled for now. + * TODO */ writeq(DISABLE_ALL_INTRS, &bar0->mc_int_mask); } else if (flag == DISABLE_INTRS) { @@ -1198,14 +1207,14 @@ static void en_dis_able_nic_intrs(struct temp64 = readq(&bar0->general_int_mask); temp64 &= ~((u64) val64); writeq(temp64, &bar0->general_int_mask); - /* + /* * Enable all the Tx side interrupts - * writing 0 Enables all 64 TX interrupt levels + * writing 0 Enables all 64 TX interrupt levels */ writeq(0x0, &bar0->tx_traffic_mask); } else if (flag == DISABLE_INTRS) { - /* - * Disable Tx Traffic Intrs in the general intr mask + /* + * Disable Tx Traffic Intrs in the general intr mask * register. */ writeq(DISABLE_ALL_INTRS, &bar0->tx_traffic_mask); @@ -1225,8 +1234,8 @@ static void en_dis_able_nic_intrs(struct /* writing 0 Enables all 8 RX interrupt levels */ writeq(0x0, &bar0->rx_traffic_mask); } else if (flag == DISABLE_INTRS) { - /* - * Disable Rx Traffic Intrs in the general intr mask + /* + * Disable Rx Traffic Intrs in the general intr mask * register. */ writeq(DISABLE_ALL_INTRS, &bar0->rx_traffic_mask); @@ -1237,20 +1246,42 @@ static void en_dis_able_nic_intrs(struct } } -/** - * verify_xena_quiescence - Checks whether the H/W is ready +static int check_prc_pcc_state(u64 val64, int flag) +{ + int ret = 0; + + if (flag == FALSE) { + if (!(val64 & ADAPTER_STATUS_RMAC_PCC_IDLE) && + ((val64 & ADAPTER_STATUS_RC_PRC_QUIESCENT) == + ADAPTER_STATUS_RC_PRC_QUIESCENT)) { + ret = 1; + } + } else { + if (((val64 & ADAPTER_STATUS_RMAC_PCC_IDLE) == + ADAPTER_STATUS_RMAC_PCC_IDLE) && + (!(val64 & ADAPTER_STATUS_RC_PRC_QUIESCENT) || + ((val64 & ADAPTER_STATUS_RC_PRC_QUIESCENT) == + ADAPTER_STATUS_RC_PRC_QUIESCENT))) { + ret = 1; + } + } + + return ret; +} +/** + * verify_xena_quiescence - Checks whether the H/W is ready * @val64 : Value read from adapter status register. * @flag : indicates if the adapter enable bit was ever written once * before. * Description: Returns whether the H/W is ready to go or not. Depending - * on whether adapter enable bit was written or not the comparison + * on whether adapter enable bit was written or not the comparison * differs and the calling function passes the input argument flag to * indicate this. - * Return: 1 If xena is quiescence + * Return: 1 If xena is quiescence * 0 If Xena is not quiescence */ -static int verify_xena_quiescence(u64 val64, int flag) +static int verify_xena_quiescence(nic_t *sp, u64 val64, int flag) { int ret = 0; u64 tmp64 = ~((u64) val64); @@ -1262,25 +1293,7 @@ static int verify_xena_quiescence(u64 va ADAPTER_STATUS_PIC_QUIESCENT | ADAPTER_STATUS_MC_DRAM_READY | ADAPTER_STATUS_MC_QUEUES_READY | ADAPTER_STATUS_M_PLL_LOCK | ADAPTER_STATUS_P_PLL_LOCK))) { - if (flag == FALSE) { - if (!(val64 & ADAPTER_STATUS_RMAC_PCC_IDLE) && - ((val64 & ADAPTER_STATUS_RC_PRC_QUIESCENT) == - ADAPTER_STATUS_RC_PRC_QUIESCENT)) { - - ret = 1; - - } - } else { - if (((val64 & ADAPTER_STATUS_RMAC_PCC_IDLE) == - ADAPTER_STATUS_RMAC_PCC_IDLE) && - (!(val64 & ADAPTER_STATUS_RC_PRC_QUIESCENT) || - ((val64 & ADAPTER_STATUS_RC_PRC_QUIESCENT) == - ADAPTER_STATUS_RC_PRC_QUIESCENT))) { - - ret = 1; - - } - } + ret = check_prc_pcc_state(val64, flag); } return ret; @@ -1289,12 +1302,12 @@ static int verify_xena_quiescence(u64 va /** * fix_mac_address - Fix for Mac addr problem on Alpha platforms * @sp: Pointer to device specifc structure - * Description : + * Description : * New procedure to clear mac address reading problems on Alpha platforms * */ -static void fix_mac_address(nic_t * sp) +void fix_mac_address(nic_t * sp) { XENA_dev_config_t __iomem *bar0 = sp->bar0; u64 val64; @@ -1302,20 +1315,21 @@ static void fix_mac_address(nic_t * sp) while (fix_mac[i] != END_SIGN) { writeq(fix_mac[i++], &bar0->gpio_control); + udelay(10); val64 = readq(&bar0->gpio_control); } } /** - * start_nic - Turns the device on + * start_nic - Turns the device on * @nic : device private variable. - * Description: - * This function actually turns the device on. Before this function is - * called,all Registers are configured from their reset states - * and shared memory is allocated but the NIC is still quiescent. On + * Description: + * This function actually turns the device on. Before this function is + * called,all Registers are configured from their reset states + * and shared memory is allocated but the NIC is still quiescent. On * calling this function, the device interrupts are cleared and the NIC is * literally switched on by writing into the adapter control register. - * Return Value: + * Return Value: * SUCCESS on success and -1 on failure. */ @@ -1324,8 +1338,8 @@ static int start_nic(struct s2io_nic *ni XENA_dev_config_t __iomem *bar0 = nic->bar0; struct net_device *dev = nic->dev; register u64 val64 = 0; - u16 interruptible, i; - u16 subid; + u16 interruptible; + u16 subid, i; mac_info_t *mac_control; struct config_param *config; @@ -1334,7 +1348,7 @@ static int start_nic(struct s2io_nic *ni /* PRC Initialization and configuration */ for (i = 0; i < config->rx_ring_num; i++) { - writeq((u64) nic->rx_blocks[i][0].block_dma_addr, + writeq((u64) mac_control->rings[i].rx_blocks[0].block_dma_addr, &bar0->prc_rxd0_n[i]); val64 = readq(&bar0->prc_ctrl_n[i]); @@ -1353,7 +1367,7 @@ static int start_nic(struct s2io_nic *ni writeq(val64, &bar0->rx_pa_cfg); #endif - /* + /* * Enabling MC-RLDRAM. After enabling the device, we timeout * for around 100ms, which is approximately the time required * for the device to be ready for operation. @@ -1363,27 +1377,27 @@ static int start_nic(struct s2io_nic *ni SPECIAL_REG_WRITE(val64, &bar0->mc_rldram_mrs, UF); val64 = readq(&bar0->mc_rldram_mrs); - msleep(100); /* Delay by around 100 ms. */ + msleep(100); /* Delay by around 100 ms. */ /* Enabling ECC Protection. */ val64 = readq(&bar0->adapter_control); val64 &= ~ADAPTER_ECC_EN; writeq(val64, &bar0->adapter_control); - /* - * Clearing any possible Link state change interrupts that + /* + * Clearing any possible Link state change interrupts that * could have popped up just before Enabling the card. */ val64 = readq(&bar0->mac_rmac_err_reg); if (val64) writeq(val64, &bar0->mac_rmac_err_reg); - /* - * Verify if the device is ready to be enabled, if so enable + /* + * Verify if the device is ready to be enabled, if so enable * it. */ val64 = readq(&bar0->adapter_status); - if (!verify_xena_quiescence(val64, nic->device_enabled_once)) { + if (!verify_xena_quiescence(nic, val64, nic->device_enabled_once)) { DBG_PRINT(ERR_DBG, "%s: device is not ready, ", dev->name); DBG_PRINT(ERR_DBG, "Adapter status reads: 0x%llx\n", (unsigned long long) val64); @@ -1395,12 +1409,12 @@ static int start_nic(struct s2io_nic *ni RX_MAC_INTR; en_dis_able_nic_intrs(nic, interruptible, ENABLE_INTRS); - /* + /* * With some switches, link might be already up at this point. - * Because of this weird behavior, when we enable laser, - * we may not get link. We need to handle this. We cannot - * figure out which switch is misbehaving. So we are forced to - * make a global change. + * Because of this weird behavior, when we enable laser, + * we may not get link. We need to handle this. We cannot + * figure out which switch is misbehaving. So we are forced to + * make a global change. */ /* Enabling Laser. */ @@ -1415,17 +1429,17 @@ static int start_nic(struct s2io_nic *ni val64 |= 0x0000800000000000ULL; writeq(val64, &bar0->gpio_control); val64 = 0x0411040400000000ULL; - writeq(val64, (void __iomem *) bar0 + 0x2700); + writeq(val64, (void __iomem *) ((u8 *) bar0 + 0x2700)); } - /* - * Don't see link state interrupts on certain switches, so + /* + * Don't see link state interrupts on certain switches, so * directly scheduling a link state task from here. */ schedule_work(&nic->set_link_task); - /* - * Here we are performing soft reset on XGXS to + /* + * Here we are performing soft reset on XGXS to * force link down. Since link is already up, we will get * link state change interrupt after this reset */ @@ -1442,12 +1456,12 @@ static int start_nic(struct s2io_nic *ni return SUCCESS; } -/** - * free_tx_buffers - Free all queued Tx buffers +/** + * free_tx_buffers - Free all queued Tx buffers * @nic : device private variable. - * Description: + * Description: * Free all queued Tx buffers. - * Return Value: void + * Return Value: void */ static void free_tx_buffers(struct s2io_nic *nic) @@ -1465,7 +1479,7 @@ static void free_tx_buffers(struct s2io_ for (i = 0; i < config->tx_fifo_num; i++) { for (j = 0; j < config->tx_cfg[i].fifo_len - 1; j++) { - txdp = (TxD_t *) nic->list_info[i][j]. + txdp = (TxD_t *) mac_control->fifos[i].list_info[j]. list_virt_addr; skb = (struct sk_buff *) ((unsigned long) txdp-> @@ -1481,16 +1495,16 @@ static void free_tx_buffers(struct s2io_ DBG_PRINT(INTR_DBG, "%s:forcibly freeing %d skbs on FIFO%d\n", dev->name, cnt, i); - mac_control->tx_curr_get_info[i].offset = 0; - mac_control->tx_curr_put_info[i].offset = 0; + mac_control->fifos[i].tx_curr_get_info.offset = 0; + mac_control->fifos[i].tx_curr_put_info.offset = 0; } } -/** - * stop_nic - To stop the nic +/** + * stop_nic - To stop the nic * @nic ; device private variable. - * Description: - * This function does exactly the opposite of what the start_nic() + * Description: + * This function does exactly the opposite of what the start_nic() * function does. This function is called to stop the device. * Return Value: * void. @@ -1520,11 +1534,11 @@ static void stop_nic(struct s2io_nic *ni } } -/** - * fill_rx_buffers - Allocates the Rx side skbs +/** + * fill_rx_buffers - Allocates the Rx side skbs * @nic: device private variable - * @ring_no: ring number - * Description: + * @ring_no: ring number + * Description: * The function allocates Rx side skbs and puts the physical * address of these buffers into the RxD buffer pointers, so that the NIC * can DMA the received frame into these locations. @@ -1532,8 +1546,8 @@ static void stop_nic(struct s2io_nic *ni * 1. single buffer, * 2. three buffer and * 3. Five buffer modes. - * Each mode defines how many fragments the received frame will be split - * up into by the NIC. The frame is split into L3 header, L4 Header, + * Each mode defines how many fragments the received frame will be split + * up into by the NIC. The frame is split into L3 header, L4 Header, * L4 payload in three buffer mode and in 5 buffer mode, L4 payload itself * is split into 3 fragments. As of now only single buffer mode is * supported. @@ -1541,7 +1555,7 @@ static void stop_nic(struct s2io_nic *ni * SUCCESS on success or an appropriate -ve value on failure. */ -static int fill_rx_buffers(struct s2io_nic *nic, int ring_no) +int fill_rx_buffers(struct s2io_nic *nic, int ring_no) { struct net_device *dev = nic->dev; struct sk_buff *skb; @@ -1549,14 +1563,13 @@ static int fill_rx_buffers(struct s2io_n int off, off1, size, block_no, block_no1; int offset, offset1; u32 alloc_tab = 0; - u32 alloc_cnt = nic->pkt_cnt[ring_no] - - atomic_read(&nic->rx_bufs_left[ring_no]); + u32 alloc_cnt; mac_info_t *mac_control; struct config_param *config; #ifdef CONFIG_2BUFF_MODE RxD_t *rxdpnext; int nextblk; - unsigned long tmp; + u64 tmp; buffAdd_t *ba; dma_addr_t rxdpphys; #endif @@ -1566,17 +1579,18 @@ static int fill_rx_buffers(struct s2io_n mac_control = &nic->mac_control; config = &nic->config; - + alloc_cnt = mac_control->rings[ring_no].pkt_cnt - + atomic_read(&nic->rx_bufs_left[ring_no]); size = dev->mtu + HEADER_ETHERNET_II_802_3_SIZE + HEADER_802_2_SIZE + HEADER_SNAP_SIZE; while (alloc_tab < alloc_cnt) { - block_no = mac_control->rx_curr_put_info[ring_no]. + block_no = mac_control->rings[ring_no].rx_curr_put_info. block_index; - block_no1 = mac_control->rx_curr_get_info[ring_no]. + block_no1 = mac_control->rings[ring_no].rx_curr_get_info. block_index; - off = mac_control->rx_curr_put_info[ring_no].offset; - off1 = mac_control->rx_curr_get_info[ring_no].offset; + off = mac_control->rings[ring_no].rx_curr_put_info.offset; + off1 = mac_control->rings[ring_no].rx_curr_get_info.offset; #ifndef CONFIG_2BUFF_MODE offset = block_no * (MAX_RXDS_PER_BLOCK + 1) + off; offset1 = block_no1 * (MAX_RXDS_PER_BLOCK + 1) + off1; @@ -1585,7 +1599,7 @@ static int fill_rx_buffers(struct s2io_n offset1 = block_no1 * (MAX_RXDS_PER_BLOCK) + off1; #endif - rxdp = nic->rx_blocks[ring_no][block_no]. + rxdp = mac_control->rings[ring_no].rx_blocks[block_no]. block_virt_addr + off; if ((offset == offset1) && (rxdp->Host_Control)) { DBG_PRINT(INTR_DBG, "%s: Get and Put", dev->name); @@ -1594,15 +1608,15 @@ static int fill_rx_buffers(struct s2io_n } #ifndef CONFIG_2BUFF_MODE if (rxdp->Control_1 == END_OF_BLOCK) { - mac_control->rx_curr_put_info[ring_no]. + mac_control->rings[ring_no].rx_curr_put_info. block_index++; - mac_control->rx_curr_put_info[ring_no]. - block_index %= nic->block_count[ring_no]; - block_no = mac_control->rx_curr_put_info - [ring_no].block_index; + mac_control->rings[ring_no].rx_curr_put_info. + block_index %= mac_control->rings[ring_no].block_count; + block_no = mac_control->rings[ring_no].rx_curr_put_info. + block_index; off++; off %= (MAX_RXDS_PER_BLOCK + 1); - mac_control->rx_curr_put_info[ring_no].offset = + mac_control->rings[ring_no].rx_curr_put_info.offset = off; rxdp = (RxD_t *) ((unsigned long) rxdp->Control_2); DBG_PRINT(INTR_DBG, "%s: Next block at: %p\n", @@ -1610,30 +1624,30 @@ static int fill_rx_buffers(struct s2io_n } #ifndef CONFIG_S2IO_NAPI spin_lock_irqsave(&nic->put_lock, flags); - nic->put_pos[ring_no] = + mac_control->rings[ring_no].put_pos = (block_no * (MAX_RXDS_PER_BLOCK + 1)) + off; spin_unlock_irqrestore(&nic->put_lock, flags); #endif #else if (rxdp->Host_Control == END_OF_BLOCK) { - mac_control->rx_curr_put_info[ring_no]. + mac_control->rings[ring_no].rx_curr_put_info. block_index++; - mac_control->rx_curr_put_info[ring_no]. - block_index %= nic->block_count[ring_no]; - block_no = mac_control->rx_curr_put_info - [ring_no].block_index; + mac_control->rings[ring_no].rx_curr_put_info.block_index + %= mac_control->rings[ring_no].block_count; + block_no = mac_control->rings[ring_no].rx_curr_put_info + .block_index; off = 0; DBG_PRINT(INTR_DBG, "%s: block%d at: 0x%llx\n", dev->name, block_no, (unsigned long long) rxdp->Control_1); - mac_control->rx_curr_put_info[ring_no].offset = + mac_control->rings[ring_no].rx_curr_put_info.offset = off; - rxdp = nic->rx_blocks[ring_no][block_no]. + rxdp = mac_control->rings[ring_no].rx_blocks[block_no]. block_virt_addr; } #ifndef CONFIG_S2IO_NAPI spin_lock_irqsave(&nic->put_lock, flags); - nic->put_pos[ring_no] = (block_no * + mac_control->rings[ring_no].put_pos = (block_no * (MAX_RXDS_PER_BLOCK + 1)) + off; spin_unlock_irqrestore(&nic->put_lock, flags); #endif @@ -1645,27 +1659,27 @@ static int fill_rx_buffers(struct s2io_n if (rxdp->Control_2 & BIT(0)) #endif { - mac_control->rx_curr_put_info[ring_no]. + mac_control->rings[ring_no].rx_curr_put_info. offset = off; goto end; } #ifdef CONFIG_2BUFF_MODE - /* - * RxDs Spanning cache lines will be replenished only - * if the succeeding RxD is also owned by Host. It - * will always be the ((8*i)+3) and ((8*i)+6) - * descriptors for the 48 byte descriptor. The offending + /* + * RxDs Spanning cache lines will be replenished only + * if the succeeding RxD is also owned by Host. It + * will always be the ((8*i)+3) and ((8*i)+6) + * descriptors for the 48 byte descriptor. The offending * decsriptor is of-course the 3rd descriptor. */ - rxdpphys = nic->rx_blocks[ring_no][block_no]. + rxdpphys = mac_control->rings[ring_no].rx_blocks[block_no]. block_dma_addr + (off * sizeof(RxD_t)); if (((u64) (rxdpphys)) % 128 > 80) { - rxdpnext = nic->rx_blocks[ring_no][block_no]. + rxdpnext = mac_control->rings[ring_no].rx_blocks[block_no]. block_virt_addr + (off + 1); if (rxdpnext->Host_Control == END_OF_BLOCK) { nextblk = (block_no + 1) % - (nic->block_count[ring_no]); - rxdpnext = nic->rx_blocks[ring_no] + (mac_control->rings[ring_no].block_count); + rxdpnext = mac_control->rings[ring_no].rx_blocks [nextblk].block_virt_addr; } if (rxdpnext->Control_2 & BIT(0)) @@ -1694,11 +1708,11 @@ static int fill_rx_buffers(struct s2io_n rxdp->Control_1 |= RXD_OWN_XENA; off++; off %= (MAX_RXDS_PER_BLOCK + 1); - mac_control->rx_curr_put_info[ring_no].offset = off; + mac_control->rings[ring_no].rx_curr_put_info.offset = off; #else - ba = &nic->ba[ring_no][block_no][off]; + ba = &mac_control->rings[ring_no].ba[block_no][off]; skb_reserve(skb, BUF0_LEN); - tmp = (unsigned long) skb->data; + tmp = (u64) skb->data; tmp += ALIGN_SIZE; tmp &= ~ALIGN_SIZE; skb->data = (void *) tmp; @@ -1722,8 +1736,9 @@ static int fill_rx_buffers(struct s2io_n rxdp->Host_Control = (u64) ((unsigned long) (skb)); rxdp->Control_1 |= RXD_OWN_XENA; off++; - mac_control->rx_curr_put_info[ring_no].offset = off; + mac_control->rings[ring_no].rx_curr_put_info.offset = off; #endif + atomic_inc(&nic->rx_bufs_left[ring_no]); alloc_tab++; } @@ -1733,9 +1748,9 @@ static int fill_rx_buffers(struct s2io_n } /** - * free_rx_buffers - Frees all Rx buffers + * free_rx_buffers - Frees all Rx buffers * @sp: device private variable. - * Description: + * Description: * This function will free all Rx buffers allocated by host. * Return Value: * NONE. @@ -1759,7 +1774,8 @@ static void free_rx_buffers(struct s2io_ for (i = 0; i < config->rx_ring_num; i++) { for (j = 0, blk = 0; j < config->rx_cfg[i].num_rxd; j++) { off = j % (MAX_RXDS_PER_BLOCK + 1); - rxdp = sp->rx_blocks[i][blk].block_virt_addr + off; + rxdp = mac_control->rings[i].rx_blocks[blk]. + block_virt_addr + off; #ifndef CONFIG_2BUFF_MODE if (rxdp->Control_1 == END_OF_BLOCK) { @@ -1794,7 +1810,7 @@ static void free_rx_buffers(struct s2io_ HEADER_SNAP_SIZE, PCI_DMA_FROMDEVICE); #else - ba = &sp->ba[i][blk][off]; + ba = &mac_control->rings[i].ba[blk][off]; pci_unmap_single(sp->pdev, (dma_addr_t) rxdp->Buffer0_ptr, BUF0_LEN, @@ -1814,10 +1830,10 @@ static void free_rx_buffers(struct s2io_ } memset(rxdp, 0, sizeof(RxD_t)); } - mac_control->rx_curr_put_info[i].block_index = 0; - mac_control->rx_curr_get_info[i].block_index = 0; - mac_control->rx_curr_put_info[i].offset = 0; - mac_control->rx_curr_get_info[i].offset = 0; + mac_control->rings[i].rx_curr_put_info.block_index = 0; + mac_control->rings[i].rx_curr_get_info.block_index = 0; + mac_control->rings[i].rx_curr_put_info.offset = 0; + mac_control->rings[i].rx_curr_get_info.offset = 0; atomic_set(&sp->rx_bufs_left[i], 0); DBG_PRINT(INIT_DBG, "%s:Freed 0x%x Rx Buffers on ring%d\n", dev->name, buf_cnt, i); @@ -1827,7 +1843,7 @@ static void free_rx_buffers(struct s2io_ /** * s2io_poll - Rx interrupt handler for NAPI support * @dev : pointer to the device structure. - * @budget : The number of packets that were budgeted to be processed + * @budget : The number of packets that were budgeted to be processed * during one pass through the 'Poll" function. * Description: * Comes into picture only if NAPI support has been incorporated. It does @@ -1837,160 +1853,35 @@ static void free_rx_buffers(struct s2io_ * 0 on success and 1 if there are No Rx packets to be processed. */ -#ifdef CONFIG_S2IO_NAPI +#if defined(CONFIG_S2IO_NAPI) static int s2io_poll(struct net_device *dev, int *budget) { nic_t *nic = dev->priv; - XENA_dev_config_t __iomem *bar0 = nic->bar0; - int pkts_to_process = *budget, pkt_cnt = 0; - register u64 val64 = 0; - rx_curr_get_info_t get_info, put_info; - int i, get_block, put_block, get_offset, put_offset, ring_bufs; -#ifndef CONFIG_2BUFF_MODE - u16 val16, cksum; -#endif - struct sk_buff *skb; - RxD_t *rxdp; + int pkt_cnt = 0, org_pkts_to_process; mac_info_t *mac_control; struct config_param *config; -#ifdef CONFIG_2BUFF_MODE - buffAdd_t *ba; -#endif + XENA_dev_config_t *bar0 = (XENA_dev_config_t *) nic->bar0; + u64 val64; + int i; mac_control = &nic->mac_control; config = &nic->config; - if (pkts_to_process > dev->quota) - pkts_to_process = dev->quota; + nic->pkts_to_process = *budget; + if (nic->pkts_to_process > dev->quota) + nic->pkts_to_process = dev->quota; + org_pkts_to_process = nic->pkts_to_process; val64 = readq(&bar0->rx_traffic_int); writeq(val64, &bar0->rx_traffic_int); for (i = 0; i < config->rx_ring_num; i++) { - get_info = mac_control->rx_curr_get_info[i]; - get_block = get_info.block_index; - put_info = mac_control->rx_curr_put_info[i]; - put_block = put_info.block_index; - ring_bufs = config->rx_cfg[i].num_rxd; - rxdp = nic->rx_blocks[i][get_block].block_virt_addr + - get_info.offset; -#ifndef CONFIG_2BUFF_MODE - get_offset = (get_block * (MAX_RXDS_PER_BLOCK + 1)) + - get_info.offset; - put_offset = (put_block * (MAX_RXDS_PER_BLOCK + 1)) + - put_info.offset; - while ((!(rxdp->Control_1 & RXD_OWN_XENA)) && - (((get_offset + 1) % ring_bufs) != put_offset)) { - if (--pkts_to_process < 0) { - goto no_rx; - } - if (rxdp->Control_1 == END_OF_BLOCK) { - rxdp = - (RxD_t *) ((unsigned long) rxdp-> - Control_2); - get_info.offset++; - get_info.offset %= - (MAX_RXDS_PER_BLOCK + 1); - get_block++; - get_block %= nic->block_count[i]; - mac_control->rx_curr_get_info[i]. - offset = get_info.offset; - mac_control->rx_curr_get_info[i]. - block_index = get_block; - continue; - } - get_offset = - (get_block * (MAX_RXDS_PER_BLOCK + 1)) + - get_info.offset; - skb = - (struct sk_buff *) ((unsigned long) rxdp-> - Host_Control); - if (skb == NULL) { - DBG_PRINT(ERR_DBG, "%s: The skb is ", - dev->name); - DBG_PRINT(ERR_DBG, "Null in Rx Intr\n"); - goto no_rx; - } - val64 = RXD_GET_BUFFER0_SIZE(rxdp->Control_2); - val16 = (u16) (val64 >> 48); - cksum = RXD_GET_L4_CKSUM(rxdp->Control_1); - pci_unmap_single(nic->pdev, (dma_addr_t) - rxdp->Buffer0_ptr, - dev->mtu + - HEADER_ETHERNET_II_802_3_SIZE + - HEADER_802_2_SIZE + - HEADER_SNAP_SIZE, - PCI_DMA_FROMDEVICE); - rx_osm_handler(nic, val16, rxdp, i); - pkt_cnt++; - get_info.offset++; - get_info.offset %= (MAX_RXDS_PER_BLOCK + 1); - rxdp = - nic->rx_blocks[i][get_block].block_virt_addr + - get_info.offset; - mac_control->rx_curr_get_info[i].offset = - get_info.offset; - } -#else - get_offset = (get_block * (MAX_RXDS_PER_BLOCK + 1)) + - get_info.offset; - put_offset = (put_block * (MAX_RXDS_PER_BLOCK + 1)) + - put_info.offset; - while (((!(rxdp->Control_1 & RXD_OWN_XENA)) && - !(rxdp->Control_2 & BIT(0))) && - (((get_offset + 1) % ring_bufs) != put_offset)) { - if (--pkts_to_process < 0) { - goto no_rx; - } - skb = (struct sk_buff *) ((unsigned long) - rxdp->Host_Control); - if (skb == NULL) { - DBG_PRINT(ERR_DBG, "%s: The skb is ", - dev->name); - DBG_PRINT(ERR_DBG, "Null in Rx Intr\n"); - goto no_rx; - } - - pci_unmap_single(nic->pdev, (dma_addr_t) - rxdp->Buffer0_ptr, - BUF0_LEN, PCI_DMA_FROMDEVICE); - pci_unmap_single(nic->pdev, (dma_addr_t) - rxdp->Buffer1_ptr, - BUF1_LEN, PCI_DMA_FROMDEVICE); - pci_unmap_single(nic->pdev, (dma_addr_t) - rxdp->Buffer2_ptr, - dev->mtu + BUF0_LEN + 4, - PCI_DMA_FROMDEVICE); - ba = &nic->ba[i][get_block][get_info.offset]; - - rx_osm_handler(nic, rxdp, i, ba); - - get_info.offset++; - mac_control->rx_curr_get_info[i].offset = - get_info.offset; - rxdp = - nic->rx_blocks[i][get_block].block_virt_addr + - get_info.offset; - - if (get_info.offset && - (!(get_info.offset % MAX_RXDS_PER_BLOCK))) { - get_info.offset = 0; - mac_control->rx_curr_get_info[i]. - offset = get_info.offset; - get_block++; - get_block %= nic->block_count[i]; - mac_control->rx_curr_get_info[i]. - block_index = get_block; - rxdp = - nic->rx_blocks[i][get_block]. - block_virt_addr; - } - get_offset = - (get_block * (MAX_RXDS_PER_BLOCK + 1)) + - get_info.offset; - pkt_cnt++; + rx_intr_handler(&mac_control->rings[i]); + pkt_cnt = org_pkts_to_process - nic->pkts_to_process; + if (!nic->pkts_to_process) { + /* Quota for the current iteration has been met */ + goto no_rx; } -#endif } if (!pkt_cnt) pkt_cnt = 1; @@ -2010,7 +1901,7 @@ static int s2io_poll(struct net_device * en_dis_able_nic_intrs(nic, RX_TRAFFIC_INTR, ENABLE_INTRS); return 0; - no_rx: +no_rx: dev->quota -= pkt_cnt; *budget -= pkt_cnt; @@ -2023,277 +1914,213 @@ static int s2io_poll(struct net_device * } return 1; } -#else -/** +#endif + +/** * rx_intr_handler - Rx interrupt handler * @nic: device private variable. - * Description: - * If the interrupt is because of a received frame or if the + * Description: + * If the interrupt is because of a received frame or if the * receive ring contains fresh as yet un-processed frames,this function is - * called. It picks out the RxD at which place the last Rx processing had - * stopped and sends the skb to the OSM's Rx handler and then increments + * called. It picks out the RxD at which place the last Rx processing had + * stopped and sends the skb to the OSM's Rx handler and then increments * the offset. * Return Value: * NONE. */ - -static void rx_intr_handler(struct s2io_nic *nic) +static void rx_intr_handler(ring_info_t *ring_data) { + nic_t *nic = ring_data->nic; struct net_device *dev = (struct net_device *) nic->dev; - XENA_dev_config_t *bar0 = (XENA_dev_config_t *) nic->bar0; + XENA_dev_config_t __iomem *bar0 = nic->bar0; + int get_block, get_offset, put_block, put_offset, ring_bufs; rx_curr_get_info_t get_info, put_info; RxD_t *rxdp; struct sk_buff *skb; -#ifndef CONFIG_2BUFF_MODE - u16 val16, cksum; -#endif - register u64 val64 = 0; - int get_block, get_offset, put_block, put_offset, ring_bufs; - int i, pkt_cnt = 0; - mac_info_t *mac_control; - struct config_param *config; -#ifdef CONFIG_2BUFF_MODE - buffAdd_t *ba; +#ifndef CONFIG_S2IO_NAPI + int pkt_cnt = 0; #endif + register u64 val64; - mac_control = &nic->mac_control; - config = &nic->config; - - /* - * rx_traffic_int reg is an R1 register, hence we read and write back - * the samevalue in the register to clear it. + /* + * rx_traffic_int reg is an R1 register, hence we read and write + * back the same value in the register to clear it */ - val64 = readq(&bar0->rx_traffic_int); - writeq(val64, &bar0->rx_traffic_int); + val64 = readq(&bar0->tx_traffic_int); + writeq(val64, &bar0->tx_traffic_int); - for (i = 0; i < config->rx_ring_num; i++) { - get_info = mac_control->rx_curr_get_info[i]; - get_block = get_info.block_index; - put_info = mac_control->rx_curr_put_info[i]; - put_block = put_info.block_index; - ring_bufs = config->rx_cfg[i].num_rxd; - rxdp = nic->rx_blocks[i][get_block].block_virt_addr + + get_info = ring_data->rx_curr_get_info; + get_block = get_info.block_index; + put_info = ring_data->rx_curr_put_info; + put_block = put_info.block_index; + ring_bufs = get_info.ring_len+1; + rxdp = ring_data->rx_blocks[get_block].block_virt_addr + get_info.offset; -#ifndef CONFIG_2BUFF_MODE - get_offset = (get_block * (MAX_RXDS_PER_BLOCK + 1)) + - get_info.offset; - spin_lock(&nic->put_lock); - put_offset = nic->put_pos[i]; - spin_unlock(&nic->put_lock); - while ((!(rxdp->Control_1 & RXD_OWN_XENA)) && - (((get_offset + 1) % ring_bufs) != put_offset)) { - if (rxdp->Control_1 == END_OF_BLOCK) { - rxdp = (RxD_t *) ((unsigned long) - rxdp->Control_2); - get_info.offset++; - get_info.offset %= - (MAX_RXDS_PER_BLOCK + 1); - get_block++; - get_block %= nic->block_count[i]; - mac_control->rx_curr_get_info[i]. - offset = get_info.offset; - mac_control->rx_curr_get_info[i]. - block_index = get_block; - continue; - } - get_offset = - (get_block * (MAX_RXDS_PER_BLOCK + 1)) + - get_info.offset; - skb = (struct sk_buff *) ((unsigned long) - rxdp->Host_Control); - if (skb == NULL) { - DBG_PRINT(ERR_DBG, "%s: The skb is ", - dev->name); - DBG_PRINT(ERR_DBG, "Null in Rx Intr\n"); - return; - } - val64 = RXD_GET_BUFFER0_SIZE(rxdp->Control_2); - val16 = (u16) (val64 >> 48); - cksum = RXD_GET_L4_CKSUM(rxdp->Control_1); - pci_unmap_single(nic->pdev, (dma_addr_t) - rxdp->Buffer0_ptr, - dev->mtu + - HEADER_ETHERNET_II_802_3_SIZE + - HEADER_802_2_SIZE + - HEADER_SNAP_SIZE, - PCI_DMA_FROMDEVICE); - rx_osm_handler(nic, val16, rxdp, i); - get_info.offset++; - get_info.offset %= (MAX_RXDS_PER_BLOCK + 1); - rxdp = - nic->rx_blocks[i][get_block].block_virt_addr + - get_info.offset; - mac_control->rx_curr_get_info[i].offset = - get_info.offset; - pkt_cnt++; - if ((indicate_max_pkts) - && (pkt_cnt > indicate_max_pkts)) - break; + get_offset = (get_block * (MAX_RXDS_PER_BLOCK + 1)) + + get_info.offset; +#ifndef CONFIG_S2IO_NAPI + spin_lock(&nic->put_lock); + put_offset = ring_data->put_pos; + spin_unlock(&nic->put_lock); +#else + put_offset = (put_block * (MAX_RXDS_PER_BLOCK + 1)) + + put_info.offset; +#endif + while ((!(rxdp->Control_1 & RXD_OWN_XENA)) && +#ifdef CONFIG_2BUFF_MODE + (!rxdp->Control_2 & BIT(0)) && +#endif + (((get_offset + 1) % ring_bufs) != put_offset)) { + skb = (struct sk_buff *) ((unsigned long)rxdp->Host_Control); + if (skb == NULL) { + DBG_PRINT(ERR_DBG, "%s: The skb is ", + dev->name); + DBG_PRINT(ERR_DBG, "Null in Rx Intr\n"); + return; } +#ifndef CONFIG_2BUFF_MODE + pci_unmap_single(nic->pdev, (dma_addr_t) + rxdp->Buffer0_ptr, + dev->mtu + + HEADER_ETHERNET_II_802_3_SIZE + + HEADER_802_2_SIZE + + HEADER_SNAP_SIZE, + PCI_DMA_FROMDEVICE); #else - get_offset = (get_block * (MAX_RXDS_PER_BLOCK + 1)) + + pci_unmap_single(nic->pdev, (dma_addr_t) + rxdp->Buffer0_ptr, + BUF0_LEN, PCI_DMA_FROMDEVICE); + pci_unmap_single(nic->pdev, (dma_addr_t) + rxdp->Buffer1_ptr, + BUF1_LEN, PCI_DMA_FROMDEVICE); + pci_unmap_single(nic->pdev, (dma_addr_t) + rxdp->Buffer2_ptr, + dev->mtu + BUF0_LEN + 4, + PCI_DMA_FROMDEVICE); +#endif + rx_osm_handler(ring_data, rxdp); + get_info.offset++; + ring_data->rx_curr_get_info.offset = get_info.offset; - spin_lock(&nic->put_lock); - put_offset = nic->put_pos[i]; - spin_unlock(&nic->put_lock); - while (((!(rxdp->Control_1 & RXD_OWN_XENA)) && - !(rxdp->Control_2 & BIT(0))) && - (((get_offset + 1) % ring_bufs) != put_offset)) { - skb = (struct sk_buff *) ((unsigned long) - rxdp->Host_Control); - if (skb == NULL) { - DBG_PRINT(ERR_DBG, "%s: The skb is ", - dev->name); - DBG_PRINT(ERR_DBG, "Null in Rx Intr\n"); - return; - } - - pci_unmap_single(nic->pdev, (dma_addr_t) - rxdp->Buffer0_ptr, - BUF0_LEN, PCI_DMA_FROMDEVICE); - pci_unmap_single(nic->pdev, (dma_addr_t) - rxdp->Buffer1_ptr, - BUF1_LEN, PCI_DMA_FROMDEVICE); - pci_unmap_single(nic->pdev, (dma_addr_t) - rxdp->Buffer2_ptr, - dev->mtu + BUF0_LEN + 4, - PCI_DMA_FROMDEVICE); - ba = &nic->ba[i][get_block][get_info.offset]; - - rx_osm_handler(nic, rxdp, i, ba); - - get_info.offset++; - mac_control->rx_curr_get_info[i].offset = - get_info.offset; - rxdp = - nic->rx_blocks[i][get_block].block_virt_addr + - get_info.offset; + rxdp = ring_data->rx_blocks[get_block].block_virt_addr + + get_info.offset; + if (get_info.offset && + (!(get_info.offset % MAX_RXDS_PER_BLOCK))) { + get_info.offset = 0; + ring_data->rx_curr_get_info.offset + = get_info.offset; + get_block++; + get_block %= ring_data->block_count; + ring_data->rx_curr_get_info.block_index + = get_block; + rxdp = ring_data->rx_blocks[get_block].block_virt_addr; + } - if (get_info.offset && - (!(get_info.offset % MAX_RXDS_PER_BLOCK))) { - get_info.offset = 0; - mac_control->rx_curr_get_info[i]. - offset = get_info.offset; - get_block++; - get_block %= nic->block_count[i]; - mac_control->rx_curr_get_info[i]. - block_index = get_block; - rxdp = - nic->rx_blocks[i][get_block]. - block_virt_addr; - } - get_offset = - (get_block * (MAX_RXDS_PER_BLOCK + 1)) + + get_offset = (get_block * (MAX_RXDS_PER_BLOCK + 1)) + get_info.offset; - pkt_cnt++; - if ((indicate_max_pkts) - && (pkt_cnt > indicate_max_pkts)) - break; - } -#endif +#ifdef CONFIG_S2IO_NAPI + nic->pkts_to_process -= 1; + if (!nic->pkts_to_process) + break; +#else + pkt_cnt++; if ((indicate_max_pkts) && (pkt_cnt > indicate_max_pkts)) break; +#endif } } -#endif -/** + +/** * tx_intr_handler - Transmit interrupt handler * @nic : device private variable - * Description: - * If an interrupt was raised to indicate DMA complete of the - * Tx packet, this function is called. It identifies the last TxD - * whose buffer was freed and frees all skbs whose data have already + * Description: + * If an interrupt was raised to indicate DMA complete of the + * Tx packet, this function is called. It identifies the last TxD + * whose buffer was freed and frees all skbs whose data have already * DMA'ed into the NICs internal memory. * Return Value: * NONE */ -static void tx_intr_handler(struct s2io_nic *nic) +static void tx_intr_handler(fifo_info_t *fifo_data) { + nic_t *nic = fifo_data->nic; XENA_dev_config_t __iomem *bar0 = nic->bar0; struct net_device *dev = (struct net_device *) nic->dev; tx_curr_get_info_t get_info, put_info; struct sk_buff *skb; TxD_t *txdlp; - register u64 val64 = 0; - int i; u16 j, frg_cnt; - mac_info_t *mac_control; - struct config_param *config; - - mac_control = &nic->mac_control; - config = &nic->config; + register u64 val64 = 0; - /* - * tx_traffic_int reg is an R1 register, hence we read and write - * back the samevalue in the register to clear it. + /* + * tx_traffic_int reg is an R1 register, hence we read and write + * back the same value in the register to clear it */ val64 = readq(&bar0->tx_traffic_int); writeq(val64, &bar0->tx_traffic_int); - for (i = 0; i < config->tx_fifo_num; i++) { - get_info = mac_control->tx_curr_get_info[i]; - put_info = mac_control->tx_curr_put_info[i]; - txdlp = (TxD_t *) nic->list_info[i][get_info.offset]. - list_virt_addr; - while ((!(txdlp->Control_1 & TXD_LIST_OWN_XENA)) && - (get_info.offset != put_info.offset) && - (txdlp->Host_Control)) { - /* Check for TxD errors */ - if (txdlp->Control_1 & TXD_T_CODE) { - unsigned long long err; - err = txdlp->Control_1 & TXD_T_CODE; - DBG_PRINT(ERR_DBG, "***TxD error %llx\n", - err); - } - - skb = (struct sk_buff *) ((unsigned long) - txdlp->Host_Control); - if (skb == NULL) { - DBG_PRINT(ERR_DBG, "%s: Null skb ", - dev->name); - DBG_PRINT(ERR_DBG, "in Tx Free Intr\n"); - return; - } - nic->tx_pkt_count++; + get_info = fifo_data->tx_curr_get_info; + put_info = fifo_data->tx_curr_put_info; + txdlp = (TxD_t *) fifo_data->list_info[get_info.offset]. + list_virt_addr; + while ((!(txdlp->Control_1 & TXD_LIST_OWN_XENA)) && + (get_info.offset != put_info.offset) && + (txdlp->Host_Control)) { + /* Check for TxD errors */ + if (txdlp->Control_1 & TXD_T_CODE) { + unsigned long long err; + err = txdlp->Control_1 & TXD_T_CODE; + DBG_PRINT(ERR_DBG, "***TxD error %llx\n", + err); + } + + skb = (struct sk_buff *) ((unsigned long) + txdlp->Host_Control); + if (skb == NULL) { + DBG_PRINT(ERR_DBG, "%s: Null skb ", + __FUNCTION__); + DBG_PRINT(ERR_DBG, "in Tx Free Intr\n"); + return; + } - frg_cnt = skb_shinfo(skb)->nr_frags; + frg_cnt = skb_shinfo(skb)->nr_frags; + nic->tx_pkt_count++; - /* For unfragmented skb */ - pci_unmap_single(nic->pdev, (dma_addr_t) - txdlp->Buffer_Pointer, - skb->len - skb->data_len, - PCI_DMA_TODEVICE); - if (frg_cnt) { - TxD_t *temp = txdlp; - txdlp++; - for (j = 0; j < frg_cnt; j++, txdlp++) { - skb_frag_t *frag = - &skb_shinfo(skb)->frags[j]; - pci_unmap_page(nic->pdev, - (dma_addr_t) - txdlp-> - Buffer_Pointer, - frag->size, - PCI_DMA_TODEVICE); - } - txdlp = temp; + pci_unmap_single(nic->pdev, (dma_addr_t) + txdlp->Buffer_Pointer, + skb->len - skb->data_len, + PCI_DMA_TODEVICE); + if (frg_cnt) { + TxD_t *temp; + temp = txdlp; + txdlp++; + for (j = 0; j < frg_cnt; j++, txdlp++) { + skb_frag_t *frag = + &skb_shinfo(skb)->frags[j]; + pci_unmap_page(nic->pdev, + (dma_addr_t) + txdlp-> + Buffer_Pointer, + frag->size, + PCI_DMA_TODEVICE); } - memset(txdlp, 0, - (sizeof(TxD_t) * config->max_txds)); - - /* Updating the statistics block */ - nic->stats.tx_packets++; - nic->stats.tx_bytes += skb->len; - dev_kfree_skb_irq(skb); - - get_info.offset++; - get_info.offset %= get_info.fifo_len + 1; - txdlp = (TxD_t *) nic->list_info[i] - [get_info.offset].list_virt_addr; - mac_control->tx_curr_get_info[i].offset = - get_info.offset; + txdlp = temp; } + memset(txdlp, 0, + (sizeof(TxD_t) * fifo_data->max_txds)); + + /* Updating the statistics block */ + nic->stats.tx_packets++; + nic->stats.tx_bytes += skb->len; + dev_kfree_skb_irq(skb); + + get_info.offset++; + get_info.offset %= get_info.fifo_len + 1; + txdlp = (TxD_t *) fifo_data->list_info + [get_info.offset].list_virt_addr; + fifo_data->tx_curr_get_info.offset = + get_info.offset; } spin_lock(&nic->tx_lock); @@ -2302,13 +2129,13 @@ static void tx_intr_handler(struct s2io_ spin_unlock(&nic->tx_lock); } -/** +/** * alarm_intr_handler - Alarm Interrrupt handler * @nic: device private variable - * Description: If the interrupt was neither because of Rx packet or Tx + * Description: If the interrupt was neither because of Rx packet or Tx * complete, this function is called. If the interrupt was to indicate - * a loss of link, the OSM link status handler is invoked for any other - * alarm interrupt the block that raised the interrupt is displayed + * a loss of link, the OSM link status handler is invoked for any other + * alarm interrupt the block that raised the interrupt is displayed * and a H/W reset is issued. * Return Value: * NONE @@ -2339,7 +2166,7 @@ static void alarm_intr_handler(struct s2 /* * Also as mentioned in the latest Errata sheets if the PCC_FB_ECC * Error occurs, the adapter will be recycled by disabling the - * adapter enable bit and enabling it again after the device + * adapter enable bit and enabling it again after the device * becomes Quiescent. */ val64 = readq(&bar0->pcc_err_reg); @@ -2355,18 +2182,18 @@ static void alarm_intr_handler(struct s2 /* Other type of interrupts are not being handled now, TODO */ } -/** +/** * wait_for_cmd_complete - waits for a command to complete. - * @sp : private member of the device structure, which is a pointer to the + * @sp : private member of the device structure, which is a pointer to the * s2io_nic structure. - * Description: Function that waits for a command to Write into RMAC - * ADDR DATA registers to be completed and returns either success or - * error depending on whether the command was complete or not. + * Description: Function that waits for a command to Write into RMAC + * ADDR DATA registers to be completed and returns either success or + * error depending on whether the command was complete or not. * Return value: * SUCCESS on success and FAILURE on failure. */ -static int wait_for_cmd_complete(nic_t * sp) +int wait_for_cmd_complete(nic_t * sp) { XENA_dev_config_t __iomem *bar0 = sp->bar0; int ret = FAILURE, cnt = 0; @@ -2386,17 +2213,17 @@ static int wait_for_cmd_complete(nic_t * return ret; } -/** - * s2io_reset - Resets the card. +/** + * s2io_reset - Resets the card. * @sp : private member of the device structure. * Description: Function to Reset the card. This function then also - * restores the previously saved PCI configuration space registers as + * restores the previously saved PCI configuration space registers as * the card reset also resets the configuration space. * Return value: * void. */ -static void s2io_reset(nic_t * sp) +void s2io_reset(nic_t * sp) { XENA_dev_config_t __iomem *bar0 = sp->bar0; u64 val64; @@ -2405,10 +2232,10 @@ static void s2io_reset(nic_t * sp) val64 = SW_RESET_ALL; writeq(val64, &bar0->sw_reset); - /* - * At this stage, if the PCI write is indeed completed, the - * card is reset and so is the PCI Config space of the device. - * So a read cannot be issued at this stage on any of the + /* + * At this stage, if the PCI write is indeed completed, the + * card is reset and so is the PCI Config space of the device. + * So a read cannot be issued at this stage on any of the * registers to ensure the write into "sw_reset" register * has gone through. * Question: Is there any system call that will explicitly force @@ -2421,10 +2248,17 @@ static void s2io_reset(nic_t * sp) /* Restore the PCI state saved during initializarion. */ pci_restore_state(sp->pdev); + s2io_init_pci(sp); msleep(250); + /* Set swapper to enable I/O register access */ + s2io_set_swapper(sp); + + /* Reset device statistics maintained by OS */ + memset(&sp->stats, 0, sizeof (struct net_device_stats)); + /* SXE-002: Configure link and activity LED to turn it off */ subid = sp->pdev->subsystem_device; if ((subid & 0xFF) >= 0x07) { @@ -2432,29 +2266,29 @@ static void s2io_reset(nic_t * sp) val64 |= 0x0000800000000000ULL; writeq(val64, &bar0->gpio_control); val64 = 0x0411040400000000ULL; - writeq(val64, (void __iomem *) bar0 + 0x2700); + writeq(val64, (void __iomem *) ((u8 *) bar0 + 0x2700)); } sp->device_enabled_once = FALSE; } /** - * s2io_set_swapper - to set the swapper controle on the card - * @sp : private member of the device structure, + * s2io_set_swapper - to set the swapper controle on the card + * @sp : private member of the device structure, * pointer to the s2io_nic structure. - * Description: Function to set the swapper control on the card + * Description: Function to set the swapper control on the card * correctly depending on the 'endianness' of the system. * Return value: * SUCCESS on success and FAILURE on failure. */ -static int s2io_set_swapper(nic_t * sp) +int s2io_set_swapper(nic_t * sp) { struct net_device *dev = sp->dev; XENA_dev_config_t __iomem *bar0 = sp->bar0; u64 val64, valt, valr; - /* + /* * Set proper endian settings and verify the same by reading * the PIF Feed-back register. */ @@ -2506,8 +2340,9 @@ static int s2io_set_swapper(nic_t * sp) i++; } if(i == 4) { + unsigned long long x = val64; DBG_PRINT(ERR_DBG, "Write failed, Xmsi_addr "); - DBG_PRINT(ERR_DBG, "reads:0x%llx\n",val64); + DBG_PRINT(ERR_DBG, "reads:0x%llx\n", x); return FAILURE; } } @@ -2515,8 +2350,8 @@ static int s2io_set_swapper(nic_t * sp) val64 &= 0xFFFF000000000000ULL; #ifdef __BIG_ENDIAN - /* - * The device by default set to a big endian format, so a + /* + * The device by default set to a big endian format, so a * big endian driver need not set anything. */ val64 |= (SWAPPER_CTRL_TXP_FE | @@ -2532,9 +2367,9 @@ static int s2io_set_swapper(nic_t * sp) SWAPPER_CTRL_STATS_FE | SWAPPER_CTRL_STATS_SE); writeq(val64, &bar0->swapper_ctrl); #else - /* + /* * Initially we enable all bits to make it accessible by the - * driver, then we selectively enable only those bits that + * driver, then we selectively enable only those bits that * we want to set. */ val64 |= (SWAPPER_CTRL_TXP_FE | @@ -2556,8 +2391,8 @@ static int s2io_set_swapper(nic_t * sp) #endif val64 = readq(&bar0->swapper_ctrl); - /* - * Verifying if endian settings are accurate by reading a + /* + * Verifying if endian settings are accurate by reading a * feedback register. */ val64 = readq(&bar0->pif_rd_swapper_fb); @@ -2577,25 +2412,25 @@ static int s2io_set_swapper(nic_t * sp) * Functions defined below concern the OS part of the driver * * ********************************************************* */ -/** +/** * s2io_open - open entry point of the driver * @dev : pointer to the device structure. * Description: * This function is the open entry point of the driver. It mainly calls a * function to allocate Rx buffers and inserts them into the buffer - * descriptors and then enables the Rx part of the NIC. + * descriptors and then enables the Rx part of the NIC. * Return value: * 0 on success and an appropriate (-)ve integer as defined in errno.h * file on failure. */ -static int s2io_open(struct net_device *dev) +int s2io_open(struct net_device *dev) { nic_t *sp = dev->priv; int err = 0; - /* - * Make sure you have link off by default every time + /* + * Make sure you have link off by default every time * Nic is initialized */ netif_carrier_off(dev); @@ -2605,27 +2440,34 @@ static int s2io_open(struct net_device * if (s2io_card_up(sp)) { DBG_PRINT(ERR_DBG, "%s: H/W initialization failed\n", dev->name); - return -ENODEV; + err = -ENODEV; + goto hw_init_failed; } /* After proper initialization of H/W, register ISR */ - err = request_irq((int) sp->irq, s2io_isr, SA_SHIRQ, + err = request_irq((int) sp->pdev->irq, s2io_isr, SA_SHIRQ, sp->name, dev); if (err) { - s2io_reset(sp); DBG_PRINT(ERR_DBG, "%s: ISR registration failed\n", dev->name); - return err; + goto isr_registration_failed; } if (s2io_set_mac_addr(dev, dev->dev_addr) == FAILURE) { DBG_PRINT(ERR_DBG, "Set Mac Address Failed\n"); - s2io_reset(sp); - return -ENODEV; + err = -ENODEV; + goto setting_mac_address_failed; } netif_start_queue(dev); return 0; + +setting_mac_address_failed: + free_irq(sp->pdev->irq, dev); +isr_registration_failed: + s2io_reset(sp); +hw_init_failed: + return err; } /** @@ -2641,16 +2483,15 @@ static int s2io_open(struct net_device * * file on failure. */ -static int s2io_close(struct net_device *dev) +int s2io_close(struct net_device *dev) { nic_t *sp = dev->priv; - flush_scheduled_work(); netif_stop_queue(dev); /* Reset card, kill tasklet and free Tx and Rx buffers. */ s2io_card_down(sp); - free_irq(dev->irq, dev); + free_irq(sp->pdev->irq, dev); sp->device_close_flag = TRUE; /* Device is shut down. */ return 0; } @@ -2668,7 +2509,7 @@ static int s2io_close(struct net_device * 0 on success & 1 on failure. */ -static int s2io_xmit(struct sk_buff *skb, struct net_device *dev) +int s2io_xmit(struct sk_buff *skb, struct net_device *dev) { nic_t *sp = dev->priv; u16 frg_cnt, frg_len, i, queue, queue_len, put_off, get_off; @@ -2686,22 +2527,24 @@ static int s2io_xmit(struct sk_buff *skb mac_control = &sp->mac_control; config = &sp->config; - DBG_PRINT(TX_DBG, "%s: In S2IO Tx routine\n", dev->name); + DBG_PRINT(TX_DBG, "%s: In Neterion Tx routine\n", dev->name); spin_lock_irqsave(&sp->tx_lock, flags); - if (atomic_read(&sp->card_state) == CARD_DOWN) { - DBG_PRINT(ERR_DBG, "%s: Card going down for reset\n", + DBG_PRINT(TX_DBG, "%s: Card going down for reset\n", dev->name); spin_unlock_irqrestore(&sp->tx_lock, flags); - return 1; + dev_kfree_skb(skb); + return 0; } queue = 0; - put_off = (u16) mac_control->tx_curr_put_info[queue].offset; - get_off = (u16) mac_control->tx_curr_get_info[queue].offset; - txdp = (TxD_t *) sp->list_info[queue][put_off].list_virt_addr; - queue_len = mac_control->tx_curr_put_info[queue].fifo_len + 1; + put_off = (u16) mac_control->fifos[queue].tx_curr_put_info.offset; + get_off = (u16) mac_control->fifos[queue].tx_curr_get_info.offset; + txdp = (TxD_t *) mac_control->fifos[queue].list_info[put_off]. + list_virt_addr; + + queue_len = mac_control->fifos[queue].tx_curr_put_info.fifo_len + 1; /* Avoid "put" pointer going beyond "get" pointer */ if (txdp->Host_Control || (((put_off + 1) % queue_len) == get_off)) { DBG_PRINT(ERR_DBG, "Error in xmit, No free TXDs.\n"); @@ -2721,9 +2564,9 @@ static int s2io_xmit(struct sk_buff *skb frg_cnt = skb_shinfo(skb)->nr_frags; frg_len = skb->len - skb->data_len; - txdp->Host_Control = (unsigned long) skb; txdp->Buffer_Pointer = pci_map_single (sp->pdev, skb->data, frg_len, PCI_DMA_TODEVICE); + txdp->Host_Control = (unsigned long) skb; if (skb->ip_summed == CHECKSUM_HW) { txdp->Control_2 |= (TXD_TX_CKO_IPV4_EN | TXD_TX_CKO_TCP_EN | @@ -2748,11 +2591,12 @@ static int s2io_xmit(struct sk_buff *skb txdp->Control_1 |= TXD_GATHER_CODE_LAST; tx_fifo = mac_control->tx_FIFO_start[queue]; - val64 = sp->list_info[queue][put_off].list_phy_addr; + val64 = mac_control->fifos[queue].list_info[put_off].list_phy_addr; writeq(val64, &tx_fifo->TxDL_Pointer); val64 = (TX_FIFO_LAST_TXD_NUM(frg_cnt) | TX_FIFO_FIRST_LIST | TX_FIFO_LAST_LIST); + #ifdef NETIF_F_TSO if (mss) val64 |= TX_FIFO_SPECIAL_FUNC; @@ -2763,8 +2607,8 @@ static int s2io_xmit(struct sk_buff *skb val64 = readq(&bar0->general_int_status); put_off++; - put_off %= mac_control->tx_curr_put_info[queue].fifo_len + 1; - mac_control->tx_curr_put_info[queue].offset = put_off; + put_off %= mac_control->fifos[queue].tx_curr_put_info.fifo_len + 1; + mac_control->fifos[queue].tx_curr_put_info.offset = put_off; /* Avoid "put" pointer going beyond "get" pointer */ if (((put_off + 1) % queue_len) == get_off) { @@ -2785,13 +2629,13 @@ static int s2io_xmit(struct sk_buff *skb * @irq: the irq of the device. * @dev_id: a void pointer to the dev structure of the NIC. * @pt_regs: pointer to the registers pushed on the stack. - * Description: This function is the ISR handler of the device. It - * identifies the reason for the interrupt and calls the relevant - * service routines. As a contongency measure, this ISR allocates the + * Description: This function is the ISR handler of the device. It + * identifies the reason for the interrupt and calls the relevant + * service routines. As a contongency measure, this ISR allocates the * recv buffers, if their numbers are below the panic value which is * presently set to 25% of the original number of rcv buffers allocated. * Return value: - * IRQ_HANDLED: will be returned if IRQ was handled by this routine + * IRQ_HANDLED: will be returned if IRQ was handled by this routine * IRQ_NONE: will be returned if interrupt is not from our device */ static irqreturn_t s2io_isr(int irq, void *dev_id, struct pt_regs *regs) @@ -2799,9 +2643,7 @@ static irqreturn_t s2io_isr(int irq, voi struct net_device *dev = (struct net_device *) dev_id; nic_t *sp = dev->priv; XENA_dev_config_t __iomem *bar0 = sp->bar0; -#ifndef CONFIG_S2IO_NAPI - int i, ret; -#endif + int i; u64 reason = 0; mac_info_t *mac_control; struct config_param *config; @@ -2809,13 +2651,13 @@ static irqreturn_t s2io_isr(int irq, voi mac_control = &sp->mac_control; config = &sp->config; - /* + /* * Identify the cause for interrupt and call the appropriate * interrupt handler. Causes for the interrupt could be; * 1. Rx of packet. * 2. Tx complete. * 3. Link down. - * 4. Error in any functional blocks of the NIC. + * 4. Error in any functional blocks of the NIC. */ reason = readq(&bar0->general_int_status); @@ -2824,12 +2666,6 @@ static irqreturn_t s2io_isr(int irq, voi return IRQ_NONE; } - /* If Intr is because of Tx Traffic */ - if (reason & GEN_INTR_TXTRAFFIC) { - tx_intr_handler(sp); - } - - /* If Intr is because of an error */ if (reason & (GEN_ERROR_INTR)) alarm_intr_handler(sp); @@ -2844,17 +2680,26 @@ static irqreturn_t s2io_isr(int irq, voi #else /* If Intr is because of Rx Traffic */ if (reason & GEN_INTR_RXTRAFFIC) { - rx_intr_handler(sp); + for (i = 0; i < config->rx_ring_num; i++) { + rx_intr_handler(&mac_control->rings[i]); + } } #endif - /* - * If the Rx buffer count is below the panic threshold then - * reallocate the buffers from the interrupt handler itself, + /* If Intr is because of Tx Traffic */ + if (reason & GEN_INTR_TXTRAFFIC) { + for (i = 0; i < config->tx_fifo_num; i++) + tx_intr_handler(&mac_control->fifos[i]); + } + + /* + * If the Rx buffer count is below the panic threshold then + * reallocate the buffers from the interrupt handler itself, * else schedule a tasklet to reallocate the buffers. */ #ifndef CONFIG_S2IO_NAPI for (i = 0; i < config->rx_ring_num; i++) { + int ret; int rxb_size = atomic_read(&sp->rx_bufs_left[i]); int level = rx_buffer_level(sp, rxb_size, i); @@ -2879,29 +2724,33 @@ static irqreturn_t s2io_isr(int irq, voi } /** - * s2io_get_stats - Updates the device statistics structure. + * s2io_get_stats - Updates the device statistics structure. * @dev : pointer to the device structure. * Description: - * This function updates the device statistics structure in the s2io_nic + * This function updates the device statistics structure in the s2io_nic * structure and returns a pointer to the same. * Return value: * pointer to the updated net_device_stats structure. */ -static struct net_device_stats *s2io_get_stats(struct net_device *dev) +struct net_device_stats *s2io_get_stats(struct net_device *dev) { nic_t *sp = dev->priv; mac_info_t *mac_control; struct config_param *config; + mac_control = &sp->mac_control; config = &sp->config; - sp->stats.tx_errors = mac_control->stats_info->tmac_any_err_frms; - sp->stats.rx_errors = mac_control->stats_info->rmac_drop_frms; - sp->stats.multicast = mac_control->stats_info->rmac_vld_mcst_frms; + sp->stats.tx_errors = + le32_to_cpu(mac_control->stats_info->tmac_any_err_frms); + sp->stats.rx_errors = + le32_to_cpu(mac_control->stats_info->rmac_drop_frms); + sp->stats.multicast = + le32_to_cpu(mac_control->stats_info->rmac_vld_mcst_frms); sp->stats.rx_length_errors = - mac_control->stats_info->rmac_long_frms; + le32_to_cpu(mac_control->stats_info->rmac_long_frms); return (&sp->stats); } @@ -2910,8 +2759,8 @@ static struct net_device_stats *s2io_get * s2io_set_multicast - entry point for multicast address enable/disable. * @dev : pointer to the device structure * Description: - * This function is a driver entry point which gets called by the kernel - * whenever multicast addresses must be enabled/disabled. This also gets + * This function is a driver entry point which gets called by the kernel + * whenever multicast addresses must be enabled/disabled. This also gets * called to set/reset promiscuous mode. Depending on the deivce flag, we * determine, if multicast address must be enabled or if promiscuous mode * is to be disabled etc. @@ -3011,7 +2860,7 @@ static void s2io_set_multicast(struct ne writeq(RMAC_ADDR_DATA0_MEM_ADDR(dis_addr), &bar0->rmac_addr_data0_mem); writeq(RMAC_ADDR_DATA1_MEM_MASK(0ULL), - &bar0->rmac_addr_data1_mem); + &bar0->rmac_addr_data1_mem); val64 = RMAC_ADDR_CMD_MEM_WE | RMAC_ADDR_CMD_MEM_STROBE_NEW_CMD | RMAC_ADDR_CMD_MEM_OFFSET @@ -3040,8 +2889,7 @@ static void s2io_set_multicast(struct ne writeq(RMAC_ADDR_DATA0_MEM_ADDR(mac_addr), &bar0->rmac_addr_data0_mem); writeq(RMAC_ADDR_DATA1_MEM_MASK(0ULL), - &bar0->rmac_addr_data1_mem); - + &bar0->rmac_addr_data1_mem); val64 = RMAC_ADDR_CMD_MEM_WE | RMAC_ADDR_CMD_MEM_STROBE_NEW_CMD | RMAC_ADDR_CMD_MEM_OFFSET @@ -3060,12 +2908,12 @@ static void s2io_set_multicast(struct ne } /** - * s2io_set_mac_addr - Programs the Xframe mac address + * s2io_set_mac_addr - Programs the Xframe mac address * @dev : pointer to the device structure. * @addr: a uchar pointer to the new mac address which is to be set. - * Description : This procedure will program the Xframe to receive + * Description : This procedure will program the Xframe to receive * frames with new Mac Address - * Return value: SUCCESS on success and an appropriate (-)ve integer + * Return value: SUCCESS on success and an appropriate (-)ve integer * as defined in errno.h file on failure. */ @@ -3076,10 +2924,10 @@ int s2io_set_mac_addr(struct net_device register u64 val64, mac_addr = 0; int i; - /* + /* * Set the new MAC address as the new unicast filter and reflect this * change on the device address registered with the OS. It will be - * at offset 0. + * at offset 0. */ for (i = 0; i < ETH_ALEN; i++) { mac_addr <<= 8; @@ -3103,12 +2951,12 @@ int s2io_set_mac_addr(struct net_device } /** - * s2io_ethtool_sset - Sets different link parameters. + * s2io_ethtool_sset - Sets different link parameters. * @sp : private member of the device structure, which is a pointer to the * s2io_nic structure. * @info: pointer to the structure with parameters given by ethtool to set * link information. * Description: - * The function sets different link parameters provided by the user onto + * The function sets different link parameters provided by the user onto * the NIC. * Return value: * 0 on success. @@ -3130,7 +2978,7 @@ static int s2io_ethtool_sset(struct net_ } /** - * s2io_ethtol_gset - Return link specific information. + * s2io_ethtol_gset - Return link specific information. * @sp : private member of the device structure, pointer to the * s2io_nic structure. * @info : pointer to the structure with parameters given by ethtool @@ -3162,8 +3010,8 @@ static int s2io_ethtool_gset(struct net_ } /** - * s2io_ethtool_gdrvinfo - Returns driver specific information. - * @sp : private member of the device structure, which is a pointer to the + * s2io_ethtool_gdrvinfo - Returns driver specific information. + * @sp : private member of the device structure, which is a pointer to the * s2io_nic structure. * @info : pointer to the structure with parameters given by ethtool to * return driver information. @@ -3191,9 +3039,9 @@ static void s2io_ethtool_gdrvinfo(struct /** * s2io_ethtool_gregs - dumps the entire space of Xfame into the buffer. - * @sp: private member of the device structure, which is a pointer to the + * @sp: private member of the device structure, which is a pointer to the * s2io_nic structure. - * @regs : pointer to the structure with parameters given by ethtool for + * @regs : pointer to the structure with parameters given by ethtool for * dumping the registers. * @reg_space: The input argumnet into which all the registers are dumped. * Description: @@ -3222,11 +3070,11 @@ static void s2io_ethtool_gregs(struct ne /** * s2io_phy_id - timer function that alternates adapter LED. - * @data : address of the private member of the device structure, which + * @data : address of the private member of the device structure, which * is a pointer to the s2io_nic structure, provided as an u32. - * Description: This is actually the timer function that alternates the - * adapter LED bit of the adapter control bit to set/reset every time on - * invocation. The timer is set for 1/2 a second, hence tha NIC blinks + * Description: This is actually the timer function that alternates the + * adapter LED bit of the adapter control bit to set/reset every time on + * invocation. The timer is set for 1/2 a second, hence tha NIC blinks * once every second. */ static void s2io_phy_id(unsigned long data) @@ -3254,12 +3102,12 @@ static void s2io_phy_id(unsigned long da * s2io_ethtool_idnic - To physically identify the nic on the system. * @sp : private member of the device structure, which is a pointer to the * s2io_nic structure. - * @id : pointer to the structure with identification parameters given by + * @id : pointer to the structure with identification parameters given by * ethtool. * Description: Used to physically identify the NIC on the system. - * The Link LED will blink for a time specified by the user for + * The Link LED will blink for a time specified by the user for * identification. - * NOTE: The Link has to be Up to be able to blink the LED. Hence + * NOTE: The Link has to be Up to be able to blink the LED. Hence * identification is possible only if it's link is up. * Return value: * int , returns 0 on success @@ -3289,9 +3137,9 @@ static int s2io_ethtool_idnic(struct net } mod_timer(&sp->id_timer, jiffies); if (data) - msleep(data * 1000); + msleep_interruptible(data * HZ); else - msleep(0xFFFFFFFF); + msleep_interruptible(MAX_FLICKER_TIME); del_timer_sync(&sp->id_timer); if (CARDS_WITH_FAULTY_LINK_INDICATORS(subid)) { @@ -3304,7 +3152,8 @@ static int s2io_ethtool_idnic(struct net /** * s2io_ethtool_getpause_data -Pause frame frame generation and reception. - * @sp : private member of the device structure, which is a pointer to the * s2io_nic structure. + * @sp : private member of the device structure, which is a pointer to the + * s2io_nic structure. * @ep : pointer to the structure with pause parameters given by ethtool. * Description: * Returns the Pause frame generation and reception capability of the NIC. @@ -3328,7 +3177,7 @@ static void s2io_ethtool_getpause_data(s /** * s2io_ethtool_setpause_data - set/reset pause frame generation. - * @sp : private member of the device structure, which is a pointer to the + * @sp : private member of the device structure, which is a pointer to the * s2io_nic structure. * @ep : pointer to the structure with pause parameters given by ethtool. * Description: @@ -3339,7 +3188,7 @@ static void s2io_ethtool_getpause_data(s */ static int s2io_ethtool_setpause_data(struct net_device *dev, - struct ethtool_pauseparam *ep) + struct ethtool_pauseparam *ep) { u64 val64; nic_t *sp = dev->priv; @@ -3360,13 +3209,13 @@ static int s2io_ethtool_setpause_data(st /** * read_eeprom - reads 4 bytes of data from user given offset. - * @sp : private member of the device structure, which is a pointer to the + * @sp : private member of the device structure, which is a pointer to the * s2io_nic structure. * @off : offset at which the data must be written * @data : Its an output parameter where the data read at the given - * offset is stored. + * offset is stored. * Description: - * Will read 4 bytes of data from the user given offset and return the + * Will read 4 bytes of data from the user given offset and return the * read data. * NOTE: Will allow to read only part of the EEPROM visible through the * I2C bus. @@ -3407,7 +3256,7 @@ static int read_eeprom(nic_t * sp, int o * s2io_nic structure. * @off : offset at which the data must be written * @data : The data that is to be written - * @cnt : Number of bytes of the data that are actually to be written into + * @cnt : Number of bytes of the data that are actually to be written into * the Eeprom. (max of 3) * Description: * Actually writes the relevant part of the data value into the Eeprom @@ -3444,7 +3293,7 @@ static int write_eeprom(nic_t * sp, int /** * s2io_ethtool_geeprom - reads the value stored in the Eeprom. * @sp : private member of the device structure, which is a pointer to the * s2io_nic structure. - * @eeprom : pointer to the user level structure provided by ethtool, + * @eeprom : pointer to the user level structure provided by ethtool, * containing all relevant information. * @data_buf : user defined value to be written into Eeprom. * Description: Reads the values stored in the Eeprom at given offset @@ -3455,7 +3304,7 @@ static int write_eeprom(nic_t * sp, int */ static int s2io_ethtool_geeprom(struct net_device *dev, - struct ethtool_eeprom *eeprom, u8 * data_buf) + struct ethtool_eeprom *eeprom, u8 * data_buf) { u32 data, i, valid; nic_t *sp = dev->priv; @@ -3480,7 +3329,7 @@ static int s2io_ethtool_geeprom(struct n * s2io_ethtool_seeprom - tries to write the user provided value in Eeprom * @sp : private member of the device structure, which is a pointer to the * s2io_nic structure. - * @eeprom : pointer to the user level structure provided by ethtool, + * @eeprom : pointer to the user level structure provided by ethtool, * containing all relevant information. * @data_buf ; user defined value to be written into Eeprom. * Description: @@ -3528,8 +3377,8 @@ static int s2io_ethtool_seeprom(struct n } /** - * s2io_register_test - reads and writes into all clock domains. - * @sp : private member of the device structure, which is a pointer to the + * s2io_register_test - reads and writes into all clock domains. + * @sp : private member of the device structure, which is a pointer to the * s2io_nic structure. * @data : variable that returns the result of each of the test conducted b * by the driver. @@ -3546,8 +3395,8 @@ static int s2io_register_test(nic_t * sp u64 val64 = 0; int fail = 0; - val64 = readq(&bar0->pcc_enable); - if (val64 != 0xff00000000000000ULL) { + val64 = readq(&bar0->pif_rd_swapper_fb); + if (val64 != 0x123456789abcdefULL) { fail = 1; DBG_PRINT(INFO_DBG, "Read Test level 1 fails\n"); } @@ -3591,13 +3440,13 @@ static int s2io_register_test(nic_t * sp } /** - * s2io_eeprom_test - to verify that EEprom in the xena can be programmed. + * s2io_eeprom_test - to verify that EEprom in the xena can be programmed. * @sp : private member of the device structure, which is a pointer to the * s2io_nic structure. * @data:variable that returns the result of each of the test conducted by * the driver. * Description: - * Verify that EEPROM in the xena can be programmed using I2C_CONTROL + * Verify that EEPROM in the xena can be programmed using I2C_CONTROL * register. * Return value: * 0 on success. @@ -3662,14 +3511,14 @@ static int s2io_eeprom_test(nic_t * sp, /** * s2io_bist_test - invokes the MemBist test of the card . - * @sp : private member of the device structure, which is a pointer to the + * @sp : private member of the device structure, which is a pointer to the * s2io_nic structure. - * @data:variable that returns the result of each of the test conducted by + * @data:variable that returns the result of each of the test conducted by * the driver. * Description: * This invokes the MemBist test of the card. We give around * 2 secs time for the Test to complete. If it's still not complete - * within this peiod, we consider that the test failed. + * within this peiod, we consider that the test failed. * Return value: * 0 on success and -1 on failure. */ @@ -3698,13 +3547,13 @@ static int s2io_bist_test(nic_t * sp, ui } /** - * s2io-link_test - verifies the link state of the nic - * @sp ; private member of the device structure, which is a pointer to the + * s2io-link_test - verifies the link state of the nic + * @sp ; private member of the device structure, which is a pointer to the * s2io_nic structure. * @data: variable that returns the result of each of the test conducted by * the driver. * Description: - * The function verifies the link state of the NIC and updates the input + * The function verifies the link state of the NIC and updates the input * argument 'data' appropriately. * Return value: * 0 on success. @@ -3723,13 +3572,13 @@ static int s2io_link_test(nic_t * sp, ui } /** - * s2io_rldram_test - offline test for access to the RldRam chip on the NIC - * @sp - private member of the device structure, which is a pointer to the + * s2io_rldram_test - offline test for access to the RldRam chip on the NIC + * @sp - private member of the device structure, which is a pointer to the * s2io_nic structure. - * @data - variable that returns the result of each of the test + * @data - variable that returns the result of each of the test * conducted by the driver. * Description: - * This is one of the offline test that tests the read and write + * This is one of the offline test that tests the read and write * access to the RldRam chip on the NIC. * Return value: * 0 on success. @@ -3834,7 +3683,7 @@ static int s2io_rldram_test(nic_t * sp, * s2io_nic structure. * @ethtest : pointer to a ethtool command specific structure that will be * returned to the user. - * @data : variable that returns the result of each of the test + * @data : variable that returns the result of each of the test * conducted by the driver. * Description: * This function conducts 6 tests ( 4 offline and 2 online) to determine @@ -3852,23 +3701,18 @@ static void s2io_ethtool_test(struct net if (ethtest->flags == ETH_TEST_FL_OFFLINE) { /* Offline Tests. */ - if (orig_state) { + if (orig_state) s2io_close(sp->dev); - s2io_set_swapper(sp); - } else - s2io_set_swapper(sp); if (s2io_register_test(sp, &data[0])) ethtest->flags |= ETH_TEST_FL_FAILED; s2io_reset(sp); - s2io_set_swapper(sp); if (s2io_rldram_test(sp, &data[3])) ethtest->flags |= ETH_TEST_FL_FAILED; s2io_reset(sp); - s2io_set_swapper(sp); if (s2io_eeprom_test(sp, &data[1])) ethtest->flags |= ETH_TEST_FL_FAILED; @@ -3952,20 +3796,19 @@ static void s2io_get_ethtool_stats(struc tmp_stats[i++] = le32_to_cpu(stat_info->rmac_err_tcp); } -static int s2io_ethtool_get_regs_len(struct net_device *dev) +int s2io_ethtool_get_regs_len(struct net_device *dev) { return (XENA_REG_SPACE); } -static u32 s2io_ethtool_get_rx_csum(struct net_device * dev) +u32 s2io_ethtool_get_rx_csum(struct net_device * dev) { nic_t *sp = dev->priv; return (sp->rx_csum); } - -static int s2io_ethtool_set_rx_csum(struct net_device *dev, u32 data) +int s2io_ethtool_set_rx_csum(struct net_device *dev, u32 data) { nic_t *sp = dev->priv; @@ -3976,19 +3819,17 @@ static int s2io_ethtool_set_rx_csum(stru return 0; } - -static int s2io_get_eeprom_len(struct net_device *dev) +int s2io_get_eeprom_len(struct net_device *dev) { return (XENA_EEPROM_SPACE); } -static int s2io_ethtool_self_test_count(struct net_device *dev) +int s2io_ethtool_self_test_count(struct net_device *dev) { return (S2IO_TEST_LEN); } - -static void s2io_ethtool_get_strings(struct net_device *dev, - u32 stringset, u8 * data) +void s2io_ethtool_get_strings(struct net_device *dev, + u32 stringset, u8 * data) { switch (stringset) { case ETH_SS_TEST: @@ -3999,13 +3840,12 @@ static void s2io_ethtool_get_strings(str sizeof(ethtool_stats_keys)); } } - static int s2io_ethtool_get_stats_count(struct net_device *dev) { return (S2IO_STAT_LEN); } -static int s2io_ethtool_op_set_tx_csum(struct net_device *dev, u32 data) +int s2io_ethtool_op_set_tx_csum(struct net_device *dev, u32 data) { if (data) dev->features |= NETIF_F_IP_CSUM; @@ -4047,21 +3887,18 @@ static struct ethtool_ops netdev_ethtool }; /** - * s2io_ioctl - Entry point for the Ioctl + * s2io_ioctl - Entry point for the Ioctl * @dev : Device pointer. * @ifr : An IOCTL specefic structure, that can contain a pointer to * a proprietary structure used to pass information to the driver. * @cmd : This is used to distinguish between the different commands that * can be passed to the IOCTL functions. * Description: - * This function has support for ethtool, adding multiple MAC addresses on - * the NIC and some DBG commands for the util tool. - * Return value: - * Currently the IOCTL supports no operations, hence by default this - * function returns OP NOT SUPPORTED value. + * Currently there are no special functionality supported in IOCTL, hence + * function always return EOPNOTSUPPORTED */ -static int s2io_ioctl(struct net_device *dev, struct ifreq *rq, int cmd) +int s2io_ioctl(struct net_device *dev, struct ifreq *rq, int cmd) { return -EOPNOTSUPP; } @@ -4077,7 +3914,7 @@ static int s2io_ioctl(struct net_device * file on failure. */ -static int s2io_change_mtu(struct net_device *dev, int new_mtu) +int s2io_change_mtu(struct net_device *dev, int new_mtu) { nic_t *sp = dev->priv; XENA_dev_config_t __iomem *bar0 = sp->bar0; @@ -4085,7 +3922,7 @@ static int s2io_change_mtu(struct net_de if (netif_running(dev)) { DBG_PRINT(ERR_DBG, "%s: Must be stopped to ", dev->name); - DBG_PRINT(ERR_DBG, "change its MTU \n"); + DBG_PRINT(ERR_DBG, "change its MTU\n"); return -EBUSY; } @@ -4109,9 +3946,9 @@ static int s2io_change_mtu(struct net_de * @dev_adr : address of the device structure in dma_addr_t format. * Description: * This is the tasklet or the bottom half of the ISR. This is - * an extension of the ISR which is scheduled by the scheduler to be run + * an extension of the ISR which is scheduled by the scheduler to be run * when the load on the CPU is low. All low priority tasks of the ISR can - * be pushed into the tasklet. For now the tasklet is used only to + * be pushed into the tasklet. For now the tasklet is used only to * replenish the Rx buffers in the Rx buffer descriptors. * Return value: * void. @@ -4167,14 +4004,14 @@ static void s2io_set_link(unsigned long } subid = nic->pdev->subsystem_device; - /* - * Allow a small delay for the NICs self initiated + /* + * Allow a small delay for the NICs self initiated * cleanup to complete. */ msleep(100); val64 = readq(&bar0->adapter_status); - if (verify_xena_quiescence(val64, nic->device_enabled_once)) { + if (verify_xena_quiescence(nic, val64, nic->device_enabled_once)) { if (LINK_IS_UP(val64)) { val64 = readq(&bar0->adapter_control); val64 |= ADAPTER_CNTL_EN; @@ -4225,8 +4062,9 @@ static void s2io_card_down(nic_t * sp) register u64 val64 = 0; /* If s2io_set_link task is executing, wait till it completes. */ - while (test_and_set_bit(0, &(sp->link_state))) + while (test_and_set_bit(0, &(sp->link_state))) { msleep(50); + } atomic_set(&sp->card_state, CARD_DOWN); /* disable Tx and Rx traffic on the NIC */ @@ -4238,7 +4076,7 @@ static void s2io_card_down(nic_t * sp) /* Check if the device is Quiescent and then Reset the NIC */ do { val64 = readq(&bar0->adapter_status); - if (verify_xena_quiescence(val64, sp->device_enabled_once)) { + if (verify_xena_quiescence(sp, val64, sp->device_enabled_once)) { break; } @@ -4277,8 +4115,8 @@ static int s2io_card_up(nic_t * sp) return -ENODEV; } - /* - * Initializing the Rx buffers. For now we are considering only 1 + /* + * Initializing the Rx buffers. For now we are considering only 1 * Rx ring and initializing buffers into 30 Rx blocks */ mac_control = &sp->mac_control; @@ -4316,12 +4154,12 @@ static int s2io_card_up(nic_t * sp) return 0; } -/** +/** * s2io_restart_nic - Resets the NIC. * @data : long pointer to the device private structure * Description: * This function is scheduled to be run by the s2io_tx_watchdog - * function after 0.5 secs to reset the NIC. The idea is to reduce + * function after 0.5 secs to reset the NIC. The idea is to reduce * the run time of the watch dog routine which is run holding a * spin lock. */ @@ -4339,10 +4177,11 @@ static void s2io_restart_nic(unsigned lo netif_wake_queue(dev); DBG_PRINT(ERR_DBG, "%s: was reset by Tx watchdog timer\n", dev->name); + } -/** - * s2io_tx_watchdog - Watchdog for transmit side. +/** + * s2io_tx_watchdog - Watchdog for transmit side. * @dev : Pointer to net device structure * Description: * This function is triggered if the Tx Queue is stopped @@ -4370,7 +4209,7 @@ static void s2io_tx_watchdog(struct net_ * @len : length of the packet * @cksum : FCS checksum of the frame. * @ring_no : the ring from which this RxD was extracted. - * Description: + * Description: * This function is called by the Tx interrupt serivce routine to perform * some OS related operations on the SKB before passing it to the upper * layers. It mainly checks if the checksum is OK, if so adds it to the @@ -4380,35 +4219,63 @@ static void s2io_tx_watchdog(struct net_ * Return value: * SUCCESS on success and -1 on failure. */ -#ifndef CONFIG_2BUFF_MODE -static int rx_osm_handler(nic_t * sp, u16 len, RxD_t * rxdp, int ring_no) -#else -static int rx_osm_handler(nic_t * sp, RxD_t * rxdp, int ring_no, - buffAdd_t * ba) -#endif +static int rx_osm_handler(ring_info_t *ring_data, RxD_t * rxdp) { + nic_t *sp = ring_data->nic; struct net_device *dev = (struct net_device *) sp->dev; - struct sk_buff *skb = - (struct sk_buff *) ((unsigned long) rxdp->Host_Control); + struct sk_buff *skb = (struct sk_buff *) + ((unsigned long) rxdp->Host_Control); + int ring_no = ring_data->ring_no; u16 l3_csum, l4_csum; #ifdef CONFIG_2BUFF_MODE - int buf0_len, buf2_len; + int buf0_len = RXD_GET_BUFFER0_SIZE(rxdp->Control_2); + int buf2_len = RXD_GET_BUFFER2_SIZE(rxdp->Control_2); + int get_block = ring_data->rx_curr_get_info.block_index; + int get_off = ring_data->rx_curr_get_info.offset; + buffAdd_t *ba = &ring_data->ba[get_block][get_off]; unsigned char *buff; +#else + u16 len = (u16) ((RXD_GET_BUFFER0_SIZE(rxdp->Control_2)) >> 48);; #endif + skb->dev = dev; + if (rxdp->Control_1 & RXD_T_CODE) { + unsigned long long err = rxdp->Control_1 & RXD_T_CODE; + DBG_PRINT(ERR_DBG, "%s: Rx error Value: 0x%llx\n", + dev->name, err); + } - l3_csum = RXD_GET_L3_CKSUM(rxdp->Control_1); - if ((rxdp->Control_1 & TCP_OR_UDP_FRAME) && (sp->rx_csum)) { + /* Updating statistics */ + rxdp->Host_Control = 0; + sp->rx_pkt_count++; + sp->stats.rx_packets++; +#ifndef CONFIG_2BUFF_MODE + sp->stats.rx_bytes += len; +#else + sp->stats.rx_bytes += buf0_len + buf2_len; +#endif + +#ifndef CONFIG_2BUFF_MODE + skb_put(skb, len); +#else + buff = skb_push(skb, buf0_len); + memcpy(buff, ba->ba_0, buf0_len); + skb_put(skb, buf2_len); +#endif + + if ((rxdp->Control_1 & TCP_OR_UDP_FRAME) && + (sp->rx_csum)) { + l3_csum = RXD_GET_L3_CKSUM(rxdp->Control_1); l4_csum = RXD_GET_L4_CKSUM(rxdp->Control_1); if ((l3_csum == L3_CKSUM_OK) && (l4_csum == L4_CKSUM_OK)) { - /* + /* * NIC verifies if the Checksum of the received * frame is Ok or not and accordingly returns * a flag in the RxD. */ skb->ip_summed = CHECKSUM_UNNECESSARY; } else { - /* - * Packet with erroneous checksum, let the + /* + * Packet with erroneous checksum, let the * upper layers deal with it. */ skb->ip_summed = CHECKSUM_NONE; @@ -4417,44 +4284,14 @@ static int rx_osm_handler(nic_t * sp, Rx skb->ip_summed = CHECKSUM_NONE; } - if (rxdp->Control_1 & RXD_T_CODE) { - unsigned long long err = rxdp->Control_1 & RXD_T_CODE; - DBG_PRINT(ERR_DBG, "%s: Rx error Value: 0x%llx\n", - dev->name, err); - } -#ifdef CONFIG_2BUFF_MODE - buf0_len = RXD_GET_BUFFER0_SIZE(rxdp->Control_2); - buf2_len = RXD_GET_BUFFER2_SIZE(rxdp->Control_2); -#endif - - skb->dev = dev; -#ifndef CONFIG_2BUFF_MODE - skb_put(skb, len); - skb->protocol = eth_type_trans(skb, dev); -#else - buff = skb_push(skb, buf0_len); - memcpy(buff, ba->ba_0, buf0_len); - skb_put(skb, buf2_len); skb->protocol = eth_type_trans(skb, dev); -#endif - #ifdef CONFIG_S2IO_NAPI netif_receive_skb(skb); #else netif_rx(skb); #endif - dev->last_rx = jiffies; - sp->rx_pkt_count++; - sp->stats.rx_packets++; -#ifndef CONFIG_2BUFF_MODE - sp->stats.rx_bytes += len; -#else - sp->stats.rx_bytes += buf0_len + buf2_len; -#endif - atomic_dec(&sp->rx_bufs_left[ring_no]); - rxdp->Host_Control = 0; return SUCCESS; } @@ -4465,13 +4302,13 @@ static int rx_osm_handler(nic_t * sp, Rx * @link : inidicates whether link is UP/DOWN. * Description: * This function stops/starts the Tx queue depending on whether the link - * status of the NIC is is down or up. This is called by the Alarm - * interrupt handler whenever a link change interrupt comes up. + * status of the NIC is is down or up. This is called by the Alarm + * interrupt handler whenever a link change interrupt comes up. * Return value: * void. */ -static void s2io_link(nic_t * sp, int link) +void s2io_link(nic_t * sp, int link) { struct net_device *dev = (struct net_device *) sp->dev; @@ -4488,8 +4325,25 @@ static void s2io_link(nic_t * sp, int li } /** - * s2io_init_pci -Initialization of PCI and PCI-X configuration registers . - * @sp : private member of the device structure, which is a pointer to the + * get_xena_rev_id - to identify revision ID of xena. + * @pdev : PCI Dev structure + * Description: + * Function to identify the Revision ID of xena. + * Return value: + * returns the revision ID of the device. + */ + +int get_xena_rev_id(struct pci_dev *pdev) +{ + u8 id = 0; + int ret; + ret = pci_read_config_byte(pdev, PCI_REVISION_ID, (u8 *) & id); + return id; +} + +/** + * s2io_init_pci -Initialization of PCI and PCI-X configuration registers . + * @sp : private member of the device structure, which is a pointer to the * s2io_nic structure. * Description: * This function initializes a few of the PCI and PCI-X configuration registers @@ -4500,15 +4354,15 @@ static void s2io_link(nic_t * sp, int li static void s2io_init_pci(nic_t * sp) { - u16 pci_cmd = 0; + u16 pci_cmd = 0, pcix_cmd = 0; /* Enable Data Parity Error Recovery in PCI-X command register. */ pci_read_config_word(sp->pdev, PCIX_COMMAND_REGISTER, - &(sp->pcix_cmd)); + &(pcix_cmd)); pci_write_config_word(sp->pdev, PCIX_COMMAND_REGISTER, - (sp->pcix_cmd | 1)); + (pcix_cmd | 1)); pci_read_config_word(sp->pdev, PCIX_COMMAND_REGISTER, - &(sp->pcix_cmd)); + &(pcix_cmd)); /* Set the PErr Response bit in PCI command register. */ pci_read_config_word(sp->pdev, PCI_COMMAND, &pci_cmd); @@ -4517,34 +4371,36 @@ static void s2io_init_pci(nic_t * sp) pci_read_config_word(sp->pdev, PCI_COMMAND, &pci_cmd); /* Set MMRB count to 1024 in PCI-X Command register. */ - sp->pcix_cmd &= 0xFFF3; - pci_write_config_word(sp->pdev, PCIX_COMMAND_REGISTER, (sp->pcix_cmd | (0x1 << 2))); /* MMRBC 1K */ + pcix_cmd &= 0xFFF3; + pci_write_config_word(sp->pdev, PCIX_COMMAND_REGISTER, + (pcix_cmd | (0x1 << 2))); /* MMRBC 1K */ pci_read_config_word(sp->pdev, PCIX_COMMAND_REGISTER, - &(sp->pcix_cmd)); + &(pcix_cmd)); /* Setting Maximum outstanding splits based on system type. */ - sp->pcix_cmd &= 0xFF8F; - - sp->pcix_cmd |= XENA_MAX_OUTSTANDING_SPLITS(0x1); /* 2 splits. */ + pcix_cmd &= 0xFF8F; + pcix_cmd |= XENA_MAX_OUTSTANDING_SPLITS(0x1); /* 2 splits. */ pci_write_config_word(sp->pdev, PCIX_COMMAND_REGISTER, - sp->pcix_cmd); + pcix_cmd); pci_read_config_word(sp->pdev, PCIX_COMMAND_REGISTER, - &(sp->pcix_cmd)); + &(pcix_cmd)); + /* Forcibly disabling relaxed ordering capability of the card. */ - sp->pcix_cmd &= 0xfffd; + pcix_cmd &= 0xfffd; pci_write_config_word(sp->pdev, PCIX_COMMAND_REGISTER, - sp->pcix_cmd); + pcix_cmd); pci_read_config_word(sp->pdev, PCIX_COMMAND_REGISTER, - &(sp->pcix_cmd)); + &(pcix_cmd)); } MODULE_AUTHOR("Raghavendra Koushik "); MODULE_LICENSE("GPL"); module_param(tx_fifo_num, int, 0); -module_param_array(tx_fifo_len, int, NULL, 0); module_param(rx_ring_num, int, 0); -module_param_array(rx_ring_sz, int, NULL, 0); +module_param_array(tx_fifo_len, uint, NULL, 0); +module_param_array(rx_ring_sz, uint, NULL, 0); module_param(Stats_refresh_time, int, 0); +module_param_array(rts_frm_len, uint, NULL, 0); module_param(rmac_pause_time, int, 0); module_param(mc_pause_threshold_q0q3, int, 0); module_param(mc_pause_threshold_q4q7, int, 0); @@ -4554,15 +4410,16 @@ module_param(rmac_util_period, int, 0); #ifndef CONFIG_S2IO_NAPI module_param(indicate_max_pkts, int, 0); #endif + /** - * s2io_init_nic - Initialization of the adapter . + * s2io_init_nic - Initialization of the adapter . * @pdev : structure containing the PCI related information of the device. * @pre: List of PCI devices supported by the driver listed in s2io_tbl. * Description: * The function initializes an adapter identified by the pci_dec structure. - * All OS related initialization including memory and device structure and - * initlaization of the device private variable is done. Also the swapper - * control register is initialized to enable read and write into the I/O + * All OS related initialization including memory and device structure and + * initlaization of the device private variable is done. Also the swapper + * control register is initialized to enable read and write into the I/O * registers of the device. * Return value: * returns 0 on success and negative on failure. @@ -4573,7 +4430,6 @@ s2io_init_nic(struct pci_dev *pdev, cons { nic_t *sp; struct net_device *dev; - char *dev_name = "S2IO 10GE NIC"; int i, j, ret; int dma_flag = FALSE; u32 mac_up, mac_down; @@ -4583,9 +4439,9 @@ s2io_init_nic(struct pci_dev *pdev, cons mac_info_t *mac_control; struct config_param *config; - - DBG_PRINT(ERR_DBG, "Loading S2IO driver with %s\n", - s2io_driver_version); +#ifdef CONFIG_S2IO_NAPI + DBG_PRINT(ERR_DBG, "NAPI support has been enabled\n"); +#endif if ((ret = pci_enable_device(pdev))) { DBG_PRINT(ERR_DBG, @@ -4596,7 +4452,6 @@ s2io_init_nic(struct pci_dev *pdev, cons if (!pci_set_dma_mask(pdev, 0xffffffffffffffffULL)) { DBG_PRINT(INIT_DBG, "s2io_init_nic: Using 64bit DMA\n"); dma_flag = TRUE; - if (pci_set_consistent_dma_mask (pdev, 0xffffffffffffffffULL)) { DBG_PRINT(ERR_DBG, @@ -4636,21 +4491,17 @@ s2io_init_nic(struct pci_dev *pdev, cons memset(sp, 0, sizeof(nic_t)); sp->dev = dev; sp->pdev = pdev; - sp->vendor_id = pdev->vendor; - sp->device_id = pdev->device; sp->high_dma_flag = dma_flag; - sp->irq = pdev->irq; sp->device_enabled_once = FALSE; - strcpy(sp->name, dev_name); /* Initialize some PCI/PCI-X fields of the NIC. */ s2io_init_pci(sp); - /* + /* * Setting the device configuration parameters. - * Most of these parameters can be specified by the user during - * module insertion as they are module loadable parameters. If - * these parameters are not not specified during load time, they + * Most of these parameters can be specified by the user during + * module insertion as they are module loadable parameters. If + * these parameters are not not specified during load time, they * are initialized with default values. */ mac_control = &sp->mac_control; @@ -4664,6 +4515,10 @@ s2io_init_nic(struct pci_dev *pdev, cons config->tx_cfg[i].fifo_priority = i; } + /* mapping the QoS priority to the configured fifos */ + for (i = 0; i < MAX_TX_FIFOS; i++) + config->fifo_mapping[i] = fifo_map[config->tx_fifo_num][i]; + config->tx_intr_type = TXD_INT_TYPE_UTILZ; for (i = 0; i < config->tx_fifo_num; i++) { config->tx_cfg[i].f_no_snoop = @@ -4744,13 +4599,14 @@ s2io_init_nic(struct pci_dev *pdev, cons dev->do_ioctl = &s2io_ioctl; dev->change_mtu = &s2io_change_mtu; SET_ETHTOOL_OPS(dev, &netdev_ethtool_ops); + /* * will use eth_mac_addr() for dev->set_mac_address * mac address will be set every time dev->open() is called */ -#ifdef CONFIG_S2IO_NAPI +#if defined(CONFIG_S2IO_NAPI) dev->poll = s2io_poll; - dev->weight = 90; + dev->weight = 32; #endif dev->features |= NETIF_F_SG | NETIF_F_IP_CSUM; @@ -4777,22 +4633,14 @@ s2io_init_nic(struct pci_dev *pdev, cons goto set_swap_failed; } - /* Fix for all "FFs" MAC address problems observed on Alpha platforms */ + /* + * Fix for all "FFs" MAC address problems observed on + * Alpha platforms + */ fix_mac_address(sp); s2io_reset(sp); /* - * Setting swapper control on the NIC, so the MAC address can be read. - */ - if (s2io_set_swapper(sp)) { - DBG_PRINT(ERR_DBG, - "%s: S2IO: swapper settings are wrong\n", - dev->name); - ret = -EAGAIN; - goto set_swap_failed; - } - - /* * MAC address initialization. * For now only one mac address will be read and used. */ @@ -4829,23 +4677,22 @@ s2io_init_nic(struct pci_dev *pdev, cons memcpy(dev->dev_addr, sp->def_mac_addr, ETH_ALEN); /* - * Initialize the tasklet status and link state flags + * Initialize the tasklet status and link state flags * and the card statte parameter */ atomic_set(&(sp->card_state), 0); sp->tasklet_status = 0; sp->link_state = 0; - /* Initialize spinlocks */ spin_lock_init(&sp->tx_lock); #ifndef CONFIG_S2IO_NAPI spin_lock_init(&sp->put_lock); #endif - /* - * SXE-002: Configure link and activity LED to init state - * on driver load. + /* + * SXE-002: Configure link and activity LED to init state + * on driver load. */ subid = sp->pdev->subsystem_device; if ((subid & 0xFF) >= 0x07) { @@ -4865,9 +4712,9 @@ s2io_init_nic(struct pci_dev *pdev, cons goto register_failed; } - /* - * Make Link state as off at this point, when the Link change - * interrupt comes the state will be automatically changed to + /* + * Make Link state as off at this point, when the Link change + * interrupt comes the state will be automatically changed to * the right state. */ netif_carrier_off(dev); @@ -4892,11 +4739,11 @@ s2io_init_nic(struct pci_dev *pdev, cons } /** - * s2io_rem_nic - Free the PCI device + * s2io_rem_nic - Free the PCI device * @pdev: structure containing the PCI related information of the device. - * Description: This function is called by the Pci subsystem to release a + * Description: This function is called by the Pci subsystem to release a * PCI device and free up all resource held up by the device. This could - * be in response to a Hot plug event or when the driver is to be removed + * be in response to a Hot plug event or when the driver is to be removed * from memory. */ @@ -4920,7 +4767,6 @@ static void __devexit s2io_rem_nic(struc pci_disable_device(pdev); pci_release_regions(pdev); pci_set_drvdata(pdev, NULL); - free_netdev(dev); } @@ -4936,11 +4782,11 @@ int __init s2io_starter(void) } /** - * s2io_closer - Cleanup routine for the driver + * s2io_closer - Cleanup routine for the driver * Description: This function is the cleanup routine for the driver. It unregist * ers the driver. */ -static void s2io_closer(void) +void s2io_closer(void) { pci_unregister_driver(&s2io_driver); DBG_PRINT(INIT_DBG, "cleanup done\n"); diff -urpN vanilla_kernel/drivers/net/s2io.h linux-2.6.12-rc6/drivers/net/s2io.h --- vanilla_kernel/drivers/net/s2io.h 2005-06-06 08:22:29.000000000 -0700 +++ linux-2.6.12-rc6/drivers/net/s2io.h 2005-06-27 06:21:32.000000000 -0700 @@ -31,6 +31,9 @@ #define SUCCESS 0 #define FAILURE -1 +/* Maximum time to flicker LED when asked to identify NIC using ethtool */ +#define MAX_FLICKER_TIME 60000 /* 60 Secs */ + /* Maximum outstanding splits to be configured into xena. */ typedef enum xena_max_outstanding_splits { XENA_ONE_SPLIT_TRANSACTION = 0, @@ -45,10 +48,10 @@ typedef enum xena_max_outstanding_splits #define XENA_MAX_OUTSTANDING_SPLITS(n) (n << 4) /* OS concerned variables and constants */ -#define WATCH_DOG_TIMEOUT 5*HZ -#define EFILL 0x1234 -#define ALIGN_SIZE 127 -#define PCIX_COMMAND_REGISTER 0x62 +#define WATCH_DOG_TIMEOUT 15*HZ +#define EFILL 0x1234 +#define ALIGN_SIZE 127 +#define PCIX_COMMAND_REGISTER 0x62 /* * Debug related variables. @@ -61,7 +64,7 @@ typedef enum xena_max_outstanding_splits #define INTR_DBG 4 /* Global variable that defines the present debug level of the driver. */ -static int debug_level = ERR_DBG; /* Default level. */ +int debug_level = ERR_DBG; /* Default level. */ /* DEBUG message print. */ #define DBG_PRINT(dbg_level, args...) if(!(debug_level> 48) @@ -382,7 +408,7 @@ typedef struct _RxD_t { #endif } RxD_t; -/* Structure that represents the Rx descriptor block which contains +/* Structure that represents the Rx descriptor block which contains * 128 Rx descriptors. */ #ifndef CONFIG_2BUFF_MODE @@ -392,11 +418,11 @@ typedef struct _RxD_block { u64 reserved_0; #define END_OF_BLOCK 0xFEFFFFFFFFFFFFFFULL - u64 reserved_1; /* 0xFEFFFFFFFFFFFFFF to mark last + u64 reserved_1; /* 0xFEFFFFFFFFFFFFFF to mark last * Rxd in this blk */ u64 reserved_2_pNext_RxD_block; /* Logical ptr to next */ u64 pNext_RxD_Blk_physical; /* Buff0_ptr.In a 32 bit arch - * the upper 32 bits should + * the upper 32 bits should * be 0 */ } RxD_block_t; #else @@ -405,13 +431,13 @@ typedef struct _RxD_block { RxD_t rxd[MAX_RXDS_PER_BLOCK]; #define END_OF_BLOCK 0xFEFFFFFFFFFFFFFFULL - u64 reserved_1; /* 0xFEFFFFFFFFFFFFFF to mark last Rxd + u64 reserved_1; /* 0xFEFFFFFFFFFFFFFF to mark last Rxd * in this blk */ u64 pNext_RxD_Blk_physical; /* Phy ponter to next blk. */ } RxD_block_t; #define SIZE_OF_BLOCK 4096 -/* Structure to hold virtual addresses of Buf0 and Buf1 in +/* Structure to hold virtual addresses of Buf0 and Buf1 in * 2buf mode. */ typedef struct bufAdd { void *ba_0_org; @@ -423,8 +449,8 @@ typedef struct bufAdd { /* Structure which stores all the MAC control parameters */ -/* This structure stores the offset of the RxD in the ring - * from which the Rx Interrupt processor can start picking +/* This structure stores the offset of the RxD in the ring + * from which the Rx Interrupt processor can start picking * up the RxDs for processing. */ typedef struct _rx_curr_get_info_t { @@ -436,7 +462,7 @@ typedef struct _rx_curr_get_info_t { typedef rx_curr_get_info_t rx_curr_put_info_t; /* This structure stores the offset of the TxDl in the FIFO - * from which the Tx Interrupt processor can start picking + * from which the Tx Interrupt processor can start picking * up the TxDLs for send complete interrupt processing. */ typedef struct { @@ -446,32 +472,96 @@ typedef struct { typedef tx_curr_get_info_t tx_curr_put_info_t; -/* Infomation related to the Tx and Rx FIFOs and Rings of Xena - * is maintained in this structure. - */ -typedef struct mac_info { -/* rx side stuff */ - /* Put pointer info which indictes which RxD has to be replenished +/* Structure that holds the Phy and virt addresses of the Blocks */ +typedef struct rx_block_info { + RxD_t *block_virt_addr; + dma_addr_t block_dma_addr; +} rx_block_info_t; + +/* pre declaration of the nic structure */ +typedef struct s2io_nic nic_t; + +/* Ring specific structure */ +typedef struct ring_info { + /* The ring number */ + int ring_no; + + /* + * Place holders for the virtual and physical addresses of + * all the Rx Blocks + */ + rx_block_info_t rx_blocks[MAX_RX_BLOCKS_PER_RING]; + int block_count; + int pkt_cnt; + + /* + * Put pointer info which indictes which RxD has to be replenished * with a new buffer. */ - rx_curr_put_info_t rx_curr_put_info[MAX_RX_RINGS]; + rx_curr_put_info_t rx_curr_put_info; - /* Get pointer info which indictes which is the last RxD that was + /* + * Get pointer info which indictes which is the last RxD that was * processed by the driver. */ - rx_curr_get_info_t rx_curr_get_info[MAX_RX_RINGS]; + rx_curr_get_info_t rx_curr_get_info; - u16 rmac_pause_time; - u16 mc_pause_threshold_q0q3; - u16 mc_pause_threshold_q4q7; +#ifndef CONFIG_S2IO_NAPI + /* Index to the absolute position of the put pointer of Rx ring */ + int put_pos; +#endif + +#ifdef CONFIG_2BUFF_MODE + /* Buffer Address store. */ + buffAdd_t **ba; +#endif + nic_t *nic; +} ring_info_t; +/* Fifo specific structure */ +typedef struct fifo_info { + /* FIFO number */ + int fifo_no; + + /* Maximum TxDs per TxDL */ + int max_txds; + + /* Place holder of all the TX List's Phy and Virt addresses. */ + list_info_hold_t *list_info; + + /* + * Current offset within the tx FIFO where driver would write + * new Tx frame + */ + tx_curr_put_info_t tx_curr_put_info; + + /* + * Current offset within tx FIFO from where the driver would start freeing + * the buffers + */ + tx_curr_get_info_t tx_curr_get_info; + + nic_t *nic; +}fifo_info_t; + +/* Infomation related to the Tx and Rx FIFOs and Rings of Xena + * is maintained in this structure. + */ +typedef struct mac_info { /* tx side stuff */ /* logical pointer of start of each Tx FIFO */ TxFIFO_element_t __iomem *tx_FIFO_start[MAX_TX_FIFOS]; -/* Current offset within tx_FIFO_start, where driver would write new Tx frame*/ - tx_curr_put_info_t tx_curr_put_info[MAX_TX_FIFOS]; - tx_curr_get_info_t tx_curr_get_info[MAX_TX_FIFOS]; + /* Fifo specific structure */ + fifo_info_t fifos[MAX_TX_FIFOS]; + +/* rx side stuff */ + /* Ring specific structure */ + ring_info_t rings[MAX_RX_RINGS]; + + u16 rmac_pause_time; + u16 mc_pause_threshold_q0q3; + u16 mc_pause_threshold_q4q7; void *stats_mem; /* orignal pointer to allocated mem */ dma_addr_t stats_mem_phy; /* Physical address of the stat block */ @@ -485,12 +575,6 @@ typedef struct { int usage_cnt; } usr_addr_t; -/* Structure that holds the Phy and virt addresses of the Blocks */ -typedef struct rx_block_info { - RxD_t *block_virt_addr; - dma_addr_t block_dma_addr; -} rx_block_info_t; - /* Default Tunable parameters of the NIC. */ #define DEFAULT_FIFO_LEN 4096 #define SMALL_RXD_CNT 30 * (MAX_RXDS_PER_BLOCK+1) @@ -499,7 +583,20 @@ typedef struct rx_block_info { #define LARGE_BLK_CNT 100 /* Structure representing one instance of the NIC */ -typedef struct s2io_nic { +struct s2io_nic { +#ifdef CONFIG_S2IO_NAPI + /* + * Count of packets to be processed in a given iteration, it will be indicated + * by the quota field of the device structure when NAPI is enabled. + */ + int pkts_to_process; +#endif + struct net_device *dev; + mac_info_t mac_control; + struct config_param config; + struct pci_dev *pdev; + void __iomem *bar0; + void __iomem *bar1; #define MAX_MAC_SUPPORTED 16 #define MAX_SUPPORTED_MULTICASTS MAX_MAC_SUPPORTED @@ -507,33 +604,17 @@ typedef struct s2io_nic { macaddr_t pre_mac_addr[MAX_MAC_SUPPORTED]; struct net_device_stats stats; - void __iomem *bar0; - void __iomem *bar1; - struct config_param config; - mac_info_t mac_control; int high_dma_flag; int device_close_flag; int device_enabled_once; - char name[32]; + char name[50]; struct tasklet_struct task; volatile unsigned long tasklet_status; - struct timer_list timer; - struct net_device *dev; - struct pci_dev *pdev; - u16 vendor_id; - u16 device_id; - u16 ccmd; - u32 cbar0_1; - u32 cbar0_2; - u32 cbar1_1; - u32 cbar1_2; - u32 cirq; - u8 cache_line; - u32 rom_expansion; - u16 pcix_cmd; - u32 irq; + /* Space to back up the PCI config space */ + u32 config_space[256 / sizeof(u32)]; + atomic_t rx_bufs_left[MAX_RX_RINGS]; spinlock_t tx_lock; @@ -558,27 +639,11 @@ typedef struct s2io_nic { u16 tx_err_count; u16 rx_err_count; -#ifndef CONFIG_S2IO_NAPI - /* Index to the absolute position of the put pointer of Rx ring. */ - int put_pos[MAX_RX_RINGS]; -#endif - - /* - * Place holders for the virtual and physical addresses of - * all the Rx Blocks - */ - rx_block_info_t rx_blocks[MAX_RX_RINGS][MAX_RX_BLOCKS_PER_RING]; - int block_count[MAX_RX_RINGS]; - int pkt_cnt[MAX_RX_RINGS]; - - /* Place holder of all the TX List's Phy and Virt addresses. */ - list_info_hold_t *list_info[MAX_TX_FIFOS]; - /* Id timer, used to blink NIC to physically identify NIC. */ struct timer_list id_timer; /* Restart timer, used to restart NIC if the device is stuck and - * a schedule task that will set the correct Link state once the + * a schedule task that will set the correct Link state once the * NIC's PHY has stabilized after a state change. */ #ifdef INIT_TQUEUE @@ -589,12 +654,12 @@ typedef struct s2io_nic { struct work_struct set_link_task; #endif - /* Flag that can be used to turn on or turn off the Rx checksum + /* Flag that can be used to turn on or turn off the Rx checksum * offload feature. */ int rx_csum; - /* after blink, the adapter must be restored with original + /* after blink, the adapter must be restored with original * values. */ u64 adapt_ctrl_org; @@ -604,16 +669,12 @@ typedef struct s2io_nic { #define LINK_DOWN 1 #define LINK_UP 2 -#ifdef CONFIG_2BUFF_MODE - /* Buffer Address store. */ - buffAdd_t **ba[MAX_RX_RINGS]; -#endif int task_flag; #define CARD_DOWN 1 #define CARD_UP 2 atomic_t card_state; volatile unsigned long link_state; -} nic_t; +}; #define RESET_ERROR 1; #define CMD_ERROR 2; @@ -622,9 +683,10 @@ typedef struct s2io_nic { #ifndef readq static inline u64 readq(void __iomem *addr) { - u64 ret = readl(addr + 4); - ret <<= 32; - ret |= readl(addr); + u64 ret = 0; + ret = readl(addr + 4); + (u64) ret <<= 32; + (u64) ret |= readl(addr); return ret; } @@ -637,10 +699,10 @@ static inline void writeq(u64 val, void writel((u32) (val >> 32), (addr + 4)); } -/* In 32 bit modes, some registers have to be written in a +/* In 32 bit modes, some registers have to be written in a * particular order to expect correct hardware operation. The - * macro SPECIAL_REG_WRITE is used to perform such ordered - * writes. Defines UF (Upper First) and LF (Lower First) will + * macro SPECIAL_REG_WRITE is used to perform such ordered + * writes. Defines UF (Upper First) and LF (Lower First) will * be used to specify the required write order. */ #define UF 1 @@ -716,6 +778,7 @@ static inline void SPECIAL_REG_WRITE(u64 #define PCC_FB_ECC_ERR vBIT(0xff, 16, 8) /* Interrupt to indicate PCC_FB_ECC Error. */ +#define RXD_GET_VLAN_TAG(Control_2) (u16)(Control_2 & MASK_VLAN_TAG) /* * Prototype declaration. */ @@ -725,36 +788,29 @@ static void __devexit s2io_rem_nic(struc static int init_shared_mem(struct s2io_nic *sp); static void free_shared_mem(struct s2io_nic *sp); static int init_nic(struct s2io_nic *nic); -#ifndef CONFIG_S2IO_NAPI -static void rx_intr_handler(struct s2io_nic *sp); -#endif -static void tx_intr_handler(struct s2io_nic *sp); +static void rx_intr_handler(ring_info_t *ring_data); +static void tx_intr_handler(fifo_info_t *fifo_data); static void alarm_intr_handler(struct s2io_nic *sp); static int s2io_starter(void); -static void s2io_closer(void); +void s2io_closer(void); static void s2io_tx_watchdog(struct net_device *dev); static void s2io_tasklet(unsigned long dev_addr); static void s2io_set_multicast(struct net_device *dev); -#ifndef CONFIG_2BUFF_MODE -static int rx_osm_handler(nic_t * sp, u16 len, RxD_t * rxdp, int ring_no); -#else -static int rx_osm_handler(nic_t * sp, RxD_t * rxdp, int ring_no, - buffAdd_t * ba); -#endif -static void s2io_link(nic_t * sp, int link); -static void s2io_reset(nic_t * sp); -#ifdef CONFIG_S2IO_NAPI +static int rx_osm_handler(ring_info_t *ring_data, RxD_t * rxdp); +void s2io_link(nic_t * sp, int link); +void s2io_reset(nic_t * sp); +#if defined(CONFIG_S2IO_NAPI) static int s2io_poll(struct net_device *dev, int *budget); #endif static void s2io_init_pci(nic_t * sp); -static int s2io_set_mac_addr(struct net_device *dev, u8 * addr); +int s2io_set_mac_addr(struct net_device *dev, u8 * addr); static irqreturn_t s2io_isr(int irq, void *dev_id, struct pt_regs *regs); -static int verify_xena_quiescence(u64 val64, int flag); +static int verify_xena_quiescence(nic_t *sp, u64 val64, int flag); static struct ethtool_ops netdev_ethtool_ops; static void s2io_set_link(unsigned long data); -static int s2io_set_swapper(nic_t * sp); -static void s2io_card_down(nic_t * nic); -static int s2io_card_up(nic_t * nic); - +int s2io_set_swapper(nic_t * sp); +static void s2io_card_down(nic_t *nic); +static int s2io_card_up(nic_t *nic); +int get_xena_rev_id(struct pci_dev *pdev); #endif /* _S2IO_H */ From raghavendra.koushik@neterion.com Thu Jul 7 15:23:01 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 07 Jul 2005 15:23:10 -0700 (PDT) Received: from linux.site (adsl-67-120-213-161.dsl.sntc01.pacbell.net [67.120.213.161]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j67MN1H9013553 for ; Thu, 7 Jul 2005 15:23:01 -0700 Received: by linux.site (Postfix, from userid 0) id C38D189828; Thu, 7 Jul 2005 15:10:33 -0700 (PDT) To: jgarzik@pobox.com, netdev@oss.sgi.com Cc: raghavendra.koushik@neterion.com, ravinandan.arakali@neterion.com, leonid.grossman@neterion.com, rapuru.sriram@neterion.com From: raghavendra.koushik@neterion.com Subject: [PATCH 2.6.12.1 2/12] S2io: Hardware fixes Message-Id: <20050707221033.C38D189828@linux.site> Date: Thu, 7 Jul 2005 15:10:33 -0700 (PDT) X-archive-position: 2673 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: raghavendra.koushik@neterion.com Precedence: bulk X-list: netdev Content-Length: 23258 Lines: 666 Hi, Below patch addresses few h/w specific issues. 1. Check for additional ownership bit on Rx path before starting Rx processing. 2. Enable only 4 PCCs(Per Context Controller) for Xframe I revisions less than 4. 3. Program Rx and Tx round robin registers depending on no. of rings/FIFOs. 4. Tx continous interrupts is now a loadable parameter. 5. Reset the card if we get double-bit ECC errors. 6. A soft reset of XGXS being done to force a link state change has been eliminated. 7. After a reset, clear "parity error detected" bit, PCI-X ECC status register, and PCI_STATUS bit in tx_pic_int register. 8. The error in the disabling allmulticast implementation has been rectified. 9. Leave the PCI-X parameters MMRBC, OST etc. at their BIOS/system defaults. Signed-off-by: Ravinandan Arakali Signed-off-by: Raghavendra Koushik --- diff -uprN vanilla_kernel/drivers/net/s2io-regs.h linux-2.6.12-rc6/drivers/net/s2io-regs.h --- vanilla_kernel/drivers/net/s2io-regs.h 2005-06-27 06:25:09.000000000 -0700 +++ linux-2.6.12-rc6/drivers/net/s2io-regs.h 2005-06-27 06:25:50.000000000 -0700 @@ -62,6 +62,7 @@ typedef struct _XENA_dev_config { #define ADAPTER_STATUS_RMAC_REMOTE_FAULT BIT(6) #define ADAPTER_STATUS_RMAC_LOCAL_FAULT BIT(7) #define ADAPTER_STATUS_RMAC_PCC_IDLE vBIT(0xFF,8,8) +#define ADAPTER_STATUS_RMAC_PCC_FOUR_IDLE vBIT(0x0F,8,8) #define ADAPTER_STATUS_RC_PRC_QUIESCENT vBIT(0xFF,16,8) #define ADAPTER_STATUS_MC_DRAM_READY BIT(24) #define ADAPTER_STATUS_MC_QUEUES_READY BIT(25) @@ -245,6 +246,7 @@ typedef struct _XENA_dev_config { #define STAT_TRSF_PER(n) TBD #define PER_SEC 0x208d5 #define SET_UPDT_PERIOD(n) vBIT((PER_SEC*n),32,32) +#define SET_UPDT_CLICKS(val) vBIT(val, 32, 32) u64 stat_addr; @@ -289,6 +291,7 @@ typedef struct _XENA_dev_config { u64 pcc_err_reg; #define PCC_FB_ECC_DB_ERR vBIT(0xFF, 16, 8) +#define PCC_ENABLE_FOUR vBIT(0x0F,0,8) u64 pcc_err_mask; u64 pcc_err_alarm; @@ -690,6 +693,10 @@ typedef struct _XENA_dev_config { #define MC_ERR_REG_MIRI_CRI_ERR_0 BIT(22) #define MC_ERR_REG_MIRI_CRI_ERR_1 BIT(23) #define MC_ERR_REG_SM_ERR BIT(31) +#define MC_ERR_REG_ECC_ALL_SNG (BIT(6) | \ + BIT(7) | BIT(17) | BIT(19)) +#define MC_ERR_REG_ECC_ALL_DBL (BIT(14) | \ + BIT(15) | BIT(18) | BIT(20)) u64 mc_err_mask; u64 mc_err_alarm; diff -uprN vanilla_kernel/drivers/net/s2io.c linux-2.6.12-rc6/drivers/net/s2io.c --- vanilla_kernel/drivers/net/s2io.c 2005-06-27 06:25:09.000000000 -0700 +++ linux-2.6.12-rc6/drivers/net/s2io.c 2005-06-27 06:25:50.000000000 -0700 @@ -67,6 +67,16 @@ static char s2io_driver_name[] = "Neterion"; static char s2io_driver_version[] = "Version 1.7.7"; +static inline int RXD_IS_UP2DT(RxD_t *rxdp) +{ + int ret; + + ret = ((!(rxdp->Control_1 & RXD_OWN_XENA)) && + (GET_RXD_MARKER(rxdp->Control_2) != THE_RXD_MARK)); + + return ret; +} + /* * Cards with following subsystem_id have a link state indication * problem, 600B, 600C, 600D, 640B, 640C and 640D. @@ -229,6 +239,7 @@ static unsigned int rx_ring_sz[MAX_RX_RI static unsigned int Stats_refresh_time = 4; static unsigned int rts_frm_len[MAX_RX_RINGS] = {[0 ...(MAX_RX_RINGS - 1)] = 0 }; +static unsigned int use_continuous_tx_intrs = 1; static unsigned int rmac_pause_time = 65535; static unsigned int mc_pause_threshold_q0q3 = 187; static unsigned int mc_pause_threshold_q4q7 = 187; @@ -637,7 +648,7 @@ static int init_nic(struct s2io_nic *nic mac_control = &nic->mac_control; config = &nic->config; - /* to set the swapper control on the card */ + /* to set the swapper controle on the card */ if(s2io_set_swapper(nic)) { DBG_PRINT(ERR_DBG,"ERROR: Setting Swapper failed\n"); return -1; @@ -755,6 +766,13 @@ static int init_nic(struct s2io_nic *nic val64 |= BIT(0); /* To enable the FIFO partition. */ writeq(val64, &bar0->tx_fifo_partition_0); + /* + * Disable 4 PCCs for Xena1, 2 and 3 as per H/W bug + * SXE-008 TRANSMIT DMA ARBITRATION ISSUE. + */ + if (get_xena_rev_id(nic->pdev) < 4) + writeq(PCC_ENABLE_FOUR, &bar0->pcc_enable); + val64 = readq(&bar0->tx_fifo_partition_0); DBG_PRINT(INIT_DBG, "Fifo partition at: 0x%p is: 0x%llx\n", &bar0->tx_fifo_partition_0, (unsigned long long) val64); @@ -822,37 +840,250 @@ static int init_nic(struct s2io_nic *nic } writeq(val64, &bar0->rx_queue_cfg); - /* Initializing the Tx round robin registers to 0 - * filling tx and rx round robin registers as per - * the number of FIFOs and Rings is still TODO - */ - writeq(0, &bar0->tx_w_round_robin_0); - writeq(0, &bar0->tx_w_round_robin_1); - writeq(0, &bar0->tx_w_round_robin_2); - writeq(0, &bar0->tx_w_round_robin_3); - writeq(0, &bar0->tx_w_round_robin_4); - - /* - * TODO - * Disable Rx steering. Hard coding all packets to be steered to - * Queue 0 for now. + /* + * Filling Tx round robin registers + * as per the number of FIFOs */ - val64 = 0x8080808080808080ULL; - writeq(val64, &bar0->rts_qos_steering); + switch (config->tx_fifo_num) { + case 1: + val64 = 0x0000000000000000ULL; + writeq(val64, &bar0->tx_w_round_robin_0); + writeq(val64, &bar0->tx_w_round_robin_1); + writeq(val64, &bar0->tx_w_round_robin_2); + writeq(val64, &bar0->tx_w_round_robin_3); + writeq(val64, &bar0->tx_w_round_robin_4); + break; + case 2: + val64 = 0x0000010000010000ULL; + writeq(val64, &bar0->tx_w_round_robin_0); + val64 = 0x0100000100000100ULL; + writeq(val64, &bar0->tx_w_round_robin_1); + val64 = 0x0001000001000001ULL; + writeq(val64, &bar0->tx_w_round_robin_2); + val64 = 0x0000010000010000ULL; + writeq(val64, &bar0->tx_w_round_robin_3); + val64 = 0x0100000000000000ULL; + writeq(val64, &bar0->tx_w_round_robin_4); + break; + case 3: + val64 = 0x0001000102000001ULL; + writeq(val64, &bar0->tx_w_round_robin_0); + val64 = 0x0001020000010001ULL; + writeq(val64, &bar0->tx_w_round_robin_1); + val64 = 0x0200000100010200ULL; + writeq(val64, &bar0->tx_w_round_robin_2); + val64 = 0x0001000102000001ULL; + writeq(val64, &bar0->tx_w_round_robin_3); + val64 = 0x0001020000000000ULL; + writeq(val64, &bar0->tx_w_round_robin_4); + break; + case 4: + val64 = 0x0001020300010200ULL; + writeq(val64, &bar0->tx_w_round_robin_0); + val64 = 0x0100000102030001ULL; + writeq(val64, &bar0->tx_w_round_robin_1); + val64 = 0x0200010000010203ULL; + writeq(val64, &bar0->tx_w_round_robin_2); + val64 = 0x0001020001000001ULL; + writeq(val64, &bar0->tx_w_round_robin_3); + val64 = 0x0203000100000000ULL; + writeq(val64, &bar0->tx_w_round_robin_4); + break; + case 5: + val64 = 0x0001000203000102ULL; + writeq(val64, &bar0->tx_w_round_robin_0); + val64 = 0x0001020001030004ULL; + writeq(val64, &bar0->tx_w_round_robin_1); + val64 = 0x0001000203000102ULL; + writeq(val64, &bar0->tx_w_round_robin_2); + val64 = 0x0001020001030004ULL; + writeq(val64, &bar0->tx_w_round_robin_3); + val64 = 0x0001000000000000ULL; + writeq(val64, &bar0->tx_w_round_robin_4); + break; + case 6: + val64 = 0x0001020304000102ULL; + writeq(val64, &bar0->tx_w_round_robin_0); + val64 = 0x0304050001020001ULL; + writeq(val64, &bar0->tx_w_round_robin_1); + val64 = 0x0203000100000102ULL; + writeq(val64, &bar0->tx_w_round_robin_2); + val64 = 0x0304000102030405ULL; + writeq(val64, &bar0->tx_w_round_robin_3); + val64 = 0x0001000200000000ULL; + writeq(val64, &bar0->tx_w_round_robin_4); + break; + case 7: + val64 = 0x0001020001020300ULL; + writeq(val64, &bar0->tx_w_round_robin_0); + val64 = 0x0102030400010203ULL; + writeq(val64, &bar0->tx_w_round_robin_1); + val64 = 0x0405060001020001ULL; + writeq(val64, &bar0->tx_w_round_robin_2); + val64 = 0x0304050000010200ULL; + writeq(val64, &bar0->tx_w_round_robin_3); + val64 = 0x0102030000000000ULL; + writeq(val64, &bar0->tx_w_round_robin_4); + break; + case 8: + val64 = 0x0001020300040105ULL; + writeq(val64, &bar0->tx_w_round_robin_0); + val64 = 0x0200030106000204ULL; + writeq(val64, &bar0->tx_w_round_robin_1); + val64 = 0x0103000502010007ULL; + writeq(val64, &bar0->tx_w_round_robin_2); + val64 = 0x0304010002060500ULL; + writeq(val64, &bar0->tx_w_round_robin_3); + val64 = 0x0103020400000000ULL; + writeq(val64, &bar0->tx_w_round_robin_4); + break; + } + + /* Filling the Rx round robin registers as per the + * number of Rings and steering based on QoS. + */ + switch (config->rx_ring_num) { + case 1: + val64 = 0x8080808080808080ULL; + writeq(val64, &bar0->rts_qos_steering); + break; + case 2: + val64 = 0x0000010000010000ULL; + writeq(val64, &bar0->rx_w_round_robin_0); + val64 = 0x0100000100000100ULL; + writeq(val64, &bar0->rx_w_round_robin_1); + val64 = 0x0001000001000001ULL; + writeq(val64, &bar0->rx_w_round_robin_2); + val64 = 0x0000010000010000ULL; + writeq(val64, &bar0->rx_w_round_robin_3); + val64 = 0x0100000000000000ULL; + writeq(val64, &bar0->rx_w_round_robin_4); + + val64 = 0x8080808040404040ULL; + writeq(val64, &bar0->rts_qos_steering); + break; + case 3: + val64 = 0x0001000102000001ULL; + writeq(val64, &bar0->rx_w_round_robin_0); + val64 = 0x0001020000010001ULL; + writeq(val64, &bar0->rx_w_round_robin_1); + val64 = 0x0200000100010200ULL; + writeq(val64, &bar0->rx_w_round_robin_2); + val64 = 0x0001000102000001ULL; + writeq(val64, &bar0->rx_w_round_robin_3); + val64 = 0x0001020000000000ULL; + writeq(val64, &bar0->rx_w_round_robin_4); + + val64 = 0x8080804040402020ULL; + writeq(val64, &bar0->rts_qos_steering); + break; + case 4: + val64 = 0x0001020300010200ULL; + writeq(val64, &bar0->rx_w_round_robin_0); + val64 = 0x0100000102030001ULL; + writeq(val64, &bar0->rx_w_round_robin_1); + val64 = 0x0200010000010203ULL; + writeq(val64, &bar0->rx_w_round_robin_2); + val64 = 0x0001020001000001ULL; + writeq(val64, &bar0->rx_w_round_robin_3); + val64 = 0x0203000100000000ULL; + writeq(val64, &bar0->rx_w_round_robin_4); + + val64 = 0x8080404020201010ULL; + writeq(val64, &bar0->rts_qos_steering); + break; + case 5: + val64 = 0x0001000203000102ULL; + writeq(val64, &bar0->rx_w_round_robin_0); + val64 = 0x0001020001030004ULL; + writeq(val64, &bar0->rx_w_round_robin_1); + val64 = 0x0001000203000102ULL; + writeq(val64, &bar0->rx_w_round_robin_2); + val64 = 0x0001020001030004ULL; + writeq(val64, &bar0->rx_w_round_robin_3); + val64 = 0x0001000000000000ULL; + writeq(val64, &bar0->rx_w_round_robin_4); + + val64 = 0x8080404020201008ULL; + writeq(val64, &bar0->rts_qos_steering); + break; + case 6: + val64 = 0x0001020304000102ULL; + writeq(val64, &bar0->rx_w_round_robin_0); + val64 = 0x0304050001020001ULL; + writeq(val64, &bar0->rx_w_round_robin_1); + val64 = 0x0203000100000102ULL; + writeq(val64, &bar0->rx_w_round_robin_2); + val64 = 0x0304000102030405ULL; + writeq(val64, &bar0->rx_w_round_robin_3); + val64 = 0x0001000200000000ULL; + writeq(val64, &bar0->rx_w_round_robin_4); + + val64 = 0x8080404020100804ULL; + writeq(val64, &bar0->rts_qos_steering); + break; + case 7: + val64 = 0x0001020001020300ULL; + writeq(val64, &bar0->rx_w_round_robin_0); + val64 = 0x0102030400010203ULL; + writeq(val64, &bar0->rx_w_round_robin_1); + val64 = 0x0405060001020001ULL; + writeq(val64, &bar0->rx_w_round_robin_2); + val64 = 0x0304050000010200ULL; + writeq(val64, &bar0->rx_w_round_robin_3); + val64 = 0x0102030000000000ULL; + writeq(val64, &bar0->rx_w_round_robin_4); + + val64 = 0x8080402010080402ULL; + writeq(val64, &bar0->rts_qos_steering); + break; + case 8: + val64 = 0x0001020300040105ULL; + writeq(val64, &bar0->rx_w_round_robin_0); + val64 = 0x0200030106000204ULL; + writeq(val64, &bar0->rx_w_round_robin_1); + val64 = 0x0103000502010007ULL; + writeq(val64, &bar0->rx_w_round_robin_2); + val64 = 0x0304010002060500ULL; + writeq(val64, &bar0->rx_w_round_robin_3); + val64 = 0x0103020400000000ULL; + writeq(val64, &bar0->rx_w_round_robin_4); + + val64 = 0x8040201008040201ULL; + writeq(val64, &bar0->rts_qos_steering); + break; + } /* UDP Fix */ val64 = 0; for (i = 0; i < 8; i++) writeq(val64, &bar0->rts_frm_len_n[i]); - /* Set the default rts frame length for ring0 */ - writeq(MAC_RTS_FRM_LEN_SET(dev->mtu+22), - &bar0->rts_frm_len_n[0]); + /* Set the default rts frame length for the rings configured */ + val64 = MAC_RTS_FRM_LEN_SET(dev->mtu+22); + for (i = 0 ; i < config->rx_ring_num ; i++) + writeq(val64, &bar0->rts_frm_len_n[i]); + + /* Set the frame length for the configured rings + * desired by the user + */ + for (i = 0; i < config->rx_ring_num; i++) { + /* If rts_frm_len[i] == 0 then it is assumed that user not + * specified frame length steering. + * If the user provides the frame length then program + * the rts_frm_len register for those values or else + * leave it as it is. + */ + if (rts_frm_len[i] != 0) { + writeq(MAC_RTS_FRM_LEN_SET(rts_frm_len[i]), + &bar0->rts_frm_len_n[i]); + } + } /* Program statistics memory */ writeq(mac_control->stats_mem_phy, &bar0->stat_addr); val64 = SET_UPDT_PERIOD(Stats_refresh_time) | - STAT_CFG_STAT_RO | STAT_CFG_STAT_EN; + STAT_CFG_STAT_RO | STAT_CFG_STAT_EN; writeq(val64, &bar0->stat_cfg); /* @@ -876,13 +1107,14 @@ static int init_nic(struct s2io_nic *nic val64 = TTI_DATA1_MEM_TX_TIMER_VAL(0x2078) | TTI_DATA1_MEM_TX_URNG_A(0xA) | TTI_DATA1_MEM_TX_URNG_B(0x10) | - TTI_DATA1_MEM_TX_URNG_C(0x30) | TTI_DATA1_MEM_TX_TIMER_AC_EN | - TTI_DATA1_MEM_TX_TIMER_CI_EN; + TTI_DATA1_MEM_TX_URNG_C(0x30) | TTI_DATA1_MEM_TX_TIMER_AC_EN; + if (use_continuous_tx_intrs) + val64 |= TTI_DATA1_MEM_TX_TIMER_CI_EN; writeq(val64, &bar0->tti_data1_mem); val64 = TTI_DATA2_MEM_TX_UFC_A(0x10) | TTI_DATA2_MEM_TX_UFC_B(0x20) | - TTI_DATA2_MEM_TX_UFC_C(0x40) | TTI_DATA2_MEM_TX_UFC_D(0x80); + TTI_DATA2_MEM_TX_UFC_C(0x70) | TTI_DATA2_MEM_TX_UFC_D(0x80); writeq(val64, &bar0->tti_data2_mem); val64 = TTI_CMD_MEM_WE | TTI_CMD_MEM_STROBE_NEW_CMD; @@ -926,10 +1158,11 @@ static int init_nic(struct s2io_nic *nic writeq(val64, &bar0->rti_command_mem); /* - * Once the operation completes, the Strobe bit of the command - * register will be reset. We poll for this particular condition - * We wait for a maximum of 500ms for the operation to complete, - * if it's not complete by then we return error. + * Once the operation completes, the Strobe bit of the + * command register will be reset. We poll for this + * particular condition. We wait for a maximum of 500ms + * for the operation to complete, if it's not complete + * by then we return error. */ time = 0; while (TRUE) { @@ -1184,10 +1417,10 @@ static void en_dis_able_nic_intrs(struct temp64 &= ~((u64) val64); writeq(temp64, &bar0->general_int_mask); /* - * All MC block error interrupts are disabled for now. - * TODO + * Enable all MC Intrs. */ - writeq(DISABLE_ALL_INTRS, &bar0->mc_int_mask); + writeq(0x0, &bar0->mc_int_mask); + writeq(0x0, &bar0->mc_err_mask); } else if (flag == DISABLE_INTRS) { /* * Disable MC Intrs in the general intr mask register @@ -1246,23 +1479,41 @@ static void en_dis_able_nic_intrs(struct } } -static int check_prc_pcc_state(u64 val64, int flag) +static int check_prc_pcc_state(u64 val64, int flag, int rev_id) { int ret = 0; if (flag == FALSE) { - if (!(val64 & ADAPTER_STATUS_RMAC_PCC_IDLE) && - ((val64 & ADAPTER_STATUS_RC_PRC_QUIESCENT) == - ADAPTER_STATUS_RC_PRC_QUIESCENT)) { - ret = 1; + if (rev_id >= 4) { + if (!(val64 & ADAPTER_STATUS_RMAC_PCC_IDLE) && + ((val64 & ADAPTER_STATUS_RC_PRC_QUIESCENT) == + ADAPTER_STATUS_RC_PRC_QUIESCENT)) { + ret = 1; + } + } else { + if (!(val64 & ADAPTER_STATUS_RMAC_PCC_FOUR_IDLE) && + ((val64 & ADAPTER_STATUS_RC_PRC_QUIESCENT) == + ADAPTER_STATUS_RC_PRC_QUIESCENT)) { + ret = 1; + } } } else { - if (((val64 & ADAPTER_STATUS_RMAC_PCC_IDLE) == - ADAPTER_STATUS_RMAC_PCC_IDLE) && - (!(val64 & ADAPTER_STATUS_RC_PRC_QUIESCENT) || - ((val64 & ADAPTER_STATUS_RC_PRC_QUIESCENT) == - ADAPTER_STATUS_RC_PRC_QUIESCENT))) { - ret = 1; + if (rev_id >= 4) { + if (((val64 & ADAPTER_STATUS_RMAC_PCC_IDLE) == + ADAPTER_STATUS_RMAC_PCC_IDLE) && + (!(val64 & ADAPTER_STATUS_RC_PRC_QUIESCENT) || + ((val64 & ADAPTER_STATUS_RC_PRC_QUIESCENT) == + ADAPTER_STATUS_RC_PRC_QUIESCENT))) { + ret = 1; + } + } else { + if (((val64 & ADAPTER_STATUS_RMAC_PCC_FOUR_IDLE) == + ADAPTER_STATUS_RMAC_PCC_FOUR_IDLE) && + (!(val64 & ADAPTER_STATUS_RC_PRC_QUIESCENT) || + ((val64 & ADAPTER_STATUS_RC_PRC_QUIESCENT) == + ADAPTER_STATUS_RC_PRC_QUIESCENT))) { + ret = 1; + } } } @@ -1285,6 +1536,7 @@ static int verify_xena_quiescence(nic_t { int ret = 0; u64 tmp64 = ~((u64) val64); + int rev_id = get_xena_rev_id(sp->pdev); if (! (tmp64 & @@ -1293,7 +1545,7 @@ static int verify_xena_quiescence(nic_t ADAPTER_STATUS_PIC_QUIESCENT | ADAPTER_STATUS_MC_DRAM_READY | ADAPTER_STATUS_MC_QUEUES_READY | ADAPTER_STATUS_M_PLL_LOCK | ADAPTER_STATUS_P_PLL_LOCK))) { - ret = check_prc_pcc_state(val64, flag); + ret = check_prc_pcc_state(val64, flag, rev_id); } return ret; @@ -1406,7 +1658,7 @@ static int start_nic(struct s2io_nic *ni /* Enable select interrupts */ interruptible = TX_TRAFFIC_INTR | RX_TRAFFIC_INTR | TX_MAC_INTR | - RX_MAC_INTR; + RX_MAC_INTR | MC_INTR; en_dis_able_nic_intrs(nic, interruptible, ENABLE_INTRS); /* @@ -1438,21 +1690,6 @@ static int start_nic(struct s2io_nic *ni */ schedule_work(&nic->set_link_task); - /* - * Here we are performing soft reset on XGXS to - * force link down. Since link is already up, we will get - * link state change interrupt after this reset - */ - SPECIAL_REG_WRITE(0x80010515001E0000ULL, &bar0->dtx_control, UF); - val64 = readq(&bar0->dtx_control); - udelay(50); - SPECIAL_REG_WRITE(0x80010515001E00E0ULL, &bar0->dtx_control, UF); - val64 = readq(&bar0->dtx_control); - udelay(50); - SPECIAL_REG_WRITE(0x80070515001F00E4ULL, &bar0->dtx_control, UF); - val64 = readq(&bar0->dtx_control); - udelay(50); - return SUCCESS; } @@ -1523,7 +1760,7 @@ static void stop_nic(struct s2io_nic *ni /* Disable all interrupts */ interruptible = TX_TRAFFIC_INTR | RX_TRAFFIC_INTR | TX_MAC_INTR | - RX_MAC_INTR; + RX_MAC_INTR | MC_INTR; en_dis_able_nic_intrs(nic, interruptible, DISABLE_INTRS); /* Disable PRCs */ @@ -1738,6 +1975,7 @@ int fill_rx_buffers(struct s2io_nic *nic off++; mac_control->rings[ring_no].rx_curr_put_info.offset = off; #endif + rxdp->Control_2 |= SET_RXD_MARKER; atomic_inc(&nic->rx_bufs_left[ring_no]); alloc_tab++; @@ -1966,11 +2204,8 @@ static void rx_intr_handler(ring_info_t put_offset = (put_block * (MAX_RXDS_PER_BLOCK + 1)) + put_info.offset; #endif - while ((!(rxdp->Control_1 & RXD_OWN_XENA)) && -#ifdef CONFIG_2BUFF_MODE - (!rxdp->Control_2 & BIT(0)) && -#endif - (((get_offset + 1) % ring_bufs) != put_offset)) { + while (RXD_IS_UP2DT(rxdp) && + (((get_offset + 1) % ring_bufs) != put_offset)) { skb = (struct sk_buff *) ((unsigned long)rxdp->Host_Control); if (skb == NULL) { DBG_PRINT(ERR_DBG, "%s: The skb is ", @@ -2154,6 +2389,21 @@ static void alarm_intr_handler(struct s2 schedule_work(&nic->set_link_task); } + /* Handling Ecc errors */ + val64 = readq(&bar0->mc_err_reg); + writeq(val64, &bar0->mc_err_reg); + if (val64 & (MC_ERR_REG_ECC_ALL_SNG | MC_ERR_REG_ECC_ALL_DBL)) { + if (val64 & MC_ERR_REG_ECC_ALL_DBL) { + DBG_PRINT(ERR_DBG, "%s: Device indicates ", + dev->name); + DBG_PRINT(ERR_DBG, "double ECC error!!\n"); + netif_stop_queue(dev); + schedule_work(&nic->rst_timer_task); + } else { + /* Device can recover from Single ECC errors */ + } + } + /* In case of a serious error, the device will be Reset. */ val64 = readq(&bar0->serr_source); if (val64 & SERR_SOURCE_ANY) { @@ -2227,7 +2477,7 @@ void s2io_reset(nic_t * sp) { XENA_dev_config_t __iomem *bar0 = sp->bar0; u64 val64; - u16 subid; + u16 subid, pci_cmd; val64 = SW_RESET_ALL; writeq(val64, &bar0->sw_reset); @@ -2256,6 +2506,18 @@ void s2io_reset(nic_t * sp) /* Set swapper to enable I/O register access */ s2io_set_swapper(sp); + /* Clear certain PCI/PCI-X fields after reset */ + pci_read_config_word(sp->pdev, PCI_STATUS, &pci_cmd); + pci_cmd &= 0x7FFF; /* Clear parity err detect bit */ + pci_write_config_word(sp->pdev, PCI_STATUS, pci_cmd); + + /* Clearing PCIX Ecc status register */ + pci_write_config_dword(sp->pdev, 0x68, 0); + + val64 = readq(&bar0->txpic_int_reg); + val64 &= ~BIT(62); /* Clearing PCI_STATUS error reflected here */ + writeq(val64, &bar0->txpic_int_reg); + /* Reset device statistics maintained by OS */ memset(&sp->stats, 0, sizeof (struct net_device_stats)); @@ -2798,6 +3060,8 @@ static void s2io_set_multicast(struct ne /* Disable all Multicast addresses */ writeq(RMAC_ADDR_DATA0_MEM_ADDR(dis_addr), &bar0->rmac_addr_data0_mem); + writeq(RMAC_ADDR_DATA1_MEM_MASK(0x0), + &bar0->rmac_addr_data1_mem); val64 = RMAC_ADDR_CMD_MEM_WE | RMAC_ADDR_CMD_MEM_STROBE_NEW_CMD | RMAC_ADDR_CMD_MEM_OFFSET(sp->all_multi_pos); @@ -4370,21 +4634,6 @@ static void s2io_init_pci(nic_t * sp) (pci_cmd | PCI_COMMAND_PARITY)); pci_read_config_word(sp->pdev, PCI_COMMAND, &pci_cmd); - /* Set MMRB count to 1024 in PCI-X Command register. */ - pcix_cmd &= 0xFFF3; - pci_write_config_word(sp->pdev, PCIX_COMMAND_REGISTER, - (pcix_cmd | (0x1 << 2))); /* MMRBC 1K */ - pci_read_config_word(sp->pdev, PCIX_COMMAND_REGISTER, - &(pcix_cmd)); - - /* Setting Maximum outstanding splits based on system type. */ - pcix_cmd &= 0xFF8F; - pcix_cmd |= XENA_MAX_OUTSTANDING_SPLITS(0x1); /* 2 splits. */ - pci_write_config_word(sp->pdev, PCIX_COMMAND_REGISTER, - pcix_cmd); - pci_read_config_word(sp->pdev, PCIX_COMMAND_REGISTER, - &(pcix_cmd)); - /* Forcibly disabling relaxed ordering capability of the card. */ pcix_cmd &= 0xfffd; pci_write_config_word(sp->pdev, PCIX_COMMAND_REGISTER, @@ -4401,6 +4650,7 @@ module_param_array(tx_fifo_len, uint, NU module_param_array(rx_ring_sz, uint, NULL, 0); module_param(Stats_refresh_time, int, 0); module_param_array(rts_frm_len, uint, NULL, 0); +module_param(use_continuous_tx_intrs, int, 1); module_param(rmac_pause_time, int, 0); module_param(mc_pause_threshold_q0q3, int, 0); module_param(mc_pause_threshold_q4q7, int, 0); diff -uprN vanilla_kernel/drivers/net/s2io.h linux-2.6.12-rc6/drivers/net/s2io.h --- vanilla_kernel/drivers/net/s2io.h 2005-06-27 06:25:09.000000000 -0700 +++ linux-2.6.12-rc6/drivers/net/s2io.h 2005-06-27 06:25:50.000000000 -0700 @@ -372,6 +372,10 @@ typedef struct _RxD_t { #define RXD_GET_L4_CKSUM(val) ((u16)(val) & 0xFFFF) u64 Control_2; +#define THE_RXD_MARK 0x3 +#define SET_RXD_MARKER vBIT(THE_RXD_MARK, 0, 2) +#define GET_RXD_MARKER(ctrl) ((ctrl & SET_RXD_MARKER) >> 62) + #ifndef CONFIG_2BUFF_MODE #define MASK_BUFFER0_SIZE vBIT(0x3FFF,2,14) #define SET_BUFFER0_SIZE(val) vBIT(val,2,14) From davem@davemloft.net Thu Jul 7 15:26:12 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 07 Jul 2005 15:26:17 -0700 (PDT) Received: from sunset.davemloft.net ([216.27.180.168]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j67MQBH9014292 for ; Thu, 7 Jul 2005 15:26:11 -0700 Received: from localhost ([127.0.0.1] ident=davem) by sunset.davemloft.net with esmtp (Exim 4.50) id 1Dqen3-00033k-Qk; Thu, 07 Jul 2005 15:24:17 -0700 Date: Thu, 07 Jul 2005 15:24:17 -0700 (PDT) Message-Id: <20050707.152417.59653729.davem@davemloft.net> To: tgraf@suug.ch Cc: dada1@cosmosbay.com, netdev@oss.sgi.com Subject: Re: [PATCH] loop unrolling in net/sched/sch_generic.c From: "David S. Miller" In-Reply-To: <20050707213450.GB16076@postel.suug.ch> References: <20050706124206.GW16076@postel.suug.ch> <20050707.141718.85410359.davem@davemloft.net> <20050707213450.GB16076@postel.suug.ch> X-Mailer: Mew version 4.2 on Emacs 21.4 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 2674 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev Content-Length: 28488 Lines: 804 From: Thomas Graf Date: Thu, 7 Jul 2005 23:34:50 +0200 > Since I'm changing the classful qdiscs to use a generic API > for queue management anyway I could take care of this if you want. > WRT the leaf qdiscs it's a bit more complicated since we have > to change the new API to take a new struct which includes the qlen > and the sk_buff_head but not a problem either. Ok. I'm going to check something like the following into my tree. It takes care of the obvious cases of direct binary test of queue length being zero vs. non-zero. This uncovered some seriously questionable stuff along the way. For example, take a look at drivers/usb/net/usbnet.c:usbnet_stop(). That code seems to want to wait until all the SKB queues are empty, but the way it is coded it only waits if all the queues have at least one packet. I preserved the behavior there, but if someone could verify my analysis and post a bug fix, I'd really appreciate it. Thanks. diff --git a/drivers/bluetooth/hci_vhci.c b/drivers/bluetooth/hci_vhci.c --- a/drivers/bluetooth/hci_vhci.c +++ b/drivers/bluetooth/hci_vhci.c @@ -120,7 +120,7 @@ static unsigned int hci_vhci_chr_poll(st poll_wait(file, &hci_vhci->read_wait, wait); - if (skb_queue_len(&hci_vhci->readq)) + if (!skb_queue_empty(&hci_vhci->readq)) return POLLIN | POLLRDNORM; return POLLOUT | POLLWRNORM; diff --git a/drivers/isdn/hisax/isdnl1.c b/drivers/isdn/hisax/isdnl1.c --- a/drivers/isdn/hisax/isdnl1.c +++ b/drivers/isdn/hisax/isdnl1.c @@ -279,7 +279,8 @@ BChannel_proc_xmt(struct BCState *bcs) if (test_and_clear_bit(FLG_L1_PULL_REQ, &st->l1.Flags)) st->l1.l1l2(st, PH_PULL | CONFIRM, NULL); if (!test_bit(BC_FLG_ACTIV, &bcs->Flag)) { - if (!test_bit(BC_FLG_BUSY, &bcs->Flag) && (!skb_queue_len(&bcs->squeue))) { + if (!test_bit(BC_FLG_BUSY, &bcs->Flag) && + skb_queue_empty(&bcs->squeue)) { st->l2.l2l1(st, PH_DEACTIVATE | CONFIRM, NULL); } } diff --git a/drivers/isdn/hisax/isdnl2.c b/drivers/isdn/hisax/isdnl2.c --- a/drivers/isdn/hisax/isdnl2.c +++ b/drivers/isdn/hisax/isdnl2.c @@ -108,7 +108,8 @@ static int l2addrsize(struct Layer2 *l2) static void set_peer_busy(struct Layer2 *l2) { test_and_set_bit(FLG_PEER_BUSY, &l2->flag); - if (skb_queue_len(&l2->i_queue) || skb_queue_len(&l2->ui_queue)) + if (!skb_queue_empty(&l2->i_queue) || + !skb_queue_empty(&l2->ui_queue)) test_and_set_bit(FLG_L2BLOCK, &l2->flag); } @@ -754,7 +755,7 @@ l2_restart_multi(struct FsmInst *fi, int st->l2.l2l3(st, DL_ESTABLISH | INDICATION, NULL); if ((ST_L2_7==state) || (ST_L2_8 == state)) - if (skb_queue_len(&st->l2.i_queue) && cansend(st)) + if (!skb_queue_empty(&st->l2.i_queue) && cansend(st)) st->l2.l2l1(st, PH_PULL | REQUEST, NULL); } @@ -810,7 +811,7 @@ l2_connected(struct FsmInst *fi, int eve if (pr != -1) st->l2.l2l3(st, pr, NULL); - if (skb_queue_len(&st->l2.i_queue) && cansend(st)) + if (!skb_queue_empty(&st->l2.i_queue) && cansend(st)) st->l2.l2l1(st, PH_PULL | REQUEST, NULL); } @@ -1014,7 +1015,7 @@ l2_st7_got_super(struct FsmInst *fi, int if(typ != RR) FsmDelTimer(&st->l2.t203, 9); restart_t200(st, 12); } - if (skb_queue_len(&st->l2.i_queue) && (typ == RR)) + if (!skb_queue_empty(&st->l2.i_queue) && (typ == RR)) st->l2.l2l1(st, PH_PULL | REQUEST, NULL); } else nrerrorrecovery(fi); @@ -1120,7 +1121,7 @@ l2_got_iframe(struct FsmInst *fi, int ev return; } - if (skb_queue_len(&st->l2.i_queue) && (fi->state == ST_L2_7)) + if (!skb_queue_empty(&st->l2.i_queue) && (fi->state == ST_L2_7)) st->l2.l2l1(st, PH_PULL | REQUEST, NULL); if (test_and_clear_bit(FLG_ACK_PEND, &st->l2.flag)) enquiry_cr(st, RR, RSP, 0); @@ -1138,7 +1139,7 @@ l2_got_tei(struct FsmInst *fi, int event test_and_set_bit(FLG_L3_INIT, &st->l2.flag); } else FsmChangeState(fi, ST_L2_4); - if (skb_queue_len(&st->l2.ui_queue)) + if (!skb_queue_empty(&st->l2.ui_queue)) tx_ui(st); } @@ -1301,7 +1302,7 @@ l2_pull_iqueue(struct FsmInst *fi, int e FsmDelTimer(&st->l2.t203, 13); FsmAddTimer(&st->l2.t200, st->l2.T200, EV_L2_T200, NULL, 11); } - if (skb_queue_len(&l2->i_queue) && cansend(st)) + if (!skb_queue_empty(&l2->i_queue) && cansend(st)) st->l2.l2l1(st, PH_PULL | REQUEST, NULL); } @@ -1347,7 +1348,7 @@ l2_st8_got_super(struct FsmInst *fi, int } invoke_retransmission(st, nr); FsmChangeState(fi, ST_L2_7); - if (skb_queue_len(&l2->i_queue) && cansend(st)) + if (!skb_queue_empty(&l2->i_queue) && cansend(st)) st->l2.l2l1(st, PH_PULL | REQUEST, NULL); } else nrerrorrecovery(fi); diff --git a/drivers/isdn/hisax/isdnl3.c b/drivers/isdn/hisax/isdnl3.c --- a/drivers/isdn/hisax/isdnl3.c +++ b/drivers/isdn/hisax/isdnl3.c @@ -302,7 +302,7 @@ release_l3_process(struct l3_process *p) !test_bit(FLG_PTP, &p->st->l2.flag)) { if (p->debug) l3_debug(p->st, "release_l3_process: last process"); - if (!skb_queue_len(&p->st->l3.squeue)) { + if (skb_queue_empty(&p->st->l3.squeue)) { if (p->debug) l3_debug(p->st, "release_l3_process: release link"); if (p->st->protocol != ISDN_PTYPE_NI1) diff --git a/drivers/isdn/i4l/isdn_tty.c b/drivers/isdn/i4l/isdn_tty.c --- a/drivers/isdn/i4l/isdn_tty.c +++ b/drivers/isdn/i4l/isdn_tty.c @@ -1223,7 +1223,7 @@ isdn_tty_write(struct tty_struct *tty, c total += c; } atomic_dec(&info->xmit_lock); - if ((info->xmit_count) || (skb_queue_len(&info->xmit_queue))) { + if ((info->xmit_count) || !skb_queue_empty(&info->xmit_queue)) { if (m->mdmreg[REG_DXMT] & BIT_DXMT) { isdn_tty_senddown(info); isdn_tty_tint(info); @@ -1284,7 +1284,7 @@ isdn_tty_flush_chars(struct tty_struct * if (isdn_tty_paranoia_check(info, tty->name, "isdn_tty_flush_chars")) return; - if ((info->xmit_count) || (skb_queue_len(&info->xmit_queue))) + if ((info->xmit_count) || !skb_queue_empty(&info->xmit_queue)) isdn_timer_ctrl(ISDN_TIMER_MODEMXMIT, 1); } diff --git a/drivers/isdn/icn/icn.c b/drivers/isdn/icn/icn.c --- a/drivers/isdn/icn/icn.c +++ b/drivers/isdn/icn/icn.c @@ -304,12 +304,12 @@ icn_pollbchan_send(int channel, icn_card isdn_ctrl cmd; if (!(card->sndcount[channel] || card->xskb[channel] || - skb_queue_len(&card->spqueue[channel]))) + !skb_queue_empty(&card->spqueue[channel]))) return; if (icn_trymaplock_channel(card, mch)) { while (sbfree && (card->sndcount[channel] || - skb_queue_len(&card->spqueue[channel]) || + !skb_queue_empty(&card->spqueue[channel]) || card->xskb[channel])) { spin_lock_irqsave(&card->lock, flags); if (card->xmit_lock[channel]) { diff --git a/drivers/net/hamradio/scc.c b/drivers/net/hamradio/scc.c --- a/drivers/net/hamradio/scc.c +++ b/drivers/net/hamradio/scc.c @@ -304,7 +304,7 @@ static inline void scc_discard_buffers(s scc->tx_buff = NULL; } - while (skb_queue_len(&scc->tx_queue)) + while (!skb_queue_empty(&scc->tx_queue)) dev_kfree_skb(skb_dequeue(&scc->tx_queue)); spin_unlock_irqrestore(&scc->lock, flags); @@ -1126,8 +1126,7 @@ static void t_dwait(unsigned long channe if (scc->stat.tx_state == TXS_WAIT) /* maxkeyup or idle timeout */ { - if (skb_queue_len(&scc->tx_queue) == 0) /* nothing to send */ - { + if (skb_queue_empty(&scc->tx_queue)) { /* nothing to send */ scc->stat.tx_state = TXS_IDLE; netif_wake_queue(scc->dev); /* t_maxkeyup locked it. */ return; diff --git a/drivers/net/ppp_async.c b/drivers/net/ppp_async.c --- a/drivers/net/ppp_async.c +++ b/drivers/net/ppp_async.c @@ -364,7 +364,7 @@ ppp_asynctty_receive(struct tty_struct * spin_lock_irqsave(&ap->recv_lock, flags); ppp_async_input(ap, buf, cflags, count); spin_unlock_irqrestore(&ap->recv_lock, flags); - if (skb_queue_len(&ap->rqueue)) + if (!skb_queue_empty(&ap->rqueue)) tasklet_schedule(&ap->tsk); ap_put(ap); if (test_and_clear_bit(TTY_THROTTLED, &tty->flags) diff --git a/drivers/net/ppp_generic.c b/drivers/net/ppp_generic.c --- a/drivers/net/ppp_generic.c +++ b/drivers/net/ppp_generic.c @@ -1237,8 +1237,8 @@ static int ppp_mp_explode(struct ppp *pp pch = list_entry(list, struct channel, clist); navail += pch->avail = (pch->chan != NULL); if (pch->avail) { - if (skb_queue_len(&pch->file.xq) == 0 - || !pch->had_frag) { + if (skb_queue_empty(&pch->file.xq) || + !pch->had_frag) { pch->avail = 2; ++nfree; } @@ -1374,8 +1374,8 @@ static int ppp_mp_explode(struct ppp *pp /* try to send it down the channel */ chan = pch->chan; - if (skb_queue_len(&pch->file.xq) - || !chan->ops->start_xmit(chan, frag)) + if (!skb_queue_empty(&pch->file.xq) || + !chan->ops->start_xmit(chan, frag)) skb_queue_tail(&pch->file.xq, frag); pch->had_frag = 1; p += flen; @@ -1412,7 +1412,7 @@ ppp_channel_push(struct channel *pch) spin_lock_bh(&pch->downl); if (pch->chan != 0) { - while (skb_queue_len(&pch->file.xq) > 0) { + while (!skb_queue_empty(&pch->file.xq)) { skb = skb_dequeue(&pch->file.xq); if (!pch->chan->ops->start_xmit(pch->chan, skb)) { /* put the packet back and try again later */ @@ -1426,7 +1426,7 @@ ppp_channel_push(struct channel *pch) } spin_unlock_bh(&pch->downl); /* see if there is anything from the attached unit to be sent */ - if (skb_queue_len(&pch->file.xq) == 0) { + if (skb_queue_empty(&pch->file.xq)) { read_lock_bh(&pch->upl); ppp = pch->ppp; if (ppp != 0) diff --git a/drivers/net/ppp_synctty.c b/drivers/net/ppp_synctty.c --- a/drivers/net/ppp_synctty.c +++ b/drivers/net/ppp_synctty.c @@ -406,7 +406,7 @@ ppp_sync_receive(struct tty_struct *tty, spin_lock_irqsave(&ap->recv_lock, flags); ppp_sync_input(ap, buf, cflags, count); spin_unlock_irqrestore(&ap->recv_lock, flags); - if (skb_queue_len(&ap->rqueue)) + if (!skb_queue_empty(&ap->rqueue)) tasklet_schedule(&ap->tsk); sp_put(ap); if (test_and_clear_bit(TTY_THROTTLED, &tty->flags) diff --git a/drivers/net/tun.c b/drivers/net/tun.c --- a/drivers/net/tun.c +++ b/drivers/net/tun.c @@ -215,7 +215,7 @@ static unsigned int tun_chr_poll(struct poll_wait(file, &tun->read_wait, wait); - if (skb_queue_len(&tun->readq)) + if (!skb_queue_empty(&tun->readq)) mask |= POLLIN | POLLRDNORM; return mask; diff --git a/drivers/net/wireless/airo.c b/drivers/net/wireless/airo.c --- a/drivers/net/wireless/airo.c +++ b/drivers/net/wireless/airo.c @@ -2374,7 +2374,7 @@ void stop_airo_card( struct net_device * /* * Clean out tx queue */ - if (test_bit(FLAG_MPI, &ai->flags) && skb_queue_len (&ai->txq) > 0) { + if (test_bit(FLAG_MPI, &ai->flags) && !skb_queue_empty(&ai->txq)) { struct sk_buff *skb = NULL; for (;(skb = skb_dequeue(&ai->txq));) dev_kfree_skb(skb); @@ -3287,7 +3287,7 @@ exitrx: if (status & EV_TXEXC) get_tx_error(apriv, -1); spin_lock_irqsave(&apriv->aux_lock, flags); - if (skb_queue_len (&apriv->txq)) { + if (!skb_queue_empty(&apriv->txq)) { spin_unlock_irqrestore(&apriv->aux_lock,flags); mpi_send_packet (dev); } else { diff --git a/drivers/s390/net/claw.c b/drivers/s390/net/claw.c --- a/drivers/s390/net/claw.c +++ b/drivers/s390/net/claw.c @@ -428,7 +428,7 @@ claw_pack_skb(struct claw_privbk *privpt new_skb = NULL; /* assume no dice */ pkt_cnt = 0; CLAW_DBF_TEXT(4,trace,"PackSKBe"); - if (skb_queue_len(&p_ch->collect_queue) > 0) { + if (!skb_queue_empty(&p_ch->collect_queue)) { /* some data */ held_skb = skb_dequeue(&p_ch->collect_queue); if (p_env->packing != DO_PACKED) @@ -1254,7 +1254,7 @@ claw_write_next ( struct chbk * p_ch ) privptr = (struct claw_privbk *) dev->priv; claw_free_wrt_buf( dev ); if ((privptr->write_free_count > 0) && - (skb_queue_len(&p_ch->collect_queue) > 0)) { + !skb_queue_empty(&p_ch->collect_queue)) { pk_skb = claw_pack_skb(privptr); while (pk_skb != NULL) { rc = claw_hw_tx( pk_skb, dev,1); diff --git a/drivers/s390/net/ctctty.c b/drivers/s390/net/ctctty.c --- a/drivers/s390/net/ctctty.c +++ b/drivers/s390/net/ctctty.c @@ -156,7 +156,7 @@ ctc_tty_readmodem(ctc_tty_info *info) skb_queue_head(&info->rx_queue, skb); else { kfree_skb(skb); - ret = skb_queue_len(&info->rx_queue); + ret = !skb_queue_empty(&info->rx_queue); } } } @@ -530,7 +530,7 @@ ctc_tty_write(struct tty_struct *tty, co total += c; count -= c; } - if (skb_queue_len(&info->tx_queue)) { + if (!skb_queue_empty(&info->tx_queue)) { info->lsr &= ~UART_LSR_TEMT; tasklet_schedule(&info->tasklet); } @@ -594,7 +594,7 @@ ctc_tty_flush_chars(struct tty_struct *t return; if (ctc_tty_paranoia_check(info, tty->name, "ctc_tty_flush_chars")) return; - if (tty->stopped || tty->hw_stopped || (!skb_queue_len(&info->tx_queue))) + if (tty->stopped || tty->hw_stopped || skb_queue_empty(&info->tx_queue)) return; tasklet_schedule(&info->tasklet); } diff --git a/drivers/usb/net/usbnet.c b/drivers/usb/net/usbnet.c --- a/drivers/usb/net/usbnet.c +++ b/drivers/usb/net/usbnet.c @@ -3227,9 +3227,9 @@ static int usbnet_stop (struct net_devic temp = unlink_urbs (dev, &dev->txq) + unlink_urbs (dev, &dev->rxq); // maybe wait for deletions to finish. - while (skb_queue_len (&dev->rxq) - && skb_queue_len (&dev->txq) - && skb_queue_len (&dev->done)) { + while (!skb_queue_empty(&dev->rxq) && + !skb_queue_empty(&dev->txq) && + !skb_queue_empty(&dev->done)) { msleep(UNLINK_TIMEOUT_MS); if (netif_msg_ifdown (dev)) devdbg (dev, "waited for %d urb completions", temp); diff --git a/include/net/irda/irda_device.h b/include/net/irda/irda_device.h --- a/include/net/irda/irda_device.h +++ b/include/net/irda/irda_device.h @@ -224,7 +224,7 @@ int irda_device_is_receiving(struct net /* Interface for internal use */ static inline int irda_device_txqueue_empty(const struct net_device *dev) { - return (skb_queue_len(&dev->qdisc->q) == 0); + return skb_queue_empty(&dev->qdisc->q); } int irda_device_set_raw_mode(struct net_device* self, int status); struct net_device *alloc_irdadev(int sizeof_priv); diff --git a/include/net/tcp.h b/include/net/tcp.h --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -991,7 +991,7 @@ static __inline__ void tcp_fast_path_on( static inline void tcp_fast_path_check(struct sock *sk, struct tcp_sock *tp) { - if (skb_queue_len(&tp->out_of_order_queue) == 0 && + if (skb_queue_empty(&tp->out_of_order_queue) && tp->rcv_wnd && atomic_read(&sk->sk_rmem_alloc) < sk->sk_rcvbuf && !tp->urg_data) diff --git a/net/bluetooth/cmtp/core.c b/net/bluetooth/cmtp/core.c --- a/net/bluetooth/cmtp/core.c +++ b/net/bluetooth/cmtp/core.c @@ -213,7 +213,7 @@ static int cmtp_send_frame(struct cmtp_s return kernel_sendmsg(sock, &msg, &iv, 1, len); } -static int cmtp_process_transmit(struct cmtp_session *session) +static void cmtp_process_transmit(struct cmtp_session *session) { struct sk_buff *skb, *nskb; unsigned char *hdr; @@ -223,7 +223,7 @@ static int cmtp_process_transmit(struct if (!(nskb = alloc_skb(session->mtu, GFP_ATOMIC))) { BT_ERR("Can't allocate memory for new frame"); - return -ENOMEM; + return; } while ((skb = skb_dequeue(&session->transmit))) { @@ -275,8 +275,6 @@ static int cmtp_process_transmit(struct cmtp_send_frame(session, nskb->data, nskb->len); kfree_skb(nskb); - - return skb_queue_len(&session->transmit); } static int cmtp_session(void *arg) diff --git a/net/bluetooth/hidp/core.c b/net/bluetooth/hidp/core.c --- a/net/bluetooth/hidp/core.c +++ b/net/bluetooth/hidp/core.c @@ -428,7 +428,7 @@ static int hidp_send_frame(struct socket return kernel_sendmsg(sock, &msg, &iv, 1, len); } -static int hidp_process_transmit(struct hidp_session *session) +static void hidp_process_transmit(struct hidp_session *session) { struct sk_buff *skb; @@ -453,9 +453,6 @@ static int hidp_process_transmit(struct hidp_set_timer(session); kfree_skb(skb); } - - return skb_queue_len(&session->ctrl_transmit) + - skb_queue_len(&session->intr_transmit); } static int hidp_session(void *arg) diff --git a/net/bluetooth/rfcomm/sock.c b/net/bluetooth/rfcomm/sock.c --- a/net/bluetooth/rfcomm/sock.c +++ b/net/bluetooth/rfcomm/sock.c @@ -590,8 +590,11 @@ static long rfcomm_sock_data_wait(struct for (;;) { set_current_state(TASK_INTERRUPTIBLE); - if (skb_queue_len(&sk->sk_receive_queue) || sk->sk_err || (sk->sk_shutdown & RCV_SHUTDOWN) || - signal_pending(current) || !timeo) + if (!skb_queue_empty(&sk->sk_receive_queue) || + sk->sk_err || + (sk->sk_shutdown & RCV_SHUTDOWN) || + signal_pending(current) || + !timeo) break; set_bit(SOCK_ASYNC_WAITDATA, &sk->sk_socket->flags); diff --git a/net/bluetooth/rfcomm/tty.c b/net/bluetooth/rfcomm/tty.c --- a/net/bluetooth/rfcomm/tty.c +++ b/net/bluetooth/rfcomm/tty.c @@ -781,7 +781,7 @@ static int rfcomm_tty_chars_in_buffer(st BT_DBG("tty %p dev %p", tty, dev); - if (skb_queue_len(&dlc->tx_queue)) + if (!skb_queue_empty(&dlc->tx_queue)) return dlc->mtu; return 0; diff --git a/net/decnet/af_decnet.c b/net/decnet/af_decnet.c --- a/net/decnet/af_decnet.c +++ b/net/decnet/af_decnet.c @@ -536,7 +536,7 @@ static void dn_keepalive(struct sock *sk * we are double checking that we are not sending too * many of these keepalive frames. */ - if (skb_queue_len(&scp->other_xmit_queue) == 0) + if (skb_queue_empty(&scp->other_xmit_queue)) dn_nsp_send_link(sk, DN_NOCHANGE, 0); } @@ -1191,7 +1191,7 @@ static unsigned int dn_poll(struct file struct dn_scp *scp = DN_SK(sk); int mask = datagram_poll(file, sock, wait); - if (skb_queue_len(&scp->other_receive_queue)) + if (!skb_queue_empty(&scp->other_receive_queue)) mask |= POLLRDBAND; return mask; @@ -1214,7 +1214,7 @@ static int dn_ioctl(struct socket *sock, case SIOCATMARK: lock_sock(sk); - val = (skb_queue_len(&scp->other_receive_queue) != 0); + val = !skb_queue_empty(&scp->other_receive_queue); if (scp->state != DN_RUN) val = -ENOTCONN; release_sock(sk); @@ -1630,7 +1630,7 @@ static int dn_data_ready(struct sock *sk int len = 0; if (flags & MSG_OOB) - return skb_queue_len(q) ? 1 : 0; + return !skb_queue_empty(q) ? 1 : 0; while(skb != (struct sk_buff *)q) { struct dn_skb_cb *cb = DN_SKB_CB(skb); @@ -1707,7 +1707,7 @@ static int dn_recvmsg(struct kiocb *iocb if (sk->sk_err) goto out; - if (skb_queue_len(&scp->other_receive_queue)) { + if (!skb_queue_empty(&scp->other_receive_queue)) { if (!(flags & MSG_OOB)) { msg->msg_flags |= MSG_OOB; if (!scp->other_report) { diff --git a/net/decnet/dn_nsp_out.c b/net/decnet/dn_nsp_out.c --- a/net/decnet/dn_nsp_out.c +++ b/net/decnet/dn_nsp_out.c @@ -342,7 +342,8 @@ int dn_nsp_xmit_timeout(struct sock *sk) dn_nsp_output(sk); - if (skb_queue_len(&scp->data_xmit_queue) || skb_queue_len(&scp->other_xmit_queue)) + if (!skb_queue_empty(&scp->data_xmit_queue) || + !skb_queue_empty(&scp->other_xmit_queue)) scp->persist = dn_nsp_persist(sk); return 0; diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -1105,7 +1105,7 @@ static void tcp_prequeue_process(struct struct sk_buff *skb; struct tcp_sock *tp = tcp_sk(sk); - NET_ADD_STATS_USER(LINUX_MIB_TCPPREQUEUED, skb_queue_len(&tp->ucopy.prequeue)); + NET_INC_STATS_USER(LINUX_MIB_TCPPREQUEUED); /* RX process wants to run with disabled BHs, though it is not * necessary */ @@ -1369,7 +1369,7 @@ int tcp_recvmsg(struct kiocb *iocb, stru * is not empty. It is more elegant, but eats cycles, * unfortunately. */ - if (skb_queue_len(&tp->ucopy.prequeue)) + if (!skb_queue_empty(&tp->ucopy.prequeue)) goto do_prequeue; /* __ Set realtime policy in scheduler __ */ @@ -1394,7 +1394,7 @@ int tcp_recvmsg(struct kiocb *iocb, stru } if (tp->rcv_nxt == tp->copied_seq && - skb_queue_len(&tp->ucopy.prequeue)) { + !skb_queue_empty(&tp->ucopy.prequeue)) { do_prequeue: tcp_prequeue_process(sk); @@ -1476,7 +1476,7 @@ skip_copy: } while (len > 0); if (user_recv) { - if (skb_queue_len(&tp->ucopy.prequeue)) { + if (!skb_queue_empty(&tp->ucopy.prequeue)) { int chunk; tp->ucopy.len = copied > 0 ? len : 0; diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -2802,7 +2802,7 @@ static void tcp_sack_remove(struct tcp_s int this_sack; /* Empty ofo queue, hence, all the SACKs are eaten. Clear. */ - if (skb_queue_len(&tp->out_of_order_queue) == 0) { + if (skb_queue_empty(&tp->out_of_order_queue)) { tp->rx_opt.num_sacks = 0; tp->rx_opt.eff_sacks = tp->rx_opt.dsack; return; @@ -2935,13 +2935,13 @@ queue_and_out: if(th->fin) tcp_fin(skb, sk, th); - if (skb_queue_len(&tp->out_of_order_queue)) { + if (!skb_queue_empty(&tp->out_of_order_queue)) { tcp_ofo_queue(sk); /* RFC2581. 4.2. SHOULD send immediate ACK, when * gap in queue is filled. */ - if (!skb_queue_len(&tp->out_of_order_queue)) + if (skb_queue_empty(&tp->out_of_order_queue)) tp->ack.pingpong = 0; } @@ -3249,9 +3249,8 @@ static int tcp_prune_queue(struct sock * * This must not ever occur. */ /* First, purge the out_of_order queue. */ - if (skb_queue_len(&tp->out_of_order_queue)) { - NET_ADD_STATS_BH(LINUX_MIB_OFOPRUNED, - skb_queue_len(&tp->out_of_order_queue)); + if (!skb_queue_empty(&tp->out_of_order_queue)) { + NET_INC_STATS_BH(LINUX_MIB_OFOPRUNED); __skb_queue_purge(&tp->out_of_order_queue); /* Reset SACK state. A conforming SACK implementation will diff --git a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c --- a/net/ipv4/tcp_timer.c +++ b/net/ipv4/tcp_timer.c @@ -231,11 +231,10 @@ static void tcp_delack_timer(unsigned lo } tp->ack.pending &= ~TCP_ACK_TIMER; - if (skb_queue_len(&tp->ucopy.prequeue)) { + if (!skb_queue_empty(&tp->ucopy.prequeue)) { struct sk_buff *skb; - NET_ADD_STATS_BH(LINUX_MIB_TCPSCHEDULERFAILED, - skb_queue_len(&tp->ucopy.prequeue)); + NET_INC_STATS_BH(LINUX_MIB_TCPSCHEDULERFAILED); while ((skb = __skb_dequeue(&tp->ucopy.prequeue)) != NULL) sk->sk_backlog_rcv(sk, skb); diff --git a/net/irda/irlap.c b/net/irda/irlap.c --- a/net/irda/irlap.c +++ b/net/irda/irlap.c @@ -445,9 +445,8 @@ void irlap_disconnect_request(struct irl IRDA_ASSERT(self->magic == LAP_MAGIC, return;); /* Don't disconnect until all data frames are successfully sent */ - if (skb_queue_len(&self->txq) > 0) { + if (!skb_queue_empty(&self->txq)) { self->disconnect_pending = TRUE; - return; } diff --git a/net/irda/irlap_event.c b/net/irda/irlap_event.c --- a/net/irda/irlap_event.c +++ b/net/irda/irlap_event.c @@ -191,7 +191,7 @@ static void irlap_start_poll_timer(struc * Send out the RR frames faster if our own transmit queue is empty, or * if the peer is busy. The effect is a much faster conversation */ - if ((skb_queue_len(&self->txq) == 0) || (self->remote_busy)) { + if (skb_queue_empty(&self->txq) || self->remote_busy) { if (self->fast_RR == TRUE) { /* * Assert that the fast poll timer has not reached the @@ -263,7 +263,7 @@ void irlap_do_event(struct irlap_cb *sel IRDA_DEBUG(2, "%s() : queue len = %d\n", __FUNCTION__, skb_queue_len(&self->txq)); - if (skb_queue_len(&self->txq)) { + if (!skb_queue_empty(&self->txq)) { /* Prevent race conditions with irlap_data_request() */ self->local_busy = TRUE; @@ -1074,7 +1074,7 @@ static int irlap_state_xmit_p(struct irl #else /* CONFIG_IRDA_DYNAMIC_WINDOW */ /* Window has been adjusted for the max packet * size, so much simpler... - Jean II */ - nextfit = (skb_queue_len(&self->txq) > 0); + nextfit = !skb_queue_empty(&self->txq); #endif /* CONFIG_IRDA_DYNAMIC_WINDOW */ /* * Send data with poll bit cleared only if window > 1 @@ -1814,7 +1814,7 @@ static int irlap_state_xmit_s(struct irl #else /* CONFIG_IRDA_DYNAMIC_WINDOW */ /* Window has been adjusted for the max packet * size, so much simpler... - Jean II */ - nextfit = (skb_queue_len(&self->txq) > 0); + nextfit = !skb_queue_empty(&self->txq); #endif /* CONFIG_IRDA_DYNAMIC_WINDOW */ /* * Send data with final bit cleared only if window > 1 @@ -1937,7 +1937,7 @@ static int irlap_state_nrm_s(struct irla irlap_data_indication(self, skb, FALSE); /* Any pending data requests? */ - if ((skb_queue_len(&self->txq) > 0) && + if (!skb_queue_empty(&self->txq) && (self->window > 0)) { self->ack_required = TRUE; @@ -2038,7 +2038,7 @@ static int irlap_state_nrm_s(struct irla /* * Any pending data requests? */ - if ((skb_queue_len(&self->txq) > 0) && + if (!skb_queue_empty(&self->txq) && (self->window > 0) && !self->remote_busy) { irlap_data_indication(self, skb, TRUE); @@ -2069,7 +2069,7 @@ static int irlap_state_nrm_s(struct irla */ nr_status = irlap_validate_nr_received(self, info->nr); if (nr_status == NR_EXPECTED) { - if ((skb_queue_len( &self->txq) > 0) && + if (!skb_queue_empty(&self->txq) && (self->window > 0)) { self->remote_busy = FALSE; diff --git a/net/irda/irlap_frame.c b/net/irda/irlap_frame.c --- a/net/irda/irlap_frame.c +++ b/net/irda/irlap_frame.c @@ -1018,11 +1018,10 @@ void irlap_resend_rejected_frames(struct /* * We can now fill the window with additional data frames */ - while (skb_queue_len( &self->txq) > 0) { + while (!skb_queue_empty(&self->txq)) { IRDA_DEBUG(0, "%s(), sending additional frames!\n", __FUNCTION__); - if ((skb_queue_len( &self->txq) > 0) && - (self->window > 0)) { + if (self->window > 0) { skb = skb_dequeue( &self->txq); IRDA_ASSERT(skb != NULL, return;); @@ -1031,8 +1030,7 @@ void irlap_resend_rejected_frames(struct * bit cleared */ if ((self->window > 1) && - skb_queue_len(&self->txq) > 0) - { + !skb_queue_empty(&self->txq)) { irlap_send_data_primary(self, skb); } else { irlap_send_data_primary_poll(self, skb); diff --git a/net/irda/irttp.c b/net/irda/irttp.c --- a/net/irda/irttp.c +++ b/net/irda/irttp.c @@ -1513,7 +1513,7 @@ int irttp_disconnect_request(struct tsap /* * Check if there is still data segments in the transmit queue */ - if (skb_queue_len(&self->tx_queue) > 0) { + if (!skb_queue_empty(&self->tx_queue)) { if (priority == P_HIGH) { /* * No need to send the queued data, if we are diff --git a/net/llc/llc_c_ev.c b/net/llc/llc_c_ev.c --- a/net/llc/llc_c_ev.c +++ b/net/llc/llc_c_ev.c @@ -84,7 +84,7 @@ static u16 llc_util_nr_inside_tx_window( if (llc->dev->flags & IFF_LOOPBACK) goto out; rc = 1; - if (!skb_queue_len(&llc->pdu_unack_q)) + if (skb_queue_empty(&llc->pdu_unack_q)) goto out; skb = skb_peek(&llc->pdu_unack_q); pdu = llc_pdu_sn_hdr(skb); diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c --- a/net/netlink/af_netlink.c +++ b/net/netlink/af_netlink.c @@ -858,7 +858,7 @@ static inline void netlink_rcv_wake(stru { struct netlink_sock *nlk = nlk_sk(sk); - if (!skb_queue_len(&sk->sk_receive_queue)) + if (skb_queue_empty(&sk->sk_receive_queue)) clear_bit(0, &nlk->state); if (!test_bit(0, &nlk->state)) wake_up_interruptible(&nlk->wait); diff --git a/net/sched/sch_red.c b/net/sched/sch_red.c --- a/net/sched/sch_red.c +++ b/net/sched/sch_red.c @@ -385,7 +385,7 @@ static int red_change(struct Qdisc *sch, memcpy(q->Stab, RTA_DATA(tb[TCA_RED_STAB-1]), 256); q->qcount = -1; - if (skb_queue_len(&sch->q) == 0) + if (skb_queue_empty(&sch->q)) PSCHED_SET_PASTPERFECT(q->qidlestart); sch_tree_unlock(sch); return 0; diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -302,7 +302,7 @@ static void unix_write_space(struct sock * may receive messages only from that peer. */ static void unix_dgram_disconnected(struct sock *sk, struct sock *other) { - if (skb_queue_len(&sk->sk_receive_queue)) { + if (!skb_queue_empty(&sk->sk_receive_queue)) { skb_queue_purge(&sk->sk_receive_queue); wake_up_interruptible_all(&unix_sk(sk)->peer_wait); @@ -1619,7 +1619,7 @@ static long unix_stream_data_wait(struct for (;;) { prepare_to_wait(sk->sk_sleep, &wait, TASK_INTERRUPTIBLE); - if (skb_queue_len(&sk->sk_receive_queue) || + if (!skb_queue_empty(&sk->sk_receive_queue) || sk->sk_err || (sk->sk_shutdown & RCV_SHUTDOWN) || signal_pending(current) || From raghavendra.koushik@neterion.com Thu Jul 7 15:31:35 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 07 Jul 2005 15:31:38 -0700 (PDT) Received: from linux.site (adsl-67-120-213-161.dsl.sntc01.pacbell.net [67.120.213.161]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j67MVZH9016324 for ; Thu, 7 Jul 2005 15:31:35 -0700 Received: by linux.site (Postfix, from userid 0) id 9C65089846; Thu, 7 Jul 2005 15:19:07 -0700 (PDT) To: jgarzik@pobox.com, netdev@oss.sgi.com Cc: raghavendra.koushik@neterion.com, ravinandan.arakali@neterion.com, leonid.grossman@neterion.com, rapuru.sriram@neterion.com From: raghavendra.koushik@neterion.com Subject: [PATCH 2.6.12.1 3/12] S2io: Software fixes Message-Id: <20050707221907.9C65089846@linux.site> Date: Thu, 7 Jul 2005 15:19:07 -0700 (PDT) X-archive-position: 2675 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: raghavendra.koushik@neterion.com Precedence: bulk X-list: netdev Content-Length: 9733 Lines: 325 Hi, Below patch includes fixes for few purely software bugs identified since last release. 1. Keep track and display(as part of ethtool command output) the no. of single-bit and double-bit ECC errors. 2. Handle race condition between intr handler and "interface down" routine. 3. Initial link state setting modified so that the link state displayed after "interface Up" is correct. 4. Fix for "Incorrect Tx packet count when TSO is enabled". 5. Disable periodic DMA of statistics and schedule one-shot DMA only when required. Signed-off-by: Ravinandan Arakali Signed-off-by: Raghavendra Koushik --- diff -uprN vanilla_kernel/drivers/net/s2io.c linux-2.6.12-rc6/drivers/net/s2io.c --- vanilla_kernel/drivers/net/s2io.c 2005-06-27 06:29:59.000000000 -0700 +++ linux-2.6.12-rc6/drivers/net/s2io.c 2005-06-27 06:30:09.000000000 -0700 @@ -157,6 +157,9 @@ static char ethtool_stats_keys[][ETH_GST {"rmac_pause_cnt"}, {"rmac_accepted_ip"}, {"rmac_err_tcp"}, + {"\n DRIVER STATISTICS"}, + {"single_bit_ecc_errs"}, + {"double_bit_ecc_errs"}, }; #define S2IO_STAT_LEN sizeof(ethtool_stats_keys)/ ETH_GSTRING_LEN @@ -236,7 +239,6 @@ static unsigned int tx_fifo_len[MAX_TX_F static unsigned int rx_ring_num = 1; static unsigned int rx_ring_sz[MAX_RX_RINGS] = {[0 ...(MAX_RX_RINGS - 1)] = 0 }; -static unsigned int Stats_refresh_time = 4; static unsigned int rts_frm_len[MAX_RX_RINGS] = {[0 ...(MAX_RX_RINGS - 1)] = 0 }; static unsigned int use_continuous_tx_intrs = 1; @@ -1082,9 +1084,6 @@ static int init_nic(struct s2io_nic *nic /* Program statistics memory */ writeq(mac_control->stats_mem_phy, &bar0->stat_addr); - val64 = SET_UPDT_PERIOD(Stats_refresh_time) | - STAT_CFG_STAT_RO | STAT_CFG_STAT_EN; - writeq(val64, &bar0->stat_cfg); /* * Initializing the sampling rate for the device to calculate the @@ -2102,6 +2101,7 @@ static int s2io_poll(struct net_device * u64 val64; int i; + atomic_inc(&nic->isr_cnt); mac_control = &nic->mac_control; config = &nic->config; @@ -2137,6 +2137,7 @@ static int s2io_poll(struct net_device * } /* Re enable the Rx interrupts. */ en_dis_able_nic_intrs(nic, RX_TRAFFIC_INTR, ENABLE_INTRS); + atomic_dec(&nic->isr_cnt); return 0; no_rx: @@ -2150,6 +2151,7 @@ no_rx: break; } } + atomic_dec(&nic->isr_cnt); return 1; } #endif @@ -2180,6 +2182,13 @@ static void rx_intr_handler(ring_info_t #endif register u64 val64; + spin_lock(&nic->rx_lock); + if (atomic_read(&nic->card_state) == CARD_DOWN) { + DBG_PRINT(ERR_DBG, "%s: %s going down for reset\n", + __FUNCTION__, dev->name); + spin_unlock(&nic->rx_lock); + } + /* * rx_traffic_int reg is an R1 register, hence we read and write * back the same value in the register to clear it @@ -2211,6 +2220,7 @@ static void rx_intr_handler(ring_info_t DBG_PRINT(ERR_DBG, "%s: The skb is ", dev->name); DBG_PRINT(ERR_DBG, "Null in Rx Intr\n"); + spin_unlock(&nic->rx_lock); return; } #ifndef CONFIG_2BUFF_MODE @@ -2263,6 +2273,7 @@ static void rx_intr_handler(ring_info_t break; #endif } + spin_unlock(&nic->rx_lock); } /** @@ -2346,7 +2357,6 @@ static void tx_intr_handler(fifo_info_t (sizeof(TxD_t) * fifo_data->max_txds)); /* Updating the statistics block */ - nic->stats.tx_packets++; nic->stats.tx_bytes += skb->len; dev_kfree_skb_irq(skb); @@ -2394,13 +2404,16 @@ static void alarm_intr_handler(struct s2 writeq(val64, &bar0->mc_err_reg); if (val64 & (MC_ERR_REG_ECC_ALL_SNG | MC_ERR_REG_ECC_ALL_DBL)) { if (val64 & MC_ERR_REG_ECC_ALL_DBL) { + nic->mac_control.stats_info->sw_stat. + double_ecc_errs++; DBG_PRINT(ERR_DBG, "%s: Device indicates ", dev->name); DBG_PRINT(ERR_DBG, "double ECC error!!\n"); netif_stop_queue(dev); schedule_work(&nic->rst_timer_task); } else { - /* Device can recover from Single ECC errors */ + nic->mac_control.stats_info->sw_stat. + single_ecc_errs++; } } @@ -2696,7 +2709,7 @@ int s2io_open(struct net_device *dev) * Nic is initialized */ netif_carrier_off(dev); - sp->last_link_state = LINK_DOWN; + sp->last_link_state = 0; /* Unkown link state */ /* Initialize H/W and enable interrupts */ if (s2io_card_up(sp)) { @@ -2910,6 +2923,7 @@ static irqreturn_t s2io_isr(int irq, voi mac_info_t *mac_control; struct config_param *config; + atomic_inc(&sp->isr_cnt); mac_control = &sp->mac_control; config = &sp->config; @@ -2925,6 +2939,7 @@ static irqreturn_t s2io_isr(int irq, voi if (!reason) { /* The interrupt was not raised by Xena. */ + atomic_dec(&sp->isr_cnt); return IRQ_NONE; } @@ -2973,6 +2988,7 @@ static irqreturn_t s2io_isr(int irq, voi dev->name); DBG_PRINT(ERR_DBG, " in ISR!!\n"); clear_bit(0, (&sp->tasklet_status)); + atomic_dec(&sp->isr_cnt); return IRQ_HANDLED; } clear_bit(0, (&sp->tasklet_status)); @@ -2982,10 +2998,37 @@ static irqreturn_t s2io_isr(int irq, voi } #endif + atomic_dec(&sp->isr_cnt); return IRQ_HANDLED; } /** + * s2io_updt_stats - + */ +static void s2io_updt_stats(nic_t *sp) +{ + XENA_dev_config_t __iomem *bar0 = sp->bar0; + u64 val64; + int cnt = 0; + + if (atomic_read(&sp->card_state) == CARD_UP) { + /* Apprx 30us on a 133 MHz bus */ + val64 = SET_UPDT_CLICKS(10) | + STAT_CFG_ONE_SHOT_EN | STAT_CFG_STAT_EN; + writeq(val64, &bar0->stat_cfg); + do { + udelay(100); + val64 = readq(&bar0->stat_cfg); + if (!(val64 & BIT(0))) + break; + cnt++; + if (cnt == 5) + break; /* Updt failed */ + } while(1); + } +} + +/** * s2io_get_stats - Updates the device statistics structure. * @dev : pointer to the device structure. * Description: @@ -3005,6 +3048,11 @@ struct net_device_stats *s2io_get_stats( mac_control = &sp->mac_control; config = &sp->config; + /* Configure Stats for immediate updt */ + s2io_updt_stats(sp); + + sp->stats.tx_packets = + le32_to_cpu(mac_control->stats_info->tmac_frms); sp->stats.tx_errors = le32_to_cpu(mac_control->stats_info->tmac_any_err_frms); sp->stats.rx_errors = @@ -4019,6 +4067,7 @@ static void s2io_get_ethtool_stats(struc nic_t *sp = dev->priv; StatInfo_t *stat_info = sp->mac_control.stats_info; + s2io_updt_stats(sp); tmp_stats[i++] = le32_to_cpu(stat_info->tmac_frms); tmp_stats[i++] = le32_to_cpu(stat_info->tmac_data_octets); tmp_stats[i++] = le64_to_cpu(stat_info->tmac_drop_frms); @@ -4058,6 +4107,9 @@ static void s2io_get_ethtool_stats(struc tmp_stats[i++] = le32_to_cpu(stat_info->rmac_pause_cnt); tmp_stats[i++] = le32_to_cpu(stat_info->rmac_accepted_ip); tmp_stats[i++] = le32_to_cpu(stat_info->rmac_err_tcp); + tmp_stats[i++] = 0; + tmp_stats[i++] = stat_info->sw_stat.single_ecc_errs; + tmp_stats[i++] = stat_info->sw_stat.double_ecc_errs; } int s2io_ethtool_get_regs_len(struct net_device *dev) @@ -4354,14 +4406,27 @@ static void s2io_card_down(nic_t * sp) break; } } while (1); - spin_lock_irqsave(&sp->tx_lock, flags); s2io_reset(sp); - /* Free all unused Tx and Rx buffers */ + /* Waiting till all Interrupt handlers are complete */ + cnt = 0; + do { + msleep(10); + if (!atomic_read(&sp->isr_cnt)) + break; + cnt++; + } while(cnt < 5); + + spin_lock_irqsave(&sp->tx_lock, flags); + /* Free all Tx buffers */ free_tx_buffers(sp); + spin_unlock_irqrestore(&sp->tx_lock, flags); + + /* Free all Rx buffers */ + spin_lock_irqsave(&sp->rx_lock, flags); free_rx_buffers(sp); + spin_unlock_irqrestore(&sp->rx_lock, flags); - spin_unlock_irqrestore(&sp->tx_lock, flags); clear_bit(0, &(sp->link_state)); } @@ -4648,7 +4713,6 @@ module_param(tx_fifo_num, int, 0); module_param(rx_ring_num, int, 0); module_param_array(tx_fifo_len, uint, NULL, 0); module_param_array(rx_ring_sz, uint, NULL, 0); -module_param(Stats_refresh_time, int, 0); module_param_array(rts_frm_len, uint, NULL, 0); module_param(use_continuous_tx_intrs, int, 1); module_param(rmac_pause_time, int, 0); @@ -4805,6 +4869,9 @@ s2io_init_nic(struct pci_dev *pdev, cons for (i = 0; i < config->rx_ring_num; i++) atomic_set(&sp->rx_bufs_left[i], 0); + /* Initialize the number of ISRs currently running */ + atomic_set(&sp->isr_cnt, 0); + /* initialize the shared memory used by the NIC and the host */ if (init_shared_mem(sp)) { DBG_PRINT(ERR_DBG, "%s: Memory allocation failed\n", @@ -4939,6 +5006,7 @@ s2io_init_nic(struct pci_dev *pdev, cons #ifndef CONFIG_S2IO_NAPI spin_lock_init(&sp->put_lock); #endif + spin_lock_init(&sp->rx_lock); /* * SXE-002: Configure link and activity LED to init state @@ -4962,13 +5030,16 @@ s2io_init_nic(struct pci_dev *pdev, cons goto register_failed; } + /* Initialize device name */ + strcpy(sp->name, dev->name); + strcat(sp->name, ": Neterion Xframe I 10GbE adapter"); + /* * Make Link state as off at this point, when the Link change * interrupt comes the state will be automatically changed to * the right state. */ netif_carrier_off(dev); - sp->last_link_state = LINK_DOWN; return 0; diff -uprN vanilla_kernel/drivers/net/s2io.h linux-2.6.12-rc6/drivers/net/s2io.h --- vanilla_kernel/drivers/net/s2io.h 2005-06-27 06:29:59.000000000 -0700 +++ linux-2.6.12-rc6/drivers/net/s2io.h 2005-06-27 06:30:09.000000000 -0700 @@ -195,6 +195,9 @@ typedef struct stat_block { u32 rxd_rd_cnt; u32 rxf_wr_cnt; u32 txf_rd_cnt; + +/* Software statistics maintained by driver */ + swStat_t sw_stat; } StatInfo_t; /* @@ -678,6 +681,8 @@ struct s2io_nic { #define CARD_UP 2 atomic_t card_state; volatile unsigned long link_state; + spinlock_t rx_lock; + atomic_t isr_cnt; }; #define RESET_ERROR 1; From davem@davemloft.net Thu Jul 7 15:32:31 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 07 Jul 2005 15:32:35 -0700 (PDT) Received: from sunset.davemloft.net ([216.27.180.168]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j67MWUH9016568 for ; Thu, 7 Jul 2005 15:32:30 -0700 Received: from localhost ([127.0.0.1] ident=davem) by sunset.davemloft.net with esmtp (Exim 4.50) id 1DqetB-00034m-Bm; Thu, 07 Jul 2005 15:30:37 -0700 Date: Thu, 07 Jul 2005 15:30:37 -0700 (PDT) Message-Id: <20050707.153037.18577551.davem@davemloft.net> To: jesse.brandeburg@intel.com Cc: tgraf@suug.ch, dada1@cosmosbay.com, netdev@oss.sgi.com Subject: Re: [PATCH] loop unrolling in net/sched/sch_generic.c From: "David S. Miller" In-Reply-To: References: X-Mailer: Mew version 4.2 on Emacs 21.4 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 2676 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev Content-Length: 249 Lines: 7 From: "Brandeburg, Jesse" Date: Thu, 7 Jul 2005 15:02:17 -0700 > Arg, this thread wasn't on the new list, is there any chance we can just > get netdev@oss.sgi.com to forward to netdev@vger.kernel.org? It does already. From raghavendra.koushik@neterion.com Thu Jul 7 15:36:06 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 07 Jul 2005 15:36:10 -0700 (PDT) Received: from linux.site (adsl-67-120-213-161.dsl.sntc01.pacbell.net [67.120.213.161]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j67Ma5H9017561 for ; Thu, 7 Jul 2005 15:36:05 -0700 Received: by linux.site (Postfix, from userid 0) id 1081A89826; Thu, 7 Jul 2005 15:23:38 -0700 (PDT) To: jgarzik@pobox.com, netdev@oss.sgi.com Cc: raghavendra.koushik@neterion.com, ravinandan.arakali@neterion.com, leonid.grossman@neterion.com, rapuru.sriram@neterion.com From: raghavendra.koushik@neterion.com Subject: [PATCH 2.6.12.1 4/12] S2io: Removed memory leaks Message-Id: <20050707222338.1081A89826@linux.site> Date: Thu, 7 Jul 2005 15:23:38 -0700 (PDT) X-archive-position: 2677 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: raghavendra.koushik@neterion.com Precedence: bulk X-list: netdev Content-Length: 2063 Lines: 67 Hi, This patch fixes certain memory leaks discovered in free_tx_buffers() and rx_osm_handler() Signed-off-by: Ravinandan Arakali Signed-off-by: Raghavendra Koushik --- diff -uprN vanilla_kernel/drivers/net/s2io.c linux-2.6.12-rc6/drivers/net/s2io.c --- vanilla_kernel/drivers/net/s2io.c 2005-06-27 06:34:23.000000000 -0700 +++ linux-2.6.12-rc6/drivers/net/s2io.c 2005-06-27 06:34:38.000000000 -0700 @@ -1708,7 +1708,7 @@ static void free_tx_buffers(struct s2io_ int i, j; mac_info_t *mac_control; struct config_param *config; - int cnt = 0; + int cnt = 0, frg_cnt; mac_control = &nic->mac_control; config = &nic->config; @@ -1721,11 +1721,33 @@ static void free_tx_buffers(struct s2io_ (struct sk_buff *) ((unsigned long) txdp-> Host_Control); if (skb == NULL) { - memset(txdp, 0, sizeof(TxD_t)); + memset(txdp, 0, sizeof(TxD_t) * + config->max_txds); continue; } + frg_cnt = skb_shinfo(skb)->nr_frags; + pci_unmap_single(nic->pdev, (dma_addr_t) + txdp->Buffer_Pointer, + skb->len - skb->data_len, + PCI_DMA_TODEVICE); + if (frg_cnt) { + TxD_t *temp; + temp = txdp; + txdp++; + for (j = 0; j < frg_cnt; j++, txdp++) { + skb_frag_t *frag = + &skb_shinfo(skb)->frags[j]; + pci_unmap_page(nic->pdev, + (dma_addr_t) + txdp-> + Buffer_Pointer, + frag->size, + PCI_DMA_TODEVICE); + } + txdp = temp; + } dev_kfree_skb(skb); - memset(txdp, 0, sizeof(TxD_t)); + memset(txdp, 0, sizeof(TxD_t) * config->max_txds); cnt++; } DBG_PRINT(INTR_DBG, @@ -4571,6 +4593,11 @@ static int rx_osm_handler(ring_info_t *r unsigned long long err = rxdp->Control_1 & RXD_T_CODE; DBG_PRINT(ERR_DBG, "%s: Rx error Value: 0x%llx\n", dev->name, err); + dev_kfree_skb(skb); + sp->stats.rx_crc_errors++; + atomic_dec(&sp->rx_bufs_left[ring_no]); + rxdp->Host_Control = 0; + return 0; } /* Updating statistics */ From raghavendra.koushik@neterion.com Thu Jul 7 15:42:30 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 07 Jul 2005 15:42:33 -0700 (PDT) Received: from linux.site (adsl-67-120-213-161.dsl.sntc01.pacbell.net [67.120.213.161]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j67MgUH9018677 for ; Thu, 7 Jul 2005 15:42:30 -0700 Received: by linux.site (Postfix, from userid 0) id E452789826; Thu, 7 Jul 2005 15:30:02 -0700 (PDT) To: jgarzik@pobox.com, netdev@oss.sgi.com Cc: raghavendra.koushik@neterion.com, ravinandan.arakali@neterion.com, leonid.grossman@neterion.com, rapuru.sriram@neterion.com From: raghavendra.koushik@neterion.com Subject: [PATCH 2.6.12.1 6/12] S2io: Support for runtime MTU change Message-Id: <20050707223002.E452789826@linux.site> Date: Thu, 7 Jul 2005 15:30:02 -0700 (PDT) X-archive-position: 2679 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: raghavendra.koushik@neterion.com Precedence: bulk X-list: netdev Content-Length: 1853 Lines: 60 Hi, Patch below supports MTU change on-the-fly(without bringing interface down) Signed-off-by: Ravinandan Arakali Signed-off-by: Raghavendra Koushik --- diff -uprN vanilla_kernel/drivers/net/s2io.c linux-2.6.12-rc6/drivers/net/s2io.c --- vanilla_kernel/drivers/net/s2io.c 2005-06-28 02:04:52.000000000 -0700 +++ linux-2.6.12-rc6/drivers/net/s2io.c 2005-06-28 02:07:26.000000000 -0700 @@ -2850,6 +2850,7 @@ int s2io_xmit(struct sk_buff *skb, struc } txdp->Control_2 |= config->tx_intr_type; + txdp->Control_1 |= (TXD_BUFFER0_SIZE(frg_len) | TXD_GATHER_CODE_FIRST); txdp->Control_1 |= TXD_LIST_OWN_XENA; @@ -4247,14 +4248,6 @@ int s2io_ioctl(struct net_device *dev, s int s2io_change_mtu(struct net_device *dev, int new_mtu) { nic_t *sp = dev->priv; - XENA_dev_config_t __iomem *bar0 = sp->bar0; - register u64 val64; - - if (netif_running(dev)) { - DBG_PRINT(ERR_DBG, "%s: Must be stopped to ", dev->name); - DBG_PRINT(ERR_DBG, "change its MTU\n"); - return -EBUSY; - } if ((new_mtu < MIN_MTU) || (new_mtu > S2IO_JUMBO_SIZE)) { DBG_PRINT(ERR_DBG, "%s: MTU size is invalid.\n", @@ -4262,11 +4255,22 @@ int s2io_change_mtu(struct net_device *d return -EPERM; } - /* Set the new MTU into the PYLD register of the NIC */ - val64 = new_mtu; - writeq(vBIT(val64, 2, 14), &bar0->rmac_max_pyld_len); - dev->mtu = new_mtu; + if (netif_running(dev)) { + s2io_card_down(sp); + netif_stop_queue(dev); + if (s2io_card_up(sp)) { + DBG_PRINT(ERR_DBG, "%s: Device bring up failed\n", + __FUNCTION__); + } + if (netif_queue_stopped(dev)) + netif_wake_queue(dev); + } else { /* Device is down */ + XENA_dev_config_t __iomem *bar0 = sp->bar0; + u64 val64 = new_mtu; + + writeq(vBIT(val64, 2, 14), &bar0->rmac_max_pyld_len); + } return 0; } From raghavendra.koushik@neterion.com Thu Jul 7 15:40:09 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 07 Jul 2005 15:40:13 -0700 (PDT) Received: from linux.site (adsl-67-120-213-161.dsl.sntc01.pacbell.net [67.120.213.161]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j67Me9H9018357 for ; Thu, 7 Jul 2005 15:40:09 -0700 Received: by linux.site (Postfix, from userid 0) id 71C3E89826; Thu, 7 Jul 2005 15:27:41 -0700 (PDT) To: jgarzik@pobox.com, netdev@oss.sgi.com Cc: raghavendra.koushik@neterion.com, ravinandan.arakali@neterion.com, leonid.grossman@neterion.com, rapuru.sriram@neterion.com From: raghavendra.koushik@neterion.com Subject: [PATCH 2.6.12.1 5/12] S2io: Performance improvements Message-Id: <20050707222741.71C3E89826@linux.site> Date: Thu, 7 Jul 2005 15:27:41 -0700 (PDT) X-archive-position: 2678 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: raghavendra.koushik@neterion.com Precedence: bulk X-list: netdev Content-Length: 5732 Lines: 166 Hi, This patch relates to mostly performance related changes. 1. Fixed incorrect computation of PANIC level in rx_buffer_level(). 2. Removed unnecessary PIOs(read/write of tx_traffic_int and rx_traffic_int) from interrupt handler and removed read of general_int_status register from xmit routine. 3. Enable two-buffer mode(for Rx path) automatically for SGI systems. This improves Rx performance dramatically on SGI systems. Signed-off-by: Ravinandan Arakali Signed-off-by: Raghavendra Koushik --- diff -uprN vanilla_kernel/drivers/net/s2io.c linux-2.6.12-rc6/drivers/net/s2io.c --- vanilla_kernel/drivers/net/s2io.c 2005-06-28 02:00:08.000000000 -0700 +++ linux-2.6.12-rc6/drivers/net/s2io.c 2005-06-28 02:00:18.000000000 -0700 @@ -99,8 +99,7 @@ static inline int rx_buffer_level(nic_t mac_control = &sp->mac_control; if ((mac_control->rings[ring].pkt_cnt - rxb_size) > 16) { level = LOW; - if ((mac_control->rings[ring].pkt_cnt - rxb_size) < - MAX_RXDS_PER_BLOCK) { + if (rxb_size <= MAX_RXDS_PER_BLOCK) { level = PANIC; } } @@ -2194,7 +2193,6 @@ static void rx_intr_handler(ring_info_t { nic_t *nic = ring_data->nic; struct net_device *dev = (struct net_device *) nic->dev; - XENA_dev_config_t __iomem *bar0 = nic->bar0; int get_block, get_offset, put_block, put_offset, ring_bufs; rx_curr_get_info_t get_info, put_info; RxD_t *rxdp; @@ -2202,8 +2200,6 @@ static void rx_intr_handler(ring_info_t #ifndef CONFIG_S2IO_NAPI int pkt_cnt = 0; #endif - register u64 val64; - spin_lock(&nic->rx_lock); if (atomic_read(&nic->card_state) == CARD_DOWN) { DBG_PRINT(ERR_DBG, "%s: %s going down for reset\n", @@ -2211,13 +2207,6 @@ static void rx_intr_handler(ring_info_t spin_unlock(&nic->rx_lock); } - /* - * rx_traffic_int reg is an R1 register, hence we read and write - * back the same value in the register to clear it - */ - val64 = readq(&bar0->tx_traffic_int); - writeq(val64, &bar0->tx_traffic_int); - get_info = ring_data->rx_curr_get_info; get_block = get_info.block_index; put_info = ring_data->rx_curr_put_info; @@ -2313,20 +2302,11 @@ static void rx_intr_handler(ring_info_t static void tx_intr_handler(fifo_info_t *fifo_data) { nic_t *nic = fifo_data->nic; - XENA_dev_config_t __iomem *bar0 = nic->bar0; struct net_device *dev = (struct net_device *) nic->dev; tx_curr_get_info_t get_info, put_info; struct sk_buff *skb; TxD_t *txdlp; u16 j, frg_cnt; - register u64 val64 = 0; - - /* - * tx_traffic_int reg is an R1 register, hence we read and write - * back the same value in the register to clear it - */ - val64 = readq(&bar0->tx_traffic_int); - writeq(val64, &bar0->tx_traffic_int); get_info = fifo_data->tx_curr_get_info; put_info = fifo_data->tx_curr_put_info; @@ -2819,7 +2799,6 @@ int s2io_xmit(struct sk_buff *skb, struc #endif mac_info_t *mac_control; struct config_param *config; - XENA_dev_config_t __iomem *bar0 = sp->bar0; mac_control = &sp->mac_control; config = &sp->config; @@ -2871,7 +2850,6 @@ int s2io_xmit(struct sk_buff *skb, struc } txdp->Control_2 |= config->tx_intr_type; - txdp->Control_1 |= (TXD_BUFFER0_SIZE(frg_len) | TXD_GATHER_CODE_FIRST); txdp->Control_1 |= TXD_LIST_OWN_XENA; @@ -2891,6 +2869,8 @@ int s2io_xmit(struct sk_buff *skb, struc val64 = mac_control->fifos[queue].list_info[put_off].list_phy_addr; writeq(val64, &tx_fifo->TxDL_Pointer); + wmb(); + val64 = (TX_FIFO_LAST_TXD_NUM(frg_cnt) | TX_FIFO_FIRST_LIST | TX_FIFO_LAST_LIST); @@ -2900,9 +2880,6 @@ int s2io_xmit(struct sk_buff *skb, struc #endif writeq(val64, &tx_fifo->List_Control); - /* Perform a PCI read to flush previous writes */ - val64 = readq(&bar0->general_int_status); - put_off++; put_off %= mac_control->fifos[queue].tx_curr_put_info.fifo_len + 1; mac_control->fifos[queue].tx_curr_put_info.offset = put_off; @@ -2941,7 +2918,7 @@ static irqreturn_t s2io_isr(int irq, voi nic_t *sp = dev->priv; XENA_dev_config_t __iomem *bar0 = sp->bar0; int i; - u64 reason = 0; + u64 reason = 0, val64; mac_info_t *mac_control; struct config_param *config; @@ -2979,6 +2956,13 @@ static irqreturn_t s2io_isr(int irq, voi #else /* If Intr is because of Rx Traffic */ if (reason & GEN_INTR_RXTRAFFIC) { + /* + * rx_traffic_int reg is an R1 register, writing all 1's + * will ensure that the actual interrupt causing bit get's + * cleared and hence a read can be avoided. + */ + val64 = 0xFFFFFFFFFFFFFFFFULL; + writeq(val64, &bar0->rx_traffic_int); for (i = 0; i < config->rx_ring_num; i++) { rx_intr_handler(&mac_control->rings[i]); } @@ -2987,6 +2971,14 @@ static irqreturn_t s2io_isr(int irq, voi /* If Intr is because of Tx Traffic */ if (reason & GEN_INTR_TXTRAFFIC) { + /* + * tx_traffic_int reg is an R1 register, writing all 1's + * will ensure that the actual interrupt causing bit get's + * cleared and hence a read can be avoided. + */ + val64 = 0xFFFFFFFFFFFFFFFFULL; + writeq(val64, &bar0->tx_traffic_int); + for (i = 0; i < config->tx_fifo_num; i++) tx_intr_handler(&mac_control->fifos[i]); } diff -uprN vanilla_kernel/drivers/net/s2io.h linux-2.6.12-rc6/drivers/net/s2io.h --- vanilla_kernel/drivers/net/s2io.h 2005-06-28 02:00:08.000000000 -0700 +++ linux-2.6.12-rc6/drivers/net/s2io.h 2005-06-28 02:00:18.000000000 -0700 @@ -13,6 +13,11 @@ #ifndef _S2IO_H #define _S2IO_H +/* Enable 2 buffer mode by default for SGI system */ +#ifdef CONFIG_IA64_SGI_SN2 +#define CONFIG_2BUFF_MODE +#endif + #define TBD 0 #define BIT(loc) (0x8000000000000000ULL >> (loc)) #define vBIT(val, loc, sz) (((u64)val) << (64-loc-sz)) From raghavendra.koushik@neterion.com Thu Jul 7 15:45:14 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 07 Jul 2005 15:45:18 -0700 (PDT) Received: from linux.site (adsl-67-120-213-161.dsl.sntc01.pacbell.net [67.120.213.161]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j67MjEH9019552 for ; Thu, 7 Jul 2005 15:45:14 -0700 Received: by linux.site (Postfix, from userid 0) id 5D6FA89828; Thu, 7 Jul 2005 15:32:46 -0700 (PDT) To: jgarzik@pobox.com, netdev@oss.sgi.com Cc: raghavendra.koushik@neterion.com, ravinandan.arakali@neterion.com, leonid.grossman@neterion.com, rapuru.sriram@neterion.com From: raghavendra.koushik@neterion.com Subject: [PATCH 2.6.12.1 7/12] S2io: Timer based slow path handling Message-Id: <20050707223246.5D6FA89828@linux.site> Date: Thu, 7 Jul 2005 15:32:46 -0700 (PDT) X-archive-position: 2680 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: raghavendra.koushik@neterion.com Precedence: bulk X-list: netdev Content-Length: 3241 Lines: 97 Hi, This patch implements the slow-path handling functions(link state change, hardware errors) as a timer. It is not handled in interrupt handler as was done previously. Signed-off-by: Ravinandan Arakali Signed-off-by: Raghavendra Koushik --- diff -urpN vanilla_kernel/drivers/net/s2io.c linux-2.6.12-rc6/drivers/net/s2io.c --- vanilla_kernel/drivers/net/s2io.c 2005-06-28 02:59:46.000000000 -0700 +++ linux-2.6.12-rc6/drivers/net/s2io.c 2005-06-28 02:59:54.000000000 -0700 @@ -167,6 +167,12 @@ static char ethtool_stats_keys[][ETH_GST #define S2IO_TEST_LEN sizeof(s2io_gstrings) / ETH_GSTRING_LEN #define S2IO_STRINGS_LEN S2IO_TEST_LEN * ETH_GSTRING_LEN +#define S2IO_TIMER_CONF(timer, handle, arg, exp) \ + init_timer(&timer); \ + timer.function = handle; \ + timer.data = (unsigned long) arg; \ + mod_timer(&timer, (jiffies + exp)) \ + /* * Constants to be programmed into the Xena's registers, to configure * the XAUI. @@ -2742,6 +2748,7 @@ int s2io_open(struct net_device *dev) setting_mac_address_failed: free_irq(sp->pdev->irq, dev); isr_registration_failed: + del_timer_sync(&sp->alarm_timer); s2io_reset(sp); hw_init_failed: return err; @@ -2899,6 +2906,15 @@ int s2io_xmit(struct sk_buff *skb, struc return 0; } +static void +s2io_alarm_handle(unsigned long data) +{ + nic_t *sp = (nic_t *)data; + + alarm_intr_handler(sp); + mod_timer(&sp->alarm_timer, jiffies + HZ / 2); +} + /** * s2io_isr - ISR handler of the device . * @irq: the irq of the device. @@ -2943,9 +2959,6 @@ static irqreturn_t s2io_isr(int irq, voi return IRQ_NONE; } - if (reason & (GEN_ERROR_INTR)) - alarm_intr_handler(sp); - #ifdef CONFIG_S2IO_NAPI if (reason & GEN_INTR_RXTRAFFIC) { if (netif_rx_schedule_prep(dev)) { @@ -4395,6 +4408,7 @@ static void s2io_card_down(nic_t * sp) unsigned long flags; register u64 val64 = 0; + del_timer_sync(&sp->alarm_timer); /* If s2io_set_link task is executing, wait till it completes. */ while (test_and_set_bit(0, &(sp->link_state))) { msleep(50); @@ -4497,6 +4511,8 @@ static int s2io_card_up(nic_t * sp) return -ENODEV; } + S2IO_TIMER_CONF(sp->alarm_timer, s2io_alarm_handle, sp, (HZ/2)); + atomic_set(&sp->card_state, CARD_UP); return 0; } diff -urpN vanilla_kernel/drivers/net/s2io.h linux-2.6.12-rc6/drivers/net/s2io.h --- vanilla_kernel/drivers/net/s2io.h 2005-06-28 02:59:46.000000000 -0700 +++ linux-2.6.12-rc6/drivers/net/s2io.h 2005-06-28 02:59:54.000000000 -0700 @@ -624,6 +624,9 @@ struct s2io_nic { struct tasklet_struct task; volatile unsigned long tasklet_status; + /* Timer that handles I/O errors/exceptions */ + struct timer_list alarm_timer; + /* Space to back up the PCI config space */ u32 config_space[256 / sizeof(u32)]; @@ -819,6 +822,7 @@ static int s2io_poll(struct net_device * #endif static void s2io_init_pci(nic_t * sp); int s2io_set_mac_addr(struct net_device *dev, u8 * addr); +static void s2io_alarm_handle(unsigned long data); static irqreturn_t s2io_isr(int irq, void *dev_id, struct pt_regs *regs); static int verify_xena_quiescence(nic_t *sp, u64 val64, int flag); static struct ethtool_ops netdev_ethtool_ops; From raghavendra.koushik@neterion.com Thu Jul 7 15:46:52 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 07 Jul 2005 15:46:56 -0700 (PDT) Received: from linux.site (adsl-67-120-213-161.dsl.sntc01.pacbell.net [67.120.213.161]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j67MkqH9020125 for ; Thu, 7 Jul 2005 15:46:52 -0700 Received: by linux.site (Postfix, from userid 0) id 8EC3889826; Thu, 7 Jul 2005 15:34:24 -0700 (PDT) To: jgarzik@pobox.com, netdev@oss.sgi.com Cc: raghavendra.koushik@neterion.com, ravinandan.arakali@neterion.com, leonid.grossman@neterion.com, rapuru.sriram@neterion.com From: raghavendra.koushik@neterion.com Subject: [PATCH 2.6.12.1 8/12] S2io: VLAN support Message-Id: <20050707223424.8EC3889826@linux.site> Date: Thu, 7 Jul 2005 15:34:24 -0700 (PDT) X-archive-position: 2681 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: raghavendra.koushik@neterion.com Precedence: bulk X-list: netdev Content-Length: 4090 Lines: 132 Hi, Patch below adds VLAN support to the driver. Signed-off-by: Ravinandan Arakali Signed-off-by: Raghavendra Koushik --- diff -urpN vanilla_kernel/drivers/net/s2io.c linux-2.6.12-rc6/drivers/net/s2io.c --- vanilla_kernel/drivers/net/s2io.c 2005-06-28 03:12:25.000000000 -0700 +++ linux-2.6.12-rc6/drivers/net/s2io.c 2005-06-28 03:14:54.000000000 -0700 @@ -54,6 +54,7 @@ #include #include #include +#include #include #include @@ -173,6 +174,30 @@ static char ethtool_stats_keys[][ETH_GST timer.data = (unsigned long) arg; \ mod_timer(&timer, (jiffies + exp)) \ +/* Add the vlan */ +static void s2io_vlan_rx_register(struct net_device *dev, + struct vlan_group *grp) +{ + nic_t *nic = dev->priv; + unsigned long flags; + + spin_lock_irqsave(&nic->tx_lock, flags); + nic->vlgrp = grp; + spin_unlock_irqrestore(&nic->tx_lock, flags); +} + +/* Unregister the vlan */ +static void s2io_vlan_rx_kill_vid(struct net_device *dev, unsigned long vid) +{ + nic_t *nic = dev->priv; + unsigned long flags; + + spin_lock_irqsave(&nic->tx_lock, flags); + if (nic->vlgrp) + nic->vlgrp->vlan_devices[vid] = NULL; + spin_unlock_irqrestore(&nic->tx_lock, flags); +} + /* * Constants to be programmed into the Xena's registers, to configure * the XAUI. @@ -2804,6 +2829,8 @@ int s2io_xmit(struct sk_buff *skb, struc #ifdef NETIF_F_TSO int mss; #endif + u16 vlan_tag = 0; + int vlan_priority = 0; mac_info_t *mac_control; struct config_param *config; @@ -2822,6 +2849,13 @@ int s2io_xmit(struct sk_buff *skb, struc queue = 0; + /* Get Fifo number to Transmit based on vlan priority */ + if (sp->vlgrp && vlan_tx_tag_present(skb)) { + vlan_tag = vlan_tx_tag_get(skb); + vlan_priority = vlan_tag >> 13; + queue = config->fifo_mapping[vlan_priority]; + } + put_off = (u16) mac_control->fifos[queue].tx_curr_put_info.offset; get_off = (u16) mac_control->fifos[queue].tx_curr_get_info.offset; txdp = (TxD_t *) mac_control->fifos[queue].list_info[put_off]. @@ -2858,6 +2892,11 @@ int s2io_xmit(struct sk_buff *skb, struc txdp->Control_2 |= config->tx_intr_type; + if (sp->vlgrp && vlan_tx_tag_present(skb)) { + txdp->Control_2 |= TXD_VLAN_ENABLE; + txdp->Control_2 |= TXD_VLAN_TAG(vlan_tag); + } + txdp->Control_1 |= (TXD_BUFFER0_SIZE(frg_len) | TXD_GATHER_CODE_FIRST); txdp->Control_1 |= TXD_LIST_OWN_XENA; @@ -4654,10 +4693,23 @@ static int rx_osm_handler(ring_info_t *r skb->protocol = eth_type_trans(skb, dev); #ifdef CONFIG_S2IO_NAPI - netif_receive_skb(skb); + if (sp->vlgrp && RXD_GET_VLAN_TAG(rxdp->Control_2)) { + /* Queueing the vlan frame to the upper layer */ + vlan_hwaccel_receive_skb(skb, sp->vlgrp, + RXD_GET_VLAN_TAG(rxdp->Control_2)); + } else { + netif_receive_skb(skb); + } #else - netif_rx(skb); + if (sp->vlgrp && RXD_GET_VLAN_TAG(rxdp->Control_2)) { + /* Queueing the vlan frame to the upper layer */ + vlan_hwaccel_rx(skb, sp->vlgrp, + RXD_GET_VLAN_TAG(rxdp->Control_2)); + } else { + netif_rx(skb); + } #endif + dev->last_rx = jiffies; atomic_dec(&sp->rx_bufs_left[ring_no]); return SUCCESS; @@ -4955,6 +5007,9 @@ s2io_init_nic(struct pci_dev *pdev, cons dev->do_ioctl = &s2io_ioctl; dev->change_mtu = &s2io_change_mtu; SET_ETHTOOL_OPS(dev, &netdev_ethtool_ops); + dev->features |= NETIF_F_HW_VLAN_TX | NETIF_F_HW_VLAN_RX; + dev->vlan_rx_register = s2io_vlan_rx_register; + dev->vlan_rx_kill_vid = (void *)s2io_vlan_rx_kill_vid; /* * will use eth_mac_addr() for dev->set_mac_address diff -urpN vanilla_kernel/drivers/net/s2io.h linux-2.6.12-rc6/drivers/net/s2io.h --- vanilla_kernel/drivers/net/s2io.h 2005-06-28 03:12:25.000000000 -0700 +++ linux-2.6.12-rc6/drivers/net/s2io.h 2005-06-28 03:14:54.000000000 -0700 @@ -689,6 +689,8 @@ struct s2io_nic { #define CARD_UP 2 atomic_t card_state; volatile unsigned long link_state; + struct vlan_group *vlgrp; + spinlock_t rx_lock; atomic_t isr_cnt; }; From raghavendra.koushik@neterion.com Thu Jul 7 15:48:21 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 07 Jul 2005 15:48:24 -0700 (PDT) Received: from linux.site (adsl-67-120-213-161.dsl.sntc01.pacbell.net [67.120.213.161]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j67MmKH9020749 for ; Thu, 7 Jul 2005 15:48:20 -0700 Received: by linux.site (Postfix, from userid 0) id 47B8E89826; Thu, 7 Jul 2005 15:35:53 -0700 (PDT) To: jgarzik@pobox.com, netdev@oss.sgi.com Cc: raghavendra.koushik@neterion.com, ravinandan.arakali@neterion.com, leonid.grossman@neterion.com, rapuru.sriram@neterion.com From: raghavendra.koushik@neterion.com Subject: [PATCH 2.6.12.1 9/12] S2io: Support for Xframe II NIC Message-Id: <20050707223553.47B8E89826@linux.site> Date: Thu, 7 Jul 2005 15:35:53 -0700 (PDT) X-archive-position: 2682 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: raghavendra.koushik@neterion.com Precedence: bulk X-list: netdev Content-Length: 30779 Lines: 936 Hi, This patch provides basic support for the Xframe II adapter. Includes the following changes: 1. New values to program XAUI interface. 2. Print the PCI/PCI-X mode(bus frequency, width). 3. Remove EOI from reset during intialization. 4. Enable all 8 PCCs if Xframe II adapter. 5. Programs the RLDRAM size depending on the device. (Note: RLDRAM size on XFARME-I is 64Mb whereas on XFRAME-II it's 32 Mb). 6. Enable extended(64-bit) statistics counters. 7. Program timer interrupt duration based on PCI/PCI-X clock speed. 8. Not required to save/restore PCI config space before/after reset. Signed-off-by: Ravinandan Arakali Signed-off-by: Raghavendra Koushik --- diff -urpN vanilla_kernel/drivers/net/s2io-regs.h linux-2.6.12-rc6/drivers/net/s2io-regs.h --- vanilla_kernel/drivers/net/s2io-regs.h 2005-06-28 03:24:56.000000000 -0700 +++ linux-2.6.12-rc6/drivers/net/s2io-regs.h 2005-06-28 03:25:04.000000000 -0700 @@ -91,7 +91,21 @@ typedef struct _XENA_dev_config { SERR_SOURCE_MC | \ SERR_SOURCE_XGXS) - u8 unused_0[0x800 - 0x120]; + u64 pci_mode; +#define GET_PCI_MODE(val) ((val & vBIT(0xF, 0, 4)) >> 60) +#define PCI_MODE_PCI_33 0 +#define PCI_MODE_PCI_66 0x1 +#define PCI_MODE_PCIX_M1_66 0x2 +#define PCI_MODE_PCIX_M1_100 0x3 +#define PCI_MODE_PCIX_M1_133 0x4 +#define PCI_MODE_PCIX_M2_66 0x5 +#define PCI_MODE_PCIX_M2_100 0x6 +#define PCI_MODE_PCIX_M2_133 0x7 +#define PCI_MODE_UNSUPPORTED BIT(0) +#define PCI_MODE_32_BITS BIT(8) +#define PCI_MODE_UNKNOWN_MODE BIT(9) + + u8 unused_0[0x800 - 0x128]; /* PCI-X Controller registers */ u64 pic_int_status; @@ -223,19 +237,16 @@ typedef struct _XENA_dev_config { u64 xmsi_data; u64 rx_mat; +#define RX_MAT_SET(ring, msi) vBIT(msi, (8 * ring), 8) u8 unused6[0x8]; - u64 tx_mat0_7; - u64 tx_mat8_15; - u64 tx_mat16_23; - u64 tx_mat24_31; - u64 tx_mat32_39; - u64 tx_mat40_47; - u64 tx_mat48_55; - u64 tx_mat56_63; + u64 tx_mat0_n[0x8]; +#define TX_MAT_SET(fifo, msi) vBIT(msi, (8 * fifo), 8) - u8 unused_1[0x10]; + u8 unused_1[0x8]; + u64 stat_byte_cnt; +#define STAT_BC(n) vBIT(n,4,12) /* Automated statistics collection */ u64 stat_cfg; @@ -269,7 +280,12 @@ typedef struct _XENA_dev_config { u64 gpio_control; #define GPIO_CTRL_GPIO_0 BIT(8) - u8 unused7[0x600]; + u8 unused7_1[0x240 - 0x200]; + + u64 wreq_split_mask; +#define WREQ_SPLIT_MASK_SET_MASK(val) vBIT(val, 52, 12) + + u8 unused7_2[0x800 - 0x248]; /* TxDMA registers */ u64 txdma_int_status; @@ -470,6 +486,7 @@ typedef struct _XENA_dev_config { #define PRC_CTRL_NO_SNOOP (BIT(22)|BIT(23)) #define PRC_CTRL_NO_SNOOP_DESC BIT(22) #define PRC_CTRL_NO_SNOOP_BUFF BIT(23) +#define PRC_CTRL_BIMODAL_INTERRUPT BIT(37) #define PRC_CTRL_RXD_BACKOFF_INTERVAL(val) vBIT(val,40,24) u64 prc_alarm_action; @@ -742,7 +759,19 @@ typedef struct _XENA_dev_config { u64 mc_rldram_test_d1; u8 unused24[0x300 - 0x288]; u64 mc_rldram_test_d2; - u8 unused25[0x700 - 0x308]; + + u8 unused24_1[0x360 - 0x308]; + u64 mc_rldram_ctrl; +#define MC_RLDRAM_ENABLE_ODT BIT(7) + + u8 unused24_2[0x640 - 0x368]; + u64 mc_rldram_ref_per_herc; +#define MC_RLDRAM_SET_REF_PERIOD(val) vBIT(val, 0, 16) + + u8 unused24_3[0x660 - 0x648]; + u64 mc_rldram_mrs_herc; + + u8 unused25[0x700 - 0x668]; u64 mc_debug_ctrl; u8 unused26[0x3000 - 0x2f08]; diff -urpN vanilla_kernel/drivers/net/s2io.c linux-2.6.12-rc6/drivers/net/s2io.c --- vanilla_kernel/drivers/net/s2io.c 2005-06-28 03:24:56.000000000 -0700 +++ linux-2.6.12-rc6/drivers/net/s2io.c 2005-06-28 03:25:04.000000000 -0700 @@ -83,9 +83,10 @@ static inline int RXD_IS_UP2DT(RxD_t *rx * problem, 600B, 600C, 600D, 640B, 640C and 640D. * macro below identifies these cards given the subsystem_id. */ -#define CARDS_WITH_FAULTY_LINK_INDICATORS(subid) \ - (((subid >= 0x600B) && (subid <= 0x600D)) || \ - ((subid >= 0x640B) && (subid <= 0x640D))) ? 1 : 0 +#define CARDS_WITH_FAULTY_LINK_INDICATORS(dev_type, subid) \ + (dev_type == XFRAME_I_DEVICE) ? \ + ((((subid >= 0x600B) && (subid <= 0x600D)) || \ + ((subid >= 0x640B) && (subid <= 0x640D))) ? 1 : 0) : 0 #define LINK_IS_UP(val64) (!(val64 & (ADAPTER_STATUS_RMAC_REMOTE_FAULT | \ ADAPTER_STATUS_RMAC_LOCAL_FAULT))) @@ -206,7 +207,24 @@ static void s2io_vlan_rx_kill_vid(struct #define SWITCH_SIGN 0xA5A5A5A5A5A5A5A5ULL #define END_SIGN 0x0 -static u64 default_mdio_cfg[] = { +static u64 herc_act_dtx_cfg[] = { + /* Set address */ + 0x80000515BA750000ULL, 0x80000515BA7500E0ULL, + /* Write data */ + 0x80000515BA750004ULL, 0x80000515BA7500E4ULL, + /* Set address */ + 0x80010515003F0000ULL, 0x80010515003F00E0ULL, + /* Write data */ + 0x80010515003F0004ULL, 0x80010515003F00E4ULL, + /* Set address */ + 0x80020515F2100000ULL, 0x80020515F21000E0ULL, + /* Write data */ + 0x80020515F2100004ULL, 0x80020515F21000E4ULL, + /* Done */ + END_SIGN +}; + +static u64 xena_mdio_cfg[] = { /* Reset PMA PLL */ 0xC001010000000000ULL, 0xC0010100000000E0ULL, 0xC0010100008000E4ULL, @@ -216,7 +234,7 @@ static u64 default_mdio_cfg[] = { END_SIGN }; -static u64 default_dtx_cfg[] = { +static u64 xena_dtx_cfg[] = { 0x8000051500000000ULL, 0x80000515000000E0ULL, 0x80000515D93500E4ULL, 0x8001051500000000ULL, 0x80010515000000E0ULL, 0x80010515001E00E4ULL, @@ -655,6 +673,87 @@ static void free_shared_mem(struct s2io_ } /** + * s2io_verify_pci_mode - + */ + +static int s2io_verify_pci_mode(nic_t *nic) +{ + XENA_dev_config_t *bar0 = (XENA_dev_config_t *) nic->bar0; + register u64 val64 = 0; + int mode; + + val64 = readq(&bar0->pci_mode); + mode = (u8)GET_PCI_MODE(val64); + + if ( val64 & PCI_MODE_UNKNOWN_MODE) + return -1; /* Unknown PCI mode */ + return mode; +} + + +/** + * s2io_print_pci_mode - + */ +static int s2io_print_pci_mode(nic_t *nic) +{ + XENA_dev_config_t *bar0 = (XENA_dev_config_t *) nic->bar0; + register u64 val64 = 0; + int mode; + struct config_param *config = &nic->config; + + val64 = readq(&bar0->pci_mode); + mode = (u8)GET_PCI_MODE(val64); + + if ( val64 & PCI_MODE_UNKNOWN_MODE) + return -1; /* Unknown PCI mode */ + + if (val64 & PCI_MODE_32_BITS) { + DBG_PRINT(ERR_DBG, "%s: Device is on 32 bit ", nic->dev->name); + } else { + DBG_PRINT(ERR_DBG, "%s: Device is on 64 bit ", nic->dev->name); + } + + switch(mode) { + case PCI_MODE_PCI_33: + DBG_PRINT(ERR_DBG, "33MHz PCI bus\n"); + config->bus_speed = 33; + break; + case PCI_MODE_PCI_66: + DBG_PRINT(ERR_DBG, "66MHz PCI bus\n"); + config->bus_speed = 133; + break; + case PCI_MODE_PCIX_M1_66: + DBG_PRINT(ERR_DBG, "66MHz PCIX(M1) bus\n"); + config->bus_speed = 133; /* Herc doubles the clock rate */ + break; + case PCI_MODE_PCIX_M1_100: + DBG_PRINT(ERR_DBG, "100MHz PCIX(M1) bus\n"); + config->bus_speed = 200; + break; + case PCI_MODE_PCIX_M1_133: + DBG_PRINT(ERR_DBG, "133MHz PCIX(M1) bus\n"); + config->bus_speed = 266; + break; + case PCI_MODE_PCIX_M2_66: + DBG_PRINT(ERR_DBG, "133MHz PCIX(M2) bus\n"); + config->bus_speed = 133; + break; + case PCI_MODE_PCIX_M2_100: + DBG_PRINT(ERR_DBG, "200MHz PCIX(M2) bus\n"); + config->bus_speed = 200; + break; + case PCI_MODE_PCIX_M2_133: + DBG_PRINT(ERR_DBG, "266MHz PCIX(M2) bus\n"); + config->bus_speed = 266; + break; + default: + return -1; /* Unsupported bus speed */ + } + + return mode; +} + +/** * init_nic - Initialization of hardware * @nic: device peivate variable * Description: The function sequentially configures every block @@ -686,6 +785,16 @@ static int init_nic(struct s2io_nic *nic return -1; } + /* + * Herc requires EOI to be removed from reset before XGXS, so.. + */ + if (nic->device_type & XFRAME_II_DEVICE) { + val64 = 0xA500000000ULL; + writeq(val64, &bar0->sw_reset); + msleep(500); + val64 = readq(&bar0->sw_reset); + } + /* Remove XGXS from reset state */ val64 = 0; writeq(val64, &bar0->sw_reset); @@ -717,41 +826,51 @@ static int init_nic(struct s2io_nic *nic * of 64 bit values into two registers in a particular * sequence. Hence a macro 'SWITCH_SIGN' has been defined * which will be defined in the array of configuration values - * (default_dtx_cfg & default_mdio_cfg) at appropriate places + * (xena_dtx_cfg & xena_mdio_cfg) at appropriate places * to switch writing from one regsiter to another. We continue * writing these values until we encounter the 'END_SIGN' macro. * For example, After making a series of 21 writes into * dtx_control register the 'SWITCH_SIGN' appears and hence we * start writing into mdio_control until we encounter END_SIGN. */ - while (1) { - dtx_cfg: - while (default_dtx_cfg[dtx_cnt] != END_SIGN) { - if (default_dtx_cfg[dtx_cnt] == SWITCH_SIGN) { - dtx_cnt++; - goto mdio_cfg; - } - SPECIAL_REG_WRITE(default_dtx_cfg[dtx_cnt], + if (nic->device_type & XFRAME_II_DEVICE) { + while (herc_act_dtx_cfg[dtx_cnt] != END_SIGN) { + SPECIAL_REG_WRITE(xena_dtx_cfg[dtx_cnt], &bar0->dtx_control, UF); - val64 = readq(&bar0->dtx_control); + if (dtx_cnt & 0x1) + msleep(1); /* Necessary!! */ dtx_cnt++; } - mdio_cfg: - while (default_mdio_cfg[mdio_cnt] != END_SIGN) { - if (default_mdio_cfg[mdio_cnt] == SWITCH_SIGN) { + } else { + while (1) { + dtx_cfg: + while (xena_dtx_cfg[dtx_cnt] != END_SIGN) { + if (xena_dtx_cfg[dtx_cnt] == SWITCH_SIGN) { + dtx_cnt++; + goto mdio_cfg; + } + SPECIAL_REG_WRITE(xena_dtx_cfg[dtx_cnt], + &bar0->dtx_control, UF); + val64 = readq(&bar0->dtx_control); + dtx_cnt++; + } + mdio_cfg: + while (xena_mdio_cfg[mdio_cnt] != END_SIGN) { + if (xena_mdio_cfg[mdio_cnt] == SWITCH_SIGN) { + mdio_cnt++; + goto dtx_cfg; + } + SPECIAL_REG_WRITE(xena_mdio_cfg[mdio_cnt], + &bar0->mdio_control, UF); + val64 = readq(&bar0->mdio_control); mdio_cnt++; + } + if ((xena_dtx_cfg[dtx_cnt] == END_SIGN) && + (xena_mdio_cfg[mdio_cnt] == END_SIGN)) { + break; + } else { goto dtx_cfg; } - SPECIAL_REG_WRITE(default_mdio_cfg[mdio_cnt], - &bar0->mdio_control, UF); - val64 = readq(&bar0->mdio_control); - mdio_cnt++; - } - if ((default_dtx_cfg[dtx_cnt] == END_SIGN) && - (default_mdio_cfg[mdio_cnt] == END_SIGN)) { - break; - } else { - goto dtx_cfg; } } @@ -802,7 +921,8 @@ static int init_nic(struct s2io_nic *nic * Disable 4 PCCs for Xena1, 2 and 3 as per H/W bug * SXE-008 TRANSMIT DMA ARBITRATION ISSUE. */ - if (get_xena_rev_id(nic->pdev) < 4) + if ((nic->device_type == XFRAME_I_DEVICE) && + (get_xena_rev_id(nic->pdev) < 4)) writeq(PCC_ENABLE_FOUR, &bar0->pcc_enable); val64 = readq(&bar0->tx_fifo_partition_0); @@ -832,7 +952,11 @@ static int init_nic(struct s2io_nic *nic * configured Rings. */ val64 = 0; - mem_size = 64; + if (nic->device_type & XFRAME_II_DEVICE) + mem_size = 32; + else + mem_size = 64; + for (i = 0; i < config->rx_ring_num; i++) { switch (i) { case 0: @@ -1115,6 +1239,11 @@ static int init_nic(struct s2io_nic *nic /* Program statistics memory */ writeq(mac_control->stats_mem_phy, &bar0->stat_addr); + if (nic->device_type == XFRAME_II_DEVICE) { + val64 = STAT_BC(0x320); + writeq(val64, &bar0->stat_byte_cnt); + } + /* * Initializing the sampling rate for the device to calculate the * bandwidth utilization. @@ -1133,12 +1262,18 @@ static int init_nic(struct s2io_nic *nic * 250 interrupts per sec. Continuous interrupts are enabled * by default. */ - val64 = TTI_DATA1_MEM_TX_TIMER_VAL(0x2078) | - TTI_DATA1_MEM_TX_URNG_A(0xA) | + if (nic->device_type == XFRAME_II_DEVICE) { + int count = (nic->config.bus_speed * 125)/2; + val64 = TTI_DATA1_MEM_TX_TIMER_VAL(count); + } else { + + val64 = TTI_DATA1_MEM_TX_TIMER_VAL(0x2078); + } + val64 |= TTI_DATA1_MEM_TX_URNG_A(0xA) | TTI_DATA1_MEM_TX_URNG_B(0x10) | TTI_DATA1_MEM_TX_URNG_C(0x30) | TTI_DATA1_MEM_TX_TIMER_AC_EN; - if (use_continuous_tx_intrs) - val64 |= TTI_DATA1_MEM_TX_TIMER_CI_EN; + if (use_continuous_tx_intrs) + val64 |= TTI_DATA1_MEM_TX_TIMER_CI_EN; writeq(val64, &bar0->tti_data1_mem); val64 = TTI_DATA2_MEM_TX_UFC_A(0x10) | @@ -1170,9 +1305,19 @@ static int init_nic(struct s2io_nic *nic time++; } + /* RTI Initialization */ - val64 = RTI_DATA1_MEM_RX_TIMER_VAL(0xFFF) | - RTI_DATA1_MEM_RX_URNG_A(0xA) | + if (nic->device_type == XFRAME_II_DEVICE) { + /* + * Programmed to generate Apprx 500 Intrs per + * second + */ + int count = (nic->config.bus_speed * 125)/4; + val64 = RTI_DATA1_MEM_RX_TIMER_VAL(count); + } else { + val64 = RTI_DATA1_MEM_RX_TIMER_VAL(0xFFF); + } + val64 |= RTI_DATA1_MEM_RX_URNG_A(0xA) | RTI_DATA1_MEM_RX_URNG_B(0x10) | RTI_DATA1_MEM_RX_URNG_C(0x30) | RTI_DATA1_MEM_RX_TIMER_AC_EN; @@ -1266,6 +1411,15 @@ static int init_nic(struct s2io_nic *nic val64 |= PIC_CNTL_SHARED_SPLITS(shared_splits); writeq(val64, &bar0->pic_control); + /* + * Programming the Herc to split every write transaction + * that does not start on an ADB to reduce disconnects. + */ + if (nic->device_type == XFRAME_II_DEVICE) { + val64 = WREQ_SPLIT_MASK_SET_MASK(255); + writeq(val64, &bar0->wreq_split_mask); + } + return SUCCESS; } @@ -1508,18 +1662,18 @@ static void en_dis_able_nic_intrs(struct } } -static int check_prc_pcc_state(u64 val64, int flag, int rev_id) +static int check_prc_pcc_state(u64 val64, int flag, int rev_id, int herc) { int ret = 0; if (flag == FALSE) { - if (rev_id >= 4) { + if ((!herc && (rev_id >= 4)) || herc) { if (!(val64 & ADAPTER_STATUS_RMAC_PCC_IDLE) && ((val64 & ADAPTER_STATUS_RC_PRC_QUIESCENT) == ADAPTER_STATUS_RC_PRC_QUIESCENT)) { ret = 1; } - } else { + }else { if (!(val64 & ADAPTER_STATUS_RMAC_PCC_FOUR_IDLE) && ((val64 & ADAPTER_STATUS_RC_PRC_QUIESCENT) == ADAPTER_STATUS_RC_PRC_QUIESCENT)) { @@ -1527,7 +1681,7 @@ static int check_prc_pcc_state(u64 val64 } } } else { - if (rev_id >= 4) { + if ((!herc && (rev_id >= 4)) || herc) { if (((val64 & ADAPTER_STATUS_RMAC_PCC_IDLE) == ADAPTER_STATUS_RMAC_PCC_IDLE) && (!(val64 & ADAPTER_STATUS_RC_PRC_QUIESCENT) || @@ -1563,10 +1717,11 @@ static int check_prc_pcc_state(u64 val64 static int verify_xena_quiescence(nic_t *sp, u64 val64, int flag) { - int ret = 0; + int ret = 0, herc; u64 tmp64 = ~((u64) val64); int rev_id = get_xena_rev_id(sp->pdev); + herc = (sp->device_type == XFRAME_II_DEVICE); if (! (tmp64 & (ADAPTER_STATUS_TDMA_READY | ADAPTER_STATUS_RDMA_READY | @@ -1574,7 +1729,7 @@ static int verify_xena_quiescence(nic_t ADAPTER_STATUS_PIC_QUIESCENT | ADAPTER_STATUS_MC_DRAM_READY | ADAPTER_STATUS_MC_QUEUES_READY | ADAPTER_STATUS_M_PLL_LOCK | ADAPTER_STATUS_P_PLL_LOCK))) { - ret = check_prc_pcc_state(val64, flag, rev_id); + ret = check_prc_pcc_state(val64, flag, rev_id, herc); } return ret; @@ -1705,7 +1860,8 @@ static int start_nic(struct s2io_nic *ni /* SXE-002: Initialize link and activity LED */ subid = nic->pdev->subsystem_device; - if ((subid & 0xFF) >= 0x07) { + if (((subid & 0xFF) >= 0x07) && + (nic->device_type == XFRAME_I_DEVICE)) { val64 = readq(&bar0->gpio_control); val64 |= 0x0000800000000000ULL; writeq(val64, &bar0->gpio_control); @@ -2542,9 +2698,12 @@ void s2io_reset(nic_t * sp) */ msleep(250); + if (!(sp->device_type & XFRAME_II_DEVICE)) { /* Restore the PCI state saved during initializarion. */ - pci_restore_state(sp->pdev); - + pci_restore_state(sp->pdev); + } else { + pci_set_master(sp->pdev); + } s2io_init_pci(sp); msleep(250); @@ -2569,7 +2728,8 @@ void s2io_reset(nic_t * sp) /* SXE-002: Configure link and activity LED to turn it off */ subid = sp->pdev->subsystem_device; - if ((subid & 0xFF) >= 0x07) { + if (((subid & 0xFF) >= 0x07) && + (sp->device_type == XFRAME_I_DEVICE)) { val64 = readq(&bar0->gpio_control); val64 |= 0x0000800000000000ULL; writeq(val64, &bar0->gpio_control); @@ -2577,6 +2737,15 @@ void s2io_reset(nic_t * sp) writeq(val64, (void __iomem *) ((u8 *) bar0 + 0x2700)); } + /* + * Clear spurious ECC interrupts that would have occured on + * XFRAME II cards after reset. + */ + if (sp->device_type == XFRAME_II_DEVICE) { + val64 = readq(&bar0->pcc_err_reg); + writeq(val64, &bar0->pcc_err_reg); + } + sp->device_enabled_once = FALSE; } @@ -3464,7 +3633,8 @@ static void s2io_phy_id(unsigned long da u16 subid; subid = sp->pdev->subsystem_device; - if ((subid & 0xFF) >= 0x07) { + if ((sp->device_type == XFRAME_II_DEVICE) || + ((subid & 0xFF) >= 0x07)) { val64 = readq(&bar0->gpio_control); val64 ^= GPIO_CTRL_GPIO_0; writeq(val64, &bar0->gpio_control); @@ -3501,7 +3671,8 @@ static int s2io_ethtool_idnic(struct net subid = sp->pdev->subsystem_device; last_gpio_ctrl_val = readq(&bar0->gpio_control); - if ((subid & 0xFF) < 0x07) { + if ((sp->device_type == XFRAME_I_DEVICE) && + ((subid & 0xFF) < 0x07)) { val64 = readq(&bar0->adapter_control); if (!(val64 & ADAPTER_CNTL_EN)) { printk(KERN_ERR @@ -3521,7 +3692,7 @@ static int s2io_ethtool_idnic(struct net msleep_interruptible(MAX_FLICKER_TIME); del_timer_sync(&sp->id_timer); - if (CARDS_WITH_FAULTY_LINK_INDICATORS(subid)) { + if (CARDS_WITH_FAULTY_LINK_INDICATORS(sp->device_type, subid)) { writeq(last_gpio_ctrl_val, &bar0->gpio_control); last_gpio_ctrl_val = readq(&bar0->gpio_control); } @@ -4135,44 +4306,91 @@ static void s2io_get_ethtool_stats(struc StatInfo_t *stat_info = sp->mac_control.stats_info; s2io_updt_stats(sp); - tmp_stats[i++] = le32_to_cpu(stat_info->tmac_frms); - tmp_stats[i++] = le32_to_cpu(stat_info->tmac_data_octets); + tmp_stats[i++] = + (u64)le32_to_cpu(stat_info->tmac_frms_oflow) << 32 | + le32_to_cpu(stat_info->tmac_frms); + tmp_stats[i++] = + (u64)le32_to_cpu(stat_info->tmac_data_octets_oflow) << 32 | + le32_to_cpu(stat_info->tmac_data_octets); tmp_stats[i++] = le64_to_cpu(stat_info->tmac_drop_frms); - tmp_stats[i++] = le32_to_cpu(stat_info->tmac_mcst_frms); - tmp_stats[i++] = le32_to_cpu(stat_info->tmac_bcst_frms); + tmp_stats[i++] = + (u64)le32_to_cpu(stat_info->tmac_mcst_frms_oflow) << 32 | + le32_to_cpu(stat_info->tmac_mcst_frms); + tmp_stats[i++] = + (u64)le32_to_cpu(stat_info->tmac_bcst_frms_oflow) << 32 | + le32_to_cpu(stat_info->tmac_bcst_frms); tmp_stats[i++] = le64_to_cpu(stat_info->tmac_pause_ctrl_frms); - tmp_stats[i++] = le32_to_cpu(stat_info->tmac_any_err_frms); + tmp_stats[i++] = + (u64)le32_to_cpu(stat_info->tmac_any_err_frms_oflow) << 32 | + le32_to_cpu(stat_info->tmac_any_err_frms); tmp_stats[i++] = le64_to_cpu(stat_info->tmac_vld_ip_octets); - tmp_stats[i++] = le32_to_cpu(stat_info->tmac_vld_ip); - tmp_stats[i++] = le32_to_cpu(stat_info->tmac_drop_ip); - tmp_stats[i++] = le32_to_cpu(stat_info->tmac_icmp); - tmp_stats[i++] = le32_to_cpu(stat_info->tmac_rst_tcp); + tmp_stats[i++] = + (u64)le32_to_cpu(stat_info->tmac_vld_ip_oflow) << 32 | + le32_to_cpu(stat_info->tmac_vld_ip); + tmp_stats[i++] = + (u64)le32_to_cpu(stat_info->tmac_drop_ip_oflow) << 32 | + le32_to_cpu(stat_info->tmac_drop_ip); + tmp_stats[i++] = + (u64)le32_to_cpu(stat_info->tmac_icmp_oflow) << 32 | + le32_to_cpu(stat_info->tmac_icmp); + tmp_stats[i++] = + (u64)le32_to_cpu(stat_info->tmac_rst_tcp_oflow) << 32 | + le32_to_cpu(stat_info->tmac_rst_tcp); tmp_stats[i++] = le64_to_cpu(stat_info->tmac_tcp); - tmp_stats[i++] = le32_to_cpu(stat_info->tmac_udp); - tmp_stats[i++] = le32_to_cpu(stat_info->rmac_vld_frms); - tmp_stats[i++] = le32_to_cpu(stat_info->rmac_data_octets); + tmp_stats[i++] = (u64)le32_to_cpu(stat_info->tmac_udp_oflow) << 32 | + le32_to_cpu(stat_info->tmac_udp); + tmp_stats[i++] = + (u64)le32_to_cpu(stat_info->rmac_vld_frms_oflow) << 32 | + le32_to_cpu(stat_info->rmac_vld_frms); + tmp_stats[i++] = + (u64)le32_to_cpu(stat_info->rmac_data_octets_oflow) << 32 | + le32_to_cpu(stat_info->rmac_data_octets); tmp_stats[i++] = le64_to_cpu(stat_info->rmac_fcs_err_frms); tmp_stats[i++] = le64_to_cpu(stat_info->rmac_drop_frms); - tmp_stats[i++] = le32_to_cpu(stat_info->rmac_vld_mcst_frms); - tmp_stats[i++] = le32_to_cpu(stat_info->rmac_vld_bcst_frms); + tmp_stats[i++] = + (u64)le32_to_cpu(stat_info->rmac_vld_mcst_frms_oflow) << 32 | + le32_to_cpu(stat_info->rmac_vld_mcst_frms); + tmp_stats[i++] = + (u64)le32_to_cpu(stat_info->rmac_vld_bcst_frms_oflow) << 32 | + le32_to_cpu(stat_info->rmac_vld_bcst_frms); tmp_stats[i++] = le32_to_cpu(stat_info->rmac_in_rng_len_err_frms); tmp_stats[i++] = le64_to_cpu(stat_info->rmac_long_frms); tmp_stats[i++] = le64_to_cpu(stat_info->rmac_pause_ctrl_frms); - tmp_stats[i++] = le32_to_cpu(stat_info->rmac_discarded_frms); - tmp_stats[i++] = le32_to_cpu(stat_info->rmac_usized_frms); - tmp_stats[i++] = le32_to_cpu(stat_info->rmac_osized_frms); - tmp_stats[i++] = le32_to_cpu(stat_info->rmac_frag_frms); - tmp_stats[i++] = le32_to_cpu(stat_info->rmac_jabber_frms); - tmp_stats[i++] = le32_to_cpu(stat_info->rmac_ip); + tmp_stats[i++] = + (u64)le32_to_cpu(stat_info->rmac_discarded_frms_oflow) << 32 | + le32_to_cpu(stat_info->rmac_discarded_frms); + tmp_stats[i++] = + (u64)le32_to_cpu(stat_info->rmac_usized_frms_oflow) << 32 | + le32_to_cpu(stat_info->rmac_usized_frms); + tmp_stats[i++] = + (u64)le32_to_cpu(stat_info->rmac_osized_frms_oflow) << 32 | + le32_to_cpu(stat_info->rmac_osized_frms); + tmp_stats[i++] = + (u64)le32_to_cpu(stat_info->rmac_frag_frms_oflow) << 32 | + le32_to_cpu(stat_info->rmac_frag_frms); + tmp_stats[i++] = + (u64)le32_to_cpu(stat_info->rmac_jabber_frms_oflow) << 32 | + le32_to_cpu(stat_info->rmac_jabber_frms); + tmp_stats[i++] = (u64)le32_to_cpu(stat_info->rmac_ip_oflow) << 32 | + le32_to_cpu(stat_info->rmac_ip); tmp_stats[i++] = le64_to_cpu(stat_info->rmac_ip_octets); tmp_stats[i++] = le32_to_cpu(stat_info->rmac_hdr_err_ip); - tmp_stats[i++] = le32_to_cpu(stat_info->rmac_drop_ip); - tmp_stats[i++] = le32_to_cpu(stat_info->rmac_icmp); + tmp_stats[i++] = (u64)le32_to_cpu(stat_info->rmac_drop_ip_oflow) << 32 | + le32_to_cpu(stat_info->rmac_drop_ip); + tmp_stats[i++] = (u64)le32_to_cpu(stat_info->rmac_icmp_oflow) << 32 | + le32_to_cpu(stat_info->rmac_icmp); tmp_stats[i++] = le64_to_cpu(stat_info->rmac_tcp); - tmp_stats[i++] = le32_to_cpu(stat_info->rmac_udp); - tmp_stats[i++] = le32_to_cpu(stat_info->rmac_err_drp_udp); - tmp_stats[i++] = le32_to_cpu(stat_info->rmac_pause_cnt); - tmp_stats[i++] = le32_to_cpu(stat_info->rmac_accepted_ip); + tmp_stats[i++] = (u64)le32_to_cpu(stat_info->rmac_udp_oflow) << 32 | + le32_to_cpu(stat_info->rmac_udp); + tmp_stats[i++] = + (u64)le32_to_cpu(stat_info->rmac_err_drp_udp_oflow) << 32 | + le32_to_cpu(stat_info->rmac_err_drp_udp); + tmp_stats[i++] = + (u64)le32_to_cpu(stat_info->rmac_pause_cnt_oflow) << 32 | + le32_to_cpu(stat_info->rmac_pause_cnt); + tmp_stats[i++] = + (u64)le32_to_cpu(stat_info->rmac_accepted_ip_oflow) << 32 | + le32_to_cpu(stat_info->rmac_accepted_ip); tmp_stats[i++] = le32_to_cpu(stat_info->rmac_err_tcp); tmp_stats[i++] = 0; tmp_stats[i++] = stat_info->sw_stat.single_ecc_errs; @@ -4402,7 +4620,8 @@ static void s2io_set_link(unsigned long val64 = readq(&bar0->adapter_control); val64 |= ADAPTER_CNTL_EN; writeq(val64, &bar0->adapter_control); - if (CARDS_WITH_FAULTY_LINK_INDICATORS(subid)) { + if (CARDS_WITH_FAULTY_LINK_INDICATORS(nic->device_type, + subid)) { val64 = readq(&bar0->gpio_control); val64 |= GPIO_CTRL_GPIO_0; writeq(val64, &bar0->gpio_control); @@ -4424,7 +4643,8 @@ static void s2io_set_link(unsigned long } s2io_link(nic, LINK_UP); } else { - if (CARDS_WITH_FAULTY_LINK_INDICATORS(subid)) { + if (CARDS_WITH_FAULTY_LINK_INDICATORS(nic->device_type, + subid)) { val64 = readq(&bar0->gpio_control); val64 &= ~GPIO_CTRL_GPIO_0; writeq(val64, &bar0->gpio_control); @@ -4709,7 +4929,6 @@ static int rx_osm_handler(ring_info_t *r netif_rx(skb); } #endif - dev->last_rx = jiffies; atomic_dec(&sp->rx_bufs_left[ring_no]); return SUCCESS; @@ -4843,6 +5062,7 @@ s2io_init_nic(struct pci_dev *pdev, cons u16 subid; mac_info_t *mac_control; struct config_param *config; + int mode; #ifdef CONFIG_S2IO_NAPI DBG_PRINT(ERR_DBG, "NAPI support has been enabled\n"); @@ -4899,6 +5119,12 @@ s2io_init_nic(struct pci_dev *pdev, cons sp->high_dma_flag = dma_flag; sp->device_enabled_once = FALSE; + if ((pdev->device == PCI_DEVICE_ID_HERC_WIN) || + (pdev->device == PCI_DEVICE_ID_HERC_UNI)) + sp->device_type = XFRAME_II_DEVICE; + else + sp->device_type = XFRAME_I_DEVICE; + /* Initialize some PCI/PCI-X fields of the NIC. */ s2io_init_pci(sp); @@ -5034,7 +5260,9 @@ s2io_init_nic(struct pci_dev *pdev, cons INIT_WORK(&sp->set_link_task, (void (*)(void *)) s2io_set_link, sp); - pci_save_state(sp->pdev); + if (!(sp->device_type & XFRAME_II_DEVICE)) { + pci_save_state(sp->pdev); + } /* Setting swapper control on the NIC, for proper reset operation */ if (s2io_set_swapper(sp)) { @@ -5044,12 +5272,26 @@ s2io_init_nic(struct pci_dev *pdev, cons goto set_swap_failed; } - /* - * Fix for all "FFs" MAC address problems observed on - * Alpha platforms - */ - fix_mac_address(sp); - s2io_reset(sp); + /* Verify if the Herc works on the slot its placed into */ + if (sp->device_type & XFRAME_II_DEVICE) { + mode = s2io_verify_pci_mode(sp); + if (mode < 0) { + DBG_PRINT(ERR_DBG, "%s: ", __FUNCTION__); + DBG_PRINT(ERR_DBG, " Unsupported PCI bus mode\n"); + ret = -EBADSLT; + goto set_swap_failed; + } + } + + /* Not needed for Herc */ + if (sp->device_type & XFRAME_I_DEVICE) { + /* + * Fix for all "FFs" MAC address problems observed on + * Alpha platforms + */ + fix_mac_address(sp); + s2io_reset(sp); + } /* * MAC address initialization. @@ -5074,22 +5316,13 @@ s2io_init_nic(struct pci_dev *pdev, cons sp->def_mac_addr[0].mac_addr[5] = (u8) (mac_down >> 16); sp->def_mac_addr[0].mac_addr[4] = (u8) (mac_down >> 24); - DBG_PRINT(INIT_DBG, - "DEFAULT MAC ADDR:0x%02x-%02x-%02x-%02x-%02x-%02x\n", - sp->def_mac_addr[0].mac_addr[0], - sp->def_mac_addr[0].mac_addr[1], - sp->def_mac_addr[0].mac_addr[2], - sp->def_mac_addr[0].mac_addr[3], - sp->def_mac_addr[0].mac_addr[4], - sp->def_mac_addr[0].mac_addr[5]); - /* Set the factory defined MAC address initially */ dev->addr_len = ETH_ALEN; memcpy(dev->dev_addr, sp->def_mac_addr, ETH_ALEN); /* * Initialize the tasklet status and link state flags - * and the card statte parameter + * and the card state parameter */ atomic_set(&(sp->card_state), 0); sp->tasklet_status = 0; @@ -5124,9 +5357,46 @@ s2io_init_nic(struct pci_dev *pdev, cons goto register_failed; } + if (sp->device_type & XFRAME_II_DEVICE) { + DBG_PRINT(ERR_DBG, "%s: Neterion Xframe II 10GbE adapter ", + dev->name); + DBG_PRINT(ERR_DBG, "(rev %d), Driver %s\n", + get_xena_rev_id(sp->pdev), + s2io_driver_version); + DBG_PRINT(ERR_DBG, "MAC ADDR: %02x:%02x:%02x:%02x:%02x:%02x\n", + sp->def_mac_addr[0].mac_addr[0], + sp->def_mac_addr[0].mac_addr[1], + sp->def_mac_addr[0].mac_addr[2], + sp->def_mac_addr[0].mac_addr[3], + sp->def_mac_addr[0].mac_addr[4], + sp->def_mac_addr[0].mac_addr[5]); + int mode = s2io_print_pci_mode(sp); + if (mode < 0) { + DBG_PRINT(ERR_DBG, " Unsupported PCI bus mode "); + ret = -EBADSLT; + goto set_swap_failed; + } + } else { + DBG_PRINT(ERR_DBG, "%s: Neterion Xframe I 10GbE adapter ", + dev->name); + DBG_PRINT(ERR_DBG, "(rev %d), Driver %s\n", + get_xena_rev_id(sp->pdev), + s2io_driver_version); + DBG_PRINT(ERR_DBG, "MAC ADDR: %02x:%02x:%02x:%02x:%02x:%02x\n", + sp->def_mac_addr[0].mac_addr[0], + sp->def_mac_addr[0].mac_addr[1], + sp->def_mac_addr[0].mac_addr[2], + sp->def_mac_addr[0].mac_addr[3], + sp->def_mac_addr[0].mac_addr[4], + sp->def_mac_addr[0].mac_addr[5]); + } + /* Initialize device name */ strcpy(sp->name, dev->name); - strcat(sp->name, ": Neterion Xframe I 10GbE adapter"); + if (sp->device_type & XFRAME_II_DEVICE) + strcat(sp->name, ": Neterion Xframe II 10GbE adapter"); + else + strcat(sp->name, ": Neterion Xframe I 10GbE adapter"); /* * Make Link state as off at this point, when the Link change diff -urpN vanilla_kernel/drivers/net/s2io.h linux-2.6.12-rc6/drivers/net/s2io.h --- vanilla_kernel/drivers/net/s2io.h 2005-06-28 03:24:56.000000000 -0700 +++ linux-2.6.12-rc6/drivers/net/s2io.h 2005-06-28 03:25:04.000000000 -0700 @@ -201,6 +201,67 @@ typedef struct stat_block { u32 rxf_wr_cnt; u32 txf_rd_cnt; +/* Tx MAC statistics overflow counters. */ + u32 tmac_data_octets_oflow; + u32 tmac_frms_oflow; + u32 tmac_bcst_frms_oflow; + u32 tmac_mcst_frms_oflow; + u32 tmac_ucst_frms_oflow; + u32 tmac_ttl_octets_oflow; + u32 tmac_any_err_frms_oflow; + u32 tmac_nucst_frms_oflow; + u64 tmac_vlan_frms; + u32 tmac_drop_ip_oflow; + u32 tmac_vld_ip_oflow; + u32 tmac_rst_tcp_oflow; + u32 tmac_icmp_oflow; + u32 tpa_unknown_protocol; + u32 tmac_udp_oflow; + u32 reserved_10; + u32 tpa_parse_failure; + +/* Rx MAC Statistics overflow counters. */ + u32 rmac_data_octets_oflow; + u32 rmac_vld_frms_oflow; + u32 rmac_vld_bcst_frms_oflow; + u32 rmac_vld_mcst_frms_oflow; + u32 rmac_accepted_ucst_frms_oflow; + u32 rmac_ttl_octets_oflow; + u32 rmac_discarded_frms_oflow; + u32 rmac_accepted_nucst_frms_oflow; + u32 rmac_usized_frms_oflow; + u32 rmac_drop_events_oflow; + u32 rmac_frag_frms_oflow; + u32 rmac_osized_frms_oflow; + u32 rmac_ip_oflow; + u32 rmac_jabber_frms_oflow; + u32 rmac_icmp_oflow; + u32 rmac_drop_ip_oflow; + u32 rmac_err_drp_udp_oflow; + u32 rmac_udp_oflow; + u32 reserved_11; + u32 rmac_pause_cnt_oflow; + u64 rmac_ttl_1519_4095_frms; + u64 rmac_ttl_4096_8191_frms; + u64 rmac_ttl_8192_max_frms; + u64 rmac_ttl_gt_max_frms; + u64 rmac_osized_alt_frms; + u64 rmac_jabber_alt_frms; + u64 rmac_gt_max_alt_frms; + u64 rmac_vlan_frms; + u32 rmac_len_discard; + u32 rmac_fcs_discard; + u32 rmac_pf_discard; + u32 rmac_da_discard; + u32 rmac_red_discard; + u32 rmac_rts_discard; + u32 reserved_12; + u32 rmac_ingm_full_discard; + u32 reserved_13; + u32 rmac_accepted_ip_oflow; + u32 reserved_14; + u32 link_fault_cnt; + /* Software statistics maintained by driver */ swStat_t sw_stat; } StatInfo_t; @@ -690,6 +751,9 @@ struct s2io_nic { atomic_t card_state; volatile unsigned long link_state; struct vlan_group *vlgrp; +#define XFRAME_I_DEVICE 1 +#define XFRAME_II_DEVICE 2 + u8 device_type; spinlock_t rx_lock; atomic_t isr_cnt; From raghavendra.koushik@neterion.com Thu Jul 7 15:50:06 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 07 Jul 2005 15:50:15 -0700 (PDT) Received: from linux.site (adsl-67-120-213-161.dsl.sntc01.pacbell.net [67.120.213.161]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j67MnoH9021401 for ; Thu, 7 Jul 2005 15:50:06 -0700 Received: by linux.site (Postfix, from userid 0) id 053B389828; Thu, 7 Jul 2005 15:37:23 -0700 (PDT) To: jgarzik@pobox.com, netdev@oss.sgi.com Cc: raghavendra.koushik@neterion.com, ravinandan.arakali@neterion.com, leonid.grossman@neterion.com, rapuru.sriram@neterion.com From: raghavendra.koushik@neterion.com Subject: [PATCH 2.6.12.1 10/12] S2io: Support for Bimodal interrupts Message-Id: <20050707223723.053B389828@linux.site> Date: Thu, 7 Jul 2005 15:37:23 -0700 (PDT) X-archive-position: 2683 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: raghavendra.koushik@neterion.com Precedence: bulk X-list: netdev Content-Length: 6376 Lines: 199 Hi, This is a patch to provide bimodal interrupt moderation support for Xframe II adapter. Basically, in this moderation scheme, the adapter raises a traffic interrupt if the no. of packets transmitted and/or received reaches a programmable threshold. Signed-off-by: Ravinandan Arakali Signed-off-by: Raghavendra Koushik --- diff -urpN vanilla_kernel/drivers/net/s2io.c linux-2.6.12-rc6/drivers/net/s2io.c --- vanilla_kernel/drivers/net/s2io.c 2005-06-28 03:30:19.000000000 -0700 +++ linux-2.6.12-rc6/drivers/net/s2io.c 2005-06-28 03:30:38.000000000 -0700 @@ -296,6 +296,7 @@ static unsigned int mc_pause_threshold_q static unsigned int shared_splits; static unsigned int tmac_util_period = 5; static unsigned int rmac_util_period = 5; +static unsigned int bimodal = 0; #ifndef CONFIG_S2IO_NAPI static unsigned int indicate_max_pkts; #endif @@ -1305,52 +1306,86 @@ static int init_nic(struct s2io_nic *nic time++; } + if (nic->config.bimodal) { + int k = 0; + for (k = 0; k < config->rx_ring_num; k++) { + val64 = TTI_CMD_MEM_WE | TTI_CMD_MEM_STROBE_NEW_CMD; + val64 |= TTI_CMD_MEM_OFFSET(0x38+k); + writeq(val64, &bar0->tti_command_mem); - /* RTI Initialization */ - if (nic->device_type == XFRAME_II_DEVICE) { /* - * Programmed to generate Apprx 500 Intrs per - * second - */ - int count = (nic->config.bus_speed * 125)/4; - val64 = RTI_DATA1_MEM_RX_TIMER_VAL(count); + * Once the operation completes, the Strobe bit of the command + * register will be reset. We poll for this particular condition + * We wait for a maximum of 500ms for the operation to complete, + * if it's not complete by then we return error. + */ + time = 0; + while (TRUE) { + val64 = readq(&bar0->tti_command_mem); + if (!(val64 & TTI_CMD_MEM_STROBE_NEW_CMD)) { + break; + } + if (time > 10) { + DBG_PRINT(ERR_DBG, + "%s: TTI init Failed\n", + dev->name); + return -1; + } + time++; + msleep(50); + } + } } else { - val64 = RTI_DATA1_MEM_RX_TIMER_VAL(0xFFF); - } - val64 |= RTI_DATA1_MEM_RX_URNG_A(0xA) | - RTI_DATA1_MEM_RX_URNG_B(0x10) | - RTI_DATA1_MEM_RX_URNG_C(0x30) | RTI_DATA1_MEM_RX_TIMER_AC_EN; - - writeq(val64, &bar0->rti_data1_mem); - val64 = RTI_DATA2_MEM_RX_UFC_A(0x1) | - RTI_DATA2_MEM_RX_UFC_B(0x2) | - RTI_DATA2_MEM_RX_UFC_C(0x40) | RTI_DATA2_MEM_RX_UFC_D(0x80); - writeq(val64, &bar0->rti_data2_mem); + /* RTI Initialization */ + if (nic->device_type == XFRAME_II_DEVICE) { + /* + * Programmed to generate Apprx 500 Intrs per + * second + */ + int count = (nic->config.bus_speed * 125)/4; + val64 = RTI_DATA1_MEM_RX_TIMER_VAL(count); + } else { + val64 = RTI_DATA1_MEM_RX_TIMER_VAL(0xFFF); + } + val64 |= RTI_DATA1_MEM_RX_URNG_A(0xA) | + RTI_DATA1_MEM_RX_URNG_B(0x10) | + RTI_DATA1_MEM_RX_URNG_C(0x30) | RTI_DATA1_MEM_RX_TIMER_AC_EN; + + writeq(val64, &bar0->rti_data1_mem); + + val64 = RTI_DATA2_MEM_RX_UFC_A(0x1) | + RTI_DATA2_MEM_RX_UFC_B(0x2) | + RTI_DATA2_MEM_RX_UFC_C(0x40) | RTI_DATA2_MEM_RX_UFC_D(0x80); + writeq(val64, &bar0->rti_data2_mem); - val64 = RTI_CMD_MEM_WE | RTI_CMD_MEM_STROBE_NEW_CMD; - writeq(val64, &bar0->rti_command_mem); + for (i = 0; i < config->rx_ring_num; i++) { + val64 = RTI_CMD_MEM_WE | RTI_CMD_MEM_STROBE_NEW_CMD + | RTI_CMD_MEM_OFFSET(i); + writeq(val64, &bar0->rti_command_mem); - /* - * Once the operation completes, the Strobe bit of the - * command register will be reset. We poll for this - * particular condition. We wait for a maximum of 500ms - * for the operation to complete, if it's not complete - * by then we return error. - */ - time = 0; - while (TRUE) { - val64 = readq(&bar0->rti_command_mem); - if (!(val64 & RTI_CMD_MEM_STROBE_NEW_CMD)) { - break; - } - if (time > 10) { - DBG_PRINT(ERR_DBG, "%s: RTI init Failed\n", - dev->name); - return -1; + /* + * Once the operation completes, the Strobe bit of the + * command register will be reset. We poll for this + * particular condition. We wait for a maximum of 500ms + * for the operation to complete, if it's not complete + * by then we return error. + */ + time = 0; + while (TRUE) { + val64 = readq(&bar0->rti_command_mem); + if (!(val64 & RTI_CMD_MEM_STROBE_NEW_CMD)) { + break; + } + if (time > 10) { + DBG_PRINT(ERR_DBG, "%s: RTI init Failed\n", + dev->name); + return -1; + } + time++; + msleep(50); + } } - time++; - msleep(50); } /* @@ -1788,6 +1823,8 @@ static int start_nic(struct s2io_nic *ni &bar0->prc_rxd0_n[i]); val64 = readq(&bar0->prc_ctrl_n[i]); + if (nic->config.bimodal) + val64 |= PRC_CTRL_BIMODAL_INTERRUPT; #ifndef CONFIG_2BUFF_MODE val64 |= PRC_CTRL_RC_ENABLED; #else @@ -5031,6 +5068,7 @@ module_param(mc_pause_threshold_q4q7, in module_param(shared_splits, int, 0); module_param(tmac_util_period, int, 0); module_param(rmac_util_period, int, 0); +module_param(bimodal, bool, 0); #ifndef CONFIG_S2IO_NAPI module_param(indicate_max_pkts, int, 0); #endif @@ -5398,6 +5436,14 @@ s2io_init_nic(struct pci_dev *pdev, cons else strcat(sp->name, ": Neterion Xframe I 10GbE adapter"); + /* Initialize bimodal Interrupts */ + sp->config.bimodal = bimodal; + if (!(sp->device_type & XFRAME_II_DEVICE) && bimodal) { + sp->config.bimodal = 0; + DBG_PRINT(ERR_DBG,"%s:Bimodal intr not supported by Xframe I\n", + dev->name); + } + /* * Make Link state as off at this point, when the Link change * interrupt comes the state will be automatically changed to diff -urpN vanilla_kernel/drivers/net/s2io.h linux-2.6.12-rc6/drivers/net/s2io.h --- vanilla_kernel/drivers/net/s2io.h 2005-06-28 03:30:19.000000000 -0700 +++ linux-2.6.12-rc6/drivers/net/s2io.h 2005-06-28 03:30:38.000000000 -0700 @@ -261,8 +261,6 @@ typedef struct stat_block { u32 rmac_accepted_ip_oflow; u32 reserved_14; u32 link_fault_cnt; - -/* Software statistics maintained by driver */ swStat_t sw_stat; } StatInfo_t; @@ -349,6 +347,7 @@ struct config_param { #define MAX_RX_BLOCKS_PER_RING 150 rx_ring_config_t rx_cfg[MAX_RX_RINGS]; /*Per-Rx Ring config */ + u8 bimodal; /*Flag for setting bimodal interrupts*/ #define HEADER_ETHERNET_II_802_3_SIZE 14 #define HEADER_802_2_SIZE 3 From raghavendra.koushik@neterion.com Thu Jul 7 15:52:33 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 07 Jul 2005 15:52:38 -0700 (PDT) Received: from linux.site (adsl-67-120-213-161.dsl.sntc01.pacbell.net [67.120.213.161]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j67MqXH9022271 for ; Thu, 7 Jul 2005 15:52:33 -0700 Received: by linux.site (Postfix, from userid 0) id 2DD9589826; Thu, 7 Jul 2005 15:40:05 -0700 (PDT) To: jgarzik@pobox.com, netdev@oss.sgi.com Cc: raghavendra.koushik@neterion.com, ravinandan.arakali@neterion.com, leonid.grossman@neterion.com, rapuru.sriram@neterion.com From: raghavendra.koushik@neterion.com Subject: [PATCH 2.6.12.1 11/12] S2io: New link handling scheme for Xframe II Message-Id: <20050707224005.2DD9589826@linux.site> Date: Thu, 7 Jul 2005 15:40:05 -0700 (PDT) X-archive-position: 2684 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: raghavendra.koushik@neterion.com Precedence: bulk X-list: netdev Content-Length: 9073 Lines: 276 Hi, The below patch implements a new "Link state change handling" scheme supported by the Xframe II adapter. It also bumps up the driver version to 2.0.2.0. Signed-off-by: Ravinandan Arakali Signed-off-by: Raghavendra Koushik --- diff -uprN vanilla_kernel/drivers/net/s2io-regs.h linux-2.6.12-rc6/drivers/net/s2io-regs.h --- vanilla_kernel/drivers/net/s2io-regs.h 2005-06-28 06:00:05.000000000 -0700 +++ linux-2.6.12-rc6/drivers/net/s2io-regs.h 2005-06-28 06:00:14.000000000 -0700 @@ -167,7 +167,11 @@ typedef struct _XENA_dev_config { u8 unused4[0x08]; u64 gpio_int_reg; +#define GPIO_INT_REG_LINK_DOWN BIT(1) +#define GPIO_INT_REG_LINK_UP BIT(2) u64 gpio_int_mask; +#define GPIO_INT_MASK_LINK_DOWN BIT(1) +#define GPIO_INT_MASK_LINK_UP BIT(2) u64 gpio_alarms; u8 unused5[0x38]; @@ -279,8 +283,10 @@ typedef struct _XENA_dev_config { u64 gpio_control; #define GPIO_CTRL_GPIO_0 BIT(8) + u64 misc_control; +#define MISC_LINK_STABILITY_PRD(val) vBIT(val,29,3) - u8 unused7_1[0x240 - 0x200]; + u8 unused7_1[0x240 - 0x208]; u64 wreq_split_mask; #define WREQ_SPLIT_MASK_SET_MASK(val) vBIT(val, 52, 12) diff -uprN vanilla_kernel/drivers/net/s2io.c linux-2.6.12-rc6/drivers/net/s2io.c --- vanilla_kernel/drivers/net/s2io.c 2005-06-28 06:00:05.000000000 -0700 +++ linux-2.6.12-rc6/drivers/net/s2io.c 2005-06-28 06:00:14.000000000 -0700 @@ -66,7 +66,7 @@ /* S2io Driver name & version. */ static char s2io_driver_name[] = "Neterion"; -static char s2io_driver_version[] = "Version 1.7.7"; +static char s2io_driver_version[] = "Version 2.0.2.0"; static inline int RXD_IS_UP2DT(RxD_t *rxdp) { @@ -1455,8 +1455,28 @@ static int init_nic(struct s2io_nic *nic writeq(val64, &bar0->wreq_split_mask); } + /* Setting Link stability period to 64 ms */ + if (nic->device_type == XFRAME_II_DEVICE) { + val64 = MISC_LINK_STABILITY_PRD(3); + writeq(val64, &bar0->misc_control); + } + return SUCCESS; } +#define LINK_UP_DOWN_INTERRUPT 1 +#define MAC_RMAC_ERR_TIMER 2 + +#if defined(CONFIG_MSI_MODE) || defined(CONFIG_MSIX_MODE) +#define s2io_link_fault_indication(x) MAC_RMAC_ERR_TIMER +#else +int s2io_link_fault_indication(nic_t *nic) +{ + if (nic->device_type == XFRAME_II_DEVICE) + return LINK_UP_DOWN_INTERRUPT; + else + return MAC_RMAC_ERR_TIMER; +} +#endif /** * en_dis_able_nic_intrs - Enable or Disable the interrupts @@ -1484,11 +1504,22 @@ static void en_dis_able_nic_intrs(struct temp64 &= ~((u64) val64); writeq(temp64, &bar0->general_int_mask); /* - * Disabled all PCIX, Flash, MDIO, IIC and GPIO + * If Hercules adapter enable GPIO otherwise + * disabled all PCIX, Flash, MDIO, IIC and GPIO * interrupts for now. * TODO */ - writeq(DISABLE_ALL_INTRS, &bar0->pic_int_mask); + if (s2io_link_fault_indication(nic) == + LINK_UP_DOWN_INTERRUPT ) { + temp64 = readq(&bar0->pic_int_mask); + temp64 &= ~((u64) PIC_INT_GPIO); + writeq(temp64, &bar0->pic_int_mask); + temp64 = readq(&bar0->gpio_int_mask); + temp64 &= ~((u64) GPIO_INT_MASK_LINK_UP); + writeq(temp64, &bar0->gpio_int_mask); + } else { + writeq(DISABLE_ALL_INTRS, &bar0->pic_int_mask); + } /* * No MSI Support is available presently, so TTI and * RTI interrupts are also disabled. @@ -1579,17 +1610,8 @@ static void en_dis_able_nic_intrs(struct writeq(temp64, &bar0->general_int_mask); /* * All MAC block error interrupts are disabled for now - * except the link status change interrupt. * TODO */ - val64 = MAC_INT_STATUS_RMAC_INT; - temp64 = readq(&bar0->mac_int_mask); - temp64 &= ~((u64) val64); - writeq(temp64, &bar0->mac_int_mask); - - val64 = readq(&bar0->mac_rmac_err_mask); - val64 &= ~((u64) RMAC_LINK_STATE_CHANGE_INT); - writeq(val64, &bar0->mac_rmac_err_mask); } else if (flag == DISABLE_INTRS) { /* * Disable MAC Intrs in the general intr mask register @@ -1878,8 +1900,10 @@ static int start_nic(struct s2io_nic *ni } /* Enable select interrupts */ - interruptible = TX_TRAFFIC_INTR | RX_TRAFFIC_INTR | TX_MAC_INTR | - RX_MAC_INTR | MC_INTR; + interruptible = TX_TRAFFIC_INTR | RX_TRAFFIC_INTR | MC_INTR; + interruptible |= TX_PIC_INTR | RX_PIC_INTR; + interruptible |= TX_MAC_INTR | RX_MAC_INTR; + en_dis_able_nic_intrs(nic, interruptible, ENABLE_INTRS); /* @@ -2003,8 +2027,9 @@ static void stop_nic(struct s2io_nic *ni config = &nic->config; /* Disable all interrupts */ - interruptible = TX_TRAFFIC_INTR | RX_TRAFFIC_INTR | TX_MAC_INTR | - RX_MAC_INTR | MC_INTR; + interruptible = TX_TRAFFIC_INTR | RX_TRAFFIC_INTR | MC_INTR; + interruptible |= TX_PIC_INTR | RX_PIC_INTR; + interruptible |= TX_MAC_INTR | RX_MAC_INTR; en_dis_able_nic_intrs(nic, interruptible, DISABLE_INTRS); /* Disable PRCs */ @@ -2619,10 +2644,12 @@ static void alarm_intr_handler(struct s2 register u64 val64 = 0, err_reg = 0; /* Handling link status change error Intr */ - err_reg = readq(&bar0->mac_rmac_err_reg); - writeq(err_reg, &bar0->mac_rmac_err_reg); - if (err_reg & RMAC_LINK_STATE_CHANGE_INT) { - schedule_work(&nic->set_link_task); + if (s2io_link_fault_indication(nic) == MAC_RMAC_ERR_TIMER) { + err_reg = readq(&bar0->mac_rmac_err_reg); + writeq(err_reg, &bar0->mac_rmac_err_reg); + if (err_reg & RMAC_LINK_STATE_CHANGE_INT) { + schedule_work(&nic->set_link_task); + } } /* Handling Ecc errors */ @@ -2948,7 +2975,7 @@ int s2io_open(struct net_device *dev) * Nic is initialized */ netif_carrier_off(dev); - sp->last_link_state = 0; /* Unkown link state */ + sp->last_link_state = LINK_DOWN; /* Initialize H/W and enable interrupts */ if (s2io_card_up(sp)) { @@ -3160,6 +3187,53 @@ s2io_alarm_handle(unsigned long data) mod_timer(&sp->alarm_timer, jiffies + HZ / 2); } +static void s2io_txpic_intr_handle(nic_t *sp) +{ + XENA_dev_config_t *bar0 = (XENA_dev_config_t *) sp->bar0; + u64 val64; + + val64 = readq(&bar0->pic_int_status); + if (val64 & PIC_INT_GPIO) { + val64 = readq(&bar0->gpio_int_reg); + if ((val64 & GPIO_INT_REG_LINK_DOWN) && + (val64 & GPIO_INT_REG_LINK_UP)) { + val64 |= GPIO_INT_REG_LINK_DOWN; + val64 |= GPIO_INT_REG_LINK_UP; + writeq(val64, &bar0->gpio_int_reg); + goto masking; + } + + if (((sp->last_link_state == LINK_UP) && + (val64 & GPIO_INT_REG_LINK_DOWN)) || + ((sp->last_link_state == LINK_DOWN) && + (val64 & GPIO_INT_REG_LINK_UP))) { + val64 = readq(&bar0->gpio_int_mask); + val64 |= GPIO_INT_MASK_LINK_DOWN; + val64 |= GPIO_INT_MASK_LINK_UP; + writeq(val64, &bar0->gpio_int_mask); + s2io_set_link((unsigned long)sp); + } +masking: + if (sp->last_link_state == LINK_UP) { + /*enable down interrupt */ + val64 = readq(&bar0->gpio_int_mask); + /* unmasks link down intr */ + val64 &= ~GPIO_INT_MASK_LINK_DOWN; + /* masks link up intr */ + val64 |= GPIO_INT_MASK_LINK_UP; + writeq(val64, &bar0->gpio_int_mask); + } else { + /*enable UP Interrupt */ + val64 = readq(&bar0->gpio_int_mask); + /* unmasks link up interrupt */ + val64 &= ~GPIO_INT_MASK_LINK_UP; + /* masks link down interrupt */ + val64 |= GPIO_INT_MASK_LINK_DOWN; + writeq(val64, &bar0->gpio_int_mask); + } + } +} + /** * s2io_isr - ISR handler of the device . * @irq: the irq of the device. @@ -3242,6 +3316,8 @@ static irqreturn_t s2io_isr(int irq, voi tx_intr_handler(&mac_control->fifos[i]); } + if (reason & GEN_INTR_TXPIC) + s2io_txpic_intr_handle(sp); /* * If the Rx buffer count is below the panic threshold then * reallocate the buffers from the interrupt handler itself, @@ -4645,11 +4721,13 @@ static void s2io_set_link(unsigned long } subid = nic->pdev->subsystem_device; - /* - * Allow a small delay for the NICs self initiated - * cleanup to complete. - */ - msleep(100); + if (s2io_link_fault_indication(nic) == MAC_RMAC_ERR_TIMER) { + /* + * Allow a small delay for the NICs self initiated + * cleanup to complete. + */ + msleep(100); + } val64 = readq(&bar0->adapter_status); if (verify_xena_quiescence(nic, val64, nic->device_enabled_once)) { @@ -4667,13 +4745,16 @@ static void s2io_set_link(unsigned long val64 |= ADAPTER_LED_ON; writeq(val64, &bar0->adapter_control); } - val64 = readq(&bar0->adapter_status); - if (!LINK_IS_UP(val64)) { - DBG_PRINT(ERR_DBG, "%s:", dev->name); - DBG_PRINT(ERR_DBG, " Link down"); - DBG_PRINT(ERR_DBG, "after "); - DBG_PRINT(ERR_DBG, "enabling "); - DBG_PRINT(ERR_DBG, "device \n"); + if (s2io_link_fault_indication(nic) == + MAC_RMAC_ERR_TIMER) { + val64 = readq(&bar0->adapter_status); + if (!LINK_IS_UP(val64)) { + DBG_PRINT(ERR_DBG, "%s:", dev->name); + DBG_PRINT(ERR_DBG, " Link down"); + DBG_PRINT(ERR_DBG, "after "); + DBG_PRINT(ERR_DBG, "enabling "); + DBG_PRINT(ERR_DBG, "device \n"); + } } if (nic->device_enabled_once == FALSE) { nic->device_enabled_once = TRUE; From raghavendra.koushik@neterion.com Thu Jul 7 15:55:11 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 07 Jul 2005 15:55:14 -0700 (PDT) Received: from linux.site (adsl-67-120-213-161.dsl.sntc01.pacbell.net [67.120.213.161]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j67MspH9023120 for ; Thu, 7 Jul 2005 15:55:11 -0700 Received: by linux.site (Postfix, from userid 0) id DA22889828; Thu, 7 Jul 2005 15:42:23 -0700 (PDT) To: jgarzik@pobox.com, netdev@oss.sgi.com Cc: raghavendra.koushik@neterion.com, ravinandan.arakali@neterion.com, leonid.grossman@neterion.com, rapuru.sriram@neterion.com From: raghavendra.koushik@neterion.com Subject: [PATCH 2.6.12.1 12/12] S2io: Miscellaneous fixes Message-Id: <20050707224223.DA22889828@linux.site> Date: Thu, 7 Jul 2005 15:42:23 -0700 (PDT) X-archive-position: 2685 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: raghavendra.koushik@neterion.com Precedence: bulk X-list: netdev Content-Length: 5058 Lines: 144 Hi, The last patch in this series fixes the following issues found during testing. 1. Ensure we don't pass zero sized buffers to the card(which can lockup) 2. Restore the PCI-X parameters(in case of Xframe I adapter) after a reset. 3. Make sure total size of all FIFOs does not exceed 8192. Signed-off-by: Ravinandan Arakali Signed-off-by: Raghavendra Koushik --- diff -uprN vanilla_kernel/drivers/net/s2io.c linux-2.6.12-rc6/drivers/net/s2io.c --- vanilla_kernel/drivers/net/s2io.c 2005-06-30 02:26:26.000000000 -0700 +++ linux-2.6.12-rc6/drivers/net/s2io.c 2005-06-30 03:39:18.000000000 -0700 @@ -364,10 +364,9 @@ static int init_shared_mem(struct s2io_n size += config->tx_cfg[i].fifo_len; } if (size > MAX_AVAILABLE_TXDS) { - DBG_PRINT(ERR_DBG, "%s: Total number of Tx FIFOs ", - dev->name); - DBG_PRINT(ERR_DBG, "exceeds the maximum value "); - DBG_PRINT(ERR_DBG, "that can be used\n"); + DBG_PRINT(ERR_DBG, "%s: Requested TxDs too high, ", + __FUNCTION__); + DBG_PRINT(ERR_DBG, "Requested: %d, max supported: 8192\n", size); return FAILURE; } @@ -610,8 +609,9 @@ static void free_shared_mem(struct s2io_ lst_per_page); for (j = 0; j < page_num; j++) { int mem_blks = (j * lst_per_page); - if (!mac_control->fifos[i].list_info[mem_blks]. - list_virt_addr) + if ((!mac_control->fifos[i].list_info) || + (!mac_control->fifos[i].list_info[mem_blks]. + list_virt_addr)) break; pci_free_consistent(nic->pdev, PAGE_SIZE, mac_control->fifos[i]. @@ -2595,6 +2595,8 @@ static void tx_intr_handler(fifo_info_t for (j = 0; j < frg_cnt; j++, txdlp++) { skb_frag_t *frag = &skb_shinfo(skb)->frags[j]; + if (!txdlp->Buffer_Pointer) + break; pci_unmap_page(nic->pdev, (dma_addr_t) txdlp-> @@ -2745,6 +2747,10 @@ void s2io_reset(nic_t * sp) u64 val64; u16 subid, pci_cmd; + /* Back up the PCI-X CMD reg, dont want to lose MMRBC, OST settings */ + if (sp->device_type == XFRAME_I_DEVICE) + pci_read_config_word(sp->pdev, PCIX_COMMAND_REGISTER, &(pci_cmd)); + val64 = SW_RESET_ALL; writeq(val64, &bar0->sw_reset); @@ -2763,8 +2769,10 @@ void s2io_reset(nic_t * sp) msleep(250); if (!(sp->device_type & XFRAME_II_DEVICE)) { - /* Restore the PCI state saved during initializarion. */ + /* Restore the PCI state saved during initializarion. */ pci_restore_state(sp->pdev); + pci_write_config_word(sp->pdev, PCIX_COMMAND_REGISTER, + pci_cmd); } else { pci_set_master(sp->pdev); } @@ -2975,7 +2983,7 @@ int s2io_open(struct net_device *dev) * Nic is initialized */ netif_carrier_off(dev); - sp->last_link_state = LINK_DOWN; + sp->last_link_state = 0; /* Initialize H/W and enable interrupts */ if (s2io_card_up(sp)) { @@ -3103,6 +3111,15 @@ int s2io_xmit(struct sk_buff *skb, struc spin_unlock_irqrestore(&sp->tx_lock, flags); return 0; } + + /* A buffer with no data will be dropped */ + if (!skb->len) { + DBG_PRINT(TX_DBG, "%s:Buffer has no data..\n", dev->name); + dev_kfree_skb(skb); + spin_unlock_irqrestore(&sp->tx_lock, flags); + return 0; + } + #ifdef NETIF_F_TSO mss = skb_shinfo(skb)->tso_size; if (mss) { @@ -3137,6 +3154,9 @@ int s2io_xmit(struct sk_buff *skb, struc /* For fragmented SKB. */ for (i = 0; i < frg_cnt; i++) { skb_frag_t *frag = &skb_shinfo(skb)->frags[i]; + /* A '0' length fragment will be ignored */ + if (!frag->size) + continue; txdp++; txdp->Buffer_Pointer = (u64) pci_map_page (sp->pdev, frag->page, frag->page_offset, @@ -5258,7 +5278,8 @@ s2io_init_nic(struct pci_dev *pdev, cons config = &sp->config; /* Tx side parameters. */ - tx_fifo_len[0] = DEFAULT_FIFO_LEN; /* Default value. */ + if (tx_fifo_len[0] == 0) + tx_fifo_len[0] = DEFAULT_FIFO_LEN; /* Default value. */ config->tx_fifo_num = tx_fifo_num; for (i = 0; i < MAX_TX_FIFOS; i++) { config->tx_cfg[i].fifo_len = tx_fifo_len[i]; @@ -5281,7 +5302,8 @@ s2io_init_nic(struct pci_dev *pdev, cons config->max_txds = MAX_SKB_FRAGS; /* Rx side parameters. */ - rx_ring_sz[0] = SMALL_BLK_CNT; /* Default value. */ + if (rx_ring_sz[0] == 0) + rx_ring_sz[0] = SMALL_BLK_CNT; /* Default value. */ config->rx_ring_num = rx_ring_num; for (i = 0; i < MAX_RX_RINGS; i++) { config->rx_cfg[i].num_rxd = rx_ring_sz[i] * @@ -5311,7 +5333,7 @@ s2io_init_nic(struct pci_dev *pdev, cons /* initialize the shared memory used by the NIC and the host */ if (init_shared_mem(sp)) { DBG_PRINT(ERR_DBG, "%s: Memory allocation failed\n", - dev->name); + __FUNCTION__); ret = -ENOMEM; goto mem_alloc_failed; } @@ -5489,7 +5511,7 @@ s2io_init_nic(struct pci_dev *pdev, cons sp->def_mac_addr[0].mac_addr[3], sp->def_mac_addr[0].mac_addr[4], sp->def_mac_addr[0].mac_addr[5]); - int mode = s2io_print_pci_mode(sp); + mode = s2io_print_pci_mode(sp); if (mode < 0) { DBG_PRINT(ERR_DBG, " Unsupported PCI bus mode "); ret = -EBADSLT; From jesse.brandeburg@intel.com Thu Jul 7 16:05:44 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 07 Jul 2005 16:05:48 -0700 (PDT) Received: from orsfmr005.jf.intel.com (fmr20.intel.com [134.134.136.19]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j67N5iH9024525 for ; Thu, 7 Jul 2005 16:05:44 -0700 Received: from orsfmr100.jf.intel.com (orsfmr100.jf.intel.com [10.7.209.16]) by orsfmr005.jf.intel.com (8.12.10/8.12.10/d: major-outer.mc,v 1.1 2004/09/17 17:50:56 root Exp $) with ESMTP id j67N3YEO012961; Thu, 7 Jul 2005 23:03:34 GMT Received: from orsmsxvs041.jf.intel.com (orsmsxvs041.jf.intel.com [192.168.65.54]) by orsfmr100.jf.intel.com (8.12.10/8.12.10/d: major-inner.mc,v 1.2 2004/09/17 18:05:01 root Exp $) with SMTP id j67N3EBX003784; Thu, 7 Jul 2005 23:03:34 GMT Received: from orsmsx331.amr.corp.intel.com ([192.168.65.56]) by orsmsxvs041.jf.intel.com (SAVSMTP 3.1.7.47) with SMTP id M2005070716033125440 ; Thu, 07 Jul 2005 16:03:31 -0700 Received: from orsmsx408.amr.corp.intel.com ([192.168.65.52]) by orsmsx331.amr.corp.intel.com with Microsoft SMTPSVC(6.0.3790.211); Thu, 7 Jul 2005 16:03:31 -0700 X-MimeOLE: Produced By Microsoft Exchange V6.5.7226.0 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Subject: netdev @ sgi still processing emails? WAS (RE: [PATCH] loop unrolling in net/sched/sch_generic.c) Date: Thu, 7 Jul 2005 16:03:30 -0700 Message-ID: X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: netdev @ sgi still processing emails? WAS (RE: [PATCH] loop unrolling in net/sched/sch_generic.c) thread-index: AcWDQ4uvE//TYIrPRoWkCBiCAYib7AAApyzQ From: "Brandeburg, Jesse" To: "David S. Miller" Cc: X-OriginalArrivalTime: 07 Jul 2005 23:03:31.0530 (UTC) FILETIME=[181C02A0:01C58348] X-Scanned-By: MIMEDefang 2.44 Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id j67N5iH9024525 X-archive-position: 2686 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jesse.brandeburg@intel.com Precedence: bulk X-list: netdev Content-Length: 1142 Lines: 33 >From: David S. Miller [mailto:davem@davemloft.net] >> Arg, this thread wasn't on the new list, is there any chance we can just >> get netdev@oss.sgi.com to forward to netdev@vger.kernel.org? > >It does already. Odd, I didn't think it was because I didn't show this thread where I am only subscribed to netdev@vger I'm assuming you're correct, but I'm still receiving mail directly from the reflector at oss.sgi Received: from oss.sgi.com (oss.sgi.com [192.48.159.27]) by fmsfmr005.fm.intel.com (8.12.10/8.12.10/d: major-outer.mc,v 1.1 2004/09/17 17:50:56 root Exp $) with ESMTP id j67MPaJF014398 for ; Thu, 7 Jul 2005 22:25:36 GMT Received: from oss (localhost [127.0.0.1]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j67MQxH9014528; Thu, 7 Jul 2005 15:26:59 -0700 Received: with ECARTIS (v1.0.0; list netdev); Thu, 07 Jul 2005 15:26:17 -0700 (PDT) Received: from sunset.davemloft.net ([216.27.180.168]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j67MQBH9014292 Is there some reason to keep both lists open? It does appear the marc.theaims is getting both. Jesse From mitch.a.williams@intel.com Thu Jul 7 16:08:32 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 07 Jul 2005 16:08:35 -0700 (PDT) Received: from orsfmr004.jf.intel.com (fmr19.intel.com [134.134.136.18]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j67N8WH9025206 for ; Thu, 7 Jul 2005 16:08:32 -0700 Received: from orsfmr101.jf.intel.com (orsfmr101.jf.intel.com [10.7.209.17]) by orsfmr004.jf.intel.com (8.12.10/8.12.10/d: major-outer.mc,v 1.1 2004/09/17 17:50:56 root Exp $) with ESMTP id j67N6isX031742; Thu, 7 Jul 2005 23:06:44 GMT Received: from nwlxmail01.jf.intel.com (nwlxmail01.jf.intel.com [10.7.171.40]) by orsfmr101.jf.intel.com (8.12.10/8.12.10/d: major-inner.mc,v 1.2 2004/09/17 18:05:01 root Exp $) with ESMTP id j67N6hVc024874; Thu, 7 Jul 2005 23:06:43 GMT Received: from mawilli1-desk2.amr.corp.intel.com (mawilli1-desk2.amr.corp.intel.com [134.134.3.58]) by nwlxmail01.jf.intel.com (8.12.10/8.12.9/MailSET/Hub) with ESMTP id j67N6dSL012674; Thu, 7 Jul 2005 16:06:41 -0700 Date: Thu, 7 Jul 2005 16:06:38 -0700 From: Mitch Williams X-X-Sender: mawilli1@mawilli1-desk2.amr.corp.intel.com To: "John W. Linville" cc: Greg KH , Mitch Williams , Radheka Godse , netdev@oss.sgi.com, bonding-devel@lists.sourceforge.net, fubar@us.ibm.com Subject: Re: [PATCH 2.6.13-rc1 8/17] bonding: SYSFS INTERFACE (large) In-Reply-To: <20050707142544.GA9418@tuxdriver.com> Message-ID: References: <20050702081346.GA20789@kroah.com> <20050706195232.GB18359@kroah.com> <20050707142544.GA9418@tuxdriver.com> ReplyTo: "Mitch Williams" MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Scanned-By: MIMEDefang 2.44 X-archive-position: 2687 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mitch.a.williams@intel.com Precedence: bulk X-list: netdev Content-Length: 3432 Lines: 81 On Thu, 7 Jul 2005, John W. Linville wrote: > On Wed, Jul 06, 2005 at 12:52:32PM -0700, Greg KH wrote: > > On Wed, Jul 06, 2005 at 11:53:13AM -0700, Mitch Williams wrote: > > > > > How about this: > > bond_add - write to this to add a new bond, one value only. > > bond_remove - write to this to remove a bond that is present. > > bonds/bond0 > > bonds/bond1 > > bonds/bond2 > > ... > > - list of bonds currently present. If you want, you > > could make those bondX files directories, and put > > other info about the individual bonds in there, if you > > need it (I know nothing about the bonding intrerface, > > sorry.) > > > > Would that work? > > I like that suggestion. It keeps the interface creation/deletion a > little more independent of each other. > We looked into a scheme much like this, but rejected it early on. Actually, what Greg is proposing is two things: 1) move the individual bond interface directories down a level, into /sys/class/net, and 2) change bonding_masters into a set of control files, instead of one file. Moving the individual bond directories to a bonds/ directory is problematic. Because each bond shows up a just another network interface, they show up in /sys/class/net automatically. We'd have to make a bunch of changes to the device model to account for bonding, and we'd break the semantics of the /sys/class/net hierarchy. Instead, what we do is piggyback on the device model by adding new files (attributes) to each bond device's directory (kobject). The problem, then, becomes one of separating the bond interfaces from the non-bond interfaces. The bonding_masters file is a simple solution to this problem. Reading the file gives the set of active bonds, and writing the file changes the set of active bonds. As I stated before, a cursory reading of Documentation/filesystems/sysfs.txt indicates that such a usage is "socially acceptable". (Or at least it was to Patrick Mochel back in January of 2003.) My other major difficulty with the bond_add/bond_remove scheme is that these files would act differently than any other sysfs files, in that their read and write semantics are not the same. What I mean is that any given sysfs "file" will appear to contain the same data for both read and write. Most scripts that handle sysfs do some sort of read-modify-write operation. This would not be possible with the type of scheme Greg KH has outlined. Furthermore, what happens when you read bond_add and bond_remove? What do you use to get a list of existing bond interfaces? Reading a file named "bond_add" to get a list of bonds is counterintuitive at best, and adding an extra read-only file just to get a list of bonds seems cumbersome. Greg also (in another message) mentioned problems with appending to bonding_masters. This currently is a problem, since sysfs itself does not handle appends properly. Since there is no concept of a file pointer or offset or such when the underlying methods are called, and since sysfs happily allows seek operations to succeed, appending ends up destroying the contents of the file. I submitted two patches that addressed this issue several months ago but got a frosty reception and gave up. Of course, all our discussions don't really mean anything, since none of us is the bonding maintainer. I'd really like to know what Jay Vosburgh thinks about all this. Are you there, Jay? -Mitch From davem@davemloft.net Thu Jul 7 16:09:27 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 07 Jul 2005 16:09:30 -0700 (PDT) Received: from sunset.davemloft.net ([216.27.180.168]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j67N9RH9025569 for ; Thu, 7 Jul 2005 16:09:27 -0700 Received: from localhost ([127.0.0.1] ident=davem) by sunset.davemloft.net with esmtp (Exim 4.50) id 1DqfT3-0001TB-8S; Thu, 07 Jul 2005 16:07:41 -0700 Date: Thu, 07 Jul 2005 16:07:40 -0700 (PDT) Message-Id: <20050707.160740.88477107.davem@davemloft.net> To: jesse.brandeburg@intel.com Cc: netdev@oss.sgi.com Subject: Re: netdev @ sgi still processing emails? WAS From: "David S. Miller" In-Reply-To: References: X-Mailer: Mew version 4.2 on Emacs 21.4 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 2688 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev Content-Length: 357 Lines: 9 From: "Brandeburg, Jesse" Date: Thu, 7 Jul 2005 16:03:30 -0700 > Is there some reason to keep both lists open? It does appear the > marc.theaims is getting both. It might be wise to kill it off in a month or two. After that much time has passed, there really isn't any reason not to have moved over to the new list location. From akepner@sgi.com Thu Jul 7 16:22:04 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 07 Jul 2005 16:22:09 -0700 (PDT) Received: from omx2.sgi.com (omx2-ext.sgi.com [192.48.171.19]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j67NM4H9030735 for ; Thu, 7 Jul 2005 16:22:04 -0700 Received: from nodin.corp.sgi.com (fddi-nodin.corp.sgi.com [198.29.75.193]) by omx2.sgi.com (8.12.11/8.12.9/linux-outbound_gateway-1.1) with ESMTP id j681CV6e022381 for ; Thu, 7 Jul 2005 18:12:31 -0700 Received: from cthulhu.engr.sgi.com (cthulhu.engr.sgi.com [192.26.80.2]) by nodin.corp.sgi.com (SGI-8.12.5/8.12.10/SGI_generic_relay-1.2) with ESMTP id j67NKQbT85273697 for ; Thu, 7 Jul 2005 16:20:26 -0700 (PDT) Received: from [192.168.2.20] (mtv-vpn-sw-corp-0-42.corp.sgi.com [134.15.0.42]) by cthulhu.engr.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id j67NJPdP45504187; Thu, 7 Jul 2005 16:19:25 -0700 (PDT) Date: Thu, 7 Jul 2005 16:15:20 -0700 (PDT) From: Arthur Kepner X-X-Sender: akepner@resonance.WorkGroup To: raghavendra.koushik@neterion.com cc: jgarzik@pobox.com, netdev@oss.sgi.com, netdev@vger.kernel.org, ravinandan.arakali@neterion.com, leonid.grossman@neterion.com, rapuru.sriram@neterion.com Subject: Re: [PATCH 2.6.12.1 5/12] S2io: Performance improvements In-Reply-To: <20050707222741.71C3E89826@linux.site> Message-ID: References: <20050707222741.71C3E89826@linux.site> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 2689 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: akepner@sgi.com Precedence: bulk X-list: netdev Content-Length: 1331 Lines: 40 On Thu, 7 Jul 2005 raghavendra.koushik@neterion.com wrote: > ....... > 2. Removed unnecessary PIOs(read/write of tx_traffic_int and > rx_traffic_int) from interrupt handler and removed read of > general_int_status register from xmit routine. > ...... > @@ -2891,6 +2869,8 @@ int s2io_xmit(struct sk_buff *skb, struc > val64 = mac_control->fifos[queue].list_info[put_off].list_phy_addr; > writeq(val64, &tx_fifo->TxDL_Pointer); > > + wmb(); > + > val64 = (TX_FIFO_LAST_TXD_NUM(frg_cnt) | TX_FIFO_FIRST_LIST | > TX_FIFO_LAST_LIST); > > @@ -2900,9 +2880,6 @@ int s2io_xmit(struct sk_buff *skb, struc > #endif > writeq(val64, &tx_fifo->List_Control); > > - /* Perform a PCI read to flush previous writes */ > - val64 = readq(&bar0->general_int_status); > - > put_off++; I thought that an mmiowb() was called for here (to order the PIO writes above more cheaply than doing the readq()). I posted a patch like this some time ago: http://marc.theaimsgroup.com/?l=linux-netdev&m=111508292028110&w=2 FWIW, I've done quite a few performance measurements with the patch I posted earlier, and it's worked well. For 1500 byte mtus throughput goes up by ~20%. Is even the mmiowb() unnecessary? What is the wmb() above for? -- Arthur From seb_vincent@yahoo.com Thu Jul 7 16:29:00 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 07 Jul 2005 16:29:09 -0700 (PDT) Received: from web54104.mail.yahoo.com (web54104.mail.yahoo.com [206.190.37.239]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id j67NSxH9031722 for ; Thu, 7 Jul 2005 16:29:00 -0700 Received: (qmail 73492 invoked by uid 60001); 7 Jul 2005 23:27:19 -0000 DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=Message-ID:Received:Date:From:Subject:To:MIME-Version:Content-Type:Content-Transfer-Encoding; b=6nmQXheF0Gx7awM8IGp+G/l3sB6ecFpwZRKbIofNfw1/9FURsQGLyg0cgL1Ry6iY9cpe5cbHZ2sr+zMdV712AUsgoJLXzuO4MjVda5ui3cEBPtl7/EEoXix/MCMw+SVjKDqgwbEikYuLgDZmJYfWsqCbt606eRBPHl19OlGaBq4= ; Message-ID: <20050707232719.73490.qmail@web54104.mail.yahoo.com> Received: from [81.86.159.14] by web54104.mail.yahoo.com via HTTP; Thu, 07 Jul 2005 16:27:19 PDT Date: Thu, 7 Jul 2005 16:27:19 -0700 (PDT) From: Sebastien Vincent Subject: transceiver problem To: venza@brownhat.org, netdev@oss.sgi.com MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="0-924250615-1120778839=:64549" Content-Transfer-Encoding: 8bit X-archive-position: 2690 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: seb_vincent@yahoo.com Precedence: bulk X-list: netdev Content-Length: 19016 Lines: 339 --0-924250615-1120778839=:64549 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit Content-Id: Content-Disposition: inline I am having some speed problem with my sis900 card. it goes at only 1.4MB and 500k download where other computer with on my ethernet network are going much faster. I've attached the dmesg out put but I suspect that it is a tranceiver problem as I see the following upload. 0000:00:04.0: Unknown PHY transceiver found at address 1. I am using a recent version of Gentoo: uname -a Linux provxtest2 2.6.11-gentoo-r9 #1 SMP Thu Jul 7 23:01:03 GMT 2005 i686 Intel(R) Pentium(R) 4 CPU 3.00GHz GenuineIntel GNU/Linux the current version of sis900.c in this version is 1.08.06 and I also tried the latest 1.08.08 with same result. I've been trying to use sis900-list-phy-ids.diff or a older patched version of the drivers with no success with 2.6.11 kernel, and I am wondering if there is an equivalent patch for this version. Anyway any help would be welcome, as I am totaly stuck. Thks. ____________________________________________________ Sell on Yahoo! Auctions – no fees. Bid on great items. http://auctions.yahoo.com/ --0-924250615-1120778839=:64549 Content-Type: application/octet-stream; name="dmesg.output" Content-Transfer-Encoding: base64 Content-Description: 4223327508-dmesg.output Content-Disposition: attachment; filename="dmesg.output" TGludXggdmVyc2lvbiAyLjYuMTEtZ2VudG9vLXI5IChyb290QHByb3Z4dGVz dDIpIChnY2MgdmVyc2lvbiAzLjMuNS0yMDA1MDEzMCAoR2VudG9vIDMuMy41 LjIwMDUwMTMwLXIxLCBzc3AtMy4zLjUuMjAwNTAxMzAtMSwgcGllLTguNy43 LjEpKSAjMSBTTVAgVGh1IEp1bCA3IDIzOjAxOjAzIEdNVCAyMDA1CkJJT1Mt cHJvdmlkZWQgcGh5c2ljYWwgUkFNIG1hcDoKIEJJT1MtZTgyMDogMDAwMDAw MDAwMDAwMDAwMCAtIDAwMDAwMDAwMDAwOWY4MDAgKHVzYWJsZSkKIEJJT1Mt ZTgyMDogMDAwMDAwMDAwMDA5ZjgwMCAtIDAwMDAwMDAwMDAwYTAwMDAgKHJl c2VydmVkKQogQklPUy1lODIwOiAwMDAwMDAwMDAwMGYwMDAwIC0gMDAwMDAw MDAwMDEwMDAwMCAocmVzZXJ2ZWQpCiBCSU9TLWU4MjA6IDAwMDAwMDAwMDAx MDAwMDAgLSAwMDAwMDAwMDFkZmYwMDAwICh1c2FibGUpCiBCSU9TLWU4MjA6 IDAwMDAwMDAwMWRmZjAwMDAgLSAwMDAwMDAwMDFkZmYzMDAwIChBQ1BJIE5W UykKIEJJT1MtZTgyMDogMDAwMDAwMDAxZGZmMzAwMCAtIDAwMDAwMDAwMWUw MDAwMDAgKEFDUEkgZGF0YSkKIEJJT1MtZTgyMDogMDAwMDAwMDBmZWMwMDAw MCAtIDAwMDAwMDAxMDAwMDAwMDAgKHJlc2VydmVkKQo0NzlNQiBMT1dNRU0g YXZhaWxhYmxlLgpmb3VuZCBTTVAgTVAtdGFibGUgYXQgMDAwZjU2ZTAKT24g bm9kZSAwIHRvdGFscGFnZXM6IDEyMjg2NAogIERNQSB6b25lOiA0MDk2IHBh Z2VzLCBMSUZPIGJhdGNoOjEKICBOb3JtYWwgem9uZTogMTE4NzY4IHBhZ2Vz LCBMSUZPIGJhdGNoOjE2CiAgSGlnaE1lbSB6b25lOiAwIHBhZ2VzLCBMSUZP IGJhdGNoOjEKRE1JIDIuMiBwcmVzZW50LgpBQ1BJOiBSU0RQICh2MDAwIEFX QVJEICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgKSBAIDB4MDAw ZjcxNTAKQUNQSTogUlNEVCAodjAwMSBBV0FSRCAgQVdSREFDUEkgMHg0MjMw MmUzMSBBV1JEIDB4MDAwMDAwMDApIEAgMHgxZGZmMzAwMApBQ1BJOiBGQURU ICh2MDAxIEFXQVJEICBBV1JEQUNQSSAweDQyMzAyZTMxIEFXUkQgMHgwMDAw MDAwMCkgQCAweDFkZmYzMDQwCkFDUEk6IE1BRFQgKHYwMDEgQVdBUkQgIEFX UkRBQ1BJIDB4NDIzMDJlMzEgQVdSRCAweDAwMDAwMDAwKSBAIDB4MWRmZjY3 YzAKQUNQSTogRFNEVCAodjAwMSBBV0FSRCAgQVdSREFDUEkgMHgwMDAwMTAw MCBNU0ZUIDB4MDEwMDAwMGQpIEAgMHgwMDAwMDAwMApBQ1BJOiBMb2NhbCBB UElDIGFkZHJlc3MgMHhmZWUwMDAwMApBQ1BJOiBMQVBJQyAoYWNwaV9pZFsw eDAwXSBsYXBpY19pZFsweDAwXSBlbmFibGVkKQpQcm9jZXNzb3IgIzAgMTU6 NCBBUElDIHZlcnNpb24gMjAKQUNQSTogTEFQSUMgKGFjcGlfaWRbMHgwMV0g bGFwaWNfaWRbMHgwMV0gZW5hYmxlZCkKUHJvY2Vzc29yICMxIDE1OjQgQVBJ QyB2ZXJzaW9uIDIwCkFDUEk6IExBUElDX05NSSAoYWNwaV9pZFsweDAwXSBo aWdoIGVkZ2UgbGludFsweDFdKQpBQ1BJOiBMQVBJQ19OTUkgKGFjcGlfaWRb MHgwMV0gaGlnaCBlZGdlIGxpbnRbMHgxXSkKQUNQSTogSU9BUElDIChpZFsw eDAyXSBhZGRyZXNzWzB4ZmVjMDAwMDBdIGdzaV9iYXNlWzBdKQpJT0FQSUNb MF06IGFwaWNfaWQgMiwgdmVyc2lvbiAyMCwgYWRkcmVzcyAweGZlYzAwMDAw LCBHU0kgMC0yMwpBQ1BJOiBJTlRfU1JDX09WUiAoYnVzIDAgYnVzX2lycSAw IGdsb2JhbF9pcnEgMiBkZmwgZGZsKQpBQ1BJOiBJTlRfU1JDX09WUiAoYnVz IDAgYnVzX2lycSA5IGdsb2JhbF9pcnEgOSBkZmwgZGZsKQpBQ1BJOiBJUlEw IHVzZWQgYnkgb3ZlcnJpZGUuCkFDUEk6IElSUTIgdXNlZCBieSBvdmVycmlk ZS4KQUNQSTogSVJROSB1c2VkIGJ5IG92ZXJyaWRlLgpFbmFibGluZyBBUElD IG1vZGU6ICBGbGF0LiAgVXNpbmcgMSBJL08gQVBJQ3MKVXNpbmcgQUNQSSAo TUFEVCkgZm9yIFNNUCBjb25maWd1cmF0aW9uIGluZm9ybWF0aW9uCkFsbG9j YXRpbmcgUENJIHJlc291cmNlcyBzdGFydGluZyBhdCAxZTAwMDAwMCAoZ2Fw OiAxZTAwMDAwMDplMGMwMDAwMCkKQnVpbHQgMSB6b25lbGlzdHMKS2VybmVs IGNvbW1hbmQgbGluZTogcm9vdD0vZGV2L3JhbTAgaW5pdD0vbGludXhyYyBy ZWFsX3Jvb3Q9L2Rldi9oZGEzIHJhbWRpc2s9ODE5MgptYXBwZWQgQVBJQyB0 byBmZmZmZDAwMCAoZmVlMDAwMDApCm1hcHBlZCBJT0FQSUMgdG8gZmZmZmMw MDAgKGZlYzAwMDAwKQpJbml0aWFsaXppbmcgQ1BVIzAKUElEIGhhc2ggdGFi bGUgZW50cmllczogMjA0OCAob3JkZXI6IDExLCAzMjc2OCBieXRlcykKRGV0 ZWN0ZWQgMzAwMS4wMDkgTUh6IHByb2Nlc3Nvci4KVXNpbmcgdHNjIGZvciBo aWdoLXJlcyB0aW1lc291cmNlClNwZWFrdXAgdi0yLjAwIENWUzogV2VkIE1h ciAyIDIwOjIyOjAyIEVTVCAyMDA1IDogaW5pdGlhbGl6ZWQKU3BlYWt1cDog IGxvYWRpbmcgbW9kdWxlICJzcGVha3VwX24iCnJlcXVlc3RfbW9kdWxlOiBy dW5hd2F5IGxvb3AgbW9kcHJvYmUgc3BlYWt1cF9uCkNvbnNvbGU6IGNvbG91 ciBWR0ErIDgweDI1CkRlbnRyeSBjYWNoZSBoYXNoIHRhYmxlIGVudHJpZXM6 IDY1NTM2IChvcmRlcjogNiwgMjYyMTQ0IGJ5dGVzKQpJbm9kZS1jYWNoZSBo YXNoIHRhYmxlIGVudHJpZXM6IDMyNzY4IChvcmRlcjogNSwgMTMxMDcyIGJ5 dGVzKQpNZW1vcnk6IDQ4MDU2MGsvNDkxNDU2ayBhdmFpbGFibGUgKDI4NjBr IGtlcm5lbCBjb2RlLCAxMDMzNmsgcmVzZXJ2ZWQsIDEwNTdrIGRhdGEsIDIy NGsgaW5pdCwgMGsgaGlnaG1lbSkKQ2hlY2tpbmcgaWYgdGhpcyBwcm9jZXNz b3IgaG9ub3VycyB0aGUgV1AgYml0IGV2ZW4gaW4gc3VwZXJ2aXNvciBtb2Rl Li4uIE9rLgpDYWxpYnJhdGluZyBkZWxheSBsb29wLi4uIDU5MzEuMDAgQm9n b01JUFMgKGxwaj0yOTY1NTA0KQpNb3VudC1jYWNoZSBoYXNoIHRhYmxlIGVu dHJpZXM6IDUxMiAob3JkZXI6IDAsIDQwOTYgYnl0ZXMpCkNvdWxkbid0IGlu aXRpYWxpemUgbWlzY2RldmljZSAvZGV2L3N5bnRoLgpDUFU6IEFmdGVyIGdl bmVyaWMgaWRlbnRpZnksIGNhcHM6IGJmZWJmYmZmIDAwMDAwMDAwIDAwMDAw MDAwIDAwMDAwMDAwIDAwMDA0NDFkIDAwMDAwMDAwIDAwMDAwMDAwCkNQVTog QWZ0ZXIgdmVuZG9yIGlkZW50aWZ5LCBjYXBzOiBiZmViZmJmZiAwMDAwMDAw MCAwMDAwMDAwMCAwMDAwMDAwMCAwMDAwNDQxZCAwMDAwMDAwMCAwMDAwMDAw MAptb25pdG9yL213YWl0IGZlYXR1cmUgcHJlc2VudC4KdXNpbmcgbXdhaXQg aW4gaWRsZSB0aHJlYWRzLgpDUFU6IFRyYWNlIGNhY2hlOiAxMksgdW9wcywg TDEgRCBjYWNoZTogMTZLCkNQVTogTDIgY2FjaGU6IDEwMjRLCkNQVTogUGh5 c2ljYWwgUHJvY2Vzc29yIElEOiAwCkNQVTogQWZ0ZXIgYWxsIGluaXRzLCBj YXBzOiBiZmViZmJmZiAwMDAwMDAwMCAwMDAwMDAwMCAwMDAwMDA4MCAwMDAw NDQxZCAwMDAwMDAwMCAwMDAwMDAwMApJbnRlbCBtYWNoaW5lIGNoZWNrIGFy Y2hpdGVjdHVyZSBzdXBwb3J0ZWQuCkludGVsIG1hY2hpbmUgY2hlY2sgcmVw b3J0aW5nIGVuYWJsZWQgb24gQ1BVIzAuCkNQVTA6IEludGVsIFA0L1hlb24g RXh0ZW5kZWQgTUNFIE1TUnMgKDEyKSBhdmFpbGFibGUKQ1BVMDogVGhlcm1h bCBtb25pdG9yaW5nIGVuYWJsZWQKRW5hYmxpbmcgZmFzdCBGUFUgc2F2ZSBh bmQgcmVzdG9yZS4uLiBkb25lLgpFbmFibGluZyB1bm1hc2tlZCBTSU1EIEZQ VSBleGNlcHRpb24gc3VwcG9ydC4uLiBkb25lLgpDaGVja2luZyAnaGx0JyBp bnN0cnVjdGlvbi4uLiBPSy4KQ1BVMDogSW50ZWwoUikgUGVudGl1bShSKSA0 IENQVSAzLjAwR0h6IHN0ZXBwaW5nIDAxCnBlci1DUFUgdGltZXNsaWNlIGN1 dG9mZjogMjkyNS4wNSB1c2Vjcy4KdGFzayBtaWdyYXRpb24gY2FjaGUgZGVj YXkgdGltZW91dDogMyBtc2Vjcy4KQm9vdGluZyBwcm9jZXNzb3IgMS8xIGVp cCAzMDAwCkluaXRpYWxpemluZyBDUFUjMQpDYWxpYnJhdGluZyBkZWxheSBs b29wLi4uIDU5OTYuNTQgQm9nb01JUFMgKGxwaj0yOTk4MjcyKQpDUFU6IEFm dGVyIGdlbmVyaWMgaWRlbnRpZnksIGNhcHM6IGJmZWJmYmZmIDAwMDAwMDAw IDAwMDAwMDAwIDAwMDAwMDAwIDAwMDA0NDFkIDAwMDAwMDAwIDAwMDAwMDAw CkNQVTogQWZ0ZXIgdmVuZG9yIGlkZW50aWZ5LCBjYXBzOiBiZmViZmJmZiAw MDAwMDAwMCAwMDAwMDAwMCAwMDAwMDAwMCAwMDAwNDQxZCAwMDAwMDAwMCAw MDAwMDAwMAptb25pdG9yL213YWl0IGZlYXR1cmUgcHJlc2VudC4KQ1BVOiBU cmFjZSBjYWNoZTogMTJLIHVvcHMsIEwxIEQgY2FjaGU6IDE2SwpDUFU6IEwy IGNhY2hlOiAxMDI0SwpDUFU6IFBoeXNpY2FsIFByb2Nlc3NvciBJRDogMApD UFU6IEFmdGVyIGFsbCBpbml0cywgY2FwczogYmZlYmZiZmYgMDAwMDAwMDAg MDAwMDAwMDAgMDAwMDAwODAgMDAwMDQ0MWQgMDAwMDAwMDAgMDAwMDAwMDAK SW50ZWwgbWFjaGluZSBjaGVjayBhcmNoaXRlY3R1cmUgc3VwcG9ydGVkLgpJ bnRlbCBtYWNoaW5lIGNoZWNrIHJlcG9ydGluZyBlbmFibGVkIG9uIENQVSMx LgpDUFUxOiBJbnRlbCBQNC9YZW9uIEV4dGVuZGVkIE1DRSBNU1JzICgxMikg YXZhaWxhYmxlCkNQVTE6IFRoZXJtYWwgbW9uaXRvcmluZyBlbmFibGVkCkNQ VTE6IEludGVsKFIpIFBlbnRpdW0oUikgNCBDUFUgMy4wMEdIeiBzdGVwcGlu ZyAwMQpUb3RhbCBvZiAyIHByb2Nlc3NvcnMgYWN0aXZhdGVkICgxMTkyNy41 NSBCb2dvTUlQUykuCkVOQUJMSU5HIElPLUFQSUMgSVJRcwouLlRJTUVSOiB2 ZWN0b3I9MHgzMSBwaW4xPTIgcGluMj0tMQouLk1QLUJJT1MgYnVnOiA4MjU0 IHRpbWVyIG5vdCBjb25uZWN0ZWQgdG8gSU8tQVBJQwouLi50cnlpbmcgdG8g c2V0IHVwIHRpbWVyIChJUlEwKSB0aHJvdWdoIHRoZSA4MjU5QSAuLi4gIGZh aWxlZC4KLi4udHJ5aW5nIHRvIHNldCB1cCB0aW1lciBhcyBWaXJ0dWFsIFdp cmUgSVJRLi4uIHdvcmtzLgpjaGVja2luZyBUU0Mgc3luY2hyb25pemF0aW9u IGFjcm9zcyAyIENQVXM6IHBhc3NlZC4KQnJvdWdodCB1cCAyIENQVXMKQ1BV MCBhdHRhY2hpbmcgc2NoZWQtZG9tYWluOgogZG9tYWluIDA6IHNwYW4gMwog IGdyb3VwczogMSAyCiAgZG9tYWluIDE6IHNwYW4gMwogICBncm91cHM6IDMK Q1BVMSBhdHRhY2hpbmcgc2NoZWQtZG9tYWluOgogZG9tYWluIDA6IHNwYW4g MwogIGdyb3VwczogMiAxCiAgZG9tYWluIDE6IHNwYW4gMwogICBncm91cHM6 IDMKY2hlY2tpbmcgaWYgaW1hZ2UgaXMgaW5pdHJhbWZzLi4uaXQgaXNuJ3Qg KG5vIGNwaW8gbWFnaWMpOyBsb29rcyBsaWtlIGFuIGluaXRyZApGcmVlaW5n IGluaXRyZCBtZW1vcnk6IDE1OTNrIGZyZWVkCk5FVDogUmVnaXN0ZXJlZCBw cm90b2NvbCBmYW1pbHkgMTYKUENJOiBQQ0kgQklPUyByZXZpc2lvbiAyLjEw IGVudHJ5IGF0IDB4ZmJiMjAsIGxhc3QgYnVzPTEKUENJOiBVc2luZyBjb25m aWd1cmF0aW9uIHR5cGUgMQptdHJyOiB2Mi4wICgyMDAyMDUxOSkKQUNQSTog U3Vic3lzdGVtIHJldmlzaW9uIDIwMDUwMjExCkFDUEk6IEludGVycHJldGVy IGVuYWJsZWQKQUNQSTogVXNpbmcgSU9BUElDIGZvciBpbnRlcnJ1cHQgcm91 dGluZwpBQ1BJOiBQQ0kgUm9vdCBCcmlkZ2UgW1BDSTBdICgwMDowMCkKUENJ OiBQcm9iaW5nIFBDSSBoYXJkd2FyZSAoYnVzIDAwKQpQQ0k6IElnbm9yaW5n IEJBUjAtMyBvZiBJREUgY29udHJvbGxlciAwMDAwOjAwOjAyLjUKQUNQSTog UENJIEludGVycnVwdCBSb3V0aW5nIFRhYmxlIFtcX1NCXy5QQ0kwLl9QUlRd CkFDUEk6IFBDSSBJbnRlcnJ1cHQgTGluayBbTE5LQV0gKElSUXMgMyA0IDUg NiA3IDkgKjEwIDExIDEyIDE0IDE1KQpBQ1BJOiBQQ0kgSW50ZXJydXB0IExp bmsgW0xOS0JdIChJUlFzIDMgNCA1IDYgNyA5IDEwIDExIDEyIDE0IDE1KSAq MCwgZGlzYWJsZWQuCkFDUEk6IFBDSSBJbnRlcnJ1cHQgTGluayBbTE5LQ10g KElSUXMgMyA0ICo1IDYgNyA5IDEwIDExIDEyIDE0IDE1KQpBQ1BJOiBQQ0kg SW50ZXJydXB0IExpbmsgW0xOS0RdIChJUlFzIDMgNCA1IDYgNyA5IDEwICox MSAxMiAxNCAxNSkKQUNQSTogUENJIEludGVycnVwdCBMaW5rIFtMTktFXSAo SVJRcyAzIDQgNSA2IDcgOSAqMTAgMTEgMTIgMTQgMTUpCkFDUEk6IFBDSSBJ bnRlcnJ1cHQgTGluayBbTE5LRl0gKElSUXMgMyA0IDUgNiA3IDkgMTAgKjEx IDEyIDE0IDE1KQpBQ1BJOiBQQ0kgSW50ZXJydXB0IExpbmsgW0xOS0ddIChJ UlFzIDMgNCA1IDYgNyAqOSAxMCAxMSAxMiAxNCAxNSkKQUNQSTogUENJIElu dGVycnVwdCBMaW5rIFtMTktIXSAoSVJRcyAqMyA0IDUgNiA3IDkgMTAgMTEg MTIgMTQgMTUpCkxpbnV4IFBsdWcgYW5kIFBsYXkgU3VwcG9ydCB2MC45NyAo YykgQWRhbSBCZWxheQpwbnA6IFBuUCBBQ1BJIGluaXQKcG5wOiBQblAgQUNQ STogZm91bmQgOSBkZXZpY2VzClNDU0kgc3Vic3lzdGVtIGluaXRpYWxpemVk ClBDSTogVXNpbmcgQUNQSSBmb3IgSVJRIHJvdXRpbmcKKiogUENJIGludGVy cnVwdHMgYXJlIG5vIGxvbmdlciByb3V0ZWQgYXV0b21hdGljYWxseS4gIElm IHRoaXMKKiogY2F1c2VzIGEgZGV2aWNlIHRvIHN0b3Agd29ya2luZywgaXQg aXMgcHJvYmFibHkgYmVjYXVzZSB0aGUKKiogZHJpdmVyIGZhaWxlZCB0byBj YWxsIHBjaV9lbmFibGVfZGV2aWNlKCkuICBBcyBhIHRlbXBvcmFyeQoqKiB3 b3JrYXJvdW5kLCB0aGUgInBjaT1yb3V0ZWlycSIgYXJndW1lbnQgcmVzdG9y ZXMgdGhlIG9sZAoqKiBiZWhhdmlvci4gIElmIHRoaXMgYXJndW1lbnQgbWFr ZXMgdGhlIGRldmljZSB3b3JrIGFnYWluLAoqKiBwbGVhc2UgZW1haWwgdGhl IG91dHB1dCBvZiAibHNwY2kiIHRvIGJqb3JuLmhlbGdhYXNAaHAuY29tCioq IHNvIEkgY2FuIGZpeCB0aGUgZHJpdmVyLgpNYWNoaW5lIGNoZWNrIGV4Y2Vw dGlvbiBwb2xsaW5nIHRpbWVyIHN0YXJ0ZWQuCmFwbTogQklPUyB2ZXJzaW9u IDEuMiBGbGFncyAweDA3IChEcml2ZXIgdmVyc2lvbiAxLjE2YWMpCmFwbTog ZGlzYWJsZWQgLSBBUE0gaXMgbm90IFNNUCBzYWZlLgppbm90aWZ5IGRldmlj ZSBtaW5vcj02MwpkZXZmczogMjAwNC0wMS0zMSBSaWNoYXJkIEdvb2NoIChy Z29vY2hAYXRuZi5jc2lyby5hdSkKZGV2ZnM6IGJvb3Rfb3B0aW9uczogMHgw Ckluc3RhbGxpbmcga25mc2QgKGNvcHlyaWdodCAoQykgMTk5NiBva2lyQG1v bmFkLnN3Yi5kZSkuClNHSSBYRlMgd2l0aCBsYXJnZSBibG9jayBudW1iZXJz LCBubyBkZWJ1ZyBlbmFibGVkCkluaXRpYWxpemluZyBDcnlwdG9ncmFwaGlj IEFQSQpSZWFsIFRpbWUgQ2xvY2sgRHJpdmVyIHYxLjEyCk5vbi12b2xhdGls ZSBtZW1vcnkgZHJpdmVyIHYxLjIKQUNQSTogUG93ZXIgQnV0dG9uIChGRikg W1BXUkZdCkFDUEk6IEZhbiBbRkFOXSAob24pCkFDUEk6IFRoZXJtYWwgWm9u ZSBbVEhSTV0gKDQxIEMpCnNlcmlvOiBpODA0MiBBVVggcG9ydCBhdCAweDYw LDB4NjQgaXJxIDEyCnNlcmlvOiBpODA0MiBLQkQgcG9ydCBhdCAweDYwLDB4 NjQgaXJxIDEKU2VyaWFsOiA4MjUwLzE2NTUwIGRyaXZlciAkUmV2aXNpb246 IDEuOTAgJCA4IHBvcnRzLCBJUlEgc2hhcmluZyBkaXNhYmxlZAp0dHlTMCBh dCBJL08gMHgzZjggKGlycSA9IDQpIGlzIGEgMTY1NTBBCnR0eVMwIGF0IEkv TyAweDNmOCAoaXJxID0gNCkgaXMgYSAxNjU1MEEKbWljZTogUFMvMiBtb3Vz ZSBkZXZpY2UgY29tbW9uIGZvciBhbGwgbWljZQppbnB1dDogQVQgVHJhbnNs YXRlZCBTZXQgMiBrZXlib2FyZCBvbiBpc2EwMDYwL3NlcmlvMAppbnB1dDog UFMvMiBHZW5lcmljIE1vdXNlIG9uIGlzYTAwNjAvc2VyaW8xCmlvIHNjaGVk dWxlciBub29wIHJlZ2lzdGVyZWQKaW8gc2NoZWR1bGVyIGFudGljaXBhdG9y eSByZWdpc3RlcmVkCmlvIHNjaGVkdWxlciBkZWFkbGluZSByZWdpc3RlcmVk CmlvIHNjaGVkdWxlciBjZnEgcmVnaXN0ZXJlZApGbG9wcHkgZHJpdmUocyk6 IGZkMCBpcyAxLjQ0TQpGREMgMCBpcyBhIHBvc3QtMTk5MSA4MjA3NwpSQU1E SVNLIGRyaXZlciBpbml0aWFsaXplZDogMTYgUkFNIGRpc2tzIG9mIDgxOTJL IHNpemUgMTAyNCBibG9ja3NpemUKbG9vcDogbG9hZGVkIChtYXggOCBkZXZp Y2VzKQpVbmlmb3JtIE11bHRpLVBsYXRmb3JtIEUtSURFIGRyaXZlciBSZXZp c2lvbjogNy4wMGFscGhhMgppZGU6IEFzc3VtaW5nIDMzTUh6IHN5c3RlbSBi dXMgc3BlZWQgZm9yIFBJTyBtb2Rlczsgb3ZlcnJpZGUgd2l0aCBpZGVidXM9 eHgKU0lTNTUxMzogSURFIGNvbnRyb2xsZXIgYXQgUENJIHNsb3QgMDAwMDow MDowMi41CkFDUEk6IFBDSSBpbnRlcnJ1cHQgMDAwMDowMDowMi41W0FdIC0+ IEdTSSAxNiAobGV2ZWwsIGxvdykgLT4gSVJRIDE2ClNJUzU1MTM6IGNoaXBz ZXQgcmV2aXNpb24gMQpTSVM1NTEzOiBub3QgMTAwJSBuYXRpdmUgbW9kZTog d2lsbCBwcm9iZSBpcnFzIGxhdGVyClNJUzU1MTM6IFNpUyA5NjIvOTYzIE11 VElPTCBJREUgVURNQTEzMyBjb250cm9sbGVyCiAgICBpZGUwOiBCTS1ETUEg YXQgMHg0MDAwLTB4NDAwNywgQklPUyBzZXR0aW5nczogaGRhOkRNQSwgaGRi OnBpbwogICAgaWRlMTogQk0tRE1BIGF0IDB4NDAwOC0weDQwMGYsIEJJT1Mg c2V0dGluZ3M6IGhkYzpETUEsIGhkZDpwaW8KUHJvYmluZyBJREUgaW50ZXJm YWNlIGlkZTAuLi4KaGRhOiBNYXh0b3IgNlkxMjBMMCwgQVRBIERJU0sgZHJp dmUKaWRlMCBhdCAweDFmMC0weDFmNywweDNmNiBvbiBpcnEgMTQKUHJvYmlu ZyBJREUgaW50ZXJmYWNlIGlkZTEuLi4KaGRjOiBDT01CTy01MlgxNkMsIEFU QVBJIENEL0RWRC1ST00gZHJpdmUKaWRlMSBhdCAweDE3MC0weDE3NywweDM3 NiBvbiBpcnEgMTUKUHJvYmluZyBJREUgaW50ZXJmYWNlIGlkZTIuLi4KUHJv YmluZyBJREUgaW50ZXJmYWNlIGlkZTMuLi4KUHJvYmluZyBJREUgaW50ZXJm YWNlIGlkZTQuLi4KUHJvYmluZyBJREUgaW50ZXJmYWNlIGlkZTUuLi4KaGRh OiBtYXggcmVxdWVzdCBzaXplOiAxMjhLaUIKaGRhOiAyNDAxMjE3Mjggc2Vj dG9ycyAoMTIyOTQyIE1CKSB3LzIwNDhLaUIgQ2FjaGUsIENIUz02NTUzNS8x Ni82MywgVURNQSgxMzMpCmhkYTogY2FjaGUgZmx1c2hlcyBzdXBwb3J0ZWQK IC9kZXYvaWRlL2hvc3QwL2J1czAvdGFyZ2V0MC9sdW4wOiBwMSBwMiBwMwpo ZGM6IEFUQVBJIDUyWCBEVkQtUk9NIENELVIvUlcgZHJpdmUsIDIwNDhrQiBD YWNoZSwgVURNQSgzMykKVW5pZm9ybSBDRC1ST00gZHJpdmVyIFJldmlzaW9u OiAzLjIwCnN0OiBWZXJzaW9uIDIwMDQxMDI1LCBmaXhlZCBidWZzaXplIDMy NzY4LCBzL2cgc2VncyAyNTYKTkVUOiBSZWdpc3RlcmVkIHByb3RvY29sIGZh bWlseSAyCklQOiByb3V0aW5nIGNhY2hlIGhhc2ggdGFibGUgb2YgNDA5NiBi dWNrZXRzLCAzMktieXRlcwpUQ1AgZXN0YWJsaXNoZWQgaGFzaCB0YWJsZSBl bnRyaWVzOiAxNjM4NCAob3JkZXI6IDUsIDEzMTA3MiBieXRlcykKVENQIGJp bmQgaGFzaCB0YWJsZSBlbnRyaWVzOiAxNjM4NCAob3JkZXI6IDUsIDEzMTA3 MiBieXRlcykKVENQOiBIYXNoIHRhYmxlcyBjb25maWd1cmVkIChlc3RhYmxp c2hlZCAxNjM4NCBiaW5kIDE2Mzg0KQpORVQ6IFJlZ2lzdGVyZWQgcHJvdG9j b2wgZmFtaWx5IDEKTkVUOiBSZWdpc3RlcmVkIHByb3RvY29sIGZhbWlseSAx NwpwNC1jbG9ja21vZDogUDQvWGVvbihUTSkgQ1BVIE9uLURlbWFuZCBDbG9j ayBNb2R1bGF0aW9uIGF2YWlsYWJsZQpTdGFydGluZyBiYWxhbmNlZF9pcnEK QUNQSSB3YWtldXAgZGV2aWNlczogClBDSTAgVVNCMCBVU0IxIFVTQjIgVVNC MyBNQUMwIEFNUjAgVUFSMSAKQUNQSTogKHN1cHBvcnRzIFMwIFMxIFM0IFM1 KQpSQU1ESVNLOiBDb21wcmVzc2VkIGltYWdlIGZvdW5kIGF0IGJsb2NrIDAK VkZTOiBNb3VudGVkIHJvb3QgKGV4dDIgZmlsZXN5c3RlbSkgcmVhZG9ubHku CkZyZWVpbmcgdW51c2VkIGtlcm5lbCBtZW1vcnk6IDIyNGsgZnJlZWQKdXNi Y29yZTogcmVnaXN0ZXJlZCBuZXcgZHJpdmVyIHVzYmZzCnVzYmNvcmU6IHJl Z2lzdGVyZWQgbmV3IGRyaXZlciBodWIKQUNQSTogUENJIGludGVycnVwdCAw MDAwOjAwOjAzLjNbRF0gLT4gR1NJIDIzIChsZXZlbCwgbG93KSAtPiBJUlEg MjMKZWhjaV9oY2QgMDAwMDowMDowMy4zOiBTaWxpY29uIEludGVncmF0ZWQg U3lzdGVtcyBbU2lTXSBVU0IgMi4wIENvbnRyb2xsZXIKZWhjaV9oY2QgMDAw MDowMDowMy4zOiBpcnEgMjMsIHBjaSBtZW0gMHhlZDUwMTAwMAplaGNpX2hj ZCAwMDAwOjAwOjAzLjM6IG5ldyBVU0IgYnVzIHJlZ2lzdGVyZWQsIGFzc2ln bmVkIGJ1cyBudW1iZXIgMQpQQ0k6IGNhY2hlIGxpbmUgc2l6ZSBvZiAxMjgg aXMgbm90IHN1cHBvcnRlZCBieSBkZXZpY2UgMDAwMDowMDowMy4zCmVoY2lf aGNkIDAwMDA6MDA6MDMuMzogVVNCIDIuMCBpbml0aWFsaXplZCwgRUhDSSAx LjAwLCBkcml2ZXIgMTAgRGVjIDIwMDQKaHViIDEtMDoxLjA6IFVTQiBodWIg Zm91bmQKaHViIDEtMDoxLjA6IDggcG9ydHMgZGV0ZWN0ZWQKdXNiY29yZTog cmVnaXN0ZXJlZCBuZXcgZHJpdmVyIGhpZGRldgp1c2Jjb3JlOiByZWdpc3Rl cmVkIG5ldyBkcml2ZXIgdXNiaGlkCmRyaXZlcnMvdXNiL2lucHV0L2hpZC1j b3JlLmM6IHYyLjA6VVNCIEhJRCBjb3JlIGRyaXZlcgpJbml0aWFsaXppbmcg VVNCIE1hc3MgU3RvcmFnZSBkcml2ZXIuLi4KdXNiY29yZTogcmVnaXN0ZXJl ZCBuZXcgZHJpdmVyIHVzYi1zdG9yYWdlClVTQiBNYXNzIFN0b3JhZ2Ugc3Vw cG9ydCByZWdpc3RlcmVkLgpVU0IgVW5pdmVyc2FsIEhvc3QgQ29udHJvbGxl ciBJbnRlcmZhY2UgZHJpdmVyIHYyLjIKb2hjaV9oY2Q6IDIwMDQgTm92IDA4 IFVTQiAxLjEgJ09wZW4nIEhvc3QgQ29udHJvbGxlciAoT0hDSSkgRHJpdmVy IChQQ0kpCkFDUEk6IFBDSSBpbnRlcnJ1cHQgMDAwMDowMDowMy4wW0FdIC0+ IEdTSSAyMCAobGV2ZWwsIGxvdykgLT4gSVJRIDIwCm9oY2lfaGNkIDAwMDA6 MDA6MDMuMDogU2lsaWNvbiBJbnRlZ3JhdGVkIFN5c3RlbXMgW1NpU10gVVNC IDEuMCBDb250cm9sbGVyCm9oY2lfaGNkIDAwMDA6MDA6MDMuMDogaXJxIDIw LCBwY2kgbWVtIDB4ZWQ1MDMwMDAKb2hjaV9oY2QgMDAwMDowMDowMy4wOiBu ZXcgVVNCIGJ1cyByZWdpc3RlcmVkLCBhc3NpZ25lZCBidXMgbnVtYmVyIDIK aHViIDItMDoxLjA6IFVTQiBodWIgZm91bmQKaHViIDItMDoxLjA6IDMgcG9y dHMgZGV0ZWN0ZWQKQUNQSTogUENJIGludGVycnVwdCAwMDAwOjAwOjAzLjFb Ql0gLT4gR1NJIDIxIChsZXZlbCwgbG93KSAtPiBJUlEgMjEKb2hjaV9oY2Qg MDAwMDowMDowMy4xOiBTaWxpY29uIEludGVncmF0ZWQgU3lzdGVtcyBbU2lT XSBVU0IgMS4wIENvbnRyb2xsZXIgKCMyKQpvaGNpX2hjZCAwMDAwOjAwOjAz LjE6IGlycSAyMSwgcGNpIG1lbSAweGVkNTA0MDAwCm9oY2lfaGNkIDAwMDA6 MDA6MDMuMTogbmV3IFVTQiBidXMgcmVnaXN0ZXJlZCwgYXNzaWduZWQgYnVz IG51bWJlciAzCmh1YiAzLTA6MS4wOiBVU0IgaHViIGZvdW5kCmh1YiAzLTA6 MS4wOiAzIHBvcnRzIGRldGVjdGVkCkFDUEk6IFBDSSBpbnRlcnJ1cHQgMDAw MDowMDowMy4yW0NdIC0+IEdTSSAyMiAobGV2ZWwsIGxvdykgLT4gSVJRIDIy Cm9oY2lfaGNkIDAwMDA6MDA6MDMuMjogU2lsaWNvbiBJbnRlZ3JhdGVkIFN5 c3RlbXMgW1NpU10gVVNCIDEuMCBDb250cm9sbGVyICgjMykKb2hjaV9oY2Qg MDAwMDowMDowMy4yOiBpcnEgMjIsIHBjaSBtZW0gMHhlZDUwMDAwMApvaGNp X2hjZCAwMDAwOjAwOjAzLjI6IG5ldyBVU0IgYnVzIHJlZ2lzdGVyZWQsIGFz c2lnbmVkIGJ1cyBudW1iZXIgNApodWIgNC0wOjEuMDogVVNCIGh1YiBmb3Vu ZApodWIgNC0wOjEuMDogMiBwb3J0cyBkZXRlY3RlZApzYnAyOiAkUmV2OiAx MjE5ICQgQmVuIENvbGxpbnMgPGJjb2xsaW5zQGRlYmlhbi5vcmc+CmRldmlj ZS1tYXBwZXI6IDQuNC4wLWlvY3RsICgyMDA1LTAxLTEyKSBpbml0aWFsaXNl ZDogZG0tZGV2ZWxAcmVkaGF0LmNvbQpsaWJhdGEgdmVyc2lvbiAxLjEwIGxv YWRlZC4KUmVpc2VyRlM6IGhkYTM6IGZvdW5kIHJlaXNlcmZzIGZvcm1hdCAi My42IiB3aXRoIHN0YW5kYXJkIGpvdXJuYWwKUmVpc2VyRlM6IGhkYTM6IHVz aW5nIG9yZGVyZWQgZGF0YSBtb2RlClJlaXNlckZTOiBoZGEzOiBqb3VybmFs IHBhcmFtczogZGV2aWNlIGhkYTMsIHNpemUgODE5Miwgam91cm5hbCBmaXJz dCBibG9jayAxOCwgbWF4IHRyYW5zIGxlbiAxMDI0LCBtYXggYmF0Y2ggOTAw LCBtYXggY29tbWl0IGFnZSAzMCwgbWF4IHRyYW5zIGFnZSAzMApSZWlzZXJG UzogaGRhMzogY2hlY2tpbmcgdHJhbnNhY3Rpb24gbG9nIChoZGEzKQpSZWlz ZXJGUzogaGRhMzogVXNpbmcgcjUgaGFzaCB0byBzb3J0IG5hbWVzCkFkZGlu ZyAxOTUzNDk2ayBzd2FwIG9uIC9kZXYvaGRhMi4gIFByaW9yaXR5Oi0xIGV4 dGVudHM6MQpMaW51eCBhZ3BnYXJ0IGludGVyZmFjZSB2MC4xMDAgKGMpIERh dmUgSm9uZXMKYWdwZ2FydDogRGV0ZWN0ZWQgU2lTIDY2MSBjaGlwc2V0CmFn cGdhcnQ6IE1heGltdW0gbWFpbiBtZW1vcnkgdG8gdXNlIGZvciBhZ3AgbWVt b3J5OiA0MDlNCmFncGdhcnQ6IEFHUCBhcGVydHVyZSBpcyA2NE0gQCAweGU4 MDAwMDAwCkFDUEk6IFBDSSBpbnRlcnJ1cHQgMDAwMDowMDowMi43W0NdIC0+ IEdTSSAxOCAobGV2ZWwsIGxvdykgLT4gSVJRIDE4CmludGVsOHgwX21lYXN1 cmVfYWM5N19jbG9jazogbWVhc3VyZWQgNTA2ODUgdXNlY3MKaW50ZWw4eDA6 IGNsb2NraW5nIHRvIDQ4MDAwCnNpczkwMC5jOiB2MS4wOC4wNyAxMS8wMi8y MDAzCkFDUEk6IFBDSSBpbnRlcnJ1cHQgMDAwMDowMDowNC4wW0FdIC0+IEdT SSAxOSAobGV2ZWwsIGxvdykgLT4gSVJRIDE5CjAwMDA6MDA6MDQuMDogVW5r bm93biBQSFkgdHJhbnNjZWl2ZXIgZm91bmQgYXQgYWRkcmVzcyAxLgowMDAw OjAwOjA0LjA6IFVzaW5nIHRyYW5zY2VpdmVyIGZvdW5kIGF0IGFkZHJlc3Mg MSBhcyBkZWZhdWx0CmV0aDA6IFNpUyA5MDAgUENJIEZhc3QgRXRoZXJuZXQg YXQgMHhlODAwLCBJUlEgMTksIDAwOjAxOjZjOmEwOmE4OjI2LgpwYXJwb3J0 OiBQblBCSU9TIHBhcnBvcnQgZGV0ZWN0ZWQuCnBhcnBvcnQwOiBQQy1zdHls ZSBhdCAweDM3OCAoMHg3NzgpLCBpcnEgNyBbUENTUFAsVFJJU1RBVEVdCg== --0-924250615-1120778839=:64549-- From mitch.a.williams@intel.com Thu Jul 7 16:34:19 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 07 Jul 2005 16:34:21 -0700 (PDT) Received: from orsfmr002.jf.intel.com (fmr17.intel.com [134.134.136.16]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j67NYJH9032540 for ; Thu, 7 Jul 2005 16:34:19 -0700 Received: from orsfmr101.jf.intel.com (orsfmr101.jf.intel.com [10.7.209.17]) by orsfmr002.jf.intel.com (8.12.10/8.12.10/d: major-outer.mc,v 1.1 2004/09/17 17:50:56 root Exp $) with ESMTP id j67NWXGR029834; Thu, 7 Jul 2005 23:32:33 GMT Received: from nwlxmail01.jf.intel.com (nwlxmail01.jf.intel.com [10.7.171.40]) by orsfmr101.jf.intel.com (8.12.10/8.12.10/d: major-inner.mc,v 1.2 2004/09/17 18:05:01 root Exp $) with ESMTP id j67NWWVc004723; Thu, 7 Jul 2005 23:32:32 GMT Received: from mawilli1-desk2.amr.corp.intel.com (mawilli1-desk2.amr.corp.intel.com [134.134.3.58]) by nwlxmail01.jf.intel.com (8.12.10/8.12.9/MailSET/Hub) with ESMTP id j67NWWSL014581; Thu, 7 Jul 2005 16:32:32 -0700 Date: Thu, 7 Jul 2005 16:32:30 -0700 From: Mitch Williams X-X-Sender: mawilli1@mawilli1-desk2.amr.corp.intel.com To: Dmitry Torokhov cc: "Williams, Mitch A" , netdev@oss.sgi.com, "Godse, Radheka" , fubar@us.ibm.com, bonding-devel@lists.sourceforge.net Subject: Re: [PATCH 2.6.13-rc1 8/17] bonding: SYSFS INTERFACE (large) In-Reply-To: <20050706190906.74128.qmail@web81301.mail.yahoo.com> Message-ID: References: <20050706190906.74128.qmail@web81301.mail.yahoo.com> ReplyTo: "Mitch Williams" MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Scanned-By: MIMEDefang 2.44 X-archive-position: 2691 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mitch.a.williams@intel.com Precedence: bulk X-list: netdev Content-Length: 253 Lines: 12 On Wed, 6 Jul 2005, Dmitry Torokhov wrote: > Semaphore will not help in scenario you described: Phooey. You're absolutely right. I'll have to go back and rework this stuff using the reference count. Thanks again for your efforts on this. -Mitch From greg@kroah.com Thu Jul 7 16:37:21 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 07 Jul 2005 16:37:24 -0700 (PDT) Received: from perch.kroah.org (mail.kroah.org [69.55.234.183]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j67NbLH9000723 for ; Thu, 7 Jul 2005 16:37:21 -0700 Received: from [192.168.0.10] (c-24-22-115-24.hsd1.or.comcast.net [24.22.115.24]) (authenticated) by perch.kroah.org (8.11.6/8.11.6) with ESMTP id j67NZ8q31177; Thu, 7 Jul 2005 16:35:09 -0700 Received: from greg by echidna.kroah.org with local (masqmail 0.2.19) id 1Dqfa0-0GZ-00; Thu, 07 Jul 2005 16:14:52 -0700 Date: Thu, 7 Jul 2005 16:14:52 -0700 From: Greg KH To: Mitch Williams Cc: "John W. Linville" , Radheka Godse , netdev@oss.sgi.com, bonding-devel@lists.sourceforge.net, fubar@us.ibm.com Subject: Re: [PATCH 2.6.13-rc1 8/17] bonding: SYSFS INTERFACE (large) Message-ID: <20050707231451.GA936@kroah.com> References: <20050702081346.GA20789@kroah.com> <20050706195232.GB18359@kroah.com> <20050707142544.GA9418@tuxdriver.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.8i X-archive-position: 2692 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: greg@kroah.com Precedence: bulk X-list: netdev Content-Length: 3901 Lines: 102 On Thu, Jul 07, 2005 at 04:06:38PM -0700, Mitch Williams wrote: > On Thu, 7 Jul 2005, John W. Linville wrote: > > On Wed, Jul 06, 2005 at 12:52:32PM -0700, Greg KH wrote: > > > On Wed, Jul 06, 2005 at 11:53:13AM -0700, Mitch Williams wrote: > > > > > > > > How about this: > > > bond_add - write to this to add a new bond, one value only. > > > bond_remove - write to this to remove a bond that is present. > > > bonds/bond0 > > > bonds/bond1 > > > bonds/bond2 > > > ... > > > - list of bonds currently present. If you want, you > > > could make those bondX files directories, and put > > > other info about the individual bonds in there, if you > > > need it (I know nothing about the bonding intrerface, > > > sorry.) > > > > > > Would that work? > > > > I like that suggestion. It keeps the interface creation/deletion a > > little more independent of each other. > > > > We looked into a scheme much like this, but rejected it early on. > > Actually, what Greg is proposing is two things: 1) move the individual > bond interface directories down a level, into /sys/class/net, and 2) > change bonding_masters into a set of control files, instead of one file. > > Moving the individual bond directories to a bonds/ directory > is problematic. Because each bond shows up a just another network > interface, they show up in /sys/class/net automatically. We'd have to > make a bunch of changes to the device model to account for bonding, and > we'd break the semantics of the /sys/class/net hierarchy. Why not just put them in /sys/class/bond/ instead? > The problem, then, becomes one of separating the bond interfaces from the > non-bond interfaces. See proposal above. > The bonding_masters file is a simple solution to > this problem. Reading the file gives the set of active bonds, and writing > the file changes the set of active bonds. As I stated before, a cursory > reading of Documentation/filesystems/sysfs.txt indicates that such a usage > is "socially acceptable". (Or at least it was to Patrick Mochel back in > January of 2003.) Pat was just trying to be nice. I'm not. :) Also, if you have too many bonds, your code will fail. > My other major difficulty with the bond_add/bond_remove scheme is that > these files would act differently than any other sysfs files, in that > their read and write semantics are not the same. Not so at all. Just don't make them readable. Lots of sysfs files are write only. > What I mean is that any given sysfs "file" will appear to contain the same > data for both read and write. Most scripts that handle sysfs do some sort > of read-modify-write operation. This would not be possible with the type > of scheme Greg KH has outlined. Again, no, lots of sysfs files work this way today. > Furthermore, what happens when you read bond_add and bond_remove? -EPERM is returned? Or is it -EIO, I think it's the later if you just don't provide a read function, haven't tried it in a while. > What do you use to get a list of existing bond interfaces? ls /sys/bond/bonds/ > Reading a file named "bond_add" to get a list of bonds is > counterintuitive at best, and adding an extra read-only file just to > get a list of bonds seems cumbersome. I never suggested reading any files. > Greg also (in another message) mentioned problems with appending to > bonding_masters. This currently is a problem, since sysfs itself does not > handle appends properly. Because you are not supposed to do that. > Since there is no concept of a file pointer or > offset or such when the underlying methods are called, and since sysfs > happily allows seek operations to succeed, appending ends up destroying > the contents of the file. I submitted two patches that addressed this > issue several months ago but got a frosty reception and gave up. Again, because you are not supposed to do that. thanks, greg k-h From aaron.f.brown@intel.com Thu Jul 7 17:34:18 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 07 Jul 2005 17:34:30 -0700 (PDT) Received: from orsfmr002.jf.intel.com (fmr17.intel.com [134.134.136.16]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j680YIH9005000 for ; Thu, 7 Jul 2005 17:34:18 -0700 Received: from orsfmr100.jf.intel.com (orsfmr100.jf.intel.com [10.7.209.16]) by orsfmr002.jf.intel.com (8.12.10/8.12.10/d: major-outer.mc,v 1.1 2004/09/17 17:50:56 root Exp $) with ESMTP id j680W7GR024822; Fri, 8 Jul 2005 00:32:07 GMT Received: from orsmsxvs040.jf.intel.com (orsmsxvs040.jf.intel.com [192.168.65.206]) by orsfmr100.jf.intel.com (8.12.10/8.12.10/d: major-inner.mc,v 1.2 2004/09/17 18:05:01 root Exp $) with SMTP id j680VvBM022654; Fri, 8 Jul 2005 00:32:05 GMT Received: from orsmsx331.amr.corp.intel.com ([192.168.65.56]) by orsmsxvs040.jf.intel.com (SAVSMTP 3.1.7.47) with SMTP id M2005070717320417073 ; Thu, 07 Jul 2005 17:32:04 -0700 Received: from orsmsx401.amr.corp.intel.com ([192.168.65.207]) by orsmsx331.amr.corp.intel.com with Microsoft SMTPSVC(6.0.3790.211); Thu, 7 Jul 2005 17:32:04 -0700 X-MimeOLE: Produced By Microsoft Exchange V6.5.7226.0 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Subject: RE: [Bonding-devel] Re: [PATCH 2.6.13-rc1 8/17] bonding: SYSFS INTERFACE (large) Date: Thu, 7 Jul 2005 17:32:03 -0700 Message-ID: <31F5998A44B92447BD334F8FBBA0B01F099F2656@orsmsx401.amr.corp.intel.com> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: [Bonding-devel] Re: [PATCH 2.6.13-rc1 8/17] bonding: SYSFS INTERFACE (large) Thread-Index: AcWDTKtb7F0CvWM4R62+AOcieE95jgAB2pnQ From: "Brown, Aaron F" To: "Greg KH" , "Williams, Mitch A" Cc: "John W. Linville" , "Godse, Radheka" , , , X-OriginalArrivalTime: 08 Jul 2005 00:32:04.0884 (UTC) FILETIME=[771D7D40:01C58354] X-Scanned-By: MIMEDefang 2.44 Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id j680YIH9005000 X-archive-position: 2693 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: aaron.f.brown@intel.com Precedence: bulk X-list: netdev Content-Length: 5941 Lines: 166 I'm looking at this from an "end user" perspective. E.g. from the perspective of somebody writing scripts to manipulate bonds, rather then from the perspective of how bonding should be implemented. >-----Original Message----- >From: bonding-devel-admin@lists.sourceforge.net [mailto:bonding-devel- >admin@lists.sourceforge.net] On Behalf Of Greg KH >Sent: Thursday, July 07, 2005 4:15 PM >To: Williams, Mitch A >Cc: John W. Linville; Godse, Radheka; netdev@oss.sgi.com; bonding- >devel@lists.sourceforge.net; fubar@us.ibm.com >Subject: [Bonding-devel] Re: [PATCH 2.6.13-rc1 8/17] bonding: SYSFS >INTERFACE (large) > >On Thu, Jul 07, 2005 at 04:06:38PM -0700, Mitch Williams wrote: >> On Thu, 7 Jul 2005, John W. Linville wrote: >> > On Wed, Jul 06, 2005 at 12:52:32PM -0700, Greg KH wrote: >> > > On Wed, Jul 06, 2005 at 11:53:13AM -0700, Mitch Williams wrote: >> > >> > > >> > > How about this: >> > > bond_add - write to this to add a new bond, one value only. >> > > bond_remove - write to this to remove a bond that is present. >> > > bonds/bond0 >> > > bonds/bond1 >> > > bonds/bond2 >> > > ... >> > > - list of bonds currently present. If you want, you >> > > could make those bondX files directories, and put >> > > other info about the individual bonds in there, if you >> > > need it (I know nothing about the bonding intrerface, >> > > sorry.) >> > > >> > > Would that work? >> > >> > I like that suggestion. It keeps the interface creation/deletion a >> > little more independent of each other. >> > >> >> We looked into a scheme much like this, but rejected it early on. >> >> Actually, what Greg is proposing is two things: 1) move the individual >> bond interface directories down a level, into /sys/class/net, and 2) >> change bonding_masters into a set of control files, instead of one file. >> >> Moving the individual bond directories to a bonds/ directory >> is problematic. Because each bond shows up a just another network >> interface, they show up in /sys/class/net automatically. We'd have to >> make a bunch of changes to the device model to account for bonding, and >> we'd break the semantics of the /sys/class/net hierarchy. > >Why not just put them in /sys/class/bond/ instead? Bonding creates a virtual network device, it seems to logically fit down in /sys/class/net much better then at a level all to itself. > >> The problem, then, becomes one of separating the bond interfaces from the >> non-bond interfaces. > >See proposal above. > >> The bonding_masters file is a simple solution to >> this problem. Reading the file gives the set of active bonds, and >writing >> the file changes the set of active bonds. As I stated before, a cursory >> reading of Documentation/filesystems/sysfs.txt indicates that such a >usage >> is "socially acceptable". (Or at least it was to Patrick Mochel back in >> January of 2003.) > >Pat was just trying to be nice. I'm not. :) > >Also, if you have too many bonds, your code will fail. This is true, but an unlikely event in any real system I am aware of. If I use the max_bonds load parameter and create say 600 bonds (which will be named bond0, bond1... bond599) then cat out the bonding_masters file I only see 524 bonds (bond0...bond523.) However, as a bond interface requires at least one but usually more physical network devices to be of much benefit I see it unlikely that anybody will really ever have a real need for that many bonds. Since bonding really is used for combining 2 or more adapters into a single logical channel it could handle 1048 ports set up in bonds of 2 before this type of failure would appear. > >> My other major difficulty with the bond_add/bond_remove scheme is that >> these files would act differently than any other sysfs files, in that >> their read and write semantics are not the same. > >Not so at all. > >Just don't make them readable. Lots of sysfs files are write only. > >> What I mean is that any given sysfs "file" will appear to contain the >same >> data for both read and write. Most scripts that handle sysfs do some >sort >> of read-modify-write operation. This would not be possible with the type >> of scheme Greg KH has outlined. > >Again, no, lots of sysfs files work this way today. > >> Furthermore, what happens when you read bond_add and bond_remove? > >-EPERM is returned? Or is it -EIO, I think it's the later if you just >don't provide a read function, haven't tried it in a while. > >> What do you use to get a list of existing bond interfaces? > >ls /sys/bond/bonds/ > >> Reading a file named "bond_add" to get a list of bonds is >> counterintuitive at best, and adding an extra read-only file just to >> get a list of bonds seems cumbersome. > >I never suggested reading any files. > >> Greg also (in another message) mentioned problems with appending to >> bonding_masters. This currently is a problem, since sysfs itself does >not >> handle appends properly. > >Because you are not supposed to do that. > >> Since there is no concept of a file pointer or >> offset or such when the underlying methods are called, and since sysfs >> happily allows seek operations to succeed, appending ends up destroying >> the contents of the file. I submitted two patches that addressed this >> issue several months ago but got a frosty reception and gave up. > >Again, because you are not supposed to do that. > >thanks, > >greg k-h > > >------------------------------------------------------- >This SF.Net email is sponsored by the 'Do More With Dual!' webinar >happening >July 14 at 8am PDT/11am EDT. We invite you to explore the latest in dual >core and dual graphics technology at this free one hour event hosted by HP, >AMD, and NVIDIA. To register visit http://www.hp.com/go/dualwebinar >_______________________________________________ >Bonding-devel mailing list >Bonding-devel@lists.sourceforge.net >https://lists.sourceforge.net/lists/listinfo/bonding-devel From raghavendra.koushik@neterion.com Thu Jul 7 18:08:04 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 07 Jul 2005 18:08:08 -0700 (PDT) Received: from ns1.s2io.com (ns1.s2io.com [142.46.200.198]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j68183H9008021 for ; Thu, 7 Jul 2005 18:08:04 -0700 Received: from guinness.s2io.com (sentry.s2io.com [142.46.200.199]) by ns1.s2io.com (8.12.10/8.12.10) with ESMTP id j6816Pcx023424; Thu, 7 Jul 2005 21:06:25 -0400 (EDT) Received: from rkoushik ([10.16.16.57]) by guinness.s2io.com (8.12.6/8.12.6) with ESMTP id j6816NKP022996; Thu, 7 Jul 2005 21:06:23 -0400 (EDT) Message-Id: <200507080106.j6816NKP022996@guinness.s2io.com> From: "Raghavendra Koushik" To: "'Arthur Kepner'" Cc: , , , , , Subject: RE: [PATCH 2.6.12.1 5/12] S2io: Performance improvements Date: Thu, 7 Jul 2005 18:06:19 -0700 MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Mailer: Microsoft Office Outlook, Build 11.0.5510 In-Reply-To: Thread-Index: AcWDSnaRMTyyX4uaSceTxBRTBWaSkAAC7jAg X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.2180 X-Scanned-By: MIMEDefang 2.34 X-archive-position: 2694 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: raghavendra.koushik@neterion.com Precedence: bulk X-list: netdev Content-Length: 2821 Lines: 85 > I thought that an mmiowb() was called for here (to order the PIO > writes above more cheaply than doing the readq()). I posted a > patch like this some time ago: > > http://marc.theaimsgroup.com/?l=linux-netdev&m=111508292028110&w=2 On an Altix machine I believe the readq was necessary to flush the PIO writes. How long did you run the tests? I had seen in long duration tests that an occasional write (TXDL control word and the address) would be missed and the xmit Get's stuck. > > FWIW, I've done quite a few performance measurements with the patch > I posted earlier, and it's worked well. For 1500 byte mtus throughput > goes up by ~20%. Is even the mmiowb() unnecessary? > Was this on 2.4 kernel because I think the readq would not have a significant impact on 2.6 kernels due to TSO. (with TSO on the number of packets that actually enter the Xmit routine would be reduced apprx 40 times). > What is the wmb() above for? wmb() is to ensure ordered PIO writes. Thanks - Koushik > -----Original Message----- > From: Arthur Kepner [mailto:akepner@sgi.com] > Sent: Thursday, July 07, 2005 4:15 PM > To: raghavendra.koushik@neterion.com > Cc: jgarzik@pobox.com; netdev@oss.sgi.com; > netdev@vger.kernel.org; ravinandan.arakali@neterion.com; > leonid.grossman@neterion.com; rapuru.sriram@neterion.com > Subject: Re: [PATCH 2.6.12.1 5/12] S2io: Performance improvements > > > On Thu, 7 Jul 2005 raghavendra.koushik@neterion.com wrote: > > > ....... > > 2. Removed unnecessary PIOs(read/write of tx_traffic_int and > > rx_traffic_int) from interrupt handler and removed read of > > general_int_status register from xmit routine. > > > ...... > > @@ -2891,6 +2869,8 @@ int s2io_xmit(struct sk_buff *skb, struc > > val64 = > mac_control->fifos[queue].list_info[put_off].list_phy_addr; > > writeq(val64, &tx_fifo->TxDL_Pointer); > > > > + wmb(); > > + > > val64 = (TX_FIFO_LAST_TXD_NUM(frg_cnt) | TX_FIFO_FIRST_LIST | > > TX_FIFO_LAST_LIST); > > > > @@ -2900,9 +2880,6 @@ int s2io_xmit(struct sk_buff *skb, struc > > #endif > > writeq(val64, &tx_fifo->List_Control); > > > > - /* Perform a PCI read to flush previous writes */ > > - val64 = readq(&bar0->general_int_status); > > - > > put_off++; > > I thought that an mmiowb() was called for here (to order the PIO > writes above more cheaply than doing the readq()). I posted a > patch like this some time ago: > > http://marc.theaimsgroup.com/?l=linux-netdev&m=111508292028110&w=2 > > FWIW, I've done quite a few performance measurements with the patch > I posted earlier, and it's worked well. For 1500 byte mtus throughput > goes up by ~20%. Is even the mmiowb() unnecessary? > > What is the wmb() above for? > > -- > Arthur > From davem@davemloft.net Thu Jul 7 20:02:21 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 07 Jul 2005 20:02:24 -0700 (PDT) Received: from sunset.davemloft.net (dsl027-180-168.sfo1.dsl.speakeasy.net [216.27.180.168]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6832KH9015383 for ; Thu, 7 Jul 2005 20:02:21 -0700 Received: from localhost ([127.0.0.1] ident=davem) by sunset.davemloft.net with esmtp (Exim 4.50) id 1Dqj6Q-0002d2-Fm; Thu, 07 Jul 2005 20:00:34 -0700 Date: Thu, 07 Jul 2005 20:00:34 -0700 (PDT) Message-Id: <20050707.200034.74747399.davem@davemloft.net> To: raghavendra.koushik@neterion.com Cc: akepner@sgi.com, jgarzik@pobox.com, netdev@oss.sgi.com, netdev@vger.kernel.org, ravinandan.arakali@neterion.com, leonid.grossman@neterion.com, rapuru.sriram@neterion.com Subject: Re: [PATCH 2.6.12.1 5/12] S2io: Performance improvements From: "David S. Miller" In-Reply-To: <200507080106.j6816NKP022996@guinness.s2io.com> References: <200507080106.j6816NKP022996@guinness.s2io.com> X-Mailer: Mew version 4.2 on Emacs 21.4 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 2695 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev Content-Length: 526 Lines: 14 From: "Raghavendra Koushik" Date: Thu, 7 Jul 2005 18:06:19 -0700 > wmb() is to ensure ordered PIO writes. wmb() does no such thing. It only has influence on load and store instructions done by the local processor, it has no effect on what the PCI bus may do with PIO writes (ie. post them). If you need a PIO to complete in a specific order, you have to read it back. If you need PIO operations to occur in a specific order wrt. cpu memory operations, mmiowb() is what you need to use. From jgarzik@pobox.com Thu Jul 7 20:09:46 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 07 Jul 2005 20:09:49 -0700 (PDT) Received: from mail.dvmed.net (mail.dvmed.net [216.237.124.58]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6839kH9016488 for ; Thu, 7 Jul 2005 20:09:46 -0700 Received: from cpe-065-184-065-144.nc.res.rr.com ([65.184.65.144] helo=[10.10.10.88]) by mail.dvmed.net with esmtpsa (Exim 4.51 #1 (Red Hat Linux)) id 1DqjDi-0006qK-0i; Fri, 08 Jul 2005 03:08:07 +0000 Message-ID: <42CDEE12.5030100@pobox.com> Date: Thu, 07 Jul 2005 23:08:02 -0400 From: Jeff Garzik User-Agent: Mozilla Thunderbird 1.0.2-6 (X11/20050513) X-Accept-Language: en-us, en MIME-Version: 1.0 To: "David S. Miller" , raghavendra.koushik@neterion.com CC: akepner@sgi.com, netdev@oss.sgi.com, netdev@vger.kernel.org, ravinandan.arakali@neterion.com, leonid.grossman@neterion.com, rapuru.sriram@neterion.com Subject: Re: [PATCH 2.6.12.1 5/12] S2io: Performance improvements References: <200507080106.j6816NKP022996@guinness.s2io.com> <20050707.200034.74747399.davem@davemloft.net> In-Reply-To: <20050707.200034.74747399.davem@davemloft.net> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 2696 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev Content-Length: 513 Lines: 17 David S. Miller wrote: > If you need a PIO to complete in a specific order, you > have to read it back. If you need PIO operations to occur Correct. A PCI read is the only way to ensure that all the CPU/PCI bridge buffers are flushed to the device. Whenever Arjan and I complain about "PCI posting" problems, we are indicating a need for additional readl() calls to ensure ordering/flushing. Delaying immediately after a writel() is a classic PCI posting mistake. Assuming ordering is another. Jeff From davem@davemloft.net Fri Jul 8 00:32:05 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 08 Jul 2005 00:32:09 -0700 (PDT) Received: from sunset.davemloft.net (dsl027-180-168.sfo1.dsl.speakeasy.net [216.27.180.168]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j687W4H9007597 for ; Fri, 8 Jul 2005 00:32:05 -0700 Received: from localhost ([127.0.0.1] ident=davem) by sunset.davemloft.net with esmtp (Exim 4.50) id 1DqnJO-0002UA-Gt; Fri, 08 Jul 2005 00:30:14 -0700 Date: Fri, 08 Jul 2005 00:30:14 -0700 (PDT) Message-Id: <20050708.003014.125896217.davem@davemloft.net> To: dada1@cosmosbay.com Cc: tgraf@suug.ch, netdev@oss.sgi.com Subject: Re: [PATCH] loop unrolling in net/sched/sch_generic.c From: "David S. Miller" In-Reply-To: <42CE22CE.7030902@cosmosbay.com> References: <20050706124206.GW16076@postel.suug.ch> <20050707.141718.85410359.davem@davemloft.net> <42CE22CE.7030902@cosmosbay.com> X-Mailer: Mew version 4.2 on Emacs 21.4 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 2698 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev Content-Length: 3075 Lines: 90 From: Eric Dumazet Date: Fri, 08 Jul 2005 08:53:02 +0200 > About making sk_buff smaller, I use this patch to declare 'struct > sec_path *sp' only ifdef CONFIG_XFRM, what do you think ? I also > use a patch to declare nfcache, nfctinfo and nfct only if > CONFIG_NETFILTER_CONNTRACK or CONFIG_NETFILTER_CONNTRACK_MODULE are > defined, but thats more intrusive. Also, tc_index is not used if > CONFIG_NET_SCHED only is declared but none of CONFIG_NET_SCH_* In my > case, I am using CONFIG_NET_SCHED only to be able to do : tc -s -d > qdisc Distributions enable all of the ifdefs, and that is thus the size and resultant performance most users see. That's why I'm working on shrinking the size assuming all the config options are enabled, because that is the reality for most installations. For all of this stuff we could consider stealing some ideas from BSD, namely doing something similar to their MBUF tags. If a subsystem wants to add a cookie to a networking buffer, it allocates a tag and links it into the struct. So, you basically get away with only one pointer (a struct hlist_head). We could use this for the security, netfilter, and TC stuff. I don't know exactly what our tags would look like, but perhaps: struct skb_tag; struct skb_tag_type { void (*destructor)(struct skb_tag *); kmem_cache_t *slab_cache; const char *name; }; struct skb_tag { struct hlist_node list; struct skb_tag_type *owner; int tag_id; char data[0]; }; struct sk_buff { ... struct hlist_head tag_list; ... }; Then netfilter does stuff like: struct sk_buff *skb; struct skb_tag *tag; struct conntrack_skb_info *info; tag = skb_find_tag(skb, SKB_TAG_NETFILTER_CONNTRACK); info = (struct conntrack_skb_info *) tag->data; etc. etc. The downsides to this approach are: 1) Tagging an SKB eats a memory allocation, which isn't nice. This is mainly why I haven't mentioned this idea before. It may be that, on an active system, the per-cpu SLAB caches for such tag objects might keep the allocation costs real low. Another factor is that tags are relatively tiny, so a large number of them fit in one SLAB. But on the other hand we've been trying to remove per-packet kmalloc() counts, see the SKB fast-clone discussions about that. And people ask for SKB recycling all the time. 2) skb_clone() would get more expensive. This is because you'd need to clone the SKB tags as well. There is the possibility to hang the tags off of the skb_shinfo() area. I know this idea sounds crazy, but the theory goes that if the netfilter et. al info would change (and thus, so would the assosciative tags), then you'd need to COW the SKB anyways. This is actually an idea worth considering regardless of whether we do tags or not. It would result in less reference counting when we clone an SKB with netfilter stuff or security stuff attached. Overall I'm not too thrilled with the idea, but I'm enthusiatic about being convinced otherwise since this would shrink sk_buff dramatically. :-) From dada1@cosmosbay.com Fri Jul 8 01:21:33 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 08 Jul 2005 01:21:36 -0700 (PDT) Received: from gw1.cosmosbay.com (gw1.cosmosbay.com [62.23.185.226]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j688LWH9010905 for ; Fri, 8 Jul 2005 01:21:32 -0700 Received: from [172.16.0.131] (edumazet-port [172.16.0.131]) by gw1.cosmosbay.com (8.13.3/8.13.3) with ESMTP id j688JkOL019253; Fri, 8 Jul 2005 10:19:46 +0200 Message-ID: <42CE3722.3070208@cosmosbay.com> Date: Fri, 08 Jul 2005 10:19:46 +0200 From: Eric Dumazet User-Agent: Mozilla Thunderbird 1.0 (Windows/20041206) X-Accept-Language: fr, en MIME-Version: 1.0 To: "David S. Miller" CC: tgraf@suug.ch, netdev@oss.sgi.com Subject: Re: [PATCH] loop unrolling in net/sched/sch_generic.c References: <20050706124206.GW16076@postel.suug.ch> <20050707.141718.85410359.davem@davemloft.net> <42CE22CE.7030902@cosmosbay.com> <20050708.003014.125896217.davem@davemloft.net> In-Reply-To: <20050708.003014.125896217.davem@davemloft.net> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-1.6 (gw1.cosmosbay.com [172.16.8.80]); Fri, 08 Jul 2005 10:19:46 +0200 (CEST) X-archive-position: 2699 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: dada1@cosmosbay.com Precedence: bulk X-list: netdev Content-Length: 1077 Lines: 31 David S. Miller a écrit : > From: Eric Dumazet > Date: Fri, 08 Jul 2005 08:53:02 +0200 > > >>About making sk_buff smaller, I use this patch to declare 'struct >>sec_path *sp' only ifdef CONFIG_XFRM, what do you think ? I also >>use a patch to declare nfcache, nfctinfo and nfct only if >>CONFIG_NETFILTER_CONNTRACK or CONFIG_NETFILTER_CONNTRACK_MODULE are >>defined, but thats more intrusive. Also, tc_index is not used if >>CONFIG_NET_SCHED only is declared but none of CONFIG_NET_SCH_* In my >>case, I am using CONFIG_NET_SCHED only to be able to do : tc -s -d >>qdisc > > > Distributions enable all of the ifdefs, and that is thus the > size and resultant performance most users see. Well, I had this idea because I found another similar use in include/linux/ip.h struct inet_sock { /* sk and pinet6 has to be the first two members of inet_sock */ struct sock sk; #if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE) struct ipv6_pinfo *pinet6; #endif You are right such conditions are distributions nightmare :( Eric From arnaldo.melo@gmail.com Fri Jul 8 04:10:32 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 08 Jul 2005 04:10:42 -0700 (PDT) Received: from zproxy.gmail.com (zproxy.gmail.com [64.233.162.200]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j68BAWH9023634 for ; Fri, 8 Jul 2005 04:10:32 -0700 Received: by zproxy.gmail.com with SMTP id p8so183385nzb for ; Fri, 08 Jul 2005 04:08:54 -0700 (PDT) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:reply-to:to:subject:cc:in-reply-to:mime-version:content-type:references; b=pvOtFjpFVGDjCkTLOeTyT/IvAFfxMYvxjIezp3v0K58pi2eEEUe2XNsFR0s7gFmCaxpUpVaVpE2Fwl1f4imeK1B36pl7jdBdMeDFSOs0v9NLWpWztQ4SaQBhFr84JQhshkgNE42VhgcFcusXL5UFPWyorx6RVwMvpTAaY0EHPOY= Received: by 10.36.66.5 with SMTP id o5mr669349nza; Fri, 08 Jul 2005 04:08:54 -0700 (PDT) Received: by 10.36.56.14 with HTTP; Fri, 8 Jul 2005 04:08:54 -0700 (PDT) Message-ID: <39e6f6c7050708040877702f6f@mail.gmail.com> Date: Fri, 8 Jul 2005 08:08:54 -0300 From: Arnaldo Carvalho de Melo Reply-To: Arnaldo Carvalho de Melo To: Eric Dumazet Subject: Re: [PATCH] loop unrolling in net/sched/sch_generic.c Cc: "David S. Miller" , tgraf@suug.ch, netdev@oss.sgi.com In-Reply-To: <42CE3722.3070208@cosmosbay.com> Mime-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_Part_1221_18469590.1120820934586" References: <20050706124206.GW16076@postel.suug.ch> <20050707.141718.85410359.davem@davemloft.net> <42CE22CE.7030902@cosmosbay.com> <20050708.003014.125896217.davem@davemloft.net> <42CE3722.3070208@cosmosbay.com> X-archive-position: 2700 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: arnaldo.melo@gmail.com Precedence: bulk X-list: netdev Content-Length: 3635 Lines: 88 ------=_Part_1221_18469590.1120820934586 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Content-Disposition: inline On 7/8/05, Eric Dumazet wrote: >=20 > David S. Miller a =E9crit : > > From: Eric Dumazet > > Date: Fri, 08 Jul 2005 08:53:02 +0200 > > > > > >>About making sk_buff smaller, I use this patch to declare 'struct > >>sec_path *sp' only ifdef CONFIG_XFRM, what do you think ? I also > >>use a patch to declare nfcache, nfctinfo and nfct only if > >>CONFIG_NETFILTER_CONNTRACK or CONFIG_NETFILTER_CONNTRACK_MODULE are > >>defined, but thats more intrusive. Also, tc_index is not used if > >>CONFIG_NET_SCHED only is declared but none of CONFIG_NET_SCH_* In my > >>case, I am using CONFIG_NET_SCHED only to be able to do : tc -s -d > >>qdisc > > > > > > Distributions enable all of the ifdefs, and that is thus the > > size and resultant performance most users see. >=20 > Well, I had this idea because I found another similar use in=20 > include/linux/ip.h >=20 > struct inet_sock { > /* sk and pinet6 has to be the first two members of inet_sock */ > struct sock sk; > #if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE) > struct ipv6_pinfo *pinet6; > #endif /me pleads guilty, Dave, any problem with removing this #ifdef? Humm, I'll= =20 think about using the skb_alloc_extension() idea for struct sock, but this pinet6 sucker is a= =20 bit more difficult I guess... - Arnaldo ------=_Part_1221_18469590.1120820934586 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Content-Disposition: inline

On 7/8/05, Eric Dumazet <dada1@co= smosbay.com> wrote:
David S. Miller a =E9crit :
> From: Eric Dumazet <dada1@cosmosbay.com>
> Date: Fri, 08 Ju= l 2005 08:53:02 +0200
>
>
>>About making sk_buff small= er, I use this patch to declare 'struct
>>sec_path *sp' only ifdef CONFIG_XFRM, what do you think ? =  I also
>>use a patch to declare nfcache, nfctinfo and nfct o= nly if
>>CONFIG_NETFILTER_CONNTRACK or CONFIG_NETFILTER_CONNTRACK_= MODULE are
>>defined, but thats more intrusive.  Also, tc_index is= not used if
>>CONFIG_NET_SCHED only is declared but none of CONFI= G_NET_SCH_* In my
>>case, I am using CONFIG_NET_SCHED only to be a= ble to do : tc -s -d
>>qdisc
>
>
> Distributions enable all of the i= fdefs, and that is thus the
> size and resultant performance most use= rs see.

Well, I had this idea because I found another similar use in= include/linux/ip.h

struct inet_sock {
     /* sk and pinet6 has= to be the first two members of inet_sock */
     st= ruct sock     sk;
#if defined(CONFIG_IPV6) || define= d(CONFIG_IPV6_MODULE)
     struct ipv6_pinfo &n= bsp; *pinet6;
#endif


/me pleads guilty, Dave, any problem with removing this #ifdef? Humm, I'll = think about using
the skb_alloc_extension() idea for struct sock, but this pinet6 sucker is a= bit more difficult I guess...

- Arnaldo

------=_Part_1221_18469590.1120820934586-- From akepner@sgi.com Fri Jul 8 08:37:08 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 08 Jul 2005 08:37:11 -0700 (PDT) Received: from omx1.americas.sgi.com (omx1-ext.sgi.com [192.48.179.11]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j68Fb7H9024525 for ; Fri, 8 Jul 2005 08:37:07 -0700 Received: from cthulhu.engr.sgi.com (cthulhu.engr.sgi.com [192.26.80.2]) by omx1.americas.sgi.com (8.12.10/8.12.9/linux-outbound_gateway-1.1) with ESMTP id j68FZTxT011238 for ; Fri, 8 Jul 2005 10:35:30 -0500 Received: from [192.168.2.20] (mtv-vpn-sw-corp-0-42.corp.sgi.com [134.15.0.42]) by cthulhu.engr.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id j68FZSdP45758656; Fri, 8 Jul 2005 08:35:28 -0700 (PDT) Date: Fri, 8 Jul 2005 08:31:21 -0700 (PDT) From: Arthur Kepner X-X-Sender: akepner@resonance.WorkGroup To: Raghavendra Koushik cc: jgarzik@pobox.com, netdev@oss.sgi.com, netdev@vger.kernel.org, ravinandan.arakali@neterion.com, leonid.grossman@neterion.com, rapuru.sriram@neterion.com Subject: RE: [PATCH 2.6.12.1 5/12] S2io: Performance improvements In-Reply-To: <200507080106.j6816NKP022996@guinness.s2io.com> Message-ID: References: <200507080106.j6816NKP022996@guinness.s2io.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 2701 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: akepner@sgi.com Precedence: bulk X-list: netdev Content-Length: 1302 Lines: 37 On Thu, 7 Jul 2005, Raghavendra Koushik wrote: > .... > On an Altix machine I believe the readq was necessary to flush > the PIO writes. How long did you run the tests? I had seen > in long duration tests that an occasional write > (TXDL control word and the address) would be missed and the xmit > Get's stuck. > The most recent tests I did used pktgen, and they ran for a total time of ~.5 hours (changing pkt_size every 30 seconds or so). The pktgen tests and other tests (like nttcp) have been run several times, so I've exercised the card for a total of several hours without any problems. > > > > > FWIW, I've done quite a few performance measurements with the patch > > I posted earlier, and it's worked well. For 1500 byte mtus throughput > > goes up by ~20%. Is even the mmiowb() unnecessary? > > > > Was this on 2.4 kernel because I think the readq would not have a > significant impact on 2.6 kernels due to TSO. > (with TSO on the number of packets that actually enter the > Xmit routine would be reduced apprx 40 times). > ..... This was with a 2.6 kernel (with TSO on). PIO reads are pretty expensive on Altix, so eliminating them really helps us. For big mtus (>=4KBytes) the benefit of replacing the readq() with mmiowb() in s2io_xmit() is negligible. -- Arthur From greg@kroah.com Fri Jul 8 10:55:45 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 08 Jul 2005 10:55:55 -0700 (PDT) Received: from perch.kroah.org (mail.kroah.org [69.55.234.183]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j68HtiH9014508 for ; Fri, 8 Jul 2005 10:55:45 -0700 Received: from [192.168.0.10] (c-24-22-115-24.hsd1.or.comcast.net [24.22.115.24]) (authenticated) by perch.kroah.org (8.11.6/8.11.6) with ESMTP id j68HrWq12325; Fri, 8 Jul 2005 10:53:32 -0700 Received: from greg by echidna.kroah.org with local (masqmail 0.2.19) id 1DqwaY-82C-00; Fri, 08 Jul 2005 10:24:34 -0700 Date: Fri, 8 Jul 2005 10:24:34 -0700 From: Greg KH To: "Brown, Aaron F" Cc: "Williams, Mitch A" , "John W. Linville" , "Godse, Radheka" , netdev@oss.sgi.com, bonding-devel@lists.sourceforge.net, fubar@us.ibm.com Subject: Re: [Bonding-devel] Re: [PATCH 2.6.13-rc1 8/17] bonding: SYSFS INTERFACE (large) Message-ID: <20050708172434.GI29606@kroah.com> References: <31F5998A44B92447BD334F8FBBA0B01F099F2656@orsmsx401.amr.corp.intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <31F5998A44B92447BD334F8FBBA0B01F099F2656@orsmsx401.amr.corp.intel.com> User-Agent: Mutt/1.5.8i X-archive-position: 2702 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: greg@kroah.com Precedence: bulk X-list: netdev Content-Length: 1977 Lines: 51 On Thu, Jul 07, 2005 at 05:32:03PM -0700, Brown, Aaron F wrote: > >Why not just put them in /sys/class/bond/ instead? > > Bonding creates a virtual network device, it seems to logically fit down > in /sys/class/net much better then at a level all to itself. Ok, fair enough, it's up to you all where you want to put it, I was just offering a suggestion. > >> The problem, then, becomes one of separating the bond interfaces from > the > >> non-bond interfaces. > > > >See proposal above. > > > >> The bonding_masters file is a simple solution to > >> this problem. Reading the file gives the set of active bonds, and > >writing > >> the file changes the set of active bonds. As I stated before, a > cursory > >> reading of Documentation/filesystems/sysfs.txt indicates that such a > >usage > >> is "socially acceptable". (Or at least it was to Patrick Mochel back > in > >> January of 2003.) > > > >Pat was just trying to be nice. I'm not. :) > > > >Also, if you have too many bonds, your code will fail. > > This is true, but an unlikely event in any real system I am aware of. > If I use the max_bonds load parameter and create say 600 bonds (which > will be named bond0, bond1... bond599) then cat out the bonding_masters > file I only see 524 bonds (bond0...bond523.) Yup, don't want to have that happen. So I'm glad you agree with me :) > However, as a bond interface requires at least one but usually more > physical network devices to be of much benefit I see it unlikely that > anybody will really ever have a real need for that many bonds. Since > bonding really is used for combining 2 or more adapters into a single > logical channel it could handle 1048 ports set up in bonds of 2 before > this type of failure would appear. Can you guarantee that no one wants that many bonds? I can't, and I don't think you want to redo your userinterface some time in the future when people ask for this. My proposal has no such limitations. thanks, greg k-h From raghavendra.koushik@neterion.com Fri Jul 8 11:18:01 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 08 Jul 2005 11:18:04 -0700 (PDT) Received: from ns1.s2io.com (ns1.s2io.com [142.46.200.198]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j68II0H9016607 for ; Fri, 8 Jul 2005 11:18:00 -0700 Received: from guinness.s2io.com (sentry.s2io.com [142.46.200.199]) by ns1.s2io.com (8.12.10/8.12.10) with ESMTP id j68IGLcx026721; Fri, 8 Jul 2005 14:16:21 -0400 (EDT) Received: from rkoushik ([10.16.16.61]) by guinness.s2io.com (8.12.6/8.12.6) with ESMTP id j68IGIKP024269; Fri, 8 Jul 2005 14:16:18 -0400 (EDT) Message-Id: <200507081816.j68IGIKP024269@guinness.s2io.com> From: "Raghavendra Koushik" To: "'Arthur Kepner'" Cc: , , , , , Subject: RE: [PATCH 2.6.12.1 5/12] S2io: Performance improvements Date: Fri, 8 Jul 2005 11:16:14 -0700 MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Mailer: Microsoft Office Outlook, Build 11.0.5510 In-Reply-To: Thread-Index: AcWD0q0SS5czwbPMREWCPAhX7ThQggAFhp3A X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.2180 X-Scanned-By: MIMEDefang 2.34 X-archive-position: 2703 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: raghavendra.koushik@neterion.com Precedence: bulk X-list: netdev Content-Length: 1905 Lines: 57 I'll include this fix in the next patch that incorporates any other review comments coming my way.. Thanks for pointing it out. -Koushik > -----Original Message----- > From: Arthur Kepner [mailto:akepner@sgi.com] > Sent: Friday, July 08, 2005 8:31 AM > To: Raghavendra Koushik > Cc: jgarzik@pobox.com; netdev@oss.sgi.com; > netdev@vger.kernel.org; ravinandan.arakali@neterion.com; > leonid.grossman@neterion.com; rapuru.sriram@neterion.com > Subject: RE: [PATCH 2.6.12.1 5/12] S2io: Performance improvements > > On Thu, 7 Jul 2005, Raghavendra Koushik wrote: > > > .... > > On an Altix machine I believe the readq was necessary to flush > > the PIO writes. How long did you run the tests? I had seen > > in long duration tests that an occasional write > > (TXDL control word and the address) would be missed and the xmit > > Get's stuck. > > > > The most recent tests I did used pktgen, and they ran for a total > time of ~.5 hours (changing pkt_size every 30 seconds or so). The > pktgen tests and other tests (like nttcp) have been run > several times, > so I've exercised the card for a total of several hours without > any problems. > > > > > > > > > FWIW, I've done quite a few performance measurements with > the patch > > > I posted earlier, and it's worked well. For 1500 byte > mtus throughput > > > goes up by ~20%. Is even the mmiowb() unnecessary? > > > > > > > Was this on 2.4 kernel because I think the readq would not have a > > significant impact on 2.6 kernels due to TSO. > > (with TSO on the number of packets that actually enter the > > Xmit routine would be reduced apprx 40 times). > > ..... > > This was with a 2.6 kernel (with TSO on). PIO reads are pretty > expensive on Altix, so eliminating them really helps us. > > For big mtus (>=4KBytes) the benefit of replacing the readq() > with mmiowb() in s2io_xmit() is negligible. > > -- > Arthur > From ravinandan.arakali@neterion.com Fri Jul 8 11:19:15 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 08 Jul 2005 11:19:20 -0700 (PDT) Received: from ns1.s2io.com (ns1.s2io.com [142.46.200.198]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j68IJEH9016906 for ; Fri, 8 Jul 2005 11:19:14 -0700 Received: from guinness.s2io.com (sentry.s2io.com [142.46.200.199]) by ns1.s2io.com (8.12.10/8.12.10) with ESMTP id j68IHZcx026729; Fri, 8 Jul 2005 14:17:35 -0400 (EDT) Received: from rarakali ([10.16.16.57]) by guinness.s2io.com (8.12.6/8.12.6) with SMTP id j68IHXKP024485; Fri, 8 Jul 2005 14:17:34 -0400 (EDT) From: "Ravinandan Arakali" To: "'Arthur Kepner'" , "'Raghavendra Koushik'" Cc: , , , , Subject: RE: [PATCH 2.6.12.1 5/12] S2io: Performance improvements Date: Fri, 8 Jul 2005 11:17:29 -0700 Message-ID: <000201c583e9$4dad0350$3910100a@pc.s2io.com> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Priority: 3 (Normal) X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook CWS, Build 9.0.2416 (9.0.2911.0) In-Reply-To: X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.2180 Importance: Normal X-Scanned-By: MIMEDefang 2.34 X-archive-position: 2704 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ravinandan.arakali@neterion.com Precedence: bulk X-list: netdev Content-Length: 1858 Lines: 56 Arthur/David/Jeff, Thanks for pointing that out. We will wait for any other comments on our 12 patches. If there are no other, will send out a patch13 to include the mmiowb() change. Thanks, Ravi -----Original Message----- From: Arthur Kepner [mailto:akepner@sgi.com] Sent: Friday, July 08, 2005 8:31 AM To: Raghavendra Koushik Cc: jgarzik@pobox.com; netdev@oss.sgi.com; netdev@vger.kernel.org; ravinandan.arakali@neterion.com; leonid.grossman@neterion.com; rapuru.sriram@neterion.com Subject: RE: [PATCH 2.6.12.1 5/12] S2io: Performance improvements On Thu, 7 Jul 2005, Raghavendra Koushik wrote: > .... > On an Altix machine I believe the readq was necessary to flush > the PIO writes. How long did you run the tests? I had seen > in long duration tests that an occasional write > (TXDL control word and the address) would be missed and the xmit > Get's stuck. > The most recent tests I did used pktgen, and they ran for a total time of ~.5 hours (changing pkt_size every 30 seconds or so). The pktgen tests and other tests (like nttcp) have been run several times, so I've exercised the card for a total of several hours without any problems. > > > > > FWIW, I've done quite a few performance measurements with the patch > > I posted earlier, and it's worked well. For 1500 byte mtus throughput > > goes up by ~20%. Is even the mmiowb() unnecessary? > > > > Was this on 2.4 kernel because I think the readq would not have a > significant impact on 2.6 kernels due to TSO. > (with TSO on the number of packets that actually enter the > Xmit routine would be reduced apprx 40 times). > ..... This was with a 2.6 kernel (with TSO on). PIO reads are pretty expensive on Altix, so eliminating them really helps us. For big mtus (>=4KBytes) the benefit of replacing the readq() with mmiowb() in s2io_xmit() is negligible. -- Arthur From mitch.a.williams@intel.com Fri Jul 8 14:16:47 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 08 Jul 2005 14:16:50 -0700 (PDT) Received: from orsfmr005.jf.intel.com (fmr20.intel.com [134.134.136.19]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j68LGlH9008958 for ; Fri, 8 Jul 2005 14:16:47 -0700 Received: from orsfmr100.jf.intel.com (orsfmr100.jf.intel.com [10.7.209.16]) by orsfmr005.jf.intel.com (8.12.10/8.12.10/d: major-outer.mc,v 1.1 2004/09/17 17:50:56 root Exp $) with ESMTP id j68LExKF013466; Fri, 8 Jul 2005 21:14:59 GMT Received: from nwlxmail01.jf.intel.com (nwlxmail01.jf.intel.com [10.7.171.40]) by orsfmr100.jf.intel.com (8.12.10/8.12.10/d: major-inner.mc,v 1.2 2004/09/17 18:05:01 root Exp $) with ESMTP id j68LExhj028712; Fri, 8 Jul 2005 21:14:59 GMT Received: from mawilli1-desk2.amr.corp.intel.com (mawilli1-desk2.amr.corp.intel.com [134.134.3.58]) by nwlxmail01.jf.intel.com (8.12.10/8.12.9/MailSET/Hub) with ESMTP id j68LEsSL009987; Fri, 8 Jul 2005 14:14:54 -0700 Date: Fri, 8 Jul 2005 14:14:54 -0700 From: Mitch Williams X-X-Sender: mawilli1@mawilli1-desk2.amr.corp.intel.com To: Greg KH cc: Mitch Williams , "John W. Linville" , Radheka Godse , netdev@oss.sgi.com, bonding-devel@lists.sourceforge.net, fubar@us.ibm.com, netdev@vger.kernel.org Subject: Re: [PATCH 2.6.13-rc1 8/17] bonding: SYSFS INTERFACE (large) In-Reply-To: <20050707231451.GA936@kroah.com> Message-ID: References: <20050702081346.GA20789@kroah.com> <20050706195232.GB18359@kroah.com> <20050707142544.GA9418@tuxdriver.com> <20050707231451.GA936@kroah.com> ReplyTo: "Mitch Williams" MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Scanned-By: MIMEDefang 2.44 X-archive-position: 2705 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mitch.a.williams@intel.com Precedence: bulk X-list: netdev Content-Length: 1925 Lines: 46 (Adding vger to Cc: list, so as to spread the joy around.) On Thu, 7 Jul 2005, Greg KH wrote: > > > > Moving the individual bond directories to a bonds/ directory > > is problematic. Because each bond shows up a just another network > > interface, they show up in /sys/class/net automatically. We'd have to > > make a bunch of changes to the device model to account for bonding, and > > we'd break the semantics of the /sys/class/net hierarchy. > > Why not just put them in /sys/class/bond/ instead? As Aaron Brown noted in another reply, it's inappropriate for bonding to do this. Bonding is really just another network driver, and its interfaces show up in /sys/class/net courtesy of the device model. We don't need to replicate this capability; we just need to extend it a bit. Adding /sys/class/bond/ would cause a much larger protest that what we're dealing with now (i.e. pretty much only you). > > reading of Documentation/filesystems/sysfs.txt indicates that such a usage > > is "socially acceptable". (Or at least it was to Patrick Mochel back in > > January of 2003.) > > Pat was just trying to be nice. I'm not. :) Well, if "nice" isn't the policy any more, then the doc needs to change. > Just don't make them readable. Lots of sysfs files are write only. OK, you got me there. I did some poking and you are absolutely correct. However, I did a little more poking, and found a bunch of files that have more than one item in them. HW resource listings, block device scheduling stuff, plenty of examples. Are they socially acceptable? > > bonding_masters. This currently is a problem, since sysfs itself does not > > handle appends properly. > > Because you are not supposed to do that. Sysfs will happily accept O_APPEND opens, and will happily report success on seek attempts, although neither operation is permitted. Whether or not you're supposed to, this behavior is just plain wrong. From greg@kroah.com Fri Jul 8 14:33:12 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 08 Jul 2005 14:33:15 -0700 (PDT) Received: from perch.kroah.org (mail.kroah.org [69.55.234.183]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j68LXBH9010858 for ; Fri, 8 Jul 2005 14:33:12 -0700 Received: from [192.168.0.10] (c-24-22-115-24.hsd1.or.comcast.net [24.22.115.24]) (authenticated) by perch.kroah.org (8.11.6/8.11.6) with ESMTP id j68LUoq00817; Fri, 8 Jul 2005 14:30:50 -0700 Received: from greg by echidna.kroah.org with local (masqmail 0.2.19) id 1Dr0R7-2wj-00; Fri, 08 Jul 2005 14:31:05 -0700 Date: Fri, 8 Jul 2005 14:31:05 -0700 From: Greg KH To: Mitch Williams Cc: "John W. Linville" , Radheka Godse , netdev@oss.sgi.com, bonding-devel@lists.sourceforge.net, fubar@us.ibm.com, netdev@vger.kernel.org Subject: Re: [PATCH 2.6.13-rc1 8/17] bonding: SYSFS INTERFACE (large) Message-ID: <20050708213105.GB22465@kroah.com> References: <20050702081346.GA20789@kroah.com> <20050706195232.GB18359@kroah.com> <20050707142544.GA9418@tuxdriver.com> <20050707231451.GA936@kroah.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.8i X-archive-position: 2706 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: greg@kroah.com Precedence: bulk X-list: netdev Content-Length: 2788 Lines: 67 On Fri, Jul 08, 2005 at 02:14:54PM -0700, Mitch Williams wrote: > (Adding vger to Cc: list, so as to spread the joy around.) > > On Thu, 7 Jul 2005, Greg KH wrote: > > > > > > > Moving the individual bond directories to a bonds/ directory > > > is problematic. Because each bond shows up a just another network > > > interface, they show up in /sys/class/net automatically. We'd have to > > > make a bunch of changes to the device model to account for bonding, and > > > we'd break the semantics of the /sys/class/net hierarchy. > > > > Why not just put them in /sys/class/bond/ instead? > As Aaron Brown noted in another reply, it's inappropriate for bonding to > do this. Bonding is really just another network driver, and its > interfaces show up in /sys/class/net courtesy of the device model. We > don't need to replicate this capability; we just need to extend it a bit. > > Adding /sys/class/bond/ would cause a much larger protest that what we're > dealing with now (i.e. pretty much only you). Ok, as I said before, I don't care where you put it, just writting and reading multiple values at once in sysfs files is not to be tolerated. > > > reading of Documentation/filesystems/sysfs.txt indicates that such a usage > > > is "socially acceptable". (Or at least it was to Patrick Mochel back in > > > January of 2003.) > > > > Pat was just trying to be nice. I'm not. :) > > Well, if "nice" isn't the policy any more, then the doc needs to change. Ok, I will do so. > However, I did a little more poking, and found a bunch of files that have > more than one item in them. HW resource listings, block device scheduling > stuff, plenty of examples. Are they socially acceptable? The block device stuff I understand as they want a single snapshot. They also can't create multiple files for every value or the 20,000 disk system admins will complain :) As for the other ones, have specific examples? And none of them are "multiple value write", correct? That, and due to the fact that your implementation will drop needed data is why I object to your patch so much. I'm sure you can understand that desiging in a limitation that could cause data to be dropped that people really want is not a good thing. > > > bonding_masters. This currently is a problem, since sysfs itself does not > > > handle appends properly. > > > > Because you are not supposed to do that. > > Sysfs will happily accept O_APPEND opens, and will happily report success > on seek attempts, although neither operation is permitted. Whether or not > you're supposed to, this behavior is just plain wrong. Ok, care to resend the patches in different emails? From what I last remember, they either were not able to be applied, or some other trival problem with them. thanks, greg k-h From manfred@colorfullife.com Sun Jul 10 04:11:23 2005 Received: with ECARTIS (v1.0.0; list netdev); Sun, 10 Jul 2005 04:11:26 -0700 (PDT) Received: from dbl.q-ag.de (dbl.q-ag.de [213.172.117.3]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6ABBLH9025739 for ; Sun, 10 Jul 2005 04:11:22 -0700 Received: from [127.0.0.2] (dbl [127.0.0.1]) by dbl.q-ag.de (8.13.3/8.13.3/Debian-6) with ESMTP id j6ABCLwY016410; Sun, 10 Jul 2005 13:12:22 +0200 Message-ID: <42D101EC.6000608@colorfullife.com> Date: Sun, 10 Jul 2005 13:09:32 +0200 From: Manfred Spraul User-Agent: Mozilla/5.0 (X11; U; Linux i686; fr-FR; rv:1.7.8) Gecko/20050513 Fedora/1.7.8-1.3.1 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Jeff Garzik CC: Netdev , renaud.lienhart@free.fr Subject: [PATCH] forcedeth: Additional ethtool support Content-Type: multipart/mixed; boundary="------------050905000803070200080800" X-archive-position: 2707 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: manfred@colorfullife.com Precedence: bulk X-list: netdev Content-Length: 10141 Lines: 272 This is a multi-part message in MIME format. --------------050905000803070200080800 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Hi Jeff, The attached patch adds ethtool support for -r (restart auto-negotiation) and -d (dump registers). It also contains the PCI_DEVICE changes from Renaud, a bugfix for the jumbo frame patch (the packet size limit remained at 1500, even for the nics that support jumbo frames) and a cleanup for the selection of the jumbo frame capable/incapable nics. Signed-Off-By: Manfred Spraul --------------050905000803070200080800 Content-Type: text/plain; name="patch-forcedeth-037-ethtool" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="patch-forcedeth-037-ethtool" --- 2.6/drivers/net/forcedeth.c 2005-07-10 12:38:53.000000000 +0200 +++ build-2.6/drivers/net/forcedeth.c 2005-07-10 12:51:12.000000000 +0200 @@ -85,7 +85,8 @@ * 0.33: 16 May 2005: Support for MCP51 added. * 0.34: 18 Jun 2005: Add DEV_NEED_LINKTIMER to all nForce nics. * 0.35: 26 Jun 2005: Support for MCP55 added. - * 0.36: 28 Jul 2005: Add jumbo frame support. + * 0.36: 28 Jun 2005: Add jumbo frame support. + * 0.37: 10 Jul 2005: Additional ethtool support, cleanup of pci id list * * Known bugs: * We suspect that on some hardware no TX done interrupts are generated. @@ -137,6 +138,7 @@ #define DEV_IRQMASK_2 0x0004 /* use NVREG_IRQMASK_WANTED_2 for irq mask */ #define DEV_NEED_TIMERIRQ 0x0008 /* set the timer irq flag in the irq mask */ #define DEV_NEED_LINKTIMER 0x0010 /* poll link settings. Relies on the timer irq */ +#define DEV_HAS_LARGEDESC 0x0020 /* device supports jumbo frames and needs packet format 2 */ enum { NvRegIrqStatus = 0x000, @@ -1846,6 +1848,50 @@ return 0; } +#define FORCEDETH_REGS_VER 1 +#define FORCEDETH_REGS_SIZE 0x400 /* 256 32-bit registers */ + +static int nv_get_regs_len(struct net_device *dev) +{ + return FORCEDETH_REGS_SIZE; +} + +static void nv_get_regs(struct net_device *dev, struct ethtool_regs *regs, void *buf) +{ + struct fe_priv *np = get_nvpriv(dev); + u8 __iomem *base = get_hwbase(dev); + u32 *rbuf = (u32 *)buf; + int i; + + regs->version = FORCEDETH_REGS_VER; + spin_lock_irq(&np->lock); + for (i=0;ilock); +} + +static int nv_nway_reset(struct net_device *dev) +{ + struct fe_priv *np = get_nvpriv(dev); + int ret; + + spin_lock_irq(&np->lock); + if (np->autoneg) { + int bmcr; + + bmcr = mii_rw(dev, np->phyaddr, MII_BMCR, MII_READ); + bmcr |= (BMCR_ANENABLE | BMCR_ANRESTART); + mii_rw(dev, np->phyaddr, MII_BMCR, bmcr); + + ret = 0; + } else { + ret = -EINVAL; + } + spin_unlock_irq(&np->lock); + + return ret; +} + static struct ethtool_ops ops = { .get_drvinfo = nv_get_drvinfo, .get_link = ethtool_op_get_link, @@ -1853,6 +1899,9 @@ .set_wol = nv_set_wol, .get_settings = nv_get_settings, .set_settings = nv_set_settings, + .get_regs_len = nv_get_regs_len, + .get_regs = nv_get_regs, + .nway_reset = nv_nway_reset, }; static int nv_open(struct net_device *dev) @@ -2092,16 +2141,11 @@ } /* handle different descriptor versions */ - if (pci_dev->device == PCI_DEVICE_ID_NVIDIA_NVENET_1 || - pci_dev->device == PCI_DEVICE_ID_NVIDIA_NVENET_2 || - pci_dev->device == PCI_DEVICE_ID_NVIDIA_NVENET_3 || - pci_dev->device == PCI_DEVICE_ID_NVIDIA_NVENET_12 || - pci_dev->device == PCI_DEVICE_ID_NVIDIA_NVENET_13) { - np->desc_ver = DESC_VER_1; - np->pkt_limit = NV_PKTLIMIT_1; - } else { + np->desc_ver = DESC_VER_1; + np->pkt_limit = NV_PKTLIMIT_1; + if (id->driver_data & DEV_HAS_LARGEDESC) { np->desc_ver = DESC_VER_2; - np->pkt_limit = NV_PKTLIMIT_1; + np->pkt_limit = NV_PKTLIMIT_2; } err = -ENOMEM; @@ -2284,109 +2328,74 @@ static struct pci_device_id pci_tbl[] = { { /* nForce Ethernet Controller */ - .vendor = PCI_VENDOR_ID_NVIDIA, - .device = PCI_DEVICE_ID_NVIDIA_NVENET_1, - .subvendor = PCI_ANY_ID, - .subdevice = PCI_ANY_ID, + PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_1), .driver_data = DEV_IRQMASK_1|DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER, }, { /* nForce2 Ethernet Controller */ - .vendor = PCI_VENDOR_ID_NVIDIA, - .device = PCI_DEVICE_ID_NVIDIA_NVENET_2, - .subvendor = PCI_ANY_ID, - .subdevice = PCI_ANY_ID, + PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_2), .driver_data = DEV_NEED_LASTPACKET1|DEV_IRQMASK_2|DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER, }, { /* nForce3 Ethernet Controller */ - .vendor = PCI_VENDOR_ID_NVIDIA, - .device = PCI_DEVICE_ID_NVIDIA_NVENET_3, - .subvendor = PCI_ANY_ID, - .subdevice = PCI_ANY_ID, + PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_3), .driver_data = DEV_NEED_LASTPACKET1|DEV_IRQMASK_2|DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER, }, { /* nForce3 Ethernet Controller */ - .vendor = PCI_VENDOR_ID_NVIDIA, - .device = PCI_DEVICE_ID_NVIDIA_NVENET_4, - .subvendor = PCI_ANY_ID, - .subdevice = PCI_ANY_ID, - .driver_data = DEV_NEED_LASTPACKET1|DEV_IRQMASK_2|DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER, + PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_4), + .driver_data = DEV_NEED_LASTPACKET1|DEV_IRQMASK_2|DEV_NEED_TIMERIRQ| + DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC, }, { /* nForce3 Ethernet Controller */ - .vendor = PCI_VENDOR_ID_NVIDIA, - .device = PCI_DEVICE_ID_NVIDIA_NVENET_5, - .subvendor = PCI_ANY_ID, - .subdevice = PCI_ANY_ID, - .driver_data = DEV_NEED_LASTPACKET1|DEV_IRQMASK_2|DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER, + PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_5), + .driver_data = DEV_NEED_LASTPACKET1|DEV_IRQMASK_2|DEV_NEED_TIMERIRQ| + DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC, }, { /* nForce3 Ethernet Controller */ - .vendor = PCI_VENDOR_ID_NVIDIA, - .device = PCI_DEVICE_ID_NVIDIA_NVENET_6, - .subvendor = PCI_ANY_ID, - .subdevice = PCI_ANY_ID, - .driver_data = DEV_NEED_LASTPACKET1|DEV_IRQMASK_2|DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER, + PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_6), + .driver_data = DEV_NEED_LASTPACKET1|DEV_IRQMASK_2|DEV_NEED_TIMERIRQ| + DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC, }, { /* nForce3 Ethernet Controller */ - .vendor = PCI_VENDOR_ID_NVIDIA, - .device = PCI_DEVICE_ID_NVIDIA_NVENET_7, - .subvendor = PCI_ANY_ID, - .subdevice = PCI_ANY_ID, - .driver_data = DEV_NEED_LASTPACKET1|DEV_IRQMASK_2|DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER, + PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_7), + .driver_data = DEV_NEED_LASTPACKET1|DEV_IRQMASK_2|DEV_NEED_TIMERIRQ| + DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC, }, { /* CK804 Ethernet Controller */ - .vendor = PCI_VENDOR_ID_NVIDIA, - .device = PCI_DEVICE_ID_NVIDIA_NVENET_8, - .subvendor = PCI_ANY_ID, - .subdevice = PCI_ANY_ID, - .driver_data = DEV_NEED_LASTPACKET1|DEV_IRQMASK_2|DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER, + PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_8), + .driver_data = DEV_NEED_LASTPACKET1|DEV_IRQMASK_2|DEV_NEED_TIMERIRQ| + DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC, }, { /* CK804 Ethernet Controller */ - .vendor = PCI_VENDOR_ID_NVIDIA, - .device = PCI_DEVICE_ID_NVIDIA_NVENET_9, - .subvendor = PCI_ANY_ID, - .subdevice = PCI_ANY_ID, - .driver_data = DEV_NEED_LASTPACKET1|DEV_IRQMASK_2|DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER, + PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_9), + .driver_data = DEV_NEED_LASTPACKET1|DEV_IRQMASK_2|DEV_NEED_TIMERIRQ| + DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC, }, { /* MCP04 Ethernet Controller */ - .vendor = PCI_VENDOR_ID_NVIDIA, - .device = PCI_DEVICE_ID_NVIDIA_NVENET_10, - .subvendor = PCI_ANY_ID, - .subdevice = PCI_ANY_ID, - .driver_data = DEV_NEED_LASTPACKET1|DEV_IRQMASK_2|DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER, + PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_10), + .driver_data = DEV_NEED_LASTPACKET1|DEV_IRQMASK_2|DEV_NEED_TIMERIRQ| + DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC, }, { /* MCP04 Ethernet Controller */ - .vendor = PCI_VENDOR_ID_NVIDIA, - .device = PCI_DEVICE_ID_NVIDIA_NVENET_11, - .subvendor = PCI_ANY_ID, - .subdevice = PCI_ANY_ID, - .driver_data = DEV_NEED_LASTPACKET1|DEV_IRQMASK_2|DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER, + PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_11), + .driver_data = DEV_NEED_LASTPACKET1|DEV_IRQMASK_2|DEV_NEED_TIMERIRQ| + DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC, }, { /* MCP51 Ethernet Controller */ - .vendor = PCI_VENDOR_ID_NVIDIA, - .device = PCI_DEVICE_ID_NVIDIA_NVENET_12, - .subvendor = PCI_ANY_ID, - .subdevice = PCI_ANY_ID, + PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_12), .driver_data = DEV_NEED_LASTPACKET1|DEV_IRQMASK_2|DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER, }, { /* MCP51 Ethernet Controller */ - .vendor = PCI_VENDOR_ID_NVIDIA, - .device = PCI_DEVICE_ID_NVIDIA_NVENET_13, - .subvendor = PCI_ANY_ID, - .subdevice = PCI_ANY_ID, + PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_13), .driver_data = DEV_NEED_LASTPACKET1|DEV_IRQMASK_2|DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER, }, { /* MCP55 Ethernet Controller */ - .vendor = PCI_VENDOR_ID_NVIDIA, - .device = PCI_DEVICE_ID_NVIDIA_NVENET_14, - .subvendor = PCI_ANY_ID, - .subdevice = PCI_ANY_ID, - .driver_data = DEV_NEED_LASTPACKET1|DEV_IRQMASK_2|DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER, + PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_14), + .driver_data = DEV_NEED_LASTPACKET1|DEV_IRQMASK_2|DEV_NEED_TIMERIRQ| + DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC, }, { /* MCP55 Ethernet Controller */ - .vendor = PCI_VENDOR_ID_NVIDIA, - .device = PCI_DEVICE_ID_NVIDIA_NVENET_15, - .subvendor = PCI_ANY_ID, - .subdevice = PCI_ANY_ID, - .driver_data = DEV_NEED_LASTPACKET1|DEV_IRQMASK_2|DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER, + PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_15), + .driver_data = DEV_NEED_LASTPACKET1|DEV_IRQMASK_2|DEV_NEED_TIMERIRQ| + DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC, }, {0,}, }; --------------050905000803070200080800-- From dr@cluenet.de Sun Jul 10 07:16:47 2005 Received: with ECARTIS (v1.0.0; list netdev); Sun, 10 Jul 2005 07:16:55 -0700 (PDT) Received: from mail1.cluenet.de (mail1.cluenet.de [195.20.121.7]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6AEGkH9009193 for ; Sun, 10 Jul 2005 07:16:47 -0700 Received: by mail1.cluenet.de (Postfix, from userid 500) id ABD561795C; Sun, 10 Jul 2005 16:15:00 +0200 (CEST) Date: Sun, 10 Jul 2005 16:15:00 +0200 From: Daniel Roesen To: Ricardo Scop Cc: netdev@oss.sgi.com Subject: Re: Fwd: Re: GRE tunnel keepalive support? Message-ID: <20050710141500.GA25229@srv01.cluenet.de> Mail-Followup-To: Ricardo Scop , netdev@oss.sgi.com References: <200506281934.41732.rscop@matrix.com.br> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <200506281934.41732.rscop@matrix.com.br> User-Agent: Mutt/1.4.1i X-archive-position: 2708 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: dr@cluenet.de Precedence: bulk X-list: netdev Content-Length: 465 Lines: 17 Hi Ricardo, On Tue, Jun 28, 2005 at 07:34:41PM -0300, Ricardo Scop wrote: > Any further development of the subject above? Neither I want to > duplicate efforts... ;) > > Please CC to my private address as I'm not yet subscribed to this list. Nope, didn't get around to do any work on that yet. And am losing motivation as my immediate use for it is going away soon. Best regards, Daniel -- CLUE-RIPE -- Jabber: dr@cluenet.de -- dr@IRCnet -- PGP: 0xA85C8AA0 From romieu@fr.zoreil.com Sun Jul 10 10:33:44 2005 Received: with ECARTIS (v1.0.0; list netdev); Sun, 10 Jul 2005 10:33:47 -0700 (PDT) Received: from fr.zoreil.com (electric-eye.fr.zoreil.com [213.41.134.224]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6AHXgH9023511 for ; Sun, 10 Jul 2005 10:33:43 -0700 Received: from electric-eye.fr.zoreil.com (localhost.localdomain [127.0.0.1]) by fr.zoreil.com (8.13.4/8.12.1) with ESMTP id j6AHSYsV002185; Sun, 10 Jul 2005 19:28:34 +0200 Received: (from romieu@localhost) by electric-eye.fr.zoreil.com (8.13.4/8.12.1) id j6AHSXs4002184; Sun, 10 Jul 2005 19:28:33 +0200 Date: Sun, 10 Jul 2005 19:28:33 +0200 From: Francois Romieu To: Manfred Spraul Cc: Jeff Garzik , Netdev , renaud.lienhart@free.fr Subject: Re: [PATCH] forcedeth: Additional ethtool support Message-ID: <20050710172833.GA1951@electric-eye.fr.zoreil.com> References: <42D101EC.6000608@colorfullife.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <42D101EC.6000608@colorfullife.com> User-Agent: Mutt/1.4.2.1i X-Organisation: Land of Sunshine Inc. X-archive-position: 2709 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: romieu@fr.zoreil.com Precedence: bulk X-list: netdev Content-Length: 1431 Lines: 51 Hi Manfred, Manfred Spraul : [...] > --- 2.6/drivers/net/forcedeth.c 2005-07-10 12:38:53.000000000 +0200 > +++ build-2.6/drivers/net/forcedeth.c 2005-07-10 12:51:12.000000000 +0200 [...] > +static void nv_get_regs(struct net_device *dev, struct ethtool_regs *regs, void *buf) > +{ > + struct fe_priv *np = get_nvpriv(dev); > + u8 __iomem *base = get_hwbase(dev); > + u32 *rbuf = (u32 *)buf; Unneeded cast from void * > + int i; > + > + regs->version = FORCEDETH_REGS_VER; > + spin_lock_irq(&np->lock); > + for (i=0;i + rbuf[i] = readl(base + i*sizeof(u32)); memcpy_fromio ? [...] > @@ -2092,16 +2141,11 @@ > } > > /* handle different descriptor versions */ > - if (pci_dev->device == PCI_DEVICE_ID_NVIDIA_NVENET_1 || > - pci_dev->device == PCI_DEVICE_ID_NVIDIA_NVENET_2 || > - pci_dev->device == PCI_DEVICE_ID_NVIDIA_NVENET_3 || > - pci_dev->device == PCI_DEVICE_ID_NVIDIA_NVENET_12 || > - pci_dev->device == PCI_DEVICE_ID_NVIDIA_NVENET_13) { > - np->desc_ver = DESC_VER_1; > - np->pkt_limit = NV_PKTLIMIT_1; > - } else { > + np->desc_ver = DESC_VER_1; > + np->pkt_limit = NV_PKTLIMIT_1; ^^ (nit) a space hides before the tab. > + if (id->driver_data & DEV_HAS_LARGEDESC) { > np->desc_ver = DESC_VER_2; > - np->pkt_limit = NV_PKTLIMIT_1; > + np->pkt_limit = NV_PKTLIMIT_2; ^^ (nit) a space hides before the tab. -- Ueimor From manfred@colorfullife.com Sun Jul 10 11:19:45 2005 Received: with ECARTIS (v1.0.0; list netdev); Sun, 10 Jul 2005 11:19:48 -0700 (PDT) Received: from dbl.q-ag.de (dbl.q-ag.de [213.172.117.3]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6AIJhH9026469 for ; Sun, 10 Jul 2005 11:19:44 -0700 Received: from [127.0.0.2] (dbl [127.0.0.1]) by dbl.q-ag.de (8.13.3/8.13.3/Debian-6) with ESMTP id j6AIKmdN017653; Sun, 10 Jul 2005 20:20:48 +0200 Message-ID: <42D16656.6000207@colorfullife.com> Date: Sun, 10 Jul 2005 20:17:58 +0200 From: Manfred Spraul User-Agent: Mozilla/5.0 (X11; U; Linux i686; fr-FR; rv:1.7.8) Gecko/20050513 Fedora/1.7.8-1.3.1 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Francois Romieu CC: Jeff Garzik , Netdev , renaud.lienhart@free.fr Subject: Re: [PATCH] forcedeth: Additional ethtool support References: <42D101EC.6000608@colorfullife.com> <20050710172833.GA1951@electric-eye.fr.zoreil.com> In-Reply-To: <20050710172833.GA1951@electric-eye.fr.zoreil.com> Content-Type: multipart/mixed; boundary="------------050500090402000102020602" X-archive-position: 2710 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: manfred@colorfullife.com Precedence: bulk X-list: netdev Content-Length: 10336 Lines: 287 This is a multi-part message in MIME format. --------------050500090402000102020602 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Francois Romieu wrote: >>+ int i; >>+ >>+ regs->version = FORCEDETH_REGS_VER; >>+ spin_lock_irq(&np->lock); >>+ for (i=0;i>+ rbuf[i] = readl(base + i*sizeof(u32)); >> >> > >memcpy_fromio ? > > > Not for a nic without complete documentation: What if an arch uses 64-bit reads to read two registers at the same time? Not all nics like that, for example IIRC natsemi explicitely mandates 32-bit reads. x86-64 doesn't, it uses 32-bit reads, but I don't like the idea of using memcpy to read registers. I agree with your other remarks, updated patch attached. -- Manfred --------------050500090402000102020602 Content-Type: text/plain; name="patch-forcedeth-037-ethtool" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="patch-forcedeth-037-ethtool" --- 2.6/drivers/net/forcedeth.c 2005-07-10 12:38:53.000000000 +0200 +++ build-2.6/drivers/net/forcedeth.c 2005-07-10 20:10:16.000000000 +0200 @@ -85,7 +85,8 @@ * 0.33: 16 May 2005: Support for MCP51 added. * 0.34: 18 Jun 2005: Add DEV_NEED_LINKTIMER to all nForce nics. * 0.35: 26 Jun 2005: Support for MCP55 added. - * 0.36: 28 Jul 2005: Add jumbo frame support. + * 0.36: 28 Jun 2005: Add jumbo frame support. + * 0.37: 10 Jul 2005: Additional ethtool support, cleanup of pci id list * * Known bugs: * We suspect that on some hardware no TX done interrupts are generated. @@ -137,6 +138,7 @@ #define DEV_IRQMASK_2 0x0004 /* use NVREG_IRQMASK_WANTED_2 for irq mask */ #define DEV_NEED_TIMERIRQ 0x0008 /* set the timer irq flag in the irq mask */ #define DEV_NEED_LINKTIMER 0x0010 /* poll link settings. Relies on the timer irq */ +#define DEV_HAS_LARGEDESC 0x0020 /* device supports jumbo frames and needs packet format 2 */ enum { NvRegIrqStatus = 0x000, @@ -1846,6 +1848,50 @@ return 0; } +#define FORCEDETH_REGS_VER 1 +#define FORCEDETH_REGS_SIZE 0x400 /* 256 32-bit registers */ + +static int nv_get_regs_len(struct net_device *dev) +{ + return FORCEDETH_REGS_SIZE; +} + +static void nv_get_regs(struct net_device *dev, struct ethtool_regs *regs, void *buf) +{ + struct fe_priv *np = get_nvpriv(dev); + u8 __iomem *base = get_hwbase(dev); + u32 *rbuf = buf; + int i; + + regs->version = FORCEDETH_REGS_VER; + spin_lock_irq(&np->lock); + for (i=0;ilock); +} + +static int nv_nway_reset(struct net_device *dev) +{ + struct fe_priv *np = get_nvpriv(dev); + int ret; + + spin_lock_irq(&np->lock); + if (np->autoneg) { + int bmcr; + + bmcr = mii_rw(dev, np->phyaddr, MII_BMCR, MII_READ); + bmcr |= (BMCR_ANENABLE | BMCR_ANRESTART); + mii_rw(dev, np->phyaddr, MII_BMCR, bmcr); + + ret = 0; + } else { + ret = -EINVAL; + } + spin_unlock_irq(&np->lock); + + return ret; +} + static struct ethtool_ops ops = { .get_drvinfo = nv_get_drvinfo, .get_link = ethtool_op_get_link, @@ -1853,6 +1899,9 @@ .set_wol = nv_set_wol, .get_settings = nv_get_settings, .set_settings = nv_set_settings, + .get_regs_len = nv_get_regs_len, + .get_regs = nv_get_regs, + .nway_reset = nv_nway_reset, }; static int nv_open(struct net_device *dev) @@ -2092,16 +2141,11 @@ } /* handle different descriptor versions */ - if (pci_dev->device == PCI_DEVICE_ID_NVIDIA_NVENET_1 || - pci_dev->device == PCI_DEVICE_ID_NVIDIA_NVENET_2 || - pci_dev->device == PCI_DEVICE_ID_NVIDIA_NVENET_3 || - pci_dev->device == PCI_DEVICE_ID_NVIDIA_NVENET_12 || - pci_dev->device == PCI_DEVICE_ID_NVIDIA_NVENET_13) { - np->desc_ver = DESC_VER_1; - np->pkt_limit = NV_PKTLIMIT_1; - } else { + np->desc_ver = DESC_VER_1; + np->pkt_limit = NV_PKTLIMIT_1; + if (id->driver_data & DEV_HAS_LARGEDESC) { np->desc_ver = DESC_VER_2; - np->pkt_limit = NV_PKTLIMIT_1; + np->pkt_limit = NV_PKTLIMIT_2; } err = -ENOMEM; @@ -2284,109 +2328,74 @@ static struct pci_device_id pci_tbl[] = { { /* nForce Ethernet Controller */ - .vendor = PCI_VENDOR_ID_NVIDIA, - .device = PCI_DEVICE_ID_NVIDIA_NVENET_1, - .subvendor = PCI_ANY_ID, - .subdevice = PCI_ANY_ID, + PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_1), .driver_data = DEV_IRQMASK_1|DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER, }, { /* nForce2 Ethernet Controller */ - .vendor = PCI_VENDOR_ID_NVIDIA, - .device = PCI_DEVICE_ID_NVIDIA_NVENET_2, - .subvendor = PCI_ANY_ID, - .subdevice = PCI_ANY_ID, + PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_2), .driver_data = DEV_NEED_LASTPACKET1|DEV_IRQMASK_2|DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER, }, { /* nForce3 Ethernet Controller */ - .vendor = PCI_VENDOR_ID_NVIDIA, - .device = PCI_DEVICE_ID_NVIDIA_NVENET_3, - .subvendor = PCI_ANY_ID, - .subdevice = PCI_ANY_ID, + PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_3), .driver_data = DEV_NEED_LASTPACKET1|DEV_IRQMASK_2|DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER, }, { /* nForce3 Ethernet Controller */ - .vendor = PCI_VENDOR_ID_NVIDIA, - .device = PCI_DEVICE_ID_NVIDIA_NVENET_4, - .subvendor = PCI_ANY_ID, - .subdevice = PCI_ANY_ID, - .driver_data = DEV_NEED_LASTPACKET1|DEV_IRQMASK_2|DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER, + PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_4), + .driver_data = DEV_NEED_LASTPACKET1|DEV_IRQMASK_2|DEV_NEED_TIMERIRQ| + DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC, }, { /* nForce3 Ethernet Controller */ - .vendor = PCI_VENDOR_ID_NVIDIA, - .device = PCI_DEVICE_ID_NVIDIA_NVENET_5, - .subvendor = PCI_ANY_ID, - .subdevice = PCI_ANY_ID, - .driver_data = DEV_NEED_LASTPACKET1|DEV_IRQMASK_2|DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER, + PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_5), + .driver_data = DEV_NEED_LASTPACKET1|DEV_IRQMASK_2|DEV_NEED_TIMERIRQ| + DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC, }, { /* nForce3 Ethernet Controller */ - .vendor = PCI_VENDOR_ID_NVIDIA, - .device = PCI_DEVICE_ID_NVIDIA_NVENET_6, - .subvendor = PCI_ANY_ID, - .subdevice = PCI_ANY_ID, - .driver_data = DEV_NEED_LASTPACKET1|DEV_IRQMASK_2|DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER, + PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_6), + .driver_data = DEV_NEED_LASTPACKET1|DEV_IRQMASK_2|DEV_NEED_TIMERIRQ| + DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC, }, { /* nForce3 Ethernet Controller */ - .vendor = PCI_VENDOR_ID_NVIDIA, - .device = PCI_DEVICE_ID_NVIDIA_NVENET_7, - .subvendor = PCI_ANY_ID, - .subdevice = PCI_ANY_ID, - .driver_data = DEV_NEED_LASTPACKET1|DEV_IRQMASK_2|DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER, + PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_7), + .driver_data = DEV_NEED_LASTPACKET1|DEV_IRQMASK_2|DEV_NEED_TIMERIRQ| + DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC, }, { /* CK804 Ethernet Controller */ - .vendor = PCI_VENDOR_ID_NVIDIA, - .device = PCI_DEVICE_ID_NVIDIA_NVENET_8, - .subvendor = PCI_ANY_ID, - .subdevice = PCI_ANY_ID, - .driver_data = DEV_NEED_LASTPACKET1|DEV_IRQMASK_2|DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER, + PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_8), + .driver_data = DEV_NEED_LASTPACKET1|DEV_IRQMASK_2|DEV_NEED_TIMERIRQ| + DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC, }, { /* CK804 Ethernet Controller */ - .vendor = PCI_VENDOR_ID_NVIDIA, - .device = PCI_DEVICE_ID_NVIDIA_NVENET_9, - .subvendor = PCI_ANY_ID, - .subdevice = PCI_ANY_ID, - .driver_data = DEV_NEED_LASTPACKET1|DEV_IRQMASK_2|DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER, + PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_9), + .driver_data = DEV_NEED_LASTPACKET1|DEV_IRQMASK_2|DEV_NEED_TIMERIRQ| + DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC, }, { /* MCP04 Ethernet Controller */ - .vendor = PCI_VENDOR_ID_NVIDIA, - .device = PCI_DEVICE_ID_NVIDIA_NVENET_10, - .subvendor = PCI_ANY_ID, - .subdevice = PCI_ANY_ID, - .driver_data = DEV_NEED_LASTPACKET1|DEV_IRQMASK_2|DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER, + PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_10), + .driver_data = DEV_NEED_LASTPACKET1|DEV_IRQMASK_2|DEV_NEED_TIMERIRQ| + DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC, }, { /* MCP04 Ethernet Controller */ - .vendor = PCI_VENDOR_ID_NVIDIA, - .device = PCI_DEVICE_ID_NVIDIA_NVENET_11, - .subvendor = PCI_ANY_ID, - .subdevice = PCI_ANY_ID, - .driver_data = DEV_NEED_LASTPACKET1|DEV_IRQMASK_2|DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER, + PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_11), + .driver_data = DEV_NEED_LASTPACKET1|DEV_IRQMASK_2|DEV_NEED_TIMERIRQ| + DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC, }, { /* MCP51 Ethernet Controller */ - .vendor = PCI_VENDOR_ID_NVIDIA, - .device = PCI_DEVICE_ID_NVIDIA_NVENET_12, - .subvendor = PCI_ANY_ID, - .subdevice = PCI_ANY_ID, + PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_12), .driver_data = DEV_NEED_LASTPACKET1|DEV_IRQMASK_2|DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER, }, { /* MCP51 Ethernet Controller */ - .vendor = PCI_VENDOR_ID_NVIDIA, - .device = PCI_DEVICE_ID_NVIDIA_NVENET_13, - .subvendor = PCI_ANY_ID, - .subdevice = PCI_ANY_ID, + PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_13), .driver_data = DEV_NEED_LASTPACKET1|DEV_IRQMASK_2|DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER, }, { /* MCP55 Ethernet Controller */ - .vendor = PCI_VENDOR_ID_NVIDIA, - .device = PCI_DEVICE_ID_NVIDIA_NVENET_14, - .subvendor = PCI_ANY_ID, - .subdevice = PCI_ANY_ID, - .driver_data = DEV_NEED_LASTPACKET1|DEV_IRQMASK_2|DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER, + PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_14), + .driver_data = DEV_NEED_LASTPACKET1|DEV_IRQMASK_2|DEV_NEED_TIMERIRQ| + DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC, }, { /* MCP55 Ethernet Controller */ - .vendor = PCI_VENDOR_ID_NVIDIA, - .device = PCI_DEVICE_ID_NVIDIA_NVENET_15, - .subvendor = PCI_ANY_ID, - .subdevice = PCI_ANY_ID, - .driver_data = DEV_NEED_LASTPACKET1|DEV_IRQMASK_2|DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER, + PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_15), + .driver_data = DEV_NEED_LASTPACKET1|DEV_IRQMASK_2|DEV_NEED_TIMERIRQ| + DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC, }, {0,}, }; --------------050500090402000102020602-- From olh@suse.de Sun Jul 10 12:38:07 2005 Received: with ECARTIS (v1.0.0; list netdev); Sun, 10 Jul 2005 12:38:09 -0700 (PDT) Received: from mx2.suse.de (ns2.suse.de [195.135.220.15]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6AJc6H9001588 for ; Sun, 10 Jul 2005 12:38:06 -0700 Received: from Relay2.suse.de (mail2.suse.de [195.135.221.8]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx2.suse.de (Postfix) with ESMTP id 46AA21D85D; Sun, 10 Jul 2005 21:36:26 +0200 (CEST) Date: Sun, 10 Jul 2005 19:36:26 +0000 From: Olaf Hering To: Andrew Morton , linux-kernel@vger.kernel.org Cc: Jeff Garzik , netdev@oss.sgi.com Subject: [PATCH 78/82] remove linux/version.h from net/ieee80211/ Message-ID: <20050710193626.78.eBhnWL4343.2247.olh@nectarine.suse.de> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline X-DOS: I got your 640K Real Mode Right Here Buddy! X-Homeland-Security: You are not supposed to read this line! You are a terrorist! User-Agent: Mutt und vi sind doch schneller als Notes (und GroupWise) In-Reply-To: <20050710193508.0.PmFpst2252.2247.olh@nectarine.suse.de> X-archive-position: 2713 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: olh@suse.de Precedence: bulk X-list: netdev Content-Length: 3924 Lines: 111 changing CONFIG_LOCALVERSION rebuilds too much, for no appearent reason. Signed-off-by: Olaf Hering net/ieee80211/ieee80211_crypt.c | 1 - net/ieee80211/ieee80211_crypt_ccmp.c | 1 - net/ieee80211/ieee80211_crypt_tkip.c | 1 - net/ieee80211/ieee80211_crypt_wep.c | 1 - net/ieee80211/ieee80211_module.c | 1 - net/ieee80211/ieee80211_rx.c | 1 - net/ieee80211/ieee80211_tx.c | 1 - net/ieee80211/ieee80211_wx.c | 1 - 8 files changed, 8 deletions(-) Index: linux-2.6.13-rc2-mm1/net/ieee80211/ieee80211_crypt.c =================================================================== --- linux-2.6.13-rc2-mm1.orig/net/ieee80211/ieee80211_crypt.c +++ linux-2.6.13-rc2-mm1/net/ieee80211/ieee80211_crypt.c @@ -12,7 +12,6 @@ */ #include -#include #include #include #include Index: linux-2.6.13-rc2-mm1/net/ieee80211/ieee80211_crypt_ccmp.c =================================================================== --- linux-2.6.13-rc2-mm1.orig/net/ieee80211/ieee80211_crypt_ccmp.c +++ linux-2.6.13-rc2-mm1/net/ieee80211/ieee80211_crypt_ccmp.c @@ -10,7 +10,6 @@ */ #include -#include #include #include #include Index: linux-2.6.13-rc2-mm1/net/ieee80211/ieee80211_crypt_tkip.c =================================================================== --- linux-2.6.13-rc2-mm1.orig/net/ieee80211/ieee80211_crypt_tkip.c +++ linux-2.6.13-rc2-mm1/net/ieee80211/ieee80211_crypt_tkip.c @@ -10,7 +10,6 @@ */ #include -#include #include #include #include Index: linux-2.6.13-rc2-mm1/net/ieee80211/ieee80211_crypt_wep.c =================================================================== --- linux-2.6.13-rc2-mm1.orig/net/ieee80211/ieee80211_crypt_wep.c +++ linux-2.6.13-rc2-mm1/net/ieee80211/ieee80211_crypt_wep.c @@ -10,7 +10,6 @@ */ #include -#include #include #include #include Index: linux-2.6.13-rc2-mm1/net/ieee80211/ieee80211_module.c =================================================================== --- linux-2.6.13-rc2-mm1.orig/net/ieee80211/ieee80211_module.c +++ linux-2.6.13-rc2-mm1/net/ieee80211/ieee80211_module.c @@ -46,7 +46,6 @@ #include #include #include -#include #include #include #include Index: linux-2.6.13-rc2-mm1/net/ieee80211/ieee80211_rx.c =================================================================== --- linux-2.6.13-rc2-mm1.orig/net/ieee80211/ieee80211_rx.c +++ linux-2.6.13-rc2-mm1/net/ieee80211/ieee80211_rx.c @@ -29,7 +29,6 @@ #include #include #include -#include #include #include #include Index: linux-2.6.13-rc2-mm1/net/ieee80211/ieee80211_tx.c =================================================================== --- linux-2.6.13-rc2-mm1.orig/net/ieee80211/ieee80211_tx.c +++ linux-2.6.13-rc2-mm1/net/ieee80211/ieee80211_tx.c @@ -39,7 +39,6 @@ #include #include #include -#include #include #include #include Index: linux-2.6.13-rc2-mm1/net/ieee80211/ieee80211_wx.c =================================================================== --- linux-2.6.13-rc2-mm1.orig/net/ieee80211/ieee80211_wx.c +++ linux-2.6.13-rc2-mm1/net/ieee80211/ieee80211_wx.c @@ -30,7 +30,6 @@ ******************************************************************************/ #include -#include #include #include From olh@suse.de Sun Jul 10 12:37:18 2005 Received: with ECARTIS (v1.0.0; list netdev); Sun, 10 Jul 2005 12:37:27 -0700 (PDT) Received: from mx1.suse.de (mx1.suse.de [195.135.220.2]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6AJbHH9001468 for ; Sun, 10 Jul 2005 12:37:18 -0700 Received: from Relay2.suse.de (mail2.suse.de [195.135.221.8]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.suse.de (Postfix) with ESMTP id C3A6EEE23; Sun, 10 Jul 2005 21:35:34 +0200 (CEST) Date: Sun, 10 Jul 2005 19:35:34 +0000 From: Olaf Hering To: Andrew Morton , linux-kernel@vger.kernel.org Cc: Jeff Garzik , netdev@oss.sgi.com, irda-users@lists.sourceforge.net Subject: [PATCH 26/82] remove linux/version.h from drivers/net/ Message-ID: <20050710193534.26.SAdzTe2972.2247.olh@nectarine.suse.de> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline X-DOS: I got your 640K Real Mode Right Here Buddy! X-Homeland-Security: You are not supposed to read this line! You are a terrorist! User-Agent: Mutt und vi sind doch schneller als Notes (und GroupWise) In-Reply-To: <20050710193508.0.PmFpst2252.2247.olh@nectarine.suse.de> X-archive-position: 2711 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: olh@suse.de Precedence: bulk X-list: netdev Content-Length: 29687 Lines: 838 changing CONFIG_LOCALVERSION rebuilds too much, for no appearent reason. remove code for obsolete kernels in acenic, irda, tokenring, typhoon and wireless. drivers/net/wan/farsync.c has fstioc_info->kernelVersion, according to the comment the content doesnt matter Signed-off-by: Olaf Hering drivers/net/acenic.c | 11 ---- drivers/net/acenic.h | 4 - drivers/net/b44.c | 1 drivers/net/dm9000.c | 1 drivers/net/gianfar.c | 1 drivers/net/gianfar.h | 1 drivers/net/gianfar_ethtool.c | 1 drivers/net/gianfar_phy.c | 1 drivers/net/hp100.c | 1 drivers/net/ibmveth.c | 1 drivers/net/irda/sir_kthread.c | 4 - drivers/net/irda/vlsi_ir.h | 22 -------- drivers/net/iseries_veth.c | 1 drivers/net/mac8390.c | 1 drivers/net/mv643xx_eth.h | 1 drivers/net/ppp_mppe.c | 1 drivers/net/s2io.c | 1 drivers/net/sk98lin/h/skdrv1st.h | 3 - drivers/net/sk_mca.c | 1 drivers/net/sk_mca.h | 2 drivers/net/starfire.c | 3 - drivers/net/tokenring/lanstreamer.c | 59 ------------------------ drivers/net/tokenring/lanstreamer.h | 14 ----- drivers/net/typhoon.c | 9 --- drivers/net/via-velocity.c | 1 drivers/net/wan/farsync.c | 2 drivers/net/wireless/hostap/hostap.c | 1 drivers/net/wireless/hostap/hostap_crypt_ccmp.c | 1 drivers/net/wireless/hostap/hostap_crypt_tkip.c | 1 drivers/net/wireless/hostap/hostap_crypt_wep.c | 1 drivers/net/wireless/hostap/hostap_cs.c | 1 drivers/net/wireless/hostap/hostap_hw.c | 1 drivers/net/wireless/hostap/hostap_pci.c | 1 drivers/net/wireless/hostap/hostap_plx.c | 1 drivers/net/wireless/ipw2100.c | 1 drivers/net/wireless/ipw2100.h | 1 drivers/net/wireless/ipw2200.c | 8 --- drivers/net/wireless/ipw2200.h | 10 ---- drivers/net/wireless/orinoco.h | 1 drivers/net/wireless/prism54/isl_38xx.c | 1 drivers/net/wireless/prism54/isl_38xx.h | 1 drivers/net/wireless/prism54/isl_ioctl.c | 1 drivers/net/wireless/prism54/islpci_dev.c | 1 drivers/net/wireless/prism54/islpci_dev.h | 1 drivers/net/wireless/prism54/islpci_eth.c | 1 drivers/net/wireless/prism54/islpci_hotplug.c | 1 46 files changed, 5 insertions(+), 179 deletions(-) Index: linux-2.6.13-rc2-mm1/drivers/net/acenic.c =================================================================== --- linux-2.6.13-rc2-mm1.orig/drivers/net/acenic.c +++ linux-2.6.13-rc2-mm1/drivers/net/acenic.c @@ -53,7 +53,6 @@ #include #include #include -#include #include #include #include @@ -164,12 +163,6 @@ MODULE_DEVICE_TABLE(pci, acenic_pci_tbl) #define SET_NETDEV_DEV(net, pdev) do{} while(0) #endif -#if LINUX_VERSION_CODE >= 0x2051c -#define ace_sync_irq(irq) synchronize_irq(irq) -#else -#define ace_sync_irq(irq) synchronize_irq() -#endif - #ifndef offset_in_page #define offset_in_page(ptr) ((unsigned long)(ptr) & ~PAGE_MASK) #endif @@ -657,7 +650,7 @@ static void __devexit acenic_remove_one( * Then release the RX buffers - jumbo buffers were * already released in ace_close(). */ - ace_sync_irq(dev->irq); + synchronize_irq(dev->irq); for (i = 0; i < RX_STD_RING_ENTRIES; i++) { struct sk_buff *skb = ap->skb->rx_std_skbuff[i].skb; @@ -2657,7 +2650,7 @@ static int ace_change_mtu(struct net_dev } } else { while (test_and_set_bit(0, &ap->jumbo_refill_busy)); - ace_sync_irq(dev->irq); + synchronize_irq(dev->irq); ace_set_rxtx_parms(dev, 0); if (ap->jumbo) { struct cmd cmd; Index: linux-2.6.13-rc2-mm1/drivers/net/acenic.h =================================================================== --- linux-2.6.13-rc2-mm1.orig/drivers/net/acenic.h +++ linux-2.6.13-rc2-mm1/drivers/net/acenic.h @@ -1,8 +1,6 @@ #ifndef _ACENIC_H_ #define _ACENIC_H_ -#include - /* * Generate TX index update each time, when TX ring is closed. * Normally, this is not useful, because results in more dma (and irqs @@ -747,7 +745,7 @@ static inline void ace_mask_irq(struct n else writel(readl(®s->HostCtrl) | MASK_INTS, ®s->HostCtrl); - ace_sync_irq(dev->irq); + synchronize_irq(dev->irq); } Index: linux-2.6.13-rc2-mm1/drivers/net/b44.c =================================================================== --- linux-2.6.13-rc2-mm1.orig/drivers/net/b44.c +++ linux-2.6.13-rc2-mm1/drivers/net/b44.c @@ -18,7 +18,6 @@ #include #include #include -#include #include #include Index: linux-2.6.13-rc2-mm1/drivers/net/dm9000.c =================================================================== --- linux-2.6.13-rc2-mm1.orig/drivers/net/dm9000.c +++ linux-2.6.13-rc2-mm1/drivers/net/dm9000.c @@ -56,7 +56,6 @@ #include #include #include -#include #include #include #include Index: linux-2.6.13-rc2-mm1/drivers/net/gianfar.c =================================================================== --- linux-2.6.13-rc2-mm1.orig/drivers/net/gianfar.c +++ linux-2.6.13-rc2-mm1/drivers/net/gianfar.c @@ -94,7 +94,6 @@ #include #include #include -#include #include #include Index: linux-2.6.13-rc2-mm1/drivers/net/gianfar.h =================================================================== --- linux-2.6.13-rc2-mm1.orig/drivers/net/gianfar.h +++ linux-2.6.13-rc2-mm1/drivers/net/gianfar.h @@ -43,7 +43,6 @@ #include #include #include -#include #include #include #include Index: linux-2.6.13-rc2-mm1/drivers/net/gianfar_ethtool.c =================================================================== --- linux-2.6.13-rc2-mm1.orig/drivers/net/gianfar_ethtool.c +++ linux-2.6.13-rc2-mm1/drivers/net/gianfar_ethtool.c @@ -34,7 +34,6 @@ #include #include #include -#include #include #include #include Index: linux-2.6.13-rc2-mm1/drivers/net/gianfar_phy.c =================================================================== --- linux-2.6.13-rc2-mm1.orig/drivers/net/gianfar_phy.c +++ linux-2.6.13-rc2-mm1/drivers/net/gianfar_phy.c @@ -36,7 +36,6 @@ #include #include #include -#include #include #include Index: linux-2.6.13-rc2-mm1/drivers/net/hp100.c =================================================================== --- linux-2.6.13-rc2-mm1.orig/drivers/net/hp100.c +++ linux-2.6.13-rc2-mm1/drivers/net/hp100.c @@ -96,7 +96,6 @@ #undef HP100_MULTICAST_FILTER /* Need to be debugged... */ -#include #include #include #include Index: linux-2.6.13-rc2-mm1/drivers/net/ibmveth.c =================================================================== --- linux-2.6.13-rc2-mm1.orig/drivers/net/ibmveth.c +++ linux-2.6.13-rc2-mm1/drivers/net/ibmveth.c @@ -35,7 +35,6 @@ #include #include -#include #include #include #include Index: linux-2.6.13-rc2-mm1/drivers/net/irda/sir_kthread.c =================================================================== --- linux-2.6.13-rc2-mm1.orig/drivers/net/irda/sir_kthread.c +++ linux-2.6.13-rc2-mm1/drivers/net/irda/sir_kthread.c @@ -14,7 +14,6 @@ #include #include -#include #include #include #include @@ -140,9 +139,6 @@ static int irda_thread(void *startup) run_irda_queue(); } -#if LINUX_VERSION_CODE < KERNEL_VERSION(2,5,35) - reparent_to_init(); -#endif complete_and_exit(&irda_rq_queue.exit, 0); /* never reached */ return 0; Index: linux-2.6.13-rc2-mm1/drivers/net/irda/vlsi_ir.h =================================================================== --- linux-2.6.13-rc2-mm1.orig/drivers/net/irda/vlsi_ir.h +++ linux-2.6.13-rc2-mm1/drivers/net/irda/vlsi_ir.h @@ -49,26 +49,6 @@ typedef void irqreturn_t; #define IRQ_RETVAL(x) #endif -/* some stuff need to check kernelversion. Not all 2.5 stuff was present - * in early 2.5.x - the test is merely to separate 2.4 from 2.5 - */ -#include - -#if LINUX_VERSION_CODE < KERNEL_VERSION(2,5,0) - -/* PDE() introduced in 2.5.4 */ -#ifdef CONFIG_PROC_FS -#define PDE(inode) ((inode)->u.generic_ip) -#endif - -/* irda crc16 calculation exported in 2.5.42 */ -#define irda_calc_crc16(fcs,buf,len) (GOOD_FCS) - -/* we use this for unified pci device name access */ -#define PCIDEV_NAME(pdev) ((pdev)->name) - -#else /* 2.5 or later */ - /* recent 2.5/2.6 stores pci device names at varying places ;-) */ #ifdef CONFIG_PCI_NAMES /* human readable name */ @@ -78,8 +58,6 @@ typedef void irqreturn_t; #define PCIDEV_NAME(pdev) (pci_name(pdev)) #endif -#endif - /* ================================================================ */ /* non-standard PCI registers */ Index: linux-2.6.13-rc2-mm1/drivers/net/iseries_veth.c =================================================================== --- linux-2.6.13-rc2-mm1.orig/drivers/net/iseries_veth.c +++ linux-2.6.13-rc2-mm1/drivers/net/iseries_veth.c @@ -57,7 +57,6 @@ #include #include -#include #include #include #include Index: linux-2.6.13-rc2-mm1/drivers/net/mac8390.c =================================================================== --- linux-2.6.13-rc2-mm1.orig/drivers/net/mac8390.c +++ linux-2.6.13-rc2-mm1/drivers/net/mac8390.c @@ -15,7 +15,6 @@ * and fixed access to Sonic Sys card which masquerades as a Farallon * by rayk@knightsmanor.org */ -#include #include #include #include Index: linux-2.6.13-rc2-mm1/drivers/net/mv643xx_eth.h =================================================================== --- linux-2.6.13-rc2-mm1.orig/drivers/net/mv643xx_eth.h +++ linux-2.6.13-rc2-mm1/drivers/net/mv643xx_eth.h @@ -1,7 +1,6 @@ #ifndef __MV643XX_ETH_H__ #define __MV643XX_ETH_H__ -#include #include #include #include Index: linux-2.6.13-rc2-mm1/drivers/net/s2io.c =================================================================== --- linux-2.6.13-rc2-mm1.orig/drivers/net/s2io.c +++ linux-2.6.13-rc2-mm1/drivers/net/s2io.c @@ -54,7 +54,6 @@ #include #include #include -#include #include #include Index: linux-2.6.13-rc2-mm1/drivers/net/sk98lin/h/skdrv1st.h =================================================================== --- linux-2.6.13-rc2-mm1.orig/drivers/net/sk98lin/h/skdrv1st.h +++ linux-2.6.13-rc2-mm1/drivers/net/sk98lin/h/skdrv1st.h @@ -39,9 +39,6 @@ #ifndef __INC_SKDRV1ST_H #define __INC_SKDRV1ST_H -/* Check kernel version */ -#include - typedef struct s_AC SK_AC; /* Set card versions */ Index: linux-2.6.13-rc2-mm1/drivers/net/sk_mca.c =================================================================== --- linux-2.6.13-rc2-mm1.orig/drivers/net/sk_mca.c +++ linux-2.6.13-rc2-mm1/drivers/net/sk_mca.c @@ -93,7 +93,6 @@ History: #include #include #include -#include #include #include #include Index: linux-2.6.13-rc2-mm1/drivers/net/sk_mca.h =================================================================== --- linux-2.6.13-rc2-mm1.orig/drivers/net/sk_mca.h +++ linux-2.6.13-rc2-mm1/drivers/net/sk_mca.h @@ -1,5 +1,3 @@ -#include - #ifndef _SK_MCA_INCLUDE_ #define _SK_MCA_INCLUDE_ Index: linux-2.6.13-rc2-mm1/drivers/net/starfire.c =================================================================== --- linux-2.6.13-rc2-mm1.orig/drivers/net/starfire.c +++ linux-2.6.13-rc2-mm1/drivers/net/starfire.c @@ -143,7 +143,6 @@ TODO: - fix forced speed/duplexing code #define DRV_RELDATE "January 19, 2005" #include -#include #include #include #include @@ -257,7 +256,7 @@ static int full_duplex[MAX_UNITS] = {0, * This SUCKS. * We need a much better method to determine if dma_addr_t is 64-bit. */ -#if (defined(__i386__) && defined(CONFIG_HIGHMEM) && (LINUX_VERSION_CODE > 0x20500 || defined(CONFIG_HIGHMEM64G))) || defined(__x86_64__) || defined (__ia64__) || defined(__mips64__) || (defined(__mips__) && defined(CONFIG_HIGHMEM) && defined(CONFIG_64BIT_PHYS_ADDR)) +#if (defined(__i386__) && defined(CONFIG_HIGHMEM)) || defined(__x86_64__) || defined (__ia64__) || defined(__mips64__) || (defined(__mips__) && defined(CONFIG_HIGHMEM) && defined(CONFIG_64BIT_PHYS_ADDR)) /* 64-bit dma_addr_t */ #define ADDR_64BITS /* This chip uses 64 bit addresses. */ #define cpu_to_dma(x) cpu_to_le64(x) Index: linux-2.6.13-rc2-mm1/drivers/net/tokenring/lanstreamer.c =================================================================== --- linux-2.6.13-rc2-mm1.orig/drivers/net/tokenring/lanstreamer.c +++ linux-2.6.13-rc2-mm1/drivers/net/tokenring/lanstreamer.c @@ -120,7 +120,6 @@ #include #include #include -#include #include #include @@ -1932,64 +1931,6 @@ static int sprintf_info(char *buffer, st #endif #endif -#if STREAMER_IOCTL && (LINUX_VERSION_CODE < KERNEL_VERSION(2,5,0)) -static int streamer_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd) -{ - int i; - struct streamer_private *streamer_priv = (struct streamer_private *) dev->priv; - u8 __iomem *streamer_mmio = streamer_priv->streamer_mmio; - - switch(cmd) { - case IOCTL_SISR_MASK: - writew(SISR_MI, streamer_mmio + SISR_MASK_SUM); - break; - case IOCTL_SPIN_LOCK_TEST: - printk(KERN_INFO "spin_lock() called.n"); - spin_lock(&streamer_priv->streamer_lock); - spin_unlock(&streamer_priv->streamer_lock); - printk(KERN_INFO "spin_unlock() finished.n"); - break; - case IOCTL_PRINT_BDAS: - printk(KERN_INFO "bdas: RXBDA: %x RXLBDA: %x TX2FDA: %x TX2LFDA: %xn", - readw(streamer_mmio + RXBDA), - readw(streamer_mmio + RXLBDA), - readw(streamer_mmio + TX2FDA), - readw(streamer_mmio + TX2LFDA)); - break; - case IOCTL_PRINT_REGISTERS: - printk(KERN_INFO "registers:n"); - printk(KERN_INFO "SISR: %04x MISR: %04x LISR: %04x BCTL: %04x BMCTL: %04xnmask %04x mask %04xn", - readw(streamer_mmio + SISR), - readw(streamer_mmio + MISR_RUM), - readw(streamer_mmio + LISR), - readw(streamer_mmio + BCTL), - readw(streamer_mmio + BMCTL_SUM), - readw(streamer_mmio + SISR_MASK), - readw(streamer_mmio + MISR_MASK)); - break; - case IOCTL_PRINT_RX_BUFS: - printk(KERN_INFO "Print rx bufs:n"); - for(i=0; istreamer_rx_ring[i].status); - break; - case IOCTL_PRINT_TX_BUFS: - printk(KERN_INFO "Print tx bufs:n"); - for(i=0; istreamer_tx_ring[i].status); - break; - case IOCTL_RX_CMD: - streamer_rx(dev); - printk(KERN_INFO "Sent rx command.n"); - break; - default: - printk(KERN_INFO "Bad ioctl!n"); - } - return 0; -} -#endif - static struct pci_driver streamer_pci_driver = { .name = "lanstreamer", .id_table = streamer_pci_tbl, Index: linux-2.6.13-rc2-mm1/drivers/net/tokenring/lanstreamer.h =================================================================== --- linux-2.6.13-rc2-mm1.orig/drivers/net/tokenring/lanstreamer.h +++ linux-2.6.13-rc2-mm1/drivers/net/tokenring/lanstreamer.h @@ -60,20 +60,6 @@ * */ -#include - -#if STREAMER_IOCTL && (LINUX_VERSION_CODE < KERNEL_VERSION(2,5,0)) -#include -#define IOCTL_PRINT_RX_BUFS SIOCDEVPRIVATE -#define IOCTL_PRINT_TX_BUFS SIOCDEVPRIVATE+1 -#define IOCTL_RX_CMD SIOCDEVPRIVATE+2 -#define IOCTL_TX_CMD SIOCDEVPRIVATE+3 -#define IOCTL_PRINT_REGISTERS SIOCDEVPRIVATE+4 -#define IOCTL_PRINT_BDAS SIOCDEVPRIVATE+5 -#define IOCTL_SPIN_LOCK_TEST SIOCDEVPRIVATE+6 -#define IOCTL_SISR_MASK SIOCDEVPRIVATE+7 -#endif - /* MAX_INTR - the maximum number of times we can loop * inside the interrupt function before returning * control to the OS (maximum value is 256) Index: linux-2.6.13-rc2-mm1/drivers/net/typhoon.c =================================================================== --- linux-2.6.13-rc2-mm1.orig/drivers/net/typhoon.c +++ linux-2.6.13-rc2-mm1/drivers/net/typhoon.c @@ -128,7 +128,6 @@ static const int multicast_filter_limit #include #include #include -#include #include #include "typhoon.h" @@ -333,12 +332,6 @@ enum state_values { #define TYPHOON_RESET_TIMEOUT_NOSLEEP ((6 * 1000000) / TYPHOON_UDELAY) #define TYPHOON_WAIT_TIMEOUT ((1000000 / 2) / TYPHOON_UDELAY) -#if LINUX_VERSION_CODE < KERNEL_VERSION(2, 5, 28) -#define typhoon_synchronize_irq(x) synchronize_irq() -#else -#define typhoon_synchronize_irq(x) synchronize_irq(x) -#endif - #if defined(NETIF_F_TSO) #define skb_tso_size(x) (skb_shinfo(x)->tso_size) #define TSO_NUM_DESCRIPTORS 2 @@ -2173,7 +2166,7 @@ typhoon_close(struct net_device *dev) printk(KERN_ERR "%s: unable to stop runtimen", dev->name); /* Make sure there is no irq handler running on a different CPU. */ - typhoon_synchronize_irq(dev->irq); + synchronize_irq(dev->irq); free_irq(dev->irq, dev); typhoon_free_rx_rings(tp); Index: linux-2.6.13-rc2-mm1/drivers/net/via-velocity.c =================================================================== --- linux-2.6.13-rc2-mm1.orig/drivers/net/via-velocity.c +++ linux-2.6.13-rc2-mm1/drivers/net/via-velocity.c @@ -61,7 +61,6 @@ #include #include #include -#include #include #include #include Index: linux-2.6.13-rc2-mm1/drivers/net/wan/farsync.c =================================================================== --- linux-2.6.13-rc2-mm1.orig/drivers/net/wan/farsync.c +++ linux-2.6.13-rc2-mm1/drivers/net/wan/farsync.c @@ -17,7 +17,6 @@ #include #include -#include #include #include #include @@ -1807,7 +1806,6 @@ gather_conf_info(struct fst_card_info *c memset(info, 0, sizeof (struct fstioc_info)); i = port->index; - info->kernelVersion = LINUX_VERSION_CODE; info->nports = card->nports; info->type = card->type; info->state = card->state; Index: linux-2.6.13-rc2-mm1/drivers/net/wireless/orinoco.h =================================================================== --- linux-2.6.13-rc2-mm1.orig/drivers/net/wireless/orinoco.h +++ linux-2.6.13-rc2-mm1/drivers/net/wireless/orinoco.h @@ -13,7 +13,6 @@ #include #include #include -#include #include "hermes.h" Index: linux-2.6.13-rc2-mm1/drivers/net/wireless/prism54/isl_38xx.c =================================================================== --- linux-2.6.13-rc2-mm1.orig/drivers/net/wireless/prism54/isl_38xx.c +++ linux-2.6.13-rc2-mm1/drivers/net/wireless/prism54/isl_38xx.c @@ -18,7 +18,6 @@ * */ -#include #include #include #include Index: linux-2.6.13-rc2-mm1/drivers/net/wireless/prism54/isl_38xx.h =================================================================== --- linux-2.6.13-rc2-mm1.orig/drivers/net/wireless/prism54/isl_38xx.h +++ linux-2.6.13-rc2-mm1/drivers/net/wireless/prism54/isl_38xx.h @@ -20,7 +20,6 @@ #ifndef _ISL_38XX_H #define _ISL_38XX_H -#include #include #include Index: linux-2.6.13-rc2-mm1/drivers/net/wireless/prism54/isl_ioctl.c =================================================================== --- linux-2.6.13-rc2-mm1.orig/drivers/net/wireless/prism54/isl_ioctl.c +++ linux-2.6.13-rc2-mm1/drivers/net/wireless/prism54/isl_ioctl.c @@ -20,7 +20,6 @@ * */ -#include #include #include #include Index: linux-2.6.13-rc2-mm1/drivers/net/wireless/prism54/islpci_dev.c =================================================================== --- linux-2.6.13-rc2-mm1.orig/drivers/net/wireless/prism54/islpci_dev.c +++ linux-2.6.13-rc2-mm1/drivers/net/wireless/prism54/islpci_dev.c @@ -19,7 +19,6 @@ * */ -#include #include #include Index: linux-2.6.13-rc2-mm1/drivers/net/wireless/prism54/islpci_dev.h =================================================================== --- linux-2.6.13-rc2-mm1.orig/drivers/net/wireless/prism54/islpci_dev.h +++ linux-2.6.13-rc2-mm1/drivers/net/wireless/prism54/islpci_dev.h @@ -23,7 +23,6 @@ #ifndef _ISLPCI_DEV_H #define _ISLPCI_DEV_H -#include #include #include #include Index: linux-2.6.13-rc2-mm1/drivers/net/wireless/prism54/islpci_eth.c =================================================================== --- linux-2.6.13-rc2-mm1.orig/drivers/net/wireless/prism54/islpci_eth.c +++ linux-2.6.13-rc2-mm1/drivers/net/wireless/prism54/islpci_eth.c @@ -17,7 +17,6 @@ * */ -#include #include #include Index: linux-2.6.13-rc2-mm1/drivers/net/wireless/prism54/islpci_hotplug.c =================================================================== --- linux-2.6.13-rc2-mm1.orig/drivers/net/wireless/prism54/islpci_hotplug.c +++ linux-2.6.13-rc2-mm1/drivers/net/wireless/prism54/islpci_hotplug.c @@ -18,7 +18,6 @@ * */ -#include #include #include #include Index: linux-2.6.13-rc2-mm1/drivers/net/ppp_mppe.c =================================================================== --- linux-2.6.13-rc2-mm1.orig/drivers/net/ppp_mppe.c +++ linux-2.6.13-rc2-mm1/drivers/net/ppp_mppe.c @@ -44,7 +44,6 @@ #include #include #include -#include #include #include #include Index: linux-2.6.13-rc2-mm1/drivers/net/wireless/hostap/hostap.c =================================================================== --- linux-2.6.13-rc2-mm1.orig/drivers/net/wireless/hostap/hostap.c +++ linux-2.6.13-rc2-mm1/drivers/net/wireless/hostap/hostap.c @@ -17,7 +17,6 @@ #endif #include -#include #include #include #include Index: linux-2.6.13-rc2-mm1/drivers/net/wireless/hostap/hostap_crypt_ccmp.c =================================================================== --- linux-2.6.13-rc2-mm1.orig/drivers/net/wireless/hostap/hostap_crypt_ccmp.c +++ linux-2.6.13-rc2-mm1/drivers/net/wireless/hostap/hostap_crypt_ccmp.c @@ -10,7 +10,6 @@ */ #include -#include #include #include #include Index: linux-2.6.13-rc2-mm1/drivers/net/wireless/hostap/hostap_crypt_tkip.c =================================================================== --- linux-2.6.13-rc2-mm1.orig/drivers/net/wireless/hostap/hostap_crypt_tkip.c +++ linux-2.6.13-rc2-mm1/drivers/net/wireless/hostap/hostap_crypt_tkip.c @@ -10,7 +10,6 @@ */ #include -#include #include #include #include Index: linux-2.6.13-rc2-mm1/drivers/net/wireless/hostap/hostap_crypt_wep.c =================================================================== --- linux-2.6.13-rc2-mm1.orig/drivers/net/wireless/hostap/hostap_crypt_wep.c +++ linux-2.6.13-rc2-mm1/drivers/net/wireless/hostap/hostap_crypt_wep.c @@ -10,7 +10,6 @@ */ #include -#include #include #include #include Index: linux-2.6.13-rc2-mm1/drivers/net/wireless/hostap/hostap_cs.c =================================================================== --- linux-2.6.13-rc2-mm1.orig/drivers/net/wireless/hostap/hostap_cs.c +++ linux-2.6.13-rc2-mm1/drivers/net/wireless/hostap/hostap_cs.c @@ -1,7 +1,6 @@ #define PRISM2_PCCARD #include -#include #include #include #include Index: linux-2.6.13-rc2-mm1/drivers/net/wireless/hostap/hostap_hw.c =================================================================== --- linux-2.6.13-rc2-mm1.orig/drivers/net/wireless/hostap/hostap_hw.c +++ linux-2.6.13-rc2-mm1/drivers/net/wireless/hostap/hostap_hw.c @@ -31,7 +31,6 @@ #include -#include #include #include Index: linux-2.6.13-rc2-mm1/drivers/net/wireless/hostap/hostap_pci.c =================================================================== --- linux-2.6.13-rc2-mm1.orig/drivers/net/wireless/hostap/hostap_pci.c +++ linux-2.6.13-rc2-mm1/drivers/net/wireless/hostap/hostap_pci.c @@ -5,7 +5,6 @@ * Andy Warner */ #include -#include #include #include #include Index: linux-2.6.13-rc2-mm1/drivers/net/wireless/hostap/hostap_plx.c =================================================================== --- linux-2.6.13-rc2-mm1.orig/drivers/net/wireless/hostap/hostap_plx.c +++ linux-2.6.13-rc2-mm1/drivers/net/wireless/hostap/hostap_plx.c @@ -8,7 +8,6 @@ #include -#include #include #include #include Index: linux-2.6.13-rc2-mm1/drivers/net/wireless/ipw2100.c =================================================================== --- linux-2.6.13-rc2-mm1.orig/drivers/net/wireless/ipw2100.c +++ linux-2.6.13-rc2-mm1/drivers/net/wireless/ipw2100.c @@ -159,7 +159,6 @@ that only one external action is invoked #include #include #include -#include #include #include #include Index: linux-2.6.13-rc2-mm1/drivers/net/wireless/ipw2100.h =================================================================== --- linux-2.6.13-rc2-mm1.orig/drivers/net/wireless/ipw2100.h +++ linux-2.6.13-rc2-mm1/drivers/net/wireless/ipw2100.h @@ -37,7 +37,6 @@ #include #include #include -#include #include // new driver API #include Index: linux-2.6.13-rc2-mm1/drivers/net/wireless/ipw2200.c =================================================================== --- linux-2.6.13-rc2-mm1.orig/drivers/net/wireless/ipw2200.c +++ linux-2.6.13-rc2-mm1/drivers/net/wireless/ipw2200.c @@ -7242,11 +7242,7 @@ static int ipw_pci_suspend(struct pci_de /* Remove the PRESENT state of the device */ netif_device_detach(dev); -#if LINUX_VERSION_CODE < KERNEL_VERSION(2,6,10) - pci_save_state(pdev, priv->pm_state); -#else pci_save_state(pdev); -#endif pci_disable_device(pdev); pci_set_power_state(pdev, state); @@ -7263,11 +7259,7 @@ static int ipw_pci_resume(struct pci_dev pci_set_power_state(pdev, 0); pci_enable_device(pdev); -#if LINUX_VERSION_CODE < KERNEL_VERSION(2,6,10) - pci_restore_state(pdev, priv->pm_state); -#else pci_restore_state(pdev); -#endif /* * Suspend/Resume resets the PCI configuration space, so we have to * re-disable the RETRY_TIMEOUT register (0x41) to keep PCI Tx retries Index: linux-2.6.13-rc2-mm1/drivers/net/wireless/ipw2200.h =================================================================== --- linux-2.6.13-rc2-mm1.orig/drivers/net/wireless/ipw2200.h +++ linux-2.6.13-rc2-mm1/drivers/net/wireless/ipw2200.h @@ -34,7 +34,6 @@ #include #include -#include #include #include #include @@ -62,15 +61,6 @@ typedef void irqreturn_t; #define IRQ_RETVAL(x) #endif -#if ( LINUX_VERSION_CODE < KERNEL_VERSION(2,6,9) ) -#define __iomem -#endif - -#if ( LINUX_VERSION_CODE < KERNEL_VERSION(2,6,5) ) -#define pci_dma_sync_single_for_cpu pci_dma_sync_single -#define pci_dma_sync_single_for_device pci_dma_sync_single -#endif - #ifndef HAVE_FREE_NETDEV #define free_netdev(x) kfree(x) #endif From olh@suse.de Sun Jul 10 12:37:59 2005 Received: with ECARTIS (v1.0.0; list netdev); Sun, 10 Jul 2005 12:38:04 -0700 (PDT) Received: from mx1.suse.de (mail.suse.de [195.135.220.2]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6AJbxH9001548 for ; Sun, 10 Jul 2005 12:37:59 -0700 Received: from Relay2.suse.de (mail2.suse.de [195.135.221.8]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.suse.de (Postfix) with ESMTP id 384C9EDFC; Sun, 10 Jul 2005 21:36:19 +0200 (CEST) Date: Sun, 10 Jul 2005 19:36:19 +0000 From: Olaf Hering To: Andrew Morton , linux-kernel@vger.kernel.org Cc: netdev@oss.sgi.com Subject: [PATCH 71/82] remove linux/version.h from include/linux/if_wanpipe_common.h Message-ID: <20050710193619.71.OhUmOh4153.2247.olh@nectarine.suse.de> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline X-DOS: I got your 640K Real Mode Right Here Buddy! X-Homeland-Security: You are not supposed to read this line! You are a terrorist! User-Agent: Mutt und vi sind doch schneller als Notes (und GroupWise) In-Reply-To: <20050710193508.0.PmFpst2252.2247.olh@nectarine.suse.de> X-archive-position: 2712 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: olh@suse.de Precedence: bulk X-list: netdev Content-Length: 637 Lines: 21 changing CONFIG_LOCALVERSION rebuilds too much, for no appearent reason. Signed-off-by: Olaf Hering include/linux/if_wanpipe_common.h | 2 -- 1 files changed, 2 deletions(-) Index: linux-2.6.13-rc2-mm1/include/linux/if_wanpipe_common.h =================================================================== --- linux-2.6.13-rc2-mm1.orig/include/linux/if_wanpipe_common.h +++ linux-2.6.13-rc2-mm1/include/linux/if_wanpipe_common.h @@ -17,8 +17,6 @@ #ifndef _WANPIPE_SOCK_DRIVER_COMMON_H #define _WANPIPE_SOCK_DRIVER_COMMON_H -#include - typedef struct { struct net_device *slave; atomic_t packet_sent; From romieu@fr.zoreil.com Sun Jul 10 16:29:44 2005 Received: with ECARTIS (v1.0.0; list netdev); Sun, 10 Jul 2005 16:29:55 -0700 (PDT) Received: from fr.zoreil.com (electric-eye.fr.zoreil.com [213.41.134.224]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6ANTgH9019977 for ; Sun, 10 Jul 2005 16:29:43 -0700 Received: from electric-eye.fr.zoreil.com (localhost.localdomain [127.0.0.1]) by fr.zoreil.com (8.13.4/8.12.1) with ESMTP id j6ANQ4bD007343; Mon, 11 Jul 2005 01:26:04 +0200 Received: (from romieu@localhost) by electric-eye.fr.zoreil.com (8.13.4/8.12.1) id j6ANQ3Mt007342; Mon, 11 Jul 2005 01:26:03 +0200 Date: Mon, 11 Jul 2005 01:26:03 +0200 From: Francois Romieu To: Manfred Spraul Cc: Jeff Garzik , Netdev , renaud.lienhart@free.fr Subject: Re: [PATCH] forcedeth: Additional ethtool support Message-ID: <20050710232603.GA4335@electric-eye.fr.zoreil.com> References: <42D101EC.6000608@colorfullife.com> <20050710172833.GA1951@electric-eye.fr.zoreil.com> <42D16656.6000207@colorfullife.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <42D16656.6000207@colorfullife.com> User-Agent: Mutt/1.4.2.1i X-Organisation: Land of Sunshine Inc. X-archive-position: 2714 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: romieu@fr.zoreil.com Precedence: bulk X-list: netdev Content-Length: 457 Lines: 16 Manfred Spraul : [memcpy_fromio] > Not for a nic without complete documentation: What if an arch uses > 64-bit reads to read two registers at the same time? So far, no citizen of arch/ does. Afaik it would probably be a bad idea on pci-x. > that, for example IIRC natsemi explicitely mandates 32-bit reads. > x86-64 doesn't, it uses 32-bit reads, but I don't like the idea of using > memcpy to read registers. Ok. -- Ueimor From davem@davemloft.net Mon Jul 11 21:04:49 2005 Received: with ECARTIS (v1.0.0; list netdev); Mon, 11 Jul 2005 21:04:55 -0700 (PDT) Received: from sunset.davemloft.net (dsl027-180-168.sfo1.dsl.speakeasy.net [216.27.180.168]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6C44mH9013404 for ; Mon, 11 Jul 2005 21:04:49 -0700 Received: from localhost ([127.0.0.1] ident=davem) by sunset.davemloft.net with esmtp (Exim 4.50) id 1DsBz2-0004im-3x; Mon, 11 Jul 2005 21:03:00 -0700 Date: Mon, 11 Jul 2005 21:02:59 -0700 (PDT) Message-Id: <20050711.210259.77057135.davem@davemloft.net> To: arnaldo.melo@gmail.com Cc: dada1@cosmosbay.com, tgraf@suug.ch, netdev@oss.sgi.com Subject: Re: [PATCH] loop unrolling in net/sched/sch_generic.c From: "David S. Miller" In-Reply-To: <39e6f6c7050708040877702f6f@mail.gmail.com> References: <20050708.003014.125896217.davem@davemloft.net> <42CE3722.3070208@cosmosbay.com> <39e6f6c7050708040877702f6f@mail.gmail.com> X-Mailer: Mew version 4.2 on Emacs 21.4 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 2716 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev Content-Length: 537 Lines: 15 From: Arnaldo Carvalho de Melo Date: Fri, 8 Jul 2005 08:08:54 -0300 > > struct inet_sock { > > /* sk and pinet6 has to be the first two members of inet_sock */ > > struct sock sk; > > #if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE) > > struct ipv6_pinfo *pinet6; > > #endif > > /me pleads guilty, Dave, any problem with removing this #ifdef? Humm, I'll > think about using Just leave it for now. If we come up with some spectacularly nice way to deal with this in the future, we can change it then. From SRS0+05568e4e9645ae537b16+688+infradead.org+hch@pentafluge.srs.infradead.org Tue Jul 12 13:29:51 2005 Received: with ECARTIS (v1.0.0; list netdev); Tue, 12 Jul 2005 13:29:55 -0700 (PDT) Received: from pentafluge.infradead.org (pentafluge.infradead.org [213.146.154.40]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6CKTgH9028808 for ; Tue, 12 Jul 2005 13:29:51 -0700 Received: from hch by pentafluge.infradead.org with local (Exim 4.52 #1 (Red Hat Linux)) id 1DsRMA-0002rS-Ii; Tue, 12 Jul 2005 21:27:54 +0100 Date: Tue, 12 Jul 2005 21:27:54 +0100 From: Christoph Hellwig To: raghavendra.koushik@neterion.com Cc: jgarzik@pobox.com, netdev@oss.sgi.com, ravinandan.arakali@neterion.com, leonid.grossman@neterion.com, rapuru.sriram@neterion.com Subject: Re: [PATCH 2.6.12.1 5/12] S2io: Performance improvements Message-ID: <20050712202754.GA10768@infradead.org> References: <20050707222741.71C3E89826@linux.site> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20050707222741.71C3E89826@linux.site> User-Agent: Mutt/1.4.2.1i X-SRS-Rewrite: SMTP reverse-path rewritten from by pentafluge.infradead.org See http://www.infradead.org/rpr.html X-archive-position: 2718 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hch@infradead.org Precedence: bulk X-list: netdev Content-Length: 570 Lines: 16 > 3. Enable two-buffer mode(for Rx path) automatically for SGI > systems. This improves Rx performance dramatically on > SGI systems. > +/* Enable 2 buffer mode by default for SGI system */ > +#ifdef CONFIG_IA64_SGI_SN2 > +#define CONFIG_2BUFF_MODE > +#endif this enabled it only on kernel that are built to only run on SN2 hardware, which is completely useless in practice. Besides that defining a CONFIG_ symbol from source files is a big no-go. What exactly is the 2buff mode, and more specific what are the downsides of enabling it on non-SGI hardware? From davem@davemloft.net Tue Jul 12 13:35:46 2005 Received: with ECARTIS (v1.0.0; list netdev); Tue, 12 Jul 2005 13:35:52 -0700 (PDT) Received: from sunset.davemloft.net (dsl027-180-168.sfo1.dsl.speakeasy.net [216.27.180.168]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6CKZkH9029750 for ; Tue, 12 Jul 2005 13:35:46 -0700 Received: from localhost ([127.0.0.1] ident=davem) by sunset.davemloft.net with esmtp (Exim 4.50) id 1DsRS8-0001uM-7r; Tue, 12 Jul 2005 13:34:04 -0700 Date: Tue, 12 Jul 2005 13:34:04 -0700 (PDT) Message-Id: <20050712.133404.52118192.davem@davemloft.net> To: hch@infradead.org Cc: raghavendra.koushik@neterion.com, jgarzik@pobox.com, netdev@oss.sgi.com, ravinandan.arakali@neterion.com, leonid.grossman@neterion.com, rapuru.sriram@neterion.com Subject: Re: [PATCH 2.6.12.1 5/12] S2io: Performance improvements From: "David S. Miller" In-Reply-To: <20050712202754.GA10768@infradead.org> References: <20050707222741.71C3E89826@linux.site> <20050712202754.GA10768@infradead.org> X-Mailer: Mew version 4.2 on Emacs 21.4 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 2719 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev Content-Length: 458 Lines: 13 From: Christoph Hellwig Date: Tue, 12 Jul 2005 21:27:54 +0100 > > +/* Enable 2 buffer mode by default for SGI system */ > > +#ifdef CONFIG_IA64_SGI_SN2 > > +#define CONFIG_2BUFF_MODE > > +#endif > > this enabled it only on kernel that are built to only run on SN2 > hardware, which is completely useless in practice. Besides that defining > a CONFIG_ symbol from source files is a big no-go. Yes, do this in the Kconfig file instead. From bunk@stusta.de Tue Jul 12 13:29:27 2005 Received: with ECARTIS (v1.0.0; list netdev); Tue, 12 Jul 2005 13:29:33 -0700 (PDT) Received: from mailout.stusta.mhn.de (emailhub.stusta.mhn.de [141.84.69.5]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id j6CKTPH9028774 for ; Tue, 12 Jul 2005 13:29:26 -0700 Received: (qmail 29619 invoked from network); 12 Jul 2005 20:27:40 -0000 Received: from r063144.stusta.swh.mhn.de (10.150.63.144) by mailout.stusta.mhn.de with SMTP; 12 Jul 2005 20:27:40 -0000 Received: by r063144.stusta.swh.mhn.de (Postfix, from userid 1000) id 768E0BB4E6; Tue, 12 Jul 2005 22:27:40 +0200 (CEST) Date: Tue, 12 Jul 2005 22:27:40 +0200 From: Adrian Bunk To: Andrew Morton Cc: jgarzik@pobox.com, jkmaline@cc.hut.fi, hostap@shmoo.com, netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: [-mm patch] net/ieee80211/: remove pci.h #include's Message-ID: <20050712202740.GL4034@stusta.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.9i X-archive-position: 2717 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: bunk@stusta.de Precedence: bulk X-list: netdev Content-Length: 1666 Lines: 49 I was wondering why editing pci.h triggered the rebuild of three files under net/, and as far as I can see, there's no reason for these three files to #include pci.h . Signed-off-by: Adrian Bunk --- This patch was already sent on: - 3 Jul 2005 - 30 May 2005 - 1 May 2005 net/ieee80211/ieee80211_module.c | 1 - net/ieee80211/ieee80211_rx.c | 1 - net/ieee80211/ieee80211_tx.c | 1 - 3 files changed, 3 deletions(-) --- linux-2.6.12-rc3-mm1-full/net/ieee80211/ieee80211_module.c.old 2005-04-30 23:23:14.000000000 +0200 +++ linux-2.6.12-rc3-mm1-full/net/ieee80211/ieee80211_module.c 2005-04-30 23:23:18.000000000 +0200 @@ -40,7 +40,6 @@ #include #include #include -#include #include #include #include --- linux-2.6.12-rc3-mm1-full/net/ieee80211/ieee80211_tx.c.old 2005-04-30 23:23:25.000000000 +0200 +++ linux-2.6.12-rc3-mm1-full/net/ieee80211/ieee80211_tx.c 2005-04-30 23:23:32.000000000 +0200 @@ -33,7 +33,6 @@ #include #include #include -#include #include #include #include --- linux-2.6.12-rc3-mm1-full/net/ieee80211/ieee80211_rx.c.old 2005-04-30 23:23:42.000000000 +0200 +++ linux-2.6.12-rc3-mm1-full/net/ieee80211/ieee80211_rx.c 2005-04-30 23:23:46.000000000 +0200 @@ -23,7 +23,6 @@ #include #include #include -#include #include #include #include From leonid.grossman@neterion.com Tue Jul 12 13:59:04 2005 Received: with ECARTIS (v1.0.0; list netdev); Tue, 12 Jul 2005 13:59:11 -0700 (PDT) Received: from ns1.s2io.com (ns1.s2io.com [142.46.200.198]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6CKx4H9031443 for ; Tue, 12 Jul 2005 13:59:04 -0700 Received: from guinness.s2io.com (sentry.s2io.com [142.46.200.199]) by ns1.s2io.com (8.12.10/8.12.10) with ESMTP id j6CKv3cx012278; Tue, 12 Jul 2005 16:57:03 -0400 (EDT) Received: from lgt40 ([10.16.16.68]) by guinness.s2io.com (8.12.6/8.12.6) with ESMTP id j6CKv1KP011554; Tue, 12 Jul 2005 16:57:01 -0400 (EDT) Message-Id: <200507122057.j6CKv1KP011554@guinness.s2io.com> From: "Leonid Grossman" To: "'Christoph Hellwig'" , Cc: , , , Subject: RE: [PATCH 2.6.12.1 5/12] S2io: Performance improvements Date: Tue, 12 Jul 2005 13:56:55 -0700 MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Mailer: Microsoft Office Outlook, Build 11.0.5510 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1409 Thread-Index: AcWHIDPwq2/XNgNfRGumCm1RvxxEMwAAW6Cg In-Reply-To: <20050712202754.GA10768@infradead.org> X-Scanned-By: MIMEDefang 2.34 X-archive-position: 2720 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: leonid.grossman@neterion.com Precedence: bulk X-list: netdev Content-Length: 1899 Lines: 49 > -----Original Message----- > From: Christoph Hellwig [mailto:hch@infradead.org] > Sent: Tuesday, July 12, 2005 1:28 PM > To: raghavendra.koushik@neterion.com > Cc: jgarzik@pobox.com; netdev@oss.sgi.com; > ravinandan.arakali@neterion.com; > leonid.grossman@neterion.com; rapuru.sriram@neterion.com > Subject: Re: [PATCH 2.6.12.1 5/12] S2io: Performance improvements > > > 3. Enable two-buffer mode(for Rx path) automatically for SGI > > systems. This improves Rx performance dramatically on > > SGI systems. > > > +/* Enable 2 buffer mode by default for SGI system */ #ifdef > > +CONFIG_IA64_SGI_SN2 #define CONFIG_2BUFF_MODE #endif > > this enabled it only on kernel that are built to only run on > SN2 hardware, which is completely useless in practice. > Besides that defining a CONFIG_ symbol from source files is a > big no-go. > > What exactly is the 2buff mode, and more specific what are > the downsides of enabling it on non-SGI hardware? In short, this is one of the ASIC modes where headers and payload are separated by the hardware, and placed in separate receive buffers. (More details are in the ASIC programming manual that is posted on ns1.s2io.com). On SGI platforms, the two buffer mode results in a significant rx performance boost since it allows to achieve both aligned transfers on the bus and aligned data copies. It has been tested on Altix systems quite a bit (the card is OEMed and shipped by SGI). On other platforms, we haven't seen significant benefits from header separation modes (unless they are combined with different features, which is another story), and we decided not to introduce an extra bus transfer unless it is clearly justified. Also, the feature has not been sufficiently tested on other platforms, For these reasons, it is left as a feature that's specific to SGI chipset - but it is really beneficial there. Leonid > > From ravinandan.arakali@neterion.com Tue Jul 12 14:03:20 2005 Received: with ECARTIS (v1.0.0; list netdev); Tue, 12 Jul 2005 14:03:23 -0700 (PDT) Received: from ns1.s2io.com (ns1.s2io.com [142.46.200.198]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6CL3JH9032073 for ; Tue, 12 Jul 2005 14:03:20 -0700 Received: from guinness.s2io.com (sentry.s2io.com [142.46.200.199]) by ns1.s2io.com (8.12.10/8.12.10) with ESMTP id j6CL10cx012292; Tue, 12 Jul 2005 17:01:00 -0400 (EDT) Received: from rarakali ([10.16.16.79]) by guinness.s2io.com (8.12.6/8.12.6) with SMTP id j6CL0wKP012521; Tue, 12 Jul 2005 17:00:58 -0400 (EDT) From: "Ravinandan Arakali" To: "'David S. Miller'" , Cc: , , , , Subject: RE: [PATCH 2.6.12.1 5/12] S2io: Performance improvements Date: Tue, 12 Jul 2005 14:00:52 -0700 Message-ID: <000a01c58724$ca41a7c0$4f10100a@pc.s2io.com> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Priority: 3 (Normal) X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook CWS, Build 9.0.2416 (9.0.2911.0) X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.2180 In-Reply-To: <20050712.133404.52118192.davem@davemloft.net> Importance: Normal X-Scanned-By: MIMEDefang 2.34 X-archive-position: 2721 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ravinandan.arakali@neterion.com Precedence: bulk X-list: netdev Content-Length: 990 Lines: 29 The two-buffer mode was added as a configurable option to Kconfig file several months ago. Hence the macro is CONFIG_2BUFF_MODE. The two-buffer receive mode involves two buffers (128 byte aligned) for each packet. This mode drastically increases performance on SGI platforms and hence enabled only for these platforms. On other platforms, there's no difference compared to one-buffer mode but the added complexity and extra memory allocated does not make it worthwhile to enable this mode for non-SGI platforms. Also, most of our QA cycle on non-SGI platforms has been done with one-buffer mode. Thanks, Ravi > > +/* Enable 2 buffer mode by default for SGI system */ > > +#ifdef CONFIG_IA64_SGI_SN2 > > +#define CONFIG_2BUFF_MODE > > +#endif > > this enabled it only on kernel that are built to only run on SN2 > hardware, which is completely useless in practice. Besides that defining > a CONFIG_ symbol from source files is a big no-go. Yes, do this in the Kconfig file instead. From davem@davemloft.net Tue Jul 12 14:05:55 2005 Received: with ECARTIS (v1.0.0; list netdev); Tue, 12 Jul 2005 14:05:58 -0700 (PDT) Received: from sunset.davemloft.net (dsl027-180-168.sfo1.dsl.speakeasy.net [216.27.180.168]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6CL5sH9000376 for ; Tue, 12 Jul 2005 14:05:54 -0700 Received: from localhost ([127.0.0.1] ident=davem) by sunset.davemloft.net with esmtp (Exim 4.50) id 1DsRvH-00045x-Im; Tue, 12 Jul 2005 14:04:11 -0700 Date: Tue, 12 Jul 2005 14:04:11 -0700 (PDT) Message-Id: <20050712.140411.41107257.davem@davemloft.net> To: ravinandan.arakali@neterion.com Cc: hch@infradead.org, raghavendra.koushik@neterion.com, jgarzik@pobox.com, netdev@oss.sgi.com, leonid.grossman@neterion.com, rapuru.sriram@neterion.com Subject: Re: [PATCH 2.6.12.1 5/12] S2io: Performance improvements From: "David S. Miller" In-Reply-To: <000a01c58724$ca41a7c0$4f10100a@pc.s2io.com> References: <20050712.133404.52118192.davem@davemloft.net> <000a01c58724$ca41a7c0$4f10100a@pc.s2io.com> X-Mailer: Mew version 4.2 on Emacs 21.4 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 2722 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev Content-Length: 522 Lines: 13 From: "Ravinandan Arakali" Subject: RE: [PATCH 2.6.12.1 5/12] S2io: Performance improvements Date: Tue, 12 Jul 2005 14:00:52 -0700 > The two-buffer mode was added as a configurable option > to Kconfig file several months ago. Hence the macro > is CONFIG_2BUFF_MODE. We're saying that you should choose CONFIG_2BUFF_MODE, when CONFIG_IA64_SGI_SN2 is set, inside the Kconfig file using the "default" Kconfig directive. You should never change the setting of CONFIG_* macros in C source. From SRS0+05568e4e9645ae537b16+688+infradead.org+hch@pentafluge.srs.infradead.org Tue Jul 12 14:08:58 2005 Received: with ECARTIS (v1.0.0; list netdev); Tue, 12 Jul 2005 14:09:01 -0700 (PDT) Received: from pentafluge.infradead.org (pentafluge.infradead.org [213.146.154.40]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6CL8wH9000989 for ; Tue, 12 Jul 2005 14:08:58 -0700 Received: from hch by pentafluge.infradead.org with local (Exim 4.52 #1 (Red Hat Linux)) id 1DsRy5-00033w-3X; Tue, 12 Jul 2005 22:07:05 +0100 Date: Tue, 12 Jul 2005 22:07:04 +0100 From: Christoph Hellwig To: "David S. Miller" Cc: ravinandan.arakali@neterion.com, hch@infradead.org, raghavendra.koushik@neterion.com, jgarzik@pobox.com, netdev@oss.sgi.com, leonid.grossman@neterion.com, rapuru.sriram@neterion.com Subject: Re: [PATCH 2.6.12.1 5/12] S2io: Performance improvements Message-ID: <20050712210704.GA11622@infradead.org> References: <20050712.133404.52118192.davem@davemloft.net> <000a01c58724$ca41a7c0$4f10100a@pc.s2io.com> <20050712.140411.41107257.davem@davemloft.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20050712.140411.41107257.davem@davemloft.net> User-Agent: Mutt/1.4.2.1i X-SRS-Rewrite: SMTP reverse-path rewritten from by pentafluge.infradead.org See http://www.infradead.org/rpr.html X-archive-position: 2723 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hch@infradead.org Precedence: bulk X-list: netdev Content-Length: 731 Lines: 18 On Tue, Jul 12, 2005 at 02:04:11PM -0700, David S. Miller wrote: > From: "Ravinandan Arakali" > Subject: RE: [PATCH 2.6.12.1 5/12] S2io: Performance improvements > Date: Tue, 12 Jul 2005 14:00:52 -0700 > > > The two-buffer mode was added as a configurable option > > to Kconfig file several months ago. Hence the macro > > is CONFIG_2BUFF_MODE. > > We're saying that you should choose CONFIG_2BUFF_MODE, when > CONFIG_IA64_SGI_SN2 is set, inside the Kconfig file using the > "default" Kconfig directive. And that doesn't help either, CONFIG_IA64_SGI_SN2 isn't used in practice, any production Altix with 26 kernels runs CONFIG_IA64_GENERIC kernels. You have to make this run-time selectable. From ak@suse.de Tue Jul 12 14:28:37 2005 Received: with ECARTIS (v1.0.0; list netdev); Tue, 12 Jul 2005 14:28:42 -0700 (PDT) Received: from mx2.suse.de (ns2.suse.de [195.135.220.15]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6CLSaH9005994 for ; Tue, 12 Jul 2005 14:28:37 -0700 Received: from Relay1.suse.de (mail2.suse.de [195.135.221.8]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx2.suse.de (Postfix) with ESMTP id 0524A1E13E; Tue, 12 Jul 2005 23:26:54 +0200 (CEST) To: Christoph Hellwig Cc: netdev@oss.sgi.com Subject: Re: [PATCH 2.6.12.1 5/12] S2io: Performance improvements References: <20050712.133404.52118192.davem@davemloft.net> <000a01c58724$ca41a7c0$4f10100a@pc.s2io.com> <20050712.140411.41107257.davem@davemloft.net> <20050712210704.GA11622@infradead.org> From: Andi Kleen Date: 12 Jul 2005 23:26:41 +0200 In-Reply-To: <20050712210704.GA11622@infradead.org> Message-ID: User-Agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.2 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-archive-position: 2724 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ak@suse.de Precedence: bulk X-list: netdev Content-Length: 244 Lines: 8 Christoph Hellwig writes: > > And that doesn't help either, CONFIG_IA64_SGI_SN2 isn't used in practice, > any production Altix with 26 kernels runs CONFIG_IA64_GENERIC kernels. At least SLES9/ia64 has a SN2 kernel. -Andi From ravinandan.arakali@neterion.com Tue Jul 12 14:57:03 2005 Received: with ECARTIS (v1.0.0; list netdev); Tue, 12 Jul 2005 14:57:07 -0700 (PDT) Received: from ns1.s2io.com (ns1.s2io.com [142.46.200.198]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6CLv2H9007627 for ; Tue, 12 Jul 2005 14:57:03 -0700 Received: from guinness.s2io.com (sentry.s2io.com [142.46.200.199]) by ns1.s2io.com (8.12.10/8.12.10) with ESMTP id j6CLsfcx012536; Tue, 12 Jul 2005 17:54:41 -0400 (EDT) Received: from rarakali ([10.16.16.79]) by guinness.s2io.com (8.12.6/8.12.6) with SMTP id j6CLsXKP022852; Tue, 12 Jul 2005 17:54:33 -0400 (EDT) From: "Ravinandan Arakali" To: "'David S. Miller'" Cc: , , , , , Subject: RE: [PATCH 2.6.12.1 5/12] S2io: Performance improvements Date: Tue, 12 Jul 2005 14:54:28 -0700 Message-ID: <000c01c5872c$47829c60$4f10100a@pc.s2io.com> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Priority: 3 (Normal) X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook CWS, Build 9.0.2416 (9.0.2911.0) X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.2180 In-Reply-To: <20050712.140411.41107257.davem@davemloft.net> Importance: Normal X-Scanned-By: MIMEDefang 2.34 X-archive-position: 2725 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ravinandan.arakali@neterion.com Precedence: bulk X-list: netdev Content-Length: 998 Lines: 29 Okay, got it. Will submit patch on Kconfig file after the macro for SGI platforms is confirmed. Ravi -----Original Message----- From: David S. Miller [mailto:davem@davemloft.net] Sent: Tuesday, July 12, 2005 2:04 PM To: ravinandan.arakali@neterion.com Cc: hch@infradead.org; raghavendra.koushik@neterion.com; jgarzik@pobox.com; netdev@oss.sgi.com; leonid.grossman@neterion.com; rapuru.sriram@neterion.com Subject: Re: [PATCH 2.6.12.1 5/12] S2io: Performance improvements From: "Ravinandan Arakali" Subject: RE: [PATCH 2.6.12.1 5/12] S2io: Performance improvements Date: Tue, 12 Jul 2005 14:00:52 -0700 > The two-buffer mode was added as a configurable option > to Kconfig file several months ago. Hence the macro > is CONFIG_2BUFF_MODE. We're saying that you should choose CONFIG_2BUFF_MODE, when CONFIG_IA64_SGI_SN2 is set, inside the Kconfig file using the "default" Kconfig directive. You should never change the setting of CONFIG_* macros in C source. From aaron.f.brown@intel.com Tue Jul 12 17:14:29 2005 Received: with ECARTIS (v1.0.0; list netdev); Tue, 12 Jul 2005 17:14:32 -0700 (PDT) Received: from orsfmr005.jf.intel.com (fmr20.intel.com [134.134.136.19]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6D0ESH9018655 for ; Tue, 12 Jul 2005 17:14:28 -0700 Received: from orsfmr100.jf.intel.com (orsfmr100.jf.intel.com [10.7.209.16]) by orsfmr005.jf.intel.com (8.12.10/8.12.10/d: major-outer.mc,v 1.1 2004/09/17 17:50:56 root Exp $) with ESMTP id j6D0CkUF017467 for ; Wed, 13 Jul 2005 00:12:46 GMT Received: from orsmsxvs041.jf.intel.com (orsmsxvs041.jf.intel.com [192.168.65.54]) by orsfmr100.jf.intel.com (8.12.10/8.12.10/d: major-inner.mc,v 1.2 2004/09/17 18:05:01 root Exp $) with SMTP id j6D0CjCu003624 for ; Wed, 13 Jul 2005 00:12:46 GMT Received: from orsmsx332.amr.corp.intel.com ([192.168.65.60]) by orsmsxvs041.jf.intel.com (SAVSMTP 3.1.7.47) with SMTP id M2005071217124611963 for ; Tue, 12 Jul 2005 17:12:46 -0700 Received: from orsmsx401.amr.corp.intel.com ([192.168.65.207]) by orsmsx332.amr.corp.intel.com with Microsoft SMTPSVC(6.0.3790.211); Tue, 12 Jul 2005 17:11:13 -0700 X-MimeOLE: Produced By Microsoft Exchange V6.5.7226.0 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----_=_NextPart_001_01C5873F.60C09F5E" Subject: RE: New Platform Test position thoughts... Date: Tue, 12 Jul 2005 17:11:12 -0700 Message-ID: <31F5998A44B92447BD334F8FBBA0B01F09285748@orsmsx401.amr.corp.intel.com> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: New Platform Test position thoughts... Thread-Index: AcWHNGXMhzO/5cLSQL2pQMv+81VVnwAClztw From: "Brown, Aaron F" To: X-OriginalArrivalTime: 13 Jul 2005 00:11:13.0224 (UTC) FILETIME=[61221480:01C5873F] X-Scanned-By: MIMEDefang 2.44 X-archive-position: 2726 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: aaron.f.brown@intel.com Precedence: bulk X-list: netdev Content-Length: 6055 Lines: 217 This is a multi-part message in MIME format. ------_=_NextPart_001_01C5873F.60C09F5E Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: quoted-printable Please ignore this message thread. It was only intended as a message to my direct work colleagues and I have absolutely no idea how I managed to include the netdev@oss.sgi.com list. I guess my only condolence is that I managed to spam this out to the old list address and not the current one ;-) =20 Head hung down in embarrassment... =20 -Aaron Brown =20 ________________________________ From: Brown, Aaron F=20 Sent: Tuesday, July 12, 2005 3:53 PM To: Frost, Stephen C; Francis, Joey L; Overdorf, Sam; Glick, Kevin; Tantilov, Emil S Cc: Perkins, John S; netdev@oss.sgi.com Subject: New Platform Test position thoughts... =20 Hi all, =20 I'm working on defining this new platform test position for the Unix group that I volunteered for. =20 =20 ------_=_NextPart_001_01C5873F.60C09F5E Content-Type: text/html; charset="US-ASCII" Content-Transfer-Encoding: quoted-printable

Please ignore this message = thread.  It was only intended as a message to my direct work colleagues and I = have absolutely no idea how I managed to include the netdev@oss.sgi.com list.  I = guess my only condolence is that I managed to spam this out to the old list = address and not the current one ;-)

 

Head hung down in = embarrassment…

 

-Aaron = Brown

 


From: = Brown, Aaron F
Sent: Tuesday, July 12, = 2005 3:53 PM
To: Frost, Stephen C; = Francis, Joey L; Overdorf, Sam; Glick, Kevin; Tantilov, Emil S
Cc: Perkins, John S; netdev@oss.sgi.com
Subject: New Platform = Test position thoughts...

 

Hi = all,

 

I’m working = on defining this new platform test position for the Unix group that I = volunteered for. 

<snip>

 

------_=_NextPart_001_01C5873F.60C09F5E-- From dhollis@davehollis.com Wed Jul 13 06:24:15 2005 Received: with ECARTIS (v1.0.0; list netdev); Wed, 13 Jul 2005 06:24:23 -0700 (PDT) Received: from ms-smtp-01.tampabay.rr.com (ms-smtp-01-smtplb.tampabay.rr.com [65.32.5.131]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6DDOEH9023120 for ; Wed, 13 Jul 2005 06:24:15 -0700 Received: from smtp.davehollis.com (111-50.26-24.tampabay.res.rr.com [24.26.50.111]) by ms-smtp-01.tampabay.rr.com (8.12.10/8.12.7) with ESMTP id j6DChdEg025949; Wed, 13 Jul 2005 08:43:39 -0400 (EDT) Received: from localhost (localhost.localdomain [127.0.0.1]) by smtp.davehollis.com (Postfix) with ESMTP id D5287B35AD; Wed, 13 Jul 2005 08:43:37 -0400 (EDT) Received: from smtp.davehollis.com ([127.0.0.1]) by localhost (bender.davehollis.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 08065-09; Wed, 13 Jul 2005 08:43:29 -0400 (EDT) Received: from [10.11.0.10] (static-68-238-170-130.tampfl.fios.verizon.net [68.238.170.130]) (using TLSv1 with cipher RC4-MD5 (128/128 bits)) (No client certificate requested) by smtp.davehollis.com (Postfix) with ESMTP id 0FCADB35AC; Wed, 13 Jul 2005 08:43:27 -0400 (EDT) Subject: Re: [PATCH] forcedeth: Additional ethtool support From: David Hollis To: Manfred Spraul Cc: Jeff Garzik , Netdev , renaud.lienhart@free.fr In-Reply-To: <42D101EC.6000608@colorfullife.com> References: <42D101EC.6000608@colorfullife.com> Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="=-Z/rD2cz0FCO9rdb3WWp/" Date: Wed, 13 Jul 2005 08:39:21 -0400 Message-Id: <1121258362.5023.1.camel@dhollis-lnx.sunera.com> Mime-Version: 1.0 X-Mailer: Evolution 2.2.2 (2.2.2-5) X-archive-position: 2727 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: dhollis@davehollis.com Precedence: bulk X-list: netdev Content-Length: 1354 Lines: 54 --=-Z/rD2cz0FCO9rdb3WWp/ Content-Type: text/plain Content-Transfer-Encoding: quoted-printable On Sun, 2005-07-10 at 13:09 +0200, Manfred Spraul wrote: > + > +static int nv_nway_reset(struct net_device *dev) > +{ > + struct fe_priv *np =3D get_nvpriv(dev); > + int ret; > + > + spin_lock_irq(&np->lock); > + if (np->autoneg) { > + int bmcr; > + > + bmcr =3D mii_rw(dev, np->phyaddr, MII_BMCR, MII_READ); > + bmcr |=3D (BMCR_ANENABLE | BMCR_ANRESTART); > + mii_rw(dev, np->phyaddr, MII_BMCR, bmcr); > + > + ret =3D 0; > + } else { > + ret =3D -EINVAL; > + } > + spin_unlock_irq(&np->lock); > + > + return ret; > +} > + This seems almost completely generic-ified (except for the np->autoneg part) and should be able to operate on any NIC. Do you think there'd be some way to whip up a stock ethtool_nway_reset() type of function that drivers can use if they don't need to do anything fancy? Would help save a lot of code duplication. --=20 David Hollis --=-Z/rD2cz0FCO9rdb3WWp/ Content-Type: application/pgp-signature; name=signature.asc Content-Description: This is a digitally signed message part -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.1 (GNU/Linux) iD8DBQBC1Qt5xasLqOyGHncRAp8jAJ9XLG/8d8Dkspyk3dgkxE4DLdsbogCguIYg 0GiUeyFb1vikkowDrHfdktg= =69hb -----END PGP SIGNATURE----- --=-Z/rD2cz0FCO9rdb3WWp/-- From sri@us.ibm.com Wed Jul 13 17:56:37 2005 Received: with ECARTIS (v1.0.0; list netdev); Wed, 13 Jul 2005 17:56:42 -0700 (PDT) Received: from e34.co.us.ibm.com (e34.co.us.ibm.com [32.97.110.132]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6E0uUH9010701 for ; Wed, 13 Jul 2005 17:56:37 -0700 Received: from d03relay04.boulder.ibm.com (d03relay04.boulder.ibm.com [9.17.195.106]) by e34.co.us.ibm.com (8.12.10/8.12.9) with ESMTP id j6E0slfj224704 for ; Wed, 13 Jul 2005 20:54:47 -0400 Received: from d03av02.boulder.ibm.com (d03av02.boulder.ibm.com [9.17.195.168]) by d03relay04.boulder.ibm.com (8.12.10/NCO/VER6.6) with ESMTP id j6E0sl5b248264 for ; Wed, 13 Jul 2005 18:54:47 -0600 Received: from d03av02.boulder.ibm.com (loopback [127.0.0.1]) by d03av02.boulder.ibm.com (8.12.11/8.13.3) with ESMTP id j6E0skmH021177 for ; Wed, 13 Jul 2005 18:54:46 -0600 Received: from w-sridhar2.beaverton.ibm.com (w-sridhar2.beaverton.ibm.com [9.47.18.20]) by d03av02.boulder.ibm.com (8.12.11/8.12.11) with ESMTP id j6E0sjqW021159; Wed, 13 Jul 2005 18:54:45 -0600 Subject: [PATCH] [SCTP] Fix potential null pointer dereference while handling an icmp error From: Sridhar Samudrala To: davem@davemloft.net Cc: lksctp-developers@lists.sourceforge.net, netdev@oss.sgi.com, cdeboys@stevens.edu Content-Type: text/plain Date: Wed, 13 Jul 2005 17:54:45 -0700 Message-Id: <1121302485.5841.30.camel@w-sridhar2.beaverton.ibm.com> Mime-Version: 1.0 X-Mailer: Evolution 2.2.2 (2.2.2-5) Content-Transfer-Encoding: 7bit X-archive-position: 2730 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: sri@us.ibm.com Precedence: bulk X-list: netdev Content-Length: 6316 Lines: 210 Dave, Please apply the following patch which fixes a potential null pointer dereference while handling an icmp error in sctp_icmp_* routines. These routines assume that we are passing a valid asoc. Reported by Charles-Henri de Boysson Thanks Sridhar --------------------------------------------------------------------- [SCTP] Fix potential null pointer dereference while handling an icmp error Signed-off-by: Sridhar Samudrala diff --git a/include/net/sctp/sctp.h b/include/net/sctp/sctp.h --- a/include/net/sctp/sctp.h +++ b/include/net/sctp/sctp.h @@ -167,15 +167,12 @@ void sctp_unhash_established(struct sctp void sctp_hash_endpoint(struct sctp_endpoint *); void sctp_unhash_endpoint(struct sctp_endpoint *); struct sock *sctp_err_lookup(int family, struct sk_buff *, - struct sctphdr *, struct sctp_endpoint **, - struct sctp_association **, + struct sctphdr *, struct sctp_association **, struct sctp_transport **); -void sctp_err_finish(struct sock *, struct sctp_endpoint *, - struct sctp_association *); +void sctp_err_finish(struct sock *, struct sctp_association *); void sctp_icmp_frag_needed(struct sock *, struct sctp_association *, struct sctp_transport *t, __u32 pmtu); void sctp_icmp_proto_unreachable(struct sock *sk, - struct sctp_endpoint *ep, struct sctp_association *asoc, struct sctp_transport *t); diff --git a/net/sctp/input.c b/net/sctp/input.c --- a/net/sctp/input.c +++ b/net/sctp/input.c @@ -351,7 +351,6 @@ void sctp_icmp_frag_needed(struct sock * * */ void sctp_icmp_proto_unreachable(struct sock *sk, - struct sctp_endpoint *ep, struct sctp_association *asoc, struct sctp_transport *t) { @@ -367,7 +366,6 @@ void sctp_icmp_proto_unreachable(struct /* Common lookup code for icmp/icmpv6 error handler. */ struct sock *sctp_err_lookup(int family, struct sk_buff *skb, struct sctphdr *sctphdr, - struct sctp_endpoint **epp, struct sctp_association **app, struct sctp_transport **tpp) { @@ -375,11 +373,10 @@ struct sock *sctp_err_lookup(int family, union sctp_addr daddr; struct sctp_af *af; struct sock *sk = NULL; - struct sctp_endpoint *ep = NULL; struct sctp_association *asoc = NULL; struct sctp_transport *transport = NULL; - *app = NULL; *epp = NULL; *tpp = NULL; + *app = NULL; *tpp = NULL; af = sctp_get_af_specific(family); if (unlikely(!af)) { @@ -394,26 +391,15 @@ struct sock *sctp_err_lookup(int family, * packet. */ asoc = __sctp_lookup_association(&saddr, &daddr, &transport); - if (!asoc) { - /* If there is no matching association, see if it matches any - * endpoint. This may happen for an ICMP error generated in - * response to an INIT_ACK. - */ - ep = __sctp_rcv_lookup_endpoint(&daddr); - if (!ep) { - return NULL; - } - } + if (!asoc) + return NULL; - if (asoc) { - sk = asoc->base.sk; + sk = asoc->base.sk; - if (ntohl(sctphdr->vtag) != asoc->c.peer_vtag) { - ICMP_INC_STATS_BH(ICMP_MIB_INERRORS); - goto out; - } - } else - sk = ep->base.sk; + if (ntohl(sctphdr->vtag) != asoc->c.peer_vtag) { + ICMP_INC_STATS_BH(ICMP_MIB_INERRORS); + goto out; + } sctp_bh_lock_sock(sk); @@ -423,7 +409,6 @@ struct sock *sctp_err_lookup(int family, if (sock_owned_by_user(sk)) NET_INC_STATS_BH(LINUX_MIB_LOCKDROPPEDICMPS); - *epp = ep; *app = asoc; *tpp = transport; return sk; @@ -432,21 +417,16 @@ out: sock_put(sk); if (asoc) sctp_association_put(asoc); - if (ep) - sctp_endpoint_put(ep); return NULL; } /* Common cleanup code for icmp/icmpv6 error handler. */ -void sctp_err_finish(struct sock *sk, struct sctp_endpoint *ep, - struct sctp_association *asoc) +void sctp_err_finish(struct sock *sk, struct sctp_association *asoc) { sctp_bh_unlock_sock(sk); sock_put(sk); if (asoc) sctp_association_put(asoc); - if (ep) - sctp_endpoint_put(ep); } /* @@ -471,7 +451,6 @@ void sctp_v4_err(struct sk_buff *skb, __ int type = skb->h.icmph->type; int code = skb->h.icmph->code; struct sock *sk; - struct sctp_endpoint *ep; struct sctp_association *asoc; struct sctp_transport *transport; struct inet_sock *inet; @@ -488,7 +467,7 @@ void sctp_v4_err(struct sk_buff *skb, __ savesctp = skb->h.raw; skb->nh.iph = iph; skb->h.raw = (char *)sh; - sk = sctp_err_lookup(AF_INET, skb, sh, &ep, &asoc, &transport); + sk = sctp_err_lookup(AF_INET, skb, sh, &asoc, &transport); /* Put back, the original pointers. */ skb->nh.raw = saveip; skb->h.raw = savesctp; @@ -515,7 +494,7 @@ void sctp_v4_err(struct sk_buff *skb, __ } else { if (ICMP_PROT_UNREACH == code) { - sctp_icmp_proto_unreachable(sk, ep, asoc, + sctp_icmp_proto_unreachable(sk, asoc, transport); goto out_unlock; } @@ -544,7 +523,7 @@ void sctp_v4_err(struct sk_buff *skb, __ } out_unlock: - sctp_err_finish(sk, ep, asoc); + sctp_err_finish(sk, asoc); } /* diff --git a/net/sctp/ipv6.c b/net/sctp/ipv6.c --- a/net/sctp/ipv6.c +++ b/net/sctp/ipv6.c @@ -91,7 +91,6 @@ SCTP_STATIC void sctp_v6_err(struct sk_b struct ipv6hdr *iph = (struct ipv6hdr *)skb->data; struct sctphdr *sh = (struct sctphdr *)(skb->data + offset); struct sock *sk; - struct sctp_endpoint *ep; struct sctp_association *asoc; struct sctp_transport *transport; struct ipv6_pinfo *np; @@ -105,7 +104,7 @@ SCTP_STATIC void sctp_v6_err(struct sk_b savesctp = skb->h.raw; skb->nh.ipv6h = iph; skb->h.raw = (char *)sh; - sk = sctp_err_lookup(AF_INET6, skb, sh, &ep, &asoc, &transport); + sk = sctp_err_lookup(AF_INET6, skb, sh, &asoc, &transport); /* Put back, the original pointers. */ skb->nh.raw = saveip; skb->h.raw = savesctp; @@ -124,7 +123,7 @@ SCTP_STATIC void sctp_v6_err(struct sk_b goto out_unlock; case ICMPV6_PARAMPROB: if (ICMPV6_UNK_NEXTHDR == code) { - sctp_icmp_proto_unreachable(sk, ep, asoc, transport); + sctp_icmp_proto_unreachable(sk, asoc, transport); goto out_unlock; } break; @@ -142,7 +141,7 @@ SCTP_STATIC void sctp_v6_err(struct sk_b } out_unlock: - sctp_err_finish(sk, ep, asoc); + sctp_err_finish(sk, asoc); out: if (likely(idev != NULL)) in6_dev_put(idev); From ric@emc.com Thu Jul 14 05:30:13 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 14 Jul 2005 05:30:24 -0700 (PDT) Received: from mailhub.lss.emc.com (mailhub.lss.emc.com [168.159.2.31]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6ECUCH9004748 for ; Thu, 14 Jul 2005 05:30:13 -0700 Received: from [192.168.1.103] ([10.4.9.42]) by mailhub.lss.emc.com (Switch-3.1.6/Switch-3.1.6) with ESMTP id j6ECRgPJ014655; Thu, 14 Jul 2005 08:27:43 -0400 (EDT) Message-ID: <42D65A41.7070403@emc.com> Date: Thu, 14 Jul 2005 08:27:45 -0400 From: Ric Wheeler User-Agent: Mozilla Thunderbird 1.0 (X11/20041206) X-Accept-Language: en-us, en MIME-Version: 1.0 To: Patrick McHardy , Herbert Xu , Jozsef Kadlecsik CC: Yair Itzhaki , netdev@oss.sgi.com, netfilter-devel@lists.netfilter.org, linux-kernel@vger.kernel.org, "Chitrapu, Kishore" , "Mellors, Andrew" Subject: Re: Re-routing packets via netfilter (ip_rt_bug) References: <4151C0F9B9C25C47B3328922A6297A3286CFB8@post.arx.com> In-Reply-To: <4151C0F9B9C25C47B3328922A6297A3286CFB8@post.arx.com> Content-Type: text/plain; charset=iso-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-PMX-Version: 4.7.1.128075, Antispam-Engine: 2.0.3.2, Antispam-Data: 2005.7.14.8 X-PerlMx-Spam: Gauge=, SPAM=7%, Reasons='__CT 0, __CTE 0, __CT_TEXT_PLAIN 0, __HAS_MSGID 0, __MIME_TEXT_ONLY 0, __MIME_VERSION 0, __SANE_MSGID 0, __USER_AGENT 0' X-archive-position: 2731 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ric@emc.com Precedence: bulk X-list: netdev Content-Length: 2172 Lines: 73 Patrick, Hebert, This issues stills seems to be in the latest trees - is this patch or a variation on it still bumping around? Thanks! Yair Itzhaki wrote: >Can anyone propose a patch that I can start checking? > >I have come up with the following: > >--- net/core/netfilter.c.orig 2005-04-18 21:55:30.000000000 +0300 >+++ net/core/netfilter.c 2005-05-02 17:35:20.000000000 +0300 >@@ -622,9 +622,10 @@ > /* some non-standard hacks like ipt_REJECT.c:send_reset() can cause > * packets with foreign saddr to appear on the NF_IP_LOCAL_OUT hook. > */ >- if (inet_addr_type(iph->saddr) == RTN_LOCAL) { >+ if ((inet_addr_type(iph->saddr) == RTN_LOCAL) || >+ (inet_addr_type(iph->daddr) == RTN_LOCAL)) { > fl.nl_u.ip4_u.daddr = iph->daddr; >- fl.nl_u.ip4_u.saddr = iph->saddr; >+ fl.nl_u.ip4_u.saddr = 0; > fl.nl_u.ip4_u.tos = RT_TOS(iph->tos); > fl.oif = (*pskb)->sk ? (*pskb)->sk->sk_bound_dev_if : 0; > #ifdef CONFIG_IP_ROUTE_FWMARK > >Please advise, >Yair > > > > >>-----Original Message----- >>From: Patrick McHardy [mailto:kaber@trash.net] >>Sent: Wednesday, April 27, 2005 14:05 >>To: Herbert Xu >>Cc: Jozsef Kadlecsik; netdev@oss.sgi.com; >>netfilter-devel@lists.netfilter.org; Yair Itzhaki; >>linux-kernel@vger.kernel.org >>Subject: Re: Re-routing packets via netfilter (ip_rt_bug) >> >> >>Herbert Xu wrote: >> >> >>>Here is another reason why these packets should go through FORWARD. >>>They were generated in response to packets in INPUT/FORWARD/OUTPUT. >>>The original packet has not undergone SNAT in any of these cases. >>> >>>However, if we feed the response packet through LOCAL_OUT it will >>>be subject to DNAT. This creates a NAT asymmetry and we may end >>>up with the wrong destination address. >>> >>>By pushing it through FORWARD it will only undergo SNAT which is >>>correct since the original packet would have undergone DNAT. >>> >>> >>This is only a problem since the recent NAT changes, but I agree >>that we should fix it by moving these packets to FORWARD. >> >>Regards >>Patrick >> >> >> > > > From hahn@physics.mcmaster.ca Thu Jul 14 18:11:45 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 14 Jul 2005 18:11:47 -0700 (PDT) Received: from coriana6.cis.mcmaster.ca (coriana6.CIS.McMaster.CA [130.113.128.17]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6F1BiH9029293 for ; Thu, 14 Jul 2005 18:11:45 -0700 Received: from coffee.psychology.mcmaster.ca (coffee.Psychology.McMaster.CA [130.113.218.59]) by coriana6.cis.mcmaster.ca (8.12.11/8.12.11) with ESMTP id j6F19vr8016798 for ; Thu, 14 Jul 2005 21:09:57 -0400 (EDT) Received: by coffee.psychology.mcmaster.ca (Postfix, from userid 502) id 251438D74; Thu, 14 Jul 2005 21:09:57 -0400 (EDT) Received: from localhost (localhost [127.0.0.1]) by coffee.psychology.mcmaster.ca (Postfix) with ESMTP id 236308B25 for ; Thu, 14 Jul 2005 21:09:57 -0400 (EDT) Date: Thu, 14 Jul 2005 21:09:57 -0400 (EDT) From: Mark Hahn X-X-Sender: hahn@coffee.psychology.mcmaster.ca To: netdev@oss.sgi.com Subject: per-route TCP windows? Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-PMX-Version-Mac: 4.7.1.128075, Antispam-Engine: 2.0.3.2, Antispam-Data: 2005.7.14.34 X-PerlMx-Spam: Gauge=IIIIIII, Probability=7%, Report='__CT 0, __CT_TEXT_PLAIN 0, __HAS_MSGID 0, __MIME_TEXT_ONLY 0, __MIME_VERSION 0, __SANE_MSGID 0' X-archive-position: 2732 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hahn@physics.mcmaster.ca Precedence: bulk X-list: netdev Content-Length: 522 Lines: 13 I was recently fiddling with TCP performance on long-fat links, and noticed that iproute2 seems to allow one to set a per-route window. is this implemented in the stack? I gave it one try and it didn't seem to work, but I admit I haven't delved into the code yet. we'd like to tune for some 10Gb, 3-6 ms links, mainly to move large files between our sites. this is not CERN-scale stuff, but being able to configure large windows on particular routes sounds terribly convenient... thanks, mark hahn sharcnet-mcmaster From davem@davemloft.net Fri Jul 15 07:30:13 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 15 Jul 2005 07:30:16 -0700 (PDT) Received: from sunset.davemloft.net (dsl027-180-168.sfo1.dsl.speakeasy.net [216.27.180.168]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6FEUCH9028012 for ; Fri, 15 Jul 2005 07:30:13 -0700 Received: from localhost ([127.0.0.1] ident=davem) by sunset.davemloft.net with esmtp (Exim 4.50) id 1DtRBG-0005QJ-OV; Fri, 15 Jul 2005 07:28:46 -0700 Date: Fri, 15 Jul 2005 07:28:46 -0700 (PDT) Message-Id: <20050715.072846.74728413.davem@davemloft.net> To: hahn@physics.mcmaster.ca Cc: netdev@oss.sgi.com Subject: Re: per-route TCP windows? From: "David S. Miller" In-Reply-To: References: X-Mailer: Mew version 4.2 on Emacs 21.4 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 2733 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev Content-Length: 367 Lines: 8 From: Mark Hahn Date: Thu, 14 Jul 2005 21:09:57 -0400 (EDT) > we'd like to tune for some 10Gb, 3-6 ms links, mainly to move large files > between our sites. this is not CERN-scale stuff, but being able to configure > large windows on particular routes sounds terribly convenient... You can set the window, but you must lock the setting. From manfred@colorfullife.com Sat Jul 16 07:07:10 2005 Received: with ECARTIS (v1.0.0; list netdev); Sat, 16 Jul 2005 07:07:13 -0700 (PDT) Received: from dbl.q-ag.de (dbl.q-ag.de [213.172.117.3]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6GE78H9012950 for ; Sat, 16 Jul 2005 07:07:09 -0700 Received: from [127.0.0.2] (dbl [127.0.0.1]) by dbl.q-ag.de (8.13.3/8.13.3/Debian-6) with ESMTP id j6GE8KMZ017572; Sat, 16 Jul 2005 16:08:20 +0200 Message-ID: <42D9141E.3070401@colorfullife.com> Date: Sat, 16 Jul 2005 16:05:18 +0200 From: Manfred Spraul User-Agent: Mozilla/5.0 (X11; U; Linux i686; fr-FR; rv:1.7.8) Gecko/20050513 Fedora/1.7.8-1.3.1 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Linux Kernel Mailing List , Netdev CC: Ayaz Abdulla Subject: [PATCH] forcedeth: TX handler changes (experimental) Content-Type: multipart/mixed; boundary="------------070800010001050908070209" X-archive-position: 2735 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: manfred@colorfullife.com Precedence: bulk X-list: netdev Content-Length: 10794 Lines: 268 This is a multi-part message in MIME format. --------------070800010001050908070209 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit [If you receive the mail twice - sorry. I forgot to attach the actual patch] Hi, Attached is a patch that modifies the tx interrupt handling of the nForce nic. It's part of the attempts to figure out what causes the nic hangs (see bug 4552). The change is experimental: It affects all nForce versions. I've tested it on my nForce 250-Gb. Please test it. And especially: If you experince a nic hang, please send me the debug output. That's the block starting with << NETDEV WATCHDOG: eth1: transmit timed out eth1: Got tx_timeout. irq: 00000000 eth1: Ring at ... << Thanks, Manfred --------------070800010001050908070209 Content-Type: text/plain; name="patch-forcedeth-038-txirq" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="patch-forcedeth-038-txirq" --- 2.6/drivers/net/forcedeth.c 2005-07-16 13:10:30.000000000 +0200 +++ build-2.6/drivers/net/forcedeth.c 2005-07-16 15:58:03.000000000 +0200 @@ -87,6 +87,8 @@ * 0.35: 26 Jun 2005: Support for MCP55 added. * 0.36: 28 Jun 2005: Add jumbo frame support. * 0.37: 10 Jul 2005: Additional ethtool support, cleanup of pci id list + * 0.38: 16 Jul 2005: tx irq rewrite: Use global flags instead of + * per-packet flags. * * Known bugs: * We suspect that on some hardware no TX done interrupts are generated. @@ -98,7 +100,7 @@ * DEV_NEED_TIMERIRQ will not harm you on sane hardware, only generating a few * superfluous timer interrupts from the nic. */ -#define FORCEDETH_VERSION "0.36" +#define FORCEDETH_VERSION "0.38" #define DRV_NAME "forcedeth" #include @@ -133,12 +135,9 @@ * Hardware access: */ -#define DEV_NEED_LASTPACKET1 0x0001 /* set LASTPACKET1 in tx flags */ -#define DEV_IRQMASK_1 0x0002 /* use NVREG_IRQMASK_WANTED_1 for irq mask */ -#define DEV_IRQMASK_2 0x0004 /* use NVREG_IRQMASK_WANTED_2 for irq mask */ -#define DEV_NEED_TIMERIRQ 0x0008 /* set the timer irq flag in the irq mask */ -#define DEV_NEED_LINKTIMER 0x0010 /* poll link settings. Relies on the timer irq */ -#define DEV_HAS_LARGEDESC 0x0020 /* device supports jumbo frames and needs packet format 2 */ +#define DEV_NEED_TIMERIRQ 0x0001 /* set the timer irq flag in the irq mask */ +#define DEV_NEED_LINKTIMER 0x0002 /* poll link settings. Relies on the timer irq */ +#define DEV_HAS_LARGEDESC 0x0003 /* device supports jumbo frames and needs packet format 2 */ enum { NvRegIrqStatus = 0x000, @@ -149,13 +148,16 @@ #define NVREG_IRQ_RX 0x0002 #define NVREG_IRQ_RX_NOBUF 0x0004 #define NVREG_IRQ_TX_ERR 0x0008 -#define NVREG_IRQ_TX2 0x0010 +#define NVREG_IRQ_TX_OK 0x0010 #define NVREG_IRQ_TIMER 0x0020 #define NVREG_IRQ_LINK 0x0040 +#define NVREG_IRQ_TX_ERROR 0x0080 #define NVREG_IRQ_TX1 0x0100 -#define NVREG_IRQMASK_WANTED_1 0x005f -#define NVREG_IRQMASK_WANTED_2 0x0147 -#define NVREG_IRQ_UNKNOWN (~(NVREG_IRQ_RX_ERROR|NVREG_IRQ_RX|NVREG_IRQ_RX_NOBUF|NVREG_IRQ_TX_ERR|NVREG_IRQ_TX2|NVREG_IRQ_TIMER|NVREG_IRQ_LINK|NVREG_IRQ_TX1)) +#define NVREG_IRQMASK_WANTED 0x00df + +#define NVREG_IRQ_UNKNOWN (~(NVREG_IRQ_RX_ERROR|NVREG_IRQ_RX|NVREG_IRQ_RX_NOBUF|NVREG_IRQ_TX_ERR| \ + NVREG_IRQ_TX_OK|NVREG_IRQ_TIMER|NVREG_IRQ_LINK|NVREG_IRQ_TX_ERROR| \ + NVREG_IRQ_TX1)) NvRegUnknownSetupReg6 = 0x008, #define NVREG_UNKSETUP6_VAL 3 @@ -296,7 +298,7 @@ #define NV_TX_LASTPACKET (1<<16) #define NV_TX_RETRYERROR (1<<19) -#define NV_TX_LASTPACKET1 (1<<24) +#define NV_TX_FORCED_INTERRUPT (1<<24) #define NV_TX_DEFERRED (1<<26) #define NV_TX_CARRIERLOST (1<<27) #define NV_TX_LATECOLLISION (1<<28) @@ -306,7 +308,7 @@ #define NV_TX2_LASTPACKET (1<<29) #define NV_TX2_RETRYERROR (1<<18) -#define NV_TX2_LASTPACKET1 (1<<23) +#define NV_TX2_FORCED_INTERRUPT (1<<30) #define NV_TX2_DEFERRED (1<<25) #define NV_TX2_CARRIERLOST (1<<26) #define NV_TX2_LATECOLLISION (1<<27) @@ -1013,9 +1015,39 @@ struct fe_priv *np = get_nvpriv(dev); u8 __iomem *base = get_hwbase(dev); - dprintk(KERN_DEBUG "%s: Got tx_timeout. irq: %08x\n", dev->name, + printk(KERN_INFO "%s: Got tx_timeout. irq: %08x\n", dev->name, readl(base + NvRegIrqStatus) & NVREG_IRQSTAT_MASK); + { + int i; + + printk(KERN_INFO "%s: Ring at %lx: next %d nic %d\n", + dev->name, (unsigned long)np->ring_addr, + np->next_tx, np->nic_tx); + printk(KERN_INFO "%s: Dumping tx registers\n", dev->name); + for (i=0;i<0x400;i+= 32) { + printk(KERN_INFO "%3x: %08x %08x %08x %08x %08x %08x %08x %08x\n", + i, + readl(base + i + 0), readl(base + i + 4), + readl(base + i + 8), readl(base + i + 12), + readl(base + i + 16), readl(base + i + 20), + readl(base + i + 24), readl(base + i + 28)); + } + printk(KERN_INFO "%s: Dumping tx ring\n", dev->name); + for (i=0;itx_ring[i].PacketBuffer), + le32_to_cpu(np->tx_ring[i].FlagLen), + le32_to_cpu(np->tx_ring[i+1].PacketBuffer), + le32_to_cpu(np->tx_ring[i+1].FlagLen), + le32_to_cpu(np->tx_ring[i+2].PacketBuffer), + le32_to_cpu(np->tx_ring[i+2].FlagLen), + le32_to_cpu(np->tx_ring[i+3].PacketBuffer), + le32_to_cpu(np->tx_ring[i+3].FlagLen)); + } + } + spin_lock_irq(&np->lock); /* 1) stop tx engine */ @@ -1557,7 +1589,7 @@ if (!(events & np->irqmask)) break; - if (events & (NVREG_IRQ_TX1|NVREG_IRQ_TX2|NVREG_IRQ_TX_ERR)) { + if (events & (NVREG_IRQ_TX1|NVREG_IRQ_TX_OK|NVREG_IRQ_TX_ERROR|NVREG_IRQ_TX_ERR)) { spin_lock(&np->lock); nv_tx_done(dev); spin_unlock(&np->lock); @@ -2213,17 +2245,10 @@ if (np->desc_ver == DESC_VER_1) { np->tx_flags = NV_TX_LASTPACKET|NV_TX_VALID; - if (id->driver_data & DEV_NEED_LASTPACKET1) - np->tx_flags |= NV_TX_LASTPACKET1; } else { np->tx_flags = NV_TX2_LASTPACKET|NV_TX2_VALID; - if (id->driver_data & DEV_NEED_LASTPACKET1) - np->tx_flags |= NV_TX2_LASTPACKET1; } - if (id->driver_data & DEV_IRQMASK_1) - np->irqmask = NVREG_IRQMASK_WANTED_1; - if (id->driver_data & DEV_IRQMASK_2) - np->irqmask = NVREG_IRQMASK_WANTED_2; + np->irqmask = NVREG_IRQMASK_WANTED; if (id->driver_data & DEV_NEED_TIMERIRQ) np->irqmask |= NVREG_IRQ_TIMER; if (id->driver_data & DEV_NEED_LINKTIMER) { @@ -2329,73 +2354,63 @@ static struct pci_device_id pci_tbl[] = { { /* nForce Ethernet Controller */ PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_1), - .driver_data = DEV_IRQMASK_1|DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER, + .driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER, }, { /* nForce2 Ethernet Controller */ PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_2), - .driver_data = DEV_NEED_LASTPACKET1|DEV_IRQMASK_2|DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER, + .driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER, }, { /* nForce3 Ethernet Controller */ PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_3), - .driver_data = DEV_NEED_LASTPACKET1|DEV_IRQMASK_2|DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER, + .driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER, }, { /* nForce3 Ethernet Controller */ PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_4), - .driver_data = DEV_NEED_LASTPACKET1|DEV_IRQMASK_2|DEV_NEED_TIMERIRQ| - DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC, + .driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC, }, { /* nForce3 Ethernet Controller */ PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_5), - .driver_data = DEV_NEED_LASTPACKET1|DEV_IRQMASK_2|DEV_NEED_TIMERIRQ| - DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC, + .driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC, }, { /* nForce3 Ethernet Controller */ PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_6), - .driver_data = DEV_NEED_LASTPACKET1|DEV_IRQMASK_2|DEV_NEED_TIMERIRQ| - DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC, + .driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC, }, { /* nForce3 Ethernet Controller */ PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_7), - .driver_data = DEV_NEED_LASTPACKET1|DEV_IRQMASK_2|DEV_NEED_TIMERIRQ| - DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC, + .driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC, }, { /* CK804 Ethernet Controller */ PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_8), - .driver_data = DEV_NEED_LASTPACKET1|DEV_IRQMASK_2|DEV_NEED_TIMERIRQ| - DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC, + .driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC, }, { /* CK804 Ethernet Controller */ PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_9), - .driver_data = DEV_NEED_LASTPACKET1|DEV_IRQMASK_2|DEV_NEED_TIMERIRQ| - DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC, + .driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC, }, { /* MCP04 Ethernet Controller */ PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_10), - .driver_data = DEV_NEED_LASTPACKET1|DEV_IRQMASK_2|DEV_NEED_TIMERIRQ| - DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC, + .driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC, }, { /* MCP04 Ethernet Controller */ PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_11), - .driver_data = DEV_NEED_LASTPACKET1|DEV_IRQMASK_2|DEV_NEED_TIMERIRQ| - DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC, + .driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC, }, { /* MCP51 Ethernet Controller */ PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_12), - .driver_data = DEV_NEED_LASTPACKET1|DEV_IRQMASK_2|DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER, + .driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER, }, { /* MCP51 Ethernet Controller */ PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_13), - .driver_data = DEV_NEED_LASTPACKET1|DEV_IRQMASK_2|DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER, + .driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER, }, { /* MCP55 Ethernet Controller */ PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_14), - .driver_data = DEV_NEED_LASTPACKET1|DEV_IRQMASK_2|DEV_NEED_TIMERIRQ| - DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC, + .driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC, }, { /* MCP55 Ethernet Controller */ PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_15), - .driver_data = DEV_NEED_LASTPACKET1|DEV_IRQMASK_2|DEV_NEED_TIMERIRQ| - DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC, + .driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC, }, {0,}, }; --------------070800010001050908070209-- From manfred@colorfullife.com Sat Jul 16 07:06:11 2005 Received: with ECARTIS (v1.0.0; list netdev); Sat, 16 Jul 2005 07:06:14 -0700 (PDT) Received: from dbl.q-ag.de (dbl.q-ag.de [213.172.117.3]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6GE67H9012865 for ; Sat, 16 Jul 2005 07:06:10 -0700 Received: from [127.0.0.2] (dbl [127.0.0.1]) by dbl.q-ag.de (8.13.3/8.13.3/Debian-6) with ESMTP id j6GE77sN017561; Sat, 16 Jul 2005 16:07:08 +0200 Message-ID: <42D913D6.5050902@colorfullife.com> Date: Sat, 16 Jul 2005 16:04:06 +0200 From: Manfred Spraul User-Agent: Mozilla/5.0 (X11; U; Linux i686; fr-FR; rv:1.7.8) Gecko/20050513 Fedora/1.7.8-1.3.1 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Linux Kernel Mailing List , Netdev CC: Ayaz Abdulla Subject: [PATCH] forcedeth: TX handler changes (experimental) Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 2734 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: manfred@colorfullife.com Precedence: bulk X-list: netdev Content-Length: 517 Lines: 19 Hi, Attached is a patch that modifies the tx interrupt handling of the nForce nic. It's part of the attempts to figure out what causes the nic hangs (see bug 4552). The change is experimental: It affects all nForce versions. I've tested it on my nForce 250-Gb. Please test it. And especially: If you experince a nic hang, please send me the debug output. That's the block starting with << NETDEV WATCHDOG: eth1: transmit timed out eth1: Got tx_timeout. irq: 00000000 eth1: Ring at ... << Thanks, Manfred From dsd@gentoo.org Sat Jul 16 09:16:44 2005 Received: with ECARTIS (v1.0.0; list netdev); Sat, 16 Jul 2005 09:16:49 -0700 (PDT) Received: from mta07-winn.ispmail.ntl.com (mta07-winn.ispmail.ntl.com [81.103.221.47]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6GGGhH9024576 for ; Sat, 16 Jul 2005 09:16:43 -0700 Received: from aamta09-winn.ispmail.ntl.com ([81.103.221.35]) by mta07-winn.ispmail.ntl.com with ESMTP id <20050716161457.MKYE481.mta07-winn.ispmail.ntl.com@aamta09-winn.ispmail.ntl.com>; Sat, 16 Jul 2005 17:14:57 +0100 Received: from zog.reactivated.net ([81.99.81.161]) by aamta09-winn.ispmail.ntl.com with ESMTP id <20050716161456.NZZH29368.aamta09-winn.ispmail.ntl.com@zog.reactivated.net>; Sat, 16 Jul 2005 17:14:56 +0100 Received: from [192.168.0.2] (dsd [192.168.0.2]) by zog.reactivated.net (Postfix) with ESMTP id 4B7BC7BAD06; Sat, 16 Jul 2005 17:39:30 +0100 (BST) Message-ID: <42D932E2.20005@gentoo.org> Date: Sat, 16 Jul 2005 17:16:34 +0100 From: Daniel Drake User-Agent: Mozilla Thunderbird 1.0.5 (X11/20050715) X-Accept-Language: en-us, en MIME-Version: 1.0 To: Manfred Spraul Cc: Linux Kernel Mailing List , Netdev , Ayaz Abdulla Subject: Re: [PATCH] forcedeth: TX handler changes (experimental) References: <42D9141E.3070401@colorfullife.com> In-Reply-To: <42D9141E.3070401@colorfullife.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 2737 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: dsd@gentoo.org Precedence: bulk X-list: netdev Content-Length: 1190 Lines: 37 Hi, Manfred Spraul wrote: > Attached is a patch that modifies the tx interrupt handling of the > nForce nic. It's part of the attempts to figure out what causes the nic > hangs (see bug 4552). > The change is experimental: It affects all nForce versions. I've tested > it on my nForce 250-Gb. This patch doesn't apply to 2.6.13-rc3: patching file drivers/net/forcedeth.c Hunk #1 FAILED at 87. Hunk #2 FAILED at 100. Hunk #3 FAILED at 135. Hunk #4 succeeded at 145 (offset -3 lines). Hunk #5 succeeded at 295 (offset -3 lines). Hunk #6 succeeded at 305 (offset -3 lines). Hunk #7 succeeded at 995 (offset -20 lines). Hunk #8 succeeded at 1502 (offset -87 lines). Hunk #9 succeeded at 2112 (offset -133 lines). Hunk #10 FAILED at 2221. 4 out of 10 hunks FAILED -- saving rejects to file drivers/net/forcedeth.c.rej I think this is because 2.6.13-rc3 has forcedeth 0.35. I can't find the patch for 0.35 --> 0.36. (Is this when the netdev archives were in limbo?) I found the patch for 0.36 --> 0.37 here : http://marc.theaimsgroup.com/?l=linux-netdev&m=112101962422678&w=2 Are the earlier changes a prerequisite, or can I just fix the TX handler rejects manually? Thanks, Daniel From manfred@colorfullife.com Sat Jul 16 09:50:58 2005 Received: with ECARTIS (v1.0.0; list netdev); Sat, 16 Jul 2005 09:51:01 -0700 (PDT) Received: from dbl.q-ag.de (dbl.q-ag.de [213.172.117.3]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6GGouH9026524 for ; Sat, 16 Jul 2005 09:50:57 -0700 Received: from [127.0.0.2] (dbl [127.0.0.1]) by dbl.q-ag.de (8.13.3/8.13.3/Debian-6) with ESMTP id j6GGq3wH017994; Sat, 16 Jul 2005 18:52:04 +0200 Message-ID: <42D93A7E.5090807@colorfullife.com> Date: Sat, 16 Jul 2005 18:49:02 +0200 From: Manfred Spraul User-Agent: Mozilla/5.0 (X11; U; Linux i686; fr-FR; rv:1.7.8) Gecko/20050513 Fedora/1.7.8-1.3.1 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Daniel Drake CC: Linux Kernel Mailing List , Netdev , Ayaz Abdulla Subject: Re: [PATCH] forcedeth: TX handler changes (experimental) References: <42D9141E.3070401@colorfullife.com> <42D932E2.20005@gentoo.org> In-Reply-To: <42D932E2.20005@gentoo.org> Content-Type: multipart/mixed; boundary="------------060503060906020809090406" X-archive-position: 2738 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: manfred@colorfullife.com Precedence: bulk X-list: netdev Content-Length: 10509 Lines: 177 This is a multi-part message in MIME format. --------------060503060906020809090406 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Daniel Drake wrote: > Hi, > > Manfred Spraul wrote: > >> Attached is a patch that modifies the tx interrupt handling of the >> nForce nic. It's part of the attempts to figure out what causes the >> nic hangs (see bug 4552). >> The change is experimental: It affects all nForce versions. I've >> tested it on my nForce 250-Gb. > > > This patch doesn't apply to 2.6.13-rc3: > > patching file drivers/net/forcedeth.c > Hunk #1 FAILED at 87. > Hunk #2 FAILED at 100. > Hunk #3 FAILED at 135. > Hunk #4 succeeded at 145 (offset -3 lines). > Hunk #5 succeeded at 295 (offset -3 lines). > Hunk #6 succeeded at 305 (offset -3 lines). > Hunk #7 succeeded at 995 (offset -20 lines). > Hunk #8 succeeded at 1502 (offset -87 lines). > Hunk #9 succeeded at 2112 (offset -133 lines). > Hunk #10 FAILED at 2221. > 4 out of 10 hunks FAILED -- saving rejects to file > drivers/net/forcedeth.c.rej > > I think this is because 2.6.13-rc3 has forcedeth 0.35. > > I can't find the patch for 0.35 --> 0.36. (Is this when the netdev > archives were in limbo?) > Either that, or I just forgot to cc netdev. I've uploaded all recent patches to http://www.colorfullife.com/~manfred/Linux-kernel/forcedeth/ 0.36 is attached. -- Manfred --------------060503060906020809090406 Content-Type: text/plain; name="patch-forcedeth-036-jumbo" Content-Transfer-Encoding: base64 Content-Disposition: inline; filename="patch-forcedeth-036-jumbo" LS0tIDIuNi9kcml2ZXJzL25ldC9mb3JjZWRldGguYwkyMDA1LTA2LTI4IDIyOjUxOjI2LjAw MDAwMDAwMCArMDIwMAorKysgYnVpbGQtMi42L2RyaXZlcnMvbmV0L2ZvcmNlZGV0aC5jCTIw MDUtMDYtMjggMjI6NTE6NDAuMDAwMDAwMDAwICswMjAwCkBAIC04NSw2ICs4NSw3IEBACiAg KgkwLjMzOiAxNiBNYXkgMjAwNTogU3VwcG9ydCBmb3IgTUNQNTEgYWRkZWQuCiAgKgkwLjM0 OiAxOCBKdW4gMjAwNTogQWRkIERFVl9ORUVEX0xJTktUSU1FUiB0byBhbGwgbkZvcmNlIG5p Y3MuCiAgKgkwLjM1OiAyNiBKdW4gMjAwNTogU3VwcG9ydCBmb3IgTUNQNTUgYWRkZWQuCisg KgkwLjM2OiAyOCBKdWwgMjAwNTogQWRkIGp1bWJvIGZyYW1lIHN1cHBvcnQuCiAgKgogICog S25vd24gYnVnczoKICAqIFdlIHN1c3BlY3QgdGhhdCBvbiBzb21lIGhhcmR3YXJlIG5vIFRY IGRvbmUgaW50ZXJydXB0cyBhcmUgZ2VuZXJhdGVkLgpAQCAtOTYsNyArOTcsNyBAQAogICog REVWX05FRURfVElNRVJJUlEgd2lsbCBub3QgaGFybSB5b3Ugb24gc2FuZSBoYXJkd2FyZSwg b25seSBnZW5lcmF0aW5nIGEgZmV3CiAgKiBzdXBlcmZsdW91cyB0aW1lciBpbnRlcnJ1cHRz IGZyb20gdGhlIG5pYy4KICAqLwotI2RlZmluZSBGT1JDRURFVEhfVkVSU0lPTgkJIjAuMzUi CisjZGVmaW5lIEZPUkNFREVUSF9WRVJTSU9OCQkiMC4zNiIKICNkZWZpbmUgRFJWX05BTUUJ CQkiZm9yY2VkZXRoIgogCiAjaW5jbHVkZSA8bGludXgvbW9kdWxlLmg+CkBAIC0zNzksOSAr MzgwLDEzIEBACiAjZGVmaW5lIFRYX0xJTUlUX1NUQVJUCTYyCiAKIC8qIHJ4L3R4IG1hYyBh ZGRyICsgdHlwZSArIHZsYW4gKyBhbGlnbiArIHNsYWNrKi8KLSNkZWZpbmUgUlhfTklDX0JV RlNJWkUJCShFVEhfREFUQV9MRU4gKyA2NCkKLS8qIGV2ZW4gbW9yZSBzbGFjayAqLwotI2Rl ZmluZSBSWF9BTExPQ19CVUZTSVpFCShFVEhfREFUQV9MRU4gKyAxMjgpCisjZGVmaW5lIE5W X1JYX0hFQURFUlMJCSg2NCkKKy8qIGV2ZW4gbW9yZSBzbGFjay4gKi8KKyNkZWZpbmUgTlZf UlhfQUxMT0NfUEFECQkoNjQpCisKKy8qIG1heGltdW0gbXR1IHNpemUgKi8KKyNkZWZpbmUg TlZfUEtUTElNSVRfMQlFVEhfREFUQV9MRU4JLyogaGFyZCBsaW1pdCBub3Qga25vd24gKi8K KyNkZWZpbmUgTlZfUEtUTElNSVRfMgk5MTAwCS8qIEFjdHVhbCBsaW1pdCBhY2NvcmRpbmcg dG8gTlZpZGlhOiA5MjAyICovCiAKICNkZWZpbmUgT09NX1JFRklMTAkoMStIWi8yMCkKICNk ZWZpbmUgUE9MTF9XQUlUCSgxK0haLzEwMCkKQEAgLTQ3Myw2ICs0NzgsNyBAQAogCXN0cnVj dCBza19idWZmICpyeF9za2J1ZmZbUlhfUklOR107CiAJZG1hX2FkZHJfdCByeF9kbWFbUlhf UklOR107CiAJdW5zaWduZWQgaW50IHJ4X2J1Zl9zejsKKwl1bnNpZ25lZCBpbnQgcGt0X2xp bWl0OwogCXN0cnVjdCB0aW1lcl9saXN0IG9vbV9raWNrOwogCXN0cnVjdCB0aW1lcl9saXN0 IG5pY19wb2xsOwogCkBAIC03OTIsNyArNzk4LDcgQEAKIAkJbnIgPSByZWZpbGxfcnggJSBS WF9SSU5HOwogCQlpZiAobnAtPnJ4X3NrYnVmZltucl0gPT0gTlVMTCkgewogCi0JCQlza2Ig PSBkZXZfYWxsb2Nfc2tiKFJYX0FMTE9DX0JVRlNJWkUpOworCQkJc2tiID0gZGV2X2FsbG9j X3NrYihucC0+cnhfYnVmX3N6ICsgTlZfUlhfQUxMT0NfUEFEKTsKIAkJCWlmICghc2tiKQog CQkJCWJyZWFrOwogCkBAIC04MDUsNyArODExLDcgQEAKIAkJCQkJCVBDSV9ETUFfRlJPTURF VklDRSk7CiAJCW5wLT5yeF9yaW5nW25yXS5QYWNrZXRCdWZmZXIgPSBjcHVfdG9fbGUzMihu cC0+cnhfZG1hW25yXSk7CiAJCXdtYigpOwotCQlucC0+cnhfcmluZ1tucl0uRmxhZ0xlbiA9 IGNwdV90b19sZTMyKFJYX05JQ19CVUZTSVpFIHwgTlZfUlhfQVZBSUwpOworCQlucC0+cnhf cmluZ1tucl0uRmxhZ0xlbiA9IGNwdV90b19sZTMyKG5wLT5yeF9idWZfc3ogfCBOVl9SWF9B VkFJTCk7CiAJCWRwcmludGsoS0VSTl9ERUJVRyAiJXM6IG52X2FsbG9jX3J4OiBQYWNrZXQg JWQgbWFya2VkIGFzIEF2YWlsYWJsZVxuIiwKIAkJCQkJZGV2LT5uYW1lLCByZWZpbGxfcngp OwogCQlyZWZpbGxfcngrKzsKQEAgLTgzMSwxOSArODM3LDMxIEBACiAJZW5hYmxlX2lycShk ZXYtPmlycSk7CiB9CiAKLXN0YXRpYyBpbnQgbnZfaW5pdF9yaW5nKHN0cnVjdCBuZXRfZGV2 aWNlICpkZXYpCitzdGF0aWMgdm9pZCBudl9pbml0X3J4KHN0cnVjdCBuZXRfZGV2aWNlICpk ZXYpIAogewogCXN0cnVjdCBmZV9wcml2ICpucCA9IGdldF9udnByaXYoZGV2KTsKIAlpbnQg aTsKIAotCW5wLT5uZXh0X3R4ID0gbnAtPm5pY190eCA9IDA7Ci0JZm9yIChpID0gMDsgaSA8 IFRYX1JJTkc7IGkrKykKLQkJbnAtPnR4X3JpbmdbaV0uRmxhZ0xlbiA9IDA7Ci0KIAlucC0+ Y3VyX3J4ID0gUlhfUklORzsKIAlucC0+cmVmaWxsX3J4ID0gMDsKIAlmb3IgKGkgPSAwOyBp IDwgUlhfUklORzsgaSsrKQogCQlucC0+cnhfcmluZ1tpXS5GbGFnTGVuID0gMDsKK30KKwor c3RhdGljIHZvaWQgbnZfaW5pdF90eChzdHJ1Y3QgbmV0X2RldmljZSAqZGV2KQoreworCXN0 cnVjdCBmZV9wcml2ICpucCA9IGdldF9udnByaXYoZGV2KTsKKwlpbnQgaTsKKworCW5wLT5u ZXh0X3R4ID0gbnAtPm5pY190eCA9IDA7CisJZm9yIChpID0gMDsgaSA8IFRYX1JJTkc7IGkr KykKKwkJbnAtPnR4X3JpbmdbaV0uRmxhZ0xlbiA9IDA7Cit9CisKK3N0YXRpYyBpbnQgbnZf aW5pdF9yaW5nKHN0cnVjdCBuZXRfZGV2aWNlICpkZXYpCit7CisJbnZfaW5pdF90eChkZXYp OworCW52X2luaXRfcngoZGV2KTsKIAlyZXR1cm4gbnZfYWxsb2NfcngoZGV2KTsKIH0KIApA QCAtMTIwNywxNSArMTIyNSw4MiBAQAogCX0KIH0KIAorc3RhdGljIHZvaWQgc2V0X2J1ZnNp emUoc3RydWN0IG5ldF9kZXZpY2UgKmRldikKK3sKKwlzdHJ1Y3QgZmVfcHJpdiAqbnAgPSBu ZXRkZXZfcHJpdihkZXYpOworCisJaWYgKGRldi0+bXR1IDw9IEVUSF9EQVRBX0xFTikKKwkJ bnAtPnJ4X2J1Zl9zeiA9IEVUSF9EQVRBX0xFTiArIE5WX1JYX0hFQURFUlM7CisJZWxzZQor CQlucC0+cnhfYnVmX3N6ID0gZGV2LT5tdHUgKyBOVl9SWF9IRUFERVJTOworfQorCiAvKgog ICogbnZfY2hhbmdlX210dTogZGV2LT5jaGFuZ2VfbXR1IGZ1bmN0aW9uCiAgKiBDYWxsZWQg d2l0aCBkZXZfYmFzZV9sb2NrIGhlbGQgZm9yIHJlYWQuCiAgKi8KIHN0YXRpYyBpbnQgbnZf Y2hhbmdlX210dShzdHJ1Y3QgbmV0X2RldmljZSAqZGV2LCBpbnQgbmV3X210dSkKIHsKLQlp ZiAobmV3X210dSA+IEVUSF9EQVRBX0xFTikKKwlzdHJ1Y3QgZmVfcHJpdiAqbnAgPSBnZXRf bnZwcml2KGRldik7CisJaW50IG9sZF9tdHU7CisKKwlpZiAobmV3X210dSA8IDY0IHx8IG5l d19tdHUgPiBucC0+cGt0X2xpbWl0KQogCQlyZXR1cm4gLUVJTlZBTDsKKworCW9sZF9tdHUg PSBkZXYtPm10dTsKIAlkZXYtPm10dSA9IG5ld19tdHU7CisKKwkvKiByZXR1cm4gZWFybHkg aWYgdGhlIGJ1ZmZlciBzaXplcyB3aWxsIG5vdCBjaGFuZ2UgKi8KKwlpZiAob2xkX210dSA8 PSBFVEhfREFUQV9MRU4gJiYgbmV3X210dSA8PSBFVEhfREFUQV9MRU4pCisJCXJldHVybiAw OworCWlmIChvbGRfbXR1ID09IG5ld19tdHUpCisJCXJldHVybiAwOworCisJLyogc3luY2hy b25pemVkIGFnYWluc3Qgb3BlbiA6IHJ0bmxfbG9jaygpIGhlbGQgYnkgY2FsbGVyICovCisJ aWYgKG5ldGlmX3J1bm5pbmcoZGV2KSkgeworCQl1OCAqYmFzZSA9IGdldF9od2Jhc2UoZGV2 KTsKKwkJLyoKKwkJICogSXQgc2VlbXMgdGhhdCB0aGUgbmljIHByZWxvYWRzIHZhbGlkIHJp bmcgZW50cmllcyBpbnRvIGFuCisJCSAqIGludGVybmFsIGJ1ZmZlci4gVGhlIHByb2NlZHVy ZSBmb3IgZmx1c2hpbmcgZXZlcnl0aGluZyBpcworCQkgKiBndWVzc2VkLCB0aGVyZSBpcyBw cm9iYWJseSBhIHNpbXBsZXIgYXBwcm9hY2guCisJCSAqIENoYW5naW5nIHRoZSBNVFUgaXMg YSByYXJlIGV2ZW50LCBpdCBzaG91bGRuJ3QgbWF0dGVyLgorCQkgKi8KKwkJZGlzYWJsZV9p cnEoZGV2LT5pcnEpOworCQlzcGluX2xvY2tfYmgoJmRldi0+eG1pdF9sb2NrKTsKKwkJc3Bp bl9sb2NrKCZucC0+bG9jayk7CisJCS8qIHN0b3AgZW5naW5lcyAqLworCQludl9zdG9wX3J4 KGRldik7CisJCW52X3N0b3BfdHgoZGV2KTsKKwkJbnZfdHhyeF9yZXNldChkZXYpOworCQkv KiBkcmFpbiByeCBxdWV1ZSAqLworCQludl9kcmFpbl9yeChkZXYpOworCQludl9kcmFpbl90 eChkZXYpOworCQkvKiByZWluaXQgZHJpdmVyIHZpZXcgb2YgdGhlIHJ4IHF1ZXVlICovCisJ CW52X2luaXRfcngoZGV2KTsKKwkJbnZfaW5pdF90eChkZXYpOworCQkvKiBhbGxvYyBuZXcg cnggYnVmZmVycyAqLworCQlzZXRfYnVmc2l6ZShkZXYpOworCQlpZiAobnZfYWxsb2Nfcngo ZGV2KSkgeworCQkJaWYgKCFucC0+aW5fc2h1dGRvd24pCisJCQkJbW9kX3RpbWVyKCZucC0+ b29tX2tpY2ssIGppZmZpZXMgKyBPT01fUkVGSUxMKTsKKwkJfQorCQkvKiByZWluaXQgbmlj IHZpZXcgb2YgdGhlIHJ4IHF1ZXVlICovCisJCXdyaXRlbChucC0+cnhfYnVmX3N6LCBiYXNl ICsgTnZSZWdPZmZsb2FkQ29uZmlnKTsKKwkJd3JpdGVsKCh1MzIpIG5wLT5yaW5nX2FkZHIs IGJhc2UgKyBOdlJlZ1J4UmluZ1BoeXNBZGRyKTsKKwkJd3JpdGVsKCh1MzIpIChucC0+cmlu Z19hZGRyICsgUlhfUklORypzaXplb2Yoc3RydWN0IHJpbmdfZGVzYykpLCBiYXNlICsgTnZS ZWdUeFJpbmdQaHlzQWRkcik7CisJCXdyaXRlbCggKChSWF9SSU5HLTEpIDw8IE5WUkVHX1JJ TkdTWl9SWFNISUZUKSArICgoVFhfUklORy0xKSA8PCBOVlJFR19SSU5HU1pfVFhTSElGVCks CisJCQliYXNlICsgTnZSZWdSaW5nU2l6ZXMpOworCQlwY2lfcHVzaChiYXNlKTsKKwkJd3Jp dGVsKE5WUkVHX1RYUlhDVExfS0lDS3xucC0+ZGVzY192ZXIsIGdldF9od2Jhc2UoZGV2KSAr IE52UmVnVHhSeENvbnRyb2wpOworCQlwY2lfcHVzaChiYXNlKTsKKworCQkvKiByZXN0YXJ0 IHJ4IGVuZ2luZSAqLworCQludl9zdGFydF9yeChkZXYpOworCQludl9zdGFydF90eChkZXYp OworCQlzcGluX3VubG9jaygmbnAtPmxvY2spOworCQlzcGluX3VubG9ja19iaCgmZGV2LT54 bWl0X2xvY2spOworCQllbmFibGVfaXJxKGRldi0+aXJxKTsKKwl9CiAJcmV0dXJuIDA7CiB9 CiAKQEAgLTE3OTIsNiArMTg3Nyw3IEBACiAJd3JpdGVsKDAsIGJhc2UgKyBOdlJlZ0FkYXB0 ZXJDb250cm9sKTsKIAogCS8qIDIpIGluaXRpYWxpemUgZGVzY3JpcHRvciByaW5ncyAqLwor CXNldF9idWZzaXplKGRldik7CiAJb29tID0gbnZfaW5pdF9yaW5nKGRldik7CiAKIAl3cml0 ZWwoMCwgYmFzZSArIE52UmVnTGlua1NwZWVkKTsKQEAgLTE4MzcsNyArMTkyMyw3IEBACiAJ d3JpdGVsKE5WUkVHX01JU0MxX0ZPUkNFIHwgTlZSRUdfTUlTQzFfSEQsIGJhc2UgKyBOdlJl Z01pc2MxKTsKIAl3cml0ZWwocmVhZGwoYmFzZSArIE52UmVnVHJhbnNtaXR0ZXJTdGF0dXMp LCBiYXNlICsgTnZSZWdUcmFuc21pdHRlclN0YXR1cyk7CiAJd3JpdGVsKE5WUkVHX1BGRl9B TFdBWVMsIGJhc2UgKyBOdlJlZ1BhY2tldEZpbHRlckZsYWdzKTsKLQl3cml0ZWwoTlZSRUdf T0ZGTE9BRF9OT1JNQUwsIGJhc2UgKyBOdlJlZ09mZmxvYWRDb25maWcpOworCXdyaXRlbChu cC0+cnhfYnVmX3N6LCBiYXNlICsgTnZSZWdPZmZsb2FkQ29uZmlnKTsKIAogCXdyaXRlbChy ZWFkbChiYXNlICsgTnZSZWdSZWNlaXZlclN0YXR1cyksIGJhc2UgKyBOdlJlZ1JlY2VpdmVy U3RhdHVzKTsKIAlnZXRfcmFuZG9tX2J5dGVzKCZpLCBzaXplb2YoaSkpOwpAQCAtMjAwNywx MyArMjA5MywxNiBAQAogCiAJLyogaGFuZGxlIGRpZmZlcmVudCBkZXNjcmlwdG9yIHZlcnNp b25zICovCiAJaWYgKHBjaV9kZXYtPmRldmljZSA9PSBQQ0lfREVWSUNFX0lEX05WSURJQV9O VkVORVRfMSB8fAotCQlwY2lfZGV2LT5kZXZpY2UgPT0gUENJX0RFVklDRV9JRF9OVklESUFf TlZFTkVUXzIgfHwKLQkJcGNpX2Rldi0+ZGV2aWNlID09IFBDSV9ERVZJQ0VfSURfTlZJRElB X05WRU5FVF8zIHx8ICAgIAotCQlwY2lfZGV2LT5kZXZpY2UgPT0gUENJX0RFVklDRV9JRF9O VklESUFfTlZFTkVUXzEyIHx8Ci0JCXBjaV9kZXYtPmRldmljZSA9PSBQQ0lfREVWSUNFX0lE X05WSURJQV9OVkVORVRfMTMpCisJCQlwY2lfZGV2LT5kZXZpY2UgPT0gUENJX0RFVklDRV9J RF9OVklESUFfTlZFTkVUXzIgfHwKKwkJCXBjaV9kZXYtPmRldmljZSA9PSBQQ0lfREVWSUNF X0lEX05WSURJQV9OVkVORVRfMyB8fCAgICAKKwkJCXBjaV9kZXYtPmRldmljZSA9PSBQQ0lf REVWSUNFX0lEX05WSURJQV9OVkVORVRfMTIgfHwKKwkJCXBjaV9kZXYtPmRldmljZSA9PSBQ Q0lfREVWSUNFX0lEX05WSURJQV9OVkVORVRfMTMpIHsKIAkJbnAtPmRlc2NfdmVyID0gREVT Q19WRVJfMTsKLQllbHNlCisgCQlucC0+cGt0X2xpbWl0ID0gTlZfUEtUTElNSVRfMTsKKwl9 IGVsc2UgewogCQlucC0+ZGVzY192ZXIgPSBERVNDX1ZFUl8yOworIAkJbnAtPnBrdF9saW1p dCA9IE5WX1BLVExJTUlUXzE7CisJfQogCiAJZXJyID0gLUVOT01FTTsKIAlucC0+YmFzZSA9 IGlvcmVtYXAoYWRkciwgTlZfUENJX1JFR1NaKTsK --------------060503060906020809090406-- From manfred@colorfullife.com Sat Jul 16 11:16:35 2005 Received: with ECARTIS (v1.0.0; list netdev); Sat, 16 Jul 2005 11:16:39 -0700 (PDT) Received: from dbl.q-ag.de (dbl.q-ag.de [213.172.117.3]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6GIGYH9031042 for ; Sat, 16 Jul 2005 11:16:35 -0700 Received: from [127.0.0.2] (dbl [127.0.0.1]) by dbl.q-ag.de (8.13.3/8.13.3/Debian-6) with ESMTP id j6GIHlTl018231; Sat, 16 Jul 2005 20:17:48 +0200 Message-ID: <42D94E95.1060303@colorfullife.com> Date: Sat, 16 Jul 2005 20:14:45 +0200 From: Manfred Spraul User-Agent: Mozilla/5.0 (X11; U; Linux i686; fr-FR; rv:1.7.8) Gecko/20050513 Fedora/1.7.8-1.3.1 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Jeff Garzik CC: Netdev Subject: [PATCH] forcedeth: jumbo frame support Content-Type: multipart/mixed; boundary="------------020905030206060006040904" X-archive-position: 2739 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: manfred@colorfullife.com Precedence: bulk X-list: netdev Content-Length: 7279 Lines: 246 This is a multi-part message in MIME format. --------------020905030206060006040904 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Hi, Attached is the missing 0.36 patch that adds jumbo frame support for the nForce versions that support it. The performance difference is quite huge: - MTU 1500: 57 MB/sec, around 85% cpu load - MTU 7200: 92 MB/sec, around 37% cpu load Signed-Off-By: Manfred Spraul --------------020905030206060006040904 Content-Type: text/plain; name="patch-forcedeth-036-jumbo" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="patch-forcedeth-036-jumbo" --- 2.6/drivers/net/forcedeth.c 2005-06-28 22:51:26.000000000 +0200 +++ build-2.6/drivers/net/forcedeth.c 2005-06-28 22:51:40.000000000 +0200 @@ -85,6 +85,7 @@ * 0.33: 16 May 2005: Support for MCP51 added. * 0.34: 18 Jun 2005: Add DEV_NEED_LINKTIMER to all nForce nics. * 0.35: 26 Jun 2005: Support for MCP55 added. + * 0.36: 28 Jul 2005: Add jumbo frame support. * * Known bugs: * We suspect that on some hardware no TX done interrupts are generated. @@ -96,7 +97,7 @@ * DEV_NEED_TIMERIRQ will not harm you on sane hardware, only generating a few * superfluous timer interrupts from the nic. */ -#define FORCEDETH_VERSION "0.35" +#define FORCEDETH_VERSION "0.36" #define DRV_NAME "forcedeth" #include @@ -379,9 +380,13 @@ #define TX_LIMIT_START 62 /* rx/tx mac addr + type + vlan + align + slack*/ -#define RX_NIC_BUFSIZE (ETH_DATA_LEN + 64) -/* even more slack */ -#define RX_ALLOC_BUFSIZE (ETH_DATA_LEN + 128) +#define NV_RX_HEADERS (64) +/* even more slack. */ +#define NV_RX_ALLOC_PAD (64) + +/* maximum mtu size */ +#define NV_PKTLIMIT_1 ETH_DATA_LEN /* hard limit not known */ +#define NV_PKTLIMIT_2 9100 /* Actual limit according to NVidia: 9202 */ #define OOM_REFILL (1+HZ/20) #define POLL_WAIT (1+HZ/100) @@ -473,6 +478,7 @@ struct sk_buff *rx_skbuff[RX_RING]; dma_addr_t rx_dma[RX_RING]; unsigned int rx_buf_sz; + unsigned int pkt_limit; struct timer_list oom_kick; struct timer_list nic_poll; @@ -792,7 +798,7 @@ nr = refill_rx % RX_RING; if (np->rx_skbuff[nr] == NULL) { - skb = dev_alloc_skb(RX_ALLOC_BUFSIZE); + skb = dev_alloc_skb(np->rx_buf_sz + NV_RX_ALLOC_PAD); if (!skb) break; @@ -805,7 +811,7 @@ PCI_DMA_FROMDEVICE); np->rx_ring[nr].PacketBuffer = cpu_to_le32(np->rx_dma[nr]); wmb(); - np->rx_ring[nr].FlagLen = cpu_to_le32(RX_NIC_BUFSIZE | NV_RX_AVAIL); + np->rx_ring[nr].FlagLen = cpu_to_le32(np->rx_buf_sz | NV_RX_AVAIL); dprintk(KERN_DEBUG "%s: nv_alloc_rx: Packet %d marked as Available\n", dev->name, refill_rx); refill_rx++; @@ -831,19 +837,31 @@ enable_irq(dev->irq); } -static int nv_init_ring(struct net_device *dev) +static void nv_init_rx(struct net_device *dev) { struct fe_priv *np = get_nvpriv(dev); int i; - np->next_tx = np->nic_tx = 0; - for (i = 0; i < TX_RING; i++) - np->tx_ring[i].FlagLen = 0; - np->cur_rx = RX_RING; np->refill_rx = 0; for (i = 0; i < RX_RING; i++) np->rx_ring[i].FlagLen = 0; +} + +static void nv_init_tx(struct net_device *dev) +{ + struct fe_priv *np = get_nvpriv(dev); + int i; + + np->next_tx = np->nic_tx = 0; + for (i = 0; i < TX_RING; i++) + np->tx_ring[i].FlagLen = 0; +} + +static int nv_init_ring(struct net_device *dev) +{ + nv_init_tx(dev); + nv_init_rx(dev); return nv_alloc_rx(dev); } @@ -1207,15 +1225,82 @@ } } +static void set_bufsize(struct net_device *dev) +{ + struct fe_priv *np = netdev_priv(dev); + + if (dev->mtu <= ETH_DATA_LEN) + np->rx_buf_sz = ETH_DATA_LEN + NV_RX_HEADERS; + else + np->rx_buf_sz = dev->mtu + NV_RX_HEADERS; +} + /* * nv_change_mtu: dev->change_mtu function * Called with dev_base_lock held for read. */ static int nv_change_mtu(struct net_device *dev, int new_mtu) { - if (new_mtu > ETH_DATA_LEN) + struct fe_priv *np = get_nvpriv(dev); + int old_mtu; + + if (new_mtu < 64 || new_mtu > np->pkt_limit) return -EINVAL; + + old_mtu = dev->mtu; dev->mtu = new_mtu; + + /* return early if the buffer sizes will not change */ + if (old_mtu <= ETH_DATA_LEN && new_mtu <= ETH_DATA_LEN) + return 0; + if (old_mtu == new_mtu) + return 0; + + /* synchronized against open : rtnl_lock() held by caller */ + if (netif_running(dev)) { + u8 *base = get_hwbase(dev); + /* + * It seems that the nic preloads valid ring entries into an + * internal buffer. The procedure for flushing everything is + * guessed, there is probably a simpler approach. + * Changing the MTU is a rare event, it shouldn't matter. + */ + disable_irq(dev->irq); + spin_lock_bh(&dev->xmit_lock); + spin_lock(&np->lock); + /* stop engines */ + nv_stop_rx(dev); + nv_stop_tx(dev); + nv_txrx_reset(dev); + /* drain rx queue */ + nv_drain_rx(dev); + nv_drain_tx(dev); + /* reinit driver view of the rx queue */ + nv_init_rx(dev); + nv_init_tx(dev); + /* alloc new rx buffers */ + set_bufsize(dev); + if (nv_alloc_rx(dev)) { + if (!np->in_shutdown) + mod_timer(&np->oom_kick, jiffies + OOM_REFILL); + } + /* reinit nic view of the rx queue */ + writel(np->rx_buf_sz, base + NvRegOffloadConfig); + writel((u32) np->ring_addr, base + NvRegRxRingPhysAddr); + writel((u32) (np->ring_addr + RX_RING*sizeof(struct ring_desc)), base + NvRegTxRingPhysAddr); + writel( ((RX_RING-1) << NVREG_RINGSZ_RXSHIFT) + ((TX_RING-1) << NVREG_RINGSZ_TXSHIFT), + base + NvRegRingSizes); + pci_push(base); + writel(NVREG_TXRXCTL_KICK|np->desc_ver, get_hwbase(dev) + NvRegTxRxControl); + pci_push(base); + + /* restart rx engine */ + nv_start_rx(dev); + nv_start_tx(dev); + spin_unlock(&np->lock); + spin_unlock_bh(&dev->xmit_lock); + enable_irq(dev->irq); + } return 0; } @@ -1792,6 +1877,7 @@ writel(0, base + NvRegAdapterControl); /* 2) initialize descriptor rings */ + set_bufsize(dev); oom = nv_init_ring(dev); writel(0, base + NvRegLinkSpeed); @@ -1837,7 +1923,7 @@ writel(NVREG_MISC1_FORCE | NVREG_MISC1_HD, base + NvRegMisc1); writel(readl(base + NvRegTransmitterStatus), base + NvRegTransmitterStatus); writel(NVREG_PFF_ALWAYS, base + NvRegPacketFilterFlags); - writel(NVREG_OFFLOAD_NORMAL, base + NvRegOffloadConfig); + writel(np->rx_buf_sz, base + NvRegOffloadConfig); writel(readl(base + NvRegReceiverStatus), base + NvRegReceiverStatus); get_random_bytes(&i, sizeof(i)); @@ -2007,13 +2093,16 @@ /* handle different descriptor versions */ if (pci_dev->device == PCI_DEVICE_ID_NVIDIA_NVENET_1 || - pci_dev->device == PCI_DEVICE_ID_NVIDIA_NVENET_2 || - pci_dev->device == PCI_DEVICE_ID_NVIDIA_NVENET_3 || - pci_dev->device == PCI_DEVICE_ID_NVIDIA_NVENET_12 || - pci_dev->device == PCI_DEVICE_ID_NVIDIA_NVENET_13) + pci_dev->device == PCI_DEVICE_ID_NVIDIA_NVENET_2 || + pci_dev->device == PCI_DEVICE_ID_NVIDIA_NVENET_3 || + pci_dev->device == PCI_DEVICE_ID_NVIDIA_NVENET_12 || + pci_dev->device == PCI_DEVICE_ID_NVIDIA_NVENET_13) { np->desc_ver = DESC_VER_1; - else + np->pkt_limit = NV_PKTLIMIT_1; + } else { np->desc_ver = DESC_VER_2; + np->pkt_limit = NV_PKTLIMIT_1; + } err = -ENOMEM; np->base = ioremap(addr, NV_PCI_REGSZ); --------------020905030206060006040904-- From dsd@gentoo.org Sat Jul 16 12:57:04 2005 Received: with ECARTIS (v1.0.0; list netdev); Sat, 16 Jul 2005 12:58:06 -0700 (PDT) Received: from mta07-winn.ispmail.ntl.com (mta07-winn.ispmail.ntl.com [81.103.221.47]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6GJv3H9007584 for ; Sat, 16 Jul 2005 12:57:04 -0700 Received: from aamta11-winn.ispmail.ntl.com ([81.103.221.35]) by mta07-winn.ispmail.ntl.com with ESMTP id <20050716195517.QDVE481.mta07-winn.ispmail.ntl.com@aamta11-winn.ispmail.ntl.com>; Sat, 16 Jul 2005 20:55:17 +0100 Received: from zog.reactivated.net ([81.99.81.161]) by aamta11-winn.ispmail.ntl.com with ESMTP id <20050716195517.PCMI9845.aamta11-winn.ispmail.ntl.com@zog.reactivated.net>; Sat, 16 Jul 2005 20:55:17 +0100 Received: from [192.168.0.2] (dsd [192.168.0.2]) by zog.reactivated.net (Postfix) with ESMTP id 97C587BAD40; Sat, 16 Jul 2005 21:19:53 +0100 (BST) Message-ID: <42D9658B.7020907@gentoo.org> Date: Sat, 16 Jul 2005 20:52:43 +0100 From: Daniel Drake User-Agent: Mozilla Thunderbird 1.0.5 (X11/20050715) X-Accept-Language: en-us, en MIME-Version: 1.0 To: Manfred Spraul Cc: Linux Kernel Mailing List , Netdev , Ayaz Abdulla Subject: Re: [PATCH] forcedeth: TX handler changes (experimental) References: <42D913D6.5050902@colorfullife.com> In-Reply-To: <42D913D6.5050902@colorfullife.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 2740 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: dsd@gentoo.org Precedence: bulk X-list: netdev Content-Length: 5091 Lines: 111 Hi, Manfred Spraul wrote: > Attached is a patch that modifies the tx interrupt handling of the > nForce nic. It's part of the attempts to figure out what causes the nic > hangs (see bug 4552). > The change is experimental: It affects all nForce versions. I've tested > it on my nForce 250-Gb. > > Please test it. And especially: If you experince a nic hang, please send > me the debug output. That's the block starting with > > << > NETDEV WATCHDOG: eth1: transmit timed out > eth1: Got tx_timeout. irq: 00000000 > eth1: Ring at ... > << After applying the v0.38 patch, I can't get any network at all. DHCP fails to get an IP. v0.37 works fine. I enabled debugging, and I get this failure for every packet being transmitted: ( i masked out part of my MAC addr with XX ) Jul 16 20:06:28 dsd eth0: nv_start_xmit: packet packet 3 queued for transmission. Jul 16 20:06:28 dsd Jul 16 20:06:28 dsd 000: ff ff ff ff ff ff 00 50 8d XX XX XX 08 00 45 00 Jul 16 20:06:28 dsd 010: 02 40 75 a0 00 00 40 11 03 0e 00 00 00 00 ff ff Jul 16 20:06:28 dsd 020: ff ff 00 44 00 43 02 2c 13 0a 01 01 06 00 d2 76 Jul 16 20:06:28 dsd 030: bc 10 00 0a 00 00 00 00 00 00 00 00 00 00 00 00 Jul 16 20:06:28 dsd eth0: nv_nic_irq Jul 16 20:06:28 dsd eth0: irq: 00000008 Jul 16 20:06:28 dsd eth0: nv_tx_done: looking at packet 3, Flags 0x6000024d. Jul 16 20:06:28 dsd eth0: received irq with events 0x8. Probably TX fail. Jul 16 20:06:28 dsd eth0: irq: 00000000 Jul 16 20:06:28 dsd eth0: nv_nic_irq completed My hardware: 0000:00:04.0 Class 0200: 10de:0066 (rev a1) 0000:00:04.0 Ethernet controller: nVidia Corporation nForce2 Ethernet Controller (rev a1) Subsystem: ABIT Computer Corp.: Unknown device 1c00 Flags: bus master, 66Mhz, fast devsel, latency 0, IRQ 17 Memory at e0087000 (32-bit, non-prefetchable) [size=4K] I/O ports at b000 [size=8] Capabilities: [44] Power Management version 2 Here's the start of the logs: Jul 16 20:05:27 dsd forcedeth.c: Reverse Engineered nForce ethernet driver. Version 0.38. Jul 16 20:05:27 dsd ACPI: PCI Interrupt 0000:00:04.0[A] -> Link [APCH] -> GSI 21 (level, high) -> IRQ 17 Jul 16 20:05:27 dsd PCI: Setting latency timer of device 0000:00:04.0 to 64 Jul 16 20:05:27 dsd 0000:00:04.0: resource 0 start e0087000 len 4096 flags 0x00000200. Jul 16 20:05:27 dsd 0000:00:04.0: MAC Address 00:50:8d:XX:XX:XX Jul 16 20:05:27 dsd 0000:00:04.0: link timer on. Jul 16 20:05:27 dsd eth%d: mii_rw read from reg 2 at PHY 1: 0x0. Jul 16 20:05:27 dsd eth%d: mii_rw read from reg 3 at PHY 1: 0x8201. Jul 16 20:05:27 dsd 0000:00:04.0: open: Found PHY 0000:0020 at address 1. Jul 16 20:05:27 dsd eth%d: mii_rw read from reg 4 at PHY 1: 0x1e1. Jul 16 20:05:27 dsd eth%d: mii_rw wrote 0xde1 to reg 4 at PHY 1 Jul 16 20:05:27 dsd eth%d: mii_rw read from reg 1 at PHY 1: 0x786d. Jul 16 20:05:27 dsd eth%d: mii_rw read from reg 0 at PHY 1: 0x3100. Jul 16 20:05:27 dsd eth%d: mii_rw wrote 0xb100 to reg 0 at PHY 1 Jul 16 20:05:28 dsd eth%d: mii_rw read from reg 0 at PHY 1: 0x3000. Jul 16 20:05:28 dsd eth%d: mii_rw read from reg 0 at PHY 1: 0x3000. Jul 16 20:05:28 dsd eth%d: mii_rw wrote 0x3200 to reg 0 at PHY 1 Jul 16 20:05:28 dsd eth0: forcedeth.c: subsystem: 0147b:1c00 bound to 0000:00:04.0 Jul 16 20:05:28 dsd rc-scripts: Configuration not set for eth0 - assuming dhcp Jul 16 20:05:28 dsd nv_open: begin Jul 16 20:05:28 dsd eth0: nv_alloc_rx: Packet 0 marked as Available Jul 16 20:05:28 dsd eth0: nv_alloc_rx: Packet 1 marked as Available Jul 16 20:05:28 dsd eth0: nv_alloc_rx: Packet 2 marked as Available Jul 16 20:05:28 dsd eth0: nv_alloc_rx: Packet 125 marked as Available Jul 16 20:05:28 dsd eth0: nv_alloc_rx: Packet 126 marked as Available Jul 16 20:05:28 dsd eth0: nv_alloc_rx: Packet 127 marked as Available Jul 16 20:05:28 dsd eth0: nv_txrx_reset Jul 16 20:05:28 dsd startup: got 0x00000010. Jul 16 20:05:28 dsd eth0: mii_rw read from reg 1 at PHY 1: 0x7849. Jul 16 20:05:28 dsd eth0: mii_rw read from reg 1 at PHY 1: 0x7849. Jul 16 20:05:28 dsd eth0: no link detected by phy - falling back to 10HD. Jul 16 20:05:28 dsd eth0: nv_start_rx Jul 16 20:05:28 dsd eth0: nv_start_rx to duplex 0, speed 0x000103e8. Jul 16 20:05:28 dsd eth0: nv_start_tx Jul 16 20:05:28 dsd eth0: no link during initialization. Jul 16 20:05:28 dsd eth0: nv_stop_rx Jul 16 20:05:28 dsd eth0: reconfiguration for multicast lists. Jul 16 20:05:28 dsd eth0: nv_start_rx Jul 16 20:05:28 dsd eth0: nv_start_rx to duplex 0, speed 0x000103e8. Jul 16 20:05:28 dsd eth0: nv_stop_rx Jul 16 20:05:28 dsd eth0: reconfiguration for multicast lists. Jul 16 20:05:28 dsd eth0: nv_start_rx Jul 16 20:05:28 dsd eth0: nv_start_rx to duplex 0, speed 0x000103e8. Jul 16 20:05:28 dsd eth0: nv_stop_rx Jul 16 20:05:28 dsd eth0: reconfiguration for multicast lists. Jul 16 20:05:28 dsd eth0: nv_start_rx Jul 16 20:05:28 dsd eth0: nv_start_rx to duplex 0, speed 0x000103e8. Let me know if full logs would be useful (they are big, and it just shows a lot of interrupts, some packets being queued up, and 5 or so TX failures like the ones above). Daniel From tmattox@gmail.com Sat Jul 16 14:10:56 2005 Received: with ECARTIS (v1.0.0; list netdev); Sat, 16 Jul 2005 14:10:58 -0700 (PDT) Received: from zproxy.gmail.com (zproxy.gmail.com [64.233.162.196]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6GLAtH9010861 for ; Sat, 16 Jul 2005 14:10:55 -0700 Received: by zproxy.gmail.com with SMTP id v1so557796nzb for ; Sat, 16 Jul 2005 14:09:09 -0700 (PDT) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:reply-to:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=JAyTvwnwwirQAm+tGc/eRk38a8f9HSxHFe5bYsGscptSO1GpEosc9gzXz5TUNizmERhLrngFpDlVSBadycDMK2IQMB9Qb8Xw5Qt29ZHBPn8BucTqTGZx/SJ5VzKbdn6uaMP5qSve85Zf8Liu4o2uNYkYoxFO2G6v57sjgFBfu+M= Received: by 10.36.42.20 with SMTP id p20mr2278672nzp; Sat, 16 Jul 2005 14:09:09 -0700 (PDT) Received: by 10.36.84.11 with HTTP; Sat, 16 Jul 2005 14:09:09 -0700 (PDT) Message-ID: Date: Sat, 16 Jul 2005 17:09:09 -0400 From: Tim Mattox Reply-To: Tim Mattox To: Manfred Spraul Subject: Re: [PATCH] forcedeth: jumbo frame support Cc: Jeff Garzik , Netdev In-Reply-To: <42D94E95.1060303@colorfullife.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Disposition: inline References: <42D94E95.1060303@colorfullife.com> Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id j6GLAtH9010861 X-archive-position: 2741 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tmattox@gmail.com Precedence: bulk X-list: netdev Content-Length: 1539 Lines: 35 This is just from a simple read of Manfred's patch, but shouldn't one side of the if-else use the NV_PKTLIMIT_2 value? If not, why is np->pkt_limit = NV_PKTLIMIT_1; inside the if-else at all? On 7/16/05, Manfred Spraul wrote: [snip] > +/* maximum mtu size */ > +#define NV_PKTLIMIT_1 ETH_DATA_LEN /* hard limit not known */ > +#define NV_PKTLIMIT_2 9100 /* Actual limit according to NVidia: 9202 */ [snip] > @@ -2007,13 +2093,16 @@ > > /* handle different descriptor versions */ > if (pci_dev->device == PCI_DEVICE_ID_NVIDIA_NVENET_1 || > - pci_dev->device == PCI_DEVICE_ID_NVIDIA_NVENET_2 || > - pci_dev->device == PCI_DEVICE_ID_NVIDIA_NVENET_3 || > - pci_dev->device == PCI_DEVICE_ID_NVIDIA_NVENET_12 || > - pci_dev->device == PCI_DEVICE_ID_NVIDIA_NVENET_13) > + pci_dev->device == PCI_DEVICE_ID_NVIDIA_NVENET_2 || > + pci_dev->device == PCI_DEVICE_ID_NVIDIA_NVENET_3 || > + pci_dev->device == PCI_DEVICE_ID_NVIDIA_NVENET_12 || > + pci_dev->device == PCI_DEVICE_ID_NVIDIA_NVENET_13) { > np->desc_ver = DESC_VER_1; > - else > + np->pkt_limit = NV_PKTLIMIT_1; > + } else { > np->desc_ver = DESC_VER_2; > + np->pkt_limit = NV_PKTLIMIT_1; > + } -- Tim Mattox - tmattox@gmail.com http://homepage.mac.com/tmattox/ I'm a bright... http://www.the-brights.net/ From manfred@colorfullife.com Sat Jul 16 14:17:29 2005 Received: with ECARTIS (v1.0.0; list netdev); Sat, 16 Jul 2005 14:17:31 -0700 (PDT) Received: from dbl.q-ag.de (dbl.q-ag.de [213.172.117.3]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6GLHRH9011550 for ; Sat, 16 Jul 2005 14:17:28 -0700 Received: from [127.0.0.2] (dbl [127.0.0.1]) by dbl.q-ag.de (8.13.3/8.13.3/Debian-6) with ESMTP id j6GLIc7I021541; Sat, 16 Jul 2005 23:18:38 +0200 Message-ID: <42D978F8.7000107@colorfullife.com> Date: Sat, 16 Jul 2005 23:15:36 +0200 From: Manfred Spraul User-Agent: Mozilla/5.0 (X11; U; Linux i686; fr-FR; rv:1.7.8) Gecko/20050513 Fedora/1.7.8-1.3.1 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Tim Mattox CC: Jeff Garzik , Netdev Subject: Re: [PATCH] forcedeth: jumbo frame support References: <42D94E95.1060303@colorfullife.com> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 2742 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: manfred@colorfullife.com Precedence: bulk X-list: netdev Content-Length: 434 Lines: 14 Tim Mattox wrote: >This is just from a simple read of Manfred's patch, but shouldn't one side of >the if-else use the NV_PKTLIMIT_2 value? > It should - and it's already fixed in the following patch for 0.37, which is already in Jeff's queue. I didn't want to break the patch in Jeff's queue, thus I submitted the patch as it was. It doesn't hurt much: 0.36 adds the jumbo frame support, 0.37 actually enables it. -- Manfred From dsd@gentoo.org Sat Jul 16 14:28:45 2005 Received: with ECARTIS (v1.0.0; list netdev); Sat, 16 Jul 2005 14:28:50 -0700 (PDT) Received: from mta09-winn.ispmail.ntl.com (mta09-winn.ispmail.ntl.com [81.103.221.49]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6GLSiH9012674 for ; Sat, 16 Jul 2005 14:28:44 -0700 Received: from aamta10-winn.ispmail.ntl.com ([81.103.221.35]) by mta09-winn.ispmail.ntl.com with ESMTP id <20050716212657.JXEW11649.mta09-winn.ispmail.ntl.com@aamta10-winn.ispmail.ntl.com>; Sat, 16 Jul 2005 22:26:57 +0100 Received: from zog.reactivated.net ([81.99.81.161]) by aamta10-winn.ispmail.ntl.com with ESMTP id <20050716212657.VHL14310.aamta10-winn.ispmail.ntl.com@zog.reactivated.net>; Sat, 16 Jul 2005 22:26:57 +0100 Received: from [192.168.0.2] (dsd [192.168.0.2]) by zog.reactivated.net (Postfix) with ESMTP id 884CC7BAE55; Sat, 16 Jul 2005 22:51:33 +0100 (BST) Message-ID: <42D97B29.4050400@gentoo.org> Date: Sat, 16 Jul 2005 22:24:57 +0100 From: Daniel Drake User-Agent: Mozilla Thunderbird 1.0.5 (X11/20050715) X-Accept-Language: en-us, en MIME-Version: 1.0 To: Manfred Spraul Cc: Linux Kernel Mailing List , Netdev , Ayaz Abdulla Subject: Re: [PATCH] forcedeth: TX handler changes (experimental) References: <42D913D6.5050902@colorfullife.com> <42D9658B.7020907@gentoo.org> In-Reply-To: <42D9658B.7020907@gentoo.org> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 2743 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: dsd@gentoo.org Precedence: bulk X-list: netdev Content-Length: 850 Lines: 25 Daniel Drake wrote: > After applying the v0.38 patch, I can't get any network at all. DHCP > fails to get an IP. v0.37 works fine. Tracked it down. (sorry for linewraps) +#define DEV_NEED_TIMERIRQ 0x0001 /* set the timer irq flag in the irq mask */ +#define DEV_NEED_LINKTIMER 0x0002 /* poll link settings. Relies on the timer irq */ +#define DEV_HAS_LARGEDESC 0x0003 /* device supports jumbo frames and needs packet format 2 */ My hardware is NEED_TIMERIRQ|NEED_LINKTIMER, however, by this logic, it'll also be DEV_HAVE_LARGEDESC, which isn't true. So, you want this instead: #define DEV_HAS_LARGEDESC 0x0004 After making that change, all is working fine, but then again, I've never run into the hangs you are debugging. I'll follow up in a couple of days time to confirm I'm not getting any problems with the new code. Thanks, Daniel From manfred@colorfullife.com Sat Jul 16 14:38:13 2005 Received: with ECARTIS (v1.0.0; list netdev); Sat, 16 Jul 2005 14:38:18 -0700 (PDT) Received: from dbl.q-ag.de (dbl.q-ag.de [213.172.117.3]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6GLcBH9013669 for ; Sat, 16 Jul 2005 14:38:12 -0700 Received: from [127.0.0.2] (dbl [127.0.0.1]) by dbl.q-ag.de (8.13.3/8.13.3/Debian-6) with ESMTP id j6GLdIDQ021604; Sat, 16 Jul 2005 23:39:19 +0200 Message-ID: <42D97DD0.6030207@colorfullife.com> Date: Sat, 16 Jul 2005 23:36:16 +0200 From: Manfred Spraul User-Agent: Mozilla/5.0 (X11; U; Linux i686; fr-FR; rv:1.7.8) Gecko/20050513 Fedora/1.7.8-1.3.1 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Daniel Drake CC: Linux Kernel Mailing List , Netdev , Ayaz Abdulla Subject: Re: [PATCH] forcedeth: TX handler changes (experimental) References: <42D913D6.5050902@colorfullife.com> <42D9658B.7020907@gentoo.org> <42D97B29.4050400@gentoo.org> In-Reply-To: <42D97B29.4050400@gentoo.org> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 2744 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: manfred@colorfullife.com Precedence: bulk X-list: netdev Content-Length: 224 Lines: 12 Daniel Drake wrote: > So, you want this instead: > > #define DEV_HAS_LARGEDESC 0x0004 > Autsch. Yes, you are right. Sorry for that, I should have reread the patch once more. I've fixed it on my website. -- Manfred From davem@davemloft.net Mon Jul 18 13:45:48 2005 Received: with ECARTIS (v1.0.0; list netdev); Mon, 18 Jul 2005 13:46:01 -0700 (PDT) Received: from sunset.davemloft.net (dsl027-180-168.sfo1.dsl.speakeasy.net [216.27.180.168]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6IKjlH9014442 for ; Mon, 18 Jul 2005 13:45:48 -0700 Received: from localhost ([127.0.0.1] ident=davem) by sunset.davemloft.net with esmtp (Exim 4.50) id 1DucTT-0000V1-UO; Mon, 18 Jul 2005 13:44:27 -0700 Date: Mon, 18 Jul 2005 13:44:27 -0700 (PDT) Message-Id: <20050718.134427.112314130.davem@davemloft.net> To: sri@us.ibm.com Cc: lksctp-developers@lists.sourceforge.net, netdev@oss.sgi.com, cdeboys@stevens.edu Subject: Re: [PATCH] [SCTP] Fix potential null pointer dereference while handling an icmp error From: "David S. Miller" In-Reply-To: <1121302485.5841.30.camel@w-sridhar2.beaverton.ibm.com> References: <1121302485.5841.30.camel@w-sridhar2.beaverton.ibm.com> X-Mailer: Mew version 4.2 on Emacs 21.4 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 2746 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev Content-Length: 368 Lines: 10 From: Sridhar Samudrala Date: Wed, 13 Jul 2005 17:54:45 -0700 > Please apply the following patch which fixes a potential null pointer > dereference while handling an icmp error in sctp_icmp_* routines. These > routines assume that we are passing a valid asoc. > > Reported by Charles-Henri de Boysson Applied, thanks Sridhar. From P@draigBrady.com Tue Jul 19 04:00:01 2005 Received: with ECARTIS (v1.0.0; list netdev); Tue, 19 Jul 2005 04:00:05 -0700 (PDT) Received: from corvil.com (gate.corvil.net [213.94.219.177]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6JAxuH9006704 for ; Tue, 19 Jul 2005 04:00:00 -0700 Received: from draigBrady.com (pixelbeat.local.corvil.com [172.18.1.170]) by corvil.com (8.13.3/8.13.3) with ESMTP id j6JAvpeh092752; Tue, 19 Jul 2005 11:57:51 +0100 (IST) (envelope-from P@draigBrady.com) Message-ID: <42DCDCAF.1060005@draigBrady.com> Date: Tue, 19 Jul 2005 11:57:51 +0100 From: P@draigBrady.com User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040124 X-Accept-Language: en-us, en MIME-Version: 1.0 To: "'netdev@oss.sgi.com'" , e1000-devel@lists.sourceforge.net Subject: drop counts Content-Type: multipart/mixed; boundary="------------050106030408090609090809" X-archive-position: 2747 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: P@draigBrady.com Precedence: bulk X-list: netdev Content-Length: 2520 Lines: 82 This is a multi-part message in MIME format. --------------050106030408090609090809 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit I'm confused about the drop count reporting in e1000 nics (and elsewhere). On e1000 nics the on nic rx buffer drop counts are maintained in "mpc" and the in kernel buffer drops are maintained in "rnbc". Note also the e1000 nic does not count mpc in its rx_packets count (gprc). In kernel 2.6.11 and before these were mapped to the kernel statistics like: packets = gprc dropped = rnbc fifo = mpc missed = mpc In 2.6.12 this changed to: packets = gprc dropped = mpc fifo = mpc missed = mpc Both are wrong I think and I think it should do (see attached patch): packets = gprc + mpc dropped = mpc + rnbc fifo = rnbc missed = mpc I tried to correlate this with the tg3 driver, but that confused me also as it seems to do: packets = rx_packets dropped = equiv of rnbc (maintained by driver) fifo = ? missed = ? errors = rx_errors + rx_discards cheers, Pádraig. --------------050106030408090609090809 Content-Type: application/x-texinfo; name="e1000-drops.diff" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="e1000-drops.diff" --- e1000_main.c 2005-05-17 21:31:44.000000000 +0000 +++ e1000_main_drops.c 2005-07-19 10:28:36.000000000 +0000 @@ -2626,7 +2626,7 @@ /* Fill out the OS statistics structure */ - adapter->net_stats.rx_packets = adapter->stats.gprc; + adapter->net_stats.rx_packets = adapter->stats.gprc + adapter->stats.mpc; adapter->net_stats.tx_packets = adapter->stats.gptc; adapter->net_stats.rx_bytes = adapter->stats.gorcl; adapter->net_stats.tx_bytes = adapter->stats.gotcl; @@ -2638,12 +2638,12 @@ adapter->net_stats.rx_errors = adapter->stats.rxerrc + adapter->stats.crcerrs + adapter->stats.algnerrc + adapter->stats.rlec + adapter->stats.mpc + - adapter->stats.cexterr; - adapter->net_stats.rx_dropped = adapter->stats.mpc; + adapter->stats.rnbc + adapter->stats.cexterr; + adapter->net_stats.rx_dropped = adapter->stats.mpc + adapter->stats.rnbc; adapter->net_stats.rx_length_errors = adapter->stats.rlec; adapter->net_stats.rx_crc_errors = adapter->stats.crcerrs; adapter->net_stats.rx_frame_errors = adapter->stats.algnerrc; - adapter->net_stats.rx_fifo_errors = adapter->stats.mpc; + adapter->net_stats.rx_fifo_errors = adapter->stats.rnbc; adapter->net_stats.rx_missed_errors = adapter->stats.mpc; /* Tx Errors */ --------------050106030408090609090809-- From chas@cmf.nrl.navy.mil Tue Jul 19 13:47:39 2005 Received: with ECARTIS (v1.0.0; list netdev); Tue, 19 Jul 2005 13:47:45 -0700 (PDT) Received: from ginger.cmf.nrl.navy.mil (ginger.cmf.nrl.navy.mil [134.207.10.161]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6JKldH9028190 for ; Tue, 19 Jul 2005 13:47:39 -0700 Received: from cmf.nrl.navy.mil (thirdoffive.cmf.nrl.navy.mil [134.207.10.180]) by ginger.cmf.nrl.navy.mil (8.12.11/8.12.11) with ESMTP id j6JKjgHO027216; Tue, 19 Jul 2005 16:45:42 -0400 (EDT) Message-Id: <200507192045.j6JKjgHO027216@ginger.cmf.nrl.navy.mil> To: netdev@oss.sgi.com Cc: davem@davemloft.net Subject: [PATCH 5/8][ATM]: [firestream] fix the sparse warning "implicit cast to nocast type" Date: Tue, 19 Jul 2005 16:45:43 -0400 From: "chas williams - CONTRACTOR" X-Scanned-By: MIMEDefang 2.30 (www . roaringpenguin . com / mimedefang) X-archive-position: 2753 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: chas@cmf.nrl.navy.mil Precedence: bulk X-list: netdev Content-Length: 1461 Lines: 42 please apply to 2.6 -- thanks! [ATM]: [firestream] fix the sparse warning "implicit cast to nocast type" Signed-off-by: Victor Fusco Signed-off-by: Domen Puncer Signed-off-by: Chas Williams --- commit 6e59c9c1673a7b31f00cc8dd79f1e11abf91be9a tree 962d15261cc88a8b094461689bcc122896508333 parent 9c893a8cd5716416ee719a57e04e01aeb2c68bd3 author chas williams Tue, 19 Jul 2005 15:07:01 -0400 committer chas williams Tue, 19 Jul 2005 15:07:01 -0400 drivers/atm/firestream.c | 6 ++++-- 1 files changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/atm/firestream.c b/drivers/atm/firestream.c --- a/drivers/atm/firestream.c +++ b/drivers/atm/firestream.c @@ -1374,7 +1374,8 @@ static void reset_chip (struct fs_dev *d } } -static void __devinit *aligned_kmalloc (int size, int flags, int alignment) +static void __devinit *aligned_kmalloc (int size, unsigned int __nocast flags, + int alignment) { void *t; @@ -1464,7 +1465,8 @@ static inline int nr_buffers_in_freepool does. I've seen "receive abort: no buffers" and things started working again after that... -- REW */ -static void top_off_fp (struct fs_dev *dev, struct freepool *fp, int gfp_flags) +static void top_off_fp (struct fs_dev *dev, struct freepool *fp, + unsigned int __nocast gfp_flags) { struct FS_BPENTRY *qe, *ne; struct sk_buff *skb; From chas@cmf.nrl.navy.mil Tue Jul 19 13:49:40 2005 Received: with ECARTIS (v1.0.0; list netdev); Tue, 19 Jul 2005 13:49:46 -0700 (PDT) Received: from ginger.cmf.nrl.navy.mil (ginger.cmf.nrl.navy.mil [134.207.10.161]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6JKndH9029324 for ; Tue, 19 Jul 2005 13:49:39 -0700 Received: from cmf.nrl.navy.mil (thirdoffive.cmf.nrl.navy.mil [134.207.10.180]) by ginger.cmf.nrl.navy.mil (8.12.11/8.12.11) with ESMTP id j6JKle4w027287; Tue, 19 Jul 2005 16:47:41 -0400 (EDT) Message-Id: <200507192047.j6JKle4w027287@ginger.cmf.nrl.navy.mil> To: netdev@oss.sgi.com Cc: davem@davemloft.net Subject: [PATCH 8/8][ATM]: [speedtch] cure atm_printk() macro gcc-2.95 compile error Date: Tue, 19 Jul 2005 16:47:42 -0400 From: "chas williams - CONTRACTOR" X-Scanned-By: MIMEDefang 2.30 (www . roaringpenguin . com / mimedefang) X-archive-position: 2756 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: chas@cmf.nrl.navy.mil Precedence: bulk X-list: netdev Content-Length: 2368 Lines: 65 please apply to 2.6 -- thanks! [ATM]: [speedtch] cure atm_printk() macro gcc-2.95 compile error Signed-off-by: Duncan Sands Signed-off-by: Chas Williams --- commit 73500df545c8763d662192d9749fd8d64209c819 tree 0f73d33200e626e8ef0af506dbed40a3850c609b parent 4932248439d20412610ffaade625cbde0e001e37 author chas williams Wed, 06 Jul 2005 13:22:52 -0400 committer chas williams Wed, 06 Jul 2005 13:22:52 -0400 drivers/usb/atm/speedtch.c | 12 ++++++------ 1 files changed, 6 insertions(+), 6 deletions(-) diff --git a/drivers/usb/atm/speedtch.c b/drivers/usb/atm/speedtch.c --- a/drivers/usb/atm/speedtch.c +++ b/drivers/usb/atm/speedtch.c @@ -448,19 +448,19 @@ static void speedtch_check_status(struct case 0: atm_dev->signal = ATM_PHY_SIG_LOST; if (instance->last_status) - atm_info(usbatm, "ADSL line is down\n"); + atm_info(usbatm, "%s\n", "ADSL line is down"); /* It may never resync again unless we ask it to... */ ret = speedtch_start_synchro(instance); break; case 0x08: atm_dev->signal = ATM_PHY_SIG_UNKNOWN; - atm_info(usbatm, "ADSL line is blocked?\n"); + atm_info(usbatm, "%s\n", "ADSL line is blocked?"); break; case 0x10: atm_dev->signal = ATM_PHY_SIG_LOST; - atm_info(usbatm, "ADSL line is synchronising\n"); + atm_info(usbatm, "%s\n", "ADSL line is synchronising"); break; case 0x20: @@ -502,7 +502,7 @@ static void speedtch_status_poll(unsigne if (instance->poll_delay < MAX_POLL_DELAY) mod_timer(&instance->status_checker.timer, jiffies + msecs_to_jiffies(instance->poll_delay)); else - atm_warn(instance->usbatm, "Too many failures - disabling line status polling\n"); + atm_warn(instance->usbatm, "%s\n", "Too many failures - disabling line status polling"); } static void speedtch_resubmit_int(unsigned long data) @@ -545,9 +545,9 @@ static void speedtch_handle_int(struct u if ((count == 6) && !memcmp(up_int, instance->int_data, 6)) { del_timer(&instance->status_checker.timer); - atm_info(usbatm, "DSL line goes up\n"); + atm_info(usbatm, "%s\n", "DSL line goes up"); } else if ((count == 6) && !memcmp(down_int, instance->int_data, 6)) { - atm_info(usbatm, "DSL line goes down\n"); + atm_info(usbatm, "%s\n", "DSL line goes down"); } else { int i; From chas@cmf.nrl.navy.mil Tue Jul 19 13:47:04 2005 Received: with ECARTIS (v1.0.0; list netdev); Tue, 19 Jul 2005 13:47:08 -0700 (PDT) Received: from ginger.cmf.nrl.navy.mil (ginger.cmf.nrl.navy.mil [134.207.10.161]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6JKl4H9027935 for ; Tue, 19 Jul 2005 13:47:04 -0700 Received: from cmf.nrl.navy.mil (thirdoffive.cmf.nrl.navy.mil [134.207.10.180]) by ginger.cmf.nrl.navy.mil (8.12.11/8.12.11) with ESMTP id j6JKj7kf027191; Tue, 19 Jul 2005 16:45:07 -0400 (EDT) Message-Id: <200507192045.j6JKj7kf027191@ginger.cmf.nrl.navy.mil> To: netdev@oss.sgi.com Cc: davem@davemloft.net Subject: [PATCH 2/8][ATM]: allow bind() on point-to-multpoint svcs (from Martin Whitaker ) Date: Tue, 19 Jul 2005 16:45:08 -0400 From: "chas williams - CONTRACTOR" X-Scanned-By: MIMEDefang 2.30 (www . roaringpenguin . com / mimedefang) X-archive-position: 2750 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: chas@cmf.nrl.navy.mil Precedence: bulk X-list: netdev Content-Length: 944 Lines: 31 please apply to 2.6 -- thanks! [ATM]: allow bind() on point-to-multpoint svcs (from Martin Whitaker ) Signed-off-by: Chas Williams --- commit c4029a0d6294a1052353346235320a678add4f41 tree e6de8b582d908c7943242e8725d435f0b1fda20a parent 73500df545c8763d662192d9749fd8d64209c819 author chas williams Tue, 19 Jul 2005 14:50:33 -0400 committer chas williams Tue, 19 Jul 2005 14:50:33 -0400 net/atm/svc.c | 4 ---- 1 files changed, 0 insertions(+), 4 deletions(-) diff --git a/net/atm/svc.c b/net/atm/svc.c --- a/net/atm/svc.c +++ b/net/atm/svc.c @@ -118,10 +118,6 @@ static int svc_bind(struct socket *sock, goto out; } vcc = ATM_SD(sock); - if (test_bit(ATM_VF_SESSION, &vcc->flags)) { - error = -EINVAL; - goto out; - } addr = (struct sockaddr_atmsvc *) sockaddr; if (addr->sas_family != AF_ATMSVC) { error = -EAFNOSUPPORT; From chas@cmf.nrl.navy.mil Tue Jul 19 13:46:55 2005 Received: with ECARTIS (v1.0.0; list netdev); Tue, 19 Jul 2005 13:47:01 -0700 (PDT) Received: from ginger.cmf.nrl.navy.mil (ginger.cmf.nrl.navy.mil [134.207.10.161]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6JKksH9027911 for ; Tue, 19 Jul 2005 13:46:55 -0700 Received: from cmf.nrl.navy.mil (thirdoffive.cmf.nrl.navy.mil [134.207.10.180]) by ginger.cmf.nrl.navy.mil (8.12.11/8.12.11) with ESMTP id j6JKirFb027183; Tue, 19 Jul 2005 16:44:53 -0400 (EDT) Message-Id: <200507192044.j6JKirFb027183@ginger.cmf.nrl.navy.mil> To: netdev@oss.sgi.com Cc: davem@davemloft.net Subject: [PATCH 1/8][ATM]: [zatm] eliminate kfree warning (from Tobias Hirning ) Date: Tue, 19 Jul 2005 16:44:54 -0400 From: "chas williams - CONTRACTOR" X-Scanned-By: MIMEDefang 2.30 (www . roaringpenguin . com / mimedefang) X-archive-position: 2749 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: chas@cmf.nrl.navy.mil Precedence: bulk X-list: netdev Content-Length: 918 Lines: 29 please apply to 2.6 -- thanks! [ATM]: [zatm] eliminate kfree warning (from Tobias Hirning ) Signed-off-by: Chas Williams --- commit 4932248439d20412610ffaade625cbde0e001e37 tree 35a60e3551f5f1abced8a435238575c488041af6 parent 238921d2cb04eb6dcc2ff5914d555d7dcaa4dfc5 author chas williams Wed, 06 Jul 2005 13:10:18 -0400 committer chas williams Wed, 06 Jul 2005 13:10:18 -0400 drivers/atm/zatm.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/drivers/atm/zatm.c b/drivers/atm/zatm.c --- a/drivers/atm/zatm.c +++ b/drivers/atm/zatm.c @@ -1339,7 +1339,7 @@ static int __init zatm_start(struct atm_ return 0; out: for (i = 0; i < NR_MBX; i++) - kfree(zatm_dev->mbx_start[i]); + kfree(&zatm_dev->mbx_start[i]); kfree(zatm_dev->rx_map); kfree(zatm_dev->tx_map); free_irq(zatm_dev->irq, dev); From chas@cmf.nrl.navy.mil Tue Jul 19 13:47:14 2005 Received: with ECARTIS (v1.0.0; list netdev); Tue, 19 Jul 2005 13:47:20 -0700 (PDT) Received: from ginger.cmf.nrl.navy.mil (ginger.cmf.nrl.navy.mil [134.207.10.161]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6JKlEH9027969 for ; Tue, 19 Jul 2005 13:47:14 -0700 Received: from cmf.nrl.navy.mil (thirdoffive.cmf.nrl.navy.mil [134.207.10.180]) by ginger.cmf.nrl.navy.mil (8.12.11/8.12.11) with ESMTP id j6JKjHTD027197; Tue, 19 Jul 2005 16:45:17 -0400 (EDT) Message-Id: <200507192045.j6JKjHTD027197@ginger.cmf.nrl.navy.mil> To: netdev@oss.sgi.com Cc: davem@davemloft.net Subject: [PATCH 3/8][ATM]: [idt77252] use time_after() macro Date: Tue, 19 Jul 2005 16:45:18 -0400 From: "chas williams - CONTRACTOR" X-Scanned-By: MIMEDefang 2.30 (www . roaringpenguin . com / mimedefang) X-archive-position: 2751 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: chas@cmf.nrl.navy.mil Precedence: bulk X-list: netdev Content-Length: 1286 Lines: 39 please apply to 2.6 -- thanks! [ATM]: [idt77252] use time_after() macro Signed-off-by: Marcelo Feitoza Parisi Signed-off-by: Domen Puncer Signed-off-by: Chas Williams --- commit ac4755cc8eefb198945e76d4069184454c0819ce tree e0dc319dca5c9e7f98c5d093c739db07a2707ac7 parent c4029a0d6294a1052353346235320a678add4f41 author chas williams Tue, 19 Jul 2005 14:55:40 -0400 committer chas williams Tue, 19 Jul 2005 14:55:40 -0400 drivers/atm/idt77252.c | 3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/drivers/atm/idt77252.c b/drivers/atm/idt77252.c --- a/drivers/atm/idt77252.c +++ b/drivers/atm/idt77252.c @@ -46,6 +46,7 @@ static char const rcsid[] = #include #include #include +#include #include #include #include @@ -780,7 +781,7 @@ push_on_scq(struct idt77252_dev *card, s return 0; out: - if (jiffies - scq->trans_start > HZ) { + if (time_after(jiffies, scq->trans_start + HZ)) { printk("%s: Error pushing TBD for %d.%d\n", card->name, vc->tx_vcc->vpi, vc->tx_vcc->vci); #ifdef CONFIG_ATM_IDT77252_DEBUG From chas@cmf.nrl.navy.mil Tue Jul 19 13:47:55 2005 Received: with ECARTIS (v1.0.0; list netdev); Tue, 19 Jul 2005 13:47:58 -0700 (PDT) Received: from ginger.cmf.nrl.navy.mil (ginger.cmf.nrl.navy.mil [134.207.10.161]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6JKlsH9028293 for ; Tue, 19 Jul 2005 13:47:55 -0700 Received: from cmf.nrl.navy.mil (thirdoffive.cmf.nrl.navy.mil [134.207.10.180]) by ginger.cmf.nrl.navy.mil (8.12.11/8.12.11) with ESMTP id j6JKjwZg027226; Tue, 19 Jul 2005 16:45:58 -0400 (EDT) Message-Id: <200507192045.j6JKjwZg027226@ginger.cmf.nrl.navy.mil> To: netdev@oss.sgi.com Cc: davem@davemloft.net Subject: [PATCH 6/8][ATM]: [ambassador] Fix the sparse warning "implicit cast to nocast type" Date: Tue, 19 Jul 2005 16:45:59 -0400 From: "chas williams - CONTRACTOR" X-Scanned-By: MIMEDefang 2.30 (www . roaringpenguin . com / mimedefang) X-archive-position: 2754 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: chas@cmf.nrl.navy.mil Precedence: bulk X-list: netdev Content-Length: 1133 Lines: 33 please apply to 2.6 -- thanks! [ATM]: [ambassador] Fix the sparse warning "implicit cast to nocast type" Signed-off-by: Victor Fusco Signed-off-by: Domen Puncer Signed-off-by: Chas Williams --- commit 8e1e58dcc5ec7ea3c29c3ae38156afc2af150cea tree e12e7c184de1170f179a17b4a3adf45986ef5a9e parent 6e59c9c1673a7b31f00cc8dd79f1e11abf91be9a author chas williams Tue, 19 Jul 2005 15:08:43 -0400 committer chas williams Tue, 19 Jul 2005 15:08:43 -0400 drivers/atm/ambassador.c | 4 +++- 1 files changed, 3 insertions(+), 1 deletions(-) diff --git a/drivers/atm/ambassador.c b/drivers/atm/ambassador.c --- a/drivers/atm/ambassador.c +++ b/drivers/atm/ambassador.c @@ -794,7 +794,9 @@ static void drain_rx_pools (amb_dev * de drain_rx_pool (dev, pool); } -static inline void fill_rx_pool (amb_dev * dev, unsigned char pool, int priority) { +static inline void fill_rx_pool (amb_dev * dev, unsigned char pool, + unsigned int __nocast priority) +{ rx_in rx; amb_rxq * rxq; From chas@cmf.nrl.navy.mil Tue Jul 19 13:47:28 2005 Received: with ECARTIS (v1.0.0; list netdev); Tue, 19 Jul 2005 13:47:33 -0700 (PDT) Received: from ginger.cmf.nrl.navy.mil (ginger.cmf.nrl.navy.mil [134.207.10.161]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6JKlSH9028106 for ; Tue, 19 Jul 2005 13:47:28 -0700 Received: from cmf.nrl.navy.mil (thirdoffive.cmf.nrl.navy.mil [134.207.10.180]) by ginger.cmf.nrl.navy.mil (8.12.11/8.12.11) with ESMTP id j6JKjUpV027205; Tue, 19 Jul 2005 16:45:30 -0400 (EDT) Message-Id: <200507192045.j6JKjUpV027205@ginger.cmf.nrl.navy.mil> To: netdev@oss.sgi.com Cc: davem@davemloft.net Subject: [PATCH 4/8][ATM]: [he] remove linux/version.h include Date: Tue, 19 Jul 2005 16:45:31 -0400 From: "chas williams - CONTRACTOR" X-Scanned-By: MIMEDefang 2.30 (www . roaringpenguin . com / mimedefang) X-archive-position: 2752 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: chas@cmf.nrl.navy.mil Precedence: bulk X-list: netdev Content-Length: 825 Lines: 29 please apply to 2.6 -- thanks! [ATM]: [he] remove linux/version.h include Signed-off-by: Olaf Hering Signed-off-by: Chas Williams --- commit 9c893a8cd5716416ee719a57e04e01aeb2c68bd3 tree ae99b267a7f35cec41df7ccdfb9aa3695a3e2b67 parent eefc05048924978c1e28835bf5085c5de2e653d2 author chas williams Tue, 19 Jul 2005 15:06:23 -0400 committer chas williams Tue, 19 Jul 2005 15:06:23 -0400 drivers/atm/he.c | 1 - 1 files changed, 0 insertions(+), 1 deletions(-) diff --git a/drivers/atm/he.c b/drivers/atm/he.c --- a/drivers/atm/he.c +++ b/drivers/atm/he.c @@ -57,7 +57,6 @@ #include #include -#include #include #include #include From chas@cmf.nrl.navy.mil Tue Jul 19 13:48:11 2005 Received: with ECARTIS (v1.0.0; list netdev); Tue, 19 Jul 2005 13:48:16 -0700 (PDT) Received: from ginger.cmf.nrl.navy.mil (ginger.cmf.nrl.navy.mil [134.207.10.161]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6JKmAH9028407 for ; Tue, 19 Jul 2005 13:48:10 -0700 Received: from cmf.nrl.navy.mil (thirdoffive.cmf.nrl.navy.mil [134.207.10.180]) by ginger.cmf.nrl.navy.mil (8.12.11/8.12.11) with ESMTP id j6JKkEc2027235; Tue, 19 Jul 2005 16:46:14 -0400 (EDT) Message-Id: <200507192046.j6JKkEc2027235@ginger.cmf.nrl.navy.mil> To: netdev@oss.sgi.com Cc: davem@davemloft.net Subject: [PATCH 7/8][ATM]: Trivial spelling fix patch for net/Kconfig Date: Tue, 19 Jul 2005 16:46:15 -0400 From: "chas williams - CONTRACTOR" X-Scanned-By: MIMEDefang 2.30 (www . roaringpenguin . com / mimedefang) X-archive-position: 2755 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: chas@cmf.nrl.navy.mil Precedence: bulk X-list: netdev Content-Length: 1529 Lines: 39 please apply to 2.6 -- thanks! [ATM]: Trivial spelling fix patch for net/Kconfig Signed-off-by: Jesper Juhl Signed-off-by: Adrian Bunk Signed-off-by: Chas Williams --- commit eefc05048924978c1e28835bf5085c5de2e653d2 tree b453e78b3eca05558e5c3ff8d9cf9ab9f2ee9bd9 parent ac4755cc8eefb198945e76d4069184454c0819ce author chas williams Tue, 19 Jul 2005 14:58:08 -0400 committer chas williams Tue, 19 Jul 2005 14:58:08 -0400 net/atm/Kconfig | 4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/net/atm/Kconfig b/net/atm/Kconfig --- a/net/atm/Kconfig +++ b/net/atm/Kconfig @@ -60,7 +60,7 @@ config ATM_BR2684 tristate "RFC1483/2684 Bridged protocols" depends on ATM && INET help - ATM PVCs can carry ethernet PDUs according to rfc2684 (formerly 1483) + ATM PVCs can carry ethernet PDUs according to RFC2684 (formerly 1483) This device will act like an ethernet from the kernels point of view, with the traffic being carried by ATM PVCs (currently 1 PVC/device). This is sometimes used over DSL lines. If in doubt, say N. @@ -69,6 +69,6 @@ config ATM_BR2684_IPFILTER bool "Per-VC IP filter kludge" depends on ATM_BR2684 help - This is an experimental mechanism for users who need to terminating a + This is an experimental mechanism for users who need to terminate a large number of IP-only vcc's. Do not enable this unless you are sure you know what you are doing. From davem@davemloft.net Tue Jul 19 14:25:09 2005 Received: with ECARTIS (v1.0.0; list netdev); Tue, 19 Jul 2005 14:25:14 -0700 (PDT) Received: from sunset.davemloft.net (dsl027-180-168.sfo1.dsl.speakeasy.net [216.27.180.168]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6JLP9H9001631 for ; Tue, 19 Jul 2005 14:25:09 -0700 Received: from localhost ([127.0.0.1] ident=davem) by sunset.davemloft.net with esmtp (Exim 4.52) id 1DuzYf-0001Bn-1m; Tue, 19 Jul 2005 14:23:21 -0700 Date: Tue, 19 Jul 2005 14:23:20 -0700 (PDT) Message-Id: <20050719.142320.52167011.davem@davemloft.net> To: ravinandan.arakali@neterion.com Cc: jgarzik@pobox.com, netdev@oss.sgi.com, raghavendra.koushik@neterion.com, leonid.grossman@neterion.com, ananda.raju@neterion.com, rapuru.sriram@neterion.com Subject: Re: [PATCH 2.6.12-rc4] IPv4/IPv6: USO v2, Scatter-gather approach From: "David S. Miller" In-Reply-To: <20050603004106.BAB6A7B990@linux.site> References: <20050603004106.BAB6A7B990@linux.site> X-Mailer: Mew version 4.2 on Emacs 21.4 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 2757 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev Content-Length: 1297 Lines: 39 From: ravinandan.arakali@neterion.com Date: Thu, 2 Jun 2005 17:41:06 -0700 (PDT) > Attached below is version 2 of kernel patch for UDP Large send offload > feature. This patch uses the "Scatter-Gather" approach. > It also incorporates David Miller's comments on the first version. I've reviewed this patches, and I think there is a problem with the sock_append_data() scheme. You disallow the case where there is an SKB on the write queue already. This breaks NFS, and other things using MSG_MORE and UDP_CORK. They do a two step packet building: 1) Send protocol headers, f.e. NFS 2) Send file contents via sendfile() 3) Uncork socket so packet gets emitted and due to this check: + if (skb_queue_len(&sk->sk_write_queue)) { + *err = -EOPNOTSUPP; + return NULL; + } you end up not supporting this correctly. So we have two options, either add support for corked socket handling to sock_append_data() or we go with the frag_list patch which doesn't have this problem. I prefer the frag_list patch from a cleanliness perspective, however I remember you saying that the sock_append_data() approach obtained better performance. And that seems clear since there will be less TX descriptors needed to send such frames unless the driver does coalescing as it walks the frag_list chain. From romieu@fr.zoreil.com Tue Jul 19 15:34:25 2005 Received: with ECARTIS (v1.0.0; list netdev); Tue, 19 Jul 2005 15:34:29 -0700 (PDT) Received: from fr.zoreil.com (electric-eye.fr.zoreil.com [213.41.134.224]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6JMYOH9006449 for ; Tue, 19 Jul 2005 15:34:24 -0700 Received: from electric-eye.fr.zoreil.com (localhost.localdomain [127.0.0.1]) by fr.zoreil.com (8.13.4/8.12.1) with ESMTP id j6JMVZiU027722; Wed, 20 Jul 2005 00:31:35 +0200 Received: (from romieu@localhost) by electric-eye.fr.zoreil.com (8.13.4/8.12.1) id j6JMVYQw027721; Wed, 20 Jul 2005 00:31:34 +0200 Date: Wed, 20 Jul 2005 00:31:34 +0200 From: Francois Romieu To: chas williams - CONTRACTOR Cc: netdev@oss.sgi.com, davem@davemloft.net Subject: Re: [PATCH 1/8][ATM]: [zatm] eliminate kfree warning (from Tobias Hirning ) Message-ID: <20050719223134.GA24535@electric-eye.fr.zoreil.com> References: <200507192044.j6JKirFb027183@ginger.cmf.nrl.navy.mil> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <200507192044.j6JKirFb027183@ginger.cmf.nrl.navy.mil> User-Agent: Mutt/1.4.2.1i X-Organisation: Land of Sunshine Inc. X-archive-position: 2758 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: romieu@fr.zoreil.com Precedence: bulk X-list: netdev Content-Length: 837 Lines: 31 chas williams - CONTRACTOR : [...] > diff --git a/drivers/atm/zatm.c b/drivers/atm/zatm.c > --- a/drivers/atm/zatm.c > +++ b/drivers/atm/zatm.c > @@ -1339,7 +1339,7 @@ static int __init zatm_start(struct atm_ > return 0; > out: > for (i = 0; i < NR_MBX; i++) > - kfree(zatm_dev->mbx_start[i]); > + kfree(&zatm_dev->mbx_start[i]); > kfree(zatm_dev->rx_map); > kfree(zatm_dev->tx_map); > free_irq(zatm_dev->irq, dev); Wow... static int __init zatm_start(struct atm_dev *dev) { [...] for (i = 0; i < NR_MBX; i++) [blah blah] here = (unsigned long) kmalloc(2*MBX_SIZE(i), GFP_KERNEL); [blah blah] zatm_dev->mbx_start[i] = here; There is some stuff in my patchbucket for this one, please wait a minute. -- Ueimor From romieu@fr.zoreil.com Tue Jul 19 15:42:26 2005 Received: with ECARTIS (v1.0.0; list netdev); Tue, 19 Jul 2005 15:42:29 -0700 (PDT) Received: from fr.zoreil.com (electric-eye.fr.zoreil.com [213.41.134.224]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6JMgPH9007274 for ; Tue, 19 Jul 2005 15:42:26 -0700 Received: from electric-eye.fr.zoreil.com (localhost.localdomain [127.0.0.1]) by fr.zoreil.com (8.13.4/8.12.1) with ESMTP id j6JMatuj027838; Wed, 20 Jul 2005 00:36:55 +0200 Received: (from romieu@localhost) by electric-eye.fr.zoreil.com (8.13.4/8.12.1) id j6JMatS0027837; Wed, 20 Jul 2005 00:36:55 +0200 Date: Wed, 20 Jul 2005 00:36:55 +0200 From: Francois Romieu To: chas williams Cc: netdev@oss.sgi.com, davem@davemloft.net Subject: [patch linux-2.6.13-rc2-gitXX 1/1] zatm: mailbox converted to pci_alloc_consistent() Message-ID: <20050719223655.GA27724@electric-eye.fr.zoreil.com> References: <200507192044.j6JKirFb027183@ginger.cmf.nrl.navy.mil> <20050719223134.GA24535@electric-eye.fr.zoreil.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20050719223134.GA24535@electric-eye.fr.zoreil.com> User-Agent: Mutt/1.4.2.1i X-Organisation: Land of Sunshine Inc. X-archive-position: 2759 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: romieu@fr.zoreil.com Precedence: bulk X-list: netdev Content-Length: 5460 Lines: 173 mailbox converted to pci_alloc_consistent() - request_region() is not needed: zatm_init_one() issues pci_request_regions(); - the warning related to kfree(zatm_dev->mbx_start) disappears; Compiled with i386 and sparc64 as target. Signed-off-by: Francois Romieu diff -puN drivers/atm/zatm.c~drivers-atm-zatm-mbx drivers/atm/zatm.c --- linux-2.6.13-rc2-gitXX/drivers/atm/zatm.c~drivers-atm-zatm-mbx 2005-07-19 23:40:48.696490366 +0200 +++ linux-2.6.13-rc2-gitXX-fr/drivers/atm/zatm.c 2005-07-20 00:27:48.263131297 +0200 @@ -16,9 +16,9 @@ #include #include #include -#include /* for request_region */ #include #include +#include #include #include #include @@ -1260,22 +1260,22 @@ static int __init zatm_init(struct atm_d static int __init zatm_start(struct atm_dev *dev) { - struct zatm_dev *zatm_dev; + struct zatm_dev *zatm_dev = ZATM_DEV(dev); + struct pci_dev *pdev = zatm_dev->pci_dev; unsigned long curr; int pools,vccs,rx; - int error,i,ld; + int error, i, ld; DPRINTK("zatm_start\n"); - zatm_dev = ZATM_DEV(dev); zatm_dev->rx_map = zatm_dev->tx_map = NULL; - for (i = 0; i < NR_MBX; i++) - zatm_dev->mbx_start[i] = 0; - if (request_irq(zatm_dev->irq,&zatm_int,SA_SHIRQ,DEV_LABEL,dev)) { - printk(KERN_ERR DEV_LABEL "(itf %d): IRQ%d is already in use\n", - dev->number,zatm_dev->irq); - return -EAGAIN; + for (i = 0; i < NR_MBX; i++) + zatm_dev->mbx_start[i] = 0; + error = request_irq(zatm_dev->irq, zatm_int, SA_SHIRQ, DEV_LABEL, dev); + if (error < 0) { + printk(KERN_ERR DEV_LABEL "(itf %d): IRQ%d is already in use\n", + dev->number,zatm_dev->irq); + goto done; } - request_region(zatm_dev->base,uPD98401_PORTS,DEV_LABEL); /* define memory regions */ pools = NR_POOLS; if (NR_SHAPERS*SHAPER_SIZE > pools*POOL_SIZE) @@ -1302,51 +1302,66 @@ static int __init zatm_start(struct atm_ "%ld VCs\n",dev->number,NR_SHAPERS,pools,rx, (zatm_dev->mem-curr*4)/VC_SIZE); /* create mailboxes */ - for (i = 0; i < NR_MBX; i++) - if (mbx_entries[i]) { - unsigned long here; - - here = (unsigned long) kmalloc(2*MBX_SIZE(i), - GFP_KERNEL); - if (!here) { - error = -ENOMEM; - goto out; - } - if ((here^(here+MBX_SIZE(i))) & ~0xffffUL)/* paranoia */ - here = (here & ~0xffffUL)+0x10000; - zatm_dev->mbx_start[i] = here; - if ((here^virt_to_bus((void *) here)) & 0xffff) { - printk(KERN_ERR DEV_LABEL "(itf %d): system " - "bus incompatible with driver\n", - dev->number); - error = -ENODEV; - goto out; - } - DPRINTK("mbx@0x%08lx-0x%08lx\n",here,here+MBX_SIZE(i)); - zatm_dev->mbx_end[i] = (here+MBX_SIZE(i)) & 0xffff; - zout(virt_to_bus((void *) here) >> 16,MSH(i)); - zout(virt_to_bus((void *) here),MSL(i)); - zout((here+MBX_SIZE(i)) & 0xffff,MBA(i)); - zout(here & 0xffff,MTA(i)); - zout(here & 0xffff,MWA(i)); - } + for (i = 0; i < NR_MBX; i++) { + void *mbx; + dma_addr_t mbx_dma; + + if (!mbx_entries[i]) + continue; + mbx = pci_alloc_consistent(pdev, 2*MBX_SIZE(i), &mbx_dma); + if (!mbx) { + error = -ENOMEM; + goto out; + } + /* + * Alignment provided by pci_alloc_consistent() isn't enough + * for this device. + */ + if (((unsigned long)mbx ^ mbx_dma) & 0xffff) { + printk(KERN_ERR DEV_LABEL "(itf %d): system " + "bus incompatible with driver\n", dev->number); + pci_free_consistent(pdev, 2*MBX_SIZE(i), mbx, mbx_dma); + error = -ENODEV; + goto out; + } + DPRINTK("mbx@0x%08lx-0x%08lx\n", mbx, mbx + MBX_SIZE(i)); + zatm_dev->mbx_start[i] = (unsigned long)mbx; + zatm_dev->mbx_dma[i] = mbx_dma; + zatm_dev->mbx_end[i] = (zatm_dev->mbx_start[i] + MBX_SIZE(i)) & + 0xffff; + zout(mbx_dma >> 16, MSH(i)); + zout(mbx_dma, MSL(i)); + zout(zatm_dev->mbx_end[i], MBA(i)); + zout((unsigned long)mbx & 0xffff, MTA(i)); + zout((unsigned long)mbx & 0xffff, MWA(i)); + } error = start_tx(dev); - if (error) goto out; + if (error) + goto out; error = start_rx(dev); - if (error) goto out; + if (error) + goto out_tx; error = dev->phy->start(dev); - if (error) goto out; + if (error) + goto out_rx; zout(0xffffffff,IMR); /* enable interrupts */ /* enable TX & RX */ zout(zin(GMR) | uPD98401_GMR_SE | uPD98401_GMR_RE,GMR); - return 0; - out: - for (i = 0; i < NR_MBX; i++) - kfree(zatm_dev->mbx_start[i]); +done: + return error; + +out_rx: kfree(zatm_dev->rx_map); +out_tx: kfree(zatm_dev->tx_map); +out: + while (i-- > 0) { + pci_free_consistent(pdev, 2*MBX_SIZE(i), + (void *)zatm_dev->mbx_start[i], + zatm_dev->mbx_dma[i]); + } free_irq(zatm_dev->irq, dev); - return error; + goto done; } diff -puN drivers/atm/zatm.h~drivers-atm-zatm-mbx drivers/atm/zatm.h --- linux-2.6.13-rc2-gitXX/drivers/atm/zatm.h~drivers-atm-zatm-mbx 2005-07-19 23:40:48.711487946 +0200 +++ linux-2.6.13-rc2-gitXX-fr/drivers/atm/zatm.h 2005-07-19 23:40:48.764479397 +0200 @@ -73,6 +73,7 @@ struct zatm_dev { int chans; /* map size, must be 2^n */ /*-------------------------------- mailboxes */ unsigned long mbx_start[NR_MBX];/* start addresses */ + dma_addr_t mbx_dma[NR_MBX]; u16 mbx_end[NR_MBX]; /* end offset (in bytes) */ /*-------------------------------- other pointers */ u32 pool_base; /* Free buffer pool dsc (word addr) */ _ From jakub@sunsite.ms.mff.cuni.cz Wed Jul 20 00:21:26 2005 Received: with ECARTIS (v1.0.0; list netdev); Wed, 20 Jul 2005 00:21:37 -0700 (PDT) Received: from sunsite.mff.cuni.cz (sunsite.ms.mff.cuni.cz [195.113.15.26]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6K7LPH9021647 for ; Wed, 20 Jul 2005 00:21:26 -0700 Received: from sunsite.mff.cuni.cz (sunsite.mff.cuni.cz [127.0.0.1]) by sunsite.mff.cuni.cz (8.13.1/8.13.1) with ESMTP id j6K7JLtE026511; Wed, 20 Jul 2005 09:19:21 +0200 Received: (from jakub@localhost) by sunsite.mff.cuni.cz (8.13.1/8.13.1/Submit) id j6K7JKaV026498; Wed, 20 Jul 2005 09:19:20 +0200 Date: Wed, 20 Jul 2005 09:19:20 +0200 From: Jakub Jelinek To: chas williams - CONTRACTOR Cc: netdev@oss.sgi.com, davem@davemloft.net Subject: Re: [PATCH 1/8][ATM]: [zatm] eliminate kfree warning (from Tobias Hirning ) Message-ID: <20050720071919.GV4740@sunsite.mff.cuni.cz> Reply-To: Jakub Jelinek References: <200507192044.j6JKirFb027183@ginger.cmf.nrl.navy.mil> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <200507192044.j6JKirFb027183@ginger.cmf.nrl.navy.mil> User-Agent: Mutt/1.4.1i X-archive-position: 2760 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jakub@redhat.com Precedence: bulk X-list: netdev Content-Length: 1720 Lines: 44 On Tue, Jul 19, 2005 at 04:44:54PM -0400, chas williams - CONTRACTOR wrote: > please apply to 2.6 -- thanks! > > [ATM]: [zatm] eliminate kfree warning (from Tobias Hirning ) > > Signed-off-by: Chas Williams > > > --- > commit 4932248439d20412610ffaade625cbde0e001e37 > tree 35a60e3551f5f1abced8a435238575c488041af6 > parent 238921d2cb04eb6dcc2ff5914d555d7dcaa4dfc5 > author chas williams Wed, 06 Jul 2005 13:10:18 -0400 > committer chas williams Wed, 06 Jul 2005 13:10:18 -0400 > > drivers/atm/zatm.c | 2 +- > 1 files changed, 1 insertions(+), 1 deletions(-) > > diff --git a/drivers/atm/zatm.c b/drivers/atm/zatm.c > --- a/drivers/atm/zatm.c > +++ b/drivers/atm/zatm.c > @@ -1339,7 +1339,7 @@ static int __init zatm_start(struct atm_ > return 0; > out: > for (i = 0; i < NR_MBX; i++) > - kfree(zatm_dev->mbx_start[i]); > + kfree(&zatm_dev->mbx_start[i]); This can't be right. zatm_dev->mbx_start[i] is allocated with: 1306 here = (unsigned long) kmalloc(2*MBX_SIZE(i), 1307 GFP_KERNEL); 1308 if (!here) { 1309 error = -ENOMEM; 1310 goto out; 1311 } 1312 if ((here^(here+MBX_SIZE(i))) & ~0xffffUL)/* paranoia */ 1313 here = (here & ~0xffffUL)+0x10000; 1314 zatm_dev->mbx_start[i] = here; so even kfree((void *)zatm_dev->mbx_start[i]); is wrong in case there was an alignment, but kfree(&zatm_dev->mbx_start[i]) is wrong in all cases. Jakub From chas@cmf.nrl.navy.mil Wed Jul 20 08:09:17 2005 Received: with ECARTIS (v1.0.0; list netdev); Wed, 20 Jul 2005 08:09:22 -0700 (PDT) Received: from ginger.cmf.nrl.navy.mil (ginger.cmf.nrl.navy.mil [134.207.10.161]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6KF9EH9007448 for ; Wed, 20 Jul 2005 08:09:16 -0700 Received: from cmf.nrl.navy.mil (thirdoffive.cmf.nrl.navy.mil [134.207.10.180]) by ginger.cmf.nrl.navy.mil (8.12.11/8.12.11) with ESMTP id j6KF7Buk009484; Wed, 20 Jul 2005 11:07:11 -0400 (EDT) Message-Id: <200507201507.j6KF7Buk009484@ginger.cmf.nrl.navy.mil> From: chas@cmf.nrl.navy.mil To: Francois Romieu cc: netdev@oss.sgi.com Reply-To: chas3@users.sourceforge.net Subject: Re: [patch linux-2.6.13-rc2-gitXX 1/1] zatm: mailbox converted to pci_alloc_consistent() In-reply-to: <20050719223655.GA27724@electric-eye.fr.zoreil.com> Date: Wed, 20 Jul 2005 11:07:12 -0400 X-Scanned-By: MIMEDefang 2.30 (www . roaringpenguin . com / mimedefang) X-archive-position: 2761 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: chas@cmf.nrl.navy.mil Precedence: bulk X-list: netdev Content-Length: 225 Lines: 5 In message <20050719223655.GA27724@electric-eye.fr.zoreil.com>,Francois Romieu writes: >+ mbx = pci_alloc_consistent(pdev, 2*MBX_SIZE(i), &mbx_dma); you might want a pci_set_dma_mask(,DMA_32BIT_MASK) before you call this. From ravinandan.arakali@neterion.com Wed Jul 20 10:06:32 2005 Received: with ECARTIS (v1.0.0; list netdev); Wed, 20 Jul 2005 10:06:35 -0700 (PDT) Received: from ns1.s2io.com (ns1.s2io.com [142.46.200.198]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6KH6VH9017884 for ; Wed, 20 Jul 2005 10:06:32 -0700 Received: from guinness.s2io.com (sentry.s2io.com [142.46.200.199]) by ns1.s2io.com (8.12.10/8.12.10) with ESMTP id j6KH4dcx014376 for ; Wed, 20 Jul 2005 13:04:39 -0400 (EDT) Received: from rarakali ([10.16.16.79]) by guinness.s2io.com (8.12.6/8.12.6) with SMTP id j6KH4bKP027558; Wed, 20 Jul 2005 13:04:37 -0400 (EDT) From: "Ravinandan Arakali" To: Cc: "Leonid. Grossman \(E-mail\)" , "Raghavendra. Koushik \(E-mail\)" Subject: Seeing problem on 2.6.13-rc3 Date: Wed, 20 Jul 2005 10:04:36 -0700 Message-ID: <001201c58d4d$1c1f78f0$4f10100a@pc.s2io.com> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 (Normal) X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook CWS, Build 9.0.2416 (9.0.2911.0) Importance: Normal X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.2180 X-Scanned-By: MIMEDefang 2.34 X-archive-position: 2762 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ravinandan.arakali@neterion.com Precedence: bulk X-list: netdev Content-Length: 564 Lines: 18 Hi, I am facing the following problem with loading S2io 10GbE driver on 2.6.13-rc3. Just wondering if anybody else is having similar problems with S2io driver or other drivers. When trying to load s2io driver, I see the following message: eth%d: 1:Endian settings are wrong, feedback read ffffffffffffffff eth%d:swapper settings are wrong Basically, we try to configure the card for correct access on both endian systems. This seems to fail. But on a somewhat similar kernel(2.6.13-rc2-mm2), with the same system, card and driver, it works fine. Thanks, Ravi From davem@davemloft.net Wed Jul 20 11:58:23 2005 Received: with ECARTIS (v1.0.0; list netdev); Wed, 20 Jul 2005 11:58:27 -0700 (PDT) Received: from sunset.davemloft.net (dsl027-180-168.sfo1.dsl.speakeasy.net [216.27.180.168]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6KIwMH9024274 for ; Wed, 20 Jul 2005 11:58:22 -0700 Received: from localhost ([127.0.0.1] ident=davem) by sunset.davemloft.net with esmtp (Exim 4.52) id 1DvJk6-0005yy-Hj; Wed, 20 Jul 2005 11:56:30 -0700 Date: Wed, 20 Jul 2005 11:56:30 -0700 (PDT) Message-Id: <20050720.115630.48529801.davem@davemloft.net> To: jakub@redhat.com Cc: chas@cmf.nrl.navy.mil, netdev@oss.sgi.com Subject: Re: [PATCH 1/8][ATM]: [zatm] eliminate kfree warning (from Tobias Hirning ) From: "David S. Miller" In-Reply-To: <20050720071919.GV4740@sunsite.mff.cuni.cz> References: <200507192044.j6JKirFb027183@ginger.cmf.nrl.navy.mil> <20050720071919.GV4740@sunsite.mff.cuni.cz> X-Mailer: Mew version 4.2 on Emacs 21.4 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 2764 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev Content-Length: 220 Lines: 7 From: Jakub Jelinek Date: Wed, 20 Jul 2005 09:19:20 +0200 > This can't be right. zatm_dev->mbx_start[i] is allocated with: Correct, Francois posted a much better fix and I'll put that into my tree. From davem@davemloft.net Wed Jul 20 11:57:25 2005 Received: with ECARTIS (v1.0.0; list netdev); Wed, 20 Jul 2005 11:57:31 -0700 (PDT) Received: from sunset.davemloft.net (dsl027-180-168.sfo1.dsl.speakeasy.net [216.27.180.168]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6KIvOH9024165 for ; Wed, 20 Jul 2005 11:57:25 -0700 Received: from localhost ([127.0.0.1] ident=davem) by sunset.davemloft.net with esmtp (Exim 4.52) id 1DvJj9-0005yl-5U; Wed, 20 Jul 2005 11:55:31 -0700 Date: Wed, 20 Jul 2005 11:55:30 -0700 (PDT) Message-Id: <20050720.115530.85417052.davem@davemloft.net> To: chas3@users.sourceforge.net, chas@cmf.nrl.navy.mil Cc: romieu@fr.zoreil.com, netdev@oss.sgi.com Subject: Re: [patch linux-2.6.13-rc2-gitXX 1/1] zatm: mailbox converted to pci_alloc_consistent() From: "David S. Miller" In-Reply-To: <200507201507.j6KF7Buk009484@ginger.cmf.nrl.navy.mil> References: <20050719223655.GA27724@electric-eye.fr.zoreil.com> <200507201507.j6KF7Buk009484@ginger.cmf.nrl.navy.mil> X-Mailer: Mew version 4.2 on Emacs 21.4 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 2763 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev Content-Length: 431 Lines: 11 From: chas@cmf.nrl.navy.mil Subject: Re: [patch linux-2.6.13-rc2-gitXX 1/1] zatm: mailbox converted to pci_alloc_consistent() Date: Wed, 20 Jul 2005 11:07:12 -0400 > In message <20050719223655.GA27724@electric-eye.fr.zoreil.com>,Francois Romieu > writes: > >+ mbx = pci_alloc_consistent(pdev, 2*MBX_SIZE(i), &mbx_dma); > > you might want a pci_set_dma_mask(,DMA_32BIT_MASK) before you call this. That's the default, no need. From davem@davemloft.net Wed Jul 20 12:04:08 2005 Received: with ECARTIS (v1.0.0; list netdev); Wed, 20 Jul 2005 12:04:12 -0700 (PDT) Received: from sunset.davemloft.net (dsl027-180-168.sfo1.dsl.speakeasy.net [216.27.180.168]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6KJ48H9025371 for ; Wed, 20 Jul 2005 12:04:08 -0700 Received: from localhost ([127.0.0.1] ident=davem) by sunset.davemloft.net with esmtp (Exim 4.52) id 1DvJpj-00060C-Im; Wed, 20 Jul 2005 12:02:19 -0700 Date: Wed, 20 Jul 2005 12:02:19 -0700 (PDT) Message-Id: <20050720.120219.75429205.davem@davemloft.net> To: romieu@fr.zoreil.com Cc: chas@cmf.nrl.navy.mil, netdev@oss.sgi.com Subject: Re: [patch linux-2.6.13-rc2-gitXX 1/1] zatm: mailbox converted to pci_alloc_consistent() From: "David S. Miller" In-Reply-To: <20050719223655.GA27724@electric-eye.fr.zoreil.com> References: <200507192044.j6JKirFb027183@ginger.cmf.nrl.navy.mil> <20050719223134.GA24535@electric-eye.fr.zoreil.com> <20050719223655.GA27724@electric-eye.fr.zoreil.com> X-Mailer: Mew version 4.2 on Emacs 21.4 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 2765 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev Content-Length: 419 Lines: 14 From: Francois Romieu Date: Wed, 20 Jul 2005 00:36:55 +0200 > mailbox converted to pci_alloc_consistent() > > - request_region() is not needed: zatm_init_one() issues > pci_request_regions(); > - the warning related to kfree(zatm_dev->mbx_start) disappears; > > Compiled with i386 and sparc64 as target. > > Signed-off-by: Francois Romieu This looks great, applied. From MAILER-DAEMON@oss.sgi.com Wed Jul 20 12:33:01 2005 Received: with ECARTIS (v1.0.0; list netdev); Wed, 20 Jul 2005 12:33:05 -0700 (PDT) Received: from bluebottle-fe1.bluebottle.com (bluebottle-fe1.bluebottle.com [67.107.78.243]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6KJX1H9031325 for ; Wed, 20 Jul 2005 12:33:01 -0700 Received: from localhost.localdomain (localhost [127.0.0.1]) by bluebottle-fe1.bluebottle.com (8.12.11/8.12.11) with ESMTP id j6KJVB1I009337 for ; Wed, 20 Jul 2005 14:31:11 -0500 Date: Wed, 20 Jul 2005 14:31:11 -0500 Message-Id: <200507201931.j6KJVB1I009337@bluebottle-fe1.bluebottle.com> From: iamthemonk@bluebottle.com To: netdev@oss.sgi.com Subject: RE: Delivery reports about your e-mail Content-Type: text/plain; charset=us-ascii X-Bluebottle-Request: <6c63b99816f766f1785ff85235f3a38b> X-Bluebottle-Address: X-Bluebottle-Subject: Delivery reports about your e-mail X-archive-position: 2766 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: iamthemonk@bluebottle.com Precedence: bulk X-list: netdev Content-Length: 444 Lines: 12 Your message was received at Wed, 20 Jul 2005 19:31:11 +0000: To: iamthemonk@bluebottle.com Subject: Delivery reports about your e-mail This account is protected by Bluebottle. Please click on the following link to have your address added to the recipient's allowed list and to ensure delivery of your email. http://www.bluebottle.com/verification/6c63b99816f766f1785ff85235f3a38b Bluebottle is a trademark of Bluebottle Solutions Pty Ltd From dsd@gentoo.org Wed Jul 20 14:14:34 2005 Received: with ECARTIS (v1.0.0; list netdev); Wed, 20 Jul 2005 14:14:39 -0700 (PDT) Received: from mta08-winn.ispmail.ntl.com (mta08-winn.ispmail.ntl.com [81.103.221.48]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6KLEUH9005141 for ; Wed, 20 Jul 2005 14:14:34 -0700 Received: from aamta11-winn.ispmail.ntl.com ([81.103.221.35]) by mta08-winn.ispmail.ntl.com with ESMTP id <20050720211240.RFVU889.mta08-winn.ispmail.ntl.com@aamta11-winn.ispmail.ntl.com>; Wed, 20 Jul 2005 22:12:40 +0100 Received: from zog.reactivated.net ([81.99.81.161]) by aamta11-winn.ispmail.ntl.com with ESMTP id <20050720211240.JTKN22522.aamta11-winn.ispmail.ntl.com@zog.reactivated.net>; Wed, 20 Jul 2005 22:12:40 +0100 Received: from [192.168.0.2] (dsd [192.168.0.2]) by zog.reactivated.net (Postfix) with ESMTP id 634E47BB83B; Wed, 20 Jul 2005 22:37:27 +0100 (BST) Message-ID: <42DEBF19.7000207@gentoo.org> Date: Wed, 20 Jul 2005 22:16:09 +0100 From: Daniel Drake User-Agent: Mozilla Thunderbird 1.0.5 (X11/20050715) X-Accept-Language: en-us, en MIME-Version: 1.0 To: Manfred Spraul Cc: Linux Kernel Mailing List , Netdev , Ayaz Abdulla Subject: Re: [PATCH] forcedeth: TX handler changes (experimental) References: <42D913D6.5050902@colorfullife.com> <42D9658B.7020907@gentoo.org> <42D97B29.4050400@gentoo.org> <42D97DD0.6030207@colorfullife.com> In-Reply-To: <42D97DD0.6030207@colorfullife.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 2767 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: dsd@gentoo.org Precedence: bulk X-list: netdev Content-Length: 250 Lines: 12 Manfred Spraul wrote: > Autsch. > Yes, you are right. Sorry for that, I should have reread the patch once > more. No problem :) I've been running v0.38 since my last mail. No problems at all. Thanks for your continued work on the driver. Daniel From jesse.brandeburg@intel.com Wed Jul 20 16:25:59 2005 Received: with ECARTIS (v1.0.0; list netdev); Wed, 20 Jul 2005 16:26:03 -0700 (PDT) Received: from orsfmr004.jf.intel.com (fmr19.intel.com [134.134.136.18]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6KNPxH9016129 for ; Wed, 20 Jul 2005 16:25:59 -0700 Received: from orsfmr101.jf.intel.com (orsfmr101.jf.intel.com [10.7.209.17]) by orsfmr004.jf.intel.com (8.12.10/8.12.10/d: major-outer.mc,v 1.1 2004/09/17 17:50:56 root Exp $) with ESMTP id j6KNO0FZ001320; Wed, 20 Jul 2005 23:24:00 GMT Received: from orsmsxvs040.jf.intel.com (orsmsxvs040.jf.intel.com [192.168.65.206]) by orsfmr101.jf.intel.com (8.12.10/8.12.10/d: major-inner.mc,v 1.2 2004/09/17 18:05:01 root Exp $) with SMTP id j6KNMAnA018200; Wed, 20 Jul 2005 23:24:00 GMT Received: from orsmsx332.amr.corp.intel.com ([192.168.65.60]) by orsmsxvs040.jf.intel.com (SAVSMTP 3.1.7.47) with SMTP id M2005072016240004083 ; Wed, 20 Jul 2005 16:24:00 -0700 Received: from orsmsx408.amr.corp.intel.com ([192.168.65.52]) by orsmsx332.amr.corp.intel.com with Microsoft SMTPSVC(6.0.3790.211); Wed, 20 Jul 2005 16:24:00 -0700 X-MimeOLE: Produced By Microsoft Exchange V6.5.7226.0 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Subject: RE: [E1000-devel] Re: drop counts Date: Wed, 20 Jul 2005 16:23:59 -0700 Message-ID: X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: [E1000-devel] Re: drop counts thread-index: AcWNey9LqUd7z49+SOqqfCi34TngCwABElzQ From: "Brandeburg, Jesse" To: Cc: , , X-OriginalArrivalTime: 20 Jul 2005 23:24:00.0323 (UTC) FILETIME=[1BE5B130:01C58D82] X-Scanned-By: MIMEDefang 2.44 Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id j6KNPxH9016129 X-archive-position: 2768 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jesse.brandeburg@intel.com Precedence: bulk X-list: netdev Content-Length: 234 Lines: 10 I apologize for my misconfigured email client, this is my correct address PS machine rebuilds suck. -----Original Message----- From: netdev-owner@vger.kernel.org [mailto:netdev-owner@vger.kernel.org] On Behalf Of Jesse Brandeburg From ravinandan.arakali@neterion.com Thu Jul 21 11:44:39 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 21 Jul 2005 11:44:55 -0700 (PDT) Received: from ns1.s2io.com (ns1.s2io.com [142.46.200.198]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6LIiYH9020385 for ; Thu, 21 Jul 2005 11:44:39 -0700 Received: from guinness.s2io.com (sentry.s2io.com [142.46.200.199]) by ns1.s2io.com (8.12.10/8.12.10) with ESMTP id j6LIgBcx019070; Thu, 21 Jul 2005 14:42:11 -0400 (EDT) Received: from rarakali ([10.16.16.72]) by guinness.s2io.com (8.12.6/8.12.6) with SMTP id j6LIg8KP001439; Thu, 21 Jul 2005 14:42:09 -0400 (EDT) From: "Ravinandan Arakali" To: "'David S. Miller'" Cc: , , , , , Subject: RE: [PATCH 2.6.12-rc4] IPv4/IPv6: USO v2, Scatter-gather approach Date: Thu, 21 Jul 2005 11:42:10 -0700 Message-ID: <000c01c58e23$e81ea850$4810100a@pc.s2io.com> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Priority: 3 (Normal) X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook CWS, Build 9.0.2416 (9.0.2911.0) X-MIMEOLE: Produced By Microsoft MimeOLE V6.00.2900.2180 Importance: Normal In-Reply-To: <20050719.142320.52167011.davem@davemloft.net> X-Scanned-By: MIMEDefang 2.34 X-archive-position: 2769 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ravinandan.arakali@neterion.com Precedence: bulk X-list: netdev Content-Length: 1787 Lines: 58 David, We are working on your comments. Will get back to you before end of next week. Thanks, Ravi -----Original Message----- From: David S. Miller [mailto:davem@davemloft.net] Sent: Tuesday, July 19, 2005 2:23 PM To: ravinandan.arakali@neterion.com Cc: jgarzik@pobox.com; netdev@oss.sgi.com; raghavendra.koushik@neterion.com; leonid.grossman@neterion.com; ananda.raju@neterion.com; rapuru.sriram@neterion.com Subject: Re: [PATCH 2.6.12-rc4] IPv4/IPv6: USO v2, Scatter-gather approach From: ravinandan.arakali@neterion.com Date: Thu, 2 Jun 2005 17:41:06 -0700 (PDT) > Attached below is version 2 of kernel patch for UDP Large send offload > feature. This patch uses the "Scatter-Gather" approach. > It also incorporates David Miller's comments on the first version. I've reviewed this patches, and I think there is a problem with the sock_append_data() scheme. You disallow the case where there is an SKB on the write queue already. This breaks NFS, and other things using MSG_MORE and UDP_CORK. They do a two step packet building: 1) Send protocol headers, f.e. NFS 2) Send file contents via sendfile() 3) Uncork socket so packet gets emitted and due to this check: + if (skb_queue_len(&sk->sk_write_queue)) { + *err = -EOPNOTSUPP; + return NULL; + } you end up not supporting this correctly. So we have two options, either add support for corked socket handling to sock_append_data() or we go with the frag_list patch which doesn't have this problem. I prefer the frag_list patch from a cleanliness perspective, however I remember you saying that the sock_append_data() approach obtained better performance. And that seems clear since there will be less TX descriptors needed to send such frames unless the driver does coalescing as it walks the frag_list chain. From manfred@colorfullife.com Fri Jul 22 13:02:58 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 22 Jul 2005 13:03:03 -0700 (PDT) Received: from dbl.q-ag.de (dbl.q-ag.de [213.172.117.3]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6MK2sH9008043 for ; Fri, 22 Jul 2005 13:02:57 -0700 Received: from [127.0.0.2] (dbl [127.0.0.1]) by dbl.q-ag.de (8.13.3/8.13.3/Debian-6) with ESMTP id j6MK47Rv021471; Fri, 22 Jul 2005 22:04:08 +0200 Message-ID: <42E15076.4080907@colorfullife.com> Date: Fri, 22 Jul 2005 22:00:54 +0200 From: Manfred Spraul User-Agent: Mozilla/5.0 (X11; U; Linux i686; fr-FR; rv:1.7.10) Gecko/20050719 Fedora/1.7.10-1.3.1 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Jeff Garzik CC: Ayaz Abdulla , Netdev Subject: [PATCH] forcedeth: simplify tx interrupt handling Content-Type: multipart/mixed; boundary="------------080303050106030109080905" X-archive-position: 2770 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: manfred@colorfullife.com Precedence: bulk X-list: netdev Content-Length: 10790 Lines: 260 This is a multi-part message in MIME format. --------------080303050106030109080905 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Hi Jeff, The nForce MAC has two options for generating tx completion interrupts: Either by setting a per-packet flag, or by setting a bit in the irq mask (which enables interrupts for all packets). forcedeth tried to use one approach for nForce boards and the other approach for nForce 2/3/4. The attached patch removes the special cases and uses the same approach for all nForce versions. The patch also adds extensive debug output that should help to identify the tx hang described in the bugzilla.kernel.org bugreport 4552. Signed-Off-By: Manfred Spraul --------------080303050106030109080905 Content-Type: text/plain; name="patch-forcedeth-038-txirq" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="patch-forcedeth-038-txirq" --- 2.6/drivers/net/forcedeth.c 2005-07-16 13:10:30.000000000 +0200 +++ build-2.6/drivers/net/forcedeth.c 2005-07-16 15:58:03.000000000 +0200 @@ -87,6 +87,8 @@ * 0.35: 26 Jun 2005: Support for MCP55 added. * 0.36: 28 Jun 2005: Add jumbo frame support. * 0.37: 10 Jul 2005: Additional ethtool support, cleanup of pci id list + * 0.38: 16 Jul 2005: tx irq rewrite: Use global flags instead of + * per-packet flags. * * Known bugs: * We suspect that on some hardware no TX done interrupts are generated. @@ -98,7 +100,7 @@ * DEV_NEED_TIMERIRQ will not harm you on sane hardware, only generating a few * superfluous timer interrupts from the nic. */ -#define FORCEDETH_VERSION "0.36" +#define FORCEDETH_VERSION "0.38" #define DRV_NAME "forcedeth" #include @@ -133,12 +135,9 @@ * Hardware access: */ -#define DEV_NEED_LASTPACKET1 0x0001 /* set LASTPACKET1 in tx flags */ -#define DEV_IRQMASK_1 0x0002 /* use NVREG_IRQMASK_WANTED_1 for irq mask */ -#define DEV_IRQMASK_2 0x0004 /* use NVREG_IRQMASK_WANTED_2 for irq mask */ -#define DEV_NEED_TIMERIRQ 0x0008 /* set the timer irq flag in the irq mask */ -#define DEV_NEED_LINKTIMER 0x0010 /* poll link settings. Relies on the timer irq */ -#define DEV_HAS_LARGEDESC 0x0020 /* device supports jumbo frames and needs packet format 2 */ +#define DEV_NEED_TIMERIRQ 0x0001 /* set the timer irq flag in the irq mask */ +#define DEV_NEED_LINKTIMER 0x0002 /* poll link settings. Relies on the timer irq */ +#define DEV_HAS_LARGEDESC 0x0004 /* device supports jumbo frames and needs packet format 2 */ enum { NvRegIrqStatus = 0x000, @@ -149,13 +148,16 @@ #define NVREG_IRQ_RX 0x0002 #define NVREG_IRQ_RX_NOBUF 0x0004 #define NVREG_IRQ_TX_ERR 0x0008 -#define NVREG_IRQ_TX2 0x0010 +#define NVREG_IRQ_TX_OK 0x0010 #define NVREG_IRQ_TIMER 0x0020 #define NVREG_IRQ_LINK 0x0040 +#define NVREG_IRQ_TX_ERROR 0x0080 #define NVREG_IRQ_TX1 0x0100 -#define NVREG_IRQMASK_WANTED_1 0x005f -#define NVREG_IRQMASK_WANTED_2 0x0147 -#define NVREG_IRQ_UNKNOWN (~(NVREG_IRQ_RX_ERROR|NVREG_IRQ_RX|NVREG_IRQ_RX_NOBUF|NVREG_IRQ_TX_ERR|NVREG_IRQ_TX2|NVREG_IRQ_TIMER|NVREG_IRQ_LINK|NVREG_IRQ_TX1)) +#define NVREG_IRQMASK_WANTED 0x00df + +#define NVREG_IRQ_UNKNOWN (~(NVREG_IRQ_RX_ERROR|NVREG_IRQ_RX|NVREG_IRQ_RX_NOBUF|NVREG_IRQ_TX_ERR| \ + NVREG_IRQ_TX_OK|NVREG_IRQ_TIMER|NVREG_IRQ_LINK|NVREG_IRQ_TX_ERROR| \ + NVREG_IRQ_TX1)) NvRegUnknownSetupReg6 = 0x008, #define NVREG_UNKSETUP6_VAL 3 @@ -296,7 +298,7 @@ #define NV_TX_LASTPACKET (1<<16) #define NV_TX_RETRYERROR (1<<19) -#define NV_TX_LASTPACKET1 (1<<24) +#define NV_TX_FORCED_INTERRUPT (1<<24) #define NV_TX_DEFERRED (1<<26) #define NV_TX_CARRIERLOST (1<<27) #define NV_TX_LATECOLLISION (1<<28) @@ -306,7 +308,7 @@ #define NV_TX2_LASTPACKET (1<<29) #define NV_TX2_RETRYERROR (1<<18) -#define NV_TX2_LASTPACKET1 (1<<23) +#define NV_TX2_FORCED_INTERRUPT (1<<30) #define NV_TX2_DEFERRED (1<<25) #define NV_TX2_CARRIERLOST (1<<26) #define NV_TX2_LATECOLLISION (1<<27) @@ -1013,9 +1015,39 @@ struct fe_priv *np = get_nvpriv(dev); u8 __iomem *base = get_hwbase(dev); - dprintk(KERN_DEBUG "%s: Got tx_timeout. irq: %08x\n", dev->name, + printk(KERN_INFO "%s: Got tx_timeout. irq: %08x\n", dev->name, readl(base + NvRegIrqStatus) & NVREG_IRQSTAT_MASK); + { + int i; + + printk(KERN_INFO "%s: Ring at %lx: next %d nic %d\n", + dev->name, (unsigned long)np->ring_addr, + np->next_tx, np->nic_tx); + printk(KERN_INFO "%s: Dumping tx registers\n", dev->name); + for (i=0;i<0x400;i+= 32) { + printk(KERN_INFO "%3x: %08x %08x %08x %08x %08x %08x %08x %08x\n", + i, + readl(base + i + 0), readl(base + i + 4), + readl(base + i + 8), readl(base + i + 12), + readl(base + i + 16), readl(base + i + 20), + readl(base + i + 24), readl(base + i + 28)); + } + printk(KERN_INFO "%s: Dumping tx ring\n", dev->name); + for (i=0;itx_ring[i].PacketBuffer), + le32_to_cpu(np->tx_ring[i].FlagLen), + le32_to_cpu(np->tx_ring[i+1].PacketBuffer), + le32_to_cpu(np->tx_ring[i+1].FlagLen), + le32_to_cpu(np->tx_ring[i+2].PacketBuffer), + le32_to_cpu(np->tx_ring[i+2].FlagLen), + le32_to_cpu(np->tx_ring[i+3].PacketBuffer), + le32_to_cpu(np->tx_ring[i+3].FlagLen)); + } + } + spin_lock_irq(&np->lock); /* 1) stop tx engine */ @@ -1557,7 +1589,7 @@ if (!(events & np->irqmask)) break; - if (events & (NVREG_IRQ_TX1|NVREG_IRQ_TX2|NVREG_IRQ_TX_ERR)) { + if (events & (NVREG_IRQ_TX1|NVREG_IRQ_TX_OK|NVREG_IRQ_TX_ERROR|NVREG_IRQ_TX_ERR)) { spin_lock(&np->lock); nv_tx_done(dev); spin_unlock(&np->lock); @@ -2213,17 +2245,10 @@ if (np->desc_ver == DESC_VER_1) { np->tx_flags = NV_TX_LASTPACKET|NV_TX_VALID; - if (id->driver_data & DEV_NEED_LASTPACKET1) - np->tx_flags |= NV_TX_LASTPACKET1; } else { np->tx_flags = NV_TX2_LASTPACKET|NV_TX2_VALID; - if (id->driver_data & DEV_NEED_LASTPACKET1) - np->tx_flags |= NV_TX2_LASTPACKET1; } - if (id->driver_data & DEV_IRQMASK_1) - np->irqmask = NVREG_IRQMASK_WANTED_1; - if (id->driver_data & DEV_IRQMASK_2) - np->irqmask = NVREG_IRQMASK_WANTED_2; + np->irqmask = NVREG_IRQMASK_WANTED; if (id->driver_data & DEV_NEED_TIMERIRQ) np->irqmask |= NVREG_IRQ_TIMER; if (id->driver_data & DEV_NEED_LINKTIMER) { @@ -2329,73 +2354,63 @@ static struct pci_device_id pci_tbl[] = { { /* nForce Ethernet Controller */ PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_1), - .driver_data = DEV_IRQMASK_1|DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER, + .driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER, }, { /* nForce2 Ethernet Controller */ PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_2), - .driver_data = DEV_NEED_LASTPACKET1|DEV_IRQMASK_2|DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER, + .driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER, }, { /* nForce3 Ethernet Controller */ PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_3), - .driver_data = DEV_NEED_LASTPACKET1|DEV_IRQMASK_2|DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER, + .driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER, }, { /* nForce3 Ethernet Controller */ PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_4), - .driver_data = DEV_NEED_LASTPACKET1|DEV_IRQMASK_2|DEV_NEED_TIMERIRQ| - DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC, + .driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC, }, { /* nForce3 Ethernet Controller */ PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_5), - .driver_data = DEV_NEED_LASTPACKET1|DEV_IRQMASK_2|DEV_NEED_TIMERIRQ| - DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC, + .driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC, }, { /* nForce3 Ethernet Controller */ PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_6), - .driver_data = DEV_NEED_LASTPACKET1|DEV_IRQMASK_2|DEV_NEED_TIMERIRQ| - DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC, + .driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC, }, { /* nForce3 Ethernet Controller */ PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_7), - .driver_data = DEV_NEED_LASTPACKET1|DEV_IRQMASK_2|DEV_NEED_TIMERIRQ| - DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC, + .driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC, }, { /* CK804 Ethernet Controller */ PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_8), - .driver_data = DEV_NEED_LASTPACKET1|DEV_IRQMASK_2|DEV_NEED_TIMERIRQ| - DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC, + .driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC, }, { /* CK804 Ethernet Controller */ PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_9), - .driver_data = DEV_NEED_LASTPACKET1|DEV_IRQMASK_2|DEV_NEED_TIMERIRQ| - DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC, + .driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC, }, { /* MCP04 Ethernet Controller */ PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_10), - .driver_data = DEV_NEED_LASTPACKET1|DEV_IRQMASK_2|DEV_NEED_TIMERIRQ| - DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC, + .driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC, }, { /* MCP04 Ethernet Controller */ PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_11), - .driver_data = DEV_NEED_LASTPACKET1|DEV_IRQMASK_2|DEV_NEED_TIMERIRQ| - DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC, + .driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC, }, { /* MCP51 Ethernet Controller */ PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_12), - .driver_data = DEV_NEED_LASTPACKET1|DEV_IRQMASK_2|DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER, + .driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER, }, { /* MCP51 Ethernet Controller */ PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_13), - .driver_data = DEV_NEED_LASTPACKET1|DEV_IRQMASK_2|DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER, + .driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER, }, { /* MCP55 Ethernet Controller */ PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_14), - .driver_data = DEV_NEED_LASTPACKET1|DEV_IRQMASK_2|DEV_NEED_TIMERIRQ| - DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC, + .driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC, }, { /* MCP55 Ethernet Controller */ PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_15), - .driver_data = DEV_NEED_LASTPACKET1|DEV_IRQMASK_2|DEV_NEED_TIMERIRQ| - DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC, + .driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC, }, {0,}, }; --------------080303050106030109080905-- From manfred@colorfullife.com Fri Jul 22 13:47:26 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 22 Jul 2005 13:48:33 -0700 (PDT) Received: from dbl.q-ag.de (dbl.q-ag.de [213.172.117.3]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6MKlOH9011609 for ; Fri, 22 Jul 2005 13:47:25 -0700 Received: from [127.0.0.2] (dbl [127.0.0.1]) by dbl.q-ag.de (8.13.3/8.13.3/Debian-6) with ESMTP id j6MKmeDu021693; Fri, 22 Jul 2005 22:48:41 +0200 Message-ID: <42E15AE7.3020304@colorfullife.com> Date: Fri, 22 Jul 2005 22:45:27 +0200 From: Manfred Spraul User-Agent: Mozilla/5.0 (X11; U; Linux i686; fr-FR; rv:1.7.10) Gecko/20050719 Fedora/1.7.10-1.3.1 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Jeff Garzik CC: Netdev , Ayaz Abdulla Subject: [PATCH] forcedeth: Add support for 64-bit dma addresses. Content-Type: multipart/mixed; boundary="------------050509060201050106010501" X-archive-position: 2771 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: manfred@colorfullife.com Precedence: bulk X-list: netdev Content-Length: 16106 Lines: 420 This is a multi-part message in MIME format. --------------050509060201050106010501 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Hi Jeff, Ayaz wrote a patch that enables 64-bit DMA for recent nForce nics: The latest nForce nics support extended ring descriptors with 64-bit dma addresses (actual hardware limit: 40 bits) Could you add the patch to your tree? Signed-Off-By: Manfred Spraul --------------050509060201050106010501 Content-Type: text/plain; name="patch-forcedeth-039-64bit" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="patch-forcedeth-039-64bit" --- 2.6/drivers/net/forcedeth.c 2005-07-22 22:29:49.000000000 +0200 +++ build-2.6/drivers/net/forcedeth.c 2005-07-22 22:32:53.000000000 +0200 @@ -89,6 +89,7 @@ * 0.37: 10 Jul 2005: Additional ethtool support, cleanup of pci id list * 0.38: 16 Jul 2005: tx irq rewrite: Use global flags instead of * per-packet flags. + * 0.39: 18 Jul 2005: Add 64bit descriptor support. * * Known bugs: * We suspect that on some hardware no TX done interrupts are generated. @@ -100,7 +101,7 @@ * DEV_NEED_TIMERIRQ will not harm you on sane hardware, only generating a few * superfluous timer interrupts from the nic. */ -#define FORCEDETH_VERSION "0.38" +#define FORCEDETH_VERSION "0.39" #define DRV_NAME "forcedeth" #include @@ -138,6 +139,7 @@ #define DEV_NEED_TIMERIRQ 0x0001 /* set the timer irq flag in the irq mask */ #define DEV_NEED_LINKTIMER 0x0002 /* poll link settings. Relies on the timer irq */ #define DEV_HAS_LARGEDESC 0x0004 /* device supports jumbo frames and needs packet format 2 */ +#define DEV_HAS_HIGH_DMA 0x0008 /* device supports 64bit dma */ enum { NvRegIrqStatus = 0x000, @@ -291,6 +293,18 @@ u32 FlagLen; }; +struct ring_desc_ex { + u32 PacketBufferHigh; + u32 PacketBufferLow; + u32 Reserved; + u32 FlagLen; +}; + +typedef union _ring_type { + struct ring_desc* orig; + struct ring_desc_ex* ex; +} ring_type; + #define FLAG_MASK_V1 0xffff0000 #define FLAG_MASK_V2 0xffffc000 #define LEN_MASK_V1 (0xffffffff ^ FLAG_MASK_V1) @@ -405,6 +419,7 @@ */ #define DESC_VER_1 0x0 #define DESC_VER_2 (0x02100|NVREG_TXRXCTL_RXCHECK) +#define DESC_VER_3 (0x02200|NVREG_TXRXCTL_RXCHECK) /* PHY defines */ #define PHY_OUI_MARVELL 0x5043 @@ -477,7 +492,7 @@ /* rx specific fields. * Locking: Within irq hander or disable_irq+spin_lock(&np->lock); */ - struct ring_desc *rx_ring; + ring_type rx_ring; unsigned int cur_rx, refill_rx; struct sk_buff *rx_skbuff[RX_RING]; dma_addr_t rx_dma[RX_RING]; @@ -494,7 +509,7 @@ /* * tx specific fields. */ - struct ring_desc *tx_ring; + ring_type tx_ring; unsigned int next_tx, nic_tx; struct sk_buff *tx_skbuff[TX_RING]; dma_addr_t tx_dma[TX_RING]; @@ -529,6 +544,11 @@ & ((v == DESC_VER_1) ? LEN_MASK_V1 : LEN_MASK_V2); } +static inline u32 nv_descr_getlength_ex(struct ring_desc_ex *prd, u32 v) +{ + return le32_to_cpu(prd->FlagLen) & LEN_MASK_V2; +} + static int reg_delay(struct net_device *dev, int offset, u32 mask, u32 target, int delay, int delaymax, const char *msg) { @@ -813,9 +833,16 @@ } np->rx_dma[nr] = pci_map_single(np->pci_dev, skb->data, skb->len, PCI_DMA_FROMDEVICE); - np->rx_ring[nr].PacketBuffer = cpu_to_le32(np->rx_dma[nr]); - wmb(); - np->rx_ring[nr].FlagLen = cpu_to_le32(np->rx_buf_sz | NV_RX_AVAIL); + if (np->desc_ver == DESC_VER_1 || np->desc_ver == DESC_VER_2) { + np->rx_ring.orig[nr].PacketBuffer = cpu_to_le32(np->rx_dma[nr]); + wmb(); + np->rx_ring.orig[nr].FlagLen = cpu_to_le32(np->rx_buf_sz | NV_RX_AVAIL); + } else { + np->rx_ring.ex[nr].PacketBufferHigh = cpu_to_le64(np->rx_dma[nr]) >> 32; + np->rx_ring.ex[nr].PacketBufferLow = cpu_to_le64(np->rx_dma[nr]) & 0x0FFFFFFFF; + wmb(); + np->rx_ring.ex[nr].FlagLen = cpu_to_le32(np->rx_buf_sz | NV_RX2_AVAIL); + } dprintk(KERN_DEBUG "%s: nv_alloc_rx: Packet %d marked as Available\n", dev->name, refill_rx); refill_rx++; @@ -849,7 +876,10 @@ np->cur_rx = RX_RING; np->refill_rx = 0; for (i = 0; i < RX_RING; i++) - np->rx_ring[i].FlagLen = 0; + if (np->desc_ver == DESC_VER_1 || np->desc_ver == DESC_VER_2) + np->rx_ring.orig[i].FlagLen = 0; + else + np->rx_ring.ex[i].FlagLen = 0; } static void nv_init_tx(struct net_device *dev) @@ -859,7 +889,10 @@ np->next_tx = np->nic_tx = 0; for (i = 0; i < TX_RING; i++) - np->tx_ring[i].FlagLen = 0; + if (np->desc_ver == DESC_VER_1 || np->desc_ver == DESC_VER_2) + np->tx_ring.orig[i].FlagLen = 0; + else + np->tx_ring.ex[i].FlagLen = 0; } static int nv_init_ring(struct net_device *dev) @@ -874,7 +907,10 @@ struct fe_priv *np = get_nvpriv(dev); int i; for (i = 0; i < TX_RING; i++) { - np->tx_ring[i].FlagLen = 0; + if (np->desc_ver == DESC_VER_1 || np->desc_ver == DESC_VER_2) + np->tx_ring.orig[i].FlagLen = 0; + else + np->tx_ring.ex[i].FlagLen = 0; if (np->tx_skbuff[i]) { pci_unmap_single(np->pci_dev, np->tx_dma[i], np->tx_skbuff[i]->len, @@ -891,7 +927,10 @@ struct fe_priv *np = get_nvpriv(dev); int i; for (i = 0; i < RX_RING; i++) { - np->rx_ring[i].FlagLen = 0; + if (np->desc_ver == DESC_VER_1 || np->desc_ver == DESC_VER_2) + np->rx_ring.orig[i].FlagLen = 0; + else + np->rx_ring.ex[i].FlagLen = 0; wmb(); if (np->rx_skbuff[i]) { pci_unmap_single(np->pci_dev, np->rx_dma[i], @@ -922,11 +961,19 @@ np->tx_dma[nr] = pci_map_single(np->pci_dev, skb->data,skb->len, PCI_DMA_TODEVICE); - np->tx_ring[nr].PacketBuffer = cpu_to_le32(np->tx_dma[nr]); + if (np->desc_ver == DESC_VER_1 || np->desc_ver == DESC_VER_2) + np->tx_ring.orig[nr].PacketBuffer = cpu_to_le32(np->tx_dma[nr]); + else { + np->tx_ring.ex[nr].PacketBufferHigh = cpu_to_le64(np->tx_dma[nr]) >> 32; + np->tx_ring.ex[nr].PacketBufferLow = cpu_to_le64(np->tx_dma[nr]) & 0x0FFFFFFFF; + } spin_lock_irq(&np->lock); wmb(); - np->tx_ring[nr].FlagLen = cpu_to_le32( (skb->len-1) | np->tx_flags ); + if (np->desc_ver == DESC_VER_1 || np->desc_ver == DESC_VER_2) + np->tx_ring.orig[nr].FlagLen = cpu_to_le32( (skb->len-1) | np->tx_flags ); + else + np->tx_ring.ex[nr].FlagLen = cpu_to_le32( (skb->len-1) | np->tx_flags ); dprintk(KERN_DEBUG "%s: nv_start_xmit: packet packet %d queued for transmission.\n", dev->name, np->next_tx); { @@ -964,7 +1011,10 @@ while (np->nic_tx != np->next_tx) { i = np->nic_tx % TX_RING; - Flags = le32_to_cpu(np->tx_ring[i].FlagLen); + if (np->desc_ver == DESC_VER_1 || np->desc_ver == DESC_VER_2) + Flags = le32_to_cpu(np->tx_ring.orig[i].FlagLen); + else + Flags = le32_to_cpu(np->tx_ring.ex[i].FlagLen); dprintk(KERN_DEBUG "%s: nv_tx_done: looking at packet %d, Flags 0x%x.\n", dev->name, np->nic_tx, Flags); @@ -1035,16 +1085,33 @@ } printk(KERN_INFO "%s: Dumping tx ring\n", dev->name); for (i=0;itx_ring[i].PacketBuffer), - le32_to_cpu(np->tx_ring[i].FlagLen), - le32_to_cpu(np->tx_ring[i+1].PacketBuffer), - le32_to_cpu(np->tx_ring[i+1].FlagLen), - le32_to_cpu(np->tx_ring[i+2].PacketBuffer), - le32_to_cpu(np->tx_ring[i+2].FlagLen), - le32_to_cpu(np->tx_ring[i+3].PacketBuffer), - le32_to_cpu(np->tx_ring[i+3].FlagLen)); + if (np->desc_ver == DESC_VER_1 || np->desc_ver == DESC_VER_2) { + printk(KERN_INFO "%03x: %08x %08x // %08x %08x // %08x %08x // %08x %08x\n", + i, + le32_to_cpu(np->tx_ring.orig[i].PacketBuffer), + le32_to_cpu(np->tx_ring.orig[i].FlagLen), + le32_to_cpu(np->tx_ring.orig[i+1].PacketBuffer), + le32_to_cpu(np->tx_ring.orig[i+1].FlagLen), + le32_to_cpu(np->tx_ring.orig[i+2].PacketBuffer), + le32_to_cpu(np->tx_ring.orig[i+2].FlagLen), + le32_to_cpu(np->tx_ring.orig[i+3].PacketBuffer), + le32_to_cpu(np->tx_ring.orig[i+3].FlagLen)); + } else { + printk(KERN_INFO "%03x: %08x %08x %08x // %08x %08x %08x // %08x %08x %08x // %08x %08x %08x\n", + i, + le32_to_cpu(np->tx_ring.ex[i].PacketBufferHigh), + le32_to_cpu(np->tx_ring.ex[i].PacketBufferLow), + le32_to_cpu(np->tx_ring.ex[i].FlagLen), + le32_to_cpu(np->tx_ring.ex[i+1].PacketBufferHigh), + le32_to_cpu(np->tx_ring.ex[i+1].PacketBufferLow), + le32_to_cpu(np->tx_ring.ex[i+1].FlagLen), + le32_to_cpu(np->tx_ring.ex[i+2].PacketBufferHigh), + le32_to_cpu(np->tx_ring.ex[i+2].PacketBufferLow), + le32_to_cpu(np->tx_ring.ex[i+2].FlagLen), + le32_to_cpu(np->tx_ring.ex[i+3].PacketBufferHigh), + le32_to_cpu(np->tx_ring.ex[i+3].PacketBufferLow), + le32_to_cpu(np->tx_ring.ex[i+3].FlagLen)); + } } } @@ -1061,7 +1128,10 @@ printk(KERN_DEBUG "%s: tx_timeout: dead entries!\n", dev->name); nv_drain_tx(dev); np->next_tx = np->nic_tx = 0; - writel((u32) (np->ring_addr + RX_RING*sizeof(struct ring_desc)), base + NvRegTxRingPhysAddr); + if (np->desc_ver == DESC_VER_1 || np->desc_ver == DESC_VER_2) + writel((u32) (np->ring_addr + RX_RING*sizeof(struct ring_desc)), base + NvRegTxRingPhysAddr); + else + writel((u32) (np->ring_addr + RX_RING*sizeof(struct ring_desc_ex)), base + NvRegTxRingPhysAddr); netif_wake_queue(dev); } @@ -1136,8 +1206,13 @@ break; /* we scanned the whole ring - do not continue */ i = np->cur_rx % RX_RING; - Flags = le32_to_cpu(np->rx_ring[i].FlagLen); - len = nv_descr_getlength(&np->rx_ring[i], np->desc_ver); + if (np->desc_ver == DESC_VER_1 || np->desc_ver == DESC_VER_2) { + Flags = le32_to_cpu(np->rx_ring.orig[i].FlagLen); + len = nv_descr_getlength(&np->rx_ring.orig[i], np->desc_ver); + } else { + Flags = le32_to_cpu(np->rx_ring.ex[i].FlagLen); + len = nv_descr_getlength_ex(&np->rx_ring.ex[i], np->desc_ver); + } dprintk(KERN_DEBUG "%s: nv_rx_process: looking at packet %d, Flags 0x%x.\n", dev->name, np->cur_rx, Flags); @@ -1321,7 +1396,10 @@ /* reinit nic view of the rx queue */ writel(np->rx_buf_sz, base + NvRegOffloadConfig); writel((u32) np->ring_addr, base + NvRegRxRingPhysAddr); - writel((u32) (np->ring_addr + RX_RING*sizeof(struct ring_desc)), base + NvRegTxRingPhysAddr); + if (np->desc_ver == DESC_VER_1 || np->desc_ver == DESC_VER_2) + writel((u32) (np->ring_addr + RX_RING*sizeof(struct ring_desc)), base + NvRegTxRingPhysAddr); + else + writel((u32) (np->ring_addr + RX_RING*sizeof(struct ring_desc_ex)), base + NvRegTxRingPhysAddr); writel( ((RX_RING-1) << NVREG_RINGSZ_RXSHIFT) + ((TX_RING-1) << NVREG_RINGSZ_TXSHIFT), base + NvRegRingSizes); pci_push(base); @@ -1982,7 +2060,10 @@ /* 4) give hw rings */ writel((u32) np->ring_addr, base + NvRegRxRingPhysAddr); - writel((u32) (np->ring_addr + RX_RING*sizeof(struct ring_desc)), base + NvRegTxRingPhysAddr); + if (np->desc_ver == DESC_VER_1 || np->desc_ver == DESC_VER_2) + writel((u32) (np->ring_addr + RX_RING*sizeof(struct ring_desc)), base + NvRegTxRingPhysAddr); + else + writel((u32) (np->ring_addr + RX_RING*sizeof(struct ring_desc_ex)), base + NvRegTxRingPhysAddr); writel( ((RX_RING-1) << NVREG_RINGSZ_RXSHIFT) + ((TX_RING-1) << NVREG_RINGSZ_TXSHIFT), base + NvRegRingSizes); @@ -2179,18 +2260,38 @@ np->desc_ver = DESC_VER_2; np->pkt_limit = NV_PKTLIMIT_2; } + if (id->driver_data & DEV_HAS_HIGH_DMA) { + np->desc_ver = DESC_VER_3; + dev->features |= NETIF_F_HIGHDMA; + if (pci_set_dma_mask(pci_dev, 0x0000007fffffffffULL)) { + if (pci_set_dma_mask(pci_dev, 0xffffffffULL)) + goto out_relreg; + } + } err = -ENOMEM; np->base = ioremap(addr, NV_PCI_REGSZ); if (!np->base) goto out_relreg; dev->base_addr = (unsigned long)np->base; + dev->irq = pci_dev->irq; - np->rx_ring = pci_alloc_consistent(pci_dev, sizeof(struct ring_desc) * (RX_RING + TX_RING), - &np->ring_addr); - if (!np->rx_ring) - goto out_unmap; - np->tx_ring = &np->rx_ring[RX_RING]; + + if (np->desc_ver == DESC_VER_1 || np->desc_ver == DESC_VER_2) { + np->rx_ring.orig = pci_alloc_consistent(pci_dev, + sizeof(struct ring_desc) * (RX_RING + TX_RING), + &np->ring_addr); + if (!np->rx_ring.orig) + goto out_unmap; + np->tx_ring.orig = &np->rx_ring.orig[RX_RING]; + } else { + np->rx_ring.ex = pci_alloc_consistent(pci_dev, + sizeof(struct ring_desc_ex) * (RX_RING + TX_RING), + &np->ring_addr); + if (!np->rx_ring.ex) + goto out_unmap; + np->tx_ring.ex = &np->rx_ring.ex[RX_RING]; + } dev->open = nv_open; dev->stop = nv_close; @@ -2313,8 +2414,12 @@ return 0; out_freering: - pci_free_consistent(np->pci_dev, sizeof(struct ring_desc) * (RX_RING + TX_RING), - np->rx_ring, np->ring_addr); + if (np->desc_ver == DESC_VER_1 || np->desc_ver == DESC_VER_2) + pci_free_consistent(np->pci_dev, sizeof(struct ring_desc) * (RX_RING + TX_RING), + np->rx_ring.orig, np->ring_addr); + else + pci_free_consistent(np->pci_dev, sizeof(struct ring_desc_ex) * (RX_RING + TX_RING), + np->rx_ring.ex, np->ring_addr); pci_set_drvdata(pci_dev, NULL); out_unmap: iounmap(get_hwbase(dev)); @@ -2343,7 +2448,10 @@ writel(np->orig_mac[1], base + NvRegMacAddrB); /* free all structures */ - pci_free_consistent(np->pci_dev, sizeof(struct ring_desc) * (RX_RING + TX_RING), np->rx_ring, np->ring_addr); + if (np->desc_ver == DESC_VER_1 || np->desc_ver == DESC_VER_2) + pci_free_consistent(np->pci_dev, sizeof(struct ring_desc) * (RX_RING + TX_RING), np->rx_ring.orig, np->ring_addr); + else + pci_free_consistent(np->pci_dev, sizeof(struct ring_desc_ex) * (RX_RING + TX_RING), np->rx_ring.ex, np->ring_addr); iounmap(get_hwbase(dev)); pci_release_regions(pci_dev); pci_disable_device(pci_dev); @@ -2382,35 +2490,35 @@ }, { /* CK804 Ethernet Controller */ PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_8), - .driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC, + .driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC|DEV_HAS_HIGH_DMA, }, { /* CK804 Ethernet Controller */ PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_9), - .driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC, + .driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC|DEV_HAS_HIGH_DMA, }, { /* MCP04 Ethernet Controller */ PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_10), - .driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC, + .driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC|DEV_HAS_HIGH_DMA, }, { /* MCP04 Ethernet Controller */ PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_11), - .driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC, + .driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC|DEV_HAS_HIGH_DMA, }, { /* MCP51 Ethernet Controller */ PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_12), - .driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER, + .driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_HIGH_DMA, }, { /* MCP51 Ethernet Controller */ PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_13), - .driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER, + .driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_HIGH_DMA, }, { /* MCP55 Ethernet Controller */ PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_14), - .driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC, + .driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC|DEV_HAS_HIGH_DMA, }, { /* MCP55 Ethernet Controller */ PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_15), - .driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC, + .driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC|DEV_HAS_HIGH_DMA, }, {0,}, }; --------------050509060201050106010501-- From manfred@colorfullife.com Fri Jul 22 13:49:59 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 22 Jul 2005 13:50:04 -0700 (PDT) Received: from dbl.q-ag.de (dbl.q-ag.de [213.172.117.3]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6MKnwH9011902 for ; Fri, 22 Jul 2005 13:49:58 -0700 Received: from [127.0.0.2] (dbl [127.0.0.1]) by dbl.q-ag.de (8.13.3/8.13.3/Debian-6) with ESMTP id j6MKpF2X021716; Fri, 22 Jul 2005 22:51:16 +0200 Message-ID: <42E15B82.9020006@colorfullife.com> Date: Fri, 22 Jul 2005 22:48:02 +0200 From: Manfred Spraul User-Agent: Mozilla/5.0 (X11; U; Linux i686; fr-FR; rv:1.7.10) Gecko/20050719 Fedora/1.7.10-1.3.1 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Jeff Garzik CC: Netdev , Ayaz Abdulla Subject: [PATCH] forcedeth: add set_mac_address support Content-Type: multipart/mixed; boundary="------------090704050606080709090205" X-archive-position: 2772 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: manfred@colorfullife.com Precedence: bulk X-list: netdev Content-Length: 2978 Lines: 95 This is a multi-part message in MIME format. --------------090704050606080709090205 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Hi Jeff, Ayaz wrote a patch that adds set_mac_address to the forcedeth nic. Could you add it to your tree? set_mac_address is required for some bonding modes. Signed-Off-By: Manfred Spraul --------------090704050606080709090205 Content-Type: text/plain; name="patch-forcedeth-040-set_mac_address" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="patch-forcedeth-040-set_mac_address" --- 2.6/drivers/net/forcedeth.c 2005-07-22 22:47:01.000000000 +0200 +++ build-2.6/drivers/net/forcedeth.c 2005-07-22 22:47:31.000000000 +0200 @@ -90,6 +90,7 @@ * 0.38: 16 Jul 2005: tx irq rewrite: Use global flags instead of * per-packet flags. * 0.39: 18 Jul 2005: Add 64bit descriptor support. + * 0.40: 19 Jul 2005: Add support for mac address change. * * Known bugs: * We suspect that on some hardware no TX done interrupts are generated. @@ -1417,6 +1418,54 @@ } /* + * nv_set_mac_address: dev->set_mac_address function + * Called with rtnl_lock() held. + */ +static int nv_set_mac_address(struct net_device *dev, void *addr) +{ + struct fe_priv *np = get_nvpriv(dev); + u8 *base = get_hwbase(dev); + struct sockaddr *macaddr = (struct sockaddr*)addr; + u32 mac[2]; + + if(!is_valid_ether_addr(macaddr->sa_data)) + return -EADDRNOTAVAIL; + + /* synchronized against open : rtnl_lock() held by caller */ + if (netif_running(dev)) { + spin_lock_bh(&dev->xmit_lock); + spin_lock_irq(&np->lock); + + /* stop rx engine */ + nv_stop_rx(dev); + + /* set mac address */ + memcpy(dev->dev_addr, macaddr->sa_data, ETH_ALEN); + mac[0] = (dev->dev_addr[0] << 0) + (dev->dev_addr[1] << 8) + + (dev->dev_addr[2] << 16) + (dev->dev_addr[3] << 24); + mac[1] = (dev->dev_addr[4] << 0) + (dev->dev_addr[5] << 8); + + writel(mac[0], base + NvRegMacAddrA); + writel(mac[1], base + NvRegMacAddrB); + + /* restart rx engine */ + nv_start_rx(dev); + spin_unlock_irq(&np->lock); + spin_unlock_bh(&dev->xmit_lock); + } else { + /* set mac address */ + memcpy(dev->dev_addr, macaddr->sa_data, ETH_ALEN); + mac[0] = (dev->dev_addr[0] << 0) + (dev->dev_addr[1] << 8) + + (dev->dev_addr[2] << 16) + (dev->dev_addr[3] << 24); + mac[1] = (dev->dev_addr[4] << 0) + (dev->dev_addr[5] << 8); + + writel(mac[0], base + NvRegMacAddrA); + writel(mac[1], base + NvRegMacAddrB); + } + return 0; +} + +/* * nv_set_multicast: dev->set_multicast function * Called with dev->xmit_lock held. */ @@ -2298,6 +2347,7 @@ dev->hard_start_xmit = nv_start_xmit; dev->get_stats = nv_get_stats; dev->change_mtu = nv_change_mtu; + dev->set_mac_address = nv_set_mac_address; dev->set_multicast_list = nv_set_multicast; #ifdef CONFIG_NET_POLL_CONTROLLER dev->poll_controller = nv_poll_controller; --------------090704050606080709090205-- From jgarzik@pobox.com Fri Jul 22 13:58:30 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 22 Jul 2005 13:58:32 -0700 (PDT) Received: from mail.dvmed.net (mail.dvmed.net [216.237.124.58]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6MKwUH9013218 for ; Fri, 22 Jul 2005 13:58:30 -0700 Received: from cpe-065-184-065-144.nc.res.rr.com ([65.184.65.144] helo=[10.10.10.88]) by mail.dvmed.net with esmtpsa (Exim 4.52 #1 (Red Hat Linux)) id 1Dw4ZR-00011r-AI; Fri, 22 Jul 2005 20:56:37 +0000 Message-ID: <42E15D83.7000902@pobox.com> Date: Fri, 22 Jul 2005 16:56:35 -0400 From: Jeff Garzik User-Agent: Mozilla Thunderbird 1.0.6-1.1.fc4 (X11/20050720) X-Accept-Language: en-us, en MIME-Version: 1.0 To: Manfred Spraul CC: Netdev , Ayaz Abdulla Subject: Re: [PATCH] forcedeth: Add support for 64-bit dma addresses. References: <42E15AE7.3020304@colorfullife.com> In-Reply-To: <42E15AE7.3020304@colorfullife.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 2773 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev Content-Length: 353 Lines: 12 Manfred Spraul wrote: > + if (id->driver_data & DEV_HAS_HIGH_DMA) { > + np->desc_ver = DESC_VER_3; > + dev->features |= NETIF_F_HIGHDMA; > + if (pci_set_dma_mask(pci_dev, 0x0000007fffffffffULL)) { > + if (pci_set_dma_mask(pci_dev, 0xffffffffULL)) > + goto out_relreg; > + } setting of NETIF_F_HIGHDMA is wrong, if 64-bit set-dma-mask fails. From manfred@colorfullife.com Fri Jul 22 14:11:32 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 22 Jul 2005 14:11:35 -0700 (PDT) Received: from dbl.q-ag.de (dbl.q-ag.de [213.172.117.3]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6MLBUH9014610 for ; Fri, 22 Jul 2005 14:11:31 -0700 Received: from [127.0.0.2] (dbl [127.0.0.1]) by dbl.q-ag.de (8.13.3/8.13.3/Debian-6) with ESMTP id j6MLCkjr021822; Fri, 22 Jul 2005 23:12:47 +0200 Message-ID: <42E1608D.80608@colorfullife.com> Date: Fri, 22 Jul 2005 23:09:33 +0200 From: Manfred Spraul User-Agent: Mozilla/5.0 (X11; U; Linux i686; fr-FR; rv:1.7.10) Gecko/20050719 Fedora/1.7.10-1.3.1 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Jeff Garzik CC: Netdev , Ayaz Abdulla Subject: Re: [PATCH] forcedeth: Add support for 64-bit dma addresses. References: <42E15AE7.3020304@colorfullife.com> <42E15D83.7000902@pobox.com> In-Reply-To: <42E15D83.7000902@pobox.com> Content-Type: multipart/mixed; boundary="------------060508080608020503060100" X-archive-position: 2774 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: manfred@colorfullife.com Precedence: bulk X-list: netdev Content-Length: 16408 Lines: 433 This is a multi-part message in MIME format. --------------060508080608020503060100 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Jeff Garzik wrote: > Manfred Spraul wrote: > >> + if (id->driver_data & DEV_HAS_HIGH_DMA) { >> + np->desc_ver = DESC_VER_3; >> + dev->features |= NETIF_F_HIGHDMA; >> + if (pci_set_dma_mask(pci_dev, 0x0000007fffffffffULL)) { >> + if (pci_set_dma_mask(pci_dev, 0xffffffffULL)) >> + goto out_relreg; >> + } > > > > setting of NETIF_F_HIGHDMA is wrong, if 64-bit set-dma-mask fails. Ok, I have removed the NETIF_F_HIGHDMA setting: forcedeth doesn't support scatter-gather, thus illegal_highdma() is never called, therefore NETIF_F_HIGHDMA doesn't matter. -- Manfred --------------060508080608020503060100 Content-Type: text/plain; name="patch-forcedeth-039-64bit" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="patch-forcedeth-039-64bit" --- 2.6/drivers/net/forcedeth.c 2005-07-22 23:06:21.000000000 +0200 +++ build-2.6/drivers/net/forcedeth.c 2005-07-22 23:06:10.000000000 +0200 @@ -89,6 +89,7 @@ * 0.37: 10 Jul 2005: Additional ethtool support, cleanup of pci id list * 0.38: 16 Jul 2005: tx irq rewrite: Use global flags instead of * per-packet flags. + * 0.39: 18 Jul 2005: Add 64bit descriptor support. * * Known bugs: * We suspect that on some hardware no TX done interrupts are generated. @@ -100,7 +101,7 @@ * DEV_NEED_TIMERIRQ will not harm you on sane hardware, only generating a few * superfluous timer interrupts from the nic. */ -#define FORCEDETH_VERSION "0.38" +#define FORCEDETH_VERSION "0.39" #define DRV_NAME "forcedeth" #include @@ -138,6 +139,7 @@ #define DEV_NEED_TIMERIRQ 0x0001 /* set the timer irq flag in the irq mask */ #define DEV_NEED_LINKTIMER 0x0002 /* poll link settings. Relies on the timer irq */ #define DEV_HAS_LARGEDESC 0x0004 /* device supports jumbo frames and needs packet format 2 */ +#define DEV_HAS_HIGH_DMA 0x0008 /* device supports 64bit dma */ enum { NvRegIrqStatus = 0x000, @@ -291,6 +293,18 @@ u32 FlagLen; }; +struct ring_desc_ex { + u32 PacketBufferHigh; + u32 PacketBufferLow; + u32 Reserved; + u32 FlagLen; +}; + +typedef union _ring_type { + struct ring_desc* orig; + struct ring_desc_ex* ex; +} ring_type; + #define FLAG_MASK_V1 0xffff0000 #define FLAG_MASK_V2 0xffffc000 #define LEN_MASK_V1 (0xffffffff ^ FLAG_MASK_V1) @@ -405,6 +419,7 @@ */ #define DESC_VER_1 0x0 #define DESC_VER_2 (0x02100|NVREG_TXRXCTL_RXCHECK) +#define DESC_VER_3 (0x02200|NVREG_TXRXCTL_RXCHECK) /* PHY defines */ #define PHY_OUI_MARVELL 0x5043 @@ -477,7 +492,7 @@ /* rx specific fields. * Locking: Within irq hander or disable_irq+spin_lock(&np->lock); */ - struct ring_desc *rx_ring; + ring_type rx_ring; unsigned int cur_rx, refill_rx; struct sk_buff *rx_skbuff[RX_RING]; dma_addr_t rx_dma[RX_RING]; @@ -494,7 +509,7 @@ /* * tx specific fields. */ - struct ring_desc *tx_ring; + ring_type tx_ring; unsigned int next_tx, nic_tx; struct sk_buff *tx_skbuff[TX_RING]; dma_addr_t tx_dma[TX_RING]; @@ -529,6 +544,11 @@ & ((v == DESC_VER_1) ? LEN_MASK_V1 : LEN_MASK_V2); } +static inline u32 nv_descr_getlength_ex(struct ring_desc_ex *prd, u32 v) +{ + return le32_to_cpu(prd->FlagLen) & LEN_MASK_V2; +} + static int reg_delay(struct net_device *dev, int offset, u32 mask, u32 target, int delay, int delaymax, const char *msg) { @@ -813,9 +833,16 @@ } np->rx_dma[nr] = pci_map_single(np->pci_dev, skb->data, skb->len, PCI_DMA_FROMDEVICE); - np->rx_ring[nr].PacketBuffer = cpu_to_le32(np->rx_dma[nr]); - wmb(); - np->rx_ring[nr].FlagLen = cpu_to_le32(np->rx_buf_sz | NV_RX_AVAIL); + if (np->desc_ver == DESC_VER_1 || np->desc_ver == DESC_VER_2) { + np->rx_ring.orig[nr].PacketBuffer = cpu_to_le32(np->rx_dma[nr]); + wmb(); + np->rx_ring.orig[nr].FlagLen = cpu_to_le32(np->rx_buf_sz | NV_RX_AVAIL); + } else { + np->rx_ring.ex[nr].PacketBufferHigh = cpu_to_le64(np->rx_dma[nr]) >> 32; + np->rx_ring.ex[nr].PacketBufferLow = cpu_to_le64(np->rx_dma[nr]) & 0x0FFFFFFFF; + wmb(); + np->rx_ring.ex[nr].FlagLen = cpu_to_le32(np->rx_buf_sz | NV_RX2_AVAIL); + } dprintk(KERN_DEBUG "%s: nv_alloc_rx: Packet %d marked as Available\n", dev->name, refill_rx); refill_rx++; @@ -849,7 +876,10 @@ np->cur_rx = RX_RING; np->refill_rx = 0; for (i = 0; i < RX_RING; i++) - np->rx_ring[i].FlagLen = 0; + if (np->desc_ver == DESC_VER_1 || np->desc_ver == DESC_VER_2) + np->rx_ring.orig[i].FlagLen = 0; + else + np->rx_ring.ex[i].FlagLen = 0; } static void nv_init_tx(struct net_device *dev) @@ -859,7 +889,10 @@ np->next_tx = np->nic_tx = 0; for (i = 0; i < TX_RING; i++) - np->tx_ring[i].FlagLen = 0; + if (np->desc_ver == DESC_VER_1 || np->desc_ver == DESC_VER_2) + np->tx_ring.orig[i].FlagLen = 0; + else + np->tx_ring.ex[i].FlagLen = 0; } static int nv_init_ring(struct net_device *dev) @@ -874,7 +907,10 @@ struct fe_priv *np = get_nvpriv(dev); int i; for (i = 0; i < TX_RING; i++) { - np->tx_ring[i].FlagLen = 0; + if (np->desc_ver == DESC_VER_1 || np->desc_ver == DESC_VER_2) + np->tx_ring.orig[i].FlagLen = 0; + else + np->tx_ring.ex[i].FlagLen = 0; if (np->tx_skbuff[i]) { pci_unmap_single(np->pci_dev, np->tx_dma[i], np->tx_skbuff[i]->len, @@ -891,7 +927,10 @@ struct fe_priv *np = get_nvpriv(dev); int i; for (i = 0; i < RX_RING; i++) { - np->rx_ring[i].FlagLen = 0; + if (np->desc_ver == DESC_VER_1 || np->desc_ver == DESC_VER_2) + np->rx_ring.orig[i].FlagLen = 0; + else + np->rx_ring.ex[i].FlagLen = 0; wmb(); if (np->rx_skbuff[i]) { pci_unmap_single(np->pci_dev, np->rx_dma[i], @@ -922,11 +961,19 @@ np->tx_dma[nr] = pci_map_single(np->pci_dev, skb->data,skb->len, PCI_DMA_TODEVICE); - np->tx_ring[nr].PacketBuffer = cpu_to_le32(np->tx_dma[nr]); + if (np->desc_ver == DESC_VER_1 || np->desc_ver == DESC_VER_2) + np->tx_ring.orig[nr].PacketBuffer = cpu_to_le32(np->tx_dma[nr]); + else { + np->tx_ring.ex[nr].PacketBufferHigh = cpu_to_le64(np->tx_dma[nr]) >> 32; + np->tx_ring.ex[nr].PacketBufferLow = cpu_to_le64(np->tx_dma[nr]) & 0x0FFFFFFFF; + } spin_lock_irq(&np->lock); wmb(); - np->tx_ring[nr].FlagLen = cpu_to_le32( (skb->len-1) | np->tx_flags ); + if (np->desc_ver == DESC_VER_1 || np->desc_ver == DESC_VER_2) + np->tx_ring.orig[nr].FlagLen = cpu_to_le32( (skb->len-1) | np->tx_flags ); + else + np->tx_ring.ex[nr].FlagLen = cpu_to_le32( (skb->len-1) | np->tx_flags ); dprintk(KERN_DEBUG "%s: nv_start_xmit: packet packet %d queued for transmission.\n", dev->name, np->next_tx); { @@ -964,7 +1011,10 @@ while (np->nic_tx != np->next_tx) { i = np->nic_tx % TX_RING; - Flags = le32_to_cpu(np->tx_ring[i].FlagLen); + if (np->desc_ver == DESC_VER_1 || np->desc_ver == DESC_VER_2) + Flags = le32_to_cpu(np->tx_ring.orig[i].FlagLen); + else + Flags = le32_to_cpu(np->tx_ring.ex[i].FlagLen); dprintk(KERN_DEBUG "%s: nv_tx_done: looking at packet %d, Flags 0x%x.\n", dev->name, np->nic_tx, Flags); @@ -1035,16 +1085,33 @@ } printk(KERN_INFO "%s: Dumping tx ring\n", dev->name); for (i=0;itx_ring[i].PacketBuffer), - le32_to_cpu(np->tx_ring[i].FlagLen), - le32_to_cpu(np->tx_ring[i+1].PacketBuffer), - le32_to_cpu(np->tx_ring[i+1].FlagLen), - le32_to_cpu(np->tx_ring[i+2].PacketBuffer), - le32_to_cpu(np->tx_ring[i+2].FlagLen), - le32_to_cpu(np->tx_ring[i+3].PacketBuffer), - le32_to_cpu(np->tx_ring[i+3].FlagLen)); + if (np->desc_ver == DESC_VER_1 || np->desc_ver == DESC_VER_2) { + printk(KERN_INFO "%03x: %08x %08x // %08x %08x // %08x %08x // %08x %08x\n", + i, + le32_to_cpu(np->tx_ring.orig[i].PacketBuffer), + le32_to_cpu(np->tx_ring.orig[i].FlagLen), + le32_to_cpu(np->tx_ring.orig[i+1].PacketBuffer), + le32_to_cpu(np->tx_ring.orig[i+1].FlagLen), + le32_to_cpu(np->tx_ring.orig[i+2].PacketBuffer), + le32_to_cpu(np->tx_ring.orig[i+2].FlagLen), + le32_to_cpu(np->tx_ring.orig[i+3].PacketBuffer), + le32_to_cpu(np->tx_ring.orig[i+3].FlagLen)); + } else { + printk(KERN_INFO "%03x: %08x %08x %08x // %08x %08x %08x // %08x %08x %08x // %08x %08x %08x\n", + i, + le32_to_cpu(np->tx_ring.ex[i].PacketBufferHigh), + le32_to_cpu(np->tx_ring.ex[i].PacketBufferLow), + le32_to_cpu(np->tx_ring.ex[i].FlagLen), + le32_to_cpu(np->tx_ring.ex[i+1].PacketBufferHigh), + le32_to_cpu(np->tx_ring.ex[i+1].PacketBufferLow), + le32_to_cpu(np->tx_ring.ex[i+1].FlagLen), + le32_to_cpu(np->tx_ring.ex[i+2].PacketBufferHigh), + le32_to_cpu(np->tx_ring.ex[i+2].PacketBufferLow), + le32_to_cpu(np->tx_ring.ex[i+2].FlagLen), + le32_to_cpu(np->tx_ring.ex[i+3].PacketBufferHigh), + le32_to_cpu(np->tx_ring.ex[i+3].PacketBufferLow), + le32_to_cpu(np->tx_ring.ex[i+3].FlagLen)); + } } } @@ -1061,7 +1128,10 @@ printk(KERN_DEBUG "%s: tx_timeout: dead entries!\n", dev->name); nv_drain_tx(dev); np->next_tx = np->nic_tx = 0; - writel((u32) (np->ring_addr + RX_RING*sizeof(struct ring_desc)), base + NvRegTxRingPhysAddr); + if (np->desc_ver == DESC_VER_1 || np->desc_ver == DESC_VER_2) + writel((u32) (np->ring_addr + RX_RING*sizeof(struct ring_desc)), base + NvRegTxRingPhysAddr); + else + writel((u32) (np->ring_addr + RX_RING*sizeof(struct ring_desc_ex)), base + NvRegTxRingPhysAddr); netif_wake_queue(dev); } @@ -1136,8 +1206,13 @@ break; /* we scanned the whole ring - do not continue */ i = np->cur_rx % RX_RING; - Flags = le32_to_cpu(np->rx_ring[i].FlagLen); - len = nv_descr_getlength(&np->rx_ring[i], np->desc_ver); + if (np->desc_ver == DESC_VER_1 || np->desc_ver == DESC_VER_2) { + Flags = le32_to_cpu(np->rx_ring.orig[i].FlagLen); + len = nv_descr_getlength(&np->rx_ring.orig[i], np->desc_ver); + } else { + Flags = le32_to_cpu(np->rx_ring.ex[i].FlagLen); + len = nv_descr_getlength_ex(&np->rx_ring.ex[i], np->desc_ver); + } dprintk(KERN_DEBUG "%s: nv_rx_process: looking at packet %d, Flags 0x%x.\n", dev->name, np->cur_rx, Flags); @@ -1321,7 +1396,10 @@ /* reinit nic view of the rx queue */ writel(np->rx_buf_sz, base + NvRegOffloadConfig); writel((u32) np->ring_addr, base + NvRegRxRingPhysAddr); - writel((u32) (np->ring_addr + RX_RING*sizeof(struct ring_desc)), base + NvRegTxRingPhysAddr); + if (np->desc_ver == DESC_VER_1 || np->desc_ver == DESC_VER_2) + writel((u32) (np->ring_addr + RX_RING*sizeof(struct ring_desc)), base + NvRegTxRingPhysAddr); + else + writel((u32) (np->ring_addr + RX_RING*sizeof(struct ring_desc_ex)), base + NvRegTxRingPhysAddr); writel( ((RX_RING-1) << NVREG_RINGSZ_RXSHIFT) + ((TX_RING-1) << NVREG_RINGSZ_TXSHIFT), base + NvRegRingSizes); pci_push(base); @@ -1982,7 +2060,10 @@ /* 4) give hw rings */ writel((u32) np->ring_addr, base + NvRegRxRingPhysAddr); - writel((u32) (np->ring_addr + RX_RING*sizeof(struct ring_desc)), base + NvRegTxRingPhysAddr); + if (np->desc_ver == DESC_VER_1 || np->desc_ver == DESC_VER_2) + writel((u32) (np->ring_addr + RX_RING*sizeof(struct ring_desc)), base + NvRegTxRingPhysAddr); + else + writel((u32) (np->ring_addr + RX_RING*sizeof(struct ring_desc_ex)), base + NvRegTxRingPhysAddr); writel( ((RX_RING-1) << NVREG_RINGSZ_RXSHIFT) + ((TX_RING-1) << NVREG_RINGSZ_TXSHIFT), base + NvRegRingSizes); @@ -2179,18 +2260,37 @@ np->desc_ver = DESC_VER_2; np->pkt_limit = NV_PKTLIMIT_2; } + if (id->driver_data & DEV_HAS_HIGH_DMA) { + np->desc_ver = DESC_VER_3; + if (pci_set_dma_mask(pci_dev, 0x0000007fffffffffULL)) { + if (pci_set_dma_mask(pci_dev, 0xffffffffULL)) + goto out_relreg; + } + } err = -ENOMEM; np->base = ioremap(addr, NV_PCI_REGSZ); if (!np->base) goto out_relreg; dev->base_addr = (unsigned long)np->base; + dev->irq = pci_dev->irq; - np->rx_ring = pci_alloc_consistent(pci_dev, sizeof(struct ring_desc) * (RX_RING + TX_RING), - &np->ring_addr); - if (!np->rx_ring) - goto out_unmap; - np->tx_ring = &np->rx_ring[RX_RING]; + + if (np->desc_ver == DESC_VER_1 || np->desc_ver == DESC_VER_2) { + np->rx_ring.orig = pci_alloc_consistent(pci_dev, + sizeof(struct ring_desc) * (RX_RING + TX_RING), + &np->ring_addr); + if (!np->rx_ring.orig) + goto out_unmap; + np->tx_ring.orig = &np->rx_ring.orig[RX_RING]; + } else { + np->rx_ring.ex = pci_alloc_consistent(pci_dev, + sizeof(struct ring_desc_ex) * (RX_RING + TX_RING), + &np->ring_addr); + if (!np->rx_ring.ex) + goto out_unmap; + np->tx_ring.ex = &np->rx_ring.ex[RX_RING]; + } dev->open = nv_open; dev->stop = nv_close; @@ -2313,8 +2413,12 @@ return 0; out_freering: - pci_free_consistent(np->pci_dev, sizeof(struct ring_desc) * (RX_RING + TX_RING), - np->rx_ring, np->ring_addr); + if (np->desc_ver == DESC_VER_1 || np->desc_ver == DESC_VER_2) + pci_free_consistent(np->pci_dev, sizeof(struct ring_desc) * (RX_RING + TX_RING), + np->rx_ring.orig, np->ring_addr); + else + pci_free_consistent(np->pci_dev, sizeof(struct ring_desc_ex) * (RX_RING + TX_RING), + np->rx_ring.ex, np->ring_addr); pci_set_drvdata(pci_dev, NULL); out_unmap: iounmap(get_hwbase(dev)); @@ -2343,7 +2447,10 @@ writel(np->orig_mac[1], base + NvRegMacAddrB); /* free all structures */ - pci_free_consistent(np->pci_dev, sizeof(struct ring_desc) * (RX_RING + TX_RING), np->rx_ring, np->ring_addr); + if (np->desc_ver == DESC_VER_1 || np->desc_ver == DESC_VER_2) + pci_free_consistent(np->pci_dev, sizeof(struct ring_desc) * (RX_RING + TX_RING), np->rx_ring.orig, np->ring_addr); + else + pci_free_consistent(np->pci_dev, sizeof(struct ring_desc_ex) * (RX_RING + TX_RING), np->rx_ring.ex, np->ring_addr); iounmap(get_hwbase(dev)); pci_release_regions(pci_dev); pci_disable_device(pci_dev); @@ -2382,35 +2489,35 @@ }, { /* CK804 Ethernet Controller */ PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_8), - .driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC, + .driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC|DEV_HAS_HIGH_DMA, }, { /* CK804 Ethernet Controller */ PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_9), - .driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC, + .driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC|DEV_HAS_HIGH_DMA, }, { /* MCP04 Ethernet Controller */ PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_10), - .driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC, + .driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC|DEV_HAS_HIGH_DMA, }, { /* MCP04 Ethernet Controller */ PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_11), - .driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC, + .driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC|DEV_HAS_HIGH_DMA, }, { /* MCP51 Ethernet Controller */ PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_12), - .driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER, + .driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_HIGH_DMA, }, { /* MCP51 Ethernet Controller */ PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_13), - .driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER, + .driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_HIGH_DMA, }, { /* MCP55 Ethernet Controller */ PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_14), - .driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC, + .driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC|DEV_HAS_HIGH_DMA, }, { /* MCP55 Ethernet Controller */ PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NVENET_15), - .driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC, + .driver_data = DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_LARGEDESC|DEV_HAS_HIGH_DMA, }, {0,}, }; --------------060508080608020503060100-- From diego.beltrami@HIIT.FI Mon Jul 25 05:43:47 2005 Received: with ECARTIS (v1.0.0; list netdev); Mon, 25 Jul 2005 05:43:58 -0700 (PDT) Received: from pegasus.hiit.fi (pegasus.hiit.fi [212.68.1.186]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6PChjH9027564 for ; Mon, 25 Jul 2005 05:43:46 -0700 Received: from [128.214.113.174] (odysse.hiit.fi [128.214.113.174]) by pegasus.hiit.fi (Postfix) with ESMTP id 1F1F1220013; Mon, 25 Jul 2005 15:41:48 +0300 (EEST) Subject: [PATCH 2.6.12.2] XFRM: BEET IPsec mode for Linux From: Diego Beltrami Reply-To: diego.beltrami@HIIT.FI To: netdev@oss.sgi.com Cc: infrahip@HIIT.FI, gurtov@cs.helsinki.fi, jeffrey.m.ahrenholz@boeing.com, kristian.slavov@nomadiclab.com, hipl-users@freelists.org, hipsec@ietf.org Content-Type: text/plain Organization: HIIT Message-Id: <1122295307.14873.37.camel@odysse> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.4.6 (1.4.6-2) Date: Mon, 25 Jul 2005 15:41:48 +0300 Content-Transfer-Encoding: 7bit X-archive-position: 2779 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: diego.beltrami@HIIT.FI Precedence: bulk X-list: netdev Content-Length: 57850 Lines: 1976 Hi folks, we have been working for three months to implement a new IPsec mode, the "BEET" mode, for Linux. Below is a link to the BEET specification and the abstract: http://www.ietf.org/internet-drafts/draft-nikander-esp-beet-mode-03.txt Abstract This document specifies a new mode, called Bound End-to-End Tunnel (BEET) mode, for IPsec ESP. The new mode augments the existing ESP tunnel and transport modes. For end-to-end tunnels, the new mode provides limited tunnel mode semantics without the regular tunnel mode overhead. The mode is intended to support new uses of ESP, including mobility and multi-address multi-homing. The BEET mode is required by the Host Identity Protocol (HIP), which provides authenticated Diffie-Hellman for end-hosts, as well as mobility and multihoming support. The BEET mode is also useful for other similar protocols being developed at the IETF. Ericsson has already developed a BEET patch for *BSD. Our patch provides the similar functionality, but using the XFRM architecture. The patch is included at the end of this email and also at the following URL: http://hipl.hiit.fi/beet/beet-patch-v1.0-2.6.12.2 We have made some testing in order to assure the quality of the patch. All the tests passed, and below is a list of them: * Does not break transport and tunnel mode (with CONFIG_XFRM_BEET on/off) * All inner-outer combinations with varying test applications: ICMP, ICMPv6, FTP, SSH, nc, nc6 * Works with fragmented packets * Interoperability with HIPL * Real machines, virtual machines (vmware) * Tested with long data streams The BEET development team: * Abhinav Pathak (InfraHIP/HIIT) * Diego Beltrami (InfraHIP/HIIT) * Kristian Slavov (Ericsson) * Miika Komu (InfraHIP/HIIT) * Jeff Ahrenholz (Boeing) On the behalf of the BEET development team, Signed-off-by: Diego Beltrami diff -urN linux-2.6.12.2/Documentation/README.BEET linux-beet-2.6.12.2/Documentation/README.BEET --- linux-2.6.12.2/Documentation/README.BEET 1970-01-01 02:00:00.000000000 +0200 +++ linux-beet-2.6.12.2/Documentation/README.BEET 2005-07-25 14:39:36.000000000 +0300 @@ -0,0 +1,465 @@ +Linux BEET-mode ESP patch + +Authors: Miika Komu + Kristian Slavov + Jeff Ahrenholz + Abhinav Pathak + Diego Beltrami + +Changelog: May 25, 2005 this document created + + +Description +----------- +This patch extends the native Linux 2.6 kernel IPsec to support +Bound-End-to-End-Tunnel (BEET) mode for ESP: + +Abstract + + This document specifies a new mode, called Bound End-to-End Tunnel + (BEET) mode, for IPsec ESP. The new mode augments the existing ESP + tunnel and transport modes. For end-to-end tunnels, the new mode + provides limited tunnel mode semantics without the regular tunnel + mode overhead. The mode is intended to support new uses of ESP, + including mobility and multi-address multi-homing. + +http://www.ietf.org/internet-drafts/draft-nikander-esp-beet-mode-03.txt + +BEET mode architecture +---------------------- + +Below are some control flow diagrams to illustrate how BEET works. + +Sending (inner IPv4, outer IPv4)(4-4) +===================================== +inet_sendmsg + raw_sendmsg + ip_route_output_flow + __ip_route_output_key + xfrm_lookup + flow_cache_lookup + xfrm_policy_lookup // lookup IPsec policy + xfrm_find_bundle // lookup IPsec SA + __xfrm_selector_match + xfrm_tmpl_resolve // only if bundle was not found! + xfrm_state_find + xfrm_bundle_create // create output (dst) chain if bundle was not found + __xfrm4_bundle_create + ip_push_pending_frames + dst_output(skb) //this calls skb->dst->output(); + xfrm4_output //This finally returns 4 (NET_XMIT_BYPASS) to dst_output(); + xfrm4_encap + esp_output + xfrm_beet_output //change the ip header to outer. + dst_output(skb) + ip_output + ip_finish_output Or ip_fragment //depending on size of packet + // Returns 0 to dst_output(); which makes dst_output to come out of infinite loop. + dev_queue_xmit + + +Receiving (inner IPv4, outer IPv4)(4-4) +=========== + +net_rx_action() +e1000_clean() // dependent on network hardware +e1000_clean_rx_irq() +netif_receive_skb() + deliver_skb() + ret = pt_prev->func(skb, skb->dev, pt_prev); + ip_rcv() + nf_hook() + ip_rcv_finish() + ip_route_input() + dst_input()->ip_forward() or ip_input() + ip_input // remove the IPv4 header + ip_input_finish + ret = ipprot->handler(&skb, &nhoff); + xfrm4_rcv() + xfrm4_rcv_encap() + xfrm4_parse_spi() + xfrm_state_lookup() // lookup IPsec SA + xfrm_beet_input(skb, x) //To change to inner IP header. + nexthdr = x->type->input(x, xfrm.decap, skb) // == esp_input + esp_input() // process ESP based on inner address + returns 0 ; + /* beet handling in xfrm_rcv_spi */ + netif_rx() + // ip_input_finish returns 0 + // netif_receive_skb returns 0 +netif_receive_skb //Now we have an IPv4 packet. So the input flow is for v4 packet. + deliver_skb() + ret = pt_prev->func(skb, skb->dev, pt_prev); + ip_rcv() + nf_hook() //This calls ip_rcv_finish(skb) + ip_rcv_finish() //Here the skb->dst is NULL and so is filled for the input side. + ip6_route_input() + dst_input()->ip_forward() or ip_input() + ip_input // remove the IPv4 header + ip_input_finish + ... + ... + ... + + +Sending (inner IPv6, outer IPv4)(6-4) +===================================== +inet_sendmsg + rawv6_sendmsg + ip6_dst_lookup + ip6_route_output + xfrm_lookup + flow_cache_lookup + xfrm_policy_lookup // lookup IPsec policy + xfrm_find_bundle // lookup IPsec SA + __xfrm_selector_match + xfrm_tmpl_resolve // only if bundle was not found! + xfrm_state_find + xfrm_bundle_create // create output (dst) chain if bundle was not found + __xfrm6_bundle_create + rawv6_push_pending_frames + ip6_push_pending_frames + dst_output(skb) + xfrm6_output + xfrm6_encap + esp6_output //esp calculation is done on inner addresses !! + xfrm_beet_output //Change the ip header to outer IP Header + dst_output(skb) + ip_output + ip_finish_output Or ip_fragment //depending on size of packet + // Returns 0 to dst_output(); which makes dst_output to come out of infinite loop. + dev_queue_xmit + + +Receiving (inner IPv6, outer IPv4)(6-4) +=========== + +net_rx_action() +e1000_clean() // dependent on network hardware +e1000_clean_rx_irq() +netif_receive_skb() + deliver_skb() + ret = pt_prev->func(skb, skb->dev, pt_prev); + ip_rcv() // skb len = 140 + nf_hook() + ip_rcv_finish() + ip_route_input() + dst_input()->ip_forward() or ip_input() + ip_input // remove the IPv6 header + ip_input_finish // calls recursively the ->handler = xfrm6_rcv + ret = ipprot->handler(&skb, &nhoff); // handler = xfrm6_rcv_spi + xfrm4_rcv() + xfrm4_rcv_encap() + xfrm4_parse_spi() + xfrm_state_lookup() // lookup IPsec SA + xfrm_beet_input(skb, x) //To change to inner IP header. + nexthdr = x->type->input(x, xfrm.decap, skb) // == esp6_input + esp6_input() // process ESP based on inner address + returns 0 ; + /* beet handling in xfrm_rcv_spi */ + netif_rx() + // ip6_input_finish returns 0 + // netif_receive_skb returns 0 +netif_receive_skb + deliver_skb() + ret = pt_prev->func(skb, skb->dev, pt_prev); + ip6_rcv() // skb len = 104 + nf_hook_slow() + ip6_rcv_finish() + ip6_route_input() + dst_input()->ip6_forward() or ip6_input() + ip6_input // remove the IPv6 header + ip6_input_finish + xfrm6_policy_check() + .. + __xfrm_policy_check + ret = ipprot->handler(&skb, &nhoff); // handler = xfrm6_rcv_spi +tcp_v6_rcv() // or icmpv6_rcv(), anyway, deliver to upper layer + + +Sending (inner IPv4, outer IPv6)(4-6) +===================================== + +inet_sendmsg + raw_sendmsg + ip_route_output_flow + __ip_route_output_key + xfrm_lookup + flow_cache_lookup + xfrm_policy_lookup // lookup IPsec policy + xfrm_find_bundle // lookup IPsec SA + __xfrm_selector_match + xfrm_tmpl_resolve // only if bundle was not found! + xfrm_state_find + xfrm_bundle_create // create output (dst) chain if bundle was not found + __xfrm4_bundle_create + ip_push_pending_frames + dst_output(skb) //this calls skb->dst->output(); + xfrm4_output //This finally returns 4 (NET_XMIT_BYPASS) to dst_output(); + xfrm4_encap + esp_output + xfrm_beet_output + dst_output(skb) + ip6_output + ip6_output2 + ip6_output_finish // Returns 0 to dst_output(); which makes dst_output to come out of infinite loop. + dev_queue_xmit + + +Receiving (inner IPv4, outer IPv6)(4-6) +=========== + +net_rx_action() +e1000_clean() // dependent on network hardware +e1000_clean_rx_irq() +netif_receive_skb() + deliver_skb() + ret = pt_prev->func(skb, skb->dev, pt_prev); + ipv6_rcv() // skb len = 140 + nf_hook_slow() + ip6_rcv_finish() + ip6_route_input() + dst_input()->ip6_forward() or ip6_input() + ip6_input // remove the IPv6 header + ip6_input_finish // calls recursively the ->handler = xfrm6_rcv + ret = ipprot->handler(&skb, &nhoff); // handler = xfrm6_rcv_spi + xfrm6_rcv() + xfrm6_rcv_spi() + xfrm_parse_spi() + xfrm_state_lookup() // lookup IPsec SA + xfrm_beet_input(skb, x) //To change to inner IP header. + nexthdr = x->type->input(x, xfrm.decap, skb) // == esp_input + esp_input() // process ESP + returns iph->protocol ; + /* beet handling in xfrm_rcv_spi */ + netif_rx() + // ip6_input_finish returns 0 + // netif_receive_skb returns 0 +netif_receive_skb //Now we have an IPv4 packet. So the input flow is for v4 packet. + deliver_skb() + ret = pt_prev->func(skb, skb->dev, pt_prev); + ip_rcv() + nf_hook() //This calls ip_rcv_finish(skb) + ip_rcv_finish() //Here the skb->dst is NULL and so is filled for the input side. + ip_route_input() + dst_input()->ip_forward() or ip_input() + ip_input // remove the IPv4 header + ip_input_finish + ... + ... + ... + +Sending (inner IPv6, outer IPv6)(6-6) +============= + +(When sending the first packet!) + +inet_sendmsg + rawv6_sendmsg + ip6_dst_lookup + ip6_route_output + xfrm_lookup + flow_cache_lookup + xfrm_policy_lookup // lookup IPsec policy + xfrm_find_bundle // lookup IPsec SA + __xfrm_selector_match + xfrm_tmpl_resolve // only if bundle was not found! + xfrm_state_find + xfrm_bundle_create // create output (dst) chain if bundle was not found + __xfrm6_bundle_create + rawv6_push_pending_frames + ip6_push_pending_frames + dst_output(skb) + xfrm6_output + xfrm6_encap + esp6_output + xfrm_beet_output + dst_output(skb) + ip6_output + ip6_output2 + ip6_output_finish + dev_queue_xmit + +when are these called? + ip6_xmt() + dst_output() + + +Receiving (inner IPv6, outer IPv6)(6-6) +=========== + +net_rx_action() +e1000_clean() // dependent on network hardware +e1000_clean_rx_irq() +netif_receive_skb() + deliver_skb() + ret = pt_prev->func(skb, skb->dev, pt_prev); + ipv6_rcv() // skb len = 140 + nf_hook_slow() + ip6_rcv_finish() + ip6_route_input() + dst_input()->ip6_forward() or ip6_input() + ip6_input // remove the IPv6 header + ip6_input_finish // calls recursively the ->handler = xfrm6_rcv + ret = ipprot->handler(&skb, &nhoff); // handler = xfrm6_rcv_spi + xfrm6_rcv() + xfrm6_rcv_spi() + xfrm_parse_spi() + xfrm_state_lookup() // lookup IPsec SA + xfrm_beet_input(skb, x) //To change to inner IP header. + nexthdr = x->type->input(x, xfrm.decap, skb) // == esp6_input + esp6_input() // process ESP + returns 58 (ICMPv6) //returns the nexthdr in the ipv6 packet. + /* beet handling in xfrm_rcv_spi */ + netif_rx() + // ip6_input_finish returns 0 + // netif_receive_skb returns 0 +netif_receive_skb + deliver_skb() + ret = pt_prev->func(skb, skb->dev, pt_prev); + ipv6_rcv() // skb len = 104 + nf_hook_slow() + ip6_rcv_finish() + ip6_route_input() + dst_input()->ip6_forward() or ip6_input() + ip6_input // remove the IPv6 header + ip6_input_finish + xfrm6_policy_check() + .. + __xfrm_policy_check + ret = ipprot->handler(&skb, &nhoff); // handler = xfrm6_rcv_spi +tcp_v6_rcv() // or icmpv6_rcv(), anyway, deliver to upper layer + + +output path +ip6_datagram_connect() + ip6_dst_lookup() // success + xfrm_lookup() // lookup policy using inner IP, matching selectors in SP and + flow information + xfrm_sk_policy_lookup() // success + flow_cache_lookup() // success + xfrm_find_bundle() // check for a bundle, if found use it, or create new + xfrm_tmpl_resolve() // when creating new, search for SA for each transform + // once valid SA found, use it to create bundle and link + // to SP. modify skbuff's dst-pointer pointing to next + // xfrmX_output(), after encaps/trans dst is consulted + // to route the packet + xfrm_state_find() // + xfrm_selector_match() // + km_query() // + + + + app app + | | + inner inner + \ / + - / + \ / + \--outer outer--/ + \ / + \======/ + + +Files Added +-------------- +This is a list of the included files for the BEET patch + +net/xfrm/xfrm_beet.c +- This file contains the functions xfrm_beet_input() and xfrm_beet_output() + which deals with the incoming and the outgoing BEET packets, respectively. + The purpose of these functions is to interchange the inner addresses with + the outer addresses in the IP header (in case of outgoing packets) and viceversa + (in case of incoming packets). + The file describes two functions: + 1. xfrm_beet_input + Used in receiving side, changes the ip header to inner ip header + 2. xfrm_beet_output + Used in sending side, changes the ip header to outer ip header + +Files changed +------------- +This is a list of changes made by the BEET patch. + +include/linux/ipsec.h + - IPSEC_MODE_BEET added + This is the new type of SA that may be created. + XXX note: are we overusing XFRM_MODE_BEET where IPSEC_MODE_BEET should be + used instead? + +include/linux/xfrm.h + - enum XFRM_MODE_{TRANSPORT|TUNNEL|BEET} added + Mode needed to distinguish from tunnel mode in xfrm code. + +include/net/xfrm.h + - u16 beet_family added to struct xfrm_state + For the outgoing SA, this is the family of the outer address. + For the incoming SA, this is the family of the inner address. + - unsigned short family added to struct xfrm_tmpl + family is required because the family may differ from the one in the selector + - possible change to xfrm_selector_match() (commented out) + +net/ipv4/xfrm4_input.c + - in xfrm4_rcv_encap() call is made to xfrm_beet_input(), to change the + ip header to inner before going for esp test. + - in xfrm4_rcv_encap() check x->props.mode for XFRM_MODE_TUNNEL, _BEET + checks address family (x->props.beet_family), and makes final adjustments + to packet before requeing it. + +net/ipv4/xfrm4_output.c + - xfrm4_encap(), note to fix the BEET case, like xfrm6_encap + - xfrm4_output(), added a call to xfrm_beet_output() to change the ip header + +net/ipv4/esp4.c + - in esp_init_state(), check if x->props.mode == XFRM_MODE_TUNNEL, + then x->props.header_len += sizeof(struct ipv6hdr), not if (x->props.mode) + - in esp_input(), while returning, if the outer family is AF_INET6, then return + iph->protocol, else return 0. + +net/ipv6/esp6.c + - in esp6_init_state(), check if x->props.mode == XFRM_MODE_TUNNEL, + then x->props.header_len += sizeof(struct ipv6hdr), not if (x->props.mode) + - in esp6_input(), while returning, if the outer family is AF_INET, then + set next header field and return 0, else return ret. + + +net/ipv6/xfrm6_input.c + - in xfrm6-rcv_spi(), call is made to xfrm_beet_input(), which changes to + inner ip header before sending to esp decapsulation. + - in xfrm6_rcv_spi(), handle x->props.mode = XFRM_MODE_BEET + checks address family (x->props.beet_family), makes final adjustments to + packet before requeing it. + +net/ipv6/xfrm6_output.c + - xfrm6_encap() add ipv4 header vars, check if (x->props.mode==XFRM_MODE_BEET) + makes space for appropriate esp header and sends to espX_output where X depends + on inner family of beet. + - xfrm6_output() change if(x->props.mode) to (x->props.mode==XFRM_MODE_TUNNEL) + Also a call is made to xfrm_beet_output after esp calculations, to change the + ip header to outer ip header. + +net/ipv6/xfrm6_policy.c + (on output...) + - in __xfrm6_bundle_create() added remotebeet, localbeet vars, + get the IPv6 headers from xfrm[i]->id.daddr (remote) and + xfrm[i]->props.saddr (local) + copy IPv4 or IPv6 addresses from remote/localbeet to fl_tunnel.fl4/6_dst/src + then do xfrm_dst_lookup() passing in xfrm[i]->props.beet_family + +net/key/af_key.c + - commented-out code in pfkey_msg2xfrm_state(): + check x->props.beet_family for x->props.family? + + - parse_ipsecrequest() check if (t->mode==IPSEC_MODE_TUNNEL-1) + handle if (t->mode==IPSEC_MODE_BEET-1) + populate t->saddr.a4 or t->saddr.a6, t->family, etc + This supports adding a new type of beet mode SA. + +net/xfrm/Kconfig + - added XFRM_BEET config variable option and text + This allows you to compile BEET mode into your kernel. + +net/xfrm/xfrm_policy.c + - note from Miika - fns added just for testing, removed for BEET + ipv6_addr_is_hit(), hip_xfrm_handler_notify(), hip_xfrm_handler_acquire(), + hip_xfrm_handler_policy_notify(), hip_register_xfrm_km_handler(), etc diff -urN linux-2.6.12.2/include/linux/ipsec.h linux-beet-2.6.12.2/include/linux/ipsec.h --- linux-2.6.12.2/include/linux/ipsec.h 2005-06-30 02:00:53.000000000 +0300 +++ linux-beet-2.6.12.2/include/linux/ipsec.h 2005-07-25 14:39:01.000000000 +0300 @@ -13,6 +13,9 @@ IPSEC_MODE_ANY = 0, /* We do not support this for SA */ IPSEC_MODE_TRANSPORT = 1, IPSEC_MODE_TUNNEL = 2 +#ifdef CONFIG_XFRM_BEET + ,IPSEC_MODE_BEET = 3 +#endif }; enum { diff -urN linux-2.6.12.2/include/linux/xfrm.h linux-beet-2.6.12.2/include/linux/xfrm.h --- linux-2.6.12.2/include/linux/xfrm.h 2005-06-30 02:00:53.000000000 +0300 +++ linux-beet-2.6.12.2/include/linux/xfrm.h 2005-07-25 14:39:01.000000000 +0300 @@ -102,6 +102,15 @@ XFRM_SHARE_UNIQUE /* Use once */ }; +enum +{ + XFRM_MODE_TRANSPORT = 0, + XFRM_MODE_TUNNEL +#ifdef CONFIG_XFRM_BEET + ,XFRM_MODE_BEET +#endif +}; + /* Netlink configuration messages. */ enum { XFRM_MSG_BASE = 0x10, diff -urN linux-2.6.12.2/include/net/xfrm.h linux-beet-2.6.12.2/include/net/xfrm.h --- linux-2.6.12.2/include/net/xfrm.h 2005-06-30 02:00:53.000000000 +0300 +++ linux-beet-2.6.12.2/include/net/xfrm.h 2005-07-25 15:03:01.000000000 +0300 @@ -113,6 +113,14 @@ xfrm_address_t saddr; int header_len; int trailer_len; +#ifdef CONFIG_XFRM_BEET + /* beet_family_out = family of outer addresses + * beet_family_in = family of inner addresses + */ + u16 beet_family_in; + u16 beet_family_out; + +#endif } props; struct xfrm_lifetime_cfg lft; @@ -241,6 +249,12 @@ /* Source address of tunnel. Ignored, if it is not a tunnel. */ xfrm_address_t saddr; +/* family of the addresses. In BEET-mode the family may differ from + the one in selector */ +#ifdef CONFIG_XFRM_BEET + unsigned short family; +#endif + __u32 reqid; /* Mode: transport/tunnel */ @@ -835,6 +849,12 @@ extern void xfrm6_tunnel_free_spi(xfrm_address_t *saddr); extern u32 xfrm6_tunnel_spi_lookup(xfrm_address_t *saddr); extern int xfrm6_output(struct sk_buff *skb); +#ifdef CONFIG_XFRM_BEET +extern struct xfrm_state * xfrm_lookup_bydst(u8 mode, xfrm_address_t *daddr, xfrm_address_t *saddr, unsigned short family); +extern int xfrm_beet_output(struct sk_buff *skb); +extern int xfrm_beet_input(struct sk_buff *skb, struct xfrm_state *x); + +#endif #ifdef CONFIG_XFRM extern int xfrm4_rcv_encap(struct sk_buff *skb, __u16 encap_type); diff -urN linux-2.6.12.2/net/ipv4/esp4.c linux-beet-2.6.12.2/net/ipv4/esp4.c --- linux-2.6.12.2/net/ipv4/esp4.c 2005-06-30 02:00:53.000000000 +0300 +++ linux-beet-2.6.12.2/net/ipv4/esp4.c 2005-07-25 14:39:11.000000000 +0300 @@ -1,3 +1,13 @@ +/* + * Changes: BEET support + * Abhinav Pathak + * Diego Beltrami + * Kristian Slavov + * Miika Komu + * Jeff Ahrenholz + * + */ + #include #include #include @@ -23,7 +33,7 @@ struct iphdr *top_iph; struct ip_esp_hdr *esph; struct crypto_tfm *tfm; - struct esp_data *esp; + struct esp_data *esp = x->data; struct sk_buff *trailer; int blksize; int clen; @@ -31,7 +41,15 @@ int nfrags; /* Strip IP+ESP header. */ - __skb_pull(skb, skb->h.raw - skb->data); +#ifdef CONFIG_XFRM_BEET + int hdr_len = skb->h.raw - skb->data + sizeof(*esph) + esp->conf.ivlen; + if (x->props.mode == XFRM_MODE_BEET) + __skb_pull(skb, hdr_len); + else + __skb_pull(skb, skb->h.raw - skb->data); +#else + __skb_pull(skb, skb->h.raw - skb->data); +#endif /* Now skb is pure payload to encrypt */ err = -ENOMEM; @@ -39,7 +57,6 @@ /* Round to block size */ clen = skb->len; - esp = x->data; alen = esp->auth.icv_trunc_len; tfm = esp->conf.tfm; blksize = (crypto_tfm_alg_blocksize(tfm) + 3) & ~3; @@ -59,7 +76,14 @@ *(u8*)(trailer->tail + clen-skb->len - 2) = (clen - skb->len)-2; pskb_put(skb, trailer, clen - skb->len); +#ifdef CONFIG_XFRM_BEET + if (x->props.mode == XFRM_MODE_BEET) + __skb_push(skb, hdr_len); + else + __skb_push(skb, skb->data - skb->nh.raw); +#else __skb_push(skb, skb->data - skb->nh.raw); +#endif top_iph = skb->nh.iph; esph = (struct ip_esp_hdr *)(skb->nh.raw + top_iph->ihl*4); top_iph->tot_len = htons(skb->len + alen); @@ -238,7 +262,14 @@ skb->nh.iph->tot_len = htons(skb->len); } +#ifdef CONFIG_XFRM_BEET + if(x->props.mode == XFRM_MODE_BEET && x->props.beet_family_out == AF_INET6) + return iph->protocol ; + else + return 0; +#else return 0; +#endif out: return -EINVAL; @@ -428,7 +459,11 @@ if (crypto_cipher_setkey(esp->conf.tfm, esp->conf.key, esp->conf.key_len)) goto error; x->props.header_len = sizeof(struct ip_esp_hdr) + esp->conf.ivlen; +#ifdef CONFIG_XFRM_BEET + if (x->props.mode == XFRM_MODE_TUNNEL) +#else if (x->props.mode) +#endif x->props.header_len += sizeof(struct iphdr); if (x->encap) { struct xfrm_encap_tmpl *encap = x->encap; diff -urN linux-2.6.12.2/net/ipv4/xfrm4_input.c linux-beet-2.6.12.2/net/ipv4/xfrm4_input.c --- linux-2.6.12.2/net/ipv4/xfrm4_input.c 2005-06-30 02:00:53.000000000 +0300 +++ linux-beet-2.6.12.2/net/ipv4/xfrm4_input.c 2005-07-25 14:39:11.000000000 +0300 @@ -7,6 +7,13 @@ * Derek Atkins * Add Encapsulation support * + * Changes: BEET support + * Abhinav Pathak + * Diego Beltrami + * Kristian Slavov + * Miika Komu + * Jeff Ahrenholz + * */ #include @@ -78,6 +85,14 @@ goto drop_unlock; xfrm_vec[xfrm_nr].decap.decap_type = encap_type; + +#ifdef CONFIG_XFRM_BEET + if (x->props.mode == XFRM_MODE_BEET) { + /* Change the outer header with the inner data */ + if (xfrm_beet_input(skb, x)) + goto drop_unlock; + } +#endif if (x->type->input(x, &(xfrm_vec[xfrm_nr].decap), skb)) goto drop_unlock; @@ -96,7 +111,11 @@ iph = skb->nh.iph; +#ifdef CONFIG_XFRM_BEET + if (x->props.mode == XFRM_MODE_TUNNEL) { +#else if (x->props.mode) { +#endif if (iph->protocol != IPPROTO_IPIP) goto drop; if (!pskb_may_pull(skb, sizeof(struct iphdr))) @@ -115,9 +134,69 @@ decaps = 1; break; } +#ifdef CONFIG_XFRM_BEET + else if (x->props.mode == XFRM_MODE_BEET) { + struct iphdr *iph = skb->nh.iph; + struct ipv6hdr *ip6h = skb->nh.ipv6h; + int size = 0; + + if (skb_cloned(skb) && + pskb_expand_head(skb, 0, 0, GFP_ATOMIC)) + goto drop; + + if (x->props.beet_family_in == AF_INET) + size = sizeof(struct iphdr); + else if (x->props.beet_family_in == AF_INET6) + size = sizeof(struct ipv6hdr); + else + BUG_ON(1); + + skb_push(skb, size); + + memmove(skb->data, skb->nh.raw, size); + skb->mac.raw = memmove(skb->data - skb->mac_len, + skb->mac.raw, skb->mac_len); + skb->nh.raw = skb->data; - if ((err = xfrm_parse_spi(skb, skb->nh.iph->protocol, &spi, &seq)) < 0) + switch(x->props.beet_family_in) { + case AF_INET: + + iph->tot_len = htons(skb->len); + iph->check = 0; + iph->check = ip_fast_csum((unsigned char *)iph, iph->ihl); + skb->protocol = htons(ETH_P_IP); + dst_release(skb->dst); + skb->dst = NULL; + decaps = 1; + + break; + case AF_INET6: + ip6h = skb->nh.ipv6h; + + skb->nh.ipv6h->payload_len = htons(ntohs(skb->nh.ipv6h->payload_len) + size); + skb->protocol = htons(ETH_P_IPV6); + + dst_release(skb->dst); + skb->dst = NULL; + decaps = 1; + break; + default: + BUG_ON(1); + } + break; + } + + if (x->props.mode == XFRM_MODE_BEET && x->props.beet_family_in == AF_INET6) { + if ((err = xfrm_parse_spi(skb, skb->nh.ipv6h->nexthdr, &spi, &seq)) < 0) + goto drop; + } else { + if ((err = xfrm_parse_spi(skb, skb->nh.iph->protocol, &spi, &seq)) < 0) + goto drop; + } +#else + if ((err = xfrm_parse_spi(skb, skb->nh.iph->protocol, &spi, &seq)) < 0) goto drop; +#endif } while (!err); /* Allocate new secpath or COW existing one. */ diff -urN linux-2.6.12.2/net/ipv4/xfrm4_output.c linux-beet-2.6.12.2/net/ipv4/xfrm4_output.c --- linux-2.6.12.2/net/ipv4/xfrm4_output.c 2005-06-30 02:00:53.000000000 +0300 +++ linux-beet-2.6.12.2/net/ipv4/xfrm4_output.c 2005-07-25 14:39:11.000000000 +0300 @@ -6,6 +6,14 @@ * modify it under the terms of the GNU General Public License * as published by the Free Software Foundation; either version * 2 of the License, or (at your option) any later version. + * + * Changes: BEET support + * Abhinav Pathak + * Diego Beltrami + * Kristian Slavov + * Miika Komu + * Jeff Ahrenholz + * */ #include @@ -26,7 +34,8 @@ * check * * On exit, skb->h will be set to the start of the payload to be processed - * by x->type->output and skb->nh will be set to the top IP header. + * by x->type->output and skb->nh, as well as skb->data, will point to + * the top IP header. */ static void xfrm4_encap(struct sk_buff *skb) { @@ -35,15 +44,36 @@ struct iphdr *iph, *top_iph; iph = skb->nh.iph; - skb->h.ipiph = iph; +#ifdef CONFIG_XFRM_BEET + /* + * This is because otherwise the BEET patch crashes in any case with Inner=4 + */ + if (x->props.mode != XFRM_MODE_BEET) + skb->h.ipiph = iph; +#else + skb->h.ipiph = iph; +#endif skb->nh.raw = skb_push(skb, x->props.header_len); top_iph = skb->nh.iph; +#ifdef CONFIG_XFRM_BEET + if (x->props.mode == XFRM_MODE_TRANSPORT) { +#else if (!x->props.mode) { +#endif + skb->h.raw += iph->ihl*4; memmove(top_iph, iph, iph->ihl*4); return; +#ifdef CONFIG_XFRM_BEET + } else if (x->props.mode == XFRM_MODE_BEET) { + + skb->h.raw = skb->data + sizeof(struct iphdr); + memmove(top_iph, iph, iph->ihl*4); + return; + +#endif /* CONFIG_XFRM_BEET */ } top_iph->ihl = 5; @@ -103,7 +133,11 @@ goto error_nolock; } +#ifdef CONFIG_XFRM_BEET + if (x->props.mode == XFRM_MODE_TUNNEL) { +#else if (x->props.mode) { +#endif err = xfrm4_tunnel_check_size(skb); if (err) goto error_nolock; @@ -120,6 +154,15 @@ if (err) goto error; +#ifdef CONFIG_XFRM_BEET + if (x->props.mode == XFRM_MODE_BEET) { + /* Change the outer header */ + err = xfrm_beet_output(skb); + if (err) + goto error; + } +#endif + x->curlft.bytes += skb->len; x->curlft.packets++; diff -urN linux-2.6.12.2/net/ipv4/xfrm4_policy.c linux-beet-2.6.12.2/net/ipv4/xfrm4_policy.c --- linux-2.6.12.2/net/ipv4/xfrm4_policy.c 2005-06-30 02:00:53.000000000 +0300 +++ linux-beet-2.6.12.2/net/ipv4/xfrm4_policy.c 2005-07-25 15:03:01.000000000 +0300 @@ -6,6 +6,14 @@ * YOSHIFUJI Hideaki @USAGI * Split up af-specific portion * + * + * Changes: BEET support + * Abhinav Pathak + * Diego Beltrami + * Kristian Slavov + * Miika Komu + * Jeff Ahrenholz + * */ #include @@ -66,6 +74,12 @@ } } }; +#ifdef CONFIG_XFRM_BEET + union { + struct in6_addr *in6; + struct in_addr *in; + } remotebeet, localbeet; +#endif int i; int err; int header_len = 0; @@ -78,6 +92,9 @@ struct dst_entry *dst1 = dst_alloc(&xfrm4_dst_ops); struct xfrm_dst *xdst; int tunnel = 0; +#ifdef CONFIG_XFRM_BEET + unsigned short beet_family = 0; +#endif if (unlikely(dst1 == NULL)) { err = -ENOBUFS; @@ -98,11 +115,28 @@ dst1->next = dst_prev; dst_prev = dst1; +#ifdef CONFIG_XFRM_BEET + if (xfrm[i]->props.mode == XFRM_MODE_TUNNEL) { +#else if (xfrm[i]->props.mode) { +#endif remote = xfrm[i]->id.daddr.a4; local = xfrm[i]->props.saddr.a4; tunnel = 1; } +#ifdef CONFIG_XFRM_BEET + else if (xfrm[i]->props.mode == XFRM_MODE_BEET) { + + beet_family = xfrm[i]->props.beet_family_out; + if(beet_family == AF_INET6){ + remotebeet.in6 = (struct in6_addr*)&xfrm[i]->id.daddr; + localbeet.in6 = (struct in6_addr*)&xfrm[i]->props.saddr; + } else if(beet_family == AF_INET){ + remotebeet.in = (struct in_addr*)&xfrm[i]->id.daddr; + localbeet.in = (struct in_addr*)&xfrm[i]->props.saddr; + } + } +#endif header_len += xfrm[i]->props.header_len; trailer_len += xfrm[i]->props.trailer_len; @@ -113,6 +147,28 @@ &fl_tunnel, AF_INET); if (err) goto error; +#ifdef CONFIG_XFRM_BEET + } else if (beet_family) { + switch(beet_family) { + case AF_INET: + fl_tunnel.fl4_dst = remotebeet.in->s_addr; + fl_tunnel.fl4_src = localbeet.in->s_addr; + break; + case AF_INET6: + ipv6_addr_copy(&fl_tunnel.fl6_dst, remotebeet.in6); + ipv6_addr_copy(&fl_tunnel.fl6_src, localbeet.in6); + break; + default: + BUG_ON(1); + } + + err = xfrm_dst_lookup((struct xfrm_dst **) &rt, + &fl_tunnel, beet_family); + /* Without this, the BEET mode crashes + indeterministically -Abi */ + rt->peer = NULL; + rt_bind_peer(rt,1); +#endif } else dst_hold(&rt->u.dst); } diff -urN linux-2.6.12.2/net/ipv6/esp6.c linux-beet-2.6.12.2/net/ipv6/esp6.c --- linux-2.6.12.2/net/ipv6/esp6.c 2005-06-30 02:00:53.000000000 +0300 +++ linux-beet-2.6.12.2/net/ipv6/esp6.c 2005-07-25 14:39:11.000000000 +0300 @@ -22,6 +22,16 @@ * Kunihiro Ishiguro * * This file is derived from net/ipv4/esp.c + * + * + * Changes: BEET support + * Abhinav Pathak + * Diego Beltrami + * Kristian Slavov + * Miika Komu + * Jeff Ahrenholz + * + * */ #include @@ -225,6 +235,13 @@ memcpy(skb->nh.raw, tmp_hdr, hdr_len); skb->nh.ipv6h->payload_len = htons(skb->len - sizeof(struct ipv6hdr)); ret = nexthdr[1]; +#ifdef CONFIG_XFRM_BEET + if(x->props.mode == XFRM_MODE_BEET && + x->props.beet_family_out == AF_INET) { + skb->nh.ipv6h->nexthdr = nexthdr[1]; + ret = 0;//This is because xfrm4_encap expects 0 if every thing is correct + } +#endif } out: @@ -365,7 +382,11 @@ if (crypto_cipher_setkey(esp->conf.tfm, esp->conf.key, esp->conf.key_len)) goto error; x->props.header_len = sizeof(struct ipv6_esp_hdr) + esp->conf.ivlen; +#ifdef CONFIG_XFRM_BEET + if (x->props.mode == XFRM_MODE_TUNNEL) +#else if (x->props.mode) +#endif x->props.header_len += sizeof(struct ipv6hdr); x->data = esp; return 0; diff -urN linux-2.6.12.2/net/ipv6/xfrm6_input.c linux-beet-2.6.12.2/net/ipv6/xfrm6_input.c --- linux-2.6.12.2/net/ipv6/xfrm6_input.c 2005-06-30 02:00:53.000000000 +0300 +++ linux-beet-2.6.12.2/net/ipv6/xfrm6_input.c 2005-07-25 14:39:11.000000000 +0300 @@ -64,6 +64,12 @@ if (xfrm_state_check_expire(x)) goto drop_unlock; +#ifdef CONFIG_XFRM_BEET + if (x->props.mode == XFRM_MODE_BEET) { + if (xfrm_beet_input(skb, x)) + goto drop_unlock; + } +#endif nexthdr = x->type->input(x, &(xfrm_vec[xfrm_nr].decap), skb); if (nexthdr <= 0) goto drop_unlock; @@ -80,7 +86,11 @@ xfrm_vec[xfrm_nr++].xvec = x; +#ifdef CONFIG_XFRM_BEET + if (x->props.mode == XFRM_MODE_TUNNEL) { +#else if (x->props.mode) { /* XXX */ +#endif if (nexthdr != IPPROTO_IPV6) goto drop; if (!pskb_may_pull(skb, sizeof(struct ipv6hdr))) @@ -97,6 +107,64 @@ skb->nh.raw = skb->data; decaps = 1; break; +#ifdef CONFIG_XFRM_BEET + } else if (x->props.mode == XFRM_MODE_BEET) { + struct iphdr *iph = skb->nh.iph; // miika: this masks input arg + struct ipv6hdr *ip6h = skb->nh.ipv6h; + int size=0; + __u8 proto=0; + __u8 hops=0; + __u16 total = ntohs(ip6h->payload_len); + + /* is the buffer a clone? + * then create identical copy of header of skb */ + if (skb_cloned(skb) && + pskb_expand_head(skb, 0, 0, GFP_ATOMIC)) + goto drop; + if (x->props.beet_family_in == AF_INET) { + size = sizeof(struct iphdr); + proto = ip6h->nexthdr; + hops = ip6h->hop_limit; + } else if (x->props.beet_family_in == AF_INET6) + size = sizeof(struct ipv6hdr); + else + BUG_ON(1); + /* add data to the start of the buffer */ + skb_push(skb, size); + /* move the raw header into new space */ + memmove(skb->data, skb->nh.raw, size); + /* move MAC header */ + skb->mac.raw = memmove(skb->data - skb->mac_len, + skb->mac.raw, skb->mac_len); + skb->nh.raw = skb->data; + + switch(x->props.beet_family_in) { + case AF_INET: + + iph = (struct iphdr *)skb->nh.raw; + + skb->protocol = htons(ETH_P_IP); + iph->tot_len = htons(skb->len); + iph->frag_off = htons(IP_DF); + iph->check=0; + iph->check = ip_fast_csum((unsigned char *)iph, iph->ihl); + + dst_release(skb->dst); + skb->dst = NULL; + + decaps = 1; + break; + + case AF_INET6: + ip6h->payload_len = htons(total + size); + --ip6h->hop_limit; + decaps = 1; + break; + default: + BUG_ON(1); + } + break; +#endif } if ((err = xfrm_parse_spi(skb, nexthdr, &spi, &seq)) < 0) diff -urN linux-2.6.12.2/net/ipv6/xfrm6_output.c linux-beet-2.6.12.2/net/ipv6/xfrm6_output.c --- linux-2.6.12.2/net/ipv6/xfrm6_output.c 2005-06-30 02:00:53.000000000 +0300 +++ linux-beet-2.6.12.2/net/ipv6/xfrm6_output.c 2005-07-25 14:39:11.000000000 +0300 @@ -7,6 +7,14 @@ * modify it under the terms of the GNU General Public License * as published by the Free Software Foundation; either version * 2 of the License, or (at your option) any later version. + * + * Changes: BEET support + * Abhinav Pathak + * Diego Beltrami + * Kristian Slavov + * Miika Komu + * Jeff Ahrenholz + * */ #include @@ -17,6 +25,10 @@ #include #include +#ifdef CONFIG_XFRM_BEET +#include +#endif + /* Add encapsulation header. * * In transport mode, the IP header and mutable extension headers will be moved @@ -42,7 +54,12 @@ skb_push(skb, x->props.header_len); iph = skb->nh.ipv6h; + +#ifdef CONFIG_XFRM_BEET + if (x->props.mode == XFRM_MODE_TRANSPORT) { +#else if (!x->props.mode) { +#endif u8 *prevhdr; int hdr_len; @@ -51,6 +68,16 @@ skb->h.raw = skb->data + hdr_len; memmove(skb->data, iph, hdr_len); return; + +#ifdef CONFIG_XFRM_BEET + } else if (x->props.mode == XFRM_MODE_BEET) { + + memmove(skb->data, skb->nh.raw, sizeof(struct ipv6hdr)); + skb->nh.raw = &((struct ipv6hdr *)skb->data)->nexthdr; + skb->h.ipv6h = ((struct ipv6hdr *)skb->data) + 1; + return; + +#endif /* CONFIG_XFRM_BEET */ } skb->nh.raw = skb->data; @@ -104,7 +131,11 @@ goto error_nolock; } +#ifdef CONFIG_XFRM_BEET + if (x->props.mode == XFRM_MODE_TUNNEL) { +#else if (x->props.mode) { +#endif err = xfrm6_tunnel_check_size(skb); if (err) goto error_nolock; @@ -121,6 +152,15 @@ if (err) goto error; +#ifdef CONFIG_XFRM_BEET + if (x->props.mode == XFRM_MODE_BEET) { + /* Change the outer header */ + err = xfrm_beet_output(skb); + if (err) + goto error; + } +#endif + x->curlft.bytes += skb->len; x->curlft.packets++; diff -urN linux-2.6.12.2/net/ipv6/xfrm6_policy.c linux-beet-2.6.12.2/net/ipv6/xfrm6_policy.c --- linux-2.6.12.2/net/ipv6/xfrm6_policy.c 2005-06-30 02:00:53.000000000 +0300 +++ linux-beet-2.6.12.2/net/ipv6/xfrm6_policy.c 2005-07-25 15:03:01.000000000 +0300 @@ -8,7 +8,14 @@ * IPv6 support * YOSHIFUJI Hideaki * Split up af-specific portion - * + * + * Changes: BEET support + * Abhinav Pathak + * Diego Beltrami + * Kristian Slavov + * Miika Komu + * Jeff Ahrenholz + * */ #include @@ -84,6 +91,12 @@ } } }; +#ifdef CONFIG_XFRM_BEET + union { + struct in6_addr *in6; + struct in_addr *in; + } remotebeet, localbeet; +#endif int i; int err = 0; int header_len = 0; @@ -96,6 +109,9 @@ struct dst_entry *dst1 = dst_alloc(&xfrm6_dst_ops); struct xfrm_dst *xdst; int tunnel = 0; +#ifdef CONFIG_XFRM_BEET + unsigned short beet_family = 0; +#endif if (unlikely(dst1 == NULL)) { err = -ENOBUFS; @@ -118,11 +134,22 @@ dst1->next = dst_prev; dst_prev = dst1; +#ifdef CONFIG_XFRM_BEET + if (xfrm[i]->props.mode == XFRM_MODE_TUNNEL) { +#else if (xfrm[i]->props.mode) { +#endif remote = (struct in6_addr*)&xfrm[i]->id.daddr; local = (struct in6_addr*)&xfrm[i]->props.saddr; tunnel = 1; } +#ifdef CONFIG_XFRM_BEET + else if (xfrm[i]->props.mode == XFRM_MODE_BEET) { + beet_family = xfrm[i]->props.beet_family_out; + remotebeet.in6 = (struct in6_addr*)&xfrm[i]->id.daddr; + localbeet.in6 = (struct in6_addr*)&xfrm[i]->props.saddr; + } +#endif header_len += xfrm[i]->props.header_len; trailer_len += xfrm[i]->props.trailer_len; @@ -133,6 +160,23 @@ &fl_tunnel, AF_INET6); if (err) goto error; +#ifdef CONFIG_XFRM_BEET + } else if (beet_family) { + switch(beet_family) { + case AF_INET: + fl_tunnel.fl4_dst = remotebeet.in->s_addr; + fl_tunnel.fl4_src = localbeet.in->s_addr; + break; + case AF_INET6: + ipv6_addr_copy(&fl_tunnel.fl6_dst, remotebeet.in6); + ipv6_addr_copy(&fl_tunnel.fl6_src, localbeet.in6); + break; + default: + BUG_ON(1); + } + err = xfrm_dst_lookup((struct xfrm_dst **) &rt, + &fl_tunnel, beet_family); +#endif } else dst_hold(&rt->u.dst); } diff -urN linux-2.6.12.2/net/key/af_key.c linux-beet-2.6.12.2/net/key/af_key.c --- linux-2.6.12.2/net/key/af_key.c 2005-06-30 02:00:53.000000000 +0300 +++ linux-beet-2.6.12.2/net/key/af_key.c 2005-07-25 14:39:12.000000000 +0300 @@ -12,6 +12,14 @@ * Kunihiro Ishiguro * Kazunori MIYAZAWA / USAGI Project * Derek Atkins + * + * Changes: BEET support + * Abhinav Pathak + * Diego Beltrami + * Kristian Slavov + * Miika Komu + * Jeff Ahrenholz + * */ #include @@ -28,6 +36,10 @@ #include #include +#ifdef CONFIG_XFRM_BEET +#include +#endif + #include #define _X2KEY(x) ((x) == XFRM_INF ? 0 : (x)) @@ -1584,7 +1596,11 @@ } /* addresses present only in tunnel mode */ +#ifdef CONFIG_XFRM_BEET + if (t->mode == IPSEC_MODE_TUNNEL-1) { +#else if (t->mode) { +#endif switch (xp->family) { case AF_INET: sin = (void*)(rq+1); @@ -1612,6 +1628,40 @@ return -EINVAL; } } +#ifdef CONFIG_XFRM_BEET + else if (t->mode == IPSEC_MODE_BEET-1) { + struct sockaddr *sa; + + sa = (struct sockaddr *)(rq+1); + switch(sa->sa_family) { + case AF_INET: + sin = (struct sockaddr_in *)sa; + t->saddr.a4 = sin->sin_addr.s_addr; + sin++; + if (sin->sin_family != AF_INET) + return -EINVAL; + t->id.daddr.a4 = sin->sin_addr.s_addr; + t->family = AF_INET; + + break; +#if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE) + case AF_INET6: + sin6 = (struct sockaddr_in6 *)sa; + memcpy(t->saddr.a6, &sin6->sin6_addr, sizeof(struct in6_addr)); + sin6++; + if (sin6->sin6_family != AF_INET6) + return -EINVAL; + memcpy(t->id.daddr.a6, &sin6->sin6_addr, sizeof(struct in6_addr)); + t->family = AF_INET6; + + break; +#endif /* CONFIG_IPV6 */ + default: + return -EINVAL; + } + } +#endif /* CONFIG_XFRM_BEET */ + /* No way to set this via kame pfkey */ t->aalgos = t->ealgos = t->calgos = ~0; xp->xfrm_nr++; @@ -1935,6 +1985,78 @@ (err = parse_ipsecrequests(xp, pol)) < 0) goto out; +#ifdef CONFIG_XFRM_BEET + /* lookup the SA (xfrm_state) and copy the inner addresses from + * the policy (xfrm_policy) to the selector within the state + */ + if (xp->xfrm_vec[0].mode == IPSEC_MODE_BEET-1) { + struct xfrm_state *x; + if (xp->family == AF_INET6) { + if ((x = xfrm_lookup_bydst(XFRM_MODE_BEET, + &xp->xfrm_vec[0].id.daddr, + &xp->xfrm_vec[0].saddr, + AF_INET6))) { + /* Inner = 6, Outer = 6 */ + x->props.beet_family_out = AF_INET6; + x->props.beet_family_in = AF_INET6; + /* insert inner addresses into the selector */ + memcpy( &x->sel.daddr, &xp->selector.daddr, + sizeof(xfrm_address_t)); + memcpy( &x->sel.saddr, &xp->selector.saddr, + sizeof(xfrm_address_t)); + x->type = xfrm_get_type(x->id.proto, x->props.beet_family_in); + } + else if ((x = xfrm_lookup_bydst(XFRM_MODE_BEET, + &xp->xfrm_vec[0].id.daddr, + &xp->xfrm_vec[0].saddr, + AF_INET))) { + /* Inner = 6, Outer = 4 */ + x->props.beet_family_out = AF_INET; + x->props.beet_family_in = AF_INET6; + /* insert inner addresses into the selector */ + memcpy( &x->sel.daddr, &xp->selector.daddr, + sizeof(xfrm_address_t)); + memcpy( &x->sel.saddr, &xp->selector.saddr, + sizeof(xfrm_address_t)); + x->type = xfrm_get_type(x->id.proto, x->props.beet_family_in); + } + } else if (xp->family == AF_INET) { + if ((x = xfrm_lookup_bydst(XFRM_MODE_BEET, + &xp->xfrm_vec[0].id.daddr, + &xp->xfrm_vec[0].saddr, + AF_INET))) + { + /* Inner = 4, Outer = 4 */ + x->props.beet_family_out = AF_INET; + x->props.beet_family_in = AF_INET; + /* insert inner addresses into the selector */ + memcpy( &x->sel.daddr, &xp->selector.daddr, + sizeof(xfrm_address_t)); + memcpy( &x->sel.saddr, &xp->selector.saddr, + sizeof(xfrm_address_t)); + x->type = xfrm_get_type(x->id.proto, x->props.beet_family_in); + } + else if ((x = xfrm_lookup_bydst(XFRM_MODE_BEET, + &xp->xfrm_vec[0].id.daddr, + &xp->xfrm_vec[0].saddr, + AF_INET6))) + { + /* Inner = 4, Outer = 6 */ + x->props.beet_family_out = AF_INET6; + x->props.beet_family_in = AF_INET; + /* insert inner addresses into the selector */ + memcpy( &x->sel.daddr, &xp->selector.daddr, + sizeof(xfrm_address_t)); + memcpy( &x->sel.saddr, &xp->selector.saddr, + sizeof(xfrm_address_t)); + x->type = xfrm_get_type(x->id.proto, x->props.beet_family_in); + } + + } else { + BUG_ON(1); + } + } +#endif out_skb = pfkey_xfrm_policy2msg_prep(xp); if (IS_ERR(out_skb)) { err = PTR_ERR(out_skb); diff -urN linux-2.6.12.2/net/xfrm/Kconfig linux-beet-2.6.12.2/net/xfrm/Kconfig --- linux-2.6.12.2/net/xfrm/Kconfig 2005-06-30 02:00:53.000000000 +0300 +++ linux-beet-2.6.12.2/net/xfrm/Kconfig 2005-07-25 15:04:36.000000000 +0300 @@ -10,3 +10,11 @@ If unsure, say Y. +config XFRM_BEET + bool "IPsec BEET mode" + depends on XFRM + ---help--- + IPsec BEET mode is combination of IPsec transport and tunnel mode. + Currently, it is used only by HIP. + + If unsure, say N. diff -urN linux-2.6.12.2/net/xfrm/Kconfig~ linux-beet-2.6.12.2/net/xfrm/Kconfig~ --- linux-2.6.12.2/net/xfrm/Kconfig~ 1970-01-01 02:00:00.000000000 +0200 +++ linux-beet-2.6.12.2/net/xfrm/Kconfig~ 2005-07-25 14:39:13.000000000 +0300 @@ -0,0 +1,28 @@ +# +# XFRM configuration +# +config XFRM_USER + tristate "IPsec user configuration interface" + depends on INET && XFRM + ---help--- + Support for IPsec user configuration interface used + by native Linux tools. + + If unsure, say Y. + +config XFRM_BEET + bool "IPsec BEET mode" + depends on XFRM + ---help--- + IPsec BEET mode is combination of IPsec transport and tunnel mode. + Currently, it is used only by HIP. + + If unsure, say N. + +config XFRM_BEET_DEBUG + bool "IPsec BEET mode debugging" + depends on XFRM_BEET + ---help--- + Enables BEET mode debugging via syslog. + + If unsure, say N. diff -urN linux-2.6.12.2/net/xfrm/Makefile linux-beet-2.6.12.2/net/xfrm/Makefile --- linux-2.6.12.2/net/xfrm/Makefile 2005-06-30 02:00:53.000000000 +0300 +++ linux-beet-2.6.12.2/net/xfrm/Makefile 2005-07-25 14:39:13.000000000 +0300 @@ -2,6 +2,6 @@ # Makefile for the XFRM subsystem. # -obj-$(CONFIG_XFRM) := xfrm_policy.o xfrm_state.o xfrm_input.o xfrm_algo.o +obj-$(CONFIG_XFRM) := xfrm_policy.o xfrm_state.o xfrm_input.o xfrm_algo.o xfrm_beet.o obj-$(CONFIG_XFRM_USER) += xfrm_user.o diff -urN linux-2.6.12.2/net/xfrm/xfrm_beet.c linux-beet-2.6.12.2/net/xfrm/xfrm_beet.c --- linux-2.6.12.2/net/xfrm/xfrm_beet.c 1970-01-01 02:00:00.000000000 +0200 +++ linux-beet-2.6.12.2/net/xfrm/xfrm_beet.c 2005-07-25 15:03:01.000000000 +0300 @@ -0,0 +1,227 @@ +/* + * xfrm_beet.c: allows for receiving and transmitting packet in BEET mode + * + * Authors: + * Abhinav Pathak + * Diego Beltrami + * Kristian Slavov + * Miika Komu + * Jeff Ahrenholz + * + */ + +#include +#include +#include +#include +#include +#include +#include + +#ifdef CONFIG_XFRM_BEET + +/* xfrm_beet_output: deals with the outgoing BEET packets. + * It changes the outer ip header and correctly set + * the header fields + * + * @skb: structure sk_buff which contains the packet to be transmitted + * skb->data points to the ip header +*/ +int xfrm_beet_output(struct sk_buff *skb) +{ + int err = 0; + struct xfrm_state *x = skb->dst->xfrm; + + if (x->props.beet_family_in == AF_INET && x->props.beet_family_out == AF_INET){ + /* Inner = 4, Outer = 4 */ + struct iphdr *iph = (struct iphdr*)skb->data; + + iph->saddr = x->props.saddr.a4; + iph->daddr = x->id.daddr.a4; + + skb->local_df = 1; //I am a bit unsure on how to implement this -Abi + + iph->check = 0; + iph->check = ip_fast_csum((unsigned char *)iph, iph->ihl); + + } else if (x->props.beet_family_in == AF_INET && x->props.beet_family_out == AF_INET6){ + /* Inner = 4, Outer = 6 */ + struct iphdr *iph = (struct iphdr*)skb->data; + __u8 protocol, ttl; + + protocol = iph->protocol; + ttl = iph->ttl; + + if (skb_headroom(skb) < sizeof(struct ipv6hdr) - sizeof(struct iphdr)){ + if (pskb_expand_head(skb, sizeof(struct ipv6hdr) - sizeof(struct iphdr),0, GFP_ATOMIC)) + return -EINVAL; //Just returning from here. + + skb->len += sizeof(struct ipv6hdr) - sizeof(struct iphdr); + skb->nh.raw = skb->h.raw - sizeof(struct ipv6hdr); + skb->data = skb->nh.raw; + + } else { + skb_push(skb, sizeof(struct ipv6hdr) - sizeof(struct iphdr)); + skb->nh.raw = skb->h.raw - sizeof(struct ipv6hdr); + skb->data = skb->nh.raw; + } + + skb->protocol = htons(ETH_P_IPV6); + + skb->nh.ipv6h = (struct ipv6hdr*)(skb->data); + + skb->nh.ipv6h->version = 6; + skb->nh.ipv6h->payload_len = htons(skb->len - sizeof(struct ipv6hdr)); + skb->nh.ipv6h->nexthdr = protocol; + skb->nh.ipv6h->hop_limit = ttl; + ipv6_addr_copy(&skb->nh.ipv6h->saddr,(struct in6_addr *)&x->props.saddr); + ipv6_addr_copy(&skb->nh.ipv6h->daddr, (struct in6_addr *)&x->id.daddr); + + skb->nh.ipv6h->priority = 0; + skb->nh.ipv6h->flow_lbl[0] = 0; + skb->nh.ipv6h->flow_lbl[1] = 0; + skb->nh.ipv6h->flow_lbl[2] = 0; + + } else if (x->props.beet_family_in == AF_INET6 && x->props.beet_family_out == AF_INET){ + /* Inner = 6, Outer = 4 */ + struct ipv6hdr *iph = (struct ipv6hdr*)skb->data; + int delta = sizeof(struct ipv6hdr) - sizeof(struct iphdr); + u8 hop, proto; + u16 payload; + struct iphdr *ip4; + hop = iph->hop_limit; + proto = iph->nexthdr; + + payload = ntohs(iph->payload_len) + sizeof(struct iphdr); + + skb_pull(skb, delta); + + skb->protocol = htons(ETH_P_IP); + ip4 = (struct iphdr *)skb->data; + + ip4->ihl = (sizeof(struct iphdr) >> 2); + ip4->version = 4; + ip4->tos = 0; + ip4->tot_len = htons(payload); + ip4->id = 0; + ip4->frag_off = htons(IP_DF); + ip4->ttl = hop; + ip4->protocol = proto; + ip4->check = 0; + ip4->saddr = x->props.saddr.a4; + ip4->daddr = x->id.daddr.a4; + ip4->check = ip_fast_csum((unsigned char *)ip4, ip4->ihl); + /* The esp6_output assumes that skb->data points to outer IP header, + * skb->nh points eventual new ext hdrs and skb->h points to the ESP header + */ + skb->nh.raw = skb->data; // there is no extension header + + } else if (x->props.beet_family_in == AF_INET6 && x->props.beet_family_out == AF_INET6){ + /* Inner = 6, Outer = 6 */ + struct ipv6hdr *iph = (struct ipv6hdr*)skb->data; + ipv6_addr_copy(&iph->saddr, (struct in6_addr *)&x->props.saddr); + ipv6_addr_copy(&iph->daddr, (struct in6_addr *)&x->id.daddr); + } + + return err; +} +EXPORT_SYMBOL(xfrm_beet_output); + + +/* xfrm_beet_input: deals with the incoming BEET packets. + * It changes the outer ip header with the corresponding inner ip header and addresses + * + * @skb: structure sk_buff. skb->nh.raw points to the outer ip address + * skb->data and skb->h.raw point to the ESP to be decapsulated + * + * @x : struct xfrm_state containing the state information + * +*/ +int xfrm_beet_input(struct sk_buff *skb, struct xfrm_state *x) +{ + int err = 0; + + if (x->props.beet_family_in == AF_INET && x->props.beet_family_out == AF_INET){ + /* Inner = 4, Outer = 4 */ + struct iphdr *iph = (struct iphdr *)skb->nh.iph; + + iph->daddr = x->sel.daddr.a4; + iph->saddr = x->sel.saddr.a4; + iph->ttl--; + iph->tot_len = htons(skb->len); + iph->frag_off = htons(IP_DF); + iph->check = 0; + iph->check = ip_fast_csum((unsigned char *)iph, iph->ihl); + + } else if (x->props.beet_family_in == AF_INET && x->props.beet_family_out == AF_INET6){ + /* Inner = 4, Outer = 6 */ + struct iphdr *iph; + __u8 proto = skb->nh.ipv6h->nexthdr; + __u8 hops = skb->nh.ipv6h->hop_limit; + + + skb->h.raw = skb->nh.raw + sizeof(struct ipv6hdr) - sizeof(struct iphdr); + memmove(skb->h.raw, skb->data, skb->len); + skb->data = skb->h.raw; + + + eth_hdr(skb)->h_proto=htons(ETH_P_IP); + + iph = (struct iphdr *)skb->nh.raw; + memset(iph, 0, sizeof(struct iphdr)); + iph->daddr = x->sel.daddr.a4; + iph->saddr = x->sel.saddr.a4; + iph->ttl = hops--; + iph->protocol = proto; + iph->tot_len = htons(skb->len); + iph->frag_off = htons(IP_DF); + iph->ihl = 5; + iph->version = 4; + iph->check = ip_fast_csum((unsigned char *)iph, iph->ihl); + + skb->protocol = htons(ETH_P_IP); + + } else if (x->props.beet_family_in == AF_INET6 && x->props.beet_family_out == AF_INET){ + /* Inner = 6, Outer = 4 */ + struct ipv6hdr *ip6h; + int proto = skb->nh.iph->protocol; + int hops = skb->nh.iph->ttl; + int total = skb->len - sizeof(struct iphdr); + + if (skb_tailroom(skb) < sizeof(struct ipv6hdr) - sizeof(struct iphdr)){ + if (pskb_expand_head(skb, 0, sizeof(struct ipv6hdr) - sizeof(struct iphdr), GFP_ATOMIC)) + return -EINVAL; //Just returning from here. + } + + skb->h.raw = skb->nh.raw + sizeof(struct ipv6hdr); + memmove(skb->h.raw, skb->data, skb->len); + skb->data = skb->h.raw; + skb->tail += sizeof(struct ipv6hdr) - sizeof(struct iphdr); + + eth_hdr(skb)->h_proto=htons(ETH_P_IPV6); + ip6h = skb->nh.ipv6h; + + memset(ip6h, 0, sizeof(struct ipv6hdr)); + ipv6_addr_copy(&ip6h->saddr, (struct in6_addr *)&x->sel.saddr.a6); + ipv6_addr_copy(&ip6h->daddr, (struct in6_addr *)&x->sel.daddr.a6); + ip6h->payload_len = htons(total); + ip6h->hop_limit = hops-1; + ip6h->version = 6; + ip6h->nexthdr = proto; + + skb->protocol = htons(ETH_P_IPV6); + + } else if (x->props.beet_family_in == AF_INET6 && x->props.beet_family_out == AF_INET6){ + /* Inner = 6, Outer = 6 */ + struct ipv6hdr *ip6h = (struct ipv6hdr *)skb->nh.raw; + ipv6_addr_copy(&ip6h->daddr, + (struct in6_addr *) &x->sel.daddr.a6); + ipv6_addr_copy(&ip6h->saddr, + (struct in6_addr *) &x->sel.saddr.a6); + } + + return err; +} +EXPORT_SYMBOL(xfrm_beet_input); + +#endif /* CONFIG_XFRM_BEET */ diff -urN linux-2.6.12.2/net/xfrm/xfrm_policy.c linux-beet-2.6.12.2/net/xfrm/xfrm_policy.c --- linux-2.6.12.2/net/xfrm/xfrm_policy.c 2005-06-30 02:00:53.000000000 +0300 +++ linux-beet-2.6.12.2/net/xfrm/xfrm_policy.c 2005-07-25 14:39:13.000000000 +0300 @@ -11,6 +11,13 @@ * Split up af-specific portion * Derek Atkins Add the post_input processor * + * Changes: BEET support + * Abhinav Pathak + * Diego Beltrami + * Kristian Slavov + * Miika Komu + * Jeff Ahrenholz + * */ #include @@ -643,6 +650,10 @@ struct xfrm_tmpl *tmpl = &policy->xfrm_vec[i]; if (tmpl->mode) { +#ifdef CONFIG_XFRM_BEET + if(tmpl->mode == XFRM_MODE_BEET) + family = tmpl->family; +#endif remote = &tmpl->id.daddr; local = &tmpl->saddr; } diff -urN linux-2.6.12.2/net/xfrm/xfrm_state.c linux-beet-2.6.12.2/net/xfrm/xfrm_state.c --- linux-2.6.12.2/net/xfrm/xfrm_state.c 2005-06-30 02:00:53.000000000 +0300 +++ linux-beet-2.6.12.2/net/xfrm/xfrm_state.c 2005-07-25 14:39:13.000000000 +0300 @@ -1036,3 +1036,31 @@ INIT_WORK(&xfrm_state_gc_work, xfrm_state_gc_task, NULL); } +#ifdef CONFIG_XFRM_BEET + +struct xfrm_state * +xfrm_lookup_bydst(u8 mode, xfrm_address_t *daddr, xfrm_address_t *saddr, unsigned short family) +{ + struct xfrm_state *x; + unsigned h = xfrm_dst_hash(daddr, family); + + list_for_each_entry(x, xfrm_state_bydst+h, bydst){ + + if (x->props.family == AF_INET6 && + ipv6_addr_equal((struct in6_addr *)daddr, (struct in6_addr *)x->id.daddr.a6) && + mode == x->props.mode && + ipv6_addr_equal((struct in6_addr *)saddr, (struct in6_addr *)x->props.saddr.a6)) { + return(x); + } + + if (x->props.family == AF_INET && + daddr->a4 == x->id.daddr.a4 && + mode == x->props.mode && + saddr->a4 == x->props.saddr.a4) + return(x); + + } + return(NULL); +} + +#endif //CONFIG_XFRM_BEET From diego.beltrami@HIIT.FI Mon Jul 25 06:30:50 2005 Received: with ECARTIS (v1.0.0; list netdev); Mon, 25 Jul 2005 06:30:57 -0700 (PDT) Received: from pegasus.hiit.fi (pegasus.hiit.fi [212.68.1.186]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6PDUnH9012347 for ; Mon, 25 Jul 2005 06:30:50 -0700 Received: from [128.214.113.174] (odysse.hiit.fi [128.214.113.174]) by pegasus.hiit.fi (Postfix) with ESMTP id 37B2922008D for ; Mon, 25 Jul 2005 16:28:55 +0300 (EEST) Subject: Re: [Hipsec] [PATCH 2.6.12.2] XFRM: BEET IPsec mode for Linux From: Diego Beltrami Reply-To: diego.beltrami@HIIT.FI To: netdev@oss.sgi.com In-Reply-To: <1122295307.14873.37.camel@odysse> References: <1122295307.14873.37.camel@odysse> Content-Type: text/plain Organization: HIIT Message-Id: <1122298135.14873.70.camel@odysse> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.4.6 (1.4.6-2) Date: Mon, 25 Jul 2005 16:28:55 +0300 Content-Transfer-Encoding: 7bit X-archive-position: 2780 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: diego.beltrami@HIIT.FI Precedence: bulk X-list: netdev Content-Length: 2285 Lines: 65 Folks, I'm sorry but the sent patch included in the email seems to be broken. Please, use the URL http://hipl.hiit.fi/beet/beet-patch-v1.0-2.6.12.2 Sincerely > Hi folks, > > we have been working for three months to implement a new IPsec mode, > the "BEET" mode, for Linux. Below is a link to the BEET specification > and > the abstract: > > http://www.ietf.org/internet-drafts/draft-nikander-esp-beet-mode-03.txt > > Abstract > > This document specifies a new mode, called Bound End-to-End Tunnel > (BEET) mode, for IPsec ESP. The new mode augments the existing ESP > tunnel and transport modes. For end-to-end tunnels, the new mode > provides limited tunnel mode semantics without the regular tunnel > mode overhead. The mode is intended to support new uses of ESP, > including mobility and multi-address multi-homing. > > The BEET mode is required by the Host Identity Protocol (HIP), which > provides authenticated Diffie-Hellman for end-hosts, as well as > mobility and multihoming support. The BEET mode is also useful for > other similar protocols being developed at the IETF. > > Ericsson has already developed a BEET patch for *BSD. Our patch > provides the similar functionality, but using the XFRM architecture. > The patch is included at the end of this email and also at the following > URL: > http://hipl.hiit.fi/beet/beet-patch-v1.0-2.6.12.2 > > We have made some testing in order to assure the quality of the > patch. All the tests passed, and below is a list of them: > > * Does not break transport and tunnel mode (with CONFIG_XFRM_BEET > on/off) > * All inner-outer combinations with varying test applications: > ICMP, ICMPv6, FTP, SSH, nc, nc6 > * Works with fragmented packets > * Interoperability with HIPL > * Real machines, virtual machines (vmware) > * Tested with long data streams > > The BEET development team: > > * Abhinav Pathak (InfraHIP/HIIT) > * Diego Beltrami (InfraHIP/HIIT) > * Kristian Slavov (Ericsson) > * Miika Komu (InfraHIP/HIIT) > * Jeff Ahrenholz (Boeing) > > On the behalf of the BEET development team, > > Signed-off-by: Diego Beltrami > > > From mcr@sandelman.ottawa.on.ca Mon Jul 25 10:02:10 2005 Received: with ECARTIS (v1.0.0; list netdev); Mon, 25 Jul 2005 10:02:17 -0700 (PDT) Received: from pike.sandelman.ca (pike.sandelman.ca [205.150.200.164]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6PH28H9009107 for ; Mon, 25 Jul 2005 10:02:09 -0700 Received: from sandelman.ottawa.on.ca (road.marajade.sandelman.ca [205.150.200.163]) by pike.sandelman.ca (Postfix) with ESMTP id 9AB4764DD9; Mon, 25 Jul 2005 13:00:07 -0400 (EDT) Received: from marajade.sandelman.ottawa.on.ca (marajade [127.0.0.1]) by sandelman.ottawa.on.ca (Postfix) with ESMTP id C0236E9983; Sun, 24 Jul 2005 19:35:50 -0400 (EDT) From: "Michael Richardson" To: user-mode-linux-devel Cc: netdev@oss.sgi.com Subject: multicast on loopback: mcast networking at the cottage X-Mailer: MH-E 7.82; nmh 1.1; XEmacs 21.4 (patch 17) Date: Sun, 24 Jul 2005 19:35:29 -0400 Message-ID: <6048.1122248129@marajade.sandelman.ottawa.on.ca> X-archive-position: 2781 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mcr@sandelman.ottawa.on.ca Precedence: bulk X-list: netdev Content-Length: 2387 Lines: 57 -----BEGIN PGP SIGNED MESSAGE----- I got frustrated at the cottage just now while testing code using multiple user mode linux boxes, because I hit the annoying: Configuring network interfaces: mcast_open: IP_ADD_MEMBERSHIP failed, error = 19 There appears not to be a multicast-capable network interface on the host. eth0 should be configured in order to use the multicast transport. SIOCSIFFLAGS: Invalid argument mcast_open: IP_ADD_MEMBERSHIP failed, error = 19 I would just use a script that switched me to daemon networking, but I thought I'd look into this more. I was always going on the assumption that this was because drivers/net/loopback.c was missing a set_multicast_list call. Presence of that function is checked in net/core/dev.c. I wrote one (it doesn't need to do anything...) for fun, rebooted, and noticed that it didn't help. What did help was: route add -net 224.0.0.0 netmask 240.0.0.0 dev lo and it worked even on a stock kernel (2.6.11.8 is what I have). [thus there is no patch attached here...] Thinking back, BSD and Solaris do this whenever an interface that has IFF_MULTICAST set is configured up. Normally configuring "eth0" and giving it a default route works --- the check for a route to the multicast address gives you a valid interface, and all is well. I suspect that for people who want to do multicast between hosts on an actual wire won't want to have just a route to lo, they may want a route for 224.0.0.0/4 on all of their interfaces. Maybe that should be automatic? I think that applications that are multicast aware are supposed to figure out which interfaces that they want to bind to anyway, so they should not be confused by the multiple routes. - -- ] Michael Richardson Xelerance Corporation, Ottawa, ON | firewalls [ ] mcr @ xelerance.com Now doing IPsec training, see |net architect[ ] http://www.sandelman.ca/mcr/ www.xelerance.com/training/ |device driver[ ] I'm a dad: http://www.sandelman.ca/lrmr/ [ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.2 (GNU/Linux) Comment: Finger me for keys iQCVAwUBQuQlvIqHRg3pndX9AQGi7QP9FajMsWOj2AZUyPrDtj+T02z5T+Zc1oej 8qobkO624XYhgKlRW/FiBTRNrYOS+MclM2pqeKxOSF7frzt2azZUfOLb0OdNiv1B d8MBTA4MgPTLVeSapXrMA0JImy8KtquRWsn2ApumW/Odsor9OQjqhlYVsYqCMnTD xyFcpSMvArc= =jRE9 -----END PGP SIGNATURE----- From stian@nixia.no Mon Jul 25 10:22:57 2005 Received: with ECARTIS (v1.0.0; list netdev); Mon, 25 Jul 2005 10:23:01 -0700 (PDT) Received: from nepa.nlc.no (nepa.nlc.no [195.159.31.6]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id j6PHMtH9012257 for ; Mon, 25 Jul 2005 10:22:56 -0700 Received: (qmail 11055 invoked by uid 48); 25 Jul 2005 17:20:57 -0000 Received: from 85.166.82.153 (SquirrelMail authenticated user stian@nixia.no); by nepa.nlc.no with HTTP; Mon, 25 Jul 2005 19:20:57 +0200 (CEST) Message-ID: <42036.85.166.82.153.1122312057.squirrel@85.166.82.153> In-Reply-To: <6048.1122248129@marajade.sandelman.ottawa.on.ca> References: <6048.1122248129@marajade.sandelman.ottawa.on.ca> Date: Mon, 25 Jul 2005 19:20:57 +0200 (CEST) Subject: Re: [uml-devel] multicast on loopback: mcast networking at the cottage From: stian@nixia.no To: "Michael Richardson" Cc: "user-mode-linux-devel" , netdev@oss.sgi.com User-Agent: SquirrelMail/1.4.3a-0.f1.1 X-Mailer: SquirrelMail/1.4.3a-0.f1.1 MIME-Version: 1.0 Content-Type: text/plain;charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-Priority: 3 (Normal) Importance: Normal X-archive-position: 2782 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: stian@nixia.no Precedence: bulk X-list: netdev Content-Length: 794 Lines: 20 > I suspect that for people who want to do multicast between hosts on an > actual wire won't want to have just a route to lo, they may want a route > for 224.0.0.0/4 on all of their interfaces. Maybe that should be > automatic? > > I think that applications that are multicast aware are supposed to > figure out which interfaces that they want to bind to anyway, so they > should not be confused by the multiple routes. If I remember things right (I might be wrong here): ipv6 does this automatically, so should ipv4 have done too if multicast is supported on a device. Multicast aware programs should always bind to a device, or broadcast on all devices for discovery. This is not uml-specific, but vanilla kernel. Some distroes add multicast-routing for ipv4 during init scripts. Stian From as@asalmela.iki.fi Tue Jul 26 04:28:07 2005 Received: with ECARTIS (v1.0.0; list netdev); Tue, 26 Jul 2005 04:28:18 -0700 (PDT) Received: from asalmela.iki.fi (addr-213-139-163-144.suomi.net [213.139.163.144]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6QBS4H9016559 for ; Tue, 26 Jul 2005 04:28:05 -0700 Received: by asalmela.iki.fi (Postfix, from userid 1000) id F31D52765C; Tue, 26 Jul 2005 14:26:07 +0300 (EEST) Date: Tue, 26 Jul 2005 14:26:07 +0300 From: Antti Salmela To: paulus@samba.org Cc: netdev@oss.sgi.com, linux-ppp@vger.kernel.org Subject: PPP dropping packets? Message-ID: <20050726112607.GA18219@asalmela.iki.fi> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.9i X-archive-position: 2783 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: asalmela@iki.fi Precedence: bulk X-list: netdev Content-Length: 45298 Lines: 1814 I'm running 2.6.13-rc3 now but have been suffering from weird stalls with my PPPoE connection ever since my ISP made it a requirement, about two months ago. I found out by listening outgoing ethernet interface with tcpdump that when a stall occurs, it's not sending anything. Listening to ppp interface shows packets being sent. Discussed this with PPPoE maintainer Michal Ostrowski and tried some debugging code he sent me, but it didn't reveal really anything. Here are two tcpdump logs taken when a stall occured and while a ping to ppp peer was running. My ip is 213.139.163.144, peer is 10.10.9.1. tcpdump -i ppp0 -n: 11:13:55.710515 IP 213.139.163.144 > 10.10.9.1: ICMP echo request seq 1, length 64 11:13:56.709827 IP 213.139.163.144 > 10.10.9.1: ICMP echo request seq 2, length 64 11:13:57.709842 IP 213.139.163.144 > 10.10.9.1: ICMP echo request seq 3, length 64 11:13:58.709848 IP 213.139.163.144 > 10.10.9.1: ICMP echo request seq 4, length 64 11:13:58.874515 IP 213.139.169.82.6668 > 213.139.163.144.58747: . 1689088981:1689089019(38) ack 1926745953 win 8499 11:13:58.874610 IP 213.139.163.144.58747 > 213.139.169.82.6668: . ack 38 win 12424 11:13:59.709864 IP 213.139.163.144 > 10.10.9.1: ICMP echo request seq 5, length 64 11:14:00.709869 IP 213.139.163.144 > 10.10.9.1: ICMP echo request seq 6, length 64 11:14:01.709884 IP 213.139.163.144 > 10.10.9.1: ICMP echo request seq 7, length 64 11:14:02.588471 IP 213.139.163.144.48587 > 207.46.0.35.1863: P 3889301623:3889301628(5) ack 519932196 win 2264 11:14:02.709889 IP 213.139.163.144 > 10.10.9.1: ICMP echo request seq 8, length 64 11:14:03.024512 IP 213.139.163.144.57511 > 207.46.0.63.1863: P 3897420679:3897420684(5) ack 1231695246 win 2264 11:14:03.709905 IP 213.139.163.144 > 10.10.9.1: ICMP echo request seq 9, length 64 11:14:04.505177 IP 216.239.59.99.80 > 213.139.163.144.44690: F 2225588483:2225588483(0) ack 4125655007 win 8190 11:14:04.544579 IP 213.139.163.144.44690 > 216.239.59.99.80: . ack 1 win 14300 11:14:04.709928 IP 213.139.163.144 > 10.10.9.1: ICMP echo request seq 10, length 64 11:14:05.512052 IP 216.239.59.99.80 > 213.139.163.144.44690: F 0:0(0) ack 1 win 8190 11:14:05.512256 IP 213.139.163.144.44690 > 216.239.59.99.80: . ack 1 win 14300 11:14:05.709934 IP 213.139.163.144 > 10.10.9.1: ICMP echo request seq 11, length 64 11:14:06.372979 IP 213.139.163.144.44690 > 216.239.59.99.80: F 1:1(0) ack 1 win 14300 11:14:06.628684 IP 213.139.163.144.44690 > 216.239.59.99.80: F 1:1(0) ack 1 win 14300 11:14:06.709947 IP 213.139.163.144 > 10.10.9.1: ICMP echo request seq 12, length 64 11:14:07.140722 IP 213.139.163.144.44690 > 216.239.59.99.80: F 1:1(0) ack 1 win 14300 11:14:07.522432 IP 216.239.59.99.80 > 213.139.163.144.44690: F 0:0(0) ack 1 win 8190 11:14:07.522663 IP 213.139.163.144.44690 > 216.239.59.99.80: . ack 1 win 14300 11:14:07.710004 IP 213.139.163.144 > 10.10.9.1: ICMP echo request seq 13, length 64 11:14:08.164751 IP 213.139.163.144.44690 > 216.239.59.99.80: F 1:1(0) ack 1 win 14300 11:14:08.709979 IP 213.139.163.144 > 10.10.9.1: ICMP echo request seq 14, length 64 11:14:08.879946 IP 213.139.163.144.58747 > 213.139.169.82.6668: P 1:19(18) ack 38 win 12424 11:14:09.710183 IP 213.139.163.144 > 10.10.9.1: ICMP echo request seq 15, length 64 11:14:10.212868 IP 213.139.163.144.44690 > 216.239.59.99.80: F 1:1(0) ack 1 win 14300 11:14:10.710024 IP 213.139.163.144 > 10.10.9.1: ICMP echo request seq 16, length 64 11:14:11.532770 IP 216.239.59.99.80 > 213.139.163.144.44690: F 0:0(0) ack 1 win 8190 11:14:11.532938 IP 213.139.163.144.44690 > 216.239.59.99.80: . ack 1 win 14300 11:14:11.710029 IP 213.139.163.144 > 10.10.9.1: ICMP echo request seq 17, length 64 11:14:12.710065 IP 213.139.163.144 > 10.10.9.1: ICMP echo request seq 18, length 64 11:14:12.710896 IP 10.10.9.1 > 213.139.163.144: ICMP echo reply seq 18, length 64 11:14:13.720235 IP 213.139.163.144 > 10.10.9.1: ICMP echo request seq 19, length 64 11:14:13.721049 IP 10.10.9.1 > 213.139.163.144: ICMP echo reply seq 19, length 64 11:14:14.309102 IP 213.139.163.144.44690 > 216.239.59.99.80: F 1:1(0) ack 1 win 14300 11:14:14.363621 IP 216.239.59.99.80 > 213.139.163.144.44690: . ack 2 win 8190 11:14:14.730079 IP 213.139.163.144 > 10.10.9.1: ICMP echo request seq 20, length 64 11:14:14.730918 IP 10.10.9.1 > 213.139.163.144: ICMP echo reply seq 20, length 64 tcpdump -i eth0 -n: 11:13:54.765999 802.1d config 8000.00:06:28:3a:8c:83.8020 root 61cf.00:05:5f:50:60:00 pathcost 27 age 3 max 20 hello 2 fdelay 15 11:13:56.145144 PPPoE [ses 0x73ec] IP 216.221.122.20.40199 > 82.128.195.209.445: S 1710730503:1710730503(0) win 64240 11:13:56.769516 802.1d config 8000.00:06:28:3a:8c:83.8020 root 61cf.00:05:5f:50:60:00 pathcost 27 age 3 max 20 hello 2 fdelay 15 11:13:57.164130 PPPoE [ses 0x6c4a] LCP, Echo-Request (0x09), id 100, length 14 11:13:58.767785 802.1d config 8000.00:06:28:3a:8c:83.8020 root 61cf.00:05:5f:50:60:00 pathcost 27 age 3 max 20 hello 2 fdelay 15 11:13:58.874515 PPPoE [ses 0x74c7] IP 213.139.169.82.6668 > 213.139.163.144.58747: . 1689088981:1689089019(38) ack 1926745953 win 8499 11:14:00.772107 802.1d config 8000.00:06:28:3a:8c:83.8020 root 61cf.00:05:5f:50:60:00 pathcost 27 age 3 max 20 hello 2 fdelay 15 11:14:02.782147 802.1d config 8000.00:06:28:3a:8c:83.8020 root 61cf.00:05:5f:50:60:00 pathcost 27 age 3 max 20 hello 2 fdelay 15 11:14:04.505177 PPPoE [ses 0x74c7] IP 216.239.59.99.80 > 213.139.163.144.44690: F 2225588483:2225588483(0) ack 4125655007 win 8190 11:14:04.767610 802.1d config 8000.00:06:28:3a:8c:83.8020 root 61cf.00:05:5f:50:60:00 pathcost 27 age 3 max 20 hello 2 fdelay 15 11:14:05.512052 PPPoE [ses 0x74c7] IP 216.239.59.99.80 > 213.139.163.144.44690: F 0:0(0) ack 1 win 8190 11:14:06.769225 802.1d config 8000.00:06:28:3a:8c:83.8020 root 61cf.00:05:5f:50:60:00 pathcost 27 age 3 max 20 hello 2 fdelay 15 11:14:07.522432 PPPoE [ses 0x74c7] IP 216.239.59.99.80 > 213.139.163.144.44690: F 0:0(0) ack 1 win 8190 11:14:08.766445 802.1d config 8000.00:06:28:3a:8c:83.8020 root 61cf.00:05:5f:50:60:00 pathcost 27 age 3 max 20 hello 2 fdelay 15 11:14:10.710040 PPPoE [ses 0x74c7] IP 213.139.163.144 > 10.10.9.1: ICMP echo request seq 16, length 64 11:14:10.766339 802.1d config 8000.00:06:28:3a:8c:83.8020 root 61cf.00:05:5f:50:60:00 pathcost 27 age 3 max 20 hello 2 fdelay 15 11:14:11.532770 PPPoE [ses 0x74c7] IP 216.239.59.99.80 > 213.139.163.144.44690: F 0:0(0) ack 1 win 8190 11:14:11.532950 PPPoE [ses 0x74c7] IP 213.139.163.144.44690 > 216.239.59.99.80: . ack 1 win 14300 11:14:11.710042 PPPoE [ses 0x74c7] IP 213.139.163.144 > 10.10.9.1: ICMP echo request seq 17, length 64 11:14:12.710085 PPPoE [ses 0x74c7] IP 213.139.163.144 > 10.10.9.1: ICMP echo request seq 18, length 64 11:14:12.710896 PPPoE [ses 0x74c7] IP 10.10.9.1 > 213.139.163.144: ICMP echo reply seq 18, length 64 11:14:12.767266 802.1d config 8000.00:06:28:3a:8c:83.8020 root 61cf.00:05:5f:50:60:00 pathcost 27 age 3 max 20 hello 2 fdelay 15 11:14:13.720253 PPPoE [ses 0x74c7] IP 213.139.163.144 > 10.10.9.1: ICMP echo request seq 19, length 64 11:14:13.721049 PPPoE [ses 0x74c7] IP 10.10.9.1 > 213.139.163.144: ICMP echo reply seq 19, length 64 11:14:14.309119 PPPoE [ses 0x74c7] IP 213.139.163.144.44690 > 216.239.59.99.80: F 1:1(0) ack 1 win 14300 11:14:14.316805 PPPoE [ses 0x6738] LCP, Echo-Request (0x09), id 73, length 14 11:14:14.363621 PPPoE [ses 0x74c7] IP 216.239.59.99.80 > 213.139.163.144.44690: . ack 2 win 8190 11:14:14.730096 PPPoE [ses 0x74c7] IP 213.139.163.144 > 10.10.9.1: ICMP echo request seq 20, length 64 11:14:14.730918 PPPoE [ses 0x74c7] IP 10.10.9.1 > 213.139.163.144: ICMP echo reply seq 20, length 64 --- linux-2.6.12-rc6/drivers/net/pppoe.c 2005-06-09 17:35:59.000000000 +0300 +++ linux-2.6.13-rc3/drivers/net/pppoe.c 2005-07-15 14:58:30.000000000 +0300 @@ -849,7 +849,9 @@ int headroom = skb_headroom(skb); int data_len = skb->len; struct sk_buff *skb2; + int line = 0; + line = __LINE__; if (sock_flag(sk, SOCK_DEAD) || !(sk->sk_state & PPPOX_CONNECTED)) goto abort; @@ -859,6 +861,7 @@ hdr.sid = po->num; hdr.length = htons(skb->len); + line = __LINE__; if (!dev) goto abort; @@ -868,6 +871,7 @@ sizeof(struct pppoe_hdr) + dev->hard_header_len); + line = __LINE__; if (skb2 == NULL) goto abort; @@ -897,6 +901,7 @@ * but free it in case of success. */ + line = __LINE__; if (dev_queue_xmit(skb2) < 0) goto abort; @@ -904,6 +909,7 @@ return 1; abort: + printk("pppoe xmit abort: %d\n", line); return 0; } # # Automatically generated make config: don't edit # Linux kernel version: 2.6.13-rc3 # Thu Jul 14 22:10:48 2005 # CONFIG_X86=y CONFIG_MMU=y CONFIG_UID16=y CONFIG_GENERIC_ISA_DMA=y CONFIG_GENERIC_IOMAP=y # # Code maturity level options # CONFIG_EXPERIMENTAL=y CONFIG_CLEAN_COMPILE=y CONFIG_BROKEN_ON_SMP=y CONFIG_INIT_ENV_ARG_LIMIT=32 # # General setup # CONFIG_LOCALVERSION="" CONFIG_SWAP=y CONFIG_SYSVIPC=y # CONFIG_POSIX_MQUEUE is not set # CONFIG_BSD_PROCESS_ACCT is not set CONFIG_SYSCTL=y # CONFIG_AUDIT is not set CONFIG_HOTPLUG=y CONFIG_KOBJECT_UEVENT=y # CONFIG_IKCONFIG is not set # CONFIG_EMBEDDED is not set CONFIG_KALLSYMS=y # CONFIG_KALLSYMS_ALL is not set # CONFIG_KALLSYMS_EXTRA_PASS is not set CONFIG_PRINTK=y CONFIG_BUG=y CONFIG_BASE_FULL=y CONFIG_FUTEX=y CONFIG_EPOLL=y CONFIG_SHMEM=y CONFIG_CC_ALIGN_FUNCTIONS=0 CONFIG_CC_ALIGN_LABELS=0 CONFIG_CC_ALIGN_LOOPS=0 CONFIG_CC_ALIGN_JUMPS=0 # CONFIG_TINY_SHMEM is not set CONFIG_BASE_SMALL=0 # # Loadable module support # CONFIG_MODULES=y CONFIG_MODULE_UNLOAD=y # CONFIG_MODULE_FORCE_UNLOAD is not set CONFIG_OBSOLETE_MODPARM=y CONFIG_MODVERSIONS=y # CONFIG_MODULE_SRCVERSION_ALL is not set CONFIG_KMOD=y # # Processor type and features # CONFIG_X86_PC=y # CONFIG_X86_ELAN is not set # CONFIG_X86_VOYAGER is not set # CONFIG_X86_NUMAQ is not set # CONFIG_X86_SUMMIT is not set # CONFIG_X86_BIGSMP is not set # CONFIG_X86_VISWS is not set # CONFIG_X86_GENERICARCH is not set # CONFIG_X86_ES7000 is not set # CONFIG_M386 is not set # CONFIG_M486 is not set # CONFIG_M586 is not set # CONFIG_M586TSC is not set # CONFIG_M586MMX is not set # CONFIG_M686 is not set # CONFIG_MPENTIUMII is not set # CONFIG_MPENTIUMIII is not set # CONFIG_MPENTIUMM is not set # CONFIG_MPENTIUM4 is not set # CONFIG_MK6 is not set CONFIG_MK7=y # CONFIG_MK8 is not set # CONFIG_MCRUSOE is not set # CONFIG_MEFFICEON is not set # CONFIG_MWINCHIPC6 is not set # CONFIG_MWINCHIP2 is not set # CONFIG_MWINCHIP3D is not set # CONFIG_MGEODEGX1 is not set # CONFIG_MCYRIXIII is not set # CONFIG_MVIAC3_2 is not set # CONFIG_X86_GENERIC is not set CONFIG_X86_CMPXCHG=y CONFIG_X86_XADD=y CONFIG_X86_L1_CACHE_SHIFT=6 CONFIG_RWSEM_XCHGADD_ALGORITHM=y CONFIG_GENERIC_CALIBRATE_DELAY=y CONFIG_X86_WP_WORKS_OK=y CONFIG_X86_INVLPG=y CONFIG_X86_BSWAP=y CONFIG_X86_POPAD_OK=y CONFIG_X86_GOOD_APIC=y CONFIG_X86_INTEL_USERCOPY=y CONFIG_X86_USE_PPRO_CHECKSUM=y CONFIG_X86_USE_3DNOW=y CONFIG_HPET_TIMER=y CONFIG_HPET_EMULATE_RTC=y # CONFIG_SMP is not set CONFIG_PREEMPT_NONE=y # CONFIG_PREEMPT_VOLUNTARY is not set # CONFIG_PREEMPT is not set CONFIG_X86_UP_APIC=y # CONFIG_X86_UP_IOAPIC is not set CONFIG_X86_LOCAL_APIC=y CONFIG_X86_TSC=y CONFIG_X86_MCE=y CONFIG_X86_MCE_NONFATAL=y # CONFIG_X86_MCE_P4THERMAL is not set # CONFIG_TOSHIBA is not set # CONFIG_I8K is not set # CONFIG_X86_REBOOTFIXUPS is not set # CONFIG_MICROCODE is not set # CONFIG_X86_MSR is not set # CONFIG_X86_CPUID is not set # # Firmware Drivers # CONFIG_EDD=y # CONFIG_NOHIGHMEM is not set CONFIG_HIGHMEM4G=y # CONFIG_HIGHMEM64G is not set CONFIG_HIGHMEM=y CONFIG_SELECT_MEMORY_MODEL=y CONFIG_FLATMEM_MANUAL=y # CONFIG_DISCONTIGMEM_MANUAL is not set # CONFIG_SPARSEMEM_MANUAL is not set CONFIG_FLATMEM=y CONFIG_FLAT_NODE_MEM_MAP=y # CONFIG_HIGHPTE is not set # CONFIG_MATH_EMULATION is not set CONFIG_MTRR=y # CONFIG_EFI is not set CONFIG_REGPARM=y CONFIG_SECCOMP=y CONFIG_HZ_100=y # CONFIG_HZ_250 is not set # CONFIG_HZ_1000 is not set CONFIG_HZ=100 CONFIG_PHYSICAL_START=0x100000 # CONFIG_KEXEC is not set # # Power management options (ACPI, APM) # CONFIG_PM=y # CONFIG_PM_DEBUG is not set # CONFIG_SOFTWARE_SUSPEND is not set # # ACPI (Advanced Configuration and Power Interface) Support # CONFIG_ACPI=y CONFIG_ACPI_BOOT=y CONFIG_ACPI_INTERPRETER=y # CONFIG_ACPI_SLEEP is not set # CONFIG_ACPI_AC is not set # CONFIG_ACPI_BATTERY is not set CONFIG_ACPI_BUTTON=y CONFIG_ACPI_VIDEO=m CONFIG_ACPI_HOTKEY=m CONFIG_ACPI_FAN=y CONFIG_ACPI_PROCESSOR=y CONFIG_ACPI_THERMAL=y # CONFIG_ACPI_ASUS is not set CONFIG_ACPI_IBM=m # CONFIG_ACPI_TOSHIBA is not set CONFIG_ACPI_BLACKLIST_YEAR=0 # CONFIG_ACPI_DEBUG is not set CONFIG_ACPI_BUS=y CONFIG_ACPI_EC=y CONFIG_ACPI_POWER=y CONFIG_ACPI_PCI=y CONFIG_ACPI_SYSTEM=y CONFIG_X86_PM_TIMER=y # CONFIG_ACPI_CONTAINER is not set # # APM (Advanced Power Management) BIOS Support # # CONFIG_APM is not set # # CPU Frequency scaling # CONFIG_CPU_FREQ=y CONFIG_CPU_FREQ_TABLE=y # CONFIG_CPU_FREQ_DEBUG is not set CONFIG_CPU_FREQ_STAT=y # CONFIG_CPU_FREQ_STAT_DETAILS is not set CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE=y # CONFIG_CPU_FREQ_DEFAULT_GOV_USERSPACE is not set CONFIG_CPU_FREQ_GOV_PERFORMANCE=y CONFIG_CPU_FREQ_GOV_POWERSAVE=y # CONFIG_CPU_FREQ_GOV_USERSPACE is not set CONFIG_CPU_FREQ_GOV_ONDEMAND=y CONFIG_CPU_FREQ_GOV_CONSERVATIVE=y # # CPUFreq processor drivers # CONFIG_X86_ACPI_CPUFREQ=y # CONFIG_X86_POWERNOW_K6 is not set CONFIG_X86_POWERNOW_K7=y CONFIG_X86_POWERNOW_K7_ACPI=y # CONFIG_X86_POWERNOW_K8 is not set # CONFIG_X86_GX_SUSPMOD is not set # CONFIG_X86_SPEEDSTEP_CENTRINO is not set # CONFIG_X86_SPEEDSTEP_ICH is not set # CONFIG_X86_SPEEDSTEP_SMI is not set # CONFIG_X86_P4_CLOCKMOD is not set # CONFIG_X86_CPUFREQ_NFORCE2 is not set # CONFIG_X86_LONGRUN is not set # CONFIG_X86_LONGHAUL is not set # # shared options # # CONFIG_X86_ACPI_CPUFREQ_PROC_INTF is not set # CONFIG_X86_SPEEDSTEP_LIB is not set # # Bus options (PCI, PCMCIA, EISA, MCA, ISA) # CONFIG_PCI=y # CONFIG_PCI_GOBIOS is not set # CONFIG_PCI_GOMMCONFIG is not set # CONFIG_PCI_GODIRECT is not set CONFIG_PCI_GOANY=y CONFIG_PCI_BIOS=y CONFIG_PCI_DIRECT=y CONFIG_PCI_MMCONFIG=y # CONFIG_PCIEPORTBUS is not set # CONFIG_PCI_LEGACY_PROC is not set CONFIG_PCI_NAMES=y # CONFIG_PCI_DEBUG is not set CONFIG_ISA_DMA_API=y CONFIG_ISA=y # CONFIG_EISA is not set # CONFIG_MCA is not set # CONFIG_SCx200 is not set # # PCCARD (PCMCIA/CardBus) support # # CONFIG_PCCARD is not set # # PCI Hotplug Support # # CONFIG_HOTPLUG_PCI is not set # # Executable file formats # CONFIG_BINFMT_ELF=y # CONFIG_BINFMT_AOUT is not set CONFIG_BINFMT_MISC=y # # Networking # CONFIG_NET=y # # Networking options # CONFIG_PACKET=y CONFIG_PACKET_MMAP=y CONFIG_UNIX=y CONFIG_XFRM=y CONFIG_XFRM_USER=y CONFIG_NET_KEY=y CONFIG_INET=y CONFIG_IP_MULTICAST=y CONFIG_IP_ADVANCED_ROUTER=y CONFIG_IP_FIB_HASH=y # CONFIG_IP_FIB_TRIE is not set CONFIG_IP_MULTIPLE_TABLES=y # CONFIG_IP_ROUTE_FWMARK is not set CONFIG_IP_ROUTE_MULTIPATH=y # CONFIG_IP_ROUTE_MULTIPATH_CACHED is not set # CONFIG_IP_ROUTE_VERBOSE is not set CONFIG_IP_PNP=y CONFIG_IP_PNP_DHCP=y CONFIG_IP_PNP_BOOTP=y # CONFIG_IP_PNP_RARP is not set # CONFIG_NET_IPIP is not set # CONFIG_NET_IPGRE is not set CONFIG_IP_MROUTE=y # CONFIG_IP_PIMSM_V1 is not set CONFIG_IP_PIMSM_V2=y # CONFIG_ARPD is not set CONFIG_SYN_COOKIES=y CONFIG_INET_AH=y CONFIG_INET_ESP=y CONFIG_INET_IPCOMP=y CONFIG_INET_TUNNEL=y CONFIG_IP_TCPDIAG=y CONFIG_IP_TCPDIAG_IPV6=y CONFIG_TCP_CONG_ADVANCED=y # # TCP congestion control # CONFIG_TCP_CONG_BIC=y CONFIG_TCP_CONG_WESTWOOD=y CONFIG_TCP_CONG_HTCP=y # CONFIG_TCP_CONG_HSTCP is not set # CONFIG_TCP_CONG_HYBLA is not set CONFIG_TCP_CONG_VEGAS=y # CONFIG_TCP_CONG_SCALABLE is not set # # IP: Virtual Server Configuration # # CONFIG_IP_VS is not set CONFIG_IPV6=y # CONFIG_IPV6_PRIVACY is not set CONFIG_INET6_AH=m CONFIG_INET6_ESP=m CONFIG_INET6_IPCOMP=m CONFIG_INET6_TUNNEL=m # CONFIG_IPV6_TUNNEL is not set CONFIG_NETFILTER=y # CONFIG_NETFILTER_DEBUG is not set # # IP: Netfilter Configuration # CONFIG_IP_NF_CONNTRACK=y # CONFIG_IP_NF_CT_ACCT is not set # CONFIG_IP_NF_CONNTRACK_MARK is not set # CONFIG_IP_NF_CT_PROTO_SCTP is not set CONFIG_IP_NF_FTP=y CONFIG_IP_NF_IRC=y # CONFIG_IP_NF_TFTP is not set # CONFIG_IP_NF_AMANDA is not set # CONFIG_IP_NF_QUEUE is not set CONFIG_IP_NF_IPTABLES=y CONFIG_IP_NF_MATCH_LIMIT=y # CONFIG_IP_NF_MATCH_IPRANGE is not set CONFIG_IP_NF_MATCH_MAC=y CONFIG_IP_NF_MATCH_PKTTYPE=y CONFIG_IP_NF_MATCH_MARK=y CONFIG_IP_NF_MATCH_MULTIPORT=y CONFIG_IP_NF_MATCH_TOS=y # CONFIG_IP_NF_MATCH_RECENT is not set CONFIG_IP_NF_MATCH_ECN=y CONFIG_IP_NF_MATCH_DSCP=y CONFIG_IP_NF_MATCH_AH_ESP=y CONFIG_IP_NF_MATCH_LENGTH=y CONFIG_IP_NF_MATCH_TTL=y CONFIG_IP_NF_MATCH_TCPMSS=y CONFIG_IP_NF_MATCH_HELPER=y CONFIG_IP_NF_MATCH_STATE=y CONFIG_IP_NF_MATCH_CONNTRACK=y CONFIG_IP_NF_MATCH_OWNER=y # CONFIG_IP_NF_MATCH_ADDRTYPE is not set # CONFIG_IP_NF_MATCH_REALM is not set # CONFIG_IP_NF_MATCH_SCTP is not set # CONFIG_IP_NF_MATCH_COMMENT is not set # CONFIG_IP_NF_MATCH_HASHLIMIT is not set CONFIG_IP_NF_FILTER=y CONFIG_IP_NF_TARGET_REJECT=y CONFIG_IP_NF_TARGET_LOG=y CONFIG_IP_NF_TARGET_ULOG=y CONFIG_IP_NF_TARGET_TCPMSS=y CONFIG_IP_NF_NAT=y CONFIG_IP_NF_NAT_NEEDED=y CONFIG_IP_NF_TARGET_MASQUERADE=y CONFIG_IP_NF_TARGET_REDIRECT=y # CONFIG_IP_NF_TARGET_NETMAP is not set # CONFIG_IP_NF_TARGET_SAME is not set # CONFIG_IP_NF_NAT_SNMP_BASIC is not set CONFIG_IP_NF_NAT_IRC=y CONFIG_IP_NF_NAT_FTP=y CONFIG_IP_NF_MANGLE=y CONFIG_IP_NF_TARGET_TOS=y CONFIG_IP_NF_TARGET_ECN=y CONFIG_IP_NF_TARGET_DSCP=y CONFIG_IP_NF_TARGET_MARK=y # CONFIG_IP_NF_TARGET_CLASSIFY is not set # CONFIG_IP_NF_RAW is not set CONFIG_IP_NF_ARPTABLES=y CONFIG_IP_NF_ARPFILTER=y # CONFIG_IP_NF_ARP_MANGLE is not set # # IPv6: Netfilter Configuration (EXPERIMENTAL) # CONFIG_IP6_NF_QUEUE=m CONFIG_IP6_NF_IPTABLES=m CONFIG_IP6_NF_MATCH_LIMIT=m CONFIG_IP6_NF_MATCH_MAC=m # CONFIG_IP6_NF_MATCH_RT is not set # CONFIG_IP6_NF_MATCH_OPTS is not set # CONFIG_IP6_NF_MATCH_FRAG is not set # CONFIG_IP6_NF_MATCH_HL is not set CONFIG_IP6_NF_MATCH_MULTIPORT=m CONFIG_IP6_NF_MATCH_OWNER=m CONFIG_IP6_NF_MATCH_MARK=m # CONFIG_IP6_NF_MATCH_IPV6HEADER is not set # CONFIG_IP6_NF_MATCH_AHESP is not set CONFIG_IP6_NF_MATCH_LENGTH=m CONFIG_IP6_NF_MATCH_EUI64=m CONFIG_IP6_NF_FILTER=m CONFIG_IP6_NF_TARGET_LOG=m CONFIG_IP6_NF_MANGLE=m CONFIG_IP6_NF_TARGET_MARK=m # CONFIG_IP6_NF_RAW is not set # # SCTP Configuration (EXPERIMENTAL) # CONFIG_IP_SCTP=m # CONFIG_SCTP_DBG_MSG is not set # CONFIG_SCTP_DBG_OBJCNT is not set # CONFIG_SCTP_HMAC_NONE is not set CONFIG_SCTP_HMAC_SHA1=y # CONFIG_SCTP_HMAC_MD5 is not set # CONFIG_ATM is not set # CONFIG_BRIDGE is not set CONFIG_VLAN_8021Q=y # CONFIG_DECNET is not set # CONFIG_LLC2 is not set # CONFIG_IPX is not set # CONFIG_ATALK is not set # CONFIG_X25 is not set # CONFIG_LAPB is not set # CONFIG_NET_DIVERT is not set # CONFIG_ECONET is not set # CONFIG_WAN_ROUTER is not set CONFIG_NET_SCHED=y # CONFIG_NET_SCH_CLK_JIFFIES is not set # CONFIG_NET_SCH_CLK_GETTIMEOFDAY is not set CONFIG_NET_SCH_CLK_CPU=y CONFIG_NET_SCH_CBQ=y CONFIG_NET_SCH_HTB=y CONFIG_NET_SCH_HFSC=y CONFIG_NET_SCH_PRIO=y CONFIG_NET_SCH_RED=y CONFIG_NET_SCH_SFQ=y CONFIG_NET_SCH_TEQL=y CONFIG_NET_SCH_TBF=y CONFIG_NET_SCH_GRED=y CONFIG_NET_SCH_DSMARK=y # CONFIG_NET_SCH_NETEM is not set CONFIG_NET_SCH_INGRESS=y CONFIG_NET_QOS=y CONFIG_NET_ESTIMATOR=y CONFIG_NET_CLS=y # CONFIG_NET_CLS_BASIC is not set CONFIG_NET_CLS_TCINDEX=y CONFIG_NET_CLS_ROUTE4=y CONFIG_NET_CLS_ROUTE=y CONFIG_NET_CLS_FW=y CONFIG_NET_CLS_U32=y # CONFIG_CLS_U32_PERF is not set # CONFIG_NET_CLS_IND is not set # CONFIG_CLS_U32_MARK is not set CONFIG_NET_CLS_RSVP=y CONFIG_NET_CLS_RSVP6=y # CONFIG_NET_EMATCH is not set # CONFIG_NET_CLS_ACT is not set CONFIG_NET_CLS_POLICE=y # # Network testing # # CONFIG_NET_PKTGEN is not set # CONFIG_NETPOLL is not set # CONFIG_NET_POLL_CONTROLLER is not set # CONFIG_HAMRADIO is not set # CONFIG_IRDA is not set # CONFIG_BT is not set # # Device Drivers # # # Generic Driver Options # CONFIG_STANDALONE=y CONFIG_PREVENT_FIRMWARE_BUILD=y CONFIG_FW_LOADER=m # CONFIG_DEBUG_DRIVER is not set # # Memory Technology Devices (MTD) # # CONFIG_MTD is not set # # Parallel port support # # CONFIG_PARPORT is not set # # Plug and Play support # CONFIG_PNP=y # CONFIG_PNP_DEBUG is not set # # Protocols # CONFIG_ISAPNP=y # CONFIG_PNPBIOS is not set CONFIG_PNPACPI=y # # Block devices # CONFIG_BLK_DEV_FD=y # CONFIG_BLK_DEV_XD is not set # CONFIG_BLK_CPQ_DA is not set # CONFIG_BLK_CPQ_CISS_DA is not set # CONFIG_BLK_DEV_DAC960 is not set # CONFIG_BLK_DEV_UMEM is not set # CONFIG_BLK_DEV_COW_COMMON is not set CONFIG_BLK_DEV_LOOP=m CONFIG_BLK_DEV_CRYPTOLOOP=m # CONFIG_BLK_DEV_NBD is not set # CONFIG_BLK_DEV_SX8 is not set # CONFIG_BLK_DEV_RAM is not set CONFIG_BLK_DEV_RAM_COUNT=16 CONFIG_INITRAMFS_SOURCE="" # CONFIG_LBD is not set # CONFIG_CDROM_PKTCDVD is not set # # IO Schedulers # CONFIG_IOSCHED_NOOP=y CONFIG_IOSCHED_AS=y CONFIG_IOSCHED_DEADLINE=y CONFIG_IOSCHED_CFQ=y # CONFIG_ATA_OVER_ETH is not set # # ATA/ATAPI/MFM/RLL support # CONFIG_IDE=y CONFIG_BLK_DEV_IDE=y # # Please see Documentation/ide.txt for help/info on IDE drives # # CONFIG_BLK_DEV_IDE_SATA is not set # CONFIG_BLK_DEV_HD_IDE is not set CONFIG_BLK_DEV_IDEDISK=y CONFIG_IDEDISK_MULTI_MODE=y CONFIG_BLK_DEV_IDECD=y # CONFIG_BLK_DEV_IDETAPE is not set # CONFIG_BLK_DEV_IDEFLOPPY is not set # CONFIG_BLK_DEV_IDESCSI is not set # CONFIG_IDE_TASK_IOCTL is not set # # IDE chipset support/bugfixes # CONFIG_IDE_GENERIC=m # CONFIG_BLK_DEV_CMD640 is not set # CONFIG_BLK_DEV_IDEPNP is not set CONFIG_BLK_DEV_IDEPCI=y CONFIG_IDEPCI_SHARE_IRQ=y # CONFIG_BLK_DEV_OFFBOARD is not set CONFIG_BLK_DEV_GENERIC=y # CONFIG_BLK_DEV_OPTI621 is not set # CONFIG_BLK_DEV_RZ1000 is not set CONFIG_BLK_DEV_IDEDMA_PCI=y # CONFIG_BLK_DEV_IDEDMA_FORCED is not set CONFIG_IDEDMA_PCI_AUTO=y # CONFIG_IDEDMA_ONLYDISK is not set # CONFIG_BLK_DEV_AEC62XX is not set # CONFIG_BLK_DEV_ALI15X3 is not set # CONFIG_BLK_DEV_AMD74XX is not set # CONFIG_BLK_DEV_ATIIXP is not set # CONFIG_BLK_DEV_CMD64X is not set # CONFIG_BLK_DEV_TRIFLEX is not set # CONFIG_BLK_DEV_CY82C693 is not set # CONFIG_BLK_DEV_CS5520 is not set # CONFIG_BLK_DEV_CS5530 is not set # CONFIG_BLK_DEV_HPT34X is not set # CONFIG_BLK_DEV_HPT366 is not set # CONFIG_BLK_DEV_SC1200 is not set # CONFIG_BLK_DEV_PIIX is not set # CONFIG_BLK_DEV_IT821X is not set # CONFIG_BLK_DEV_NS87415 is not set CONFIG_BLK_DEV_PDC202XX_OLD=y # CONFIG_PDC202XX_BURST is not set CONFIG_BLK_DEV_PDC202XX_NEW=y # CONFIG_PDC202XX_FORCE is not set # CONFIG_BLK_DEV_SVWKS is not set # CONFIG_BLK_DEV_SIIMAGE is not set # CONFIG_BLK_DEV_SIS5513 is not set # CONFIG_BLK_DEV_SLC90E66 is not set # CONFIG_BLK_DEV_TRM290 is not set CONFIG_BLK_DEV_VIA82CXXX=y # CONFIG_IDE_ARM is not set # CONFIG_IDE_CHIPSETS is not set CONFIG_BLK_DEV_IDEDMA=y # CONFIG_IDEDMA_IVB is not set CONFIG_IDEDMA_AUTO=y # CONFIG_BLK_DEV_HD is not set # # SCSI device support # CONFIG_SCSI=m CONFIG_SCSI_PROC_FS=y # # SCSI support type (disk, tape, CD-ROM) # CONFIG_BLK_DEV_SD=m # CONFIG_CHR_DEV_ST is not set # CONFIG_CHR_DEV_OSST is not set CONFIG_BLK_DEV_SR=m # CONFIG_BLK_DEV_SR_VENDOR is not set CONFIG_CHR_DEV_SG=m # CONFIG_CHR_DEV_SCH is not set # # Some SCSI devices (e.g. CD jukebox) support multiple LUNs # # CONFIG_SCSI_MULTI_LUN is not set # CONFIG_SCSI_CONSTANTS is not set # CONFIG_SCSI_LOGGING is not set # # SCSI Transport Attributes # # CONFIG_SCSI_SPI_ATTRS is not set # CONFIG_SCSI_FC_ATTRS is not set # CONFIG_SCSI_ISCSI_ATTRS is not set # # SCSI low-level drivers # # CONFIG_BLK_DEV_3W_XXXX_RAID is not set # CONFIG_SCSI_3W_9XXX is not set # CONFIG_SCSI_7000FASST is not set # CONFIG_SCSI_ACARD is not set # CONFIG_SCSI_AHA152X is not set # CONFIG_SCSI_AHA1542 is not set # CONFIG_SCSI_AACRAID is not set # CONFIG_SCSI_AIC7XXX is not set # CONFIG_SCSI_AIC7XXX_OLD is not set # CONFIG_SCSI_AIC79XX is not set # CONFIG_SCSI_DPT_I2O is not set # CONFIG_SCSI_IN2000 is not set # CONFIG_MEGARAID_NEWGEN is not set # CONFIG_MEGARAID_LEGACY is not set # CONFIG_SCSI_SATA is not set # CONFIG_SCSI_BUSLOGIC is not set # CONFIG_SCSI_DMX3191D is not set # CONFIG_SCSI_DTC3280 is not set # CONFIG_SCSI_EATA is not set # CONFIG_SCSI_FUTURE_DOMAIN is not set # CONFIG_SCSI_GDTH is not set # CONFIG_SCSI_GENERIC_NCR5380 is not set # CONFIG_SCSI_GENERIC_NCR5380_MMIO is not set # CONFIG_SCSI_IPS is not set # CONFIG_SCSI_INITIO is not set # CONFIG_SCSI_INIA100 is not set # CONFIG_SCSI_NCR53C406A is not set # CONFIG_SCSI_SYM53C8XX_2 is not set # CONFIG_SCSI_IPR is not set # CONFIG_SCSI_PAS16 is not set # CONFIG_SCSI_PSI240I is not set # CONFIG_SCSI_QLOGIC_FAS is not set # CONFIG_SCSI_QLOGIC_FC is not set # CONFIG_SCSI_QLOGIC_1280 is not set CONFIG_SCSI_QLA2XXX=m # CONFIG_SCSI_QLA21XX is not set # CONFIG_SCSI_QLA22XX is not set # CONFIG_SCSI_QLA2300 is not set # CONFIG_SCSI_QLA2322 is not set # CONFIG_SCSI_QLA6312 is not set # CONFIG_SCSI_LPFC is not set # CONFIG_SCSI_SYM53C416 is not set # CONFIG_SCSI_DC395x is not set # CONFIG_SCSI_DC390T is not set # CONFIG_SCSI_T128 is not set # CONFIG_SCSI_U14_34F is not set # CONFIG_SCSI_ULTRASTOR is not set # CONFIG_SCSI_NSP32 is not set # CONFIG_SCSI_DEBUG is not set # # Old CD-ROM drivers (not SCSI, not IDE) # # CONFIG_CD_NO_IDESCSI is not set # # Multi-device support (RAID and LVM) # CONFIG_MD=y CONFIG_BLK_DEV_MD=y # CONFIG_MD_LINEAR is not set # CONFIG_MD_RAID0 is not set CONFIG_MD_RAID1=y # CONFIG_MD_RAID10 is not set # CONFIG_MD_RAID5 is not set # CONFIG_MD_RAID6 is not set # CONFIG_MD_MULTIPATH is not set # CONFIG_MD_FAULTY is not set CONFIG_BLK_DEV_DM=y CONFIG_DM_CRYPT=y CONFIG_DM_SNAPSHOT=y CONFIG_DM_MIRROR=y # CONFIG_DM_ZERO is not set # CONFIG_DM_MULTIPATH is not set # # Fusion MPT device support # # CONFIG_FUSION is not set # CONFIG_FUSION_SPI is not set # CONFIG_FUSION_FC is not set # # IEEE 1394 (FireWire) support # # CONFIG_IEEE1394 is not set # # I2O device support # # CONFIG_I2O is not set # # Network device support # CONFIG_NETDEVICES=y CONFIG_DUMMY=m # CONFIG_BONDING is not set # CONFIG_EQUALIZER is not set # CONFIG_TUN is not set # CONFIG_NET_SB1000 is not set # # ARCnet devices # # CONFIG_ARCNET is not set # # Ethernet (10 or 100Mbit) # CONFIG_NET_ETHERNET=y CONFIG_MII=y # CONFIG_HAPPYMEAL is not set # CONFIG_SUNGEM is not set CONFIG_NET_VENDOR_3COM=y # CONFIG_EL1 is not set # CONFIG_EL2 is not set # CONFIG_ELPLUS is not set # CONFIG_EL16 is not set # CONFIG_EL3 is not set # CONFIG_3C515 is not set CONFIG_VORTEX=y # CONFIG_TYPHOON is not set # CONFIG_LANCE is not set # CONFIG_NET_VENDOR_SMC is not set # CONFIG_NET_VENDOR_RACAL is not set # # Tulip family network device support # # CONFIG_NET_TULIP is not set # CONFIG_AT1700 is not set # CONFIG_DEPCA is not set # CONFIG_HP100 is not set # CONFIG_NET_ISA is not set # CONFIG_NET_PCI is not set # CONFIG_NET_POCKET is not set # # Ethernet (1000 Mbit) # # CONFIG_ACENIC is not set # CONFIG_DL2K is not set # CONFIG_E1000 is not set # CONFIG_NS83820 is not set # CONFIG_HAMACHI is not set # CONFIG_YELLOWFIN is not set # CONFIG_R8169 is not set # CONFIG_SKGE is not set # CONFIG_SK98LIN is not set # CONFIG_TIGON3 is not set # CONFIG_BNX2 is not set # # Ethernet (10000 Mbit) # # CONFIG_IXGB is not set # CONFIG_S2IO is not set # # Token Ring devices # # CONFIG_TR is not set # # Wireless LAN (non-hamradio) # # CONFIG_NET_RADIO is not set # # Wan interfaces # # CONFIG_WAN is not set # CONFIG_FDDI is not set # CONFIG_HIPPI is not set CONFIG_PPP=y # CONFIG_PPP_MULTILINK is not set # CONFIG_PPP_FILTER is not set # CONFIG_PPP_ASYNC is not set # CONFIG_PPP_SYNC_TTY is not set # CONFIG_PPP_DEFLATE is not set # CONFIG_PPP_BSDCOMP is not set CONFIG_PPPOE=y # CONFIG_SLIP is not set # CONFIG_NET_FC is not set # CONFIG_SHAPER is not set # CONFIG_NETCONSOLE is not set # # ISDN subsystem # # CONFIG_ISDN is not set # # Telephony Support # # CONFIG_PHONE is not set # # Input device support # CONFIG_INPUT=y # # Userland interfaces # CONFIG_INPUT_MOUSEDEV=y CONFIG_INPUT_MOUSEDEV_PSAUX=y CONFIG_INPUT_MOUSEDEV_SCREEN_X=1024 CONFIG_INPUT_MOUSEDEV_SCREEN_Y=768 # CONFIG_INPUT_JOYDEV is not set # CONFIG_INPUT_TSDEV is not set # CONFIG_INPUT_EVDEV is not set # CONFIG_INPUT_EVBUG is not set # # Input Device Drivers # CONFIG_INPUT_KEYBOARD=y CONFIG_KEYBOARD_ATKBD=y # CONFIG_KEYBOARD_SUNKBD is not set # CONFIG_KEYBOARD_LKKBD is not set # CONFIG_KEYBOARD_XTKBD is not set # CONFIG_KEYBOARD_NEWTON is not set # CONFIG_INPUT_MOUSE is not set # CONFIG_INPUT_JOYSTICK is not set # CONFIG_INPUT_TOUCHSCREEN is not set # CONFIG_INPUT_MISC is not set # # Hardware I/O ports # CONFIG_SERIO=y CONFIG_SERIO_I8042=y CONFIG_SERIO_SERPORT=y # CONFIG_SERIO_CT82C710 is not set # CONFIG_SERIO_PCIPS2 is not set CONFIG_SERIO_LIBPS2=y # CONFIG_SERIO_RAW is not set # CONFIG_GAMEPORT is not set # # Character devices # CONFIG_VT=y CONFIG_VT_CONSOLE=y CONFIG_HW_CONSOLE=y # CONFIG_SERIAL_NONSTANDARD is not set # # Serial drivers # CONFIG_SERIAL_8250=y # CONFIG_SERIAL_8250_CONSOLE is not set # CONFIG_SERIAL_8250_ACPI is not set CONFIG_SERIAL_8250_NR_UARTS=4 # CONFIG_SERIAL_8250_EXTENDED is not set # # Non-8250 serial port support # CONFIG_SERIAL_CORE=y # CONFIG_SERIAL_JSM is not set CONFIG_UNIX98_PTYS=y CONFIG_LEGACY_PTYS=y CONFIG_LEGACY_PTY_COUNT=16 # # IPMI # # CONFIG_IPMI_HANDLER is not set # # Watchdog Cards # CONFIG_WATCHDOG=y # CONFIG_WATCHDOG_NOWAYOUT is not set # # Watchdog Device Drivers # CONFIG_SOFT_WATCHDOG=y # CONFIG_ACQUIRE_WDT is not set # CONFIG_ADVANTECH_WDT is not set # CONFIG_ALIM1535_WDT is not set # CONFIG_ALIM7101_WDT is not set # CONFIG_SC520_WDT is not set # CONFIG_EUROTECH_WDT is not set # CONFIG_IB700_WDT is not set # CONFIG_WAFER_WDT is not set # CONFIG_I8XX_TCO is not set # CONFIG_SC1200_WDT is not set # CONFIG_60XX_WDT is not set # CONFIG_CPU5_WDT is not set # CONFIG_W83627HF_WDT is not set # CONFIG_W83877F_WDT is not set # CONFIG_MACHZ_WDT is not set # # ISA-based Watchdog Cards # # CONFIG_PCWATCHDOG is not set # CONFIG_MIXCOMWD is not set # CONFIG_WDT is not set # # PCI-based Watchdog Cards # # CONFIG_PCIPCWATCHDOG is not set # CONFIG_WDTPCI is not set # CONFIG_HW_RANDOM is not set # CONFIG_NVRAM is not set CONFIG_RTC=y # CONFIG_DTLK is not set # CONFIG_R3964 is not set # CONFIG_APPLICOM is not set # CONFIG_SONYPI is not set # # Ftape, the floppy tape device driver # # CONFIG_FTAPE is not set CONFIG_AGP=y # CONFIG_AGP_ALI is not set # CONFIG_AGP_ATI is not set # CONFIG_AGP_AMD is not set # CONFIG_AGP_AMD64 is not set # CONFIG_AGP_INTEL is not set # CONFIG_AGP_NVIDIA is not set # CONFIG_AGP_SIS is not set # CONFIG_AGP_SWORKS is not set CONFIG_AGP_VIA=y # CONFIG_AGP_EFFICEON is not set # CONFIG_DRM is not set # CONFIG_MWAVE is not set # CONFIG_RAW_DRIVER is not set # CONFIG_HPET is not set # CONFIG_HANGCHECK_TIMER is not set # # TPM devices # # CONFIG_TCG_TPM is not set # # I2C support # CONFIG_I2C=m CONFIG_I2C_CHARDEV=m # # I2C Algorithms # CONFIG_I2C_ALGOBIT=m CONFIG_I2C_ALGOPCF=m # CONFIG_I2C_ALGOPCA is not set # # I2C Hardware Bus support # CONFIG_I2C_ALI1535=m CONFIG_I2C_ALI1563=m CONFIG_I2C_ALI15X3=m CONFIG_I2C_AMD756=m # CONFIG_I2C_AMD756_S4882 is not set CONFIG_I2C_AMD8111=m CONFIG_I2C_ELEKTOR=m CONFIG_I2C_I801=m CONFIG_I2C_I810=m CONFIG_I2C_PIIX4=m CONFIG_I2C_ISA=m CONFIG_I2C_NFORCE2=m CONFIG_I2C_PARPORT_LIGHT=m CONFIG_I2C_PROSAVAGE=m CONFIG_I2C_SAVAGE4=m CONFIG_SCx200_ACB=m CONFIG_I2C_SIS5595=m CONFIG_I2C_SIS630=m CONFIG_I2C_SIS96X=m # CONFIG_I2C_STUB is not set CONFIG_I2C_VIA=m CONFIG_I2C_VIAPRO=m CONFIG_I2C_VOODOO3=m # CONFIG_I2C_PCA_ISA is not set CONFIG_I2C_SENSOR=m # # Miscellaneous I2C Chip support # CONFIG_SENSORS_DS1337=m # CONFIG_SENSORS_DS1374 is not set CONFIG_SENSORS_EEPROM=m CONFIG_SENSORS_PCF8574=m # CONFIG_SENSORS_PCA9539 is not set CONFIG_SENSORS_PCF8591=m CONFIG_SENSORS_RTC8564=m # CONFIG_SENSORS_MAX6875 is not set # CONFIG_I2C_DEBUG_CORE is not set # CONFIG_I2C_DEBUG_ALGO is not set # CONFIG_I2C_DEBUG_BUS is not set # CONFIG_I2C_DEBUG_CHIP is not set # # Dallas's 1-wire bus # # CONFIG_W1 is not set # # Hardware Monitoring support # CONFIG_HWMON=y CONFIG_SENSORS_ADM1021=m CONFIG_SENSORS_ADM1025=m CONFIG_SENSORS_ADM1026=m CONFIG_SENSORS_ADM1031=m CONFIG_SENSORS_ADM9240=m CONFIG_SENSORS_ASB100=m CONFIG_SENSORS_ATXP1=m CONFIG_SENSORS_DS1621=m CONFIG_SENSORS_FSCHER=m # CONFIG_SENSORS_FSCPOS is not set CONFIG_SENSORS_GL518SM=m CONFIG_SENSORS_GL520SM=m CONFIG_SENSORS_IT87=m CONFIG_SENSORS_LM63=m CONFIG_SENSORS_LM75=m CONFIG_SENSORS_LM77=m CONFIG_SENSORS_LM78=m CONFIG_SENSORS_LM80=m CONFIG_SENSORS_LM83=m CONFIG_SENSORS_LM85=m CONFIG_SENSORS_LM87=m CONFIG_SENSORS_LM90=m CONFIG_SENSORS_LM92=m CONFIG_SENSORS_MAX1619=m CONFIG_SENSORS_PC87360=m CONFIG_SENSORS_SIS5595=m # CONFIG_SENSORS_SMSC47M1 is not set # CONFIG_SENSORS_SMSC47B397 is not set CONFIG_SENSORS_VIA686A=m CONFIG_SENSORS_W83781D=m CONFIG_SENSORS_W83L785TS=m CONFIG_SENSORS_W83627HF=m CONFIG_SENSORS_W83627EHF=m # CONFIG_HWMON_DEBUG_CHIP is not set # # Misc devices # # CONFIG_IBM_ASM is not set # # Multimedia devices # CONFIG_VIDEO_DEV=m # # Video For Linux # # # Video Adapters # # CONFIG_VIDEO_BT848 is not set # CONFIG_VIDEO_PMS is not set # CONFIG_VIDEO_CPIA is not set # CONFIG_VIDEO_SAA5246A is not set # CONFIG_VIDEO_SAA5249 is not set # CONFIG_TUNER_3036 is not set # CONFIG_VIDEO_STRADIS is not set # CONFIG_VIDEO_ZORAN is not set # CONFIG_VIDEO_SAA7134 is not set # CONFIG_VIDEO_MXB is not set # CONFIG_VIDEO_DPC is not set # CONFIG_VIDEO_HEXIUM_ORION is not set # CONFIG_VIDEO_HEXIUM_GEMINI is not set # CONFIG_VIDEO_CX88 is not set # CONFIG_VIDEO_OVCAMCHIP is not set # # Radio Adapters # # CONFIG_RADIO_CADET is not set # CONFIG_RADIO_RTRACK is not set # CONFIG_RADIO_RTRACK2 is not set # CONFIG_RADIO_AZTECH is not set # CONFIG_RADIO_GEMTEK is not set # CONFIG_RADIO_GEMTEK_PCI is not set # CONFIG_RADIO_MAXIRADIO is not set # CONFIG_RADIO_MAESTRO is not set # CONFIG_RADIO_SF16FMI is not set # CONFIG_RADIO_SF16FMR2 is not set # CONFIG_RADIO_TERRATEC is not set # CONFIG_RADIO_TRUST is not set # CONFIG_RADIO_TYPHOON is not set # CONFIG_RADIO_ZOLTRIX is not set # # Digital Video Broadcasting Devices # CONFIG_DVB=y CONFIG_DVB_CORE=m # # Supported SAA7146 based PCI Adapters # CONFIG_DVB_AV7110=m CONFIG_DVB_AV7110_OSD=y CONFIG_DVB_BUDGET=m CONFIG_DVB_BUDGET_CI=m # CONFIG_DVB_BUDGET_AV is not set CONFIG_DVB_BUDGET_PATCH=m # # Supported FlexCopII (B2C2) Adapters # # CONFIG_DVB_B2C2_FLEXCOP is not set # # Supported BT878 Adapters # # # Supported Pluto2 Adapters # # CONFIG_DVB_PLUTO2 is not set # # Supported DVB Frontends # # # Customise DVB Frontends # # # DVB-S (satellite) frontends # CONFIG_DVB_STV0299=m CONFIG_DVB_CX24110=m CONFIG_DVB_TDA8083=m CONFIG_DVB_TDA80XX=m CONFIG_DVB_MT312=m CONFIG_DVB_VES1X93=m CONFIG_DVB_S5H1420=m # # DVB-T (terrestrial) frontends # CONFIG_DVB_SP8870=m CONFIG_DVB_SP887X=m CONFIG_DVB_CX22700=m CONFIG_DVB_CX22702=m CONFIG_DVB_L64781=m CONFIG_DVB_TDA1004X=m CONFIG_DVB_NXT6000=m CONFIG_DVB_MT352=m CONFIG_DVB_DIB3000MB=m CONFIG_DVB_DIB3000MC=m # # DVB-C (cable) frontends # CONFIG_DVB_ATMEL_AT76C651=m CONFIG_DVB_VES1820=m CONFIG_DVB_TDA10021=m CONFIG_DVB_STV0297=m # # ATSC (North American/Korean Terresterial DTV) frontends # # CONFIG_DVB_NXT2002 is not set # CONFIG_DVB_OR51211 is not set # CONFIG_DVB_OR51132 is not set # CONFIG_DVB_BCM3510 is not set # CONFIG_DVB_LGDT3302 is not set CONFIG_VIDEO_SAA7146=m CONFIG_VIDEO_SAA7146_VV=m CONFIG_VIDEO_VIDEOBUF=m CONFIG_VIDEO_BUF=m # # Graphics support # CONFIG_FB=y CONFIG_FB_CFB_FILLRECT=m CONFIG_FB_CFB_COPYAREA=m CONFIG_FB_CFB_IMAGEBLIT=m CONFIG_FB_SOFT_CURSOR=m # CONFIG_FB_MACMODES is not set CONFIG_FB_MODE_HELPERS=y # CONFIG_FB_TILEBLITTING is not set # CONFIG_FB_CIRRUS is not set # CONFIG_FB_PM2 is not set # CONFIG_FB_CYBER2000 is not set # CONFIG_FB_ARC is not set # CONFIG_FB_ASILIANT is not set # CONFIG_FB_IMSTT is not set # CONFIG_FB_VGA16 is not set # CONFIG_FB_VESA is not set CONFIG_VIDEO_SELECT=y # CONFIG_FB_HGA is not set # CONFIG_FB_NVIDIA is not set # CONFIG_FB_RIVA is not set # CONFIG_FB_I810 is not set # CONFIG_FB_INTEL is not set # CONFIG_FB_MATROX is not set # CONFIG_FB_RADEON_OLD is not set CONFIG_FB_RADEON=m CONFIG_FB_RADEON_I2C=y # CONFIG_FB_RADEON_DEBUG is not set CONFIG_FB_ATY128=m CONFIG_FB_ATY=m # CONFIG_FB_ATY_CT is not set # CONFIG_FB_ATY_GX is not set # CONFIG_FB_SAVAGE is not set # CONFIG_FB_SIS is not set # CONFIG_FB_NEOMAGIC is not set # CONFIG_FB_KYRO is not set # CONFIG_FB_3DFX is not set # CONFIG_FB_VOODOO1 is not set # CONFIG_FB_TRIDENT is not set # CONFIG_FB_GEODE is not set # CONFIG_FB_S1D13XXX is not set # CONFIG_FB_VIRTUAL is not set # # Console display driver support # CONFIG_VGA_CONSOLE=y # CONFIG_MDA_CONSOLE is not set CONFIG_DUMMY_CONSOLE=y # CONFIG_FRAMEBUFFER_CONSOLE is not set # # Logo configuration # # CONFIG_LOGO is not set # CONFIG_BACKLIGHT_LCD_SUPPORT is not set # # Sound # CONFIG_SOUND=m # # Advanced Linux Sound Architecture # CONFIG_SND=m CONFIG_SND_TIMER=m CONFIG_SND_PCM=m CONFIG_SND_HWDEP=m CONFIG_SND_RAWMIDI=m CONFIG_SND_SEQUENCER=m CONFIG_SND_SEQ_DUMMY=m CONFIG_SND_OSSEMUL=y CONFIG_SND_MIXER_OSS=m CONFIG_SND_PCM_OSS=m # CONFIG_SND_SEQUENCER_OSS is not set CONFIG_SND_RTCTIMER=m # CONFIG_SND_VERBOSE_PRINTK is not set # CONFIG_SND_DEBUG is not set # # Generic devices # CONFIG_SND_MPU401_UART=m CONFIG_SND_OPL3_LIB=m # CONFIG_SND_DUMMY is not set # CONFIG_SND_VIRMIDI is not set # CONFIG_SND_MTPAV is not set # CONFIG_SND_SERIAL_U16550 is not set # CONFIG_SND_MPU401 is not set # # ISA devices # # CONFIG_SND_AD1816A is not set # CONFIG_SND_AD1848 is not set # CONFIG_SND_CS4231 is not set # CONFIG_SND_CS4232 is not set # CONFIG_SND_CS4236 is not set # CONFIG_SND_ES968 is not set # CONFIG_SND_ES1688 is not set # CONFIG_SND_ES18XX is not set # CONFIG_SND_GUSCLASSIC is not set # CONFIG_SND_GUSEXTREME is not set # CONFIG_SND_GUSMAX is not set # CONFIG_SND_INTERWAVE is not set # CONFIG_SND_INTERWAVE_STB is not set # CONFIG_SND_OPTI92X_AD1848 is not set # CONFIG_SND_OPTI92X_CS4231 is not set # CONFIG_SND_OPTI93X is not set # CONFIG_SND_SB8 is not set # CONFIG_SND_SB16 is not set # CONFIG_SND_SBAWE is not set # CONFIG_SND_WAVEFRONT is not set # CONFIG_SND_ALS100 is not set # CONFIG_SND_AZT2320 is not set # CONFIG_SND_CMI8330 is not set # CONFIG_SND_DT019X is not set # CONFIG_SND_OPL3SA2 is not set # CONFIG_SND_SGALAXY is not set # CONFIG_SND_SSCAPE is not set # # PCI devices # CONFIG_SND_AC97_CODEC=m # CONFIG_SND_ALI5451 is not set # CONFIG_SND_ATIIXP is not set # CONFIG_SND_ATIIXP_MODEM is not set # CONFIG_SND_AU8810 is not set # CONFIG_SND_AU8820 is not set # CONFIG_SND_AU8830 is not set # CONFIG_SND_AZT3328 is not set # CONFIG_SND_BT87X is not set # CONFIG_SND_CS46XX is not set # CONFIG_SND_CS4281 is not set # CONFIG_SND_EMU10K1 is not set # CONFIG_SND_EMU10K1X is not set # CONFIG_SND_CA0106 is not set # CONFIG_SND_KORG1212 is not set # CONFIG_SND_MIXART is not set # CONFIG_SND_NM256 is not set # CONFIG_SND_RME32 is not set # CONFIG_SND_RME96 is not set # CONFIG_SND_RME9652 is not set # CONFIG_SND_HDSP is not set # CONFIG_SND_HDSPM is not set # CONFIG_SND_TRIDENT is not set CONFIG_SND_YMFPCI=m # CONFIG_SND_ALS4000 is not set # CONFIG_SND_CMIPCI is not set CONFIG_SND_ENS1370=m CONFIG_SND_ENS1371=m # CONFIG_SND_ES1938 is not set # CONFIG_SND_ES1968 is not set # CONFIG_SND_MAESTRO3 is not set # CONFIG_SND_FM801 is not set # CONFIG_SND_ICE1712 is not set # CONFIG_SND_ICE1724 is not set # CONFIG_SND_INTEL8X0 is not set # CONFIG_SND_INTEL8X0M is not set # CONFIG_SND_SONICVIBES is not set # CONFIG_SND_VIA82XX is not set # CONFIG_SND_VIA82XX_MODEM is not set # CONFIG_SND_VX222 is not set # CONFIG_SND_HDA_INTEL is not set # # Open Sound System # # CONFIG_SOUND_PRIME is not set # # USB support # CONFIG_USB_ARCH_HAS_HCD=y CONFIG_USB_ARCH_HAS_OHCI=y # CONFIG_USB is not set # # USB Gadget Support # # CONFIG_USB_GADGET is not set # # MMC/SD Card support # # CONFIG_MMC is not set # # InfiniBand support # # CONFIG_INFINIBAND is not set # # SN Devices # # # File systems # # CONFIG_EXT2_FS is not set CONFIG_EXT3_FS=y CONFIG_EXT3_FS_XATTR=y CONFIG_EXT3_FS_POSIX_ACL=y # CONFIG_EXT3_FS_SECURITY is not set CONFIG_JBD=y # CONFIG_JBD_DEBUG is not set CONFIG_FS_MBCACHE=y # CONFIG_REISERFS_FS is not set # CONFIG_JFS_FS is not set CONFIG_FS_POSIX_ACL=y # # XFS support # # CONFIG_XFS_FS is not set # CONFIG_MINIX_FS is not set # CONFIG_ROMFS_FS is not set CONFIG_INOTIFY=y CONFIG_QUOTA=y # CONFIG_QFMT_V1 is not set # CONFIG_QFMT_V2 is not set CONFIG_QUOTACTL=y CONFIG_DNOTIFY=y # CONFIG_AUTOFS_FS is not set CONFIG_AUTOFS4_FS=y # # CD-ROM/DVD Filesystems # CONFIG_ISO9660_FS=y CONFIG_JOLIET=y # CONFIG_ZISOFS is not set CONFIG_UDF_FS=m CONFIG_UDF_NLS=y # # DOS/FAT/NT Filesystems # CONFIG_FAT_FS=m # CONFIG_MSDOS_FS is not set CONFIG_VFAT_FS=m CONFIG_FAT_DEFAULT_CODEPAGE=437 CONFIG_FAT_DEFAULT_IOCHARSET="iso8859-1" # CONFIG_NTFS_FS is not set # # Pseudo filesystems # CONFIG_PROC_FS=y CONFIG_PROC_KCORE=y CONFIG_SYSFS=y # CONFIG_DEVPTS_FS_XATTR is not set CONFIG_TMPFS=y # CONFIG_TMPFS_XATTR is not set # CONFIG_HUGETLBFS is not set # CONFIG_HUGETLB_PAGE is not set CONFIG_RAMFS=y # # Miscellaneous filesystems # # CONFIG_ADFS_FS is not set # CONFIG_AFFS_FS is not set # CONFIG_HFS_FS is not set # CONFIG_HFSPLUS_FS is not set # CONFIG_BEFS_FS is not set # CONFIG_BFS_FS is not set # CONFIG_EFS_FS is not set # CONFIG_CRAMFS is not set # CONFIG_VXFS_FS is not set # CONFIG_HPFS_FS is not set # CONFIG_QNX4FS_FS is not set # CONFIG_SYSV_FS is not set # CONFIG_UFS_FS is not set # # Network File Systems # CONFIG_NFS_FS=y CONFIG_NFS_V3=y CONFIG_NFS_V3_ACL=y # CONFIG_NFS_V4 is not set # CONFIG_NFS_DIRECTIO is not set CONFIG_NFSD=m CONFIG_NFSD_V2_ACL=y CONFIG_NFSD_V3=y CONFIG_NFSD_V3_ACL=y # CONFIG_NFSD_V4 is not set CONFIG_NFSD_TCP=y CONFIG_ROOT_NFS=y CONFIG_LOCKD=y CONFIG_LOCKD_V4=y CONFIG_EXPORTFS=m CONFIG_NFS_ACL_SUPPORT=y CONFIG_NFS_COMMON=y CONFIG_SUNRPC=y CONFIG_SUNRPC_GSS=m CONFIG_RPCSEC_GSS_KRB5=m CONFIG_RPCSEC_GSS_SPKM3=m # CONFIG_SMB_FS is not set # CONFIG_CIFS is not set # CONFIG_NCP_FS is not set # CONFIG_CODA_FS is not set # CONFIG_AFS_FS is not set # # Partition Types # # CONFIG_PARTITION_ADVANCED is not set CONFIG_MSDOS_PARTITION=y # # Native Language Support # CONFIG_NLS=y CONFIG_NLS_DEFAULT="iso8859-1" CONFIG_NLS_CODEPAGE_437=m CONFIG_NLS_CODEPAGE_737=m CONFIG_NLS_CODEPAGE_775=m CONFIG_NLS_CODEPAGE_850=m CONFIG_NLS_CODEPAGE_852=m CONFIG_NLS_CODEPAGE_855=m CONFIG_NLS_CODEPAGE_857=m CONFIG_NLS_CODEPAGE_860=m CONFIG_NLS_CODEPAGE_861=m CONFIG_NLS_CODEPAGE_862=m CONFIG_NLS_CODEPAGE_863=m CONFIG_NLS_CODEPAGE_864=m CONFIG_NLS_CODEPAGE_865=m CONFIG_NLS_CODEPAGE_866=m CONFIG_NLS_CODEPAGE_869=m CONFIG_NLS_CODEPAGE_936=m CONFIG_NLS_CODEPAGE_950=m CONFIG_NLS_CODEPAGE_932=m CONFIG_NLS_CODEPAGE_949=m CONFIG_NLS_CODEPAGE_874=m CONFIG_NLS_ISO8859_8=m CONFIG_NLS_CODEPAGE_1250=m CONFIG_NLS_CODEPAGE_1251=m # CONFIG_NLS_ASCII is not set CONFIG_NLS_ISO8859_1=m CONFIG_NLS_ISO8859_2=m CONFIG_NLS_ISO8859_3=m CONFIG_NLS_ISO8859_4=m CONFIG_NLS_ISO8859_5=m CONFIG_NLS_ISO8859_6=m CONFIG_NLS_ISO8859_7=m CONFIG_NLS_ISO8859_9=m CONFIG_NLS_ISO8859_13=m CONFIG_NLS_ISO8859_14=m CONFIG_NLS_ISO8859_15=m CONFIG_NLS_KOI8_R=m CONFIG_NLS_KOI8_U=m CONFIG_NLS_UTF8=m # # Profiling support # # CONFIG_PROFILING is not set # # Kernel hacking # # CONFIG_PRINTK_TIME is not set CONFIG_DEBUG_KERNEL=y CONFIG_MAGIC_SYSRQ=y CONFIG_LOG_BUF_SHIFT=14 # CONFIG_SCHEDSTATS is not set # CONFIG_DEBUG_SLAB is not set # CONFIG_DEBUG_SPINLOCK is not set # CONFIG_DEBUG_SPINLOCK_SLEEP is not set # CONFIG_DEBUG_KOBJECT is not set # CONFIG_DEBUG_HIGHMEM is not set CONFIG_DEBUG_BUGVERBOSE=y # CONFIG_DEBUG_INFO is not set # CONFIG_DEBUG_FS is not set # CONFIG_FRAME_POINTER is not set CONFIG_EARLY_PRINTK=y # CONFIG_DEBUG_STACKOVERFLOW is not set # CONFIG_KPROBES is not set # CONFIG_DEBUG_STACK_USAGE is not set # CONFIG_DEBUG_PAGEALLOC is not set CONFIG_4KSTACKS=y CONFIG_X86_FIND_SMP_CONFIG=y CONFIG_X86_MPPARSE=y # # Security options # # CONFIG_KEYS is not set # CONFIG_SECURITY is not set # # Cryptographic options # CONFIG_CRYPTO=y CONFIG_CRYPTO_HMAC=y CONFIG_CRYPTO_NULL=y CONFIG_CRYPTO_MD4=y CONFIG_CRYPTO_MD5=y CONFIG_CRYPTO_SHA1=y CONFIG_CRYPTO_SHA256=y CONFIG_CRYPTO_SHA512=y # CONFIG_CRYPTO_WP512 is not set # CONFIG_CRYPTO_TGR192 is not set CONFIG_CRYPTO_DES=y CONFIG_CRYPTO_BLOWFISH=y CONFIG_CRYPTO_TWOFISH=y CONFIG_CRYPTO_SERPENT=y CONFIG_CRYPTO_AES_586=y CONFIG_CRYPTO_CAST5=y CONFIG_CRYPTO_CAST6=y CONFIG_CRYPTO_TEA=y CONFIG_CRYPTO_ARC4=y CONFIG_CRYPTO_KHAZAD=y # CONFIG_CRYPTO_ANUBIS is not set CONFIG_CRYPTO_DEFLATE=y # CONFIG_CRYPTO_MICHAEL_MIC is not set # CONFIG_CRYPTO_CRC32C is not set # CONFIG_CRYPTO_TEST is not set # # Hardware crypto devices # # CONFIG_CRYPTO_DEV_PADLOCK is not set # # Library routines # # CONFIG_CRC_CCITT is not set CONFIG_CRC32=m # CONFIG_LIBCRC32C is not set CONFIG_ZLIB_INFLATE=y CONFIG_ZLIB_DEFLATE=y CONFIG_GENERIC_HARDIRQS=y CONFIG_GENERIC_IRQ_PROBE=y CONFIG_X86_BIOS_REBOOT=y CONFIG_PC=y -- Antti Salmela From mkomu@twilight.cs.hut.fi Tue Jul 26 06:04:56 2005 Received: with ECARTIS (v1.0.0; list netdev); Tue, 26 Jul 2005 06:05:01 -0700 (PDT) Received: from twilight.cs.hut.fi (twilight.cs.hut.fi [130.233.40.5]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6QD4tH9027243 for ; Tue, 26 Jul 2005 06:04:56 -0700 Received: by twilight.cs.hut.fi (Postfix, from userid 60001) id 1FD3A2CD0; Tue, 26 Jul 2005 16:03:00 +0300 (EEST) Received: from kekkonen.cs.hut.fi (kekkonen.cs.hut.fi [130.233.41.50]) by twilight.cs.hut.fi (Postfix) with ESMTP id BCC812CBC for ; Tue, 26 Jul 2005 16:02:59 +0300 (EEST) Received: (from mkomu@localhost) by kekkonen.cs.hut.fi (8.11.7p1+Sun/8.10.2) id j6QD2xS02315; Tue, 26 Jul 2005 16:02:59 +0300 (EEST) Date: Tue, 26 Jul 2005 16:02:59 +0300 (EEST) From: Miika Komu X-X-Sender: mkomu@kekkonen.cs.hut.fi To: netdev@oss.sgi.com Subject: Re: [PATCH 2.6.12.2] XFRM: BEET IPsec mode for Linux In-Reply-To: <1122295307.14873.37.camel@odysse> Message-ID: References: <1122295307.14873.37.camel@odysse> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 2784 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: miika@iki.fi Precedence: bulk X-list: netdev Content-Length: 754 Lines: 17 > Ericsson has already developed a BEET patch for *BSD. Our patch > provides the similar functionality, but using the XFRM architecture. > The patch is included at the end of this email and also at the following > URL: > http://hipl.hiit.fi/beet/beet-patch-v1.0-2.6.12.2 We added more files to the same directory that are helpful for testing the patch. There is a patch for setkey which enables you to configure security associations manually. Also, there is a simple script that can be used for quick testing of ping over BEET ESP security associations. http://hipl.hiit.fi/beet/ipsec-tools-0.5.1-patch-beet http://hipl.hiit.fi/beet/run-beet.sh Looking forward for feedback, -- Miika Komu miika@iki.fi http://www.iki.fi/miika/ From jeremy.guthrie@berbee.com Tue Jul 26 07:48:54 2005 Received: with ECARTIS (v1.0.0; list netdev); Tue, 26 Jul 2005 07:49:00 -0700 (PDT) Received: from ctg-msnexc01.staff.berbee.com (msn-office-flr2.binc.net [64.73.12.254]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6QEmpH9001801 for ; Tue, 26 Jul 2005 07:48:54 -0700 Received: from localhost ([172.30.254.220] RDNS failed) by ctg-msnexc01.staff.berbee.com with Microsoft SMTPSVC(6.0.3790.211); Tue, 26 Jul 2005 09:46:56 -0500 From: "Jeremy M. Guthrie" Reply-To: jeremy.guthrie@berbee.com Organization: Berbee Information Networks To: netdev@oss.sgi.com Subject: Linux Policy Routing-Based IDS Load Balancer HOWTO Date: Tue, 26 Jul 2005 09:46:36 -0500 User-Agent: KMail/1.8.1 MIME-Version: 1.0 Content-Type: multipart/signed; boundary="nextPart1281510.49u6NrilRT"; protocol="application/pgp-signature"; micalg=pgp-sha1 Content-Transfer-Encoding: 7bit Message-Id: <200507260946.38894.jeremy.guthrie@berbee.com> X-OriginalArrivalTime: 26 Jul 2005 14:46:56.0285 (UTC) FILETIME=[DE9BACD0:01C591F0] X-archive-position: 2785 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jeremy.guthrie@berbee.com Precedence: bulk X-list: netdev Content-Length: 56349 Lines: 1476 --nextPart1281510.49u6NrilRT Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Content-Disposition: inline Linux Policy Routing-Based IDS Load Balancer HOWTO What you say? Well, some of you may have remember in early January of this= =20 year I was spamming the list pretty hard trying to get a host to route quit= e=20 a bit of data. Turns out it was for an Linux based IDS load balancer. I a= m=20 80-90% done and I'd like to get your feed back as to how bad I've horked=20 this. Specifically are there descriptions for things like gc_interval and= =20 the like that are so technically wrong I should be slapped with a wet=20 penguin? Obvious things I have not completed yet: 1. spell checker, that will come after the list has beaten this up 2. formatting, I gave up on general formatting till the very end 3. the TOC isn't finished yet If you could please, review the tuning/performance enhancing sections. BTW, my first howto.... ids-load-balancing-HOWTO Jeremy M. Guthrie jeremy.guthrie@berbee.com =46ebruary 2004 Version 1.0 Jeremy M. Guthrie 2-5-2005 =2D------------------------------------------------------------------------= =2D--- Table of Contents 0. Credit and Notes About the Author 1. Terms 1. The Example Problem 2. Possible Solutions 3. The Linux Answer 3.1. Further Example Details 3.2. Needed Software Updates & Other Requirements 3.3. Notes About Limitations 3.4. It's All About the 'IP' 3.5. Other assumptions 3.6. L2CC vs EBTables & L2-NAT 3.6.1 L2CC Config 3.6.2 EBTables & L2-NAT 3.7. The Policy Router Config 4. Performance Tuning 5. Copyright 5.1. GNU Free Documentation License 5.2. PREAMBLE 5.3. APPLICABILITY AND DEFINITIONS 5.4. VERBATIM COPYING 5.5. COPYING IN QUANTITY 5.6. MODIFICATIONS 5.7. COMBINING DOCUMENTS 5.8. COLLECTIONS OF DOCUMENTS 5.9. AGGREGATION WITH INDEPENDENT WORKS 5.10. TRANSLATION 5.11. TERMINATION 5.12. FUTURE REVISIONS OF THIS LICENSE 5.13. How to use this License for your documents =20 =2D------------------------------------------------------------------------= =2D--- 0. Credit and Notes About the Author First and foremost this document would not be possible if it were not for= =20 Jim Leu, Robert Olsson, Stephen Hemminger, David Miller, Jesse Brandeburg,= =20 Chris Gerg, and Jon Vanderhill. All of these people were instrumental in=20 providing code, feedback, resources, or other critical input to help get th= is=20 document and system where it is today. =20 I myself am a network engineer. I have been building or managing network= s=20 since 1992. I have been a Linux advocate since late 1995. It is my primar= y=20 OS for what I do. I am not a kernel developer though I have written other= =20 software packages. I make no claim that I am the final authority on system= s=20 I reference within. I am providing descriptions of the bits and pieces bas= ed=20 on existing documentation or discussions I have had with other persons. IO= W,=20 if you see a blatant error or a feature omission, it is not on purpose and= =20 would love the feedback to correct this documentation. =20 =2D------------------------------------------------------------------------= =2D--- 1. Terms EBTables: A Layer-2 filtering technology in the Linux V2.6 kernel. Linear Search: The process of one-by-one searching through the hash=20 collision buckets. L2CC: Layer Two Cross Connect: A software patch created by Jim Leu that takes all input frames from an ethernet controller and NATs the=20 destination MAC address to be that of some host on an outbound=20 port. =20 L2XC: Another name for the L2CC software. Innovation: "the introduction of something new" as defined by=20 Merriam-Webster Online(www.m-w.com). =20 MAC-Munger: Another name for the L2CC software. Policy-Router: A router which routes because of a dictated policy rather than by normal IPV4 per-hop-behavior. ie. route all port 80 out my DSL link, all SMTP traffic out my cable link vs. default=20 route traffic via my cable. SPAN: Switch port analyzer. Think of it as a port which mirrors traffic off of a VLAN or switchport out another port. ie. mirror all all traffic going in/out of the Firewall's inside interface to the SPAN port. =2D------------------------------------------------------------------------= =2D--- 1. The Example Problem You are a large business with a lot of Internet bandwidth. Your IDS is capable of dealing with some subset of your traffic volume but it cannot handle it by itself. It also would cost too much money to buy a bigger, faster, IDS. You want to buy the same model as you already have. The two IDSes together can handle the volume but now you have to figure out how to divide the traffic up between the two. Sounds simple doesn't it? =20 Example bandwidth, 800 megabit per second. Assume you are using Cisco IDS 4255 where each IDS lists at $25,000 a piece. A 4255 is supposed to be capable of 600mbps. Remember, you want the IDS to see all of the data in a flow, not just one half! So you cannot just break up the flow without some thought.=20 =2D------------------------------------------------------------------------= =2D--- =20 2. Possible Solutions F5, Radware, and Top Layer networks all make boxes that will split data appart. All of these are commercial products that help load balance flows for IDSs or traffic management. They are not cheap and in some ca= ses are budget busters. =20 The example network hardware I reference in here will be Cisco platforms as that is what I am most familiar with. =2D------------------------------------------------------------------------= =2D--- =20 3. The Linux Answer The good news is that Linux has an answer with L2CC/EBTables and Policy= =20 Routing. The two main components run in the kernel so they operate securel= y. =20 I will detail scaling the solution upward as it should be able to meet the= =20 needs of most any bandwidth. The Linux solution identified herein has been dubbed by Chris Gerg as th= e=20 (I(DS)^2). However I will to the whole system as the IDS load balancer he= re=20 after. The concept is simple: (or EBTables) +---------------+ +---+ | Catalyst 6509 |-[1000SX-SPAN]->| l | +---------------+ +---------------+ | 2 |-[1000SX]->| Policy Router | +---------------+ | c | +-+-----+-------+ | Catalyst 6509 |-[1000SX-SPAN]->| c | | +---------------+ +---+ |[1000SX] v +-------------------+ | Catalyst 3750 Gig | +---+------------+--+ | | |<--1000TX-->| +---+---+ | | IDS A | | +-------+ | +---+---+ | IDS A | +-------+ There is a highly available network with two enterprise class switches=20 front-ending the network. Each switch hands off a SPAN port to the Layer T= wo=20 Cross Connect(l2cc/l2xc) host AKA EBTables Host. The SPAN data is an=20 identical copy of the data from the ports they mirror. =20 =20 If the destination MAC address of an example packet is 05:05:05:03:03:03= ,=20 then we to NAT this to an address of the policy router's outside interface= =20 MAC address. Why? Normal routers will not route data unless the layer two= =20 destination address matches their own MAC address. The l2cc host=20 changes/L2-NATs the destination MAC address to be that of the Policy Router= 's=20 'outside' interface. When the data arrives at the policy router, it looks = in=20 its ip routing ruleset to determine what 'table' to use to forward the=20 incoming packets. Once a table is found and selected, that table is then=20 used to define the per-hop-behavior at that point in time for that packet. 3.1. Further Example Details =20 Time to add yet another layer of detail to the example. We will assume = we=20 want to run Snort or NTOP against all traffic coming through our network. = In=20 this example we will assume that we have 1.0.0.0/16 assigned to us. We als= o=20 know that we have a pretty good distribution of traffic such that 1.0.0.0/1= 7=20 gets about 350mbps and 1.0.128.0/17 gets about 450 mbps. =20 What we also know is that because of the network that traffic to/from=20 both /17 subnets comes in over both SPANs but is NOT duplicate traffic. Th= is=20 can because of things like per-packet or per-flow routing decisions made by= =20 downstream equipment. =20 3.2. Needed Software Updates & Other Requirements =20 You will want to make sure you have the latest tools for the job. I hav= e=20 only ever worked this solution using Intel Gig NICs. Others should work bu= t=20 I have never tested them. =20 Software to get: 1. Latest Intel E1000 drivers http://sourceforge.net/projects/e1000/ =20 2. Latest IPRoute2 utilities: http://www.policyrouting.org/ 3. Latest L2CC software: http://mpls-linux.sf.net/ =20 4. A V2.4 & V2.6 Linux Kernel: http://www.kernel.org/ L2CC requires V2.4 and you should use V2.6 for your policy router. =20 5. Schedutils http://tech9.net/rml/schedutils/ 6. EBTables http://ebtables.sourceforge.net/ =20 As for other requirements.... I HIGHLY recommend a multi-CPU box. Hyperthread has shown advantages an= d=20 works well in our implementation. I typically assign one CPU per NIC on=20 either the EBTables box and/or Policy Router. Your mileage will vary but u= se=20 common sense. If you have a dual 3.2 ghz P4 w/ Hyperthreading then that wi= ll=20 have enough horse power to handle large data volumes. In some cases you=20 could easily assign more than one NIC to a CPU. 3.3. Notes About Limitations =20 Every piece of hardware or software has limits. Keep this in mind when= =20 deploying your system. I will show you where counters are instrumented and= =20 generically how to manage them. Like any good system administrator, you wi= ll=20 have to manage your system. =20 3.4. It's All About the 'IP' =20 ifconfig, netstat, and other old-style utilities are being phased out=20 slowly. Familiarize yourself with the 'ip' utility from the iproute2 packa= ge=20 as it replaces the prior listed programs. The iproute2 package includes=20 other programs that will help in providing access to critical performance=20 information. 'lnstat' and 'rtstat' can be used to poll routing performance= =20 information from your kernel DEPENDING on which kernel release you are=20 running. =20 =20 3.5. Other assumptions =20 Going forward it will be assumed that you have the appropriate kernels=20 and/or features installed unless we talk explicitly about a feature. =20 3.6 L2CC vs EBTables & L2-NAT L2CC was the only option available when this document was first written = but=20 now EBTables is available. EBTables was beaten up a quite bit in the May=20 2005 Networld+Interop in Vegas. EBTables proved to be a very reliable bit = of=20 software. L2CC has more burn in time for my organization but we are in the= =20 process of migrating. This document will provide examples for both L2CC &= =20 EBTables. 3.6.1 L2CC Config =20 The example L2CC host has three Gig NICs. eth0-1 are gathering SPAN dat= a=20 while eth2 is the output port. The Policy router's eth0 MAC address is=20 01:01:01:10:10:10. =20 =20 The L2CC host will need the L2CC patch applied against the V2.4 kernel. = =20 =46rom there do the following: make menuconfig Select "Networking Options" Compile in "Layer 2 Cross Connectr (EXPERIMENTAL)", do not build as a=20 module. Rebuild your kernel and reboot. =20 #add two entries, one for each NIC l2cc -a -i eth0 -o eth2 -m 01:01:01:10:10:10 l2cc -a -i eth1 -o eth2 -m 01:01:01:10:10:10 =20 #delete two entries, one for each NIC l2cc -d -i eth0 -o eth2 -m 01:01:01:10:10:10 l2cc -d -i eth1 -o eth2 -m 01:01:01:10:10:10 =20 *WARNING* Test that your configuration is working using TCPDump in an=20 ISOLATED network. =20 In the above example, I should be able to run three instances of TCPDump= =20 and see that data coming in eth0 & eth1 is having its MAC address NAT'd whe= n=20 being transmitted out eth2. 3.6.2 EBTables & L2-NAT The example EBTables host has three Gig NICs. eth0-1 are gathering SPAN= =20 data while eth2 is the output port. The Policy router's eth0 MAC address i= s=20 01:01:01:10:10:10. =20 EBTables will require a Linux host running with a V2.6 kernel. To build your Linux kernel with EBTables support: make menuconfig Select "Device Drivers" Select "Networking Support" =20 Select "Networking Options" Compile in "802.1d Ethernet Bridging" Select "Network packet filtering" Select "Bridge: Netfilter Configuration" Select "Ethernet Bridge tables (ebtables) support" Select "ebt: nat table support" Select "ebt: dnat target support" Select "ebt: snat target support" Rebuild your kernel and reboot. #prep your interfaces ifconfig eth0 up ifconfig eth1 up ifconfig eth2 up #create the br0 interface brctl addbr br0 #turn off spanning-tree brctl stp br0 off #add interfaces to the br0 broadcast domain brctl addif br0 eth0 brctl addif br0 eth1 brctl addif br0 eth2 #Prep ebtables ebtables -F INPUT ebtables -F OUTPUT ebtables -F FORWARD ebtables -t nat -F PREROUTING =20 #NAT all incoming data on eth0 to 00:11:25:8c:8c:37 ebtables -t nat -A PREROUTING -I eth0 -j dnat -to-destination=20 00:11:25:8c:8c:37 #NAT all incoming data on eth1 to 00:11:25:8c:8c:37 ebtables -t nat -A PREROUTING -I eth1 -j dnat -to-destination=20 00:11:25:8c:8c:37 #Tell EBTables to route/bridge data destined to 00:11:25:8c:8c:37 out et= h2 ebtables -A OUTPUT -o eth2 -d 00:11:25:8c:8c:37 -j ACCEPT =20 *WARNING* Test that your configuration is working using TCPDump in an=20 ISOLATED network. =20 In the above example, I should be able to run three instances of TCPDump= =20 and see that data coming in eth0 & eth1 is having its MAC address NAT'd whe= n=20 being transmitted out eth2. =20 3.7. The Policy Router Config =20 I am going to assume you have the appropriate iproute2 package for your= =20 kernel. You may need a newer version of the iproute2 utilities. =20 ie. V2.6.9 kernel =20 strace -f rtstat ... --snip-- open("/proc/net/rt_cache_stat", O_RDONLY) =3D -1 ENOENT (No such file or= =20 directory) --snip-- In this case I would need the latest iproute2 code to use=20 the /proc/net/stat/rt_cache instead. In fact, rtstat may also be called=20 'lnstat'. =20 *ASSUMPTION* This policy router config assumes there are no overlapping= =20 subnets. Overlapping subnets mean you must order your policy rules=20 appropriately for them to work. Overlapping IP examples: 192.168.0.0/16 &= =20 192.168.0.0/24 =20 The policy router is made up of several components. The policy router u= ses=20 rules and tables. Rules are used to classify which traffic belongs to whic= h=20 table. This is where the policy in policy routing comes from. You define= =20 what policies you want to implement. =20 Tables are routing tables used to device the per-hop-behavior for the=20 packet being routed via that table. Shortly you will see how the combinati= on=20 of rules with tables are combined to split traffic. =20 In our example, we want to split traffic of two /17s to the two sensors.= =20 With that in mind we will add rules to do the actual policy mapping. =20 Eth0 is our input device while eth1 is our output device. Eth0 will hav= e=20 an IP address of 10.0.0.1/32. An IP address is required otherwise the Linu= x=20 kernel will not policy-route for the interface. Eth1 will have an IP addre= ss=20 of 10.0.1.1/24. Sensor 1 will have an IP address of 10.0.1.10, Sensor 2 wi= ll=20 have an IP address of 10.0.1.11. =20 Routing at high speed means any hiccup, EVEN SMALL ones, result in lost= =20 packets as recieve rings for network cards can be overrun quickly. Thus we= =20 have to minimize any hiccups. One hiccup is ARP. Sensors NEVER transmit a= nd=20 we will always have data to send to them. You will see several changes we= =20 will make to account for this. =20 #First, turn off IP forwarding before we configure routing echo 0 > /proc/sys/net/ipv4/ip_forward =20 #Add static ARP entries as we should ALWAYS know what MAC #address to associate with our Sensor IPS arp -s 10.0.1.10 00:02:50:98:DC:1C arp -s 10.0.1.11 00:02:50:A1:5D:5A =20 #send any traffic to/from 1.0.0.0/17 to table 15 ip rule add type unicast dev eth0 from 1.0.0.0/17 table 15 ip rule add type unicast dev eth0 to 1.0.0.0/17 table 15 =20 #send any traffic to/from 1.0.128.0/17 to table 16 ip rule add type unicast dev eth0 from 1.0.128.0/17 table 16 ip rule add type unicast dev eth0 to 1.0.128.0/17 table 16 =20 #Tell policy routing code that the only path in table 15 is via Sensor 1! ip route add default via 10.1.0.10 dev eth1 table 15 =20 #Tell policy routing code that the only path in table 16 is via Sensor 2! ip route add default via 10.1.0.11 dev eth1 table 16 =20 #When done with our changes, flush the cache ip route flush cache =20 #Lastly, turn on IP forwarding echo 1 > /proc/sys/net/ipv4/ip_forward =20 This is all you actually 'have to do' to enable a working system. There= =20 are however other changes you SHOULD make to help with performance. =20 For one, if you left IP forwarding on and blew away the policy routing=20 table, all traffic would then follow your normal default route on the=20 box!!!!!! =20 =20 *WARNING* Let's complicate our policy router by adding another interface,=20 eth3->192.168.10.5 with a default gw to a firewall so we can remotely manag= e=20 the system. In our example, we had 800mbps heading towards our policy=20 router. If we turn off policy routing but left IP forwarding on, the Linux= =20 host will try to forward >>>>800mbps<<<< of traffic towards the firewall on= =20 the 192.168.10.0/24 network thereby killing it. Okay?! Follow? If not,=20 re-read till you do. =20 =20 Hint: If your firewall has a 100mbps interface and you fire 800mbps at = it,=20 the firewall will stop working because you will overload its interface with= =20 bandwidth. =20 There are then two ways to protect yourself: A) Always turn off IP forwarding before making ANY changes to the polic= y=20 router.=20 B) Turn on iptables filtering. Here is a quick an dirty example of an= =20 IPtables filter to apply to this example host: =20 #The only data that will be allowed in or out of eth3 will be traffic #to or from 192.168.10.5. So even if you accidentally leave IP forwardi= ng #on, you can trust that IP Tables it stemming the flow from burrying your #gateway/firewall. iptables -F iptables -A FORWARD -s 192.168.10.5 -o eth3 -j ACCEPT iptables -A FORWARD -o eth0 -j DROP iptables -A OUTPUT -s 192.168.10.5 -o eth3 -j ACCEPT iptables -A OUTPUT -o eth0 -j DROP =20 The switches will need some adjustments to make sure that the switch kno= ws=20 exactly where the sensors are. If the sensors never transmit data on their= =20 ports then the switches turn act as hubs which is exactly what we don't wan= t. =20 #3750 config: interface GigabitEthernet1/0/1 description Eth0 of Sensor 1 switchport access vlan 100 switchport trunk native vlan 100 switchport trunk allowed vlan none switchport mode access switchport nonegotiate load-interval 30 no cdp enable ! interface GigabitEthernet1/0/2 description Eth0 of Sensor 2 switchport access vlan 100 switchport trunk native vlan 100 switchport trunk allowed vlan none switchport mode access switchport nonegotiate load-interval 30 no cdp enable ! interface GigabitEthernet1/0/28 description Eth1 of Policy Router switchport access vlan 100 switchport trunk native vlan 100 switchport trunk allowed vlan none switchport mode access switchport nonegotiate load-interval 30 no cdp enable ! mac-address-table static 0002.5098.DC1C vlan 100 interface=20 GigabitEthernet1/0/1 mac-address-table static 0002.50A1.5D5A vlan 100 interface=20 GigabitEthernet1/0/2=20 =20 That's it? Right? Well, sure. If your system is blazing fast and does= =20 not require any tuning. This assumes your system defaults are adequate. =20 That may not be the case. The rest of this document will be dedicated to=20 discussing eliminating the bottlenecks. =20 =2D------------------------------------------------------------------------= =2D--- 4. Performance Tuning Jumping back to a prior statement, you need to be aware of the limits of= =20 the policy routing system. I will list them out here and we will discuss h= ow=20 to go about addressing them. Some are easy to update, others are what they= =20 are. =20 4.1. General Limits =20 Kernel Limits: Max # of Policy Routing Rules: 32768 - some are reserved Max # of Policy Routing Tables: 256 - some are reserved Max # of Interfaces in V2.6 Kernel: 4096 Max # of Interfaces in V2.4 Kernel: 256 =20 IP Route Hash Limits: =20 Card Limits: Intel EtherExpress RX/TX Buffer Count: 256 packets Max RX/TX Buffer Count: 4096 packets 4.2. Instrumentation Knowing that any system is running well can take a bit to figure out. =20 There will be a few places that we concentrate on monitoring. Memory, CPU= =20 utilization, interrupt distribution, network stack drops, network card drop= s,=20 # of routes, garbage collection, routing packets per second, # of hash=20 entries, and others. =20 4.2.1. rtstat or lnstat in the land of iproute2 Rtstat and lnstat are two tools used to watch routing activity within th= e=20 Linux kernel. Kernels prior to approximately V2.6.9 will use rtstat. Afte= r=20 2.6.9 iproute2 uses lnstat to gather routing detail. You can tell if your= =20 kernel works with rtstat or lnstat by trying the existing install of rtstat= =2E =20 Both tools pull data from /proc, it is a question of which one. =20 ie. [plato jguthrie 10:29am]~-> rtstat fopen: No such file or directory =20 4.2.2 Important /proc files for your reference. #Data on each route cache entries/hashes /proc/net/rt_cache =20 #Aggregate statistics on route cache entries/hashes /proc/net/stat/rt_cache =20 #General information about the network stack on a per-cpu basis /proc/net/softnet_stat We will examine each of the files to examine what they can tell us. 4.2.2.1. /proc/net/softnet_stat cat /proc/net/softnet_stat 00000130 00000034 00000000 00000000 00000000 00000000 00000000 00000000= =20 00000000 00000150 00000000 00000000 00000000 00000000 00000000 00000000 00000000= =20 00000000 =20 The basic idea behind softnet_stat is that we use it as a way to tell us= if=20 the Kernel itself is dropping packets. The second field in the list of nin= e=20 is the packet drop count. You can see in this example that we have dropped= =20 0x34 packets. Look into using NAPI or other network card features to=20 possibly relieve CPU overhead. 4.2.3 CPUs, Memory, and Interrupts & the Rules To Follow As with any rules here, all can be affected by your budget. 1. Assume that each CPU on your box will be handling only one NIC's=20 interrupts. 2. Assume that you will be using 'taskset' to keep non-kernel routing=20 functions assigned to a specific CPU. 3. EBTables uses little memory, you can skimp here 4. Policy Routing Chews Memory, you cannot skimp here - minimum 1 Gigab= yt=20 of RAM ***HIGHLY*** recommended 4.2.3.1 CPUs should only following one NIC. If you look at the output below you can see that CPU0 is taking the=20 interrupts for eth3. CPU1 is taking interrupts for eth2 & eth0. Optimisin= g=20 any system relies on keep thrashing to a minimum. As a result I highly=20 recommend disable IRQ Balancing. make menuconfig for your kernel config Select "Processor type and features" Disable "Enable kernel irq balancing" Rebuild your kernel and reboot. You will have to poke around /proc to set which CPU an interrupt binds t= o. =20 Here is what was used to set the interrupt/CPU bindings down below: echo 01 > /proc/irq/18/smp_affinity echo 02 > /proc/irq/20/smp_affinity The value used is expressed in powers of two. ie. CPU3 would actually b= e=20 04. cat /proc/interrupts CPU0 CPU1 0: 3184569581 1789102599 IO-APIC-edge timer 1: 1005 218 IO-APIC-edge i8042 7: 0 0 IO-APIC-level ohci_hcd 8: 1 1 IO-APIC-edge rtc 12: 122 74 IO-APIC-edge i8042 14: 2 0 IO-APIC-edge ide0 18: 995373697 5139 IO-APIC-level eth3 20: 2 1378253801 IO-APIC-level eth2 27: 7542100 9352305 IO-APIC-level eth0 28: 4150402 13187680 IO-APIC-level aic7xxx 30: 0 0 IO-APIC-level acpi NMI: 0 0 LOC: 679927478 679903506 ERR: 0 MIS: 0 4.2.3.2 Taskset is your friend. Taskset allows an administrator to bind a software process to a specific= =20 processor on a box. By using taskset you help cut down on CPU cache=20 thrashing. If your hosts will be running SNMP daemons, snort, etc, then yo= u=20 will want to bind snort to the least used CPU. You want to keep your base= =20 load balancing system predictable. 4.3 Tune your routing! The Linux kernel defaults for routing work great in a lot of situations = and=20 unfortunately this is not one of them. You will find that the Linux kernel= =20 will take some tuning to get the performance you want. The kernel counts on route hash entries to track existing conversations.= =20 One route hash entry is used per host-host communication. ie. Entry 1: 192.168.1.1 -> 192.168.2.2 Entry 2: 192.168.2.4 -> 192.168.7.5 Imagine HOW MANY of these you might have given your volumes of traffic. = =20 Also imagine that the kernel has NO WAY to tell when a conversation is over= =2E =20 The kernel is not following TCP/UDP conversations thus entries age out of=20 existence. You will need to conduct performance testing to confirm how wel= l=20 your environment runs. Just be warned that the more route hash entries you= =20 run, the more RAM the kernel WILL use. Example 'free' from your Policy router: /proc/sys/net/ipv4/route/gc_thresh: 786432 total used free shared buffers cached Mem: 1034088 1007112 26976 0 310840 217220 =2D/+ buffers/cache: 479052 555036 Swap: 1028120 0 1028120 4.3.1 Adjust one kernel parameter and reboot. You will need to bump up the maximum number of supported route hash entr= ies=20 the kernel supports. I recommend setting this rather high and using anothe= r=20 parameter to set your ceiling. Add the following to your boot config kerne= l=20 parameters: rhash_entries=3D2400000 4.3.2 Adjust /proc/sys/net/... to tune your routing The Linux kernel uses six parameters to adjust how it handles managing t= he=20 routing hashes, collisions, and aging. =20 gc_elesticity can best be described as the average bucket depth the kern= el=20 will accept before it starts expiring route hash entries. This will help=20 maintain the upper limit of active routes. echo 8 > /proc/sys/net/ipv4/route/gc_elasticity I had limited success playing with these next two entries seeing as I co= uld=20 find little information on the effect of either one. echo 60 > /proc/sys/net/ipv4/route/gc_interval echo 0 > /proc/sys/net/ipv4/route/gc_min_interval gc_thresh is another limiting factor in controlling how much RAM your=20 policy routing will eat up. This number cannot be greater than the=20 rhash_entries kernel parameter. As a rule of thumb, set your rhash_entries= =20 parameter REALLY high(mine is 2.4million) and control your running limit wi= th=20 gc_thresh. echo 1048576 > /proc/sys/net/ipv4/route/gc_thresh This parameter needs better kernel docs. echo 300 > /proc/sys/net/ipv4/route/gc_timeout The secret_interval instructs the kernel how often to blow away ALL rout= e=20 hash entries regardless of how new/old they are. In our environment this i= s=20 generally bad. The CPU will be busy rebuilding thousands of entries per=20 second every time the cache is cleared. However we set this to run once a= =20 day to keep memory leaks at bay(though we've never had one). echo 86400 > /proc/sys/net/ipv4/route/secret_interval 4.3.3 Basic scripts These aren't perfect but then again, what is.... 4.3.3.1 watcherrors #!/bin/tcsh set interval=3D15 set argc=3D`echo $argv | wc -w | tr -s " " "\t" | cut -f2` if ( $argc > 0 ) then set interval=3D$argv[1] endif set stats=3D`ifconfig eth3 | egrep 'RX packets:' | tr -s ": " "\t" | cut -f= 4,6` set interrupts=3D`cat /proc/interrupts | egrep "eth[23]" | tr -s " " "\t" |= cut=20 =2Df 3,4` while ( 1 ) sleep $interval set newstats=3D`ifconfig eth3 | egrep 'RX packets:' | tr -s ": " "\= t" |=20 cut -f4,6` set packets=3D`expr $newstats[1] - $stats[1]` set errors=3D`expr $newstats[2] - $stats[2]` set percentage=3D`expr $errors "*" 10000 / $packets` set packetspersec=3D`expr $packets / $interval ` set date=3D`date "+%m/%d/%y %H:%M:%S"` set entries=3D`cat /proc/net/stat/rt_cache | tr -s " " "\t" | cut -= f 1 |=20 head -n 2 | tail -n 1` set newinterrupts=3D`cat /proc/interrupts | egrep "eth[23]" | tr -s= " "=20 "\t" | cut -f 3,4` set eth3int=3D`expr $newinterrupts[1] - $interrupts[1]` set eth2int=3D`expr $newinterrupts[4] - $interrupts[4]` echo "$date entries: $entries Pkts: $packets Err: $errors PPS= : =20 $packetspersec Drop %: 0.$percentage% Eth3RXInt: $eth3int Eth2TXInt: = =20 $eth2int" set stats=3D( $newstats ) set interrupts=3D( $newinterrupts ) end 4.3.3.2 Policy Routing Control Script This script assumes that you will have two files, a policy route file, a= nd=20 policy rule file. The ROUTE file should have the following format: [ip of next hop] [outgoing interface] [table #] Example 'routefile' contents: 10.0.1.10 eth2 31 10.0.1.11 eth2 32 The RULE file should have the following format: [CIDR BLOCK] [table #] Example 'rulefile' contents: 172.30.0.0/24 31 172.16.0.0/22 32 Data to/from 172.30.0.0/24 would be sent to sensor 10.0.1.10. Data to/f= rom=20 172.16.0.0/22 would be sent to sensor 10.0.1.11. #POLICY V1.0 script #!/bin/tcsh set policyrulefile=3D/opt/bin/policyrules set policyroutefile=3D/opt/bin/policyroutes set argc=3D`echo $argv | wc -w | tr -s " " "\t" | cut -f2` if ( $argc < 3 ) then echo Policy V1.00 echo "Usage: policy [add|delete|show] [routefile] [rulefile]" exit endif set command=3D`echo $1 | tr "[a-z]" "[A-Z"` if ( ( $command =3D=3D "ADD" ) || ( $command =3D=3D "DELETE" ) ) then if ( ! -e $argv[2] ) then echo Route file $argv[2] does not exist exit endif if ( ! -e $argv[3] ) then echo Rule file $argv[3] does not exist exit endif endif set policyroutefile=3D$argv[2] set policyrulefile=3D$argv[3] set policyrulecount=3D`egrep "[0-9]\/[0-9]" $policyrulefile | wc -l | tr -s= " "=20 "\t" | cut -f2` set policyrules=3D`egrep "[0-9]\/[0-9]" $policyrulefile | tr -s " " "\t"` set policyroutecount=3D`egrep "[0-9] eth" $policyroutefile | wc -w | t= r -s=20 " " "\t" | cut -f2` set policyroutes=3D`egrep "[0-9] eth" $policyroutefile | tr -s " " "\t"` if ( $command =3D=3D "ADD" ) then echo -n "Turning up..." set alternate=3D0 foreach policyrule ($policyrules ) if ( $alternate ) then set alternate=3D0 set table=3D$policyrule # echo "Adding rule: $range $table" echo -n "." /sbin/ip rule add type unicast dev eth3 from $range= =20 table $table /sbin/ip rule add type unicast dev eth3 to $range=20 table $table else set alternate=3D1 set range=3D$policyrule endif end set loop=3D0 while ( $loop !=3D $policyroutecount ) set loop=3D`expr $loop + 1` set gw=3D$policyroutes[$loop] set loop=3D`expr $loop + 1` set device=3D$policyroutes[$loop] set loop=3D`expr $loop + 1` set table=3D$policyroutes[$loop] #echo "Adding default route: $gw $device $table" echo -n "+" /sbin/ip route add default via $gw dev $device table $table end /sbin/ip route flush cache echo 1 > /proc/sys/net/ipv4/ip_forward echo "" endif if ( $command =3D=3D "DELETE" ) then echo -n "Turning down..." echo 0 > /proc/sys/net/ipv4/ip_forward set alternate=3D0 foreach policyrule ($policyrules ) if ( $alternate ) then set alternate=3D0 set table=3D$policyrule #echo "Deleting rule: $range $table" echo -n "." /sbin/ip rule delete type unicast dev eth3 from $ra= nge=20 table $table /sbin/ip rule delete type unicast dev eth3 to $rang= e=20 table $table else set alternate=3D1 set range=3D$policyrule endif end set loop=3D0 while ( $loop !=3D $policyroutecount ) set loop=3D`expr $loop + 1` set gw=3D$policyroutes[$loop] set loop=3D`expr $loop + 1` set device=3D$policyroutes[$loop] set loop=3D`expr $loop + 1` set table=3D$policyroutes[$loop] #echo "Deleting default route: $gw $device $table" echo -n "+" /sbin/ip route delete default via $gw dev $device table $ta= ble end /sbin/ip route flush cache echo "" endif if ( $command =3D=3D "SHOW" ) then set tables=3D`cat $policyroutefile | egrep "[0-9]" | tr -s " " "\t"= |=20 cut -f3 | sort -u` ip rule list foreach table ($tables) ip route list table $table end endif =2D------------------------------------------------------------------------= =2D--- 5. Copyright Copyright =A9 2005 Jeremy M. Guthrie Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.1 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts and no Back-Cover Texts. A copy of the licen= se is included in the section entitled "GNU Free Documentation License". =2D------------------------------------------------------------------------= =2D--- 5.1. GNU Free Documentation License Version 1.1, March 2000 =20 Copyright (C) 2000 Free Software Foundation, Inc. 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed. =20 =2D------------------------------------------------------------------------= =2D--- 5.2. PREAMBLE The purpose of this License is to make a manual, textbook, or other written document "free" in the sense of freedom: to assure everyone the effective freedom to copy and redistribute it, with or without modifying it, either commercially or noncommercially. Secondarily, this License preserves for the author and publisher a way to get credit for their work, while not being considered responsible for modifications made by others. This License is a kind of "copyleft", which means that derivative works of the document must themselves be free in the same sense. It complements the GNU General Public License, which is a copyleft license designed for free software. We have designed this License in order to use it for manuals for free software, because free software needs free documentation: a free program should come with manuals providing the same freedoms that the software does. But this License is not limited to software manuals; it can be used for any textual work, regardless of subject matter or whether it is published as a printed book. We recommend this License principally for works whose purpose is instruction or reference. =2D------------------------------------------------------------------------= =2D--- 5.3. APPLICABILITY AND DEFINITIONS This License applies to any manual or other work that contains a notice placed by the copyright holder saying it can be distributed under the terms of this License. The "Document", below, refers to any such manual or work. Any member of the public is a licensee, and is addressed as "you". A "Modified Version" of the Document means any work containing the Document or a portion of it, either copied verbatim, or with modifications and/or translated into another language. A "Secondary Section" is a named appendix or a front-matter section of the Document that deals exclusively with the relationship of the publishers or authors of the Document to the Document's overall subject (or to related matters) and contains nothing that could fall directly within that overall subject. (For example, if the Document is in part a textbook of mathematics, a Secondary Section may not explain any mathematics.) The relationship could be a matter of historical connection with the subject or with related matters, or of legal, commercial, philosophical, ethical or political position regarding them. The "Invariant Sections" are certain Secondary Sections whose titles are designated, as being those of Invariant Sections, in the notice that says that the Document is released under this License. The "Cover Texts" are certain short passages of text that are listed, as =46ront-Cover Texts or Back-Cover Texts, in the notice that says that the Document is released under this License. A "Transparent" copy of the Document means a machine-readable copy, represented in a format whose specification is available to the general public, whose contents can be viewed and edited directly and straightforwardly with generic text editors or (for images composed of pixels) generic paint programs or (for drawings) some widely available drawing editor, and that is suitable for input to text formatters or for automatic translation to a variety of formats suitable for input to text formatters. A copy made in an otherwise Transparent file format whose markup has been designed to thwart or discourage subsequent modification by readers is not Transparent. A copy that is not "Transparent" is called "Opaque". Examples of suitable formats for Transparent copies include plain ASCII without markup, Texinfo input format, LaTeX input format, SGML or XML using= a publicly available DTD, and standard-conforming simple HTML designed for human modification. Opaque formats include PostScript, PDF, proprietary formats that can be read and edited only by proprietary word processors, SG= ML or XML for which the DTD and/or processing tools are not generally availabl= e, and the machine-generated HTML produced by some word processors for output purposes only. The "Title Page" means, for a printed book, the title page itself, plus such following pages as are needed to hold, legibly, the material this License requires to appear in the title page. For works in formats which do not have any title page as such, "Title Page" means the text near the most prominent appearance of the work's title, preceding the beginning of the body of the text. =2D------------------------------------------------------------------------= =2D--- 5.4. VERBATIM COPYING You may copy and distribute the Document in any medium, either commercially or noncommercially, provided that this License, the copyright notices, and the license notice saying this License applies to the Document are reproduc= ed in all copies, and that you add no other conditions whatsoever to those of this License. You may not use technical measures to obstruct or control the reading or further copying of the copies you make or distribute. However, y= ou may accept compensation in exchange for copies. If you distribute a large enough number of copies you must also follow the conditions in section 3. You may also lend copies, under the same conditions stated above, and you m= ay publicly display copies. =2D------------------------------------------------------------------------= =2D--- 5.5. COPYING IN QUANTITY If you publish printed copies of the Document numbering more than 100, and the Document's license notice requires Cover Texts, you must enclose the copies in covers that carry, clearly and legibly, all these Cover Texts: =46ront-Cover Texts on the front cover, and Back-Cover Texts on the back co= ver. Both covers must also clearly and legibly identify you as the publisher of these copies. The front cover must present the full title with all words of the title equally prominent and visible. You may add other material on the covers in addition. Copying with changes limited to the covers, as long as they preserve the title of the Document and satisfy these conditions, can be treated as verbatim copying in other respects. If the required texts for either cover are too voluminous to fit legibly, y= ou should put the first ones listed (as many as fit reasonably) on the actual cover, and continue the rest onto adjacent pages. If you publish or distribute Opaque copies of the Document numbering more than 100, you must either include a machine-readable Transparent copy along with each Opaque copy, or state in or with each Opaque copy a publicly-accessible computer-network location containing a complete Transparent copy of the Document, free of added material, which the general network-using public has access to download anonymously at no charge using public-standard network protocols. If you use the latter option, you must take reasonably prudent steps, when you begin distribution of Opaque copies in quantity, to ensure that this Transparent copy will remain thus accessib= le at the stated location until at least one year after the last time you distribute an Opaque copy (directly or through your agents or retailers) of that edition to the public. It is requested, but not required, that you contact the authors of the Document well before redistributing any large number of copies, to give them a chance to provide you with an updated version of the Document. =2D------------------------------------------------------------------------= =2D--- 5.6. MODIFICATIONS You may copy and distribute a Modified Version of the Document under the conditions of sections 2 and 3 above, provided that you release the Modified Version under precisely this License, with the Modified Version filling the role of the Document, thus licensing distribution and modification of the Modified Version to whoever possesses a copy of it. In addition, you must do these things in the Modified Version: A. Use in the Title Page (and on the covers, if any) a title distinct from that of the Document, and from those of previous versions (which should, if there were any, be listed in the History section of the Document). Y= ou may use the same title as a previous version if the original publisher = of that version gives permission. =20 B. List on the Title Page, as authors, one or more persons or entities responsible for authorship of the modifications in the Modified Version, together with at least five of the principal authors of the Document (a= ll of its principal authors, if it has less than five). =20 C. State on the Title page the name of the publisher of the Modified Version, as the publisher. =20 D. Preserve all the copyright notices of the Document. =20 E. Add an appropriate copyright notice for your modifications adjacent to the other copyright notices. =20 F. Include, immediately after the copyright notices, a license notice givi= ng the public permission to use the Modified Version under the terms of th= is License, in the form shown in the Addendum below. =20 G. Preserve in that license notice the full lists of Invariant Sections and required Cover Texts given in the Document's license notice. =20 H. Include an unaltered copy of this License. =20 I. Preserve the section entitled "History", and its title, and add to it an item stating at least the title, year, new authors, and publisher of the Modified Version as given on the Title Page. If there is no section entitled "History" in the Document, create one stating the title, year, authors, and publisher of the Document as given on its Title Page, then add an item describing the Modified Version as stated in the previous sentence. =20 J. Preserve the network location, if any, given in the Document for public access to a Transparent copy of the Document, and likewise the network locations given in the Document for previous versions it was based on. These may be placed in the "History" section. You may omit a network location for a work that was published at least four years before the Document itself, or if the original publisher of the version it refers = to gives permission. =20 K. In any section entitled "Acknowledgements" or "Dedications", preserve t= he section's title, and preserve in the section all the substance and tone of each of the contributor acknowledgements and/or dedications given therein. =20 L. Preserve all the Invariant Sections of the Document, unaltered in their text and in their titles. Section numbers or the equivalent are not considered part of the section titles. =20 M. Delete any section entitled "Endorsements". Such a section may not be included in the Modified Version. =20 N. Do not retitle any existing section as "Endorsements" or to conflict in title with any Invariant Section. =20 If the Modified Version includes new front-matter sections or appendices th= at qualify as Secondary Sections and contain no material copied from the Document, you may at your option designate some or all of these sections as invariant. To do this, add their titles to the list of Invariant Sections in the Modified Version's license notice. These titles must be distinct from a= ny other section titles. You may add a section entitled "Endorsements", provided it contains nothing but endorsements of your Modified Version by various parties--for example, statements of peer review or that the text has been approved by an organization as the authoritative definition of a standard. You may add a passage of up to five words as a Front-Cover Text, and a passage of up to 25 words as a Back-Cover Text, to the end of the list of Cover Texts in the Modified Version. Only one passage of Front-Cover Text a= nd one of Back-Cover Text may be added by (or through arrangements made by) any one entity. If the Document already includes a cover text for the same cove= r, previously added by you or by arrangement made by the same entity you are acting on behalf of, you may not add another; but you may replace the old one, on explicit permission from the previous publisher that added the old one. The author(s) and publisher(s) of the Document do not by this License give permission to use their names for publicity for or to assert or imply endorsement of any Modified Version. =2D------------------------------------------------------------------------= =2D--- 5.7. COMBINING DOCUMENTS You may combine the Document with other documents released under this License, under the terms defined in section 4 above for modified versions, provided that you include in the combination all of the Invariant Sections = of all of the original documents, unmodified, and list them all as Invariant Sections of your combined work in its license notice. The combined work need only contain one copy of this License, and multiple identical Invariant Sections may be replaced with a single copy. If there a= re multiple Invariant Sections with the same name but different contents, make the title of each such section unique by adding at the end of it, in parentheses, the name of the original author or publisher of that section if known, or else a unique number. Make the same adjustment to the section titles in the list of Invariant Sections in the license notice of the combined work. In the combination, you must combine any sections entitled "History" in the various original documents, forming one section entitled "History"; likewise combine any sections entitled "Acknowledgements", and any sections entitled "Dedications". You must delete all sections entitled "Endorsements." =2D------------------------------------------------------------------------= =2D--- 5.8. COLLECTIONS OF DOCUMENTS You may make a collection consisting of the Document and other documents released under this License, and replace the individual copies of this License in the various documents with a single copy that is included in the collection, provided that you follow the rules of this License for verbatim copying of each of the documents in all other respects. You may extract a single document from such a collection, and distribute it individually under this License, provided you insert a copy of this License into the extracted document, and follow this License in all other respects regarding verbatim copying of that document. =2D------------------------------------------------------------------------= =2D--- 5.9. AGGREGATION WITH INDEPENDENT WORKS A compilation of the Document or its derivatives with other separate and independent documents or works, in or on a volume of a storage or distribution medium, does not as a whole count as a Modified Version of the Document, provided no compilation copyright is claimed for the compilation. Such a compilation is called an "aggregate", and this License does not apply to the other self-contained works thus compiled with the Document, on accou= nt of their being thus compiled, if they are not themselves derivative works of the Document. If the Cover Text requirement of section 3 is applicable to these copies of the Document, then if the Document is less than one quarter of the entire aggregate, the Document's Cover Texts may be placed on covers that surround only the Document within the aggregate. Otherwise they must appear on covers around the whole aggregate. =2D------------------------------------------------------------------------= =2D--- 5.10. TRANSLATION Translation is considered a kind of modification, so you may distribute translations of the Document under the terms of section 4. Replacing Invariant Sections with translations requires special permission from their copyright holders, but you may include translations of some or all Invariant Sections in addition to the original versions of these Invariant Sections. You may include a translation of this License provided that you also include the original English version of this License. In case of a disagreement between the translation and the original English version of this License, t= he original English version will prevail. =2D------------------------------------------------------------------------= =2D--- 5.11. TERMINATION You may not copy, modify, sublicense, or distribute the Document except as expressly provided for under this License. Any other attempt to copy, modif= y, sublicense or distribute the Document is void, and will automatically terminate your rights under this License. However, parties who have received copies, or rights, from you under this License will not have their licenses terminated so long as such parties remain in full compliance. =2D------------------------------------------------------------------------= =2D--- 5.12. FUTURE REVISIONS OF THIS LICENSE The Free Software Foundation may publish new, revised versions of the GNU =46ree Documentation License from time to time. Such new versions will be similar in spirit to the present version, but may differ in detail to addre= ss new problems or concerns. See [http://www.gnu.org/copyleft/] http:// www.gnu.org/copyleft/. Each version of the License is given a distinguishing version number. If the Document specifies that a particular numbered version of this License "or a= ny later version" applies to it, you have the option of following the terms and conditions either of that specified version or of any later version that has been published (not as a draft) by the Free Software Foundation. If the Document does not specify a version number of this License, you may choose any version ever published (not as a draft) by the Free Software Foundation. =2D------------------------------------------------------------------------= =2D--- 5.13. How to use this License for your documents To use this License in a document you have written, include a copy of the License in the document and put the following copyright and license notices just after the title page: =20 Copyright (c) YEAR YOUR NAME. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentati= on License, Version 1.1 or any later version published by the Free Software Foundation; with the Invariant Sections being LIST THEIR TITLES, with t= he Front-Cover Texts being LIST, and with the Back-Cover Texts being LIST.= A copy of the license is included in the section entitled "GNU Free Documentation License". =20 If you have no Invariant Sections, write "with no Invariant Sections" inste= ad of saying which ones are invariant. If you have no Front-Cover Texts, write "no Front-Cover Texts" instead of "Front-Cover Texts being LIST"; likewise for Back-Cover Texts. If your document contains nontrivial examples of program code, we recommend releasing these examples in parallel under your choice of free software license, such as the GNU General Public License, to permit their use in free software. =2D-=20 =2D------------------------------------------------- Jeremy M. Guthrie jeremy.guthrie@berbee.com Senior Network Engineer Phone: 608-298-1061 Berbee Fax: 608-288-3007 5520 Research Park Drive NOC: 608-298-1102 Madison, WI 53711 --nextPart1281510.49u6NrilRT Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.0 (GNU/Linux) iD8DBQBC5kzOqtjaBHGZBeURArqzAJ9lOTtI67pPPY1cLYlLsyrChzVDvgCfZI9a hR8tLmJkuzSaB07mx6SQSew= =/n6m -----END PGP SIGNATURE----- --nextPart1281510.49u6NrilRT-- From P@draigBrady.com Tue Jul 26 08:11:30 2005 Received: with ECARTIS (v1.0.0; list netdev); Tue, 26 Jul 2005 08:11:34 -0700 (PDT) Received: from corvil.com (gate.corvil.net [213.94.219.177]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6QFBSH9003766 for ; Tue, 26 Jul 2005 08:11:29 -0700 Received: from draigBrady.com (pixelbeat.local.corvil.com [172.18.1.170]) by corvil.com (8.13.3/8.13.3) with ESMTP id j6QF9H4w006141; Tue, 26 Jul 2005 16:09:23 +0100 (IST) (envelope-from P@draigBrady.com) Message-ID: <42E6521D.6020601@draigBrady.com> Date: Tue, 26 Jul 2005 16:09:17 +0100 From: P@draigBrady.com User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040124 X-Accept-Language: en-us, en MIME-Version: 1.0 To: jeremy.guthrie@berbee.com CC: netdev@oss.sgi.com Subject: Re: Linux Policy Routing-Based IDS Load Balancer HOWTO References: <200507260946.38894.jeremy.guthrie@berbee.com> In-Reply-To: <200507260946.38894.jeremy.guthrie@berbee.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-archive-position: 2786 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: P@draigBrady.com Precedence: bulk X-list: netdev Content-Length: 2248 Lines: 70 Jeremy M. Guthrie wrote: > Linux Policy Routing-Based IDS Load Balancer HOWTO great thanks! > 4.2.3.1 CPUs should only following one NIC. > > If you look at the output below you can see that CPU0 is taking the > interrupts for eth3. CPU1 is taking interrupts for eth2 & eth0. Optimising > any system relies on keep thrashing to a minimum. I recommend you rename these interfaces throughout the doc, and then you can refer to and write generic scripts against these known names. For example: ip link set dev eth2 name ids1 ip link set dev eth3 name ids2 > As a result I highly > recommend disable IRQ Balancing. > > make menuconfig for your kernel config > Select "Processor type and features" > Disable "Enable kernel irq balancing" > Rebuild your kernel and reboot. > > You will have to poke around /proc to set which CPU an interrupt binds to. > Here is what was used to set the interrupt/CPU bindings down below: > echo 01 > /proc/irq/18/smp_affinity > echo 02 > /proc/irq/20/smp_affinity > > The value used is expressed in powers of two. ie. CPU3 would actually be > 04. > > cat /proc/interrupts > CPU0 CPU1 > 0: 3184569581 1789102599 IO-APIC-edge timer > 1: 1005 218 IO-APIC-edge i8042 > 7: 0 0 IO-APIC-level ohci_hcd > 8: 1 1 IO-APIC-edge rtc > 12: 122 74 IO-APIC-edge i8042 > 14: 2 0 IO-APIC-edge ide0 > 18: 995373697 5139 IO-APIC-level eth3 > 20: 2 1378253801 IO-APIC-level eth2 > 27: 7542100 9352305 IO-APIC-level eth0 > 28: 4150402 13187680 IO-APIC-level aic7xxx > 30: 0 0 IO-APIC-level acpi > NMI: 0 0 > LOC: 679927478 679903506 > ERR: 0 > MIS: 0 As an example of a generic script: for iface in ids1 ids2; do int=`grep $iface\$ /proc/interrupts | cut -d: -f1` int=`echo $int` #strip whitespace [ "$iface" = "ids1" ] && mask=01 || mask=04 echo $mask > /proc/irq/$int/smp_affinity done > 4.3.3 Basic scripts I find "watch" very useful, for example: watch -n1 `ethtool -S ids1` -- Pádraig Brady - http://www.pixelbeat.org -- From simon@devicescape.com Tue Jul 26 11:05:12 2005 Received: with ECARTIS (v1.0.0; list netdev); Tue, 26 Jul 2005 11:05:14 -0700 (PDT) Received: from dhost002-46.dex002.intermedia.net (dhost002-46.dex002.intermedia.net [64.78.21.140]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6QI5CH9019053 for ; Tue, 26 Jul 2005 11:05:12 -0700 X-MimeOLE: Produced By Microsoft Exchange V6.5.7226.0 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----_=_NextPart_001_01C5920C.4C252365" Subject: Why is packet socket checked before bridge in netif_receive_skb? Date: Tue, 26 Jul 2005 11:03:17 -0700 Message-ID: X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: Why is packet socket checked before bridge in netif_receive_skb? Thread-Index: AcWSDEy8jqFDnLBxQUG/o79KeTEBcA== From: "Simon Barber" To: X-archive-position: 2787 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: simon@devicescape.com Precedence: bulk X-list: netdev Content-Length: 2135 Lines: 58 This is a multi-part message in MIME format. ------_=_NextPart_001_01C5920C.4C252365 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable In netif_receive_skb there is code that checks the ptype_all handlers (protocol handlers that want to handle all protocol types) before the bridge hook, and code to check the specific protocol handlers after the bridge hook. =20 The protocol handlers are also used to implement packet sockets. - Why is the all handler checked before the bridge hook? =20 (I'm reading code from 2.4.26). =20 Simon =20 ------_=_NextPart_001_01C5920C.4C252365 Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable
In = netif_receive_skb=20 there is code that checks the ptype_all handlers (protocol handlers that = want to=20 handle all protocol types) before the bridge hook, and code to check the = specific protocol handlers after the bridge hook.
 
The = protocol=20 handlers are also used to implement packet sockets. - Why is the all = handler=20 checked before the bridge hook?
 
(I'm = reading code=20 from 2.4.26).
 
Simon
 
------_=_NextPart_001_01C5920C.4C252365-- From linville@tuxdriver.com Tue Jul 26 11:20:59 2005 Received: with ECARTIS (v1.0.0; list netdev); Tue, 26 Jul 2005 11:21:05 -0700 (PDT) Received: from ra.tuxdriver.com (ra.tuxdriver.com [24.172.12.4]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6QIKwH9020842 for ; Tue, 26 Jul 2005 11:20:59 -0700 Received: from bilbo.tuxdriver.com (azure.tuxdriver.com [24.172.12.5]) by ra.tuxdriver.com (8.13.3/8.13.3) with ESMTP id j6QIBPWB027666 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Tue, 26 Jul 2005 14:11:26 -0400 Received: from bilbo.tuxdriver.com (localhost.localdomain [127.0.0.1]) by bilbo.tuxdriver.com (8.13.1/8.13.1) with ESMTP id j6QIIrUH010105; Tue, 26 Jul 2005 14:18:53 -0400 Received: (from linville@localhost) by bilbo.tuxdriver.com (8.13.1/8.13.1/Submit) id j6QIIqaD010104; Tue, 26 Jul 2005 14:18:52 -0400 Date: Tue, 26 Jul 2005 14:18:52 -0400 From: "John W. Linville" To: Simon Barber Cc: netdev@oss.sgi.com Subject: Re: Why is packet socket checked before bridge in netif_receive_skb? Message-ID: <20050726181850.GA9881@tuxdriver.com> Mail-Followup-To: Simon Barber , netdev@oss.sgi.com References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.1i X-archive-position: 2788 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: linville@tuxdriver.com Precedence: bulk X-list: netdev Content-Length: 347 Lines: 12 On Tue, Jul 26, 2005 at 11:03:17AM -0700, Simon Barber wrote: > The protocol handlers are also used to implement packet sockets. - Why > is the all handler checked before the bridge hook? Perhaps so that one can look at frames entering on a specific interface rather than the bridge as a whole? John -- John W. Linville linville@tuxdriver.com From simon@devicescape.com Tue Jul 26 11:27:02 2005 Received: with ECARTIS (v1.0.0; list netdev); Tue, 26 Jul 2005 11:27:06 -0700 (PDT) Received: from dhost002-46.dex002.intermedia.net (dhost002-46.dex002.intermedia.net [64.78.21.140]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6QIR2H9021494 for ; Tue, 26 Jul 2005 11:27:02 -0700 X-MimeOLE: Produced By Microsoft Exchange V6.5.7226.0 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Subject: RE: Why is packet socket checked before bridge in netif_receive_skb? Date: Tue, 26 Jul 2005 11:25:02 -0700 Message-ID: X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: Why is packet socket checked before bridge in netif_receive_skb? Thread-Index: AcWSDoDgjwiaiMSoR2O6TR3lgr81HgAAEKGw From: "Simon Barber" To: "John W. Linville" Cc: Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id j6QIR2H9021494 X-archive-position: 2789 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: simon@devicescape.com Precedence: bulk X-list: netdev Content-Length: 911 Lines: 31 The odd thing is that the 'all protocols' handlers are checked before the bridge hook, but the protocol type specific handlers are checked after. Hence if you bind your packet socket to 'all' protocols you get packets before bridging, but if you bind to a specific protocol you get packets after the bridge hook. Simon -----Original Message----- From: John W. Linville [mailto:linville@tuxdriver.com] Sent: Tuesday, July 26, 2005 11:19 AM To: Simon Barber Cc: netdev@oss.sgi.com Subject: Re: Why is packet socket checked before bridge in netif_receive_skb? On Tue, Jul 26, 2005 at 11:03:17AM -0700, Simon Barber wrote: > The protocol handlers are also used to implement packet sockets. - Why > is the all handler checked before the bridge hook? Perhaps so that one can look at frames entering on a specific interface rather than the bridge as a whole? John -- John W. Linville linville@tuxdriver.com From davem@davemloft.net Tue Jul 26 13:00:14 2005 Received: with ECARTIS (v1.0.0; list netdev); Tue, 26 Jul 2005 13:00:18 -0700 (PDT) Received: from sunset.davemloft.net (dsl027-180-168.sfo1.dsl.speakeasy.net [216.27.180.168]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6QK09H9030483 for ; Tue, 26 Jul 2005 13:00:14 -0700 Received: from localhost ([127.0.0.1] ident=davem) by sunset.davemloft.net with esmtp (Exim 4.52) id 1DxVZU-0002Qr-5i; Tue, 26 Jul 2005 12:58:36 -0700 Date: Tue, 26 Jul 2005 12:58:36 -0700 (PDT) Message-Id: <20050726.125836.58466212.davem@davemloft.net> To: simon@devicescape.com Cc: netdev@oss.sgi.com Subject: Re: Why is packet socket checked before bridge in netif_receive_skb? From: "David S. Miller" In-Reply-To: References: X-Mailer: Mew version 4.2 on Emacs 21.4 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 2790 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev Content-Length: 378 Lines: 9 From: "Simon Barber" Subject: Why is packet socket checked before bridge in netif_receive_skb? Date: Tue, 26 Jul 2005 11:03:17 -0700 > The protocol handlers are also used to implement packet sockets. - Why > is the all handler checked before the bridge hook? Because we want packet sniffers to see the packet before the bridging layer decapsulates it. From simon@devicescape.com Tue Jul 26 14:24:49 2005 Received: with ECARTIS (v1.0.0; list netdev); Tue, 26 Jul 2005 14:24:53 -0700 (PDT) Received: from dhost002-46.dex002.intermedia.net (dhost002-46.dex002.intermedia.net [64.78.21.140]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6QLOnH9003346 for ; Tue, 26 Jul 2005 14:24:49 -0700 X-MimeOLE: Produced By Microsoft Exchange V6.5.7226.0 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Subject: RE: Why is packet socket checked before bridge in netif_receive_skb? Date: Tue, 26 Jul 2005 14:22:54 -0700 Message-ID: X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: Why is packet socket checked before bridge in netif_receive_skb? Thread-Index: AcWSHF0Gj4AD5v6YSY+jjFOmaI5EVQAC1zdA From: "Simon Barber" To: "David S. Miller" Cc: Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id j6QLOnH9003346 X-archive-position: 2791 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: simon@devicescape.com Precedence: bulk X-list: netdev Content-Length: 1009 Lines: 30 Ok - and I'm guessing that the behavior is different for the protocol specific case (i.e. packet socket bound to a specific protocol type) because no application has needed it to be the same? (IE sniffers normally bind to all protocols, and protocol specific apps like DHCP servers don't need to see frames before the bridge hook - they would normally see them on the brige itself). Simon -----Original Message----- From: David S. Miller [mailto:davem@davemloft.net] Sent: Tuesday, July 26, 2005 12:59 PM To: Simon Barber Cc: netdev@oss.sgi.com Subject: Re: Why is packet socket checked before bridge in netif_receive_skb? From: "Simon Barber" Subject: Why is packet socket checked before bridge in netif_receive_skb? Date: Tue, 26 Jul 2005 11:03:17 -0700 > The protocol handlers are also used to implement packet sockets. - Why > is the all handler checked before the bridge hook? Because we want packet sniffers to see the packet before the bridging layer decapsulates it. From mmporter@cox.net Wed Jul 27 10:44:44 2005 Received: with ECARTIS (v1.0.0; list netdev); Wed, 27 Jul 2005 10:44:54 -0700 (PDT) Received: from fed1rmmtao12.cox.net (fed1rmmtao12.cox.net [68.230.241.27]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6RHiiH9012548 for ; Wed, 27 Jul 2005 10:44:44 -0700 Received: from liberty.homelinux.org ([70.190.160.125]) by fed1rmmtao12.cox.net (InterMail vM.6.01.04.00 201-2131-118-20041027) with ESMTP id <20050727174238.DREX550.fed1rmmtao12.cox.net@liberty.homelinux.org>; Wed, 27 Jul 2005 13:42:38 -0400 Received: (from mmporter@localhost) by liberty.homelinux.org (8.9.3/8.9.3/Debian 8.9.3-21) id KAA01926; Wed, 27 Jul 2005 10:42:47 -0700 Date: Wed, 27 Jul 2005 10:42:47 -0700 From: Matt Porter To: jgarzik@pobox.com, netdev@oss.sgi.com Cc: wfarnsworth@mvista.com Subject: [PATCH] emac: add bamboo support Message-ID: <20050727104247.C1114@cox.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i X-archive-position: 2793 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mporter@kernel.crashing.org Precedence: bulk X-list: netdev Content-Length: 2976 Lines: 105 Adds support for the Bamboo board phys in the EMAC driver. Please apply. Signed-off-by: Wade Farnsworth Signed-off-by: Matt Porter diff -uprN linux-2.6.12/drivers/net/ibm_emac/ibm_emac_phy.c linux-2.6.12-440ep/drivers/net/ibm_emac/ibm_emac_phy.c --- linux-2.6.12/drivers/net/ibm_emac/ibm_emac_phy.c 2005-06-17 12:48:29.000000000 -0700 +++ linux-2.6.12-440ep/drivers/net/ibm_emac/ibm_emac_phy.c 2005-07-25 11:32:38.000000000 -0700 @@ -24,6 +24,7 @@ #include #include #include +#include #include "ibm_emac_phy.h" @@ -78,6 +79,45 @@ static int cis8201_init(struct mii_phy * return 0; } +#ifdef CONFIG_BAMBOO +static int ac104_init(struct mii_phy *phy) +{ + /* + * SW2 on the Bamboo is used for ethernet configuration and is accessed + * via the CONFIG2 register in the FPGA. If the ANEG pin is set, + * overwrite the supported features with the settings in SW2. + */ + u8 *config2_addr, config2_val; + config2_addr = ioremap64(BAMBOO_FPGA_CONFIG2_REG_ADDR, 0x8); + config2_val = * config2_addr; + iounmap(config2_addr); + if (BAMBOO_AUTONEGOTIATE(config2_val)) + return 0; + phy->def->features = SUPPORTED_TP | SUPPORTED_MII; + if (BAMBOO_FORCE_100Mbps(config2_val)) { + phy->speed = SPEED_100; + if (BAMBOO_FULL_DUPLEX_EN(config2_val)) { + phy->def->features |= SUPPORTED_100baseT_Full; + phy->duplex = DUPLEX_FULL; + } else { + phy->def->features |= SUPPORTED_100baseT_Half; + phy->duplex = DUPLEX_HALF; + } + } else { + phy->speed = SPEED_10; + if (BAMBOO_FULL_DUPLEX_EN(config2_val)) { + phy->def->features |= SUPPORTED_10baseT_Full; + phy->duplex = DUPLEX_FULL; + } else { + phy->def->features |= SUPPORTED_10baseT_Half; + phy->duplex = DUPLEX_HALF; + } + } + + return 0; +} +#endif + static int genmii_setup_aneg(struct mii_phy *phy, u32 advertise) { u16 ctl, adv; @@ -226,6 +266,17 @@ static struct mii_phy_ops cis8201_phy_op read_link:cis8201_read_link }; +/* AC104 phy ops */ +static struct mii_phy_ops ac104_phy_ops = { +#ifdef CONFIG_BAMBOO + init:ac104_init, +#endif + setup_aneg:genmii_setup_aneg, + setup_forced:genmii_setup_forced, + poll_link:genmii_poll_link, + read_link:genmii_read_link +}; + /* Generic implementation for most 10/100 PHYs */ static struct mii_phy_ops generic_phy_ops = { setup_aneg:genmii_setup_aneg, @@ -234,6 +285,15 @@ static struct mii_phy_ops generic_phy_op read_link:genmii_read_link }; +static struct mii_phy_def ac104_phy_def = { + phy_id:0x00225540, + phy_id_mask:0x00fffff0, + name:"AC104 Ethernet", + features:MII_BASIC_FEATURES, + magic_aneg:0, + ops:&ac104_phy_ops +}; + static struct mii_phy_def cis8201_phy_def = { phy_id:0x000fc410, phy_id_mask:0x000ffff0, @@ -254,6 +314,7 @@ static struct mii_phy_def genmii_phy_def static struct mii_phy_def *mii_phy_table[] = { &cis8201_phy_def, + &ac104_phy_def, &genmii_phy_def, NULL }; From ebs@ebshome.net Thu Jul 28 00:01:41 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 28 Jul 2005 00:01:47 -0700 (PDT) Received: from gate.ebshome.net (gate.ebshome.net [64.81.67.12]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6S71eH9030323 for ; Thu, 28 Jul 2005 00:01:41 -0700 Received: (qmail 18047 invoked by uid 1000); 27 Jul 2005 23:59:44 -0700 Date: Wed, 27 Jul 2005 23:59:44 -0700 From: Eugene Surovegin To: Matt Porter Cc: jgarzik@pobox.com, netdev@oss.sgi.com, wfarnsworth@mvista.com Subject: Re: [PATCH] emac: add bamboo support Message-ID: <20050728065943.GA16041@gate.ebshome.net> Mail-Followup-To: Matt Porter , jgarzik@pobox.com, netdev@oss.sgi.com, wfarnsworth@mvista.com References: <20050727104247.C1114@cox.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20050727104247.C1114@cox.net> X-ICQ-UIN: 1193073 X-Operating-System: Linux i686 X-PGP-Key: http://www.ebshome.net/pubkey.asc User-Agent: Mutt/1.5.5.1i X-archive-position: 2794 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ebs@ebshome.net Precedence: bulk X-list: netdev Content-Length: 966 Lines: 30 On Wed, Jul 27, 2005 at 10:42:47AM -0700, Matt Porter wrote: > Adds support for the Bamboo board phys in the EMAC driver. > Please apply. > > Signed-off-by: Wade Farnsworth > Signed-off-by: Matt Porter > [snip] > +#ifdef CONFIG_BAMBOO > +static int ac104_init(struct mii_phy *phy) > +{ > + /* > + * SW2 on the Bamboo is used for ethernet configuration and is accessed > + * via the CONFIG2 register in the FPGA. If the ANEG pin is set, > + * overwrite the supported features with the settings in SW2. > + */ I wonder, how this SW2 works. Is it just a way to tell software not to use autoneg and force some settings, or it disables autoneg on hw level (I'm kinda doubt that)? If this is just some board specific configuration option which doesn't affect this PHY directly, let's drop this stuff completely and always use autoneg, if user wants to force something - he should ethtool. -- Eugene From dada1@cosmosbay.com Thu Jul 28 08:54:08 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 28 Jul 2005 08:54:14 -0700 (PDT) Received: from gw1.cosmosbay.com (gw1.cosmosbay.com [62.23.185.226]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6SFs7H9004841 for ; Thu, 28 Jul 2005 08:54:08 -0700 Received: from [172.16.0.131] (edumazet-port [172.16.0.131]) by gw1.cosmosbay.com (8.13.3/8.13.3) with ESMTP id j6SFq5sN007017; Thu, 28 Jul 2005 17:52:06 +0200 Message-ID: <42E8FF24.9070009@cosmosbay.com> Date: Thu, 28 Jul 2005 17:52:04 +0200 From: Eric Dumazet User-Agent: Mozilla Thunderbird 1.0 (Windows/20041206) X-Accept-Language: fr, en MIME-Version: 1.0 To: "David S. Miller" CC: netdev@oss.sgi.com Subject: [PATCH] Add prefetches in net/ipv4/route.c References: <20050704.154712.63128211.davem@davemloft.net> <42C9BE69.2070008@cosmosbay.com> <42C9BEF6.4080402@cosmosbay.com> <20050704.160140.21591849.davem@davemloft.net> <42CA390C.9000801@cosmosbay.com> In-Reply-To: <42CA390C.9000801@cosmosbay.com> Content-Type: multipart/mixed; boundary="------------060607040500090607010609" X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-1.6 (gw1.cosmosbay.com [172.16.8.80]); Thu, 28 Jul 2005 17:52:06 +0200 (CEST) X-archive-position: 2795 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: dada1@cosmosbay.com Precedence: bulk X-list: netdev Content-Length: 2226 Lines: 64 This is a multi-part message in MIME format. --------------060607040500090607010609 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit [NET] : Adds prefetches in route hash list traversals. The actual code doesnt use a prefetch enabled macro like list_for_each_rcu(), so manually add prefetch() hints. Signed-off-by: Eric Dumazet --------------060607040500090607010609 Content-Type: text/plain; name="route.prefetches" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="route.prefetches" diff -Nru linux-2.6.13-rc3/net/ipv4/route.c linux-2.6.13-rc3-ed/net/ipv4/route.c --- linux-2.6.13-rc3/net/ipv4/route.c 2005-07-13 06:46:46.000000000 +0200 +++ linux-2.6.13-rc3-ed/net/ipv4/route.c 2005-07-28 17:20:21.000000000 +0200 @@ -1148,6 +1148,7 @@ while ((rth = rcu_dereference(*rthp)) != NULL) { struct rtable *rt; + prefetch(rth->u.rt_next); if (rth->fl.fl4_dst != daddr || rth->fl.fl4_src != skeys[i] || rth->fl.fl4_tos != tos || @@ -1401,6 +1402,7 @@ rcu_read_lock(); for (rth = rcu_dereference(rt_hash_table[hash].chain); rth; rth = rcu_dereference(rth->u.rt_next)) { + prefetch(rth->u.rt_next); if (rth->fl.fl4_dst == daddr && rth->fl.fl4_src == skeys[i] && rth->rt_dst == daddr && @@ -2094,6 +2096,7 @@ rcu_read_lock(); for (rth = rcu_dereference(rt_hash_table[hash].chain); rth; rth = rcu_dereference(rth->u.rt_next)) { + prefetch(rth->u.rt_next); if (rth->fl.fl4_dst == daddr && rth->fl.fl4_src == saddr && rth->fl.iif == iif && @@ -2565,6 +2568,7 @@ rcu_read_lock_bh(); for (rth = rcu_dereference(rt_hash_table[hash].chain); rth; rth = rcu_dereference(rth->u.rt_next)) { + prefetch(rth->u.rt_next); if (rth->fl.fl4_dst == flp->fl4_dst && rth->fl.fl4_src == flp->fl4_src && rth->fl.iif == 0 && @@ -2819,6 +2823,7 @@ rcu_read_lock_bh(); for (rt = rcu_dereference(rt_hash_table[h].chain), idx = 0; rt; rt = rcu_dereference(rt->u.rt_next), idx++) { + prefetch(rt->u.rt_next); if (idx < s_idx) continue; skb->dst = dst_clone(&rt->u.dst); --------------060607040500090607010609-- From wfarnsworth@mvista.com Thu Jul 28 10:28:12 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 28 Jul 2005 10:28:17 -0700 (PDT) Received: from av.mvista.com (gateway-1237.mvista.com [12.44.186.158]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6SHSBH9009897 for ; Thu, 28 Jul 2005 10:28:11 -0700 Received: from rhino.az.mvista.com (av [127.0.0.1]) by av.mvista.com (8.9.3/8.9.3) with ESMTP id KAA10388; Thu, 28 Jul 2005 10:25:07 -0700 Subject: Re: [PATCH] emac: add bamboo support From: Wade Farnsworth To: Eugene Surovegin Cc: Matt Porter , jgarzik@pobox.com, netdev@oss.sgi.com In-Reply-To: <20050728065943.GA16041@gate.ebshome.net> References: <20050727104247.C1114@cox.net> <20050728065943.GA16041@gate.ebshome.net> Content-Type: text/plain Message-Id: <1122571506.22059.146.camel@rhino.az.mvista.com> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.3.92 (Preview Release) Date: 28 Jul 2005 10:25:07 -0700 Content-Transfer-Encoding: 7bit X-archive-position: 2796 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: wfarnsworth@mvista.com Precedence: bulk X-list: netdev Content-Length: 2905 Lines: 64 On Wed, 2005-07-27 at 23:59, Eugene Surovegin wrote: > On Wed, Jul 27, 2005 at 10:42:47AM -0700, Matt Porter wrote: > > Adds support for the Bamboo board phys in the EMAC driver. > > Please apply. > > > > Signed-off-by: Wade Farnsworth > > Signed-off-by: Matt Porter > > > > [snip] > > > +#ifdef CONFIG_BAMBOO > > +static int ac104_init(struct mii_phy *phy) > > +{ > > + /* > > + * SW2 on the Bamboo is used for ethernet configuration and is accessed > > + * via the CONFIG2 register in the FPGA. If the ANEG pin is set, > > + * overwrite the supported features with the settings in SW2. > > + */ > > I wonder, how this SW2 works. Is it just a way to tell software not to > use autoneg and force some settings, or it disables autoneg on hw > level (I'm kinda doubt that)? Yes, SW2 is completely ignored by the PHY. > > If this is just some board specific configuration option which doesn't > affect this PHY directly, let's drop this stuff completely and always > use autoneg, if user wants to force something - he should ethtool. I guess my comment does not explain the real reason for this function. The Rev. 0 Bamboo has improperly biased RJ45 sockets. This causes the PHY to only work at 10 Mbps. One can remove the inductors L17 and L18 from the board to enable 100Mbps, but this also disables 10Mbps. Attempting to bring up ethernet in one of the unavailable speeds causes ethernet to hang until the board is reset. AMCC has no plans to replace the Rev. 0, so there are users that will need some reliable way to determine which speed is available and select that speed at boot. A previous version of the patch did this by attempting to determine the board rev. If a rev 0 was found, then keep the speed determined by the firmware, since we know that works. Rev 1's would be allowed to use both speeds. The board rev was determined by reading the cpu rev from the PVR. This assumes that all rev 0 boards have rev A cpu's and rev 1 boards have rev B cpu's. I believe you had some reservations about this method. PIBS uses a similar method to what this patch does. In order to tftpboot using a 10Mbps-enabled rev. 0 the ANEG pin and the Force 100Mbps pin must be disabled. Similarly, the patch will read those same pins and only allow the speed selected. Additionally, the patch reads the Duplex pin and determines which duplex mode is available. The duplex has no bearing on the above bug, so this can be safely removed. However, since we're already determining phy speed using SW2, I think users would expect us to determine the duplex mode in the same manner. I realize that this departs from what the other boards/phys do in this driver, but we do need some way for the appropriate speed to be selected on the rev. 0 boards. If you know of a better way of doing this please let me know. -Wade Farnsworth From ebs@ebshome.net Thu Jul 28 10:33:06 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 28 Jul 2005 10:33:10 -0700 (PDT) Received: from gate.ebshome.net (gate.ebshome.net [64.81.67.12]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6SHX6H9010666 for ; Thu, 28 Jul 2005 10:33:06 -0700 Received: (qmail 27196 invoked by uid 1000); 28 Jul 2005 10:31:09 -0700 Date: Thu, 28 Jul 2005 10:31:09 -0700 From: Eugene Surovegin To: Wade Farnsworth Cc: Matt Porter , jgarzik@pobox.com, netdev@oss.sgi.com Subject: Re: [PATCH] emac: add bamboo support Message-ID: <20050728173109.GB16041@gate.ebshome.net> Mail-Followup-To: Wade Farnsworth , Matt Porter , jgarzik@pobox.com, netdev@oss.sgi.com References: <20050727104247.C1114@cox.net> <20050728065943.GA16041@gate.ebshome.net> <1122571506.22059.146.camel@rhino.az.mvista.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1122571506.22059.146.camel@rhino.az.mvista.com> X-ICQ-UIN: 1193073 X-Operating-System: Linux i686 X-PGP-Key: http://www.ebshome.net/pubkey.asc User-Agent: Mutt/1.5.5.1i X-archive-position: 2797 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ebs@ebshome.net Precedence: bulk X-list: netdev Content-Length: 545 Lines: 15 On Thu, Jul 28, 2005 at 10:25:07AM -0700, Wade Farnsworth wrote: > If you know of a better way of doing this please let me know. Yes, I do and IIRC told you last time. Make it generic - move all board specific stuff where it belongs - board support files. Add additional field(s) to ocp_func_emac_data which will allow EMAC driver to override PHY modes, the same way we specify PHY id now, for example. With this approach, we won't have to add another board specific crap to the driver next time hw vendor fuck ups their hw. -- Eugene From davem@davemloft.net Thu Jul 28 12:41:22 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 28 Jul 2005 12:41:39 -0700 (PDT) Received: from sunset.davemloft.net (dsl027-180-168.sfo1.dsl.speakeasy.net [216.27.180.168]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6SJfIH9021111 for ; Thu, 28 Jul 2005 12:41:22 -0700 Received: from localhost ([127.0.0.1] ident=davem) by sunset.davemloft.net with esmtp (Exim 4.52) id 1DyEDy-0000tn-Lh; Thu, 28 Jul 2005 12:39:22 -0700 Date: Thu, 28 Jul 2005 12:39:22 -0700 (PDT) Message-Id: <20050728.123922.126777020.davem@davemloft.net> To: dada1@cosmosbay.com Cc: netdev@oss.sgi.com Subject: Re: [PATCH] Add prefetches in net/ipv4/route.c From: "David S. Miller" In-Reply-To: <42E8FF24.9070009@cosmosbay.com> References: <20050704.160140.21591849.davem@davemloft.net> <42CA390C.9000801@cosmosbay.com> <42E8FF24.9070009@cosmosbay.com> X-Mailer: Mew version 4.2 on Emacs 21.4 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 2798 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev Content-Length: 301 Lines: 9 From: Eric Dumazet Date: Thu, 28 Jul 2005 17:52:04 +0200 > [NET] : Adds prefetches in route hash list traversals. > > The actual code doesnt use a prefetch enabled macro like > list_for_each_rcu(), so manually add prefetch() hints. and the measured performance improvement is? From dada1@cosmosbay.com Thu Jul 28 13:58:35 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 28 Jul 2005 13:58:38 -0700 (PDT) Received: from smtp.cegetel.net (mf01.sitadelle.com [212.94.174.68]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6SKwYH9024586 for ; Thu, 28 Jul 2005 13:58:35 -0700 Received: from [192.168.30.10] (84-4-81-154.adslgp.cegetel.net [84.4.81.154]) by smtp.cegetel.net (Postfix) with ESMTP id 5621D3182B4; Thu, 28 Jul 2005 22:56:33 +0200 (CEST) Message-ID: <42E94680.8060309@cosmosbay.com> Date: Thu, 28 Jul 2005 22:56:32 +0200 From: Eric Dumazet User-Agent: Mozilla Thunderbird 1.0 (Windows/20041206) X-Accept-Language: fr, en MIME-Version: 1.0 To: "David S. Miller" Cc: netdev@oss.sgi.com Subject: Re: [PATCH] Add prefetches in net/ipv4/route.c References: <20050704.160140.21591849.davem@davemloft.net> <42CA390C.9000801@cosmosbay.com> <42E8FF24.9070009@cosmosbay.com> <20050728.123922.126777020.davem@davemloft.net> In-Reply-To: <20050728.123922.126777020.davem@davemloft.net> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit X-archive-position: 2799 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: dada1@cosmosbay.com Precedence: bulk X-list: netdev Content-Length: 929 Lines: 27 David S. Miller a écrit : > From: Eric Dumazet > Date: Thu, 28 Jul 2005 17:52:04 +0200 > > >>[NET] : Adds prefetches in route hash list traversals. >> >>The actual code doesnt use a prefetch enabled macro like >>list_for_each_rcu(), so manually add prefetch() hints. > > > and the measured performance improvement is? > > Half the improvement we could get if only fl.fl4_dst, and other fields were not so far away from the u.rt_next field. (0xE8 on x86_64) For good performance, one should of course choose a big route cache hash size, and in this case, prefetchs are useless, and even cost an extra load : prefetch(rth->u.rt_next) imply the load of the rt_next pointer at the start of rth structure, while the fl fields are on a different cache line) But in case of DDOS, prefetches are a win. I did not test a solution using two prefetches... Other patches using prefetches will follow. Eric From davem@davemloft.net Thu Jul 28 14:00:25 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 28 Jul 2005 14:00:30 -0700 (PDT) Received: from sunset.davemloft.net (dsl027-180-168.sfo1.dsl.speakeasy.net [216.27.180.168]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6SL0PH9024880 for ; Thu, 28 Jul 2005 14:00:25 -0700 Received: from localhost ([127.0.0.1] ident=davem) by sunset.davemloft.net with esmtp (Exim 4.52) id 1DyFSV-0001A2-3V; Thu, 28 Jul 2005 13:58:27 -0700 Date: Thu, 28 Jul 2005 13:58:26 -0700 (PDT) Message-Id: <20050728.135826.63129319.davem@davemloft.net> To: dada1@cosmosbay.com Cc: netdev@oss.sgi.com Subject: Re: [PATCH] Add prefetches in net/ipv4/route.c From: "David S. Miller" In-Reply-To: <42E94680.8060309@cosmosbay.com> References: <42E8FF24.9070009@cosmosbay.com> <20050728.123922.126777020.davem@davemloft.net> <42E94680.8060309@cosmosbay.com> X-Mailer: Mew version 4.2 on Emacs 21.4 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 2800 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev Content-Length: 162 Lines: 6 From: Eric Dumazet Date: Thu, 28 Jul 2005 22:56:32 +0200 > But in case of DDOS, prefetches are a win. Numbers please, I'm simply curious. From dada1@cosmosbay.com Thu Jul 28 14:26:34 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 28 Jul 2005 14:26:37 -0700 (PDT) Received: from smtp.cegetel.net (mf01.sitadelle.com [212.94.174.68]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6SLQXH9026949 for ; Thu, 28 Jul 2005 14:26:33 -0700 Received: from [192.168.30.10] (84-4-81-154.adslgp.cegetel.net [84.4.81.154]) by smtp.cegetel.net (Postfix) with ESMTP id CB0D931809D; Thu, 28 Jul 2005 23:24:35 +0200 (CEST) Message-ID: <42E94D11.4090002@cosmosbay.com> Date: Thu, 28 Jul 2005 23:24:33 +0200 From: Eric Dumazet User-Agent: Mozilla Thunderbird 1.0 (Windows/20041206) X-Accept-Language: fr, en MIME-Version: 1.0 To: "David S. Miller" Cc: netdev@oss.sgi.com Subject: Re: [PATCH] Add prefetches in net/ipv4/route.c References: <42E8FF24.9070009@cosmosbay.com> <20050728.123922.126777020.davem@davemloft.net> <42E94680.8060309@cosmosbay.com> <20050728.135826.63129319.davem@davemloft.net> In-Reply-To: <20050728.135826.63129319.davem@davemloft.net> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit X-archive-position: 2801 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: dada1@cosmosbay.com Precedence: bulk X-list: netdev Content-Length: 902 Lines: 34 David S. Miller a écrit : > From: Eric Dumazet > Date: Thu, 28 Jul 2005 22:56:32 +0200 > > >>But in case of DDOS, prefetches are a win. > > > Numbers please, I'm simply curious. > > I have no profiling info for this exact patch, I'm sorry David. On a dual opteron machine, this thing from ip_route_input() is very expensive : RT_CACHE_STAT_INC(in_hlist_search); ip_route_input() use a total of 3.4563 % of one cpu, but this 'increment' takes 1.20 % !!! 0.0047 mov 2123529(%rip),%rax # ffffffff804b4a60 1.1898 not %rax mov %gs:0x34,%edx 0.0042 movslq %edx,%rdx mov (%rax,%rdx,8),%rax incl 0x38(%rax) Sometime I wonder if oprofile can be trusted :( Maybe we should increment a counter on the stack and do a final if (counter != 0) RT_CACHE_STAT_ADD(in_hlist_search, counter); Eric From davem@davemloft.net Thu Jul 28 15:46:23 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 28 Jul 2005 15:46:28 -0700 (PDT) Received: from sunset.davemloft.net (dsl027-180-168.sfo1.dsl.speakeasy.net [216.27.180.168]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6SMkNH9031611 for ; Thu, 28 Jul 2005 15:46:23 -0700 Received: from localhost ([127.0.0.1] ident=davem) by sunset.davemloft.net with esmtp (Exim 4.52) id 1DyH6y-0001Sq-UE; Thu, 28 Jul 2005 15:44:20 -0700 Date: Thu, 28 Jul 2005 15:44:20 -0700 (PDT) Message-Id: <20050728.154420.21594218.davem@davemloft.net> To: dada1@cosmosbay.com Cc: netdev@oss.sgi.com Subject: Re: [PATCH] Add prefetches in net/ipv4/route.c From: "David S. Miller" In-Reply-To: <42E94D11.4090002@cosmosbay.com> References: <42E94680.8060309@cosmosbay.com> <20050728.135826.63129319.davem@davemloft.net> <42E94D11.4090002@cosmosbay.com> X-Mailer: Mew version 4.2 on Emacs 21.4 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 2802 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev Content-Length: 349 Lines: 11 From: Eric Dumazet Date: Thu, 28 Jul 2005 23:24:33 +0200 > On a dual opteron machine, this thing from ip_route_input() is very expensive : > > RT_CACHE_STAT_INC(in_hlist_search); That's amazing since it's per-cpu. I don't have any suggestions besides your idea to increment it once using an accumulation local variable. From herbert@gondor.apana.org.au Thu Jul 28 16:43:09 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 28 Jul 2005 16:43:18 -0700 (PDT) Received: from arnor.apana.org.au (arnor.apana.org.au [203.14.152.115]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6SNh8H9005514 for ; Thu, 28 Jul 2005 16:43:09 -0700 Received: from gondolin.me.apana.org.au ([192.168.0.6] ident=mail) by arnor.apana.org.au with esmtp (Exim 3.35 #1 (Debian)) id 1Dy6hD-0006Hu-00; Thu, 28 Jul 2005 21:37:03 +1000 Received: from herbert by gondolin.me.apana.org.au with local (Exim 3.36 #1 (Debian)) id 1Dy6gb-00044G-00; Thu, 28 Jul 2005 21:36:25 +1000 From: Herbert Xu To: diego.beltrami@HIIT.FI Subject: Re: [PATCH 2.6.12.2] XFRM: BEET IPsec mode for Linux Cc: netdev@oss.sgi.com, infrahip@HIIT.FI, gurtov@cs.helsinki.fi, jeffrey.m.ahrenholz@boeing.com, kristian.slavov@nomadiclab.com, hipl-users@freelists.org, hipsec@ietf.org Organization: Core In-Reply-To: <1122295307.14873.37.camel@odysse> X-Newsgroups: apana.lists.os.linux.netdev User-Agent: tin/1.7.4-20040225 ("Benbecula") (UNIX) (Linux/2.4.27-hx-1-686-smp (i686)) Message-Id: Date: Thu, 28 Jul 2005 21:36:25 +1000 X-archive-position: 2803 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: herbert@gondor.apana.org.au Precedence: bulk X-list: netdev Content-Length: 1182 Lines: 31 Diego Beltrami wrote: > > we have been working for three months to implement a new IPsec mode, > the "BEET" mode, for Linux. Below is a link to the BEET specification > and > the abstract: > > http://www.ietf.org/internet-drafts/draft-nikander-esp-beet-mode-03.txt Thanks for the patch guys, this is really interesting. > extern int xfrm4_rcv_encap(struct sk_buff *skb, __u16 encap_type); > diff -urN linux-2.6.12.2/net/ipv4/esp4.c > linux-beet-2.6.12.2/net/ipv4/esp4.c > --- linux-2.6.12.2/net/ipv4/esp4.c 2005-06-30 02:00:53.000000000 +0300 > +++ linux-beet-2.6.12.2/net/ipv4/esp4.c 2005-07-25 14:39:11.000000000 Although the document only talks about ESP, as far as I can see the encapsulation can be applied to AH/IPComp just as well. So how about moving this stuff to the generic xfrm_input/xfrm_output functions? Also, if you're going to do cross-family transforms, it should be done for both BEET and plain tunnel-mode. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt From Robert.Olsson@data.slu.se Fri Jul 29 08:14:40 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 29 Jul 2005 08:14:43 -0700 (PDT) Received: from mx1.slu.se (mx1.slu.se [130.238.96.70]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6TFEaH9004297 for ; Fri, 29 Jul 2005 08:14:39 -0700 Received: from robur.slu.se (robur.slu.se [130.238.98.12]) by mx1.slu.se (8.13.1/8.13.1) with ESMTP id j6TFCcB3005330 for ; Fri, 29 Jul 2005 17:12:38 +0200 Received: by robur.slu.se (Postfix, from userid 1000) id 97270EC3BB; Fri, 29 Jul 2005 16:50:31 +0200 (CEST) From: Robert Olsson MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <17130.16951.581026.863431@robur.slu.se> Date: Fri, 29 Jul 2005 16:50:31 +0200 To: Eric Dumazet Cc: "David S. Miller" , netdev@oss.sgi.com Subject: Re: [PATCH] Add prefetches in net/ipv4/route.c In-Reply-To: <42E94D11.4090002@cosmosbay.com> References: <42E8FF24.9070009@cosmosbay.com> <20050728.123922.126777020.davem@davemloft.net> <42E94680.8060309@cosmosbay.com> <20050728.135826.63129319.davem@davemloft.net> <42E94D11.4090002@cosmosbay.com> X-Mailer: VM 7.19 under Emacs 21.4.1 X-Scanned-By: MIMEDefang 2.48 on 130.238.96.70 X-archive-position: 2804 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: Robert.Olsson@data.slu.se Precedence: bulk X-list: netdev Content-Length: 938 Lines: 26 Eric Dumazet writes: > I have no profiling info for this exact patch, I'm sorry David. > On a dual opteron machine, this thing from ip_route_input() is very expensive : > > RT_CACHE_STAT_INC(in_hlist_search); > > ip_route_input() use a total of 3.4563 % of one cpu, but this 'increment' takes 1.20 % !!! Very weird if the statscounter taking a third of ip_route_input. > Sometime I wonder if oprofile can be trusted :( > > Maybe we should increment a counter on the stack and do a final > if (counter != 0) > RT_CACHE_STAT_ADD(in_hlist_search, counter); My experiences from playing with prefetching eth_type_trans in this case. One must look in the total performance not just were the prefetching is done. In this case I was able to get eth_type_trans down in the profile list but other functions increased so performance was the same or lower. This needs to be sorted out... Cheers. --ro From diego.beltrami@HIIT.FI Fri Jul 29 08:35:36 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 29 Jul 2005 08:35:40 -0700 (PDT) Received: from pegasus.hiit.fi (pegasus.hiit.fi [212.68.1.186]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6TFZZH9009522 for ; Fri, 29 Jul 2005 08:35:36 -0700 Received: from [128.214.113.174] (odysse.hiit.fi [128.214.113.174]) by pegasus.hiit.fi (Postfix) with ESMTP id 91B8D220077; Fri, 29 Jul 2005 18:33:36 +0300 (EEST) Subject: Re: [hipl-users] Re: [PATCH 2.6.12.2] XFRM: BEET IPsec mode for Linux From: Diego Beltrami Reply-To: diego.beltrami@HIIT.FI To: herbert@gondor.apana.org.au Cc: infrahip@HIIT.FI, netdev@oss.sgi.com In-Reply-To: References: Content-Type: text/plain Organization: HIIT Message-Id: <1122651216.25842.67.camel@odysse> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.4.6 (1.4.6-2) Date: Fri, 29 Jul 2005 18:33:36 +0300 Content-Transfer-Encoding: 7bit X-archive-position: 2805 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: diego.beltrami@HIIT.FI Precedence: bulk X-list: netdev Content-Length: 2431 Lines: 66 > Diego Beltrami wrote: > > > > we have been working for three months to implement a new IPsec mode, > > the "BEET" mode, for Linux. Below is a link to the BEET specification > > and > > the abstract: > > > > http://www.ietf.org/internet-drafts/draft-nikander-esp-beet-mode-03.txt > > Thanks for the patch guys, this is really interesting. Thanks Herbert for your feedback! > > extern int xfrm4_rcv_encap(struct sk_buff *skb, __u16 encap_type); > > diff -urN linux-2.6.12.2/net/ipv4/esp4.c > > linux-beet-2.6.12.2/net/ipv4/esp4.c > > --- linux-2.6.12.2/net/ipv4/esp4.c 2005-06-30 02:00:53.000000000 +0300 > > +++ linux-beet-2.6.12.2/net/ipv4/esp4.c 2005-07-25 14:39:11.000000000 > > Although the document only talks about ESP, as far as I can see > the encapsulation can be applied to AH/IPComp just as well. > So how about moving this stuff to the generic xfrm_input/xfrm_output > functions? The BEET code is already present in xfrm_input/xfrm_output functions and it applies ESP encapsulation merely because of SA and SP set by means setkey. As a consequence, if SA and SP are correctly set for AH the flow goes through the AH functions. The modifications in the ESP functions are due to the hybrid cases when Inner and Outer address families are different; in those cases the values returned by espX functions are not coherent. I tried to change SA and SP so that AH is used and the flow correctly goes through AH functions but the problem has been revealed to be something else. In particular, it seems that the AH functions deal with the pointers contained in skb (skb->data, skb->nh, skb->h etc) in a slightly different way than ESP functions. (Can anyone say more?) Surely BEET will work also for AH with minor changes, even though we only tried the ESP encapsulation. This will require some time to inspect and analyze the exact situation. In any case, as a result, I would say the code is already generic itself. On the other hand I don't know about IPComp, so I wouldn't say anything. Hence if You could please give some hints, they will be more than appreciated. > > Also, if you're going to do cross-family transforms, it should be > done for both BEET and plain tunnel-mode. Potentially it could be possible also for plain tunnel-mode: this will require further analysis. For further discussion and advice, please give feedback. Thank You very much! Cheers, --Diego From pekka.nikander@nomadiclab.com Fri Jul 29 08:47:25 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 29 Jul 2005 08:47:28 -0700 (PDT) Received: from n2.nomadiclab.com (n2.nomadiclab.com [193.234.219.2]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6TFlOH9010708 for ; Fri, 29 Jul 2005 08:47:25 -0700 Received: from [127.0.0.1] (localhost [127.0.0.1]) by n2.nomadiclab.com (Postfix) with ESMTP id 42320212C86; Fri, 29 Jul 2005 18:45:25 +0300 (EEST) In-Reply-To: <1122651216.25842.67.camel@odysse> References: <1122651216.25842.67.camel@odysse> Mime-Version: 1.0 (Apple Message framework v733) Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed Message-Id: Cc: herbert@gondor.apana.org.au, netdev@oss.sgi.com, infrahip@hiit.fi Content-Transfer-Encoding: 7bit From: Pekka Nikander Subject: Re: [Infrahip] Re: [hipl-users] Re: [PATCH 2.6.12.2] XFRM: BEET IPsec mode for Linux Date: Fri, 29 Jul 2005 17:45:24 +0200 To: diego.beltrami@hiit.fi X-Mailer: Apple Mail (2.733) X-archive-position: 2806 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: pekka.nikander@nomadiclab.com Precedence: bulk X-list: netdev Content-Length: 637 Lines: 14 > Surely BEET will work also for AH with minor changes, even though we > only tried the ESP encapsulation. I wouldn't be so sure. IIRC, tunnel mode is not specified for AH but for ESP only. Consequently, defining BEET mode for AH might be pretty tricky. OTOH, I don't know the linux IPsec implementation so that it might be possible to make BEET to "work" for AH, for some value of "work", but it probably would require some careful thinking to define the exact semantics, like what addresses (inner or outer) are covered by the AH integrity protection, what does the integrity protection really assert, etc. --Pekka From ravinandan.arakali@neterion.com Fri Jul 29 09:40:48 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 29 Jul 2005 09:40:51 -0700 (PDT) Received: from ns1.s2io.com (ns1.s2io.com [142.46.200.198]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6TGelH9014353 for ; Fri, 29 Jul 2005 09:40:48 -0700 Received: from guinness.s2io.com (sentry.s2io.com [142.46.200.199]) by ns1.s2io.com (8.12.10/8.12.10) with ESMTP id j6TGbtcx027415; Fri, 29 Jul 2005 12:37:55 -0400 (EDT) Received: from rarakali ([10.16.16.72]) by guinness.s2io.com (8.12.6/8.12.6) with SMTP id j6TGbrKP026884; Fri, 29 Jul 2005 12:37:53 -0400 (EDT) From: "Ravinandan Arakali" To: "'David S. Miller'" Cc: , , , , , Subject: RE: [PATCH 2.6.12.1 5/12] S2io: Performance improvements Date: Fri, 29 Jul 2005 09:37:55 -0700 Message-ID: <001001c5945b$df8afd90$4810100a@pc.s2io.com> MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit X-Priority: 3 (Normal) X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook CWS, Build 9.0.2416 (9.0.2911.0) Importance: Normal In-Reply-To: <20050712.140411.41107257.davem@davemloft.net> X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.2180 X-Scanned-By: MIMEDefang 2.34 X-archive-position: 2807 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ravinandan.arakali@neterion.com Precedence: bulk X-list: netdev Content-Length: 1400 Lines: 41 David, We are trying to use the "default" directive in Kconfig. We tried using an unconditional directive(just to test it out) such as "default y" and a conditional one such as "default y if CONFIG_IA64_SGI_SN2". But when we run "make menuconfig", it does not seem to pickup any of these changes from Kconfig. Any idea what we might be missing ? Once this is fixed, we'll send out a patch to address comments from previous 12 patches as well as couple of issues we found in the meantime. Thanks, Ravi -----Original Message----- From: David S. Miller [mailto:davem@davemloft.net] Sent: Tuesday, July 12, 2005 2:04 PM To: ravinandan.arakali@neterion.com Cc: hch@infradead.org; raghavendra.koushik@neterion.com; jgarzik@pobox.com; netdev@oss.sgi.com; leonid.grossman@neterion.com; rapuru.sriram@neterion.com Subject: Re: [PATCH 2.6.12.1 5/12] S2io: Performance improvements From: "Ravinandan Arakali" Subject: RE: [PATCH 2.6.12.1 5/12] S2io: Performance improvements Date: Tue, 12 Jul 2005 14:00:52 -0700 > The two-buffer mode was added as a configurable option > to Kconfig file several months ago. Hence the macro > is CONFIG_2BUFF_MODE. We're saying that you should choose CONFIG_2BUFF_MODE, when CONFIG_IA64_SGI_SN2 is set, inside the Kconfig file using the "default" Kconfig directive. You should never change the setting of CONFIG_* macros in C source. From rick.jones2@hp.com Fri Jul 29 10:08:09 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 29 Jul 2005 10:08:12 -0700 (PDT) Received: from palrel13.hp.com (palrel13.hp.com [156.153.255.238]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6TH89H9016216 for ; Fri, 29 Jul 2005 10:08:09 -0700 Received: from tardy.cup.hp.com (tardy.cup.hp.com [16.89.64.81]) by palrel13.hp.com (Postfix) with ESMTP id 43AD91C05763 for ; Fri, 29 Jul 2005 10:06:11 -0700 (PDT) Received: from hp.com (localhost [127.0.0.1]) by tardy.cup.hp.com (8.9.3 (PHNE_28810)/8.9.3 SMKit7.02) with ESMTP id KAA18802 for ; Fri, 29 Jul 2005 10:06:10 -0700 (PDT) Message-ID: <42EA6202.703@hp.com> Date: Fri, 29 Jul 2005 10:06:10 -0700 From: Rick Jones User-Agent: Mozilla/5.0 (X11; U; HP-UX 9000/785; en-US; rv:1.6) Gecko/20040304 X-Accept-Language: en-us, en MIME-Version: 1.0 To: netdev@oss.sgi.com Subject: Re: [PATCH] Add prefetches in net/ipv4/route.c References: <42E8FF24.9070009@cosmosbay.com> <20050728.123922.126777020.davem@davemloft.net> <42E94680.8060309@cosmosbay.com> <20050728.135826.63129319.davem@davemloft.net> <42E94D11.4090002@cosmosbay.com> <17130.16951.581026.863431@robur.slu.se> In-Reply-To: <17130.16951.581026.863431@robur.slu.se> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 2808 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: rick.jones2@hp.com Precedence: bulk X-list: netdev Content-Length: 1262 Lines: 31 Robert Olsson wrote: > Eric Dumazet writes: > > > I have no profiling info for this exact patch, I'm sorry David. > > On a dual opteron machine, this thing from ip_route_input() is very expensive : > > > > RT_CACHE_STAT_INC(in_hlist_search); > > > > ip_route_input() use a total of 3.4563 % of one cpu, but this 'increment' takes 1.20 % !!! > > Very weird if the statscounter taking a third of ip_route_input. > > > Sometime I wonder if oprofile can be trusted :( > > > > Maybe we should increment a counter on the stack and do a final > > if (counter != 0) > > RT_CACHE_STAT_ADD(in_hlist_search, counter); > > My experiences from playing with prefetching eth_type_trans in this > case. One must look in the total performance not just were the > prefetching is done. In this case I was able to get eth_type_trans > down in the profile list but other functions increased so performance > was the same or lower. This needs to be sorted out... How many of the architectures have PMU's that can give us cache miss statistics? Itanium does, and can go so far as to tell us which addresses and instructions are involved - do the others? That sort of data would seem to be desirable in this sort of situation. rick jones From Robert.Olsson@data.slu.se Fri Jul 29 10:46:42 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 29 Jul 2005 10:46:54 -0700 (PDT) Received: from mx1.slu.se (mx1.slu.se [130.238.96.70]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6THkfH9018824 for ; Fri, 29 Jul 2005 10:46:42 -0700 Received: from robur.slu.se (robur.slu.se [130.238.98.12]) by mx1.slu.se (8.13.1/8.13.1) with ESMTP id j6THigD4023702 for ; Fri, 29 Jul 2005 19:44:43 +0200 Received: by robur.slu.se (Postfix, from userid 1000) id 89FA2EC3BB; Fri, 29 Jul 2005 19:44:42 +0200 (CEST) From: Robert Olsson MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <17130.27402.526238.962563@robur.slu.se> Date: Fri, 29 Jul 2005 19:44:42 +0200 To: Rick Jones Cc: netdev@oss.sgi.com Subject: Re: [PATCH] Add prefetches in net/ipv4/route.c In-Reply-To: <42EA6202.703@hp.com> References: <42E8FF24.9070009@cosmosbay.com> <20050728.123922.126777020.davem@davemloft.net> <42E94680.8060309@cosmosbay.com> <20050728.135826.63129319.davem@davemloft.net> <42E94D11.4090002@cosmosbay.com> <17130.16951.581026.863431@robur.slu.se> <42EA6202.703@hp.com> X-Mailer: VM 7.19 under Emacs 21.4.1 X-Scanned-By: MIMEDefang 2.48 on 130.238.96.70 X-archive-position: 2809 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: Robert.Olsson@data.slu.se Precedence: bulk X-list: netdev Content-Length: 1010 Lines: 30 Rick Jones writes: > > My experiences from playing with prefetching eth_type_trans in this > > case. One must look in the total performance not just were the > > prefetching is done. In this case I was able to get eth_type_trans > > down in the profile list but other functions increased so performance > > was the same or lower. This needs to be sorted out... > > How many of the architectures have PMU's that can give us cache miss statistics? > Itanium does, and can go so far as to tell us which addresses and instructions > are involved - do the others? I've seem XEON and Opterons has performance counters for this that can be used with oprofile. Intel has a document (was it an application note?) describing prefetching. Really a lot of things to consider to become successful. > That sort of data would seem to be desirable in this sort of situation. Also what scenario code patch and load we optimizing for Eric mentioned this briefly. Cheers. --ro From dada1@cosmosbay.com Fri Jul 29 10:59:52 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 29 Jul 2005 10:59:58 -0700 (PDT) Received: from gw1.cosmosbay.com (gw1.cosmosbay.com [62.23.185.226]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6THxpH9020045 for ; Fri, 29 Jul 2005 10:59:51 -0700 Received: from [172.16.0.131] (edumazet-port [172.16.0.131]) by gw1.cosmosbay.com (8.13.3/8.13.3) with ESMTP id j6THva0o014802; Fri, 29 Jul 2005 19:57:36 +0200 Message-ID: <42EA6E0F.8060705@cosmosbay.com> Date: Fri, 29 Jul 2005 19:57:35 +0200 From: Eric Dumazet User-Agent: Mozilla Thunderbird 1.0 (Windows/20041206) X-Accept-Language: fr, en MIME-Version: 1.0 To: Rick Jones CC: netdev@oss.sgi.com, "David S. Miller" , Robert Olsson Subject: Re: [PATCH] Add prefetches in net/ipv4/route.c References: <42E8FF24.9070009@cosmosbay.com> <20050728.123922.126777020.davem@davemloft.net> <42E94680.8060309@cosmosbay.com> <20050728.135826.63129319.davem@davemloft.net> <42E94D11.4090002@cosmosbay.com> <17130.16951.581026.863431@robur.slu.se> <42EA6202.703@hp.com> In-Reply-To: <42EA6202.703@hp.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-1.6 (gw1.cosmosbay.com [172.16.8.80]); Fri, 29 Jul 2005 19:57:36 +0200 (CEST) X-archive-position: 2810 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: dada1@cosmosbay.com Precedence: bulk X-list: netdev Content-Length: 2089 Lines: 60 Rick Jones a écrit : > Robert Olsson wrote: > >> Eric Dumazet writes: >> >> > I have no profiling info for this exact patch, I'm sorry David. >> > On a dual opteron machine, this thing from ip_route_input() is very >> expensive : >> > > RT_CACHE_STAT_INC(in_hlist_search); >> > > ip_route_input() use a total of 3.4563 % of one cpu, but this >> 'increment' takes 1.20 % !!! >> >> Very weird if the statscounter taking a third of ip_route_input. >> >> > Sometime I wonder if oprofile can be trusted :( >> > > Maybe we should increment a counter on the stack and do a final >> > if (counter != 0) >> > RT_CACHE_STAT_ADD(in_hlist_search, counter); >> >> My experiences from playing with prefetching eth_type_trans in this >> case. One must look in the total performance not just were the >> prefetching is done. In this case I was able to get eth_type_trans >> down in the profile list but other functions increased so performance >> was the same or lower. This needs to be sorted out... > > > How many of the architectures have PMU's that can give us cache miss > statistics? Itanium does, and can go so far as to tell us which > addresses and instructions are involved - do the others? > > That sort of data would seem to be desirable in this sort of situation. > > rick jones > > oprofile on AMD64 can gather lots of data, DATA_CACHE_MISSES for example... But I think I know what happens... nm -v /usr/src/linux/vmlinux | grep -5 rt_cache_stat ffffffff804c6a80 b rover.5 ffffffff804c6a88 b last_gc.2 ffffffff804c6a90 b rover.3 ffffffff804c6a94 b equilibrium.4 ffffffff804c6a98 b ip_fallback_id.7 ffffffff804c6aa0 B rt_cache_stat ffffffff804c6aa8 b ip_rt_max_size ffffffff804c6aac b ip_rt_debug ffffffff804c6ab0 b rt_deadline So rt_cache_stat (which is a read only pointer) is in the middle of a hot cache line (some parts of it are written over and over), that probably ping pong between CPUS. Time to provide a patch to carefully place all the static data from net/ipv4/route.c into 2 parts : mostly readonly, and others... :) Eric From rick.jones2@hp.com Fri Jul 29 11:27:21 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 29 Jul 2005 11:27:23 -0700 (PDT) Received: from palrel12.hp.com (palrel12.hp.com [156.153.255.237]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6TIRKH9021699 for ; Fri, 29 Jul 2005 11:27:21 -0700 Received: from tardy.cup.hp.com (tardy.cup.hp.com [16.89.64.81]) by palrel12.hp.com (Postfix) with ESMTP id BDC2C404C53; Fri, 29 Jul 2005 11:25:22 -0700 (PDT) Received: from hp.com (localhost [127.0.0.1]) by tardy.cup.hp.com (8.9.3 (PHNE_28810)/8.9.3 SMKit7.02) with ESMTP id LAA19450; Fri, 29 Jul 2005 11:25:22 -0700 (PDT) Message-ID: <42EA7491.1010207@hp.com> Date: Fri, 29 Jul 2005 11:25:21 -0700 From: Rick Jones User-Agent: Mozilla/5.0 (X11; U; HP-UX 9000/785; en-US; rv:1.6) Gecko/20040304 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Eric Dumazet Cc: netdev@oss.sgi.com, "David S. Miller" , Robert Olsson Subject: Re: [PATCH] Add prefetches in net/ipv4/route.c References: <42E8FF24.9070009@cosmosbay.com> <20050728.123922.126777020.davem@davemloft.net> <42E94680.8060309@cosmosbay.com> <20050728.135826.63129319.davem@davemloft.net> <42E94D11.4090002@cosmosbay.com> <17130.16951.581026.863431@robur.slu.se> <42EA6202.703@hp.com> <42EA6E0F.8060705@cosmosbay.com> In-Reply-To: <42EA6E0F.8060705@cosmosbay.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 2811 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: rick.jones2@hp.com Precedence: bulk X-list: netdev Content-Length: 948 Lines: 28 > oprofile on AMD64 can gather lots of data, DATA_CACHE_MISSES for example... > > But I think I know what happens... > > nm -v /usr/src/linux/vmlinux | grep -5 rt_cache_stat > > ffffffff804c6a80 b rover.5 > ffffffff804c6a88 b last_gc.2 > ffffffff804c6a90 b rover.3 > ffffffff804c6a94 b equilibrium.4 > ffffffff804c6a98 b ip_fallback_id.7 > ffffffff804c6aa0 B rt_cache_stat > ffffffff804c6aa8 b ip_rt_max_size > ffffffff804c6aac b ip_rt_debug > ffffffff804c6ab0 b rt_deadline > > So rt_cache_stat (which is a read only pointer) is in the middle of a > hot cache line (some parts of it are written over and over), that > probably ping pong between CPUS. > > Time to provide a patch to carefully place all the static data from > net/ipv4/route.c into 2 parts : mostly readonly, and others... :) Which of course begs the question - what cache line size should be ass-u-me-d when blocking these things? I'll put-forth 128 bytes. rick jones From herbert@gondor.apana.org.au Fri Jul 29 16:51:39 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 29 Jul 2005 16:51:44 -0700 (PDT) Received: from arnor.apana.org.au (arnor.apana.org.au [203.14.152.115]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6TNpbH9012586 for ; Fri, 29 Jul 2005 16:51:38 -0700 Received: from gondolin.me.apana.org.au ([192.168.0.6] ident=mail) by arnor.apana.org.au with esmtp (Exim 3.35 #1 (Debian)) id 1DyebD-0005ue-00; Sat, 30 Jul 2005 09:49:07 +1000 Received: from herbert by gondolin.me.apana.org.au with local (Exim 3.36 #1 (Debian)) id 1Dyeb5-00077C-00; Sat, 30 Jul 2005 09:48:59 +1000 Date: Sat, 30 Jul 2005 09:48:59 +1000 To: Pekka Nikander Cc: diego.beltrami@HIIT.FI, netdev@oss.sgi.com, infrahip@HIIT.FI Subject: Re: [Infrahip] Re: [hipl-users] Re: [PATCH 2.6.12.2] XFRM: BEET IPsec mode for Linux Message-ID: <20050729234859.GA27325@gondor.apana.org.au> References: <1122651216.25842.67.camel@odysse> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.9i From: Herbert Xu X-archive-position: 2812 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: herbert@gondor.apana.org.au Precedence: bulk X-list: netdev Content-Length: 686 Lines: 17 On Fri, Jul 29, 2005 at 05:45:24PM +0200, Pekka Nikander wrote: > >Surely BEET will work also for AH with minor changes, even though we > >only tried the ESP encapsulation. > > I wouldn't be so sure. IIRC, tunnel mode is not specified for AH but > for ESP only. Consequently, defining BEET mode for AH might be Well plain tunnel mode certainly is specified for AH as well as IPComp. But you're right the semantics of BEET mode for AH needs to be thought out. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt From diego.beltrami@HIIT.FI Sat Jul 30 04:03:18 2005 Received: with ECARTIS (v1.0.0; list netdev); Sat, 30 Jul 2005 04:03:27 -0700 (PDT) Received: from pegasus.hiit.fi (pegasus.hiit.fi [212.68.1.186]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6UB3HH9020852 for ; Sat, 30 Jul 2005 04:03:18 -0700 Received: from [128.214.113.174] (odysse.hiit.fi [128.214.113.174]) by pegasus.hiit.fi (Postfix) with ESMTP id 5E1CA220011; Sat, 30 Jul 2005 14:01:18 +0300 (EEST) Subject: Re: [PATCH 2.6.12.2] XFRM: BEET IPsec mode for Linux From: Diego Beltrami Reply-To: diego.beltrami@HIIT.FI To: Herbert Xu Cc: Pekka Nikander , netdev@oss.sgi.com, infrahip@HIIT.FI, hipl-users@freelists.org, hipsec@ietf.org In-Reply-To: <20050729234859.GA27325@gondor.apana.org.au> References: <1122651216.25842.67.camel@odysse> <20050729234859.GA27325@gondor.apana.org.au> Content-Type: text/plain Organization: HIIT Message-Id: <1122721278.3696.28.camel@odysse> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.4.6 (1.4.6-2) Date: Sat, 30 Jul 2005 14:01:18 +0300 Content-Transfer-Encoding: 7bit X-archive-position: 2813 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: diego.beltrami@HIIT.FI Precedence: bulk X-list: netdev Content-Length: 1171 Lines: 30 > On Fri, Jul 29, 2005 at 05:45:24PM +0200, Pekka Nikander wrote: > > >Surely BEET will work also for AH with minor changes, even though we > > >only tried the ESP encapsulation. > > > > I wouldn't be so sure. IIRC, tunnel mode is not specified for AH but > > for ESP only. Consequently, defining BEET mode for AH might be > > Well plain tunnel mode certainly is specified for AH as well as IPComp. > But you're right the semantics of BEET mode for AH needs to be thought > out. > The Linux patch which has been presented (see URL: http://infrahip.hiit.fi/beet/beet-patch-v1.0-2.6.12.2 ), has been developed based upon the design given by the draft http://www.ietf.org/internet-drafts/draft-nikander-esp-beet-mode-03.txt As a result BEET patch considers the ESP encapsulation as it has been designed. OTOH we believe the implementation is usable more or less as it is now for AH and perhaps IPComp in the future. But, as already mentioned both by Pekka and Herbert, this would need more thinking and designing. The implementation is flexible enough to finetune once the semantics for similar optimizations have been considered for AH and IPComp. --Diego From herbert@gondor.apana.org.au Sat Jul 30 04:17:07 2005 Received: with ECARTIS (v1.0.0; list netdev); Sat, 30 Jul 2005 04:17:13 -0700 (PDT) Received: from arnor.apana.org.au (arnor.apana.org.au [203.14.152.115]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6UBH5H9022439 for ; Sat, 30 Jul 2005 04:17:06 -0700 Received: from gondolin.me.apana.org.au ([192.168.0.6] ident=mail) by arnor.apana.org.au with esmtp (Exim 3.35 #1 (Debian)) id 1DypJ0-0001XU-00; Sat, 30 Jul 2005 21:15:02 +1000 Received: from herbert by gondolin.me.apana.org.au with local (Exim 3.36 #1 (Debian)) id 1DypIy-00037K-00; Sat, 30 Jul 2005 21:15:00 +1000 From: Herbert Xu To: diego.beltrami@HIIT.FI Subject: Re: [hipl-users] Re: [PATCH 2.6.12.2] XFRM: BEET IPsec mode for Linux Cc: herbert@gondor.apana.org.au, infrahip@HIIT.FI, netdev@oss.sgi.com Organization: Core In-Reply-To: <1122651216.25842.67.camel@odysse> X-Newsgroups: apana.lists.os.linux.netdev User-Agent: tin/1.7.4-20040225 ("Benbecula") (UNIX) (Linux/2.4.27-hx-1-686-smp (i686)) Message-Id: Date: Sat, 30 Jul 2005 21:15:00 +1000 X-archive-position: 2814 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: herbert@gondor.apana.org.au Precedence: bulk X-list: netdev Content-Length: 1276 Lines: 31 Diego Beltrami wrote: > > The modifications in the ESP functions are due to the hybrid cases when > Inner and Outer address families are different; in those cases the > values returned by espX functions are not coherent. I see. However, this is really a consequence of us not implementing interfamily transforms for plain old tunnel mode. Had we implemented that, it would be a piece of cake to extend this to BEET without touching ESP. >> Also, if you're going to do cross-family transforms, it should be >> done for both BEET and plain tunnel-mode. > > Potentially it could be possible also for plain tunnel-mode: this will > require further analysis. It definitely does need further analysis even for BEET mode. The rcv path for interfamily transforms is straightforward since we pass through netif_rx. However, on the outbound path things aren't that simple. I suggest that you remove the interfamily support for the initial merge of the BEET implementation. We can then readd it for both plain tunnel and BEET mode. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt From davem@davemloft.net Sat Jul 30 20:46:46 2005 Received: with ECARTIS (v1.0.0; list netdev); Sat, 30 Jul 2005 20:46:54 -0700 (PDT) Received: from sunset.davemloft.net (dsl027-180-168.sfo1.dsl.speakeasy.net [216.27.180.168]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6V3kjH9022624 for ; Sat, 30 Jul 2005 20:46:45 -0700 Received: from localhost ([127.0.0.1] ident=davem) by sunset.davemloft.net with esmtp (Exim 4.52) id 1Dz4kq-00036O-JR; Sat, 30 Jul 2005 20:44:48 -0700 Date: Sat, 30 Jul 2005 20:44:48 -0700 (PDT) Message-Id: <20050730.204448.85713599.davem@davemloft.net> To: Robert.Olsson@data.slu.se Cc: dada1@cosmosbay.com, netdev@oss.sgi.com Subject: Re: [PATCH] Add prefetches in net/ipv4/route.c From: "David S. Miller" In-Reply-To: <17130.16951.581026.863431@robur.slu.se> References: <20050728.135826.63129319.davem@davemloft.net> <42E94D11.4090002@cosmosbay.com> <17130.16951.581026.863431@robur.slu.se> X-Mailer: Mew version 4.2 on Emacs 21.4 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 2815 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev Content-Length: 1118 Lines: 23 From: Robert Olsson Date: Fri, 29 Jul 2005 16:50:31 +0200 > My experiences from playing with prefetching eth_type_trans in this > case. One must look in the total performance not just were the > prefetching is done. In this case I was able to get eth_type_trans > down in the profile list but other functions increased so performance > was the same or lower. This needs to be sorted out... The problem is that if you just barely fit in the cache for a workload, prefetches can hurt if done too early. Let's say that your code path needs to access data items A, B, and C, in that order. If you need to access A in order to know C, and subsequently if you prefetch C before you use B, you might kick out B and end up making performance worse (since you'll thus need to bring in C twice). I really do not want to merge in any prefetch patches until there is hard data showing an improvement, instead of some shamanistic justification :-) When the witch doctor comes to town and starts adding prefetches all over the place without performance metrics, I become rightly concerned. From davem@davemloft.net Sat Jul 30 20:53:00 2005 Received: with ECARTIS (v1.0.0; list netdev); Sat, 30 Jul 2005 20:53:05 -0700 (PDT) Received: from sunset.davemloft.net (dsl027-180-168.sfo1.dsl.speakeasy.net [216.27.180.168]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6V3r0H9023110 for ; Sat, 30 Jul 2005 20:53:00 -0700 Received: from localhost ([127.0.0.1] ident=davem) by sunset.davemloft.net with esmtp (Exim 4.52) id 1Dz4qv-00036u-LY; Sat, 30 Jul 2005 20:51:05 -0700 Date: Sat, 30 Jul 2005 20:51:05 -0700 (PDT) Message-Id: <20050730.205105.125871032.davem@davemloft.net> To: dada1@cosmosbay.com Cc: rick.jones2@hp.com, netdev@oss.sgi.com, davem@redhat.com, Robert.Olsson@data.slu.se Subject: Re: [PATCH] Add prefetches in net/ipv4/route.c From: "David S. Miller" In-Reply-To: <42EA6E0F.8060705@cosmosbay.com> References: <17130.16951.581026.863431@robur.slu.se> <42EA6202.703@hp.com> <42EA6E0F.8060705@cosmosbay.com> X-Mailer: Mew version 4.2 on Emacs 21.4 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 2816 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev Content-Length: 908 Lines: 24 From: Eric Dumazet Date: Fri, 29 Jul 2005 19:57:35 +0200 > nm -v /usr/src/linux/vmlinux | grep -5 rt_cache_stat > > ffffffff804c6a80 b rover.5 > ffffffff804c6a88 b last_gc.2 > ffffffff804c6a90 b rover.3 > ffffffff804c6a94 b equilibrium.4 > ffffffff804c6a98 b ip_fallback_id.7 > ffffffff804c6aa0 B rt_cache_stat > ffffffff804c6aa8 b ip_rt_max_size > ffffffff804c6aac b ip_rt_debug > ffffffff804c6ab0 b rt_deadline > > So rt_cache_stat (which is a read only pointer) is in the middle of a hot cache line (some parts of it are written over and over), that > probably ping pong between CPUS. One cure for this would be to declare it as "__read_mostly", that should help a lot. But it's not the best idea, I think. Instead, we can do away with the pointer and use DECLARE_PERCPU() and __get_cpu_var() for rt_cache_stat. That would emit the most efficient code on every architecture. From davem@davemloft.net Sat Jul 30 20:54:01 2005 Received: with ECARTIS (v1.0.0; list netdev); Sat, 30 Jul 2005 20:54:05 -0700 (PDT) Received: from sunset.davemloft.net (dsl027-180-168.sfo1.dsl.speakeasy.net [216.27.180.168]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6V3s1H9023513 for ; Sat, 30 Jul 2005 20:54:01 -0700 Received: from localhost ([127.0.0.1] ident=davem) by sunset.davemloft.net with esmtp (Exim 4.52) id 1Dz4ry-00037I-6n; Sat, 30 Jul 2005 20:52:10 -0700 Date: Sat, 30 Jul 2005 20:52:09 -0700 (PDT) Message-Id: <20050730.205209.112313042.davem@davemloft.net> To: rick.jones2@hp.com Cc: dada1@cosmosbay.com, netdev@oss.sgi.com, davem@redhat.com, Robert.Olsson@data.slu.se Subject: Re: [PATCH] Add prefetches in net/ipv4/route.c From: "David S. Miller" In-Reply-To: <42EA7491.1010207@hp.com> References: <42EA6202.703@hp.com> <42EA6E0F.8060705@cosmosbay.com> <42EA7491.1010207@hp.com> X-Mailer: Mew version 4.2 on Emacs 21.4 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 2817 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev Content-Length: 321 Lines: 10 From: Rick Jones Date: Fri, 29 Jul 2005 11:25:21 -0700 > Which of course begs the question - what cache line size should be > ass-u-me-d when blocking these things? I'll put-forth 128 bytes. Why guess? There is no need to. Use "__read_mostly" which is an attribute designed explicitly for this. From jgarzik@pobox.com Sat Jul 30 21:53:10 2005 Received: with ECARTIS (v1.0.0; list netdev); Sat, 30 Jul 2005 21:53:16 -0700 (PDT) Received: from mail.dvmed.net (mail.dvmed.net [216.237.124.58]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6V4rAH9027015 for ; Sat, 30 Jul 2005 21:53:10 -0700 Received: from cpe-065-184-065-144.nc.res.rr.com ([65.184.65.144] helo=[10.10.10.88]) by mail.dvmed.net with esmtpsa (Exim 4.52 #1 (Red Hat Linux)) id 1Dz5n0-0006G3-FT; Sun, 31 Jul 2005 04:51:07 +0000 Message-ID: <42EC58B8.7080307@pobox.com> Date: Sun, 31 Jul 2005 00:51:04 -0400 From: Jeff Garzik User-Agent: Mozilla Thunderbird 1.0.6-1.1.fc4 (X11/20050720) X-Accept-Language: en-us, en MIME-Version: 1.0 To: Manfred Spraul CC: Francois Romieu , Netdev , renaud.lienhart@free.fr Subject: Re: [PATCH] forcedeth: Additional ethtool support References: <42D101EC.6000608@colorfullife.com> <20050710172833.GA1951@electric-eye.fr.zoreil.com> <42D16656.6000207@colorfullife.com> In-Reply-To: <42D16656.6000207@colorfullife.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 2818 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev Content-Length: 790 Lines: 22 Manfred Spraul wrote: > Not for a nic without complete documentation: What if an arch uses > 64-bit reads to read two registers at the same time? Not all nics like > that, for example IIRC natsemi explicitely mandates 32-bit reads. > x86-64 doesn't, it uses 32-bit reads, but I don't like the idea of using > memcpy to read registers. > > I agree with your other remarks, updated patch attached. Going through my pending folder, I was about to apply all the queued forcedeth patches. However, in two cases, you violated Rule #6 of http://linux.yyz.us/patch-format.html Please resend ALL forcedeth patches, with proper descriptions, so I don't have to hunt through previous patch versions, and previous emails, to match up the correct description to the correct patch. Jeff From jgarzik@pobox.com Sat Jul 30 22:13:55 2005 Received: with ECARTIS (v1.0.0; list netdev); Sat, 30 Jul 2005 22:13:58 -0700 (PDT) Received: from mail.dvmed.net (mail.dvmed.net [216.237.124.58]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6V5DtH9028749 for ; Sat, 30 Jul 2005 22:13:55 -0700 Received: from cpe-065-184-065-144.nc.res.rr.com ([65.184.65.144] helo=[10.10.10.88]) by mail.dvmed.net with esmtpsa (Exim 4.52 #1 (Red Hat Linux)) id 1Dz678-0006HH-Km; Sun, 31 Jul 2005 05:11:55 +0000 Message-ID: <42EC5D96.3050304@pobox.com> Date: Sun, 31 Jul 2005 01:11:50 -0400 From: Jeff Garzik User-Agent: Mozilla Thunderbird 1.0.6-1.1.fc4 (X11/20050720) X-Accept-Language: en-us, en MIME-Version: 1.0 To: raghavendra.koushik@neterion.com CC: netdev@oss.sgi.com, ravinandan.arakali@neterion.com, leonid.grossman@neterion.com, rapuru.sriram@neterion.com Subject: Re: [PATCH 2.6.12.1 1/12] S2io: Code cleanup References: <20050707220506.A8AC689826@linux.site> In-Reply-To: <20050707220506.A8AC689826@linux.site> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 2819 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev Content-Length: 549 Lines: 17 patch doesn't seem to apply :( Can you please resend the entire series, taking into account the comments WRT patch #5? Also, I was unable to include your fixes in my 'fixes' branch, whose speed to upstream kernel is accelerated, because patch #1 was not bug fixes. If you want your bug fixes to go upstream as rapidly as possible, make sure they are ordered before the code cleanups and new features. This allows me to send the fixes upstream immediately, while allowing further review and testing of the cleanup/feature patches. Jeff From jgarzik@pobox.com Sat Jul 30 22:30:10 2005 Received: with ECARTIS (v1.0.0; list netdev); Sat, 30 Jul 2005 22:30:20 -0700 (PDT) Received: from mail.dvmed.net (mail.dvmed.net [216.237.124.58]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6V5UAH9030504 for ; Sat, 30 Jul 2005 22:30:10 -0700 Received: from cpe-065-184-065-144.nc.res.rr.com ([65.184.65.144] helo=[10.10.10.88]) by mail.dvmed.net with esmtpsa (Exim 4.52 #1 (Red Hat Linux)) id 1Dz6Mq-0006Hb-4q; Sun, 31 Jul 2005 05:28:08 +0000 Message-ID: <42EC6165.2080809@pobox.com> Date: Sun, 31 Jul 2005 01:28:05 -0400 From: Jeff Garzik User-Agent: Mozilla Thunderbird 1.0.6-1.1.fc4 (X11/20050720) X-Accept-Language: en-us, en MIME-Version: 1.0 To: Radheka Godse , mitch.a.williams@intel.com CC: fubar@us.ibm.com, bonding-devel@lists.sourceforge.net, netdev@oss.sgi.com Subject: Re: [PATCH 2.6.13-rc1 0/17] bonding: Sysfs Support References: In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 2820 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev Content-Length: 513 Lines: 14 Radheka Godse wrote: > This patch set is an updated sysfs bonding patch to the one sent > previously in April. The patch is upto date with all recent bonding > and kernel changes and applies cleanly to linux-2.6.13-rc1. It adds > sysfs entry for the "Xmit Hash Policy" for new bonding module param; > incorporates feedback and bug fixes on all known issues and has been > tested on linux-2.6.13-rc1. Will we be seeing a resend of this patch series, incorporating the comments from GregKH and others? Jeff From SRS0+1fbf9b496c03fd52096f+707+infradead.org+hch@pentafluge.srs.infradead.org Sun Jul 31 07:07:33 2005 Received: with ECARTIS (v1.0.0; list netdev); Sun, 31 Jul 2005 07:07:38 -0700 (PDT) Received: from pentafluge.infradead.org (pentafluge.infradead.org [213.146.154.40]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j6VE7OH9031293 for ; Sun, 31 Jul 2005 07:07:33 -0700 Received: from hch by pentafluge.infradead.org with local (Exim 4.52 #1 (Red Hat Linux)) id 1DzERH-0001dc-JO; Sun, 31 Jul 2005 15:05:15 +0100 Date: Sun, 31 Jul 2005 15:05:15 +0100 From: Christoph Hellwig To: Ravinandan Arakali Cc: "'David S. Miller'" , hch@infradead.org, raghavendra.koushik@neterion.com, jgarzik@pobox.com, netdev@oss.sgi.com, leonid.grossman@neterion.com, rapuru.sriram@neterion.com Subject: Re: [PATCH 2.6.12.1 5/12] S2io: Performance improvements Message-ID: <20050731140515.GA6261@infradead.org> References: <20050712.140411.41107257.davem@davemloft.net> <001001c5945b$df8afd90$4810100a@pc.s2io.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <001001c5945b$df8afd90$4810100a@pc.s2io.com> User-Agent: Mutt/1.4.2.1i X-SRS-Rewrite: SMTP reverse-path rewritten from by pentafluge.infradead.org See http://www.infradead.org/rpr.html X-archive-position: 2821 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hch@infradead.org Precedence: bulk X-list: netdev Content-Length: 548 Lines: 12 On Fri, Jul 29, 2005 at 09:37:55AM -0700, Ravinandan Arakali wrote: > David, > We are trying to use the "default" directive in Kconfig. We tried > using an unconditional directive(just to test it out) such as > "default y" and a conditional one such as "default y if > CONFIG_IA64_SGI_SN2". Again, please make this a module option, CONFIG_IA64_SGI_SN2 does not mean runs on Altix but that this is a kernel that only supports Altix, which is a non-standard case that doesn't cover 90% or more of actual Altix systems in the field and running 2.6.