From davem@redhat.com Mon Aug 1 08:41:44 2005 Received: with ECARTIS (v1.0.0; list netdev); Mon, 01 Aug 2005 08:41:49 -0700 (PDT) Received: from mx1.redhat.com (mx1.redhat.com [66.187.233.31]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j71FfhH9031172 for ; Mon, 1 Aug 2005 08:41:44 -0700 Received: from int-mx1.corp.redhat.com (int-mx1.corp.redhat.com [172.16.52.254]) by mx1.redhat.com (8.12.11/8.12.11) with ESMTP id j71FdGv3015245; Mon, 1 Aug 2005 11:39:16 -0400 Received: from devserv.devel.redhat.com (devserv.devel.redhat.com [172.16.58.1]) by int-mx1.corp.redhat.com (8.11.6/8.11.6) with ESMTP id j71FdGV14396; Mon, 1 Aug 2005 11:39:16 -0400 Received: from localhost (localhost.localdomain [127.0.0.1]) by devserv.devel.redhat.com (8.12.11/8.12.11) with ESMTP id j71FdG5v016295; Mon, 1 Aug 2005 11:39:16 -0400 Date: Mon, 01 Aug 2005 11:39:16 -0400 (EDT) Message-Id: <20050801.113916.74736771.davem@redhat.com> To: dada1@cosmosbay.com Cc: davem@davemloft.net, rick.jones2@hp.com, netdev@oss.sgi.com, Robert.Olsson@data.slu.se Subject: Re: [PATCH] Add prefetches in net/ipv4/route.c From: "David S. Miller" In-Reply-To: <42EDDA50.4010405@cosmosbay.com> References: <42EA7491.1010207@hp.com> <20050730.205209.112313042.davem@davemloft.net> <42EDDA50.4010405@cosmosbay.com> X-Mailer: Mew version 4.2.52 on Emacs 21.3 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 2823 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev Content-Length: 533 Lines: 16 From: Eric Dumazet Date: Mon, 01 Aug 2005 10:16:16 +0200 > Last time I checked, read_mostly had a meaning only if CONFIG_X86 || > CONFIG_SPARC64 Meaning that other platforms need to implement this, big deal :) I do agree that the implementor of __read_mostly should have just let the build break instead so folks could simply be forced to fix things up. It's actually a trivial change, and the author therefore could also have updated all the platform's vmlinux.lds.S files instead of ifdef'ing this stuff. From ravinandan.arakali@neterion.com Mon Aug 1 11:03:56 2005 Received: with ECARTIS (v1.0.0; list netdev); Mon, 01 Aug 2005 11:04:05 -0700 (PDT) Received: from ns1.s2io.com (ns1.s2io.com [142.46.200.198]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j71I3qH9007322 for ; Mon, 1 Aug 2005 11:03:55 -0700 Received: from guinness.s2io.com (sentry.s2io.com [142.46.200.199]) by ns1.s2io.com (8.12.10/8.12.10) with ESMTP id j71I1ocx009815; Mon, 1 Aug 2005 14:01:50 -0400 (EDT) Received: from rarakali ([10.16.16.72]) by guinness.s2io.com (8.12.6/8.12.6) with SMTP id j71I1nKP021384; Mon, 1 Aug 2005 14:01:49 -0400 (EDT) From: "Ravinandan Arakali" To: "'Jeff Garzik'" , Cc: , , Subject: RE: [PATCH 2.6.12.1 1/12] S2io: Code cleanup Date: Mon, 1 Aug 2005 11:01:50 -0700 Message-ID: <001d01c596c3$17f098c0$4810100a@pc.s2io.com> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 (Normal) X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook CWS, Build 9.0.2416 (9.0.2911.0) X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.2180 In-Reply-To: <42EC5D96.3050304@pobox.com> Importance: Normal X-Scanned-By: MIMEDefang 2.34 X-archive-position: 2824 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ravinandan.arakali@neterion.com Precedence: bulk X-list: netdev Content-Length: 1241 Lines: 39 Jeff, We'll re-verify the patches against latest kernel and resend. Since the remaining 12 patches are layered on top of the first one, it will be tough to re-order the patches at this stage. Also, we'll need to rerun the QA cycle to ensure that nothing got broken in the process of patch reordering. We will certainly keep this in mind for our next submission. Thanks, Ravi -----Original Message----- From: Jeff Garzik [mailto:jgarzik@pobox.com] Sent: Saturday, July 30, 2005 10:12 PM To: raghavendra.koushik@neterion.com Cc: netdev@oss.sgi.com; ravinandan.arakali@neterion.com; leonid.grossman@neterion.com; rapuru.sriram@neterion.com Subject: Re: [PATCH 2.6.12.1 1/12] S2io: Code cleanup patch doesn't seem to apply :( Can you please resend the entire series, taking into account the comments WRT patch #5? Also, I was unable to include your fixes in my 'fixes' branch, whose speed to upstream kernel is accelerated, because patch #1 was not bug fixes. If you want your bug fixes to go upstream as rapidly as possible, make sure they are ordered before the code cleanups and new features. This allows me to send the fixes upstream immediately, while allowing further review and testing of the cleanup/feature patches. Jeff From jgarzik@pobox.com Mon Aug 1 11:08:20 2005 Received: with ECARTIS (v1.0.0; list netdev); Mon, 01 Aug 2005 11:08:23 -0700 (PDT) Received: from mail.dvmed.net (mail.dvmed.net [216.237.124.58]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j71I8KH9008196 for ; Mon, 1 Aug 2005 11:08:20 -0700 Received: from cpe-065-184-065-144.nc.res.rr.com ([65.184.65.144] helo=[10.10.10.88]) by mail.dvmed.net with esmtpsa (Exim 4.52 #1 (Red Hat Linux)) id 1Dzeg6-0007KO-8S; Mon, 01 Aug 2005 18:06:18 +0000 Message-ID: <42EE6497.8080506@pobox.com> Date: Mon, 01 Aug 2005 14:06:15 -0400 From: Jeff Garzik User-Agent: Mozilla Thunderbird 1.0.6-1.1.fc4 (X11/20050720) X-Accept-Language: en-us, en MIME-Version: 1.0 To: Ravinandan Arakali CC: raghavendra.koushik@neterion.com, netdev@oss.sgi.com, leonid.grossman@neterion.com, rapuru.sriram@neterion.com Subject: Re: [PATCH 2.6.12.1 1/12] S2io: Code cleanup References: <001d01c596c3$17f098c0$4810100a@pc.s2io.com> In-Reply-To: <001d01c596c3$17f098c0$4810100a@pc.s2io.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 2825 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev Content-Length: 878 Lines: 24 Ravinandan Arakali wrote: > Jeff, > We'll re-verify the patches against latest kernel and resend. > > Since the remaining 12 patches are layered on top of the first one, > it will be tough to re-order the patches at this stage. Also, we'll > need to rerun the QA cycle to ensure that nothing got broken in > the process of patch reordering. > We will certainly keep this in mind for our next submission. It's not a problem for me, either way you choose. I just wanted to make sure you that understood that ordering non-fixes before fixes could delay the fixes going into the upstream kernel. Since 2.6.13 is currently in 'release candidate' status, it is only taking bug fixes right now. If [hypothetically] patches had been ordered with bug fixes first, I could have sent the fixes into the 2.6.13 release. Without such ordering, we must wait until 2.6.14. Jeff From diego.beltrami@HIIT.FI Tue Aug 2 05:03:46 2005 Received: with ECARTIS (v1.0.0; list netdev); Tue, 02 Aug 2005 05:03:56 -0700 (PDT) Received: from pegasus.hiit.fi (pegasus.hiit.fi [212.68.1.186]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j72C3iH9008039 for ; Tue, 2 Aug 2005 05:03:45 -0700 Received: from [128.214.113.174] (odysse.hiit.fi [128.214.113.174]) by pegasus.hiit.fi (Postfix) with ESMTP id ECA23220057; Tue, 2 Aug 2005 15:01:39 +0300 (EEST) Subject: Re: [PATCH 2.6.12.2] XFRM: BEET IPsec mode for Linux From: Diego Beltrami Reply-To: diego.beltrami@HIIT.FI To: Herbert Xu Cc: netdev@oss.sgi.com, infrahip@HIIT.FI, hipl-users@freelists.org, hipsec@ietf.org Content-Type: multipart/mixed; boundary="=-cxEtPhsKh15+QDSHjSKR" Organization: HIIT Message-Id: <1122984099.1214.142.camel@odysse> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.4.6 (1.4.6-2) Date: Tue, 02 Aug 2005 15:01:39 +0300 X-archive-position: 2828 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: diego.beltrami@HIIT.FI Precedence: bulk X-list: netdev Content-Length: 36898 Lines: 1237 --=-cxEtPhsKh15+QDSHjSKR Content-Type: text/plain Content-Transfer-Encoding: 7bit Folks, after sending the first version of BEET patch and having received a valuable feedback and after the discussion based upon the BEET design, we now send the new BEET patch which allows for BEET to work without the inter-family transform (i.e. inner address family different than outer address family). The implementation of such a patch is based on the draft you can find at the following URL: http://www.ietf.org/internet-drafts/draft-nikander-esp-beet-mode-03.txt The patch is attached to the email, but, in case it gives some problems in applying it, you may also find it at the following URL: http://infrahip.hiit.fi/beet/beet-patch-v2.0-2.6.12.2 As it was originally designed the BEET patch at the moment works for only ESP protocol. As Pekka Nikader mentioned in one reply [1]: "[...] defining BEET mode for AH might be pretty tricky. [...] it probably would require some careful thinking to define the exact semantics, like what addresses (inner or outer) are covered by the AH integrity protection, what does the integrity protection really assert, etc. ". As previously written, the inter-family transform has been left out at the moment since the xfrm architecture doesn't support it. As a result, as soon as the xfrm architecture will be enhanced, the inter-family case will be properly included as, for example, it can be useful for supporting HIP over IPv4 network. But, as already mentioned, this would require more work in properly designing the xfrm architecture (thing which we consider necessary in order to make xfrm as generic as possible). On the behalf of the BEET development team, Signed-off-by: Diego Beltrami Reference: [1] http://marc.theaimsgroup.com/?l=linux-netdev&m=112265207304302&w=2 --=-cxEtPhsKh15+QDSHjSKR Content-Disposition: attachment; filename=beet-patch-v2.0-2.6.12.2 Content-Type: text/plain; name=beet-patch-v2.0-2.6.12.2; charset=ISO-8859-15 Content-Transfer-Encoding: 7bit --- linux-2.6.12.2-orig/Documentation/README.BEET +++ linux-2.6.12.2/Documentation/README.BEET @@ -0,0 +1,296 @@ +Linux BEET-mode ESP patch + +Authors: Miika Komu + Kristian Slavov + Jeff Ahrenholz + Abhinav Pathak + Diego Beltrami + +Changelog: May 25, 2005 this document created + + +Description +----------- +This patch extends the native Linux 2.6 kernel IPsec to support +Bound-End-to-End-Tunnel (BEET) mode for ESP: + +Abstract + + This document specifies a new mode, called Bound End-to-End Tunnel + (BEET) mode, for IPsec ESP. The new mode augments the existing ESP + tunnel and transport modes. For end-to-end tunnels, the new mode + provides limited tunnel mode semantics without the regular tunnel + mode overhead. The mode is intended to support new uses of ESP, + including mobility and multi-address multi-homing. + +http://www.ietf.org/internet-drafts/draft-nikander-esp-beet-mode-03.txt + +BEET mode architecture +---------------------- + +Below are some control flow diagrams to illustrate how BEET works. + +Sending (inner IPv4, outer IPv4)(4-4) +===================================== +inet_sendmsg + raw_sendmsg + ip_route_output_flow + __ip_route_output_key + xfrm_lookup + flow_cache_lookup + xfrm_policy_lookup // lookup IPsec policy + xfrm_find_bundle // lookup IPsec SA + __xfrm_selector_match + xfrm_tmpl_resolve // only if bundle was not found! + xfrm_state_find + xfrm_bundle_create // create output (dst) chain if bundle was not found + __xfrm4_bundle_create + ip_push_pending_frames + dst_output(skb) //this calls skb->dst->output(); + xfrm4_output //This finally returns 4 (NET_XMIT_BYPASS) to dst_output(); + xfrm4_encap + esp_output + xfrm_beet_output //change the ip header to outer. + dst_output(skb) + ip_output + ip_finish_output Or ip_fragment //depending on size of packet + // Returns 0 to dst_output(); which makes dst_output to come out of infinite loop. + dev_queue_xmit + + +Receiving (inner IPv4, outer IPv4)(4-4) +=========== + +net_rx_action() +e1000_clean() // dependent on network hardware +e1000_clean_rx_irq() +netif_receive_skb() + deliver_skb() + ret = pt_prev->func(skb, skb->dev, pt_prev); + ip_rcv() + nf_hook() + ip_rcv_finish() + ip_route_input() + dst_input()->ip_forward() or ip_input() + ip_input // remove the IPv4 header + ip_input_finish + ret = ipprot->handler(&skb, &nhoff); + xfrm4_rcv() + xfrm4_rcv_encap() + xfrm4_parse_spi() + xfrm_state_lookup() // lookup IPsec SA + xfrm_beet_input(skb, x) //To change to inner IP header. + nexthdr = x->type->input(x, xfrm.decap, skb) // == esp_input + esp_input() // process ESP based on inner address + returns 0 ; + /* beet handling in xfrm_rcv_spi */ + netif_rx() + // ip_input_finish returns 0 + // netif_receive_skb returns 0 +netif_receive_skb //Now we have an IPv4 packet. So the input flow is for v4 packet. + deliver_skb() + ret = pt_prev->func(skb, skb->dev, pt_prev); + ip_rcv() + nf_hook() //This calls ip_rcv_finish(skb) + ip_rcv_finish() //Here the skb->dst is NULL and so is filled for the input side. + ip6_route_input() + dst_input()->ip_forward() or ip_input() + ip_input // remove the IPv4 header + ip_input_finish + ... + ... + ... + +Sending (inner IPv6, outer IPv6)(6-6) +============= + +(When sending the first packet!) + +inet_sendmsg + rawv6_sendmsg + ip6_dst_lookup + ip6_route_output + xfrm_lookup + flow_cache_lookup + xfrm_policy_lookup // lookup IPsec policy + xfrm_find_bundle // lookup IPsec SA + __xfrm_selector_match + xfrm_tmpl_resolve // only if bundle was not found! + xfrm_state_find + xfrm_bundle_create // create output (dst) chain if bundle was not found + __xfrm6_bundle_create + rawv6_push_pending_frames + ip6_push_pending_frames + dst_output(skb) + xfrm6_output + xfrm6_encap + esp6_output + xfrm_beet_output + dst_output(skb) + ip6_output + ip6_output2 + ip6_output_finish + dev_queue_xmit + +when are these called? + ip6_xmt() + dst_output() + + +Receiving (inner IPv6, outer IPv6)(6-6) +=========== + +net_rx_action() +e1000_clean() // dependent on network hardware +e1000_clean_rx_irq() +netif_receive_skb() + deliver_skb() + ret = pt_prev->func(skb, skb->dev, pt_prev); + ipv6_rcv() // skb len = 140 + nf_hook_slow() + ip6_rcv_finish() + ip6_route_input() + dst_input()->ip6_forward() or ip6_input() + ip6_input // remove the IPv6 header + ip6_input_finish // calls recursively the ->handler = xfrm6_rcv + ret = ipprot->handler(&skb, &nhoff); // handler = xfrm6_rcv_spi + xfrm6_rcv() + xfrm6_rcv_spi() + xfrm_parse_spi() + xfrm_state_lookup() // lookup IPsec SA + xfrm_beet_input(skb, x) //To change to inner IP header. + nexthdr = x->type->input(x, xfrm.decap, skb) // == esp6_input + esp6_input() // process ESP + returns 58 (ICMPv6) //returns the nexthdr in the ipv6 packet. + /* beet handling in xfrm_rcv_spi */ + netif_rx() + // ip6_input_finish returns 0 + // netif_receive_skb returns 0 +netif_receive_skb + deliver_skb() + ret = pt_prev->func(skb, skb->dev, pt_prev); + ipv6_rcv() // skb len = 104 + nf_hook_slow() + ip6_rcv_finish() + ip6_route_input() + dst_input()->ip6_forward() or ip6_input() + ip6_input // remove the IPv6 header + ip6_input_finish + xfrm6_policy_check() + .. + __xfrm_policy_check + ret = ipprot->handler(&skb, &nhoff); // handler = xfrm6_rcv_spi +tcp_v6_rcv() // or icmpv6_rcv(), anyway, deliver to upper layer + + +output path +ip6_datagram_connect() + ip6_dst_lookup() // success + xfrm_lookup() // lookup policy using inner IP, matching selectors in SP and + flow information + xfrm_sk_policy_lookup() // success + flow_cache_lookup() // success + xfrm_find_bundle() // check for a bundle, if found use it, or create new + xfrm_tmpl_resolve() // when creating new, search for SA for each transform + // once valid SA found, use it to create bundle and link + // to SP. modify skbuff's dst-pointer pointing to next + // xfrmX_output(), after encaps/trans dst is consulted + // to route the packet + xfrm_state_find() // + xfrm_selector_match() // + km_query() // + + + + app app + | | + inner inner + \ / + - / + \ / + \--outer outer--/ + \ / + \======/ + + +Files changed +------------- +This is a list of changes made by the BEET patch. + +include/linux/ipsec.h + - IPSEC_MODE_BEET added + This is the new type of SA that may be created. + XXX note: are we overusing XFRM_MODE_BEET where IPSEC_MODE_BEET should be + used instead? + +include/linux/xfrm.h + - enum XFRM_MODE_{TRANSPORT|TUNNEL|BEET} added + Mode needed to distinguish from tunnel mode in xfrm code. + +include/net/xfrm.h + - u16 beet_family added to struct xfrm_state + For the outgoing SA, this is the family of the outer address. + For the incoming SA, this is the family of the inner address. + - unsigned short family added to struct xfrm_tmpl + family is required because the family may differ from the one in the selector + - possible change to xfrm_selector_match() (commented out) + +net/ipv4/xfrm4_input.c + - in xfrm4_rcv_encap() change the + ip header to inner before going for esp test. + - in xfrm4_rcv_encap() check x->props.mode for XFRM_MODE_TUNNEL, _BEET + checks address family (x->props.beet_family), and makes final adjustments + to packet before requeing it. + +net/ipv4/xfrm4_output.c + - xfrm4_encap(), note to fix the BEET case, like xfrm6_encap + - xfrm4_output() changes the ip header + +net/ipv4/esp4.c + - in esp_init_state(), check if x->props.mode == XFRM_MODE_TUNNEL, + then x->props.header_len += sizeof(struct ipv6hdr), not if (x->props.mode) + +net/ipv6/esp6.c + - in esp6_init_state(), check if x->props.mode == XFRM_MODE_TUNNEL, + then x->props.header_len += sizeof(struct ipv6hdr), not if (x->props.mode) + +net/ipv6/xfrm6_input.c + - xfrm6-rcv_spi(), changes the + inner ip header before sending to esp decapsulation. + - in xfrm6_rcv_spi(), handle x->props.mode = XFRM_MODE_BEET + checks address family (x->props.beet_family), makes final adjustments to + packet before requeing it. + +net/ipv6/xfrm6_output.c + - xfrm6_encap() add ipv4 header vars, check if (x->props.mode==XFRM_MODE_BEET) + makes space for appropriate esp header and sends to espX_output where X depends + on inner family of beet. + - xfrm6_output() change if(x->props.mode) to (x->props.mode==XFRM_MODE_TUNNEL) + After esp calculations the ip header is changed + to outer ip header. + +net/ipv6/xfrm6_policy.c + (on output...) + - in __xfrm6_bundle_create() added remotebeet, localbeet vars, + get the IPv6 headers from xfrm[i]->id.daddr (remote) and + xfrm[i]->props.saddr (local) + copy IPv4 or IPv6 addresses from remote/localbeet to fl_tunnel.fl4/6_dst/src + then do xfrm_dst_lookup() passing in xfrm[i]->props.beet_family + +net/key/af_key.c + - commented-out code in pfkey_msg2xfrm_state(): + check x->props.beet_family for x->props.family? + + - parse_ipsecrequest() check if (t->mode==IPSEC_MODE_TUNNEL-1) + handle if (t->mode==IPSEC_MODE_BEET-1) + populate t->saddr.a4 or t->saddr.a6, t->family, etc + This supports adding a new type of beet mode SA. + +net/xfrm/Kconfig + - added XFRM_BEET config variable option and text + This allows you to compile BEET mode into your kernel. + +net/xfrm/xfrm_policy.c + - note from Miika - fns added just for testing, removed for BEET + ipv6_addr_is_hit(), hip_xfrm_handler_notify(), hip_xfrm_handler_acquire(), + hip_xfrm_handler_policy_notify(), hip_register_xfrm_km_handler(), etc --- linux-2.6.12.2-orig/include/linux/ipsec.h +++ linux-2.6.12.2/include/linux/ipsec.h @@ -13,6 +13,9 @@ IPSEC_MODE_ANY = 0, /* We do not support this for SA */ IPSEC_MODE_TRANSPORT = 1, IPSEC_MODE_TUNNEL = 2 +#ifdef CONFIG_XFRM_BEET + ,IPSEC_MODE_BEET = 3 +#endif }; enum { --- linux-2.6.12.2-orig/include/linux/xfrm.h +++ linux-2.6.12.2/include/linux/xfrm.h @@ -102,6 +102,15 @@ XFRM_SHARE_UNIQUE /* Use once */ }; +enum +{ + XFRM_MODE_TRANSPORT = 0, + XFRM_MODE_TUNNEL +#ifdef CONFIG_XFRM_BEET + ,XFRM_MODE_BEET +#endif +}; + /* Netlink configuration messages. */ enum { XFRM_MSG_BASE = 0x10, --- linux-2.6.12.2-orig/include/net/xfrm.h +++ linux-2.6.12.2/include/net/xfrm.h @@ -113,6 +113,14 @@ xfrm_address_t saddr; int header_len; int trailer_len; +#ifdef CONFIG_XFRM_BEET + /* beet_family_out = family of outer addresses + * beet_family_in = family of inner addresses + */ + u16 beet_family_in; + u16 beet_family_out; + +#endif } props; struct xfrm_lifetime_cfg lft; @@ -241,6 +249,12 @@ /* Source address of tunnel. Ignored, if it is not a tunnel. */ xfrm_address_t saddr; +/* family of the addresses. In BEET-mode the family may differ from + the one in selector */ +#ifdef CONFIG_XFRM_BEET + unsigned short family; +#endif + __u32 reqid; /* Mode: transport/tunnel */ @@ -835,6 +849,12 @@ extern void xfrm6_tunnel_free_spi(xfrm_address_t *saddr); extern u32 xfrm6_tunnel_spi_lookup(xfrm_address_t *saddr); extern int xfrm6_output(struct sk_buff *skb); +#ifdef CONFIG_XFRM_BEET +extern struct xfrm_state * xfrm_lookup_bydst(u8 mode, xfrm_address_t *daddr, xfrm_address_t *saddr, unsigned short family); +extern int xfrm_beet_output(struct sk_buff *skb); +extern int xfrm_beet_input(struct sk_buff *skb, struct xfrm_state *x); + +#endif #ifdef CONFIG_XFRM extern int xfrm4_rcv_encap(struct sk_buff *skb, __u16 encap_type); --- linux-2.6.12.2-orig/net/ipv4/esp4.c +++ linux-2.6.12.2/net/ipv4/esp4.c @@ -1,3 +1,13 @@ +/* + * Changes: BEET support + * Abhinav Pathak + * Diego Beltrami + * Kristian Slavov + * Miika Komu + * Jeff Ahrenholz + * + */ + #include #include #include @@ -23,7 +33,7 @@ struct iphdr *top_iph; struct ip_esp_hdr *esph; struct crypto_tfm *tfm; - struct esp_data *esp; + struct esp_data *esp = x->data; struct sk_buff *trailer; int blksize; int clen; @@ -31,7 +41,15 @@ int nfrags; /* Strip IP+ESP header. */ - __skb_pull(skb, skb->h.raw - skb->data); +#ifdef CONFIG_XFRM_BEET + int hdr_len = skb->h.raw - skb->data + sizeof(*esph) + esp->conf.ivlen; + if (x->props.mode == XFRM_MODE_BEET) + __skb_pull(skb, hdr_len); + else + __skb_pull(skb, skb->h.raw - skb->data); +#else + __skb_pull(skb, skb->h.raw - skb->data); +#endif /* Now skb is pure payload to encrypt */ err = -ENOMEM; @@ -39,7 +57,6 @@ /* Round to block size */ clen = skb->len; - esp = x->data; alen = esp->auth.icv_trunc_len; tfm = esp->conf.tfm; blksize = (crypto_tfm_alg_blocksize(tfm) + 3) & ~3; @@ -59,7 +76,14 @@ *(u8*)(trailer->tail + clen-skb->len - 2) = (clen - skb->len)-2; pskb_put(skb, trailer, clen - skb->len); +#ifdef CONFIG_XFRM_BEET + if (x->props.mode == XFRM_MODE_BEET) + __skb_push(skb, hdr_len); + else + __skb_push(skb, skb->data - skb->nh.raw); +#else __skb_push(skb, skb->data - skb->nh.raw); +#endif top_iph = skb->nh.iph; esph = (struct ip_esp_hdr *)(skb->nh.raw + top_iph->ihl*4); top_iph->tot_len = htons(skb->len + alen); @@ -428,7 +452,11 @@ if (crypto_cipher_setkey(esp->conf.tfm, esp->conf.key, esp->conf.key_len)) goto error; x->props.header_len = sizeof(struct ip_esp_hdr) + esp->conf.ivlen; +#ifdef CONFIG_XFRM_BEET + if (x->props.mode == XFRM_MODE_TUNNEL) +#else if (x->props.mode) +#endif x->props.header_len += sizeof(struct iphdr); if (x->encap) { struct xfrm_encap_tmpl *encap = x->encap; --- linux-2.6.12.2-orig/net/ipv4/xfrm4_input.c +++ linux-2.6.12.2/net/ipv4/xfrm4_input.c @@ -7,6 +7,13 @@ * Derek Atkins * Add Encapsulation support * + * Changes: BEET support + * Abhinav Pathak + * Diego Beltrami + * Kristian Slavov + * Miika Komu + * Jeff Ahrenholz + * */ #include @@ -78,6 +85,25 @@ goto drop_unlock; xfrm_vec[xfrm_nr].decap.decap_type = encap_type; + +#ifdef CONFIG_XFRM_BEET + if (x->props.mode == XFRM_MODE_BEET) { + /* Change the outer header with the inner data */ + if (x->props.beet_family_in == AF_INET && x->props.beet_family_out == AF_INET){ + /* Inner = 4, Outer = 4 */ + struct iphdr *iph = (struct iphdr *)skb->nh.iph; + iph->daddr = x->sel.daddr.a4; + iph->saddr = x->sel.saddr.a4; + iph->ttl--; + iph->tot_len = htons(skb->len); + iph->frag_off = htons(IP_DF); + iph->check = 0; + iph->check = ip_fast_csum((unsigned char *)iph, iph->ihl); + + } else + BUG_ON(1); + } +#endif if (x->type->input(x, &(xfrm_vec[xfrm_nr].decap), skb)) goto drop_unlock; @@ -96,7 +122,11 @@ iph = skb->nh.iph; +#ifdef CONFIG_XFRM_BEET + if (x->props.mode == XFRM_MODE_TUNNEL) { +#else if (x->props.mode) { +#endif if (iph->protocol != IPPROTO_IPIP) goto drop; if (!pskb_may_pull(skb, sizeof(struct iphdr))) @@ -115,9 +145,35 @@ decaps = 1; break; } +#ifdef CONFIG_XFRM_BEET + else if (x->props.mode == XFRM_MODE_BEET) { + struct iphdr *iph = skb->nh.iph; + int size = sizeof(struct iphdr); + + if (skb_cloned(skb) && + pskb_expand_head(skb, 0, 0, GFP_ATOMIC)) + goto drop; - if ((err = xfrm_parse_spi(skb, skb->nh.iph->protocol, &spi, &seq)) < 0) + skb_push(skb, size); + + memmove(skb->data, skb->nh.raw, size); + skb->mac.raw = memmove(skb->data - skb->mac_len, + skb->mac.raw, skb->mac_len); + skb->nh.raw = skb->data; + iph->tot_len = htons(skb->len); + iph->check = 0; + iph->check = ip_fast_csum((unsigned char *)iph, iph->ihl); + skb->protocol = htons(ETH_P_IP); + dst_release(skb->dst); + skb->dst = NULL; + decaps = 1; + + break; + } +#endif + if ((err = xfrm_parse_spi(skb, skb->nh.iph->protocol, &spi, &seq)) < 0) goto drop; + } while (!err); /* Allocate new secpath or COW existing one. */ --- linux-2.6.12.2-orig/net/ipv4/xfrm4_output.c +++ linux-2.6.12.2/net/ipv4/xfrm4_output.c @@ -6,6 +6,14 @@ * modify it under the terms of the GNU General Public License * as published by the Free Software Foundation; either version * 2 of the License, or (at your option) any later version. + * + * Changes: BEET support + * Abhinav Pathak + * Diego Beltrami + * Kristian Slavov + * Miika Komu + * Jeff Ahrenholz + * */ #include @@ -26,7 +34,8 @@ * check * * On exit, skb->h will be set to the start of the payload to be processed - * by x->type->output and skb->nh will be set to the top IP header. + * by x->type->output and skb->nh, as well as skb->data, will point to + * the top IP header. */ static void xfrm4_encap(struct sk_buff *skb) { @@ -35,15 +44,36 @@ struct iphdr *iph, *top_iph; iph = skb->nh.iph; - skb->h.ipiph = iph; +#ifdef CONFIG_XFRM_BEET + /* + * This is because otherwise the BEET patch crashes in any case with Inner=4 + */ + if (x->props.mode != XFRM_MODE_BEET) + skb->h.ipiph = iph; +#else + skb->h.ipiph = iph; +#endif skb->nh.raw = skb_push(skb, x->props.header_len); top_iph = skb->nh.iph; +#ifdef CONFIG_XFRM_BEET + if (x->props.mode == XFRM_MODE_TRANSPORT) { +#else if (!x->props.mode) { +#endif + skb->h.raw += iph->ihl*4; memmove(top_iph, iph, iph->ihl*4); return; +#ifdef CONFIG_XFRM_BEET + } else if (x->props.mode == XFRM_MODE_BEET) { + + skb->h.raw = skb->data + sizeof(struct iphdr); + memmove(top_iph, iph, iph->ihl*4); + return; + +#endif /* CONFIG_XFRM_BEET */ } top_iph->ihl = 5; @@ -103,7 +133,11 @@ goto error_nolock; } +#ifdef CONFIG_XFRM_BEET + if (x->props.mode == XFRM_MODE_TUNNEL) { +#else if (x->props.mode) { +#endif err = xfrm4_tunnel_check_size(skb); if (err) goto error_nolock; @@ -120,6 +154,21 @@ if (err) goto error; +#ifdef CONFIG_XFRM_BEET + if (x->props.mode == XFRM_MODE_BEET) { + /* Change the outer header */ + if (x->props.beet_family_in == AF_INET && x->props.beet_family_out == AF_INET){ + struct iphdr *iph = (struct iphdr*)skb->data; + iph->saddr = x->props.saddr.a4; + iph->daddr = x->id.daddr.a4; + skb->local_df = 1; //I am a bit unsure on how to implement this -Abi + iph->check = 0; + iph->check = ip_fast_csum((unsigned char *)iph, iph->ihl); + } else + BUG_ON(1); + } +#endif + x->curlft.bytes += skb->len; x->curlft.packets++; --- linux-2.6.12.2-orig/net/ipv4/xfrm4_policy.c 2005-06-30 02:00:53.000000000 +0300 +++ linux-2.6.12.2/net/ipv4/xfrm4_policy.c 2005-08-01 15:05:26.000000000 +0300 @@ -6,6 +6,14 @@ * YOSHIFUJI Hideaki @USAGI * Split up af-specific portion * + * + * Changes: BEET support + * Abhinav Pathak + * Diego Beltrami + * Kristian Slavov + * Miika Komu + * Jeff Ahrenholz + * */ #include @@ -66,6 +74,12 @@ } } }; +#ifdef CONFIG_XFRM_BEET + union { + struct in6_addr *in6; + struct in_addr *in; + } remotebeet, localbeet; +#endif int i; int err; int header_len = 0; @@ -78,6 +92,9 @@ struct dst_entry *dst1 = dst_alloc(&xfrm4_dst_ops); struct xfrm_dst *xdst; int tunnel = 0; +#ifdef CONFIG_XFRM_BEET + unsigned short beet_family = 0; +#endif if (unlikely(dst1 == NULL)) { err = -ENOBUFS; @@ -98,11 +115,26 @@ dst1->next = dst_prev; dst_prev = dst1; +#ifdef CONFIG_XFRM_BEET + if (xfrm[i]->props.mode == XFRM_MODE_TUNNEL) { +#else if (xfrm[i]->props.mode) { +#endif remote = xfrm[i]->id.daddr.a4; local = xfrm[i]->props.saddr.a4; tunnel = 1; } +#ifdef CONFIG_XFRM_BEET + else if (xfrm[i]->props.mode == XFRM_MODE_BEET) { + + if(xfrm[i]->props.beet_family_out == AF_INET){ + remotebeet.in = (struct in_addr*)&xfrm[i]->id.daddr; + localbeet.in = (struct in_addr*)&xfrm[i]->props.saddr; + beet_family = xfrm[i]->props.beet_family_out; + } else + BUG_ON(1); + } +#endif header_len += xfrm[i]->props.header_len; trailer_len += xfrm[i]->props.trailer_len; @@ -113,6 +145,18 @@ &fl_tunnel, AF_INET); if (err) goto error; +#ifdef CONFIG_XFRM_BEET + } else if (beet_family) { + fl_tunnel.fl4_dst = remotebeet.in->s_addr; + fl_tunnel.fl4_src = localbeet.in->s_addr; + + err = xfrm_dst_lookup((struct xfrm_dst **) &rt, + &fl_tunnel, beet_family); + /* Without this, the BEET mode crashes + indeterministically -Abi */ + rt->peer = NULL; + rt_bind_peer(rt,1); +#endif } else dst_hold(&rt->u.dst); } --- linux-2.6.12.2-orig/net/ipv6/esp6.c +++ linux-2.6.12.2/net/ipv6/esp6.c @@ -22,6 +22,16 @@ * Kunihiro Ishiguro * * This file is derived from net/ipv4/esp.c + * + * + * Changes: BEET support + * Abhinav Pathak + * Diego Beltrami + * Kristian Slavov + * Miika Komu + * Jeff Ahrenholz + * + * */ #include @@ -365,7 +375,11 @@ if (crypto_cipher_setkey(esp->conf.tfm, esp->conf.key, esp->conf.key_len)) goto error; x->props.header_len = sizeof(struct ipv6_esp_hdr) + esp->conf.ivlen; +#ifdef CONFIG_XFRM_BEET + if (x->props.mode == XFRM_MODE_TUNNEL) +#else if (x->props.mode) +#endif x->props.header_len += sizeof(struct ipv6hdr); x->data = esp; return 0; --- linux-2.6.12.2-orig/net/ipv6/xfrm6_input.c +++ linux-2.6.12.2/net/ipv6/xfrm6_input.c @@ -64,6 +64,18 @@ if (xfrm_state_check_expire(x)) goto drop_unlock; +#ifdef CONFIG_XFRM_BEET + if (x->props.mode == XFRM_MODE_BEET) { + if (x->props.beet_family_in == AF_INET6 && x->props.beet_family_out == AF_INET6){ + struct ipv6hdr *ip6h = (struct ipv6hdr *)skb->nh.raw; + ipv6_addr_copy(&ip6h->daddr, + (struct in6_addr *) &x->sel.daddr.a6); + ipv6_addr_copy(&ip6h->saddr, + (struct in6_addr *) &x->sel.saddr.a6); + } else + BUG_ON(1); + } +#endif nexthdr = x->type->input(x, &(xfrm_vec[xfrm_nr].decap), skb); if (nexthdr <= 0) goto drop_unlock; @@ -80,7 +92,11 @@ xfrm_vec[xfrm_nr++].xvec = x; +#ifdef CONFIG_XFRM_BEET + if (x->props.mode == XFRM_MODE_TUNNEL) { +#else if (x->props.mode) { /* XXX */ +#endif if (nexthdr != IPPROTO_IPV6) goto drop; if (!pskb_may_pull(skb, sizeof(struct ipv6hdr))) @@ -97,6 +113,33 @@ skb->nh.raw = skb->data; decaps = 1; break; +#ifdef CONFIG_XFRM_BEET + } else if (x->props.mode == XFRM_MODE_BEET) { + struct ipv6hdr *ip6h = skb->nh.ipv6h; + int size = sizeof(struct ipv6hdr); + __u16 total = ntohs(ip6h->payload_len); + + /* is the buffer a clone? + * then create identical copy of header of skb */ + if (skb_cloned(skb) && + pskb_expand_head(skb, 0, 0, GFP_ATOMIC)) + goto drop; + + /* add data to the start of the buffer */ + skb_push(skb, size); + /* move the raw header into new space */ + memmove(skb->data, skb->nh.raw, size); + /* move MAC header */ + skb->mac.raw = memmove(skb->data - skb->mac_len, + skb->mac.raw, skb->mac_len); + skb->nh.raw = skb->data; + + ip6h->payload_len = htons(total + size); + --ip6h->hop_limit; + decaps = 1; + + break; +#endif } if ((err = xfrm_parse_spi(skb, nexthdr, &spi, &seq)) < 0) --- linux-2.6.12.2-orig/net/ipv6/xfrm6_output.c +++ linux-2.6.12.2/net/ipv6/xfrm6_output.c @@ -7,6 +7,14 @@ * modify it under the terms of the GNU General Public License * as published by the Free Software Foundation; either version * 2 of the License, or (at your option) any later version. + * + * Changes: BEET support + * Abhinav Pathak + * Diego Beltrami + * Kristian Slavov + * Miika Komu + * Jeff Ahrenholz + * */ #include @@ -17,6 +25,10 @@ #include #include +#ifdef CONFIG_XFRM_BEET +#include +#endif + /* Add encapsulation header. * * In transport mode, the IP header and mutable extension headers will be moved @@ -42,7 +54,12 @@ skb_push(skb, x->props.header_len); iph = skb->nh.ipv6h; + +#ifdef CONFIG_XFRM_BEET + if (x->props.mode == XFRM_MODE_TRANSPORT) { +#else if (!x->props.mode) { +#endif u8 *prevhdr; int hdr_len; @@ -51,6 +68,16 @@ skb->h.raw = skb->data + hdr_len; memmove(skb->data, iph, hdr_len); return; + +#ifdef CONFIG_XFRM_BEET + } else if (x->props.mode == XFRM_MODE_BEET) { + + memmove(skb->data, skb->nh.raw, sizeof(struct ipv6hdr)); + skb->nh.raw = &((struct ipv6hdr *)skb->data)->nexthdr; + skb->h.ipv6h = ((struct ipv6hdr *)skb->data) + 1; + return; + +#endif /* CONFIG_XFRM_BEET */ } skb->nh.raw = skb->data; @@ -104,7 +131,11 @@ goto error_nolock; } +#ifdef CONFIG_XFRM_BEET + if (x->props.mode == XFRM_MODE_TUNNEL) { +#else if (x->props.mode) { +#endif err = xfrm6_tunnel_check_size(skb); if (err) goto error_nolock; @@ -121,6 +152,19 @@ if (err) goto error; +#ifdef CONFIG_XFRM_BEET + if (x->props.mode == XFRM_MODE_BEET) { + /* Change the outer header */ + if (x->props.beet_family_in == AF_INET6 && x->props.beet_family_out == AF_INET6){ + /* Inner = 6, Outer = 6 */ + struct ipv6hdr *iph = (struct ipv6hdr*)skb->data; + ipv6_addr_copy(&iph->saddr, (struct in6_addr *)&x->props.saddr); + ipv6_addr_copy(&iph->daddr, (struct in6_addr *)&x->id.daddr); + } else + BUG_ON(1); + } +#endif + x->curlft.bytes += skb->len; x->curlft.packets++; --- linux-2.6.12.2-orig/net/ipv6/xfrm6_policy.c +++ linux-2.6.12.2/net/ipv6/xfrm6_policy.c @@ -8,7 +8,14 @@ * IPv6 support * YOSHIFUJI Hideaki * Split up af-specific portion - * + * + * Changes: BEET support + * Abhinav Pathak + * Diego Beltrami + * Kristian Slavov + * Miika Komu + * Jeff Ahrenholz + * */ #include @@ -84,6 +91,12 @@ } } }; +#ifdef CONFIG_XFRM_BEET + union { + struct in6_addr *in6; + struct in_addr *in; + } remotebeet, localbeet; +#endif int i; int err = 0; int header_len = 0; @@ -96,6 +109,9 @@ struct dst_entry *dst1 = dst_alloc(&xfrm6_dst_ops); struct xfrm_dst *xdst; int tunnel = 0; +#ifdef CONFIG_XFRM_BEET + unsigned short beet_family = 0; +#endif if (unlikely(dst1 == NULL)) { err = -ENOBUFS; @@ -118,11 +134,25 @@ dst1->next = dst_prev; dst_prev = dst1; +#ifdef CONFIG_XFRM_BEET + if (xfrm[i]->props.mode == XFRM_MODE_TUNNEL) { +#else if (xfrm[i]->props.mode) { +#endif remote = (struct in6_addr*)&xfrm[i]->id.daddr; local = (struct in6_addr*)&xfrm[i]->props.saddr; tunnel = 1; } +#ifdef CONFIG_XFRM_BEET + else if (xfrm[i]->props.mode == XFRM_MODE_BEET) { + if (xfrm[i]->props.beet_family_out == AF_INET6) { + beet_family = xfrm[i]->props.beet_family_out; + remotebeet.in6 = (struct in6_addr*)&xfrm[i]->id.daddr; + localbeet.in6 = (struct in6_addr*)&xfrm[i]->props.saddr; + } else + BUG_ON(1); + } +#endif header_len += xfrm[i]->props.header_len; trailer_len += xfrm[i]->props.trailer_len; @@ -133,6 +163,13 @@ &fl_tunnel, AF_INET6); if (err) goto error; +#ifdef CONFIG_XFRM_BEET + } else if (beet_family) { + ipv6_addr_copy(&fl_tunnel.fl6_dst, remotebeet.in6); + ipv6_addr_copy(&fl_tunnel.fl6_src, localbeet.in6); + err = xfrm_dst_lookup((struct xfrm_dst **) &rt, + &fl_tunnel, beet_family); +#endif } else dst_hold(&rt->u.dst); } --- linux-2.6.12.2-orig/net/key/af_key.c +++ linux-2.6.12.2/net/key/af_key.c @@ -12,6 +12,14 @@ * Kunihiro Ishiguro * Kazunori MIYAZAWA / USAGI Project * Derek Atkins + * + * Changes: BEET support + * Abhinav Pathak + * Diego Beltrami + * Kristian Slavov + * Miika Komu + * Jeff Ahrenholz + * */ #include @@ -28,6 +36,10 @@ #include #include +#ifdef CONFIG_XFRM_BEET +#include +#endif + #include #define _X2KEY(x) ((x) == XFRM_INF ? 0 : (x)) @@ -1584,7 +1596,11 @@ } /* addresses present only in tunnel mode */ +#ifdef CONFIG_XFRM_BEET + if (t->mode == IPSEC_MODE_TUNNEL-1) { +#else if (t->mode) { +#endif switch (xp->family) { case AF_INET: sin = (void*)(rq+1); @@ -1612,6 +1628,40 @@ return -EINVAL; } } +#ifdef CONFIG_XFRM_BEET + else if (t->mode == IPSEC_MODE_BEET-1) { + struct sockaddr *sa; + + sa = (struct sockaddr *)(rq+1); + switch(sa->sa_family) { + case AF_INET: + sin = (struct sockaddr_in *)sa; + t->saddr.a4 = sin->sin_addr.s_addr; + sin++; + if (sin->sin_family != AF_INET) + return -EINVAL; + t->id.daddr.a4 = sin->sin_addr.s_addr; + t->family = AF_INET; + + break; +#if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE) + case AF_INET6: + sin6 = (struct sockaddr_in6 *)sa; + memcpy(t->saddr.a6, &sin6->sin6_addr, sizeof(struct in6_addr)); + sin6++; + if (sin6->sin6_family != AF_INET6) + return -EINVAL; + memcpy(t->id.daddr.a6, &sin6->sin6_addr, sizeof(struct in6_addr)); + t->family = AF_INET6; + + break; +#endif /* CONFIG_IPV6 */ + default: + return -EINVAL; + } + } +#endif /* CONFIG_XFRM_BEET */ + /* No way to set this via kame pfkey */ t->aalgos = t->ealgos = t->calgos = ~0; xp->xfrm_nr++; @@ -1935,6 +1985,48 @@ (err = parse_ipsecrequests(xp, pol)) < 0) goto out; +#ifdef CONFIG_XFRM_BEET + /* lookup the SA (xfrm_state) and copy the inner addresses from + * the policy (xfrm_policy) to the selector within the state + */ + if (xp->xfrm_vec[0].mode == IPSEC_MODE_BEET-1) { + struct xfrm_state *x; + if (xp->family == AF_INET6) { + if ((x = xfrm_lookup_bydst(XFRM_MODE_BEET, + &xp->xfrm_vec[0].id.daddr, + &xp->xfrm_vec[0].saddr, + AF_INET6))) { + /* Inner = 6, Outer = 6 */ + x->props.beet_family_out = AF_INET6; + x->props.beet_family_in = AF_INET6; + /* insert inner addresses into the selector */ + memcpy( &x->sel.daddr, &xp->selector.daddr, + sizeof(xfrm_address_t)); + memcpy( &x->sel.saddr, &xp->selector.saddr, + sizeof(xfrm_address_t)); + x->type = xfrm_get_type(x->id.proto, x->props.beet_family_in); + } + } else if (xp->family == AF_INET) { + if ((x = xfrm_lookup_bydst(XFRM_MODE_BEET, + &xp->xfrm_vec[0].id.daddr, + &xp->xfrm_vec[0].saddr, + AF_INET))) + { + /* Inner = 4, Outer = 4 */ + x->props.beet_family_out = AF_INET; + x->props.beet_family_in = AF_INET; + /* insert inner addresses into the selector */ + memcpy( &x->sel.daddr, &xp->selector.daddr, + sizeof(xfrm_address_t)); + memcpy( &x->sel.saddr, &xp->selector.saddr, + sizeof(xfrm_address_t)); + x->type = xfrm_get_type(x->id.proto, x->props.beet_family_in); + } + } else { + BUG_ON(1); + } + } +#endif out_skb = pfkey_xfrm_policy2msg_prep(xp); if (IS_ERR(out_skb)) { err = PTR_ERR(out_skb); --- linux-2.6.12.2-orig/net/xfrm/Kconfig +++ linux-2.6.12.2/net/xfrm/Kconfig @@ -10,3 +10,19 @@ If unsure, say Y. +config XFRM_BEET + bool "IPsec BEET mode" + depends on XFRM + ---help--- + IPsec BEET mode is combination of IPsec transport and tunnel mode. + Currently, it is used only by HIP. + + If unsure, say N. + +config XFRM_BEET_DEBUG + bool "IPsec BEET mode debugging" + depends on XFRM_BEET + ---help--- + Enables BEET mode debugging via syslog. + + If unsure, say N. --- linux-2.6.12.2-orig/net/xfrm/xfrm_state.c +++ linux-2.6.12.2/net/xfrm/xfrm_state.c @@ -1036,3 +1036,31 @@ INIT_WORK(&xfrm_state_gc_work, xfrm_state_gc_task, NULL); } +#ifdef CONFIG_XFRM_BEET + +struct xfrm_state * +xfrm_lookup_bydst(u8 mode, xfrm_address_t *daddr, xfrm_address_t *saddr, unsigned short family) +{ + struct xfrm_state *x; + unsigned h = xfrm_dst_hash(daddr, family); + + list_for_each_entry(x, xfrm_state_bydst+h, bydst){ + + if (x->props.family == AF_INET6 && + ipv6_addr_equal((struct in6_addr *)daddr, (struct in6_addr *)x->id.daddr.a6) && + mode == x->props.mode && + ipv6_addr_equal((struct in6_addr *)saddr, (struct in6_addr *)x->props.saddr.a6)) { + return(x); + } + + if (x->props.family == AF_INET && + daddr->a4 == x->id.daddr.a4 && + mode == x->props.mode && + saddr->a4 == x->props.saddr.a4) + return(x); + + } + return(NULL); +} + +#endif //CONFIG_XFRM_BEET --=-cxEtPhsKh15+QDSHjSKR-- From simon@devicescape.com Tue Aug 2 11:32:44 2005 Received: with ECARTIS (v1.0.0; list netdev); Tue, 02 Aug 2005 11:32:49 -0700 (PDT) Received: from dhost002-46.dex002.intermedia.net (dhost002-46.dex002.intermedia.net [64.78.21.140]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j72IWiH9004361 for ; Tue, 2 Aug 2005 11:32:44 -0700 X-MimeOLE: Produced By Microsoft Exchange V6.5.7226.0 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Subject: RE: Why is packet socket checked before bridge in netif_receive_skb? Date: Tue, 2 Aug 2005 11:30:40 -0700 Message-ID: X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: Why is packet socket checked before bridge in netif_receive_skb? Thread-Index: AcWSHF0Gj4AD5v6YSY+jjFOmaI5EVQFc7Jbw From: "Simon Barber" To: "David S. Miller" Cc: Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id j72IWiH9004361 X-archive-position: 2829 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: simon@devicescape.com Precedence: bulk X-list: netdev Content-Length: 834 Lines: 26 Ah - OK, so I guess it is an artifact of the reuse of the protocol handler structures that means that some packet sockets see frames before the bridge and others after. I guess this doesn't cause any problems. Simon -----Original Message----- From: David S. Miller [mailto:davem@davemloft.net] Sent: Tuesday, July 26, 2005 12:59 PM To: Simon Barber Cc: netdev@oss.sgi.com Subject: Re: Why is packet socket checked before bridge in netif_receive_skb? From: "Simon Barber" Subject: Why is packet socket checked before bridge in netif_receive_skb? Date: Tue, 26 Jul 2005 11:03:17 -0700 > The protocol handlers are also used to implement packet sockets. - Why > is the all handler checked before the bridge hook? Because we want packet sniffers to see the packet before the bridging layer decapsulates it. From mateuszb@gmail.com Tue Aug 2 15:14:09 2005 Received: with ECARTIS (v1.0.0; list netdev); Tue, 02 Aug 2005 15:14:14 -0700 (PDT) Received: from rproxy.gmail.com (rproxy.gmail.com [64.233.170.204]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j72ME0H9019955 for ; Tue, 2 Aug 2005 15:14:09 -0700 Received: by rproxy.gmail.com with SMTP id z35so1507231rne for ; Tue, 02 Aug 2005 15:11:58 -0700 (PDT) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:reply-to:to:subject:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=i188FGyYMb3MJ3yXIMqXK+wqy9oCJvXcmFCStSYnIHPhXFBQVwZnaVMjxih55U9AvdtamzMDnt2oLAXBiWfXOPEDnvwMU8cALLG4kwz1oQdA4IvEJUwmSaVup0K3mNdb6EWdbX33pa3b0Um03xt6FweXFioP5Il+zarz6tMyCVQ= Received: by 10.38.11.37 with SMTP id 37mr2800rnk; Tue, 02 Aug 2005 15:11:58 -0700 (PDT) Received: by 10.38.89.50 with HTTP; Tue, 2 Aug 2005 15:11:58 -0700 (PDT) Message-ID: Date: Wed, 3 Aug 2005 00:11:58 +0200 From: Mateusz Berezecki Reply-To: Mateusz Berezecki To: netdev@oss.sgi.com Subject: Fwd: opensource atheros driver almost done In-Reply-To: Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Disposition: inline References: Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id j72ME0H9019955 X-archive-position: 2830 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mateuszb@gmail.com Precedence: bulk X-list: netdev Content-Length: 1041 Lines: 40 Hi everyone, I just thought I could have CC'ed that e-mail to netdev list as well so I am forwarding the original here. And yes, I am looking for some kind of assistance ;-) all the best, Mateusz ---------- Forwarded message ---------- From: Mateusz Berezecki Date: Aug 3, 2005 12:08 AM Subject: opensource atheros driver almost done To: kernel-mentors@selenic.com Hi list members, I am about 85 percent done with opensource atheros driver (for 5212 combo chips at the moment) and I am seeking some help in order to connect the driver with kernel routines and expose entrypoints so the driver could be used by applications and i could finally begin testing. the driver source code is located at http://mateusz.agrest.org/atheros/ I am developing that driver using ieee80211 branch of netdev kernel tree, and I am not so confident I do things right using new ieee80211 api. I would appreciate any suggestion and review of the source code. thanks a lot, Mateusz -- Mateusz Berezecki http://mateusz.agrest.org From ravinandan.arakali@neterion.com Tue Aug 2 16:16:48 2005 Received: with ECARTIS (v1.0.0; list netdev); Tue, 02 Aug 2005 16:16:54 -0700 (PDT) Received: from ns1.s2io.com (ns1.s2io.com [142.46.200.198]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j72NGlH9023694 for ; Tue, 2 Aug 2005 16:16:48 -0700 Received: from guinness.s2io.com (sentry.s2io.com [142.46.200.199]) by ns1.s2io.com (8.12.10/8.12.10) with ESMTP id j72NDvcx015589; Tue, 2 Aug 2005 19:13:57 -0400 (EDT) Received: from rarakali ([10.16.16.72]) by guinness.s2io.com (8.12.6/8.12.6) with SMTP id j72NDtKP020621; Tue, 2 Aug 2005 19:13:55 -0400 (EDT) From: "Ravinandan Arakali" To: "'Christoph Hellwig'" Cc: "'David S. Miller'" , , , , , Subject: RE: [PATCH 2.6.12.1 5/12] S2io: Performance improvements Date: Tue, 2 Aug 2005 16:13:57 -0700 Message-ID: <000901c597b7$dc316080$4810100a@pc.s2io.com> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Priority: 3 (Normal) X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook CWS, Build 9.0.2416 (9.0.2911.0) X-MIMEOLE: Produced By Microsoft MimeOLE V6.00.2900.2180 Importance: Normal In-Reply-To: <20050731140515.GA6261@infradead.org> X-Scanned-By: MIMEDefang 2.34 X-archive-position: 2831 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ravinandan.arakali@neterion.com Precedence: bulk X-list: netdev Content-Length: 1317 Lines: 35 Hi Christoph, Following is SGI's stand on this issue: SGI recommends that customers use the -sn2 kernel. This is the kernel that is installed by our factory when we ship systems. The -sn2 kernel is also the kernel that must be run if the Altix has more that 128 CPUs. So I'd be surprised it the majority of the Altix systems in the field are not running the -sn2 kernel. Thanks, Ravi -----Original Message----- From: Christoph Hellwig [mailto:hch@infradead.org] Sent: Sunday, July 31, 2005 7:05 AM To: Ravinandan Arakali Cc: 'David S. Miller'; hch@infradead.org; raghavendra.koushik@neterion.com; jgarzik@pobox.com; netdev@oss.sgi.com; leonid.grossman@neterion.com; rapuru.sriram@neterion.com Subject: Re: [PATCH 2.6.12.1 5/12] S2io: Performance improvements On Fri, Jul 29, 2005 at 09:37:55AM -0700, Ravinandan Arakali wrote: > David, > We are trying to use the "default" directive in Kconfig. We tried > using an unconditional directive(just to test it out) such as > "default y" and a conditional one such as "default y if > CONFIG_IA64_SGI_SN2". Again, please make this a module option, CONFIG_IA64_SGI_SN2 does not mean runs on Altix but that this is a kernel that only supports Altix, which is a non-standard case that doesn't cover 90% or more of actual Altix systems in the field and running 2.6. From SRS0+3ce4cc44608fa7c9dc2f+709+infradead.org+hch@pentafluge.srs.infradead.org Tue Aug 2 16:28:54 2005 Received: with ECARTIS (v1.0.0; list netdev); Tue, 02 Aug 2005 16:29:03 -0700 (PDT) Received: from pentafluge.infradead.org (pentafluge.infradead.org [213.146.154.40]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j72NSrH9028398 for ; Tue, 2 Aug 2005 16:28:54 -0700 Received: from hch by pentafluge.infradead.org with local (Exim 4.52 #1 (Red Hat Linux)) id 1E069m-0005Ia-7E; Wed, 03 Aug 2005 00:26:46 +0100 Date: Wed, 3 Aug 2005 00:26:45 +0100 From: "'Christoph Hellwig'" To: Ravinandan Arakali Cc: "'Christoph Hellwig'" , "'David S. Miller'" , raghavendra.koushik@neterion.com, jgarzik@pobox.com, netdev@oss.sgi.com, leonid.grossman@neterion.com, rapuru.sriram@neterion.com Subject: Re: [PATCH 2.6.12.1 5/12] S2io: Performance improvements Message-ID: <20050802232645.GA20313@infradead.org> References: <20050731140515.GA6261@infradead.org> <000901c597b7$dc316080$4810100a@pc.s2io.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <000901c597b7$dc316080$4810100a@pc.s2io.com> User-Agent: Mutt/1.4.2.1i X-SRS-Rewrite: SMTP reverse-path rewritten from by pentafluge.infradead.org See http://www.infradead.org/rpr.html X-archive-position: 2832 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hch@infradead.org Precedence: bulk X-list: netdev Content-Length: 615 Lines: 14 On Tue, Aug 02, 2005 at 04:13:57PM -0700, Ravinandan Arakali wrote: > Hi Christoph, > Following is SGI's stand on this issue: > > SGI recommends that customers use the -sn2 kernel. This is the kernel > that is installed by our factory when we ship systems. The -sn2 > kernel is also the kernel that must be run if the Altix has more that > 128 CPUs. So I'd be surprised it the majority of the Altix systems in > the field are not running the -sn2 kernel. The my argument is wrong for SuSE ;-) This still needs to be a runtime switch though, not just for this reason. Platform ifdefs are not the way to go. From prarit@sgi.com Wed Aug 3 05:51:24 2005 Received: with ECARTIS (v1.0.0; list netdev); Wed, 03 Aug 2005 05:51:34 -0700 (PDT) Received: from mx1.redhat.com (mx1.redhat.com [66.187.233.31]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j73CpOH9025002 for ; Wed, 3 Aug 2005 05:51:24 -0700 Received: from int-mx1.corp.redhat.com (int-mx1.corp.redhat.com [172.16.52.254]) by mx1.redhat.com (8.12.11/8.12.11) with ESMTP id j73Cn340001022; Wed, 3 Aug 2005 08:49:04 -0400 Received: from mail.boston.redhat.com (mail.boston.redhat.com [172.16.76.12]) by int-mx1.corp.redhat.com (8.11.6/8.11.6) with ESMTP id j73CmtV27503; Wed, 3 Aug 2005 08:48:59 -0400 Received: from [172.16.80.158] (prarit.boston.redhat.com [172.16.80.158]) by mail.boston.redhat.com (8.12.8/8.12.8) with ESMTP id j73CmsD9012835; Wed, 3 Aug 2005 08:48:54 -0400 Message-ID: <42F0BD1E.7070006@sgi.com> Date: Wed, 03 Aug 2005 08:48:30 -0400 From: Prarit Bhargava User-Agent: Mozilla Thunderbird 1.0 (X11/20041206) X-Accept-Language: en-us, en MIME-Version: 1.0 To: ravinandan.arakali@neterion.com CC: "'Christoph Hellwig'" , "'David S. Miller'" , raghavendra.koushik@neterion.com, jgarzik@pobox.com, netdev@oss.sgi.com, leonid.grossman@neterion.com, rapuru.sriram@neterion.com Subject: Re: [PATCH 2.6.12.1 5/12] S2io: Performance improvements Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 2833 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: prarit@sgi.com Precedence: bulk X-list: netdev Content-Length: 748 Lines: 20 > On Tue, Aug 02, 2005 at 04:13:57PM -0700, Ravinandan Arakali wrote: >> Hi Christoph, >> Following is SGI's stand on this issue: >> >> SGI recommends that customers use the -sn2 kernel. This is the kernel >> that is installed by our factory when we ship systems. The -sn2 >> kernel is also the kernel that must be run if the Altix has more that >> 128 CPUs. So I'd be surprised it the majority of the Altix systems in >> the field are not running the -sn2 kernel. > > The my argument is wrong for SuSE ;-) This still needs to be a runtime > switch though, not just for this reason. Platform ifdefs are not the > way to go. Hi Ravinandan, Ditto for SGI Altix on Red Hat -- platform specific ifdefs are not the proper way to do this. P. From tmdubui@tycho.ncsc.mil Wed Aug 3 09:48:37 2005 Received: with ECARTIS (v1.0.0; list netdev); Wed, 03 Aug 2005 09:48:40 -0700 (PDT) Received: from jazzhorn.ncsc.mil (mummy.ncsc.mil [144.51.88.129]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j73GmaH9012205 for ; Wed, 3 Aug 2005 09:48:36 -0700 Received: from deliverance.tycho.ncsc.mil (jazzhorn.ncsc.mil [144.51.5.9]) by jazzhorn.ncsc.mil (8.12.10/8.12.10) with ESMTP id j73Gf2EZ006494 for ; Wed, 3 Aug 2005 16:44:45 GMT Received: by deliverance.tycho.ncsc.mil with Internet Mail Service (5.5.2650.21) id ; Wed, 3 Aug 2005 11:47:46 -0400 Message-ID: <0D2F5426C26FD511A1C400B0D0D059680156922E@deliverance.tycho.ncsc.mil> From: "DuBuisson, Thomas" To: "'netdev@oss.sgi.com'" Subject: PF_KEY not RCF2367 compliant Date: Wed, 3 Aug 2005 11:47:44 -0400 MIME-Version: 1.0 X-Mailer: Internet Mail Service (5.5.2650.21) Content-Type: text/plain; charset="iso-8859-1" X-archive-position: 2834 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tmdubui@tycho.ncsc.mil Precedence: bulk X-list: netdev Content-Length: 1314 Lines: 36 Section 3.1.6 of RFC 2367 clearly indicates there are two cases in which user space programs can send the kernel PF_KEY messages. The first case is just the 'struct sadb_msg' header that should specify an error relating to a previous acquire message. I don't think the other case is implemented in the Linux kernel - I have reprinted the relevant portion of the RFC below: ------------------ The third is where an application-layer consumer of security associations (e.g. an OSPFv2 or RIPv2 daemon) needs a security association. Send an SADB_ACQUIRE message from a user process to the kernel. The kernel returns an SADB_ACQUIRE message to registered sockets. The user-level consumer waits for an SADB_UPDATE or SADB_ADD message for its particular type, and then can use that association by using SADB_GET messages. ---------- Now for the barrage of questions: Was this omitted for a reason? Are we aware this was omitted? Does someone already have a patch? Would a patch be accepted for 2.6.13 if it is sent in time? This is a bug after all. Cheers, Thomas From raghavendra.koushik@neterion.com Wed Aug 3 12:43:30 2005 Received: with ECARTIS (v1.0.0; list netdev); Wed, 03 Aug 2005 12:43:35 -0700 (PDT) Received: from linux.site (adsl-67-120-213-161.dsl.sntc01.pacbell.net [67.120.213.161]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j73JhTH9027166 for ; Wed, 3 Aug 2005 12:43:29 -0700 Received: by linux.site (Postfix, from userid 0) id BF46A98336; Wed, 3 Aug 2005 12:27:09 -0700 (PDT) To: jgarzik@pobox.com, netdev@oss.sgi.com Cc: raghavendra.koushik@neterion.com, ravinandan.arakali@neterion.com, leonid.grossman@neterion.com, rapuru.sriram@neterion.com From: raghavendra.koushik@neterion.com Subject: [PATCH 2.6.13-rc4 2/13] S2io: Hardware fixes Message-Id: <20050803192709.BF46A98336@linux.site> Date: Wed, 3 Aug 2005 12:27:09 -0700 (PDT) X-archive-position: 2836 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: raghavendra.koushik@neterion.com Precedence: bulk X-list: netdev Content-Length: 23254 Lines: 666 Hi, Below patch addresses few h/w specific issues. 1. Check for additional ownership bit on Rx path before starting Rx processing. 2. Enable only 4 PCCs(Per Context Controller) for Xframe I revisions less than 4. 3. Program Rx and Tx round robin registers depending on no. of rings/FIFOs. 4. Tx continous interrupts is now a loadable parameter. 5. Reset the card if we get double-bit ECC errors. 6. A soft reset of XGXS being done to force a link state change has been eliminated. 7. After a reset, clear "parity error detected" bit, PCI-X ECC status register, and PCI_STATUS bit in tx_pic_int register. 8. The error in the disabling allmulticast implementation has been rectified. 9. Leave the PCI-X parameters MMRBC, OST etc. at their BIOS/system defaults. Signed-off-by: Ravinandan Arakali Signed-off-by: Raghavendra Koushik --- diff -uprN vanilla_linux/drivers/net/s2io-regs.h linux-2.6.13-rc4/drivers/net/s2io-regs.h --- vanilla_linux/drivers/net/s2io-regs.h 2005-08-02 02:14:25.000000000 -0700 +++ linux-2.6.13-rc4/drivers/net/s2io-regs.h 2005-08-02 02:14:02.000000000 -0700 @@ -62,6 +62,7 @@ typedef struct _XENA_dev_config { #define ADAPTER_STATUS_RMAC_REMOTE_FAULT BIT(6) #define ADAPTER_STATUS_RMAC_LOCAL_FAULT BIT(7) #define ADAPTER_STATUS_RMAC_PCC_IDLE vBIT(0xFF,8,8) +#define ADAPTER_STATUS_RMAC_PCC_FOUR_IDLE vBIT(0x0F,8,8) #define ADAPTER_STATUS_RC_PRC_QUIESCENT vBIT(0xFF,16,8) #define ADAPTER_STATUS_MC_DRAM_READY BIT(24) #define ADAPTER_STATUS_MC_QUEUES_READY BIT(25) @@ -245,6 +246,7 @@ typedef struct _XENA_dev_config { #define STAT_TRSF_PER(n) TBD #define PER_SEC 0x208d5 #define SET_UPDT_PERIOD(n) vBIT((PER_SEC*n),32,32) +#define SET_UPDT_CLICKS(val) vBIT(val, 32, 32) u64 stat_addr; @@ -289,6 +291,7 @@ typedef struct _XENA_dev_config { u64 pcc_err_reg; #define PCC_FB_ECC_DB_ERR vBIT(0xFF, 16, 8) +#define PCC_ENABLE_FOUR vBIT(0x0F,0,8) u64 pcc_err_mask; u64 pcc_err_alarm; @@ -690,6 +693,10 @@ typedef struct _XENA_dev_config { #define MC_ERR_REG_MIRI_CRI_ERR_0 BIT(22) #define MC_ERR_REG_MIRI_CRI_ERR_1 BIT(23) #define MC_ERR_REG_SM_ERR BIT(31) +#define MC_ERR_REG_ECC_ALL_SNG (BIT(6) | \ + BIT(7) | BIT(17) | BIT(19)) +#define MC_ERR_REG_ECC_ALL_DBL (BIT(14) | \ + BIT(15) | BIT(18) | BIT(20)) u64 mc_err_mask; u64 mc_err_alarm; diff -uprN vanilla_linux/drivers/net/s2io.c linux-2.6.13-rc4/drivers/net/s2io.c --- vanilla_linux/drivers/net/s2io.c 2005-08-02 02:14:17.000000000 -0700 +++ linux-2.6.13-rc4/drivers/net/s2io.c 2005-08-02 02:23:37.000000000 -0700 @@ -68,6 +68,16 @@ static char s2io_driver_name[] = "Neterion"; static char s2io_driver_version[] = "Version 1.7.7"; +static inline int RXD_IS_UP2DT(RxD_t *rxdp) +{ + int ret; + + ret = ((!(rxdp->Control_1 & RXD_OWN_XENA)) && + (GET_RXD_MARKER(rxdp->Control_2) != THE_RXD_MARK)); + + return ret; +} + /* * Cards with following subsystem_id have a link state indication * problem, 600B, 600C, 600D, 640B, 640C and 640D. @@ -230,6 +240,7 @@ static unsigned int rx_ring_sz[MAX_RX_RI static unsigned int Stats_refresh_time = 4; static unsigned int rts_frm_len[MAX_RX_RINGS] = {[0 ...(MAX_RX_RINGS - 1)] = 0 }; +static unsigned int use_continuous_tx_intrs = 1; static unsigned int rmac_pause_time = 65535; static unsigned int mc_pause_threshold_q0q3 = 187; static unsigned int mc_pause_threshold_q4q7 = 187; @@ -638,7 +649,7 @@ static int init_nic(struct s2io_nic *nic mac_control = &nic->mac_control; config = &nic->config; - /* to set the swapper control on the card */ + /* to set the swapper controle on the card */ if(s2io_set_swapper(nic)) { DBG_PRINT(ERR_DBG,"ERROR: Setting Swapper failed\n"); return -1; @@ -756,6 +767,13 @@ static int init_nic(struct s2io_nic *nic val64 |= BIT(0); /* To enable the FIFO partition. */ writeq(val64, &bar0->tx_fifo_partition_0); + /* + * Disable 4 PCCs for Xena1, 2 and 3 as per H/W bug + * SXE-008 TRANSMIT DMA ARBITRATION ISSUE. + */ + if (get_xena_rev_id(nic->pdev) < 4) + writeq(PCC_ENABLE_FOUR, &bar0->pcc_enable); + val64 = readq(&bar0->tx_fifo_partition_0); DBG_PRINT(INIT_DBG, "Fifo partition at: 0x%p is: 0x%llx\n", &bar0->tx_fifo_partition_0, (unsigned long long) val64); @@ -823,37 +841,250 @@ static int init_nic(struct s2io_nic *nic } writeq(val64, &bar0->rx_queue_cfg); - /* Initializing the Tx round robin registers to 0 - * filling tx and rx round robin registers as per - * the number of FIFOs and Rings is still TODO - */ - writeq(0, &bar0->tx_w_round_robin_0); - writeq(0, &bar0->tx_w_round_robin_1); - writeq(0, &bar0->tx_w_round_robin_2); - writeq(0, &bar0->tx_w_round_robin_3); - writeq(0, &bar0->tx_w_round_robin_4); - - /* - * TODO - * Disable Rx steering. Hard coding all packets to be steered to - * Queue 0 for now. + /* + * Filling Tx round robin registers + * as per the number of FIFOs */ - val64 = 0x8080808080808080ULL; - writeq(val64, &bar0->rts_qos_steering); + switch (config->tx_fifo_num) { + case 1: + val64 = 0x0000000000000000ULL; + writeq(val64, &bar0->tx_w_round_robin_0); + writeq(val64, &bar0->tx_w_round_robin_1); + writeq(val64, &bar0->tx_w_round_robin_2); + writeq(val64, &bar0->tx_w_round_robin_3); + writeq(val64, &bar0->tx_w_round_robin_4); + break; + case 2: + val64 = 0x0000010000010000ULL; + writeq(val64, &bar0->tx_w_round_robin_0); + val64 = 0x0100000100000100ULL; + writeq(val64, &bar0->tx_w_round_robin_1); + val64 = 0x0001000001000001ULL; + writeq(val64, &bar0->tx_w_round_robin_2); + val64 = 0x0000010000010000ULL; + writeq(val64, &bar0->tx_w_round_robin_3); + val64 = 0x0100000000000000ULL; + writeq(val64, &bar0->tx_w_round_robin_4); + break; + case 3: + val64 = 0x0001000102000001ULL; + writeq(val64, &bar0->tx_w_round_robin_0); + val64 = 0x0001020000010001ULL; + writeq(val64, &bar0->tx_w_round_robin_1); + val64 = 0x0200000100010200ULL; + writeq(val64, &bar0->tx_w_round_robin_2); + val64 = 0x0001000102000001ULL; + writeq(val64, &bar0->tx_w_round_robin_3); + val64 = 0x0001020000000000ULL; + writeq(val64, &bar0->tx_w_round_robin_4); + break; + case 4: + val64 = 0x0001020300010200ULL; + writeq(val64, &bar0->tx_w_round_robin_0); + val64 = 0x0100000102030001ULL; + writeq(val64, &bar0->tx_w_round_robin_1); + val64 = 0x0200010000010203ULL; + writeq(val64, &bar0->tx_w_round_robin_2); + val64 = 0x0001020001000001ULL; + writeq(val64, &bar0->tx_w_round_robin_3); + val64 = 0x0203000100000000ULL; + writeq(val64, &bar0->tx_w_round_robin_4); + break; + case 5: + val64 = 0x0001000203000102ULL; + writeq(val64, &bar0->tx_w_round_robin_0); + val64 = 0x0001020001030004ULL; + writeq(val64, &bar0->tx_w_round_robin_1); + val64 = 0x0001000203000102ULL; + writeq(val64, &bar0->tx_w_round_robin_2); + val64 = 0x0001020001030004ULL; + writeq(val64, &bar0->tx_w_round_robin_3); + val64 = 0x0001000000000000ULL; + writeq(val64, &bar0->tx_w_round_robin_4); + break; + case 6: + val64 = 0x0001020304000102ULL; + writeq(val64, &bar0->tx_w_round_robin_0); + val64 = 0x0304050001020001ULL; + writeq(val64, &bar0->tx_w_round_robin_1); + val64 = 0x0203000100000102ULL; + writeq(val64, &bar0->tx_w_round_robin_2); + val64 = 0x0304000102030405ULL; + writeq(val64, &bar0->tx_w_round_robin_3); + val64 = 0x0001000200000000ULL; + writeq(val64, &bar0->tx_w_round_robin_4); + break; + case 7: + val64 = 0x0001020001020300ULL; + writeq(val64, &bar0->tx_w_round_robin_0); + val64 = 0x0102030400010203ULL; + writeq(val64, &bar0->tx_w_round_robin_1); + val64 = 0x0405060001020001ULL; + writeq(val64, &bar0->tx_w_round_robin_2); + val64 = 0x0304050000010200ULL; + writeq(val64, &bar0->tx_w_round_robin_3); + val64 = 0x0102030000000000ULL; + writeq(val64, &bar0->tx_w_round_robin_4); + break; + case 8: + val64 = 0x0001020300040105ULL; + writeq(val64, &bar0->tx_w_round_robin_0); + val64 = 0x0200030106000204ULL; + writeq(val64, &bar0->tx_w_round_robin_1); + val64 = 0x0103000502010007ULL; + writeq(val64, &bar0->tx_w_round_robin_2); + val64 = 0x0304010002060500ULL; + writeq(val64, &bar0->tx_w_round_robin_3); + val64 = 0x0103020400000000ULL; + writeq(val64, &bar0->tx_w_round_robin_4); + break; + } + + /* Filling the Rx round robin registers as per the + * number of Rings and steering based on QoS. + */ + switch (config->rx_ring_num) { + case 1: + val64 = 0x8080808080808080ULL; + writeq(val64, &bar0->rts_qos_steering); + break; + case 2: + val64 = 0x0000010000010000ULL; + writeq(val64, &bar0->rx_w_round_robin_0); + val64 = 0x0100000100000100ULL; + writeq(val64, &bar0->rx_w_round_robin_1); + val64 = 0x0001000001000001ULL; + writeq(val64, &bar0->rx_w_round_robin_2); + val64 = 0x0000010000010000ULL; + writeq(val64, &bar0->rx_w_round_robin_3); + val64 = 0x0100000000000000ULL; + writeq(val64, &bar0->rx_w_round_robin_4); + + val64 = 0x8080808040404040ULL; + writeq(val64, &bar0->rts_qos_steering); + break; + case 3: + val64 = 0x0001000102000001ULL; + writeq(val64, &bar0->rx_w_round_robin_0); + val64 = 0x0001020000010001ULL; + writeq(val64, &bar0->rx_w_round_robin_1); + val64 = 0x0200000100010200ULL; + writeq(val64, &bar0->rx_w_round_robin_2); + val64 = 0x0001000102000001ULL; + writeq(val64, &bar0->rx_w_round_robin_3); + val64 = 0x0001020000000000ULL; + writeq(val64, &bar0->rx_w_round_robin_4); + + val64 = 0x8080804040402020ULL; + writeq(val64, &bar0->rts_qos_steering); + break; + case 4: + val64 = 0x0001020300010200ULL; + writeq(val64, &bar0->rx_w_round_robin_0); + val64 = 0x0100000102030001ULL; + writeq(val64, &bar0->rx_w_round_robin_1); + val64 = 0x0200010000010203ULL; + writeq(val64, &bar0->rx_w_round_robin_2); + val64 = 0x0001020001000001ULL; + writeq(val64, &bar0->rx_w_round_robin_3); + val64 = 0x0203000100000000ULL; + writeq(val64, &bar0->rx_w_round_robin_4); + + val64 = 0x8080404020201010ULL; + writeq(val64, &bar0->rts_qos_steering); + break; + case 5: + val64 = 0x0001000203000102ULL; + writeq(val64, &bar0->rx_w_round_robin_0); + val64 = 0x0001020001030004ULL; + writeq(val64, &bar0->rx_w_round_robin_1); + val64 = 0x0001000203000102ULL; + writeq(val64, &bar0->rx_w_round_robin_2); + val64 = 0x0001020001030004ULL; + writeq(val64, &bar0->rx_w_round_robin_3); + val64 = 0x0001000000000000ULL; + writeq(val64, &bar0->rx_w_round_robin_4); + + val64 = 0x8080404020201008ULL; + writeq(val64, &bar0->rts_qos_steering); + break; + case 6: + val64 = 0x0001020304000102ULL; + writeq(val64, &bar0->rx_w_round_robin_0); + val64 = 0x0304050001020001ULL; + writeq(val64, &bar0->rx_w_round_robin_1); + val64 = 0x0203000100000102ULL; + writeq(val64, &bar0->rx_w_round_robin_2); + val64 = 0x0304000102030405ULL; + writeq(val64, &bar0->rx_w_round_robin_3); + val64 = 0x0001000200000000ULL; + writeq(val64, &bar0->rx_w_round_robin_4); + + val64 = 0x8080404020100804ULL; + writeq(val64, &bar0->rts_qos_steering); + break; + case 7: + val64 = 0x0001020001020300ULL; + writeq(val64, &bar0->rx_w_round_robin_0); + val64 = 0x0102030400010203ULL; + writeq(val64, &bar0->rx_w_round_robin_1); + val64 = 0x0405060001020001ULL; + writeq(val64, &bar0->rx_w_round_robin_2); + val64 = 0x0304050000010200ULL; + writeq(val64, &bar0->rx_w_round_robin_3); + val64 = 0x0102030000000000ULL; + writeq(val64, &bar0->rx_w_round_robin_4); + + val64 = 0x8080402010080402ULL; + writeq(val64, &bar0->rts_qos_steering); + break; + case 8: + val64 = 0x0001020300040105ULL; + writeq(val64, &bar0->rx_w_round_robin_0); + val64 = 0x0200030106000204ULL; + writeq(val64, &bar0->rx_w_round_robin_1); + val64 = 0x0103000502010007ULL; + writeq(val64, &bar0->rx_w_round_robin_2); + val64 = 0x0304010002060500ULL; + writeq(val64, &bar0->rx_w_round_robin_3); + val64 = 0x0103020400000000ULL; + writeq(val64, &bar0->rx_w_round_robin_4); + + val64 = 0x8040201008040201ULL; + writeq(val64, &bar0->rts_qos_steering); + break; + } /* UDP Fix */ val64 = 0; for (i = 0; i < 8; i++) writeq(val64, &bar0->rts_frm_len_n[i]); - /* Set the default rts frame length for ring0 */ - writeq(MAC_RTS_FRM_LEN_SET(dev->mtu+22), - &bar0->rts_frm_len_n[0]); + /* Set the default rts frame length for the rings configured */ + val64 = MAC_RTS_FRM_LEN_SET(dev->mtu+22); + for (i = 0 ; i < config->rx_ring_num ; i++) + writeq(val64, &bar0->rts_frm_len_n[i]); + + /* Set the frame length for the configured rings + * desired by the user + */ + for (i = 0; i < config->rx_ring_num; i++) { + /* If rts_frm_len[i] == 0 then it is assumed that user not + * specified frame length steering. + * If the user provides the frame length then program + * the rts_frm_len register for those values or else + * leave it as it is. + */ + if (rts_frm_len[i] != 0) { + writeq(MAC_RTS_FRM_LEN_SET(rts_frm_len[i]), + &bar0->rts_frm_len_n[i]); + } + } /* Program statistics memory */ writeq(mac_control->stats_mem_phy, &bar0->stat_addr); val64 = SET_UPDT_PERIOD(Stats_refresh_time) | - STAT_CFG_STAT_RO | STAT_CFG_STAT_EN; + STAT_CFG_STAT_RO | STAT_CFG_STAT_EN; writeq(val64, &bar0->stat_cfg); /* @@ -877,13 +1108,14 @@ static int init_nic(struct s2io_nic *nic val64 = TTI_DATA1_MEM_TX_TIMER_VAL(0x2078) | TTI_DATA1_MEM_TX_URNG_A(0xA) | TTI_DATA1_MEM_TX_URNG_B(0x10) | - TTI_DATA1_MEM_TX_URNG_C(0x30) | TTI_DATA1_MEM_TX_TIMER_AC_EN | - TTI_DATA1_MEM_TX_TIMER_CI_EN; + TTI_DATA1_MEM_TX_URNG_C(0x30) | TTI_DATA1_MEM_TX_TIMER_AC_EN; + if (use_continuous_tx_intrs) + val64 |= TTI_DATA1_MEM_TX_TIMER_CI_EN; writeq(val64, &bar0->tti_data1_mem); val64 = TTI_DATA2_MEM_TX_UFC_A(0x10) | TTI_DATA2_MEM_TX_UFC_B(0x20) | - TTI_DATA2_MEM_TX_UFC_C(0x40) | TTI_DATA2_MEM_TX_UFC_D(0x80); + TTI_DATA2_MEM_TX_UFC_C(0x70) | TTI_DATA2_MEM_TX_UFC_D(0x80); writeq(val64, &bar0->tti_data2_mem); val64 = TTI_CMD_MEM_WE | TTI_CMD_MEM_STROBE_NEW_CMD; @@ -927,10 +1159,11 @@ static int init_nic(struct s2io_nic *nic writeq(val64, &bar0->rti_command_mem); /* - * Once the operation completes, the Strobe bit of the command - * register will be reset. We poll for this particular condition - * We wait for a maximum of 500ms for the operation to complete, - * if it's not complete by then we return error. + * Once the operation completes, the Strobe bit of the + * command register will be reset. We poll for this + * particular condition. We wait for a maximum of 500ms + * for the operation to complete, if it's not complete + * by then we return error. */ time = 0; while (TRUE) { @@ -1185,10 +1418,10 @@ static void en_dis_able_nic_intrs(struct temp64 &= ~((u64) val64); writeq(temp64, &bar0->general_int_mask); /* - * All MC block error interrupts are disabled for now. - * TODO + * Enable all MC Intrs. */ - writeq(DISABLE_ALL_INTRS, &bar0->mc_int_mask); + writeq(0x0, &bar0->mc_int_mask); + writeq(0x0, &bar0->mc_err_mask); } else if (flag == DISABLE_INTRS) { /* * Disable MC Intrs in the general intr mask register @@ -1247,23 +1480,41 @@ static void en_dis_able_nic_intrs(struct } } -static int check_prc_pcc_state(u64 val64, int flag) +static int check_prc_pcc_state(u64 val64, int flag, int rev_id) { int ret = 0; if (flag == FALSE) { - if (!(val64 & ADAPTER_STATUS_RMAC_PCC_IDLE) && - ((val64 & ADAPTER_STATUS_RC_PRC_QUIESCENT) == - ADAPTER_STATUS_RC_PRC_QUIESCENT)) { - ret = 1; + if (rev_id >= 4) { + if (!(val64 & ADAPTER_STATUS_RMAC_PCC_IDLE) && + ((val64 & ADAPTER_STATUS_RC_PRC_QUIESCENT) == + ADAPTER_STATUS_RC_PRC_QUIESCENT)) { + ret = 1; + } + } else { + if (!(val64 & ADAPTER_STATUS_RMAC_PCC_FOUR_IDLE) && + ((val64 & ADAPTER_STATUS_RC_PRC_QUIESCENT) == + ADAPTER_STATUS_RC_PRC_QUIESCENT)) { + ret = 1; + } } } else { - if (((val64 & ADAPTER_STATUS_RMAC_PCC_IDLE) == - ADAPTER_STATUS_RMAC_PCC_IDLE) && - (!(val64 & ADAPTER_STATUS_RC_PRC_QUIESCENT) || - ((val64 & ADAPTER_STATUS_RC_PRC_QUIESCENT) == - ADAPTER_STATUS_RC_PRC_QUIESCENT))) { - ret = 1; + if (rev_id >= 4) { + if (((val64 & ADAPTER_STATUS_RMAC_PCC_IDLE) == + ADAPTER_STATUS_RMAC_PCC_IDLE) && + (!(val64 & ADAPTER_STATUS_RC_PRC_QUIESCENT) || + ((val64 & ADAPTER_STATUS_RC_PRC_QUIESCENT) == + ADAPTER_STATUS_RC_PRC_QUIESCENT))) { + ret = 1; + } + } else { + if (((val64 & ADAPTER_STATUS_RMAC_PCC_FOUR_IDLE) == + ADAPTER_STATUS_RMAC_PCC_FOUR_IDLE) && + (!(val64 & ADAPTER_STATUS_RC_PRC_QUIESCENT) || + ((val64 & ADAPTER_STATUS_RC_PRC_QUIESCENT) == + ADAPTER_STATUS_RC_PRC_QUIESCENT))) { + ret = 1; + } } } @@ -1286,6 +1537,7 @@ static int verify_xena_quiescence(nic_t { int ret = 0; u64 tmp64 = ~((u64) val64); + int rev_id = get_xena_rev_id(sp->pdev); if (! (tmp64 & @@ -1294,7 +1546,7 @@ static int verify_xena_quiescence(nic_t ADAPTER_STATUS_PIC_QUIESCENT | ADAPTER_STATUS_MC_DRAM_READY | ADAPTER_STATUS_MC_QUEUES_READY | ADAPTER_STATUS_M_PLL_LOCK | ADAPTER_STATUS_P_PLL_LOCK))) { - ret = check_prc_pcc_state(val64, flag); + ret = check_prc_pcc_state(val64, flag, rev_id); } return ret; @@ -1407,7 +1659,7 @@ static int start_nic(struct s2io_nic *ni /* Enable select interrupts */ interruptible = TX_TRAFFIC_INTR | RX_TRAFFIC_INTR | TX_MAC_INTR | - RX_MAC_INTR; + RX_MAC_INTR | MC_INTR; en_dis_able_nic_intrs(nic, interruptible, ENABLE_INTRS); /* @@ -1439,21 +1691,6 @@ static int start_nic(struct s2io_nic *ni */ schedule_work(&nic->set_link_task); - /* - * Here we are performing soft reset on XGXS to - * force link down. Since link is already up, we will get - * link state change interrupt after this reset - */ - SPECIAL_REG_WRITE(0x80010515001E0000ULL, &bar0->dtx_control, UF); - val64 = readq(&bar0->dtx_control); - udelay(50); - SPECIAL_REG_WRITE(0x80010515001E00E0ULL, &bar0->dtx_control, UF); - val64 = readq(&bar0->dtx_control); - udelay(50); - SPECIAL_REG_WRITE(0x80070515001F00E4ULL, &bar0->dtx_control, UF); - val64 = readq(&bar0->dtx_control); - udelay(50); - return SUCCESS; } @@ -1524,7 +1761,7 @@ static void stop_nic(struct s2io_nic *ni /* Disable all interrupts */ interruptible = TX_TRAFFIC_INTR | RX_TRAFFIC_INTR | TX_MAC_INTR | - RX_MAC_INTR; + RX_MAC_INTR | MC_INTR; en_dis_able_nic_intrs(nic, interruptible, DISABLE_INTRS); /* Disable PRCs */ @@ -1737,6 +1974,7 @@ int fill_rx_buffers(struct s2io_nic *nic off++; mac_control->rings[ring_no].rx_curr_put_info.offset = off; #endif + rxdp->Control_2 |= SET_RXD_MARKER; atomic_inc(&nic->rx_bufs_left[ring_no]); alloc_tab++; @@ -1965,11 +2203,8 @@ static void rx_intr_handler(ring_info_t put_offset = (put_block * (MAX_RXDS_PER_BLOCK + 1)) + put_info.offset; #endif - while ((!(rxdp->Control_1 & RXD_OWN_XENA)) && -#ifdef CONFIG_2BUFF_MODE - (!rxdp->Control_2 & BIT(0)) && -#endif - (((get_offset + 1) % ring_bufs) != put_offset)) { + while (RXD_IS_UP2DT(rxdp) && + (((get_offset + 1) % ring_bufs) != put_offset)) { skb = (struct sk_buff *) ((unsigned long)rxdp->Host_Control); if (skb == NULL) { DBG_PRINT(ERR_DBG, "%s: The skb is ", @@ -2153,6 +2388,21 @@ static void alarm_intr_handler(struct s2 schedule_work(&nic->set_link_task); } + /* Handling Ecc errors */ + val64 = readq(&bar0->mc_err_reg); + writeq(val64, &bar0->mc_err_reg); + if (val64 & (MC_ERR_REG_ECC_ALL_SNG | MC_ERR_REG_ECC_ALL_DBL)) { + if (val64 & MC_ERR_REG_ECC_ALL_DBL) { + DBG_PRINT(ERR_DBG, "%s: Device indicates ", + dev->name); + DBG_PRINT(ERR_DBG, "double ECC error!!\n"); + netif_stop_queue(dev); + schedule_work(&nic->rst_timer_task); + } else { + /* Device can recover from Single ECC errors */ + } + } + /* In case of a serious error, the device will be Reset. */ val64 = readq(&bar0->serr_source); if (val64 & SERR_SOURCE_ANY) { @@ -2226,7 +2476,7 @@ void s2io_reset(nic_t * sp) { XENA_dev_config_t __iomem *bar0 = sp->bar0; u64 val64; - u16 subid; + u16 subid, pci_cmd; val64 = SW_RESET_ALL; writeq(val64, &bar0->sw_reset); @@ -2255,6 +2505,18 @@ void s2io_reset(nic_t * sp) /* Set swapper to enable I/O register access */ s2io_set_swapper(sp); + /* Clear certain PCI/PCI-X fields after reset */ + pci_read_config_word(sp->pdev, PCI_COMMAND, &pci_cmd); + pci_cmd &= 0x7FFF; /* Clear parity err detect bit */ + pci_write_config_word(sp->pdev, PCI_COMMAND, pci_cmd); + + val64 = readq(&bar0->txpic_int_reg); + val64 &= ~BIT(62); /* Clearing PCI_STATUS error reflected here */ + writeq(val64, &bar0->txpic_int_reg); + + /* Clearing PCIX Ecc status register */ + pci_write_config_dword(sp->pdev, 0x68, 0); + /* Reset device statistics maintained by OS */ memset(&sp->stats, 0, sizeof (struct net_device_stats)); @@ -2797,6 +3059,8 @@ static void s2io_set_multicast(struct ne /* Disable all Multicast addresses */ writeq(RMAC_ADDR_DATA0_MEM_ADDR(dis_addr), &bar0->rmac_addr_data0_mem); + writeq(RMAC_ADDR_DATA1_MEM_MASK(0x0), + &bar0->rmac_addr_data1_mem); val64 = RMAC_ADDR_CMD_MEM_WE | RMAC_ADDR_CMD_MEM_STROBE_NEW_CMD | RMAC_ADDR_CMD_MEM_OFFSET(sp->all_multi_pos); @@ -4369,21 +4633,6 @@ static void s2io_init_pci(nic_t * sp) (pci_cmd | PCI_COMMAND_PARITY)); pci_read_config_word(sp->pdev, PCI_COMMAND, &pci_cmd); - /* Set MMRB count to 1024 in PCI-X Command register. */ - pcix_cmd &= 0xFFF3; - pci_write_config_word(sp->pdev, PCIX_COMMAND_REGISTER, - (pcix_cmd | (0x1 << 2))); /* MMRBC 1K */ - pci_read_config_word(sp->pdev, PCIX_COMMAND_REGISTER, - &(pcix_cmd)); - - /* Setting Maximum outstanding splits based on system type. */ - pcix_cmd &= 0xFF8F; - pcix_cmd |= XENA_MAX_OUTSTANDING_SPLITS(0x1); /* 2 splits. */ - pci_write_config_word(sp->pdev, PCIX_COMMAND_REGISTER, - pcix_cmd); - pci_read_config_word(sp->pdev, PCIX_COMMAND_REGISTER, - &(pcix_cmd)); - /* Forcibly disabling relaxed ordering capability of the card. */ pcix_cmd &= 0xfffd; pci_write_config_word(sp->pdev, PCIX_COMMAND_REGISTER, @@ -4400,6 +4649,7 @@ module_param_array(tx_fifo_len, uint, NU module_param_array(rx_ring_sz, uint, NULL, 0); module_param(Stats_refresh_time, int, 0); module_param_array(rts_frm_len, uint, NULL, 0); +module_param(use_continuous_tx_intrs, int, 1); module_param(rmac_pause_time, int, 0); module_param(mc_pause_threshold_q0q3, int, 0); module_param(mc_pause_threshold_q4q7, int, 0); diff -uprN vanilla_linux/drivers/net/s2io.h linux-2.6.13-rc4/drivers/net/s2io.h --- vanilla_linux/drivers/net/s2io.h 2005-08-02 02:14:21.000000000 -0700 +++ linux-2.6.13-rc4/drivers/net/s2io.h 2005-08-02 02:13:58.000000000 -0700 @@ -372,6 +372,10 @@ typedef struct _RxD_t { #define RXD_GET_L4_CKSUM(val) ((u16)(val) & 0xFFFF) u64 Control_2; +#define THE_RXD_MARK 0x3 +#define SET_RXD_MARKER vBIT(THE_RXD_MARK, 0, 2) +#define GET_RXD_MARKER(ctrl) ((ctrl & SET_RXD_MARKER) >> 62) + #ifndef CONFIG_2BUFF_MODE #define MASK_BUFFER0_SIZE vBIT(0x3FFF,2,14) #define SET_BUFFER0_SIZE(val) vBIT(val,2,14) From raghavendra.koushik@neterion.com Wed Aug 3 12:41:00 2005 Received: with ECARTIS (v1.0.0; list netdev); Wed, 03 Aug 2005 12:41:11 -0700 (PDT) Received: from linux.site (adsl-67-120-213-161.dsl.sntc01.pacbell.net [67.120.213.161]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j73JesH9026890 for ; Wed, 3 Aug 2005 12:40:55 -0700 Received: by linux.site (Postfix, from userid 0) id BBFB498336; Wed, 3 Aug 2005 12:24:33 -0700 (PDT) To: jgarzik@pobox.com, netdev@oss.sgi.com Cc: raghavendra.koushik@neterion.com, ravinandan.arakali@neterion.com, leonid.grossman@neterion.com, rapuru.sriram@neterion.com From: raghavendra.koushik@neterion.com Subject: [PATCH 2.6.13-rc4 1/13] S2io: Code cleanup Message-Id: <20050803192433.BBFB498336@linux.site> Date: Wed, 3 Aug 2005 12:24:33 -0700 (PDT) X-archive-position: 2835 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: raghavendra.koushik@neterion.com Precedence: bulk X-list: netdev Content-Length: 142310 Lines: 4202 Hi, We are submitting a series of 13 patches to support our Xframe I and Xframe II line of products. The patches can be categorized as follows: Patches 1-8 : Changes applicable to both Xframe I and II Patches 9-11: Xframe II specific features Patch 12: Addresses issues found during testing cycle. Patch 13: Incorpoates mostly the review comments from community and some last moment bug fixes. Please review the patches and let us know your comments. Starting with patch 1 below. This patch involves cosmetic changes(tabs and indentation, regrouping of transmit and receive data structures, typecasting, code cleanup). Signed-off-by: Ravinandan Arakali Signed-off-by: Raghavendra Koushik --- diff -uprN vanilla_linux/drivers/net/s2io-regs.h linux-2.6.13-rc4/drivers/net/s2io-regs.h --- vanilla_linux/drivers/net/s2io-regs.h 2005-08-01 15:51:44.000000000 -0700 +++ linux-2.6.13-rc4/drivers/net/s2io-regs.h 2005-08-02 02:00:54.000000000 -0700 @@ -77,19 +77,18 @@ typedef struct _XENA_dev_config { #define ADAPTER_ECC_EN BIT(55) u64 serr_source; -#define SERR_SOURCE_PIC BIT(0) -#define SERR_SOURCE_TXDMA BIT(1) -#define SERR_SOURCE_RXDMA BIT(2) +#define SERR_SOURCE_PIC BIT(0) +#define SERR_SOURCE_TXDMA BIT(1) +#define SERR_SOURCE_RXDMA BIT(2) #define SERR_SOURCE_MAC BIT(3) #define SERR_SOURCE_MC BIT(4) #define SERR_SOURCE_XGXS BIT(5) -#define SERR_SOURCE_ANY (SERR_SOURCE_PIC | \ - SERR_SOURCE_TXDMA | \ - SERR_SOURCE_RXDMA | \ - SERR_SOURCE_MAC | \ - SERR_SOURCE_MC | \ - SERR_SOURCE_XGXS) - +#define SERR_SOURCE_ANY (SERR_SOURCE_PIC | \ + SERR_SOURCE_TXDMA | \ + SERR_SOURCE_RXDMA | \ + SERR_SOURCE_MAC | \ + SERR_SOURCE_MC | \ + SERR_SOURCE_XGXS) u8 unused_0[0x800 - 0x120]; diff -uprN vanilla_linux/drivers/net/s2io.c linux-2.6.13-rc4/drivers/net/s2io.c --- vanilla_linux/drivers/net/s2io.c 2005-08-01 15:51:42.000000000 -0700 +++ linux-2.6.13-rc4/drivers/net/s2io.c 2005-08-02 02:00:43.000000000 -0700 @@ -11,29 +11,28 @@ * See the file COPYING in this distribution for more information. * * Credits: - * Jeff Garzik : For pointing out the improper error condition - * check in the s2io_xmit routine and also some - * issues in the Tx watch dog function. Also for - * patiently answering all those innumerable + * Jeff Garzik : For pointing out the improper error condition + * check in the s2io_xmit routine and also some + * issues in the Tx watch dog function. Also for + * patiently answering all those innumerable * questions regaring the 2.6 porting issues. * Stephen Hemminger : Providing proper 2.6 porting mechanism for some * macros available only in 2.6 Kernel. - * Francois Romieu : For pointing out all code part that were + * Francois Romieu : For pointing out all code part that were * deprecated and also styling related comments. - * Grant Grundler : For helping me get rid of some Architecture + * Grant Grundler : For helping me get rid of some Architecture * dependent code. * Christopher Hellwig : Some more 2.6 specific issues in the driver. - * + * * The module loadable parameters that are supported by the driver and a brief * explaination of all the variables. - * rx_ring_num : This can be used to program the number of receive rings used - * in the driver. - * rx_ring_len: This defines the number of descriptors each ring can have. This + * rx_ring_num : This can be used to program the number of receive rings used + * in the driver. + * rx_ring_len: This defines the number of descriptors each ring can have. This * is also an array of size 8. * tx_fifo_num: This defines the number of Tx FIFOs thats used int the driver. - * tx_fifo_len: This too is an array of 8. Each element defines the number of + * tx_fifo_len: This too is an array of 8. Each element defines the number of * Tx descriptors that can be associated with each corresponding FIFO. - * in PCI Configuration space. ************************************************************************/ #include @@ -57,19 +56,19 @@ #include #include -#include #include #include +#include /* local include */ #include "s2io.h" #include "s2io-regs.h" /* S2io Driver name & version. */ -static char s2io_driver_name[] = "s2io"; -static char s2io_driver_version[] = "Version 1.7.7.1"; +static char s2io_driver_name[] = "Neterion"; +static char s2io_driver_version[] = "Version 1.7.7"; -/* +/* * Cards with following subsystem_id have a link state indication * problem, 600B, 600C, 600D, 640B, 640C and 640D. * macro below identifies these cards given the subsystem_id. @@ -86,9 +85,13 @@ static char s2io_driver_version[] = "Ver static inline int rx_buffer_level(nic_t * sp, int rxb_size, int ring) { int level = 0; - if ((sp->pkt_cnt[ring] - rxb_size) > 16) { + mac_info_t *mac_control; + + mac_control = &sp->mac_control; + if ((mac_control->rings[ring].pkt_cnt - rxb_size) > 16) { level = LOW; - if ((sp->pkt_cnt[ring] - rxb_size) < MAX_RXDS_PER_BLOCK) { + if ((mac_control->rings[ring].pkt_cnt - rxb_size) < + MAX_RXDS_PER_BLOCK) { level = PANIC; } } @@ -153,8 +156,7 @@ static char ethtool_stats_keys[][ETH_GST #define S2IO_TEST_LEN sizeof(s2io_gstrings) / ETH_GSTRING_LEN #define S2IO_STRINGS_LEN S2IO_TEST_LEN * ETH_GSTRING_LEN - -/* +/* * Constants to be programmed into the Xena's registers, to configure * the XAUI. */ @@ -196,8 +198,7 @@ static u64 default_dtx_cfg[] = { END_SIGN }; - -/* +/* * Constants for Fixing the MacAddress problem seen mostly on * Alpha machines. */ @@ -227,6 +228,8 @@ static unsigned int rx_ring_num = 1; static unsigned int rx_ring_sz[MAX_RX_RINGS] = {[0 ...(MAX_RX_RINGS - 1)] = 0 }; static unsigned int Stats_refresh_time = 4; +static unsigned int rts_frm_len[MAX_RX_RINGS] = + {[0 ...(MAX_RX_RINGS - 1)] = 0 }; static unsigned int rmac_pause_time = 65535; static unsigned int mc_pause_threshold_q0q3 = 187; static unsigned int mc_pause_threshold_q4q7 = 187; @@ -237,9 +240,9 @@ static unsigned int rmac_util_period = 5 static unsigned int indicate_max_pkts; #endif -/* +/* * S2IO device table. - * This table lists all the devices that this driver supports. + * This table lists all the devices that this driver supports. */ static struct pci_device_id s2io_tbl[] __devinitdata = { {PCI_VENDOR_ID_S2IO, PCI_DEVICE_ID_S2IO_WIN, @@ -247,9 +250,9 @@ static struct pci_device_id s2io_tbl[] _ {PCI_VENDOR_ID_S2IO, PCI_DEVICE_ID_S2IO_UNI, PCI_ANY_ID, PCI_ANY_ID}, {PCI_VENDOR_ID_S2IO, PCI_DEVICE_ID_HERC_WIN, - PCI_ANY_ID, PCI_ANY_ID}, - {PCI_VENDOR_ID_S2IO, PCI_DEVICE_ID_HERC_UNI, - PCI_ANY_ID, PCI_ANY_ID}, + PCI_ANY_ID, PCI_ANY_ID}, + {PCI_VENDOR_ID_S2IO, PCI_DEVICE_ID_HERC_UNI, + PCI_ANY_ID, PCI_ANY_ID}, {0,} }; @@ -268,8 +271,8 @@ static struct pci_driver s2io_driver = { /** * init_shared_mem - Allocation and Initialization of Memory * @nic: Device private variable. - * Description: The function allocates all the memory areas shared - * between the NIC and the driver. This includes Tx descriptors, + * Description: The function allocates all the memory areas shared + * between the NIC and the driver. This includes Tx descriptors, * Rx descriptors and the statistics block. */ @@ -279,11 +282,11 @@ static int init_shared_mem(struct s2io_n void *tmp_v_addr, *tmp_v_addr_next; dma_addr_t tmp_p_addr, tmp_p_addr_next; RxD_block_t *pre_rxd_blk = NULL; - int i, j, blk_cnt; + int i, j, blk_cnt, rx_sz, tx_sz; int lst_size, lst_per_page; struct net_device *dev = nic->dev; #ifdef CONFIG_2BUFF_MODE - unsigned long tmp; + u64 tmp; buffAdd_t *ba; #endif @@ -308,28 +311,34 @@ static int init_shared_mem(struct s2io_n } lst_size = (sizeof(TxD_t) * config->max_txds); + tx_sz = lst_size * size; lst_per_page = PAGE_SIZE / lst_size; for (i = 0; i < config->tx_fifo_num; i++) { int fifo_len = config->tx_cfg[i].fifo_len; int list_holder_size = fifo_len * sizeof(list_info_hold_t); - nic->list_info[i] = kmalloc(list_holder_size, GFP_KERNEL); - if (!nic->list_info[i]) { + mac_control->fifos[i].list_info = kmalloc(list_holder_size, + GFP_KERNEL); + if (!mac_control->fifos[i].list_info) { DBG_PRINT(ERR_DBG, "Malloc failed for list_info\n"); return -ENOMEM; } - memset(nic->list_info[i], 0, list_holder_size); + memset(mac_control->fifos[i].list_info, 0, list_holder_size); } for (i = 0; i < config->tx_fifo_num; i++) { int page_num = TXD_MEM_PAGE_CNT(config->tx_cfg[i].fifo_len, lst_per_page); - mac_control->tx_curr_put_info[i].offset = 0; - mac_control->tx_curr_put_info[i].fifo_len = + mac_control->fifos[i].tx_curr_put_info.offset = 0; + mac_control->fifos[i].tx_curr_put_info.fifo_len = config->tx_cfg[i].fifo_len - 1; - mac_control->tx_curr_get_info[i].offset = 0; - mac_control->tx_curr_get_info[i].fifo_len = + mac_control->fifos[i].tx_curr_get_info.offset = 0; + mac_control->fifos[i].tx_curr_get_info.fifo_len = config->tx_cfg[i].fifo_len - 1; + mac_control->fifos[i].fifo_no = i; + mac_control->fifos[i].nic = nic; + mac_control->fifos[i].max_txds = MAX_SKB_FRAGS; + for (j = 0; j < page_num; j++) { int k = 0; dma_addr_t tmp_p; @@ -345,16 +354,15 @@ static int init_shared_mem(struct s2io_n while (k < lst_per_page) { int l = (j * lst_per_page) + k; if (l == config->tx_cfg[i].fifo_len) - goto end_txd_alloc; - nic->list_info[i][l].list_virt_addr = + break; + mac_control->fifos[i].list_info[l].list_virt_addr = tmp_v + (k * lst_size); - nic->list_info[i][l].list_phy_addr = + mac_control->fifos[i].list_info[l].list_phy_addr = tmp_p + (k * lst_size); k++; } } } - end_txd_alloc: /* Allocation and initialization of RXDs in Rings */ size = 0; @@ -367,21 +375,26 @@ static int init_shared_mem(struct s2io_n return FAILURE; } size += config->rx_cfg[i].num_rxd; - nic->block_count[i] = + mac_control->rings[i].block_count = config->rx_cfg[i].num_rxd / (MAX_RXDS_PER_BLOCK + 1); - nic->pkt_cnt[i] = - config->rx_cfg[i].num_rxd - nic->block_count[i]; + mac_control->rings[i].pkt_cnt = + config->rx_cfg[i].num_rxd - mac_control->rings[i].block_count; } + size = (size * (sizeof(RxD_t))); + rx_sz = size; for (i = 0; i < config->rx_ring_num; i++) { - mac_control->rx_curr_get_info[i].block_index = 0; - mac_control->rx_curr_get_info[i].offset = 0; - mac_control->rx_curr_get_info[i].ring_len = + mac_control->rings[i].rx_curr_get_info.block_index = 0; + mac_control->rings[i].rx_curr_get_info.offset = 0; + mac_control->rings[i].rx_curr_get_info.ring_len = config->rx_cfg[i].num_rxd - 1; - mac_control->rx_curr_put_info[i].block_index = 0; - mac_control->rx_curr_put_info[i].offset = 0; - mac_control->rx_curr_put_info[i].ring_len = + mac_control->rings[i].rx_curr_put_info.block_index = 0; + mac_control->rings[i].rx_curr_put_info.offset = 0; + mac_control->rings[i].rx_curr_put_info.ring_len = config->rx_cfg[i].num_rxd - 1; + mac_control->rings[i].nic = nic; + mac_control->rings[i].ring_no = i; + blk_cnt = config->rx_cfg[i].num_rxd / (MAX_RXDS_PER_BLOCK + 1); /* Allocating all the Rx blocks */ @@ -395,32 +408,36 @@ static int init_shared_mem(struct s2io_n &tmp_p_addr); if (tmp_v_addr == NULL) { /* - * In case of failure, free_shared_mem() - * is called, which should free any - * memory that was alloced till the + * In case of failure, free_shared_mem() + * is called, which should free any + * memory that was alloced till the * failure happened. */ - nic->rx_blocks[i][j].block_virt_addr = + mac_control->rings[i].rx_blocks[j].block_virt_addr = tmp_v_addr; return -ENOMEM; } memset(tmp_v_addr, 0, size); - nic->rx_blocks[i][j].block_virt_addr = tmp_v_addr; - nic->rx_blocks[i][j].block_dma_addr = tmp_p_addr; + mac_control->rings[i].rx_blocks[j].block_virt_addr = + tmp_v_addr; + mac_control->rings[i].rx_blocks[j].block_dma_addr = + tmp_p_addr; } /* Interlinking all Rx Blocks */ for (j = 0; j < blk_cnt; j++) { - tmp_v_addr = nic->rx_blocks[i][j].block_virt_addr; + tmp_v_addr = + mac_control->rings[i].rx_blocks[j].block_virt_addr; tmp_v_addr_next = - nic->rx_blocks[i][(j + 1) % + mac_control->rings[i].rx_blocks[(j + 1) % blk_cnt].block_virt_addr; - tmp_p_addr = nic->rx_blocks[i][j].block_dma_addr; + tmp_p_addr = + mac_control->rings[i].rx_blocks[j].block_dma_addr; tmp_p_addr_next = - nic->rx_blocks[i][(j + 1) % + mac_control->rings[i].rx_blocks[(j + 1) % blk_cnt].block_dma_addr; pre_rxd_blk = (RxD_block_t *) tmp_v_addr; - pre_rxd_blk->reserved_1 = END_OF_BLOCK; /* last RxD + pre_rxd_blk->reserved_1 = END_OF_BLOCK; /* last RxD * marker. */ #ifndef CONFIG_2BUFF_MODE @@ -433,43 +450,43 @@ static int init_shared_mem(struct s2io_n } #ifdef CONFIG_2BUFF_MODE - /* + /* * Allocation of Storages for buffer addresses in 2BUFF mode * and the buffers as well. */ for (i = 0; i < config->rx_ring_num; i++) { blk_cnt = config->rx_cfg[i].num_rxd / (MAX_RXDS_PER_BLOCK + 1); - nic->ba[i] = kmalloc((sizeof(buffAdd_t *) * blk_cnt), + mac_control->rings[i].ba = kmalloc((sizeof(buffAdd_t *) * blk_cnt), GFP_KERNEL); - if (!nic->ba[i]) + if (!mac_control->rings[i].ba) return -ENOMEM; for (j = 0; j < blk_cnt; j++) { int k = 0; - nic->ba[i][j] = kmalloc((sizeof(buffAdd_t) * + mac_control->rings[i].ba[j] = kmalloc((sizeof(buffAdd_t) * (MAX_RXDS_PER_BLOCK + 1)), GFP_KERNEL); - if (!nic->ba[i][j]) + if (!mac_control->rings[i].ba[j]) return -ENOMEM; while (k != MAX_RXDS_PER_BLOCK) { - ba = &nic->ba[i][j][k]; + ba = &mac_control->rings[i].ba[j][k]; - ba->ba_0_org = kmalloc + ba->ba_0_org = (void *) kmalloc (BUF0_LEN + ALIGN_SIZE, GFP_KERNEL); if (!ba->ba_0_org) return -ENOMEM; - tmp = (unsigned long) ba->ba_0_org; + tmp = (u64) ba->ba_0_org; tmp += ALIGN_SIZE; - tmp &= ~((unsigned long) ALIGN_SIZE); + tmp &= ~((u64) ALIGN_SIZE); ba->ba_0 = (void *) tmp; - ba->ba_1_org = kmalloc + ba->ba_1_org = (void *) kmalloc (BUF1_LEN + ALIGN_SIZE, GFP_KERNEL); if (!ba->ba_1_org) return -ENOMEM; - tmp = (unsigned long) ba->ba_1_org; + tmp = (u64) ba->ba_1_org; tmp += ALIGN_SIZE; - tmp &= ~((unsigned long) ALIGN_SIZE); + tmp &= ~((u64) ALIGN_SIZE); ba->ba_1 = (void *) tmp; k++; } @@ -483,9 +500,9 @@ static int init_shared_mem(struct s2io_n (nic->pdev, size, &mac_control->stats_mem_phy); if (!mac_control->stats_mem) { - /* - * In case of failure, free_shared_mem() is called, which - * should free any memory that was alloced till the + /* + * In case of failure, free_shared_mem() is called, which + * should free any memory that was alloced till the * failure happened. */ return -ENOMEM; @@ -495,15 +512,14 @@ static int init_shared_mem(struct s2io_n tmp_v_addr = mac_control->stats_mem; mac_control->stats_info = (StatInfo_t *) tmp_v_addr; memset(tmp_v_addr, 0, size); - DBG_PRINT(INIT_DBG, "%s:Ring Mem PHY: 0x%llx\n", dev->name, (unsigned long long) tmp_p_addr); return SUCCESS; } -/** - * free_shared_mem - Free the allocated Memory +/** + * free_shared_mem - Free the allocated Memory * @nic: Device private variable. * Description: This function is to free all memory locations allocated by * the init_shared_mem() function and return it to the kernel. @@ -533,15 +549,18 @@ static void free_shared_mem(struct s2io_ lst_per_page); for (j = 0; j < page_num; j++) { int mem_blks = (j * lst_per_page); - if (!nic->list_info[i][mem_blks].list_virt_addr) + if (!mac_control->fifos[i].list_info[mem_blks]. + list_virt_addr) break; pci_free_consistent(nic->pdev, PAGE_SIZE, - nic->list_info[i][mem_blks]. + mac_control->fifos[i]. + list_info[mem_blks]. list_virt_addr, - nic->list_info[i][mem_blks]. + mac_control->fifos[i]. + list_info[mem_blks]. list_phy_addr); } - kfree(nic->list_info[i]); + kfree(mac_control->fifos[i].list_info); } #ifndef CONFIG_2BUFF_MODE @@ -550,10 +569,12 @@ static void free_shared_mem(struct s2io_ size = SIZE_OF_BLOCK; #endif for (i = 0; i < config->rx_ring_num; i++) { - blk_cnt = nic->block_count[i]; + blk_cnt = mac_control->rings[i].block_count; for (j = 0; j < blk_cnt; j++) { - tmp_v_addr = nic->rx_blocks[i][j].block_virt_addr; - tmp_p_addr = nic->rx_blocks[i][j].block_dma_addr; + tmp_v_addr = mac_control->rings[i].rx_blocks[j]. + block_virt_addr; + tmp_p_addr = mac_control->rings[i].rx_blocks[j]. + block_dma_addr; if (tmp_v_addr == NULL) break; pci_free_consistent(nic->pdev, size, @@ -566,35 +587,21 @@ static void free_shared_mem(struct s2io_ for (i = 0; i < config->rx_ring_num; i++) { blk_cnt = config->rx_cfg[i].num_rxd / (MAX_RXDS_PER_BLOCK + 1); - if (!nic->ba[i]) - goto end_free; for (j = 0; j < blk_cnt; j++) { int k = 0; - if (!nic->ba[i][j]) { - kfree(nic->ba[i]); - goto end_free; - } + if (!mac_control->rings[i].ba[j]) + continue; while (k != MAX_RXDS_PER_BLOCK) { - buffAdd_t *ba = &nic->ba[i][j][k]; - if (!ba || !ba->ba_0_org || !ba->ba_1_org) - { - kfree(nic->ba[i]); - kfree(nic->ba[i][j]); - if(ba->ba_0_org) - kfree(ba->ba_0_org); - if(ba->ba_1_org) - kfree(ba->ba_1_org); - goto end_free; - } + buffAdd_t *ba = &mac_control->rings[i].ba[j][k]; kfree(ba->ba_0_org); kfree(ba->ba_1_org); k++; } - kfree(nic->ba[i][j]); + kfree(mac_control->rings[i].ba[j]); } - kfree(nic->ba[i]); + if (mac_control->rings[i].ba) + kfree(mac_control->rings[i].ba); } -end_free: #endif if (mac_control->stats_mem) { @@ -605,12 +612,12 @@ end_free: } } -/** - * init_nic - Initialization of hardware +/** + * init_nic - Initialization of hardware * @nic: device peivate variable - * Description: The function sequentially configures every block - * of the H/W from their reset values. - * Return Value: SUCCESS on success and + * Description: The function sequentially configures every block + * of the H/W from their reset values. + * Return Value: SUCCESS on success and * '-1' on failure (endian settings incorrect). */ @@ -626,12 +633,13 @@ static int init_nic(struct s2io_nic *nic struct config_param *config; int mdio_cnt = 0, dtx_cnt = 0; unsigned long long mem_share; + int mem_size; mac_control = &nic->mac_control; config = &nic->config; - /* Initialize swapper control register */ - if (s2io_set_swapper(nic)) { + /* to set the swapper control on the card */ + if(s2io_set_swapper(nic)) { DBG_PRINT(ERR_DBG,"ERROR: Setting Swapper failed\n"); return -1; } @@ -639,8 +647,8 @@ static int init_nic(struct s2io_nic *nic /* Remove XGXS from reset state */ val64 = 0; writeq(val64, &bar0->sw_reset); - val64 = readq(&bar0->sw_reset); msleep(500); + val64 = readq(&bar0->sw_reset); /* Enable Receiving broadcasts */ add = &bar0->mac_cfg; @@ -660,18 +668,18 @@ static int init_nic(struct s2io_nic *nic val64 = dev->mtu; writeq(vBIT(val64, 2, 14), &bar0->rmac_max_pyld_len); - /* - * Configuring the XAUI Interface of Xena. + /* + * Configuring the XAUI Interface of Xena. * *************************************** - * To Configure the Xena's XAUI, one has to write a series - * of 64 bit values into two registers in a particular - * sequence. Hence a macro 'SWITCH_SIGN' has been defined - * which will be defined in the array of configuration values - * (default_dtx_cfg & default_mdio_cfg) at appropriate places - * to switch writing from one regsiter to another. We continue + * To Configure the Xena's XAUI, one has to write a series + * of 64 bit values into two registers in a particular + * sequence. Hence a macro 'SWITCH_SIGN' has been defined + * which will be defined in the array of configuration values + * (default_dtx_cfg & default_mdio_cfg) at appropriate places + * to switch writing from one regsiter to another. We continue * writing these values until we encounter the 'END_SIGN' macro. - * For example, After making a series of 21 writes into - * dtx_control register the 'SWITCH_SIGN' appears and hence we + * For example, After making a series of 21 writes into + * dtx_control register the 'SWITCH_SIGN' appears and hence we * start writing into mdio_control until we encounter END_SIGN. */ while (1) { @@ -752,8 +760,8 @@ static int init_nic(struct s2io_nic *nic DBG_PRINT(INIT_DBG, "Fifo partition at: 0x%p is: 0x%llx\n", &bar0->tx_fifo_partition_0, (unsigned long long) val64); - /* - * Initialization of Tx_PA_CONFIG register to ignore packet + /* + * Initialization of Tx_PA_CONFIG register to ignore packet * integrity checking. */ val64 = readq(&bar0->tx_pa_cfg); @@ -770,54 +778,54 @@ static int init_nic(struct s2io_nic *nic } writeq(val64, &bar0->rx_queue_priority); - /* - * Allocating equal share of memory to all the + /* + * Allocating equal share of memory to all the * configured Rings. */ val64 = 0; + mem_size = 64; for (i = 0; i < config->rx_ring_num; i++) { switch (i) { case 0: - mem_share = (64 / config->rx_ring_num + - 64 % config->rx_ring_num); + mem_share = (mem_size / config->rx_ring_num + + mem_size % config->rx_ring_num); val64 |= RX_QUEUE_CFG_Q0_SZ(mem_share); continue; case 1: - mem_share = (64 / config->rx_ring_num); + mem_share = (mem_size / config->rx_ring_num); val64 |= RX_QUEUE_CFG_Q1_SZ(mem_share); continue; case 2: - mem_share = (64 / config->rx_ring_num); + mem_share = (mem_size / config->rx_ring_num); val64 |= RX_QUEUE_CFG_Q2_SZ(mem_share); continue; case 3: - mem_share = (64 / config->rx_ring_num); + mem_share = (mem_size / config->rx_ring_num); val64 |= RX_QUEUE_CFG_Q3_SZ(mem_share); continue; case 4: - mem_share = (64 / config->rx_ring_num); + mem_share = (mem_size / config->rx_ring_num); val64 |= RX_QUEUE_CFG_Q4_SZ(mem_share); continue; case 5: - mem_share = (64 / config->rx_ring_num); + mem_share = (mem_size / config->rx_ring_num); val64 |= RX_QUEUE_CFG_Q5_SZ(mem_share); continue; case 6: - mem_share = (64 / config->rx_ring_num); + mem_share = (mem_size / config->rx_ring_num); val64 |= RX_QUEUE_CFG_Q6_SZ(mem_share); continue; case 7: - mem_share = (64 / config->rx_ring_num); + mem_share = (mem_size / config->rx_ring_num); val64 |= RX_QUEUE_CFG_Q7_SZ(mem_share); continue; } } writeq(val64, &bar0->rx_queue_cfg); - /* - * Initializing the Tx round robin registers to 0. - * Filling Tx and Rx round robin registers as per the - * number of FIFOs and Rings is still TODO. + /* Initializing the Tx round robin registers to 0 + * filling tx and rx round robin registers as per + * the number of FIFOs and Rings is still TODO */ writeq(0, &bar0->tx_w_round_robin_0); writeq(0, &bar0->tx_w_round_robin_1); @@ -825,30 +833,30 @@ static int init_nic(struct s2io_nic *nic writeq(0, &bar0->tx_w_round_robin_3); writeq(0, &bar0->tx_w_round_robin_4); - /* + /* * TODO - * Disable Rx steering. Hard coding all packets be steered to - * Queue 0 for now. + * Disable Rx steering. Hard coding all packets to be steered to + * Queue 0 for now. */ val64 = 0x8080808080808080ULL; writeq(val64, &bar0->rts_qos_steering); /* UDP Fix */ val64 = 0; - for (i = 1; i < 8; i++) + for (i = 0; i < 8; i++) writeq(val64, &bar0->rts_frm_len_n[i]); - /* Set rts_frm_len register for fifo 0 */ - writeq(MAC_RTS_FRM_LEN_SET(dev->mtu + 22), - &bar0->rts_frm_len_n[0]); + /* Set the default rts frame length for ring0 */ + writeq(MAC_RTS_FRM_LEN_SET(dev->mtu+22), + &bar0->rts_frm_len_n[0]); - /* Enable statistics */ + /* Program statistics memory */ writeq(mac_control->stats_mem_phy, &bar0->stat_addr); val64 = SET_UPDT_PERIOD(Stats_refresh_time) | STAT_CFG_STAT_RO | STAT_CFG_STAT_EN; writeq(val64, &bar0->stat_cfg); - /* + /* * Initializing the sampling rate for the device to calculate the * bandwidth utilization. */ @@ -857,11 +865,12 @@ static int init_nic(struct s2io_nic *nic writeq(val64, &bar0->mac_link_util); - /* - * Initializing the Transmit and Receive Traffic Interrupt + /* + * Initializing the Transmit and Receive Traffic Interrupt * Scheme. */ - /* TTI Initialization. Default Tx timer gets us about + /* + * TTI Initialization. Default Tx timer gets us about * 250 interrupts per sec. Continuous interrupts are enabled * by default. */ @@ -880,7 +889,7 @@ static int init_nic(struct s2io_nic *nic val64 = TTI_CMD_MEM_WE | TTI_CMD_MEM_STROBE_NEW_CMD; writeq(val64, &bar0->tti_command_mem); - /* + /* * Once the operation completes, the Strobe bit of the command * register will be reset. We poll for this particular condition * We wait for a maximum of 500ms for the operation to complete, @@ -917,7 +926,7 @@ static int init_nic(struct s2io_nic *nic val64 = RTI_CMD_MEM_WE | RTI_CMD_MEM_STROBE_NEW_CMD; writeq(val64, &bar0->rti_command_mem); - /* + /* * Once the operation completes, the Strobe bit of the command * register will be reset. We poll for this particular condition * We wait for a maximum of 500ms for the operation to complete, @@ -926,7 +935,7 @@ static int init_nic(struct s2io_nic *nic time = 0; while (TRUE) { val64 = readq(&bar0->rti_command_mem); - if (!(val64 & TTI_CMD_MEM_STROBE_NEW_CMD)) { + if (!(val64 & RTI_CMD_MEM_STROBE_NEW_CMD)) { break; } if (time > 10) { @@ -938,15 +947,15 @@ static int init_nic(struct s2io_nic *nic msleep(50); } - /* - * Initializing proper values as Pause threshold into all + /* + * Initializing proper values as Pause threshold into all * the 8 Queues on Rx side. */ writeq(0xffbbffbbffbbffbbULL, &bar0->mc_pause_thresh_q0q3); writeq(0xffbbffbbffbbffbbULL, &bar0->mc_pause_thresh_q4q7); /* Disable RMAC PAD STRIPPING */ - add = &bar0->mac_cfg; + add = (void *) &bar0->mac_cfg; val64 = readq(&bar0->mac_cfg); val64 &= ~(MAC_CFG_RMAC_STRIP_PAD); writeq(RMAC_CFG_KEY(0x4C0D), &bar0->rmac_cfg_key); @@ -955,8 +964,8 @@ static int init_nic(struct s2io_nic *nic writel((u32) (val64 >> 32), (add + 4)); val64 = readq(&bar0->mac_cfg); - /* - * Set the time value to be inserted in the pause frame + /* + * Set the time value to be inserted in the pause frame * generated by xena. */ val64 = readq(&bar0->rmac_pause_cfg); @@ -964,7 +973,7 @@ static int init_nic(struct s2io_nic *nic val64 |= RMAC_PAUSE_HG_PTIME(nic->mac_control.rmac_pause_time); writeq(val64, &bar0->rmac_pause_cfg); - /* + /* * Set the Threshold Limit for Generating the pause frame * If the amount of data in any Queue exceeds ratio of * (mac_control.mc_pause_threshold_q0q3 or q4q7)/256 @@ -988,8 +997,8 @@ static int init_nic(struct s2io_nic *nic } writeq(val64, &bar0->mc_pause_thresh_q4q7); - /* - * TxDMA will stop Read request if the number of read split has + /* + * TxDMA will stop Read request if the number of read split has * exceeded the limit pointed by shared_splits */ val64 = readq(&bar0->pic_control); @@ -999,14 +1008,14 @@ static int init_nic(struct s2io_nic *nic return SUCCESS; } -/** - * en_dis_able_nic_intrs - Enable or Disable the interrupts +/** + * en_dis_able_nic_intrs - Enable or Disable the interrupts * @nic: device private variable, * @mask: A mask indicating which Intr block must be modified and, * @flag: A flag indicating whether to enable or disable the Intrs. * Description: This function will either disable or enable the interrupts - * depending on the flag argument. The mask argument can be used to - * enable/disable any Intr block. + * depending on the flag argument. The mask argument can be used to + * enable/disable any Intr block. * Return Value: NONE. */ @@ -1024,20 +1033,20 @@ static void en_dis_able_nic_intrs(struct temp64 = readq(&bar0->general_int_mask); temp64 &= ~((u64) val64); writeq(temp64, &bar0->general_int_mask); - /* + /* * Disabled all PCIX, Flash, MDIO, IIC and GPIO - * interrupts for now. - * TODO + * interrupts for now. + * TODO */ writeq(DISABLE_ALL_INTRS, &bar0->pic_int_mask); - /* + /* * No MSI Support is available presently, so TTI and * RTI interrupts are also disabled. */ } else if (flag == DISABLE_INTRS) { - /* - * Disable PIC Intrs in the general - * intr mask register + /* + * Disable PIC Intrs in the general + * intr mask register */ writeq(DISABLE_ALL_INTRS, &bar0->pic_int_mask); temp64 = readq(&bar0->general_int_mask); @@ -1055,27 +1064,27 @@ static void en_dis_able_nic_intrs(struct temp64 = readq(&bar0->general_int_mask); temp64 &= ~((u64) val64); writeq(temp64, &bar0->general_int_mask); - /* - * Keep all interrupts other than PFC interrupt + /* + * Keep all interrupts other than PFC interrupt * and PCC interrupt disabled in DMA level. */ val64 = DISABLE_ALL_INTRS & ~(TXDMA_PFC_INT_M | TXDMA_PCC_INT_M); writeq(val64, &bar0->txdma_int_mask); - /* - * Enable only the MISC error 1 interrupt in PFC block + /* + * Enable only the MISC error 1 interrupt in PFC block */ val64 = DISABLE_ALL_INTRS & (~PFC_MISC_ERR_1); writeq(val64, &bar0->pfc_err_mask); - /* - * Enable only the FB_ECC error interrupt in PCC block + /* + * Enable only the FB_ECC error interrupt in PCC block */ val64 = DISABLE_ALL_INTRS & (~PCC_FB_ECC_ERR); writeq(val64, &bar0->pcc_err_mask); } else if (flag == DISABLE_INTRS) { - /* - * Disable TxDMA Intrs in the general intr mask - * register + /* + * Disable TxDMA Intrs in the general intr mask + * register */ writeq(DISABLE_ALL_INTRS, &bar0->txdma_int_mask); writeq(DISABLE_ALL_INTRS, &bar0->pfc_err_mask); @@ -1093,15 +1102,15 @@ static void en_dis_able_nic_intrs(struct temp64 = readq(&bar0->general_int_mask); temp64 &= ~((u64) val64); writeq(temp64, &bar0->general_int_mask); - /* - * All RxDMA block interrupts are disabled for now - * TODO + /* + * All RxDMA block interrupts are disabled for now + * TODO */ writeq(DISABLE_ALL_INTRS, &bar0->rxdma_int_mask); } else if (flag == DISABLE_INTRS) { - /* - * Disable RxDMA Intrs in the general intr mask - * register + /* + * Disable RxDMA Intrs in the general intr mask + * register */ writeq(DISABLE_ALL_INTRS, &bar0->rxdma_int_mask); temp64 = readq(&bar0->general_int_mask); @@ -1118,8 +1127,8 @@ static void en_dis_able_nic_intrs(struct temp64 = readq(&bar0->general_int_mask); temp64 &= ~((u64) val64); writeq(temp64, &bar0->general_int_mask); - /* - * All MAC block error interrupts are disabled for now + /* + * All MAC block error interrupts are disabled for now * except the link status change interrupt. * TODO */ @@ -1132,8 +1141,8 @@ static void en_dis_able_nic_intrs(struct val64 &= ~((u64) RMAC_LINK_STATE_CHANGE_INT); writeq(val64, &bar0->mac_rmac_err_mask); } else if (flag == DISABLE_INTRS) { - /* - * Disable MAC Intrs in the general intr mask register + /* + * Disable MAC Intrs in the general intr mask register */ writeq(DISABLE_ALL_INTRS, &bar0->mac_int_mask); writeq(DISABLE_ALL_INTRS, @@ -1152,14 +1161,14 @@ static void en_dis_able_nic_intrs(struct temp64 = readq(&bar0->general_int_mask); temp64 &= ~((u64) val64); writeq(temp64, &bar0->general_int_mask); - /* + /* * All XGXS block error interrupts are disabled for now - * TODO + * TODO */ writeq(DISABLE_ALL_INTRS, &bar0->xgxs_int_mask); } else if (flag == DISABLE_INTRS) { - /* - * Disable MC Intrs in the general intr mask register + /* + * Disable MC Intrs in the general intr mask register */ writeq(DISABLE_ALL_INTRS, &bar0->xgxs_int_mask); temp64 = readq(&bar0->general_int_mask); @@ -1175,9 +1184,9 @@ static void en_dis_able_nic_intrs(struct temp64 = readq(&bar0->general_int_mask); temp64 &= ~((u64) val64); writeq(temp64, &bar0->general_int_mask); - /* - * All MC block error interrupts are disabled for now - * TODO + /* + * All MC block error interrupts are disabled for now. + * TODO */ writeq(DISABLE_ALL_INTRS, &bar0->mc_int_mask); } else if (flag == DISABLE_INTRS) { @@ -1199,14 +1208,14 @@ static void en_dis_able_nic_intrs(struct temp64 = readq(&bar0->general_int_mask); temp64 &= ~((u64) val64); writeq(temp64, &bar0->general_int_mask); - /* + /* * Enable all the Tx side interrupts - * writing 0 Enables all 64 TX interrupt levels + * writing 0 Enables all 64 TX interrupt levels */ writeq(0x0, &bar0->tx_traffic_mask); } else if (flag == DISABLE_INTRS) { - /* - * Disable Tx Traffic Intrs in the general intr mask + /* + * Disable Tx Traffic Intrs in the general intr mask * register. */ writeq(DISABLE_ALL_INTRS, &bar0->tx_traffic_mask); @@ -1226,8 +1235,8 @@ static void en_dis_able_nic_intrs(struct /* writing 0 Enables all 8 RX interrupt levels */ writeq(0x0, &bar0->rx_traffic_mask); } else if (flag == DISABLE_INTRS) { - /* - * Disable Rx Traffic Intrs in the general intr mask + /* + * Disable Rx Traffic Intrs in the general intr mask * register. */ writeq(DISABLE_ALL_INTRS, &bar0->rx_traffic_mask); @@ -1238,20 +1247,42 @@ static void en_dis_able_nic_intrs(struct } } -/** - * verify_xena_quiescence - Checks whether the H/W is ready +static int check_prc_pcc_state(u64 val64, int flag) +{ + int ret = 0; + + if (flag == FALSE) { + if (!(val64 & ADAPTER_STATUS_RMAC_PCC_IDLE) && + ((val64 & ADAPTER_STATUS_RC_PRC_QUIESCENT) == + ADAPTER_STATUS_RC_PRC_QUIESCENT)) { + ret = 1; + } + } else { + if (((val64 & ADAPTER_STATUS_RMAC_PCC_IDLE) == + ADAPTER_STATUS_RMAC_PCC_IDLE) && + (!(val64 & ADAPTER_STATUS_RC_PRC_QUIESCENT) || + ((val64 & ADAPTER_STATUS_RC_PRC_QUIESCENT) == + ADAPTER_STATUS_RC_PRC_QUIESCENT))) { + ret = 1; + } + } + + return ret; +} +/** + * verify_xena_quiescence - Checks whether the H/W is ready * @val64 : Value read from adapter status register. * @flag : indicates if the adapter enable bit was ever written once * before. * Description: Returns whether the H/W is ready to go or not. Depending - * on whether adapter enable bit was written or not the comparison + * on whether adapter enable bit was written or not the comparison * differs and the calling function passes the input argument flag to * indicate this. - * Return: 1 If xena is quiescence + * Return: 1 If xena is quiescence * 0 If Xena is not quiescence */ -static int verify_xena_quiescence(u64 val64, int flag) +static int verify_xena_quiescence(nic_t *sp, u64 val64, int flag) { int ret = 0; u64 tmp64 = ~((u64) val64); @@ -1263,25 +1294,7 @@ static int verify_xena_quiescence(u64 va ADAPTER_STATUS_PIC_QUIESCENT | ADAPTER_STATUS_MC_DRAM_READY | ADAPTER_STATUS_MC_QUEUES_READY | ADAPTER_STATUS_M_PLL_LOCK | ADAPTER_STATUS_P_PLL_LOCK))) { - if (flag == FALSE) { - if (!(val64 & ADAPTER_STATUS_RMAC_PCC_IDLE) && - ((val64 & ADAPTER_STATUS_RC_PRC_QUIESCENT) == - ADAPTER_STATUS_RC_PRC_QUIESCENT)) { - - ret = 1; - - } - } else { - if (((val64 & ADAPTER_STATUS_RMAC_PCC_IDLE) == - ADAPTER_STATUS_RMAC_PCC_IDLE) && - (!(val64 & ADAPTER_STATUS_RC_PRC_QUIESCENT) || - ((val64 & ADAPTER_STATUS_RC_PRC_QUIESCENT) == - ADAPTER_STATUS_RC_PRC_QUIESCENT))) { - - ret = 1; - - } - } + ret = check_prc_pcc_state(val64, flag); } return ret; @@ -1290,12 +1303,12 @@ static int verify_xena_quiescence(u64 va /** * fix_mac_address - Fix for Mac addr problem on Alpha platforms * @sp: Pointer to device specifc structure - * Description : + * Description : * New procedure to clear mac address reading problems on Alpha platforms * */ -static void fix_mac_address(nic_t * sp) +void fix_mac_address(nic_t * sp) { XENA_dev_config_t __iomem *bar0 = sp->bar0; u64 val64; @@ -1303,20 +1316,21 @@ static void fix_mac_address(nic_t * sp) while (fix_mac[i] != END_SIGN) { writeq(fix_mac[i++], &bar0->gpio_control); + udelay(10); val64 = readq(&bar0->gpio_control); } } /** - * start_nic - Turns the device on + * start_nic - Turns the device on * @nic : device private variable. - * Description: - * This function actually turns the device on. Before this function is - * called,all Registers are configured from their reset states - * and shared memory is allocated but the NIC is still quiescent. On + * Description: + * This function actually turns the device on. Before this function is + * called,all Registers are configured from their reset states + * and shared memory is allocated but the NIC is still quiescent. On * calling this function, the device interrupts are cleared and the NIC is * literally switched on by writing into the adapter control register. - * Return Value: + * Return Value: * SUCCESS on success and -1 on failure. */ @@ -1325,8 +1339,8 @@ static int start_nic(struct s2io_nic *ni XENA_dev_config_t __iomem *bar0 = nic->bar0; struct net_device *dev = nic->dev; register u64 val64 = 0; - u16 interruptible, i; - u16 subid; + u16 interruptible; + u16 subid, i; mac_info_t *mac_control; struct config_param *config; @@ -1335,7 +1349,7 @@ static int start_nic(struct s2io_nic *ni /* PRC Initialization and configuration */ for (i = 0; i < config->rx_ring_num; i++) { - writeq((u64) nic->rx_blocks[i][0].block_dma_addr, + writeq((u64) mac_control->rings[i].rx_blocks[0].block_dma_addr, &bar0->prc_rxd0_n[i]); val64 = readq(&bar0->prc_ctrl_n[i]); @@ -1354,7 +1368,7 @@ static int start_nic(struct s2io_nic *ni writeq(val64, &bar0->rx_pa_cfg); #endif - /* + /* * Enabling MC-RLDRAM. After enabling the device, we timeout * for around 100ms, which is approximately the time required * for the device to be ready for operation. @@ -1364,27 +1378,27 @@ static int start_nic(struct s2io_nic *ni SPECIAL_REG_WRITE(val64, &bar0->mc_rldram_mrs, UF); val64 = readq(&bar0->mc_rldram_mrs); - msleep(100); /* Delay by around 100 ms. */ + msleep(100); /* Delay by around 100 ms. */ /* Enabling ECC Protection. */ val64 = readq(&bar0->adapter_control); val64 &= ~ADAPTER_ECC_EN; writeq(val64, &bar0->adapter_control); - /* - * Clearing any possible Link state change interrupts that + /* + * Clearing any possible Link state change interrupts that * could have popped up just before Enabling the card. */ val64 = readq(&bar0->mac_rmac_err_reg); if (val64) writeq(val64, &bar0->mac_rmac_err_reg); - /* - * Verify if the device is ready to be enabled, if so enable + /* + * Verify if the device is ready to be enabled, if so enable * it. */ val64 = readq(&bar0->adapter_status); - if (!verify_xena_quiescence(val64, nic->device_enabled_once)) { + if (!verify_xena_quiescence(nic, val64, nic->device_enabled_once)) { DBG_PRINT(ERR_DBG, "%s: device is not ready, ", dev->name); DBG_PRINT(ERR_DBG, "Adapter status reads: 0x%llx\n", (unsigned long long) val64); @@ -1396,12 +1410,12 @@ static int start_nic(struct s2io_nic *ni RX_MAC_INTR; en_dis_able_nic_intrs(nic, interruptible, ENABLE_INTRS); - /* + /* * With some switches, link might be already up at this point. - * Because of this weird behavior, when we enable laser, - * we may not get link. We need to handle this. We cannot - * figure out which switch is misbehaving. So we are forced to - * make a global change. + * Because of this weird behavior, when we enable laser, + * we may not get link. We need to handle this. We cannot + * figure out which switch is misbehaving. So we are forced to + * make a global change. */ /* Enabling Laser. */ @@ -1416,17 +1430,17 @@ static int start_nic(struct s2io_nic *ni val64 |= 0x0000800000000000ULL; writeq(val64, &bar0->gpio_control); val64 = 0x0411040400000000ULL; - writeq(val64, (void __iomem *) bar0 + 0x2700); + writeq(val64, (void __iomem *) ((u8 *) bar0 + 0x2700)); } - /* - * Don't see link state interrupts on certain switches, so + /* + * Don't see link state interrupts on certain switches, so * directly scheduling a link state task from here. */ schedule_work(&nic->set_link_task); - /* - * Here we are performing soft reset on XGXS to + /* + * Here we are performing soft reset on XGXS to * force link down. Since link is already up, we will get * link state change interrupt after this reset */ @@ -1443,12 +1457,12 @@ static int start_nic(struct s2io_nic *ni return SUCCESS; } -/** - * free_tx_buffers - Free all queued Tx buffers +/** + * free_tx_buffers - Free all queued Tx buffers * @nic : device private variable. - * Description: + * Description: * Free all queued Tx buffers. - * Return Value: void + * Return Value: void */ static void free_tx_buffers(struct s2io_nic *nic) @@ -1466,7 +1480,7 @@ static void free_tx_buffers(struct s2io_ for (i = 0; i < config->tx_fifo_num; i++) { for (j = 0; j < config->tx_cfg[i].fifo_len - 1; j++) { - txdp = (TxD_t *) nic->list_info[i][j]. + txdp = (TxD_t *) mac_control->fifos[i].list_info[j]. list_virt_addr; skb = (struct sk_buff *) ((unsigned long) txdp-> @@ -1482,16 +1496,16 @@ static void free_tx_buffers(struct s2io_ DBG_PRINT(INTR_DBG, "%s:forcibly freeing %d skbs on FIFO%d\n", dev->name, cnt, i); - mac_control->tx_curr_get_info[i].offset = 0; - mac_control->tx_curr_put_info[i].offset = 0; + mac_control->fifos[i].tx_curr_get_info.offset = 0; + mac_control->fifos[i].tx_curr_put_info.offset = 0; } } -/** - * stop_nic - To stop the nic +/** + * stop_nic - To stop the nic * @nic ; device private variable. - * Description: - * This function does exactly the opposite of what the start_nic() + * Description: + * This function does exactly the opposite of what the start_nic() * function does. This function is called to stop the device. * Return Value: * void. @@ -1521,11 +1535,11 @@ static void stop_nic(struct s2io_nic *ni } } -/** - * fill_rx_buffers - Allocates the Rx side skbs +/** + * fill_rx_buffers - Allocates the Rx side skbs * @nic: device private variable - * @ring_no: ring number - * Description: + * @ring_no: ring number + * Description: * The function allocates Rx side skbs and puts the physical * address of these buffers into the RxD buffer pointers, so that the NIC * can DMA the received frame into these locations. @@ -1533,8 +1547,8 @@ static void stop_nic(struct s2io_nic *ni * 1. single buffer, * 2. three buffer and * 3. Five buffer modes. - * Each mode defines how many fragments the received frame will be split - * up into by the NIC. The frame is split into L3 header, L4 Header, + * Each mode defines how many fragments the received frame will be split + * up into by the NIC. The frame is split into L3 header, L4 Header, * L4 payload in three buffer mode and in 5 buffer mode, L4 payload itself * is split into 3 fragments. As of now only single buffer mode is * supported. @@ -1542,7 +1556,7 @@ static void stop_nic(struct s2io_nic *ni * SUCCESS on success or an appropriate -ve value on failure. */ -static int fill_rx_buffers(struct s2io_nic *nic, int ring_no) +int fill_rx_buffers(struct s2io_nic *nic, int ring_no) { struct net_device *dev = nic->dev; struct sk_buff *skb; @@ -1550,14 +1564,13 @@ static int fill_rx_buffers(struct s2io_n int off, off1, size, block_no, block_no1; int offset, offset1; u32 alloc_tab = 0; - u32 alloc_cnt = nic->pkt_cnt[ring_no] - - atomic_read(&nic->rx_bufs_left[ring_no]); + u32 alloc_cnt; mac_info_t *mac_control; struct config_param *config; #ifdef CONFIG_2BUFF_MODE RxD_t *rxdpnext; int nextblk; - unsigned long tmp; + u64 tmp; buffAdd_t *ba; dma_addr_t rxdpphys; #endif @@ -1567,17 +1580,18 @@ static int fill_rx_buffers(struct s2io_n mac_control = &nic->mac_control; config = &nic->config; - + alloc_cnt = mac_control->rings[ring_no].pkt_cnt - + atomic_read(&nic->rx_bufs_left[ring_no]); size = dev->mtu + HEADER_ETHERNET_II_802_3_SIZE + HEADER_802_2_SIZE + HEADER_SNAP_SIZE; while (alloc_tab < alloc_cnt) { - block_no = mac_control->rx_curr_put_info[ring_no]. + block_no = mac_control->rings[ring_no].rx_curr_put_info. block_index; - block_no1 = mac_control->rx_curr_get_info[ring_no]. + block_no1 = mac_control->rings[ring_no].rx_curr_get_info. block_index; - off = mac_control->rx_curr_put_info[ring_no].offset; - off1 = mac_control->rx_curr_get_info[ring_no].offset; + off = mac_control->rings[ring_no].rx_curr_put_info.offset; + off1 = mac_control->rings[ring_no].rx_curr_get_info.offset; #ifndef CONFIG_2BUFF_MODE offset = block_no * (MAX_RXDS_PER_BLOCK + 1) + off; offset1 = block_no1 * (MAX_RXDS_PER_BLOCK + 1) + off1; @@ -1586,7 +1600,7 @@ static int fill_rx_buffers(struct s2io_n offset1 = block_no1 * (MAX_RXDS_PER_BLOCK) + off1; #endif - rxdp = nic->rx_blocks[ring_no][block_no]. + rxdp = mac_control->rings[ring_no].rx_blocks[block_no]. block_virt_addr + off; if ((offset == offset1) && (rxdp->Host_Control)) { DBG_PRINT(INTR_DBG, "%s: Get and Put", dev->name); @@ -1595,15 +1609,15 @@ static int fill_rx_buffers(struct s2io_n } #ifndef CONFIG_2BUFF_MODE if (rxdp->Control_1 == END_OF_BLOCK) { - mac_control->rx_curr_put_info[ring_no]. + mac_control->rings[ring_no].rx_curr_put_info. block_index++; - mac_control->rx_curr_put_info[ring_no]. - block_index %= nic->block_count[ring_no]; - block_no = mac_control->rx_curr_put_info - [ring_no].block_index; + mac_control->rings[ring_no].rx_curr_put_info. + block_index %= mac_control->rings[ring_no].block_count; + block_no = mac_control->rings[ring_no].rx_curr_put_info. + block_index; off++; off %= (MAX_RXDS_PER_BLOCK + 1); - mac_control->rx_curr_put_info[ring_no].offset = + mac_control->rings[ring_no].rx_curr_put_info.offset = off; rxdp = (RxD_t *) ((unsigned long) rxdp->Control_2); DBG_PRINT(INTR_DBG, "%s: Next block at: %p\n", @@ -1611,30 +1625,30 @@ static int fill_rx_buffers(struct s2io_n } #ifndef CONFIG_S2IO_NAPI spin_lock_irqsave(&nic->put_lock, flags); - nic->put_pos[ring_no] = + mac_control->rings[ring_no].put_pos = (block_no * (MAX_RXDS_PER_BLOCK + 1)) + off; spin_unlock_irqrestore(&nic->put_lock, flags); #endif #else if (rxdp->Host_Control == END_OF_BLOCK) { - mac_control->rx_curr_put_info[ring_no]. + mac_control->rings[ring_no].rx_curr_put_info. block_index++; - mac_control->rx_curr_put_info[ring_no]. - block_index %= nic->block_count[ring_no]; - block_no = mac_control->rx_curr_put_info - [ring_no].block_index; + mac_control->rings[ring_no].rx_curr_put_info.block_index + %= mac_control->rings[ring_no].block_count; + block_no = mac_control->rings[ring_no].rx_curr_put_info + .block_index; off = 0; DBG_PRINT(INTR_DBG, "%s: block%d at: 0x%llx\n", dev->name, block_no, (unsigned long long) rxdp->Control_1); - mac_control->rx_curr_put_info[ring_no].offset = + mac_control->rings[ring_no].rx_curr_put_info.offset = off; - rxdp = nic->rx_blocks[ring_no][block_no]. + rxdp = mac_control->rings[ring_no].rx_blocks[block_no]. block_virt_addr; } #ifndef CONFIG_S2IO_NAPI spin_lock_irqsave(&nic->put_lock, flags); - nic->put_pos[ring_no] = (block_no * + mac_control->rings[ring_no].put_pos = (block_no * (MAX_RXDS_PER_BLOCK + 1)) + off; spin_unlock_irqrestore(&nic->put_lock, flags); #endif @@ -1646,27 +1660,27 @@ static int fill_rx_buffers(struct s2io_n if (rxdp->Control_2 & BIT(0)) #endif { - mac_control->rx_curr_put_info[ring_no]. + mac_control->rings[ring_no].rx_curr_put_info. offset = off; goto end; } #ifdef CONFIG_2BUFF_MODE - /* - * RxDs Spanning cache lines will be replenished only - * if the succeeding RxD is also owned by Host. It - * will always be the ((8*i)+3) and ((8*i)+6) - * descriptors for the 48 byte descriptor. The offending + /* + * RxDs Spanning cache lines will be replenished only + * if the succeeding RxD is also owned by Host. It + * will always be the ((8*i)+3) and ((8*i)+6) + * descriptors for the 48 byte descriptor. The offending * decsriptor is of-course the 3rd descriptor. */ - rxdpphys = nic->rx_blocks[ring_no][block_no]. + rxdpphys = mac_control->rings[ring_no].rx_blocks[block_no]. block_dma_addr + (off * sizeof(RxD_t)); if (((u64) (rxdpphys)) % 128 > 80) { - rxdpnext = nic->rx_blocks[ring_no][block_no]. + rxdpnext = mac_control->rings[ring_no].rx_blocks[block_no]. block_virt_addr + (off + 1); if (rxdpnext->Host_Control == END_OF_BLOCK) { nextblk = (block_no + 1) % - (nic->block_count[ring_no]); - rxdpnext = nic->rx_blocks[ring_no] + (mac_control->rings[ring_no].block_count); + rxdpnext = mac_control->rings[ring_no].rx_blocks [nextblk].block_virt_addr; } if (rxdpnext->Control_2 & BIT(0)) @@ -1695,9 +1709,9 @@ static int fill_rx_buffers(struct s2io_n rxdp->Control_1 |= RXD_OWN_XENA; off++; off %= (MAX_RXDS_PER_BLOCK + 1); - mac_control->rx_curr_put_info[ring_no].offset = off; + mac_control->rings[ring_no].rx_curr_put_info.offset = off; #else - ba = &nic->ba[ring_no][block_no][off]; + ba = &mac_control->rings[ring_no].ba[block_no][off]; skb_reserve(skb, BUF0_LEN); tmp = ((unsigned long) skb->data & ALIGN_SIZE); if (tmp) @@ -1721,8 +1735,9 @@ static int fill_rx_buffers(struct s2io_n rxdp->Host_Control = (u64) ((unsigned long) (skb)); rxdp->Control_1 |= RXD_OWN_XENA; off++; - mac_control->rx_curr_put_info[ring_no].offset = off; + mac_control->rings[ring_no].rx_curr_put_info.offset = off; #endif + atomic_inc(&nic->rx_bufs_left[ring_no]); alloc_tab++; } @@ -1732,9 +1747,9 @@ static int fill_rx_buffers(struct s2io_n } /** - * free_rx_buffers - Frees all Rx buffers + * free_rx_buffers - Frees all Rx buffers * @sp: device private variable. - * Description: + * Description: * This function will free all Rx buffers allocated by host. * Return Value: * NONE. @@ -1758,7 +1773,8 @@ static void free_rx_buffers(struct s2io_ for (i = 0; i < config->rx_ring_num; i++) { for (j = 0, blk = 0; j < config->rx_cfg[i].num_rxd; j++) { off = j % (MAX_RXDS_PER_BLOCK + 1); - rxdp = sp->rx_blocks[i][blk].block_virt_addr + off; + rxdp = mac_control->rings[i].rx_blocks[blk]. + block_virt_addr + off; #ifndef CONFIG_2BUFF_MODE if (rxdp->Control_1 == END_OF_BLOCK) { @@ -1793,7 +1809,7 @@ static void free_rx_buffers(struct s2io_ HEADER_SNAP_SIZE, PCI_DMA_FROMDEVICE); #else - ba = &sp->ba[i][blk][off]; + ba = &mac_control->rings[i].ba[blk][off]; pci_unmap_single(sp->pdev, (dma_addr_t) rxdp->Buffer0_ptr, BUF0_LEN, @@ -1813,10 +1829,10 @@ static void free_rx_buffers(struct s2io_ } memset(rxdp, 0, sizeof(RxD_t)); } - mac_control->rx_curr_put_info[i].block_index = 0; - mac_control->rx_curr_get_info[i].block_index = 0; - mac_control->rx_curr_put_info[i].offset = 0; - mac_control->rx_curr_get_info[i].offset = 0; + mac_control->rings[i].rx_curr_put_info.block_index = 0; + mac_control->rings[i].rx_curr_get_info.block_index = 0; + mac_control->rings[i].rx_curr_put_info.offset = 0; + mac_control->rings[i].rx_curr_get_info.offset = 0; atomic_set(&sp->rx_bufs_left[i], 0); DBG_PRINT(INIT_DBG, "%s:Freed 0x%x Rx Buffers on ring%d\n", dev->name, buf_cnt, i); @@ -1826,7 +1842,7 @@ static void free_rx_buffers(struct s2io_ /** * s2io_poll - Rx interrupt handler for NAPI support * @dev : pointer to the device structure. - * @budget : The number of packets that were budgeted to be processed + * @budget : The number of packets that were budgeted to be processed * during one pass through the 'Poll" function. * Description: * Comes into picture only if NAPI support has been incorporated. It does @@ -1836,160 +1852,35 @@ static void free_rx_buffers(struct s2io_ * 0 on success and 1 if there are No Rx packets to be processed. */ -#ifdef CONFIG_S2IO_NAPI +#if defined(CONFIG_S2IO_NAPI) static int s2io_poll(struct net_device *dev, int *budget) { nic_t *nic = dev->priv; - XENA_dev_config_t __iomem *bar0 = nic->bar0; - int pkts_to_process = *budget, pkt_cnt = 0; - register u64 val64 = 0; - rx_curr_get_info_t get_info, put_info; - int i, get_block, put_block, get_offset, put_offset, ring_bufs; -#ifndef CONFIG_2BUFF_MODE - u16 val16, cksum; -#endif - struct sk_buff *skb; - RxD_t *rxdp; + int pkt_cnt = 0, org_pkts_to_process; mac_info_t *mac_control; struct config_param *config; -#ifdef CONFIG_2BUFF_MODE - buffAdd_t *ba; -#endif + XENA_dev_config_t *bar0 = (XENA_dev_config_t *) nic->bar0; + u64 val64; + int i; mac_control = &nic->mac_control; config = &nic->config; - if (pkts_to_process > dev->quota) - pkts_to_process = dev->quota; + nic->pkts_to_process = *budget; + if (nic->pkts_to_process > dev->quota) + nic->pkts_to_process = dev->quota; + org_pkts_to_process = nic->pkts_to_process; val64 = readq(&bar0->rx_traffic_int); writeq(val64, &bar0->rx_traffic_int); for (i = 0; i < config->rx_ring_num; i++) { - get_info = mac_control->rx_curr_get_info[i]; - get_block = get_info.block_index; - put_info = mac_control->rx_curr_put_info[i]; - put_block = put_info.block_index; - ring_bufs = config->rx_cfg[i].num_rxd; - rxdp = nic->rx_blocks[i][get_block].block_virt_addr + - get_info.offset; -#ifndef CONFIG_2BUFF_MODE - get_offset = (get_block * (MAX_RXDS_PER_BLOCK + 1)) + - get_info.offset; - put_offset = (put_block * (MAX_RXDS_PER_BLOCK + 1)) + - put_info.offset; - while ((!(rxdp->Control_1 & RXD_OWN_XENA)) && - (((get_offset + 1) % ring_bufs) != put_offset)) { - if (--pkts_to_process < 0) { - goto no_rx; - } - if (rxdp->Control_1 == END_OF_BLOCK) { - rxdp = - (RxD_t *) ((unsigned long) rxdp-> - Control_2); - get_info.offset++; - get_info.offset %= - (MAX_RXDS_PER_BLOCK + 1); - get_block++; - get_block %= nic->block_count[i]; - mac_control->rx_curr_get_info[i]. - offset = get_info.offset; - mac_control->rx_curr_get_info[i]. - block_index = get_block; - continue; - } - get_offset = - (get_block * (MAX_RXDS_PER_BLOCK + 1)) + - get_info.offset; - skb = - (struct sk_buff *) ((unsigned long) rxdp-> - Host_Control); - if (skb == NULL) { - DBG_PRINT(ERR_DBG, "%s: The skb is ", - dev->name); - DBG_PRINT(ERR_DBG, "Null in Rx Intr\n"); - goto no_rx; - } - val64 = RXD_GET_BUFFER0_SIZE(rxdp->Control_2); - val16 = (u16) (val64 >> 48); - cksum = RXD_GET_L4_CKSUM(rxdp->Control_1); - pci_unmap_single(nic->pdev, (dma_addr_t) - rxdp->Buffer0_ptr, - dev->mtu + - HEADER_ETHERNET_II_802_3_SIZE + - HEADER_802_2_SIZE + - HEADER_SNAP_SIZE, - PCI_DMA_FROMDEVICE); - rx_osm_handler(nic, val16, rxdp, i); - pkt_cnt++; - get_info.offset++; - get_info.offset %= (MAX_RXDS_PER_BLOCK + 1); - rxdp = - nic->rx_blocks[i][get_block].block_virt_addr + - get_info.offset; - mac_control->rx_curr_get_info[i].offset = - get_info.offset; - } -#else - get_offset = (get_block * (MAX_RXDS_PER_BLOCK + 1)) + - get_info.offset; - put_offset = (put_block * (MAX_RXDS_PER_BLOCK + 1)) + - put_info.offset; - while (((!(rxdp->Control_1 & RXD_OWN_XENA)) && - !(rxdp->Control_2 & BIT(0))) && - (((get_offset + 1) % ring_bufs) != put_offset)) { - if (--pkts_to_process < 0) { - goto no_rx; - } - skb = (struct sk_buff *) ((unsigned long) - rxdp->Host_Control); - if (skb == NULL) { - DBG_PRINT(ERR_DBG, "%s: The skb is ", - dev->name); - DBG_PRINT(ERR_DBG, "Null in Rx Intr\n"); - goto no_rx; - } - - pci_unmap_single(nic->pdev, (dma_addr_t) - rxdp->Buffer0_ptr, - BUF0_LEN, PCI_DMA_FROMDEVICE); - pci_unmap_single(nic->pdev, (dma_addr_t) - rxdp->Buffer1_ptr, - BUF1_LEN, PCI_DMA_FROMDEVICE); - pci_unmap_single(nic->pdev, (dma_addr_t) - rxdp->Buffer2_ptr, - dev->mtu + BUF0_LEN + 4, - PCI_DMA_FROMDEVICE); - ba = &nic->ba[i][get_block][get_info.offset]; - - rx_osm_handler(nic, rxdp, i, ba); - - get_info.offset++; - mac_control->rx_curr_get_info[i].offset = - get_info.offset; - rxdp = - nic->rx_blocks[i][get_block].block_virt_addr + - get_info.offset; - - if (get_info.offset && - (!(get_info.offset % MAX_RXDS_PER_BLOCK))) { - get_info.offset = 0; - mac_control->rx_curr_get_info[i]. - offset = get_info.offset; - get_block++; - get_block %= nic->block_count[i]; - mac_control->rx_curr_get_info[i]. - block_index = get_block; - rxdp = - nic->rx_blocks[i][get_block]. - block_virt_addr; - } - get_offset = - (get_block * (MAX_RXDS_PER_BLOCK + 1)) + - get_info.offset; - pkt_cnt++; + rx_intr_handler(&mac_control->rings[i]); + pkt_cnt = org_pkts_to_process - nic->pkts_to_process; + if (!nic->pkts_to_process) { + /* Quota for the current iteration has been met */ + goto no_rx; } -#endif } if (!pkt_cnt) pkt_cnt = 1; @@ -2009,7 +1900,7 @@ static int s2io_poll(struct net_device * en_dis_able_nic_intrs(nic, RX_TRAFFIC_INTR, ENABLE_INTRS); return 0; - no_rx: +no_rx: dev->quota -= pkt_cnt; *budget -= pkt_cnt; @@ -2022,277 +1913,213 @@ static int s2io_poll(struct net_device * } return 1; } -#else -/** +#endif + +/** * rx_intr_handler - Rx interrupt handler * @nic: device private variable. - * Description: - * If the interrupt is because of a received frame or if the + * Description: + * If the interrupt is because of a received frame or if the * receive ring contains fresh as yet un-processed frames,this function is - * called. It picks out the RxD at which place the last Rx processing had - * stopped and sends the skb to the OSM's Rx handler and then increments + * called. It picks out the RxD at which place the last Rx processing had + * stopped and sends the skb to the OSM's Rx handler and then increments * the offset. * Return Value: * NONE. */ - -static void rx_intr_handler(struct s2io_nic *nic) +static void rx_intr_handler(ring_info_t *ring_data) { + nic_t *nic = ring_data->nic; struct net_device *dev = (struct net_device *) nic->dev; - XENA_dev_config_t *bar0 = (XENA_dev_config_t *) nic->bar0; + XENA_dev_config_t __iomem *bar0 = nic->bar0; + int get_block, get_offset, put_block, put_offset, ring_bufs; rx_curr_get_info_t get_info, put_info; RxD_t *rxdp; struct sk_buff *skb; -#ifndef CONFIG_2BUFF_MODE - u16 val16, cksum; -#endif - register u64 val64 = 0; - int get_block, get_offset, put_block, put_offset, ring_bufs; - int i, pkt_cnt = 0; - mac_info_t *mac_control; - struct config_param *config; -#ifdef CONFIG_2BUFF_MODE - buffAdd_t *ba; +#ifndef CONFIG_S2IO_NAPI + int pkt_cnt = 0; #endif + register u64 val64; - mac_control = &nic->mac_control; - config = &nic->config; - - /* - * rx_traffic_int reg is an R1 register, hence we read and write back - * the samevalue in the register to clear it. + /* + * rx_traffic_int reg is an R1 register, hence we read and write + * back the same value in the register to clear it */ - val64 = readq(&bar0->rx_traffic_int); - writeq(val64, &bar0->rx_traffic_int); + val64 = readq(&bar0->tx_traffic_int); + writeq(val64, &bar0->tx_traffic_int); - for (i = 0; i < config->rx_ring_num; i++) { - get_info = mac_control->rx_curr_get_info[i]; - get_block = get_info.block_index; - put_info = mac_control->rx_curr_put_info[i]; - put_block = put_info.block_index; - ring_bufs = config->rx_cfg[i].num_rxd; - rxdp = nic->rx_blocks[i][get_block].block_virt_addr + + get_info = ring_data->rx_curr_get_info; + get_block = get_info.block_index; + put_info = ring_data->rx_curr_put_info; + put_block = put_info.block_index; + ring_bufs = get_info.ring_len+1; + rxdp = ring_data->rx_blocks[get_block].block_virt_addr + get_info.offset; -#ifndef CONFIG_2BUFF_MODE - get_offset = (get_block * (MAX_RXDS_PER_BLOCK + 1)) + - get_info.offset; - spin_lock(&nic->put_lock); - put_offset = nic->put_pos[i]; - spin_unlock(&nic->put_lock); - while ((!(rxdp->Control_1 & RXD_OWN_XENA)) && - (((get_offset + 1) % ring_bufs) != put_offset)) { - if (rxdp->Control_1 == END_OF_BLOCK) { - rxdp = (RxD_t *) ((unsigned long) - rxdp->Control_2); - get_info.offset++; - get_info.offset %= - (MAX_RXDS_PER_BLOCK + 1); - get_block++; - get_block %= nic->block_count[i]; - mac_control->rx_curr_get_info[i]. - offset = get_info.offset; - mac_control->rx_curr_get_info[i]. - block_index = get_block; - continue; - } - get_offset = - (get_block * (MAX_RXDS_PER_BLOCK + 1)) + - get_info.offset; - skb = (struct sk_buff *) ((unsigned long) - rxdp->Host_Control); - if (skb == NULL) { - DBG_PRINT(ERR_DBG, "%s: The skb is ", - dev->name); - DBG_PRINT(ERR_DBG, "Null in Rx Intr\n"); - return; - } - val64 = RXD_GET_BUFFER0_SIZE(rxdp->Control_2); - val16 = (u16) (val64 >> 48); - cksum = RXD_GET_L4_CKSUM(rxdp->Control_1); - pci_unmap_single(nic->pdev, (dma_addr_t) - rxdp->Buffer0_ptr, - dev->mtu + - HEADER_ETHERNET_II_802_3_SIZE + - HEADER_802_2_SIZE + - HEADER_SNAP_SIZE, - PCI_DMA_FROMDEVICE); - rx_osm_handler(nic, val16, rxdp, i); - get_info.offset++; - get_info.offset %= (MAX_RXDS_PER_BLOCK + 1); - rxdp = - nic->rx_blocks[i][get_block].block_virt_addr + - get_info.offset; - mac_control->rx_curr_get_info[i].offset = - get_info.offset; - pkt_cnt++; - if ((indicate_max_pkts) - && (pkt_cnt > indicate_max_pkts)) - break; + get_offset = (get_block * (MAX_RXDS_PER_BLOCK + 1)) + + get_info.offset; +#ifndef CONFIG_S2IO_NAPI + spin_lock(&nic->put_lock); + put_offset = ring_data->put_pos; + spin_unlock(&nic->put_lock); +#else + put_offset = (put_block * (MAX_RXDS_PER_BLOCK + 1)) + + put_info.offset; +#endif + while ((!(rxdp->Control_1 & RXD_OWN_XENA)) && +#ifdef CONFIG_2BUFF_MODE + (!rxdp->Control_2 & BIT(0)) && +#endif + (((get_offset + 1) % ring_bufs) != put_offset)) { + skb = (struct sk_buff *) ((unsigned long)rxdp->Host_Control); + if (skb == NULL) { + DBG_PRINT(ERR_DBG, "%s: The skb is ", + dev->name); + DBG_PRINT(ERR_DBG, "Null in Rx Intr\n"); + return; } +#ifndef CONFIG_2BUFF_MODE + pci_unmap_single(nic->pdev, (dma_addr_t) + rxdp->Buffer0_ptr, + dev->mtu + + HEADER_ETHERNET_II_802_3_SIZE + + HEADER_802_2_SIZE + + HEADER_SNAP_SIZE, + PCI_DMA_FROMDEVICE); #else - get_offset = (get_block * (MAX_RXDS_PER_BLOCK + 1)) + + pci_unmap_single(nic->pdev, (dma_addr_t) + rxdp->Buffer0_ptr, + BUF0_LEN, PCI_DMA_FROMDEVICE); + pci_unmap_single(nic->pdev, (dma_addr_t) + rxdp->Buffer1_ptr, + BUF1_LEN, PCI_DMA_FROMDEVICE); + pci_unmap_single(nic->pdev, (dma_addr_t) + rxdp->Buffer2_ptr, + dev->mtu + BUF0_LEN + 4, + PCI_DMA_FROMDEVICE); +#endif + rx_osm_handler(ring_data, rxdp); + get_info.offset++; + ring_data->rx_curr_get_info.offset = get_info.offset; - spin_lock(&nic->put_lock); - put_offset = nic->put_pos[i]; - spin_unlock(&nic->put_lock); - while (((!(rxdp->Control_1 & RXD_OWN_XENA)) && - !(rxdp->Control_2 & BIT(0))) && - (((get_offset + 1) % ring_bufs) != put_offset)) { - skb = (struct sk_buff *) ((unsigned long) - rxdp->Host_Control); - if (skb == NULL) { - DBG_PRINT(ERR_DBG, "%s: The skb is ", - dev->name); - DBG_PRINT(ERR_DBG, "Null in Rx Intr\n"); - return; - } - - pci_unmap_single(nic->pdev, (dma_addr_t) - rxdp->Buffer0_ptr, - BUF0_LEN, PCI_DMA_FROMDEVICE); - pci_unmap_single(nic->pdev, (dma_addr_t) - rxdp->Buffer1_ptr, - BUF1_LEN, PCI_DMA_FROMDEVICE); - pci_unmap_single(nic->pdev, (dma_addr_t) - rxdp->Buffer2_ptr, - dev->mtu + BUF0_LEN + 4, - PCI_DMA_FROMDEVICE); - ba = &nic->ba[i][get_block][get_info.offset]; - - rx_osm_handler(nic, rxdp, i, ba); - - get_info.offset++; - mac_control->rx_curr_get_info[i].offset = - get_info.offset; - rxdp = - nic->rx_blocks[i][get_block].block_virt_addr + - get_info.offset; + rxdp = ring_data->rx_blocks[get_block].block_virt_addr + + get_info.offset; + if (get_info.offset && + (!(get_info.offset % MAX_RXDS_PER_BLOCK))) { + get_info.offset = 0; + ring_data->rx_curr_get_info.offset + = get_info.offset; + get_block++; + get_block %= ring_data->block_count; + ring_data->rx_curr_get_info.block_index + = get_block; + rxdp = ring_data->rx_blocks[get_block].block_virt_addr; + } - if (get_info.offset && - (!(get_info.offset % MAX_RXDS_PER_BLOCK))) { - get_info.offset = 0; - mac_control->rx_curr_get_info[i]. - offset = get_info.offset; - get_block++; - get_block %= nic->block_count[i]; - mac_control->rx_curr_get_info[i]. - block_index = get_block; - rxdp = - nic->rx_blocks[i][get_block]. - block_virt_addr; - } - get_offset = - (get_block * (MAX_RXDS_PER_BLOCK + 1)) + + get_offset = (get_block * (MAX_RXDS_PER_BLOCK + 1)) + get_info.offset; - pkt_cnt++; - if ((indicate_max_pkts) - && (pkt_cnt > indicate_max_pkts)) - break; - } -#endif +#ifdef CONFIG_S2IO_NAPI + nic->pkts_to_process -= 1; + if (!nic->pkts_to_process) + break; +#else + pkt_cnt++; if ((indicate_max_pkts) && (pkt_cnt > indicate_max_pkts)) break; +#endif } } -#endif -/** + +/** * tx_intr_handler - Transmit interrupt handler * @nic : device private variable - * Description: - * If an interrupt was raised to indicate DMA complete of the - * Tx packet, this function is called. It identifies the last TxD - * whose buffer was freed and frees all skbs whose data have already + * Description: + * If an interrupt was raised to indicate DMA complete of the + * Tx packet, this function is called. It identifies the last TxD + * whose buffer was freed and frees all skbs whose data have already * DMA'ed into the NICs internal memory. * Return Value: * NONE */ -static void tx_intr_handler(struct s2io_nic *nic) +static void tx_intr_handler(fifo_info_t *fifo_data) { + nic_t *nic = fifo_data->nic; XENA_dev_config_t __iomem *bar0 = nic->bar0; struct net_device *dev = (struct net_device *) nic->dev; tx_curr_get_info_t get_info, put_info; struct sk_buff *skb; TxD_t *txdlp; - register u64 val64 = 0; - int i; u16 j, frg_cnt; - mac_info_t *mac_control; - struct config_param *config; - - mac_control = &nic->mac_control; - config = &nic->config; + register u64 val64 = 0; - /* - * tx_traffic_int reg is an R1 register, hence we read and write - * back the samevalue in the register to clear it. + /* + * tx_traffic_int reg is an R1 register, hence we read and write + * back the same value in the register to clear it */ val64 = readq(&bar0->tx_traffic_int); writeq(val64, &bar0->tx_traffic_int); - for (i = 0; i < config->tx_fifo_num; i++) { - get_info = mac_control->tx_curr_get_info[i]; - put_info = mac_control->tx_curr_put_info[i]; - txdlp = (TxD_t *) nic->list_info[i][get_info.offset]. - list_virt_addr; - while ((!(txdlp->Control_1 & TXD_LIST_OWN_XENA)) && - (get_info.offset != put_info.offset) && - (txdlp->Host_Control)) { - /* Check for TxD errors */ - if (txdlp->Control_1 & TXD_T_CODE) { - unsigned long long err; - err = txdlp->Control_1 & TXD_T_CODE; - DBG_PRINT(ERR_DBG, "***TxD error %llx\n", - err); - } - - skb = (struct sk_buff *) ((unsigned long) - txdlp->Host_Control); - if (skb == NULL) { - DBG_PRINT(ERR_DBG, "%s: Null skb ", - dev->name); - DBG_PRINT(ERR_DBG, "in Tx Free Intr\n"); - return; - } - nic->tx_pkt_count++; + get_info = fifo_data->tx_curr_get_info; + put_info = fifo_data->tx_curr_put_info; + txdlp = (TxD_t *) fifo_data->list_info[get_info.offset]. + list_virt_addr; + while ((!(txdlp->Control_1 & TXD_LIST_OWN_XENA)) && + (get_info.offset != put_info.offset) && + (txdlp->Host_Control)) { + /* Check for TxD errors */ + if (txdlp->Control_1 & TXD_T_CODE) { + unsigned long long err; + err = txdlp->Control_1 & TXD_T_CODE; + DBG_PRINT(ERR_DBG, "***TxD error %llx\n", + err); + } + + skb = (struct sk_buff *) ((unsigned long) + txdlp->Host_Control); + if (skb == NULL) { + DBG_PRINT(ERR_DBG, "%s: Null skb ", + __FUNCTION__); + DBG_PRINT(ERR_DBG, "in Tx Free Intr\n"); + return; + } - frg_cnt = skb_shinfo(skb)->nr_frags; + frg_cnt = skb_shinfo(skb)->nr_frags; + nic->tx_pkt_count++; - /* For unfragmented skb */ - pci_unmap_single(nic->pdev, (dma_addr_t) - txdlp->Buffer_Pointer, - skb->len - skb->data_len, - PCI_DMA_TODEVICE); - if (frg_cnt) { - TxD_t *temp = txdlp; - txdlp++; - for (j = 0; j < frg_cnt; j++, txdlp++) { - skb_frag_t *frag = - &skb_shinfo(skb)->frags[j]; - pci_unmap_page(nic->pdev, - (dma_addr_t) - txdlp-> - Buffer_Pointer, - frag->size, - PCI_DMA_TODEVICE); - } - txdlp = temp; + pci_unmap_single(nic->pdev, (dma_addr_t) + txdlp->Buffer_Pointer, + skb->len - skb->data_len, + PCI_DMA_TODEVICE); + if (frg_cnt) { + TxD_t *temp; + temp = txdlp; + txdlp++; + for (j = 0; j < frg_cnt; j++, txdlp++) { + skb_frag_t *frag = + &skb_shinfo(skb)->frags[j]; + pci_unmap_page(nic->pdev, + (dma_addr_t) + txdlp-> + Buffer_Pointer, + frag->size, + PCI_DMA_TODEVICE); } - memset(txdlp, 0, - (sizeof(TxD_t) * config->max_txds)); - - /* Updating the statistics block */ - nic->stats.tx_packets++; - nic->stats.tx_bytes += skb->len; - dev_kfree_skb_irq(skb); - - get_info.offset++; - get_info.offset %= get_info.fifo_len + 1; - txdlp = (TxD_t *) nic->list_info[i] - [get_info.offset].list_virt_addr; - mac_control->tx_curr_get_info[i].offset = - get_info.offset; + txdlp = temp; } + memset(txdlp, 0, + (sizeof(TxD_t) * fifo_data->max_txds)); + + /* Updating the statistics block */ + nic->stats.tx_packets++; + nic->stats.tx_bytes += skb->len; + dev_kfree_skb_irq(skb); + + get_info.offset++; + get_info.offset %= get_info.fifo_len + 1; + txdlp = (TxD_t *) fifo_data->list_info + [get_info.offset].list_virt_addr; + fifo_data->tx_curr_get_info.offset = + get_info.offset; } spin_lock(&nic->tx_lock); @@ -2301,13 +2128,13 @@ static void tx_intr_handler(struct s2io_ spin_unlock(&nic->tx_lock); } -/** +/** * alarm_intr_handler - Alarm Interrrupt handler * @nic: device private variable - * Description: If the interrupt was neither because of Rx packet or Tx + * Description: If the interrupt was neither because of Rx packet or Tx * complete, this function is called. If the interrupt was to indicate - * a loss of link, the OSM link status handler is invoked for any other - * alarm interrupt the block that raised the interrupt is displayed + * a loss of link, the OSM link status handler is invoked for any other + * alarm interrupt the block that raised the interrupt is displayed * and a H/W reset is issued. * Return Value: * NONE @@ -2338,7 +2165,7 @@ static void alarm_intr_handler(struct s2 /* * Also as mentioned in the latest Errata sheets if the PCC_FB_ECC * Error occurs, the adapter will be recycled by disabling the - * adapter enable bit and enabling it again after the device + * adapter enable bit and enabling it again after the device * becomes Quiescent. */ val64 = readq(&bar0->pcc_err_reg); @@ -2354,18 +2181,18 @@ static void alarm_intr_handler(struct s2 /* Other type of interrupts are not being handled now, TODO */ } -/** +/** * wait_for_cmd_complete - waits for a command to complete. - * @sp : private member of the device structure, which is a pointer to the + * @sp : private member of the device structure, which is a pointer to the * s2io_nic structure. - * Description: Function that waits for a command to Write into RMAC - * ADDR DATA registers to be completed and returns either success or - * error depending on whether the command was complete or not. + * Description: Function that waits for a command to Write into RMAC + * ADDR DATA registers to be completed and returns either success or + * error depending on whether the command was complete or not. * Return value: * SUCCESS on success and FAILURE on failure. */ -static int wait_for_cmd_complete(nic_t * sp) +int wait_for_cmd_complete(nic_t * sp) { XENA_dev_config_t __iomem *bar0 = sp->bar0; int ret = FAILURE, cnt = 0; @@ -2385,17 +2212,17 @@ static int wait_for_cmd_complete(nic_t * return ret; } -/** - * s2io_reset - Resets the card. +/** + * s2io_reset - Resets the card. * @sp : private member of the device structure. * Description: Function to Reset the card. This function then also - * restores the previously saved PCI configuration space registers as + * restores the previously saved PCI configuration space registers as * the card reset also resets the configuration space. * Return value: * void. */ -static void s2io_reset(nic_t * sp) +void s2io_reset(nic_t * sp) { XENA_dev_config_t __iomem *bar0 = sp->bar0; u64 val64; @@ -2404,10 +2231,10 @@ static void s2io_reset(nic_t * sp) val64 = SW_RESET_ALL; writeq(val64, &bar0->sw_reset); - /* - * At this stage, if the PCI write is indeed completed, the - * card is reset and so is the PCI Config space of the device. - * So a read cannot be issued at this stage on any of the + /* + * At this stage, if the PCI write is indeed completed, the + * card is reset and so is the PCI Config space of the device. + * So a read cannot be issued at this stage on any of the * registers to ensure the write into "sw_reset" register * has gone through. * Question: Is there any system call that will explicitly force @@ -2420,10 +2247,17 @@ static void s2io_reset(nic_t * sp) /* Restore the PCI state saved during initializarion. */ pci_restore_state(sp->pdev); + s2io_init_pci(sp); msleep(250); + /* Set swapper to enable I/O register access */ + s2io_set_swapper(sp); + + /* Reset device statistics maintained by OS */ + memset(&sp->stats, 0, sizeof (struct net_device_stats)); + /* SXE-002: Configure link and activity LED to turn it off */ subid = sp->pdev->subsystem_device; if ((subid & 0xFF) >= 0x07) { @@ -2431,29 +2265,29 @@ static void s2io_reset(nic_t * sp) val64 |= 0x0000800000000000ULL; writeq(val64, &bar0->gpio_control); val64 = 0x0411040400000000ULL; - writeq(val64, (void __iomem *) bar0 + 0x2700); + writeq(val64, (void __iomem *) ((u8 *) bar0 + 0x2700)); } sp->device_enabled_once = FALSE; } /** - * s2io_set_swapper - to set the swapper controle on the card - * @sp : private member of the device structure, + * s2io_set_swapper - to set the swapper controle on the card + * @sp : private member of the device structure, * pointer to the s2io_nic structure. - * Description: Function to set the swapper control on the card + * Description: Function to set the swapper control on the card * correctly depending on the 'endianness' of the system. * Return value: * SUCCESS on success and FAILURE on failure. */ -static int s2io_set_swapper(nic_t * sp) +int s2io_set_swapper(nic_t * sp) { struct net_device *dev = sp->dev; XENA_dev_config_t __iomem *bar0 = sp->bar0; u64 val64, valt, valr; - /* + /* * Set proper endian settings and verify the same by reading * the PIF Feed-back register. */ @@ -2505,8 +2339,9 @@ static int s2io_set_swapper(nic_t * sp) i++; } if(i == 4) { + unsigned long long x = val64; DBG_PRINT(ERR_DBG, "Write failed, Xmsi_addr "); - DBG_PRINT(ERR_DBG, "reads:0x%llx\n",val64); + DBG_PRINT(ERR_DBG, "reads:0x%llx\n", x); return FAILURE; } } @@ -2514,8 +2349,8 @@ static int s2io_set_swapper(nic_t * sp) val64 &= 0xFFFF000000000000ULL; #ifdef __BIG_ENDIAN - /* - * The device by default set to a big endian format, so a + /* + * The device by default set to a big endian format, so a * big endian driver need not set anything. */ val64 |= (SWAPPER_CTRL_TXP_FE | @@ -2531,9 +2366,9 @@ static int s2io_set_swapper(nic_t * sp) SWAPPER_CTRL_STATS_FE | SWAPPER_CTRL_STATS_SE); writeq(val64, &bar0->swapper_ctrl); #else - /* + /* * Initially we enable all bits to make it accessible by the - * driver, then we selectively enable only those bits that + * driver, then we selectively enable only those bits that * we want to set. */ val64 |= (SWAPPER_CTRL_TXP_FE | @@ -2555,8 +2390,8 @@ static int s2io_set_swapper(nic_t * sp) #endif val64 = readq(&bar0->swapper_ctrl); - /* - * Verifying if endian settings are accurate by reading a + /* + * Verifying if endian settings are accurate by reading a * feedback register. */ val64 = readq(&bar0->pif_rd_swapper_fb); @@ -2576,25 +2411,25 @@ static int s2io_set_swapper(nic_t * sp) * Functions defined below concern the OS part of the driver * * ********************************************************* */ -/** +/** * s2io_open - open entry point of the driver * @dev : pointer to the device structure. * Description: * This function is the open entry point of the driver. It mainly calls a * function to allocate Rx buffers and inserts them into the buffer - * descriptors and then enables the Rx part of the NIC. + * descriptors and then enables the Rx part of the NIC. * Return value: * 0 on success and an appropriate (-)ve integer as defined in errno.h * file on failure. */ -static int s2io_open(struct net_device *dev) +int s2io_open(struct net_device *dev) { nic_t *sp = dev->priv; int err = 0; - /* - * Make sure you have link off by default every time + /* + * Make sure you have link off by default every time * Nic is initialized */ netif_carrier_off(dev); @@ -2604,27 +2439,34 @@ static int s2io_open(struct net_device * if (s2io_card_up(sp)) { DBG_PRINT(ERR_DBG, "%s: H/W initialization failed\n", dev->name); - return -ENODEV; + err = -ENODEV; + goto hw_init_failed; } /* After proper initialization of H/W, register ISR */ - err = request_irq((int) sp->irq, s2io_isr, SA_SHIRQ, + err = request_irq((int) sp->pdev->irq, s2io_isr, SA_SHIRQ, sp->name, dev); if (err) { - s2io_reset(sp); DBG_PRINT(ERR_DBG, "%s: ISR registration failed\n", dev->name); - return err; + goto isr_registration_failed; } if (s2io_set_mac_addr(dev, dev->dev_addr) == FAILURE) { DBG_PRINT(ERR_DBG, "Set Mac Address Failed\n"); - s2io_reset(sp); - return -ENODEV; + err = -ENODEV; + goto setting_mac_address_failed; } netif_start_queue(dev); return 0; + +setting_mac_address_failed: + free_irq(sp->pdev->irq, dev); +isr_registration_failed: + s2io_reset(sp); +hw_init_failed: + return err; } /** @@ -2640,16 +2482,15 @@ static int s2io_open(struct net_device * * file on failure. */ -static int s2io_close(struct net_device *dev) +int s2io_close(struct net_device *dev) { nic_t *sp = dev->priv; - flush_scheduled_work(); netif_stop_queue(dev); /* Reset card, kill tasklet and free Tx and Rx buffers. */ s2io_card_down(sp); - free_irq(dev->irq, dev); + free_irq(sp->pdev->irq, dev); sp->device_close_flag = TRUE; /* Device is shut down. */ return 0; } @@ -2667,7 +2508,7 @@ static int s2io_close(struct net_device * 0 on success & 1 on failure. */ -static int s2io_xmit(struct sk_buff *skb, struct net_device *dev) +int s2io_xmit(struct sk_buff *skb, struct net_device *dev) { nic_t *sp = dev->priv; u16 frg_cnt, frg_len, i, queue, queue_len, put_off, get_off; @@ -2685,22 +2526,24 @@ static int s2io_xmit(struct sk_buff *skb mac_control = &sp->mac_control; config = &sp->config; - DBG_PRINT(TX_DBG, "%s: In S2IO Tx routine\n", dev->name); + DBG_PRINT(TX_DBG, "%s: In Neterion Tx routine\n", dev->name); spin_lock_irqsave(&sp->tx_lock, flags); - if (atomic_read(&sp->card_state) == CARD_DOWN) { - DBG_PRINT(ERR_DBG, "%s: Card going down for reset\n", + DBG_PRINT(TX_DBG, "%s: Card going down for reset\n", dev->name); spin_unlock_irqrestore(&sp->tx_lock, flags); - return 1; + dev_kfree_skb(skb); + return 0; } queue = 0; - put_off = (u16) mac_control->tx_curr_put_info[queue].offset; - get_off = (u16) mac_control->tx_curr_get_info[queue].offset; - txdp = (TxD_t *) sp->list_info[queue][put_off].list_virt_addr; - queue_len = mac_control->tx_curr_put_info[queue].fifo_len + 1; + put_off = (u16) mac_control->fifos[queue].tx_curr_put_info.offset; + get_off = (u16) mac_control->fifos[queue].tx_curr_get_info.offset; + txdp = (TxD_t *) mac_control->fifos[queue].list_info[put_off]. + list_virt_addr; + + queue_len = mac_control->fifos[queue].tx_curr_put_info.fifo_len + 1; /* Avoid "put" pointer going beyond "get" pointer */ if (txdp->Host_Control || (((put_off + 1) % queue_len) == get_off)) { DBG_PRINT(ERR_DBG, "Error in xmit, No free TXDs.\n"); @@ -2720,9 +2563,9 @@ static int s2io_xmit(struct sk_buff *skb frg_cnt = skb_shinfo(skb)->nr_frags; frg_len = skb->len - skb->data_len; - txdp->Host_Control = (unsigned long) skb; txdp->Buffer_Pointer = pci_map_single (sp->pdev, skb->data, frg_len, PCI_DMA_TODEVICE); + txdp->Host_Control = (unsigned long) skb; if (skb->ip_summed == CHECKSUM_HW) { txdp->Control_2 |= (TXD_TX_CKO_IPV4_EN | TXD_TX_CKO_TCP_EN | @@ -2747,11 +2590,12 @@ static int s2io_xmit(struct sk_buff *skb txdp->Control_1 |= TXD_GATHER_CODE_LAST; tx_fifo = mac_control->tx_FIFO_start[queue]; - val64 = sp->list_info[queue][put_off].list_phy_addr; + val64 = mac_control->fifos[queue].list_info[put_off].list_phy_addr; writeq(val64, &tx_fifo->TxDL_Pointer); val64 = (TX_FIFO_LAST_TXD_NUM(frg_cnt) | TX_FIFO_FIRST_LIST | TX_FIFO_LAST_LIST); + #ifdef NETIF_F_TSO if (mss) val64 |= TX_FIFO_SPECIAL_FUNC; @@ -2762,8 +2606,8 @@ static int s2io_xmit(struct sk_buff *skb val64 = readq(&bar0->general_int_status); put_off++; - put_off %= mac_control->tx_curr_put_info[queue].fifo_len + 1; - mac_control->tx_curr_put_info[queue].offset = put_off; + put_off %= mac_control->fifos[queue].tx_curr_put_info.fifo_len + 1; + mac_control->fifos[queue].tx_curr_put_info.offset = put_off; /* Avoid "put" pointer going beyond "get" pointer */ if (((put_off + 1) % queue_len) == get_off) { @@ -2784,13 +2628,13 @@ static int s2io_xmit(struct sk_buff *skb * @irq: the irq of the device. * @dev_id: a void pointer to the dev structure of the NIC. * @pt_regs: pointer to the registers pushed on the stack. - * Description: This function is the ISR handler of the device. It - * identifies the reason for the interrupt and calls the relevant - * service routines. As a contongency measure, this ISR allocates the + * Description: This function is the ISR handler of the device. It + * identifies the reason for the interrupt and calls the relevant + * service routines. As a contongency measure, this ISR allocates the * recv buffers, if their numbers are below the panic value which is * presently set to 25% of the original number of rcv buffers allocated. * Return value: - * IRQ_HANDLED: will be returned if IRQ was handled by this routine + * IRQ_HANDLED: will be returned if IRQ was handled by this routine * IRQ_NONE: will be returned if interrupt is not from our device */ static irqreturn_t s2io_isr(int irq, void *dev_id, struct pt_regs *regs) @@ -2798,9 +2642,7 @@ static irqreturn_t s2io_isr(int irq, voi struct net_device *dev = (struct net_device *) dev_id; nic_t *sp = dev->priv; XENA_dev_config_t __iomem *bar0 = sp->bar0; -#ifndef CONFIG_S2IO_NAPI - int i, ret; -#endif + int i; u64 reason = 0; mac_info_t *mac_control; struct config_param *config; @@ -2808,13 +2650,13 @@ static irqreturn_t s2io_isr(int irq, voi mac_control = &sp->mac_control; config = &sp->config; - /* + /* * Identify the cause for interrupt and call the appropriate * interrupt handler. Causes for the interrupt could be; * 1. Rx of packet. * 2. Tx complete. * 3. Link down. - * 4. Error in any functional blocks of the NIC. + * 4. Error in any functional blocks of the NIC. */ reason = readq(&bar0->general_int_status); @@ -2823,12 +2665,6 @@ static irqreturn_t s2io_isr(int irq, voi return IRQ_NONE; } - /* If Intr is because of Tx Traffic */ - if (reason & GEN_INTR_TXTRAFFIC) { - tx_intr_handler(sp); - } - - /* If Intr is because of an error */ if (reason & (GEN_ERROR_INTR)) alarm_intr_handler(sp); @@ -2843,17 +2679,26 @@ static irqreturn_t s2io_isr(int irq, voi #else /* If Intr is because of Rx Traffic */ if (reason & GEN_INTR_RXTRAFFIC) { - rx_intr_handler(sp); + for (i = 0; i < config->rx_ring_num; i++) { + rx_intr_handler(&mac_control->rings[i]); + } } #endif - /* - * If the Rx buffer count is below the panic threshold then - * reallocate the buffers from the interrupt handler itself, + /* If Intr is because of Tx Traffic */ + if (reason & GEN_INTR_TXTRAFFIC) { + for (i = 0; i < config->tx_fifo_num; i++) + tx_intr_handler(&mac_control->fifos[i]); + } + + /* + * If the Rx buffer count is below the panic threshold then + * reallocate the buffers from the interrupt handler itself, * else schedule a tasklet to reallocate the buffers. */ #ifndef CONFIG_S2IO_NAPI for (i = 0; i < config->rx_ring_num; i++) { + int ret; int rxb_size = atomic_read(&sp->rx_bufs_left[i]); int level = rx_buffer_level(sp, rxb_size, i); @@ -2878,29 +2723,33 @@ static irqreturn_t s2io_isr(int irq, voi } /** - * s2io_get_stats - Updates the device statistics structure. + * s2io_get_stats - Updates the device statistics structure. * @dev : pointer to the device structure. * Description: - * This function updates the device statistics structure in the s2io_nic + * This function updates the device statistics structure in the s2io_nic * structure and returns a pointer to the same. * Return value: * pointer to the updated net_device_stats structure. */ -static struct net_device_stats *s2io_get_stats(struct net_device *dev) +struct net_device_stats *s2io_get_stats(struct net_device *dev) { nic_t *sp = dev->priv; mac_info_t *mac_control; struct config_param *config; + mac_control = &sp->mac_control; config = &sp->config; - sp->stats.tx_errors = mac_control->stats_info->tmac_any_err_frms; - sp->stats.rx_errors = mac_control->stats_info->rmac_drop_frms; - sp->stats.multicast = mac_control->stats_info->rmac_vld_mcst_frms; + sp->stats.tx_errors = + le32_to_cpu(mac_control->stats_info->tmac_any_err_frms); + sp->stats.rx_errors = + le32_to_cpu(mac_control->stats_info->rmac_drop_frms); + sp->stats.multicast = + le32_to_cpu(mac_control->stats_info->rmac_vld_mcst_frms); sp->stats.rx_length_errors = - mac_control->stats_info->rmac_long_frms; + le32_to_cpu(mac_control->stats_info->rmac_long_frms); return (&sp->stats); } @@ -2909,8 +2758,8 @@ static struct net_device_stats *s2io_get * s2io_set_multicast - entry point for multicast address enable/disable. * @dev : pointer to the device structure * Description: - * This function is a driver entry point which gets called by the kernel - * whenever multicast addresses must be enabled/disabled. This also gets + * This function is a driver entry point which gets called by the kernel + * whenever multicast addresses must be enabled/disabled. This also gets * called to set/reset promiscuous mode. Depending on the deivce flag, we * determine, if multicast address must be enabled or if promiscuous mode * is to be disabled etc. @@ -3010,7 +2859,7 @@ static void s2io_set_multicast(struct ne writeq(RMAC_ADDR_DATA0_MEM_ADDR(dis_addr), &bar0->rmac_addr_data0_mem); writeq(RMAC_ADDR_DATA1_MEM_MASK(0ULL), - &bar0->rmac_addr_data1_mem); + &bar0->rmac_addr_data1_mem); val64 = RMAC_ADDR_CMD_MEM_WE | RMAC_ADDR_CMD_MEM_STROBE_NEW_CMD | RMAC_ADDR_CMD_MEM_OFFSET @@ -3039,8 +2888,7 @@ static void s2io_set_multicast(struct ne writeq(RMAC_ADDR_DATA0_MEM_ADDR(mac_addr), &bar0->rmac_addr_data0_mem); writeq(RMAC_ADDR_DATA1_MEM_MASK(0ULL), - &bar0->rmac_addr_data1_mem); - + &bar0->rmac_addr_data1_mem); val64 = RMAC_ADDR_CMD_MEM_WE | RMAC_ADDR_CMD_MEM_STROBE_NEW_CMD | RMAC_ADDR_CMD_MEM_OFFSET @@ -3059,12 +2907,12 @@ static void s2io_set_multicast(struct ne } /** - * s2io_set_mac_addr - Programs the Xframe mac address + * s2io_set_mac_addr - Programs the Xframe mac address * @dev : pointer to the device structure. * @addr: a uchar pointer to the new mac address which is to be set. - * Description : This procedure will program the Xframe to receive + * Description : This procedure will program the Xframe to receive * frames with new Mac Address - * Return value: SUCCESS on success and an appropriate (-)ve integer + * Return value: SUCCESS on success and an appropriate (-)ve integer * as defined in errno.h file on failure. */ @@ -3075,10 +2923,10 @@ int s2io_set_mac_addr(struct net_device register u64 val64, mac_addr = 0; int i; - /* + /* * Set the new MAC address as the new unicast filter and reflect this * change on the device address registered with the OS. It will be - * at offset 0. + * at offset 0. */ for (i = 0; i < ETH_ALEN; i++) { mac_addr <<= 8; @@ -3102,12 +2950,12 @@ int s2io_set_mac_addr(struct net_device } /** - * s2io_ethtool_sset - Sets different link parameters. + * s2io_ethtool_sset - Sets different link parameters. * @sp : private member of the device structure, which is a pointer to the * s2io_nic structure. * @info: pointer to the structure with parameters given by ethtool to set * link information. * Description: - * The function sets different link parameters provided by the user onto + * The function sets different link parameters provided by the user onto * the NIC. * Return value: * 0 on success. @@ -3129,7 +2977,7 @@ static int s2io_ethtool_sset(struct net_ } /** - * s2io_ethtol_gset - Return link specific information. + * s2io_ethtol_gset - Return link specific information. * @sp : private member of the device structure, pointer to the * s2io_nic structure. * @info : pointer to the structure with parameters given by ethtool @@ -3161,8 +3009,8 @@ static int s2io_ethtool_gset(struct net_ } /** - * s2io_ethtool_gdrvinfo - Returns driver specific information. - * @sp : private member of the device structure, which is a pointer to the + * s2io_ethtool_gdrvinfo - Returns driver specific information. + * @sp : private member of the device structure, which is a pointer to the * s2io_nic structure. * @info : pointer to the structure with parameters given by ethtool to * return driver information. @@ -3190,9 +3038,9 @@ static void s2io_ethtool_gdrvinfo(struct /** * s2io_ethtool_gregs - dumps the entire space of Xfame into the buffer. - * @sp: private member of the device structure, which is a pointer to the + * @sp: private member of the device structure, which is a pointer to the * s2io_nic structure. - * @regs : pointer to the structure with parameters given by ethtool for + * @regs : pointer to the structure with parameters given by ethtool for * dumping the registers. * @reg_space: The input argumnet into which all the registers are dumped. * Description: @@ -3221,11 +3069,11 @@ static void s2io_ethtool_gregs(struct ne /** * s2io_phy_id - timer function that alternates adapter LED. - * @data : address of the private member of the device structure, which + * @data : address of the private member of the device structure, which * is a pointer to the s2io_nic structure, provided as an u32. - * Description: This is actually the timer function that alternates the - * adapter LED bit of the adapter control bit to set/reset every time on - * invocation. The timer is set for 1/2 a second, hence tha NIC blinks + * Description: This is actually the timer function that alternates the + * adapter LED bit of the adapter control bit to set/reset every time on + * invocation. The timer is set for 1/2 a second, hence tha NIC blinks * once every second. */ static void s2io_phy_id(unsigned long data) @@ -3253,12 +3101,12 @@ static void s2io_phy_id(unsigned long da * s2io_ethtool_idnic - To physically identify the nic on the system. * @sp : private member of the device structure, which is a pointer to the * s2io_nic structure. - * @id : pointer to the structure with identification parameters given by + * @id : pointer to the structure with identification parameters given by * ethtool. * Description: Used to physically identify the NIC on the system. - * The Link LED will blink for a time specified by the user for + * The Link LED will blink for a time specified by the user for * identification. - * NOTE: The Link has to be Up to be able to blink the LED. Hence + * NOTE: The Link has to be Up to be able to blink the LED. Hence * identification is possible only if it's link is up. * Return value: * int , returns 0 on success @@ -3288,9 +3136,9 @@ static int s2io_ethtool_idnic(struct net } mod_timer(&sp->id_timer, jiffies); if (data) - msleep(data * 1000); + msleep_interruptible(data * HZ); else - msleep(0xFFFFFFFF); + msleep_interruptible(MAX_FLICKER_TIME); del_timer_sync(&sp->id_timer); if (CARDS_WITH_FAULTY_LINK_INDICATORS(subid)) { @@ -3303,7 +3151,8 @@ static int s2io_ethtool_idnic(struct net /** * s2io_ethtool_getpause_data -Pause frame frame generation and reception. - * @sp : private member of the device structure, which is a pointer to the * s2io_nic structure. + * @sp : private member of the device structure, which is a pointer to the + * s2io_nic structure. * @ep : pointer to the structure with pause parameters given by ethtool. * Description: * Returns the Pause frame generation and reception capability of the NIC. @@ -3327,7 +3176,7 @@ static void s2io_ethtool_getpause_data(s /** * s2io_ethtool_setpause_data - set/reset pause frame generation. - * @sp : private member of the device structure, which is a pointer to the + * @sp : private member of the device structure, which is a pointer to the * s2io_nic structure. * @ep : pointer to the structure with pause parameters given by ethtool. * Description: @@ -3338,7 +3187,7 @@ static void s2io_ethtool_getpause_data(s */ static int s2io_ethtool_setpause_data(struct net_device *dev, - struct ethtool_pauseparam *ep) + struct ethtool_pauseparam *ep) { u64 val64; nic_t *sp = dev->priv; @@ -3359,13 +3208,13 @@ static int s2io_ethtool_setpause_data(st /** * read_eeprom - reads 4 bytes of data from user given offset. - * @sp : private member of the device structure, which is a pointer to the + * @sp : private member of the device structure, which is a pointer to the * s2io_nic structure. * @off : offset at which the data must be written * @data : Its an output parameter where the data read at the given - * offset is stored. + * offset is stored. * Description: - * Will read 4 bytes of data from the user given offset and return the + * Will read 4 bytes of data from the user given offset and return the * read data. * NOTE: Will allow to read only part of the EEPROM visible through the * I2C bus. @@ -3406,7 +3255,7 @@ static int read_eeprom(nic_t * sp, int o * s2io_nic structure. * @off : offset at which the data must be written * @data : The data that is to be written - * @cnt : Number of bytes of the data that are actually to be written into + * @cnt : Number of bytes of the data that are actually to be written into * the Eeprom. (max of 3) * Description: * Actually writes the relevant part of the data value into the Eeprom @@ -3443,7 +3292,7 @@ static int write_eeprom(nic_t * sp, int /** * s2io_ethtool_geeprom - reads the value stored in the Eeprom. * @sp : private member of the device structure, which is a pointer to the * s2io_nic structure. - * @eeprom : pointer to the user level structure provided by ethtool, + * @eeprom : pointer to the user level structure provided by ethtool, * containing all relevant information. * @data_buf : user defined value to be written into Eeprom. * Description: Reads the values stored in the Eeprom at given offset @@ -3454,7 +3303,7 @@ static int write_eeprom(nic_t * sp, int */ static int s2io_ethtool_geeprom(struct net_device *dev, - struct ethtool_eeprom *eeprom, u8 * data_buf) + struct ethtool_eeprom *eeprom, u8 * data_buf) { u32 data, i, valid; nic_t *sp = dev->priv; @@ -3479,7 +3328,7 @@ static int s2io_ethtool_geeprom(struct n * s2io_ethtool_seeprom - tries to write the user provided value in Eeprom * @sp : private member of the device structure, which is a pointer to the * s2io_nic structure. - * @eeprom : pointer to the user level structure provided by ethtool, + * @eeprom : pointer to the user level structure provided by ethtool, * containing all relevant information. * @data_buf ; user defined value to be written into Eeprom. * Description: @@ -3527,8 +3376,8 @@ static int s2io_ethtool_seeprom(struct n } /** - * s2io_register_test - reads and writes into all clock domains. - * @sp : private member of the device structure, which is a pointer to the + * s2io_register_test - reads and writes into all clock domains. + * @sp : private member of the device structure, which is a pointer to the * s2io_nic structure. * @data : variable that returns the result of each of the test conducted b * by the driver. @@ -3545,8 +3394,8 @@ static int s2io_register_test(nic_t * sp u64 val64 = 0; int fail = 0; - val64 = readq(&bar0->pcc_enable); - if (val64 != 0xff00000000000000ULL) { + val64 = readq(&bar0->pif_rd_swapper_fb); + if (val64 != 0x123456789abcdefULL) { fail = 1; DBG_PRINT(INFO_DBG, "Read Test level 1 fails\n"); } @@ -3590,13 +3439,13 @@ static int s2io_register_test(nic_t * sp } /** - * s2io_eeprom_test - to verify that EEprom in the xena can be programmed. + * s2io_eeprom_test - to verify that EEprom in the xena can be programmed. * @sp : private member of the device structure, which is a pointer to the * s2io_nic structure. * @data:variable that returns the result of each of the test conducted by * the driver. * Description: - * Verify that EEPROM in the xena can be programmed using I2C_CONTROL + * Verify that EEPROM in the xena can be programmed using I2C_CONTROL * register. * Return value: * 0 on success. @@ -3661,14 +3510,14 @@ static int s2io_eeprom_test(nic_t * sp, /** * s2io_bist_test - invokes the MemBist test of the card . - * @sp : private member of the device structure, which is a pointer to the + * @sp : private member of the device structure, which is a pointer to the * s2io_nic structure. - * @data:variable that returns the result of each of the test conducted by + * @data:variable that returns the result of each of the test conducted by * the driver. * Description: * This invokes the MemBist test of the card. We give around * 2 secs time for the Test to complete. If it's still not complete - * within this peiod, we consider that the test failed. + * within this peiod, we consider that the test failed. * Return value: * 0 on success and -1 on failure. */ @@ -3697,13 +3546,13 @@ static int s2io_bist_test(nic_t * sp, ui } /** - * s2io-link_test - verifies the link state of the nic - * @sp ; private member of the device structure, which is a pointer to the + * s2io-link_test - verifies the link state of the nic + * @sp ; private member of the device structure, which is a pointer to the * s2io_nic structure. * @data: variable that returns the result of each of the test conducted by * the driver. * Description: - * The function verifies the link state of the NIC and updates the input + * The function verifies the link state of the NIC and updates the input * argument 'data' appropriately. * Return value: * 0 on success. @@ -3722,13 +3571,13 @@ static int s2io_link_test(nic_t * sp, ui } /** - * s2io_rldram_test - offline test for access to the RldRam chip on the NIC - * @sp - private member of the device structure, which is a pointer to the + * s2io_rldram_test - offline test for access to the RldRam chip on the NIC + * @sp - private member of the device structure, which is a pointer to the * s2io_nic structure. - * @data - variable that returns the result of each of the test + * @data - variable that returns the result of each of the test * conducted by the driver. * Description: - * This is one of the offline test that tests the read and write + * This is one of the offline test that tests the read and write * access to the RldRam chip on the NIC. * Return value: * 0 on success. @@ -3833,7 +3682,7 @@ static int s2io_rldram_test(nic_t * sp, * s2io_nic structure. * @ethtest : pointer to a ethtool command specific structure that will be * returned to the user. - * @data : variable that returns the result of each of the test + * @data : variable that returns the result of each of the test * conducted by the driver. * Description: * This function conducts 6 tests ( 4 offline and 2 online) to determine @@ -3851,23 +3700,18 @@ static void s2io_ethtool_test(struct net if (ethtest->flags == ETH_TEST_FL_OFFLINE) { /* Offline Tests. */ - if (orig_state) { + if (orig_state) s2io_close(sp->dev); - s2io_set_swapper(sp); - } else - s2io_set_swapper(sp); if (s2io_register_test(sp, &data[0])) ethtest->flags |= ETH_TEST_FL_FAILED; s2io_reset(sp); - s2io_set_swapper(sp); if (s2io_rldram_test(sp, &data[3])) ethtest->flags |= ETH_TEST_FL_FAILED; s2io_reset(sp); - s2io_set_swapper(sp); if (s2io_eeprom_test(sp, &data[1])) ethtest->flags |= ETH_TEST_FL_FAILED; @@ -3951,20 +3795,19 @@ static void s2io_get_ethtool_stats(struc tmp_stats[i++] = le32_to_cpu(stat_info->rmac_err_tcp); } -static int s2io_ethtool_get_regs_len(struct net_device *dev) +int s2io_ethtool_get_regs_len(struct net_device *dev) { return (XENA_REG_SPACE); } -static u32 s2io_ethtool_get_rx_csum(struct net_device * dev) +u32 s2io_ethtool_get_rx_csum(struct net_device * dev) { nic_t *sp = dev->priv; return (sp->rx_csum); } - -static int s2io_ethtool_set_rx_csum(struct net_device *dev, u32 data) +int s2io_ethtool_set_rx_csum(struct net_device *dev, u32 data) { nic_t *sp = dev->priv; @@ -3975,19 +3818,17 @@ static int s2io_ethtool_set_rx_csum(stru return 0; } - -static int s2io_get_eeprom_len(struct net_device *dev) +int s2io_get_eeprom_len(struct net_device *dev) { return (XENA_EEPROM_SPACE); } -static int s2io_ethtool_self_test_count(struct net_device *dev) +int s2io_ethtool_self_test_count(struct net_device *dev) { return (S2IO_TEST_LEN); } - -static void s2io_ethtool_get_strings(struct net_device *dev, - u32 stringset, u8 * data) +void s2io_ethtool_get_strings(struct net_device *dev, + u32 stringset, u8 * data) { switch (stringset) { case ETH_SS_TEST: @@ -3998,13 +3839,12 @@ static void s2io_ethtool_get_strings(str sizeof(ethtool_stats_keys)); } } - static int s2io_ethtool_get_stats_count(struct net_device *dev) { return (S2IO_STAT_LEN); } -static int s2io_ethtool_op_set_tx_csum(struct net_device *dev, u32 data) +int s2io_ethtool_op_set_tx_csum(struct net_device *dev, u32 data) { if (data) dev->features |= NETIF_F_IP_CSUM; @@ -4046,21 +3886,18 @@ static struct ethtool_ops netdev_ethtool }; /** - * s2io_ioctl - Entry point for the Ioctl + * s2io_ioctl - Entry point for the Ioctl * @dev : Device pointer. * @ifr : An IOCTL specefic structure, that can contain a pointer to * a proprietary structure used to pass information to the driver. * @cmd : This is used to distinguish between the different commands that * can be passed to the IOCTL functions. * Description: - * This function has support for ethtool, adding multiple MAC addresses on - * the NIC and some DBG commands for the util tool. - * Return value: - * Currently the IOCTL supports no operations, hence by default this - * function returns OP NOT SUPPORTED value. + * Currently there are no special functionality supported in IOCTL, hence + * function always return EOPNOTSUPPORTED */ -static int s2io_ioctl(struct net_device *dev, struct ifreq *rq, int cmd) +int s2io_ioctl(struct net_device *dev, struct ifreq *rq, int cmd) { return -EOPNOTSUPP; } @@ -4076,7 +3913,7 @@ static int s2io_ioctl(struct net_device * file on failure. */ -static int s2io_change_mtu(struct net_device *dev, int new_mtu) +int s2io_change_mtu(struct net_device *dev, int new_mtu) { nic_t *sp = dev->priv; XENA_dev_config_t __iomem *bar0 = sp->bar0; @@ -4084,7 +3921,7 @@ static int s2io_change_mtu(struct net_de if (netif_running(dev)) { DBG_PRINT(ERR_DBG, "%s: Must be stopped to ", dev->name); - DBG_PRINT(ERR_DBG, "change its MTU \n"); + DBG_PRINT(ERR_DBG, "change its MTU\n"); return -EBUSY; } @@ -4108,9 +3945,9 @@ static int s2io_change_mtu(struct net_de * @dev_adr : address of the device structure in dma_addr_t format. * Description: * This is the tasklet or the bottom half of the ISR. This is - * an extension of the ISR which is scheduled by the scheduler to be run + * an extension of the ISR which is scheduled by the scheduler to be run * when the load on the CPU is low. All low priority tasks of the ISR can - * be pushed into the tasklet. For now the tasklet is used only to + * be pushed into the tasklet. For now the tasklet is used only to * replenish the Rx buffers in the Rx buffer descriptors. * Return value: * void. @@ -4166,14 +4003,14 @@ static void s2io_set_link(unsigned long } subid = nic->pdev->subsystem_device; - /* - * Allow a small delay for the NICs self initiated + /* + * Allow a small delay for the NICs self initiated * cleanup to complete. */ msleep(100); val64 = readq(&bar0->adapter_status); - if (verify_xena_quiescence(val64, nic->device_enabled_once)) { + if (verify_xena_quiescence(nic, val64, nic->device_enabled_once)) { if (LINK_IS_UP(val64)) { val64 = readq(&bar0->adapter_control); val64 |= ADAPTER_CNTL_EN; @@ -4224,8 +4061,9 @@ static void s2io_card_down(nic_t * sp) register u64 val64 = 0; /* If s2io_set_link task is executing, wait till it completes. */ - while (test_and_set_bit(0, &(sp->link_state))) + while (test_and_set_bit(0, &(sp->link_state))) { msleep(50); + } atomic_set(&sp->card_state, CARD_DOWN); /* disable Tx and Rx traffic on the NIC */ @@ -4237,7 +4075,7 @@ static void s2io_card_down(nic_t * sp) /* Check if the device is Quiescent and then Reset the NIC */ do { val64 = readq(&bar0->adapter_status); - if (verify_xena_quiescence(val64, sp->device_enabled_once)) { + if (verify_xena_quiescence(sp, val64, sp->device_enabled_once)) { break; } @@ -4276,8 +4114,8 @@ static int s2io_card_up(nic_t * sp) return -ENODEV; } - /* - * Initializing the Rx buffers. For now we are considering only 1 + /* + * Initializing the Rx buffers. For now we are considering only 1 * Rx ring and initializing buffers into 30 Rx blocks */ mac_control = &sp->mac_control; @@ -4315,12 +4153,12 @@ static int s2io_card_up(nic_t * sp) return 0; } -/** +/** * s2io_restart_nic - Resets the NIC. * @data : long pointer to the device private structure * Description: * This function is scheduled to be run by the s2io_tx_watchdog - * function after 0.5 secs to reset the NIC. The idea is to reduce + * function after 0.5 secs to reset the NIC. The idea is to reduce * the run time of the watch dog routine which is run holding a * spin lock. */ @@ -4338,10 +4176,11 @@ static void s2io_restart_nic(unsigned lo netif_wake_queue(dev); DBG_PRINT(ERR_DBG, "%s: was reset by Tx watchdog timer\n", dev->name); + } -/** - * s2io_tx_watchdog - Watchdog for transmit side. +/** + * s2io_tx_watchdog - Watchdog for transmit side. * @dev : Pointer to net device structure * Description: * This function is triggered if the Tx Queue is stopped @@ -4369,7 +4208,7 @@ static void s2io_tx_watchdog(struct net_ * @len : length of the packet * @cksum : FCS checksum of the frame. * @ring_no : the ring from which this RxD was extracted. - * Description: + * Description: * This function is called by the Tx interrupt serivce routine to perform * some OS related operations on the SKB before passing it to the upper * layers. It mainly checks if the checksum is OK, if so adds it to the @@ -4379,35 +4218,63 @@ static void s2io_tx_watchdog(struct net_ * Return value: * SUCCESS on success and -1 on failure. */ -#ifndef CONFIG_2BUFF_MODE -static int rx_osm_handler(nic_t * sp, u16 len, RxD_t * rxdp, int ring_no) -#else -static int rx_osm_handler(nic_t * sp, RxD_t * rxdp, int ring_no, - buffAdd_t * ba) -#endif +static int rx_osm_handler(ring_info_t *ring_data, RxD_t * rxdp) { + nic_t *sp = ring_data->nic; struct net_device *dev = (struct net_device *) sp->dev; - struct sk_buff *skb = - (struct sk_buff *) ((unsigned long) rxdp->Host_Control); + struct sk_buff *skb = (struct sk_buff *) + ((unsigned long) rxdp->Host_Control); + int ring_no = ring_data->ring_no; u16 l3_csum, l4_csum; #ifdef CONFIG_2BUFF_MODE - int buf0_len, buf2_len; + int buf0_len = RXD_GET_BUFFER0_SIZE(rxdp->Control_2); + int buf2_len = RXD_GET_BUFFER2_SIZE(rxdp->Control_2); + int get_block = ring_data->rx_curr_get_info.block_index; + int get_off = ring_data->rx_curr_get_info.offset; + buffAdd_t *ba = &ring_data->ba[get_block][get_off]; unsigned char *buff; +#else + u16 len = (u16) ((RXD_GET_BUFFER0_SIZE(rxdp->Control_2)) >> 48);; #endif + skb->dev = dev; + if (rxdp->Control_1 & RXD_T_CODE) { + unsigned long long err = rxdp->Control_1 & RXD_T_CODE; + DBG_PRINT(ERR_DBG, "%s: Rx error Value: 0x%llx\n", + dev->name, err); + } - l3_csum = RXD_GET_L3_CKSUM(rxdp->Control_1); - if ((rxdp->Control_1 & TCP_OR_UDP_FRAME) && (sp->rx_csum)) { + /* Updating statistics */ + rxdp->Host_Control = 0; + sp->rx_pkt_count++; + sp->stats.rx_packets++; +#ifndef CONFIG_2BUFF_MODE + sp->stats.rx_bytes += len; +#else + sp->stats.rx_bytes += buf0_len + buf2_len; +#endif + +#ifndef CONFIG_2BUFF_MODE + skb_put(skb, len); +#else + buff = skb_push(skb, buf0_len); + memcpy(buff, ba->ba_0, buf0_len); + skb_put(skb, buf2_len); +#endif + + if ((rxdp->Control_1 & TCP_OR_UDP_FRAME) && + (sp->rx_csum)) { + l3_csum = RXD_GET_L3_CKSUM(rxdp->Control_1); l4_csum = RXD_GET_L4_CKSUM(rxdp->Control_1); if ((l3_csum == L3_CKSUM_OK) && (l4_csum == L4_CKSUM_OK)) { - /* + /* * NIC verifies if the Checksum of the received * frame is Ok or not and accordingly returns * a flag in the RxD. */ skb->ip_summed = CHECKSUM_UNNECESSARY; } else { - /* - * Packet with erroneous checksum, let the + /* + * Packet with erroneous checksum, let the * upper layers deal with it. */ skb->ip_summed = CHECKSUM_NONE; @@ -4416,44 +4283,14 @@ static int rx_osm_handler(nic_t * sp, Rx skb->ip_summed = CHECKSUM_NONE; } - if (rxdp->Control_1 & RXD_T_CODE) { - unsigned long long err = rxdp->Control_1 & RXD_T_CODE; - DBG_PRINT(ERR_DBG, "%s: Rx error Value: 0x%llx\n", - dev->name, err); - } -#ifdef CONFIG_2BUFF_MODE - buf0_len = RXD_GET_BUFFER0_SIZE(rxdp->Control_2); - buf2_len = RXD_GET_BUFFER2_SIZE(rxdp->Control_2); -#endif - - skb->dev = dev; -#ifndef CONFIG_2BUFF_MODE - skb_put(skb, len); - skb->protocol = eth_type_trans(skb, dev); -#else - buff = skb_push(skb, buf0_len); - memcpy(buff, ba->ba_0, buf0_len); - skb_put(skb, buf2_len); skb->protocol = eth_type_trans(skb, dev); -#endif - #ifdef CONFIG_S2IO_NAPI netif_receive_skb(skb); #else netif_rx(skb); #endif - dev->last_rx = jiffies; - sp->rx_pkt_count++; - sp->stats.rx_packets++; -#ifndef CONFIG_2BUFF_MODE - sp->stats.rx_bytes += len; -#else - sp->stats.rx_bytes += buf0_len + buf2_len; -#endif - atomic_dec(&sp->rx_bufs_left[ring_no]); - rxdp->Host_Control = 0; return SUCCESS; } @@ -4464,13 +4301,13 @@ static int rx_osm_handler(nic_t * sp, Rx * @link : inidicates whether link is UP/DOWN. * Description: * This function stops/starts the Tx queue depending on whether the link - * status of the NIC is is down or up. This is called by the Alarm - * interrupt handler whenever a link change interrupt comes up. + * status of the NIC is is down or up. This is called by the Alarm + * interrupt handler whenever a link change interrupt comes up. * Return value: * void. */ -static void s2io_link(nic_t * sp, int link) +void s2io_link(nic_t * sp, int link) { struct net_device *dev = (struct net_device *) sp->dev; @@ -4487,8 +4324,25 @@ static void s2io_link(nic_t * sp, int li } /** - * s2io_init_pci -Initialization of PCI and PCI-X configuration registers . - * @sp : private member of the device structure, which is a pointer to the + * get_xena_rev_id - to identify revision ID of xena. + * @pdev : PCI Dev structure + * Description: + * Function to identify the Revision ID of xena. + * Return value: + * returns the revision ID of the device. + */ + +int get_xena_rev_id(struct pci_dev *pdev) +{ + u8 id = 0; + int ret; + ret = pci_read_config_byte(pdev, PCI_REVISION_ID, (u8 *) & id); + return id; +} + +/** + * s2io_init_pci -Initialization of PCI and PCI-X configuration registers . + * @sp : private member of the device structure, which is a pointer to the * s2io_nic structure. * Description: * This function initializes a few of the PCI and PCI-X configuration registers @@ -4499,15 +4353,15 @@ static void s2io_link(nic_t * sp, int li static void s2io_init_pci(nic_t * sp) { - u16 pci_cmd = 0; + u16 pci_cmd = 0, pcix_cmd = 0; /* Enable Data Parity Error Recovery in PCI-X command register. */ pci_read_config_word(sp->pdev, PCIX_COMMAND_REGISTER, - &(sp->pcix_cmd)); + &(pcix_cmd)); pci_write_config_word(sp->pdev, PCIX_COMMAND_REGISTER, - (sp->pcix_cmd | 1)); + (pcix_cmd | 1)); pci_read_config_word(sp->pdev, PCIX_COMMAND_REGISTER, - &(sp->pcix_cmd)); + &(pcix_cmd)); /* Set the PErr Response bit in PCI command register. */ pci_read_config_word(sp->pdev, PCI_COMMAND, &pci_cmd); @@ -4516,34 +4370,36 @@ static void s2io_init_pci(nic_t * sp) pci_read_config_word(sp->pdev, PCI_COMMAND, &pci_cmd); /* Set MMRB count to 1024 in PCI-X Command register. */ - sp->pcix_cmd &= 0xFFF3; - pci_write_config_word(sp->pdev, PCIX_COMMAND_REGISTER, (sp->pcix_cmd | (0x1 << 2))); /* MMRBC 1K */ + pcix_cmd &= 0xFFF3; + pci_write_config_word(sp->pdev, PCIX_COMMAND_REGISTER, + (pcix_cmd | (0x1 << 2))); /* MMRBC 1K */ pci_read_config_word(sp->pdev, PCIX_COMMAND_REGISTER, - &(sp->pcix_cmd)); + &(pcix_cmd)); /* Setting Maximum outstanding splits based on system type. */ - sp->pcix_cmd &= 0xFF8F; - - sp->pcix_cmd |= XENA_MAX_OUTSTANDING_SPLITS(0x1); /* 2 splits. */ + pcix_cmd &= 0xFF8F; + pcix_cmd |= XENA_MAX_OUTSTANDING_SPLITS(0x1); /* 2 splits. */ pci_write_config_word(sp->pdev, PCIX_COMMAND_REGISTER, - sp->pcix_cmd); + pcix_cmd); pci_read_config_word(sp->pdev, PCIX_COMMAND_REGISTER, - &(sp->pcix_cmd)); + &(pcix_cmd)); + /* Forcibly disabling relaxed ordering capability of the card. */ - sp->pcix_cmd &= 0xfffd; + pcix_cmd &= 0xfffd; pci_write_config_word(sp->pdev, PCIX_COMMAND_REGISTER, - sp->pcix_cmd); + pcix_cmd); pci_read_config_word(sp->pdev, PCIX_COMMAND_REGISTER, - &(sp->pcix_cmd)); + &(pcix_cmd)); } MODULE_AUTHOR("Raghavendra Koushik "); MODULE_LICENSE("GPL"); module_param(tx_fifo_num, int, 0); -module_param_array(tx_fifo_len, int, NULL, 0); module_param(rx_ring_num, int, 0); -module_param_array(rx_ring_sz, int, NULL, 0); +module_param_array(tx_fifo_len, uint, NULL, 0); +module_param_array(rx_ring_sz, uint, NULL, 0); module_param(Stats_refresh_time, int, 0); +module_param_array(rts_frm_len, uint, NULL, 0); module_param(rmac_pause_time, int, 0); module_param(mc_pause_threshold_q0q3, int, 0); module_param(mc_pause_threshold_q4q7, int, 0); @@ -4553,15 +4409,16 @@ module_param(rmac_util_period, int, 0); #ifndef CONFIG_S2IO_NAPI module_param(indicate_max_pkts, int, 0); #endif + /** - * s2io_init_nic - Initialization of the adapter . + * s2io_init_nic - Initialization of the adapter . * @pdev : structure containing the PCI related information of the device. * @pre: List of PCI devices supported by the driver listed in s2io_tbl. * Description: * The function initializes an adapter identified by the pci_dec structure. - * All OS related initialization including memory and device structure and - * initlaization of the device private variable is done. Also the swapper - * control register is initialized to enable read and write into the I/O + * All OS related initialization including memory and device structure and + * initlaization of the device private variable is done. Also the swapper + * control register is initialized to enable read and write into the I/O * registers of the device. * Return value: * returns 0 on success and negative on failure. @@ -4572,7 +4429,6 @@ s2io_init_nic(struct pci_dev *pdev, cons { nic_t *sp; struct net_device *dev; - char *dev_name = "S2IO 10GE NIC"; int i, j, ret; int dma_flag = FALSE; u32 mac_up, mac_down; @@ -4582,9 +4438,9 @@ s2io_init_nic(struct pci_dev *pdev, cons mac_info_t *mac_control; struct config_param *config; - - DBG_PRINT(ERR_DBG, "Loading S2IO driver with %s\n", - s2io_driver_version); +#ifdef CONFIG_S2IO_NAPI + DBG_PRINT(ERR_DBG, "NAPI support has been enabled\n"); +#endif if ((ret = pci_enable_device(pdev))) { DBG_PRINT(ERR_DBG, @@ -4595,7 +4451,6 @@ s2io_init_nic(struct pci_dev *pdev, cons if (!pci_set_dma_mask(pdev, DMA_64BIT_MASK)) { DBG_PRINT(INIT_DBG, "s2io_init_nic: Using 64bit DMA\n"); dma_flag = TRUE; - if (pci_set_consistent_dma_mask (pdev, DMA_64BIT_MASK)) { DBG_PRINT(ERR_DBG, @@ -4635,21 +4490,17 @@ s2io_init_nic(struct pci_dev *pdev, cons memset(sp, 0, sizeof(nic_t)); sp->dev = dev; sp->pdev = pdev; - sp->vendor_id = pdev->vendor; - sp->device_id = pdev->device; sp->high_dma_flag = dma_flag; - sp->irq = pdev->irq; sp->device_enabled_once = FALSE; - strcpy(sp->name, dev_name); /* Initialize some PCI/PCI-X fields of the NIC. */ s2io_init_pci(sp); - /* + /* * Setting the device configuration parameters. - * Most of these parameters can be specified by the user during - * module insertion as they are module loadable parameters. If - * these parameters are not not specified during load time, they + * Most of these parameters can be specified by the user during + * module insertion as they are module loadable parameters. If + * these parameters are not not specified during load time, they * are initialized with default values. */ mac_control = &sp->mac_control; @@ -4663,6 +4514,10 @@ s2io_init_nic(struct pci_dev *pdev, cons config->tx_cfg[i].fifo_priority = i; } + /* mapping the QoS priority to the configured fifos */ + for (i = 0; i < MAX_TX_FIFOS; i++) + config->fifo_mapping[i] = fifo_map[config->tx_fifo_num][i]; + config->tx_intr_type = TXD_INT_TYPE_UTILZ; for (i = 0; i < config->tx_fifo_num; i++) { config->tx_cfg[i].f_no_snoop = @@ -4743,13 +4598,14 @@ s2io_init_nic(struct pci_dev *pdev, cons dev->do_ioctl = &s2io_ioctl; dev->change_mtu = &s2io_change_mtu; SET_ETHTOOL_OPS(dev, &netdev_ethtool_ops); + /* * will use eth_mac_addr() for dev->set_mac_address * mac address will be set every time dev->open() is called */ -#ifdef CONFIG_S2IO_NAPI +#if defined(CONFIG_S2IO_NAPI) dev->poll = s2io_poll; - dev->weight = 90; + dev->weight = 32; #endif dev->features |= NETIF_F_SG | NETIF_F_IP_CSUM; @@ -4776,22 +4632,14 @@ s2io_init_nic(struct pci_dev *pdev, cons goto set_swap_failed; } - /* Fix for all "FFs" MAC address problems observed on Alpha platforms */ + /* + * Fix for all "FFs" MAC address problems observed on + * Alpha platforms + */ fix_mac_address(sp); s2io_reset(sp); /* - * Setting swapper control on the NIC, so the MAC address can be read. - */ - if (s2io_set_swapper(sp)) { - DBG_PRINT(ERR_DBG, - "%s: S2IO: swapper settings are wrong\n", - dev->name); - ret = -EAGAIN; - goto set_swap_failed; - } - - /* * MAC address initialization. * For now only one mac address will be read and used. */ @@ -4828,23 +4676,22 @@ s2io_init_nic(struct pci_dev *pdev, cons memcpy(dev->dev_addr, sp->def_mac_addr, ETH_ALEN); /* - * Initialize the tasklet status and link state flags + * Initialize the tasklet status and link state flags * and the card statte parameter */ atomic_set(&(sp->card_state), 0); sp->tasklet_status = 0; sp->link_state = 0; - /* Initialize spinlocks */ spin_lock_init(&sp->tx_lock); #ifndef CONFIG_S2IO_NAPI spin_lock_init(&sp->put_lock); #endif - /* - * SXE-002: Configure link and activity LED to init state - * on driver load. + /* + * SXE-002: Configure link and activity LED to init state + * on driver load. */ subid = sp->pdev->subsystem_device; if ((subid & 0xFF) >= 0x07) { @@ -4864,9 +4711,9 @@ s2io_init_nic(struct pci_dev *pdev, cons goto register_failed; } - /* - * Make Link state as off at this point, when the Link change - * interrupt comes the state will be automatically changed to + /* + * Make Link state as off at this point, when the Link change + * interrupt comes the state will be automatically changed to * the right state. */ netif_carrier_off(dev); @@ -4891,11 +4738,11 @@ s2io_init_nic(struct pci_dev *pdev, cons } /** - * s2io_rem_nic - Free the PCI device + * s2io_rem_nic - Free the PCI device * @pdev: structure containing the PCI related information of the device. - * Description: This function is called by the Pci subsystem to release a + * Description: This function is called by the Pci subsystem to release a * PCI device and free up all resource held up by the device. This could - * be in response to a Hot plug event or when the driver is to be removed + * be in response to a Hot plug event or when the driver is to be removed * from memory. */ @@ -4919,7 +4766,6 @@ static void __devexit s2io_rem_nic(struc pci_disable_device(pdev); pci_release_regions(pdev); pci_set_drvdata(pdev, NULL); - free_netdev(dev); } @@ -4935,11 +4781,11 @@ int __init s2io_starter(void) } /** - * s2io_closer - Cleanup routine for the driver + * s2io_closer - Cleanup routine for the driver * Description: This function is the cleanup routine for the driver. It unregist * ers the driver. */ -static void s2io_closer(void) +void s2io_closer(void) { pci_unregister_driver(&s2io_driver); DBG_PRINT(INIT_DBG, "cleanup done\n"); diff -uprN vanilla_linux/drivers/net/s2io.h linux-2.6.13-rc4/drivers/net/s2io.h --- vanilla_linux/drivers/net/s2io.h 2005-08-01 15:51:42.000000000 -0700 +++ linux-2.6.13-rc4/drivers/net/s2io.h 2005-08-02 02:00:48.000000000 -0700 @@ -31,6 +31,9 @@ #define SUCCESS 0 #define FAILURE -1 +/* Maximum time to flicker LED when asked to identify NIC using ethtool */ +#define MAX_FLICKER_TIME 60000 /* 60 Secs */ + /* Maximum outstanding splits to be configured into xena. */ typedef enum xena_max_outstanding_splits { XENA_ONE_SPLIT_TRANSACTION = 0, @@ -45,10 +48,10 @@ typedef enum xena_max_outstanding_splits #define XENA_MAX_OUTSTANDING_SPLITS(n) (n << 4) /* OS concerned variables and constants */ -#define WATCH_DOG_TIMEOUT 5*HZ -#define EFILL 0x1234 -#define ALIGN_SIZE 127 -#define PCIX_COMMAND_REGISTER 0x62 +#define WATCH_DOG_TIMEOUT 15*HZ +#define EFILL 0x1234 +#define ALIGN_SIZE 127 +#define PCIX_COMMAND_REGISTER 0x62 /* * Debug related variables. @@ -61,7 +64,7 @@ typedef enum xena_max_outstanding_splits #define INTR_DBG 4 /* Global variable that defines the present debug level of the driver. */ -static int debug_level = ERR_DBG; /* Default level. */ +int debug_level = ERR_DBG; /* Default level. */ /* DEBUG message print. */ #define DBG_PRINT(dbg_level, args...) if(!(debug_level> 48) @@ -382,7 +408,7 @@ typedef struct _RxD_t { #endif } RxD_t; -/* Structure that represents the Rx descriptor block which contains +/* Structure that represents the Rx descriptor block which contains * 128 Rx descriptors. */ #ifndef CONFIG_2BUFF_MODE @@ -392,11 +418,11 @@ typedef struct _RxD_block { u64 reserved_0; #define END_OF_BLOCK 0xFEFFFFFFFFFFFFFFULL - u64 reserved_1; /* 0xFEFFFFFFFFFFFFFF to mark last + u64 reserved_1; /* 0xFEFFFFFFFFFFFFFF to mark last * Rxd in this blk */ u64 reserved_2_pNext_RxD_block; /* Logical ptr to next */ u64 pNext_RxD_Blk_physical; /* Buff0_ptr.In a 32 bit arch - * the upper 32 bits should + * the upper 32 bits should * be 0 */ } RxD_block_t; #else @@ -405,13 +431,13 @@ typedef struct _RxD_block { RxD_t rxd[MAX_RXDS_PER_BLOCK]; #define END_OF_BLOCK 0xFEFFFFFFFFFFFFFFULL - u64 reserved_1; /* 0xFEFFFFFFFFFFFFFF to mark last Rxd + u64 reserved_1; /* 0xFEFFFFFFFFFFFFFF to mark last Rxd * in this blk */ u64 pNext_RxD_Blk_physical; /* Phy ponter to next blk. */ } RxD_block_t; #define SIZE_OF_BLOCK 4096 -/* Structure to hold virtual addresses of Buf0 and Buf1 in +/* Structure to hold virtual addresses of Buf0 and Buf1 in * 2buf mode. */ typedef struct bufAdd { void *ba_0_org; @@ -423,8 +449,8 @@ typedef struct bufAdd { /* Structure which stores all the MAC control parameters */ -/* This structure stores the offset of the RxD in the ring - * from which the Rx Interrupt processor can start picking +/* This structure stores the offset of the RxD in the ring + * from which the Rx Interrupt processor can start picking * up the RxDs for processing. */ typedef struct _rx_curr_get_info_t { @@ -436,7 +462,7 @@ typedef struct _rx_curr_get_info_t { typedef rx_curr_get_info_t rx_curr_put_info_t; /* This structure stores the offset of the TxDl in the FIFO - * from which the Tx Interrupt processor can start picking + * from which the Tx Interrupt processor can start picking * up the TxDLs for send complete interrupt processing. */ typedef struct { @@ -446,32 +472,96 @@ typedef struct { typedef tx_curr_get_info_t tx_curr_put_info_t; -/* Infomation related to the Tx and Rx FIFOs and Rings of Xena - * is maintained in this structure. - */ -typedef struct mac_info { -/* rx side stuff */ - /* Put pointer info which indictes which RxD has to be replenished +/* Structure that holds the Phy and virt addresses of the Blocks */ +typedef struct rx_block_info { + RxD_t *block_virt_addr; + dma_addr_t block_dma_addr; +} rx_block_info_t; + +/* pre declaration of the nic structure */ +typedef struct s2io_nic nic_t; + +/* Ring specific structure */ +typedef struct ring_info { + /* The ring number */ + int ring_no; + + /* + * Place holders for the virtual and physical addresses of + * all the Rx Blocks + */ + rx_block_info_t rx_blocks[MAX_RX_BLOCKS_PER_RING]; + int block_count; + int pkt_cnt; + + /* + * Put pointer info which indictes which RxD has to be replenished * with a new buffer. */ - rx_curr_put_info_t rx_curr_put_info[MAX_RX_RINGS]; + rx_curr_put_info_t rx_curr_put_info; - /* Get pointer info which indictes which is the last RxD that was + /* + * Get pointer info which indictes which is the last RxD that was * processed by the driver. */ - rx_curr_get_info_t rx_curr_get_info[MAX_RX_RINGS]; + rx_curr_get_info_t rx_curr_get_info; - u16 rmac_pause_time; - u16 mc_pause_threshold_q0q3; - u16 mc_pause_threshold_q4q7; +#ifndef CONFIG_S2IO_NAPI + /* Index to the absolute position of the put pointer of Rx ring */ + int put_pos; +#endif + +#ifdef CONFIG_2BUFF_MODE + /* Buffer Address store. */ + buffAdd_t **ba; +#endif + nic_t *nic; +} ring_info_t; +/* Fifo specific structure */ +typedef struct fifo_info { + /* FIFO number */ + int fifo_no; + + /* Maximum TxDs per TxDL */ + int max_txds; + + /* Place holder of all the TX List's Phy and Virt addresses. */ + list_info_hold_t *list_info; + + /* + * Current offset within the tx FIFO where driver would write + * new Tx frame + */ + tx_curr_put_info_t tx_curr_put_info; + + /* + * Current offset within tx FIFO from where the driver would start freeing + * the buffers + */ + tx_curr_get_info_t tx_curr_get_info; + + nic_t *nic; +}fifo_info_t; + +/* Infomation related to the Tx and Rx FIFOs and Rings of Xena + * is maintained in this structure. + */ +typedef struct mac_info { /* tx side stuff */ /* logical pointer of start of each Tx FIFO */ TxFIFO_element_t __iomem *tx_FIFO_start[MAX_TX_FIFOS]; -/* Current offset within tx_FIFO_start, where driver would write new Tx frame*/ - tx_curr_put_info_t tx_curr_put_info[MAX_TX_FIFOS]; - tx_curr_get_info_t tx_curr_get_info[MAX_TX_FIFOS]; + /* Fifo specific structure */ + fifo_info_t fifos[MAX_TX_FIFOS]; + +/* rx side stuff */ + /* Ring specific structure */ + ring_info_t rings[MAX_RX_RINGS]; + + u16 rmac_pause_time; + u16 mc_pause_threshold_q0q3; + u16 mc_pause_threshold_q4q7; void *stats_mem; /* orignal pointer to allocated mem */ dma_addr_t stats_mem_phy; /* Physical address of the stat block */ @@ -485,12 +575,6 @@ typedef struct { int usage_cnt; } usr_addr_t; -/* Structure that holds the Phy and virt addresses of the Blocks */ -typedef struct rx_block_info { - RxD_t *block_virt_addr; - dma_addr_t block_dma_addr; -} rx_block_info_t; - /* Default Tunable parameters of the NIC. */ #define DEFAULT_FIFO_LEN 4096 #define SMALL_RXD_CNT 30 * (MAX_RXDS_PER_BLOCK+1) @@ -499,7 +583,20 @@ typedef struct rx_block_info { #define LARGE_BLK_CNT 100 /* Structure representing one instance of the NIC */ -typedef struct s2io_nic { +struct s2io_nic { +#ifdef CONFIG_S2IO_NAPI + /* + * Count of packets to be processed in a given iteration, it will be indicated + * by the quota field of the device structure when NAPI is enabled. + */ + int pkts_to_process; +#endif + struct net_device *dev; + mac_info_t mac_control; + struct config_param config; + struct pci_dev *pdev; + void __iomem *bar0; + void __iomem *bar1; #define MAX_MAC_SUPPORTED 16 #define MAX_SUPPORTED_MULTICASTS MAX_MAC_SUPPORTED @@ -507,33 +604,17 @@ typedef struct s2io_nic { macaddr_t pre_mac_addr[MAX_MAC_SUPPORTED]; struct net_device_stats stats; - void __iomem *bar0; - void __iomem *bar1; - struct config_param config; - mac_info_t mac_control; int high_dma_flag; int device_close_flag; int device_enabled_once; - char name[32]; + char name[50]; struct tasklet_struct task; volatile unsigned long tasklet_status; - struct timer_list timer; - struct net_device *dev; - struct pci_dev *pdev; - u16 vendor_id; - u16 device_id; - u16 ccmd; - u32 cbar0_1; - u32 cbar0_2; - u32 cbar1_1; - u32 cbar1_2; - u32 cirq; - u8 cache_line; - u32 rom_expansion; - u16 pcix_cmd; - u32 irq; + /* Space to back up the PCI config space */ + u32 config_space[256 / sizeof(u32)]; + atomic_t rx_bufs_left[MAX_RX_RINGS]; spinlock_t tx_lock; @@ -558,27 +639,11 @@ typedef struct s2io_nic { u16 tx_err_count; u16 rx_err_count; -#ifndef CONFIG_S2IO_NAPI - /* Index to the absolute position of the put pointer of Rx ring. */ - int put_pos[MAX_RX_RINGS]; -#endif - - /* - * Place holders for the virtual and physical addresses of - * all the Rx Blocks - */ - rx_block_info_t rx_blocks[MAX_RX_RINGS][MAX_RX_BLOCKS_PER_RING]; - int block_count[MAX_RX_RINGS]; - int pkt_cnt[MAX_RX_RINGS]; - - /* Place holder of all the TX List's Phy and Virt addresses. */ - list_info_hold_t *list_info[MAX_TX_FIFOS]; - /* Id timer, used to blink NIC to physically identify NIC. */ struct timer_list id_timer; /* Restart timer, used to restart NIC if the device is stuck and - * a schedule task that will set the correct Link state once the + * a schedule task that will set the correct Link state once the * NIC's PHY has stabilized after a state change. */ #ifdef INIT_TQUEUE @@ -589,12 +654,12 @@ typedef struct s2io_nic { struct work_struct set_link_task; #endif - /* Flag that can be used to turn on or turn off the Rx checksum + /* Flag that can be used to turn on or turn off the Rx checksum * offload feature. */ int rx_csum; - /* after blink, the adapter must be restored with original + /* after blink, the adapter must be restored with original * values. */ u64 adapt_ctrl_org; @@ -604,16 +669,12 @@ typedef struct s2io_nic { #define LINK_DOWN 1 #define LINK_UP 2 -#ifdef CONFIG_2BUFF_MODE - /* Buffer Address store. */ - buffAdd_t **ba[MAX_RX_RINGS]; -#endif int task_flag; #define CARD_DOWN 1 #define CARD_UP 2 atomic_t card_state; volatile unsigned long link_state; -} nic_t; +}; #define RESET_ERROR 1; #define CMD_ERROR 2; @@ -622,9 +683,10 @@ typedef struct s2io_nic { #ifndef readq static inline u64 readq(void __iomem *addr) { - u64 ret = readl(addr + 4); - ret <<= 32; - ret |= readl(addr); + u64 ret = 0; + ret = readl(addr + 4); + (u64) ret <<= 32; + (u64) ret |= readl(addr); return ret; } @@ -637,10 +699,10 @@ static inline void writeq(u64 val, void writel((u32) (val >> 32), (addr + 4)); } -/* In 32 bit modes, some registers have to be written in a +/* In 32 bit modes, some registers have to be written in a * particular order to expect correct hardware operation. The - * macro SPECIAL_REG_WRITE is used to perform such ordered - * writes. Defines UF (Upper First) and LF (Lower First) will + * macro SPECIAL_REG_WRITE is used to perform such ordered + * writes. Defines UF (Upper First) and LF (Lower First) will * be used to specify the required write order. */ #define UF 1 @@ -716,6 +778,7 @@ static inline void SPECIAL_REG_WRITE(u64 #define PCC_FB_ECC_ERR vBIT(0xff, 16, 8) /* Interrupt to indicate PCC_FB_ECC Error. */ +#define RXD_GET_VLAN_TAG(Control_2) (u16)(Control_2 & MASK_VLAN_TAG) /* * Prototype declaration. */ @@ -725,36 +788,29 @@ static void __devexit s2io_rem_nic(struc static int init_shared_mem(struct s2io_nic *sp); static void free_shared_mem(struct s2io_nic *sp); static int init_nic(struct s2io_nic *nic); -#ifndef CONFIG_S2IO_NAPI -static void rx_intr_handler(struct s2io_nic *sp); -#endif -static void tx_intr_handler(struct s2io_nic *sp); +static void rx_intr_handler(ring_info_t *ring_data); +static void tx_intr_handler(fifo_info_t *fifo_data); static void alarm_intr_handler(struct s2io_nic *sp); static int s2io_starter(void); -static void s2io_closer(void); +void s2io_closer(void); static void s2io_tx_watchdog(struct net_device *dev); static void s2io_tasklet(unsigned long dev_addr); static void s2io_set_multicast(struct net_device *dev); -#ifndef CONFIG_2BUFF_MODE -static int rx_osm_handler(nic_t * sp, u16 len, RxD_t * rxdp, int ring_no); -#else -static int rx_osm_handler(nic_t * sp, RxD_t * rxdp, int ring_no, - buffAdd_t * ba); -#endif -static void s2io_link(nic_t * sp, int link); -static void s2io_reset(nic_t * sp); -#ifdef CONFIG_S2IO_NAPI +static int rx_osm_handler(ring_info_t *ring_data, RxD_t * rxdp); +void s2io_link(nic_t * sp, int link); +void s2io_reset(nic_t * sp); +#if defined(CONFIG_S2IO_NAPI) static int s2io_poll(struct net_device *dev, int *budget); #endif static void s2io_init_pci(nic_t * sp); -static int s2io_set_mac_addr(struct net_device *dev, u8 * addr); +int s2io_set_mac_addr(struct net_device *dev, u8 * addr); static irqreturn_t s2io_isr(int irq, void *dev_id, struct pt_regs *regs); -static int verify_xena_quiescence(u64 val64, int flag); +static int verify_xena_quiescence(nic_t *sp, u64 val64, int flag); static struct ethtool_ops netdev_ethtool_ops; static void s2io_set_link(unsigned long data); -static int s2io_set_swapper(nic_t * sp); -static void s2io_card_down(nic_t * nic); -static int s2io_card_up(nic_t * nic); - +int s2io_set_swapper(nic_t * sp); +static void s2io_card_down(nic_t *nic); +static int s2io_card_up(nic_t *nic); +int get_xena_rev_id(struct pci_dev *pdev); #endif /* _S2IO_H */ From raghavendra.koushik@neterion.com Wed Aug 3 12:45:40 2005 Received: with ECARTIS (v1.0.0; list netdev); Wed, 03 Aug 2005 12:45:44 -0700 (PDT) Received: from linux.site (adsl-67-120-213-161.dsl.sntc01.pacbell.net [67.120.213.161]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j73JjeH9027904 for ; Wed, 3 Aug 2005 12:45:40 -0700 Received: by linux.site (Postfix, from userid 0) id 5FB9798336; Wed, 3 Aug 2005 12:29:20 -0700 (PDT) To: jgarzik@pobox.com, netdev@oss.sgi.com Cc: raghavendra.koushik@neterion.com, ravinandan.arakali@neterion.com, leonid.grossman@neterion.com, rapuru.sriram@neterion.com From: raghavendra.koushik@neterion.com Subject: [PATCH 2.6.13-rc4 3/13] S2io: Software fixes Message-Id: <20050803192920.5FB9798336@linux.site> Date: Wed, 3 Aug 2005 12:29:20 -0700 (PDT) X-archive-position: 2837 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: raghavendra.koushik@neterion.com Precedence: bulk X-list: netdev Content-Length: 9729 Lines: 325 Hi, Below patch includes fixes for few purely software bugs identified since last release. 1. Keep track and display(as part of ethtool command output) the no. of single-bit and double-bit ECC errors. 2. Handle race condition between intr handler and "interface down" routine. 3. Initial link state setting modified so that the link state displayed after "interface Up" is correct. 4. Fix for "Incorrect Tx packet count when TSO is enabled". 5. Disable periodic DMA of statistics and schedule one-shot DMA only when required. Signed-off-by: Ravinandan Arakali Signed-off-by: Raghavendra Koushik --- diff -uprN vanilla_linux/drivers/net/s2io.c linux-2.6.13-rc4/drivers/net/s2io.c --- vanilla_linux/drivers/net/s2io.c 2005-08-02 02:40:31.000000000 -0700 +++ linux-2.6.13-rc4/drivers/net/s2io.c 2005-08-02 02:31:47.000000000 -0700 @@ -158,6 +158,9 @@ static char ethtool_stats_keys[][ETH_GST {"rmac_pause_cnt"}, {"rmac_accepted_ip"}, {"rmac_err_tcp"}, + {"\n DRIVER STATISTICS"}, + {"single_bit_ecc_errs"}, + {"double_bit_ecc_errs"}, }; #define S2IO_STAT_LEN sizeof(ethtool_stats_keys)/ ETH_GSTRING_LEN @@ -237,7 +240,6 @@ static unsigned int tx_fifo_len[MAX_TX_F static unsigned int rx_ring_num = 1; static unsigned int rx_ring_sz[MAX_RX_RINGS] = {[0 ...(MAX_RX_RINGS - 1)] = 0 }; -static unsigned int Stats_refresh_time = 4; static unsigned int rts_frm_len[MAX_RX_RINGS] = {[0 ...(MAX_RX_RINGS - 1)] = 0 }; static unsigned int use_continuous_tx_intrs = 1; @@ -1083,9 +1085,6 @@ static int init_nic(struct s2io_nic *nic /* Program statistics memory */ writeq(mac_control->stats_mem_phy, &bar0->stat_addr); - val64 = SET_UPDT_PERIOD(Stats_refresh_time) | - STAT_CFG_STAT_RO | STAT_CFG_STAT_EN; - writeq(val64, &bar0->stat_cfg); /* * Initializing the sampling rate for the device to calculate the @@ -2101,6 +2100,7 @@ static int s2io_poll(struct net_device * u64 val64; int i; + atomic_inc(&nic->isr_cnt); mac_control = &nic->mac_control; config = &nic->config; @@ -2136,6 +2136,7 @@ static int s2io_poll(struct net_device * } /* Re enable the Rx interrupts. */ en_dis_able_nic_intrs(nic, RX_TRAFFIC_INTR, ENABLE_INTRS); + atomic_dec(&nic->isr_cnt); return 0; no_rx: @@ -2149,6 +2150,7 @@ no_rx: break; } } + atomic_dec(&nic->isr_cnt); return 1; } #endif @@ -2179,6 +2181,13 @@ static void rx_intr_handler(ring_info_t #endif register u64 val64; + spin_lock(&nic->rx_lock); + if (atomic_read(&nic->card_state) == CARD_DOWN) { + DBG_PRINT(ERR_DBG, "%s: %s going down for reset\n", + __FUNCTION__, dev->name); + spin_unlock(&nic->rx_lock); + } + /* * rx_traffic_int reg is an R1 register, hence we read and write * back the same value in the register to clear it @@ -2210,6 +2219,7 @@ static void rx_intr_handler(ring_info_t DBG_PRINT(ERR_DBG, "%s: The skb is ", dev->name); DBG_PRINT(ERR_DBG, "Null in Rx Intr\n"); + spin_unlock(&nic->rx_lock); return; } #ifndef CONFIG_2BUFF_MODE @@ -2262,6 +2272,7 @@ static void rx_intr_handler(ring_info_t break; #endif } + spin_unlock(&nic->rx_lock); } /** @@ -2345,7 +2356,6 @@ static void tx_intr_handler(fifo_info_t (sizeof(TxD_t) * fifo_data->max_txds)); /* Updating the statistics block */ - nic->stats.tx_packets++; nic->stats.tx_bytes += skb->len; dev_kfree_skb_irq(skb); @@ -2393,13 +2403,16 @@ static void alarm_intr_handler(struct s2 writeq(val64, &bar0->mc_err_reg); if (val64 & (MC_ERR_REG_ECC_ALL_SNG | MC_ERR_REG_ECC_ALL_DBL)) { if (val64 & MC_ERR_REG_ECC_ALL_DBL) { + nic->mac_control.stats_info->sw_stat. + double_ecc_errs++; DBG_PRINT(ERR_DBG, "%s: Device indicates ", dev->name); DBG_PRINT(ERR_DBG, "double ECC error!!\n"); netif_stop_queue(dev); schedule_work(&nic->rst_timer_task); } else { - /* Device can recover from Single ECC errors */ + nic->mac_control.stats_info->sw_stat. + single_ecc_errs++; } } @@ -2695,7 +2708,7 @@ int s2io_open(struct net_device *dev) * Nic is initialized */ netif_carrier_off(dev); - sp->last_link_state = LINK_DOWN; + sp->last_link_state = 0; /* Unkown link state */ /* Initialize H/W and enable interrupts */ if (s2io_card_up(sp)) { @@ -2909,6 +2922,7 @@ static irqreturn_t s2io_isr(int irq, voi mac_info_t *mac_control; struct config_param *config; + atomic_inc(&sp->isr_cnt); mac_control = &sp->mac_control; config = &sp->config; @@ -2924,6 +2938,7 @@ static irqreturn_t s2io_isr(int irq, voi if (!reason) { /* The interrupt was not raised by Xena. */ + atomic_dec(&sp->isr_cnt); return IRQ_NONE; } @@ -2972,6 +2987,7 @@ static irqreturn_t s2io_isr(int irq, voi dev->name); DBG_PRINT(ERR_DBG, " in ISR!!\n"); clear_bit(0, (&sp->tasklet_status)); + atomic_dec(&sp->isr_cnt); return IRQ_HANDLED; } clear_bit(0, (&sp->tasklet_status)); @@ -2981,10 +2997,37 @@ static irqreturn_t s2io_isr(int irq, voi } #endif + atomic_dec(&sp->isr_cnt); return IRQ_HANDLED; } /** + * s2io_updt_stats - + */ +static void s2io_updt_stats(nic_t *sp) +{ + XENA_dev_config_t __iomem *bar0 = sp->bar0; + u64 val64; + int cnt = 0; + + if (atomic_read(&sp->card_state) == CARD_UP) { + /* Apprx 30us on a 133 MHz bus */ + val64 = SET_UPDT_CLICKS(10) | + STAT_CFG_ONE_SHOT_EN | STAT_CFG_STAT_EN; + writeq(val64, &bar0->stat_cfg); + do { + udelay(100); + val64 = readq(&bar0->stat_cfg); + if (!(val64 & BIT(0))) + break; + cnt++; + if (cnt == 5) + break; /* Updt failed */ + } while(1); + } +} + +/** * s2io_get_stats - Updates the device statistics structure. * @dev : pointer to the device structure. * Description: @@ -3004,6 +3047,11 @@ struct net_device_stats *s2io_get_stats( mac_control = &sp->mac_control; config = &sp->config; + /* Configure Stats for immediate updt */ + s2io_updt_stats(sp); + + sp->stats.tx_packets = + le32_to_cpu(mac_control->stats_info->tmac_frms); sp->stats.tx_errors = le32_to_cpu(mac_control->stats_info->tmac_any_err_frms); sp->stats.rx_errors = @@ -4018,6 +4066,7 @@ static void s2io_get_ethtool_stats(struc nic_t *sp = dev->priv; StatInfo_t *stat_info = sp->mac_control.stats_info; + s2io_updt_stats(sp); tmp_stats[i++] = le32_to_cpu(stat_info->tmac_frms); tmp_stats[i++] = le32_to_cpu(stat_info->tmac_data_octets); tmp_stats[i++] = le64_to_cpu(stat_info->tmac_drop_frms); @@ -4057,6 +4106,9 @@ static void s2io_get_ethtool_stats(struc tmp_stats[i++] = le32_to_cpu(stat_info->rmac_pause_cnt); tmp_stats[i++] = le32_to_cpu(stat_info->rmac_accepted_ip); tmp_stats[i++] = le32_to_cpu(stat_info->rmac_err_tcp); + tmp_stats[i++] = 0; + tmp_stats[i++] = stat_info->sw_stat.single_ecc_errs; + tmp_stats[i++] = stat_info->sw_stat.double_ecc_errs; } int s2io_ethtool_get_regs_len(struct net_device *dev) @@ -4353,14 +4405,27 @@ static void s2io_card_down(nic_t * sp) break; } } while (1); - spin_lock_irqsave(&sp->tx_lock, flags); s2io_reset(sp); - /* Free all unused Tx and Rx buffers */ + /* Waiting till all Interrupt handlers are complete */ + cnt = 0; + do { + msleep(10); + if (!atomic_read(&sp->isr_cnt)) + break; + cnt++; + } while(cnt < 5); + + spin_lock_irqsave(&sp->tx_lock, flags); + /* Free all Tx buffers */ free_tx_buffers(sp); + spin_unlock_irqrestore(&sp->tx_lock, flags); + + /* Free all Rx buffers */ + spin_lock_irqsave(&sp->rx_lock, flags); free_rx_buffers(sp); + spin_unlock_irqrestore(&sp->rx_lock, flags); - spin_unlock_irqrestore(&sp->tx_lock, flags); clear_bit(0, &(sp->link_state)); } @@ -4647,7 +4712,6 @@ module_param(tx_fifo_num, int, 0); module_param(rx_ring_num, int, 0); module_param_array(tx_fifo_len, uint, NULL, 0); module_param_array(rx_ring_sz, uint, NULL, 0); -module_param(Stats_refresh_time, int, 0); module_param_array(rts_frm_len, uint, NULL, 0); module_param(use_continuous_tx_intrs, int, 1); module_param(rmac_pause_time, int, 0); @@ -4804,6 +4868,9 @@ s2io_init_nic(struct pci_dev *pdev, cons for (i = 0; i < config->rx_ring_num; i++) atomic_set(&sp->rx_bufs_left[i], 0); + /* Initialize the number of ISRs currently running */ + atomic_set(&sp->isr_cnt, 0); + /* initialize the shared memory used by the NIC and the host */ if (init_shared_mem(sp)) { DBG_PRINT(ERR_DBG, "%s: Memory allocation failed\n", @@ -4938,6 +5005,7 @@ s2io_init_nic(struct pci_dev *pdev, cons #ifndef CONFIG_S2IO_NAPI spin_lock_init(&sp->put_lock); #endif + spin_lock_init(&sp->rx_lock); /* * SXE-002: Configure link and activity LED to init state @@ -4961,13 +5029,16 @@ s2io_init_nic(struct pci_dev *pdev, cons goto register_failed; } + /* Initialize device name */ + strcpy(sp->name, dev->name); + strcat(sp->name, ": Neterion Xframe I 10GbE adapter"); + /* * Make Link state as off at this point, when the Link change * interrupt comes the state will be automatically changed to * the right state. */ netif_carrier_off(dev); - sp->last_link_state = LINK_DOWN; return 0; diff -uprN vanilla_linux/drivers/net/s2io.h linux-2.6.13-rc4/drivers/net/s2io.h --- vanilla_linux/drivers/net/s2io.h 2005-08-02 02:32:11.000000000 -0700 +++ linux-2.6.13-rc4/drivers/net/s2io.h 2005-08-02 02:31:51.000000000 -0700 @@ -195,6 +195,9 @@ typedef struct stat_block { u32 rxd_rd_cnt; u32 rxf_wr_cnt; u32 txf_rd_cnt; + +/* Software statistics maintained by driver */ + swStat_t sw_stat; } StatInfo_t; /* @@ -678,6 +681,8 @@ struct s2io_nic { #define CARD_UP 2 atomic_t card_state; volatile unsigned long link_state; + spinlock_t rx_lock; + atomic_t isr_cnt; }; #define RESET_ERROR 1; From raghavendra.koushik@neterion.com Wed Aug 3 12:47:03 2005 Received: with ECARTIS (v1.0.0; list netdev); Wed, 03 Aug 2005 12:47:07 -0700 (PDT) Received: from linux.site (adsl-67-120-213-161.dsl.sntc01.pacbell.net [67.120.213.161]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j73Jl3H9028645 for ; Wed, 3 Aug 2005 12:47:03 -0700 Received: by linux.site (Postfix, from userid 0) id 3B07898336; Wed, 3 Aug 2005 12:30:43 -0700 (PDT) To: jgarzik@pobox.com, netdev@oss.sgi.com Cc: raghavendra.koushik@neterion.com, ravinandan.arakali@neterion.com, leonid.grossman@neterion.com, rapuru.sriram@neterion.com From: raghavendra.koushik@neterion.com Subject: [PATCH 2.6.13-rc4 4/13] S2io: Removed memory leaks Message-Id: <20050803193043.3B07898336@linux.site> Date: Wed, 3 Aug 2005 12:30:43 -0700 (PDT) X-archive-position: 2838 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: raghavendra.koushik@neterion.com Precedence: bulk X-list: netdev Content-Length: 2061 Lines: 67 Hi, This patch fixes certain memory leaks discovered in free_tx_buffers() and rx_osm_handler() Signed-off-by: Ravinandan Arakali Signed-off-by: Raghavendra Koushik --- diff -uprN vanilla_linux/drivers/net/s2io.c linux-2.6.13-rc4/drivers/net/s2io.c --- vanilla_linux/drivers/net/s2io.c 2005-08-02 02:51:35.000000000 -0700 +++ linux-2.6.13-rc4/drivers/net/s2io.c 2005-08-02 02:51:26.000000000 -0700 @@ -1709,7 +1709,7 @@ static void free_tx_buffers(struct s2io_ int i, j; mac_info_t *mac_control; struct config_param *config; - int cnt = 0; + int cnt = 0, frg_cnt; mac_control = &nic->mac_control; config = &nic->config; @@ -1722,11 +1722,33 @@ static void free_tx_buffers(struct s2io_ (struct sk_buff *) ((unsigned long) txdp-> Host_Control); if (skb == NULL) { - memset(txdp, 0, sizeof(TxD_t)); + memset(txdp, 0, sizeof(TxD_t) * + config->max_txds); continue; } + frg_cnt = skb_shinfo(skb)->nr_frags; + pci_unmap_single(nic->pdev, (dma_addr_t) + txdp->Buffer_Pointer, + skb->len - skb->data_len, + PCI_DMA_TODEVICE); + if (frg_cnt) { + TxD_t *temp; + temp = txdp; + txdp++; + for (j = 0; j < frg_cnt; j++, txdp++) { + skb_frag_t *frag = + &skb_shinfo(skb)->frags[j]; + pci_unmap_page(nic->pdev, + (dma_addr_t) + txdp-> + Buffer_Pointer, + frag->size, + PCI_DMA_TODEVICE); + } + txdp = temp; + } dev_kfree_skb(skb); - memset(txdp, 0, sizeof(TxD_t)); + memset(txdp, 0, sizeof(TxD_t) * config->max_txds); cnt++; } DBG_PRINT(INTR_DBG, @@ -4570,6 +4592,11 @@ static int rx_osm_handler(ring_info_t *r unsigned long long err = rxdp->Control_1 & RXD_T_CODE; DBG_PRINT(ERR_DBG, "%s: Rx error Value: 0x%llx\n", dev->name, err); + dev_kfree_skb(skb); + sp->stats.rx_crc_errors++; + atomic_dec(&sp->rx_bufs_left[ring_no]); + rxdp->Host_Control = 0; + return 0; } /* Updating statistics */ From raghavendra.koushik@neterion.com Wed Aug 3 12:48:20 2005 Received: with ECARTIS (v1.0.0; list netdev); Wed, 03 Aug 2005 12:48:26 -0700 (PDT) Received: from linux.site (adsl-67-120-213-161.dsl.sntc01.pacbell.net [67.120.213.161]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j73JmKH9029359 for ; Wed, 3 Aug 2005 12:48:20 -0700 Received: by linux.site (Postfix, from userid 0) id 4F3E598336; Wed, 3 Aug 2005 12:32:00 -0700 (PDT) To: jgarzik@pobox.com, netdev@oss.sgi.com Cc: raghavendra.koushik@neterion.com, ravinandan.arakali@neterion.com, leonid.grossman@neterion.com, rapuru.sriram@neterion.com From: raghavendra.koushik@neterion.com Subject: [PATCH 2.6.13-rc4 5/13] S2io: Performance improvements Message-Id: <20050803193200.4F3E598336@linux.site> Date: Wed, 3 Aug 2005 12:32:00 -0700 (PDT) X-archive-position: 2839 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: raghavendra.koushik@neterion.com Precedence: bulk X-list: netdev Content-Length: 5730 Lines: 166 Hi, This patch relates to mostly performance related changes. 1. Fixed incorrect computation of PANIC level in rx_buffer_level(). 2. Removed unnecessary PIOs(read/write of tx_traffic_int and rx_traffic_int) from interrupt handler and removed read of general_int_status register from xmit routine. 3. Enable two-buffer mode(for Rx path) automatically for SGI systems. This improves Rx performance dramatically on SGI systems. Signed-off-by: Ravinandan Arakali Signed-off-by: Raghavendra Koushik --- diff -uprN vanilla_linux/drivers/net/s2io.c linux-2.6.13-rc4/drivers/net/s2io.c --- vanilla_linux/drivers/net/s2io.c 2005-08-02 05:57:48.000000000 -0700 +++ linux-2.6.13-rc4/drivers/net/s2io.c 2005-08-02 05:58:08.000000000 -0700 @@ -100,8 +100,7 @@ static inline int rx_buffer_level(nic_t mac_control = &sp->mac_control; if ((mac_control->rings[ring].pkt_cnt - rxb_size) > 16) { level = LOW; - if ((mac_control->rings[ring].pkt_cnt - rxb_size) < - MAX_RXDS_PER_BLOCK) { + if (rxb_size <= MAX_RXDS_PER_BLOCK) { level = PANIC; } } @@ -2193,7 +2192,6 @@ static void rx_intr_handler(ring_info_t { nic_t *nic = ring_data->nic; struct net_device *dev = (struct net_device *) nic->dev; - XENA_dev_config_t __iomem *bar0 = nic->bar0; int get_block, get_offset, put_block, put_offset, ring_bufs; rx_curr_get_info_t get_info, put_info; RxD_t *rxdp; @@ -2201,8 +2199,6 @@ static void rx_intr_handler(ring_info_t #ifndef CONFIG_S2IO_NAPI int pkt_cnt = 0; #endif - register u64 val64; - spin_lock(&nic->rx_lock); if (atomic_read(&nic->card_state) == CARD_DOWN) { DBG_PRINT(ERR_DBG, "%s: %s going down for reset\n", @@ -2210,13 +2206,6 @@ static void rx_intr_handler(ring_info_t spin_unlock(&nic->rx_lock); } - /* - * rx_traffic_int reg is an R1 register, hence we read and write - * back the same value in the register to clear it - */ - val64 = readq(&bar0->tx_traffic_int); - writeq(val64, &bar0->tx_traffic_int); - get_info = ring_data->rx_curr_get_info; get_block = get_info.block_index; put_info = ring_data->rx_curr_put_info; @@ -2312,20 +2301,11 @@ static void rx_intr_handler(ring_info_t static void tx_intr_handler(fifo_info_t *fifo_data) { nic_t *nic = fifo_data->nic; - XENA_dev_config_t __iomem *bar0 = nic->bar0; struct net_device *dev = (struct net_device *) nic->dev; tx_curr_get_info_t get_info, put_info; struct sk_buff *skb; TxD_t *txdlp; u16 j, frg_cnt; - register u64 val64 = 0; - - /* - * tx_traffic_int reg is an R1 register, hence we read and write - * back the same value in the register to clear it - */ - val64 = readq(&bar0->tx_traffic_int); - writeq(val64, &bar0->tx_traffic_int); get_info = fifo_data->tx_curr_get_info; put_info = fifo_data->tx_curr_put_info; @@ -2818,7 +2798,6 @@ int s2io_xmit(struct sk_buff *skb, struc #endif mac_info_t *mac_control; struct config_param *config; - XENA_dev_config_t __iomem *bar0 = sp->bar0; mac_control = &sp->mac_control; config = &sp->config; @@ -2870,7 +2849,6 @@ int s2io_xmit(struct sk_buff *skb, struc } txdp->Control_2 |= config->tx_intr_type; - txdp->Control_1 |= (TXD_BUFFER0_SIZE(frg_len) | TXD_GATHER_CODE_FIRST); txdp->Control_1 |= TXD_LIST_OWN_XENA; @@ -2890,6 +2868,8 @@ int s2io_xmit(struct sk_buff *skb, struc val64 = mac_control->fifos[queue].list_info[put_off].list_phy_addr; writeq(val64, &tx_fifo->TxDL_Pointer); + wmb(); + val64 = (TX_FIFO_LAST_TXD_NUM(frg_cnt) | TX_FIFO_FIRST_LIST | TX_FIFO_LAST_LIST); @@ -2899,9 +2879,6 @@ int s2io_xmit(struct sk_buff *skb, struc #endif writeq(val64, &tx_fifo->List_Control); - /* Perform a PCI read to flush previous writes */ - val64 = readq(&bar0->general_int_status); - put_off++; put_off %= mac_control->fifos[queue].tx_curr_put_info.fifo_len + 1; mac_control->fifos[queue].tx_curr_put_info.offset = put_off; @@ -2940,7 +2917,7 @@ static irqreturn_t s2io_isr(int irq, voi nic_t *sp = dev->priv; XENA_dev_config_t __iomem *bar0 = sp->bar0; int i; - u64 reason = 0; + u64 reason = 0, val64; mac_info_t *mac_control; struct config_param *config; @@ -2978,6 +2955,13 @@ static irqreturn_t s2io_isr(int irq, voi #else /* If Intr is because of Rx Traffic */ if (reason & GEN_INTR_RXTRAFFIC) { + /* + * rx_traffic_int reg is an R1 register, writing all 1's + * will ensure that the actual interrupt causing bit get's + * cleared and hence a read can be avoided. + */ + val64 = 0xFFFFFFFFFFFFFFFFULL; + writeq(val64, &bar0->rx_traffic_int); for (i = 0; i < config->rx_ring_num; i++) { rx_intr_handler(&mac_control->rings[i]); } @@ -2986,6 +2970,14 @@ static irqreturn_t s2io_isr(int irq, voi /* If Intr is because of Tx Traffic */ if (reason & GEN_INTR_TXTRAFFIC) { + /* + * tx_traffic_int reg is an R1 register, writing all 1's + * will ensure that the actual interrupt causing bit get's + * cleared and hence a read can be avoided. + */ + val64 = 0xFFFFFFFFFFFFFFFFULL; + writeq(val64, &bar0->tx_traffic_int); + for (i = 0; i < config->tx_fifo_num; i++) tx_intr_handler(&mac_control->fifos[i]); } diff -uprN vanilla_linux/drivers/net/s2io.h linux-2.6.13-rc4/drivers/net/s2io.h --- vanilla_linux/drivers/net/s2io.h 2005-08-02 05:57:48.000000000 -0700 +++ linux-2.6.13-rc4/drivers/net/s2io.h 2005-08-02 05:58:08.000000000 -0700 @@ -13,6 +13,11 @@ #ifndef _S2IO_H #define _S2IO_H +/* Enable 2 buffer mode by default for SGI system */ +#ifdef CONFIG_IA64_SGI_SN2 +#define CONFIG_2BUFF_MODE +#endif + #define TBD 0 #define BIT(loc) (0x8000000000000000ULL >> (loc)) #define vBIT(val, loc, sz) (((u64)val) << (64-loc-sz)) From raghavendra.koushik@neterion.com Wed Aug 3 12:49:32 2005 Received: with ECARTIS (v1.0.0; list netdev); Wed, 03 Aug 2005 12:49:36 -0700 (PDT) Received: from linux.site (adsl-67-120-213-161.dsl.sntc01.pacbell.net [67.120.213.161]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j73JnWH9029915 for ; Wed, 3 Aug 2005 12:49:32 -0700 Received: by linux.site (Postfix, from userid 0) id 5173C98336; Wed, 3 Aug 2005 12:33:12 -0700 (PDT) To: jgarzik@pobox.com, netdev@oss.sgi.com Cc: raghavendra.koushik@neterion.com, ravinandan.arakali@neterion.com, leonid.grossman@neterion.com, rapuru.sriram@neterion.com From: raghavendra.koushik@neterion.com Subject: [PATCH 2.6.13-rc4 6/13] S2io: Support for runtime MTU change Message-Id: <20050803193312.5173C98336@linux.site> Date: Wed, 3 Aug 2005 12:33:12 -0700 (PDT) X-archive-position: 2840 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: raghavendra.koushik@neterion.com Precedence: bulk X-list: netdev Content-Length: 1851 Lines: 60 Hi, Patch below supports MTU change on-the-fly(without bringing interface down) Signed-off-by: Ravinandan Arakali Signed-off-by: Raghavendra Koushik --- diff -uprN vanilla_linux/drivers/net/s2io.c linux-2.6.13-rc4/drivers/net/s2io.c --- vanilla_linux/drivers/net/s2io.c 2005-08-02 06:03:44.000000000 -0700 +++ linux-2.6.13-rc4/drivers/net/s2io.c 2005-08-02 06:03:51.000000000 -0700 @@ -2849,6 +2849,7 @@ int s2io_xmit(struct sk_buff *skb, struc } txdp->Control_2 |= config->tx_intr_type; + txdp->Control_1 |= (TXD_BUFFER0_SIZE(frg_len) | TXD_GATHER_CODE_FIRST); txdp->Control_1 |= TXD_LIST_OWN_XENA; @@ -4246,14 +4247,6 @@ int s2io_ioctl(struct net_device *dev, s int s2io_change_mtu(struct net_device *dev, int new_mtu) { nic_t *sp = dev->priv; - XENA_dev_config_t __iomem *bar0 = sp->bar0; - register u64 val64; - - if (netif_running(dev)) { - DBG_PRINT(ERR_DBG, "%s: Must be stopped to ", dev->name); - DBG_PRINT(ERR_DBG, "change its MTU\n"); - return -EBUSY; - } if ((new_mtu < MIN_MTU) || (new_mtu > S2IO_JUMBO_SIZE)) { DBG_PRINT(ERR_DBG, "%s: MTU size is invalid.\n", @@ -4261,11 +4254,22 @@ int s2io_change_mtu(struct net_device *d return -EPERM; } - /* Set the new MTU into the PYLD register of the NIC */ - val64 = new_mtu; - writeq(vBIT(val64, 2, 14), &bar0->rmac_max_pyld_len); - dev->mtu = new_mtu; + if (netif_running(dev)) { + s2io_card_down(sp); + netif_stop_queue(dev); + if (s2io_card_up(sp)) { + DBG_PRINT(ERR_DBG, "%s: Device bring up failed\n", + __FUNCTION__); + } + if (netif_queue_stopped(dev)) + netif_wake_queue(dev); + } else { /* Device is down */ + XENA_dev_config_t __iomem *bar0 = sp->bar0; + u64 val64 = new_mtu; + + writeq(vBIT(val64, 2, 14), &bar0->rmac_max_pyld_len); + } return 0; } From raghavendra.koushik@neterion.com Wed Aug 3 12:50:31 2005 Received: with ECARTIS (v1.0.0; list netdev); Wed, 03 Aug 2005 12:50:37 -0700 (PDT) Received: from linux.site (adsl-67-120-213-161.dsl.sntc01.pacbell.net [67.120.213.161]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j73JoVH9030222 for ; Wed, 3 Aug 2005 12:50:31 -0700 Received: by linux.site (Postfix, from userid 0) id 54C9498336; Wed, 3 Aug 2005 12:34:11 -0700 (PDT) To: jgarzik@pobox.com, netdev@oss.sgi.com Cc: raghavendra.koushik@neterion.com, ravinandan.arakali@neterion.com, leonid.grossman@neterion.com, rapuru.sriram@neterion.com From: raghavendra.koushik@neterion.com Subject: [PATCH 2.6.13-rc4 7/13] S2io: Timer based slowpath handling Message-Id: <20050803193411.54C9498336@linux.site> Date: Wed, 3 Aug 2005 12:34:11 -0700 (PDT) X-archive-position: 2841 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: raghavendra.koushik@neterion.com Precedence: bulk X-list: netdev Content-Length: 3237 Lines: 97 Hi, This patch implements the slow-path handling functions(link state change, hardware errors) as a timer. It is not handled in interrupt handler as was done previously. Signed-off-by: Ravinandan Arakali Signed-off-by: Raghavendra Koushik --- diff -uprN vanilla_linux/drivers/net/s2io.c linux-2.6.13-rc4/drivers/net/s2io.c --- vanilla_linux/drivers/net/s2io.c 2005-08-02 06:25:21.000000000 -0700 +++ linux-2.6.13-rc4/drivers/net/s2io.c 2005-08-02 06:25:31.000000000 -0700 @@ -168,6 +168,12 @@ static char ethtool_stats_keys[][ETH_GST #define S2IO_TEST_LEN sizeof(s2io_gstrings) / ETH_GSTRING_LEN #define S2IO_STRINGS_LEN S2IO_TEST_LEN * ETH_GSTRING_LEN +#define S2IO_TIMER_CONF(timer, handle, arg, exp) \ + init_timer(&timer); \ + timer.function = handle; \ + timer.data = (unsigned long) arg; \ + mod_timer(&timer, (jiffies + exp)) \ + /* * Constants to be programmed into the Xena's registers, to configure * the XAUI. @@ -2741,6 +2747,7 @@ int s2io_open(struct net_device *dev) setting_mac_address_failed: free_irq(sp->pdev->irq, dev); isr_registration_failed: + del_timer_sync(&sp->alarm_timer); s2io_reset(sp); hw_init_failed: return err; @@ -2898,6 +2905,15 @@ int s2io_xmit(struct sk_buff *skb, struc return 0; } +static void +s2io_alarm_handle(unsigned long data) +{ + nic_t *sp = (nic_t *)data; + + alarm_intr_handler(sp); + mod_timer(&sp->alarm_timer, jiffies + HZ / 2); +} + /** * s2io_isr - ISR handler of the device . * @irq: the irq of the device. @@ -2942,9 +2958,6 @@ static irqreturn_t s2io_isr(int irq, voi return IRQ_NONE; } - if (reason & (GEN_ERROR_INTR)) - alarm_intr_handler(sp); - #ifdef CONFIG_S2IO_NAPI if (reason & GEN_INTR_RXTRAFFIC) { if (netif_rx_schedule_prep(dev)) { @@ -4394,6 +4407,7 @@ static void s2io_card_down(nic_t * sp) unsigned long flags; register u64 val64 = 0; + del_timer_sync(&sp->alarm_timer); /* If s2io_set_link task is executing, wait till it completes. */ while (test_and_set_bit(0, &(sp->link_state))) { msleep(50); @@ -4496,6 +4510,8 @@ static int s2io_card_up(nic_t * sp) return -ENODEV; } + S2IO_TIMER_CONF(sp->alarm_timer, s2io_alarm_handle, sp, (HZ/2)); + atomic_set(&sp->card_state, CARD_UP); return 0; } diff -uprN vanilla_linux/drivers/net/s2io.h linux-2.6.13-rc4/drivers/net/s2io.h --- vanilla_linux/drivers/net/s2io.h 2005-08-02 06:25:21.000000000 -0700 +++ linux-2.6.13-rc4/drivers/net/s2io.h 2005-08-02 06:25:31.000000000 -0700 @@ -624,6 +624,9 @@ struct s2io_nic { struct tasklet_struct task; volatile unsigned long tasklet_status; + /* Timer that handles I/O errors/exceptions */ + struct timer_list alarm_timer; + /* Space to back up the PCI config space */ u32 config_space[256 / sizeof(u32)]; @@ -819,6 +822,7 @@ static int s2io_poll(struct net_device * #endif static void s2io_init_pci(nic_t * sp); int s2io_set_mac_addr(struct net_device *dev, u8 * addr); +static void s2io_alarm_handle(unsigned long data); static irqreturn_t s2io_isr(int irq, void *dev_id, struct pt_regs *regs); static int verify_xena_quiescence(nic_t *sp, u64 val64, int flag); static struct ethtool_ops netdev_ethtool_ops; From raghavendra.koushik@neterion.com Wed Aug 3 12:52:16 2005 Received: with ECARTIS (v1.0.0; list netdev); Wed, 03 Aug 2005 12:52:19 -0700 (PDT) Received: from linux.site (adsl-67-120-213-161.dsl.sntc01.pacbell.net [67.120.213.161]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j73JqFH9031035 for ; Wed, 3 Aug 2005 12:52:15 -0700 Received: by linux.site (Postfix, from userid 0) id 9E77A98336; Wed, 3 Aug 2005 12:35:55 -0700 (PDT) To: jgarzik@pobox.com, netdev@oss.sgi.com Cc: raghavendra.koushik@neterion.com, ravinandan.arakali@neterion.com, leonid.grossman@neterion.com, rapuru.sriram@neterion.com From: raghavendra.koushik@neterion.com Subject: [PATCH 2.6.13-rc4 8/13] S2io: VLAN support Message-Id: <20050803193555.9E77A98336@linux.site> Date: Wed, 3 Aug 2005 12:35:55 -0700 (PDT) X-archive-position: 2842 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: raghavendra.koushik@neterion.com Precedence: bulk X-list: netdev Content-Length: 4086 Lines: 132 Hi, Patch below adds VLAN support to the driver. Signed-off-by: Ravinandan Arakali Signed-off-by: Raghavendra Koushik --- diff -uprN vanilla_linux/drivers/net/s2io.c linux-2.6.13-rc4/drivers/net/s2io.c --- vanilla_linux/drivers/net/s2io.c 2005-08-02 06:31:23.000000000 -0700 +++ linux-2.6.13-rc4/drivers/net/s2io.c 2005-08-02 06:31:39.000000000 -0700 @@ -55,6 +55,7 @@ #include #include #include +#include #include #include @@ -174,6 +175,30 @@ static char ethtool_stats_keys[][ETH_GST timer.data = (unsigned long) arg; \ mod_timer(&timer, (jiffies + exp)) \ +/* Add the vlan */ +static void s2io_vlan_rx_register(struct net_device *dev, + struct vlan_group *grp) +{ + nic_t *nic = dev->priv; + unsigned long flags; + + spin_lock_irqsave(&nic->tx_lock, flags); + nic->vlgrp = grp; + spin_unlock_irqrestore(&nic->tx_lock, flags); +} + +/* Unregister the vlan */ +static void s2io_vlan_rx_kill_vid(struct net_device *dev, unsigned long vid) +{ + nic_t *nic = dev->priv; + unsigned long flags; + + spin_lock_irqsave(&nic->tx_lock, flags); + if (nic->vlgrp) + nic->vlgrp->vlan_devices[vid] = NULL; + spin_unlock_irqrestore(&nic->tx_lock, flags); +} + /* * Constants to be programmed into the Xena's registers, to configure * the XAUI. @@ -2803,6 +2828,8 @@ int s2io_xmit(struct sk_buff *skb, struc #ifdef NETIF_F_TSO int mss; #endif + u16 vlan_tag = 0; + int vlan_priority = 0; mac_info_t *mac_control; struct config_param *config; @@ -2821,6 +2848,13 @@ int s2io_xmit(struct sk_buff *skb, struc queue = 0; + /* Get Fifo number to Transmit based on vlan priority */ + if (sp->vlgrp && vlan_tx_tag_present(skb)) { + vlan_tag = vlan_tx_tag_get(skb); + vlan_priority = vlan_tag >> 13; + queue = config->fifo_mapping[vlan_priority]; + } + put_off = (u16) mac_control->fifos[queue].tx_curr_put_info.offset; get_off = (u16) mac_control->fifos[queue].tx_curr_get_info.offset; txdp = (TxD_t *) mac_control->fifos[queue].list_info[put_off]. @@ -2857,6 +2891,11 @@ int s2io_xmit(struct sk_buff *skb, struc txdp->Control_2 |= config->tx_intr_type; + if (sp->vlgrp && vlan_tx_tag_present(skb)) { + txdp->Control_2 |= TXD_VLAN_ENABLE; + txdp->Control_2 |= TXD_VLAN_TAG(vlan_tag); + } + txdp->Control_1 |= (TXD_BUFFER0_SIZE(frg_len) | TXD_GATHER_CODE_FIRST); txdp->Control_1 |= TXD_LIST_OWN_XENA; @@ -4653,10 +4692,23 @@ static int rx_osm_handler(ring_info_t *r skb->protocol = eth_type_trans(skb, dev); #ifdef CONFIG_S2IO_NAPI - netif_receive_skb(skb); + if (sp->vlgrp && RXD_GET_VLAN_TAG(rxdp->Control_2)) { + /* Queueing the vlan frame to the upper layer */ + vlan_hwaccel_receive_skb(skb, sp->vlgrp, + RXD_GET_VLAN_TAG(rxdp->Control_2)); + } else { + netif_receive_skb(skb); + } #else - netif_rx(skb); + if (sp->vlgrp && RXD_GET_VLAN_TAG(rxdp->Control_2)) { + /* Queueing the vlan frame to the upper layer */ + vlan_hwaccel_rx(skb, sp->vlgrp, + RXD_GET_VLAN_TAG(rxdp->Control_2)); + } else { + netif_rx(skb); + } #endif + dev->last_rx = jiffies; atomic_dec(&sp->rx_bufs_left[ring_no]); return SUCCESS; @@ -4954,6 +5006,9 @@ s2io_init_nic(struct pci_dev *pdev, cons dev->do_ioctl = &s2io_ioctl; dev->change_mtu = &s2io_change_mtu; SET_ETHTOOL_OPS(dev, &netdev_ethtool_ops); + dev->features |= NETIF_F_HW_VLAN_TX | NETIF_F_HW_VLAN_RX; + dev->vlan_rx_register = s2io_vlan_rx_register; + dev->vlan_rx_kill_vid = (void *)s2io_vlan_rx_kill_vid; /* * will use eth_mac_addr() for dev->set_mac_address diff -uprN vanilla_linux/drivers/net/s2io.h linux-2.6.13-rc4/drivers/net/s2io.h --- vanilla_linux/drivers/net/s2io.h 2005-08-02 06:31:23.000000000 -0700 +++ linux-2.6.13-rc4/drivers/net/s2io.h 2005-08-02 06:31:39.000000000 -0700 @@ -689,6 +689,8 @@ struct s2io_nic { #define CARD_UP 2 atomic_t card_state; volatile unsigned long link_state; + struct vlan_group *vlgrp; + spinlock_t rx_lock; atomic_t isr_cnt; }; From raghavendra.koushik@neterion.com Wed Aug 3 12:53:16 2005 Received: with ECARTIS (v1.0.0; list netdev); Wed, 03 Aug 2005 12:53:19 -0700 (PDT) Received: from linux.site (adsl-67-120-213-161.dsl.sntc01.pacbell.net [67.120.213.161]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j73JrFH9031333 for ; Wed, 3 Aug 2005 12:53:16 -0700 Received: by linux.site (Postfix, from userid 0) id CB51298355; Wed, 3 Aug 2005 12:36:55 -0700 (PDT) To: jgarzik@pobox.com, netdev@oss.sgi.com Cc: raghavendra.koushik@neterion.com, ravinandan.arakali@neterion.com, leonid.grossman@neterion.com, rapuru.sriram@neterion.com From: raghavendra.koushik@neterion.com Subject: [PATCH 2.6.13-rc4 9/13] S2io: Support for Xframe II NIC Message-Id: <20050803193655.CB51298355@linux.site> Date: Wed, 3 Aug 2005 12:36:55 -0700 (PDT) X-archive-position: 2843 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: raghavendra.koushik@neterion.com Precedence: bulk X-list: netdev Content-Length: 30773 Lines: 936 Hi, This patch provides basic support for the Xframe II adapter. Includes the following changes: 1. New values to program XAUI interface. 2. Print the PCI/PCI-X mode(bus frequency, width). 3. Remove EOI from reset during intialization. 4. Enable all 8 PCCs if Xframe II adapter. 5. Programs the RLDRAM size depending on the device. (Note: RLDRAM size on XFARME-I is 64Mb whereas on XFRAME-II it's 32 Mb). 6. Enable extended(64-bit) statistics counters. 7. Program timer interrupt duration based on PCI/PCI-X clock speed. 8. Not required to save/restore PCI config space before/after reset. Signed-off-by: Ravinandan Arakali Signed-off-by: Raghavendra Koushik --- diff -uprN vanilla_linux/drivers/net/s2io-regs.h linux-2.6.13-rc4/drivers/net/s2io-regs.h --- vanilla_linux/drivers/net/s2io-regs.h 2005-08-02 06:37:37.000000000 -0700 +++ linux-2.6.13-rc4/drivers/net/s2io-regs.h 2005-08-02 06:45:08.000000000 -0700 @@ -91,7 +91,21 @@ typedef struct _XENA_dev_config { SERR_SOURCE_MC | \ SERR_SOURCE_XGXS) - u8 unused_0[0x800 - 0x120]; + u64 pci_mode; +#define GET_PCI_MODE(val) ((val & vBIT(0xF, 0, 4)) >> 60) +#define PCI_MODE_PCI_33 0 +#define PCI_MODE_PCI_66 0x1 +#define PCI_MODE_PCIX_M1_66 0x2 +#define PCI_MODE_PCIX_M1_100 0x3 +#define PCI_MODE_PCIX_M1_133 0x4 +#define PCI_MODE_PCIX_M2_66 0x5 +#define PCI_MODE_PCIX_M2_100 0x6 +#define PCI_MODE_PCIX_M2_133 0x7 +#define PCI_MODE_UNSUPPORTED BIT(0) +#define PCI_MODE_32_BITS BIT(8) +#define PCI_MODE_UNKNOWN_MODE BIT(9) + + u8 unused_0[0x800 - 0x128]; /* PCI-X Controller registers */ u64 pic_int_status; @@ -223,19 +237,16 @@ typedef struct _XENA_dev_config { u64 xmsi_data; u64 rx_mat; +#define RX_MAT_SET(ring, msi) vBIT(msi, (8 * ring), 8) u8 unused6[0x8]; - u64 tx_mat0_7; - u64 tx_mat8_15; - u64 tx_mat16_23; - u64 tx_mat24_31; - u64 tx_mat32_39; - u64 tx_mat40_47; - u64 tx_mat48_55; - u64 tx_mat56_63; + u64 tx_mat0_n[0x8]; +#define TX_MAT_SET(fifo, msi) vBIT(msi, (8 * fifo), 8) - u8 unused_1[0x10]; + u8 unused_1[0x8]; + u64 stat_byte_cnt; +#define STAT_BC(n) vBIT(n,4,12) /* Automated statistics collection */ u64 stat_cfg; @@ -269,7 +280,12 @@ typedef struct _XENA_dev_config { u64 gpio_control; #define GPIO_CTRL_GPIO_0 BIT(8) - u8 unused7[0x600]; + u8 unused7_1[0x240 - 0x200]; + + u64 wreq_split_mask; +#define WREQ_SPLIT_MASK_SET_MASK(val) vBIT(val, 52, 12) + + u8 unused7_2[0x800 - 0x248]; /* TxDMA registers */ u64 txdma_int_status; @@ -470,6 +486,7 @@ typedef struct _XENA_dev_config { #define PRC_CTRL_NO_SNOOP (BIT(22)|BIT(23)) #define PRC_CTRL_NO_SNOOP_DESC BIT(22) #define PRC_CTRL_NO_SNOOP_BUFF BIT(23) +#define PRC_CTRL_BIMODAL_INTERRUPT BIT(37) #define PRC_CTRL_RXD_BACKOFF_INTERVAL(val) vBIT(val,40,24) u64 prc_alarm_action; @@ -742,7 +759,19 @@ typedef struct _XENA_dev_config { u64 mc_rldram_test_d1; u8 unused24[0x300 - 0x288]; u64 mc_rldram_test_d2; - u8 unused25[0x700 - 0x308]; + + u8 unused24_1[0x360 - 0x308]; + u64 mc_rldram_ctrl; +#define MC_RLDRAM_ENABLE_ODT BIT(7) + + u8 unused24_2[0x640 - 0x368]; + u64 mc_rldram_ref_per_herc; +#define MC_RLDRAM_SET_REF_PERIOD(val) vBIT(val, 0, 16) + + u8 unused24_3[0x660 - 0x648]; + u64 mc_rldram_mrs_herc; + + u8 unused25[0x700 - 0x668]; u64 mc_debug_ctrl; u8 unused26[0x3000 - 0x2f08]; diff -uprN vanilla_linux/drivers/net/s2io.c linux-2.6.13-rc4/drivers/net/s2io.c --- vanilla_linux/drivers/net/s2io.c 2005-08-02 06:37:37.000000000 -0700 +++ linux-2.6.13-rc4/drivers/net/s2io.c 2005-08-02 06:45:08.000000000 -0700 @@ -84,9 +84,10 @@ static inline int RXD_IS_UP2DT(RxD_t *rx * problem, 600B, 600C, 600D, 640B, 640C and 640D. * macro below identifies these cards given the subsystem_id. */ -#define CARDS_WITH_FAULTY_LINK_INDICATORS(subid) \ - (((subid >= 0x600B) && (subid <= 0x600D)) || \ - ((subid >= 0x640B) && (subid <= 0x640D))) ? 1 : 0 +#define CARDS_WITH_FAULTY_LINK_INDICATORS(dev_type, subid) \ + (dev_type == XFRAME_I_DEVICE) ? \ + ((((subid >= 0x600B) && (subid <= 0x600D)) || \ + ((subid >= 0x640B) && (subid <= 0x640D))) ? 1 : 0) : 0 #define LINK_IS_UP(val64) (!(val64 & (ADAPTER_STATUS_RMAC_REMOTE_FAULT | \ ADAPTER_STATUS_RMAC_LOCAL_FAULT))) @@ -207,7 +208,24 @@ static void s2io_vlan_rx_kill_vid(struct #define SWITCH_SIGN 0xA5A5A5A5A5A5A5A5ULL #define END_SIGN 0x0 -static u64 default_mdio_cfg[] = { +static u64 herc_act_dtx_cfg[] = { + /* Set address */ + 0x80000515BA750000ULL, 0x80000515BA7500E0ULL, + /* Write data */ + 0x80000515BA750004ULL, 0x80000515BA7500E4ULL, + /* Set address */ + 0x80010515003F0000ULL, 0x80010515003F00E0ULL, + /* Write data */ + 0x80010515003F0004ULL, 0x80010515003F00E4ULL, + /* Set address */ + 0x80020515F2100000ULL, 0x80020515F21000E0ULL, + /* Write data */ + 0x80020515F2100004ULL, 0x80020515F21000E4ULL, + /* Done */ + END_SIGN +}; + +static u64 xena_mdio_cfg[] = { /* Reset PMA PLL */ 0xC001010000000000ULL, 0xC0010100000000E0ULL, 0xC0010100008000E4ULL, @@ -217,7 +235,7 @@ static u64 default_mdio_cfg[] = { END_SIGN }; -static u64 default_dtx_cfg[] = { +static u64 xena_dtx_cfg[] = { 0x8000051500000000ULL, 0x80000515000000E0ULL, 0x80000515D93500E4ULL, 0x8001051500000000ULL, 0x80010515000000E0ULL, 0x80010515001E00E4ULL, @@ -656,6 +674,87 @@ static void free_shared_mem(struct s2io_ } /** + * s2io_verify_pci_mode - + */ + +static int s2io_verify_pci_mode(nic_t *nic) +{ + XENA_dev_config_t *bar0 = (XENA_dev_config_t *) nic->bar0; + register u64 val64 = 0; + int mode; + + val64 = readq(&bar0->pci_mode); + mode = (u8)GET_PCI_MODE(val64); + + if ( val64 & PCI_MODE_UNKNOWN_MODE) + return -1; /* Unknown PCI mode */ + return mode; +} + + +/** + * s2io_print_pci_mode - + */ +static int s2io_print_pci_mode(nic_t *nic) +{ + XENA_dev_config_t *bar0 = (XENA_dev_config_t *) nic->bar0; + register u64 val64 = 0; + int mode; + struct config_param *config = &nic->config; + + val64 = readq(&bar0->pci_mode); + mode = (u8)GET_PCI_MODE(val64); + + if ( val64 & PCI_MODE_UNKNOWN_MODE) + return -1; /* Unknown PCI mode */ + + if (val64 & PCI_MODE_32_BITS) { + DBG_PRINT(ERR_DBG, "%s: Device is on 32 bit ", nic->dev->name); + } else { + DBG_PRINT(ERR_DBG, "%s: Device is on 64 bit ", nic->dev->name); + } + + switch(mode) { + case PCI_MODE_PCI_33: + DBG_PRINT(ERR_DBG, "33MHz PCI bus\n"); + config->bus_speed = 33; + break; + case PCI_MODE_PCI_66: + DBG_PRINT(ERR_DBG, "66MHz PCI bus\n"); + config->bus_speed = 133; + break; + case PCI_MODE_PCIX_M1_66: + DBG_PRINT(ERR_DBG, "66MHz PCIX(M1) bus\n"); + config->bus_speed = 133; /* Herc doubles the clock rate */ + break; + case PCI_MODE_PCIX_M1_100: + DBG_PRINT(ERR_DBG, "100MHz PCIX(M1) bus\n"); + config->bus_speed = 200; + break; + case PCI_MODE_PCIX_M1_133: + DBG_PRINT(ERR_DBG, "133MHz PCIX(M1) bus\n"); + config->bus_speed = 266; + break; + case PCI_MODE_PCIX_M2_66: + DBG_PRINT(ERR_DBG, "133MHz PCIX(M2) bus\n"); + config->bus_speed = 133; + break; + case PCI_MODE_PCIX_M2_100: + DBG_PRINT(ERR_DBG, "200MHz PCIX(M2) bus\n"); + config->bus_speed = 200; + break; + case PCI_MODE_PCIX_M2_133: + DBG_PRINT(ERR_DBG, "266MHz PCIX(M2) bus\n"); + config->bus_speed = 266; + break; + default: + return -1; /* Unsupported bus speed */ + } + + return mode; +} + +/** * init_nic - Initialization of hardware * @nic: device peivate variable * Description: The function sequentially configures every block @@ -687,6 +786,16 @@ static int init_nic(struct s2io_nic *nic return -1; } + /* + * Herc requires EOI to be removed from reset before XGXS, so.. + */ + if (nic->device_type & XFRAME_II_DEVICE) { + val64 = 0xA500000000ULL; + writeq(val64, &bar0->sw_reset); + msleep(500); + val64 = readq(&bar0->sw_reset); + } + /* Remove XGXS from reset state */ val64 = 0; writeq(val64, &bar0->sw_reset); @@ -718,41 +827,51 @@ static int init_nic(struct s2io_nic *nic * of 64 bit values into two registers in a particular * sequence. Hence a macro 'SWITCH_SIGN' has been defined * which will be defined in the array of configuration values - * (default_dtx_cfg & default_mdio_cfg) at appropriate places + * (xena_dtx_cfg & xena_mdio_cfg) at appropriate places * to switch writing from one regsiter to another. We continue * writing these values until we encounter the 'END_SIGN' macro. * For example, After making a series of 21 writes into * dtx_control register the 'SWITCH_SIGN' appears and hence we * start writing into mdio_control until we encounter END_SIGN. */ - while (1) { - dtx_cfg: - while (default_dtx_cfg[dtx_cnt] != END_SIGN) { - if (default_dtx_cfg[dtx_cnt] == SWITCH_SIGN) { - dtx_cnt++; - goto mdio_cfg; - } - SPECIAL_REG_WRITE(default_dtx_cfg[dtx_cnt], + if (nic->device_type & XFRAME_II_DEVICE) { + while (herc_act_dtx_cfg[dtx_cnt] != END_SIGN) { + SPECIAL_REG_WRITE(xena_dtx_cfg[dtx_cnt], &bar0->dtx_control, UF); - val64 = readq(&bar0->dtx_control); + if (dtx_cnt & 0x1) + msleep(1); /* Necessary!! */ dtx_cnt++; } - mdio_cfg: - while (default_mdio_cfg[mdio_cnt] != END_SIGN) { - if (default_mdio_cfg[mdio_cnt] == SWITCH_SIGN) { + } else { + while (1) { + dtx_cfg: + while (xena_dtx_cfg[dtx_cnt] != END_SIGN) { + if (xena_dtx_cfg[dtx_cnt] == SWITCH_SIGN) { + dtx_cnt++; + goto mdio_cfg; + } + SPECIAL_REG_WRITE(xena_dtx_cfg[dtx_cnt], + &bar0->dtx_control, UF); + val64 = readq(&bar0->dtx_control); + dtx_cnt++; + } + mdio_cfg: + while (xena_mdio_cfg[mdio_cnt] != END_SIGN) { + if (xena_mdio_cfg[mdio_cnt] == SWITCH_SIGN) { + mdio_cnt++; + goto dtx_cfg; + } + SPECIAL_REG_WRITE(xena_mdio_cfg[mdio_cnt], + &bar0->mdio_control, UF); + val64 = readq(&bar0->mdio_control); mdio_cnt++; + } + if ((xena_dtx_cfg[dtx_cnt] == END_SIGN) && + (xena_mdio_cfg[mdio_cnt] == END_SIGN)) { + break; + } else { goto dtx_cfg; } - SPECIAL_REG_WRITE(default_mdio_cfg[mdio_cnt], - &bar0->mdio_control, UF); - val64 = readq(&bar0->mdio_control); - mdio_cnt++; - } - if ((default_dtx_cfg[dtx_cnt] == END_SIGN) && - (default_mdio_cfg[mdio_cnt] == END_SIGN)) { - break; - } else { - goto dtx_cfg; } } @@ -803,7 +922,8 @@ static int init_nic(struct s2io_nic *nic * Disable 4 PCCs for Xena1, 2 and 3 as per H/W bug * SXE-008 TRANSMIT DMA ARBITRATION ISSUE. */ - if (get_xena_rev_id(nic->pdev) < 4) + if ((nic->device_type == XFRAME_I_DEVICE) && + (get_xena_rev_id(nic->pdev) < 4)) writeq(PCC_ENABLE_FOUR, &bar0->pcc_enable); val64 = readq(&bar0->tx_fifo_partition_0); @@ -833,7 +953,11 @@ static int init_nic(struct s2io_nic *nic * configured Rings. */ val64 = 0; - mem_size = 64; + if (nic->device_type & XFRAME_II_DEVICE) + mem_size = 32; + else + mem_size = 64; + for (i = 0; i < config->rx_ring_num; i++) { switch (i) { case 0: @@ -1116,6 +1240,11 @@ static int init_nic(struct s2io_nic *nic /* Program statistics memory */ writeq(mac_control->stats_mem_phy, &bar0->stat_addr); + if (nic->device_type == XFRAME_II_DEVICE) { + val64 = STAT_BC(0x320); + writeq(val64, &bar0->stat_byte_cnt); + } + /* * Initializing the sampling rate for the device to calculate the * bandwidth utilization. @@ -1134,12 +1263,18 @@ static int init_nic(struct s2io_nic *nic * 250 interrupts per sec. Continuous interrupts are enabled * by default. */ - val64 = TTI_DATA1_MEM_TX_TIMER_VAL(0x2078) | - TTI_DATA1_MEM_TX_URNG_A(0xA) | + if (nic->device_type == XFRAME_II_DEVICE) { + int count = (nic->config.bus_speed * 125)/2; + val64 = TTI_DATA1_MEM_TX_TIMER_VAL(count); + } else { + + val64 = TTI_DATA1_MEM_TX_TIMER_VAL(0x2078); + } + val64 |= TTI_DATA1_MEM_TX_URNG_A(0xA) | TTI_DATA1_MEM_TX_URNG_B(0x10) | TTI_DATA1_MEM_TX_URNG_C(0x30) | TTI_DATA1_MEM_TX_TIMER_AC_EN; - if (use_continuous_tx_intrs) - val64 |= TTI_DATA1_MEM_TX_TIMER_CI_EN; + if (use_continuous_tx_intrs) + val64 |= TTI_DATA1_MEM_TX_TIMER_CI_EN; writeq(val64, &bar0->tti_data1_mem); val64 = TTI_DATA2_MEM_TX_UFC_A(0x10) | @@ -1171,9 +1306,19 @@ static int init_nic(struct s2io_nic *nic time++; } + /* RTI Initialization */ - val64 = RTI_DATA1_MEM_RX_TIMER_VAL(0xFFF) | - RTI_DATA1_MEM_RX_URNG_A(0xA) | + if (nic->device_type == XFRAME_II_DEVICE) { + /* + * Programmed to generate Apprx 500 Intrs per + * second + */ + int count = (nic->config.bus_speed * 125)/4; + val64 = RTI_DATA1_MEM_RX_TIMER_VAL(count); + } else { + val64 = RTI_DATA1_MEM_RX_TIMER_VAL(0xFFF); + } + val64 |= RTI_DATA1_MEM_RX_URNG_A(0xA) | RTI_DATA1_MEM_RX_URNG_B(0x10) | RTI_DATA1_MEM_RX_URNG_C(0x30) | RTI_DATA1_MEM_RX_TIMER_AC_EN; @@ -1267,6 +1412,15 @@ static int init_nic(struct s2io_nic *nic val64 |= PIC_CNTL_SHARED_SPLITS(shared_splits); writeq(val64, &bar0->pic_control); + /* + * Programming the Herc to split every write transaction + * that does not start on an ADB to reduce disconnects. + */ + if (nic->device_type == XFRAME_II_DEVICE) { + val64 = WREQ_SPLIT_MASK_SET_MASK(255); + writeq(val64, &bar0->wreq_split_mask); + } + return SUCCESS; } @@ -1509,18 +1663,18 @@ static void en_dis_able_nic_intrs(struct } } -static int check_prc_pcc_state(u64 val64, int flag, int rev_id) +static int check_prc_pcc_state(u64 val64, int flag, int rev_id, int herc) { int ret = 0; if (flag == FALSE) { - if (rev_id >= 4) { + if ((!herc && (rev_id >= 4)) || herc) { if (!(val64 & ADAPTER_STATUS_RMAC_PCC_IDLE) && ((val64 & ADAPTER_STATUS_RC_PRC_QUIESCENT) == ADAPTER_STATUS_RC_PRC_QUIESCENT)) { ret = 1; } - } else { + }else { if (!(val64 & ADAPTER_STATUS_RMAC_PCC_FOUR_IDLE) && ((val64 & ADAPTER_STATUS_RC_PRC_QUIESCENT) == ADAPTER_STATUS_RC_PRC_QUIESCENT)) { @@ -1528,7 +1682,7 @@ static int check_prc_pcc_state(u64 val64 } } } else { - if (rev_id >= 4) { + if ((!herc && (rev_id >= 4)) || herc) { if (((val64 & ADAPTER_STATUS_RMAC_PCC_IDLE) == ADAPTER_STATUS_RMAC_PCC_IDLE) && (!(val64 & ADAPTER_STATUS_RC_PRC_QUIESCENT) || @@ -1564,10 +1718,11 @@ static int check_prc_pcc_state(u64 val64 static int verify_xena_quiescence(nic_t *sp, u64 val64, int flag) { - int ret = 0; + int ret = 0, herc; u64 tmp64 = ~((u64) val64); int rev_id = get_xena_rev_id(sp->pdev); + herc = (sp->device_type == XFRAME_II_DEVICE); if (! (tmp64 & (ADAPTER_STATUS_TDMA_READY | ADAPTER_STATUS_RDMA_READY | @@ -1575,7 +1730,7 @@ static int verify_xena_quiescence(nic_t ADAPTER_STATUS_PIC_QUIESCENT | ADAPTER_STATUS_MC_DRAM_READY | ADAPTER_STATUS_MC_QUEUES_READY | ADAPTER_STATUS_M_PLL_LOCK | ADAPTER_STATUS_P_PLL_LOCK))) { - ret = check_prc_pcc_state(val64, flag, rev_id); + ret = check_prc_pcc_state(val64, flag, rev_id, herc); } return ret; @@ -1706,7 +1861,8 @@ static int start_nic(struct s2io_nic *ni /* SXE-002: Initialize link and activity LED */ subid = nic->pdev->subsystem_device; - if ((subid & 0xFF) >= 0x07) { + if (((subid & 0xFF) >= 0x07) && + (nic->device_type == XFRAME_I_DEVICE)) { val64 = readq(&bar0->gpio_control); val64 |= 0x0000800000000000ULL; writeq(val64, &bar0->gpio_control); @@ -2541,9 +2697,12 @@ void s2io_reset(nic_t * sp) */ msleep(250); + if (!(sp->device_type & XFRAME_II_DEVICE)) { /* Restore the PCI state saved during initializarion. */ - pci_restore_state(sp->pdev); - + pci_restore_state(sp->pdev); + } else { + pci_set_master(sp->pdev); + } s2io_init_pci(sp); msleep(250); @@ -2568,7 +2727,8 @@ void s2io_reset(nic_t * sp) /* SXE-002: Configure link and activity LED to turn it off */ subid = sp->pdev->subsystem_device; - if ((subid & 0xFF) >= 0x07) { + if (((subid & 0xFF) >= 0x07) && + (sp->device_type == XFRAME_I_DEVICE)) { val64 = readq(&bar0->gpio_control); val64 |= 0x0000800000000000ULL; writeq(val64, &bar0->gpio_control); @@ -2576,6 +2736,15 @@ void s2io_reset(nic_t * sp) writeq(val64, (void __iomem *) ((u8 *) bar0 + 0x2700)); } + /* + * Clear spurious ECC interrupts that would have occured on + * XFRAME II cards after reset. + */ + if (sp->device_type == XFRAME_II_DEVICE) { + val64 = readq(&bar0->pcc_err_reg); + writeq(val64, &bar0->pcc_err_reg); + } + sp->device_enabled_once = FALSE; } @@ -3463,7 +3632,8 @@ static void s2io_phy_id(unsigned long da u16 subid; subid = sp->pdev->subsystem_device; - if ((subid & 0xFF) >= 0x07) { + if ((sp->device_type == XFRAME_II_DEVICE) || + ((subid & 0xFF) >= 0x07)) { val64 = readq(&bar0->gpio_control); val64 ^= GPIO_CTRL_GPIO_0; writeq(val64, &bar0->gpio_control); @@ -3500,7 +3670,8 @@ static int s2io_ethtool_idnic(struct net subid = sp->pdev->subsystem_device; last_gpio_ctrl_val = readq(&bar0->gpio_control); - if ((subid & 0xFF) < 0x07) { + if ((sp->device_type == XFRAME_I_DEVICE) && + ((subid & 0xFF) < 0x07)) { val64 = readq(&bar0->adapter_control); if (!(val64 & ADAPTER_CNTL_EN)) { printk(KERN_ERR @@ -3520,7 +3691,7 @@ static int s2io_ethtool_idnic(struct net msleep_interruptible(MAX_FLICKER_TIME); del_timer_sync(&sp->id_timer); - if (CARDS_WITH_FAULTY_LINK_INDICATORS(subid)) { + if (CARDS_WITH_FAULTY_LINK_INDICATORS(sp->device_type, subid)) { writeq(last_gpio_ctrl_val, &bar0->gpio_control); last_gpio_ctrl_val = readq(&bar0->gpio_control); } @@ -4134,44 +4305,91 @@ static void s2io_get_ethtool_stats(struc StatInfo_t *stat_info = sp->mac_control.stats_info; s2io_updt_stats(sp); - tmp_stats[i++] = le32_to_cpu(stat_info->tmac_frms); - tmp_stats[i++] = le32_to_cpu(stat_info->tmac_data_octets); + tmp_stats[i++] = + (u64)le32_to_cpu(stat_info->tmac_frms_oflow) << 32 | + le32_to_cpu(stat_info->tmac_frms); + tmp_stats[i++] = + (u64)le32_to_cpu(stat_info->tmac_data_octets_oflow) << 32 | + le32_to_cpu(stat_info->tmac_data_octets); tmp_stats[i++] = le64_to_cpu(stat_info->tmac_drop_frms); - tmp_stats[i++] = le32_to_cpu(stat_info->tmac_mcst_frms); - tmp_stats[i++] = le32_to_cpu(stat_info->tmac_bcst_frms); + tmp_stats[i++] = + (u64)le32_to_cpu(stat_info->tmac_mcst_frms_oflow) << 32 | + le32_to_cpu(stat_info->tmac_mcst_frms); + tmp_stats[i++] = + (u64)le32_to_cpu(stat_info->tmac_bcst_frms_oflow) << 32 | + le32_to_cpu(stat_info->tmac_bcst_frms); tmp_stats[i++] = le64_to_cpu(stat_info->tmac_pause_ctrl_frms); - tmp_stats[i++] = le32_to_cpu(stat_info->tmac_any_err_frms); + tmp_stats[i++] = + (u64)le32_to_cpu(stat_info->tmac_any_err_frms_oflow) << 32 | + le32_to_cpu(stat_info->tmac_any_err_frms); tmp_stats[i++] = le64_to_cpu(stat_info->tmac_vld_ip_octets); - tmp_stats[i++] = le32_to_cpu(stat_info->tmac_vld_ip); - tmp_stats[i++] = le32_to_cpu(stat_info->tmac_drop_ip); - tmp_stats[i++] = le32_to_cpu(stat_info->tmac_icmp); - tmp_stats[i++] = le32_to_cpu(stat_info->tmac_rst_tcp); + tmp_stats[i++] = + (u64)le32_to_cpu(stat_info->tmac_vld_ip_oflow) << 32 | + le32_to_cpu(stat_info->tmac_vld_ip); + tmp_stats[i++] = + (u64)le32_to_cpu(stat_info->tmac_drop_ip_oflow) << 32 | + le32_to_cpu(stat_info->tmac_drop_ip); + tmp_stats[i++] = + (u64)le32_to_cpu(stat_info->tmac_icmp_oflow) << 32 | + le32_to_cpu(stat_info->tmac_icmp); + tmp_stats[i++] = + (u64)le32_to_cpu(stat_info->tmac_rst_tcp_oflow) << 32 | + le32_to_cpu(stat_info->tmac_rst_tcp); tmp_stats[i++] = le64_to_cpu(stat_info->tmac_tcp); - tmp_stats[i++] = le32_to_cpu(stat_info->tmac_udp); - tmp_stats[i++] = le32_to_cpu(stat_info->rmac_vld_frms); - tmp_stats[i++] = le32_to_cpu(stat_info->rmac_data_octets); + tmp_stats[i++] = (u64)le32_to_cpu(stat_info->tmac_udp_oflow) << 32 | + le32_to_cpu(stat_info->tmac_udp); + tmp_stats[i++] = + (u64)le32_to_cpu(stat_info->rmac_vld_frms_oflow) << 32 | + le32_to_cpu(stat_info->rmac_vld_frms); + tmp_stats[i++] = + (u64)le32_to_cpu(stat_info->rmac_data_octets_oflow) << 32 | + le32_to_cpu(stat_info->rmac_data_octets); tmp_stats[i++] = le64_to_cpu(stat_info->rmac_fcs_err_frms); tmp_stats[i++] = le64_to_cpu(stat_info->rmac_drop_frms); - tmp_stats[i++] = le32_to_cpu(stat_info->rmac_vld_mcst_frms); - tmp_stats[i++] = le32_to_cpu(stat_info->rmac_vld_bcst_frms); + tmp_stats[i++] = + (u64)le32_to_cpu(stat_info->rmac_vld_mcst_frms_oflow) << 32 | + le32_to_cpu(stat_info->rmac_vld_mcst_frms); + tmp_stats[i++] = + (u64)le32_to_cpu(stat_info->rmac_vld_bcst_frms_oflow) << 32 | + le32_to_cpu(stat_info->rmac_vld_bcst_frms); tmp_stats[i++] = le32_to_cpu(stat_info->rmac_in_rng_len_err_frms); tmp_stats[i++] = le64_to_cpu(stat_info->rmac_long_frms); tmp_stats[i++] = le64_to_cpu(stat_info->rmac_pause_ctrl_frms); - tmp_stats[i++] = le32_to_cpu(stat_info->rmac_discarded_frms); - tmp_stats[i++] = le32_to_cpu(stat_info->rmac_usized_frms); - tmp_stats[i++] = le32_to_cpu(stat_info->rmac_osized_frms); - tmp_stats[i++] = le32_to_cpu(stat_info->rmac_frag_frms); - tmp_stats[i++] = le32_to_cpu(stat_info->rmac_jabber_frms); - tmp_stats[i++] = le32_to_cpu(stat_info->rmac_ip); + tmp_stats[i++] = + (u64)le32_to_cpu(stat_info->rmac_discarded_frms_oflow) << 32 | + le32_to_cpu(stat_info->rmac_discarded_frms); + tmp_stats[i++] = + (u64)le32_to_cpu(stat_info->rmac_usized_frms_oflow) << 32 | + le32_to_cpu(stat_info->rmac_usized_frms); + tmp_stats[i++] = + (u64)le32_to_cpu(stat_info->rmac_osized_frms_oflow) << 32 | + le32_to_cpu(stat_info->rmac_osized_frms); + tmp_stats[i++] = + (u64)le32_to_cpu(stat_info->rmac_frag_frms_oflow) << 32 | + le32_to_cpu(stat_info->rmac_frag_frms); + tmp_stats[i++] = + (u64)le32_to_cpu(stat_info->rmac_jabber_frms_oflow) << 32 | + le32_to_cpu(stat_info->rmac_jabber_frms); + tmp_stats[i++] = (u64)le32_to_cpu(stat_info->rmac_ip_oflow) << 32 | + le32_to_cpu(stat_info->rmac_ip); tmp_stats[i++] = le64_to_cpu(stat_info->rmac_ip_octets); tmp_stats[i++] = le32_to_cpu(stat_info->rmac_hdr_err_ip); - tmp_stats[i++] = le32_to_cpu(stat_info->rmac_drop_ip); - tmp_stats[i++] = le32_to_cpu(stat_info->rmac_icmp); + tmp_stats[i++] = (u64)le32_to_cpu(stat_info->rmac_drop_ip_oflow) << 32 | + le32_to_cpu(stat_info->rmac_drop_ip); + tmp_stats[i++] = (u64)le32_to_cpu(stat_info->rmac_icmp_oflow) << 32 | + le32_to_cpu(stat_info->rmac_icmp); tmp_stats[i++] = le64_to_cpu(stat_info->rmac_tcp); - tmp_stats[i++] = le32_to_cpu(stat_info->rmac_udp); - tmp_stats[i++] = le32_to_cpu(stat_info->rmac_err_drp_udp); - tmp_stats[i++] = le32_to_cpu(stat_info->rmac_pause_cnt); - tmp_stats[i++] = le32_to_cpu(stat_info->rmac_accepted_ip); + tmp_stats[i++] = (u64)le32_to_cpu(stat_info->rmac_udp_oflow) << 32 | + le32_to_cpu(stat_info->rmac_udp); + tmp_stats[i++] = + (u64)le32_to_cpu(stat_info->rmac_err_drp_udp_oflow) << 32 | + le32_to_cpu(stat_info->rmac_err_drp_udp); + tmp_stats[i++] = + (u64)le32_to_cpu(stat_info->rmac_pause_cnt_oflow) << 32 | + le32_to_cpu(stat_info->rmac_pause_cnt); + tmp_stats[i++] = + (u64)le32_to_cpu(stat_info->rmac_accepted_ip_oflow) << 32 | + le32_to_cpu(stat_info->rmac_accepted_ip); tmp_stats[i++] = le32_to_cpu(stat_info->rmac_err_tcp); tmp_stats[i++] = 0; tmp_stats[i++] = stat_info->sw_stat.single_ecc_errs; @@ -4401,7 +4619,8 @@ static void s2io_set_link(unsigned long val64 = readq(&bar0->adapter_control); val64 |= ADAPTER_CNTL_EN; writeq(val64, &bar0->adapter_control); - if (CARDS_WITH_FAULTY_LINK_INDICATORS(subid)) { + if (CARDS_WITH_FAULTY_LINK_INDICATORS(nic->device_type, + subid)) { val64 = readq(&bar0->gpio_control); val64 |= GPIO_CTRL_GPIO_0; writeq(val64, &bar0->gpio_control); @@ -4423,7 +4642,8 @@ static void s2io_set_link(unsigned long } s2io_link(nic, LINK_UP); } else { - if (CARDS_WITH_FAULTY_LINK_INDICATORS(subid)) { + if (CARDS_WITH_FAULTY_LINK_INDICATORS(nic->device_type, + subid)) { val64 = readq(&bar0->gpio_control); val64 &= ~GPIO_CTRL_GPIO_0; writeq(val64, &bar0->gpio_control); @@ -4708,7 +4928,6 @@ static int rx_osm_handler(ring_info_t *r netif_rx(skb); } #endif - dev->last_rx = jiffies; atomic_dec(&sp->rx_bufs_left[ring_no]); return SUCCESS; @@ -4842,6 +5061,7 @@ s2io_init_nic(struct pci_dev *pdev, cons u16 subid; mac_info_t *mac_control; struct config_param *config; + int mode; #ifdef CONFIG_S2IO_NAPI DBG_PRINT(ERR_DBG, "NAPI support has been enabled\n"); @@ -4898,6 +5118,12 @@ s2io_init_nic(struct pci_dev *pdev, cons sp->high_dma_flag = dma_flag; sp->device_enabled_once = FALSE; + if ((pdev->device == PCI_DEVICE_ID_HERC_WIN) || + (pdev->device == PCI_DEVICE_ID_HERC_UNI)) + sp->device_type = XFRAME_II_DEVICE; + else + sp->device_type = XFRAME_I_DEVICE; + /* Initialize some PCI/PCI-X fields of the NIC. */ s2io_init_pci(sp); @@ -5033,7 +5259,9 @@ s2io_init_nic(struct pci_dev *pdev, cons INIT_WORK(&sp->set_link_task, (void (*)(void *)) s2io_set_link, sp); - pci_save_state(sp->pdev); + if (!(sp->device_type & XFRAME_II_DEVICE)) { + pci_save_state(sp->pdev); + } /* Setting swapper control on the NIC, for proper reset operation */ if (s2io_set_swapper(sp)) { @@ -5043,12 +5271,26 @@ s2io_init_nic(struct pci_dev *pdev, cons goto set_swap_failed; } - /* - * Fix for all "FFs" MAC address problems observed on - * Alpha platforms - */ - fix_mac_address(sp); - s2io_reset(sp); + /* Verify if the Herc works on the slot its placed into */ + if (sp->device_type & XFRAME_II_DEVICE) { + mode = s2io_verify_pci_mode(sp); + if (mode < 0) { + DBG_PRINT(ERR_DBG, "%s: ", __FUNCTION__); + DBG_PRINT(ERR_DBG, " Unsupported PCI bus mode\n"); + ret = -EBADSLT; + goto set_swap_failed; + } + } + + /* Not needed for Herc */ + if (sp->device_type & XFRAME_I_DEVICE) { + /* + * Fix for all "FFs" MAC address problems observed on + * Alpha platforms + */ + fix_mac_address(sp); + s2io_reset(sp); + } /* * MAC address initialization. @@ -5073,22 +5315,13 @@ s2io_init_nic(struct pci_dev *pdev, cons sp->def_mac_addr[0].mac_addr[5] = (u8) (mac_down >> 16); sp->def_mac_addr[0].mac_addr[4] = (u8) (mac_down >> 24); - DBG_PRINT(INIT_DBG, - "DEFAULT MAC ADDR:0x%02x-%02x-%02x-%02x-%02x-%02x\n", - sp->def_mac_addr[0].mac_addr[0], - sp->def_mac_addr[0].mac_addr[1], - sp->def_mac_addr[0].mac_addr[2], - sp->def_mac_addr[0].mac_addr[3], - sp->def_mac_addr[0].mac_addr[4], - sp->def_mac_addr[0].mac_addr[5]); - /* Set the factory defined MAC address initially */ dev->addr_len = ETH_ALEN; memcpy(dev->dev_addr, sp->def_mac_addr, ETH_ALEN); /* * Initialize the tasklet status and link state flags - * and the card statte parameter + * and the card state parameter */ atomic_set(&(sp->card_state), 0); sp->tasklet_status = 0; @@ -5123,9 +5356,46 @@ s2io_init_nic(struct pci_dev *pdev, cons goto register_failed; } + if (sp->device_type & XFRAME_II_DEVICE) { + DBG_PRINT(ERR_DBG, "%s: Neterion Xframe II 10GbE adapter ", + dev->name); + DBG_PRINT(ERR_DBG, "(rev %d), Driver %s\n", + get_xena_rev_id(sp->pdev), + s2io_driver_version); + DBG_PRINT(ERR_DBG, "MAC ADDR: %02x:%02x:%02x:%02x:%02x:%02x\n", + sp->def_mac_addr[0].mac_addr[0], + sp->def_mac_addr[0].mac_addr[1], + sp->def_mac_addr[0].mac_addr[2], + sp->def_mac_addr[0].mac_addr[3], + sp->def_mac_addr[0].mac_addr[4], + sp->def_mac_addr[0].mac_addr[5]); + int mode = s2io_print_pci_mode(sp); + if (mode < 0) { + DBG_PRINT(ERR_DBG, " Unsupported PCI bus mode "); + ret = -EBADSLT; + goto set_swap_failed; + } + } else { + DBG_PRINT(ERR_DBG, "%s: Neterion Xframe I 10GbE adapter ", + dev->name); + DBG_PRINT(ERR_DBG, "(rev %d), Driver %s\n", + get_xena_rev_id(sp->pdev), + s2io_driver_version); + DBG_PRINT(ERR_DBG, "MAC ADDR: %02x:%02x:%02x:%02x:%02x:%02x\n", + sp->def_mac_addr[0].mac_addr[0], + sp->def_mac_addr[0].mac_addr[1], + sp->def_mac_addr[0].mac_addr[2], + sp->def_mac_addr[0].mac_addr[3], + sp->def_mac_addr[0].mac_addr[4], + sp->def_mac_addr[0].mac_addr[5]); + } + /* Initialize device name */ strcpy(sp->name, dev->name); - strcat(sp->name, ": Neterion Xframe I 10GbE adapter"); + if (sp->device_type & XFRAME_II_DEVICE) + strcat(sp->name, ": Neterion Xframe II 10GbE adapter"); + else + strcat(sp->name, ": Neterion Xframe I 10GbE adapter"); /* * Make Link state as off at this point, when the Link change diff -uprN vanilla_linux/drivers/net/s2io.h linux-2.6.13-rc4/drivers/net/s2io.h --- vanilla_linux/drivers/net/s2io.h 2005-08-02 06:37:37.000000000 -0700 +++ linux-2.6.13-rc4/drivers/net/s2io.h 2005-08-02 06:45:08.000000000 -0700 @@ -201,6 +201,67 @@ typedef struct stat_block { u32 rxf_wr_cnt; u32 txf_rd_cnt; +/* Tx MAC statistics overflow counters. */ + u32 tmac_data_octets_oflow; + u32 tmac_frms_oflow; + u32 tmac_bcst_frms_oflow; + u32 tmac_mcst_frms_oflow; + u32 tmac_ucst_frms_oflow; + u32 tmac_ttl_octets_oflow; + u32 tmac_any_err_frms_oflow; + u32 tmac_nucst_frms_oflow; + u64 tmac_vlan_frms; + u32 tmac_drop_ip_oflow; + u32 tmac_vld_ip_oflow; + u32 tmac_rst_tcp_oflow; + u32 tmac_icmp_oflow; + u32 tpa_unknown_protocol; + u32 tmac_udp_oflow; + u32 reserved_10; + u32 tpa_parse_failure; + +/* Rx MAC Statistics overflow counters. */ + u32 rmac_data_octets_oflow; + u32 rmac_vld_frms_oflow; + u32 rmac_vld_bcst_frms_oflow; + u32 rmac_vld_mcst_frms_oflow; + u32 rmac_accepted_ucst_frms_oflow; + u32 rmac_ttl_octets_oflow; + u32 rmac_discarded_frms_oflow; + u32 rmac_accepted_nucst_frms_oflow; + u32 rmac_usized_frms_oflow; + u32 rmac_drop_events_oflow; + u32 rmac_frag_frms_oflow; + u32 rmac_osized_frms_oflow; + u32 rmac_ip_oflow; + u32 rmac_jabber_frms_oflow; + u32 rmac_icmp_oflow; + u32 rmac_drop_ip_oflow; + u32 rmac_err_drp_udp_oflow; + u32 rmac_udp_oflow; + u32 reserved_11; + u32 rmac_pause_cnt_oflow; + u64 rmac_ttl_1519_4095_frms; + u64 rmac_ttl_4096_8191_frms; + u64 rmac_ttl_8192_max_frms; + u64 rmac_ttl_gt_max_frms; + u64 rmac_osized_alt_frms; + u64 rmac_jabber_alt_frms; + u64 rmac_gt_max_alt_frms; + u64 rmac_vlan_frms; + u32 rmac_len_discard; + u32 rmac_fcs_discard; + u32 rmac_pf_discard; + u32 rmac_da_discard; + u32 rmac_red_discard; + u32 rmac_rts_discard; + u32 reserved_12; + u32 rmac_ingm_full_discard; + u32 reserved_13; + u32 rmac_accepted_ip_oflow; + u32 reserved_14; + u32 link_fault_cnt; + /* Software statistics maintained by driver */ swStat_t sw_stat; } StatInfo_t; @@ -690,6 +751,9 @@ struct s2io_nic { atomic_t card_state; volatile unsigned long link_state; struct vlan_group *vlgrp; +#define XFRAME_I_DEVICE 1 +#define XFRAME_II_DEVICE 2 + u8 device_type; spinlock_t rx_lock; atomic_t isr_cnt; From raghavendra.koushik@neterion.com Wed Aug 3 12:54:21 2005 Received: with ECARTIS (v1.0.0; list netdev); Wed, 03 Aug 2005 12:54:27 -0700 (PDT) Received: from linux.site (adsl-67-120-213-161.dsl.sntc01.pacbell.net [67.120.213.161]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j73JsLH9031929 for ; Wed, 3 Aug 2005 12:54:21 -0700 Received: by linux.site (Postfix, from userid 0) id 1649298336; Wed, 3 Aug 2005 12:38:01 -0700 (PDT) To: jgarzik@pobox.com, netdev@oss.sgi.com Cc: raghavendra.koushik@neterion.com, ravinandan.arakali@neterion.com, leonid.grossman@neterion.com, rapuru.sriram@neterion.com From: raghavendra.koushik@neterion.com Subject: [PATCH 2.6.13-rc4 10/13] S2io: Support for bimodal interrupts Message-Id: <20050803193801.1649298336@linux.site> Date: Wed, 3 Aug 2005 12:38:01 -0700 (PDT) X-archive-position: 2844 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: raghavendra.koushik@neterion.com Precedence: bulk X-list: netdev Content-Length: 6372 Lines: 199 Hi, This is a patch to provide bimodal interrupt moderation support for Xframe II adapter. Basically, in this moderation scheme, the adapter raises a traffic interrupt if the no. of packets transmitted and/or received reaches a programmable threshold. Signed-off-by: Ravinandan Arakali Signed-off-by: Raghavendra Koushik --- diff -uprN vanilla_linux/drivers/net/s2io.c linux-2.6.13-rc4/drivers/net/s2io.c --- vanilla_linux/drivers/net/s2io.c 2005-08-02 06:50:49.000000000 -0700 +++ linux-2.6.13-rc4/drivers/net/s2io.c 2005-08-02 06:54:23.000000000 -0700 @@ -297,6 +297,7 @@ static unsigned int mc_pause_threshold_q static unsigned int shared_splits; static unsigned int tmac_util_period = 5; static unsigned int rmac_util_period = 5; +static unsigned int bimodal = 0; #ifndef CONFIG_S2IO_NAPI static unsigned int indicate_max_pkts; #endif @@ -1306,52 +1307,86 @@ static int init_nic(struct s2io_nic *nic time++; } + if (nic->config.bimodal) { + int k = 0; + for (k = 0; k < config->rx_ring_num; k++) { + val64 = TTI_CMD_MEM_WE | TTI_CMD_MEM_STROBE_NEW_CMD; + val64 |= TTI_CMD_MEM_OFFSET(0x38+k); + writeq(val64, &bar0->tti_command_mem); - /* RTI Initialization */ - if (nic->device_type == XFRAME_II_DEVICE) { /* - * Programmed to generate Apprx 500 Intrs per - * second - */ - int count = (nic->config.bus_speed * 125)/4; - val64 = RTI_DATA1_MEM_RX_TIMER_VAL(count); + * Once the operation completes, the Strobe bit of the command + * register will be reset. We poll for this particular condition + * We wait for a maximum of 500ms for the operation to complete, + * if it's not complete by then we return error. + */ + time = 0; + while (TRUE) { + val64 = readq(&bar0->tti_command_mem); + if (!(val64 & TTI_CMD_MEM_STROBE_NEW_CMD)) { + break; + } + if (time > 10) { + DBG_PRINT(ERR_DBG, + "%s: TTI init Failed\n", + dev->name); + return -1; + } + time++; + msleep(50); + } + } } else { - val64 = RTI_DATA1_MEM_RX_TIMER_VAL(0xFFF); - } - val64 |= RTI_DATA1_MEM_RX_URNG_A(0xA) | - RTI_DATA1_MEM_RX_URNG_B(0x10) | - RTI_DATA1_MEM_RX_URNG_C(0x30) | RTI_DATA1_MEM_RX_TIMER_AC_EN; - - writeq(val64, &bar0->rti_data1_mem); - val64 = RTI_DATA2_MEM_RX_UFC_A(0x1) | - RTI_DATA2_MEM_RX_UFC_B(0x2) | - RTI_DATA2_MEM_RX_UFC_C(0x40) | RTI_DATA2_MEM_RX_UFC_D(0x80); - writeq(val64, &bar0->rti_data2_mem); + /* RTI Initialization */ + if (nic->device_type == XFRAME_II_DEVICE) { + /* + * Programmed to generate Apprx 500 Intrs per + * second + */ + int count = (nic->config.bus_speed * 125)/4; + val64 = RTI_DATA1_MEM_RX_TIMER_VAL(count); + } else { + val64 = RTI_DATA1_MEM_RX_TIMER_VAL(0xFFF); + } + val64 |= RTI_DATA1_MEM_RX_URNG_A(0xA) | + RTI_DATA1_MEM_RX_URNG_B(0x10) | + RTI_DATA1_MEM_RX_URNG_C(0x30) | RTI_DATA1_MEM_RX_TIMER_AC_EN; + + writeq(val64, &bar0->rti_data1_mem); + + val64 = RTI_DATA2_MEM_RX_UFC_A(0x1) | + RTI_DATA2_MEM_RX_UFC_B(0x2) | + RTI_DATA2_MEM_RX_UFC_C(0x40) | RTI_DATA2_MEM_RX_UFC_D(0x80); + writeq(val64, &bar0->rti_data2_mem); - val64 = RTI_CMD_MEM_WE | RTI_CMD_MEM_STROBE_NEW_CMD; - writeq(val64, &bar0->rti_command_mem); + for (i = 0; i < config->rx_ring_num; i++) { + val64 = RTI_CMD_MEM_WE | RTI_CMD_MEM_STROBE_NEW_CMD + | RTI_CMD_MEM_OFFSET(i); + writeq(val64, &bar0->rti_command_mem); - /* - * Once the operation completes, the Strobe bit of the - * command register will be reset. We poll for this - * particular condition. We wait for a maximum of 500ms - * for the operation to complete, if it's not complete - * by then we return error. - */ - time = 0; - while (TRUE) { - val64 = readq(&bar0->rti_command_mem); - if (!(val64 & RTI_CMD_MEM_STROBE_NEW_CMD)) { - break; - } - if (time > 10) { - DBG_PRINT(ERR_DBG, "%s: RTI init Failed\n", - dev->name); - return -1; + /* + * Once the operation completes, the Strobe bit of the + * command register will be reset. We poll for this + * particular condition. We wait for a maximum of 500ms + * for the operation to complete, if it's not complete + * by then we return error. + */ + time = 0; + while (TRUE) { + val64 = readq(&bar0->rti_command_mem); + if (!(val64 & RTI_CMD_MEM_STROBE_NEW_CMD)) { + break; + } + if (time > 10) { + DBG_PRINT(ERR_DBG, "%s: RTI init Failed\n", + dev->name); + return -1; + } + time++; + msleep(50); + } } - time++; - msleep(50); } /* @@ -1789,6 +1824,8 @@ static int start_nic(struct s2io_nic *ni &bar0->prc_rxd0_n[i]); val64 = readq(&bar0->prc_ctrl_n[i]); + if (nic->config.bimodal) + val64 |= PRC_CTRL_BIMODAL_INTERRUPT; #ifndef CONFIG_2BUFF_MODE val64 |= PRC_CTRL_RC_ENABLED; #else @@ -5030,6 +5067,7 @@ module_param(mc_pause_threshold_q4q7, in module_param(shared_splits, int, 0); module_param(tmac_util_period, int, 0); module_param(rmac_util_period, int, 0); +module_param(bimodal, bool, 0); #ifndef CONFIG_S2IO_NAPI module_param(indicate_max_pkts, int, 0); #endif @@ -5397,6 +5435,14 @@ s2io_init_nic(struct pci_dev *pdev, cons else strcat(sp->name, ": Neterion Xframe I 10GbE adapter"); + /* Initialize bimodal Interrupts */ + sp->config.bimodal = bimodal; + if (!(sp->device_type & XFRAME_II_DEVICE) && bimodal) { + sp->config.bimodal = 0; + DBG_PRINT(ERR_DBG,"%s:Bimodal intr not supported by Xframe I\n", + dev->name); + } + /* * Make Link state as off at this point, when the Link change * interrupt comes the state will be automatically changed to diff -uprN vanilla_linux/drivers/net/s2io.h linux-2.6.13-rc4/drivers/net/s2io.h --- vanilla_linux/drivers/net/s2io.h 2005-08-02 06:50:49.000000000 -0700 +++ linux-2.6.13-rc4/drivers/net/s2io.h 2005-08-02 06:54:23.000000000 -0700 @@ -261,8 +261,6 @@ typedef struct stat_block { u32 rmac_accepted_ip_oflow; u32 reserved_14; u32 link_fault_cnt; - -/* Software statistics maintained by driver */ swStat_t sw_stat; } StatInfo_t; @@ -349,6 +347,7 @@ struct config_param { #define MAX_RX_BLOCKS_PER_RING 150 rx_ring_config_t rx_cfg[MAX_RX_RINGS]; /*Per-Rx Ring config */ + u8 bimodal; /*Flag for setting bimodal interrupts*/ #define HEADER_ETHERNET_II_802_3_SIZE 14 #define HEADER_802_2_SIZE 3 From raghavendra.koushik@neterion.com Wed Aug 3 12:55:22 2005 Received: with ECARTIS (v1.0.0; list netdev); Wed, 03 Aug 2005 12:55:25 -0700 (PDT) Received: from linux.site (adsl-67-120-213-161.dsl.sntc01.pacbell.net [67.120.213.161]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j73JtJH9032458 for ; Wed, 3 Aug 2005 12:55:22 -0700 Received: by linux.site (Postfix, from userid 0) id C556A98336; Wed, 3 Aug 2005 12:38:59 -0700 (PDT) To: jgarzik@pobox.com, netdev@oss.sgi.com Cc: raghavendra.koushik@neterion.com, ravinandan.arakali@neterion.com, leonid.grossman@neterion.com, rapuru.sriram@neterion.com From: raghavendra.koushik@neterion.com Subject: [PATCH 2.6.13-rc4 11/13] S2io: New link handling scheme for Xframe II Message-Id: <20050803193859.C556A98336@linux.site> Date: Wed, 3 Aug 2005 12:38:59 -0700 (PDT) X-archive-position: 2845 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: raghavendra.koushik@neterion.com Precedence: bulk X-list: netdev Content-Length: 9069 Lines: 276 Hi, The below patch implements a new "Link state change handling" scheme supported by the Xframe II adapter. It also bumps up the driver version to 2.0.2.0. Signed-off-by: Ravinandan Arakali Signed-off-by: Raghavendra Koushik --- diff -uprN vanilla_linux/drivers/net/s2io-regs.h linux-2.6.13-rc4/drivers/net/s2io-regs.h --- vanilla_linux/drivers/net/s2io-regs.h 2005-08-02 06:59:35.000000000 -0700 +++ linux-2.6.13-rc4/drivers/net/s2io-regs.h 2005-08-02 06:59:45.000000000 -0700 @@ -167,7 +167,11 @@ typedef struct _XENA_dev_config { u8 unused4[0x08]; u64 gpio_int_reg; +#define GPIO_INT_REG_LINK_DOWN BIT(1) +#define GPIO_INT_REG_LINK_UP BIT(2) u64 gpio_int_mask; +#define GPIO_INT_MASK_LINK_DOWN BIT(1) +#define GPIO_INT_MASK_LINK_UP BIT(2) u64 gpio_alarms; u8 unused5[0x38]; @@ -279,8 +283,10 @@ typedef struct _XENA_dev_config { u64 gpio_control; #define GPIO_CTRL_GPIO_0 BIT(8) + u64 misc_control; +#define MISC_LINK_STABILITY_PRD(val) vBIT(val,29,3) - u8 unused7_1[0x240 - 0x200]; + u8 unused7_1[0x240 - 0x208]; u64 wreq_split_mask; #define WREQ_SPLIT_MASK_SET_MASK(val) vBIT(val, 52, 12) diff -uprN vanilla_linux/drivers/net/s2io.c linux-2.6.13-rc4/drivers/net/s2io.c --- vanilla_linux/drivers/net/s2io.c 2005-08-02 06:59:35.000000000 -0700 +++ linux-2.6.13-rc4/drivers/net/s2io.c 2005-08-02 06:59:45.000000000 -0700 @@ -67,7 +67,7 @@ /* S2io Driver name & version. */ static char s2io_driver_name[] = "Neterion"; -static char s2io_driver_version[] = "Version 1.7.7"; +static char s2io_driver_version[] = "Version 2.0.2.0"; static inline int RXD_IS_UP2DT(RxD_t *rxdp) { @@ -1456,8 +1456,28 @@ static int init_nic(struct s2io_nic *nic writeq(val64, &bar0->wreq_split_mask); } + /* Setting Link stability period to 64 ms */ + if (nic->device_type == XFRAME_II_DEVICE) { + val64 = MISC_LINK_STABILITY_PRD(3); + writeq(val64, &bar0->misc_control); + } + return SUCCESS; } +#define LINK_UP_DOWN_INTERRUPT 1 +#define MAC_RMAC_ERR_TIMER 2 + +#if defined(CONFIG_MSI_MODE) || defined(CONFIG_MSIX_MODE) +#define s2io_link_fault_indication(x) MAC_RMAC_ERR_TIMER +#else +int s2io_link_fault_indication(nic_t *nic) +{ + if (nic->device_type == XFRAME_II_DEVICE) + return LINK_UP_DOWN_INTERRUPT; + else + return MAC_RMAC_ERR_TIMER; +} +#endif /** * en_dis_able_nic_intrs - Enable or Disable the interrupts @@ -1485,11 +1505,22 @@ static void en_dis_able_nic_intrs(struct temp64 &= ~((u64) val64); writeq(temp64, &bar0->general_int_mask); /* - * Disabled all PCIX, Flash, MDIO, IIC and GPIO + * If Hercules adapter enable GPIO otherwise + * disabled all PCIX, Flash, MDIO, IIC and GPIO * interrupts for now. * TODO */ - writeq(DISABLE_ALL_INTRS, &bar0->pic_int_mask); + if (s2io_link_fault_indication(nic) == + LINK_UP_DOWN_INTERRUPT ) { + temp64 = readq(&bar0->pic_int_mask); + temp64 &= ~((u64) PIC_INT_GPIO); + writeq(temp64, &bar0->pic_int_mask); + temp64 = readq(&bar0->gpio_int_mask); + temp64 &= ~((u64) GPIO_INT_MASK_LINK_UP); + writeq(temp64, &bar0->gpio_int_mask); + } else { + writeq(DISABLE_ALL_INTRS, &bar0->pic_int_mask); + } /* * No MSI Support is available presently, so TTI and * RTI interrupts are also disabled. @@ -1580,17 +1611,8 @@ static void en_dis_able_nic_intrs(struct writeq(temp64, &bar0->general_int_mask); /* * All MAC block error interrupts are disabled for now - * except the link status change interrupt. * TODO */ - val64 = MAC_INT_STATUS_RMAC_INT; - temp64 = readq(&bar0->mac_int_mask); - temp64 &= ~((u64) val64); - writeq(temp64, &bar0->mac_int_mask); - - val64 = readq(&bar0->mac_rmac_err_mask); - val64 &= ~((u64) RMAC_LINK_STATE_CHANGE_INT); - writeq(val64, &bar0->mac_rmac_err_mask); } else if (flag == DISABLE_INTRS) { /* * Disable MAC Intrs in the general intr mask register @@ -1879,8 +1901,10 @@ static int start_nic(struct s2io_nic *ni } /* Enable select interrupts */ - interruptible = TX_TRAFFIC_INTR | RX_TRAFFIC_INTR | TX_MAC_INTR | - RX_MAC_INTR | MC_INTR; + interruptible = TX_TRAFFIC_INTR | RX_TRAFFIC_INTR | MC_INTR; + interruptible |= TX_PIC_INTR | RX_PIC_INTR; + interruptible |= TX_MAC_INTR | RX_MAC_INTR; + en_dis_able_nic_intrs(nic, interruptible, ENABLE_INTRS); /* @@ -2004,8 +2028,9 @@ static void stop_nic(struct s2io_nic *ni config = &nic->config; /* Disable all interrupts */ - interruptible = TX_TRAFFIC_INTR | RX_TRAFFIC_INTR | TX_MAC_INTR | - RX_MAC_INTR | MC_INTR; + interruptible = TX_TRAFFIC_INTR | RX_TRAFFIC_INTR | MC_INTR; + interruptible |= TX_PIC_INTR | RX_PIC_INTR; + interruptible |= TX_MAC_INTR | RX_MAC_INTR; en_dis_able_nic_intrs(nic, interruptible, DISABLE_INTRS); /* Disable PRCs */ @@ -2618,10 +2643,12 @@ static void alarm_intr_handler(struct s2 register u64 val64 = 0, err_reg = 0; /* Handling link status change error Intr */ - err_reg = readq(&bar0->mac_rmac_err_reg); - writeq(err_reg, &bar0->mac_rmac_err_reg); - if (err_reg & RMAC_LINK_STATE_CHANGE_INT) { - schedule_work(&nic->set_link_task); + if (s2io_link_fault_indication(nic) == MAC_RMAC_ERR_TIMER) { + err_reg = readq(&bar0->mac_rmac_err_reg); + writeq(err_reg, &bar0->mac_rmac_err_reg); + if (err_reg & RMAC_LINK_STATE_CHANGE_INT) { + schedule_work(&nic->set_link_task); + } } /* Handling Ecc errors */ @@ -2947,7 +2974,7 @@ int s2io_open(struct net_device *dev) * Nic is initialized */ netif_carrier_off(dev); - sp->last_link_state = 0; /* Unkown link state */ + sp->last_link_state = LINK_DOWN; /* Initialize H/W and enable interrupts */ if (s2io_card_up(sp)) { @@ -3159,6 +3186,53 @@ s2io_alarm_handle(unsigned long data) mod_timer(&sp->alarm_timer, jiffies + HZ / 2); } +static void s2io_txpic_intr_handle(nic_t *sp) +{ + XENA_dev_config_t *bar0 = (XENA_dev_config_t *) sp->bar0; + u64 val64; + + val64 = readq(&bar0->pic_int_status); + if (val64 & PIC_INT_GPIO) { + val64 = readq(&bar0->gpio_int_reg); + if ((val64 & GPIO_INT_REG_LINK_DOWN) && + (val64 & GPIO_INT_REG_LINK_UP)) { + val64 |= GPIO_INT_REG_LINK_DOWN; + val64 |= GPIO_INT_REG_LINK_UP; + writeq(val64, &bar0->gpio_int_reg); + goto masking; + } + + if (((sp->last_link_state == LINK_UP) && + (val64 & GPIO_INT_REG_LINK_DOWN)) || + ((sp->last_link_state == LINK_DOWN) && + (val64 & GPIO_INT_REG_LINK_UP))) { + val64 = readq(&bar0->gpio_int_mask); + val64 |= GPIO_INT_MASK_LINK_DOWN; + val64 |= GPIO_INT_MASK_LINK_UP; + writeq(val64, &bar0->gpio_int_mask); + s2io_set_link((unsigned long)sp); + } +masking: + if (sp->last_link_state == LINK_UP) { + /*enable down interrupt */ + val64 = readq(&bar0->gpio_int_mask); + /* unmasks link down intr */ + val64 &= ~GPIO_INT_MASK_LINK_DOWN; + /* masks link up intr */ + val64 |= GPIO_INT_MASK_LINK_UP; + writeq(val64, &bar0->gpio_int_mask); + } else { + /*enable UP Interrupt */ + val64 = readq(&bar0->gpio_int_mask); + /* unmasks link up interrupt */ + val64 &= ~GPIO_INT_MASK_LINK_UP; + /* masks link down interrupt */ + val64 |= GPIO_INT_MASK_LINK_DOWN; + writeq(val64, &bar0->gpio_int_mask); + } + } +} + /** * s2io_isr - ISR handler of the device . * @irq: the irq of the device. @@ -3241,6 +3315,8 @@ static irqreturn_t s2io_isr(int irq, voi tx_intr_handler(&mac_control->fifos[i]); } + if (reason & GEN_INTR_TXPIC) + s2io_txpic_intr_handle(sp); /* * If the Rx buffer count is below the panic threshold then * reallocate the buffers from the interrupt handler itself, @@ -4644,11 +4720,13 @@ static void s2io_set_link(unsigned long } subid = nic->pdev->subsystem_device; - /* - * Allow a small delay for the NICs self initiated - * cleanup to complete. - */ - msleep(100); + if (s2io_link_fault_indication(nic) == MAC_RMAC_ERR_TIMER) { + /* + * Allow a small delay for the NICs self initiated + * cleanup to complete. + */ + msleep(100); + } val64 = readq(&bar0->adapter_status); if (verify_xena_quiescence(nic, val64, nic->device_enabled_once)) { @@ -4666,13 +4744,16 @@ static void s2io_set_link(unsigned long val64 |= ADAPTER_LED_ON; writeq(val64, &bar0->adapter_control); } - val64 = readq(&bar0->adapter_status); - if (!LINK_IS_UP(val64)) { - DBG_PRINT(ERR_DBG, "%s:", dev->name); - DBG_PRINT(ERR_DBG, " Link down"); - DBG_PRINT(ERR_DBG, "after "); - DBG_PRINT(ERR_DBG, "enabling "); - DBG_PRINT(ERR_DBG, "device \n"); + if (s2io_link_fault_indication(nic) == + MAC_RMAC_ERR_TIMER) { + val64 = readq(&bar0->adapter_status); + if (!LINK_IS_UP(val64)) { + DBG_PRINT(ERR_DBG, "%s:", dev->name); + DBG_PRINT(ERR_DBG, " Link down"); + DBG_PRINT(ERR_DBG, "after "); + DBG_PRINT(ERR_DBG, "enabling "); + DBG_PRINT(ERR_DBG, "device \n"); + } } if (nic->device_enabled_once == FALSE) { nic->device_enabled_once = TRUE; From raghavendra.koushik@neterion.com Wed Aug 3 12:56:17 2005 Received: with ECARTIS (v1.0.0; list netdev); Wed, 03 Aug 2005 12:56:24 -0700 (PDT) Received: from linux.site (adsl-67-120-213-161.dsl.sntc01.pacbell.net [67.120.213.161]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j73JuGH9000565 for ; Wed, 3 Aug 2005 12:56:17 -0700 Received: by linux.site (Postfix, from userid 0) id E293198336; Wed, 3 Aug 2005 12:39:56 -0700 (PDT) To: jgarzik@pobox.com, netdev@oss.sgi.com Cc: raghavendra.koushik@neterion.com, ravinandan.arakali@neterion.com, leonid.grossman@neterion.com, rapuru.sriram@neterion.com From: raghavendra.koushik@neterion.com Subject: [PATCH 2.6.13-rc4 12/13] S2io: Miscellaneous fixes Message-Id: <20050803193956.E293198336@linux.site> Date: Wed, 3 Aug 2005 12:39:56 -0700 (PDT) X-archive-position: 2846 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: raghavendra.koushik@neterion.com Precedence: bulk X-list: netdev Content-Length: 5056 Lines: 144 Hi, The last patch in this series fixes the following issues found during testing. 1. Ensure we don't pass zero sized buffers to the card(which can lockup) 2. Restore the PCI-X parameters(in case of Xframe I adapter) after a reset. 3. Make sure total size of all FIFOs does not exceed 8192. Signed-off-by: Ravinandan Arakali Signed-off-by: Raghavendra Koushik --- diff -uprN vanilla_linux/drivers/net/s2io.c linux-2.6.13-rc4/drivers/net/s2io.c --- vanilla_linux/drivers/net/s2io.c 2005-08-02 07:05:12.000000000 -0700 +++ linux-2.6.13-rc4/drivers/net/s2io.c 2005-08-02 07:05:24.000000000 -0700 @@ -365,10 +365,9 @@ static int init_shared_mem(struct s2io_n size += config->tx_cfg[i].fifo_len; } if (size > MAX_AVAILABLE_TXDS) { - DBG_PRINT(ERR_DBG, "%s: Total number of Tx FIFOs ", - dev->name); - DBG_PRINT(ERR_DBG, "exceeds the maximum value "); - DBG_PRINT(ERR_DBG, "that can be used\n"); + DBG_PRINT(ERR_DBG, "%s: Requested TxDs too high, ", + __FUNCTION__); + DBG_PRINT(ERR_DBG, "Requested: %d, max supported: 8192\n", size); return FAILURE; } @@ -611,8 +610,9 @@ static void free_shared_mem(struct s2io_ lst_per_page); for (j = 0; j < page_num; j++) { int mem_blks = (j * lst_per_page); - if (!mac_control->fifos[i].list_info[mem_blks]. - list_virt_addr) + if ((!mac_control->fifos[i].list_info) || + (!mac_control->fifos[i].list_info[mem_blks]. + list_virt_addr)) break; pci_free_consistent(nic->pdev, PAGE_SIZE, mac_control->fifos[i]. @@ -2594,6 +2594,8 @@ static void tx_intr_handler(fifo_info_t for (j = 0; j < frg_cnt; j++, txdlp++) { skb_frag_t *frag = &skb_shinfo(skb)->frags[j]; + if (!txdlp->Buffer_Pointer) + break; pci_unmap_page(nic->pdev, (dma_addr_t) txdlp-> @@ -2744,6 +2746,10 @@ void s2io_reset(nic_t * sp) u64 val64; u16 subid, pci_cmd; + /* Back up the PCI-X CMD reg, dont want to lose MMRBC, OST settings */ + if (sp->device_type == XFRAME_I_DEVICE) + pci_read_config_word(sp->pdev, PCIX_COMMAND_REGISTER, &(pci_cmd)); + val64 = SW_RESET_ALL; writeq(val64, &bar0->sw_reset); @@ -2762,8 +2768,10 @@ void s2io_reset(nic_t * sp) msleep(250); if (!(sp->device_type & XFRAME_II_DEVICE)) { - /* Restore the PCI state saved during initializarion. */ + /* Restore the PCI state saved during initializarion. */ pci_restore_state(sp->pdev); + pci_write_config_word(sp->pdev, PCIX_COMMAND_REGISTER, + pci_cmd); } else { pci_set_master(sp->pdev); } @@ -2974,7 +2982,7 @@ int s2io_open(struct net_device *dev) * Nic is initialized */ netif_carrier_off(dev); - sp->last_link_state = LINK_DOWN; + sp->last_link_state = 0; /* Initialize H/W and enable interrupts */ if (s2io_card_up(sp)) { @@ -3102,6 +3110,15 @@ int s2io_xmit(struct sk_buff *skb, struc spin_unlock_irqrestore(&sp->tx_lock, flags); return 0; } + + /* A buffer with no data will be dropped */ + if (!skb->len) { + DBG_PRINT(TX_DBG, "%s:Buffer has no data..\n", dev->name); + dev_kfree_skb(skb); + spin_unlock_irqrestore(&sp->tx_lock, flags); + return 0; + } + #ifdef NETIF_F_TSO mss = skb_shinfo(skb)->tso_size; if (mss) { @@ -3136,6 +3153,9 @@ int s2io_xmit(struct sk_buff *skb, struc /* For fragmented SKB. */ for (i = 0; i < frg_cnt; i++) { skb_frag_t *frag = &skb_shinfo(skb)->frags[i]; + /* A '0' length fragment will be ignored */ + if (!frag->size) + continue; txdp++; txdp->Buffer_Pointer = (u64) pci_map_page (sp->pdev, frag->page, frag->page_offset, @@ -5257,7 +5277,8 @@ s2io_init_nic(struct pci_dev *pdev, cons config = &sp->config; /* Tx side parameters. */ - tx_fifo_len[0] = DEFAULT_FIFO_LEN; /* Default value. */ + if (tx_fifo_len[0] == 0) + tx_fifo_len[0] = DEFAULT_FIFO_LEN; /* Default value. */ config->tx_fifo_num = tx_fifo_num; for (i = 0; i < MAX_TX_FIFOS; i++) { config->tx_cfg[i].fifo_len = tx_fifo_len[i]; @@ -5280,7 +5301,8 @@ s2io_init_nic(struct pci_dev *pdev, cons config->max_txds = MAX_SKB_FRAGS; /* Rx side parameters. */ - rx_ring_sz[0] = SMALL_BLK_CNT; /* Default value. */ + if (rx_ring_sz[0] == 0) + rx_ring_sz[0] = SMALL_BLK_CNT; /* Default value. */ config->rx_ring_num = rx_ring_num; for (i = 0; i < MAX_RX_RINGS; i++) { config->rx_cfg[i].num_rxd = rx_ring_sz[i] * @@ -5310,7 +5332,7 @@ s2io_init_nic(struct pci_dev *pdev, cons /* initialize the shared memory used by the NIC and the host */ if (init_shared_mem(sp)) { DBG_PRINT(ERR_DBG, "%s: Memory allocation failed\n", - dev->name); + __FUNCTION__); ret = -ENOMEM; goto mem_alloc_failed; } @@ -5488,7 +5510,7 @@ s2io_init_nic(struct pci_dev *pdev, cons sp->def_mac_addr[0].mac_addr[3], sp->def_mac_addr[0].mac_addr[4], sp->def_mac_addr[0].mac_addr[5]); - int mode = s2io_print_pci_mode(sp); + mode = s2io_print_pci_mode(sp); if (mode < 0) { DBG_PRINT(ERR_DBG, " Unsupported PCI bus mode "); ret = -EBADSLT; From raghavendra.koushik@neterion.com Wed Aug 3 12:57:59 2005 Received: with ECARTIS (v1.0.0; list netdev); Wed, 03 Aug 2005 12:58:03 -0700 (PDT) Received: from linux.site (adsl-67-120-213-161.dsl.sntc01.pacbell.net [67.120.213.161]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j73JvwH9001424 for ; Wed, 3 Aug 2005 12:57:59 -0700 Received: by linux.site (Postfix, from userid 0) id C20F898336; Wed, 3 Aug 2005 12:41:38 -0700 (PDT) To: jgarzik@pobox.com, netdev@oss.sgi.com Cc: raghavendra.koushik@neterion.com, ravinandan.arakali@neterion.com, leonid.grossman@neterion.com, rapuru.sriram@neterion.com From: raghavendra.koushik@neterion.com Subject: [PATCH 2.6.13-rc4 13/13] S2io: Errors found during review Message-Id: <20050803194138.C20F898336@linux.site> Date: Wed, 3 Aug 2005 12:41:38 -0700 (PDT) X-archive-position: 2847 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: raghavendra.koushik@neterion.com Precedence: bulk X-list: netdev Content-Length: 6924 Lines: 195 Hi, This is a patch to incorporate comments from earlier 12 patches. It also fixes a few issues we found during this time. Following is a list of changes in this patch. Item 1 incorporates earlier comments. Issues addressed in items 2 to 4 were discovered recently. 1. wmb() call in s2io_xmit() replaced with mmiowb(). 2. The dtx_control register was earlier programmed incorrectly for Xframe II adapter. 3. As suggested by hardware team, after a reset, in case of Xframe II adapter, we clear certain spurious errors by clearing PCI-X ECC status register, "detected parity error" bit in PCI_STATUS register and PCI_STATUS bit in txpic_int register. 4. On IBM PPC platforms, we found that in the Rx buffer replenish function, two memory writes(one to the the descriptor length and another to the ownership) were getting reordered. This was causing the adapter to see the ownership transfered to it before the length was updated. One solution was to add a wmb() but since this would turnout expensive on some platforms if called for every descriptor, we set the ownership bit and other fields of '2' to 'N' Rx descriptors followed by a wmb() and then set the ownership of first descriptor ('1'). Here the value 'N' is configurable by making it a module loadable parameter (rxsync_frequency). (NOTE: This parameter is a power of 2). 5. Bumped up the driver version no. to 2.0.2.1 Signed-off-by: Ravinandan Arakali Signed-off-by: Raghavendra Koushik --- diff -uprN vanilla_linux/drivers/net/s2io.c linux-2.6.13-rc4/drivers/net/s2io.c --- vanilla_linux/drivers/net/s2io.c 2005-08-02 07:47:21.000000000 -0700 +++ linux-2.6.13-rc4/drivers/net/s2io.c 2005-08-02 08:47:16.000000000 -0700 @@ -67,7 +67,7 @@ /* S2io Driver name & version. */ static char s2io_driver_name[] = "Neterion"; -static char s2io_driver_version[] = "Version 2.0.2.0"; +static char s2io_driver_version[] = "Version 2.0.2.1"; static inline int RXD_IS_UP2DT(RxD_t *rxdp) { @@ -301,6 +301,8 @@ static unsigned int bimodal = 0; #ifndef CONFIG_S2IO_NAPI static unsigned int indicate_max_pkts; #endif +/* Frequency of Rx desc syncs expressed as power of 2 */ +static unsigned int rxsync_frequency = 3; /* * S2IO device table. @@ -837,7 +839,7 @@ static int init_nic(struct s2io_nic *nic */ if (nic->device_type & XFRAME_II_DEVICE) { while (herc_act_dtx_cfg[dtx_cnt] != END_SIGN) { - SPECIAL_REG_WRITE(xena_dtx_cfg[dtx_cnt], + SPECIAL_REG_WRITE(herc_act_dtx_cfg[dtx_cnt], &bar0->dtx_control, UF); if (dtx_cnt & 0x1) msleep(1); /* Necessary!! */ @@ -2083,6 +2085,7 @@ int fill_rx_buffers(struct s2io_nic *nic #ifndef CONFIG_S2IO_NAPI unsigned long flags; #endif + RxD_t *first_rxdp = NULL; mac_control = &nic->mac_control; config = &nic->config; @@ -2202,6 +2205,10 @@ int fill_rx_buffers(struct s2io_nic *nic if (!skb) { DBG_PRINT(ERR_DBG, "%s: Out of ", dev->name); DBG_PRINT(ERR_DBG, "memory to allocate SKBs\n"); + if (first_rxdp) { + wmb(); + first_rxdp->Control_1 |= RXD_OWN_XENA; + } return -ENOMEM; } #ifndef CONFIG_2BUFF_MODE @@ -2212,7 +2219,8 @@ int fill_rx_buffers(struct s2io_nic *nic rxdp->Control_2 &= (~MASK_BUFFER0_SIZE); rxdp->Control_2 |= SET_BUFFER0_SIZE(size); rxdp->Host_Control = (unsigned long) (skb); - rxdp->Control_1 |= RXD_OWN_XENA; + if (alloc_tab & ((1 << rxsync_frequency) - 1)) + rxdp->Control_1 |= RXD_OWN_XENA; off++; off %= (MAX_RXDS_PER_BLOCK + 1); mac_control->rings[ring_no].rx_curr_put_info.offset = off; @@ -2239,17 +2247,34 @@ int fill_rx_buffers(struct s2io_nic *nic rxdp->Control_2 |= SET_BUFFER1_SIZE(1); /* dummy. */ rxdp->Control_2 |= BIT(0); /* Set Buffer_Empty bit. */ rxdp->Host_Control = (u64) ((unsigned long) (skb)); - rxdp->Control_1 |= RXD_OWN_XENA; + if (alloc_tab & ((1 << rxsync_frequency) - 1)) + rxdp->Control_1 |= RXD_OWN_XENA; off++; mac_control->rings[ring_no].rx_curr_put_info.offset = off; #endif rxdp->Control_2 |= SET_RXD_MARKER; + if (!(alloc_tab & ((1 << rxsync_frequency) - 1))) { + if (first_rxdp) { + wmb(); + first_rxdp->Control_1 |= RXD_OWN_XENA; + } + first_rxdp = rxdp; + } atomic_inc(&nic->rx_bufs_left[ring_no]); alloc_tab++; } end: + /* Transfer ownership of first descriptor to adapter just before + * exiting. Before that, use memory barrier so that ownership + * and other fields are seen by adapter correctly. + */ + if (first_rxdp) { + wmb(); + first_rxdp->Control_1 |= RXD_OWN_XENA; + } + return SUCCESS; } @@ -2783,16 +2808,16 @@ void s2io_reset(nic_t * sp) s2io_set_swapper(sp); /* Clear certain PCI/PCI-X fields after reset */ - pci_read_config_word(sp->pdev, PCI_COMMAND, &pci_cmd); - pci_cmd &= 0x7FFF; /* Clear parity err detect bit */ - pci_write_config_word(sp->pdev, PCI_COMMAND, pci_cmd); + if (sp->device_type == XFRAME_II_DEVICE) { + /* Clear parity err detect bit */ + pci_write_config_word(sp->pdev, PCI_STATUS, 0x8000); - val64 = readq(&bar0->txpic_int_reg); - val64 &= ~BIT(62); /* Clearing PCI_STATUS error reflected here */ - writeq(val64, &bar0->txpic_int_reg); + /* Clearing PCIX Ecc status register */ + pci_write_config_dword(sp->pdev, 0x68, 0x7C); - /* Clearing PCIX Ecc status register */ - pci_write_config_dword(sp->pdev, 0x68, 0); + /* Clearing PCI_STATUS error reflected here */ + writeq(BIT(62), &bar0->txpic_int_reg); + } /* Reset device statistics maintained by OS */ memset(&sp->stats, 0, sizeof (struct net_device_stats)); @@ -3168,8 +3193,6 @@ int s2io_xmit(struct sk_buff *skb, struc val64 = mac_control->fifos[queue].list_info[put_off].list_phy_addr; writeq(val64, &tx_fifo->TxDL_Pointer); - wmb(); - val64 = (TX_FIFO_LAST_TXD_NUM(frg_cnt) | TX_FIFO_FIRST_LIST | TX_FIFO_LAST_LIST); @@ -3179,6 +3202,8 @@ int s2io_xmit(struct sk_buff *skb, struc #endif writeq(val64, &tx_fifo->List_Control); + mmiowb(); + put_off++; put_off %= mac_control->fifos[queue].tx_curr_put_info.fifo_len + 1; mac_control->fifos[queue].tx_curr_put_info.offset = put_off; @@ -5172,6 +5197,7 @@ module_param(bimodal, bool, 0); #ifndef CONFIG_S2IO_NAPI module_param(indicate_max_pkts, int, 0); #endif +module_param(rxsync_frequency, int, 0); /** * s2io_init_nic - Initialization of the adapter . diff -uprN vanilla_linux/drivers/net/s2io.h linux-2.6.13-rc4/drivers/net/s2io.h --- vanilla_linux/drivers/net/s2io.h 2005-08-02 07:47:21.000000000 -0700 +++ linux-2.6.13-rc4/drivers/net/s2io.h 2005-08-02 07:45:38.000000000 -0700 @@ -13,11 +13,6 @@ #ifndef _S2IO_H #define _S2IO_H -/* Enable 2 buffer mode by default for SGI system */ -#ifdef CONFIG_IA64_SGI_SN2 -#define CONFIG_2BUFF_MODE -#endif - #define TBD 0 #define BIT(loc) (0x8000000000000000ULL >> (loc)) #define vBIT(val, loc, sz) (((u64)val) << (64-loc-sz)) From mkomu@twilight.cs.hut.fi Wed Aug 3 13:59:30 2005 Received: with ECARTIS (v1.0.0; list netdev); Wed, 03 Aug 2005 13:59:36 -0700 (PDT) Received: from twilight.cs.hut.fi (twilight.cs.hut.fi [130.233.40.5]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j73KxTH9005575 for ; Wed, 3 Aug 2005 13:59:30 -0700 Received: by twilight.cs.hut.fi (Postfix, from userid 60001) id 1E2172D3D; Wed, 3 Aug 2005 23:57:26 +0300 (EEST) Received: from kekkonen.cs.hut.fi (kekkonen.cs.hut.fi [130.233.41.50]) by twilight.cs.hut.fi (Postfix) with ESMTP id A88042D16; Wed, 3 Aug 2005 23:57:25 +0300 (EEST) Received: (from mkomu@localhost) by kekkonen.cs.hut.fi (8.11.7p1+Sun/8.10.2) id j73KvOm07087; Wed, 3 Aug 2005 23:57:24 +0300 (EEST) Date: Wed, 3 Aug 2005 23:57:24 +0300 (EEST) From: Miika Komu X-X-Sender: mkomu@kekkonen.cs.hut.fi To: Diego Beltrami Cc: Herbert Xu , netdev@oss.sgi.com, infrahip@HIIT.FI, hipl-users@freelists.org, hipsec@ietf.org Subject: Re: [Hipsec] Re: [PATCH 2.6.12.2] XFRM: BEET IPsec mode for Linux In-Reply-To: <1122984099.1214.142.camel@odysse> Message-ID: References: <1122984099.1214.142.camel@odysse> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 2848 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: miika@iki.fi Precedence: bulk X-list: netdev Content-Length: 2789 Lines: 56 On Tue, 2 Aug 2005, Diego Beltrami wrote: Hi Herbert and others, (sorry for the late comments - I am still on a holiday :) > after sending the first version of BEET patch and having received a > valuable feedback and after the discussion based upon the BEET design, > we now send the new BEET patch which allows for BEET to work without the > inter-family transform (i.e. inner address family different than outer > address family). > ... > > As it was originally designed the BEET patch at the moment works for > only ESP protocol. > As Pekka Nikader mentioned in one reply [1]: "[...] defining BEET mode > for AH might be pretty tricky. [...] it probably would require some > careful thinking to define the exact semantics, like what addresses > (inner or outer) are covered by the AH integrity protection, what does > the integrity protection really assert, etc. ". > > As previously written, the inter-family transform has been left out at > the moment since the xfrm architecture doesn't support it. As a result, > as soon as the xfrm architecture will be enhanced, the inter-family case > will be properly included as, for example, it can be useful for > supporting HIP over IPv4 network. But, as already mentioned, this would > require more work in properly designing the xfrm architecture (thing > which we consider necessary in order to make xfrm as generic as > possible). Based on the comments from Pekka Nikander, it seems like to me that generalizing XFRM to support AH with different inner and outer families may not very useful (a). On the other hand, the different inner and outer families for BEET is *extremely* useful (b). Excluding this support from BEET restricts the HIP implementations and applications quite radically. My own thinking logic tells me the (a) + (b) equals to supporting different inner and outer families in BEET in the way it is implemented currently. Don't fix it if it ain't broken!) We have tested that the BEET with different inner and outer addresses does not break anything. Further, if you don't need it, you don't have to compile it in :) Later, if AH seems really useful with different inner and outer families, or a new XYZ header is introduced, we can refactor the architecture for greater modularity. Even then, the XFRM/PFKEY APIs should remain the same. So, I'd vote for the original BEET patch, but of course it is up to you to decide. The BEET support is the minimal support required from the kernel in order for a usepace HIP implementation to work, and it would make both HIP implementors and users life much more easier :) Additionally, we would gain also more experience from using mixed inner and outer families within the XFRM architecture. -- Miika Komu miika@iki.fi http://www.iki.fi/miika/ From herbert@gondor.apana.org.au Wed Aug 3 16:43:08 2005 Received: with ECARTIS (v1.0.0; list netdev); Wed, 03 Aug 2005 16:43:13 -0700 (PDT) Received: from jay.exetel.com.au (jay.exetel.com.au [220.233.0.8]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j73Nh4H9019472 for ; Wed, 3 Aug 2005 16:43:07 -0700 Received: (qmail 22219 invoked by uid 507); 4 Aug 2005 09:40:58 +1000 Received: from 22.107.233.220.exetel.com.au (HELO arnor.apana.org.au) (220.233.107.22) by jay.exetel.com.au with SMTP; 4 Aug 2005 09:40:58 +1000 Received: from gondolin.me.apana.org.au ([192.168.0.6] ident=mail) by arnor.apana.org.au with esmtp (Exim 3.35 #1 (Debian)) id 1E0Sr3-0006Vd-00; Thu, 04 Aug 2005 09:40:57 +1000 Received: from herbert by gondolin.me.apana.org.au with local (Exim 3.36 #1 (Debian)) id 1E0Sqy-0003Rn-00; Thu, 04 Aug 2005 09:40:52 +1000 Date: Thu, 4 Aug 2005 09:40:52 +1000 To: Diego Beltrami Cc: netdev@oss.sgi.com, infrahip@HIIT.FI, hipl-users@freelists.org, hipsec@ietf.org Subject: Re: [PATCH 2.6.12.2] XFRM: BEET IPsec mode for Linux Message-ID: <20050803234052.GA13216@gondor.apana.org.au> References: <1122984099.1214.142.camel@odysse> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1122984099.1214.142.camel@odysse> User-Agent: Mutt/1.5.9i From: Herbert Xu X-archive-position: 2849 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: herbert@gondor.apana.org.au Precedence: bulk X-list: netdev Content-Length: 867 Lines: 23 On Tue, Aug 02, 2005 at 03:01:39PM +0300, Diego Beltrami wrote: > > after sending the first version of BEET patch and having received a > valuable feedback and after the discussion based upon the BEET design, > we now send the new BEET patch which allows for BEET to work without the > inter-family transform (i.e. inner address family different than outer > address family). Thanks for the new patch. Unfortunately it really looks quite similar to the previous patch :) > --- linux-2.6.12.2-orig/net/ipv4/esp4.c > +++ linux-2.6.12.2/net/ipv4/esp4.c I thought getting rid of the interfamily stuff would remove the need to touch this file, no? Thanks, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt From jb@suse.cz Thu Aug 4 05:11:23 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 04 Aug 2005 05:11:26 -0700 (PDT) Received: from mail.suse.cz (styx.suse.cz [82.119.242.94]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j74CBLH9014565 for ; Thu, 4 Aug 2005 05:11:22 -0700 Received: from dwarf.suse.cz (dwarf.suse.cz [10.20.1.32]) by mail.suse.cz (SUSE CR ESMTP Mailer) with ESMTP id 4E71D628349; Thu, 4 Aug 2005 14:09:15 +0200 (CEST) Received: by dwarf.suse.cz (Postfix, from userid 10013) id 1B2D812F115; Thu, 4 Aug 2005 14:09:14 +0200 (CEST) Date: Thu, 4 Aug 2005 14:09:14 +0200 From: Jirka Bohac To: Mateusz Berezecki Cc: netdev@oss.sgi.com, jbenc@suse.cz Subject: Re: Fwd: opensource atheros driver almost done Message-ID: <20050804120914.GA1905@dwarf.suse.cz> References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.6i X-archive-position: 2850 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jbohac@suse.cz Precedence: bulk X-list: netdev Content-Length: 725 Lines: 24 Hi, > I am developing that driver using ieee80211 branch of netdev > kernel tree, and I am not so confident I do things right using > new ieee80211 api. I would appreciate any suggestion and review > of the source code. please have a look at http://forge.novell.com/modules/xfmod/cvs/cvsbrowse.php/ieee80211/patches-upstream/ These patches are likely to be merged by Jeff soon (apply using quilt or use the order in the series file) Also, you can have a look at http://forge.novell.com/modules/xfmod/cvs/cvsbrowse.php/ieee80211/patches-netdev/ These patches break a lot of things, but can tell you more about the ongoing development of the ieee80211 layer. Regards, -- Jirka Bohac SUSE Labs, SUSE CR From herbert@gondor.apana.org.au Thu Aug 4 06:17:36 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 04 Aug 2005 06:17:42 -0700 (PDT) Received: from jay.exetel.com.au (jay.exetel.com.au [220.233.0.8]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j74DHZH9019103 for ; Thu, 4 Aug 2005 06:17:36 -0700 Received: (qmail 26834 invoked by uid 507); 4 Aug 2005 23:15:29 +1000 Received: from 22.107.233.220.exetel.com.au (HELO arnor.apana.org.au) (220.233.107.22) by jay.exetel.com.au with SMTP; 4 Aug 2005 23:15:29 +1000 Received: from gondolin.me.apana.org.au ([192.168.0.6] ident=mail) by arnor.apana.org.au with esmtp (Exim 3.35 #1 (Debian)) id 1E0fZG-0002A3-00; Thu, 04 Aug 2005 23:15:26 +1000 Received: from herbert by gondolin.me.apana.org.au with local (Exim 3.36 #1 (Debian)) id 1E0fZ9-0001Yw-00; Thu, 04 Aug 2005 23:15:19 +1000 Date: Thu, 4 Aug 2005 23:15:19 +1000 To: Miika Komu Cc: Diego Beltrami , netdev@oss.sgi.com, infrahip@HIIT.FI, hipl-users@freelists.org, hipsec@ietf.org Subject: Re: [Hipsec] Re: [PATCH 2.6.12.2] XFRM: BEET IPsec mode for Linux Message-ID: <20050804131519.GB5831@gondor.apana.org.au> References: <1122984099.1214.142.camel@odysse> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.9i From: Herbert Xu X-archive-position: 2851 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: herbert@gondor.apana.org.au Precedence: bulk X-list: netdev Content-Length: 1782 Lines: 42 Hi Miika: On Wed, Aug 03, 2005 at 11:57:24PM +0300, Miika Komu wrote: > > Based on the comments from Pekka Nikander, it seems like to me that > generalizing XFRM to support AH with different inner and outer families > may not very useful (a). On the other hand, the different inner and outer Well to me it's more of an issue of maintainability. BEET mode is more akin to transport/tunnel mode than AH/ESP/IPcomp. As such its implementation would be most at home where the existing encapsulation and decapsulation for transport/tunnel mode is done. That is, in xfrm[46]_input.c and xfrm[46]_output.c. For instance, the reason the current patch has to touch esp4.c at all is really because the patch to xfrm4_output.c isn't right. It should do what the comment says and set skb->h to the start of the payload, not the start of the ESP header. If it did that, then esp_output doesn't have to care about BEET at all. Also, the outer header generation should be done before x->type->output is called, not after. That way, the AH semantics falls out quite naturally. > families for BEET is *extremely* useful (b). Excluding this support from > BEET restricts the HIP implementations and applications quite radically. I agree with you wholeheartedly that this is extremely useful. However, I also see nothing that's BEET-specific about this feature. So for the sake of the overall consistency of the IPsec stack please keep the implementation generic instead of BEET-specific. That is, please do it in a way so that it applies to plain tunnel mode as well. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt From diego.beltrami@hiit.fi Thu Aug 4 07:40:37 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 04 Aug 2005 07:40:39 -0700 (PDT) Received: from pegasus.hiit.fi (pegasus.hiit.fi [212.68.1.186]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j74EeZH9024552 for ; Thu, 4 Aug 2005 07:40:36 -0700 Received: from [128.214.113.174] (odysse.hiit.fi [128.214.113.174]) by pegasus.hiit.fi (Postfix) with ESMTP id 5C9A6220082; Thu, 4 Aug 2005 17:38:31 +0300 (EEST) Message-ID: <42F22867.9050804@hiit.fi> Date: Thu, 04 Aug 2005 17:38:31 +0300 From: Diego Beltrami User-Agent: Mozilla Thunderbird 1.0.2-0.fdr.1.2 (X11/20050514) X-Accept-Language: en-us, en MIME-Version: 1.0 To: Herbert Xu Cc: Miika Komu , netdev@oss.sgi.com, infrahip@hiit.fi, hipl-users@freelists.org, hipsec@ietf.org Subject: Re: [PATCH 2.6.12.2] XFRM: BEET IPsec mode for Linux References: <1122984099.1214.142.camel@odysse> <20050804131519.GB5831@gondor.apana.org.au> In-Reply-To: <20050804131519.GB5831@gondor.apana.org.au> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 2852 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: diego.beltrami@hiit.fi Precedence: bulk X-list: netdev Content-Length: 1757 Lines: 37 > Well to me it's more of an issue of maintainability. BEET mode is > more akin to transport/tunnel mode than AH/ESP/IPcomp. As such its > implementation would be most at home where the existing encapsulation > and decapsulation for transport/tunnel mode is done. That is, in > xfrm[46]_input.c and xfrm[46]_output.c. > > For instance, the reason the current patch has to touch esp4.c at > all is really because the patch to xfrm4_output.c isn't right. > It should do what the comment says and set skb->h to the start > of the payload, not the start of the ESP header. If it did that, > then esp_output doesn't have to care about BEET at all. This is totally true, and I agree with you but then this is somehow a controversial thing with respect to the esp6_output. In fact the esp6_output has the same purpose of esp_output, but it requires the skb->h to be set at the beginning of ESP header. > > Also, the outer header generation should be done before > x->type->output is called, not after. That way, the AH > semantics falls out quite naturally. BEET has been designed to be compatible with HIP. This means that the ESP header should be computed with respect to the inner addresses. In a very first implementation of BEET we were converting the inner addresses to the outer addresses before x->type->output, but we couldn't make interoperate BEET with HIP. That's the reason why the outer header generation has been after x->type->output. This is one of the reasons why the AH, as Pekka Nikader said, is a bit trickier with respect to ESP (the AH protocol protects the IP datagram including immutable parts of the IP header like the IP addresses whereas for ESP the IP header is not included in the calculation process). --Diego From mateuszb@gmail.com Thu Aug 4 07:46:12 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 04 Aug 2005 07:46:16 -0700 (PDT) Received: from rproxy.gmail.com (rproxy.gmail.com [64.233.170.199]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j74Ek9H9025405 for ; Thu, 4 Aug 2005 07:46:12 -0700 Received: by rproxy.gmail.com with SMTP id z35so346476rne for ; Thu, 04 Aug 2005 07:44:03 -0700 (PDT) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:user-agent:x-accept-language:mime-version:to:subject:content-type:content-transfer-encoding; b=nXYbhy8e3HtWlYgdNNuGi7DBdGA0oqgUQIx9kZBJv7/0bCN1CEpAVCaCY8Vn6iTem7tdLqpv1qM1YMGvbH7ftJG9yz6E+cTM8blA4KtWI2kfO++5epsInBhJiv23ayvER4uh7Yk/R0j3YChTP24kMaCDku8HCcsPVBKO2BRirNY= Received: by 10.38.97.9 with SMTP id u9mr864228rnb; Thu, 04 Aug 2005 07:44:03 -0700 (PDT) Received: from ?192.168.0.56? ([82.139.13.231]) by mx.gmail.com with ESMTP id 70sm561025rnb.2005.08.04.07.44.02; Thu, 04 Aug 2005 07:44:02 -0700 (PDT) Message-ID: <42F22AD5.3020600@gmail.com> Date: Thu, 04 Aug 2005 16:48:53 +0200 From: Mateusz Berezecki User-Agent: Mozilla Thunderbird 1.0.5 (X11/20050719) X-Accept-Language: en-us, en MIME-Version: 1.0 To: netdev Subject: latest netdev tree - (broadcom44 bug?) letting you know... Content-Type: text/plain; charset=ISO-8859-2; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 2853 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mateuszb@gmail.com Precedence: bulk X-list: netdev Content-Length: 329 Lines: 10 Hi list readers I am using netdev treee (ieee80211 branch) fetched using git 2 days ago and what worries me is that my b44 card stopped working. I dont know if that's driver issue or something has changed inside the network subsystem. I am trying to track down the changes and hopefully will post the result. regards Mateusz From flamingice@sourmilk.net Thu Aug 4 11:05:08 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 04 Aug 2005 11:05:19 -0700 (PDT) Received: from server8.totalchoicehosting.com (server8.totalchoicehosting.com [216.180.241.250]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j74I57H9013456 for ; Thu, 4 Aug 2005 11:05:08 -0700 Received: from host-24-225-148-91.patmedia.net ([24.225.148.91] helo=[192.168.0.135]) by server8.totalchoicehosting.com with esmtpsa (TLSv1:RC4-MD5:128) (Exim 4.44) id 1E0k3b-00038l-M1; Thu, 04 Aug 2005 14:03:03 -0400 From: Michael Wu To: Jeff Garzik Subject: [PATCH ieee80211] ieee80211.h: minor changes to header Date: Thu, 4 Aug 2005 14:02:48 -0400 User-Agent: KMail/1.8.2 Cc: netdev@oss.sgi.com MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200508041402.48868.flamingice@sourmilk.net> X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - server8.totalchoicehosting.com X-AntiAbuse: Original Domain - oss.sgi.com X-AntiAbuse: Originator/Caller UID/GID - [0 0] / [47 12] X-AntiAbuse: Sender Address Domain - sourmilk.net X-Source: X-Source-Args: X-Source-Dir: X-archive-position: 2854 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: flamingice@sourmilk.net Precedence: bulk X-list: netdev Content-Length: 3571 Lines: 136 Hi, This patch: - fixes misc. whitespace/comments - replaces u16 with __le16/__be16 where appropriate Signed-off-by: Michael Wu diff --git a/include/net/ieee80211.h b/include/net/ieee80211.h --- a/include/net/ieee80211.h +++ b/include/net/ieee80211.h @@ -47,22 +47,22 @@ #define IEEE80211_FRAME_LEN (IEEE80211_DATA_LEN + IEEE80211_HLEN) struct ieee80211_hdr { - u16 frame_ctl; - u16 duration_id; + __le16 frame_ctl; + __le16 duration_id; u8 addr1[ETH_ALEN]; u8 addr2[ETH_ALEN]; u8 addr3[ETH_ALEN]; - u16 seq_ctl; + __le16 seq_ctl; u8 addr4[ETH_ALEN]; } __attribute__ ((packed)); struct ieee80211_hdr_3addr { - u16 frame_ctl; - u16 duration_id; + __le16 frame_ctl; + __le16 duration_id; u8 addr1[ETH_ALEN]; u8 addr2[ETH_ALEN]; u8 addr3[ETH_ALEN]; - u16 seq_ctl; + __le16 seq_ctl; } __attribute__ ((packed)); enum eap_type { @@ -88,10 +88,10 @@ static inline const char *eap_get_type(i struct eapol { u8 snap[6]; - u16 ethertype; + __be16 ethertype; u8 version; u8 type; - u16 length; + __be16 length; } __attribute__ ((packed)); #define IEEE80211_1ADDR_LEN 10 @@ -223,9 +223,9 @@ do { if (ieee80211_debug_level & (level) #include /* ARPHRD_ETHER */ #ifndef WIRELESS_SPY -#define WIRELESS_SPY // enable iwspy support +#define WIRELESS_SPY /* enable iwspy support */ #endif -#include // new driver API +#include /* new driver API */ #ifndef ETH_P_PAE #define ETH_P_PAE 0x888E /* Port Access Entity (IEEE 802.1X) */ @@ -520,9 +520,9 @@ struct ieee80211_info_element { struct ieee80211_authentication { struct ieee80211_hdr_3addr header; - u16 algorithm; - u16 transaction; - u16 status; + __le16 algorithm; + __le16 transaction; + __le16 status; struct ieee80211_info_element info_element; } __attribute__ ((packed)); @@ -530,23 +530,23 @@ struct ieee80211_authentication { struct ieee80211_probe_response { struct ieee80211_hdr_3addr header; u32 time_stamp[2]; - u16 beacon_interval; - u16 capability; + __le16 beacon_interval; + __le16 capability; struct ieee80211_info_element info_element; } __attribute__ ((packed)); struct ieee80211_assoc_request_frame { - u16 capability; - u16 listen_interval; + __le16 capability; + __le16 listen_interval; u8 current_ap[ETH_ALEN]; struct ieee80211_info_element info_element; } __attribute__ ((packed)); struct ieee80211_assoc_response_frame { struct ieee80211_hdr_3addr header; - u16 capability; - u16 status; - u16 aid; + __le16 capability; + __le16 status; + __le16 aid; struct ieee80211_info_element info_element; /* supported rates */ } __attribute__ ((packed)); @@ -561,7 +561,7 @@ struct ieee80211_txb { }; -/* SWEEP TABLE ENTRIES NUMBER*/ +/* SWEEP TABLE ENTRIES NUMBER */ #define MAX_SWEEP_TAB_ENTRIES 42 #define MAX_SWEEP_TAB_ENTRIES_PER_PACKET 7 /* MAX_RATES_LENGTH needs to be 12. The spec says 8, and many APs @@ -791,8 +791,6 @@ extern struct net_device *alloc_ieee8021 extern int ieee80211_set_encryption(struct ieee80211_device *ieee); /* ieee80211_tx.c */ - - extern int ieee80211_xmit(struct sk_buff *skb, struct net_device *dev); extern void ieee80211_txb_free(struct ieee80211_txb *); @@ -805,7 +803,7 @@ extern void ieee80211_rx_mgt(struct ieee struct ieee80211_hdr *header, struct ieee80211_rx_stats *stats); -/* iee80211_wx.c */ +/* ieee80211_wx.c */ extern int ieee80211_wx_get_scan(struct ieee80211_device *ieee, struct iw_request_info *info, union iwreq_data *wrqu, char *key); From jgarzik@pobox.com Thu Aug 4 13:33:38 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 04 Aug 2005 13:33:41 -0700 (PDT) Received: from mail.dvmed.net (mail.dvmed.net [216.237.124.58]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j74KXcH9026265 for ; Thu, 4 Aug 2005 13:33:38 -0700 Received: from cpe-065-184-065-144.nc.res.rr.com ([65.184.65.144] helo=[10.10.10.88]) by mail.dvmed.net with esmtpsa (Exim 4.52 #1 (Red Hat Linux)) id 1E0mNI-0001Er-ED; Thu, 04 Aug 2005 20:31:32 +0000 Message-ID: <42F27B22.6050001@pobox.com> Date: Thu, 04 Aug 2005 16:31:30 -0400 From: Jeff Garzik User-Agent: Mozilla Thunderbird 1.0.6-1.1.fc4 (X11/20050720) X-Accept-Language: en-us, en MIME-Version: 1.0 To: Mateusz Berezecki CC: netdev Subject: Re: latest netdev tree - (broadcom44 bug?) letting you know... References: <42F22AD5.3020600@gmail.com> In-Reply-To: <42F22AD5.3020600@gmail.com> Content-Type: text/plain; charset=ISO-8859-2; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 2855 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev Content-Length: 413 Lines: 15 Mateusz Berezecki wrote: > Hi list readers > > I am using netdev treee (ieee80211 branch) fetched using git 2 days ago > and what worries me is that > my b44 card stopped working. I dont know if that's driver issue or > something has changed inside the network subsystem. > I am trying to track down the changes and hopefully will post the result. Nothing in that tree has changed the b44 driver... Jeff From Amine.Elkchaou@int-evry.fr Fri Aug 5 03:06:03 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 05 Aug 2005 03:06:07 -0700 (PDT) Received: from massilia.int-evry.fr (massilia.int-evry.fr [157.159.10.13]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j75A61H9023193 for ; Fri, 5 Aug 2005 03:06:02 -0700 Received: from massilia.int-evry.fr (localhost.localdomain [127.0.0.1]) by massilia.int-evry.fr (8.12.11/8.12.11) with ESMTP id j75A3uD2018061 for ; Fri, 5 Aug 2005 12:03:56 +0200 Received: (from apache@localhost) by massilia.int-evry.fr (8.12.11/8.12.11/Submit) id j75A3qg9018053 for netdev@oss.sgi.com; Fri, 5 Aug 2005 12:03:52 +0200 X-Authentication-Warning: massilia.int-evry.fr: apache set sender to Amine.Elkchaou@int-evry.fr using -f Received: from 84.98.246.10 ([84.98.246.10]) by imp.int-evry.fr (IMP) with HTTP for ; Fri, 5 Aug 2005 12:03:52 +0200 Message-ID: <1123236232.42f339885926a@imp.int-evry.fr> Date: Fri, 5 Aug 2005 12:03:52 +0200 From: Amine ELKCHAOU To: netdev@oss.sgi.com Subject: PHP netem MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit User-Agent: Internet Messaging Program (IMP) 3.2.5 X-Originating-IP: 84.98.246.10 X-archive-position: 2856 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: Amine.Elkchaou@int-evry.fr Precedence: bulk X-list: netdev Content-Length: 333 Lines: 10 Hello, I heared about a user interface for the netem developped with PHP, and I'm interested on it, but can't find it. If you have the link pionting on it it would be great if you send me it. Thanks. ---------------------------------------------------------------- This message was sent using IMP, the Internet Messaging Program. From wfarnsworth@mvista.com Fri Aug 5 16:34:15 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 05 Aug 2005 16:34:19 -0700 (PDT) Received: from av.mvista.com (gateway-1237.mvista.com [12.44.186.158]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j75NYFH9028016 for ; Fri, 5 Aug 2005 16:34:15 -0700 Received: from rhino.az.mvista.com (av [127.0.0.1]) by av.mvista.com (8.9.3/8.9.3) with ESMTP id QAA29211; Fri, 5 Aug 2005 16:32:05 -0700 Subject: [PATCH] emac: add support for platform-specific unsupported PHY features From: Wade Farnsworth To: jgarzik@pobox.com, netdev@oss.sgi.com, Matt Porter Cc: Eugene Surovegin Content-Type: multipart/mixed; boundary="=-H6kjIo0tyo+58bZb74rl" Message-Id: <1123284725.27880.26.camel@rhino.az.mvista.com> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.3.92 (Preview Release) Date: 05 Aug 2005 16:32:05 -0700 X-archive-position: 2857 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: wfarnsworth@mvista.com Precedence: bulk X-list: netdev Content-Length: 2502 Lines: 78 --=-H6kjIo0tyo+58bZb74rl Content-Type: text/plain Content-Transfer-Encoding: 7bit Hello, This patch adds support to the ibm_emac driver for platform-specific unsupported PHY features. The patch attempts to determine the highest speed and duplex when autonegotiation is unsupported. -Wade Farnsworth Signed-off-by: Wade Farnsworth --=-H6kjIo0tyo+58bZb74rl Content-Disposition: attachment; filename=ibm-emac-phy-feat-unsupp-core.patch Content-Type: text/x-patch; name=ibm-emac-phy-feat-unsupp-core.patch; charset=us-ascii Content-Transfer-Encoding: 7bit diff -upr linux-2.6/drivers/net/ibm_emac/ibm_emac_core.c linux-2.6-dev/drivers/net/ibm_emac/ibm_emac_core.c --- linux-2.6/drivers/net/ibm_emac/ibm_emac_core.c 2005-08-03 13:33:42.000000000 -0700 +++ linux-2.6-dev/drivers/net/ibm_emac/ibm_emac_core.c 2005-08-02 10:42:59.000000000 -0700 @@ -1876,6 +1876,9 @@ static int emac_init_device(struct ocp_d rc = -ENODEV; goto bail; } + + /* Disable any PHY features not supported by the platform */ + ep->phy_mii.def->features &= ~emacdata->feat_unsupp; /* Setup initial PHY config & startup aneg */ if (ep->phy_mii.def->ops->init) @@ -1883,6 +1886,38 @@ static int emac_init_device(struct ocp_d netif_carrier_off(ndev); if (ep->phy_mii.def->features & SUPPORTED_Autoneg) ep->want_autoneg = 1; + else { + ep->want_autoneg = 0; + + /* Select highest supported speed/duplex */ + if (ep->phy_mii.def->features & SUPPORTED_10000baseT_Full) { + ep->phy_mii.speed = SPEED_10000; + ep->phy_mii.duplex = DUPLEX_FULL; + } else if (ep->phy_mii.def->features & + SUPPORTED_1000baseT_Full) { + ep->phy_mii.speed = SPEED_1000; + ep->phy_mii.duplex = DUPLEX_FULL; + } else if (ep->phy_mii.def->features & + SUPPORTED_1000baseT_Half) { + ep->phy_mii.speed = SPEED_1000; + ep->phy_mii.duplex = DUPLEX_HALF; + } else if (ep->phy_mii.def->features & + SUPPORTED_100baseT_Full) { + ep->phy_mii.speed = SPEED_100; + ep->phy_mii.duplex = DUPLEX_FULL; + } else if (ep->phy_mii.def->features & + SUPPORTED_100baseT_Half) { + ep->phy_mii.speed = SPEED_100; + ep->phy_mii.duplex = DUPLEX_HALF; + } else if (ep->phy_mii.def->features & + SUPPORTED_10baseT_Full) { + ep->phy_mii.speed = SPEED_10; + ep->phy_mii.duplex = DUPLEX_FULL; + } else { + ep->phy_mii.speed = SPEED_10; + ep->phy_mii.duplex = DUPLEX_HALF; + } + } emac_start_link(ep, NULL); /* read the MAC Address */ --=-H6kjIo0tyo+58bZb74rl-- From ebs@ebshome.net Fri Aug 5 17:07:21 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 05 Aug 2005 17:07:25 -0700 (PDT) Received: from gate.ebshome.net (gate.ebshome.net [64.81.67.12]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7607LH9030319 for ; Fri, 5 Aug 2005 17:07:21 -0700 Received: (qmail 19323 invoked by uid 1000); 5 Aug 2005 17:05:16 -0700 Date: Fri, 5 Aug 2005 17:05:16 -0700 From: Eugene Surovegin To: Wade Farnsworth Cc: jgarzik@pobox.com, netdev@oss.sgi.com, Matt Porter Subject: Re: [PATCH] emac: add support for platform-specific unsupported PHY features Message-ID: <20050806000516.GA19218@gate.ebshome.net> Mail-Followup-To: Wade Farnsworth , jgarzik@pobox.com, netdev@oss.sgi.com, Matt Porter References: <1123284725.27880.26.camel@rhino.az.mvista.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1123284725.27880.26.camel@rhino.az.mvista.com> X-ICQ-UIN: 1193073 X-Operating-System: Linux i686 X-PGP-Key: http://www.ebshome.net/pubkey.asc User-Agent: Mutt/1.5.5.1i X-archive-position: 2858 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ebs@ebshome.net Precedence: bulk X-list: netdev Content-Length: 1682 Lines: 50 On Fri, Aug 05, 2005 at 04:32:05PM -0700, Wade Farnsworth wrote: > Hello, > > This patch adds support to the ibm_emac driver for platform-specific > unsupported PHY features. > > The patch attempts to determine the highest speed and duplex when > autonegotiation is unsupported. Looks good. > > -Wade Farnsworth > > Signed-off-by: Wade Farnsworth > diff -upr linux-2.6/drivers/net/ibm_emac/ibm_emac_core.c linux-2.6-dev/drivers/net/ibm_emac/ibm_emac_core.c > --- linux-2.6/drivers/net/ibm_emac/ibm_emac_core.c 2005-08-03 13:33:42.000000000 -0700 > +++ linux-2.6-dev/drivers/net/ibm_emac/ibm_emac_core.c 2005-08-02 10:42:59.000000000 -0700 > @@ -1876,6 +1876,9 @@ static int emac_init_device(struct ocp_d > rc = -ENODEV; > goto bail; > } > + > + /* Disable any PHY features not supported by the platform */ > + ep->phy_mii.def->features &= ~emacdata->feat_unsupp; > > /* Setup initial PHY config & startup aneg */ > if (ep->phy_mii.def->ops->init) > @@ -1883,6 +1886,38 @@ static int emac_init_device(struct ocp_d > netif_carrier_off(ndev); > if (ep->phy_mii.def->features & SUPPORTED_Autoneg) > ep->want_autoneg = 1; > + else { > + ep->want_autoneg = 0; > + > + /* Select highest supported speed/duplex */ > + if (ep->phy_mii.def->features & SUPPORTED_10000baseT_Full) { > + ep->phy_mii.speed = SPEED_10000; > + ep->phy_mii.duplex = DUPLEX_FULL; I think you are being too optimistic here :). EMAC doesn't support 10G Ethernet and will never will (at least sanely) given it's brain-damaged design. So I think it's safe to drop SUPPORTED_10000baseT_Full test. I'll update my NAPI tree with similar code. -- Eugene From manfred@colorfullife.com Sat Aug 6 14:50:11 2005 Received: with ECARTIS (v1.0.0; list netdev); Sat, 06 Aug 2005 14:50:14 -0700 (PDT) Received: from dbl.q-ag.de (dbl.q-ag.de [213.172.117.3]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j76Lo9H9010206 for ; Sat, 6 Aug 2005 14:50:10 -0700 Received: from [127.0.0.2] (dbl [127.0.0.1]) by dbl.q-ag.de (8.13.3/8.13.3/Debian-6) with ESMTP id j76LrUm2002557; Sat, 6 Aug 2005 23:53:31 +0200 Message-ID: <42F5300B.3070909@colorfullife.com> Date: Sat, 06 Aug 2005 23:47:55 +0200 From: Manfred Spraul User-Agent: Mozilla/5.0 (X11; U; Linux i686; fr-FR; rv:1.7.10) Gecko/20050719 Fedora/1.7.10-1.5.1 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Jeff Garzik CC: Ayaz Abdulla , Netdev , cOzmIc.FI@gmx.net Subject: [PATCH] forcedeth: Initialize link settings in every nv_open() Content-Type: multipart/mixed; boundary="------------070304080809040805050905" X-archive-position: 2860 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: manfred@colorfullife.com Precedence: bulk X-list: netdev Content-Length: 1889 Lines: 49 This is a multi-part message in MIME format. --------------070304080809040805050905 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit Rüdiger found a bug in nv_open that explains some of the reports with duplex mismatches: nv_open calls nv_update_link_speed for initializing the hardware link speed registers. If current link setting matches the values in np->linkspeed and np->duplex, then the function does nothing. Usually, doing nothing is the right thing, but not in nv_open: During nv_open, the registers must be initialized because the nic was reset. The attached patch fixes that by setting np->linkspeed to an invalid value before calling nv_update_link_speed from nv_open. Signed-Off-By: Manfred Spraul --------------070304080809040805050905 Content-Type: text/plain; name="patch-forcedeth-042-forcelinkinit" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="patch-forcedeth-042-forcelinkinit" --- 2.6/drivers/net/forcedeth.c 2005-08-06 19:59:56.000000000 +0200 +++ build-2.6/drivers/net/forcedeth.c 2005-08-06 19:59:06.000000000 +0200 @@ -93,6 +93,8 @@ * 0.40: 19 Jul 2005: Add support for mac address change. * 0.41: 30 Jul 2005: Write back original MAC in nv_close instead * of nv_remove + * 0.42: 06 Aug 2005: Fix lack of link speed initialization + * in the second (and later) nv_open call * * Known bugs: * We suspect that on some hardware no TX done interrupts are generated. @@ -2178,6 +2180,9 @@ writel(NVREG_MIISTAT_MASK, base + NvRegMIIStatus); dprintk(KERN_INFO "startup: got 0x%08x.\n", miistat); } + /* set linkspeed to invalid value, thus force nv_update_linkspeed + * to init hw */ + np->linkspeed = 0; ret = nv_update_linkspeed(dev); nv_start_rx(dev); nv_start_tx(dev); --------------070304080809040805050905-- From ravinandan.arakali@neterion.com Sun Aug 7 13:53:06 2005 Received: with ECARTIS (v1.0.0; list netdev); Sun, 07 Aug 2005 13:53:10 -0700 (PDT) Received: from ns1.s2io.com (ns1.s2io.com [142.46.200.198]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j77Kr5H9014589 for ; Sun, 7 Aug 2005 13:53:06 -0700 Received: from guinness.s2io.com (sentry.s2io.com [142.46.200.199]) by ns1.s2io.com (8.12.10/8.12.10) with ESMTP id j77Korcx014489; Sun, 7 Aug 2005 16:50:53 -0400 (EDT) Received: from rarakali ([10.16.16.72]) by guinness.s2io.com (8.12.6/8.12.6) with SMTP id j77KopKP005377; Sun, 7 Aug 2005 16:50:51 -0400 (EDT) From: "Ravinandan Arakali" To: "'Jeff Garzik'" , Cc: , , Subject: RE: [PATCH 2.6.12.1 1/12] S2io: Code cleanup Date: Sun, 7 Aug 2005 13:51:52 -0700 Message-ID: <000001c59b91$d7f07370$4810100a@pc.s2io.com> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 (Normal) X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook CWS, Build 9.0.2416 (9.0.2911.0) Importance: Normal In-Reply-To: <42EC5D96.3050304@pobox.com> X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.2180 X-Scanned-By: MIMEDefang 2.34 X-archive-position: 2864 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ravinandan.arakali@neterion.com Precedence: bulk X-list: netdev Content-Length: 1028 Lines: 35 Jeff, The entire set of patches have been resent and an additional patch13 to address earlier comments. Pls confirm if these patches apply correctly. Thanks, Ravi -----Original Message----- From: Jeff Garzik [mailto:jgarzik@pobox.com] Sent: Saturday, July 30, 2005 10:12 PM To: raghavendra.koushik@neterion.com Cc: netdev@oss.sgi.com; ravinandan.arakali@neterion.com; leonid.grossman@neterion.com; rapuru.sriram@neterion.com Subject: Re: [PATCH 2.6.12.1 1/12] S2io: Code cleanup patch doesn't seem to apply :( Can you please resend the entire series, taking into account the comments WRT patch #5? Also, I was unable to include your fixes in my 'fixes' branch, whose speed to upstream kernel is accelerated, because patch #1 was not bug fixes. If you want your bug fixes to go upstream as rapidly as possible, make sure they are ordered before the code cleanups and new features. This allows me to send the fixes upstream immediately, while allowing further review and testing of the cleanup/feature patches. Jeff From ravinandan.arakali@neterion.com Mon Aug 8 17:44:02 2005 Received: with ECARTIS (v1.0.0; list netdev); Mon, 08 Aug 2005 17:44:13 -0700 (PDT) Received: from ns1.s2io.com (ns1.s2io.com [142.46.200.198]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j790i1H9016929 for ; Mon, 8 Aug 2005 17:44:01 -0700 Received: from guinness.s2io.com (sentry.s2io.com [142.46.200.199]) by ns1.s2io.com (8.12.10/8.12.10) with ESMTP id j790etcx019903; Mon, 8 Aug 2005 20:40:55 -0400 (EDT) Received: from rarakali ([10.16.16.72]) by guinness.s2io.com (8.12.6/8.12.6) with SMTP id j790eqKP020960; Mon, 8 Aug 2005 20:40:52 -0400 (EDT) From: "Ravinandan Arakali" To: "'David S. Miller'" Cc: , , , , , Subject: default directive in Kconfig(subject modified) Date: Mon, 8 Aug 2005 17:40:45 -0700 Message-ID: <002801c59c7a$fb660ec0$4810100a@pc.s2io.com> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Priority: 3 (Normal) X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook CWS, Build 9.0.2416 (9.0.2911.0) X-MIMEOLE: Produced By Microsoft MimeOLE V6.00.2900.2180 In-Reply-To: Importance: Normal X-Scanned-By: MIMEDefang 2.34 X-archive-position: 2866 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ravinandan.arakali@neterion.com Precedence: bulk X-list: netdev Content-Length: 2011 Lines: 60 Hi, Can somebody throw light on the below subject ? We have S2io configured as a module and 2buff mode is one of the suboptions under S2io. But a directive such as "default y" does not seem to enable 2buff mode. Thanks, Ravi -----Original Message----- From: Ravinandan Arakali [mailto:ravinandan.arakali@neterion.com] Sent: Friday, July 29, 2005 9:38 AM To: 'David S. Miller' Cc: 'hch@infradead.org'; 'raghavendra.koushik@neterion.com'; 'jgarzik@pobox.com'; 'netdev@oss.sgi.com'; 'leonid.grossman@neterion.com'; 'rapuru.sriram@neterion.com' Subject: RE: [PATCH 2.6.12.1 5/12] S2io: Performance improvements David, We are trying to use the "default" directive in Kconfig. We tried using an unconditional directive(just to test it out) such as "default y" and a conditional one such as "default y if CONFIG_IA64_SGI_SN2". But when we run "make menuconfig", it does not seem to pickup any of these changes from Kconfig. Any idea what we might be missing ? Once this is fixed, we'll send out a patch to address comments from previous 12 patches as well as couple of issues we found in the meantime. Thanks, Ravi -----Original Message----- From: David S. Miller [mailto:davem@davemloft.net] Sent: Tuesday, July 12, 2005 2:04 PM To: ravinandan.arakali@neterion.com Cc: hch@infradead.org; raghavendra.koushik@neterion.com; jgarzik@pobox.com; netdev@oss.sgi.com; leonid.grossman@neterion.com; rapuru.sriram@neterion.com Subject: Re: [PATCH 2.6.12.1 5/12] S2io: Performance improvements From: "Ravinandan Arakali" Subject: RE: [PATCH 2.6.12.1 5/12] S2io: Performance improvements Date: Tue, 12 Jul 2005 14:00:52 -0700 > The two-buffer mode was added as a configurable option > to Kconfig file several months ago. Hence the macro > is CONFIG_2BUFF_MODE. We're saying that you should choose CONFIG_2BUFF_MODE, when CONFIG_IA64_SGI_SN2 is set, inside the Kconfig file using the "default" Kconfig directive. You should never change the setting of CONFIG_* macros in C source. From davem@davemloft.net Mon Aug 8 19:40:36 2005 Received: with ECARTIS (v1.0.0; list netdev); Mon, 08 Aug 2005 19:40:41 -0700 (PDT) Received: from sunset.davemloft.net (dsl027-180-168.sfo1.dsl.speakeasy.net [216.27.180.168]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j792eaH9024500 for ; Mon, 8 Aug 2005 19:40:36 -0700 Received: from localhost ([127.0.0.1] ident=davem) by sunset.davemloft.net with esmtp (Exim 4.52) id 1E2K0O-0006x8-W9; Mon, 08 Aug 2005 19:38:17 -0700 Date: Mon, 08 Aug 2005 19:38:16 -0700 (PDT) Message-Id: <20050808.193816.70216612.davem@davemloft.net> To: ravinandan.arakali@neterion.com Cc: hch@infradead.org, raghavendra.koushik@neterion.com, jgarzik@pobox.com, netdev@oss.sgi.com, leonid.grossman@neterion.com, rapuru.sriram@neterion.com Subject: Re: default directive in Kconfig(subject modified) From: "David S. Miller" In-Reply-To: <002801c59c7a$fb660ec0$4810100a@pc.s2io.com> References: <002801c59c7a$fb660ec0$4810100a@pc.s2io.com> X-Mailer: Mew version 4.2 on Emacs 21.4 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 2867 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev Content-Length: 695 Lines: 16 From: "Ravinandan Arakali" Date: Mon, 8 Aug 2005 17:40:45 -0700 > Can somebody throw light on the below subject ? > We have S2io configured as a module and 2buff mode is > one of the suboptions under S2io. But a directive such > as "default y" does not seem to enable 2buff mode. Independant of this issue, can you please acknowledge what many people are trying to show you in that you MUST make this a run-time selectable feature. Yes, that means the driver will have to have two totally seperate code paths. But that should not be inefficient because you can just hook up different transmit and interrupt handler methods depending upon the mode selected. From ravinandan.arakali@neterion.com Tue Aug 9 14:29:12 2005 Received: with ECARTIS (v1.0.0; list netdev); Tue, 09 Aug 2005 14:29:19 -0700 (PDT) Received: from ns1.s2io.com (ns1.s2io.com [142.46.200.198]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j79LTBH9023755 for ; Tue, 9 Aug 2005 14:29:12 -0700 Received: from guinness.s2io.com (sentry.s2io.com [142.46.200.199]) by ns1.s2io.com (8.12.10/8.12.10) with ESMTP id j79LQMcx023310; Tue, 9 Aug 2005 17:26:22 -0400 (EDT) Received: from rarakali ([10.16.16.72]) by guinness.s2io.com (8.12.6/8.12.6) with SMTP id j79LQJKP011866; Tue, 9 Aug 2005 17:26:19 -0400 (EDT) From: "Ravinandan Arakali" To: "'David S. Miller'" Cc: , , , , , Subject: RE: default directive in Kconfig(subject modified) Date: Tue, 9 Aug 2005 14:26:05 -0700 Message-ID: <001001c59d28$f41c1160$4810100a@pc.s2io.com> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Priority: 3 (Normal) X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook CWS, Build 9.0.2416 (9.0.2911.0) In-Reply-To: <20050808.193816.70216612.davem@davemloft.net> X-MIMEOLE: Produced By Microsoft MimeOLE V6.00.2900.2180 Importance: Normal X-Scanned-By: MIMEDefang 2.34 X-archive-position: 2869 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ravinandan.arakali@neterion.com Precedence: bulk X-list: netdev Content-Length: 1231 Lines: 34 Yes, I got the run-time feature message. It will require quite a bit of code reshuffle and testing to make sure we don't break existing code in any way. Thanks, Ravi -----Original Message----- From: David S. Miller [mailto:davem@davemloft.net] Sent: Monday, August 08, 2005 7:38 PM To: ravinandan.arakali@neterion.com Cc: hch@infradead.org; raghavendra.koushik@neterion.com; jgarzik@pobox.com; netdev@oss.sgi.com; leonid.grossman@neterion.com; rapuru.sriram@neterion.com Subject: Re: default directive in Kconfig(subject modified) From: "Ravinandan Arakali" Date: Mon, 8 Aug 2005 17:40:45 -0700 > Can somebody throw light on the below subject ? > We have S2io configured as a module and 2buff mode is > one of the suboptions under S2io. But a directive such > as "default y" does not seem to enable 2buff mode. Independant of this issue, can you please acknowledge what many people are trying to show you in that you MUST make this a run-time selectable feature. Yes, that means the driver will have to have two totally seperate code paths. But that should not be inefficient because you can just hook up different transmit and interrupt handler methods depending upon the mode selected. From jgarzik@pobox.com Wed Aug 10 21:06:09 2005 Received: with ECARTIS (v1.0.0; list netdev); Wed, 10 Aug 2005 21:06:13 -0700 (PDT) Received: from mail.dvmed.net (mail.dvmed.net [216.237.124.58]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7B468H9008125 for ; Wed, 10 Aug 2005 21:06:08 -0700 Received: from cpe-069-134-188-146.nc.res.rr.com ([69.134.188.146] helo=[10.10.10.88]) by mail.dvmed.net with esmtpsa (Exim 4.52 #1 (Red Hat Linux)) id 1E34IJ-0004ct-Od; Thu, 11 Aug 2005 04:03:52 +0000 Message-ID: <42FACE25.5080909@pobox.com> Date: Thu, 11 Aug 2005 00:03:49 -0400 From: Jeff Garzik User-Agent: Mozilla Thunderbird 1.0.6-1.1.fc4 (X11/20050720) X-Accept-Language: en-us, en MIME-Version: 1.0 To: Wade Farnsworth CC: netdev@oss.sgi.com, Matt Porter , Eugene Surovegin Subject: Re: [PATCH] emac: add support for platform-specific unsupported PHY features References: <1123284725.27880.26.camel@rhino.az.mvista.com> In-Reply-To: <1123284725.27880.26.camel@rhino.az.mvista.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 2873 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev Content-Length: 381 Lines: 19 Wade Farnsworth wrote: > Hello, > > This patch adds support to the ibm_emac driver for platform-specific > unsupported PHY features. > > The patch attempts to determine the highest speed and duplex when > autonegotiation is unsupported. > > -Wade Farnsworth > > Signed-off-by: Wade Farnsworth Can you update this patch RE Eugene's comments? Jeff From jgarzik@pobox.com Wed Aug 10 21:13:21 2005 Received: with ECARTIS (v1.0.0; list netdev); Wed, 10 Aug 2005 21:13:24 -0700 (PDT) Received: from mail.dvmed.net (mail.dvmed.net [216.237.124.58]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7B4DLH9008936 for ; Wed, 10 Aug 2005 21:13:21 -0700 Received: from cpe-069-134-188-146.nc.res.rr.com ([69.134.188.146] helo=[10.10.10.88]) by mail.dvmed.net with esmtpsa (Exim 4.52 #1 (Red Hat Linux)) id 1E34PN-0004dS-BH; Thu, 11 Aug 2005 04:11:10 +0000 Message-ID: <42FACFDB.8010109@pobox.com> Date: Thu, 11 Aug 2005 00:11:07 -0400 From: Jeff Garzik User-Agent: Mozilla Thunderbird 1.0.6-1.1.fc4 (X11/20050720) X-Accept-Language: en-us, en MIME-Version: 1.0 To: raghavendra.koushik@neterion.com CC: netdev@oss.sgi.com, ravinandan.arakali@neterion.com, leonid.grossman@neterion.com, rapuru.sriram@neterion.com Subject: Re: [PATCH 2.6.13-rc4 1/13] S2io: Code cleanup References: <20050803192433.BBFB498336@linux.site> In-Reply-To: <20050803192433.BBFB498336@linux.site> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 2874 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev Content-Length: 22 Lines: 2 applied patches 1-13 From jgarzik@pobox.com Thu Aug 11 10:50:37 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 11 Aug 2005 10:50:41 -0700 (PDT) Received: from mail.dvmed.net (mail.dvmed.net [216.237.124.58]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7BHoaH9022778 for ; Thu, 11 Aug 2005 10:50:36 -0700 Received: from cpe-069-134-188-146.nc.res.rr.com ([69.134.188.146] helo=[10.10.10.88]) by mail.dvmed.net with esmtpsa (Exim 4.52 #1 (Red Hat Linux)) id 1E3HAE-0004xv-Of; Thu, 11 Aug 2005 17:48:24 +0000 Message-ID: <42FB8F64.4020001@pobox.com> Date: Thu, 11 Aug 2005 13:48:20 -0400 From: Jeff Garzik User-Agent: Mozilla Thunderbird 1.0.6-1.1.fc4 (X11/20050720) X-Accept-Language: en-us, en MIME-Version: 1.0 To: alex@neterion.com CC: netdev@oss.sgi.com, davem@davemloft.net, ak@muc.de Subject: Re: [ANNOUNCE] Experimental Driver for Neterion/S2io 10GbE Adapters References: <200503222330.j2MNU2DD028953@guinness.s2io.com> In-Reply-To: <200503222330.j2MNU2DD028953@guinness.s2io.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 2875 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev Content-Length: 235 Lines: 10 Alex Aizman wrote: > This is our 2nd attempt to submit "xge", the experimental driver for > Neterion, Inc (formerly S2io, Inc) family of 10GbE adapters. Can you send a link to the latest version of this driver, for review? Jeff From Leonid.Grossman@neterion.com Thu Aug 11 12:42:51 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 11 Aug 2005 12:42:56 -0700 (PDT) Received: from nekter.pc.s2io.com (sentry.s2io.com [142.46.200.199]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7BJgoH9003921 for ; Thu, 11 Aug 2005 12:42:51 -0700 X-MIMEOLE: Produced By Microsoft Exchange V6.5.7226.0 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Subject: RE: [ANNOUNCE] Experimental Driver for Neterion/S2io 10GbE Adapters Date: Thu, 11 Aug 2005 15:40:42 -0400 Message-ID: <78C9135A3D2ECE4B8162EBDCE82CAD77057419@nekter> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: [ANNOUNCE] Experimental Driver for Neterion/S2io 10GbE Adapters Thread-index: AcWenRD95UmQrb5wQJqXq/iYnDh4dAABqYYg From: "Leonid Grossman" To: "Jeff Garzik" , Cc: , , Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id j7BJgoH9003921 X-archive-position: 2876 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: Leonid.Grossman@neterion.com Precedence: bulk X-list: netdev Content-Length: 1588 Lines: 46 Hi Jeff, Thanks for remembering about the submission. My assumption was that the submission did not get a consensus and died quietly :-), so we stopped maintenance on the experimental driver several weeks ago and pulled it from the ftp site. I guess the original idea behind developing the "experimental" driver was to eventually use it as a replacement for the "s2io" driver in the kernel, mainly for the maintenance reasons - since "experimental" driver shares hardware-oriented code with other Neterion drivers, our team could run mature network test suites in other Operating Systems and indirectly find/fix issues in Linux driver in a very efficient fashion. This benefit seemed to be outweighed by other concerns that some of you guys raised on the list. In a long run (as 10GbE Xframe cards get shipped in volumes and the patches start coming in from the community), these concerns may be valid, so we have decided to stay with "s2io" driver as our production code. Leonid > -----Original Message----- > From: netdev-bounce@oss.sgi.com > [mailto:netdev-bounce@oss.sgi.com] On Behalf Of Jeff Garzik > Sent: Thursday, August 11, 2005 10:48 AM > To: alex@neterion.com > Cc: netdev@oss.sgi.com; davem@davemloft.net; ak@muc.de > Subject: Re: [ANNOUNCE] Experimental Driver for Neterion/S2io > 10GbE Adapters > > Alex Aizman wrote: > > This is our 2nd attempt to submit "xge", the experimental > driver for > > Neterion, Inc (formerly S2io, Inc) family of 10GbE adapters. > > Can you send a link to the latest version of this driver, for review? > > Jeff > > > > > From wfarnsworth@mvista.com Thu Aug 11 13:41:21 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 11 Aug 2005 13:41:25 -0700 (PDT) Received: from av.mvista.com (gateway-1237.mvista.com [12.44.186.158]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7BKfKH9009691 for ; Thu, 11 Aug 2005 13:41:20 -0700 Received: from rhino.az.mvista.com (av [127.0.0.1]) by av.mvista.com (8.9.3/8.9.3) with ESMTP id NAA23658; Thu, 11 Aug 2005 13:39:05 -0700 Subject: Re: [PATCH] emac: add support for platform-specific unsupported PHY features From: Wade Farnsworth To: Jeff Garzik Cc: netdev@oss.sgi.com, Matt Porter , Eugene Surovegin In-Reply-To: <42FACE25.5080909@pobox.com> References: <1123284725.27880.26.camel@rhino.az.mvista.com> <42FACE25.5080909@pobox.com> Content-Type: multipart/mixed; boundary="=-WdeGUu5jcVZj2BqQ6UIE" Message-Id: <1123792744.27734.30.camel@rhino.az.mvista.com> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.3.92 (Preview Release) Date: 11 Aug 2005 13:39:05 -0700 X-archive-position: 2877 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: wfarnsworth@mvista.com Precedence: bulk X-list: netdev Content-Length: 2636 Lines: 89 --=-WdeGUu5jcVZj2BqQ6UIE Content-Type: text/plain Content-Transfer-Encoding: 7bit On Wed, 2005-08-10 at 21:03, Jeff Garzik wrote: > Wade Farnsworth wrote: > > Hello, > > > > This patch adds support to the ibm_emac driver for platform-specific > > unsupported PHY features. > > > > The patch attempts to determine the highest speed and duplex when > > autonegotiation is unsupported. > > > > -Wade Farnsworth > > > > Signed-off-by: Wade Farnsworth > > Can you update this patch RE Eugene's comments? > > Jeff > Sorry for the delay. Here is the updated patch. -Wade Farnsworth Signed-off-by: Wade Farnsworth --=-WdeGUu5jcVZj2BqQ6UIE Content-Disposition: attachment; filename=ibm-emac-phy-feat-exc-core.patch Content-Type: text/x-patch; name=ibm-emac-phy-feat-exc-core.patch Content-Transfer-Encoding: 7bit diff -upr linux-2.6/drivers/net/ibm_emac/ibm_emac_core.c linux-2.6-dev/drivers/net/ibm_emac/ibm_emac_core.c --- linux-2.6/drivers/net/ibm_emac/ibm_emac_core.c 2005-08-11 13:29:43.000000000 -0700 +++ linux-2.6-dev/drivers/net/ibm_emac/ibm_emac_core.c 2005-08-11 13:13:40.000000000 -0700 @@ -1876,6 +1876,9 @@ static int emac_init_device(struct ocp_d rc = -ENODEV; goto bail; } + + /* Disable any PHY features not supported by the platform */ + ep->phy_mii.def->features &= ~emacdata->phy_feat_exc; /* Setup initial PHY config & startup aneg */ if (ep->phy_mii.def->ops->init) @@ -1883,6 +1886,34 @@ static int emac_init_device(struct ocp_d netif_carrier_off(ndev); if (ep->phy_mii.def->features & SUPPORTED_Autoneg) ep->want_autoneg = 1; + else { + ep->want_autoneg = 0; + + /* Select highest supported speed/duplex */ + if (ep->phy_mii.def->features & SUPPORTED_1000baseT_Full) { + ep->phy_mii.speed = SPEED_1000; + ep->phy_mii.duplex = DUPLEX_FULL; + } else if (ep->phy_mii.def->features & + SUPPORTED_1000baseT_Half) { + ep->phy_mii.speed = SPEED_1000; + ep->phy_mii.duplex = DUPLEX_HALF; + } else if (ep->phy_mii.def->features & + SUPPORTED_100baseT_Full) { + ep->phy_mii.speed = SPEED_100; + ep->phy_mii.duplex = DUPLEX_FULL; + } else if (ep->phy_mii.def->features & + SUPPORTED_100baseT_Half) { + ep->phy_mii.speed = SPEED_100; + ep->phy_mii.duplex = DUPLEX_HALF; + } else if (ep->phy_mii.def->features & + SUPPORTED_10baseT_Full) { + ep->phy_mii.speed = SPEED_10; + ep->phy_mii.duplex = DUPLEX_FULL; + } else { + ep->phy_mii.speed = SPEED_10; + ep->phy_mii.duplex = DUPLEX_HALF; + } + } emac_start_link(ep, NULL); /* read the MAC Address */ --=-WdeGUu5jcVZj2BqQ6UIE-- From mmporter@cox.net Thu Aug 11 13:57:42 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 11 Aug 2005 13:57:46 -0700 (PDT) Received: from fed1rmmtao11.cox.net (fed1rmmtao11.cox.net [68.230.241.28]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7BKvgH9011341 for ; Thu, 11 Aug 2005 13:57:42 -0700 Received: from liberty.homelinux.org ([70.190.160.125]) by fed1rmmtao11.cox.net (InterMail vM.6.01.04.00 201-2131-118-20041027) with ESMTP id <20050811205529.LEPB12158.fed1rmmtao11.cox.net@liberty.homelinux.org>; Thu, 11 Aug 2005 16:55:29 -0400 Received: (from mmporter@localhost) by liberty.homelinux.org (8.9.3/8.9.3/Debian 8.9.3-21) id NAA32424; Thu, 11 Aug 2005 13:55:30 -0700 Date: Thu, 11 Aug 2005 13:55:30 -0700 From: Matt Porter To: Wade Farnsworth Cc: Jeff Garzik , netdev@oss.sgi.com, Eugene Surovegin Subject: Re: [PATCH] emac: add support for platform-specific unsupported PHY features Message-ID: <20050811135530.G30033@cox.net> References: <1123284725.27880.26.camel@rhino.az.mvista.com> <42FACE25.5080909@pobox.com> <1123792744.27734.30.camel@rhino.az.mvista.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: <1123792744.27734.30.camel@rhino.az.mvista.com>; from wfarnsworth@mvista.com on Thu, Aug 11, 2005 at 01:39:05PM -0700 X-archive-position: 2878 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mporter@kernel.crashing.org Precedence: bulk X-list: netdev Content-Length: 700 Lines: 26 On Thu, Aug 11, 2005 at 01:39:05PM -0700, Wade Farnsworth wrote: > On Wed, 2005-08-10 at 21:03, Jeff Garzik wrote: > > Wade Farnsworth wrote: > > > Hello, > > > > > > This patch adds support to the ibm_emac driver for platform-specific > > > unsupported PHY features. > > > > > > The patch attempts to determine the highest speed and duplex when > > > autonegotiation is unsupported. > > > > > > -Wade Farnsworth > > > > > > Signed-off-by: Wade Farnsworth > > > > Can you update this patch RE Eugene's comments? > > > > Jeff > > > > Sorry for the delay. Here is the updated patch. FWIW, akpm has the ppc32 relevant bits that go with this patch pending now. -Matt From akepner@sgi.com Thu Aug 11 15:03:03 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 11 Aug 2005 15:03:07 -0700 (PDT) Received: from omx2.sgi.com (omx2-ext.sgi.com [192.48.171.19]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7BM32H9018372 for ; Thu, 11 Aug 2005 15:03:03 -0700 Received: from cthulhu.engr.sgi.com (cthulhu.engr.sgi.com [192.26.80.2]) by omx2.sgi.com (8.12.11/8.12.9/linux-outbound_gateway-1.1) with ESMTP id j7BNw60I013764 for ; Thu, 11 Aug 2005 16:58:06 -0700 Received: from [192.168.2.20] (mtv-vpn-sw-corp-0-25.corp.sgi.com [134.15.0.25]) by cthulhu.engr.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id j7BM0nSD7204134; Thu, 11 Aug 2005 15:00:50 -0700 (PDT) Date: Thu, 11 Aug 2005 14:55:24 -0700 (PDT) From: Arthur Kepner X-X-Sender: akepner@resonance.WorkGroup To: "David S. Miller" cc: netdev@oss.sgi.com, bonding-devel@lists.sourceforge.net Subject: [RESEND] [PATCH] bond inherits zero-copy flags of slaves Message-ID: MIME-Version: 1.0 Content-Type: MULTIPART/MIXED; BOUNDARY="32512-514941971-1123797007=:14105" Content-ID: X-archive-position: 2879 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: akepner@sgi.com Precedence: bulk X-list: netdev Content-Length: 6895 Lines: 127 This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. --32512-514941971-1123797007=:14105 Content-Type: TEXT/PLAIN; CHARSET=US-ASCII Content-ID: The attached patch allows a bonding device to inherit the "zero-copy" features of its slave devices. It was inspired by a couple of previous postings on this topic: http://marc.theaimsgroup.com/?l=bonding-devel&m=111924607327794&w=2 http://marc.theaimsgroup.com/?l=bonding-devel&m=111925242706297&w=2 and it's largely a combination of the patches that appear in those emails. Patch is against net-2.6.git Signed-off-by: Arthur Kepner -- Arthur --32512-514941971-1123797007=:14105 Content-Type: TEXT/PLAIN; CHARSET=US-ASCII; NAME="bonding_features.git.patch" Content-Transfer-Encoding: BASE64 Content-ID: Content-Description: bonding_features.git.patch Content-Disposition: ATTACHMENT; FILENAME="bonding_features.git.patch" ZGlmZiAtLWdpdCBhL2RyaXZlcnMvbmV0L2JvbmRpbmcvYm9uZF9tYWluLmMg Yi9kcml2ZXJzL25ldC9ib25kaW5nL2JvbmRfbWFpbi5jDQotLS0gYS9kcml2 ZXJzL25ldC9ib25kaW5nL2JvbmRfbWFpbi5jDQorKysgYi9kcml2ZXJzL25l dC9ib25kaW5nL2JvbmRfbWFpbi5jDQpAQCAtMTYwNCw2ICsxNjA0LDQ0IEBA IHN0YXRpYyBpbnQgYm9uZF9zZXRod2FkZHIoc3RydWN0IG5ldF9kZXYNCiAJ cmV0dXJuIDA7DQogfQ0KIA0KKyNkZWZpbmUgQk9ORF9JTlRFUlNFQ1RfRkVB VFVSRVMgXA0KKwkoTkVUSUZfRl9TR3xORVRJRl9GX0lQX0NTVU18TkVUSUZf Rl9OT19DU1VNfE5FVElGX0ZfSFdfQ1NVTSkNCisNCisvKiANCisgKiBDb21w dXRlIHRoZSBmZWF0dXJlcyBhdmFpbGFibGUgdG8gdGhlIGJvbmRpbmcgZGV2 aWNlIGJ5IA0KKyAqIGludGVyc2VjdGlvbiBvZiBhbGwgb2YgdGhlIHNsYXZl IGRldmljZXMnIEJPTkRfSU5URVJTRUNUX0ZFQVRVUkVTLg0KKyAqIENhbGwg dGhpcyBhZnRlciBhdHRhY2hpbmcgb3IgZGV0YWNoaW5nIGEgc2xhdmUgdG8g dXBkYXRlIHRoZSANCisgKiBib25kJ3MgZmVhdHVyZXMuDQorICovDQorc3Rh dGljIGludCBib25kX2NvbXB1dGVfZmVhdHVyZXMoc3RydWN0IGJvbmRpbmcg KmJvbmQpDQorew0KKwlpbnQgaTsNCisJc3RydWN0IHNsYXZlICpzbGF2ZTsN CisJc3RydWN0IG5ldF9kZXZpY2UgKmJvbmRfZGV2ID0gYm9uZC0+ZGV2Ow0K KwlpbnQgZmVhdHVyZXMgPSBib25kLT5ib25kX2ZlYXR1cmVzOw0KKw0KKwli b25kX2Zvcl9lYWNoX3NsYXZlKGJvbmQsIHNsYXZlLCBpKSB7DQorCQlzdHJ1 Y3QgbmV0X2RldmljZSAqIHNsYXZlX2RldiA9IHNsYXZlLT5kZXY7DQorCQlp ZiAoaSA9PSAwKSB7DQorCQkJZmVhdHVyZXMgfD0gQk9ORF9JTlRFUlNFQ1Rf RkVBVFVSRVM7DQorCQl9DQorCQlmZWF0dXJlcyAmPQ0KKwkJCX4ofnNsYXZl X2Rldi0+ZmVhdHVyZXMgJiBCT05EX0lOVEVSU0VDVF9GRUFUVVJFUyk7DQor CX0NCisNCisJLyogdHVybiBvZmYgTkVUSUZfRl9TRyBpZiB3ZSBuZWVkIGEg Y3N1bSBhbmQgaC93IGNhbid0IGRvIGl0ICovDQorCWlmICgoZmVhdHVyZXMg JiBORVRJRl9GX1NHKSAmJiANCisJCSEoZmVhdHVyZXMgJiAoTkVUSUZfRl9J UF9DU1VNIHwNCisJCQkgICAgICBORVRJRl9GX05PX0NTVU0gfA0KKwkJCSAg ICAgIE5FVElGX0ZfSFdfQ1NVTSkpKSB7DQorCQlmZWF0dXJlcyAmPSB+TkVU SUZfRl9TRzsNCisJfQ0KKw0KKwlib25kX2Rldi0+ZmVhdHVyZXMgPSBmZWF0 dXJlczsNCisNCisJcmV0dXJuIDA7DQorfQ0KKw0KIC8qIGVuc2xhdmUgZGV2 aWNlIDxzbGF2ZT4gdG8gYm9uZCBkZXZpY2UgPG1hc3Rlcj4gKi8NCiBzdGF0 aWMgaW50IGJvbmRfZW5zbGF2ZShzdHJ1Y3QgbmV0X2RldmljZSAqYm9uZF9k ZXYsIHN0cnVjdCBuZXRfZGV2aWNlICpzbGF2ZV9kZXYpDQogew0KQEAgLTE4 MTEsNiArMTg0OSw4IEBAIHN0YXRpYyBpbnQgYm9uZF9lbnNsYXZlKHN0cnVj dCBuZXRfZGV2aWMNCiAJbmV3X3NsYXZlLT5kZWxheSA9IDA7DQogCW5ld19z bGF2ZS0+bGlua19mYWlsdXJlX2NvdW50ID0gMDsNCiANCisJYm9uZF9jb21w dXRlX2ZlYXR1cmVzKGJvbmQpOw0KKw0KIAlpZiAoYm9uZC0+cGFyYW1zLm1p aW1vbiAmJiAhYm9uZC0+cGFyYW1zLnVzZV9jYXJyaWVyKSB7DQogCQlsaW5r X3JlcG9ydGluZyA9IGJvbmRfY2hlY2tfZGV2X2xpbmsoYm9uZCwgc2xhdmVf ZGV2LCAxKTsNCiANCkBAIC0yMDE1LDcgKzIwNTUsNyBAQCBlcnJfZnJlZToN CiANCiBlcnJfdW5kb19mbGFnczoNCiAJYm9uZF9kZXYtPmZlYXR1cmVzID0g b2xkX2ZlYXR1cmVzOw0KLQ0KKyANCiAJcmV0dXJuIHJlczsNCiB9DQogDQpA QCAtMjEwMCw2ICsyMTQwLDggQEAgc3RhdGljIGludCBib25kX3JlbGVhc2Uo c3RydWN0IG5ldF9kZXZpYw0KIAkvKiByZWxlYXNlIHRoZSBzbGF2ZSBmcm9t IGl0cyBib25kICovDQogCWJvbmRfZGV0YWNoX3NsYXZlKGJvbmQsIHNsYXZl KTsNCiANCisJYm9uZF9jb21wdXRlX2ZlYXR1cmVzKGJvbmQpOw0KKw0KIAlp ZiAoYm9uZC0+cHJpbWFyeV9zbGF2ZSA9PSBzbGF2ZSkgew0KIAkJYm9uZC0+ cHJpbWFyeV9zbGF2ZSA9IE5VTEw7DQogCX0NCkBAIC0yMjQzLDYgKzIyODUs OCBAQCBzdGF0aWMgaW50IGJvbmRfcmVsZWFzZV9hbGwoc3RydWN0IG5ldF9k DQogCQkJYm9uZF9hbGJfZGVpbml0X3NsYXZlKGJvbmQsIHNsYXZlKTsNCiAJ CX0NCiANCisJCWJvbmRfY29tcHV0ZV9mZWF0dXJlcyhib25kKTsNCisNCiAJ CS8qIG5vdyB0aGF0IHRoZSBzbGF2ZSBpcyBkZXRhY2hlZCwgdW5sb2NrIGFu ZCBwZXJmb3JtDQogCQkgKiBhbGwgdGhlIHVuZG8gc3RlcHMgdGhhdCBzaG91 bGQgbm90IGJlIGNhbGxlZCBmcm9tDQogCQkgKiB3aXRoaW4gYSBsb2NrLg0K QEAgLTM1ODgsNiArMzYzMiw3IEBAIHN0YXRpYyBpbnQgYm9uZF9tYXN0ZXJf bmV0ZGV2X2V2ZW50KHVuc2kNCiBzdGF0aWMgaW50IGJvbmRfc2xhdmVfbmV0 ZGV2X2V2ZW50KHVuc2lnbmVkIGxvbmcgZXZlbnQsIHN0cnVjdCBuZXRfZGV2 aWNlICpzbGF2ZV9kZXYpDQogew0KIAlzdHJ1Y3QgbmV0X2RldmljZSAqYm9u ZF9kZXYgPSBzbGF2ZV9kZXYtPm1hc3RlcjsNCisJc3RydWN0IGJvbmRpbmcg KmJvbmQgPSBib25kX2Rldi0+cHJpdjsNCiANCiAJc3dpdGNoIChldmVudCkg ew0KIAljYXNlIE5FVERFVl9VTlJFR0lTVEVSOg0KQEAgLTM2MjYsNiArMzY3 MSw5IEBAIHN0YXRpYyBpbnQgYm9uZF9zbGF2ZV9uZXRkZXZfZXZlbnQodW5z aWcNCiAJCSAqIFRPRE86IGhhbmRsZSBjaGFuZ2luZyB0aGUgcHJpbWFyeSdz IG5hbWUNCiAJCSAqLw0KIAkJYnJlYWs7DQorCWNhc2UgTkVUREVWX0ZFQVRf Q0hBTkdFOg0KKwkJYm9uZF9jb21wdXRlX2ZlYXR1cmVzKGJvbmQpOw0KKwkJ YnJlYWs7DQogCWRlZmF1bHQ6DQogCQlicmVhazsNCiAJfQ0KQEAgLTQ1MjYs NiArNDU3NCwxMSBAQCBzdGF0aWMgaW5saW5lIHZvaWQgYm9uZF9zZXRfbW9k ZV9vcHMoc3RyDQogCX0NCiB9DQogDQorc3RhdGljIHN0cnVjdCBldGh0b29s X29wcyBib25kX2V0aHRvb2xfb3BzID0gew0KKwkuZ2V0X3R4X2NzdW0JCT0g ZXRodG9vbF9vcF9nZXRfdHhfY3N1bSwNCisJLmdldF9zZwkJCT0gZXRodG9v bF9vcF9nZXRfc2csDQorfTsNCisNCiAvKg0KICAqIERvZXMgbm90IGFsbG9j YXRlIGJ1dCBjcmVhdGVzIGEgL3Byb2MgZW50cnkuDQogICogQWxsb3dlZCB0 byBmYWlsLg0KQEAgLTQ1NTUsNiArNDYwOCw3IEBAIHN0YXRpYyBpbnQgX19p bml0IGJvbmRfaW5pdChzdHJ1Y3QgbmV0X2QNCiAJYm9uZF9kZXYtPnN0b3Ag PSBib25kX2Nsb3NlOw0KIAlib25kX2Rldi0+Z2V0X3N0YXRzID0gYm9uZF9n ZXRfc3RhdHM7DQogCWJvbmRfZGV2LT5kb19pb2N0bCA9IGJvbmRfZG9faW9j dGw7DQorCWJvbmRfZGV2LT5ldGh0b29sX29wcyA9ICZib25kX2V0aHRvb2xf b3BzOw0KIAlib25kX2Rldi0+c2V0X211bHRpY2FzdF9saXN0ID0gYm9uZF9z ZXRfbXVsdGljYXN0X2xpc3Q7DQogCWJvbmRfZGV2LT5jaGFuZ2VfbXR1ID0g Ym9uZF9jaGFuZ2VfbXR1Ow0KIAlib25kX2Rldi0+c2V0X21hY19hZGRyZXNz ID0gYm9uZF9zZXRfbWFjX2FkZHJlc3M7DQpAQCAtNDU5MSw2ICs0NjQ1LDgg QEAgc3RhdGljIGludCBfX2luaXQgYm9uZF9pbml0KHN0cnVjdCBuZXRfZA0K IAkJCSAgICAgICBORVRJRl9GX0hXX1ZMQU5fUlggfA0KIAkJCSAgICAgICBO RVRJRl9GX0hXX1ZMQU5fRklMVEVSKTsNCiANCisJYm9uZC0+Ym9uZF9mZWF0 dXJlcyA9IGJvbmRfZGV2LT5mZWF0dXJlczsNCisNCiAjaWZkZWYgQ09ORklH X1BST0NfRlMNCiAJYm9uZF9jcmVhdGVfcHJvY19lbnRyeShib25kKTsNCiAj ZW5kaWYNCmRpZmYgLS1naXQgYS9kcml2ZXJzL25ldC9ib25kaW5nL2JvbmRp bmcuaCBiL2RyaXZlcnMvbmV0L2JvbmRpbmcvYm9uZGluZy5oDQotLS0gYS9k cml2ZXJzL25ldC9ib25kaW5nL2JvbmRpbmcuaA0KKysrIGIvZHJpdmVycy9u ZXQvYm9uZGluZy9ib25kaW5nLmgNCkBAIC0yMTEsNiArMjExLDkgQEAgc3Ry dWN0IGJvbmRpbmcgew0KIAlzdHJ1Y3QgICBib25kX3BhcmFtcyBwYXJhbXM7 DQogCXN0cnVjdCAgIGxpc3RfaGVhZCB2bGFuX2xpc3Q7DQogCXN0cnVjdCAg IHZsYW5fZ3JvdXAgKnZsZ3JwOw0KKwkvKiB0aGUgZmVhdHVyZXMgdGhlIGJv bmRpbmcgZGV2aWNlIHN1cHBvcnRzLCBpbmRlcGVuZGVudGx5IA0KKwkgKiBv ZiBhbnkgc2xhdmVzICovDQorCWludAkgYm9uZF9mZWF0dXJlczsgDQogfTsN CiANCiAvKioNCg== --32512-514941971-1123797007=:14105-- From mpm@selenic.com Thu Aug 11 19:21:41 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 11 Aug 2005 19:22:19 -0700 (PDT) Received: from waste.org (waste.org [216.27.176.166]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7C2LeH9026344 for ; Thu, 11 Aug 2005 19:21:40 -0700 Received: from cinder.waste.org ([10.0.0.101]) by waste.org (8.13.4/8.13.4/Debian-3) with ESMTP id j7C2ISX8020709; Thu, 11 Aug 2005 21:19:11 -0500 Date: Thu, 11 Aug 2005 21:19:11 -0500 From: Matt Mackall To: Andrew Morton , "David S. Miller" X-PatchBomber: http://selenic.com/scripts/mailpatches Cc: ak@suse.de, Jeff Moyer , netdev@oss.sgi.com, linux-kernel@vger.kernel.org, mingo@elte.hu, john.ronciak@intel.com, rostedt@goodmis.org In-Reply-To: <5.502409567@selenic.com> Message-Id: <6.502409567@selenic.com> Subject: [PATCH 5/8] netpoll: add retry timeout X-archive-position: 2903 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mpm@selenic.com Precedence: bulk X-list: netdev Content-Length: 2330 Lines: 79 Add limited retry logic to netpoll_send_skb Each time we attempt to send, decrement our per-device retry counter. On every successful send, we reset the counter. We delay 50us between attempts with up to 20000 retries for a total of 1 second. After we've exhausted our retries, subsequent failed attempts will try only once until reset by success. Signed-off-by: Matt Mackall Index: lhg/net/core/netpoll.c =================================================================== --- lhg.orig/net/core/netpoll.c 2005-08-11 16:17:45.000000000 -0700 +++ lhg/net/core/netpoll.c 2005-08-11 16:45:20.000000000 -0700 @@ -33,6 +33,7 @@ #define MAX_UDP_CHUNK 1460 #define MAX_SKBS 32 #define MAX_QUEUE_DEPTH (MAX_SKBS / 2) +#define MAX_RETRIES 20000 static DEFINE_SPINLOCK(skb_list_lock); static int nr_skbs; @@ -265,7 +266,8 @@ static void netpoll_send_skb(struct netp return; } - while (1) { + do { + npinfo->tries--; spin_lock(&np->dev->xmit_lock); np->dev->xmit_lock_owner = smp_processor_id(); @@ -277,6 +279,7 @@ static void netpoll_send_skb(struct netp np->dev->xmit_lock_owner = -1; spin_unlock(&np->dev->xmit_lock); netpoll_poll(np); + udelay(50); continue; } @@ -285,12 +288,15 @@ static void netpoll_send_skb(struct netp spin_unlock(&np->dev->xmit_lock); /* success */ - if(!status) + if(!status) { + npinfo->tries = MAX_RETRIES; /* reset */ return; + } /* transmit busy */ netpoll_poll(np); - } + udelay(50); + } while (npinfo->tries > 0); } void netpoll_send_udp(struct netpoll *np, const char *msg, int len) @@ -642,6 +648,7 @@ int netpoll_setup(struct netpoll *np) npinfo->rx_np = NULL; npinfo->poll_lock = SPIN_LOCK_UNLOCKED; npinfo->poll_owner = -1; + npinfo->tries = MAX_RETRIES; npinfo->rx_lock = SPIN_LOCK_UNLOCKED; } else npinfo = ndev->npinfo; Index: lhg/include/linux/netpoll.h =================================================================== --- lhg.orig/include/linux/netpoll.h 2005-08-11 15:40:27.000000000 -0700 +++ lhg/include/linux/netpoll.h 2005-08-11 16:17:56.000000000 -0700 @@ -26,6 +26,7 @@ struct netpoll { struct netpoll_info { spinlock_t poll_lock; int poll_owner; + int tries; int rx_flags; spinlock_t rx_lock; struct netpoll *rx_np; /* netpoll that registered an rx_hook */ From mpm@selenic.com Thu Aug 11 19:21:41 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 11 Aug 2005 19:22:18 -0700 (PDT) Received: from waste.org (waste.org [216.27.176.166]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7C2LeH9026341 for ; Thu, 11 Aug 2005 19:21:40 -0700 Received: from cinder.waste.org ([10.0.0.101]) by waste.org (8.13.4/8.13.4/Debian-3) with ESMTP id j7C2ISX4020709; Thu, 11 Aug 2005 21:19:10 -0500 Date: Thu, 11 Aug 2005 21:19:10 -0500 From: Matt Mackall To: Andrew Morton , "David S. Miller" X-PatchBomber: http://selenic.com/scripts/mailpatches Cc: ak@suse.de, Jeff Moyer , netdev@oss.sgi.com, linux-kernel@vger.kernel.org, mingo@elte.hu, john.ronciak@intel.com, rostedt@goodmis.org In-Reply-To: <1.502409567@selenic.com> Message-Id: <2.502409567@selenic.com> Subject: [PATCH 1/8] netpoll: rx_flags bugfix X-archive-position: 2902 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mpm@selenic.com Precedence: bulk X-list: netdev Content-Length: 667 Lines: 17 Initialize npinfo->rx_flags. The way it stands now, this will have random garbage, and so will incur a locking penalty even when an rx_hook isn't registered and we are not active in the netpoll polling code. Signed-off-by: Jeff Moyer Signed-off-by: Matt Mackall --- linux-2.6.12/net/core/netpoll.c.orig 2005-07-01 14:02:56.039174635 -0400 +++ linux-2.6.12/net/core/netpoll.c 2005-07-01 14:03:16.688739508 -0400 @@ -639,6 +639,7 @@ int netpoll_setup(struct netpoll *np) if (!npinfo) goto release; + npinfo->rx_flags = 0; npinfo->rx_np = NULL; npinfo->poll_lock = SPIN_LOCK_UNLOCKED; npinfo->poll_owner = -1; From mpm@selenic.com Thu Aug 11 19:21:42 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 11 Aug 2005 19:22:13 -0700 (PDT) Received: from waste.org (waste.org [216.27.176.166]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7C2LfH9026375 for ; Thu, 11 Aug 2005 19:21:42 -0700 Received: from cinder.waste.org ([10.0.0.101]) by waste.org (8.13.4/8.13.4/Debian-3) with ESMTP id j7C2ISXA020709; Thu, 11 Aug 2005 21:19:12 -0500 Date: Thu, 11 Aug 2005 21:19:12 -0500 From: Matt Mackall To: Andrew Morton , "David S. Miller" X-PatchBomber: http://selenic.com/scripts/mailpatches Cc: ak@suse.de, Jeff Moyer , netdev@oss.sgi.com, linux-kernel@vger.kernel.org, mingo@elte.hu, john.ronciak@intel.com, rostedt@goodmis.org In-Reply-To: <7.502409567@selenic.com> Message-Id: <8.502409567@selenic.com> Subject: [PATCH 7/8] netpoll: fix initialization/NAPI race X-archive-position: 2899 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mpm@selenic.com Precedence: bulk X-list: netdev Content-Length: 3129 Lines: 108 This fixes a race during initialization with the NAPI softirq processing by using an RCU approach. This race was discovered when refill_skbs() was added to the setup code. Signed-off-by: Matt Mackall Index: l/net/core/netpoll.c =================================================================== --- l.orig/net/core/netpoll.c 2005-08-09 00:56:23.000000000 -0500 +++ l/net/core/netpoll.c 2005-08-11 01:50:24.000000000 -0500 @@ -731,6 +731,9 @@ int netpoll_setup(struct netpoll *np) /* last thing to do is link it to the net device structure */ ndev->npinfo = npinfo; + /* avoid racing with NAPI reading npinfo */ + synchronize_rcu(); + return 0; release: Index: l/include/linux/netpoll.h =================================================================== --- l.orig/include/linux/netpoll.h 2005-08-09 00:56:23.000000000 -0500 +++ l/include/linux/netpoll.h 2005-08-11 01:33:42.000000000 -0500 @@ -9,6 +9,7 @@ #include #include +#include #include struct netpoll; @@ -61,25 +62,31 @@ static inline int netpoll_rx(struct sk_b return ret; } -static inline void netpoll_poll_lock(struct net_device *dev) +static inline void *netpoll_poll_lock(struct net_device *dev) { + rcu_read_lock(); /* deal with race on ->npinfo */ if (dev->npinfo) { spin_lock(&dev->npinfo->poll_lock); dev->npinfo->poll_owner = smp_processor_id(); + return dev->npinfo; } + return NULL; } -static inline void netpoll_poll_unlock(struct net_device *dev) +static inline void netpoll_poll_unlock(void *have) { - if (dev->npinfo) { - dev->npinfo->poll_owner = -1; - spin_unlock(&dev->npinfo->poll_lock); + struct netpoll_info *npi = have; + + if (npi) { + npi->poll_owner = -1; + spin_unlock(&npi->poll_lock); } + rcu_read_unlock(); } #else #define netpoll_rx(a) 0 -#define netpoll_poll_lock(a) +#define netpoll_poll_lock(a) 0 #define netpoll_poll_unlock(a) #endif Index: l/net/core/dev.c =================================================================== --- l.orig/net/core/dev.c 2005-08-09 00:56:23.000000000 -0500 +++ l/net/core/dev.c 2005-08-11 01:34:08.000000000 -0500 @@ -1696,7 +1696,8 @@ static void net_rx_action(struct softirq struct softnet_data *queue = &__get_cpu_var(softnet_data); unsigned long start_time = jiffies; int budget = netdev_budget; - + void *have; + local_irq_disable(); while (!list_empty(&queue->poll_list)) { @@ -1709,10 +1710,10 @@ static void net_rx_action(struct softirq dev = list_entry(queue->poll_list.next, struct net_device, poll_list); - netpoll_poll_lock(dev); + have = netpoll_poll_lock(dev); if (dev->quota <= 0 || dev->poll(dev, &budget)) { - netpoll_poll_unlock(dev); + netpoll_poll_unlock(have); local_irq_disable(); list_del(&dev->poll_list); list_add_tail(&dev->poll_list, &queue->poll_list); @@ -1721,7 +1722,7 @@ static void net_rx_action(struct softirq else dev->quota = dev->weight; } else { - netpoll_poll_unlock(dev); + netpoll_poll_unlock(have); dev_put(dev); local_irq_disable(); } From mpm@selenic.com Thu Aug 11 19:21:41 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 11 Aug 2005 19:22:12 -0700 (PDT) Received: from waste.org (waste.org [216.27.176.166]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7C2LeH9026342 for ; Thu, 11 Aug 2005 19:21:40 -0700 Received: from cinder.waste.org ([10.0.0.101]) by waste.org (8.13.4/8.13.4/Debian-3) with ESMTP id j7C2ISX7020709; Thu, 11 Aug 2005 21:19:11 -0500 Date: Thu, 11 Aug 2005 21:19:11 -0500 From: Matt Mackall To: Andrew Morton , "David S. Miller" X-PatchBomber: http://selenic.com/scripts/mailpatches Cc: ak@suse.de, Jeff Moyer , netdev@oss.sgi.com, linux-kernel@vger.kernel.org, mingo@elte.hu, john.ronciak@intel.com, rostedt@goodmis.org In-Reply-To: <4.502409567@selenic.com> Message-Id: <5.502409567@selenic.com> Subject: [PATCH 4/8] netpoll: netpoll_send_skb simplify X-archive-position: 2898 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mpm@selenic.com Precedence: bulk X-list: netdev Content-Length: 1894 Lines: 78 Minor netpoll_send_skb restructuring Restructure to avoid confusing goto and move some bits out of the retry loop. Signed-off-by: Matt Mackall Index: l/net/core/netpoll.c =================================================================== --- l.orig/net/core/netpoll.c 2005-08-07 15:15:48.000000000 -0500 +++ l/net/core/netpoll.c 2005-08-07 16:59:27.000000000 -0500 @@ -248,14 +248,14 @@ static void netpoll_send_skb(struct netp int status; struct netpoll_info *npinfo; -repeat: - if(!np || !np->dev || !netif_running(np->dev)) { + if (!np || !np->dev || !netif_running(np->dev)) { __kfree_skb(skb); return; } - /* avoid recursion */ npinfo = np->dev->npinfo; + + /* avoid recursion */ if (npinfo->poll_owner == smp_processor_id() || np->dev->xmit_lock_owner == smp_processor_id()) { if (np->drop) @@ -265,29 +265,31 @@ repeat: return; } - spin_lock(&np->dev->xmit_lock); - np->dev->xmit_lock_owner = smp_processor_id(); + while (1) { + spin_lock(&np->dev->xmit_lock); + np->dev->xmit_lock_owner = smp_processor_id(); + + /* + * network drivers do not expect to be called if the queue is + * stopped. + */ + if (netif_queue_stopped(np->dev)) { + np->dev->xmit_lock_owner = -1; + spin_unlock(&np->dev->xmit_lock); + netpoll_poll(np); + continue; + } - /* - * network drivers do not expect to be called if the queue is - * stopped. - */ - if (netif_queue_stopped(np->dev)) { + status = np->dev->hard_start_xmit(skb, np->dev); np->dev->xmit_lock_owner = -1; spin_unlock(&np->dev->xmit_lock); - netpoll_poll(np); - goto repeat; - } - - status = np->dev->hard_start_xmit(skb, np->dev); - np->dev->xmit_lock_owner = -1; - spin_unlock(&np->dev->xmit_lock); + /* success */ + if(!status) + return; - /* transmit busy */ - if(status) { + /* transmit busy */ netpoll_poll(np); - goto repeat; } } From mpm@selenic.com Thu Aug 11 19:21:42 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 11 Aug 2005 19:22:11 -0700 (PDT) Received: from waste.org (waste.org [216.27.176.166]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7C2LeH9026345 for ; Thu, 11 Aug 2005 19:21:41 -0700 Received: from cinder.waste.org ([10.0.0.101]) by waste.org (8.13.4/8.13.4/Debian-3) with ESMTP id j7C2ISX3020709; Thu, 11 Aug 2005 21:19:08 -0500 Date: Thu, 11 Aug 2005 21:18:28 -0500 From: Matt Mackall To: Andrew Morton , "David S. Miller" X-PatchBomber: http://selenic.com/scripts/mailpatches Cc: ak@suse.de, Jeff Moyer , netdev@oss.sgi.com, linux-kernel@vger.kernel.org, mingo@elte.hu, john.ronciak@intel.com, rostedt@goodmis.org Message-Id: <1.502409567@selenic.com> Subject: [PATCH 0/8] netpoll: various bugfixes X-archive-position: 2897 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mpm@selenic.com Precedence: bulk X-list: netdev Content-Length: 455 Lines: 12 This patch series cleans up a few outstanding bugs in netpoll: - two bugfixes from Jeff Moyer's netpoll bonding - a tweak to e1000's netpoll stub - timeout handling for e1000 with carrier loss - prefilling SKBs at init - a fix-up for a race discovered in initialization - an unused variable warning This patch set was tested over repeated rebooting with both tg3 and e1000 and random cable disconnection, with and without SMP and preempt. Please apply. From mpm@selenic.com Thu Aug 11 19:21:41 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 11 Aug 2005 19:22:20 -0700 (PDT) Received: from waste.org (waste.org [216.27.176.166]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7C2LeH9026343 for ; Thu, 11 Aug 2005 19:21:41 -0700 Received: from cinder.waste.org ([10.0.0.101]) by waste.org (8.13.4/8.13.4/Debian-3) with ESMTP id j7C2ISX6020709; Thu, 11 Aug 2005 21:19:10 -0500 Date: Thu, 11 Aug 2005 21:19:10 -0500 From: Matt Mackall To: Andrew Morton , "David S. Miller" X-PatchBomber: http://selenic.com/scripts/mailpatches Cc: ak@suse.de, Jeff Moyer , netdev@oss.sgi.com, linux-kernel@vger.kernel.org, mingo@elte.hu, john.ronciak@intel.com, rostedt@goodmis.org In-Reply-To: <3.502409567@selenic.com> Message-Id: <4.502409567@selenic.com> Subject: [PATCH 3/8] netpoll: e1000 netpoll tweak X-archive-position: 2904 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mpm@selenic.com Precedence: bulk X-list: netdev Content-Length: 649 Lines: 16 Suggested by Steven Rostedt, matches his patch included in e100. Signed-off-by: Matt Mackall Index: l/drivers/net/e1000/e1000_main.c =================================================================== --- l.orig/drivers/net/e1000/e1000_main.c 2005-08-06 17:36:32.000000000 -0500 +++ l/drivers/net/e1000/e1000_main.c 2005-08-06 17:55:01.000000000 -0500 @@ -3789,6 +3789,7 @@ e1000_netpoll(struct net_device *netdev) struct e1000_adapter *adapter = netdev_priv(netdev); disable_irq(adapter->pdev->irq); e1000_intr(adapter->pdev->irq, netdev, NULL); + e1000_clean_tx_irq(adapter); enable_irq(adapter->pdev->irq); } #endif From mpm@selenic.com Thu Aug 11 19:21:41 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 11 Aug 2005 19:22:16 -0700 (PDT) Received: from waste.org (waste.org [216.27.176.166]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7C2LeH9026339 for ; Thu, 11 Aug 2005 19:21:41 -0700 Received: from cinder.waste.org ([10.0.0.101]) by waste.org (8.13.4/8.13.4/Debian-3) with ESMTP id j7C2ISX9020709; Thu, 11 Aug 2005 21:19:11 -0500 Date: Thu, 11 Aug 2005 21:19:11 -0500 From: Matt Mackall To: Andrew Morton , "David S. Miller" X-PatchBomber: http://selenic.com/scripts/mailpatches Cc: ak@suse.de, Jeff Moyer , netdev@oss.sgi.com, linux-kernel@vger.kernel.org, mingo@elte.hu, john.ronciak@intel.com, rostedt@goodmis.org In-Reply-To: <6.502409567@selenic.com> Message-Id: <7.502409567@selenic.com> Subject: [PATCH 6/8] netpoll: pre-fill skb pool X-archive-position: 2901 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mpm@selenic.com Precedence: bulk X-list: netdev Content-Length: 990 Lines: 29 we could do one thing (see the patch below): i think it would be useful to fill up the netlogging skb queue straight at initialization time. Especially if netpoll is used for dumping alone, the system might not be in a situation to fill up the queue at the point of crash, so better be a bit more prepared and keep the pipeline filled. Ingo Signed-off-by: Ingo Molnar I've modified this to be called earlier - mpm Signed-off-by: Matt Mackall Index: l/net/core/netpoll.c =================================================================== --- l.orig/net/core/netpoll.c 2005-08-08 23:00:48.000000000 -0500 +++ l/net/core/netpoll.c 2005-08-11 01:50:31.000000000 -0500 @@ -724,6 +724,10 @@ int netpoll_setup(struct netpoll *np) npinfo->rx_np = np; spin_unlock_irqrestore(&npinfo->rx_lock, flags); } + + /* fill up the skb queue */ + refill_skbs(); + /* last thing to do is link it to the net device structure */ ndev->npinfo = npinfo; From mpm@selenic.com Thu Aug 11 19:21:41 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 11 Aug 2005 19:22:14 -0700 (PDT) Received: from waste.org (waste.org [216.27.176.166]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7C2LeH9026338 for ; Thu, 11 Aug 2005 19:21:40 -0700 Received: from cinder.waste.org ([10.0.0.101]) by waste.org (8.13.4/8.13.4/Debian-3) with ESMTP id j7C2ISX5020709; Thu, 11 Aug 2005 21:19:10 -0500 Date: Thu, 11 Aug 2005 21:19:10 -0500 From: Matt Mackall To: Andrew Morton , "David S. Miller" X-PatchBomber: http://selenic.com/scripts/mailpatches Cc: ak@suse.de, Jeff Moyer , netdev@oss.sgi.com, linux-kernel@vger.kernel.org, mingo@elte.hu, john.ronciak@intel.com, rostedt@goodmis.org In-Reply-To: <2.502409567@selenic.com> Message-Id: <3.502409567@selenic.com> Subject: [PATCH 2/8] netpoll: deadlock bugfix X-archive-position: 2900 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mpm@selenic.com Precedence: bulk X-list: netdev Content-Length: 852 Lines: 23 This fixes an obvious deadlock in the netpoll code. netpoll_rx takes the npinfo->rx_lock. netpoll_rx is also the only caller of arp_reply (through __netpoll_rx). As such, it is not necessary to take this lock. Signed-off-by: Jeff Moyer Signed-off-by: Matt Mackall Index: l/net/core/netpoll.c =================================================================== --- l.orig/net/core/netpoll.c 2005-08-06 17:47:48.000000000 -0500 +++ l/net/core/netpoll.c 2005-08-06 17:47:49.000000000 -0500 @@ -353,11 +353,8 @@ static void arp_reply(struct sk_buff *sk struct sk_buff *send_skb; struct netpoll *np = NULL; - spin_lock_irqsave(&npinfo->rx_lock, flags); if (npinfo->rx_np && npinfo->rx_np->dev == skb->dev) np = npinfo->rx_np; - spin_unlock_irqrestore(&npinfo->rx_lock, flags); - if (!np) return; From mpm@selenic.com Thu Aug 11 19:21:41 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 11 Aug 2005 19:22:22 -0700 (PDT) Received: from waste.org (waste.org [216.27.176.166]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7C2LeH9026340 for ; Thu, 11 Aug 2005 19:21:40 -0700 Received: from cinder.waste.org ([10.0.0.101]) by waste.org (8.13.4/8.13.4/Debian-3) with ESMTP id j7C2ISXB020709; Thu, 11 Aug 2005 21:19:12 -0500 Date: Thu, 11 Aug 2005 21:19:12 -0500 From: Matt Mackall To: Andrew Morton , "David S. Miller" X-PatchBomber: http://selenic.com/scripts/mailpatches Cc: ak@suse.de, Jeff Moyer , netdev@oss.sgi.com, linux-kernel@vger.kernel.org, mingo@elte.hu, john.ronciak@intel.com, rostedt@goodmis.org In-Reply-To: <8.502409567@selenic.com> Message-Id: <9.502409567@selenic.com> Subject: [PATCH 8/8] netpoll: remove unused variable X-archive-position: 2905 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mpm@selenic.com Precedence: bulk X-list: netdev Content-Length: 530 Lines: 16 Remove unused variable Signed-off-by: Matt Mackall Index: l/net/core/netpoll.c =================================================================== --- l.orig/net/core/netpoll.c 2005-08-11 01:32:01.000000000 -0500 +++ l/net/core/netpoll.c 2005-08-11 01:49:37.000000000 -0500 @@ -356,7 +356,6 @@ static void arp_reply(struct sk_buff *sk unsigned char *arp_ptr; int size, type = ARPOP_REPLY, ptype = ETH_P_ARP; u32 sip, tip; - unsigned long flags; struct sk_buff *send_skb; struct netpoll *np = NULL; From davem@davemloft.net Thu Aug 11 19:44:22 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 11 Aug 2005 19:44:47 -0700 (PDT) Received: from sunset.davemloft.net (dsl027-180-168.sfo1.dsl.speakeasy.net [216.27.180.168]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7C2iLH9003253 for ; Thu, 11 Aug 2005 19:44:22 -0700 Received: from localhost ([127.0.0.1] ident=davem) by sunset.davemloft.net with esmtp (Exim 4.52) id 1E3PTs-0005uT-7H; Thu, 11 Aug 2005 19:41:12 -0700 Date: Thu, 11 Aug 2005 19:41:11 -0700 (PDT) Message-Id: <20050811.194111.41635499.davem@davemloft.net> To: mpm@selenic.com Cc: akpm@osdl.com, ak@suse.de, jmoyer@redhat.com, netdev@oss.sgi.com, linux-kernel@vger.kernel.org, mingo@elte.hu, john.ronciak@intel.com, rostedt@goodmis.org Subject: Re: [PATCH 0/8] netpoll: various bugfixes From: "David S. Miller" In-Reply-To: <1.502409567@selenic.com> References: <1.502409567@selenic.com> X-Mailer: Mew version 4.2 on Emacs 21.4 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 2910 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev Content-Length: 670 Lines: 19 From: Matt Mackall Date: Thu, 11 Aug 2005 21:18:28 -0500 > This patch series cleans up a few outstanding bugs in netpoll: > > - two bugfixes from Jeff Moyer's netpoll bonding > - a tweak to e1000's netpoll stub > - timeout handling for e1000 with carrier loss > - prefilling SKBs at init > - a fix-up for a race discovered in initialization > - an unused variable warning > > This patch set was tested over repeated rebooting with both tg3 and > e1000 and random cable disconnection, with and without SMP and > preempt. Please apply. All applied, thanks a lot for putting this patch set together. I'll push this to Linus after some smoke testing. From alan@lxorguk.ukuu.org.uk Fri Aug 12 06:10:28 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 12 Aug 2005 06:10:34 -0700 (PDT) Received: from localhost.localdomain (clock-tower.bc.nu [81.2.110.250] (may be forged)) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7CDAOH9028404 for ; Fri, 12 Aug 2005 06:10:27 -0700 Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by localhost.localdomain (8.13.4/8.13.4) with ESMTP id j7CDZEHn026417 for ; Fri, 12 Aug 2005 14:35:15 +0100 Received: (from alan@localhost) by localhost.localdomain (8.13.4/8.13.4/Submit) id j7CDZEYc026416 for netdev@oss.sgi.com; Fri, 12 Aug 2005 14:35:14 +0100 X-Authentication-Warning: localhost.localdomain: alan set sender to alan@lxorguk.ukuu.org.uk using -f Subject: Strange uses of netif_start_queue From: Alan Cox To: netdev@oss.sgi.com Content-Type: text/plain Content-Transfer-Encoding: 7bit Date: Fri, 12 Aug 2005 14:35:14 +0100 Message-Id: <1123853714.22460.39.camel@localhost.localdomain> Mime-Version: 1.0 X-Mailer: Evolution 2.2.2 (2.2.2-5) X-archive-position: 3003 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: alan@lxorguk.ukuu.org.uk Precedence: bulk X-list: netdev Content-Length: 150 Lines: 4 Something I noticed doing the tty work. the 6pack driver calls netif_start_queue() before it calls register_netdev. I'm curious if this is allowed ? From ralf@linux-mips.org Fri Aug 12 06:30:13 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 12 Aug 2005 06:30:15 -0700 (PDT) Received: from bacchus.net.dhis.org (extgw-uk.mips.com [62.254.210.129]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7CDUCH9032193 for ; Fri, 12 Aug 2005 06:30:12 -0700 Received: from dea.linux-mips.net (localhost.localdomain [127.0.0.1]) by bacchus.net.dhis.org (8.13.4/8.13.1) with ESMTP id j7CDRx3c014186; Fri, 12 Aug 2005 14:27:59 +0100 Received: (from ralf@localhost) by dea.linux-mips.net (8.13.4/8.13.4/Submit) id j7CDRxiv014185; Fri, 12 Aug 2005 14:27:59 +0100 Date: Fri, 12 Aug 2005 14:27:59 +0100 From: Ralf Baechle To: Alan Cox Cc: netdev@oss.sgi.com Subject: Re: Strange uses of netif_start_queue Message-ID: <20050812132758.GE2819@linux-mips.org> References: <1123853714.22460.39.camel@localhost.localdomain> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1123853714.22460.39.camel@localhost.localdomain> User-Agent: Mutt/1.4.2.1i X-archive-position: 3007 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ralf@linux-mips.org Precedence: bulk X-list: netdev Content-Length: 420 Lines: 11 On Fri, Aug 12, 2005 at 02:35:14PM +0100, Alan Cox wrote: > Something I noticed doing the tty work. the 6pack driver calls > netif_start_queue() before it calls register_netdev. I'm curious if this > is allowed ? As part of adding support for extended 6pack which is required by the PR 430 I've recently fixed that. It was looking suspect enough that I fixed it though I don't see any way this could do harm. Ralf From kaber@trash.net Fri Aug 12 06:35:57 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 12 Aug 2005 06:36:00 -0700 (PDT) Received: from kaber.coreworks.de ([62.206.217.67]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7CDZuH9001604 for ; Fri, 12 Aug 2005 06:35:56 -0700 Received: from localhost ([127.0.0.1]) by kaber.coreworks.de with esmtp (Exim 4.52) id 1E3ZfJ-0001Ue-Hz; Fri, 12 Aug 2005 15:33:41 +0200 Message-ID: <42FCA535.3090108@trash.net> Date: Fri, 12 Aug 2005 15:33:41 +0200 From: Patrick McHardy User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.7.10) Gecko/20050803 Debian/1.7.10-1 X-Accept-Language: en MIME-Version: 1.0 To: Alan Cox CC: netdev@oss.sgi.com Subject: Re: Strange uses of netif_start_queue References: <1123853714.22460.39.camel@localhost.localdomain> In-Reply-To: <1123853714.22460.39.camel@localhost.localdomain> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 3008 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kaber@trash.net Precedence: bulk X-list: netdev Content-Length: 277 Lines: 7 Alan Cox wrote: > Something I noticed doing the tty work. the 6pack driver calls > netif_start_queue() before it calls register_netdev. I'm curious if this > is allowed ? All netif_start_queue does is clear_bit(...), so it is a NOP at that point. I guess it could be removed. From ralf@linux-mips.org Fri Aug 12 06:41:19 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 12 Aug 2005 06:41:23 -0700 (PDT) Received: from bacchus.net.dhis.org (extgw-uk.mips.com [62.254.210.129]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7CDfIH9004160 for ; Fri, 12 Aug 2005 06:41:18 -0700 Received: from dea.linux-mips.net (localhost.localdomain [127.0.0.1]) by bacchus.net.dhis.org (8.13.4/8.13.1) with ESMTP id j7CDd55G016144; Fri, 12 Aug 2005 14:39:05 +0100 Received: (from ralf@localhost) by dea.linux-mips.net (8.13.4/8.13.4/Submit) id j7CDd5XG016143; Fri, 12 Aug 2005 14:39:05 +0100 Date: Fri, 12 Aug 2005 14:39:05 +0100 From: Ralf Baechle To: Alan Cox Cc: netdev@oss.sgi.com Subject: Re: Strange uses of netif_start_queue Message-ID: <20050812133905.GF2819@linux-mips.org> References: <1123853714.22460.39.camel@localhost.localdomain> <20050812132758.GE2819@linux-mips.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20050812132758.GE2819@linux-mips.org> User-Agent: Mutt/1.4.2.1i X-archive-position: 3010 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ralf@linux-mips.org Precedence: bulk X-list: netdev Content-Length: 650 Lines: 15 On Fri, Aug 12, 2005 at 02:27:59PM +0100, Ralf Baechle wrote: > > Something I noticed doing the tty work. the 6pack driver calls > > netif_start_queue() before it calls register_netdev. I'm curious if this > > is allowed ? > > As part of adding support for extended 6pack which is required by the > PR 430 I've recently fixed that. It was looking suspect enough that I > fixed it though I don't see any way this could do harm. To answer the fundamental question, I think netif_start_queue / netif_stop_queue should be allowed in case the driver for some reason has the desire to stop queueing of packet immediately after register_netdev. Ralf From abonilla@linuxwireless.org Fri Aug 12 08:51:22 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 12 Aug 2005 08:51:29 -0700 (PDT) Received: from linuxwireless.org.ve.carpathiahost.net (linuxwireless.org.ve.carpathiahost.net [66.117.45.234]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7CFpLH9008255 for ; Fri, 12 Aug 2005 08:51:22 -0700 Received: from WCRSJO2KPAB047 ([200.9.49.66]) by linuxwireless.org.ve.carpathiahost.net (8.12.10/8.12.10) with SMTP id j7CFnBqf004929 for ; Fri, 12 Aug 2005 11:49:11 -0400 Reply-To: From: "Alejandro Bonilla" To: Subject: More SPAM in Netdev? Date: Fri, 12 Aug 2005 09:49:09 -0600 Message-ID: <00cc01c59f55$61a5a790$a20cc60a@amer.sykes.com> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 (Normal) X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook CWS, Build 9.0.6604 (9.0.2911.0) X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1506 Importance: Normal X-archive-position: 3034 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: abonilla@linuxwireless.org Precedence: bulk X-list: netdev Content-Length: 94 Lines: 5 Hi, Is it me or we are getting much more spam? (The ones with Chinesse letters?) .Alejandro From zdzichu@irc.pl Fri Aug 12 09:14:05 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 12 Aug 2005 09:14:12 -0700 (PDT) Received: from pollux.ds.pg.gda.pl (pollux.ds.pg.gda.pl [153.19.208.7]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7CGE4H9012450 for ; Fri, 12 Aug 2005 09:14:05 -0700 Received: from localhost (localhost [127.0.0.1]) by pollux.ds.pg.gda.pl (Postfix) with ESMTP id 11749F597B for ; Fri, 12 Aug 2005 18:11:44 +0200 (CEST) Received: from pollux.ds.pg.gda.pl ([127.0.0.1]) by localhost (pollux [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 16140-08 for ; Fri, 12 Aug 2005 18:11:43 +0200 (CEST) Received: from piorun.ds.pg.gda.pl (piorun.ds.pg.gda.pl [153.19.208.8]) by pollux.ds.pg.gda.pl (Postfix) with ESMTP id CBAC9F5979 for ; Fri, 12 Aug 2005 18:11:43 +0200 (CEST) Received: from matthew.ogrody.nsm.pl (daemon@localhost [127.0.0.1]) by piorun.ds.pg.gda.pl (8.13.3/8.13.1) with SMTP id j7CGBm2R022855 for ; Fri, 12 Aug 2005 18:11:48 +0200 Received: (qmail 20741 invoked by uid 1000); 12 Aug 2005 16:11:48 -0000 Date: Fri, 12 Aug 2005 18:11:48 +0200 From: Tomasz Torcz To: netdev@oss.sgi.com Subject: Re: More SPAM in Netdev? Message-ID: <20050812161148.GC4367@irc.pl> References: <00cc01c59f55$61a5a790$a20cc60a@amer.sykes.com> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="5/uDoXvLw7AC5HRs" Content-Disposition: inline In-Reply-To: <00cc01c59f55$61a5a790$a20cc60a@amer.sykes.com> User-Agent: Mutt/1.5.4i X-archive-position: 3039 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: zdzichu@irc.pl Precedence: bulk X-list: netdev Content-Length: 981 Lines: 34 --5/uDoXvLw7AC5HRs Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Fri, Aug 12, 2005 at 09:49:09AM -0600, Alejandro Bonilla wrote: > Hi, Is it me or we are getting much more spam? (The ones with Chinesse > letters?) Enormous amounts. It looks like spam filters are down. Or maybe that a friendly reminder, that netdev is dead, and all activity got moved to vger lists. I'm going to take that reminder - unsubscribing now. --=20 Tomasz Torcz "Funeral in the morning, IDE hacking zdzichu@irc.-nie.spam-.pl in the afternoon and evening." - Alan Cox --5/uDoXvLw7AC5HRs Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.0 (GNU/Linux) Comment: gpg --search-keys Tomasz Torcz iD8DBQFC/MpEThhlKowQALQRAp2oAJwK3nUqH9KmC7nhUypcrlm5JnB17QCg5GTh DqPf/k9mEOp2YsgnznULzdo= =8FDy -----END PGP SIGNATURE----- --5/uDoXvLw7AC5HRs-- From olh@suse.de Fri Aug 12 10:25:34 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 12 Aug 2005 10:25:59 -0700 (PDT) Received: from mx2.suse.de (ns2.suse.de [195.135.220.15]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7CHPWH9029383 for ; Fri, 12 Aug 2005 10:25:33 -0700 Received: from Relay2.suse.de (mail2.suse.de [195.135.221.8]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx2.suse.de (Postfix) with ESMTP id 1417D1CC3B; Fri, 12 Aug 2005 19:23:21 +0200 (CEST) Received: from nectarine.suse.de (nectarine.suse.de [10.10.1.156]) by Relay2.suse.de (Postfix) with ESMTP id 68D66126DC; Fri, 12 Aug 2005 19:21:52 +0200 (CEST) Date: Fri, 12 Aug 2005 19:21:51 +0200 From: Olaf Hering To: Matt Mackall Cc: Andrew Morton , "David S. Miller" , ak@suse.de, Jeff Moyer , netdev@oss.sgi.com, linux-kernel@vger.kernel.org, mingo@elte.hu, john.ronciak@intel.com, rostedt@goodmis.org Subject: Re: [PATCH 0/8] netpoll: various bugfixes Message-ID: <20050812172151.GA11104@suse.de> References: <1.502409567@selenic.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <1.502409567@selenic.com> X-DOS: I got your 640K Real Mode Right Here Buddy! X-Homeland-Security: You are not supposed to read this line! You are a terrorist! User-Agent: Mutt und vi sind doch schneller als Notes (und GroupWise) X-archive-position: 3052 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: olh@suse.de Precedence: bulk X-list: netdev Content-Length: 509 Lines: 14 On Thu, Aug 11, Matt Mackall wrote: > This patch series cleans up a few outstanding bugs in netpoll: > > - two bugfixes from Jeff Moyer's netpoll bonding > - a tweak to e1000's netpoll stub > - timeout handling for e1000 with carrier loss > - prefilling SKBs at init > - a fix-up for a race discovered in initialization > - an unused variable warning Matt, I have tested them, the sender doesnt lockup anymore. But a task dump doesnt work, I get only the first task. This is on a 3GHz xeon with tg3 card. From ravinandan.arakali@neterion.com Fri Aug 12 10:33:30 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 12 Aug 2005 10:33:34 -0700 (PDT) Received: from linux.site (adsl-67-120-213-161.dsl.sntc01.pacbell.net [67.120.213.161]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7CHXUH9030658 for ; Fri, 12 Aug 2005 10:33:30 -0700 Received: by linux.site (Postfix, from userid 0) id F0C03983D0; Fri, 12 Aug 2005 10:15:59 -0700 (PDT) To: jgarzik@pobox.com, netdev@oss.sgi.com Cc: raghavendra.koushik@neterion.com, ravinandan.arakali@neterion.com, leonid.grossman@neterion.com, rapuru.sriram@neterion.com, ananda.raju@neterion.com From: ravinandan.arakali@neterion.com Subject: [PATCH 2.6.13-rc6] S2io: Hardware fixes for Xframe II adapter Message-Id: <20050812171559.F0C03983D0@linux.site> Date: Fri, 12 Aug 2005 10:15:59 -0700 (PDT) X-archive-position: 3054 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ravinandan.arakali@neterion.com Precedence: bulk X-list: netdev Content-Length: 3951 Lines: 121 Hi, Patch Description: This patch incorporates the following hardware fixes required for Xframe II adapter. 1. New values to program the dtx_control register. 2. Disable memory controller interrupts(MC_INTR) since these are now monitored thru' a poll routine. 3. Don't reset an XframeII card on an ECC double-bit error(It can recover). 4. Save/restore PCI config space before/after a reset irrespective of Xframe I or II card. 5. Bumped up the driver version no. to 2.0.3.1 Please review the patch and apply the same if it looks ok. Signed-off-by: Ravinandan Arakali --- diff -urN old/drivers/net/s2io.c new/drivers/net/s2io.c --- old/drivers/net/s2io.c 2005-08-12 01:29:56.000000000 -0700 +++ new/drivers/net/s2io.c 2005-08-12 01:30:12.000000000 -0700 @@ -67,7 +67,7 @@ /* S2io Driver name & version. */ static char s2io_driver_name[] = "Neterion"; -static char s2io_driver_version[] = "Version 2.0.2.1"; +static char s2io_driver_version[] = "Version 2.0.3.1"; static inline int RXD_IS_UP2DT(RxD_t *rxdp) { @@ -210,14 +210,18 @@ static u64 herc_act_dtx_cfg[] = { /* Set address */ - 0x80000515BA750000ULL, 0x80000515BA7500E0ULL, + 0x8000051536750000ULL, 0x80000515367500E0ULL, /* Write data */ - 0x80000515BA750004ULL, 0x80000515BA7500E4ULL, + 0x8000051536750004ULL, 0x80000515367500E4ULL, /* Set address */ 0x80010515003F0000ULL, 0x80010515003F00E0ULL, /* Write data */ 0x80010515003F0004ULL, 0x80010515003F00E4ULL, /* Set address */ + 0x801205150D440000ULL, 0x801205150D4400E0ULL, + /* Write data */ + 0x801205150D440004ULL, 0x801205150D4400E4ULL, + /* Set address */ 0x80020515F2100000ULL, 0x80020515F21000E0ULL, /* Write data */ 0x80020515F2100004ULL, 0x80020515F21000E4ULL, @@ -1903,7 +1907,7 @@ } /* Enable select interrupts */ - interruptible = TX_TRAFFIC_INTR | RX_TRAFFIC_INTR | MC_INTR; + interruptible = TX_TRAFFIC_INTR | RX_TRAFFIC_INTR; interruptible |= TX_PIC_INTR | RX_PIC_INTR; interruptible |= TX_MAC_INTR | RX_MAC_INTR; @@ -2030,7 +2034,7 @@ config = &nic->config; /* Disable all interrupts */ - interruptible = TX_TRAFFIC_INTR | RX_TRAFFIC_INTR | MC_INTR; + interruptible = TX_TRAFFIC_INTR | RX_TRAFFIC_INTR; interruptible |= TX_PIC_INTR | RX_PIC_INTR; interruptible |= TX_MAC_INTR | RX_MAC_INTR; en_dis_able_nic_intrs(nic, interruptible, DISABLE_INTRS); @@ -2688,8 +2692,10 @@ DBG_PRINT(ERR_DBG, "%s: Device indicates ", dev->name); DBG_PRINT(ERR_DBG, "double ECC error!!\n"); - netif_stop_queue(dev); - schedule_work(&nic->rst_timer_task); + if (nic->device_type != XFRAME_II_DEVICE) { + netif_stop_queue(dev); + schedule_work(&nic->rst_timer_task); + } } else { nic->mac_control.stats_info->sw_stat. single_ecc_errs++; @@ -2772,8 +2778,7 @@ u16 subid, pci_cmd; /* Back up the PCI-X CMD reg, dont want to lose MMRBC, OST settings */ - if (sp->device_type == XFRAME_I_DEVICE) - pci_read_config_word(sp->pdev, PCIX_COMMAND_REGISTER, &(pci_cmd)); + pci_read_config_word(sp->pdev, PCIX_COMMAND_REGISTER, &(pci_cmd)); val64 = SW_RESET_ALL; writeq(val64, &bar0->sw_reset); @@ -2792,14 +2797,10 @@ */ msleep(250); - if (!(sp->device_type & XFRAME_II_DEVICE)) { - /* Restore the PCI state saved during initializarion. */ - pci_restore_state(sp->pdev); - pci_write_config_word(sp->pdev, PCIX_COMMAND_REGISTER, + /* Restore the PCI state saved during initialization. */ + pci_restore_state(sp->pdev); + pci_write_config_word(sp->pdev, PCIX_COMMAND_REGISTER, pci_cmd); - } else { - pci_set_master(sp->pdev); - } s2io_init_pci(sp); msleep(250); @@ -5426,9 +5427,7 @@ INIT_WORK(&sp->set_link_task, (void (*)(void *)) s2io_set_link, sp); - if (!(sp->device_type & XFRAME_II_DEVICE)) { - pci_save_state(sp->pdev); - } + pci_save_state(sp->pdev); /* Setting swapper control on the NIC, for proper reset operation */ if (s2io_set_swapper(sp)) { From davem@davemloft.net Fri Aug 12 11:02:46 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 12 Aug 2005 11:02:52 -0700 (PDT) Received: from sunset.davemloft.net (dsl027-180-168.sfo1.dsl.speakeasy.net [216.27.180.168]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7CI2kH9001745 for ; Fri, 12 Aug 2005 11:02:46 -0700 Received: from localhost ([127.0.0.1] ident=davem) by sunset.davemloft.net with esmtp (Exim 4.52) id 1E3dpa-0001vJ-6r; Fri, 12 Aug 2005 11:00:34 -0700 Date: Fri, 12 Aug 2005 11:00:34 -0700 (PDT) Message-Id: <20050812.110034.63126875.davem@davemloft.net> To: alan@lxorguk.ukuu.org.uk Cc: netdev@oss.sgi.com Subject: Re: Strange uses of netif_start_queue From: "David S. Miller" In-Reply-To: <1123853714.22460.39.camel@localhost.localdomain> References: <1123853714.22460.39.camel@localhost.localdomain> X-Mailer: Mew version 4.2 on Emacs 21.4 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 3058 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev Content-Length: 424 Lines: 12 From: Alan Cox Date: Fri, 12 Aug 2005 14:35:14 +0100 > Something I noticed doing the tty work. the 6pack driver calls > netif_start_queue() before it calls register_netdev. I'm curious if this > is allowed ? It's definitely a bug, and when register_netdev() happens it will just overwrite that change of state. Since the queue is not initialized yet, this could also cause a crash or hang. :-) From davem@davemloft.net Fri Aug 12 11:05:38 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 12 Aug 2005 11:05:42 -0700 (PDT) Received: from sunset.davemloft.net (dsl027-180-168.sfo1.dsl.speakeasy.net [216.27.180.168]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7CI5cH9004026 for ; Fri, 12 Aug 2005 11:05:38 -0700 Received: from localhost ([127.0.0.1] ident=davem) by sunset.davemloft.net with esmtp (Exim 4.52) id 1E3dsG-0001wa-80; Fri, 12 Aug 2005 11:03:20 -0700 Date: Fri, 12 Aug 2005 11:03:20 -0700 (PDT) Message-Id: <20050812.110320.97292810.davem@davemloft.net> To: ralf@linux-mips.org Cc: alan@lxorguk.ukuu.org.uk, netdev@oss.sgi.com Subject: Re: Strange uses of netif_start_queue From: "David S. Miller" In-Reply-To: <20050812133905.GF2819@linux-mips.org> References: <1123853714.22460.39.camel@localhost.localdomain> <20050812132758.GE2819@linux-mips.org> <20050812133905.GF2819@linux-mips.org> X-Mailer: Mew version 4.2 on Emacs 21.4 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 3061 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev Content-Length: 484 Lines: 10 From: Ralf Baechle Date: Fri, 12 Aug 2005 14:39:05 +0100 > To answer the fundamental question, I think netif_start_queue / > netif_stop_queue should be allowed in case the driver for some reason has > the desire to stop queueing of packet immediately after register_netdev. I disagree. register_netdev() does not make packets start getting queued, you have to up the interface for that to start occuring. And your ->open() routine has full control over that. From tgraf@suug.ch Fri Aug 12 11:48:17 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 12 Aug 2005 11:48:22 -0700 (PDT) Received: from postel.suug.ch (postel.suug.ch [195.134.158.23]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7CImCH9011306 for ; Fri, 12 Aug 2005 11:48:17 -0700 Received: by postel.suug.ch (Postfix, from userid 10001) id 41BC81C0EB; Fri, 12 Aug 2005 20:46:05 +0200 (CEST) Date: Fri, 12 Aug 2005 20:46:05 +0200 From: Thomas Graf To: "David S. Miller" Cc: alan@lxorguk.ukuu.org.uk, netdev@oss.sgi.com Subject: Re: Strange uses of netif_start_queue Message-ID: <20050812184605.GK10481@postel.suug.ch> References: <1123853714.22460.39.camel@localhost.localdomain> <20050812.110034.63126875.davem@davemloft.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20050812.110034.63126875.davem@davemloft.net> X-archive-position: 3069 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev Content-Length: 1206 Lines: 26 * David S. Miller <20050812.110034.63126875.davem@davemloft.net> 2005-08-12 11:00 > From: Alan Cox > Date: Fri, 12 Aug 2005 14:35:14 +0100 > > > Something I noticed doing the tty work. the 6pack driver calls > > netif_start_queue() before it calls register_netdev. I'm curious if this > > is allowed ? > > It's definitely a bug, and when register_netdev() happens it will > just overwrite that change of state. > > Since the queue is not initialized yet, this could also cause > a crash or hang. :-) Hmm, maybe I got something wrong but: As you say correctly, there is no way we can queue packets until the device has been put up, a device cannot be put up while it is not present but that doesn't happen before register_netdevice() so the value of the queue bit doesn't matter at all. Actually the default state is indeed to have the queue running (bit set to 0) so the mentioned netif_start_queue() is a simple nop as Patrick noted already. It will _not_ be overwritten by a call to register_netdevice() though which is a good thing: It allows the queue to be stopped from the very beginning without a race between register_netdevice() and the call to netif_stop_queue(). From john.ronciak@gmail.com Fri Aug 12 12:04:16 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 12 Aug 2005 12:04:43 -0700 (PDT) Received: from zproxy.gmail.com (zproxy.gmail.com [64.233.162.199]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7CJ4FH9013430 for ; Fri, 12 Aug 2005 12:04:16 -0700 Received: by zproxy.gmail.com with SMTP id m22so440991nzf for ; Fri, 12 Aug 2005 12:02:04 -0700 (PDT) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=j9Y4zaH3tuqvsKq8UlqKui8vqA6OzVdpJf2viXg2FEJ5YkeZyyq8X+V5wa2Dync3BjgKiUhbbo1qqCFATHLhKfePYuV8JrSvvpzevpuJP9VwU+1AO3MD+ALGb7oV7PgoM4T7DnAMHLp/OiEoEyy0WGQyKb2qmAR9bgZzDmyM4Tg= Received: by 10.37.15.47 with SMTP id s47mr1622801nzi; Fri, 12 Aug 2005 12:02:04 -0700 (PDT) Received: by 10.36.148.11 with HTTP; Fri, 12 Aug 2005 12:02:03 -0700 (PDT) Message-ID: <56a8daef0508121202172bcd17@mail.gmail.com> Date: Fri, 12 Aug 2005 12:02:03 -0700 From: John Ronciak To: Matt Mackall Subject: Re: [PATCH 3/8] netpoll: e1000 netpoll tweak Cc: Andrew Morton , "David S. Miller" , ak@suse.de, Jeff Moyer , netdev@oss.sgi.com, linux-kernel@vger.kernel.org, mingo@elte.hu, john.ronciak@intel.com, rostedt@goodmis.org In-Reply-To: <4.502409567@selenic.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Disposition: inline References: <3.502409567@selenic.com> <4.502409567@selenic.com> Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id j7CJ4FH9013430 X-archive-position: 3072 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: john.ronciak@gmail.com Precedence: bulk X-list: netdev Content-Length: 991 Lines: 30 Sorry this reply was to go to the whole list but only made it to Matt. The e1000_intr() routine already calls e1000_clean_tx_irq(). So what's the point of this patch? Am I missing something? On 8/11/05, Matt Mackall wrote: > Suggested by Steven Rostedt, matches his patch included in e100. > > Signed-off-by: Matt Mackall > > Index: l/drivers/net/e1000/e1000_main.c > =================================================================== > --- l.orig/drivers/net/e1000/e1000_main.c 2005-08-06 17:36:32.000000000 -0500 > +++ l/drivers/net/e1000/e1000_main.c 2005-08-06 17:55:01.000000000 -0500 > @@ -3789,6 +3789,7 @@ e1000_netpoll(struct net_device *netdev) > struct e1000_adapter *adapter = netdev_priv(netdev); > disable_irq(adapter->pdev->irq); > e1000_intr(adapter->pdev->irq, netdev, NULL); > + e1000_clean_tx_irq(adapter); > enable_irq(adapter->pdev->irq); > } > #endif > > -- Cheers, John From davem@davemloft.net Fri Aug 12 12:13:43 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 12 Aug 2005 12:14:06 -0700 (PDT) Received: from sunset.davemloft.net (dsl027-180-168.sfo1.dsl.speakeasy.net [216.27.180.168]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7CJDgH9015875 for ; Fri, 12 Aug 2005 12:13:43 -0700 Received: from localhost ([127.0.0.1] ident=davem) by sunset.davemloft.net with esmtp (Exim 4.52) id 1E3evO-00066o-TH; Fri, 12 Aug 2005 12:10:38 -0700 Date: Fri, 12 Aug 2005 12:10:38 -0700 (PDT) Message-Id: <20050812.121038.23012223.davem@davemloft.net> To: john.ronciak@gmail.com Cc: mpm@selenic.com, akpm@osdl.com, ak@suse.de, jmoyer@redhat.com, netdev@oss.sgi.com, linux-kernel@vger.kernel.org, mingo@elte.hu, john.ronciak@intel.com, rostedt@goodmis.org Subject: Re: [PATCH 3/8] netpoll: e1000 netpoll tweak From: "David S. Miller" In-Reply-To: <56a8daef0508121202172bcd17@mail.gmail.com> References: <3.502409567@selenic.com> <4.502409567@selenic.com> <56a8daef0508121202172bcd17@mail.gmail.com> X-Mailer: Mew version 4.2 on Emacs 21.4 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 3074 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev Content-Length: 410 Lines: 11 From: John Ronciak Subject: Re: [PATCH 3/8] netpoll: e1000 netpoll tweak Date: Fri, 12 Aug 2005 12:02:03 -0700 > Sorry this reply was to go to the whole list but only made it to Matt. > > The e1000_intr() routine already calls e1000_clean_tx_irq(). So > what's the point of this patch? Am I missing something? e1000_intr() does not call e1000_clean_tx_irq() when NAPI is enabled. From oxymoron@waste.org Fri Aug 12 12:20:28 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 12 Aug 2005 12:20:34 -0700 (PDT) Received: from waste.org (waste.org [216.27.176.166]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7CJKRH9021075 for ; Fri, 12 Aug 2005 12:20:28 -0700 Received: from waste.org (localhost [127.0.0.1]) by waste.org (8.13.4/8.13.4/Debian-3) with ESMTP id j7CJHrQe027533 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NOT); Fri, 12 Aug 2005 14:17:53 -0500 Received: (from oxymoron@localhost) by waste.org (8.13.4/8.13.4/Submit) id j7CJHqtt027530; Fri, 12 Aug 2005 14:17:52 -0500 Date: Fri, 12 Aug 2005 12:17:52 -0700 From: Matt Mackall To: John Ronciak Cc: Andrew Morton , "David S. Miller" , ak@suse.de, Jeff Moyer , netdev@oss.sgi.com, linux-kernel@vger.kernel.org, mingo@elte.hu, john.ronciak@intel.com, rostedt@goodmis.org Subject: Re: [PATCH 3/8] netpoll: e1000 netpoll tweak Message-ID: <20050812191752.GI12284@waste.org> References: <3.502409567@selenic.com> <4.502409567@selenic.com> <56a8daef0508121202172bcd17@mail.gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <56a8daef0508121202172bcd17@mail.gmail.com> User-Agent: Mutt/1.5.9i X-archive-position: 3076 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mpm@selenic.com Precedence: bulk X-list: netdev Content-Length: 1095 Lines: 29 [corrected akpm's address] On Fri, Aug 12, 2005 at 12:02:03PM -0700, John Ronciak wrote: > Sorry this reply was to go to the whole list but only made it to Matt. > > The e1000_intr() routine already calls e1000_clean_tx_irq(). So > what's the point of this patch? Am I missing something? Here is Steven's original analysis: http://lkml.org/lkml/2005/8/5/116 It looked plausible, but I didn't dig much deeper. > > Index: l/drivers/net/e1000/e1000_main.c > > =================================================================== > > --- l.orig/drivers/net/e1000/e1000_main.c 2005-08-06 17:36:32.000000000 -0500 > > +++ l/drivers/net/e1000/e1000_main.c 2005-08-06 17:55:01.000000000 -0500 > > @@ -3789,6 +3789,7 @@ e1000_netpoll(struct net_device *netdev) > > struct e1000_adapter *adapter = netdev_priv(netdev); > > disable_irq(adapter->pdev->irq); > > e1000_intr(adapter->pdev->irq, netdev, NULL); > > + e1000_clean_tx_irq(adapter); > > enable_irq(adapter->pdev->irq); > > } > > #endif -- Mathematics is the supreme nostalgia of our time. From oxymoron@waste.org Fri Aug 12 12:24:21 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 12 Aug 2005 12:24:26 -0700 (PDT) Received: from waste.org (waste.org [216.27.176.166]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7CJOKH9021894 for ; Fri, 12 Aug 2005 12:24:20 -0700 Received: from waste.org (localhost [127.0.0.1]) by waste.org (8.13.4/8.13.4/Debian-3) with ESMTP id j7CJLqLt027990 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NOT); Fri, 12 Aug 2005 14:21:52 -0500 Received: (from oxymoron@localhost) by waste.org (8.13.4/8.13.4/Submit) id j7CJLqOJ027987; Fri, 12 Aug 2005 14:21:52 -0500 Date: Fri, 12 Aug 2005 12:21:52 -0700 From: Matt Mackall To: Olaf Hering Cc: Andrew Morton , "David S. Miller" , ak@suse.de, Jeff Moyer , netdev@oss.sgi.com, linux-kernel@vger.kernel.org, mingo@elte.hu, john.ronciak@intel.com, rostedt@goodmis.org Subject: Re: [PATCH 0/8] netpoll: various bugfixes Message-ID: <20050812192152.GJ12284@waste.org> References: <1.502409567@selenic.com> <20050812172151.GA11104@suse.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20050812172151.GA11104@suse.de> User-Agent: Mutt/1.5.9i X-archive-position: 3077 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mpm@selenic.com Precedence: bulk X-list: netdev Content-Length: 771 Lines: 23 [corrected akpm's address] On Fri, Aug 12, 2005 at 07:21:51PM +0200, Olaf Hering wrote: > On Thu, Aug 11, Matt Mackall wrote: > > > This patch series cleans up a few outstanding bugs in netpoll: > > > > - two bugfixes from Jeff Moyer's netpoll bonding > > - a tweak to e1000's netpoll stub > > - timeout handling for e1000 with carrier loss > > - prefilling SKBs at init > > - a fix-up for a race discovered in initialization > > - an unused variable warning > > Matt, I have tested them, the sender doesnt lockup anymore. But a > task dump doesnt work, I get only the first task. This is on a 3GHz xeon > with tg3 card. Does the task dump work without patch 5/8 (add retry timeout)? I'll try testing it here. -- Mathematics is the supreme nostalgia of our time. From olh@suse.de Fri Aug 12 12:33:26 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 12 Aug 2005 12:33:30 -0700 (PDT) Received: from mx1.suse.de (mail.suse.de [195.135.220.2]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7CJXQH9023689 for ; Fri, 12 Aug 2005 12:33:26 -0700 Received: from Relay2.suse.de (mail2.suse.de [195.135.221.8]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.suse.de (Postfix) with ESMTP id 824E4E8CF; Fri, 12 Aug 2005 21:31:14 +0200 (CEST) Received: from nectarine.suse.de (nectarine.suse.de [10.10.1.156]) by Relay2.suse.de (Postfix) with ESMTP id 0962C16316; Fri, 12 Aug 2005 21:31:09 +0200 (CEST) Date: Fri, 12 Aug 2005 21:31:09 +0200 From: Olaf Hering To: Matt Mackall Cc: Andrew Morton , "David S. Miller" , ak@suse.de, Jeff Moyer , netdev@oss.sgi.com, linux-kernel@vger.kernel.org, mingo@elte.hu, john.ronciak@intel.com, rostedt@goodmis.org Subject: Re: [PATCH 0/8] netpoll: various bugfixes Message-ID: <20050812193109.GA15434@suse.de> References: <1.502409567@selenic.com> <20050812172151.GA11104@suse.de> <20050812192152.GJ12284@waste.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20050812192152.GJ12284@waste.org> X-DOS: I got your 640K Real Mode Right Here Buddy! X-Homeland-Security: You are not supposed to read this line! You are a terrorist! User-Agent: Mutt und vi sind doch schneller als Notes (und GroupWise) X-archive-position: 3079 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: olh@suse.de Precedence: bulk X-list: netdev Content-Length: 273 Lines: 7 On Fri, Aug 12, Matt Mackall wrote: > Does the task dump work without patch 5/8 (add retry timeout)? I'll > try testing it here. I spoke to soon, worked once, after reboot not anymore. Will try to play with individual patches. Does the task dump work for you, at least? From oxymoron@waste.org Sun Aug 14 14:02:51 2005 Received: with ECARTIS (v1.0.0; list netdev); Sun, 14 Aug 2005 14:02:58 -0700 (PDT) Received: from waste.org (waste.org [216.27.176.166]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7EL2oH9013791 for ; Sun, 14 Aug 2005 14:02:50 -0700 Received: from waste.org (localhost [127.0.0.1]) by waste.org (8.13.4/8.13.4/Debian-3) with ESMTP id j7EL0DZG007342 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NOT); Sun, 14 Aug 2005 16:00:13 -0500 Received: (from oxymoron@localhost) by waste.org (8.13.4/8.13.4/Submit) id j7EL0A3S007338; Sun, 14 Aug 2005 16:00:10 -0500 Date: Sun, 14 Aug 2005 14:00:10 -0700 From: Matt Mackall To: Olaf Hering Cc: Andrew Morton , "David S. Miller" , ak@suse.de, Jeff Moyer , netdev@oss.sgi.com, linux-kernel@vger.kernel.org, mingo@elte.hu, john.ronciak@intel.com, rostedt@goodmis.org Subject: Re: [PATCH 0/8] netpoll: various bugfixes Message-ID: <20050814210010.GU12284@waste.org> References: <1.502409567@selenic.com> <20050812172151.GA11104@suse.de> <20050812192152.GJ12284@waste.org> <20050812193109.GA15434@suse.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20050812193109.GA15434@suse.de> User-Agent: Mutt/1.5.9i X-archive-position: 3329 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mpm@selenic.com Precedence: bulk X-list: netdev Content-Length: 537 Lines: 14 On Fri, Aug 12, 2005 at 09:31:09PM +0200, Olaf Hering wrote: > On Fri, Aug 12, Matt Mackall wrote: > > > Does the task dump work without patch 5/8 (add retry timeout)? I'll > > try testing it here. > > I spoke to soon, worked once, after reboot not anymore. Will try to play > with individual patches. Does the task dump work for you, at least? Works flawlessly on e1000. Works on tg3 with serial console, but seems to cause trouble without. Haven't had time to dig deeper yet. -- Mathematics is the supreme nostalgia of our time. From zwane@arm.linux.org.uk Sun Aug 14 19:12:02 2005 Received: with ECARTIS (v1.0.0; list netdev); Sun, 14 Aug 2005 19:12:13 -0700 (PDT) Received: from fsmlabs.com (fsmlabs.com [168.103.115.128]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7F2C1H9015289 for ; Sun, 14 Aug 2005 19:12:02 -0700 Received: from montezuma.fsmlabs.com (65-100-26-250.albq.qwest.net [65.100.26.250]) by fsmlabs.com (8.13.1/8.13.1) with ESMTP id j7F29MZ5024939; Sun, 14 Aug 2005 20:09:23 -0600 Date: Sun, 14 Aug 2005 20:15:53 -0600 (MDT) From: Zwane Mwaikambo To: Harald Welte cc: Andrew Morton , LKML , netdev@oss.sgi.com, "Rafael J. Wysocki" Subject: Re: 2.6.13-rc5-mm1: BUG: rwlock recursion on CPU#0 In-Reply-To: <200508141448.36562.rjw@sisk.pl> Message-ID: References: <200508141448.36562.rjw@sisk.pl> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 3361 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: zwane@arm.linux.org.uk Precedence: bulk X-list: netdev Content-Length: 3698 Lines: 71 On Sun, 14 Aug 2005, Rafael J. Wysocki wrote: > I've got the following BUG on Asus L5D (x86-64) with the 2.6.13-rc5-mm1 kernel: > > BUG: rwlock recursion on CPU#0, nscd/3668, ffffffff8817d4a0 > > Call Trace:{add_preempt_count+105} {rwlock_bug+114} > {_raw_write_lock+62} {_write_lock_bh+40} > {:ip_conntrack:destroy_conntrack+196} > {:ip_conntrack:__ip_ct_event_cache_init+165} > {:ip_conntrack:ip_ct_refresh_acct+249} > {:ip_conntrack:udp_packet+47} {:ip_conntrack:ip_conntrack_in+1059} > {:ip_conntrack:ip_conntrack_local+76} > {nf_iterate+92} {dst_output+0} > {nf_hook_slow+142} {dst_output+0} > {ip_push_pending_frames+895} {lock_sock+201} > {udp_push_pending_frames+574} {udp_sendmsg+1703} > {current_fs_time+78} {file_read_actor+60} > {update_atime+76} {do_generic_mapping_read+1194} > {inet_sendmsg+86} {sock_sendmsg+271} > {add_preempt_count+105} {free_hot_cold_page+270} > {free_hot_page+11} {add_preempt_count+105} > {autoremove_wake_function+0} {sockfd_lookup+28} > {sys_sendto+260} {do_sys_poll+851} > {__pollwait+0} {system_call+126} > > --------------------------- > | preempt count: 00000303 ] > | 3 level deep critical section nesting: > ---------------------------------------- > .. [] .... nf_hook_slow+0x35/0x160 > .....[] .. ( <= ip_push_pending_frames+0x37f/0x490) > .. [] .... _write_lock_bh+0x20/0x30 > .....[] .. ( <= ip_ct_refresh_acct+0xb0/0x160 [ip_conntrack]) > .. [] .... _write_lock_bh+0x20/0x30 > .....[] .. ( <= destroy_conntrack+0xc4/0x180 [ip_conntrack]) Is the following patch correct? ip_conntrack_event_cache should never be called with ip_conntrack_lock held and ct_add_counters does not need to be called with ip_conntrack_lock held. Index: linux-2.6.13-rc5-mm1/net/ipv4/netfilter/ip_conntrack_core.c =================================================================== RCS file: /home/cvsroot/linux-2.6.13-rc5-mm1/net/ipv4/netfilter/ip_conntrack_core.c,v retrieving revision 1.1.1.1 diff -u -p -B -r1.1.1.1 ip_conntrack_core.c --- linux-2.6.13-rc5-mm1/net/ipv4/netfilter/ip_conntrack_core.c 7 Aug 2005 21:38:40 -0000 1.1.1.1 +++ linux-2.6.13-rc5-mm1/net/ipv4/netfilter/ip_conntrack_core.c 15 Aug 2005 02:09:23 -0000 @@ -1139,15 +1139,20 @@ void ip_ct_refresh_acct(struct ip_conntr ct->timeout.expires = extra_jiffies; ct_add_counters(ct, ctinfo, skb); } else { + int do_event_cache = 0; + write_lock_bh(&ip_conntrack_lock); /* Need del_timer for race avoidance (may already be dying). */ if (del_timer(&ct->timeout)) { ct->timeout.expires = jiffies + extra_jiffies; add_timer(&ct->timeout); - ip_conntrack_event_cache(IPCT_REFRESH, skb); + do_event_cache = 1; } - ct_add_counters(ct, ctinfo, skb); write_unlock_bh(&ip_conntrack_lock); + + if (do_event_cache) + ip_conntrack_event_cache(IPCT_REFRESH, skb); + ct_add_counters(ct, ctinfo, skb); } } From olh@suse.de Sun Aug 14 23:19:10 2005 Received: with ECARTIS (v1.0.0; list netdev); Sun, 14 Aug 2005 23:19:20 -0700 (PDT) Received: from mx2.suse.de (cantor2.suse.de [195.135.220.15]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7F6JAH9008681 for ; Sun, 14 Aug 2005 23:19:10 -0700 Received: from Relay2.suse.de (mail2.suse.de [195.135.221.8]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx2.suse.de (Postfix) with ESMTP id B942B1C7BC; Mon, 15 Aug 2005 08:16:51 +0200 (CEST) Received: from nectarine.suse.de (nectarine.suse.de [10.10.1.156]) by Relay2.suse.de (Postfix) with ESMTP id 24046A7E2; Mon, 15 Aug 2005 08:16:46 +0200 (CEST) Date: Mon, 15 Aug 2005 08:16:46 +0200 From: Olaf Hering To: Matt Mackall Cc: Andrew Morton , "David S. Miller" , ak@suse.de, Jeff Moyer , netdev@oss.sgi.com, linux-kernel@vger.kernel.org, mingo@elte.hu, john.ronciak@intel.com, rostedt@goodmis.org Subject: Re: [PATCH 0/8] netpoll: various bugfixes Message-ID: <20050815061646.GA20762@suse.de> References: <1.502409567@selenic.com> <20050812172151.GA11104@suse.de> <20050812192152.GJ12284@waste.org> <20050812193109.GA15434@suse.de> <20050814210010.GU12284@waste.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20050814210010.GU12284@waste.org> X-DOS: I got your 640K Real Mode Right Here Buddy! X-Homeland-Security: You are not supposed to read this line! You are a terrorist! User-Agent: Mutt und vi sind doch schneller als Notes (und GroupWise) X-archive-position: 3389 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: olh@suse.de Precedence: bulk X-list: netdev Content-Length: 715 Lines: 19 On Sun, Aug 14, Matt Mackall wrote: > On Fri, Aug 12, 2005 at 09:31:09PM +0200, Olaf Hering wrote: > > On Fri, Aug 12, Matt Mackall wrote: > > > > > Does the task dump work without patch 5/8 (add retry timeout)? I'll > > > try testing it here. > > > > I spoke to soon, worked once, after reboot not anymore. Will try to play > > with individual patches. Does the task dump work for you, at least? > > Works flawlessly on e1000. Works on tg3 with serial console, but seems > to cause trouble without. Haven't had time to dig deeper yet. Can you send me your .config off-list? I'm using ftp.suse.com/pub/projects/kernel/kotd/i386/HEAD/kernel-smp.i586.rpm will check if that nmi_watchdog thing shows anything. From laforge@netfilter.org Mon Aug 15 00:39:54 2005 Received: with ECARTIS (v1.0.0; list netdev); Mon, 15 Aug 2005 00:40:01 -0700 (PDT) Received: from ganesha.gnumonks.org (ganesha.gnumonks.org [213.95.27.120]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7F7drH9022381 for ; Mon, 15 Aug 2005 00:39:53 -0700 Received: from uucp by ganesha.gnumonks.org with local-bsmtp (Exim 4.50) id 1E4ZXN-0006aT-6D for netdev@oss.sgi.com; Mon, 15 Aug 2005 09:37:37 +0200 Received: from laforge by rama.gnumonks.org with local (Exim 3.36 #1) id 1E4bP8-0001AI-00; Mon, 15 Aug 2005 11:37:14 +0200 Date: Mon, 15 Aug 2005 11:37:14 +0200 From: Harald Welte To: Zwane Mwaikambo Cc: Andrew Morton , LKML , netdev@oss.sgi.com, "Rafael J. Wysocki" Subject: Re: 2.6.13-rc5-mm1: BUG: rwlock recursion on CPU#0 Message-ID: <20050815093714.GB4439@rama.de.gnumonks.org> References: <200508141448.36562.rjw@sisk.pl> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="7AUc2qLy4jB3hD7Z" Content-Disposition: inline In-Reply-To: User-Agent: mutt-ng devel-20050619 (Debian) X-archive-position: 3397 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: laforge@netfilter.org Precedence: bulk X-list: netdev Content-Length: 1860 Lines: 53 --7AUc2qLy4jB3hD7Z Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Sun, Aug 14, 2005 at 08:15:53PM -0600, Zwane Mwaikambo wrote: > Is the following patch correct? ip_conntrack_event_cache should never be= =20 > called with ip_conntrack_lock held and ct_add_counters does not need to b= e=20 > called with ip_conntrack_lock held. No, it's not correct. ct_add_countes has to be called from within write_lock_bh() on ip_conntrack_lock. So if you keep the ct_add_counters() call where it is and only apply the rest of your patch (i.e. deferring of ip_conntrack_event_cache() call), then I think your patch would work. However, the whole eventcache needs to be audited, it's called from a number of places. As Patrick wrote he's working on a solution, I'm not going to intervene or replicate his work. As a interim solution I'd suggest disabling CONFIG_IP_NF_CT_ACCT [which can't be vital anyway, since it was only added in net-2.6.14 (and thus -mm)]. Cheers, --=20 - Harald Welte http://netfilter.org/ =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D "Fragmentation is like classful addressing -- an interesting early architectural error that shows how much experimentation was going on while IP was being designed." -- Paul Vixie --7AUc2qLy4jB3hD7Z Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.1 (GNU/Linux) iD8DBQFDAGJJXaXGVTD0i/8RAgAoAKCXgyYsWyIzw6bKK1OnpnlhTAEvcgCgqOeG B16+kW+DiFqW3wA/tVPX/TA= =fakC -----END PGP SIGNATURE----- --7AUc2qLy4jB3hD7Z-- From zwane@arm.linux.org.uk Mon Aug 15 07:22:10 2005 Received: with ECARTIS (v1.0.0; list netdev); Mon, 15 Aug 2005 07:22:17 -0700 (PDT) Received: from fsmlabs.com (fsmlabs.com [168.103.115.128]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7FEM9H9018442 for ; Mon, 15 Aug 2005 07:22:10 -0700 Received: from montezuma.fsmlabs.com (65-100-26-250.albq.qwest.net [65.100.26.250]) by fsmlabs.com (8.13.1/8.13.1) with ESMTP id j7FEJMwF021297; Mon, 15 Aug 2005 08:19:23 -0600 Date: Mon, 15 Aug 2005 08:25:51 -0600 (MDT) From: Zwane Mwaikambo To: Harald Welte cc: Andrew Morton , LKML , netdev@oss.sgi.com, "Rafael J. Wysocki" Subject: Re: 2.6.13-rc5-mm1: BUG: rwlock recursion on CPU#0 In-Reply-To: <20050815093714.GB4439@rama.de.gnumonks.org> Message-ID: References: <200508141448.36562.rjw@sisk.pl> <20050815093714.GB4439@rama.de.gnumonks.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 3459 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: zwane@arm.linux.org.uk Precedence: bulk X-list: netdev Content-Length: 1081 Lines: 29 On Mon, 15 Aug 2005, Harald Welte wrote: > On Sun, Aug 14, 2005 at 08:15:53PM -0600, Zwane Mwaikambo wrote: > > > Is the following patch correct? ip_conntrack_event_cache should never be > > called with ip_conntrack_lock held and ct_add_counters does not need to be > > called with ip_conntrack_lock held. > > No, it's not correct. ct_add_countes has to be called from within > write_lock_bh() on ip_conntrack_lock. > > So if you keep the ct_add_counters() call where it is and only apply the > rest of your patch (i.e. deferring of ip_conntrack_event_cache() call), > then I think your patch would work. > > However, the whole eventcache needs to be audited, it's called from a > number of places. > > As Patrick wrote he's working on a solution, I'm not going to intervene > or replicate his work. As a interim solution I'd suggest disabling > CONFIG_IP_NF_CT_ACCT [which can't be vital anyway, since it was only > added in net-2.6.14 (and thus -mm)]. Thanks for the explanation Harald, i based the ct_add_counters assumption on other callers of it. Thanks, Zwane From sim@netnation.com Mon Aug 15 14:41:09 2005 Received: with ECARTIS (v1.0.0; list netdev); Mon, 15 Aug 2005 14:41:12 -0700 (PDT) Received: from peace.netnation.com (newpeace.netnation.com [204.174.223.7]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7FLf9H9010895 for ; Mon, 15 Aug 2005 14:41:09 -0700 Received: from sim by peace.netnation.com with local (Exim 4.50) id 1E4mfX-0004xc-GF; Mon, 15 Aug 2005 14:38:55 -0700 Date: Mon, 15 Aug 2005 14:38:55 -0700 From: Simon Kirby To: Robert Olsson , netdev@oss.sgi.com Subject: Route cache performance Message-ID: <20050815213855.GA17832@netnation.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.9i X-archive-position: 3497 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: sim@netnation.com Precedence: bulk X-list: netdev Content-Length: 2569 Lines: 52 Hi! Well, after a few years of other work :), I have finally got around to setting up some more permanent forwarding / route cache performance test boxes. I noticed the route trie option in the newer 2.6 kernels and figured it would be a good time to revisit things. Test setup: [Xeon w/e1000]---[Opteron w/dual e1000]---[Xeon w/e1000] The Xeons are 2.4 GHz boxes and the Opteron is a 140. At some point I intend to compare the performance of a 32 bit versus 64 bit kernel. I'm only able to get pktgen to spit out about 660 kpps from the test 2.4 Xeon box with onboard e1000 (pause disabled), but already I notice some disappointing results. The old 2.4.27 kernel I last did tests with seems to do a much better job of forwarding small packets (static src/dst) than 2.4.31 and 2.6.12. On the (leftmost) sending box, 2.4.27, 2.4.31, and 2.6.12 all seem to do fairly well at transmission with pktgen. The 2.6 pktgen seems a little better (no transmission errors and a few more Mbps), so I've been using 2.6.12. With fixed dst packets and pause disabled via ethtool, about 660 kpps is sent continuously. juno (spoofed source, userspace) seems to do about 360 kpps. The routes and packets are set up to route through the Opteron box to the receiving (rightmost) box. I've noticed that e1000 changes integrated in 2.6.11-bk2 are resulting in the forwarding test box slowing down enough that it seems to be exposing "dst cache overflow", even though under slightly less load the gc seems to be able to keep up. Robert, if I read correctly it seems that the e1000 NAPI changes were some fixes you submitted? Something appears to be different in the rtcache GC or perhaps NAPI or some other interaction, because firing juno at 2.4 does not show any problems while I can't seem to get 2.6.12 to _not_ print "dst cache overflow". 2.6.11 (pre-bk2) seems a little better at start, but any kind of burst seems to make the route cache entries exceed gc and then the slower hash lookups seem to make it get stuck at max_size (and printing "dst cache overflow") until the attack stops, even with gc_min_interval set to 0 (really 0). Anyway, I'm still in early testing stages here but it seems it's still as easy as ever to destroy routers (and hosts?) with a fairly small stream of small packets which create new rtcache entries. These days, 184 Mbps is starting to fall under the "small" DoS attack category, too. I notice the hash table size is still only 4096 buckets for 512 MB, which isn't that wonderful when it hits a max_size of 65536 (w/512 MB)... Simon- From dada1@cosmosbay.com Mon Aug 15 19:25:45 2005 Received: with ECARTIS (v1.0.0; list netdev); Mon, 15 Aug 2005 19:25:50 -0700 (PDT) Received: from smtp.cegetel.net (mf01.sitadelle.com [212.94.174.68]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7G2PiH9030846 for ; Mon, 15 Aug 2005 19:25:45 -0700 Received: from [192.168.30.10] (84-4-76-217.adslgp.cegetel.net [84.4.76.217]) by smtp.cegetel.net (Postfix) with ESMTP id 7723B3184C9; Tue, 16 Aug 2005 04:23:27 +0200 (CEST) Message-ID: <43014E27.1070104@cosmosbay.com> Date: Tue, 16 Aug 2005 04:23:35 +0200 From: Eric Dumazet User-Agent: Mozilla Thunderbird 1.0 (Windows/20041206) X-Accept-Language: fr, en MIME-Version: 1.0 To: Simon Kirby Cc: Robert Olsson , netdev@oss.sgi.com Subject: Re: Route cache performance References: <20050815213855.GA17832@netnation.com> In-Reply-To: <20050815213855.GA17832@netnation.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit X-archive-position: 3498 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: dada1@cosmosbay.com Precedence: bulk X-list: netdev Content-Length: 3329 Lines: 75 Simon Kirby a écrit : > Hi! > > Well, after a few years of other work :), I have finally got around to > setting up some more permanent forwarding / route cache performance > test boxes. I noticed the route trie option in the newer 2.6 kernels > and figured it would be a good time to revisit things. > > Test setup: > > [Xeon w/e1000]---[Opteron w/dual e1000]---[Xeon w/e1000] > > The Xeons are 2.4 GHz boxes and the Opteron is a 140. At some point > I intend to compare the performance of a 32 bit versus 64 bit kernel. > > I'm only able to get pktgen to spit out about 660 kpps from the test 2.4 > Xeon box with onboard e1000 (pause disabled), but already I notice some > disappointing results. The old 2.4.27 kernel I last did tests with seems > to do a much better job of forwarding small packets (static src/dst) than > 2.4.31 and 2.6.12. > > On the (leftmost) sending box, 2.4.27, 2.4.31, and 2.6.12 all seem to do > fairly well at transmission with pktgen. The 2.6 pktgen seems a little > better (no transmission errors and a few more Mbps), so I've been using > 2.6.12. With fixed dst packets and pause disabled via ethtool, about > 660 kpps is sent continuously. juno (spoofed source, userspace) seems to > do about 360 kpps. The routes and packets are set up to route through > the Opteron box to the receiving (rightmost) box. > > I've noticed that e1000 changes integrated in 2.6.11-bk2 are resulting in > the forwarding test box slowing down enough that it seems to be exposing > "dst cache overflow", even though under slightly less load the gc seems > to be able to keep up. Robert, if I read correctly it seems that the > e1000 NAPI changes were some fixes you submitted? > > Something appears to be different in the rtcache GC or perhaps NAPI or > some other interaction, because firing juno at 2.4 does not show any > problems while I can't seem to get 2.6.12 to _not_ print "dst cache > overflow". 2.6.11 (pre-bk2) seems a little better at start, but any kind > of burst seems to make the route cache entries exceed gc and then the > slower hash lookups seem to make it get stuck at max_size (and printing > "dst cache overflow") until the attack stops, even with gc_min_interval > set to 0 (really 0). > > Anyway, I'm still in early testing stages here but it seems it's still as > easy as ever to destroy routers (and hosts?) with a fairly small stream > of small packets which create new rtcache entries. These days, 184 Mbps > is starting to fall under the "small" DoS attack category, too. > > I notice the hash table size is still only 4096 buckets for 512 MB, which > isn't that wonderful when it hits a max_size of 65536 (w/512 MB)... > > Simon- > > Hi Simon I think one of the reason linux 2.6 has worst results is because HZ=1000 (instead of HZ=100 for linux 2.4) So if rt_garbage_collect() has heavy work to do, it usually break out of the loop because of : } while (!in_softirq() && time_before_eq(jiffies, now)); Could you please test latest 2.6.13-rc6 kernel on the Opteron machine, compiled with HZ=100, with the appended kernel argument : rhash_entries=8191 ( or rhash_entries=16383 ) and echo 1 >/proc/sys/net/ipv4/route/gc_interval echo 2 >/proc/sys/net/ipv4/route/gc_elasticity Could you also post some data from your router (like : rtstat -c 20 -i 1) Eric From grundler@cup.hp.com Wed Aug 17 23:34:31 2005 Received: with ECARTIS (v1.0.0; list netdev); Wed, 17 Aug 2005 23:34:37 -0700 (PDT) Received: from palrel10.hp.com (palrel10.hp.com [156.153.255.245]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7I6YRH9013257 for ; Wed, 17 Aug 2005 23:34:31 -0700 Received: from esmail.cup.hp.com (esmail.cup.hp.com [15.0.65.164]) by palrel10.hp.com (Postfix) with ESMTP id F099F854; Wed, 17 Aug 2005 23:32:10 -0700 (PDT) Received: from localhost.localdomain (debian.cup.hp.com [15.244.57.47]) by esmail.cup.hp.com (8.9.3 (PHNE_29774)/8.8.6) with ESMTP id XAA13589; Wed, 17 Aug 2005 23:25:01 -0700 (PDT) Received: by localhost.localdomain (Postfix, from userid 1000) id 08448907E9; Wed, 17 Aug 2005 23:35:45 -0700 (PDT) Date: Wed, 17 Aug 2005 23:35:45 -0700 From: Grant Grundler To: davem@davemloft.net, mchan@broadcom.com Cc: netdev@oss.sgi.com, Grant Grundler Subject: [BUG] tg3 v3.26 patch and "FIBRE" partno(A7109-6) Message-ID: <20050818063545.GI11107@esmail.cup.hp.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.9i X-archive-position: 3500 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: iod00d@hp.com Precedence: bulk X-list: netdev Content-Length: 5268 Lines: 114 Dave, Michael, I was looking at a new problem Matthew Wilcox reported: tg3 networking failed on rx8620 IOX Core LAN He was testing 2.6.13-rc6 on an HP rx8620 (ia64). The NIC gets no link when "ifconfig up" and ethtool says: Supported ports: [ FIBRE ] when it should say "[ MII ]". I worked backwards and found v3.25 is the last version that ethtool reports "MII" and gets a link at 100BT (FDx and HDx). We can't use 1000BT because of the brain damaged "bootcode" on the "IOX Core LAN" fix wasn't committed until tg3 v3.30. :^( Console output from modprobe and b57diag output are appended below. This looks like further brain damage in the bootcode since both 3.25 and 3.26 read "0x25" (0x20 == FIBRE) from NIC_SRAM_DATA_CFG. (ie nic_cfg == 0x25). But I don't understand why this commit causes ethtool to report "FIBRE" when the previous code didn't: "[TG3]: Split tg3_phy_probe into 2 functions" http://www.kernel.org/git/?p=linux/kernel/git/davem/net-2.6.git;a=commit;h=7d0c41ef89dad9008edf1c3c0022721ebad39999 This commit causes tg3 to read NIC_SRAM_DATA_CFG *before* setting the power state (see tg3_get_invariants()). And tg3_set_power_state() wants to know if the phy is SERDES (or not). The above patch sets TG3_FLG2_PHY_SERDES before calling tg3_set_power_state() since nic_cfg has 0x20 bit set. Calling tg3_get_eeprom_hw_cfg() *after* tg3_set_power_state() didn't change anything visible. ethtool still reported FIBRE port. I need to stare at the code some more tomorrow again to understand why the older code happened to work despite the value of nic_cfg. Or is it obvious to one of you? I'm also hoping someone has a better idea how to fix this than to add a hack based on subsystem id. I expect HP would be willing to roll the "bootcode" for this NIC since it's clearly broken. This would be the easiest solution. But I've not seen a recipe that will let an HP Integrity customer update the bcm5701 chip bootcoode. Broadcom and HP have been looking at this since at least March, 2005 when I raised the previous bootcode problem. I'll poke folks about this again but I'm not optimistic. *sigh* thanks, grant Console output seems the same for both: temp:/usr/src/linux-2.6.12# modprobe tg3_325-orig tg3.c:v3.25 (March 24, 2005) GSI 28 (level, low) -> CPU 2 (0x0800) vector 83 ACPI: PCI Interrupt 0000:00:01.0[A] -> GSI 28 (level, low) -> IRQ 83 tg3: nic_cfg 0x25 eth0: Tigon3 [partno(A7109-6) rev 0105 PHY(5701)] (PCI:33MHz:64-bit) 10/100/1000BaseT Ethernet 00:30:6e:49:42:ca eth0: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] Split[0] WireSpeed[1] TSOcap[0] ... temp:/usr/src/linux-2.6.12# modprobe tg3-326 tg3.c:v3.26 (April 24, 2005) GSI 28 (level, low) -> CPU 0 (0x0000) vector 83 ACPI: PCI Interrupt 0000:00:01.0[A] -> GSI 28 (level, low) -> IRQ 83 eth0: Tigon3 [partno(A7109-6) rev 0105 PHY(5701)] (PCI:33MHz:64-bit) 10/100/1000 BaseT Ethernet 00:30:6e:49:42:ca eth0: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] Split[0] WireSpeed[1] TSOcap[0] ... b57diag output for the offending card: C Brd:Rv Bus PCI Spd Base Irq EEP MAC Fmw Configuration - ------- ------- --- --- ---- -- ---- ------------ ----------- -------------- 0 5701:A3 00:01:0 64 33 C000 181 64k 00306E4942CA 5701-v2.17 auto 1 5701:A3 00:01:0 64 33 C000 125 64k 00306E49327F 5701-v2.17 auto 0:>secfg Reading current NVRAM ... OK Validating content... ** Error: unknow field 00 found Using Defualt VPD value, press any key to continue... (paused) 1. MAC Address : 00:30:6e:49:42:ca 2. Power Dissipated (D3:D2:D1:D0) : 10:0:0:100 3. Power Consumed (D3:D2:D1:D0) : 10:0:0:100 4. Vendor ID : 14E4 5. Vendor Device ID : 1645 6. Subsystem Vendor ID : 103C 7. Subsystem Device ID : 1300 ... 10. Magic Packet WoL { Enable(1), Disable(2) } : Disable 11. Product Name : A7109A COREIO10/100/1GBT ethernet controller 12. Part Number : BCM95700A6 13. Engineering Change : 106679-15 14. Serial Number : 0123456789 15. Manufacturing ID : 14e4 16. Asset Tag : 17. Part Revision : A3 18. Voltage { 1.3V(0), 1.8V(1) } : 1.8V 19. Force PCI Mode { Enable(1),Disable(2) } : Disable 21. Led Mode { TripleLink(1), Link/Speed(2) } : Triple Link ... 0:>seread 0x100-0x200 Current Mode: Legacy SEEPROM, Auto 000100: 822f0041 37313039 4120434f 5245494f 31302f31 30302f31 47425420 65746865 000120: 726e6574 20636f6e 74726f6c 6c657220 20209020 00504e07 41373130 392d3630 000140: 30303120 20534e41 35363230 31383734 3139304e 04313465 34525624 4c000000 000160: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 000180: 91990059 410b5859 5a303132 33343536 3752576b 00000000 00000000 00000000 0001a0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 0001c0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 0001e0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000078 000200: 0e000003 From grundler@cup.hp.com Wed Aug 17 23:49:43 2005 Received: with ECARTIS (v1.0.0; list netdev); Wed, 17 Aug 2005 23:49:50 -0700 (PDT) Received: from palrel10.hp.com (palrel10.hp.com [156.153.255.245]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7I6ngH9014782 for ; Wed, 17 Aug 2005 23:49:42 -0700 Received: from esmail.cup.hp.com (esmail.cup.hp.com [15.0.65.164]) by palrel10.hp.com (Postfix) with ESMTP id B35A247F5; Wed, 17 Aug 2005 23:31:17 -0700 (PDT) Received: from localhost.localdomain (debian.cup.hp.com [15.244.57.47]) by esmail.cup.hp.com (8.9.3 (PHNE_29774)/8.8.6) with ESMTP id XAA13554; Wed, 17 Aug 2005 23:24:07 -0700 (PDT) Received: by localhost.localdomain (Postfix, from userid 1000) id A9199907E9; Wed, 17 Aug 2005 23:34:52 -0700 (PDT) Date: Wed, 17 Aug 2005 23:34:52 -0700 From: Grant Grundler To: davem@davemloft.net, mchan@broadcom.hp.com.sgi.com Cc: netdev@oss.sgi.com, Grant Grundler Subject: OK if tg3_get_eeprom_hw_cfg() reads SRAM? Message-ID: <20050818063452.GA13262@esmail.cup.hp.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.9i X-archive-position: 3501 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: iod00d@hp.com Precedence: bulk X-list: netdev Content-Length: 460 Lines: 12 Michael, tg3_get_eeprom_hw_cfg() was added in this patch: http://www.kernel.org/git/?p=linux/kernel/git/davem/net-2.6.git;a=commit;h=7d0c41ef89dad9008edf1c3c0022721ebad39999 The comment before tg3_get_eeprom_hw_cfg() suggests tg3 should ONLY be touching PCI cfg space until the NIC gets to D0 power state. But tg3_get_eeprom_hw_cfg() is called before tg3_set_power_state(tp,0). Is it OK for tg3_get_eeprom_hw_cfg() to read NIC_SRAM_DATA_CFG? thanks, grant From mchan@broadcom.com Thu Aug 18 00:43:10 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 18 Aug 2005 00:43:15 -0700 (PDT) Received: from MMS2.broadcom.com (mms2.broadcom.com [216.31.210.18]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7I7h9H9025733 for ; Thu, 18 Aug 2005 00:43:10 -0700 Received: from 10.10.64.121 by MMS2.broadcom.com with SMTP (Broadcom SMTP Relay (Email Firewall v6.1.0)); Thu, 18 Aug 2005 00:40:44 -0700 X-Server-Uuid: 1F20ACF3-9CAF-44F7-AB47-F294E2D5B4EA Received: from mail-irva-8.broadcom.com ([10.10.64.221]) by mail-irva-1.broadcom.com (Post.Office MTA v3.5.3 release 223 ID# 0-72233U7200L2200S0V35) with ESMTP id com; Thu, 18 Aug 2005 00:40:43 -0700 Received: from mon-irva-10.broadcom.com (mon-irva-10.broadcom.com [10.10.64.171]) by mail-irva-8.broadcom.com (MOS 3.5.6-GR) with ESMTP id BPM48098; Thu, 18 Aug 2005 00:40:43 -0700 (PDT) Received: from nt-irva-0741.brcm.ad.broadcom.com (nt-irva-0741 [10.8.194.54]) by mon-irva-10.broadcom.com (8.9.1/8.9.1) with ESMTP id AAA13813; Thu, 18 Aug 2005 00:40:43 -0700 (PDT) X-MimeOLE: Produced By Microsoft Exchange V6.5.7226.0 Content-class: urn:content-classes:message MIME-Version: 1.0 Subject: RE: OK if tg3_get_eeprom_hw_cfg() reads SRAM? Date: Thu, 18 Aug 2005 00:40:42 -0700 Message-ID: Thread-Topic: OK if tg3_get_eeprom_hw_cfg() reads SRAM? Thread-Index: AcWjwNCBiVuH99ZnQYOLVzIAyjF/CgABlqZA From: "Michael Chan" To: "Grant Grundler" , davem@davemloft.net, mchan@broadcom.hp.com.sgi.com cc: netdev@oss.sgi.com X-WSS-ID: 6F1AE4F629O4732368-01-01 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id j7I7h9H9025733 X-archive-position: 3502 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mchan@broadcom.com Precedence: bulk X-list: netdev Content-Length: 655 Lines: 22 Grant Grundler wrote: > The comment before tg3_get_eeprom_hw_cfg() suggests tg3 > should ONLY be touching PCI cfg space until the NIC gets > to D0 power state. The comment is correct. In D3 power state, the chip will not respond to MMIO. When I did that patch, I made very sure that only config. cycles were used before switching to D0. > But tg3_get_eeprom_hw_cfg() is called before > tg3_set_power_state(tp,0). > Is it OK for tg3_get_eeprom_hw_cfg() to read > NIC_SRAM_DATA_CFG? Yes, NIC_SRAM_DATA_CFG is in memory space (as opposed to register space) and we always use config. cycles to read/write memory space. Please see tg3_read/write_mem(). From mchan@broadcom.com Thu Aug 18 01:25:43 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 18 Aug 2005 01:25:47 -0700 (PDT) Received: from MMS1.broadcom.com (mms1.broadcom.com [216.31.210.17]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7I8PgH9004133 for ; Thu, 18 Aug 2005 01:25:42 -0700 Received: from 10.10.64.121 by MMS1.broadcom.com with SMTP (Broadcom SMTP Relay (Email Firewall v6.1.0)); Thu, 18 Aug 2005 01:22:20 -0700 X-Server-Uuid: 146C3151-C1DE-4F71-9D02-C3BE503878DD Received: from mail-irva-8.broadcom.com ([10.10.64.221]) by mail-irva-1.broadcom.com (Post.Office MTA v3.5.3 release 223 ID# 0-72233U7200L2200S0V35) with ESMTP id com; Thu, 18 Aug 2005 01:22:20 -0700 Received: from mon-irva-10.broadcom.com (mon-irva-10.broadcom.com [10.10.64.171]) by mail-irva-8.broadcom.com (MOS 3.5.6-GR) with ESMTP id BPM56030; Thu, 18 Aug 2005 01:22:20 -0700 (PDT) Received: from nt-irva-0741.brcm.ad.broadcom.com (nt-irva-0741 [10.8.194.54]) by mon-irva-10.broadcom.com (8.9.1/8.9.1) with ESMTP id BAA22357; Thu, 18 Aug 2005 01:22:20 -0700 (PDT) X-MimeOLE: Produced By Microsoft Exchange V6.5.7226.0 Content-class: urn:content-classes:message MIME-Version: 1.0 Subject: Re: [BUG] tg3 v3.26 patch and "FIBRE" partno(A7109-6) Date: Thu, 18 Aug 2005 01:22:20 -0700 Message-ID: Thread-Topic: [BUG] tg3 v3.26 patch and "FIBRE" partno(A7109-6) Thread-Index: AcWjvpXzSHx0rItCTLWtM3NTbgqUMQADN7eA From: "Michael Chan" To: "Grant Grundler" , davem@davemloft.net cc: netdev@oss.sgi.com X-WSS-ID: 6F1A9AB626C3974380-01-01 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id j7I8PgH9004133 X-archive-position: 3503 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mchan@broadcom.com Precedence: bulk X-list: netdev Content-Length: 1360 Lines: 40 Grant Grundler wrote: > Dave, Michael, > I was looking at a new problem Matthew Wilcox reported: > tg3 networking failed on rx8620 IOX Core LAN > > He was testing 2.6.13-rc6 on an HP rx8620 (ia64). > The NIC gets no link when "ifconfig up" and ethtool says: > Supported ports: [ FIBRE ] > > when it should say "[ MII ]". > > I worked backwards and found v3.25 is the last version that > ethtool reports "MII" and gets a link at 100BT (FDx and HDx). I stared at the old and new tg3 code for a while and I think I know what's going on in your case. The boot code must be incorrectly reporting serdes in NIC_SRAM_DATA_CFG. With older tg3, it always reads the PHY ID if ASF is not enabled and if the PHY ID is valid, it will ignore the serdes bit in NIC_SRAM_DATA_CFG and determine whether it's copper/serdes based on the PHY ID. In the new tg3, because the code is slightly different with the new tg3_get_eeprom_hw_cfg(), it will always set the TG3_FLG2_PHY_SERDES flag if the serdes bit in NIC_SRAM_DATA_CFG is set. So the problem is caused by wrong eeprom serdes configuration and a slight change in the driver code. The old driver works because the PHY ID overrides it. The new driver trusts the eeprom's serdes configuration bit. I will look at the code some more tomorrow to see if there is an easy way to fix it in the new tg3. Thanks. From maca02@atlas.cz Thu Aug 18 04:25:46 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 18 Aug 2005 04:25:54 -0700 (PDT) Received: from localhost.localdomain (maca.fortech.cz [213.250.192.50]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7IBPiH9020129 for ; Thu, 18 Aug 2005 04:25:46 -0700 Received: from localhost (localhost [127.0.0.1]) by localhost.localdomain (8.12.11/8.12.8) with ESMTP id j7IBNQ1x006473 for ; Thu, 18 Aug 2005 13:23:27 +0200 Date: Thu, 18 Aug 2005 13:23:26 +0200 (CEST) From: =?ISO-8859-2?Q?Tom=E1=B9_Macek?= X-X-Sender: root@localhost.localdomain To: netdev@oss.sgi.com Subject: RTM_F_NOTIFY Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-archive-position: 3504 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: maca02@atlas.cz Precedence: bulk X-list: netdev Content-Length: 450 Lines: 5 In the manual page is written that "RTM_F_NOTIFY if the route changes, notify the user via rtnetlink". Can you give me some information about how this 'notifying' works? How can I be notified? Will I be notified about one certain route change or only generaly about that route table changed? I was looking for in the source of net-tools 1.60 and libnl and on google, but without success. Can you give me some link or info? Any help appreciated! From grundler@cup.hp.com Thu Aug 18 10:22:55 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 18 Aug 2005 10:22:58 -0700 (PDT) Received: from palrel10.hp.com (palrel10.hp.com [156.153.255.245]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7IHMtH9029211 for ; Thu, 18 Aug 2005 10:22:55 -0700 Received: from esmail.cup.hp.com (esmail.cup.hp.com [15.0.65.164]) by palrel10.hp.com (Postfix) with ESMTP id A0D2A4DB6; Thu, 18 Aug 2005 09:59:30 -0700 (PDT) Received: from localhost.localdomain (debian.cup.hp.com [15.244.57.47]) by esmail.cup.hp.com (8.9.3 (PHNE_29774)/8.8.6) with ESMTP id JAA15908; Thu, 18 Aug 2005 09:52:20 -0700 (PDT) Received: by localhost.localdomain (Postfix, from userid 1000) id 0C6F4907E9; Thu, 18 Aug 2005 10:03:06 -0700 (PDT) Date: Thu, 18 Aug 2005 10:03:05 -0700 From: Grant Grundler To: Michael Chan Cc: davem@davemloft.net, netdev@oss.sgi.com Subject: Re: OK if tg3_get_eeprom_hw_cfg() reads SRAM? Message-ID: <20050818170305.GA15077@esmail.cup.hp.com> References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.9i X-archive-position: 3505 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: iod00d@hp.com Precedence: bulk X-list: netdev Content-Length: 396 Lines: 13 On Thu, Aug 18, 2005 at 12:40:42AM -0700, Michael Chan wrote: > > Is it OK for tg3_get_eeprom_hw_cfg() to read > > NIC_SRAM_DATA_CFG? > > Yes, NIC_SRAM_DATA_CFG is in memory space (as opposed to > register space) and we always use config. cycles to > read/write memory space. Please see tg3_read/write_mem(). doh! of course...I've looked at tg3_read_mem() more than a few times. thanks, grant From kaber@trash.net Thu Aug 18 11:45:15 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 18 Aug 2005 11:45:19 -0700 (PDT) Received: from kaber.coreworks.de ([62.206.217.67]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7IIjEH9003698 for ; Thu, 18 Aug 2005 11:45:14 -0700 Received: from localhost ([127.0.0.1]) by kaber.coreworks.de with esmtp (Exim 4.52) id 1E5pLp-00027A-2D; Thu, 18 Aug 2005 20:42:53 +0200 Message-ID: <4304D6AC.4060606@trash.net> Date: Thu, 18 Aug 2005 20:42:52 +0200 From: Patrick McHardy User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.7.10) Gecko/20050803 Debian/1.7.10-1 X-Accept-Language: en MIME-Version: 1.0 To: Ollie Wild CC: linux-kernel@vger.kernel.org, Maillist netdev Subject: Re: [PATCH] fix dst_entry leak in icmp_push_reply() References: <43039C3F.2000207@rincewind.tv> <4303CEC5.3010502@trash.net> <43042D94.4030303@rincewind.tv> In-Reply-To: <43042D94.4030303@rincewind.tv> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 3506 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kaber@trash.net Precedence: bulk X-list: netdev Content-Length: 718 Lines: 19 Ollie Wild wrote: > Patrick McHardy wrote: > >> Ollie Wild wrote: >> >>> If the ip_append_data() call in icmp_push_reply() fails, >>> ip_flush_pending_frames() needs to be called. Otherwise, ip_rt_put() >>> is never called on inet_sk(icmp_socket->sk)->cork.rt, which prevents >>> the route (and net_device) from ever being freed. >> >> Your patch doesn't fit your description, the else-condition you're >> adding triggers when the queue is empty, so what is the point? > > Since we're only calling ip_append_data() once here, the two conditions > are identical. You're right, I misread your patch. It would be easier to understand if you just checked the return value of ip_append_data, as done in udp.c or raw.c. From kaber@trash.net Thu Aug 18 12:01:54 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 18 Aug 2005 12:01:58 -0700 (PDT) Received: from kaber.coreworks.de ([62.206.217.67]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7IJ1sH9006018 for ; Thu, 18 Aug 2005 12:01:54 -0700 Received: from localhost ([127.0.0.1]) by kaber.coreworks.de with esmtp (Exim 4.52) id 1E5pc1-0002OU-4o; Thu, 18 Aug 2005 20:59:37 +0200 Message-ID: <4304DA99.2080205@trash.net> Date: Thu, 18 Aug 2005 20:59:37 +0200 From: Patrick McHardy User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.7.10) Gecko/20050803 Debian/1.7.10-1 X-Accept-Language: en MIME-Version: 1.0 To: Ollie Wild CC: linux-kernel@vger.kernel.org, Maillist netdev Subject: Re: [PATCH] fix dst_entry leak in icmp_push_reply() References: <43039C3F.2000207@rincewind.tv> <4303CEC5.3010502@trash.net> <43042D94.4030303@rincewind.tv> <4304D763.4090001@rincewind.tv> In-Reply-To: <4304D763.4090001@rincewind.tv> Content-Type: multipart/mixed; boundary="------------020505050000070804030301" X-archive-position: 3507 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kaber@trash.net Precedence: bulk X-list: netdev Content-Length: 1832 Lines: 52 This is a multi-part message in MIME format. --------------020505050000070804030301 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Ollie Wild wrote: > That said, I appreciate that the if-else condition doesn't seem quite > right. The problem is, the icmp_push_reply() routine is implicitly > using the queue as a success indicator. I put the > ip_flush_pending_frames() call inside the else block because I wanted to > guarantee that one of ip_push_pending_frames() and > ip_flush_pending_frames() is always called. Both will do proper cleanup. > > I'm open to suggestions if you think there's a cleaner way to implement > this. Checking the return value of ip_append_data seems cleaner to me. Patch attached. Signed-off-by: Patrick McHardy --------------020505050000070804030301 Content-Type: text/plain; name="x" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="x" diff --git a/net/ipv4/icmp.c b/net/ipv4/icmp.c --- a/net/ipv4/icmp.c +++ b/net/ipv4/icmp.c @@ -349,12 +349,12 @@ static void icmp_push_reply(struct icmp_ { struct sk_buff *skb; - ip_append_data(icmp_socket->sk, icmp_glue_bits, icmp_param, - icmp_param->data_len+icmp_param->head_len, - icmp_param->head_len, - ipc, rt, MSG_DONTWAIT); - - if ((skb = skb_peek(&icmp_socket->sk->sk_write_queue)) != NULL) { + if (ip_append_data(icmp_socket->sk, icmp_glue_bits, icmp_param, + icmp_param->data_len+icmp_param->head_len, + icmp_param->head_len, + ipc, rt, MSG_DONTWAIT) < 0) + ip_flush_pending_frames(icmp_socket->sk); + else if ((skb = skb_peek(&icmp_socket->sk->sk_write_queue)) != NULL) { struct icmphdr *icmph = skb->h.icmph; unsigned int csum = 0; struct sk_buff *skb1; --------------020505050000070804030301-- From aaw@rincewind.tv Thu Aug 18 12:07:48 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 18 Aug 2005 12:07:51 -0700 (PDT) Received: from ixca-ex1.ixiacom.com (ixca-out.ixiacom.com [67.133.120.10]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7IJ7lH9007029 for ; Thu, 18 Aug 2005 12:07:48 -0700 Received: from [192.168.6.244] (wild2.ixiacom.com [192.168.6.244]) by ixca-ex1.ixiacom.com with SMTP (Microsoft Exchange Internet Mail Service Version 5.5.2657.72) id PB3N9XGZ; Thu, 18 Aug 2005 12:05:31 -0700 Message-ID: <4304DBFB.5010906@rincewind.tv> Date: Thu, 18 Aug 2005 12:05:31 -0700 X-Sybari-Trust: 2be44ef6 453feeff 4098152d 0000010c From: Ollie Wild User-Agent: Mozilla Thunderbird 1.0.6 (X11/20050725) X-Accept-Language: en-us, en MIME-Version: 1.0 To: Patrick McHardy CC: linux-kernel@vger.kernel.org, Maillist netdev Subject: Re: [PATCH] fix dst_entry leak in icmp_push_reply() References: <43039C3F.2000207@rincewind.tv> <4303CEC5.3010502@trash.net> <43042D94.4030303@rincewind.tv> <4304D763.4090001@rincewind.tv> <4304DA99.2080205@trash.net> In-Reply-To: <4304DA99.2080205@trash.net> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 3508 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: aaw@rincewind.tv Precedence: bulk X-list: netdev Content-Length: 142 Lines: 10 Patrick McHardy wrote: >Checking the return value of ip_append_data seems cleaner to me. >Patch attached. > > Works for me. Thanks, Ollie From davem@davemloft.net Thu Aug 18 14:34:51 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 18 Aug 2005 14:34:56 -0700 (PDT) Received: from outer-richmond.davemloft.net (dsl027-180-204.sfo1.dsl.speakeasy.net [216.27.180.204]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7ILYkH9023863 for ; Thu, 18 Aug 2005 14:34:51 -0700 Received: from localhost (localhost.localdomain [127.0.0.1]) by outer-richmond.davemloft.net (8.13.4/8.13.4) with ESMTP id j7ILWOVm008831; Thu, 18 Aug 2005 14:32:25 -0700 Date: Thu, 18 Aug 2005 14:32:24 -0700 (PDT) Message-Id: <20050818.143224.111875937.davem@davemloft.net> To: aaw@rincewind.tv Cc: kaber@trash.net, linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: [PATCH] fix dst_entry leak in icmp_push_reply() From: "David S. Miller" In-Reply-To: <4304DBFB.5010906@rincewind.tv> References: <4304D763.4090001@rincewind.tv> <4304DA99.2080205@trash.net> <4304DBFB.5010906@rincewind.tv> X-Mailer: Mew version 4.2 on Emacs 21.4 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 3509 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev Content-Length: 243 Lines: 12 From: Ollie Wild Date: Thu, 18 Aug 2005 12:05:31 -0700 > Patrick McHardy wrote: > > >Checking the return value of ip_append_data seems cleaner to me. > >Patch attached. > > > > > Works for me. Applied, thanks everyone. From ryanh@us.ibm.com Thu Aug 18 14:43:44 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 18 Aug 2005 14:43:50 -0700 (PDT) Received: from e35.co.us.ibm.com (e35.co.us.ibm.com [32.97.110.133]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7ILhgH9024840 for ; Thu, 18 Aug 2005 14:43:44 -0700 Received: from westrelay02.boulder.ibm.com (westrelay02.boulder.ibm.com [9.17.195.11]) by e35.co.us.ibm.com (8.12.10/8.12.9) with ESMTP id j7ILf048547928 for ; Thu, 18 Aug 2005 17:41:04 -0400 Received: from d03av04.boulder.ibm.com (d03av04.boulder.ibm.com [9.17.195.170]) by westrelay02.boulder.ibm.com (8.12.10/NCO/VERS6.7) with ESMTP id j7ILeaJ7396302 for ; Thu, 18 Aug 2005 15:40:36 -0600 Received: from d03av04.boulder.ibm.com (loopback [127.0.0.1]) by d03av04.boulder.ibm.com (8.12.11/8.13.3) with ESMTP id j7ILebfp000915 for ; Thu, 18 Aug 2005 15:40:37 -0600 Received: from localhost.localdomain (frylock.austin.ibm.com [9.53.91.14]) by d03av04.boulder.ibm.com (8.12.11/8.12.11) with ESMTP id j7ILebbk000864; Thu, 18 Aug 2005 15:40:37 -0600 Received: by localhost.localdomain (Postfix, from userid 1000) id A3C3A93764; Thu, 18 Aug 2005 16:40:36 -0500 (CDT) Date: Thu, 18 Aug 2005 16:40:36 -0500 From: Ryan Harper To: shemminger@osdl.org Cc: netdev@oss.sgi.com Subject: Possible race with br_del_if() Message-ID: <20050818214036.GH10593@us.ibm.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.6+20040907i X-archive-position: 3510 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ryanh@us.ibm.com Precedence: bulk X-list: netdev Content-Length: 1922 Lines: 58 Hello, I've encountered several oops when adding and removing interfaces from bridges while using Xen. Most of the details are available [1]here. The short of it is the following sequence: CPU0 CPU1 add_del_if() unregister_netdevice() br_del_if() notifier_call_chain(NETDEV_UNREGISTER) del_nbp() br_stp_disable_port() // port->state == BR_STATE_DISABLED br_device_event() // dev->br_port != NULL yet // event is NETDEV_UNREGISTER br_del_if() sysfs_remove_dir(p) kobject_del() dget(dentry) BUG_ON(!atomic_read(&dentry->d_count) This sequence doesn't happen all of the time. In many cases, CPU0 moves along right into destroy_nbp() which sets dev->br_port = NULL, and be_device_event check (p == NULL) hits and a second br_del_if() isn't called. The attached patch is a workaround for the double case, but I'm not sure if is the right way to deal with this issue, or if it any issue at all. 1. http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=90 -- Ryan Harper Software Engineer; Linux Technology Center IBM Corp., Austin, Tx (512) 838-9253 T/L: 678-9253 ryanh@us.ibm.com diffstat output: br_if.c | 2 +- 1 files changed, 1 insertion(+), 1 deletion(-) Signed-off-by: Ryan Harper --- Simple workaround for double call to br_del_if(). Signed-off-by: Ryan Harper --- linux-2.6.12/net/bridge/br_if.c 2005-06-17 14:48:29.000000000 -0500 +++ linux-2.6.12-xen0-smp/net/bridge/br_if.c 2005-08-18 15:17:27.302615846 -0500 @@ -382,7 +382,7 @@ { struct net_bridge_port *p = dev->br_port; - if (!p || p->br != br) + if (!p || p->br != br || p->state == BR_STATE_DISABLED) return -EINVAL; br_sysfs_removeif(p); From shemminger@osdl.org Thu Aug 18 15:13:50 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 18 Aug 2005 15:13:59 -0700 (PDT) Received: from smtp.osdl.org (smtp.osdl.org [65.172.181.4]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7IMDnH9027662 for ; Thu, 18 Aug 2005 15:13:50 -0700 Received: from shell0.pdx.osdl.net (fw.osdl.org [65.172.181.6]) by smtp.osdl.org (8.12.8/8.12.8) with ESMTP id j7IMBTjA013484 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO); Thu, 18 Aug 2005 15:11:30 -0700 Received: from dxpl.pdx.osdl.net (dxpl.pdx.osdl.net [10.8.0.74]) by shell0.pdx.osdl.net (8.13.1/8.11.6) with ESMTP id j7IMBSX7011574; Thu, 18 Aug 2005 15:11:29 -0700 Date: Thu, 18 Aug 2005 15:12:02 -0700 From: Stephen Hemminger To: Ryan Harper Cc: netdev@oss.sgi.com Subject: Re: Possible race with br_del_if() Message-ID: <20050818151202.6fe6ded4@dxpl.pdx.osdl.net> In-Reply-To: <20050818214036.GH10593@us.ibm.com> References: <20050818214036.GH10593@us.ibm.com> X-Mailer: Sylpheed-Claws 1.9.13 (GTK+ 2.6.7; x86_64-redhat-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-MIMEDefang-Filter: osdl$Revision: 1.114 $ X-Scanned-By: MIMEDefang 2.36 X-archive-position: 3511 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev Content-Length: 1427 Lines: 35 On Thu, 18 Aug 2005 16:40:36 -0500 Ryan Harper wrote: > Hello, > > I've encountered several oops when adding and removing interfaces from > bridges while using Xen. Most of the details are available [1]here. > The short of it is the following sequence: Doesn't the mutex in RTNL work right? or are you calling routines with out asserting it? > CPU0 CPU1 > add_del_if() unregister_netdevice() > br_del_if() notifier_call_chain(NETDEV_UNREGISTER) > del_nbp() > br_stp_disable_port() // port->state == BR_STATE_DISABLED > br_device_event() // dev->br_port != NULL yet > // event is NETDEV_UNREGISTER > br_del_if() > sysfs_remove_dir(p) > kobject_del() > dget(dentry) > BUG_ON(!atomic_read(&dentry->d_count) > > This sequence doesn't happen all of the time. In many cases, CPU0 moves > along right into destroy_nbp() which sets dev->br_port = NULL, and > be_device_event check (p == NULL) hits and a second br_del_if() isn't > called. > > The attached patch is a workaround for the double case, but I'm not sure > if is the right way to deal with this issue, or if it any issue at all. > > 1. http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=90 > From ryanh@us.ibm.com Thu Aug 18 15:25:48 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 18 Aug 2005 15:25:53 -0700 (PDT) Received: from e34.co.us.ibm.com (e34.co.us.ibm.com [32.97.110.132]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7IMPfH9029227 for ; Thu, 18 Aug 2005 15:25:48 -0700 Received: from d03relay04.boulder.ibm.com (d03relay04.boulder.ibm.com [9.17.195.106]) by e34.co.us.ibm.com (8.12.10/8.12.9) with ESMTP id j7IMNOxr116608 for ; Thu, 18 Aug 2005 18:23:24 -0400 Received: from d03av03.boulder.ibm.com (d03av03.boulder.ibm.com [9.17.195.169]) by d03relay04.boulder.ibm.com (8.12.10/NCO/VERS6.7) with ESMTP id j7IMNTmb223690 for ; Thu, 18 Aug 2005 16:23:29 -0600 Received: from d03av03.boulder.ibm.com (loopback [127.0.0.1]) by d03av03.boulder.ibm.com (8.12.11/8.13.3) with ESMTP id j7IMNOxs001459 for ; Thu, 18 Aug 2005 16:23:24 -0600 Received: from localhost.localdomain (frylock.austin.ibm.com [9.53.91.14]) by d03av03.boulder.ibm.com (8.12.11/8.12.11) with ESMTP id j7IMNONm001454; Thu, 18 Aug 2005 16:23:24 -0600 Received: by localhost.localdomain (Postfix, from userid 1000) id C723193764; Thu, 18 Aug 2005 17:23:23 -0500 (CDT) Date: Thu, 18 Aug 2005 17:23:23 -0500 From: Ryan Harper To: Stephen Hemminger Cc: netdev@oss.sgi.com Subject: Re: Possible race with br_del_if() Message-ID: <20050818222323.GI10593@us.ibm.com> References: <20050818214036.GH10593@us.ibm.com> <20050818151202.6fe6ded4@dxpl.pdx.osdl.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20050818151202.6fe6ded4@dxpl.pdx.osdl.net> User-Agent: Mutt/1.5.6+20040907i X-archive-position: 3512 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ryanh@us.ibm.com Precedence: bulk X-list: netdev Content-Length: 872 Lines: 28 * Stephen Hemminger [2005-08-18 17:11]: > On Thu, 18 Aug 2005 16:40:36 -0500 > Ryan Harper wrote: > > > Hello, > > > > I've encountered several oops when adding and removing interfaces from > > bridges while using Xen. Most of the details are available [1]here. > > The short of it is the following sequence: > > Doesn't the mutex in RTNL work right? or are you calling > routines with out asserting it? unregister_netdevice asserts RTNL, add_del_if() in br_ioctl.c doesn't seem to do so. I don't see it down dev_get_by_index() path either. It looks like any caller of add_del_if() isn't asserting RTNL. The two callers I see are: br_dev_ioctl() in br_ioctl.c old_dev_ioctl() in br_ioctl.c -- Ryan Harper Software Engineer; Linux Technology Center IBM Corp., Austin, Tx (512) 838-9253 T/L: 678-9253 ryanh@us.ibm.com From shemminger@osdl.org Thu Aug 18 15:37:18 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 18 Aug 2005 15:37:26 -0700 (PDT) Received: from smtp.osdl.org (smtp.osdl.org [65.172.181.4]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7IMbHH9030484 for ; Thu, 18 Aug 2005 15:37:17 -0700 Received: from shell0.pdx.osdl.net (fw.osdl.org [65.172.181.6]) by smtp.osdl.org (8.12.8/8.12.8) with ESMTP id j7IMYwjA015360 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO); Thu, 18 Aug 2005 15:34:58 -0700 Received: from dxpl.pdx.osdl.net (dxpl.pdx.osdl.net [10.8.0.74]) by shell0.pdx.osdl.net (8.13.1/8.11.6) with ESMTP id j7IMYvbR012891; Thu, 18 Aug 2005 15:34:58 -0700 Date: Thu, 18 Aug 2005 15:35:31 -0700 From: Stephen Hemminger To: Ryan Harper Cc: netdev@oss.sgi.com Subject: Re: Possible race with br_del_if() Message-ID: <20050818153531.61f62ac0@dxpl.pdx.osdl.net> In-Reply-To: <20050818222323.GI10593@us.ibm.com> References: <20050818214036.GH10593@us.ibm.com> <20050818151202.6fe6ded4@dxpl.pdx.osdl.net> <20050818222323.GI10593@us.ibm.com> X-Mailer: Sylpheed-Claws 1.9.13 (GTK+ 2.6.7; x86_64-redhat-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-MIMEDefang-Filter: osdl$Revision: 1.114 $ X-Scanned-By: MIMEDefang 2.36 X-archive-position: 3513 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev Content-Length: 1029 Lines: 34 On Thu, 18 Aug 2005 17:23:23 -0500 Ryan Harper wrote: > * Stephen Hemminger [2005-08-18 17:11]: > > On Thu, 18 Aug 2005 16:40:36 -0500 > > Ryan Harper wrote: > > > > > Hello, > > > > > > I've encountered several oops when adding and removing interfaces from > > > bridges while using Xen. Most of the details are available [1]here. > > > The short of it is the following sequence: > > > > Doesn't the mutex in RTNL work right? or are you calling > > routines with out asserting it? > > unregister_netdevice asserts RTNL, add_del_if() in br_ioctl.c doesn't > seem to do so. I don't see it down dev_get_by_index() path either. It > looks like any caller of add_del_if() isn't asserting RTNL. The two > callers I see are: > > br_dev_ioctl() in br_ioctl.c > old_dev_ioctl() in br_ioctl.c But the pat to br_dev_ioctl() is via the socket ioctl and that should already have gotten RTNL. dev_ioctl rtnl_lock() dev_ifsioc() dev->do_ioctl --> br_dev_ioctl From ryanh@us.ibm.com Thu Aug 18 15:58:35 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 18 Aug 2005 15:58:42 -0700 (PDT) Received: from e32.co.us.ibm.com (e32.co.us.ibm.com [32.97.110.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7IMwTH9000302 for ; Thu, 18 Aug 2005 15:58:35 -0700 Received: from westrelay02.boulder.ibm.com (westrelay02.boulder.ibm.com [9.17.195.11]) by e32.co.us.ibm.com (8.12.10/8.12.9) with ESMTP id j7IMu7wh312326 for ; Thu, 18 Aug 2005 18:56:07 -0400 Received: from d03av01.boulder.ibm.com (d03av01.boulder.ibm.com [9.17.195.167]) by westrelay02.boulder.ibm.com (8.12.10/NCO/VERS6.7) with ESMTP id j7IMthJ7396096 for ; Thu, 18 Aug 2005 16:55:43 -0600 Received: from d03av01.boulder.ibm.com (loopback [127.0.0.1]) by d03av01.boulder.ibm.com (8.12.11/8.13.3) with ESMTP id j7IMu6Vw004070 for ; Thu, 18 Aug 2005 16:56:06 -0600 Received: from localhost.localdomain (frylock.austin.ibm.com [9.53.91.14]) by d03av01.boulder.ibm.com (8.12.11/8.12.11) with ESMTP id j7IMu6Z6003972; Thu, 18 Aug 2005 16:56:06 -0600 Received: by localhost.localdomain (Postfix, from userid 1000) id 2BE4193764; Thu, 18 Aug 2005 17:56:02 -0500 (CDT) Date: Thu, 18 Aug 2005 17:56:01 -0500 From: Ryan Harper To: Stephen Hemminger Cc: netdev@oss.sgi.com Subject: Re: Possible race with br_del_if() Message-ID: <20050818225601.GJ10593@us.ibm.com> References: <20050818214036.GH10593@us.ibm.com> <20050818151202.6fe6ded4@dxpl.pdx.osdl.net> <20050818222323.GI10593@us.ibm.com> <20050818153531.61f62ac0@dxpl.pdx.osdl.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20050818153531.61f62ac0@dxpl.pdx.osdl.net> User-Agent: Mutt/1.5.6+20040907i X-archive-position: 3514 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ryanh@us.ibm.com Precedence: bulk X-list: netdev Content-Length: 1796 Lines: 54 * Stephen Hemminger [2005-08-18 17:36]: > On Thu, 18 Aug 2005 17:23:23 -0500 > Ryan Harper wrote: > > > * Stephen Hemminger [2005-08-18 17:11]: > > > On Thu, 18 Aug 2005 16:40:36 -0500 > > > Ryan Harper wrote: > > > > > > > Hello, > > > > > > > > I've encountered several oops when adding and removing interfaces from > > > > bridges while using Xen. Most of the details are available [1]here. > > > > The short of it is the following sequence: > > > > > > Doesn't the mutex in RTNL work right? or are you calling > > > routines with out asserting it? > > > > unregister_netdevice asserts RTNL, add_del_if() in br_ioctl.c doesn't > > seem to do so. I don't see it down dev_get_by_index() path either. It > > looks like any caller of add_del_if() isn't asserting RTNL. The two > > callers I see are: > > > > br_dev_ioctl() in br_ioctl.c > > old_dev_ioctl() in br_ioctl.c > > But the pat to br_dev_ioctl() is via the socket ioctl and that > should already have gotten RTNL. > > > dev_ioctl > rtnl_lock() > dev_ifsioc() > dev->do_ioctl --> br_dev_ioctl Hrm. OK. It sounds like both paths are doing the right thing w.r.t asserting RTNL, but br_device_event() still gets called with: 1) dev->br_port != NULL 2) dev->br_port->state = BR_STATE_DISABLED 3) event = NETDEV_UNREGISTER which results in br_del_if() being called a second time on the same port. Some of the other cases (NETDEV_FEAT_CHANGE, NETDEV_CHANGE) do a state check before calling a subsequent function. Does it make sense for br_del_if() to be called on a port whose state is BR_STATE_DISABLED? -- Ryan Harper Software Engineer; Linux Technology Center IBM Corp., Austin, Tx (512) 838-9253 T/L: 678-9253 ryanh@us.ibm.com From jgarzik@pobox.com Thu Aug 18 23:25:10 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 18 Aug 2005 23:25:14 -0700 (PDT) Received: from mail.dvmed.net (mail.dvmed.net [216.237.124.58]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7J6P9H9016159 for ; Thu, 18 Aug 2005 23:25:10 -0700 Received: from cpe-069-134-188-146.nc.res.rr.com ([69.134.188.146] helo=[10.10.10.88]) by mail.dvmed.net with esmtpsa (Exim 4.52 #1 (Red Hat Linux)) id 1E60HB-00027H-HZ; Fri, 19 Aug 2005 06:22:49 +0000 Message-ID: <43057AB7.20602@pobox.com> Date: Fri, 19 Aug 2005 02:22:47 -0400 From: Jeff Garzik User-Agent: Mozilla Thunderbird 1.0.6-1.1.fc4 (X11/20050720) X-Accept-Language: en-us, en MIME-Version: 1.0 To: Don Fry CC: tsbogend@alpha.franken.de, netdev@oss.sgi.com Subject: Re: [PATCH 2.4.30] pcnet32: fix resource leak with loopback test References: <20050429214956.GA1074@us.ibm.com> In-Reply-To: <20050429214956.GA1074@us.ibm.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 3516 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev Content-Length: 9 Lines: 2 applied From jgarzik@pobox.com Thu Aug 18 23:24:33 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 18 Aug 2005 23:24:37 -0700 (PDT) Received: from mail.dvmed.net (mail.dvmed.net [216.237.124.58]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7J6OVH9016081 for ; Thu, 18 Aug 2005 23:24:33 -0700 Received: from cpe-069-134-188-146.nc.res.rr.com ([69.134.188.146] helo=[10.10.10.88]) by mail.dvmed.net with esmtpsa (Exim 4.52 #1 (Red Hat Linux)) id 1E60GZ-00027A-9h; Fri, 19 Aug 2005 06:22:13 +0000 Message-ID: <43057A90.6000302@pobox.com> Date: Fri, 19 Aug 2005 02:22:08 -0400 From: Jeff Garzik User-Agent: Mozilla Thunderbird 1.0.6-1.1.fc4 (X11/20050720) X-Accept-Language: en-us, en MIME-Version: 1.0 To: Malli Chilakala CC: netdev Subject: Re: [resend][PATCH net-drivers-2.4 12/16] e1000: Modified e1000_clean:: exit poll References: In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 3515 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev Content-Length: 479 Lines: 17 Malli Chilakala wrote: > Modified e1000_clean:: exit poll if no Tx and work_done == 0 > > Signed-off-by: Mallikarjuna R Chilakala > Signed-off-by: Ganesh Venkatesan > Signed-off-by: John Ronciak Applied e1000 patches 1-12 to 2.4.x. Stopped at this patch. Patch 13 was corrupted, so that's where the import stopped. But nonetheless... 2.4.x netdev patches are flowing again! Jeff From herbert@gondor.apana.org.au Fri Aug 19 01:10:46 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 19 Aug 2005 01:10:52 -0700 (PDT) Received: from jay.exetel.com.au (jay.exetel.com.au [220.233.0.8]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7J8AjH9031854 for ; Fri, 19 Aug 2005 01:10:46 -0700 Received: (qmail 471 invoked by uid 507); 19 Aug 2005 18:08:22 +1000 Received: from 22.107.233.220.exetel.com.au (HELO arnor.apana.org.au) (220.233.107.22) by jay.exetel.com.au with SMTP; 19 Aug 2005 18:08:22 +1000 Received: from gondolin.me.apana.org.au ([192.168.0.6] ident=mail) by arnor.apana.org.au with esmtp (Exim 3.35 #1 (Debian)) id 1E61vJ-0004p1-00; Fri, 19 Aug 2005 18:08:21 +1000 Received: from herbert by gondolin.me.apana.org.au with local (Exim 3.36 #1 (Debian)) id 1E61vH-00028n-00; Fri, 19 Aug 2005 18:08:19 +1000 Date: Fri, 19 Aug 2005 18:08:19 +1000 To: Jeff Garzik Cc: Andrew Morton , netdev@oss.sgi.com, mangus@deprecated.it, webvenza@libero.it Subject: Re: Fw: [Bugme-new] [Bug 4223] New: sis900 kernel oop at boot Message-ID: <20050819080819.GA8203@gondor.apana.org.au> References: <20050217134440.44f591e2.akpm@osdl.org> <20050305084537.GA12678@gondor.apana.org.au> <430592A7.30101@pobox.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <430592A7.30101@pobox.com> User-Agent: Mutt/1.5.9i From: Herbert Xu X-archive-position: 3518 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: herbert@gondor.apana.org.au Precedence: bulk X-list: netdev Content-Length: 1967 Lines: 49 On Fri, Aug 19, 2005 at 04:04:55AM -0400, Jeff Garzik wrote: > Herbert Xu wrote: > >Hi: > > > >Here is the version that moves the necessary code above register_netdev > >instead of using init. It's against netdev-2.6. > > > > > >>Feb 15 18:26:20 saturno kernel: Unable to handle kernel NULL pointer > >>dereference at virtual address 0000000e > >>Feb 15 18:26:20 saturno kernel: printing eip: > >>Feb 15 18:26:20 saturno kernel: e1113417 > >>Feb 15 18:26:20 saturno kernel: *pde = 00000000 > >>Feb 15 18:26:20 saturno kernel: Oops: 0000 [#1] > >>Feb 15 18:26:20 saturno kernel: PREEMPT > >>Feb 15 18:26:20 saturno kernel: Modules linked in: sis900 nvidia 8250_pci > >>8250 > >>serial_core psmouse > >>Feb 15 18:26:20 saturno kernel: CPU: 0 > >>Feb 15 18:26:20 saturno kernel: EIP: 0060:[] Tainted: P > >>VLI > >>Feb 15 18:26:20 saturno kernel: EFLAGS: 00010296 (2.6.10-M7) > >>Feb 15 18:26:20 saturno kernel: EIP is at sis900_check_mode+0x17/0xa0 > >>[sis900] > > > > > >OK, this happened because we got preempted before sis900_mii_probe > >finished setting the sis_priv->mii. Theoretically this can happen > >with SMP as well but I suppose the number of SMP machines with sis900 > >is fairly small. > > > >Anyway, the fix is to make sure that sis900_mii_probe is done before > >the device can be opened. This patch does it by moving the setup > >before register_netdevice. > > > >Since the netdev name is not available before register_netdev, I've > >changed the relevant printk's to use pci_name instead. Note that > >one of those printk's may be called after register_netdev as well. > > > >Signed-off-by: Herbert Xu > > Is this patch still needed? Nope, because you applied this months ago :) -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt From jgarzik@pobox.com Fri Aug 19 01:07:28 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 19 Aug 2005 01:07:42 -0700 (PDT) Received: from mail.dvmed.net (mail.dvmed.net [216.237.124.58]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7J87NH9031288 for ; Fri, 19 Aug 2005 01:07:27 -0700 Received: from cpe-069-134-188-146.nc.res.rr.com ([69.134.188.146] helo=[10.10.10.88]) by mail.dvmed.net with esmtpsa (Exim 4.52 #1 (Red Hat Linux)) id 1E61s4-00029P-AX; Fri, 19 Aug 2005 08:05:00 +0000 Message-ID: <430592A7.30101@pobox.com> Date: Fri, 19 Aug 2005 04:04:55 -0400 From: Jeff Garzik User-Agent: Mozilla Thunderbird 1.0.6-1.1.fc4 (X11/20050720) X-Accept-Language: en-us, en MIME-Version: 1.0 To: Herbert Xu CC: Andrew Morton , netdev@oss.sgi.com, mangus@deprecated.it, webvenza@libero.it Subject: Re: Fw: [Bugme-new] [Bug 4223] New: sis900 kernel oop at boot References: <20050217134440.44f591e2.akpm@osdl.org> <20050305084537.GA12678@gondor.apana.org.au> In-Reply-To: <20050305084537.GA12678@gondor.apana.org.au> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 3517 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev Content-Length: 1594 Lines: 44 Herbert Xu wrote: > Hi: > > Here is the version that moves the necessary code above register_netdev > instead of using init. It's against netdev-2.6. > > >>Feb 15 18:26:20 saturno kernel: Unable to handle kernel NULL pointer >>dereference at virtual address 0000000e >>Feb 15 18:26:20 saturno kernel: printing eip: >>Feb 15 18:26:20 saturno kernel: e1113417 >>Feb 15 18:26:20 saturno kernel: *pde = 00000000 >>Feb 15 18:26:20 saturno kernel: Oops: 0000 [#1] >>Feb 15 18:26:20 saturno kernel: PREEMPT >>Feb 15 18:26:20 saturno kernel: Modules linked in: sis900 nvidia 8250_pci 8250 >>serial_core psmouse >>Feb 15 18:26:20 saturno kernel: CPU: 0 >>Feb 15 18:26:20 saturno kernel: EIP: 0060:[] Tainted: P >>VLI >>Feb 15 18:26:20 saturno kernel: EFLAGS: 00010296 (2.6.10-M7) >>Feb 15 18:26:20 saturno kernel: EIP is at sis900_check_mode+0x17/0xa0 [sis900] > > > OK, this happened because we got preempted before sis900_mii_probe > finished setting the sis_priv->mii. Theoretically this can happen > with SMP as well but I suppose the number of SMP machines with sis900 > is fairly small. > > Anyway, the fix is to make sure that sis900_mii_probe is done before > the device can be opened. This patch does it by moving the setup > before register_netdevice. > > Since the netdev name is not available before register_netdev, I've > changed the relevant printk's to use pci_name instead. Note that > one of those printk's may be called after register_netdev as well. > > Signed-off-by: Herbert Xu Is this patch still needed? Jeff From ryanh@us.ibm.com Fri Aug 19 12:14:03 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 19 Aug 2005 12:14:09 -0700 (PDT) Received: from e5.ny.us.ibm.com (e5.ny.us.ibm.com [32.97.182.145]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7JJDuH9004651 for ; Fri, 19 Aug 2005 12:14:03 -0700 Received: from d01relay02.pok.ibm.com (d01relay02.pok.ibm.com [9.56.227.234]) by e5.ny.us.ibm.com (8.12.11/8.12.11) with ESMTP id j7JJBJID026197 for ; Fri, 19 Aug 2005 15:11:19 -0400 Received: from d01av04.pok.ibm.com (d01av04.pok.ibm.com [9.56.224.64]) by d01relay02.pok.ibm.com (8.12.10/NCO/VERS6.7) with ESMTP id j7JJBJXY277000 for ; Fri, 19 Aug 2005 15:11:19 -0400 Received: from d01av04.pok.ibm.com (loopback [127.0.0.1]) by d01av04.pok.ibm.com (8.12.11/8.13.3) with ESMTP id j7JJB9hR006877 for ; Fri, 19 Aug 2005 15:11:09 -0400 Received: from localhost.localdomain (frylock.austin.ibm.com [9.53.91.14]) by d01av04.pok.ibm.com (8.12.11/8.12.11) with ESMTP id j7JJB8DF006392; Fri, 19 Aug 2005 15:11:08 -0400 Received: by localhost.localdomain (Postfix, from userid 1000) id 255B793792; Fri, 19 Aug 2005 14:10:52 -0500 (CDT) Date: Fri, 19 Aug 2005 14:10:52 -0500 From: Ryan Harper To: Stephen Hemminger Cc: netdev@oss.sgi.com Subject: Re: Possible race with br_del_if() Message-ID: <20050819191052.GE5523@us.ibm.com> References: <20050818214036.GH10593@us.ibm.com> <20050818151202.6fe6ded4@dxpl.pdx.osdl.net> <20050818222323.GI10593@us.ibm.com> <20050818153531.61f62ac0@dxpl.pdx.osdl.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20050818153531.61f62ac0@dxpl.pdx.osdl.net> User-Agent: Mutt/1.5.6+20040907i X-archive-position: 3519 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ryanh@us.ibm.com Precedence: bulk X-list: netdev Content-Length: 1923 Lines: 55 * Stephen Hemminger [2005-08-18 17:36]: > On Thu, 18 Aug 2005 17:23:23 -0500 > Ryan Harper wrote: > > > * Stephen Hemminger [2005-08-18 17:11]: > > > On Thu, 18 Aug 2005 16:40:36 -0500 > > > Ryan Harper wrote: > > > > > > > Hello, > > > > > > > > I've encountered several oops when adding and removing interfaces from > > > > bridges while using Xen. Most of the details are available [1]here. > > > > The short of it is the following sequence: > > > > > > Doesn't the mutex in RTNL work right? or are you calling > > > routines with out asserting it? > > > > unregister_netdevice asserts RTNL, add_del_if() in br_ioctl.c doesn't > > seem to do so. I don't see it down dev_get_by_index() path either. It > > looks like any caller of add_del_if() isn't asserting RTNL. The two > > callers I see are: > > > > br_dev_ioctl() in br_ioctl.c > > old_dev_ioctl() in br_ioctl.c > > But the pat to br_dev_ioctl() is via the socket ioctl and that > should already have gotten RTNL. > > > dev_ioctl > rtnl_lock() > dev_ifsioc() > dev->do_ioctl --> br_dev_ioctl Just to follow-up, the issue was a race between the call_rcu() callback for destroy_nbp() and an unregister_netdev() call. Sometimes the br_device_event() routine was triggered and destroy_nbp() had not been run yet leaving dev->br_port non-NULL to which br_device_event then correctly calls br_del_if(). We caused this by issuing a brctl delif from userspace scripts and having a in kernel handler invoke unregister_netdev() call. Our fix is to not bother calling brctl delif because the unregister_netdev() call will automatically remove the device from the bridge when the notify_call_chain() kicks in from unregister_netdevice(). -- Ryan Harper Software Engineer; Linux Technology Center IBM Corp., Austin, Tx (512) 838-9253 T/L: 678-9253 ryanh@us.ibm.com From shemminger@osdl.org Fri Aug 19 12:42:26 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 19 Aug 2005 12:42:37 -0700 (PDT) Received: from smtp.osdl.org (smtp.osdl.org [65.172.181.4]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7JJgQH9011080 for ; Fri, 19 Aug 2005 12:42:26 -0700 Received: from shell0.pdx.osdl.net (fw.osdl.org [65.172.181.6]) by smtp.osdl.org (8.12.8/8.12.8) with ESMTP id j7JJe7jA005409 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO); Fri, 19 Aug 2005 12:40:07 -0700 Received: from dxpl.pdx.osdl.net (dxpl.pdx.osdl.net [10.8.0.74]) by shell0.pdx.osdl.net (8.13.1/8.11.6) with ESMTP id j7JJe7JC002080; Fri, 19 Aug 2005 12:40:07 -0700 Date: Fri, 19 Aug 2005 12:40:42 -0700 From: Stephen Hemminger To: Ryan Harper Cc: netdev@oss.sgi.com Subject: Re: Possible race with br_del_if() Message-ID: <20050819124042.0d0ec5c7@dxpl.pdx.osdl.net> In-Reply-To: <20050819191052.GE5523@us.ibm.com> References: <20050818214036.GH10593@us.ibm.com> <20050818151202.6fe6ded4@dxpl.pdx.osdl.net> <20050818222323.GI10593@us.ibm.com> <20050818153531.61f62ac0@dxpl.pdx.osdl.net> <20050819191052.GE5523@us.ibm.com> X-Mailer: Sylpheed-Claws 1.9.13 (GTK+ 2.6.7; x86_64-redhat-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-MIMEDefang-Filter: osdl$Revision: 1.114 $ X-Scanned-By: MIMEDefang 2.36 X-archive-position: 3520 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev Content-Length: 2062 Lines: 54 On Fri, 19 Aug 2005 14:10:52 -0500 Ryan Harper wrote: > * Stephen Hemminger [2005-08-18 17:36]: > > On Thu, 18 Aug 2005 17:23:23 -0500 > > Ryan Harper wrote: > > > > > * Stephen Hemminger [2005-08-18 17:11]: > > > > On Thu, 18 Aug 2005 16:40:36 -0500 > > > > Ryan Harper wrote: > > > > > > > > > Hello, > > > > > > > > > > I've encountered several oops when adding and removing interfaces from > > > > > bridges while using Xen. Most of the details are available [1]here. > > > > > The short of it is the following sequence: > > > > > > > > Doesn't the mutex in RTNL work right? or are you calling > > > > routines with out asserting it? > > > > > > unregister_netdevice asserts RTNL, add_del_if() in br_ioctl.c doesn't > > > seem to do so. I don't see it down dev_get_by_index() path either. It > > > looks like any caller of add_del_if() isn't asserting RTNL. The two > > > callers I see are: > > > > > > br_dev_ioctl() in br_ioctl.c > > > old_dev_ioctl() in br_ioctl.c > > > > But the pat to br_dev_ioctl() is via the socket ioctl and that > > should already have gotten RTNL. > > > > > > dev_ioctl > > rtnl_lock() > > dev_ifsioc() > > dev->do_ioctl --> br_dev_ioctl > > > Just to follow-up, the issue was a race between the call_rcu() callback > for destroy_nbp() and an unregister_netdev() call. Sometimes the > br_device_event() routine was triggered and destroy_nbp() had not been > run yet leaving dev->br_port non-NULL to which br_device_event then > correctly calls br_del_if(). > > We caused this by issuing a brctl delif from userspace scripts and > having a in kernel handler invoke unregister_netdev() call. > > Our fix is to not bother calling brctl delif because the > unregister_netdev() call will automatically remove the device from the > bridge when the notify_call_chain() kicks in from > unregister_netdevice(). I'll get back to you, this needs some review, I have a bunch of old test suites to dig up for it. From rostedt@goodmis.org Fri Aug 19 14:24:54 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 19 Aug 2005 14:25:02 -0700 (PDT) Received: from ms-smtp-01.nyroc.rr.com (ms-smtp-01.nyroc.rr.com [24.24.2.55]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7JLOrH9020459 for ; Fri, 19 Aug 2005 14:24:54 -0700 Received: from [192.168.23.9] (cpe-24-94-57-164.stny.res.rr.com [24.94.57.164]) by ms-smtp-01.nyroc.rr.com (8.12.10/8.12.10) with ESMTP id j7JLMTDX005466; Fri, 19 Aug 2005 17:22:30 -0400 (EDT) Subject: Re: 2.6.13-rc6-rt6 From: Steven Rostedt To: Ingo Molnar Cc: netdev@oss.sgi.com, "Paul E. McKenney" , linux-kernel@vger.kernel.org In-Reply-To: <20050817162324.GA24495@elte.hu> References: <20050816170805.GA12959@elte.hu> <1124214647.5764.40.camel@localhost.localdomain> <1124215631.5764.43.camel@localhost.localdomain> <1124218245.5764.52.camel@localhost.localdomain> <1124252419.5764.83.camel@localhost.localdomain> <1124257580.5764.105.camel@localhost.localdomain> <20050817064750.GA8395@elte.hu> <1124287505.5764.141.camel@localhost.localdomain> <1124288677.5764.154.camel@localhost.localdomain> <1124295214.5764.163.camel@localhost.localdomain> <20050817162324.GA24495@elte.hu> Content-Type: text/plain Organization: Kihon Technologies Date: Fri, 19 Aug 2005 17:22:28 -0400 Message-Id: <1124486548.18408.18.camel@localhost.localdomain> Mime-Version: 1.0 X-Mailer: Evolution 2.2.3 Content-Transfer-Encoding: 7bit X-archive-position: 3521 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: rostedt@goodmis.org Precedence: bulk X-list: netdev Content-Length: 4757 Lines: 163 On Wed, 2005-08-17 at 18:23 +0200, Ingo Molnar wrote: > > > And it goes on and on. This happens everytime. Without netconsole, I > > only get the nonzero lock count error. Also, one of my lockups on SMP > > had to do with the kernel_thread_helper: > > > > Using IPI Shortcut mode > > khelper/794[CPU#0]: BUG in set_new_owner at kernel/rt.c:916 This was with netconsole and showed up after a bunch of other bugs. So this is a side effect of what happened earlier. > > this is a 'must not happen'. Somehow lock->held list got non-empty. > Maybe some use-after-free thing? Havent seen it myself. > I started debugging netconsole with the RT patch and found this happening. After seeing what's wrong, I looked at the latest git branch, and it seems to already have a similar solution that I was going to make. Here's a description of what's wrong. In net/core/dev.c the following code is in net_rx_action: netpoll_poll_lock(dev); if (dev->quota <= 0 || dev->poll(dev, &budget)) { netpoll_poll_unlock(dev); raw_local_irq_disable(); list_del(&dev->poll_list); list_add_tail(&dev->poll_list, &queue->poll_list); if (dev->quota < 0) dev->quota += dev->weight; else dev->quota = dev->weight; } else { netpoll_poll_unlock(dev); The netpoll_poll_lock and netpoll_poll_unlock look like this (in current RT): static inline netpoll_poll_lock(struct net_device *dev) { if (dev->npinfo) { spin_lock(&dev->npinfo->poll_lock); dev->npinfo->poll_owner = smp_processor_id(); } } static inline void netpoll_poll_unlock(struct net_device *dev) { if (dev->npinfo) { dev->npinfo->poll_owner = -1; spin_unlock(&dev->npinfo->poll_lock); } } The problem here is that between netpoll_poll_lock and netpoll_poll_unlock the dev->npinfo gets assigned. So we unlock the dev->npinfo->poll_lock without ever locking it. Here's the port from the latest git to solve this. I've CCed the netdev, since I'm not sure I got all the places for rcu_lock for the netpoll. At least to solve this problem. I did boot up the kernel and this patch did fix my bugs that I was getting using netconsole. (I have one more patch to send to fix the illegal API messages). -- Steve Signed-off-by: Steven Rostedt Index: linux_realtime_ernie/include/linux/netpoll.h =================================================================== --- linux_realtime_ernie/include/linux/netpoll.h (revision 296) +++ linux_realtime_ernie/include/linux/netpoll.h (working copy) @@ -60,25 +60,31 @@ return ret; } -static inline void netpoll_poll_lock(struct net_device *dev) +static inline void *netpoll_poll_lock(struct net_device *dev) { + rcu_read_lock(); if (dev->npinfo) { spin_lock(&dev->npinfo->poll_lock); dev->npinfo->poll_owner = smp_processor_id(); + return dev->npinfo; } + return NULL; } -static inline void netpoll_poll_unlock(struct net_device *dev) +static inline void netpoll_poll_unlock(void *have) { - if (dev->npinfo) { - dev->npinfo->poll_owner = -1; - spin_unlock(&dev->npinfo->poll_lock); + struct netpoll_info *npi = have; + + if (npi) { + npi->poll_owner = -1; + spin_unlock(&npi->poll_lock); } + rcu_read_unlock(); } #else #define netpoll_rx(a) 0 -#define netpoll_poll_lock(a) +#define netpoll_poll_lock(a) 0 #define netpoll_poll_unlock(a) #endif Index: linux_realtime_ernie/net/core/netpoll.c =================================================================== --- linux_realtime_ernie/net/core/netpoll.c (revision 296) +++ linux_realtime_ernie/net/core/netpoll.c (working copy) @@ -726,6 +726,9 @@ /* last thing to do is link it to the net device structure */ ndev->npinfo = npinfo; + /* avoid racing with NAPI reading npinfo */ + synchronize_rcu(); + return 0; release: Index: linux_realtime_ernie/net/core/dev.c =================================================================== --- linux_realtime_ernie/net/core/dev.c (revision 296) +++ linux_realtime_ernie/net/core/dev.c (working copy) @@ -1723,6 +1723,7 @@ while (!list_empty(&queue->poll_list)) { struct net_device *dev; + void *have; if (budget <= 0 || jiffies - start_time > 1) goto softnet_break; @@ -1735,10 +1736,10 @@ dev = list_entry(queue->poll_list.next, struct net_device, poll_list); - netpoll_poll_lock(dev); + have = netpoll_poll_lock(dev); if (dev->quota <= 0 || dev->poll(dev, &budget)) { - netpoll_poll_unlock(dev); + netpoll_poll_unlock(have); raw_local_irq_disable(); list_del(&dev->poll_list); list_add_tail(&dev->poll_list, &queue->poll_list); @@ -1747,7 +1748,7 @@ else dev->quota = dev->weight; } else { - netpoll_poll_unlock(dev); + netpoll_poll_unlock(have); dev_put(dev); raw_local_irq_disable(); } From paulmck@us.ibm.com Fri Aug 19 15:49:44 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 19 Aug 2005 15:49:52 -0700 (PDT) Received: from e33.co.us.ibm.com (e33.co.us.ibm.com [32.97.110.131]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7JMnbH9027819 for ; Fri, 19 Aug 2005 15:49:44 -0700 Received: from d03relay04.boulder.ibm.com (d03relay04.boulder.ibm.com [9.17.195.106]) by e33.co.us.ibm.com (8.12.10/8.12.9) with ESMTP id j7JMlJf5679590 for ; Fri, 19 Aug 2005 18:47:19 -0400 Received: from d03av04.boulder.ibm.com (d03av04.boulder.ibm.com [9.17.195.170]) by d03relay04.boulder.ibm.com (8.12.10/NCO/VERS6.7) with ESMTP id j7JMlOwi188218 for ; Fri, 19 Aug 2005 16:47:24 -0600 Received: from d03av04.boulder.ibm.com (loopback [127.0.0.1]) by d03av04.boulder.ibm.com (8.12.11/8.13.3) with ESMTP id j7JMlIgn027526 for ; Fri, 19 Aug 2005 16:47:18 -0600 Received: from linux.local (linux-009047022063.beaverton.ibm.com [9.47.22.63]) by d03av04.boulder.ibm.com (8.12.11/8.12.11) with ESMTP id j7JMlHwt027496; Fri, 19 Aug 2005 16:47:18 -0600 Received: by linux.local (Postfix on SuSE Linux 7.3 (i386), from userid 500) id D46ED148B3C; Fri, 19 Aug 2005 15:47:58 -0700 (PDT) Date: Fri, 19 Aug 2005 15:47:58 -0700 From: "Paul E. McKenney" To: Steven Rostedt Cc: Ingo Molnar , netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: Re: 2.6.13-rc6-rt6 Message-ID: <20050819224758.GJ1298@us.ibm.com> Reply-To: paulmck@us.ibm.com References: <1124215631.5764.43.camel@localhost.localdomain> <1124218245.5764.52.camel@localhost.localdomain> <1124252419.5764.83.camel@localhost.localdomain> <1124257580.5764.105.camel@localhost.localdomain> <20050817064750.GA8395@elte.hu> <1124287505.5764.141.camel@localhost.localdomain> <1124288677.5764.154.camel@localhost.localdomain> <1124295214.5764.163.camel@localhost.localdomain> <20050817162324.GA24495@elte.hu> <1124486548.18408.18.camel@localhost.localdomain> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1124486548.18408.18.camel@localhost.localdomain> User-Agent: Mutt/1.4.1i X-archive-position: 3522 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: paulmck@us.ibm.com Precedence: bulk X-list: netdev Content-Length: 6850 Lines: 217 On Fri, Aug 19, 2005 at 05:22:28PM -0400, Steven Rostedt wrote: > On Wed, 2005-08-17 at 18:23 +0200, Ingo Molnar wrote: > > > > > > And it goes on and on. This happens everytime. Without netconsole, I > > > only get the nonzero lock count error. Also, one of my lockups on SMP > > > had to do with the kernel_thread_helper: > > > > > > Using IPI Shortcut mode > > > khelper/794[CPU#0]: BUG in set_new_owner at kernel/rt.c:916 > > This was with netconsole and showed up after a bunch of other bugs. So > this is a side effect of what happened earlier. > > > > this is a 'must not happen'. Somehow lock->held list got non-empty. > > Maybe some use-after-free thing? Havent seen it myself. > > I started debugging netconsole with the RT patch and found this > happening. After seeing what's wrong, I looked at the latest git > branch, and it seems to already have a similar solution that I was going > to make. Here's a description of what's wrong. > > In net/core/dev.c the following code is in net_rx_action: > > netpoll_poll_lock(dev); > > if (dev->quota <= 0 || dev->poll(dev, &budget)) { > netpoll_poll_unlock(dev); > raw_local_irq_disable(); > list_del(&dev->poll_list); > list_add_tail(&dev->poll_list, &queue->poll_list); > if (dev->quota < 0) > dev->quota += dev->weight; > else > dev->quota = dev->weight; > } else { > netpoll_poll_unlock(dev); > > The netpoll_poll_lock and netpoll_poll_unlock look like this (in current RT): > > static inline netpoll_poll_lock(struct net_device *dev) > { > if (dev->npinfo) { > spin_lock(&dev->npinfo->poll_lock); > dev->npinfo->poll_owner = smp_processor_id(); > } > } > > static inline void netpoll_poll_unlock(struct net_device *dev) > { > if (dev->npinfo) { > dev->npinfo->poll_owner = -1; > spin_unlock(&dev->npinfo->poll_lock); > } > } > > > The problem here is that between netpoll_poll_lock and > netpoll_poll_unlock the dev->npinfo gets assigned. So we unlock the > dev->npinfo->poll_lock without ever locking it. > > Here's the port from the latest git to solve this. I've CCed the netdev, > since I'm not sure I got all the places for rcu_lock for the netpoll. At > least to solve this problem. I did boot up the kernel and this patch > did fix my bugs that I was getting using netconsole. (I have one more > patch to send to fix the illegal API messages). > > -- Steve > > Signed-off-by: Steven Rostedt > > Index: linux_realtime_ernie/include/linux/netpoll.h > =================================================================== > --- linux_realtime_ernie/include/linux/netpoll.h (revision 296) > +++ linux_realtime_ernie/include/linux/netpoll.h (working copy) > @@ -60,25 +60,31 @@ > return ret; > } Good catch -- but a few changes needed to be perfectly safe: static inline void *netpoll_poll_lock(struct net_device *dev) { struct netpoll_info *npi; rcu_read_lock(); npi = rcu_dereference(dev)->npinfo; if (have) { spin_lock(&npi->poll_lock); npi->poll_owner = smp_processor_id(); return npi; } return NULL; } The earlier version could get in trouble if dev->npinfo was set to NULL while this was executing. I am assuming that the dev pointer is really what is being RCU-protected, but this example uses mostly static data structures, so it is hard for me to tell. The npinfo pointer is not being RCU protected, as it appears to never be changed. The other candidate is the rx_np pointer, which is set to NULL in netpoll_cleanup. I suggest a modification to netpoll_cleanup below that handles both possibilities. Of course, I might well be missing something... > -static inline void netpoll_poll_unlock(struct net_device *dev) > +static inline void netpoll_poll_unlock(void *have) > { > - if (dev->npinfo) { > - dev->npinfo->poll_owner = -1; > - spin_unlock(&dev->npinfo->poll_lock); > + struct netpoll_info *npi = have; > + > + if (npi) { > + npi->poll_owner = -1; > + spin_unlock(&npi->poll_lock); > } > + rcu_read_unlock(); > } > > #else > #define netpoll_rx(a) 0 > -#define netpoll_poll_lock(a) > +#define netpoll_poll_lock(a) 0 > #define netpoll_poll_unlock(a) > #endif > > Index: linux_realtime_ernie/net/core/netpoll.c > =================================================================== > --- linux_realtime_ernie/net/core/netpoll.c (revision 296) > +++ linux_realtime_ernie/net/core/netpoll.c (working copy) If netpoll_setup() is implicitly tearing down an earlier netpoll_setup(), then something like Steve's change below might be needed. > @@ -726,6 +726,9 @@ > /* last thing to do is link it to the net device structure */ > ndev->npinfo = npinfo; > > + /* avoid racing with NAPI reading npinfo */ > + synchronize_rcu(); > + > return 0; > > release: Assuming that it is legal to block in netpoll_cleanup(), the following should work. The idea is to NULL the dev pointer, wait for all RCU readers to get done, and only then complete the cleanup. void netpoll_cleanup(struct netpoll *np) { struct netpoll_info *npinfo; unsigned long flags; struct net_device *dp; if (np->dev) { dp = np->dev; rcu_assign_pointer(np->dev, NULL); synchronize_rcu(); npinfo = dp->npinfo; if (npinfo && npinfo->rx_np == np) { spin_lock_irqsave(&npinfo->rx_lock, flags); npinfo->rx_np = NULL; npinfo->rx_flags &= ~NETPOLL_RX_ENABLED; spin_unlock_irqrestore(&npinfo->rx_lock, flags); } dev_put(dp); } } Again, I do not fully understand this code, so a grain of salt might come in handy. But there definitely need to be some rcu_dereference() and rcu_assign_pointer() primitives in there somewhere. ;-) The following changes look good to me, but, as I said earlier, I do not claim to fully understand this code. > Index: linux_realtime_ernie/net/core/dev.c > =================================================================== > --- linux_realtime_ernie/net/core/dev.c (revision 296) > +++ linux_realtime_ernie/net/core/dev.c (working copy) > @@ -1723,6 +1723,7 @@ > > while (!list_empty(&queue->poll_list)) { > struct net_device *dev; > + void *have; > > if (budget <= 0 || jiffies - start_time > 1) > goto softnet_break; > @@ -1735,10 +1736,10 @@ > > dev = list_entry(queue->poll_list.next, > struct net_device, poll_list); > - netpoll_poll_lock(dev); > + have = netpoll_poll_lock(dev); > > if (dev->quota <= 0 || dev->poll(dev, &budget)) { > - netpoll_poll_unlock(dev); > + netpoll_poll_unlock(have); > raw_local_irq_disable(); > list_del(&dev->poll_list); > list_add_tail(&dev->poll_list, &queue->poll_list); > @@ -1747,7 +1748,7 @@ > else > dev->quota = dev->weight; > } else { > - netpoll_poll_unlock(dev); > + netpoll_poll_unlock(have); > dev_put(dev); > raw_local_irq_disable(); > } Thanx, Paul From rostedt@goodmis.org Fri Aug 19 16:05:07 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 19 Aug 2005 16:05:16 -0700 (PDT) Received: from ms-smtp-01.nyroc.rr.com (ms-smtp-01.nyroc.rr.com [24.24.2.55]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7JN56H9030109 for ; Fri, 19 Aug 2005 16:05:07 -0700 Received: from [192.168.23.9] (cpe-24-94-57-164.stny.res.rr.com [24.94.57.164]) by ms-smtp-01.nyroc.rr.com (8.12.10/8.12.10) with ESMTP id j7JN2hDX000638; Fri, 19 Aug 2005 19:02:43 -0400 (EDT) Subject: Re: 2.6.13-rc6-rt6 From: Steven Rostedt To: paulmck@us.ibm.com Cc: Ingo Molnar , netdev@oss.sgi.com, linux-kernel@vger.kernel.org In-Reply-To: <20050819224758.GJ1298@us.ibm.com> References: <1124215631.5764.43.camel@localhost.localdomain> <1124218245.5764.52.camel@localhost.localdomain> <1124252419.5764.83.camel@localhost.localdomain> <1124257580.5764.105.camel@localhost.localdomain> <20050817064750.GA8395@elte.hu> <1124287505.5764.141.camel@localhost.localdomain> <1124288677.5764.154.camel@localhost.localdomain> <1124295214.5764.163.camel@localhost.localdomain> <20050817162324.GA24495@elte.hu> <1124486548.18408.18.camel@localhost.localdomain> <20050819224758.GJ1298@us.ibm.com> Content-Type: text/plain Organization: Kihon Technologies Date: Fri, 19 Aug 2005 19:02:42 -0400 Message-Id: <1124492562.18408.35.camel@localhost.localdomain> Mime-Version: 1.0 X-Mailer: Evolution 2.2.3 Content-Transfer-Encoding: 7bit X-archive-position: 3523 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: rostedt@goodmis.org Precedence: bulk X-list: netdev Content-Length: 1657 Lines: 52 On Fri, 2005-08-19 at 15:47 -0700, Paul E. McKenney wrote: > Good catch -- but a few changes needed to be perfectly safe: > > static inline void *netpoll_poll_lock(struct net_device *dev) > { > > struct netpoll_info *npi; > > rcu_read_lock(); > npi = rcu_dereference(dev)->npinfo; > if (have) { Here I'm sure you mean "if (npi) {" :-) > spin_lock(&npi->poll_lock); > npi->poll_owner = smp_processor_id(); > return npi; > } > return NULL; > } > > The earlier version could get in trouble if dev->npinfo was set > to NULL while this was executing. Truth be told, I was just fixing the race with getting the npinfo pointer set between netpoll_poll_lock and netpoll_poll_unlock. I wrote a patch that fixed that but nothing with the rcu_locks. Then I looked at the current git tree and saw that they already had my changes, but also included the rcu locks. So I just (blindly) added them. > > Again, I do not fully understand this code, so a grain of salt might > come in handy. But there definitely need to be some rcu_dereference() > and rcu_assign_pointer() primitives in there somewhere. ;-) > > The following changes look good to me, but, as I said earlier, I do > not claim to fully understand this code. netpoll has changed quite a bit in the last few releases. I've seen lots of fixup code sent in (which usually means there's lots of new broken code ;-) Anyway, I don't quite fully understand RCU. I read a few of the documents on your web site, but I haven't had time to really digest it. Have you taken a look at the latest git tree? The rcu_locks are used for net poll quite a bit more there. -- Steve From paulmck@us.ibm.com Fri Aug 19 16:14:46 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 19 Aug 2005 16:14:51 -0700 (PDT) Received: from e5.ny.us.ibm.com (e5.ny.us.ibm.com [32.97.182.145]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7JNEjH9031113 for ; Fri, 19 Aug 2005 16:14:45 -0700 Received: from d01relay02.pok.ibm.com (d01relay02.pok.ibm.com [9.56.227.234]) by e5.ny.us.ibm.com (8.12.11/8.12.11) with ESMTP id j7JNCDLP013116 for ; Fri, 19 Aug 2005 19:12:13 -0400 Received: from d01av03.pok.ibm.com (d01av03.pok.ibm.com [9.56.224.217]) by d01relay02.pok.ibm.com (8.12.10/NCO/VERS6.7) with ESMTP id j7JNCDXY267748 for ; Fri, 19 Aug 2005 19:12:13 -0400 Received: from d01av03.pok.ibm.com (loopback [127.0.0.1]) by d01av03.pok.ibm.com (8.12.11/8.13.3) with ESMTP id j7JNBxwr011309 for ; Fri, 19 Aug 2005 19:12:00 -0400 Received: from linux.local (linux-009047022063.beaverton.ibm.com [9.47.22.63]) by d01av03.pok.ibm.com (8.12.11/8.12.11) with ESMTP id j7JNBwgD010978; Fri, 19 Aug 2005 19:11:59 -0400 Received: by linux.local (Postfix on SuSE Linux 7.3 (i386), from userid 500) id E7328148B3C; Fri, 19 Aug 2005 16:12:23 -0700 (PDT) Date: Fri, 19 Aug 2005 16:12:23 -0700 From: "Paul E. McKenney" To: Steven Rostedt Cc: Ingo Molnar , netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: Re: 2.6.13-rc6-rt6 Message-ID: <20050819231223.GN1298@us.ibm.com> Reply-To: paulmck@us.ibm.com References: <1124252419.5764.83.camel@localhost.localdomain> <1124257580.5764.105.camel@localhost.localdomain> <20050817064750.GA8395@elte.hu> <1124287505.5764.141.camel@localhost.localdomain> <1124288677.5764.154.camel@localhost.localdomain> <1124295214.5764.163.camel@localhost.localdomain> <20050817162324.GA24495@elte.hu> <1124486548.18408.18.camel@localhost.localdomain> <20050819224758.GJ1298@us.ibm.com> <1124492562.18408.35.camel@localhost.localdomain> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1124492562.18408.35.camel@localhost.localdomain> User-Agent: Mutt/1.4.1i X-archive-position: 3524 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: paulmck@us.ibm.com Precedence: bulk X-list: netdev Content-Length: 1936 Lines: 56 On Fri, Aug 19, 2005 at 07:02:42PM -0400, Steven Rostedt wrote: > On Fri, 2005-08-19 at 15:47 -0700, Paul E. McKenney wrote: > > > Good catch -- but a few changes needed to be perfectly safe: > > > > static inline void *netpoll_poll_lock(struct net_device *dev) > > { > > > > struct netpoll_info *npi; > > > > rcu_read_lock(); > > npi = rcu_dereference(dev)->npinfo; > > if (have) { > > Here I'm sure you mean "if (npi) {" :-) Right you are! ;-) > > spin_lock(&npi->poll_lock); > > npi->poll_owner = smp_processor_id(); > > return npi; > > } > > return NULL; > > } > > > > The earlier version could get in trouble if dev->npinfo was set > > to NULL while this was executing. > > Truth be told, I was just fixing the race with getting the npinfo > pointer set between netpoll_poll_lock and netpoll_poll_unlock. I wrote > a patch that fixed that but nothing with the rcu_locks. Then I looked > at the current git tree and saw that they already had my changes, but > also included the rcu locks. So I just (blindly) added them. Understood! > > Again, I do not fully understand this code, so a grain of salt might > > come in handy. But there definitely need to be some rcu_dereference() > > and rcu_assign_pointer() primitives in there somewhere. ;-) > > > > The following changes look good to me, but, as I said earlier, I do > > not claim to fully understand this code. > > netpoll has changed quite a bit in the last few releases. I've seen lots > of fixup code sent in (which usually means there's lots of new broken > code ;-) > > Anyway, I don't quite fully understand RCU. I read a few of the > documents on your web site, but I haven't had time to really digest it. > Have you taken a look at the latest git tree? The rcu_locks are used > for net poll quite a bit more there. Hmmm.... Guess it is time for me to stop procrastinating on better understanding git... Thanx, Paul From rostedt@goodmis.org Fri Aug 19 16:22:50 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 19 Aug 2005 16:22:55 -0700 (PDT) Received: from ms-smtp-02.nyroc.rr.com (ms-smtp-02.nyroc.rr.com [24.24.2.56]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7JNMnH9003374 for ; Fri, 19 Aug 2005 16:22:49 -0700 Received: from [192.168.23.9] (cpe-24-94-57-164.stny.res.rr.com [24.94.57.164]) by ms-smtp-02.nyroc.rr.com (8.12.10/8.12.10) with ESMTP id j7JNKQIU015555; Fri, 19 Aug 2005 19:20:26 -0400 (EDT) Subject: Re: 2.6.13-rc6-rt6 From: Steven Rostedt To: paulmck@us.ibm.com Cc: Ingo Molnar , netdev@oss.sgi.com, linux-kernel@vger.kernel.org In-Reply-To: <20050819231223.GN1298@us.ibm.com> References: <1124252419.5764.83.camel@localhost.localdomain> <1124257580.5764.105.camel@localhost.localdomain> <20050817064750.GA8395@elte.hu> <1124287505.5764.141.camel@localhost.localdomain> <1124288677.5764.154.camel@localhost.localdomain> <1124295214.5764.163.camel@localhost.localdomain> <20050817162324.GA24495@elte.hu> <1124486548.18408.18.camel@localhost.localdomain> <20050819224758.GJ1298@us.ibm.com> <1124492562.18408.35.camel@localhost.localdomain> <20050819231223.GN1298@us.ibm.com> Content-Type: text/plain Organization: Kihon Technologies Date: Fri, 19 Aug 2005 19:20:20 -0400 Message-Id: <1124493620.18408.40.camel@localhost.localdomain> Mime-Version: 1.0 X-Mailer: Evolution 2.2.3 Content-Transfer-Encoding: 7bit X-archive-position: 3525 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: rostedt@goodmis.org Precedence: bulk X-list: netdev Content-Length: 547 Lines: 16 On Fri, 2005-08-19 at 16:12 -0700, Paul E. McKenney wrote: > > Hmmm.... Guess it is time for me to stop procrastinating on better > understanding git... Why? I still don't. Just go to http://www.kernel.org/ and download the latest git release (as of now it's -git11). Of course you need to know the special combination :-) http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.12.tar.bz2 http://kernel.org/pub/linux/kernel/v2.6/testing/patch-2.6.13-rc6.bz2 http://kernel.org/pub/linux/kernel/v2.6/snapshots/patch-2.6.13-rc6-git11.bz2 -- Steve From paulmck@us.ibm.com Fri Aug 19 16:46:35 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 19 Aug 2005 16:46:40 -0700 (PDT) Received: from e32.co.us.ibm.com (e32.co.us.ibm.com [32.97.110.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7JNkTH9005629 for ; Fri, 19 Aug 2005 16:46:35 -0700 Received: from d03relay04.boulder.ibm.com (d03relay04.boulder.ibm.com [9.17.195.106]) by e32.co.us.ibm.com (8.12.10/8.12.9) with ESMTP id j7JNiAwh070892 for ; Fri, 19 Aug 2005 19:44:10 -0400 Received: from d03av04.boulder.ibm.com (d03av04.boulder.ibm.com [9.17.195.170]) by d03relay04.boulder.ibm.com (8.12.10/NCO/VERS6.7) with ESMTP id j7JNiGwi204238 for ; Fri, 19 Aug 2005 17:44:16 -0600 Received: from d03av04.boulder.ibm.com (loopback [127.0.0.1]) by d03av04.boulder.ibm.com (8.12.11/8.13.3) with ESMTP id j7JNi9fF007455 for ; Fri, 19 Aug 2005 17:44:10 -0600 Received: from linux.local (linux-009047022063.beaverton.ibm.com [9.47.22.63]) by d03av04.boulder.ibm.com (8.12.11/8.12.11) with ESMTP id j7JNi93t007430; Fri, 19 Aug 2005 17:44:09 -0600 Received: by linux.local (Postfix on SuSE Linux 7.3 (i386), from userid 500) id 72386148B4A; Fri, 19 Aug 2005 16:44:50 -0700 (PDT) Date: Fri, 19 Aug 2005 16:44:50 -0700 From: "Paul E. McKenney" To: Steven Rostedt Cc: Ingo Molnar , netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: Re: 2.6.13-rc6-rt6 Message-ID: <20050819234450.GQ1298@us.ibm.com> Reply-To: paulmck@us.ibm.com References: <20050817064750.GA8395@elte.hu> <1124287505.5764.141.camel@localhost.localdomain> <1124288677.5764.154.camel@localhost.localdomain> <1124295214.5764.163.camel@localhost.localdomain> <20050817162324.GA24495@elte.hu> <1124486548.18408.18.camel@localhost.localdomain> <20050819224758.GJ1298@us.ibm.com> <1124492562.18408.35.camel@localhost.localdomain> <20050819231223.GN1298@us.ibm.com> <1124493620.18408.40.camel@localhost.localdomain> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1124493620.18408.40.camel@localhost.localdomain> User-Agent: Mutt/1.4.1i X-archive-position: 3526 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: paulmck@us.ibm.com Precedence: bulk X-list: netdev Content-Length: 706 Lines: 20 On Fri, Aug 19, 2005 at 07:20:20PM -0400, Steven Rostedt wrote: > On Fri, 2005-08-19 at 16:12 -0700, Paul E. McKenney wrote: > > > > Hmmm.... Guess it is time for me to stop procrastinating on better > > understanding git... > > Why? I still don't. Just go to http://www.kernel.org/ and download the > latest git release (as of now it's -git11). > > Of course you need to know the special combination :-) > > http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.12.tar.bz2 > http://kernel.org/pub/linux/kernel/v2.6/testing/patch-2.6.13-rc6.bz2 > http://kernel.org/pub/linux/kernel/v2.6/snapshots/patch-2.6.13-rc6-git11.bz2 I feel very much enlightened! ;-) Thank you very much!!! Thanx, Paul From jgarzik@pobox.com Fri Aug 19 18:44:37 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 19 Aug 2005 18:44:41 -0700 (PDT) Received: from mail.dvmed.net (mail.dvmed.net [216.237.124.58]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7K1iaH9015360 for ; Fri, 19 Aug 2005 18:44:37 -0700 Received: from cpe-069-134-188-146.nc.res.rr.com ([69.134.188.146] helo=[10.10.10.88]) by mail.dvmed.net with esmtpsa (Exim 4.52 #1 (Red Hat Linux)) id 1E6INE-0002UY-8C; Sat, 20 Aug 2005 01:42:16 +0000 Message-ID: <43068A76.3070707@pobox.com> Date: Fri, 19 Aug 2005 21:42:14 -0400 From: Jeff Garzik User-Agent: Mozilla Thunderbird 1.0.6-1.1.fc4 (X11/20050720) X-Accept-Language: en-us, en MIME-Version: 1.0 To: ravinandan.arakali@neterion.com CC: netdev@oss.sgi.com, raghavendra.koushik@neterion.com, leonid.grossman@neterion.com, rapuru.sriram@neterion.com, ananda.raju@neterion.com Subject: Re: [PATCH 2.6.13-rc6] S2io: Hardware fixes for Xframe II adapter References: <20050812171559.F0C03983D0@linux.site> In-Reply-To: <20050812171559.F0C03983D0@linux.site> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 3527 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev Content-Length: 9 Lines: 2 applied From dennismail@gmx.net Mon Aug 22 00:17:47 2005 Received: with ECARTIS (v1.0.0; list netdev); Mon, 22 Aug 2005 00:17:50 -0700 (PDT) Received: from mail.gmx.net (mail.gmx.de [213.165.64.20]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id j7M7HjH9011015 for ; Mon, 22 Aug 2005 00:17:46 -0700 Received: (qmail invoked by alias); 22 Aug 2005 07:15:25 -0000 Received: from cyclades.de (EHLO [192.168.5.120]) [62.225.173.198] by mail.gmx.net (mp017) with SMTP; 22 Aug 2005 09:15:25 +0200 X-Authenticated: #217599 From: Dennis To: netdev@oss.sgi.com Subject: No Gigabit with r8169 module Date: Mon, 22 Aug 2005 09:21:17 +0200 User-Agent: KMail/1.8 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Disposition: inline Message-Id: <200508220921.17956.dennismail@gmx.net> X-Y-GMX-Trusted: 0 Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id j7M7HjH9011015 X-archive-position: 3530 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: dennismail@gmx.net Precedence: bulk X-list: netdev Content-Length: 1622 Lines: 38 Hi! I am trying to build up a gigabit network with the following equipment (all from Netgear): - Laptop: GA511 PCMCIA Gigabit Adapter - PC: GA311 PCI Gigabit Card - GS605 5-port Gigabit Switch Both PCMCIA and PCI Cards are running with the r8169 driver. The PC as the laptop are installed with Suse 9.3. The Switch has dual color support on the connection LEDs and both ports (the one connected to the PC and the one connected to the Laptop) are showing that they are connected to gigabit ethernet cards. What I tried to do is to copy a folder from the laptop to the PC, which size is about 6.1 GB. I am using KDE 3.4 and the konqueror with fish-protocol. The copy dialogue says that this wold take about 2.5 hours with a average of 850 kbit/s. This is quite too slow for gigabit. Just to make sure that I didn´t make a hardware connection mistake, i tried the connection with Windows XP and there, a folder which size is about 2.1 GB takes about 10 minutes to copy - seems like gigabit is actually working with all cables I am using. Another point was that suse could be the scapegoat, so I used knoppix to test the connection under linux again. I got the same slow speed here. The laptop has three network interfaces, but I switched all of except the gigabit card (ifdown ), so no mistake there possible from my point of view (correct me if I am wrong here). Now I wonder if this is due to the driver or did I do something wrong? I would be happy for any help that gets me further to get my gigabit running. If there are any more details I have to supply, please ask me. Yours, Mike From ralf@linux-mips.org Mon Aug 22 04:05:17 2005 Received: with ECARTIS (v1.0.0; list netdev); Mon, 22 Aug 2005 04:05:30 -0700 (PDT) Received: from ftp.linux-mips.org (mail.linux-mips.org [62.254.210.162]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7MB5GH9012268 for ; Mon, 22 Aug 2005 04:05:17 -0700 Received: from extgw-uk.mips.com ([IPv6:::ffff:62.254.210.129]:30996 "EHLO bacchus.net.dhis.org") by linux-mips.org with ESMTP id ; Mon, 22 Aug 2005 11:57:37 +0100 Received: from dea.linux-mips.net (localhost.localdomain [127.0.0.1]) by bacchus.net.dhis.org (8.13.4/8.13.1) with ESMTP id j7MB2I9T007542; Mon, 22 Aug 2005 12:02:18 +0100 Received: (from ralf@localhost) by dea.linux-mips.net (8.13.4/8.13.4/Submit) id j7MB2I7H007541; Mon, 22 Aug 2005 12:02:18 +0100 Date: Mon, 22 Aug 2005 12:02:18 +0100 From: Ralf Baechle To: "David S. Miller" , netdev@linux-mips.org Cc: linux-hams@vger.kernel.org Subject: [PATCH] Fix socket bitop damage Message-ID: <20050822110218.GA7514@linux-mips.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4.2.1i X-archive-position: 3531 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ralf@linux-mips.org Precedence: bulk X-list: netdev The socket flag cleanups that went into 2.6.12-rc1 are basically oring the flags of an old socket into the socket just being created. Unfortunately that one was just initialized by sock_init_data(), so already has SOCK_ZAPPED set. As the result zapped sockets are created and all incoming connection will fail due to this bug which again was carefully replicated to at least AX.25, NET/ROM or ROSE. In order to keep the abstraction alive I've introduced sock_copy_flags() to copy the socket flags from one sockets to another and used that instead of the bitwise copy thing. Anyway, the idea here has probably been to copy all flags, so sock_copy_flags() should be the right thing. With this the ham radio protocols are usable again, so I hope this will make it into 2.6.13. Signed-off-by: Ralf Baechle DL5RB include/net/sock.h | 5 +++++ net/ax25/af_ax25.c | 7 +------ net/netrom/af_netrom.c | 7 +------ net/rose/af_rose.c | 7 +------ 4 files changed, 8 insertions(+), 18 deletions(-) Index: linux-cvs/net/netrom/af_netrom.c =================================================================== --- linux-cvs.orig/net/netrom/af_netrom.c +++ linux-cvs/net/netrom/af_netrom.c @@ -465,12 +465,7 @@ static struct sock *nr_make_new(struct s sk->sk_sndbuf = osk->sk_sndbuf; sk->sk_state = TCP_ESTABLISHED; sk->sk_sleep = osk->sk_sleep; - - if (sock_flag(osk, SOCK_ZAPPED)) - sock_set_flag(sk, SOCK_ZAPPED); - - if (sock_flag(osk, SOCK_DBG)) - sock_set_flag(sk, SOCK_DBG); + sock_copy_flags(sk, osk); skb_queue_head_init(&nr->ack_queue); skb_queue_head_init(&nr->reseq_queue); Index: linux-cvs/include/net/sock.h =================================================================== --- linux-cvs.orig/include/net/sock.h +++ linux-cvs/include/net/sock.h @@ -384,6 +384,11 @@ enum sock_flags { SOCK_QUEUE_SHRUNK, /* write queue has been shrunk recently */ }; +static inline void sock_copy_flags(struct sock *nsk, struct sock *osk) +{ + nsk->sk_flags = osk->sk_flags; +} + static inline void sock_set_flag(struct sock *sk, enum sock_flags flag) { __set_bit(flag, &sk->sk_flags); Index: linux-cvs/net/ax25/af_ax25.c =================================================================== --- linux-cvs.orig/net/ax25/af_ax25.c +++ linux-cvs/net/ax25/af_ax25.c @@ -884,12 +884,7 @@ struct sock *ax25_make_new(struct sock * sk->sk_sndbuf = osk->sk_sndbuf; sk->sk_state = TCP_ESTABLISHED; sk->sk_sleep = osk->sk_sleep; - - if (sock_flag(osk, SOCK_DBG)) - sock_set_flag(sk, SOCK_DBG); - - if (sock_flag(osk, SOCK_ZAPPED)) - sock_set_flag(sk, SOCK_ZAPPED); + sock_copy_flags(sk, osk); oax25 = ax25_sk(osk); Index: linux-cvs/net/rose/af_rose.c =================================================================== --- linux-cvs.orig/net/rose/af_rose.c +++ linux-cvs/net/rose/af_rose.c @@ -556,12 +556,7 @@ static struct sock *rose_make_new(struct sk->sk_sndbuf = osk->sk_sndbuf; sk->sk_state = TCP_ESTABLISHED; sk->sk_sleep = osk->sk_sleep; - - if (sock_flag(osk, SOCK_ZAPPED)) - sock_set_flag(sk, SOCK_ZAPPED); - - if (sock_flag(osk, SOCK_DBG)) - sock_set_flag(sk, SOCK_DBG); + sock_copy_flags(sk, osk); init_timer(&rose->timer); init_timer(&rose->idletimer); From ralf@linux-mips.org Mon Aug 22 04:13:38 2005 Received: with ECARTIS (v1.0.0; list netdev); Mon, 22 Aug 2005 04:13:43 -0700 (PDT) Received: from ftp.linux-mips.org (mail.linux-mips.org [62.254.210.162]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7MBDbH9013673 for ; Mon, 22 Aug 2005 04:13:37 -0700 Received: from extgw-uk.mips.com ([IPv6:::ffff:62.254.210.129]:21788 "EHLO bacchus.net.dhis.org") by linux-mips.org with ESMTP id ; Mon, 22 Aug 2005 12:05:58 +0100 Received: from dea.linux-mips.net (localhost.localdomain [127.0.0.1]) by bacchus.net.dhis.org (8.13.4/8.13.1) with ESMTP id j7MBAccj007836; Mon, 22 Aug 2005 12:10:38 +0100 Received: (from ralf@localhost) by dea.linux-mips.net (8.13.4/8.13.4/Submit) id j7MBAcg9007835; Mon, 22 Aug 2005 12:10:38 +0100 Date: Mon, 22 Aug 2005 12:10:38 +0100 From: Ralf Baechle To: "David S. Miller" , netdev@linux-mips.org Cc: linux-hams@vger.kernel.org Subject: [PATCH] AX.25 UID fixes Message-ID: <20050822111038.GA7545@linux-mips.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4.2.1i X-archive-position: 3532 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ralf@linux-mips.org Precedence: bulk X-list: netdev o Brown paperbag bug - ax25_findbyuid() was always returning a NULL pointer as the result. Breaks ROSE completly and AX.25 if UID policy set to deny. o While the list structure of AX.25's UID to callsign mapping table was properly protected by a spinlock, it's elements were not refcounted resulting in a race between removal and usage of an element. Signed-off-by: Ralf Baechle DL5RB include/net/ax25.h | 18 +++++++++- net/ax25/af_ax25.c | 20 +++++++---- net/ax25/ax25_route.c | 12 ++++--- net/ax25/ax25_uid.c | 83 +++++++++++++++++++++---------------------------- net/netrom/af_netrom.c | 24 +++++++++----- net/rose/af_rose.c | 20 +++++++---- 6 files changed, 100 insertions(+), 77 deletions(-) Index: linux-cvs/include/net/ax25.h =================================================================== --- linux-cvs.orig/include/net/ax25.h +++ linux-cvs/include/net/ax25.h @@ -148,11 +148,25 @@ enum { #define AX25_DEF_DS_TIMEOUT 180000 /* DAMA timeout 3 minutes */ typedef struct ax25_uid_assoc { - struct ax25_uid_assoc *next; + struct hlist_node uid_node; + atomic_t refcount; uid_t uid; ax25_address call; } ax25_uid_assoc; +#define ax25_uid_for_each(__ax25, node, list) \ + hlist_for_each_entry(__ax25, node, list, uid_node) + +#define ax25_uid_hold(ax25) \ + atomic_inc(&((ax25)->refcount)) + +static inline void ax25_uid_put(ax25_uid_assoc *assoc) +{ + if (atomic_dec_and_test(&assoc->refcount)) { + kfree(assoc); + } +} + typedef struct { ax25_address calls[AX25_MAX_DIGIS]; unsigned char repeated[AX25_MAX_DIGIS]; @@ -386,7 +400,7 @@ extern unsigned long ax25_display_timer( /* ax25_uid.c */ extern int ax25_uid_policy; -extern ax25_address *ax25_findbyuid(uid_t); +extern ax25_uid_assoc *ax25_findbyuid(uid_t); extern int ax25_uid_ioctl(int, struct sockaddr_ax25 *); extern struct file_operations ax25_uid_fops; extern void ax25_uid_free(void); Index: linux-cvs/net/ax25/ax25_route.c =================================================================== --- linux-cvs.orig/net/ax25/ax25_route.c +++ linux-cvs/net/ax25/ax25_route.c @@ -422,8 +422,8 @@ static inline void ax25_adjust_path(ax25 */ int ax25_rt_autobind(ax25_cb *ax25, ax25_address *addr) { + ax25_uid_assoc *user; ax25_route *ax25_rt; - ax25_address *call; int err; if ((ax25_rt = ax25_get_route(addr, NULL)) == NULL) @@ -434,16 +434,18 @@ int ax25_rt_autobind(ax25_cb *ax25, ax25 goto put; } - if ((call = ax25_findbyuid(current->euid)) == NULL) { + user = ax25_findbyuid(current->euid); + if (user) { + ax25->source_addr = user->call; + ax25_uid_put(user); + } else { if (ax25_uid_policy && !capable(CAP_NET_BIND_SERVICE)) { err = -EPERM; goto put; } - call = (ax25_address *)ax25->ax25_dev->dev->dev_addr; + ax25->source_addr = *(ax25_address *)ax25->ax25_dev->dev->dev_addr; } - ax25->source_addr = *call; - if (ax25_rt->digipeat != NULL) { if ((ax25->digipeat = kmalloc(sizeof(ax25_digi), GFP_ATOMIC)) == NULL) { err = -ENOMEM; Index: linux-cvs/net/ax25/ax25_uid.c =================================================================== --- linux-cvs.orig/net/ax25/ax25_uid.c +++ linux-cvs/net/ax25/ax25_uid.c @@ -28,6 +28,7 @@ #include #include #include +#include #include #include #include @@ -41,38 +42,41 @@ * Callsign/UID mapper. This is in kernel space for security on multi-amateur machines. */ -static ax25_uid_assoc *ax25_uid_list; +HLIST_HEAD(ax25_uid_list); static DEFINE_RWLOCK(ax25_uid_lock); int ax25_uid_policy = 0; -ax25_address *ax25_findbyuid(uid_t uid) +ax25_uid_assoc *ax25_findbyuid(uid_t uid) { - ax25_uid_assoc *ax25_uid; - ax25_address *res = NULL; + ax25_uid_assoc *ax25_uid, *res = NULL; + struct hlist_node *node; read_lock(&ax25_uid_lock); - for (ax25_uid = ax25_uid_list; ax25_uid != NULL; ax25_uid = ax25_uid->next) { + ax25_uid_for_each(ax25_uid, node, &ax25_uid_list) { if (ax25_uid->uid == uid) { - res = &ax25_uid->call; + ax25_uid_hold(ax25_uid); + res = ax25_uid; break; } } read_unlock(&ax25_uid_lock); - return NULL; + return res; } int ax25_uid_ioctl(int cmd, struct sockaddr_ax25 *sax) { - ax25_uid_assoc *s, *ax25_uid; + ax25_uid_assoc *ax25_uid; + struct hlist_node *node; + ax25_uid_assoc *user; unsigned long res; switch (cmd) { case SIOCAX25GETUID: res = -ENOENT; read_lock(&ax25_uid_lock); - for (ax25_uid = ax25_uid_list; ax25_uid != NULL; ax25_uid = ax25_uid->next) { + ax25_uid_for_each(ax25_uid, node, &ax25_uid_list) { if (ax25cmp(&sax->sax25_call, &ax25_uid->call) == 0) { res = ax25_uid->uid; break; @@ -85,19 +89,22 @@ int ax25_uid_ioctl(int cmd, struct socka case SIOCAX25ADDUID: if (!capable(CAP_NET_ADMIN)) return -EPERM; - if (ax25_findbyuid(sax->sax25_uid)) + user = ax25_findbyuid(sax->sax25_uid); + if (user) { + ax25_uid_put(user); return -EEXIST; + } if (sax->sax25_uid == 0) return -EINVAL; if ((ax25_uid = kmalloc(sizeof(*ax25_uid), GFP_KERNEL)) == NULL) return -ENOMEM; + atomic_set(&ax25_uid->refcount, 1); ax25_uid->uid = sax->sax25_uid; ax25_uid->call = sax->sax25_call; write_lock(&ax25_uid_lock); - ax25_uid->next = ax25_uid_list; - ax25_uid_list = ax25_uid; + hlist_add_head(&ax25_uid->uid_node, &ax25_uid_list); write_unlock(&ax25_uid_lock); return 0; @@ -106,34 +113,21 @@ int ax25_uid_ioctl(int cmd, struct socka if (!capable(CAP_NET_ADMIN)) return -EPERM; + ax25_uid = NULL; write_lock(&ax25_uid_lock); - for (ax25_uid = ax25_uid_list; ax25_uid != NULL; ax25_uid = ax25_uid->next) { - if (ax25cmp(&sax->sax25_call, &ax25_uid->call) == 0) { + ax25_uid_for_each(ax25_uid, node, &ax25_uid_list) { + if (ax25cmp(&sax->sax25_call, &ax25_uid->call) == 0) break; - } } if (ax25_uid == NULL) { write_unlock(&ax25_uid_lock); return -ENOENT; } - if ((s = ax25_uid_list) == ax25_uid) { - ax25_uid_list = s->next; - write_unlock(&ax25_uid_lock); - kfree(ax25_uid); - return 0; - } - while (s != NULL && s->next != NULL) { - if (s->next == ax25_uid) { - s->next = ax25_uid->next; - write_unlock(&ax25_uid_lock); - kfree(ax25_uid); - return 0; - } - s = s->next; - } + hlist_del_init(&ax25_uid->uid_node); + ax25_uid_put(ax25_uid); write_unlock(&ax25_uid_lock); - return -ENOENT; + return 0; default: return -EINVAL; @@ -147,13 +141,11 @@ int ax25_uid_ioctl(int cmd, struct socka static void *ax25_uid_seq_start(struct seq_file *seq, loff_t *pos) { struct ax25_uid_assoc *pt; - int i = 1; + struct hlist_node *node; + int i = 0; read_lock(&ax25_uid_lock); - if (*pos == 0) - return SEQ_START_TOKEN; - - for (pt = ax25_uid_list; pt != NULL; pt = pt->next) { + ax25_uid_for_each(pt, node, &ax25_uid_list) { if (i == *pos) return pt; ++i; @@ -164,8 +156,9 @@ static void *ax25_uid_seq_start(struct s static void *ax25_uid_seq_next(struct seq_file *seq, void *v, loff_t *pos) { ++*pos; - return (v == SEQ_START_TOKEN) ? ax25_uid_list : - ((struct ax25_uid_assoc *) v)->next; + + return hlist_entry(((ax25_uid_assoc *)v)->uid_node.next, + ax25_uid_assoc, uid_node); } static void ax25_uid_seq_stop(struct seq_file *seq, void *v) @@ -179,7 +172,6 @@ static int ax25_uid_seq_show(struct seq_ seq_printf(seq, "Policy: %d\n", ax25_uid_policy); else { struct ax25_uid_assoc *pt = v; - seq_printf(seq, "%6d %s\n", pt->uid, ax2asc(&pt->call)); } @@ -213,16 +205,13 @@ struct file_operations ax25_uid_fops = { */ void __exit ax25_uid_free(void) { - ax25_uid_assoc *s, *ax25_uid; + ax25_uid_assoc *ax25_uid; + struct hlist_node *node; write_lock(&ax25_uid_lock); - ax25_uid = ax25_uid_list; - while (ax25_uid != NULL) { - s = ax25_uid; - ax25_uid = ax25_uid->next; - - kfree(s); + ax25_uid_for_each(ax25_uid, node, &ax25_uid_list) { + hlist_del_init(&ax25_uid->uid_node); + ax25_uid_put(ax25_uid); } - ax25_uid_list = NULL; write_unlock(&ax25_uid_lock); } Index: linux-cvs/net/netrom/af_netrom.c =================================================================== --- linux-cvs.orig/net/netrom/af_netrom.c +++ linux-cvs/net/netrom/af_netrom.c @@ -542,7 +542,8 @@ static int nr_bind(struct socket *sock, struct nr_sock *nr = nr_sk(sk); struct full_sockaddr_ax25 *addr = (struct full_sockaddr_ax25 *)uaddr; struct net_device *dev; - ax25_address *user, *source; + ax25_uid_assoc *user; + ax25_address *source; lock_sock(sk); if (!sock_flag(sk, SOCK_ZAPPED)) { @@ -581,16 +582,19 @@ static int nr_bind(struct socket *sock, } else { source = &addr->fsa_ax25.sax25_call; - if ((user = ax25_findbyuid(current->euid)) == NULL) { + user = ax25_findbyuid(current->euid); + if (user) { + nr->user_addr = user->call; + ax25_uid_put(user); + } else { if (ax25_uid_policy && !capable(CAP_NET_BIND_SERVICE)) { release_sock(sk); dev_put(dev); return -EPERM; } - user = source; + nr->user_addr = *source; } - nr->user_addr = *user; nr->source_addr = *source; } @@ -610,7 +614,8 @@ static int nr_connect(struct socket *soc struct sock *sk = sock->sk; struct nr_sock *nr = nr_sk(sk); struct sockaddr_ax25 *addr = (struct sockaddr_ax25 *)uaddr; - ax25_address *user, *source = NULL; + ax25_address *source = NULL; + ax25_uid_assoc *user; struct net_device *dev; lock_sock(sk); @@ -651,16 +656,19 @@ static int nr_connect(struct socket *soc } source = (ax25_address *)dev->dev_addr; - if ((user = ax25_findbyuid(current->euid)) == NULL) { + user = ax25_findbyuid(current->euid); + if (user) { + nr->user_addr = user->call; + ax25_uid_put(user); + } else { if (ax25_uid_policy && !capable(CAP_NET_ADMIN)) { dev_put(dev); release_sock(sk); return -EPERM; } - user = source; + nr->user_addr = *source; } - nr->user_addr = *user; nr->source_addr = *source; nr->device = dev; Index: linux-cvs/net/rose/af_rose.c =================================================================== --- linux-cvs.orig/net/rose/af_rose.c +++ linux-cvs/net/rose/af_rose.c @@ -626,7 +626,8 @@ static int rose_bind(struct socket *sock struct rose_sock *rose = rose_sk(sk); struct sockaddr_rose *addr = (struct sockaddr_rose *)uaddr; struct net_device *dev; - ax25_address *user, *source; + ax25_address *source; + ax25_uid_assoc *user; int n; if (!sock_flag(sk, SOCK_ZAPPED)) @@ -651,14 +652,17 @@ static int rose_bind(struct socket *sock source = &addr->srose_call; - if ((user = ax25_findbyuid(current->euid)) == NULL) { + user = ax25_findbyuid(current->euid); + if (user) { + rose->source_call = user->call; + ax25_uid_put(user); + } else { if (ax25_uid_policy && !capable(CAP_NET_BIND_SERVICE)) return -EACCES; - user = source; + rose->source_call = *source; } rose->source_addr = addr->srose_addr; - rose->source_call = *user; rose->device = dev; rose->source_ndigis = addr->srose_ndigis; @@ -685,8 +689,8 @@ static int rose_connect(struct socket *s struct rose_sock *rose = rose_sk(sk); struct sockaddr_rose *addr = (struct sockaddr_rose *)uaddr; unsigned char cause, diagnostic; - ax25_address *user; struct net_device *dev; + ax25_uid_assoc *user; int n; if (sk->sk_state == TCP_ESTABLISHED && sock->state == SS_CONNECTING) { @@ -736,12 +740,14 @@ static int rose_connect(struct socket *s if ((dev = rose_dev_first()) == NULL) return -ENETUNREACH; - if ((user = ax25_findbyuid(current->euid)) == NULL) + user = ax25_findbyuid(current->euid); + if (!user) return -EINVAL; memcpy(&rose->source_addr, dev->dev_addr, ROSE_ADDR_LEN); - rose->source_call = *user; + rose->source_call = user->call; rose->device = dev; + ax25_uid_put(user); rose_insert_socket(sk); /* Finish the bind */ } Index: linux-cvs/net/ax25/af_ax25.c =================================================================== --- linux-cvs.orig/net/ax25/af_ax25.c +++ linux-cvs/net/ax25/af_ax25.c @@ -1011,7 +1011,8 @@ static int ax25_bind(struct socket *sock struct sock *sk = sock->sk; struct full_sockaddr_ax25 *addr = (struct full_sockaddr_ax25 *)uaddr; ax25_dev *ax25_dev = NULL; - ax25_address *call; + ax25_uid_assoc *user; + ax25_address call; ax25_cb *ax25; int err = 0; @@ -1030,9 +1031,15 @@ static int ax25_bind(struct socket *sock if (addr->fsa_ax25.sax25_family != AF_AX25) return -EINVAL; - call = ax25_findbyuid(current->euid); - if (call == NULL && ax25_uid_policy && !capable(CAP_NET_ADMIN)) { - return -EACCES; + user = ax25_findbyuid(current->euid); + if (user) { + call = user->call; + ax25_uid_put(user); + } else { + if (ax25_uid_policy && !capable(CAP_NET_ADMIN)) + return -EACCES; + + call = addr->fsa_ax25.sax25_call; } lock_sock(sk); @@ -1043,10 +1050,7 @@ static int ax25_bind(struct socket *sock goto out; } - if (call == NULL) - ax25->source_addr = addr->fsa_ax25.sax25_call; - else - ax25->source_addr = *call; + ax25->source_addr = call; /* * User already set interface with SO_BINDTODEVICE From tgraf@suug.ch Mon Aug 22 04:16:39 2005 Received: with ECARTIS (v1.0.0; list netdev); Mon, 22 Aug 2005 04:16:44 -0700 (PDT) Received: from ftp.linux-mips.org (mail.linux-mips.org [62.254.210.162]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7MBGdH9014561 for ; Mon, 22 Aug 2005 04:16:39 -0700 Received: from postel.suug.ch ([IPv6:::ffff:195.134.158.23]:33720 "EHLO postel.suug.ch") by linux-mips.org with ESMTP id ; Mon, 22 Aug 2005 12:09:00 +0100 Received: by postel.suug.ch (Postfix, from userid 10001) id 7D7951C0EB; Mon, 22 Aug 2005 13:14:36 +0200 (CEST) Date: Mon, 22 Aug 2005 13:14:36 +0200 From: Thomas Graf To: Ralf Baechle Cc: "David S. Miller" , netdev@linux-mips.org, linux-hams@vger.kernel.org Subject: Re: [PATCH] Fix socket bitop damage Message-ID: <20050822111436.GE17371@postel.suug.ch> References: <20050822110218.GA7514@linux-mips.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20050822110218.GA7514@linux-mips.org> X-archive-position: 3533 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev * Ralf Baechle <20050822110218.GA7514@linux-mips.org> 2005-08-22 12:02 > The socket flag cleanups that went into 2.6.12-rc1 are basically oring > the flags of an old socket into the socket just being created. > Unfortunately that one was just initialized by sock_init_data(), so already > has SOCK_ZAPPED set. As the result zapped sockets are created and all > incoming connection will fail due to this bug which again was carefully > replicated to at least AX.25, NET/ROM or ROSE. I'm probably to one to blame here but I don't get the point yet. What I did was to change the bitfield based flags to use sk_flags like this: - sk->sk_zapped = osk->sk_zapped; + + if (sock_flag(osk, SOCK_ZAPPED)) + sock_set_flag(sk, SOCK_ZAPPED); From tgraf@suug.ch Mon Aug 22 04:23:34 2005 Received: with ECARTIS (v1.0.0; list netdev); Mon, 22 Aug 2005 04:23:42 -0700 (PDT) Received: from ftp.linux-mips.org (mail.linux-mips.org [62.254.210.162]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7MBNXH9019470 for ; Mon, 22 Aug 2005 04:23:34 -0700 Received: from postel.suug.ch ([IPv6:::ffff:195.134.158.23]:37048 "EHLO postel.suug.ch") by linux-mips.org with ESMTP id ; Mon, 22 Aug 2005 12:15:54 +0100 Received: by postel.suug.ch (Postfix, from userid 10001) id 8AFA71C0EB; Mon, 22 Aug 2005 13:21:34 +0200 (CEST) Date: Mon, 22 Aug 2005 13:21:34 +0200 From: Thomas Graf To: Ralf Baechle Cc: "David S. Miller" , netdev@linux-mips.org, linux-hams@vger.kernel.org Subject: Re: [PATCH] Fix socket bitop damage Message-ID: <20050822112134.GF17371@postel.suug.ch> References: <20050822110218.GA7514@linux-mips.org> <20050822111436.GE17371@postel.suug.ch> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20050822111436.GE17371@postel.suug.ch> X-archive-position: 3534 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev * Thomas Graf <20050822111436.GE17371@postel.suug.ch> 2005-08-22 13:14 > * Ralf Baechle <20050822110218.GA7514@linux-mips.org> 2005-08-22 12:02 > > The socket flag cleanups that went into 2.6.12-rc1 are basically oring > > the flags of an old socket into the socket just being created. > > Unfortunately that one was just initialized by sock_init_data(), so already > > has SOCK_ZAPPED set. As the result zapped sockets are created and all > > incoming connection will fail due to this bug which again was carefully > > replicated to at least AX.25, NET/ROM or ROSE. > > I'm probably to one to blame here but I don't get the point yet. Never mind, I got it, sk->sk_flags may be be != 0. From romieu@fr.zoreil.com Mon Aug 22 13:04:33 2005 Received: with ECARTIS (v1.0.0; list netdev); Mon, 22 Aug 2005 13:04:37 -0700 (PDT) Received: from fr.zoreil.com (electric-eye.fr.zoreil.com [213.41.134.224]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7MK4VH9016953 for ; Mon, 22 Aug 2005 13:04:32 -0700 Received: from electric-eye.fr.zoreil.com (localhost.localdomain [127.0.0.1]) by fr.zoreil.com (8.13.4/8.12.1) with ESMTP id j7MK1ang029192; Mon, 22 Aug 2005 22:01:36 +0200 Received: (from romieu@localhost) by electric-eye.fr.zoreil.com (8.13.4/8.12.1) id j7MK1Zc8029191; Mon, 22 Aug 2005 22:01:35 +0200 Date: Mon, 22 Aug 2005 22:01:35 +0200 From: Francois Romieu To: Dennis Cc: netdev@oss.sgi.com Subject: Re: No Gigabit with r8169 module Message-ID: <20050822200135.GA28933@electric-eye.fr.zoreil.com> References: <200508220921.17956.dennismail@gmx.net> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <200508220921.17956.dennismail@gmx.net> User-Agent: Mutt/1.4.2.1i X-Organisation: Land of Sunshine Inc. X-archive-position: 3536 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: romieu@fr.zoreil.com Precedence: bulk X-list: netdev Content-Length: 1236 Lines: 37 Dennis : [...] > I am trying to build up a gigabit network with the following equipment (all > from Netgear): Same config here. [...] > Both PCMCIA and PCI Cards are running with the r8169 driver. The PC as the > laptop are installed with Suse 9.3. [...] > What I tried to do is to copy a folder from the laptop to the PC, which size > is about 6.1 GB. I am using KDE 3.4 and the konqueror with fish-protocol. The > copy dialogue says that this wold take about 2.5 hours with a average of 850 > kbit/s. This is quite too slow for gigabit. - Which kernel are you using ? - Can you try plain old scp -c blowfish $BIG_FILE in a text console (i.e. outside of X) and report the transfer rate ? - If it still sucks, can you try 2.6.13-rc6 ? - Please send: o complete dmesg after boot o lspci -vx o lsmod o cat /proc/interrupts + vmstat 1 during transfer o ethtool ethX Do not hesitate to use bugzilla.kernel.org. > Just to make sure that I didn´t make a hardware connection mistake, i tried > the connection with Windows XP and there, a folder which size is about 2.1 GB > takes about 10 minutes to copy - seems like gigabit is actually working with It means 3~4 Mo/s. Unimpressing. -- Ueimor From dale@farnsworth.org Mon Aug 22 15:55:51 2005 Received: with ECARTIS (v1.0.0; list netdev); Mon, 22 Aug 2005 15:55:58 -0700 (PDT) Received: from xyzzy.farnsworth.org (h142-az.mvista.com [65.200.49.142] (may be forged)) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id j7MMtoH9032192 for ; Mon, 22 Aug 2005 15:55:50 -0700 Received: (qmail 27027 invoked by uid 1000); 22 Aug 2005 22:53:29 -0000 From: "Dale Farnsworth" Date: Mon, 22 Aug 2005 15:53:29 -0700 To: netdev@oss.sgi.com Cc: Jeff Garzik Subject: [PATCH] [NET] mv643xx: add workaround for HW checksum generation bug Message-ID: <20050822225329.GA25560@xyzzy.farnsworth.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.9i X-archive-position: 3537 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: dale@farnsworth.org Precedence: bulk X-list: netdev Content-Length: 3543 Lines: 99 [PATCH] [NET] mv643xx: add workaround for HW checksum generation bug The hardware checksum generator on the mv64xxx occasionally generates an incorrect checksum. This patch works around the issue and enables hardware checksum generation. Signed-off-by: Dale Farnsworth --- commit 42b926194c0e88445e654b8f11faf199d1409650 tree 0dfda571d490d3d67ee5fc14f473390efeabef84 parent f6fdd7d9c273bb2a20ab467cb57067494f932fa3 author Dale Farnsworth Mon, 22 Aug 2005 15:43:49 -0700 committer Dale Farnsworth Mon, 22 Aug 2005 15:43:49 -0700 drivers/net/mv643xx_eth.c | 29 ++++++++++++++++++----------- drivers/net/mv643xx_eth.h | 4 +++- 2 files changed, 21 insertions(+), 12 deletions(-) diff --git a/drivers/net/mv643xx_eth.c b/drivers/net/mv643xx_eth.c --- a/drivers/net/mv643xx_eth.c +++ b/drivers/net/mv643xx_eth.c @@ -1157,16 +1157,20 @@ static int mv643xx_eth_start_xmit(struct if (!skb_shinfo(skb)->nr_frags) { linear: if (skb->ip_summed != CHECKSUM_HW) { + /* Errata BTS #50, IHL must be 5 if no HW checksum */ pkt_info.cmd_sts = ETH_TX_ENABLE_INTERRUPT | - ETH_TX_FIRST_DESC | ETH_TX_LAST_DESC; + ETH_TX_FIRST_DESC | + ETH_TX_LAST_DESC | + 5 << ETH_TX_IHL_SHIFT; pkt_info.l4i_chk = 0; } else { - u32 ipheader = skb->nh.iph->ihl << 11; pkt_info.cmd_sts = ETH_TX_ENABLE_INTERRUPT | - ETH_TX_FIRST_DESC | ETH_TX_LAST_DESC | - ETH_GEN_TCP_UDP_CHECKSUM | - ETH_GEN_IP_V_4_CHECKSUM | ipheader; + ETH_TX_FIRST_DESC | + ETH_TX_LAST_DESC | + ETH_GEN_TCP_UDP_CHECKSUM | + ETH_GEN_IP_V_4_CHECKSUM | + skb->nh.iph->ihl << ETH_TX_IHL_SHIFT; /* CPU already calculated pseudo header checksum. */ if (skb->nh.iph->protocol == IPPROTO_UDP) { pkt_info.cmd_sts |= ETH_UDP_FRAME; @@ -1193,7 +1197,6 @@ linear: stats->tx_bytes += pkt_info.byte_cnt; } else { unsigned int frag; - u32 ipheader; /* Since hardware can't handle unaligned fragments smaller * than 9 bytes, if we find any, we linearize the skb @@ -1222,12 +1225,16 @@ linear: DMA_TO_DEVICE); pkt_info.l4i_chk = 0; pkt_info.return_info = 0; - pkt_info.cmd_sts = ETH_TX_FIRST_DESC; - if (skb->ip_summed == CHECKSUM_HW) { - ipheader = skb->nh.iph->ihl << 11; - pkt_info.cmd_sts |= ETH_GEN_TCP_UDP_CHECKSUM | - ETH_GEN_IP_V_4_CHECKSUM | ipheader; + if (skb->ip_summed != CHECKSUM_HW) + /* Errata BTS #50, IHL must be 5 if no HW checksum */ + pkt_info.cmd_sts = ETH_TX_FIRST_DESC | + 5 << ETH_TX_IHL_SHIFT; + else { + pkt_info.cmd_sts = ETH_TX_FIRST_DESC | + ETH_GEN_TCP_UDP_CHECKSUM | + ETH_GEN_IP_V_4_CHECKSUM | + skb->nh.iph->ihl << ETH_TX_IHL_SHIFT; /* CPU already calculated pseudo header checksum. */ if (skb->nh.iph->protocol == IPPROTO_UDP) { pkt_info.cmd_sts |= ETH_UDP_FRAME; diff --git a/drivers/net/mv643xx_eth.h b/drivers/net/mv643xx_eth.h --- a/drivers/net/mv643xx_eth.h +++ b/drivers/net/mv643xx_eth.h @@ -49,7 +49,7 @@ /* Checksum offload for Tx works for most packets, but * fails if previous packet sent did not use hw csum */ -#undef MV643XX_CHECKSUM_OFFLOAD_TX +#define MV643XX_CHECKSUM_OFFLOAD_TX #define MV643XX_NAPI #define MV643XX_TX_FAST_REFILL #undef MV643XX_RX_QUEUE_FILL_ON_TASK /* Does not work, yet */ @@ -217,6 +217,8 @@ #define ETH_TX_ENABLE_INTERRUPT (BIT23) #define ETH_AUTO_MODE (BIT30) +#define ETH_TX_IHL_SHIFT 11 + /* typedefs */ typedef enum _eth_func_ret_status { From dale@farnsworth.org Mon Aug 22 16:52:41 2005 Received: with ECARTIS (v1.0.0; list netdev); Mon, 22 Aug 2005 16:52:45 -0700 (PDT) Received: from xyzzy.farnsworth.org (h142-az.mvista.com [65.200.49.142] (may be forged)) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id j7MNqeH9008621 for ; Mon, 22 Aug 2005 16:52:40 -0700 Received: (qmail 29975 invoked by uid 1000); 22 Aug 2005 23:50:19 -0000 From: "Dale Farnsworth" Date: Mon, 22 Aug 2005 16:50:19 -0700 To: Jeff Garzik , Netdev Subject: [PATCH] [NET] mii: Add test for GigE support Message-ID: <20050822235019.GA29630@xyzzy.farnsworth.org> References: <20050322231746.GA27770@xyzzy> <4240A9F3.5040704@pobox.com> <212e5bf54766a68d2ab8716574225203@freescale.com> <4240CABB.5090701@pobox.com> <20050323061446.GA6943@xyzzy> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20050323061446.GA6943@xyzzy> User-Agent: Mutt/1.5.9i X-archive-position: 3538 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: dale@farnsworth.org Precedence: bulk X-list: netdev Content-Length: 3483 Lines: 82 [This patch was submitted on 22Mar2005 and jgarzik said, "applied, thanks", but it may have been lost in the git transition. I've updated it to current offsets.] Signed-off-by: Dale Farnsworth Index: netdev-2.6-mv643xx-enet/drivers/net/mii.c =================================================================== --- netdev-2.6-mv643xx-enet.orig/drivers/net/mii.c +++ netdev-2.6-mv643xx-enet/drivers/net/mii.c @@ -207,6 +207,21 @@ return 0; } +int mii_check_gmii_support(struct mii_if_info *mii) +{ + int reg; + + reg = mii->mdio_read(mii->dev, mii->phy_id, MII_BMSR); + if (reg & BMSR_HAS_EXTSTAT1000) { + reg = mii->mdio_read(mii->dev, mii->phy_id, MII_EXTSTAT1000); + if (reg & (ESR_1000_BASE_X_FD | ESR_1000_BASE_T_FD | + ESR_1000_BASE_X_HD | ESR_1000_BASE_T_HD)) + return 1; + } + + return 0; +} + int mii_link_ok (struct mii_if_info *mii) { /* first, a dummy read, needed to latch some MII phys */ @@ -394,5 +409,6 @@ EXPORT_SYMBOL(mii_ethtool_sset); EXPORT_SYMBOL(mii_check_link); EXPORT_SYMBOL(mii_check_media); +EXPORT_SYMBOL(mii_check_gmii_support); EXPORT_SYMBOL(generic_mii_ioctl); Index: netdev-2.6-mv643xx-enet/include/linux/mii.h =================================================================== --- netdev-2.6-mv643xx-enet.orig/include/linux/mii.h +++ netdev-2.6-mv643xx-enet/include/linux/mii.h @@ -22,6 +22,7 @@ #define MII_EXPANSION 0x06 /* Expansion register */ #define MII_CTRL1000 0x09 /* 1000BASE-T control */ #define MII_STAT1000 0x0a /* 1000BASE-T status */ +#define MII_EXTSTAT1000 0x0f /* 1000BASE-XX extended status */ #define MII_DCOUNTER 0x12 /* Disconnect counter */ #define MII_FCSCOUNTER 0x13 /* False carrier counter */ #define MII_NWAYTEST 0x14 /* N-way auto-neg test reg */ @@ -54,7 +55,8 @@ #define BMSR_ANEGCAPABLE 0x0008 /* Able to do auto-negotiation */ #define BMSR_RFAULT 0x0010 /* Remote fault detected */ #define BMSR_ANEGCOMPLETE 0x0020 /* Auto-negotiation complete */ -#define BMSR_RESV 0x07c0 /* Unused... */ +#define BMSR_HAS_EXTSTAT1000 0x0100 /* Has 1000BASE extended status*/ +#define BMSR_RESV 0x06c0 /* Unused... */ #define BMSR_10HALF 0x0800 /* Can do 10mbps, half-duplex */ #define BMSR_10FULL 0x1000 /* Can do 10mbps, full-duplex */ #define BMSR_100HALF 0x2000 /* Can do 100mbps, half-duplex */ @@ -129,6 +131,12 @@ #define LPA_1000FULL 0x0800 /* Link partner 1000BASE-T full duplex */ #define LPA_1000HALF 0x0400 /* Link partner 1000BASE-T half duplex */ +/* 1000BASE Ext Status register */ +#define ESR_1000_BASE_X_FD 0x8000 +#define ESR_1000_BASE_X_HD 0x4000 +#define ESR_1000_BASE_T_FD 0x2000 +#define ESR_1000_BASE_T_HD 0x1000 + struct mii_if_info { int phy_id; int advertising; @@ -151,6 +159,7 @@ extern int mii_nway_restart (struct mii_if_info *mii); extern int mii_ethtool_gset(struct mii_if_info *mii, struct ethtool_cmd *ecmd); extern int mii_ethtool_sset(struct mii_if_info *mii, struct ethtool_cmd *ecmd); +extern int mii_check_gmii_support(struct mii_if_info *mii); extern void mii_check_link (struct mii_if_info *mii); extern unsigned int mii_check_media (struct mii_if_info *mii, unsigned int ok_to_print, From jgarzik@pobox.com Mon Aug 22 22:37:34 2005 Received: with ECARTIS (v1.0.0; list netdev); Mon, 22 Aug 2005 22:37:41 -0700 (PDT) Received: from mail.dvmed.net (mail.dvmed.net [216.237.124.58]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7N5bYH9010956 for ; Mon, 22 Aug 2005 22:37:34 -0700 Received: from cpe-069-134-188-146.nc.res.rr.com ([69.134.188.146] helo=[10.10.10.88]) by mail.dvmed.net with esmtpsa (Exim 4.52 #1 (Red Hat Linux)) id 1E7RRA-0004TN-W2; Tue, 23 Aug 2005 05:35:06 +0000 Message-ID: <430AB587.3030506@pobox.com> Date: Tue, 23 Aug 2005 01:35:03 -0400 From: Jeff Garzik User-Agent: Mozilla Thunderbird 1.0.6-1.1.fc4 (X11/20050720) X-Accept-Language: en-us, en MIME-Version: 1.0 To: Arthur Kepner CC: "David S. Miller" , netdev@oss.sgi.com, bonding-devel@lists.sourceforge.net Subject: Re: [RESEND] [PATCH] bond inherits zero-copy flags of slaves References: In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 3539 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev Content-Length: 9 Lines: 2 applied From davem@davemloft.net Tue Aug 23 09:27:03 2005 Received: with ECARTIS (v1.0.0; list netdev); Tue, 23 Aug 2005 09:27:06 -0700 (PDT) Received: from ftp.linux-mips.org (mail.linux-mips.org [62.254.210.162]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7NGR3H9022329 for ; Tue, 23 Aug 2005 09:27:03 -0700 Received: from dsl027-180-168.sfo1.dsl.speakeasy.net ([IPv6:::ffff:216.27.180.168]:65469 "EHLO sunset.davemloft.net") by linux-mips.org with ESMTP id ; Tue, 23 Aug 2005 17:19:15 +0100 Received: from localhost ([127.0.0.1] ident=davem) by sunset.davemloft.net with esmtp (Exim 4.52) id 1E7bZr-0006XG-J3; Tue, 23 Aug 2005 09:24:43 -0700 Date: Tue, 23 Aug 2005 09:24:43 -0700 (PDT) Message-Id: <20050823.092443.42164438.davem@davemloft.net> To: ralf@linux-mips.org Cc: netdev@linux-mips.org, linux-hams@vger.kernel.org Subject: Re: [PATCH] AX.25 UID fixes From: "David S. Miller" In-Reply-To: <20050822111038.GA7545@linux-mips.org> References: <20050822111038.GA7545@linux-mips.org> X-Mailer: Mew version 4.2 on Emacs 21.4 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 3542 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev Content-Length: 540 Lines: 13 From: Ralf Baechle Date: Mon, 22 Aug 2005 12:10:38 +0100 > o Brown paperbag bug - ax25_findbyuid() was always returning a NULL pointer > as the result. Breaks ROSE completly and AX.25 if UID policy set to deny. > > o While the list structure of AX.25's UID to callsign mapping table was > properly protected by a spinlock, it's elements were not refcounted > resulting in a race between removal and usage of an element. > > Signed-off-by: Ralf Baechle DL5RB Applied, thanks Ralf. From davem@davemloft.net Tue Aug 23 09:24:58 2005 Received: with ECARTIS (v1.0.0; list netdev); Tue, 23 Aug 2005 09:25:07 -0700 (PDT) Received: from ftp.linux-mips.org (mail.linux-mips.org [62.254.210.162]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7NGOuH9021949 for ; Tue, 23 Aug 2005 09:24:58 -0700 Received: from dsl027-180-168.sfo1.dsl.speakeasy.net ([IPv6:::ffff:216.27.180.168]:17077 "EHLO sunset.davemloft.net") by linux-mips.org with ESMTP id ; Tue, 23 Aug 2005 17:17:08 +0100 Received: from localhost ([127.0.0.1] ident=davem) by sunset.davemloft.net with esmtp (Exim 4.52) id 1E7bXm-0006W0-Ux; Tue, 23 Aug 2005 09:22:35 -0700 Date: Tue, 23 Aug 2005 09:22:34 -0700 (PDT) Message-Id: <20050823.092234.112408202.davem@davemloft.net> To: ralf@linux-mips.org Cc: netdev@linux-mips.org, linux-hams@vger.kernel.org Subject: Re: [PATCH] Fix socket bitop damage From: "David S. Miller" In-Reply-To: <20050822110218.GA7514@linux-mips.org> References: <20050822110218.GA7514@linux-mips.org> X-Mailer: Mew version 4.2 on Emacs 21.4 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 3541 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev Content-Length: 968 Lines: 21 From: Ralf Baechle Date: Mon, 22 Aug 2005 12:02:18 +0100 > The socket flag cleanups that went into 2.6.12-rc1 are basically oring > the flags of an old socket into the socket just being created. > Unfortunately that one was just initialized by sock_init_data(), so already > has SOCK_ZAPPED set. As the result zapped sockets are created and all > incoming connection will fail due to this bug which again was carefully > replicated to at least AX.25, NET/ROM or ROSE. > > In order to keep the abstraction alive I've introduced sock_copy_flags() > to copy the socket flags from one sockets to another and used that > instead of the bitwise copy thing. Anyway, the idea here has probably > been to copy all flags, so sock_copy_flags() should be the right thing. > With this the ham radio protocols are usable again, so I hope this will > make it into 2.6.13. > > Signed-off-by: Ralf Baechle DL5RB Applied, thanks Ralf. From sim@netnation.com Tue Aug 23 12:11:22 2005 Received: with ECARTIS (v1.0.0; list netdev); Tue, 23 Aug 2005 12:11:29 -0700 (PDT) Received: from peace.netnation.com (newpeace.netnation.com [204.174.223.7]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7NJBKH9005160 for ; Tue, 23 Aug 2005 12:11:20 -0700 Received: from sim by peace.netnation.com with local (Exim 4.50) id 1E7e8j-0006u9-2P; Tue, 23 Aug 2005 12:08:53 -0700 Date: Tue, 23 Aug 2005 12:08:53 -0700 From: Simon Kirby To: Eric Dumazet Cc: Robert Olsson , netdev@oss.sgi.com Subject: Re: Route cache performance Message-ID: <20050823190852.GA20794@netnation.com> References: <20050815213855.GA17832@netnation.com> <43014E27.1070104@cosmosbay.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <43014E27.1070104@cosmosbay.com> User-Agent: Mutt/1.5.9i X-archive-position: 3543 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: sim@netnation.com Precedence: bulk X-list: netdev Content-Length: 6857 Lines: 127 On Tue, Aug 16, 2005 at 04:23:35AM +0200, Eric Dumazet wrote: > Hi Simon Hi there! > I think one of the reason linux 2.6 has worst results is because HZ=1000 > (instead of HZ=100 for linux 2.4) > So if rt_garbage_collect() has heavy work to do, it usually break out of > the loop because of : > > } while (!in_softirq() && time_before_eq(jiffies, now)); I was under the impression, however, that the code Alexei added last time I brought up this problem was intended to always allow gc when the the table is full and another entry is attempting to be created, even when under gc_min_interval. I'm actually not even interested (yet) with the gc_interval/timer case because I'm testing currently with a flow creation rate of much larger than max_size per second (the minimum gc_interval being one second). > Could you please test latest 2.6.13-rc6 kernel on the Opteron machine, > compiled with HZ=100, with the appended kernel argument : > > rhash_entries=8191 ( or rhash_entries=16383 ) > > and > > echo 1 >/proc/sys/net/ipv4/route/gc_interval > echo 2 >/proc/sys/net/ipv4/route/gc_elasticity > > Could you also post some data from your router (like : rtstat -c 20 -i 1) Sure. Here are results from 2.6.13-rc6 with HZ=100 and rhash_entries=8191, which sets the max_size to 131072. I'm using lnstat becuase the rtstat version I could find doesn't work on newer kernels: lnstat -c -1 -i 1 -f rt_cache -k entries,in_hit,in_slow_tot,gc_total,gc_ignored,gc_goal_miss,gc_dst_overflow,in_hlist_search The sender is running "juno 192.168.1.1 31313 0" (juno-z.101f.c): pid 18492: ran for 40s, 13595333 packets out, 16241091 bytes/s (~340kpps) Without tweaks to gc_interval and gc_elasticity: rt_cache|rt_cache|rt_cache|rt_cache|rt_cache|rt_cache|rt_cache|rt_cache| entries| in_hit|in_slow_|gc_total|gc_ignor|gc_goal_|gc_dst_o|in_hlist| | | tot| | ed| miss| verflow| _search| 32| 117| 419| 0| 0| 0| 0| 0| 32| 6| 0| 0| 0| 0| 0| 0| 32| 2| 0| 0| 0| 0| 0| 0| 33| 2| 4| 0| 0| 0| 0| 0| 9033| 2| 9002| 840| 839| 0| 0| 4962| 131062| 22| 125633| 125629| 125447| 182| 181| 837163| 131062| 0| 13511| 13509| 900| 12609| 12609| 10| 131062| 0| 8772| 8770| 600| 8170| 8170| 7| 131062| 0| 8709| 8706| 600| 8106| 8106| 8| 131062| 0| 8771| 8770| 600| 8170| 8170| 6| 131062| 0| 8770| 8768| 600| 8168| 8168| 6| 131062| 0| 8706| 8704| 600| 8104| 8104| 10| 131062| 0| 8770| 8770| 600| 8170| 8170| 5| 131062| 0| 8708| 8706| 600| 8106| 8106| 5| 131062| 0| 8770| 8769| 600| 8169| 8169| 6| 131062| 0| 8770| 8769| 600| 8169| 8169| 10| 131062| 0| 8713| 8706| 600| 8106| 8106| 7| 131062| 0| 8786| 8769| 600| 8169| 8169| 9| With tweaks (and after 60 seconds to wait for timer expiry): rt_cache|rt_cache|rt_cache|rt_cache|rt_cache|rt_cache|rt_cache|rt_cache| entries| in_hit|in_slow_|gc_total|gc_ignor|gc_goal_|gc_dst_o|in_hlist| | | tot| | ed| miss| verflow| _search| 28| 632| 424656| 413834| 145906| 267927| 267926| 842370| 28| 2| 3| 0| 0| 0| 0| 0| 28| 3| 2| 0| 0| 0| 0| 0| 28| 2| 4| 0| 0| 0| 0| 0| 35129| 3| 35999| 27826| 27825| 0| 0| 61913| 131062| 6| 102045| 102043| 99432| 2611| 2610| 288926| 131062| 0| 13446| 13442| 900| 12542| 12542| 11| 131062| 0| 11914| 11909| 800| 11109| 11109| 5| 131062| 0| 8772| 8770| 599| 8171| 8170| 5| 131062| 0| 8708| 8708| 600| 8108| 8108| 7| 131062| 0| 8774| 8771| 600| 8171| 8171| 2| 131062| 0| 8769| 8769| 600| 8169| 8169| 9| 131062| 0| 8706| 8704| 600| 8104| 8104| 4| 131062| 0| 8769| 8768| 599| 8169| 8168| 5| 131062| 0| 8707| 8706| 600| 8106| 8106| 7| 131062| 0| 8771| 8768| 600| 8168| 8168| 6| 131062| 0| 8770| 8768| 600| 8168| 8168| 8| 131062| 0| 8705| 8704| 600| 8104| 8104| 6| 131062| 0| 8771| 8768| 600| 8168| 8168| 5| No visible difference to me. On stock 2.4.31 with no alterations to the gc settings (and no rhash_entries as it doesn't exist), lnstat shows: rt_cache|rt_cache|rt_cache|rt_cache|rt_cache|rt_cache|rt_cache|rt_cache| entries| in_hit|in_slow_|gc_total|gc_ignor|gc_goal_|gc_dst_o|in_hlist| | | tot| | ed| miss| verflow| _search| 21| 85| 160| 0| 0| 0| 0| 0| 21| 4| 2| 0| 0| 0| 0| 0| 21| 2| 3| 0| 0| 0| 0| 0| 22| 2| 2| 0| 0| 0| 0| 0| 18432| 11| 136187| 134158| 134156| 1| 0| 1133784| 18432| 5| 195891| 195889| 195887| 2| 0| 1763070| 18432| 9| 195585| 195568| 195566| 2| 0| 1758397| 18432| 7| 195290| 195281| 195279| 0| 0| 1751884| 18432| 8| 195587| 195579| 195577| 0| 0| 1754813| 18432| 20| 195276| 195275| 195273| 0| 0| 1752216| 18432| 11| 194983| 194980| 194978| 0| 0| 1749822| 18432| 7| 195288| 195287| 195285| 0| 0| 1752655| 18432| 13| 195282| 195281| 195279| 0| 0| 1752869| 18432| 12| 194984| 194984| 194982| 1| 0| 1749589| 18432| 17| 194978| 194974| 194972| 0| 0| 1748817| 18432| 11| 194985| 194981| 194979| 0| 0| 1749182| 18432| 14| 194981| 194977| 194975| 0| 0| 1749287| 18432| 14| 194682| 194679| 194677| 0| 0| 1746847| 18432| 11| 194983| 194980| 194978| 0| 0| 1749679| ...and the machine is perfectly responsive. It's dropping packets (managing to forward ~210 kpps, a little less than 2.4.27), but it's at least working. 2.6.13-rc6 dribbles out ~33 kpps. Simon- From Robert.Olsson@data.slu.se Tue Aug 23 12:59:23 2005 Received: with ECARTIS (v1.0.0; list netdev); Tue, 23 Aug 2005 12:59:27 -0700 (PDT) Received: from mx1.slu.se (mx1.slu.se [130.238.96.70]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7NJxLH9013348 for ; Tue, 23 Aug 2005 12:59:22 -0700 Received: from robur.slu.se (robur.slu.se [130.238.98.12]) by mx1.slu.se (8.13.1/8.13.1) with ESMTP id j7NJurxe027337; Tue, 23 Aug 2005 21:56:53 +0200 Received: by robur.slu.se (Postfix, from userid 1000) id 3B579EC3BB; Tue, 23 Aug 2005 21:56:53 +0200 (CEST) From: Robert Olsson MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <17163.32645.202453.145416@robur.slu.se> Date: Tue, 23 Aug 2005 21:56:53 +0200 To: Simon Kirby Cc: Eric Dumazet , Robert Olsson , netdev@oss.sgi.com Subject: Re: Route cache performance In-Reply-To: <20050823190852.GA20794@netnation.com> References: <20050815213855.GA17832@netnation.com> <43014E27.1070104@cosmosbay.com> <20050823190852.GA20794@netnation.com> X-Mailer: VM 7.19 under Emacs 21.4.1 X-Scanned-By: MIMEDefang 2.48 on 130.238.96.70 X-archive-position: 3544 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: Robert.Olsson@data.slu.se Precedence: bulk X-list: netdev Content-Length: 1203 Lines: 33 Hello! Simon Kirby writes: > I was under the impression, however, that the code Alexei added last time > I brought up this problem was intended to always allow gc when the the > table is full and another entry is attempting to be created, even when > under gc_min_interval. Yes your GC does not work at all in your 2.6 setups...Why? > rt_cache|rt_cache|rt_cache|rt_cache|rt_cache|rt_cache|rt_cache|rt_cache| > entries| in_hit|in_slow_|gc_total|gc_ignor|gc_goal_|gc_dst_o|in_hlist| > | | tot| | ed| miss| verflow| _search| > 131062| 22| 125633| 125629| 125447| 182| 181| 837163| > 131062| 0| 13511| 13509| 900| 12609| 12609| 10| > 131062| 0| 8772| 8770| 600| 8170| 8170| 7| > 131062| 0| 8709| 8706| 600| 8106| 8106| 8| > 131062| 0| 8771| 8770| 600| 8170| 8170| 6| > 131062| 0| 8770| 8768| 600| 8168| 8168| 6| > 131062| 0| 8706| 8704| 600| 8104| 8104| 10| Can you try echo 50 > /proc/sys/net/ipv4/route/gc_min_interval_ms Cheers. --ro From sim@netnation.com Tue Aug 23 17:04:20 2005 Received: with ECARTIS (v1.0.0; list netdev); Tue, 23 Aug 2005 17:04:24 -0700 (PDT) Received: from peace.netnation.com (newpeace.netnation.com [204.174.223.7]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7O04JH9007526 for ; Tue, 23 Aug 2005 17:04:20 -0700 Received: from sim by peace.netnation.com with local (Exim 4.50) id 1E7iiM-0002BH-ES; Tue, 23 Aug 2005 17:01:58 -0700 Date: Tue, 23 Aug 2005 17:01:58 -0700 From: Simon Kirby To: Robert Olsson Cc: Eric Dumazet , netdev@oss.sgi.com Subject: Re: Route cache performance Message-ID: <20050824000158.GA8137@netnation.com> References: <20050815213855.GA17832@netnation.com> <43014E27.1070104@cosmosbay.com> <20050823190852.GA20794@netnation.com> <17163.32645.202453.145416@robur.slu.se> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <17163.32645.202453.145416@robur.slu.se> User-Agent: Mutt/1.5.9i X-archive-position: 3545 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: sim@netnation.com Precedence: bulk X-list: netdev Content-Length: 1922 Lines: 37 On Tue, Aug 23, 2005 at 09:56:53PM +0200, Robert Olsson wrote: > Yes your GC does not work at all in your 2.6 setups...Why? Good question. :) > echo 50 > /proc/sys/net/ipv4/route/gc_min_interval_ms The output looks exactly the same with gc_min_interval_ms set to 50. If I set it to 0, it does change a little but _still_ overflows: rt_cache|rt_cache|rt_cache|rt_cache|rt_cache|rt_cache|rt_cache|rt_cache| entries| in_hit|in_slow_|gc_total|gc_ignor|gc_goal_|gc_dst_o|in_hlist| | | tot| | ed| miss| verflow| _search| 3| 3| 1| 1| 1| 0| 0| 0| 4| 11| 5| 0| 0| 0| 0| 0| 5| 5| 2| 0| 0| 0| 0| 0| 23615| 1| 24002| 15812| 0| 0| 0| 11470| 68692| 0| 46780| 46777| 0| 4687| 0| 4492| 86046| 0| 18763| 18754| 0| 18754| 0| 119| 94884| 0| 9540| 9538| 0| 9538| 0| 47| 104901| 0| 10819| 10817| 0| 10817| 0| 61| 114919| 0| 10817| 10818| 0| 10818| 0| 68| 127424| 0| 13512| 13505| 0| 13505| 0| 74| 131062| 0| 15113| 15106| 0| 15106| 10368| 28| 131062| 0| 12503| 12482| 0| 12482| 11582| 9| 131062| 0| 8146| 8130| 0| 8130| 7530| 5| 131062| 0| 8204| 8194| 0| 8194| 7594| 2| 131062| 0| 8132| 8131| 0| 8131| 7531| 5| 131062| 0| 8196| 8195| 0| 8195| 7595| 4| 131062| 0| 8130| 8129| 0| 8129| 7529| 8| Something is definitely broken here. Are the interrupts (or in this case, NAPI) able to starve the gc somehow? Simon- From mhuth@mvista.com Tue Aug 23 17:30:20 2005 Received: with ECARTIS (v1.0.0; list netdev); Tue, 23 Aug 2005 17:30:23 -0700 (PDT) Received: from zipcode.az.mvista.com (rav-az.mvista.com [65.200.49.157]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7O0UKH9010585 for ; Tue, 23 Aug 2005 17:30:20 -0700 Received: from mvista.com ([10.50.1.182]) by zipcode.az.mvista.com (8.9.3/8.9.3) with ESMTP id RAA22744; Tue, 23 Aug 2005 17:28:00 -0700 Message-ID: <430BC09A.3090401@mvista.com> Date: Tue, 23 Aug 2005 17:34:34 -0700 From: Mark Huth User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.0.1) Gecko/20020823 Netscape/7.0 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Dale Farnsworth CC: Netdev , Jeff Garzik , Ralf Baechle , Manish Lachwani , Brian Waite , "Steven J. Hill" , Benjamin Herrenschmidt , James Chapman Subject: Re: mv643xx(2/20): use MII library for PHY management References: <20050328233807.GA28423@xyzzy> <20050328234225.GB29098@xyzzy> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 3546 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mhuth@mvista.com Precedence: bulk X-list: netdev Content-Length: 6369 Lines: 195 It's good to use the abstractions and common code, but in this case there is a significant performance difference. The MDIO read/write on this family does a cpu spin wait for the mdio operation to complete. Last time I measured this (back when fixing up a 2.4.20 implementation) I got around 100 us for the mii_ioctl path, of which a good bit was in the spin loop waiting for the MDIO operation to complete. A quick look seems to indicate the spin loop is still in this version of the driver. Given that the NIC chip gives cheap access to the link status and a couple of other interesting bits, the change to use the mii library has a performance impact. Mark Huth Dale Farnsworth wrote: > Modify link up/down handling to use the functions from the MII > library. Note that I track link state using the MII PHY registers > rather than the mv643xx chip's link state registers because I think > it's cleaner to use the MII library code rather than writing local > driver support code. It is also useful to make the actual MII > registers available to the user with maskable kernel printk messages > so the MII registers are being read anyway > > Signed-off-by: James Chapman > Acked-by: Dale Farnsworth > > Index: linux-2.5-enet/drivers/net/mv643xx_eth.h > =================================================================== > --- linux-2.5-enet.orig/drivers/net/mv643xx_eth.h > +++ linux-2.5-enet/drivers/net/mv643xx_eth.h > @@ -6,6 +6,7 @@ > #include > #include > #include > +#include > > #include > > @@ -397,6 +398,9 @@ > > u32 rx_int_coal; > u32 tx_int_coal; > + > + u32 msg_enable; > + struct mii_if_info mii; > }; > > /* ethernet.h API list */ > Index: linux-2.5-enet/drivers/net/mv643xx_eth.c > =================================================================== > --- linux-2.5-enet.orig/drivers/net/mv643xx_eth.c > +++ linux-2.5-enet/drivers/net/mv643xx_eth.c > @@ -74,7 +74,6 @@ > #define PHY_WAIT_MICRO_SECONDS 10 > > /* Static function declarations */ > -static int eth_port_link_is_up(unsigned int eth_port_num); > static void eth_port_uc_addr_get(struct net_device *dev, > unsigned char *MacAddr); > static int mv643xx_eth_real_open(struct net_device *); > @@ -85,8 +84,11 @@ > #ifdef MV643XX_NAPI > static int mv643xx_poll(struct net_device *dev, int *budget); > #endif > +static int ethernet_phy_get(unsigned int eth_port_num); > static void ethernet_phy_set(unsigned int eth_port_num, int phy_addr); > static int ethernet_phy_detect(unsigned int eth_port_num); > +static int mv643xx_mdio_read(struct net_device *dev, int phy_id, int location); > +static void mv643xx_mdio_write(struct net_device *dev, int phy_id, int location, int val); > static struct ethtool_ops mv643xx_ethtool_ops; > > static char mv643xx_driver_name[] = "mv643xx_eth"; > @@ -550,16 +552,38 @@ > } > /* PHY status changed */ > if (eth_int_cause_ext & (BIT16 | BIT20)) { > - if (eth_port_link_is_up(port_num)) { > - netif_carrier_on(dev); > + struct ethtool_cmd cmd; > + > + /* mii library handles link maintenance tasks */ > + > + mii_ethtool_gset(&mp->mii, &cmd); > + if (netif_msg_link(mp)) > + printk(KERN_DEBUG "%s: link phy regs: " > + "supported=%x advert=%x " > + "autoneg=%x speed=%d duplex=%d\n", > + dev->name, > + cmd.supported, cmd.advertising, > + cmd.autoneg, cmd.speed, cmd.duplex); > + > + if(mii_link_ok(&mp->mii) && !netif_carrier_ok(dev)) { > + if (netif_msg_ifup(mp)) > + printk(KERN_INFO "%s: link up, %sMbps, %s-duplex\n", > + dev->name, > + cmd.speed == SPEED_1000 ? "1000" : > + cmd.speed == SPEED_100 ? "100" : "10", > + cmd.duplex == DUPLEX_FULL ? "full" : "half"); > + > netif_wake_queue(dev); > /* Start TX queue */ > - mv_write(MV643XX_ETH_TRANSMIT_QUEUE_COMMAND_REG > - (port_num), 1); > - } else { > - netif_carrier_off(dev); > + mv_write(MV643XX_ETH_TRANSMIT_QUEUE_COMMAND_REG(port_num), 1); > + > + } else if(!mii_link_ok(&mp->mii) && netif_carrier_ok(dev)) { > netif_stop_queue(dev); > + if (netif_msg_ifdown(mp)) > + printk(KERN_INFO "%s: link down\n", dev->name); > } > + > + mii_check_link(&mp->mii); > } > > /* > @@ -1379,6 +1403,10 @@ > > mp = netdev_priv(dev); > > + /* By default, log probe, interface up/down and error events */ > + mp->msg_enable = NETIF_MSG_PROBE | NETIF_MSG_IFUP | NETIF_MSG_IFDOWN | > + NETIF_MSG_TX_ERR | NETIF_MSG_RX_ERR; > + > res = platform_get_resource(pdev, IORESOURCE_IRQ, 0); > BUG_ON(!res); > dev->irq = res->start; > @@ -1415,6 +1443,15 @@ > #endif > #endif > > + /* Hook up MII support for ethtool */ > + mp->mii.dev = dev; > + mp->mii.mdio_read = mv643xx_mdio_read; > + mp->mii.mdio_write = mv643xx_mdio_write; > + mp->mii.phy_id = ethernet_phy_get(mp->port_num); > + mp->mii.phy_id_mask = 0x3f; > + mp->mii.reg_num_mask = 0x1f; > + mp->mii.supports_gmii = 1; > + > /* Configure the timeout task */ > INIT_WORK(&mp->tx_timeout_task, > (void (*)(void *))mv643xx_eth_tx_timeout_task, dev); > @@ -2323,21 +2360,6 @@ > return phy_reg_data0 & 0x1000; > } > > -static int eth_port_link_is_up(unsigned int eth_port_num) > -{ > - unsigned int phy_reg_data1; > - > - eth_port_read_smi_reg(eth_port_num, 1, &phy_reg_data1); > - > - if (eth_port_autoneg_supported(eth_port_num)) { > - if (phy_reg_data1 & 0x20) /* auto-neg complete */ > - return 1; > - } else if (phy_reg_data1 & 0x4) /* link up */ > - return 1; > - > - return 0; > -} > - > /* > * ethernet_get_config_reg - Get the port configuration register > * > @@ -2468,6 +2490,24 @@ > } > > /* > + * Wrappers for MII support library. > + */ > +static int mv643xx_mdio_read(struct net_device *dev, int phy_id, int location) > +{ > + int val; > + struct mv643xx_private *mp = netdev_priv(dev); > + > + eth_port_read_smi_reg(mp->port_num, location, &val); > + return val; > +} > + > +static void mv643xx_mdio_write(struct net_device *dev, int phy_id, int location, int val) > +{ > + struct mv643xx_private *mp = netdev_priv(dev); > + eth_port_write_smi_reg(mp->port_num, location, val); > +} > + > +/* > * eth_port_send - Send an Ethernet packet > * > * DESCRIPTION: > > From benh@kernel.crashing.org Tue Aug 23 17:37:08 2005 Received: with ECARTIS (v1.0.0; list netdev); Tue, 23 Aug 2005 17:37:11 -0700 (PDT) Received: from gate.crashing.org (gate.crashing.org [63.228.1.57]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7O0b7H9011594 for ; Tue, 23 Aug 2005 17:37:07 -0700 Received: from gaston (localhost [127.0.0.1]) by gate.crashing.org (8.12.8/8.12.8) with ESMTP id j7O0SkIq010631; Tue, 23 Aug 2005 19:28:47 -0500 Subject: Re: mv643xx(2/20): use MII library for PHY management From: Benjamin Herrenschmidt To: Mark Huth Cc: Dale Farnsworth , Netdev , Jeff Garzik , Ralf Baechle , Manish Lachwani , Brian Waite , "Steven J. Hill" , James Chapman In-Reply-To: <430BC09A.3090401@mvista.com> References: <20050328233807.GA28423@xyzzy> <20050328234225.GB29098@xyzzy> <430BC09A.3090401@mvista.com> Content-Type: text/plain Date: Wed, 24 Aug 2005 10:33:26 +1000 Message-Id: <1124843606.5158.124.camel@gaston> Mime-Version: 1.0 X-Mailer: Evolution 2.2.3 Content-Transfer-Encoding: 7bit X-archive-position: 3547 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: benh@kernel.crashing.org Precedence: bulk X-list: netdev Content-Length: 912 Lines: 20 On Tue, 2005-08-23 at 17:34 -0700, Mark Huth wrote: > It's good to use the abstractions and common code, but in this case > there is a significant performance difference. The MDIO read/write on > this family does a cpu spin wait for the mdio operation to complete. > Last time I measured this (back when fixing up a 2.4.20 implementation) > I got around 100 us for the mii_ioctl path, of which a good bit was in > the spin loop waiting for the MDIO operation to complete. A quick look > seems to indicate the spin loop is still in this version of the driver. > > Given that the NIC chip gives cheap access to the link status and a > couple of other interesting bits, the change to use the mii library has > a performance impact. Is it possible to implement the mdio functions without a spin loop ? Also, it might be a good idea to use the PHY driver model (a-la sungem) rather than miilib... Ben. From Robert.Olsson@data.slu.se Tue Aug 23 20:52:42 2005 Received: with ECARTIS (v1.0.0; list netdev); Tue, 23 Aug 2005 20:52:44 -0700 (PDT) Received: from mx1.slu.se (mx1.slu.se [130.238.96.70]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7O3qfH9000589 for ; Tue, 23 Aug 2005 20:52:41 -0700 Received: from robur.slu.se (robur.slu.se [130.238.98.12]) by mx1.slu.se (8.13.1/8.13.1) with ESMTP id j7O3oCv8008243; Wed, 24 Aug 2005 05:50:12 +0200 Received: by robur.slu.se (Postfix, from userid 1000) id 02683EC3BB; Wed, 24 Aug 2005 05:50:11 +0200 (CEST) From: Robert Olsson MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <17163.61043.912047.611487@robur.slu.se> Date: Wed, 24 Aug 2005 05:50:11 +0200 To: Simon Kirby Cc: Robert Olsson , Eric Dumazet , netdev@oss.sgi.com Subject: Re: Route cache performance In-Reply-To: <20050824000158.GA8137@netnation.com> References: <20050815213855.GA17832@netnation.com> <43014E27.1070104@cosmosbay.com> <20050823190852.GA20794@netnation.com> <17163.32645.202453.145416@robur.slu.se> <20050824000158.GA8137@netnation.com> X-Mailer: VM 7.19 under Emacs 21.4.1 X-Scanned-By: MIMEDefang 2.48 on 130.238.96.70 X-archive-position: 3548 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: Robert.Olsson@data.slu.se Precedence: bulk X-list: netdev Content-Length: 447 Lines: 20 Simon Kirby writes: > Something is definitely broken here. Are the interrupts (or in this > case, NAPI) able to starve the gc somehow? Hmm no in 2.6 dst entries are freed via RCU callback this had problems but was redesigned. Reading your old email... Didn't you get "dst cache overflow" before 2.6.11-bk2? In other case I like to have your detailed setup to see if I get any idea or possible can reproduced. Cheers. --ro From sim@netnation.com Wed Aug 24 09:08:28 2005 Received: with ECARTIS (v1.0.0; list netdev); Wed, 24 Aug 2005 09:08:35 -0700 (PDT) Received: from peace.netnation.com (newpeace.netnation.com [204.174.223.7]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7OG8SiL020493 for ; Wed, 24 Aug 2005 09:08:28 -0700 Received: from sim by peace.netnation.com with local (Exim 4.50) id 1E7xlO-0002L7-O2 for netdev@oss.sgi.com; Wed, 24 Aug 2005 09:06:06 -0700 Date: Wed, 24 Aug 2005 09:06:06 -0700 From: Simon Kirby To: netdev@oss.sgi.com Subject: Re: Route cache performance Message-ID: <20050824160606.GC7078@netnation.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.9i X-archive-position: 3549 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: sim@netnation.com Precedence: bulk X-list: netdev Content-Length: 1236 Lines: 31 On Wed, Aug 24, 2005 at 04:59:32PM +0200, Robert Olsson wrote: > You could probably also try to set hash table very large. > via the boot option rt_hash_entries. I have, and it just grows until it uses up all memory and kills my SSH session. > and gc_thresh to 1/4 of that as an experiment. The threshold appears to have no difference except for where it settles once I stop the DoS traffic. > Also if you find any 2.6 version that work a la 2.4 it's > a good start. It's weird because 2.6.11 is a lot better in that the GC appears to work for some time, but it eventually something happens and it also hits max_size and overflows continually. I think I'm going to have to find a version that works consistently as opposed to being "a little better". I was just testing it again and noticed that on 2.6.11 it seems to be almost stable at 71,000 entries (max_size = 131072) but as soon as I type "dmesg" in another SSH window it will hit 131072. It's as if it's at equilibrium with the packet creation. It may just be as simple as something that has always been buggy but doesn't show up in 2.4 because the e1000 driver is more efficient there (and/or some other piece of networking, which appears to be more likely). Simon- From ralf@linux-mips.org Wed Aug 24 10:19:37 2005 Received: with ECARTIS (v1.0.0; list netdev); Wed, 24 Aug 2005 10:19:39 -0700 (PDT) Received: from ftp.linux-mips.org (mail.linux-mips.org [62.254.210.162]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7OHJaiL027567 for ; Wed, 24 Aug 2005 10:19:37 -0700 Received: from extgw-uk.mips.com ([IPv6:::ffff:62.254.210.129]:28434 "EHLO bacchus.net.dhis.org") by linux-mips.org with ESMTP id ; Wed, 24 Aug 2005 18:11:38 +0100 Received: from dea.linux-mips.net (localhost.localdomain [127.0.0.1]) by bacchus.net.dhis.org (8.13.4/8.13.1) with ESMTP id j7OHGZAX008395; Wed, 24 Aug 2005 18:16:35 +0100 Received: (from ralf@localhost) by dea.linux-mips.net (8.13.4/8.13.4/Submit) id j7OHGZaK008394; Wed, 24 Aug 2005 18:16:35 +0100 Date: Wed, 24 Aug 2005 18:16:35 +0100 From: Ralf Baechle DL5RB To: "David S. Miller" Cc: netdev@linux-mips.org, linux-hams@vger.kernel.org Subject: [PATCH 2/3] Cleanup direct calls into IP stack Message-ID: <20050824171635.GA8367@linux-mips.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4.2.1i X-archive-position: 3551 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ralf@linux-mips.org Precedence: bulk X-list: netdev Content-Length: 1916 Lines: 59 All these are claiming to include to get ip_rcv() but in fact don't need the header at all, so away with the inclusion. Signed-off-by: Ralf Baechle DL5RB net/ax25/ax25_ds_in.c | 1 - net/ax25/ax25_std_in.c | 1 - net/netrom/nr_in.c | 1 - net/rose/rose_in.c | 1 - 4 files changed, 4 deletions(-) Index: linux-cvs/net/ax25/ax25_ds_in.c =================================================================== --- linux-cvs.orig/net/ax25/ax25_ds_in.c +++ linux-cvs/net/ax25/ax25_ds_in.c @@ -22,7 +22,6 @@ #include #include #include -#include /* For ip_rcv */ #include #include #include Index: linux-cvs/net/ax25/ax25_std_in.c =================================================================== --- linux-cvs.orig/net/ax25/ax25_std_in.c +++ linux-cvs/net/ax25/ax25_std_in.c @@ -29,7 +29,6 @@ #include #include #include -#include /* For ip_rcv */ #include #include #include Index: linux-cvs/net/netrom/nr_in.c =================================================================== --- linux-cvs.orig/net/netrom/nr_in.c +++ linux-cvs/net/netrom/nr_in.c @@ -23,7 +23,6 @@ #include #include #include -#include /* For ip_rcv */ #include #include #include Index: linux-cvs/net/rose/rose_in.c =================================================================== --- linux-cvs.orig/net/rose/rose_in.c +++ linux-cvs/net/rose/rose_in.c @@ -26,7 +26,6 @@ #include #include #include -#include /* For ip_rcv */ #include #include #include From ralf@linux-mips.org Wed Aug 24 10:18:31 2005 Received: with ECARTIS (v1.0.0; list netdev); Wed, 24 Aug 2005 10:18:47 -0700 (PDT) Received: from ftp.linux-mips.org (mail.linux-mips.org [62.254.210.162]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7OHISiL027315 for ; Wed, 24 Aug 2005 10:18:30 -0700 Received: from extgw-uk.mips.com ([IPv6:::ffff:62.254.210.129]:2308 "EHLO bacchus.net.dhis.org") by linux-mips.org with ESMTP id ; Wed, 24 Aug 2005 18:10:31 +0100 Received: from dea.linux-mips.net (localhost.localdomain [127.0.0.1]) by bacchus.net.dhis.org (8.13.4/8.13.1) with ESMTP id j7OHFQ45008364; Wed, 24 Aug 2005 18:15:26 +0100 Received: (from ralf@localhost) by dea.linux-mips.net (8.13.4/8.13.4/Submit) id j7OHFNJB008363; Wed, 24 Aug 2005 18:15:23 +0100 Date: Wed, 24 Aug 2005 18:15:23 +0100 From: Ralf Baechle To: "David S. Miller" Cc: netdev@linux-mips.org, linux-hams@vger.kernel.org Subject: [PATCH 1/2] Cleanup direct calls into IP stack Message-ID: <20050824171523.GA8260@linux-mips.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4.2.1i X-archive-position: 3550 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ralf@linux-mips.org Precedence: bulk X-list: netdev Content-Length: 4228 Lines: 134 Get rid of the calls to ip_rcv and arp_rcv which were laying violations anyway. With those being replaced by netif_rx less parts of AX.25 and relatives depend on INET support actually being enabled. This also will make PF_PACKET sockets work for IP and ARP packets receiver over AX.25 and for IP packets over NET/ROM. Signed-off-by: Ralf Baechle DL5RB net/ax25/ax25_in.c | 13 +++---------- net/netrom/af_netrom.c | 5 ++--- net/netrom/nr_dev.c | 5 ++--- 3 files changed, 7 insertions(+), 16 deletions(-) Index: linux-cvs/net/ax25/ax25_in.c =================================================================== --- linux-cvs.orig/net/ax25/ax25_in.c +++ linux-cvs/net/ax25/ax25_in.c @@ -9,7 +9,6 @@ * Copyright (C) Joerg Reuter DL1BKE (jreuter@yaina.de) * Copyright (C) Hans-Joachim Hetscher DD8NE (dd8ne@bnv-bamberg.de) */ -#include #include #include #include @@ -26,9 +25,7 @@ #include #include #include -#include /* For ip_rcv */ #include -#include /* For arp_rcv */ #include #include #include @@ -114,7 +111,6 @@ int ax25_rx_iframe(ax25_cb *ax25, struct pid = *skb->data; -#ifdef CONFIG_INET if (pid == AX25_P_IP) { /* working around a TCP bug to keep additional listeners * happy. TCP re-uses the buffer and destroys the original @@ -132,10 +128,9 @@ int ax25_rx_iframe(ax25_cb *ax25, struct skb->dev = ax25->ax25_dev->dev; skb->pkt_type = PACKET_HOST; skb->protocol = htons(ETH_P_IP); - ip_rcv(skb, skb->dev, NULL); /* Wrong ptype */ + netif_rx(skb); return 1; } -#endif if (pid == AX25_P_SEGMENT) { skb_pull(skb, 1); /* Remove PID */ return ax25_rx_fragment(ax25, skb); @@ -250,7 +245,6 @@ static int ax25_rcv(struct sk_buff *skb, /* Now we are pointing at the pid byte */ switch (skb->data[1]) { -#ifdef CONFIG_INET case AX25_P_IP: skb_pull(skb,2); /* drop PID/CTRL */ skb->h.raw = skb->data; @@ -258,7 +252,7 @@ static int ax25_rcv(struct sk_buff *skb, skb->dev = dev; skb->pkt_type = PACKET_HOST; skb->protocol = htons(ETH_P_IP); - ip_rcv(skb, dev, ptype); /* Note ptype here is the wrong one, fix me later */ + netif_rx(skb); break; case AX25_P_ARP: @@ -268,9 +262,8 @@ static int ax25_rcv(struct sk_buff *skb, skb->dev = dev; skb->pkt_type = PACKET_HOST; skb->protocol = htons(ETH_P_ARP); - arp_rcv(skb, dev, ptype); /* Note ptype here is wrong... */ + netif_rx(skb); break; -#endif case AX25_P_TEXT: /* Now find a suitable dgram socket */ sk = ax25_get_socket(&dest, &src, SOCK_DGRAM); Index: linux-cvs/net/netrom/af_netrom.c =================================================================== --- linux-cvs.orig/net/netrom/af_netrom.c +++ linux-cvs/net/netrom/af_netrom.c @@ -858,17 +858,16 @@ int nr_rx_frame(struct sk_buff *skb, str frametype = skb->data[19] & 0x0F; flags = skb->data[19] & 0xF0; -#ifdef CONFIG_INET /* * Check for an incoming IP over NET/ROM frame. */ - if (frametype == NR_PROTOEXT && circuit_index == NR_PROTO_IP && circuit_id == NR_PROTO_IP) { + if (frametype == NR_PROTOEXT && + circuit_index == NR_PROTO_IP && circuit_id == NR_PROTO_IP) { skb_pull(skb, NR_NETWORK_LEN + NR_TRANSPORT_LEN); skb->h.raw = skb->data; return nr_rx_ip(skb, dev); } -#endif /* * Find an existing socket connection, based on circuit ID, if it's Index: linux-cvs/net/netrom/nr_dev.c =================================================================== --- linux-cvs.orig/net/netrom/nr_dev.c +++ linux-cvs/net/netrom/nr_dev.c @@ -38,8 +38,6 @@ #include #include -#ifdef CONFIG_INET - /* * Only allow IP over NET/ROM frames through if the netrom device is up. */ @@ -64,11 +62,12 @@ int nr_rx_ip(struct sk_buff *skb, struct skb->nh.raw = skb->data; skb->pkt_type = PACKET_HOST; - ip_rcv(skb, skb->dev, NULL); + netif_rx(skb); return 1; } +#ifdef CONFIG_INET static int nr_rebuild_header(struct sk_buff *skb) { From ralf@linux-mips.org Wed Aug 24 10:20:12 2005 Received: with ECARTIS (v1.0.0; list netdev); Wed, 24 Aug 2005 10:20:19 -0700 (PDT) Received: from ftp.linux-mips.org (mail.linux-mips.org [62.254.210.162]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7OHJaiM027567 for ; Wed, 24 Aug 2005 10:20:12 -0700 Received: from extgw-uk.mips.com ([IPv6:::ffff:62.254.210.129]:43277 "EHLO bacchus.net.dhis.org") by linux-mips.org with ESMTP id ; Wed, 24 Aug 2005 18:12:17 +0100 Received: from dea.linux-mips.net (localhost.localdomain [127.0.0.1]) by bacchus.net.dhis.org (8.13.4/8.13.1) with ESMTP id j7OHHCmW008434; Wed, 24 Aug 2005 18:17:12 +0100 Received: (from ralf@localhost) by dea.linux-mips.net (8.13.4/8.13.4/Submit) id j7OHHCYa008433; Wed, 24 Aug 2005 18:17:12 +0100 Date: Wed, 24 Aug 2005 18:17:12 +0100 From: Ralf Baechle DL5RB To: "David S. Miller" Cc: netdev@linux-mips.org, linux-hams@vger.kernel.org Subject: [PATCH 3/3] Cleanup direct calls into IP stack Message-ID: <20050824171712.GB8367@linux-mips.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4.2.1i X-archive-position: 3552 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ralf@linux-mips.org Precedence: bulk X-list: netdev Content-Length: 525 Lines: 18 With ip_rcv nowhere outside the IP stack being used anymore it's EXPORT_SYMBOL is not needed any longer either. Signed-off-by: Ralf Baechle DL5RB net/ipv4/ip_input.c | 1 - 1 files changed, 1 deletion(-) Index: linux-cvs/net/ipv4/ip_input.c =================================================================== --- linux-cvs.orig/net/ipv4/ip_input.c +++ linux-cvs/net/ipv4/ip_input.c @@ -428,5 +428,4 @@ out: return NET_RX_DROP; } -EXPORT_SYMBOL(ip_rcv); EXPORT_SYMBOL(ip_statistics); From davem@davemloft.net Wed Aug 24 11:41:31 2005 Received: with ECARTIS (v1.0.0; list netdev); Wed, 24 Aug 2005 11:41:37 -0700 (PDT) Received: from ftp.linux-mips.org (mail.linux-mips.org [62.254.210.162]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7OIfViL003222 for ; Wed, 24 Aug 2005 11:41:31 -0700 Received: from dsl027-180-168.sfo1.dsl.speakeasy.net ([IPv6:::ffff:216.27.180.168]:56248 "EHLO sunset.davemloft.net") by linux-mips.org with ESMTP id ; Wed, 24 Aug 2005 19:33:35 +0100 Received: from localhost ([127.0.0.1] ident=davem) by sunset.davemloft.net with esmtp (Exim 4.52) id 1E809U-0000t4-Ig; Wed, 24 Aug 2005 11:39:08 -0700 Date: Wed, 24 Aug 2005 11:39:08 -0700 (PDT) Message-Id: <20050824.113908.123854644.davem@davemloft.net> To: ralf@linux-mips.org Cc: netdev@linux-mips.org, linux-hams@vger.kernel.org Subject: Re: [PATCH 3/3] Cleanup direct calls into IP stack From: "David S. Miller" In-Reply-To: <20050824171712.GB8367@linux-mips.org> References: <20050824171712.GB8367@linux-mips.org> X-Mailer: Mew version 4.2 on Emacs 21.4 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 3553 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev Content-Length: 56 Lines: 2 All 3 patches queued up for 2.6.14, thanks a lot Ralf. From pavel@ucw.cz Thu Aug 25 04:07:18 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 25 Aug 2005 04:07:23 -0700 (PDT) Received: from amd.ucw.cz (gprs189-60.eurotel.cz [160.218.189.60]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7PB78iL015276 for ; Thu, 25 Aug 2005 04:07:14 -0700 Received: by amd.ucw.cz (Postfix, from userid 8) id 464F08B3E8; Thu, 25 Aug 2005 13:04:38 +0200 (CEST) Date: Thu, 25 Aug 2005 13:04:38 +0200 From: Pavel Machek To: Jeff Garzik , Netdev list , kernel list , "James P. Ketrenos" , jbenc@suse.cz, jbo@suse.cz Subject: [patch] ipw2200: remove support for obsolete kernels Message-ID: <20050825110438.GA16944@elf.ucw.cz> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-Warning: Reading this can be dangerous to your mental health. User-Agent: Mutt/1.5.9i X-archive-position: 3554 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: pavel@suse.cz Precedence: bulk X-list: netdev Content-Length: 1728 Lines: 64 This removes support for old (and non-mainline) kernels from ipw2200. Please apply, Signed-off-by: Pavel Machek --- clean-mm/drivers/net/wireless/ipw2200.c 2005-08-24 20:25:09.000000000 +0200 +++ linux-mm/drivers/net/wireless/ipw2200.c 2005-08-25 12:50:19.000000000 +0200 @@ -6617,11 +6617,7 @@ { int ret = 0; -#ifdef CONFIG_SOFTWARE_SUSPEND2 - priv->workqueue = create_workqueue(DRV_NAME, 0); -#else priv->workqueue = create_workqueue(DRV_NAME); -#endif init_waitqueue_head(&priv->wait_command_queue); INIT_WORK(&priv->adhoc_check, ipw_adhoc_check, priv); @@ -7242,11 +7238,7 @@ /* Remove the PRESENT state of the device */ netif_device_detach(dev); -#if LINUX_VERSION_CODE < KERNEL_VERSION(2,6,10) - pci_save_state(pdev, priv->pm_state); -#else pci_save_state(pdev); -#endif pci_disable_device(pdev); pci_set_power_state(pdev, pci_choose_state(pdev, state)); --- clean-mm/drivers/net/wireless/ipw2200.h 2005-08-24 20:25:09.000000000 +0200 +++ linux-mm/drivers/net/wireless/ipw2200.h 2005-08-25 12:42:30.000000000 +0200 @@ -55,26 +55,6 @@ #include -#ifndef IRQ_NONE -typedef void irqreturn_t; -#define IRQ_NONE -#define IRQ_HANDLED -#define IRQ_RETVAL(x) -#endif - -#if ( LINUX_VERSION_CODE < KERNEL_VERSION(2,6,9) ) -#define __iomem -#endif - -#if ( LINUX_VERSION_CODE < KERNEL_VERSION(2,6,5) ) -#define pci_dma_sync_single_for_cpu pci_dma_sync_single -#define pci_dma_sync_single_for_device pci_dma_sync_single -#endif - -#ifndef HAVE_FREE_NETDEV -#define free_netdev(x) kfree(x) -#endif - /* Authentication and Association States */ enum connection_manager_assoc_states { -- if you have sharp zaurus hardware you don't need... you know my address From pavel@ucw.cz Thu Aug 25 04:08:29 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 25 Aug 2005 04:08:35 -0700 (PDT) Received: from amd.ucw.cz (gprs189-60.eurotel.cz [160.218.189.60]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7PB8NiL015425 for ; Thu, 25 Aug 2005 04:08:25 -0700 Received: by amd.ucw.cz (Postfix, from userid 8) id 28AAB8B3E8; Thu, 25 Aug 2005 13:05:57 +0200 (CEST) Date: Thu, 25 Aug 2005 13:05:57 +0200 From: Pavel Machek To: Jeff Garzik , Netdev list , kernel list , "James P. Ketrenos" , jbenc@suse.cz, jbo@suse.cz Subject: [patch] ipw2200: remove trap and unused stuff Message-ID: <20050825110557.GA16960@elf.ucw.cz> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-Warning: Reading this can be dangerous to your mental health. User-Agent: Mutt/1.5.9i X-archive-position: 3555 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: pavel@suse.cz Precedence: bulk X-list: netdev Content-Length: 1464 Lines: 52 This removes one trap for a programmer, few unused macros, and one unused struct. Please apply, Signed-off-by: Pavel Machek --- clean-mm/drivers/net/wireless/ipw2200.c 2005-08-24 20:25:09.000000000 +0200 +++ linux-mm/drivers/net/wireless/ipw2200.c 2005-08-25 12:50:19.000000000 +0200 @@ -4485,7 +4485,7 @@ IPW_DEBUG_INFO("RATE MASK: 0x%08X\n", priv->rates_mask); } #else -#define ipw_debug_config(x) do {} while (0); +#define ipw_debug_config(x) do {} while (0) #endif static inline void ipw_set_fixed_rate(struct ipw_priv *priv, --- clean-mm/drivers/net/wireless/ipw2200.h 2005-08-24 20:25:09.000000000 +0200 +++ linux-mm/drivers/net/wireless/ipw2200.h 2005-08-25 12:42:30.000000000 +0200 @@ -95,8 +75,6 @@ }; -#define IPW_NORMAL 0 -#define IPW_NOWAIT 0 #define IPW_WAIT (1<<0) #define IPW_QUIET (1<<1) #define IPW_ROAMING (1<<2) @@ -202,7 +180,7 @@ /* even if MAC WEP set (allows pre-encrypt) */ #define DCT_FLAG_NO_WEP 0x20 -#define IPW_ + /* overwrite TSF field */ #define DCT_FLAG_TSF_REQD 0x40 @@ -535,12 +513,6 @@ u16 status; } __attribute__ ((packed)); -struct temperature -{ - s32 measured; - s32 active; -} __attribute__ ((packed)); - struct notif_calibration { u8 data[104]; } __attribute__ ((packed)); -- if you have sharp zaurus hardware you don't need... you know my address From alessandro.suardi@gmail.com Thu Aug 25 06:41:25 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 25 Aug 2005 06:41:30 -0700 (PDT) Received: from rproxy.gmail.com (rproxy.gmail.com [64.233.170.203]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7PDfPiL005732 for ; Thu, 25 Aug 2005 06:41:25 -0700 Received: by rproxy.gmail.com with SMTP id f1so359136rne for ; Thu, 25 Aug 2005 06:39:02 -0700 (PDT) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:to:subject:mime-version:content-type:content-transfer-encoding:content-disposition; b=inuR47igPGB+jIbCppXVAdX6rphl3KkMP7a4IYUNKIKF5jRpzC26HKmGzpn2mcQ/PNWdqHrxRL8t5tpKqF4T5QO8QtKjpmc/qxWbZN6zL6Tn++Cr1fhbZsyXrcrDS+NwRFR3/U/GKtFeZYmiDImHMYJcJf5JeWjCSLSkE+Kh3k4= Received: by 10.38.12.52 with SMTP id 52mr121442rnl; Thu, 25 Aug 2005 06:39:02 -0700 (PDT) Received: by 10.38.13.14 with HTTP; Thu, 25 Aug 2005 06:39:02 -0700 (PDT) Message-ID: <5a4c581d05082506395fa984ae@mail.gmail.com> Date: Thu, 25 Aug 2005 15:39:02 +0200 From: Alessandro Suardi To: netdev@oss.sgi.com, Linux Kernel Mailing List , netfilter-devel@lists.netfilter.org Subject: oops in 2.6.13-rc6-git12 in tcp/netfilter routines Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id j7PDfPiL005732 X-archive-position: 3556 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: alessandro.suardi@gmail.com Precedence: bulk X-list: netdev Content-Length: 2089 Lines: 56 Howdy, and excuse me for crossposting - feel free to zap CC to unrelated, if any, mailing lists. just gave PeerGuardian a spin on my eDonkey home box and said box didn't last half a day before oopsing in netlink/nf/tcp related routines (or so it seems to my untrained eye). K7800, 256MB RAM, uptodate FC3 running 2.6.13-rc6-git12, doing nothing but running MetaMachine's eDonkey 1.4.3 QT gui. PeerGuardian is the 1.5 beta version available from methlabs.org. Stack is hand-copied from the dead box's console. [] die+0xe4/0x170 [] do_trap+0x7f/0xc0 [] do_invalid_op+0xa3/0xb0 [] error_code+0x4f/0x54 [] kfree_skbmem+0xb/0x20 [] __kfree_skb+0x5f/0xf0 [] tcp_clean_rtx_queue+0x16a/0x470 [] tcp_ack+0xf6/0x360 [] tcp_rcv_established+0x277/0x7a0 [] tcp_v4_do_rcv+0xf0/0x110 [] tcp_v4_rcv+0x6e0/0x820 [] ip_local_deliver_finish+0x84/0x160 [] nf_reinject+0x13a/0x1c0 [] ipq_issue_verdict+0x28/0x40 [] ipq_set_verdict+0x48/0x70 [] ipq_receive_peer+0x39/0x50 [] ipq_receive_sk+0x172/0x190 [] netlink_data_ready+0x35/0x60 [] netlink_sendskb+0x24/0x60 [] netlink_unicast+0x127/0x160 [] netlink_sendmsg+0x204/0x2b0 [] sock_sendmsg+0xb0/0xe0 [] sys_sendmsg+0x134/0x240 [] sys_socketcall+0x224/0x230 [] sysenter_past_esp+0x54/0x75 Code: 8b 41 0c 85 c0 75 1b 8b 86 94 00 00 00 e8 9e 37 e5 ff 5b 5e c9 c3 89 d0 e8 43 46 e5 ff 8d 76 00 eb d2 89 f0 e8 f7 fe ff ff eb dc <0f> 0b 54 01 16 d2 36 c0 eb b4 8d 74 26 00 8d bc 27 00 00 00 00 <0>Kernel panic - not syncing: Fatal exception in interrupt If there's need for further info I'd be happy to provide it. For now the box is rebooted into the same kernel and running the same PG/eD2k programs, if the issue reproduces I'll follow up on my own message. Thanks in advance, ciao, --alessandro "Not every smile means I'm laughing inside" (Wallflowers - "From The Bottom Of My Heart") From laforge@netfilter.org Thu Aug 25 09:58:21 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 25 Aug 2005 09:58:27 -0700 (PDT) Received: from ganesha.gnumonks.org ([213.95.27.120]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7PGwGiL027205 for ; Thu, 25 Aug 2005 09:58:20 -0700 Received: from uucp by ganesha.gnumonks.org with local-bsmtp (Exim 4.50) id 1E8L0x-0005Lj-EO for netdev@oss.sgi.com; Thu, 25 Aug 2005 18:55:43 +0200 Received: from laforge by rama.gnumonks.org with local (Exim 3.36 #1) id 1E8L15-0006E9-00; Thu, 25 Aug 2005 18:55:51 +0200 Date: Thu, 25 Aug 2005 18:55:50 +0200 From: Harald Welte To: Alessandro Suardi Cc: netdev@oss.sgi.com, Linux Kernel Mailing List , netfilter-devel@lists.netfilter.org Subject: Re: oops in 2.6.13-rc6-git12 in tcp/netfilter routines Message-ID: <20050825165550.GC4442@rama.de.gnumonks.org> Mail-Followup-To: Harald Welte , Alessandro Suardi , netdev@oss.sgi.com, Linux Kernel Mailing List , netfilter-devel@lists.netfilter.org References: <5a4c581d05082506395fa984ae@mail.gmail.com> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="KLINyTCByxgMLuN/" Content-Disposition: inline In-Reply-To: <5a4c581d05082506395fa984ae@mail.gmail.com> User-Agent: mutt-ng devel-20050619 (Debian) X-archive-position: 3557 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: laforge@netfilter.org Precedence: bulk X-list: netdev Content-Length: 3786 Lines: 101 --KLINyTCByxgMLuN/ Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Thu, Aug 25, 2005 at 03:39:02PM +0200, Alessandro Suardi wrote: > Howdy, and excuse me for crossposting - feel free to zap CC to > unrelated, if any, mailing lists. >=20 > just gave PeerGuardian a spin on my eDonkey home box and > said box didn't last half a day before oopsing in netlink/nf/tcp > related routines (or so it seems to my untrained eye). Yes, it indeed could be that there is some fishy interaction between the tcp stack and ip_queue causing the oops.=20 > K7800, 256MB RAM, uptodate FC3 running 2.6.13-rc6-git12, > doing nothing but running MetaMachine's eDonkey 1.4.3 QT gui. > PeerGuardian is the 1.5 beta version available from methlabs.org. Is it true that PeerGuardian is a proprietary application? I'm not going to debug this problem using a proprietary ip_queue program, sorry. If you can produce a testcase with open source userspace ip_queue code, I could look into reproducing the problem locally and debugging the problem more thoroughly. While it definitely is a kernel bug (whatever userspace sends should not crash the kernel), it might be something that specifically [only] PeerGuardian does to the packet. Something that ip_queue doesn't check (but should check) on packet reinjection and therefore upsets the TCP stack. Also helpful would be the output of an "strace -f -x -s65535 -e trace=3Dsendmsg" on the PeerGuardian (daemon?) process. > [] die+0xe4/0x170 > [] do_trap+0x7f/0xc0 > [] do_invalid_op+0xa3/0xb0 > [] error_code+0x4f/0x54 > [] kfree_skbmem+0xb/0x20 > [] __kfree_skb+0x5f/0xf0 ok, so something down the chain from kfree_skb() results in an invalid operation? looks more like some compiler problem, bad memory or memory corruption to me. Try to reproduce the problem without PG. > [] tcp_clean_rtx_queue+0x16a/0x470 > [] tcp_ack+0xf6/0x360 > [] tcp_rcv_established+0x277/0x7a0 > [] tcp_v4_do_rcv+0xf0/0x110 > [] tcp_v4_rcv+0x6e0/0x820 > [] ip_local_deliver_finish+0x84/0x160 so something in the tcp stack ends up doing tcp_clean_rtx_queue() > [] nf_reinject+0x13a/0x1c0 > [] ipq_issue_verdict+0x28/0x40 > [] ipq_set_verdict+0x48/0x70 ip_queue reinjects a packet via nf_reinject() > [] ipq_receive_peer+0x39/0x50 > [] ipq_receive_sk+0x172/0x190 ip_queue receives and ipq verdict msg packet from netlink > [] netlink_data_ready+0x35/0x60 > [] netlink_sendskb+0x24/0x60 > [] netlink_unicast+0x127/0x160 > [] netlink_sendmsg+0x204/0x2b0 > [] sock_sendmsg+0xb0/0xe0 > [] sys_sendmsg+0x134/0x240 > [] sys_socketcall+0x224/0x230 > [] sysenter_past_esp+0x54/0x75 process sendmsg()s on the netlink socket. --=20 - Harald Welte http://netfilter.org/ =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D "Fragmentation is like classful addressing -- an interesting early architectural error that shows how much experimentation was going on while IP was being designed." -- Paul Vixie --KLINyTCByxgMLuN/ Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.1 (GNU/Linux) iD8DBQFDDfgWXaXGVTD0i/8RAlKWAKCLPpvWL8TMiA/7tYlD1ETKeUQZtACgnfKI 2/nlXN2NSODp8oF33ZBm7pw= =cPuj -----END PGP SIGNATURE----- --KLINyTCByxgMLuN/-- From alessandro.suardi@gmail.com Thu Aug 25 10:29:05 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 25 Aug 2005 10:29:12 -0700 (PDT) Received: from rproxy.gmail.com (rproxy.gmail.com [64.233.170.198]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7PHT4iL030445 for ; Thu, 25 Aug 2005 10:29:05 -0700 Received: by rproxy.gmail.com with SMTP id f1so404441rne for ; Thu, 25 Aug 2005 10:26:41 -0700 (PDT) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=FJMSvDUPjXWueY2sBkmNianSA5Nc/NQa3st0hUu4kvI5HoD1aweE1ygR/GUFxlsIjACXH0rrcaBnDdH7oFysK8f5TNfPlI9+XmrHn59RvOpROMTBpmmWZgUwJ9HDIQVHurtcX1mDJBJdlyZoDzkd7IfySw7dBmktF5Uwvf6DZyE= Received: by 10.38.10.35 with SMTP id 35mr1214227rnj; Thu, 25 Aug 2005 10:26:41 -0700 (PDT) Received: by 10.38.13.14 with HTTP; Thu, 25 Aug 2005 10:26:41 -0700 (PDT) Message-ID: <5a4c581d050825102678c27b4e@mail.gmail.com> Date: Thu, 25 Aug 2005 19:26:41 +0200 From: Alessandro Suardi To: Harald Welte , Alessandro Suardi , netdev@oss.sgi.com, Linux Kernel Mailing List , netfilter-devel@lists.netfilter.org Subject: Re: oops in 2.6.13-rc6-git12 in tcp/netfilter routines In-Reply-To: <20050825165550.GC4442@rama.de.gnumonks.org> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Disposition: inline References: <5a4c581d05082506395fa984ae@mail.gmail.com> <20050825165550.GC4442@rama.de.gnumonks.org> Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id j7PHT4iL030445 X-archive-position: 3558 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: alessandro.suardi@gmail.com Precedence: bulk X-list: netdev Content-Length: 3980 Lines: 101 On 8/25/05, Harald Welte wrote: > On Thu, Aug 25, 2005 at 03:39:02PM +0200, Alessandro Suardi wrote: > > Howdy, and excuse me for crossposting - feel free to zap CC to > > unrelated, if any, mailing lists. > > > > just gave PeerGuardian a spin on my eDonkey home box and > > said box didn't last half a day before oopsing in netlink/nf/tcp > > related routines (or so it seems to my untrained eye). > > Yes, it indeed could be that there is some fishy interaction between the > tcp stack and ip_queue causing the oops. > > > K7800, 256MB RAM, uptodate FC3 running 2.6.13-rc6-git12, > > doing nothing but running MetaMachine's eDonkey 1.4.3 QT gui. > > PeerGuardian is the 1.5 beta version available from methlabs.org. > > Is it true that PeerGuardian is a proprietary application? I'm not > going to debug this problem using a proprietary ip_queue program, sorry. I'm not sure I understand the issue; I built PG from these sources: http://prdownloads.sourceforge.net/peerguardian/pglinux-1.5beta.tar.gz?download and I had to install the iptables-devel FC3 rpm to build. The PG sources seem to be licensed under GPLv2. But maybe you're referring to the fact that whatever PG does, it doesn't show up as output from 'iptables -L' ? > If you can produce a testcase with open source userspace ip_queue code, > I could look into reproducing the problem locally and debugging the > problem more thoroughly. So far the box has been running for over four hours, I'll configure my laptop as a netdump server hoping it might capture something if the ed2k box crashes again later. I'm afraid I won't be able to set up a real testcase (and btw, edonkey v1.4.3 from MetaMachine is actually a proprietary program, though entirely in userspace). > While it definitely is a kernel bug (whatever userspace sends should not > crash the kernel), it might be something that specifically [only] > PeerGuardian does to the packet. Something that ip_queue doesn't check > (but should check) on packet reinjection and therefore upsets the TCP stack. > > Also helpful would be the output of an "strace -f -x -s65535 -e > trace=sendmsg" on the PeerGuardian (daemon?) process. > > > > [] die+0xe4/0x170 > > [] do_trap+0x7f/0xc0 > > [] do_invalid_op+0xa3/0xb0 > > [] error_code+0x4f/0x54 > > [] kfree_skbmem+0xb/0x20 > > [] __kfree_skb+0x5f/0xf0 > > ok, so something down the chain from kfree_skb() results in an invalid > operation? looks more like some compiler problem, bad memory or memory > corruption to me. Try to reproduce the problem without PG. compiler is fc3's latest - gcc-3.4.4-2.fc3. I might have a go at memtest86 in the next weeks if more symptoms point at possible bad RAM. > > [] tcp_clean_rtx_queue+0x16a/0x470 > > [] tcp_ack+0xf6/0x360 > > [] tcp_rcv_established+0x277/0x7a0 > > [] tcp_v4_do_rcv+0xf0/0x110 > > [] tcp_v4_rcv+0x6e0/0x820 > > [] ip_local_deliver_finish+0x84/0x160 > > so something in the tcp stack ends up doing tcp_clean_rtx_queue() > > > [] nf_reinject+0x13a/0x1c0 > > [] ipq_issue_verdict+0x28/0x40 > > [] ipq_set_verdict+0x48/0x70 > > ip_queue reinjects a packet via nf_reinject() > > > [] ipq_receive_peer+0x39/0x50 > > [] ipq_receive_sk+0x172/0x190 > > ip_queue receives and ipq verdict msg packet from netlink > > > [] netlink_data_ready+0x35/0x60 > > [] netlink_sendskb+0x24/0x60 > > [] netlink_unicast+0x127/0x160 > > [] netlink_sendmsg+0x204/0x2b0 > > [] sock_sendmsg+0xb0/0xe0 > > [] sys_sendmsg+0x134/0x240 > > [] sys_socketcall+0x224/0x230 > > [] sysenter_past_esp+0x54/0x75 > > process sendmsg()s on the netlink socket. Thanks, --alessandro "Not every smile means I'm laughing inside" (Wallflowers - "From The Bottom Of My Heart") From sim@netnation.com Thu Aug 25 11:13:37 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 25 Aug 2005 11:13:48 -0700 (PDT) Received: from peace.netnation.com (newpeace.netnation.com [204.174.223.7]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7PIDbiL003468 for ; Thu, 25 Aug 2005 11:13:37 -0700 Received: from sim by peace.netnation.com with local (Exim 4.50) id 1E8MBz-00047G-Rq; Thu, 25 Aug 2005 11:11:11 -0700 Date: Thu, 25 Aug 2005 11:11:11 -0700 From: Simon Kirby To: Robert Olsson , kuznet@ms2.inr.ac.ru Cc: Eric Dumazet , netdev@oss.sgi.com Subject: Re: Route cache performance Message-ID: <20050825181111.GB14336@netnation.com> References: <20050815213855.GA17832@netnation.com> <43014E27.1070104@cosmosbay.com> <20050823190852.GA20794@netnation.com> <17163.32645.202453.145416@robur.slu.se> <20050824000158.GA8137@netnation.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20050824000158.GA8137@netnation.com> User-Agent: Mutt/1.5.9i X-archive-position: 3560 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: sim@netnation.com Precedence: bulk X-list: netdev Content-Length: 4079 Lines: 72 [ Alexei / kuznet, I've included you here as I suspect you'll know what's going on. :) ] I've been working with 2.6.13-rc6 to try to figure out why it's breaking. I've added a more rt_cache statistics and flung more DoS traffic at it. I've determined that when the gc_goal_miss counter is increased, it seems to be because the refcnt is non-zero in rt_may_expire(). The static "expire" variable reaches 0 easily this case and the rover variables and loop aren't overflowing or anything -- it's just that something is holding the refcnt > 0 for almost all of the entries that rt_garbage_collect() walks. Here are some statistics I recorded: rt_cache entries|in_slow|gc_tota|gc_igno|gc_aggr|gc_zero|gc_expi|gc_expi|gc_goa|gc_dst|in_hlis| | _tot| l| red| essive|_expire| re_no| re_yes|l_miss|overfl| t_srch| 14| 4| 0| 0| 0| 0| 0| 0| 0| 0| 0| 24012| 24003| 15819| 15818| 0| 0| 0| 0| 0| 0| 35182| 131062| 112232| 112229| 110515| 1714| 1703| 10309| 79489| 1711| 1711| 767998| 131062| 14279| 14276| 900| 13376| 13376| 75352| 900| 13376| 13376| 8| 131062| 9542| 9538| 600| 8938| 8938| 50276| 600| 8938| 8938| 5| 131062| 9543| 9539| 600| 8939| 8939| 50278| 600| 8939| 8939| 5| 131062| 9542| 9538| 600| 8938| 8938| 50276| 600| 8938| 8938| 10| 131062| 9542| 9536| 600| 8936| 8936| 50272| 600| 8936| 8936| 6| 131062| 9475| 9472| 600| 8872| 8872| 50144| 600| 8872| 8872| 5| 131062| 9540| 9538| 600| 8938| 8938| 50276| 600| 8938| 8938| 4| gc_aggressive: Times the "we are in dangerous area" block executes. gc_zero_expire: Times the loop is broken because expire == 0. gc_expire_no: Times rt_may_expire() said no. gc_expire_yes: Times rt_may_expire() said yes. It seems the code is all the same in 2.4 in rt_may_expire(), so something outside must have changed. I can't even find anything in route.c that decrements or zeros the refcnt. Does anybody know why this is happening? Simon- On Tue, Aug 23, 2005 at 05:01:58PM -0700, Simon Kirby wrote: > On Tue, Aug 23, 2005 at 09:56:53PM +0200, Robert Olsson wrote: > > > Yes your GC does not work at all in your 2.6 setups...Why? > > Good question. :) > > > echo 50 > /proc/sys/net/ipv4/route/gc_min_interval_ms > > The output looks exactly the same with gc_min_interval_ms set to 50. > > If I set it to 0, it does change a little but _still_ overflows: > > rt_cache|rt_cache|rt_cache|rt_cache|rt_cache|rt_cache|rt_cache|rt_cache| > entries| in_hit|in_slow_|gc_total|gc_ignor|gc_goal_|gc_dst_o|in_hlist| > | | tot| | ed| miss| verflow| _search| > 3| 3| 1| 1| 1| 0| 0| 0| > 4| 11| 5| 0| 0| 0| 0| 0| > 5| 5| 2| 0| 0| 0| 0| 0| > 23615| 1| 24002| 15812| 0| 0| 0| 11470| > 68692| 0| 46780| 46777| 0| 4687| 0| 4492| > 86046| 0| 18763| 18754| 0| 18754| 0| 119| > 94884| 0| 9540| 9538| 0| 9538| 0| 47| > 104901| 0| 10819| 10817| 0| 10817| 0| 61| > 114919| 0| 10817| 10818| 0| 10818| 0| 68| > 127424| 0| 13512| 13505| 0| 13505| 0| 74| > 131062| 0| 15113| 15106| 0| 15106| 10368| 28| > 131062| 0| 12503| 12482| 0| 12482| 11582| 9| > 131062| 0| 8146| 8130| 0| 8130| 7530| 5| > 131062| 0| 8204| 8194| 0| 8194| 7594| 2| > 131062| 0| 8132| 8131| 0| 8131| 7531| 5| > 131062| 0| 8196| 8195| 0| 8195| 7595| 4| > 131062| 0| 8130| 8129| 0| 8129| 7529| 8| From sim@netnation.com Thu Aug 25 14:24:34 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 25 Aug 2005 14:24:43 -0700 (PDT) Received: from peace.netnation.com (newpeace.netnation.com [204.174.223.7]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7PLOYiL017838 for ; Thu, 25 Aug 2005 14:24:34 -0700 Received: from sim by peace.netnation.com with local (Exim 4.50) id 1E8PAp-0006QC-JI; Thu, 25 Aug 2005 14:22:11 -0700 Date: Thu, 25 Aug 2005 14:22:11 -0700 From: Simon Kirby To: Alexey Kuznetsov Cc: Robert Olsson , Eric Dumazet , netdev@oss.sgi.com Subject: Re: Route cache performance Message-ID: <20050825212211.GA23384@netnation.com> References: <20050815213855.GA17832@netnation.com> <43014E27.1070104@cosmosbay.com> <20050823190852.GA20794@netnation.com> <17163.32645.202453.145416@robur.slu.se> <20050824000158.GA8137@netnation.com> <20050825181111.GB14336@netnation.com> <20050825200543.GA6612@yakov.inr.ac.ru> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20050825200543.GA6612@yakov.inr.ac.ru> User-Agent: Mutt/1.5.9i X-archive-position: 3562 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: sim@netnation.com Precedence: bulk X-list: netdev Content-Length: 1223 Lines: 32 On Fri, Aug 26, 2005 at 12:05:43AM +0400, Alexey Kuznetsov wrote: > Hello! > > > something is holding the refcnt > 0 for almost all of the entries that > > rt_garbage_collect() walks. > > Did you try to look at output of "ip -s -s ro ls ca" ? > If it is just a refcnt leakage, leaked routes should appear there > and it is possible to guess, where they leaked. Hi Alexey, It appears to be just the DoS traffic I am routing through the box, as expected, but showing a refcnt for each entry: cache users 1 age 0sec mtu 1500 advmss 1460 hoplimit 64 iif eth3 I can't find in route.c what would ever decrement refcnt, and it seems to start being set to 1. It obviously does at some point or else the table would stay full forever, but when I stop the DoS it falls back down. What part of the code will decrement the count? I can't see it. The DoS in this case is set up to be from a spoofed source per packet and to the address of a remote box behind the box in question. Forwarding is enabled. BTW, I hacked a busy loop into juno-z.101f.c to fine rate control and found that with 2.6.13-rc6, it is unable to keep up with the traffic starting at about 112 kpps (each packet being a new random source). Simon- From SRS0=x8Ur=W4=zion.homelinux.com=sven@srs.kundenserver.de Thu Aug 25 14:19:18 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 25 Aug 2005 14:19:27 -0700 (PDT) Received: from moutng.kundenserver.de (moutng.kundenserver.de [212.227.126.171]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7PLJHiL016868 for ; Thu, 25 Aug 2005 14:19:18 -0700 Received: from p549303E7.dip0.t-ipconnect.de [84.147.3.231] (helo=zion.homelinux.com) by mrelayeu.kundenserver.de with ESMTP (Nemesis), id 0MKxQS-1E8OrO0ncI-0007fS; Thu, 25 Aug 2005 23:02:06 +0200 Received: from localhost (zion.homelinux.com [127.0.0.1]) by stage2.zion.homelinux.com (Postfix) with ESMTP id 63D7816C4C4; Thu, 25 Aug 2005 23:02:05 +0200 (CEST) Received: from zion.homelinux.com ([127.0.0.1]) by localhost (zion.homelinux.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 10361-07; Thu, 25 Aug 2005 23:02:01 +0200 (CEST) Received: by zion.homelinux.com (Postfix, from userid 1022) id 30C0516C4C6; Thu, 25 Aug 2005 23:02:01 +0200 (CEST) Date: Thu, 25 Aug 2005 23:02:01 +0200 From: Sven Schuster To: Harald Welte , Alessandro Suardi , netdev@oss.sgi.com, Linux Kernel Mailing List , netfilter-devel@lists.netfilter.org Subject: Re: oops in 2.6.13-rc6-git12 in tcp/netfilter routines Message-ID: <20050825210200.GA10374@zion.homelinux.com> References: <5a4c581d05082506395fa984ae@mail.gmail.com> <20050825165550.GC4442@rama.de.gnumonks.org> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="4Ckj6UjgE2iN1+kY" Content-Disposition: inline In-Reply-To: <20050825165550.GC4442@rama.de.gnumonks.org> User-Agent: Mutt/1.5.9i X-Provags-ID: kundenserver.de abuse@kundenserver.de login:38b5f051b8cd178556c5593940405c4a X-archive-position: 3561 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: schuster.sven@gmx.de Precedence: bulk X-list: netdev Content-Length: 1067 Lines: 41 --4Ckj6UjgE2iN1+kY Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Hi Harald, On Thu, Aug 25, 2005 at 06:55:50PM +0200, Harald Welte told us: > Is it true that PeerGuardian is a proprietary application? I'm not > going to debug this problem using a proprietary ip_queue program, sorry. sorry to jump in here, but I took a quick look at PeerGuardian, according to http://methlabs.org/wiki/license_information it's open source. The source code is available at http://methlabs.org/projects/peerguardian-linuxosx/ HTH Sven --=20 Linux zion.homelinux.com 2.6.13-rc6-mm2 #3 Thu Aug 25 14:53:55 CEST 2005 i6= 86 athlon i386 GNU/Linux 22:56:18 up 7:40, 1 user, load average: 0.46, 0.14, 0.04 --4Ckj6UjgE2iN1+kY Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.1 (GNU/Linux) iD8DBQFDDjHIo4FAdB2PneQRAiX1AJ0Q1heEigmg49MCUMdY9EiDCI9LfwCfVYex P4mmlStmsdG54dWJxp3u8Ts= =PEro -----END PGP SIGNATURE----- --4Ckj6UjgE2iN1+kY-- From kuznet@yakov.inr.ac.ru Thu Aug 25 14:32:30 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 25 Aug 2005 14:32:34 -0700 (PDT) Received: from yakov.inr.ac.ru (yakov.inr.ac.ru [194.67.69.111]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id j7PLWSiL019103 for ; Thu, 25 Aug 2005 14:32:29 -0700 DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=ms2.inr.ac.ru; b=UicNxEBxTkiG91GNT6GVL8Cn/vukvldDNWPmYGIg4HwQzAuxZ2bedM+/FmMEbWXpR14ULOvyQ8TyaJLJGC5OXsCsU1yTUnCIWmBRx7wImSQRlhv6T8E10ouNlcn9OMV8fpGykw7MMacnN49rw6KhJhEe+I1SW+gEw8P/UcSsJEM=; Received: (from kuznet@localhost) envelope-from=kuznet by yakov.inr.ac.ru (8.6.13/ANK) id AAA06672; Fri, 26 Aug 2005 00:05:43 +0400 Date: Fri, 26 Aug 2005 00:05:43 +0400 From: Alexey Kuznetsov To: Simon Kirby Cc: Robert Olsson , kuznet@ms2.inr.ac.ru, Eric Dumazet , netdev@oss.sgi.com Subject: Re: Route cache performance Message-ID: <20050825200543.GA6612@yakov.inr.ac.ru> References: <20050815213855.GA17832@netnation.com> <43014E27.1070104@cosmosbay.com> <20050823190852.GA20794@netnation.com> <17163.32645.202453.145416@robur.slu.se> <20050824000158.GA8137@netnation.com> <20050825181111.GB14336@netnation.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20050825181111.GB14336@netnation.com> User-Agent: Mutt/1.5.6i X-archive-position: 3563 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kuznet@ms2.inr.ac.ru Precedence: bulk X-list: netdev Content-Length: 290 Lines: 11 Hello! > something is holding the refcnt > 0 for almost all of the entries that > rt_garbage_collect() walks. Did you try to look at output of "ip -s -s ro ls ca" ? If it is just a refcnt leakage, leaked routes should appear there and it is possible to guess, where they leaked. Alexey From flamingice@sourmilk.net Thu Aug 25 17:52:27 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 25 Aug 2005 17:52:33 -0700 (PDT) Received: from server8.totalchoicehosting.com (server8.totalchoicehosting.com [216.180.241.250]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7Q0qQiL009417 for ; Thu, 25 Aug 2005 17:52:27 -0700 Received: from host-24-225-148-91.patmedia.net ([24.225.148.91] helo=[192.168.0.102]) by server8.totalchoicehosting.com with esmtpsa (TLSv1:RC4-MD5:128) (Exim 4.44) id 1E8SPy-00017A-3t; Thu, 25 Aug 2005 20:50:02 -0400 From: Michael Wu To: netdev@oss.sgi.com Subject: Re: ieee80211 patches Date: Thu, 25 Aug 2005 20:49:52 -0400 User-Agent: KMail/1.8.2 Cc: Jiri Benc , NetDev , Jeff Garzik , jbohac@suse.cz, pavel@suse.cz References: <20050825193112.529d0dc9@griffin.suse.cz> In-Reply-To: <20050825193112.529d0dc9@griffin.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200508252049.52413.flamingice@sourmilk.net> X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - server8.totalchoicehosting.com X-AntiAbuse: Original Domain - oss.sgi.com X-AntiAbuse: Originator/Caller UID/GID - [0 0] / [47 12] X-AntiAbuse: Sender Address Domain - sourmilk.net X-Source: X-Source-Args: X-Source-Dir: X-archive-position: 3564 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: flamingice@sourmilk.net Precedence: bulk X-list: netdev Content-Length: 651 Lines: 19 On Thursday 25 August 2005 13:31, Jiri Benc wrote: > Our patches against latest ieee80211 branch can be found at > http://kernel.org/pub/linux/kernel/people/jbenc/ > I hope to submit the adm8211 driver for review soon, but there's a bunch of code in the driver which probably belong in the ieee80211 code: - Duplicate frame removal - Definitions for all management payloads - SIOCSIWENCODEEXT and SIOCGIWENCODEEXT - AVS capture header in monitor mode - Software Scanning - Software Authentication & Association Does that sound okay? I will start submitting patches to add those things if they should be in the ieee80211 code. Thanks, -Michael Wu From laforge@netfilter.org Fri Aug 26 01:50:24 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 26 Aug 2005 01:50:28 -0700 (PDT) Received: from ganesha.gnumonks.org (ganesha.gnumonks.org [213.95.27.120]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7Q8oNiL003885 for ; Fri, 26 Aug 2005 01:50:23 -0700 Received: from uucp by ganesha.gnumonks.org with local-bsmtp (Exim 4.50) id 1E8ZsV-0001Ou-6l for netdev@oss.sgi.com; Fri, 26 Aug 2005 10:47:59 +0200 Received: from laforge by rama.gnumonks.org with local (Exim 3.36 #1) id 1E8ZgN-00019n-00; Fri, 26 Aug 2005 10:35:27 +0200 Date: Fri, 26 Aug 2005 10:35:27 +0200 From: Harald Welte To: Sven Schuster Cc: Alessandro Suardi , netdev@oss.sgi.com, Linux Kernel Mailing List , netfilter-devel@lists.netfilter.org Subject: Re: oops in 2.6.13-rc6-git12 in tcp/netfilter routines Message-ID: <20050826083527.GD4226@rama.de.gnumonks.org> Mail-Followup-To: Harald Welte , Sven Schuster , Alessandro Suardi , netdev@oss.sgi.com, Linux Kernel Mailing List , netfilter-devel@lists.netfilter.org References: <5a4c581d05082506395fa984ae@mail.gmail.com> <20050825165550.GC4442@rama.de.gnumonks.org> <20050825210200.GA10374@zion.homelinux.com> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="W5WqUoFLvi1M7tJE" Content-Disposition: inline In-Reply-To: <20050825210200.GA10374@zion.homelinux.com> User-Agent: mutt-ng devel-20050619 (Debian) X-archive-position: 3565 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: laforge@netfilter.org Precedence: bulk X-list: netdev Content-Length: 1636 Lines: 46 --W5WqUoFLvi1M7tJE Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Thu, Aug 25, 2005 at 11:02:01PM +0200, Sven Schuster wrote: >=20 > Hi Harald, >=20 > On Thu, Aug 25, 2005 at 06:55:50PM +0200, Harald Welte told us: > > Is it true that PeerGuardian is a proprietary application? I'm not > > going to debug this problem using a proprietary ip_queue program, sorry. >=20 > sorry to jump in here, but I took a quick look at PeerGuardian, > according to > http://methlabs.org/wiki/license_information > it's open source. The source code is available at > http://methlabs.org/projects/peerguardian-linuxosx/ ok, thanks. Sorry for the confusion, but the 'official' website is just a blog that didn't really reveal all that much information. --=20 - Harald Welte http://netfilter.org/ =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D "Fragmentation is like classful addressing -- an interesting early architectural error that shows how much experimentation was going on while IP was being designed." -- Paul Vixie --W5WqUoFLvi1M7tJE Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.1 (GNU/Linux) iD8DBQFDDtROXaXGVTD0i/8RAry9AJ4gjMslIcm5T+nTvhKXHWHS5bdCVACgswYA EUq2k+lWbO+nrpQO8dzOleQ= =xE+q -----END PGP SIGNATURE----- --W5WqUoFLvi1M7tJE-- From stefan@loplof.de Fri Aug 26 02:04:20 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 26 Aug 2005 02:04:22 -0700 (PDT) Received: from natnoddy.rzone.de (natnoddy.rzone.de [81.169.145.166]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7Q94JiL006039 for ; Fri, 26 Aug 2005 02:04:20 -0700 Received: from dose.hogan.de (p54B0064D.dip0.t-ipconnect.de [84.176.6.77]) by post.webmailer.de (8.13.1/8.13.1) with ESMTP id j7Q91rtS013134; Fri, 26 Aug 2005 11:01:53 +0200 (MEST) From: Stefan Rompf To: Michael Wu Subject: Re: ieee80211 patches Date: Fri, 26 Aug 2005 11:04:15 +0200 User-Agent: KMail/1.8 Cc: netdev@oss.sgi.com, Jiri Benc , NetDev , Jeff Garzik , jbohac@suse.cz, pavel@suse.cz References: <20050825193112.529d0dc9@griffin.suse.cz> <200508252049.52413.flamingice@sourmilk.net> In-Reply-To: <200508252049.52413.flamingice@sourmilk.net> MIME-Version: 1.0 Content-Disposition: inline Message-Id: <200508261104.16134.stefan@loplof.de> Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-archive-position: 3566 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: stefan@loplof.de Precedence: bulk X-list: netdev Content-Length: 670 Lines: 17 Hi, Am Freitag 26 August 2005 02:49 schrieb Michael Wu: > I hope to submit the adm8211 driver for review soon, but there's a bunch of > code in the driver which probably belong in the ieee80211 code: [...] > - AVS capture header in monitor mode there has been a discussion about AVS or radiotap header on the ipw list that resulted in the inclusion of radiotap definitions in James' latest patches for ieee80211. Radiotap requires a libpcap upgrade, but can be extended easier. I'd like all linux wireless drivers to move to radiotap for statistic data so that userspace has just one format to parse, and a new driver would be a perfect target to begin. Stefan From kuznet@yakov.inr.ac.ru Fri Aug 26 04:58:09 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 26 Aug 2005 04:58:16 -0700 (PDT) Received: from yakov.inr.ac.ru (yakov.inr.ac.ru [194.67.69.111]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id j7QBw6iL026734 for ; Fri, 26 Aug 2005 04:58:07 -0700 DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=ms2.inr.ac.ru; b=F0vNkIotNGbIzesWuf4wLK+u0Z/GPRS5apKGp6/tnkMKxEKjZF2vy8TSmMPUTBr7K9z1m1r+d5hs6R2TlnzFzzFEEBgF/NWYV/E8zMak/zx12MNhlSKL6M3PD2eYlngZwN7l94eKml0vA1vaYGviXWlrPuBERUoqBRBMm2XBfVs=; Received: (from kuznet@localhost) envelope-from=kuznet by yakov.inr.ac.ru (8.6.13/ANK) id PAA12574; Fri, 26 Aug 2005 15:55:20 +0400 Date: Fri, 26 Aug 2005 15:55:20 +0400 From: Alexey Kuznetsov To: Simon Kirby Cc: Alexey Kuznetsov , Robert Olsson , Eric Dumazet , netdev@oss.sgi.com Subject: Re: Route cache performance Message-ID: <20050826115520.GA12351@yakov.inr.ac.ru> References: <20050815213855.GA17832@netnation.com> <43014E27.1070104@cosmosbay.com> <20050823190852.GA20794@netnation.com> <17163.32645.202453.145416@robur.slu.se> <20050824000158.GA8137@netnation.com> <20050825181111.GB14336@netnation.com> <20050825200543.GA6612@yakov.inr.ac.ru> <20050825212211.GA23384@netnation.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20050825212211.GA23384@netnation.com> User-Agent: Mutt/1.5.6i X-archive-position: 3567 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kuznet@ms2.inr.ac.ru Precedence: bulk X-list: netdev Content-Length: 289 Lines: 11 Hello! > What part of the code will decrement the count? I can't see it. It depends. In the case of forwarding, it is kfree_skb(), happening after the packet is transmitted by output device. Well, it could result in overflow only if device queue is longer than route/max_size. Alexey From jbenc@suse.cz Fri Aug 26 05:10:58 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 26 Aug 2005 05:11:03 -0700 (PDT) Received: from mail.suse.cz (styx.suse.cz [82.119.242.94]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7QCAviL028615 for ; Fri, 26 Aug 2005 05:10:58 -0700 Received: from griffin.suse.cz (griffin.suse.cz [10.20.1.99]) by mail.suse.cz (SUSE CR ESMTP Mailer) with ESMTP id 7F01C628374; Fri, 26 Aug 2005 14:08:31 +0200 (CEST) Date: Fri, 26 Aug 2005 14:08:31 +0200 From: Jiri Benc To: Michael Wu Cc: netdev@oss.sgi.com, NetDev , Jeff Garzik , jbohac@suse.cz, pavel@suse.cz Subject: Re: ieee80211 patches Message-ID: <20050826140831.5ec0f301@griffin.suse.cz> In-Reply-To: <200508252049.52413.flamingice@sourmilk.net> References: <20050825193112.529d0dc9@griffin.suse.cz> <200508252049.52413.flamingice@sourmilk.net> X-Mailer: Sylpheed-Claws 1.0.4a (GTK+ 1.2.10; x86_64-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 3568 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jbenc@suse.cz Precedence: bulk X-list: netdev Content-Length: 820 Lines: 25 On Thu, 25 Aug 2005 20:49:52 -0400, Michael Wu wrote: > I hope to submit the adm8211 driver for review soon, but there's a bunch of > code in the driver which probably belong in the ieee80211 code: > > - Duplicate frame removal > - Definitions for all management payloads > - SIOCSIWENCODEEXT and SIOCGIWENCODEEXT > - AVS capture header in monitor mode > - Software Scanning > - Software Authentication & Association > > Does that sound okay? I will start submitting patches to add those things if > they should be in the ieee80211 code. Sounds great. Could you post a pointer to the project? Btw, Andrea Merello has a code for some of these features in his rtl8180-sa2400 project (http://sourceforge.net/projects/rtl8180-sa2400) too - we are trying to subsequently include them. Thanks, -- Jiri Benc SUSE Labs From flamingice@sourmilk.net Fri Aug 26 07:01:34 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 26 Aug 2005 07:01:39 -0700 (PDT) Received: from server8.totalchoicehosting.com (server8.totalchoicehosting.com [216.180.241.250]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7QE1XiL032617 for ; Fri, 26 Aug 2005 07:01:34 -0700 Received: from host-24-225-148-91.patmedia.net ([24.225.148.91] helo=[192.168.0.102]) by server8.totalchoicehosting.com with esmtpsa (TLSv1:RC4-MD5:128) (Exim 4.44) id 1E8ejb-0003PW-Nw; Fri, 26 Aug 2005 09:59:07 -0400 From: Michael Wu To: Stefan Rompf Subject: Re: ieee80211 patches Date: Fri, 26 Aug 2005 09:59:00 -0400 User-Agent: KMail/1.8.2 Cc: netdev@oss.sgi.com, Jiri Benc , NetDev , Jeff Garzik , jbohac@suse.cz, pavel@suse.cz References: <20050825193112.529d0dc9@griffin.suse.cz> <200508252049.52413.flamingice@sourmilk.net> <200508261104.16134.stefan@loplof.de> In-Reply-To: <200508261104.16134.stefan@loplof.de> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200508260959.01120.flamingice@sourmilk.net> X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - server8.totalchoicehosting.com X-AntiAbuse: Original Domain - oss.sgi.com X-AntiAbuse: Originator/Caller UID/GID - [0 0] / [47 12] X-AntiAbuse: Sender Address Domain - sourmilk.net X-Source: X-Source-Args: X-Source-Dir: X-archive-position: 3569 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: flamingice@sourmilk.net Precedence: bulk X-list: netdev Content-Length: 730 Lines: 16 On Friday 26 August 2005 05:04, Stefan Rompf wrote: > > - AVS capture header in monitor mode > > there has been a discussion about AVS or radiotap header on the ipw list > that resulted in the inclusion of radiotap definitions in James' latest > patches for ieee80211. Radiotap requires a libpcap upgrade, but can be > extended easier. I'd like all linux wireless drivers to move to radiotap > for statistic data so that userspace has just one format to parse, and a > new driver would be a perfect target to begin. > Sounds like everyone agrees that radiotap is a superior header. I will remove AVS support and add radiotap support (if additional code is needed) once radiotap code is merged in ieee80211. Thanks, -Michael Wu From flamingice@sourmilk.net Fri Aug 26 08:09:42 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 26 Aug 2005 08:09:47 -0700 (PDT) Received: from server8.totalchoicehosting.com (server8.totalchoicehosting.com [216.180.241.250]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7QF9fiL007721 for ; Fri, 26 Aug 2005 08:09:42 -0700 Received: from host-24-225-148-91.patmedia.net ([24.225.148.91] helo=[192.168.0.102]) by server8.totalchoicehosting.com with esmtpsa (TLSv1:RC4-MD5:128) (Exim 4.44) id 1E8fnZ-0005K5-WC; Fri, 26 Aug 2005 11:07:18 -0400 From: Michael Wu To: Jiri Benc Subject: Re: ieee80211 patches Date: Fri, 26 Aug 2005 11:07:07 -0400 User-Agent: KMail/1.8.2 Cc: netdev@oss.sgi.com, NetDev , Jeff Garzik , jbohac@suse.cz, pavel@suse.cz References: <20050825193112.529d0dc9@griffin.suse.cz> <200508252049.52413.flamingice@sourmilk.net> <20050826140831.5ec0f301@griffin.suse.cz> In-Reply-To: <20050826140831.5ec0f301@griffin.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200508261107.07501.flamingice@sourmilk.net> X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - server8.totalchoicehosting.com X-AntiAbuse: Original Domain - oss.sgi.com X-AntiAbuse: Originator/Caller UID/GID - [0 0] / [47 12] X-AntiAbuse: Sender Address Domain - sourmilk.net X-Source: X-Source-Args: X-Source-Dir: X-archive-position: 3570 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: flamingice@sourmilk.net Precedence: bulk X-list: netdev Content-Length: 2033 Lines: 46 On Friday 26 August 2005 08:08, Jiri Benc wrote: > On Thu, 25 Aug 2005 20:49:52 -0400, Michael Wu wrote: > > I hope to submit the adm8211 driver for review soon, but there's a bunch > > of code in the driver which probably belong in the ieee80211 code: > > > > - Duplicate frame removal > > - Definitions for all management payloads > > - SIOCSIWENCODEEXT and SIOCGIWENCODEEXT > > - AVS capture header in monitor mode > > - Software Scanning > > - Software Authentication & Association > > > > Does that sound okay? I will start submitting patches to add those things > > if they should be in the ieee80211 code. > > Sounds great. Could you post a pointer to the project? > http://aluminum.sourmilk.net/adm8211/index.php?path=netdev/ > Btw, Andrea Merello has a code for some of these features in his > rtl8180-sa2400 project (http://sourceforge.net/projects/rtl8180-sa2400) > too - we are trying to subsequently include them. > Hmm.. - adm8211 supports shared key authentication, rtl8180-sa2400 only supports open authentication. (which is kinda okay when you consider that 802.11i doesn't use shared key) - Both drivers do not support LEAP authentication. - adm8211 supports WE18/WPA (partly via SIWGENIE, it doesn't generate WPA/RSN IEs itself) - rtl8180-sa2400 supports a really simple master mode. (should something like hostapd be used instead for master mode?) - rtl8180-sa2400 supports sending probe responses. adm8211 lets the hardware do it, though newer adm8211 cards can be told to hand it off to the software. - rtl8180-sa2400 creates a completely random bssid in adhoc. adm8211 keeps the first 3 octets from the MAC addr. - adm8211 supports sending & handling deauthentication and disassociation frames. I did not see definitions for those frames, so I assume there's no support there. - adm8211 needs many hooks in scanning/authentication/association to set BSSID/SSID/AID/etc. in hardware. - rtl8180-sa2400 doesn't seem to convert to little endian in a number of places. (not really major..) -Michael Wu From kaber@trash.net Fri Aug 26 10:54:32 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 26 Aug 2005 10:54:37 -0700 (PDT) Received: from kaber.coreworks.de ([62.206.217.67]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7QHsViL024710 for ; Fri, 26 Aug 2005 10:54:32 -0700 Received: from localhost ([127.0.0.1]) by kaber.coreworks.de with esmtp (Exim 4.52) id 1E8iN6-00010I-7X; Fri, 26 Aug 2005 19:52:08 +0200 Message-ID: <430F56C7.8070500@trash.net> Date: Fri, 26 Aug 2005 19:52:07 +0200 From: Patrick McHardy User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.7.10) Gecko/20050803 Debian/1.7.10-1 X-Accept-Language: en MIME-Version: 1.0 To: Alessandro Suardi CC: netdev@oss.sgi.com, Linux Kernel Mailing List , netfilter-devel@lists.netfilter.org Subject: Re: oops in 2.6.13-rc6-git12 in tcp/netfilter routines References: <5a4c581d05082506395fa984ae@mail.gmail.com> In-Reply-To: <5a4c581d05082506395fa984ae@mail.gmail.com> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 3571 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kaber@trash.net Precedence: bulk X-list: netdev Content-Length: 1678 Lines: 40 Alessandro Suardi wrote: > Stack is hand-copied from the dead box's console. > > [] die+0xe4/0x170 > [] do_trap+0x7f/0xc0 > [] do_invalid_op+0xa3/0xb0 > [] error_code+0x4f/0x54 > [] kfree_skbmem+0xb/0x20 > [] __kfree_skb+0x5f/0xf0 > [] tcp_clean_rtx_queue+0x16a/0x470 > [] tcp_ack+0xf6/0x360 > [] tcp_rcv_established+0x277/0x7a0 > [] tcp_v4_do_rcv+0xf0/0x110 > [] tcp_v4_rcv+0x6e0/0x820 > [] ip_local_deliver_finish+0x84/0x160 > [] nf_reinject+0x13a/0x1c0 > [] ipq_issue_verdict+0x28/0x40 > [] ipq_set_verdict+0x48/0x70 > [] ipq_receive_peer+0x39/0x50 > [] ipq_receive_sk+0x172/0x190 > [] netlink_data_ready+0x35/0x60 > [] netlink_sendskb+0x24/0x60 > [] netlink_unicast+0x127/0x160 > [] netlink_sendmsg+0x204/0x2b0 > [] sock_sendmsg+0xb0/0xe0 > [] sys_sendmsg+0x134/0x240 > [] sys_socketcall+0x224/0x230 > [] sysenter_past_esp+0x54/0x75 > Code: 8b 41 0c 85 c0 75 1b 8b 86 94 00 00 00 e8 9e 37 e5 ff 5b 5e c9 > c3 89 d0 e8 43 46 e5 ff 8d 76 00 eb d2 89 f0 e8 f7 fe ff ff eb dc <0f> > 0b 54 01 16 d2 36 c0 eb b4 8d 74 26 00 8d bc 27 00 00 00 00 > <0>Kernel panic - not syncing: Fatal exception in interrupt > > If there's need for further info I'd be happy to provide it. For now > the box is rebooted into the same kernel and running the same > PG/eD2k programs, if the issue reproduces I'll follow up on my > own message. Any chance you can get the entire Oops including registers etc using netconsole or serial console? From chrisw@osdl.org Fri Aug 26 12:25:55 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 26 Aug 2005 12:26:10 -0700 (PDT) Received: from smtp.osdl.org (smtp.osdl.org [65.172.181.4]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7QJPqiL002046 for ; Fri, 26 Aug 2005 12:25:54 -0700 Received: from shell0.pdx.osdl.net (fw.osdl.org [65.172.181.6]) by smtp.osdl.org (8.12.8/8.12.8) with ESMTP id j7QJK2jA001561 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO); Fri, 26 Aug 2005 12:20:02 -0700 Received: from localhost.localdomain (shell0.pdx.osdl.net [10.9.0.31]) by shell0.pdx.osdl.net (8.13.1/8.11.6) with ESMTP id j7QJK1mq025675; Fri, 26 Aug 2005 12:20:01 -0700 Received: (from chrisw@localhost) by localhost.localdomain (8.13.1/8.13.1/Submit) id j7QJJ27w012652; Fri, 26 Aug 2005 12:19:02 -0700 Message-Id: <20050826191901.965850000@localhost.localdomain> References: <20050826191755.052951000@localhost.localdomain> Date: Fri, 26 Aug 2005 12:17:59 -0700 From: Chris Wright To: linux-kernel@vger.kernel.org, stable@kernel.org, Ollie Wild Cc: Justin Forbes , Zwane Mwaikambo , "Theodore Ts'o" , Randy Dunlap , Chuck Wolber , torvalds@osdl.org, akpm@osdl.org, alan@lxorguk.ukuu.org.uk, Maillist netdev , Patrick McHardy , "David S. Miller" , Chris Wright Subject: [PATCH 4/7] [IPV4]: Fix DST leak in icmp_push_reply() Content-Disposition: inline; filename=fix-dst-leak-in-icmp_push_reply.patch X-MIMEDefang-Filter: osdl$Revision: 1.114 $ X-Scanned-By: MIMEDefang 2.36 X-archive-position: 3572 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: chrisw@osdl.org Precedence: bulk X-list: netdev Content-Length: 1332 Lines: 38 -stable review patch. If anyone has any objections, please let us know. ------------------ Based upon a bug report and initial patch by Ollie Wild. Signed-off-by: Patrick McHardy Signed-off-by: "David S. Miller" Signed-off-by: Chris Wright --- net/ipv4/icmp.c | 12 ++++++------ 1 files changed, 6 insertions(+), 6 deletions(-) Index: linux-2.6.12.y/net/ipv4/icmp.c =================================================================== --- linux-2.6.12.y.orig/net/ipv4/icmp.c +++ linux-2.6.12.y/net/ipv4/icmp.c @@ -349,12 +349,12 @@ static void icmp_push_reply(struct icmp_ { struct sk_buff *skb; - ip_append_data(icmp_socket->sk, icmp_glue_bits, icmp_param, - icmp_param->data_len+icmp_param->head_len, - icmp_param->head_len, - ipc, rt, MSG_DONTWAIT); - - if ((skb = skb_peek(&icmp_socket->sk->sk_write_queue)) != NULL) { + if (ip_append_data(icmp_socket->sk, icmp_glue_bits, icmp_param, + icmp_param->data_len+icmp_param->head_len, + icmp_param->head_len, + ipc, rt, MSG_DONTWAIT) < 0) + ip_flush_pending_frames(icmp_socket->sk); + else if ((skb = skb_peek(&icmp_socket->sk->sk_write_queue)) != NULL) { struct icmphdr *icmph = skb->h.icmph; unsigned int csum = 0; struct sk_buff *skb1; -- From Robert.Olsson@data.slu.se Fri Aug 26 12:52:15 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 26 Aug 2005 12:52:20 -0700 (PDT) Received: from mx1.slu.se (mx1.slu.se [130.238.96.70]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7QJqEiL003904 for ; Fri, 26 Aug 2005 12:52:15 -0700 Received: from robur.slu.se (robur.slu.se [130.238.98.12]) by mx1.slu.se (8.13.1/8.13.1) with ESMTP id j7QJnBjN019050; Fri, 26 Aug 2005 21:49:11 +0200 Received: by robur.slu.se (Postfix, from userid 1000) id 7BCDCEC3BB; Fri, 26 Aug 2005 21:49:11 +0200 (CEST) From: Robert Olsson MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <17167.29239.469711.847951@robur.slu.se> Date: Fri, 26 Aug 2005 21:49:11 +0200 To: Alexey Kuznetsov Cc: Simon Kirby , Robert Olsson , Eric Dumazet , netdev@oss.sgi.com Subject: Re: Route cache performance In-Reply-To: <20050826115520.GA12351@yakov.inr.ac.ru> References: <20050815213855.GA17832@netnation.com> <43014E27.1070104@cosmosbay.com> <20050823190852.GA20794@netnation.com> <17163.32645.202453.145416@robur.slu.se> <20050824000158.GA8137@netnation.com> <20050825181111.GB14336@netnation.com> <20050825200543.GA6612@yakov.inr.ac.ru> <20050825212211.GA23384@netnation.com> <20050826115520.GA12351@yakov.inr.ac.ru> X-Mailer: VM 7.19 under Emacs 21.4.1 X-Scanned-By: MIMEDefang 2.48 on 130.238.96.70 X-archive-position: 3573 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: Robert.Olsson@data.slu.se Precedence: bulk X-list: netdev Content-Length: 696 Lines: 30 Hello! This thread seems familar :) I think Simon uses UP and it could be idea to check if the RCU deferred deletion causes the problem. Simon it would be interesting to see if the patch below makes any difference given the assumption about UP was correct, Cheers. --ro diff --git a/net/ipv4/route.c b/net/ipv4/route.c --- a/net/ipv4/route.c +++ b/net/ipv4/route.c @@ -485,7 +485,11 @@ static struct file_operations rt_cpu_seq static __inline__ void rt_free(struct rtable *rt) { multipath_remove(rt); +#ifdef CONFIG_SMP call_rcu_bh(&rt->u.dst.rcu_head, dst_rcu_free); +#else + dst_free((struct dst_entry *)rt); +#endif } static __inline__ void rt_drop(struct rtable *rt) From alessandro.suardi@gmail.com Fri Aug 26 13:42:32 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 26 Aug 2005 13:42:39 -0700 (PDT) Received: from rproxy.gmail.com (rproxy.gmail.com [64.233.170.207]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7QKgViL007671 for ; Fri, 26 Aug 2005 13:42:31 -0700 Received: by rproxy.gmail.com with SMTP id f1so652763rne for ; Fri, 26 Aug 2005 13:40:05 -0700 (PDT) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=P1x1FfSgmnkSu8e7d6dsbWd+KjBBO3A70vSUoVzGtmyfEnSh/+ZpPehGB1n4bqE9CHzpA4H2V3M0+AfCmGhD1guwyQFxqGIBPXqa58K8ITMOGDsf2HPzW6Z5TxmzRWwJLqD+Tnx5iHxrteS3duPilRywQi35VV0NIFN87LUyzQQ= Received: by 10.38.76.59 with SMTP id y59mr301696rna; Fri, 26 Aug 2005 13:40:05 -0700 (PDT) Received: by 10.38.13.14 with HTTP; Fri, 26 Aug 2005 13:40:05 -0700 (PDT) Message-ID: <5a4c581d0508261340cc0c9ee@mail.gmail.com> Date: Fri, 26 Aug 2005 22:40:05 +0200 From: Alessandro Suardi To: Patrick McHardy Subject: Re: oops in 2.6.13-rc6-git12 in tcp/netfilter routines Cc: netdev@oss.sgi.com, Linux Kernel Mailing List , netfilter-devel@lists.netfilter.org In-Reply-To: <430F56C7.8070500@trash.net> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Disposition: inline References: <5a4c581d05082506395fa984ae@mail.gmail.com> <430F56C7.8070500@trash.net> Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id j7QKgViL007671 X-archive-position: 3574 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: alessandro.suardi@gmail.com Precedence: bulk X-list: netdev Content-Length: 3056 Lines: 74 On 8/26/05, Patrick McHardy wrote: > Alessandro Suardi wrote: > > Stack is hand-copied from the dead box's console. > > > > [] die+0xe4/0x170 > > [] do_trap+0x7f/0xc0 > > [] do_invalid_op+0xa3/0xb0 > > [] error_code+0x4f/0x54 > > [] kfree_skbmem+0xb/0x20 > > [] __kfree_skb+0x5f/0xf0 > > [] tcp_clean_rtx_queue+0x16a/0x470 > > [] tcp_ack+0xf6/0x360 > > [] tcp_rcv_established+0x277/0x7a0 > > [] tcp_v4_do_rcv+0xf0/0x110 > > [] tcp_v4_rcv+0x6e0/0x820 > > [] ip_local_deliver_finish+0x84/0x160 > > [] nf_reinject+0x13a/0x1c0 > > [] ipq_issue_verdict+0x28/0x40 > > [] ipq_set_verdict+0x48/0x70 > > [] ipq_receive_peer+0x39/0x50 > > [] ipq_receive_sk+0x172/0x190 > > [] netlink_data_ready+0x35/0x60 > > [] netlink_sendskb+0x24/0x60 > > [] netlink_unicast+0x127/0x160 > > [] netlink_sendmsg+0x204/0x2b0 > > [] sock_sendmsg+0xb0/0xe0 > > [] sys_sendmsg+0x134/0x240 > > [] sys_socketcall+0x224/0x230 > > [] sysenter_past_esp+0x54/0x75 > > Code: 8b 41 0c 85 c0 75 1b 8b 86 94 00 00 00 e8 9e 37 e5 ff 5b 5e c9 > > c3 89 d0 e8 43 46 e5 ff 8d 76 00 eb d2 89 f0 e8 f7 fe ff ff eb dc <0f> > > 0b 54 01 16 d2 36 c0 eb b4 8d 74 26 00 8d bc 27 00 00 00 00 > > <0>Kernel panic - not syncing: Fatal exception in interrupt > > > > If there's need for further info I'd be happy to provide it. For now > > the box is rebooted into the same kernel and running the same > > PG/eD2k programs, if the issue reproduces I'll follow up on my > > own message. > > Any chance you can get the entire Oops including registers etc > using netconsole or serial console? Not right now, as I noticed netconsole requires netpoll and this latter can't be modular; but I'll do so before leaving tomorrow morning, obviously rebuilding with 2.6.13-rc7-git1 or -git2 if the new snapshot comes out. At the moment, the box has been running for 32 hours with no sign of wanting to oops... [root@donkey ~]# ps ax | egrep 'peer|edon' 2416 pts/2 Sl 25:37 peerguardnf -d -l /var/log/pg.log -c /etc/PG.conf 25186 pts/0 R+ 76:37 ./edonkey2000 25189 pts/0 S+ 0:06 ./edonkey2000 25191 pts/0 S+ 9:49 ./edonkey2000 7007 pts/0 S+ 0:00 ./edonkey2000 7011 pts/3 R+ 0:00 egrep peer|edon [root@donkey ~]# w 22:37:53 up 1 day, 7:49, 4 users, load average: 0.15, 0.18, 0.25 USER TTY FROM LOGIN@ IDLE JCPU PCPU WHAT root pts/0 donkey:2.0 Thu14 20:15m 1:26m 0.00s bash root pts/1 donkey:2.0 Thu14 13:40m 0.41s 1:57 gnome-terminal --sm-config-prefix /gnome-terminal-wBjEOn/ - root pts/2 donkey:2.0 Thu14 4:07 25:37 0.49s bash root pts/3 192.168.1.6 22:37 0.00s 0.06s 0.01s w Thanks, --alessandro "Not every smile means I'm laughing inside" (Wallflowers - "From The Bottom Of My Heart") From flamingice@sourmilk.net Fri Aug 26 16:53:22 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 26 Aug 2005 16:53:29 -0700 (PDT) Received: from server8.totalchoicehosting.com (server8.totalchoicehosting.com [216.180.241.250]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7QNrLiL024830 for ; Fri, 26 Aug 2005 16:53:21 -0700 Received: from host-24-225-148-91.patmedia.net ([24.225.148.91] helo=[192.168.0.102]) by server8.totalchoicehosting.com with esmtpsa (TLSv1:RC4-MD5:128) (Exim 4.44) id 1E8nyK-0006Aw-Ct; Fri, 26 Aug 2005 19:50:56 -0400 From: Michael Wu To: netdev@oss.sgi.com Subject: Re: ieee80211 patches Date: Fri, 26 Aug 2005 19:50:47 -0400 User-Agent: KMail/1.8.2 Cc: Jiri Benc , NetDev , Jeff Garzik , jbohac@suse.cz, pavel@suse.cz References: <20050825193112.529d0dc9@griffin.suse.cz> <200508261107.07501.flamingice@sourmilk.net> <20050826181329.6dcf94d3@griffin.suse.cz> In-Reply-To: <20050826181329.6dcf94d3@griffin.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200508261950.47716.flamingice@sourmilk.net> X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - server8.totalchoicehosting.com X-AntiAbuse: Original Domain - oss.sgi.com X-AntiAbuse: Originator/Caller UID/GID - [0 0] / [47 12] X-AntiAbuse: Sender Address Domain - sourmilk.net X-Source: X-Source-Args: X-Source-Dir: X-archive-position: 3575 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: flamingice@sourmilk.net Precedence: bulk X-list: netdev Content-Length: 2058 Lines: 56 On Friday 26 August 2005 12:13, Jiri Benc wrote: > You mentioned problem with hooks in scanning/association/etc. Probably > everybody involved in this discussion knows it but I think it is worth > saying it: This is one of the most important things in a new ieee80211 > layer. Unfortunately, it seems that every driver developer concentrates > on the aspects of his device (this include ieee80211 layer from ipw as > well as from rtl8180). But every device (or more precisely firmware) > provides different set of features - some takes care of almost > everything, other can do basic things only. So the amount of work that > needs to be done in the ieee80211 layer is different for different > devices. The layer has to be generic. It has to cope e.g. with > authentication/association performed fully by itself as well as with > auth/assoc performed automatically by the device, etc. > > I'm convinced this is the first goal we have to reach before we start to > implement WPA, QoS and so. Right now, the adm8211 driver uses a number of specific hooks in its own ieee80211 station management code such as: set_bssid set_ssid set_channel set_interval set_beacon Though with enough of these hooks, many devices can be supported, I think generic hooks like these would work best to support a wide range of hardware that use softmac: pre_scan set_channel pre_associate link_change start_ibss join_ibss Then for the devices that do more in the firmware, adding callbacks like: hw_scan hw_associate hw_start_ibss would indicate to the ieee80211 layer that the hardware is capable of handling those things itself. The driver should be able to find all the info needed to configure the hardware in ieee80211_device so only that needs to be passed to the hooks. A good portion of WPA can be implemented right now by implementing the encodeext ioctls and generalizing ieee80211_security to support more than just WEP. adm8211 doesn't have support for non-wep encryption yet, so WPA works without ieee80211_security support. Sound good? -Michael Wu From greearb@candelatech.com Fri Aug 26 17:11:17 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 26 Aug 2005 17:11:21 -0700 (PDT) Received: from www.lanforge.com (ns1.lanforge.com [66.165.47.210]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7R0BGiL026205 for ; Fri, 26 Aug 2005 17:11:16 -0700 Received: from [71.112.205.201] (pool-71-112-205-201.sttlwa.dsl-w.verizon.net [71.112.205.201]) (authenticated bits=0) by www.lanforge.com (8.12.8/8.12.8) with ESMTP id j7R0Cfo6008544 for ; Fri, 26 Aug 2005 17:12:42 -0700 Message-ID: <430FAF0D.5050000@candelatech.com> Date: Fri, 26 Aug 2005 17:08:45 -0700 From: Ben Greear Organization: Candela Technologies User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.7.10) Gecko/20050719 Fedora/1.7.10-1.3.1 X-Accept-Language: en-us, en MIME-Version: 1.0 To: "'netdev@oss.sgi.com'" Subject: Leaked net-device reference in eql.c Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 3576 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: greearb@candelatech.com Precedence: bulk X-list: netdev Content-Length: 907 Lines: 44 Hello! I think the eql_s_slave_cfg method in eql.c leaks the reference to slave_dev. Am I missing something? static int eql_s_slave_cfg(struct net_device *dev, slave_config_t __user *scp) { slave_t *slave; equalizer_t *eql; struct net_device *slave_dev; slave_config_t sc; int ret; if (copy_from_user(&sc, scp, sizeof (slave_config_t))) return -EFAULT; slave_dev = dev_get_by_name(sc.slave_name); if (!slave_dev) return -ENODEV; ret = -EINVAL; eql = netdev_priv(dev); spin_lock_bh(&eql->queue.lock); if (eql_is_slave(slave_dev)) { slave = __eql_find_slave_dev(&eql->queue, slave_dev); if (slave) { slave->priority = sc.priority; slave->priority_bps = sc.priority; slave->priority_Bps = sc.priority / 8; ret = 0; } } spin_unlock_bh(&eql->queue.lock); return ret; } -- Ben Greear Candela Technologies Inc http://www.candelatech.com From kaber@trash.net Fri Aug 26 20:40:37 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 26 Aug 2005 20:40:43 -0700 (PDT) Received: from localhost.localdomain ([62.206.217.67]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7R3eaiL009730 for ; Fri, 26 Aug 2005 20:40:37 -0700 Received: from localhost ([127.0.0.1]) by localhost.localdomain with esmtp (Exim 4.52) id 1E8rWA-0002JJ-IZ; Sat, 27 Aug 2005 05:38:06 +0200 Message-ID: <430FE01E.9020600@trash.net> Date: Sat, 27 Aug 2005 05:38:06 +0200 From: Patrick McHardy User-Agent: Debian Thunderbird 1.0.6 (X11/20050802) X-Accept-Language: en-us, en MIME-Version: 1.0 To: Ben Greear CC: "'netdev@oss.sgi.com'" Subject: Re: Leaked net-device reference in eql.c References: <430FAF0D.5050000@candelatech.com> In-Reply-To: <430FAF0D.5050000@candelatech.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 3577 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kaber@trash.net Precedence: bulk X-list: netdev Content-Length: 184 Lines: 5 Ben Greear wrote: > I think the eql_s_slave_cfg method in eql.c leaks > the reference to slave_dev. Am I missing something? No, it should also put the device, as in eql_g_slave_cfg. From greearb@candelatech.com Fri Aug 26 23:26:34 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 26 Aug 2005 23:26:39 -0700 (PDT) Received: from www.lanforge.com (ns1.lanforge.com [66.165.47.210]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7R6QYiL018963 for ; Fri, 26 Aug 2005 23:26:34 -0700 Received: from [71.112.205.201] (pool-71-112-205-201.sttlwa.dsl-w.verizon.net [71.112.205.201]) (authenticated bits=0) by www.lanforge.com (8.12.8/8.12.8) with ESMTP id j7R6S4o6012040; Fri, 26 Aug 2005 23:28:05 -0700 Message-ID: <43100709.1090800@candelatech.com> Date: Fri, 26 Aug 2005 23:24:09 -0700 From: Ben Greear Organization: Candela Technologies User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.7.10) Gecko/20050719 Fedora/1.7.10-1.3.1 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Patrick McHardy CC: "'netdev@oss.sgi.com'" Subject: Re: Leaked net-device reference in eql.c References: <430FAF0D.5050000@candelatech.com> <430FE01E.9020600@trash.net> In-Reply-To: <430FE01E.9020600@trash.net> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 3578 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: greearb@candelatech.com Precedence: bulk X-list: netdev Content-Length: 1128 Lines: 45 Patrick McHardy wrote: > Ben Greear wrote: > >> I think the eql_s_slave_cfg method in eql.c leaks >> the reference to slave_dev. Am I missing something? > > > No, it should also put the device, as in eql_g_slave_cfg. Ok, I'm making a patch...will add this to it. How about this one. It seems like it does a dev_put when it shouldn't (if some of the if's fail, the dev_get never happened): net/sched/sch_generic.c static void dev_watchdog(unsigned long arg) { struct net_device *dev = (struct net_device *)arg; spin_lock(&dev->xmit_lock); if (dev->qdisc != &noop_qdisc) { if (netif_device_present(dev) && netif_running(dev) && netif_carrier_ok(dev)) { if (netif_queue_stopped(dev) && (jiffies - dev->trans_start) > dev->watchdog_timeo) { printk(KERN_INFO "NETDEV WATCHDOG: %s: transmit timed out\n", dev->name); dev->tx_timeout(dev); } if (!mod_timer(&dev->watchdog_timer, jiffies + dev->watchdog_timeo)) dev_hold(dev); } } spin_unlock(&dev->xmit_lock); dev_put(dev); } -- Ben Greear Candela Technologies Inc http://www.candelatech.com From arnaldo.melo@gmail.com Sat Aug 27 02:39:45 2005 Received: with ECARTIS (v1.0.0; list netdev); Sat, 27 Aug 2005 02:39:50 -0700 (PDT) Received: from zproxy.gmail.com (zproxy.gmail.com [64.233.162.202]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7R9diiL002807 for ; Sat, 27 Aug 2005 02:39:45 -0700 Received: by zproxy.gmail.com with SMTP id m22so386035nzf for ; Sat, 27 Aug 2005 02:37:17 -0700 (PDT) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=lP9V7IeAmE3MgqNimUGZSQ/q3AWBVwJ922TelQZeX5kbfy//wnfJwzAirSV263rOQYz94z3a8yRl6+sdQjPrXFlQ6EsZ+DU4l5QCo46cZcXR33+E2jVo+YMO2YIg5K31RJm8BtCkMdEj10JnwgBjPK1Hbk4z/3FP7p76pPVsW9c= Received: by 10.37.15.27 with SMTP id s27mr437740nzi; Sat, 27 Aug 2005 02:37:17 -0700 (PDT) Received: by 10.36.56.14 with HTTP; Sat, 27 Aug 2005 02:37:17 -0700 (PDT) Message-ID: <39e6f6c705082702372dbc902d@mail.gmail.com> Date: Sat, 27 Aug 2005 06:37:17 -0300 From: Arnaldo Carvalho de Melo To: Ben Greear Subject: Re: Leaked net-device reference in eql.c Cc: Patrick McHardy , "netdev@oss.sgi.com" In-Reply-To: <43100709.1090800@candelatech.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Disposition: inline References: <430FAF0D.5050000@candelatech.com> <430FE01E.9020600@trash.net> <43100709.1090800@candelatech.com> Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id j7R9diiL002807 X-archive-position: 3580 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: arnaldo.melo@gmail.com Precedence: bulk X-list: netdev Content-Length: 1889 Lines: 48 On 8/27/05, Ben Greear wrote: > Patrick McHardy wrote: > > Ben Greear wrote: > > > >> I think the eql_s_slave_cfg method in eql.c leaks > >> the reference to slave_dev. Am I missing something? > > > > > > No, it should also put the device, as in eql_g_slave_cfg. > > Ok, I'm making a patch...will add this to it. > > How about this one. It seems like it does a dev_put when it shouldn't > (if some of the if's fail, the dev_get never happened): > > net/sched/sch_generic.c > > static void dev_watchdog(unsigned long arg) > { > struct net_device *dev = (struct net_device *)arg; > > spin_lock(&dev->xmit_lock); > if (dev->qdisc != &noop_qdisc) { > if (netif_device_present(dev) && > netif_running(dev) && > netif_carrier_ok(dev)) { > if (netif_queue_stopped(dev) && > (jiffies - dev->trans_start) > dev->watchdog_timeo) { > printk(KERN_INFO "NETDEV WATCHDOG: %s: transmit timed out\n", dev->name); > dev->tx_timeout(dev); > } > if (!mod_timer(&dev->watchdog_timer, jiffies + dev->watchdog_timeo)) > dev_hold(dev); > } > } > spin_unlock(&dev->xmit_lock); > > dev_put(dev); > } Doesn't look like its a problem, its the classical case where when you associate some data structure to a timer you grab a refcount, when the timer expires you drop the refcount, and as the code above shows when mod_timer is succesfully called it grabs a reference, so the reference being dropped above is from the previous timer firing, now its just a matter if looking for the first mod_timer, that must be at some other place in sched_generic.c, lemme see... From arnaldo.melo@gmail.com Sat Aug 27 02:41:39 2005 Received: with ECARTIS (v1.0.0; list netdev); Sat, 27 Aug 2005 02:41:45 -0700 (PDT) Received: from zproxy.gmail.com (zproxy.gmail.com [64.233.162.203]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7R9fciL003103 for ; Sat, 27 Aug 2005 02:41:39 -0700 Received: by zproxy.gmail.com with SMTP id m22so386096nzf for ; Sat, 27 Aug 2005 02:39:14 -0700 (PDT) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=glo52mKe4mmhKIv6k5rtZwZbdLhvDwFdEDunry8lTRvbvoQkazW/4f/x6WET0yEVojg3LVS8lxjMJv1gqQ6Mu7jhyX7AdmM8KH7RhIc30Cz3kraUgSh5/s4Vw2VYy6OxabbhK+80Z7AEd2VRarRU/pf9cDKPjJHYwNg1fTBFZgc= Received: by 10.36.74.20 with SMTP id w20mr440222nza; Sat, 27 Aug 2005 02:39:14 -0700 (PDT) Received: by 10.36.56.14 with HTTP; Sat, 27 Aug 2005 02:39:14 -0700 (PDT) Message-ID: <39e6f6c705082702396591573d@mail.gmail.com> Date: Sat, 27 Aug 2005 06:39:14 -0300 From: Arnaldo Carvalho de Melo To: Ben Greear Subject: Re: Leaked net-device reference in eql.c Cc: Patrick McHardy , "netdev@oss.sgi.com" In-Reply-To: <39e6f6c705082702372dbc902d@mail.gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Disposition: inline References: <430FAF0D.5050000@candelatech.com> <430FE01E.9020600@trash.net> <43100709.1090800@candelatech.com> <39e6f6c705082702372dbc902d@mail.gmail.com> Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id j7R9fciL003103 X-archive-position: 3581 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: arnaldo.melo@gmail.com Precedence: bulk X-list: netdev Content-Length: 2096 Lines: 53 On 8/27/05, Arnaldo Carvalho de Melo wrote: > On 8/27/05, Ben Greear wrote: > > Patrick McHardy wrote: > > > Ben Greear wrote: > > > > > >> I think the eql_s_slave_cfg method in eql.c leaks > > >> the reference to slave_dev. Am I missing something? > > > > > > > > > No, it should also put the device, as in eql_g_slave_cfg. > > > > Ok, I'm making a patch...will add this to it. > > > > How about this one. It seems like it does a dev_put when it shouldn't > > (if some of the if's fail, the dev_get never happened): > > > > net/sched/sch_generic.c > > > > static void dev_watchdog(unsigned long arg) > > { > > struct net_device *dev = (struct net_device *)arg; > > > > spin_lock(&dev->xmit_lock); > > if (dev->qdisc != &noop_qdisc) { > > if (netif_device_present(dev) && > > netif_running(dev) && > > netif_carrier_ok(dev)) { > > if (netif_queue_stopped(dev) && > > (jiffies - dev->trans_start) > dev->watchdog_timeo) { > > printk(KERN_INFO "NETDEV WATCHDOG: %s: transmit timed out\n", dev->name); > > dev->tx_timeout(dev); > > } > > if (!mod_timer(&dev->watchdog_timer, jiffies + dev->watchdog_timeo)) > > dev_hold(dev); > > } > > } > > spin_unlock(&dev->xmit_lock); > > > > dev_put(dev); > > } > > Doesn't look like its a problem, its the classical case where when you > associate some data structure to a timer you grab a refcount, when the > timer expires you drop the refcount, and as the code above shows when > mod_timer is succesfully called it grabs a reference, so the reference > being dropped above is from the previous timer firing, now its just a matter > if looking for the first mod_timer, that must be at some other place in > sched_generic.c, lemme see... Yup, look at __netdev_watchdog_up :-) - Arnaldo From greearb@candelatech.com Sat Aug 27 23:45:23 2005 Received: with ECARTIS (v1.0.0; list netdev); Sat, 27 Aug 2005 23:45:27 -0700 (PDT) Received: from www.lanforge.com (ns1.lanforge.com [66.165.47.210]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7S6jKiL021475 for ; Sat, 27 Aug 2005 23:45:22 -0700 Received: from [71.112.205.201] (pool-71-112-205-201.sttlwa.dsl-w.verizon.net [71.112.205.201]) (authenticated bits=0) by www.lanforge.com (8.12.8/8.12.8) with ESMTP id j7S6kvo6026613 for ; Sat, 27 Aug 2005 23:46:58 -0700 Message-ID: <43115CEE.9030306@candelatech.com> Date: Sat, 27 Aug 2005 23:42:54 -0700 From: Ben Greear Organization: Candela Technologies User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.7.10) Gecko/20050719 Fedora/1.7.10-1.3.1 X-Accept-Language: en-us, en MIME-Version: 1.0 To: "'netdev@oss.sgi.com'" Subject: Netdevice reference counting issues in net/core/dv.c Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 3582 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: greearb@candelatech.com Precedence: bulk X-list: netdev Content-Length: 713 Lines: 23 dv.c has several issues. First, it uses the check_args method to find the device. It acquires a hold on the device and then drops it in the same method. Upon return from this check_args method, code then continues to use the reference to the device. This could lead to access-after-free errors. Also, check_args has an arbitrary device-index check to make sure it is less that 1000. This is bogus since we can have many more devices than that... If there is a maintainer that wants to fix this, please be my guest. Otherwise, I'll make a stab at fixing it as part of my ref-count debugging work. Thanks, Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com From jacques.chion@wanadoo.fr Sun Aug 28 08:15:58 2005 Received: with ECARTIS (v1.0.0; list netdev); Sun, 28 Aug 2005 08:16:06 -0700 (PDT) Received: from ftp.linux-mips.org (mail.linux-mips.org [62.254.210.162]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7SFFviL020342 for ; Sun, 28 Aug 2005 08:15:57 -0700 Received: from smtp10.wanadoo.fr ([IPv6:::ffff:193.252.22.21]:54505 "EHLO smtp10.wanadoo.fr") by linux-mips.org with ESMTP id ; Sun, 28 Aug 2005 16:07:33 +0100 Received: from me-wanadoo.net (localhost [127.0.0.1]) by mwinf1007.wanadoo.fr (SMTP Server) with ESMTP id 5EA3F2800148 for ; Sun, 28 Aug 2005 17:13:24 +0200 (CEST) Received: from cwoux2.cerise.fr (Mix-Lyon-302-2-177.w193-248.abo.wanadoo.fr [193.248.229.177]) by mwinf1007.wanadoo.fr (SMTP Server) with ESMTP id BBB1A2800139; Sun, 28 Aug 2005 17:13:23 +0200 (CEST) X-ME-UUID: 20050828151323768.BBB1A2800139@mwinf1007.wanadoo.fr Received: from cwoux2.cerise.fr (cwoux2.cerise.fr [127.0.0.1]) by cwoux2.cerise.fr =?ISO-8859-1?Q?(8.13.1/8.12.11/p=E9p=E8re?=) with ESMTP id j7SFDauh015526; Sun, 28 Aug 2005 17:13:36 +0200 Received: from localhost (jimmy@localhost) by cwoux2.cerise.fr =?ISO-8859-1?Q?(8.13.1/8.12.11/p=E9p=E8re?=) with ESMTP id j7SFCPb2015490; Sun, 28 Aug 2005 17:12:25 +0200 Date: Sun, 28 Aug 2005 17:12:25 +0200 (CEST) From: Jacques Chion To: Ralf Baechle DL5RB Cc: "David S. Miller" , , Subject: Re: [PATCH 3/3] Cleanup direct calls into IP stack In-Reply-To: <20050824171712.GB8367@linux-mips.org> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=iso-8859-15 Content-Transfer-Encoding: 8BIT X-archive-position: 3583 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: Jacques.Chion@wanadoo.fr Precedence: bulk X-list: netdev Content-Length: 193 Lines: 11 I applied all these patches to the kernel 2.6.13-rc7 and i recompile ax25-tools. All is running fine. Best 73's -- F6CWO Quelques HOWTO en français: http://perso.wanadoo.fr/jacques.chion/ From malawi@yahoo.nl Mon Aug 29 07:13:13 2005 Received: with ECARTIS (v1.0.0; list netdev); Mon, 29 Aug 2005 07:13:16 -0700 (PDT) Received: from 221.14.18.114 ([221.14.18.114]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id j7TED2iL030441 for ; Mon, 29 Aug 2005 07:13:12 -0700 Received: from linx.org (monochromator@localhost.linx.org [IPv6:::1]) by claudiamongon.linx.org (8.12.3/8.11.6) with ESMTP id g70KIP8H020683; Mon, 29 Aug 2005 13:06:00 -0100 Date: Mon, 29 Aug 2005 09:10:00 -0500 From: "Jocelyn Cannon" Message-ID: <5CFBE14D.F0B5030A@attglobal.net> X-Mailer: WebMail (Hydra) SMTP v3.61.00 To: netdev@oss.sgi.com Subject: hu64 - Trade Global Currency - jx23 X-RIPE: FIPS X-archive-position: 3584 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: cummings@yahoo.ie Precedence: bulk X-list: netdev Content-Length: 138 Lines: 7 Gonzalo, ws70 - You Can Trade Worldwide Currencies - v1 http://uk.geocities.com/trade_foreign_currencies_2020 /6960455.html Naomi Dunn From QUOTA-GUARD@tsinghuaic.com Tue Aug 30 07:37:18 2005 Received: with ECARTIS (v1.0.0; list netdev); Tue, 30 Aug 2005 07:37:23 -0700 (PDT) Received: from mxvip1.hichina.com ([218.30.103.129]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7UEbGiL020833 for ; Tue, 30 Aug 2005 07:37:18 -0700 Message-ID: <2005830223435.68825.quarkmail@mxvip1.hichina.com> Date: Tue, 30 Aug 2005 22:34:35 +0800 From: QUOTA-GUARD@tsinghuaic.com To: netdev@oss.sgi.com Subject: Failure notice: user's disk quota exceeded! X-archive-position: 3585 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: QUOTA-GUARD@tsinghuaic.com Precedence: bulk X-list: netdev Content-Length: 944 Lines: 23 Hi, This is the mail deliver program at tsinghuaic.com Sorry, Your message to cannot be delivered as the recipient has exceeded their hardquota limit for email.However a small (< 1KB) message will be delivered should you wish to inform the person you tried to email. --- Below this line is a copy of the header. --- Received: from 221.218.42.170 (HELO oss.sgi.com) (envelope-from netdev@oss.sgi.com) by mxvip1.hichina.com (quarkmail-1.2.1) with ESMTP id S335026AbVH3OeR for dan@tsinghuaic.com; Tue, 30 Aug 2005 22:34:17 +0800 From: netdev@oss.sgi.com To: dan@tsinghuaic.com Subject: Status Date: Tue, 30 Aug 2005 22:33:31 +0800 MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_NextPart_000_0014_91B7BF9C.8477E2AA" X-Priority: 3 X-MSMail-Priority: Normal Message-ID: <1125412462$67935$17774241@netdev@oss.sgi.com> X-Attachment: data.scr, 46675 ------------------------------------------------ From jeremy.guthrie@berbee.com Tue Aug 30 08:12:32 2005 Received: with ECARTIS (v1.0.0; list netdev); Tue, 30 Aug 2005 08:12:35 -0700 (PDT) Received: from ctg-msnexc01.staff.berbee.com (msn-office-flr2.binc.net [64.73.12.254]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7UFCViL023403 for ; Tue, 30 Aug 2005 08:12:32 -0700 Received: from localhost ([172.30.254.220] RDNS failed) by ctg-msnexc01.staff.berbee.com with Microsoft SMTPSVC(6.0.3790.211); Tue, 30 Aug 2005 10:10:04 -0500 From: "Jeremy M. Guthrie" Reply-To: jeremy.guthrie@berbee.com Organization: Berbee Information Networks To: netdev@oss.sgi.com Subject: Linux Policy Routing Feature Enhancement Request Date: Tue, 30 Aug 2005 10:10:00 -0500 User-Agent: KMail/1.8.2 MIME-Version: 1.0 Content-Type: multipart/signed; boundary="nextPart1908724.MTYqe6KoFq"; protocol="application/pgp-signature"; micalg=pgp-sha1 Content-Transfer-Encoding: 7bit Message-Id: <200508301010.03420.jeremy.guthrie@berbee.com> X-OriginalArrivalTime: 30 Aug 2005 15:10:04.0279 (UTC) FILETIME=[E6600070:01C5AD74] X-archive-position: 3586 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jeremy.guthrie@berbee.com Precedence: bulk X-list: netdev Content-Length: 977 Lines: 34 --nextPart1908724.MTYqe6KoFq Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Content-Disposition: inline If possible, I'd like to put in a feature request to increase the number of= =20 available tables from 256 to 65536. The IDS load balancer can actually mak= e=20 use of all of them. My attempts to do this to 2.6.11 and 'ip' failed=20 miserably. 8( =2D-=20 =2D------------------------------------------------- Jeremy M. Guthrie jeremy.guthrie@berbee.com Senior Network Engineer Phone: 608-298-1061 Berbee Fax: 608-288-3007 5520 Research Park Drive NOC: 608-298-1102 Madison, WI 53711 --nextPart1908724.MTYqe6KoFq Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.0 (GNU/Linux) iD8DBQBDFHbLqtjaBHGZBeURAridAJ9euIXuzH0faSSl8hHxCV9lMIkntwCfZVdw hEwIa6SFCRimrygkM0blIos= =5ocU -----END PGP SIGNATURE----- --nextPart1908724.MTYqe6KoFq-- From kaber@trash.net Tue Aug 30 15:39:00 2005 Received: with ECARTIS (v1.0.0; list netdev); Tue, 30 Aug 2005 15:39:12 -0700 (PDT) Received: from kaber.coreworks.de ([62.206.217.67]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j7UMcxiL032314 for ; Tue, 30 Aug 2005 15:39:00 -0700 Received: from localhost ([127.0.0.1]) by kaber.coreworks.de with esmtp (Exim 4.52) id 1EAEib-00072C-3L; Wed, 31 Aug 2005 00:36:37 +0200 Message-ID: <4314DF74.1030402@trash.net> Date: Wed, 31 Aug 2005 00:36:36 +0200 From: Patrick McHardy User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.7.10) Gecko/20050803 Debian/1.7.10-1 X-Accept-Language: en MIME-Version: 1.0 To: John McGowan , mike@infonexus.com CC: linux-kernel@vger.kernel.org, Maillist netdev Subject: Re: Kernel 2.6.13: TCP (libnet?) References: <20050830194107.GA11652@localhost.localdomain> In-Reply-To: <20050830194107.GA11652@localhost.localdomain> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 3587 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kaber@trash.net Precedence: bulk X-list: netdev Content-Length: 2433 Lines: 67 John McGowan wrote: > Kernel 2.6.13: TCP (libnet?) > > Broken libnet? > > KERNEL: linux-kernel@vger.kernel.org > LIBNET 1.1 (c) 1998 - 2004 Mike D. Schiffman > > I don't like spam. I track spamvertized sites. Many only respond to TCP > packets sent to port 80. I need a TCP traceroute (traceroute using TCP/SYN > packets). > > I have four such programmes. > > 1: Hping in traceroute mode. > Poor. If it hits a router which does not respond, it just sits > and waits. > 2: LFT > OK. > a: Does not work in Fedora Core2 - without patching. > The source code expects a header of zero bytes in the > pcap output of zero bytes (hard coded in the source). > My captures have a "linux cooked capture" header of sixteen bytes. > Changing an offset from zero to sixteen gets it to work. > b: Requires traffic on the interface. > It seems it gets into a loop and awaits some traffic. > It examines it - if it is data it expects it uses it. > If it is other data from other programmes accessing the 'net > it does nothing with it. > In both those cases it moves on and starts over. > What if there is no traffic? Unless there is something for it > either to use or ignore, it seems to hang. To get it to work > I have to, say, read the NY Times online while running it. > (I believe the traceproto site mentions doing something to > get around the timeout problem) > Output is OK - but I don't really like it. > 3: Tcptraceroute > I have used this since kernel 2.2 through 2.4 > (older version with older version of libnet) and > 2.6.5, 2.6.7, 2.6.9, 2.6.10, 2.6.11, 2.6.12 > It was my favourite until I got traceproto. > 4: Traceproto > I have used this in kernels 2.4, > 2.6.5, 2.6.7, 2.6.9, 2.6.10, 2.6.11, 2.6.12 > Good. > > > In kernel 2.6.13: [patching 2.1.12 with the patch file] > > Standard "traceroute" works. > LFT works. > HPING works (also in traceroute mode). > tcptraceroute fails. > traceproto (tcp or udp mode) fails. > > How do they fail? > > A TCPDUMP shows that they do send out the packets. > I do get back ICMP "time exceeded" error messages. > They no longer recognize them. > > Something that had never changed before has now changed > and has broken traceproto and tcptraceroute. [netdev CC'ed] Could you provide tcpdump dumps and your .config file please?