From akpm@osdl.org Thu Jul 1 00:29:17 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 01 Jul 2004 00:29:20 -0700 (PDT) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i617THgi028495 for ; Thu, 1 Jul 2004 00:29:17 -0700 Received: from bix (build.pdx.osdl.net [172.20.1.2]) by mail.osdl.org (8.11.6/8.11.6) with SMTP id i617TBG29915 for ; Thu, 1 Jul 2004 00:29:11 -0700 Date: Thu, 1 Jul 2004 00:28:14 -0700 From: Andrew Morton To: netdev@oss.sgi.com Subject: Fw: [Bugme-new] [Bug 2991] New: vlan (8021q) is working bad with some sites and some ports Message-Id: <20040701002814.7f3bc516.akpm@osdl.org> X-Mailer: Sylpheed version 0.9.7 (GTK+ 1.2.10; i386-redhat-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 6487 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: akpm@osdl.org Precedence: bulk X-list: netdev Begin forwarded message: Date: Thu, 1 Jul 2004 00:19:15 -0700 From: bugme-daemon@osdl.org To: bugme-new@lists.osdl.org Subject: [Bugme-new] [Bug 2991] New: vlan (8021q) is working bad with some sites and some ports http://bugme.osdl.org/show_bug.cgi?id=2991 Summary: vlan (8021q) is working bad with some sites and some ports Kernel Version: 2.6.7 Status: NEW Severity: high Owner: niv@us.ibm.com Submitter: roxwal@libero.it Distribution: Vlan (8021q) lost ipv4 packet Hardware Environment: HP tc3100 + switch 3com 3c16886A Software Environment: Mandrake 10 official + kernel 2.6.7-2mdk Problem Description: While using vlan (eth0.2) for connection to internet mostly sites don't work (ex: www.ziobudda.net bad, punto-informatico.it ok) The connection starts but no packet are recived. tcpdump -i eth0.2 -vv -X host www.ziobudda.net tcpdump: listening on eth0.2, link-type EN10MB (Ethernet), capture size 96 bytes 09:03:48.715078 IP (tos 0x0, ttl 64, id 51924, offset 0, flags [DF], length: 52) 81.112.63.189.32933 > 193.254.241.4.http: F [tcp sum ok] 2920000677:2920000677(0) ack 2170843880 win 5840 0x0000 4500 0034 cad4 4000 4006 2bbf 5170 3fbd E..4..@.@.+.Qp?. 0x0010 c1fe f104 80a5 0050 ae0b aca5 8164 72e8 .......P.....dr. 0x0020 8011 16d0 df2c 0000 0101 080a 001c ed47 .....,.........G 0x0030 102b 6f0c .+o. 09:03:48.723397 IP (tos 0x0, ttl 54, id 7661, offset 0, flags [DF], length: 52) 193.254.241.4.http > 81.112.63.189.32933: . [tcp sum ok] 2897:2897(0) ack 1 win 6432 0x0000 4500 0034 1ded 4000 3606 e2a6 c1fe f104 E..4..@.6....... 0x0010 5170 3fbd 0050 80a5 8164 7e38 ae0b aca6 Qp?..P...d~8.... 0x0020 8010 1920 c39b 0000 0101 080a 102b 7cfd .............+|. 0x0030 001c ed47 ...G 09:03:48.826211 IP (tos 0x0, ttl 64, id 20222, offset 0, flags [DF], length: 60) 81.112.63.189.32934 > 193.254.241.4.http: S [tcp sum ok] 2985141064:2985141064(0) win 5840 0x0000 4500 003c 4efe 4000 4006 a78d 5170 3fbd E.. 81.112.63.189.32934: S [tcp sum ok] 2195619523:2195619523 (0) ack 2985141065 win 5792 0x0000 4500 003c 0000 4000 3606 008c c1fe f104 E..<..@.6....... 0x0010 5170 3fbd 0050 80a6 82de 7ec3 b1ed a349 Qp?..P....~....I 0x0020 a012 16a0 9a50 0000 0204 05b4 0402 080a .....P.......... 0x0030 102b 7d08 001c edb6 0103 0300 .+}......... 09:03:48.834228 IP (tos 0x0, ttl 64, id 20223, offset 0, flags [DF], length: 52) 81.112.63.189.32934 > 193.254.241.4.http: . [tcp sum ok] 1:1(0) ack 1 win 5840 0x0000 4500 0034 4eff 4000 4006 a794 5170 3fbd E..4N.@.@...Qp?. 0x0010 c1fe f104 80a6 0050 b1ed a349 82de 7ec4 .......P...I..~. 0x0020 8010 16d0 c8dd 0000 0101 080a 001c edbe ................ 0x0030 102b 7d08 .+}. 09:03:48.836162 IP (tos 0x0, ttl 64, id 20224, offset 0, flags [DF], length: 451) 81.112.63.189.32934 > 193.254.241.4.http: P 1:400(399) ack 1 win 5840 0x0000 4500 01c3 4f00 4000 4006 a604 5170 3fbd E...O.@.@...Qp?. 0x0010 c1fe f104 80a6 0050 b1ed a349 82de 7ec4 .......P...I..~. 0x0020 8018 16d0 78ce 0000 0101 080a 001c edc0 ....x........... 0x0030 102b 7d08 4745 5420 2f20 4854 5450 2f31 .+}.GET./.HTTP/1 0x0040 2e31 0d0a 436f 6e6e 6563 7469 6f6e 3a20 .1..Connection:. 0x0050 4b65 Ke 09:03:48.846287 IP (tos 0x0, ttl 54, id 973, offset 0, flags [DF], length: 52) 193.254.241.4.http > 81.112.63.189.32934: . [tcp sum ok] 1:1(0) ack 400 win 6432 0x0000 4500 0034 03cd 4000 3606 fcc6 c1fe f104 E..4..@.6....... 0x0010 5170 3fbd 0050 80a6 82de 7ec4 b1ed a4d8 Qp?..P....~..... 0x0020 8010 1920 c4fb 0000 0101 080a 102b 7d09 .............+}. 0x0030 001c edc0 (no more) the ping works and the telnet towards others ports (but 80) work if I use une other net card for direct connettion (eth1) internet connections works perfectly Steps to reproduce: hdsl + router cisco + 3com switch layer 3 + vlan eth0 local lan eth0.2 internet links www.ziobudda.net tks W:-} ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is. From laforge@netfilter.org Thu Jul 1 02:10:54 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 01 Jul 2004 02:10:57 -0700 (PDT) Received: from ganesha.gnumonks.org (Debian-exim@ganesha.gnumonks.org [213.95.27.120]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i619Aqgi006577 for ; Thu, 1 Jul 2004 02:10:53 -0700 Received: from dsl-082-083-225-178.arcor-ip.net ([82.83.225.178] helo=sunbeam.gnumonks.org) by ganesha.gnumonks.org with asmtp (TLSv1:RC4-SHA:128) (Exim 4.30) id 1Bfxak-0006Lt-Go; Thu, 01 Jul 2004 11:10:50 +0200 Received: from laforge by sunbeam.gnumonks.org with local (Exim 4.22) id 1Bfxai-0000u5-II; Thu, 01 Jul 2004 11:10:48 +0200 Date: Thu, 1 Jul 2004 11:10:48 +0200 From: Harald Welte To: "David S. Miller" Cc: James Morris , netfilter-devel@lists.netfilter.org, netdev@oss.sgi.com, arjanv@redhat.com, kuznet@ms2.inr.ac.ru Subject: Re: Remote DoS vulnerability in Linux kernel 2.6.x (fwd) Message-ID: <20040701091048.GR1410@sunbeam.de.gnumonks.org> Mail-Followup-To: Harald Welte , "David S. Miller" , James Morris , netfilter-devel@lists.netfilter.org, netdev@oss.sgi.com, arjanv@redhat.com, kuznet@ms2.inr.ac.ru References: <20040630144230.1d52864b.davem@redhat.com> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="pWJxWxNlJUNgDlXi" Content-Disposition: inline In-Reply-To: <20040630144230.1d52864b.davem@redhat.com> User-Agent: Mutt/1.5.5.1+cvs20040105i X-archive-position: 6488 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: laforge@netfilter.org Precedence: bulk X-list: netdev --pWJxWxNlJUNgDlXi Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Wed, Jun 30, 2004 at 02:42:30PM -0700, David S. Miller wrote: =20 > This bug only came up because up the huge change Rusty and Harald did > to make these modules not access the SKB header data directly, and > instead to use local on-stack copies and skb_copy_bits(). A change we had to make in order not to assume fully linearized packet including the tcp header. I suppose the trivial fix has already been pushed upstream... Very unfortunate that vendors weren't informed in advance :( --=20 - Harald Welte http://www.netfilter.org/ =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D "Fragmentation is like classful addressing -- an interesting early architectural error that shows how much experimentation was going on while IP was being designed." -- Paul Vixie --pWJxWxNlJUNgDlXi Content-Type: application/pgp-signature; name="signature.asc" Content-Description: Digital signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.4 (GNU/Linux) iD8DBQFA49UYXaXGVTD0i/8RAq4sAJ9GVk5gzRKJAiGgtIWalh/ydPTYtACeP6gt vNB5hX9KKHHjwltrbeQalVo= =skaj -----END PGP SIGNATURE----- --pWJxWxNlJUNgDlXi-- From herbert@gondor.apana.org.au Thu Jul 1 04:04:41 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 01 Jul 2004 04:04:48 -0700 (PDT) Received: from arnor.apana.org.au (mail@arnor.apana.org.au [203.14.152.115]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i61B4cgi013648 for ; Thu, 1 Jul 2004 04:04:39 -0700 Received: from gondolin.me.apana.org.au ([192.168.0.6] ident=mail) by arnor.apana.org.au with esmtp (Exim 3.35 #1 (Debian)) id 1BfzMi-0002Jv-00; Thu, 01 Jul 2004 21:04:28 +1000 Received: from herbert by gondolin.me.apana.org.au with local (Exim 3.36 #1 (Debian)) id 1BfzMY-0002oO-00; Thu, 01 Jul 2004 21:04:18 +1000 Date: Thu, 1 Jul 2004 21:04:18 +1000 To: Jeff Garzik , netdev@oss.sgi.com Subject: Resend: [NETDRV] Fix successive calls to spin_lock_irqsave in sk98lin Message-ID: <20040701110418.GA10797@gondor.apana.org.au> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="Nq2Wo0NMKNjxTN9z" Content-Disposition: inline User-Agent: Mutt/1.5.6+20040523i From: Herbert Xu X-archive-position: 6489 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: herbert@gondor.apana.org.au Precedence: bulk X-list: netdev --Nq2Wo0NMKNjxTN9z Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Hi Jeff: This patch fixes the few places in sk98lin where it calls spin_lock_saveirq on the same flags variable thus causing interrupts to be disabled upon leaving the driver. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt --Nq2Wo0NMKNjxTN9z Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename=p Index: drivers/net/sk98lin/skge.c =================================================================== RCS file: /home/gondolin/herbert/src/CVS/debian/kernel-source-2.5/drivers/net/sk98lin/skge.c,v retrieving revision 1.1.1.17 diff -u -r1.1.1.17 skge.c --- drivers/net/sk98lin/skge.c 10 May 2004 09:47:55 -0000 1.1.1.17 +++ drivers/net/sk98lin/skge.c 22 Jun 2004 10:45:23 -0000 @@ -3093,8 +3093,7 @@ SkEventDispatcher(pAC, pAC->IoBase); for (i=0; iGIni.GIMacsFound; i++) { - spin_lock_irqsave( - &pAC->TxPort[i][TX_PRIO_LOW].TxDesRingLock, Flags); + spin_lock(&pAC->TxPort[i][TX_PRIO_LOW].TxDesRingLock); netif_stop_queue(pAC->dev[i]); } @@ -4773,12 +4772,10 @@ spin_lock_irqsave( &pAC->TxPort[FromPort][TX_PRIO_LOW].TxDesRingLock, Flags); - spin_lock_irqsave( - &pAC->TxPort[ToPort][TX_PRIO_LOW].TxDesRingLock, Flags); + spin_lock(&pAC->TxPort[ToPort][TX_PRIO_LOW].TxDesRingLock); SkGeStopPort(pAC, IoC, FromPort, SK_STOP_ALL, SK_SOFT_RST); SkGeStopPort(pAC, IoC, ToPort, SK_STOP_ALL, SK_SOFT_RST); - spin_unlock_irqrestore( - &pAC->TxPort[ToPort][TX_PRIO_LOW].TxDesRingLock, Flags); + spin_unlock(&pAC->TxPort[ToPort][TX_PRIO_LOW].TxDesRingLock); spin_unlock_irqrestore( &pAC->TxPort[FromPort][TX_PRIO_LOW].TxDesRingLock, Flags); @@ -4791,8 +4788,7 @@ spin_lock_irqsave( &pAC->TxPort[FromPort][TX_PRIO_LOW].TxDesRingLock, Flags); - spin_lock_irqsave( - &pAC->TxPort[ToPort][TX_PRIO_LOW].TxDesRingLock, Flags); + spin_lock(&pAC->TxPort[ToPort][TX_PRIO_LOW].TxDesRingLock); pAC->ActivePort = ToPort; #if 0 SetQueueSizes(pAC); @@ -4807,8 +4803,7 @@ pAC, pAC->ActivePort, DualNet)) { - spin_unlock_irqrestore( - &pAC->TxPort[ToPort][TX_PRIO_LOW].TxDesRingLock, Flags); + spin_unlock(&pAC->TxPort[ToPort][TX_PRIO_LOW].TxDesRingLock); spin_unlock_irqrestore( &pAC->TxPort[FromPort][TX_PRIO_LOW].TxDesRingLock, Flags); @@ -4834,8 +4829,7 @@ SkGePollTxD(pAC, IoC, ToPort, SK_TRUE); ClearAndStartRx(pAC, FromPort); ClearAndStartRx(pAC, ToPort); - spin_unlock_irqrestore( - &pAC->TxPort[ToPort][TX_PRIO_LOW].TxDesRingLock, Flags); + spin_unlock(&pAC->TxPort[ToPort][TX_PRIO_LOW].TxDesRingLock); spin_unlock_irqrestore( &pAC->TxPort[FromPort][TX_PRIO_LOW].TxDesRingLock, Flags); --Nq2Wo0NMKNjxTN9z-- From herbert@gondor.apana.org.au Thu Jul 1 04:19:34 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 01 Jul 2004 04:19:51 -0700 (PDT) Received: from arnor.apana.org.au (mail@arnor.apana.org.au [203.14.152.115]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i61BJUgi017548 for ; Thu, 1 Jul 2004 04:19:31 -0700 Received: from gondolin.me.apana.org.au ([192.168.0.6] ident=mail) by arnor.apana.org.au with esmtp (Exim 3.35 #1 (Debian)) id 1Bfzb7-0002P6-00; Thu, 01 Jul 2004 21:19:21 +1000 Received: from herbert by gondolin.me.apana.org.au with local (Exim 3.36 #1 (Debian)) id 1Bfzb0-0002td-00; Thu, 01 Jul 2004 21:19:14 +1000 Date: Thu, 1 Jul 2004 21:19:14 +1000 To: Jeff Garzik , netdev@oss.sgi.com Subject: Resend: [NETDRV] Merge register_netdev calls Message-ID: <20040701111914.GA11120@gondor.apana.org.au> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="sdtB3X0nJg68CQEu" Content-Disposition: inline User-Agent: Mutt/1.5.6+20040523i From: Herbert Xu X-archive-position: 6490 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: herbert@gondor.apana.org.au Precedence: bulk X-list: netdev --sdtB3X0nJg68CQEu Content-Type: text/plain; charset=us-ascii Content-Disposition: inline On Fri, Jun 11, 2004 at 12:08:55PM +1000, herbert wrote: > > In fact it's really making these ISA/MCA probe() functions more > like the ones we have for PCI. To illustrate this, let's take the first driver touched by 4/x. In 3c503, the function init_module() essentially does for each ioaddr if (do_el2_probe(ioaddr) == 0) return 0 And do_el2_probe() just calls el2_probe1() which is similar to your average PCI probe function except that the first thing it does is to make sure that the device exists at ioaddr. This is not that different from PCI where it would look like for each PCI device matching the vendor/product numbers if (do_el2_probe(device) == 0) return 0 Now before my patch, register_netdev was being called just after do_el2_probe() returns. My patch simply moves it to the end of el2_probe1() which is exactly what would happen if this were a PCI driver. I've rediffed it against your net-drivers-2.6 tree. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt --sdtB3X0nJg68CQEu Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename=p ===== drivers/net/3c503.c 1.20 vs edited ===== --- 1.20/drivers/net/3c503.c 2004-05-23 03:40:55 +10:00 +++ edited/drivers/net/3c503.c 2004-07-01 21:16:27 +10:00 @@ -162,12 +162,7 @@ err = do_el2_probe(dev); if (err) goto out; - err = register_netdev(dev); - if (err) - goto out1; return dev; -out1: - cleanup_card(dev); out: free_netdev(dev); return ERR_PTR(err); @@ -343,6 +338,10 @@ dev->poll_controller = ei_poll; #endif + retval = register_netdev(dev); + if (retval) + goto out1; + if (dev->mem_start) printk("%s: %s - %dkB RAM, 8kB shared mem window at %#6lx-%#6lx.\n", dev->name, ei_status.name, (wordlength+1)<<3, @@ -702,11 +701,8 @@ dev->base_addr = io[this_dev]; dev->mem_end = xcvr[this_dev]; /* low 4bits = xcvr sel. */ if (do_el2_probe(dev) == 0) { - if (register_netdev(dev) == 0) { - dev_el2[found++] = dev; - continue; - } - cleanup_card(dev); + dev_el2[found++] = dev; + continue; } free_netdev(dev); printk(KERN_WARNING "3c503.c: No 3c503 card found (i/o = 0x%x).\n", io[this_dev]); ===== drivers/net/3c515.c 1.29 vs edited ===== --- 1.29/drivers/net/3c515.c 2004-03-30 20:17:59 +10:00 +++ edited/drivers/net/3c515.c 2004-07-01 21:16:28 +10:00 @@ -373,7 +373,7 @@ #endif /* __ISAPNP__ */ static struct net_device *corkscrew_scan(int unit); -static void corkscrew_setup(struct net_device *dev, int ioaddr, +static int corkscrew_setup(struct net_device *dev, int ioaddr, struct pnp_dev *idev, int card_number); static int corkscrew_open(struct net_device *dev); static void corkscrew_timer(unsigned long arg); @@ -537,10 +537,9 @@ printk(KERN_INFO "3c515 Resource configuration register %#4.4x, DCR %4.4x.\n", inl(ioaddr + 0x2002), inw(ioaddr + 0x2000)); /* irq = inw(ioaddr + 0x2002) & 15; */ /* Use the irq from isapnp */ - corkscrew_setup(dev, ioaddr, idev, cards_found++); SET_NETDEV_DEV(dev, &idev->dev); pnp_cards++; - err = register_netdev(dev); + err = corkscrew_setup(dev, ioaddr, idev, cards_found++); if (!err) return dev; cleanup_card(dev); @@ -556,8 +555,7 @@ printk(KERN_INFO "3c515 Resource configuration register %#4.4x, DCR %4.4x.\n", inl(ioaddr + 0x2002), inw(ioaddr + 0x2000)); - corkscrew_setup(dev, ioaddr, NULL, cards_found++); - err = register_netdev(dev); + err = corkscrew_setup(dev, ioaddr, NULL, cards_found++); if (!err) return dev; cleanup_card(dev); @@ -566,7 +564,7 @@ return NULL; } -static void corkscrew_setup(struct net_device *dev, int ioaddr, +static int corkscrew_setup(struct net_device *dev, int ioaddr, struct pnp_dev *idev, int card_number) { struct corkscrew_private *vp = (struct corkscrew_private *) dev->priv; @@ -689,6 +687,8 @@ dev->get_stats = &corkscrew_get_stats; dev->set_multicast_list = &set_rx_mode; dev->ethtool_ops = &netdev_ethtool_ops; + + return register_netdev(dev); } ===== drivers/net/3c523.c 1.16 vs edited ===== --- 1.16/drivers/net/3c523.c 2004-05-23 03:40:55 +10:00 +++ edited/drivers/net/3c523.c 2004-07-01 21:16:28 +10:00 @@ -572,6 +572,10 @@ dev->flags&=~IFF_MULTICAST; /* Multicast doesn't work */ #endif + retval = register_netdev(dev); + if (retval) + goto err_out; + return 0; err_out: mca_set_adapter_procfn(slot, NULL, NULL); @@ -600,12 +604,7 @@ err = do_elmc_probe(dev); if (err) goto out; - err = register_netdev(dev); - if (err) - goto out1; return dev; -out1: - cleanup_card(dev); out: free_netdev(dev); return ERR_PTR(err); @@ -1288,12 +1287,9 @@ dev->irq=irq[this_dev]; dev->base_addr=io[this_dev]; if (do_elmc_probe(dev) == 0) { - if (register_netdev(dev) == 0) { - dev_elmc[this_dev] = dev; - found++; - continue; - } - cleanup_card(dev); + dev_elmc[this_dev] = dev; + found++; + continue; } free_netdev(dev); if (io[this_dev]==0) ===== drivers/net/ac3200.c 1.19 vs edited ===== --- 1.19/drivers/net/ac3200.c 2004-05-23 03:40:55 +10:00 +++ edited/drivers/net/ac3200.c 2004-07-01 21:16:29 +10:00 @@ -147,12 +147,7 @@ err = do_ac3200_probe(dev); if (err) goto out; - err = register_netdev(dev); - if (err) - goto out1; return dev; -out1: - cleanup_card(dev); out: free_netdev(dev); return ERR_PTR(err); @@ -284,7 +279,14 @@ dev->poll_controller = ei_poll; #endif NS8390_init(dev, 0); + + retval = register_netdev(dev); + if (retval) + goto out2; return 0; +out2: + if (ei_status.reg0) + iounmap((void *)dev->mem_start); out1: free_irq(dev->irq, dev); out: @@ -402,11 +404,8 @@ dev->base_addr = io[this_dev]; dev->mem_start = mem[this_dev]; /* Currently ignored by driver */ if (do_ac3200_probe(dev) == 0) { - if (register_netdev(dev) == 0) { - dev_ac32[found++] = dev; - continue; - } - cleanup_card(dev); + dev_ac32[found++] = dev; + continue; } free_netdev(dev); printk(KERN_WARNING "ac3200.c: No ac3200 card found (i/o = 0x%x).\n", io[this_dev]); ===== drivers/net/cs89x0.c 1.24 vs edited ===== --- 1.24/drivers/net/cs89x0.c 2004-05-23 03:40:55 +10:00 +++ edited/drivers/net/cs89x0.c 2004-07-01 21:16:29 +10:00 @@ -307,13 +307,7 @@ } if (err) goto out; - err = register_netdev(dev); - if (err) - goto out1; return dev; -out1: - outw(PP_ChipID, dev->base_addr + ADD_PORT); - release_region(dev->base_addr, NETCARD_IO_EXTENT); out: free_netdev(dev); printk(KERN_WARNING "cs89x0: no cs8900 or cs8920 detected. Be sure to disable PnP with SETUP\n"); @@ -718,7 +712,13 @@ printk("\n"); if (net_debug) printk("cs89x0_probe1() successful\n"); + + retval = register_netdev(dev); + if (retval) + goto out3; return 0; +out3: + outw(PP_ChipID, dev->base_addr + ADD_PORT); out2: release_region(ioaddr & ~3, NETCARD_IO_EXTENT); out1: @@ -1806,13 +1806,6 @@ if (ret) goto out; - if (register_netdev(dev) != 0) { - printk(KERN_ERR "cs89x0.c: No card found at 0x%x\n", io); - ret = -ENXIO; - outw(PP_ChipID, dev->base_addr + ADD_PORT); - release_region(dev->base_addr, NETCARD_IO_EXTENT); - goto out; - } dev_cs89x0 = dev; return 0; out: ===== drivers/net/e2100.c 1.18 vs edited ===== --- 1.18/drivers/net/e2100.c 2004-05-23 03:40:55 +10:00 +++ edited/drivers/net/e2100.c 2004-07-01 21:16:29 +10:00 @@ -161,12 +161,7 @@ err = do_e2100_probe(dev); if (err) goto out; - err = register_netdev(dev); - if (err) - goto out1; return dev; -out1: - cleanup_card(dev); out: free_netdev(dev); return ERR_PTR(err); @@ -278,6 +273,9 @@ #endif NS8390_init(dev, 0); + retval = register_netdev(dev); + if (retval) + goto out; return 0; out: release_region(ioaddr, E21_IO_EXTENT); @@ -445,11 +443,8 @@ dev->mem_start = mem[this_dev]; dev->mem_end = xcvr[this_dev]; /* low 4bits = xcvr sel. */ if (do_e2100_probe(dev) == 0) { - if (register_netdev(dev) == 0) { - dev_e21[found++] = dev; - continue; - } - cleanup_card(dev); + dev_e21[found++] = dev; + continue; } free_netdev(dev); printk(KERN_WARNING "e2100.c: No E2100 card found (i/o = 0x%x).\n", io[this_dev]); ===== drivers/net/eepro.c 1.25 vs edited ===== --- 1.25/drivers/net/eepro.c 2004-05-23 03:40:55 +10:00 +++ edited/drivers/net/eepro.c 2004-07-01 21:16:29 +10:00 @@ -596,12 +596,7 @@ err = do_eepro_probe(dev); if (err) goto out; - err = register_netdev(dev); - if (err) - goto out1; return dev; -out1: - release_region(dev->base_addr, EEPRO_IO_EXTENT); out: free_netdev(dev); return ERR_PTR(err); @@ -747,6 +742,7 @@ struct eepro_local *lp; enum iftype { AUI=0, BNC=1, TPE=2 }; int ioaddr = dev->base_addr; + int err; /* Grab the region so we can find another board if autoIRQ fails. */ if (!request_region(ioaddr, EEPRO_IO_EXTENT, DRV_NAME)) { @@ -856,10 +852,16 @@ /* reset 82595 */ eepro_reset(ioaddr); + + err = register_netdev(dev); + if (err) + goto err; return 0; exit: + err = -ENODEV; +err: release_region(dev->base_addr, EEPRO_IO_EXTENT); - return -ENODEV; + return err; } /* Open/initialize the board. This is called (in the current kernel) @@ -1756,11 +1758,8 @@ dev->irq = irq[i]; if (do_eepro_probe(dev) == 0) { - if (register_netdev(dev) == 0) { - dev_eepro[n_eepro++] = dev; - continue; - } - release_region(dev->base_addr, EEPRO_IO_EXTENT); + dev_eepro[n_eepro++] = dev; + continue; } free_netdev(dev); break; ===== drivers/net/eexpress.c 1.18 vs edited ===== --- 1.18/drivers/net/eexpress.c 2004-05-21 07:16:23 +10:00 +++ edited/drivers/net/eexpress.c 2004-07-01 21:16:30 +10:00 @@ -436,11 +436,8 @@ netdev_boot_setup_check(dev); err = do_express_probe(dev); - if (!err) { - err = register_netdev(dev); - if (!err) - return dev; - } + if (!err) + return dev; free_netdev(dev); return ERR_PTR(err); } @@ -1205,7 +1202,8 @@ dev->set_multicast_list = &eexp_set_multicast; dev->tx_timeout = eexp_timeout; dev->watchdog_timeo = 2*HZ; - return 0; + + return register_netdev(dev); } /* @@ -1716,7 +1714,7 @@ break; printk(KERN_NOTICE "eexpress.c: Module autoprobe not recommended, give io=xx.\n"); } - if (do_express_probe(dev) == 0 && register_netdev(dev) == 0) { + if (do_express_probe(dev) == 0) { dev_eexp[this_dev] = dev; found++; continue; ===== drivers/net/es3210.c 1.13 vs edited ===== --- 1.13/drivers/net/es3210.c 2004-05-21 07:16:23 +10:00 +++ edited/drivers/net/es3210.c 2004-07-01 21:16:30 +10:00 @@ -176,12 +176,7 @@ err = do_es_probe(dev); if (err) goto out; - err = register_netdev(dev); - if (err) - goto out1; return dev; -out1: - cleanup_card(dev); out: free_netdev(dev); return ERR_PTR(err); @@ -304,6 +299,10 @@ dev->poll_controller = ei_poll; #endif NS8390_init(dev, 0); + + retval = register_netdev(dev); + if (retval) + goto out1; return 0; out1: free_irq(dev->irq, dev); @@ -439,11 +438,8 @@ dev->base_addr = io[this_dev]; dev->mem_start = mem[this_dev]; if (do_es_probe(dev) == 0) { - if (register_netdev(dev) == 0) { - dev_es3210[found++] = dev; - continue; - } - cleanup_card(dev); + dev_es3210[found++] = dev; + continue; } free_netdev(dev); printk(KERN_WARNING "es3210.c: No es3210 card found (i/o = 0x%x).\n", io[this_dev]); ===== drivers/net/hp.c 1.14 vs edited ===== --- 1.14/drivers/net/hp.c 2004-05-23 03:40:55 +10:00 +++ edited/drivers/net/hp.c 2004-07-01 21:16:30 +10:00 @@ -123,12 +123,7 @@ err = do_hp_probe(dev); if (err) goto out; - err = register_netdev(dev); - if (err) - goto out1; return dev; -out1: - cleanup_card(dev); out: free_netdev(dev); return ERR_PTR(err); @@ -227,7 +222,12 @@ ei_status.block_output = &hp_block_output; hp_init_card(dev); + retval = register_netdev(dev); + if (retval) + goto out1; return 0; +out1: + free_irq(dev->irq, dev); out: release_region(ioaddr, HP_IO_EXTENT); return retval; @@ -432,11 +432,8 @@ dev->irq = irq[this_dev]; dev->base_addr = io[this_dev]; if (do_hp_probe(dev) == 0) { - if (register_netdev(dev) == 0) { - dev_hp[found++] = dev; - continue; - } - cleanup_card(dev); + dev_hp[found++] = dev; + continue; } free_netdev(dev); printk(KERN_WARNING "hp.c: No HP card found (i/o = 0x%x).\n", io[this_dev]); ===== drivers/net/eth16i.c 1.19 vs edited ===== --- 1.19/drivers/net/eth16i.c 2004-05-23 03:40:55 +10:00 +++ edited/drivers/net/eth16i.c 2004-07-01 21:16:30 +10:00 @@ -473,13 +473,7 @@ err = do_eth16i_probe(dev); if (err) goto out; - err = register_netdev(dev); - if (err) - goto out1; return dev; -out1: - free_irq(dev->irq, dev); - release_region(dev->base_addr, ETH16I_IO_EXTENT); out: free_netdev(dev); return ERR_PTR(err); @@ -569,7 +563,13 @@ dev->tx_timeout = eth16i_timeout; dev->watchdog_timeo = TX_TIMEOUT; spin_lock_init(&lp->lock); + + retval = register_netdev(dev); + if (retval) + goto out1; return 0; +out1: + free_irq(dev->irq, dev); out: release_region(ioaddr, ETH16I_IO_EXTENT); return retval; @@ -1462,12 +1462,8 @@ } if (do_eth16i_probe(dev) == 0) { - if (register_netdev(dev) == 0) { - dev_eth16i[found++] = dev; - continue; - } - free_irq(dev->irq, dev); - release_region(dev->base_addr, ETH16I_IO_EXTENT); + dev_eth16i[found++] = dev; + continue; } printk(KERN_WARNING "eth16i.c No Eth16i card found (i/o = 0x%x).\n", io[this_dev]); ===== drivers/net/hp-plus.c 1.16 vs edited ===== --- 1.16/drivers/net/hp-plus.c 2004-05-23 03:40:55 +10:00 +++ edited/drivers/net/hp-plus.c 2004-07-01 21:16:30 +10:00 @@ -159,12 +159,7 @@ err = do_hpp_probe(dev); if (err) goto out; - err = register_netdev(dev); - if (err) - goto out1; return dev; -out1: - cleanup_card(dev); out: free_netdev(dev); return ERR_PTR(err); @@ -271,6 +266,9 @@ /* Leave the 8390 and HP chip reset. */ outw(inw(ioaddr + HPP_OPTION) & ~EnableIRQ, ioaddr + HPP_OPTION); + retval = register_netdev(dev); + if (retval) + goto out; return 0; out: release_region(ioaddr, HP_IO_EXTENT); @@ -463,11 +461,8 @@ dev->irq = irq[this_dev]; dev->base_addr = io[this_dev]; if (do_hpp_probe(dev) == 0) { - if (register_netdev(dev) == 0) { - dev_hpp[found++] = dev; - continue; - } - cleanup_card(dev); + dev_hpp[found++] = dev; + continue; } free_netdev(dev); printk(KERN_WARNING "hp-plus.c: No HP-Plus card found (i/o = 0x%x).\n", io[this_dev]); ===== drivers/net/hp100.c 1.28 vs edited ===== --- 1.28/drivers/net/hp100.c 2004-05-21 07:16:23 +10:00 +++ edited/drivers/net/hp100.c 2004-07-01 21:16:31 +10:00 @@ -411,12 +411,7 @@ if (err) goto out; - err = register_netdev(dev); - if (err) - goto out1; return dev; - out1: - release_region(dev->base_addr, HP100_REGION_SIZE); out: free_netdev(dev); return ERR_PTR(err); @@ -770,11 +765,22 @@ printk("Warning! Link down.\n"); } + err = register_netdev(dev); + if (err) + goto out3; + return 0; +out3: + if (local_mode == 1) + pci_free_consistent(lp->pci_dev, MAX_RINGSIZE + 0x0f, + lp->page_vaddr_algn, + virt_to_whatever(dev, lp->page_vaddr_algn)); + if (mem_ptr_virt) + iounmap(mem_ptr_virt); out2: release_region(ioaddr, HP100_REGION_SIZE); out1: - return -ENODEV; + return err; } /* This procedure puts the card into a stable init state */ @@ -2868,18 +2874,12 @@ if (err) goto out1; - err = register_netdev(dev); - if (err) - goto out2; - #ifdef HP100_DEBUG printk("hp100: %s: EISA adapter found at 0x%x\n", dev->name, dev->base_addr); #endif gendev->driver_data = dev; return 0; - out2: - release_region(dev->base_addr, HP100_REGION_SIZE); out1: free_netdev(dev); return err; @@ -2938,17 +2938,12 @@ err = hp100_probe1(dev, ioaddr, HP100_BUS_PCI, pdev); if (err) goto out1; - err = register_netdev(dev); - if (err) - goto out2; #ifdef HP100_DEBUG printk("hp100: %s: PCI adapter found at 0x%x\n", dev->name, ioaddr); #endif pci_set_drvdata(pdev, dev); return 0; - out2: - release_region(dev->base_addr, HP100_REGION_SIZE); out1: free_netdev(dev); return err; @@ -3016,15 +3011,9 @@ SET_MODULE_OWNER(dev); err = hp100_isa_probe(dev, hp100_port[i]); - if (!err) { - err = register_netdev(dev); - if (!err) - hp100_devlist[cards++] = dev; - else - release_region(dev->base_addr, HP100_REGION_SIZE); - } - - if (err) + if (!err) + hp100_devlist[cards++] = dev; + else free_netdev(dev); } ===== drivers/net/isa-skeleton.c 1.13 vs edited ===== --- 1.13/drivers/net/isa-skeleton.c 2004-05-21 07:16:23 +10:00 +++ edited/drivers/net/isa-skeleton.c 2004-07-01 21:16:31 +10:00 @@ -176,12 +176,7 @@ err = do_netcard_probe(dev); if (err) goto out; - err = register_netdev(dev); - if (err) - goto out1; return dev; -out1: - cleanup_card(dev); out: free_netdev(dev); return ERR_PTR(err); @@ -316,7 +311,15 @@ dev->tx_timeout = &net_tx_timeout; dev->watchdog_timeo = MY_TX_TIMEOUT; + + err = register_netdev(dev); + if (err) + goto out2; return 0; +out2: +#ifdef jumpered_dma + free_dma(dev->dma); +#endif out1: #ifdef jumpered_interrupts free_irq(dev->irq, dev); @@ -691,11 +694,8 @@ dev->dma = dma; dev->mem_start = mem; if (do_netcard_probe(dev) == 0) { - if (register_netdev(dev) == 0) - this_device = dev; - return 0; - } - cleanup_card(dev); + this_device = dev; + return 0; } free_netdev(dev); return -ENXIO; ===== drivers/net/lance.c 1.22 vs edited ===== --- 1.22/drivers/net/lance.c 2004-05-21 07:16:23 +10:00 +++ edited/drivers/net/lance.c 2004-07-01 21:16:31 +10:00 @@ -355,11 +355,8 @@ dev->base_addr = io[this_dev]; dev->dma = dma[this_dev]; if (do_lance_probe(dev) == 0) { - if (register_netdev(dev) == 0) { - dev_lance[found++] = dev; - continue; - } - cleanup_card(dev); + dev_lance[found++] = dev; + continue; } free_netdev(dev); break; @@ -447,12 +444,7 @@ err = do_lance_probe(dev); if (err) goto out; - err = register_netdev(dev); - if (err) - goto out1; return dev; -out1: - cleanup_card(dev); out: free_netdev(dev); return ERR_PTR(err); @@ -723,6 +715,9 @@ dev->tx_timeout = lance_tx_timeout; dev->watchdog_timeo = TX_TIMEOUT; + err = register_netdev(dev); + if (err) + goto out_dma; return 0; out_dma: if (dev->dma != 4) ===== drivers/net/ne.c 1.22 vs edited ===== --- 1.22/drivers/net/ne.c 2004-05-23 03:40:55 +10:00 +++ edited/drivers/net/ne.c 2004-07-01 21:16:32 +10:00 @@ -220,12 +220,7 @@ err = do_ne_probe(dev); if (err) goto out; - err = register_netdev(dev); - if (err) - goto out1; return dev; -out1: - cleanup_card(dev); out: free_netdev(dev); return ERR_PTR(err); @@ -506,8 +501,14 @@ dev->poll_controller = ei_poll; #endif NS8390_init(dev, 0); + + ret = register_netdev(dev); + if (ret) + goto out_irq; return 0; +out_irq: + free_irq(dev->irq, dev); err_out: release_region(ioaddr, NE_IO_EXTENT); return ret; @@ -798,11 +799,8 @@ dev->mem_end = bad[this_dev]; dev->base_addr = io[this_dev]; if (do_ne_probe(dev) == 0) { - if (register_netdev(dev) == 0) { - dev_ne[found++] = dev; - continue; - } - cleanup_card(dev); + dev_ne[found++] = dev; + continue; } free_netdev(dev); if (found) ===== drivers/net/lne390.c 1.14 vs edited ===== --- 1.14/drivers/net/lne390.c 2004-05-23 03:40:55 +10:00 +++ edited/drivers/net/lne390.c 2004-07-01 21:16:31 +10:00 @@ -168,12 +168,7 @@ err = do_lne390_probe(dev); if (err) goto out; - err = register_netdev(dev); - if (err) - goto out1; return dev; -out1: - cleanup_card(dev); out: free_netdev(dev); return ERR_PTR(err); @@ -307,7 +302,14 @@ dev->poll_controller = ei_poll; #endif NS8390_init(dev, 0); + + ret = register_netdev(dev); + if (ret) + goto unmap; return 0; +unmap: + if (ei_status.reg0) + iounmap((void *)dev->mem_start); cleanup: free_irq(dev->irq, dev); return ret; @@ -436,11 +438,8 @@ dev->base_addr = io[this_dev]; dev->mem_start = mem[this_dev]; if (do_lne390_probe(dev) == 0) { - if (register_netdev(dev) == 0) { - dev_lne[found++] = dev; - continue; - } - cleanup_card(dev); + dev_lne[found++] = dev; + continue; } free_netdev(dev); printk(KERN_WARNING "lne390.c: No LNE390 card found (i/o = 0x%x).\n", io[this_dev]); ===== drivers/net/ne2.c 1.16 vs edited ===== --- 1.16/drivers/net/ne2.c 2004-05-23 03:40:55 +10:00 +++ edited/drivers/net/ne2.c 2004-07-01 21:16:32 +10:00 @@ -301,12 +301,7 @@ err = do_ne2_probe(dev); if (err) goto out; - err = register_netdev(dev); - if (err) - goto out1; return dev; -out1: - cleanup_card(dev); out: free_netdev(dev); return ERR_PTR(err); @@ -517,7 +512,14 @@ dev->poll_controller = ei_poll; #endif NS8390_init(dev, 0); + + retval = register_netdev(dev); + if (retval) + goto out1; return 0; +out1: + mca_set_adapter_procfn( ei_status.priv, NULL, NULL); + free_irq(dev->irq, dev); out: release_region(base_addr, NE_IO_EXTENT); return retval; @@ -800,11 +802,8 @@ dev->mem_end = bad[this_dev]; dev->base_addr = io[this_dev]; if (do_ne2_probe(dev) == 0) { - if (register_netdev(dev) == 0) { - dev_ne[found++] = dev; - continue; - } - cleanup_card(dev); + dev_ne[found++] = dev; + continue; } free_netdev(dev); break; ===== drivers/net/ne-h8300.c 1.4 vs edited ===== --- 1.4/drivers/net/ne-h8300.c 2004-05-23 03:40:55 +10:00 +++ edited/drivers/net/ne-h8300.c 2004-07-01 21:16:32 +10:00 @@ -180,12 +180,7 @@ err = do_ne_probe(dev); if (err) goto out; - err = register_netdev(dev); - if (err) - goto out1; return dev; -out1: - cleanup_card(dev); out: free_netdev(dev); return ERR_PTR(err); @@ -325,8 +320,13 @@ dev->poll_controller = ei_poll; #endif NS8390_init(dev, 0); - return 0; + ret = register_netdev(dev); + if (ret) + goto out_irq; + return 0; +out_irq: + free_irq(dev->irq, dev); err_out: release_region(ioaddr, NE_IO_EXTENT); return ret; @@ -633,11 +633,8 @@ err = init_reg_offset(dev, dev->base_addr); if (!err) { if (do_ne_probe(dev) == 0) { - if (register_netdev(dev) == 0) { - dev_ne[found++] = dev; - continue; - } - cleanup_card(dev); + dev_ne[found++] = dev; + continue; } } free_netdev(dev); ===== drivers/net/smc-ultra.c 1.22 vs edited ===== --- 1.22/drivers/net/smc-ultra.c 2004-05-23 03:40:55 +10:00 +++ edited/drivers/net/smc-ultra.c 2004-07-01 21:16:32 +10:00 @@ -195,12 +195,7 @@ err = do_ultra_probe(dev); if (err) goto out; - err = register_netdev(dev); - if (err) - goto out1; return dev; -out1: - cleanup_card(dev); out: free_netdev(dev); return ERR_PTR(err); @@ -321,6 +316,9 @@ #endif NS8390_init(dev, 0); + retval = register_netdev(dev); + if (retval) + goto out; return 0; out: release_region(ioaddr, ULTRA_IO_EXTENT); @@ -579,11 +577,8 @@ dev->irq = irq[this_dev]; dev->base_addr = io[this_dev]; if (do_ultra_probe(dev) == 0) { - if (register_netdev(dev) == 0) { - dev_ultra[found++] = dev; - continue; - } - cleanup_card(dev); + dev_ultra[found++] = dev; + continue; } free_netdev(dev); printk(KERN_WARNING "smc-ultra.c: No SMC Ultra card found (i/o = 0x%x).\n", io[this_dev]); ===== drivers/net/wd.c 1.19 vs edited ===== --- 1.19/drivers/net/wd.c 2004-05-23 03:40:55 +10:00 +++ edited/drivers/net/wd.c 2004-07-01 21:16:32 +10:00 @@ -148,12 +148,7 @@ err = do_wd_probe(dev); if (err) goto out; - err = register_netdev(dev); - if (err) - goto out1; return dev; -out1: - cleanup_card(dev); out: free_netdev(dev); return ERR_PTR(err); @@ -163,6 +158,7 @@ static int __init wd_probe1(struct net_device *dev, int ioaddr) { int i; + int err; int checksum = 0; int ancient = 0; /* An old card without config registers. */ int word16 = 0; /* 0 = 8 bit, 1 = 16 bit */ @@ -349,7 +345,10 @@ outb(inb(ioaddr+4)|0x80, ioaddr+4); #endif - return 0; + err = register_netdev(dev); + if (err) + free_irq(dev->irq, dev); + return err; } static int @@ -519,11 +518,8 @@ dev->mem_start = mem[this_dev]; dev->mem_end = mem_end[this_dev]; if (do_wd_probe(dev) == 0) { - if (register_netdev(dev) == 0) { - dev_wd[found++] = dev; - continue; - } - cleanup_card(dev); + dev_wd[found++] = dev; + continue; } free_netdev(dev); printk(KERN_WARNING "wd.c: No wd80x3 card found (i/o = 0x%x).\n", io[this_dev]); --sdtB3X0nJg68CQEu-- From herbert@gondor.apana.org.au Thu Jul 1 05:33:56 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 01 Jul 2004 05:34:03 -0700 (PDT) Received: from arnor.apana.org.au (mail@arnor.apana.org.au [203.14.152.115]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i61CXsgi019270 for ; Thu, 1 Jul 2004 05:33:55 -0700 Received: from gondolin.me.apana.org.au ([192.168.0.6] ident=mail) by arnor.apana.org.au with esmtp (Exim 3.35 #1 (Debian)) id 1Bg0l9-0002s4-00; Thu, 01 Jul 2004 22:33:47 +1000 Received: from herbert by gondolin.me.apana.org.au with local (Exim 3.36 #1 (Debian)) id 1Bg0l6-00032C-00; Thu, 01 Jul 2004 22:33:44 +1000 Date: Thu, 1 Jul 2004 22:33:44 +1000 To: "David S. Miller" , netdev@oss.sgi.com Subject: [ESP4] Merge NAT-T code in esp_output Message-ID: <20040701123344.GA11639@gondor.apana.org.au> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="jI8keyz6grp/JLjh" Content-Disposition: inline User-Agent: Mutt/1.5.6+20040523i From: Herbert Xu X-archive-position: 6491 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: herbert@gondor.apana.org.au Precedence: bulk X-list: netdev --jI8keyz6grp/JLjh Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Hi Dave: This is another step on the way to distilling the tunnel mode encap code between AH/ESP/IPCOMP. This patch removes the needless duplciation of NAT-T code between transport mode and tunnel mode ESP4. Signed-off-by: Herbert Xu Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt --jI8keyz6grp/JLjh Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename=p ===== net/ipv4/esp4.c 1.48 vs edited ===== --- 1.48/net/ipv4/esp4.c 2004-07-01 19:11:29 +10:00 +++ edited/net/ipv4/esp4.c 2004-07-01 22:33:16 +10:00 @@ -28,9 +28,6 @@ struct crypto_tfm *tfm; struct esp_data *esp; struct sk_buff *trailer; - struct udphdr *uh = NULL; - u32 *udpdata32; - struct xfrm_encap_tmpl *encap = NULL; int blksize; int clen; int alen; @@ -88,30 +85,10 @@ *(u8*)(trailer->tail + clen-(*pskb)->len - 2) = (clen - (*pskb)->len)-2; pskb_put(*pskb, trailer, clen - (*pskb)->len); - encap = x->encap; - iph = (*pskb)->nh.iph; if (x->props.mode) { top_iph = (struct iphdr*)skb_push(*pskb, x->props.header_len); esph = (struct ip_esp_hdr*)(top_iph+1); - if (encap) { - switch (encap->encap_type) { - default: - case UDP_ENCAP_ESPINUDP: - uh = (struct udphdr*) esph; - esph = (struct ip_esp_hdr*)(uh+1); - top_iph->protocol = IPPROTO_UDP; - break; - case UDP_ENCAP_ESPINUDP_NON_IKE: - uh = (struct udphdr*) esph; - udpdata32 = (u32*)(uh+1); - udpdata32[0] = udpdata32[1] = 0; - esph = (struct ip_esp_hdr*)(udpdata32+2); - top_iph->protocol = IPPROTO_UDP; - break; - } - } else - top_iph->protocol = IPPROTO_ESP; *(u8*)(trailer->tail - 1) = IPPROTO_IPIP; top_iph->ihl = 5; top_iph->version = 4; @@ -131,24 +108,6 @@ esph = (struct ip_esp_hdr*)skb_push(*pskb, x->props.header_len); top_iph = (struct iphdr*)skb_push(*pskb, iph->ihl*4); memcpy(top_iph, &tmp_iph, iph->ihl*4); - if (encap) { - switch (encap->encap_type) { - default: - case UDP_ENCAP_ESPINUDP: - uh = (struct udphdr*) esph; - esph = (struct ip_esp_hdr*)(uh+1); - top_iph->protocol = IPPROTO_UDP; - break; - case UDP_ENCAP_ESPINUDP_NON_IKE: - uh = (struct udphdr*) esph; - udpdata32 = (u32*)(uh+1); - udpdata32[0] = udpdata32[1] = 0; - esph = (struct ip_esp_hdr*)(udpdata32+2); - top_iph->protocol = IPPROTO_UDP; - break; - } - } else - top_iph->protocol = IPPROTO_ESP; iph = &tmp_iph.iph; top_iph->tot_len = htons((*pskb)->len + alen); top_iph->check = 0; @@ -157,12 +116,32 @@ } /* this is non-NULL only with UDP Encapsulation */ - if (encap && uh) { + if (x->encap) { + struct xfrm_encap_tmpl *encap = x->encap; + struct udphdr *uh; + u32 *udpdata32; + + uh = (struct udphdr *)esph; uh->source = encap->encap_sport; uh->dest = encap->encap_dport; uh->len = htons((*pskb)->len + alen - sizeof(struct iphdr)); uh->check = 0; - } + + switch (encap->encap_type) { + default: + case UDP_ENCAP_ESPINUDP: + esph = (struct ip_esp_hdr *)(uh + 1); + break; + case UDP_ENCAP_ESPINUDP_NON_IKE: + udpdata32 = (u32 *)(uh + 1); + udpdata32[0] = udpdata32[1] = 0; + esph = (struct ip_esp_hdr *)(udpdata32 + 2); + break; + } + + top_iph->protocol = IPPROTO_UDP; + } else + top_iph->protocol = IPPROTO_ESP; esph->spi = x->id.spi; esph->seq_no = htonl(++x->replay.oseq); --jI8keyz6grp/JLjh-- From michael.kerrisk@gmx.net Thu Jul 1 05:47:24 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 01 Jul 2004 05:47:27 -0700 (PDT) Received: from mail.jambit.com (mail.jambit.com [62.245.207.83]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i61ClNgi019843 for ; Thu, 1 Jul 2004 05:47:24 -0700 Received: from localhost (localhost [127.0.0.1]) by mail.jambit.com (Postfix) with ESMTP id EF61C4A44F; Thu, 1 Jul 2004 14:47:16 +0200 (CEST) Received: from mail.jambit.com ([127.0.0.1]) by localhost (mail [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 04463-03; Thu, 1 Jul 2004 14:47:07 +0200 (CEST) Received: from wakatipu (proxy.jambit.com [62.245.207.82]) by mail.jambit.com (Postfix) with ESMTP id 99B674A42D; Thu, 1 Jul 2004 14:47:07 +0200 (CEST) From: "Michael Kerrisk" To: netdev@oss.sgi.com Date: Thu, 01 Jul 2004 14:47:13 +0200 MIME-Version: 1.0 Subject: TCP_CORK 200ms maximum cork time -- expected behaviour? Cc: michael.kerrisk@gmx.net Message-ID: <40E423F1.31890.6CC4104@localhost> Priority: normal X-mailer: Pegasus Mail for Windows (4.21a) Content-type: text/plain; charset=US-ASCII Content-transfer-encoding: 7BIT Content-description: Mail message body X-Virus-Scanned: by amavisd-new at jambit.com X-archive-position: 6492 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: michael.kerrisk@gmx.net Precedence: bulk X-list: netdev Gidday, The TCP_CORK socket option allows us to perform multiple write()s (or send()s or sendfile()s) while delaying the transmission of an outgoing TCP segment until the option is disabled (or a segment MSS is filled or the socket is closed). All is fine and good, but there's one point I'm puzzled about: even when TCP_CORK is set, buffered data will still be transmitted after a 200 millisecond delay (the delay counts from the time that the first corked byte was written), even if TCP_CORK is still set. So, I'm wondering: 1. Is this intended behaviour, or simply an outgrowth of the combined implementations of TCP_CORK and TCP_NAGLE_OFF? 2. If it's intended behaviour, what is the rationale for the ceiling time on corking? Cheers, Michael PS I first observed this behaviour quite some time back, but I've verified that it is still current (2.4.26 and 2.6.7 kernels). (In passing: of course, similar behaviour occurs with MSG_MORE on TCP sockets.) From shemminger@osdl.org Thu Jul 1 11:33:31 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 01 Jul 2004 11:33:38 -0700 (PDT) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i61IXUgi002463 for ; Thu, 1 Jul 2004 11:33:30 -0700 Received: from dell_ss3.pdx.osdl.net (dell_ss3.pdx.osdl.net [172.20.1.60]) by mail.osdl.org (8.11.6/8.11.6) with SMTP id i61IXCG11856; Thu, 1 Jul 2004 11:33:12 -0700 Date: Thu, 1 Jul 2004 11:33:12 -0700 From: Stephen Hemminger To: "David S. Miller" Cc: Catalin BOIE , netdev@oss.sgi.com, lartc@mailman.ds9a.nl Subject: [PATCH 2.6] update to network emulation QOS scheduler Message-Id: <20040701113312.43cfe6c5@dell_ss3.pdx.osdl.net> Organization: Open Source Development Lab X-Mailer: Sylpheed version 0.9.10claws (GTK+ 1.2.10; i386-redhat-linux-gnu) X-Face: &@E+xe?c%:&e4D{>f1O<&U>2qwRREG5!}7R4;D<"NO^UI2mJ[eEOA2*3>(`Th.yP,VDPo9$ /`~cw![cmj~~jWe?AHY7D1S+\}5brN0k*NE?pPh_'_d>6;XGG[\KDRViCfumZT3@[ Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 6493 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev This patch updates the network emulation packet scheduler. * name changed from delay to netem since it does more than just delay * Catalin's merged code to do packet reordering * uses a socket queue's directly rather than layering on qdisc(fifo) because this is used in performance tests. * adds placeholder in API for future enhancements (rate and duplicate). Signed-off-by: Stephen Hemminger diff -urNp -X dontdiff linux-2.6/include/linux/pkt_sched.h sched-2.6/include/linux/pkt_sched.h --- linux-2.6/include/linux/pkt_sched.h 2004-06-24 08:52:58.000000000 -0700 +++ sched-2.6/include/linux/pkt_sched.h 2004-07-01 03:53:31.185482832 -0700 @@ -439,11 +439,14 @@ enum { #define TCA_ATM_MAX TCA_ATM_STATE -/* Delay section */ -struct tc_dly_qopt +/* Network emulator */ +struct tc_netem_qopt { - __u32 latency; - __u32 limit; - __u32 loss; + __u32 latency; /* added delay (us) */ + __u32 limit; /* fifo limit (packets) */ + __u32 loss; /* random packet loss (0=none ~0=100%) */ + __u32 gap; /* re-ordering gap (0 for delay all) */ + __u32 duplicate; /* random packet dup (0=none ~0=100%) */ + __u32 rate; /* maximum transmit rate (bytes/sec) */ }; #endif diff -urNp -X dontdiff linux-2.6/net/sched/Kconfig sched-2.6/net/sched/Kconfig --- linux-2.6/net/sched/Kconfig 2004-06-25 09:41:00.000000000 -0700 +++ sched-2.6/net/sched/Kconfig 2004-06-28 09:17:19.000000000 -0700 @@ -164,12 +164,12 @@ config NET_SCH_DSMARK To compile this code as a module, choose M here: the module will be called sch_dsmark. -config NET_SCH_DELAY - tristate "Delay simulator" +config NET_SCH_NETEM + tristate "Network emulator" depends on NET_SCHED help - Say Y if you want to delay packets by a fixed amount of - time. This is often useful to simulate network delay when + Say Y if you want to emulate network delay, loss, and packet + re-ordering. This is often useful to simulate networks when testing applications or protocols. To compile this driver as a module, choose M here: the module diff -urNp -X dontdiff linux-2.6/net/sched/Makefile sched-2.6/net/sched/Makefile --- linux-2.6/net/sched/Makefile 2004-06-24 08:52:58.000000000 -0700 +++ sched-2.6/net/sched/Makefile 2004-06-28 09:17:49.000000000 -0700 @@ -24,7 +24,7 @@ obj-$(CONFIG_NET_SCH_TBF) += sch_tbf.o obj-$(CONFIG_NET_SCH_TEQL) += sch_teql.o obj-$(CONFIG_NET_SCH_PRIO) += sch_prio.o obj-$(CONFIG_NET_SCH_ATM) += sch_atm.o -obj-$(CONFIG_NET_SCH_DELAY) += sch_delay.o +obj-$(CONFIG_NET_SCH_NETEM) += sch_netem.o obj-$(CONFIG_NET_CLS_U32) += cls_u32.o obj-$(CONFIG_NET_CLS_ROUTE4) += cls_route.o obj-$(CONFIG_NET_CLS_FW) += cls_fw.o diff -urNp -X dontdiff linux-2.6/net/sched/sch_delay.c sched-2.6/net/sched/sch_delay.c --- linux-2.6/net/sched/sch_delay.c 2004-06-21 09:23:15.000000000 -0700 +++ sched-2.6/net/sched/sch_delay.c 1969-12-31 16:00:00.000000000 -0800 @@ -1,281 +0,0 @@ -/* - * net/sched/sch_delay.c Simple constant delay - * - * This program is free software; you can redistribute it and/or - * modify it under the terms of the GNU General Public License - * as published by the Free Software Foundation; either version - * 2 of the License, or (at your option) any later version. - * - * Authors: Stephen Hemminger - */ - -#include -#include -#include -#include - -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include - -/* Network delay simulator - This scheduler adds a fixed delay to all packets. - Similar to NISTnet and BSD Dummynet. - - It uses byte fifo underneath similar to TBF */ -struct dly_sched_data { - u32 latency; - u32 limit; - u32 loss; - struct timer_list timer; - struct Qdisc *qdisc; -}; - -/* Time stamp put into socket buffer control block */ -struct dly_skb_cb { - psched_time_t queuetime; -}; - -/* Enqueue packets with underlying discipline (fifo) - * but mark them with current time first. - */ -static int dly_enqueue(struct sk_buff *skb, struct Qdisc *sch) -{ - struct dly_sched_data *q = (struct dly_sched_data *)sch->data; - struct dly_skb_cb *cb = (struct dly_skb_cb *)skb->cb; - int ret; - - /* Random packet drop 0 => none, ~0 => all */ - if (q->loss >= net_random()) { - sch->stats.drops++; - return 0; /* lie about loss so TCP doesn't know */ - } - - PSCHED_GET_TIME(cb->queuetime); - - /* Queue to underlying scheduler */ - ret = q->qdisc->enqueue(skb, q->qdisc); - if (ret) - sch->stats.drops++; - else { - sch->q.qlen++; - sch->stats.bytes += skb->len; - sch->stats.packets++; - } - return ret; -} - -/* Requeue packets but don't change time stamp */ -static int dly_requeue(struct sk_buff *skb, struct Qdisc *sch) -{ - struct dly_sched_data *q = (struct dly_sched_data *)sch->data; - int ret; - - ret = q->qdisc->ops->requeue(skb, q->qdisc); - if (ret == 0) - sch->q.qlen++; - return ret; -} - -static unsigned int dly_drop(struct Qdisc *sch) -{ - struct dly_sched_data *q = (struct dly_sched_data *)sch->data; - unsigned int len; - - len = q->qdisc->ops->drop(q->qdisc); - if (len) { - sch->q.qlen--; - sch->stats.drops++; - } - return len; -} - -/* Dequeue packet. - * If packet needs to be held up, then stop the - * queue and set timer to wakeup later. - */ -static struct sk_buff *dly_dequeue(struct Qdisc *sch) -{ - struct dly_sched_data *q = (struct dly_sched_data *)sch->data; - struct sk_buff *skb; - - retry: - skb = q->qdisc->dequeue(q->qdisc); - if (skb) { - struct dly_skb_cb *cb = (struct dly_skb_cb *)skb->cb; - psched_time_t now; - long diff, delay; - - PSCHED_GET_TIME(now); - diff = q->latency - PSCHED_TDIFF(now, cb->queuetime); - - if (diff <= 0) { - sch->q.qlen--; - sch->flags &= ~TCQ_F_THROTTLED; - return skb; - } - - if (q->qdisc->ops->requeue(skb, q->qdisc) != NET_XMIT_SUCCESS) { - sch->q.qlen--; - sch->stats.drops++; - goto retry; - } - - delay = PSCHED_US2JIFFIE(diff); - if (delay <= 0) - delay = 1; - mod_timer(&q->timer, jiffies+delay); - - sch->flags |= TCQ_F_THROTTLED; - } - return NULL; -} - -static void dly_reset(struct Qdisc *sch) -{ - struct dly_sched_data *q = (struct dly_sched_data *)sch->data; - - qdisc_reset(q->qdisc); - sch->q.qlen = 0; - sch->flags &= ~TCQ_F_THROTTLED; - del_timer(&q->timer); -} - -static void dly_timer(unsigned long arg) -{ - struct Qdisc *sch = (struct Qdisc *)arg; - - sch->flags &= ~TCQ_F_THROTTLED; - netif_schedule(sch->dev); -} - -/* Tell Fifo the new limit. */ -static int change_limit(struct Qdisc *q, u32 limit) -{ - struct rtattr *rta; - int ret; - - rta = kmalloc(RTA_LENGTH(sizeof(struct tc_fifo_qopt)), GFP_KERNEL); - if (!rta) - return -ENOMEM; - - rta->rta_type = RTM_NEWQDISC; - rta->rta_len = RTA_LENGTH(sizeof(struct tc_fifo_qopt)); - ((struct tc_fifo_qopt *)RTA_DATA(rta))->limit = limit; - ret = q->ops->change(q, rta); - kfree(rta); - - return ret; -} - -/* Setup underlying FIFO discipline */ -static int dly_change(struct Qdisc *sch, struct rtattr *opt) -{ - struct dly_sched_data *q = (struct dly_sched_data *)sch->data; - struct tc_dly_qopt *qopt = RTA_DATA(opt); - int err; - - if (q->qdisc == &noop_qdisc) { - struct Qdisc *child - = qdisc_create_dflt(sch->dev, &bfifo_qdisc_ops); - if (!child) - return -EINVAL; - q->qdisc = child; - } - - err = change_limit(q->qdisc, qopt->limit); - if (err) { - qdisc_destroy(q->qdisc); - q->qdisc = &noop_qdisc; - } else { - q->latency = qopt->latency; - q->limit = qopt->limit; - q->loss = qopt->loss; - } - return err; -} - -static int dly_init(struct Qdisc *sch, struct rtattr *opt) -{ - struct dly_sched_data *q = (struct dly_sched_data *)sch->data; - - if (!opt) - return -EINVAL; - - init_timer(&q->timer); - q->timer.function = dly_timer; - q->timer.data = (unsigned long) sch; - q->qdisc = &noop_qdisc; - - return dly_change(sch, opt); -} - -static void dly_destroy(struct Qdisc *sch) -{ - struct dly_sched_data *q = (struct dly_sched_data *)sch->data; - - del_timer(&q->timer); - qdisc_destroy(q->qdisc); - q->qdisc = &noop_qdisc; -} - -static int dly_dump(struct Qdisc *sch, struct sk_buff *skb) -{ - struct dly_sched_data *q = (struct dly_sched_data *)sch->data; - unsigned char *b = skb->tail; - struct tc_dly_qopt qopt; - - qopt.latency = q->latency; - qopt.limit = q->limit; - qopt.loss = q->loss; - - RTA_PUT(skb, TCA_OPTIONS, sizeof(qopt), &qopt); - - return skb->len; - -rtattr_failure: - skb_trim(skb, b - skb->data); - return -1; -} - -static struct Qdisc_ops dly_qdisc_ops = { - .id = "delay", - .priv_size = sizeof(struct dly_sched_data), - .enqueue = dly_enqueue, - .dequeue = dly_dequeue, - .requeue = dly_requeue, - .drop = dly_drop, - .init = dly_init, - .reset = dly_reset, - .destroy = dly_destroy, - .change = dly_change, - .dump = dly_dump, - .owner = THIS_MODULE, -}; - - -static int __init dly_module_init(void) -{ - return register_qdisc(&dly_qdisc_ops); -} -static void __exit dly_module_exit(void) -{ - unregister_qdisc(&dly_qdisc_ops); -} -module_init(dly_module_init) -module_exit(dly_module_exit) -MODULE_LICENSE("GPL"); diff -urNp -X dontdiff linux-2.6/net/sched/sch_netem.c sched-2.6/net/sched/sch_netem.c --- linux-2.6/net/sched/sch_netem.c 1969-12-31 16:00:00.000000000 -0800 +++ sched-2.6/net/sched/sch_netem.c 2004-06-30 14:05:13.000000000 -0700 @@ -0,0 +1,255 @@ +/* + * net/sched/sch_netem.c Network emulator + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + * + * Authors: Stephen Hemminger + * Catalin(ux aka Dino) BOIE + */ + +#include +#include +#include +#include +#include +#include +#include +#include + +#include + +/* Network emulator + * + * This scheduler can alters spacing and order + * Similar to NISTnet and BSD Dummynet. + */ + +struct netem_sched_data { + struct sk_buff_head qnormal; + struct sk_buff_head qdelay; + struct timer_list timer; + + u32 latency; + u32 loss; + u32 counter; + u32 gap; +}; + +/* Time stamp put into socket buffer control block */ +struct netem_skb_cb { + psched_time_t time_to_send; +}; + +/* Enqueue packets with underlying discipline (fifo) + * but mark them with current time first. + */ +static int netem_enqueue(struct sk_buff *skb, struct Qdisc *sch) +{ + struct netem_sched_data *q = (struct netem_sched_data *)sch->data; + struct netem_skb_cb *cb = (struct netem_skb_cb *)skb->cb; + + pr_debug("netem_enqueue skb=%p @%lu\n", skb, jiffies); + + /* Random packet drop 0 => none, ~0 => all */ + if (q->loss >= net_random()) { + sch->stats.drops++; + return 0; /* lie about loss so TCP doesn't know */ + } + + if (q->qnormal.qlen < sch->dev->tx_queue_len) { + PSCHED_GET_TIME(cb->time_to_send); + PSCHED_TADD(cb->time_to_send, q->latency); + + __skb_queue_tail(&q->qnormal, skb); + sch->q.qlen++; + sch->stats.bytes += skb->len; + sch->stats.packets++; + return 0; + } + + sch->stats.drops++; + kfree_skb(skb); + return NET_XMIT_DROP; +} + +/* Requeue packets but don't change time stamp */ +static int netem_requeue(struct sk_buff *skb, struct Qdisc *sch) +{ + struct netem_sched_data *q = (struct netem_sched_data *)sch->data; + + __skb_queue_head(&q->qnormal, skb); + sch->q.qlen++; + return 0; +} + +/* + * Check the look aside buffer list, and see if any freshly baked buffers. + * If head of queue is not baked, set timer. + */ +static struct sk_buff *netem_get_delayed(struct netem_sched_data *q) +{ + struct sk_buff *skb; + psched_time_t now; + long delay; + + skb = skb_peek(&q->qdelay); + if (skb) { + const struct netem_skb_cb *cb + = (const struct netem_skb_cb *)skb->cb; + + PSCHED_GET_TIME(now); + delay = PSCHED_US2JIFFIE(PSCHED_TDIFF(cb->time_to_send, now)); + pr_debug("netem_dequeue: delay queue %p@%lu %ld\n", + skb, jiffies, delay); + + /* it's baked enough */ + if (delay <= 0) { + __skb_unlink(skb, &q->qdelay); + del_timer(&q->timer); + return skb; + } + + if (!timer_pending(&q->timer)) { + q->timer.expires = jiffies + delay; + add_timer(&q->timer); + } + } + return NULL; +} + +/* Dequeue packet. + * If packet needs to be held up, then put in the delay + * queue and set timer to wakeup later. + */ +static struct sk_buff *netem_dequeue(struct Qdisc *sch) +{ + struct netem_sched_data *q = (struct netem_sched_data *)sch->data; + struct sk_buff *skb; + + skb = netem_get_delayed(q); + if (!skb && (skb = __skb_dequeue(&q->qnormal))) { + /* are we doing out of order packet skip? */ + if (q->counter < q->gap) { + pr_debug("netem_dequeue: send %p normally\n", skb); + q->counter++; + } else { + /* don't send now hold for later */ + pr_debug("netem_dequeue: hold [%p]@%lu\n", skb, jiffies); + __skb_queue_tail(&q->qdelay, skb); + q->counter = 0; + skb = netem_get_delayed(q); + } + } + + if (skb) + sch->q.qlen--; + return skb; +} + +static void netem_timer(unsigned long arg) +{ + struct Qdisc *sch = (struct Qdisc *)arg; + + pr_debug("netem_timer: fired @%lu\n", jiffies); + netif_schedule(sch->dev); +} + +static void netem_reset(struct Qdisc *sch) +{ + struct netem_sched_data *q = (struct netem_sched_data *)sch->data; + + skb_queue_purge(&q->qnormal); + skb_queue_purge(&q->qdelay); + + sch->q.qlen = 0; + del_timer_sync(&q->timer); +} + +static int netem_change(struct Qdisc *sch, struct rtattr *opt) +{ + struct netem_sched_data *q = (struct netem_sched_data *)sch->data; + struct tc_netem_qopt *qopt = RTA_DATA(opt); + + if (qopt->limit) + sch->dev->tx_queue_len = qopt->limit; + + q->gap = qopt->gap; + q->loss = qopt->loss; + q->latency = qopt->latency; + + return 0; +} + +static int netem_init(struct Qdisc *sch, struct rtattr *opt) +{ + struct netem_sched_data *q = (struct netem_sched_data *)sch->data; + + if (!opt) + return -EINVAL; + + skb_queue_head_init(&q->qnormal); + skb_queue_head_init(&q->qdelay); + init_timer(&q->timer); + q->timer.function = netem_timer; + q->timer.data = (unsigned long) sch; + q->counter = 0; + + return netem_change(sch, opt); +} + +static void netem_destroy(struct Qdisc *sch) +{ + struct netem_sched_data *q = (struct netem_sched_data *)sch->data; + + del_timer_sync(&q->timer); +} + +static int netem_dump(struct Qdisc *sch, struct sk_buff *skb) +{ + struct netem_sched_data *q = (struct netem_sched_data *)sch->data; + unsigned char *b = skb->tail; + struct tc_netem_qopt qopt; + + qopt.latency = q->latency; + qopt.limit = sch->dev->tx_queue_len; + qopt.loss = q->loss; + qopt.gap = q->gap; + + RTA_PUT(skb, TCA_OPTIONS, sizeof(qopt), &qopt); + + return skb->len; + +rtattr_failure: + skb_trim(skb, b - skb->data); + return -1; +} + +static struct Qdisc_ops netem_qdisc_ops = { + .id = "netem", + .priv_size = sizeof(struct netem_sched_data), + .enqueue = netem_enqueue, + .dequeue = netem_dequeue, + .requeue = netem_requeue, + .init = netem_init, + .reset = netem_reset, + .destroy = netem_destroy, + .change = netem_change, + .dump = netem_dump, + .owner = THIS_MODULE, +}; + + +static int __init netem_module_init(void) +{ + return register_qdisc(&netem_qdisc_ops); +} +static void __exit netem_module_exit(void) +{ + unregister_qdisc(&netem_qdisc_ops); +} +module_init(netem_module_init) +module_exit(netem_module_exit) +MODULE_LICENSE("GPL"); From oxymoron@waste.org Thu Jul 1 11:44:04 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 01 Jul 2004 11:44:09 -0700 (PDT) Received: from waste.org (waste.org [209.173.204.2]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i61Ii3gi002994 for ; Thu, 1 Jul 2004 11:44:03 -0700 Received: from waste.org (localhost [127.0.0.1]) by waste.org (8.12.3/8.12.3/Debian-6.6) with ESMTP id i61IhtDj022121; Thu, 1 Jul 2004 13:43:55 -0500 Received: (from oxymoron@localhost) by waste.org (8.12.3/8.12.3/Debian-6.6) id i61Ihtv6022119; Thu, 1 Jul 2004 13:43:55 -0500 Date: Thu, 1 Jul 2004 13:43:55 -0500 From: Matt Mackall To: John Sage Cc: linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: Parentage of BPF code in Linux Message-ID: <20040701184354.GJ5414@waste.org> References: <20040701181002.GG6445@sparky.finchhaven.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20040701181002.GG6445@sparky.finchhaven.net> User-Agent: Mutt/1.3.28i X-Virus-Scanned: by amavisd-new X-archive-position: 6494 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mpm@selenic.com Precedence: bulk X-list: netdev On Thu, Jul 01, 2004 at 11:10:02AM -0700, John Sage wrote: > [Non-subscriber: please cc on replies] > > WRT to the SCO/IBM/Linux imbroglio, there was an interesting assertion > made on the Yahoo! Finance message board for SCOX, and I wondered if > anyone could shed some light. > > The assertion is this: > > "...among other things, the Berkeley Packet Filter code, which was > written by an independent developer for the Missouri School District, > licensed under the BSD license terms that never was part of SysV at > any time..." There's a from-scratch reimplementation of BPF in Linux (called Linux Socket Filter) by Jay Schulist in net/core/filter.c. And he appears to have worked for the _Wisconsin_ school district at the time. A Google search on "schulist filter wisconsin" reveals: Jay Schulist, a senior software engineer with Pleasanton, California's Bivio Networks says he wrote the 500 lines of code in 1997 as part of a volunteer project for the Stevens Point Area Catholic Schools in Wisconsin. "I used it for helping a local school district in my home town to connect their old Apple Macintosh machines to the Internet," he said. -- Mathematics is the supreme nostalgia of our time. From P@draigBrady.com Thu Jul 1 12:51:34 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 01 Jul 2004 12:51:37 -0700 (PDT) Received: from corvil.com (gate.corvil.net [213.94.219.177]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i61JpWgi008400 for ; Thu, 1 Jul 2004 12:51:33 -0700 Received: from draigBrady.com (pixelbeat.local.corvil.com [172.18.1.170]) by corvil.com (8.12.9/8.12.5) with ESMTP id i61JpEOC095576; Thu, 1 Jul 2004 20:51:15 +0100 (IST) (envelope-from P@draigBrady.com) Message-ID: <40E46B32.1080605@draigBrady.com> Date: Thu, 01 Jul 2004 20:51:14 +0100 From: P@draigBrady.com User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040124 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Chris Leech CC: e1000-devel@lists.sourceforge.net, netdev@oss.sgi.com Subject: [PATCH] Re: [E1000-devel] e1000 jumbo problems References: <40D883C2.7010106@draigBrady.com> <40D9BF6B.4050807@draigBrady.com> <41b516cb040623114825a9c555@mail.gmail.com> In-Reply-To: <41b516cb040623114825a9c555@mail.gmail.com> Content-Type: multipart/mixed; boundary="------------000106050202080508060107" X-archive-position: 6495 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: P@draigBrady.com Precedence: bulk X-list: netdev This is a multi-part message in MIME format. --------------000106050202080508060107 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: quoted-printable X-MIME-Autoconverted: from 8bit to quoted-printable by corvil.com id i61JpEOC095576 This patch is not for applying, just for discussion. comments below... Chris Leech wrote: >>>Another related issue, is that the driver uses 4KiB buffers >>>for MTUs in the 1500 -> 2000 range which seems a bit silly. >>>Any particular reason for that? >=20 >=20 > It is wasteful, but does anyone actually use an MTU in the range of > 1501 - 2030? It seems silly to me to go with a non-standard frame > size, but not go up to something that might give you a performance > benefit (9k). > =20 >=20 >>I changed the driver to use 2KiB buffers for frames in the >>1518 -> 2048 range (BSEX=3D0, LPE=3D1). This breaks however >>as packets are not dropped that are larger than the max specified? >>Instead they're scribbled into memory causing a lockup after a while. >=20 >=20 > That sounds right, if you actually got the RCTL register set > correctly. In e1000_setup_rctl the adapter->rx_buffer_len is used to > set that register, and it's currently written to only set LPE if the > buffer size is bigger than 2k (thus, why 4k buffers are used even when > the MTU is in the 1501 - 2030 range). To use 2k buffers for slightly > large frames, you'd want some new flag in the adapter for LPE (or > check netdev->mtu I guess) and do something like: rctl |=3D > E1000_RCTL_SZ_2048 | E1000_RCTL_LPE >=20 > e1000 devices don't have a programmable MTU for receive filtering, > they drop anything larger than 1518 unless LPE (long packet enable) is > set. If LPE is set they accept anything that fits in the FIFO and has > a valid FCS. More accurately e1000s accept anything (even greater than a FIFO). When a large packet is written into multiple FIFOs, only the last rx descriptor has the EOP (end of packet) flag set. The driver doesn't handle this at all currently and will drop the initial buffers (because they don't have the EOP set) which is fine, but it will accept the last buffer (part of the packet). I've attached a patch that fixes this. Also the patch drops packets that fit within a buffer but are larger than MTU. So in summary the patch will stop packets > MTU being accepted by the driver. Note also this patch changes to using 2KiB buffers (from 4KiB) for MTUs between 1500 and 2030, and also it enables large frame reception (LFE) always, but ingore these as they're just for debugging. The patch makes my system completely stable now for MTUs <=3D 2500, However I can still get the system to freeze repeatedly by sending packets larger than this. cheers, P=E1draig. --------------000106050202080508060107 Content-Type: application/x-texinfo; name="e1000-smallMTU.diff" Content-Disposition: inline; filename="e1000-smallMTU.diff" Content-Transfer-Encoding: 7bit diff -Naru e1000-5.2.52-k3/src/e1000.h e1000-pb/src/e1000.h --- e1000-5.2.52-k3/src/e1000.h 2004-05-17 23:59:53.000000000 +0100 +++ e1000-pb/src/e1000.h 2004-06-28 14:45:04.000000000 +0100 @@ -177,6 +177,8 @@ unsigned int next_to_use; /* next descriptor to check for DD status bit */ unsigned int next_to_clean; + /* whether next buffer is partial packet */ + unsigned int multi_buf_pkt; /* array of buffer information structs */ struct e1000_buffer *buffer_info; }; diff -Naru e1000-5.2.52-k3/src/e1000_hw.h e1000-pb/src/e1000_hw.h --- e1000-5.2.52-k3/src/e1000_hw.h 2004-05-17 23:59:53.000000000 +0100 +++ e1000-pb/src/e1000_hw.h 2004-06-29 14:30:45.000000000 +0100 @@ -922,6 +922,7 @@ uint64_t sec; uint64_t cexterr; uint64_t rlec; + uint64_t multibuf; uint64_t xonrxc; uint64_t xontxc; uint64_t xoffrxc; diff -Naru e1000-5.2.52-k3/src/e1000_main.c e1000-pb/src/e1000_main.c --- e1000-5.2.52-k3/src/e1000_main.c 2004-05-17 23:59:53.000000000 +0100 +++ e1000-pb/src/e1000_main.c 2004-07-01 19:24:41.000000000 +0100 @@ -817,6 +817,8 @@ txdr->next_to_use = 0; txdr->next_to_clean = 0; + txdr->multi_buf_pkt = 0; + return 0; } @@ -935,6 +937,8 @@ rxdr->next_to_clean = 0; rxdr->next_to_use = 0; + rxdr->multi_buf_pkt = 0; + return 0; } @@ -962,11 +966,12 @@ rctl &= ~E1000_RCTL_SBP; rctl &= ~(E1000_RCTL_SZ_4096); + rctl |= E1000_RCTL_LPE; switch (adapter->rx_buffer_len) { case E1000_RXBUFFER_2048: default: rctl |= E1000_RCTL_SZ_2048; - rctl &= ~(E1000_RCTL_BSEX | E1000_RCTL_LPE); + rctl &= ~(E1000_RCTL_BSEX); break; case E1000_RXBUFFER_4096: rctl |= E1000_RCTL_SZ_4096 | E1000_RCTL_BSEX | E1000_RCTL_LPE; @@ -1101,6 +1106,8 @@ tx_ring->next_to_use = 0; tx_ring->next_to_clean = 0; + tx_ring->multi_buf_pkt = 0; + E1000_WRITE_REG(&adapter->hw, TDH, 0); E1000_WRITE_REG(&adapter->hw, TDT, 0); } @@ -1169,6 +1176,8 @@ rx_ring->next_to_clean = 0; rx_ring->next_to_use = 0; + rx_ring->multi_buf_pkt = 0; + E1000_WRITE_REG(&adapter->hw, RDH, 0); E1000_WRITE_REG(&adapter->hw, RDT, 0); } @@ -1904,14 +1913,16 @@ DPRINTK(PROBE, ERR, "Invalid MTU setting\n"); return -EINVAL; } + if(max_frame > MAXIMUM_ETHERNET_FRAME_SIZE) { + if(adapter->hw.mac_type < e1000_82543) { + DPRINTK(PROBE, ERR, "Jumbo Frames not supported on 82542\n"); + return -EINVAL; + } + } - if(max_frame <= MAXIMUM_ETHERNET_FRAME_SIZE) { + if(max_frame <= E1000_RXBUFFER_2048) { adapter->rx_buffer_len = E1000_RXBUFFER_2048; - } else if(adapter->hw.mac_type < e1000_82543) { - DPRINTK(PROBE, ERR, "Jumbo Frames not supported on 82542\n"); - return -EINVAL; - } else if(max_frame <= E1000_RXBUFFER_4096) { adapter->rx_buffer_len = E1000_RXBUFFER_4096; @@ -1961,7 +1972,7 @@ adapter->stats.gorch += E1000_READ_REG(hw, GORCH); adapter->stats.bprc += E1000_READ_REG(hw, BPRC); adapter->stats.mprc += E1000_READ_REG(hw, MPRC); - adapter->stats.roc += E1000_READ_REG(hw, ROC); + adapter->stats.roc += E1000_READ_REG(hw, ROC); adapter->stats.roc += adapter->stats.multibuf; adapter->stats.prc64 += E1000_READ_REG(hw, PRC64); adapter->stats.prc127 += E1000_READ_REG(hw, PRC127); adapter->stats.prc255 += E1000_READ_REG(hw, PRC255); @@ -1979,7 +1990,7 @@ adapter->stats.latecol += E1000_READ_REG(hw, LATECOL); adapter->stats.dc += E1000_READ_REG(hw, DC); adapter->stats.sec += E1000_READ_REG(hw, SEC); - adapter->stats.rlec += E1000_READ_REG(hw, RLEC); + adapter->stats.rlec += E1000_READ_REG(hw, RLEC); adapter->stats.rlec += adapter->stats.multibuf; adapter->stats.xonrxc += E1000_READ_REG(hw, XONRXC); adapter->stats.xontxc += E1000_READ_REG(hw, XONTXC); adapter->stats.xoffrxc += E1000_READ_REG(hw, XOFFRXC); @@ -2069,6 +2080,8 @@ adapter->phy_stats.receive_errors += phy_tmp; } + adapter->stats.multibuf=0; + spin_unlock_irqrestore(&adapter->stats_lock, flags); } @@ -2190,6 +2203,7 @@ if(work_done < work_to_do || !netif_running(netdev)) { netif_rx_complete(netdev); e1000_irq_enable(adapter); + return 0; } return (work_done >= work_to_do); @@ -2312,7 +2326,18 @@ skb = buffer_info->skb; length = le16_to_cpu(rx_desc->length); - if(!(rx_desc->status & E1000_RXD_STAT_EOP)) { + if((!(rx_desc->status & E1000_RXD_STAT_EOP)) || rx_ring->multi_buf_pkt) { + + if(!(rx_desc->status & E1000_RXD_STAT_EOP)) { + rx_ring->multi_buf_pkt=1; /* Next buffer also to be dropped */ + } else { + spin_lock_irqsave(&adapter->stats_lock, flags); + adapter->stats.multibuf++; /* counted as "frame too large" error */ + /* TODO: decrement byte and packet counts */ + spin_unlock_irqrestore(&adapter->stats_lock, flags); + + rx_ring->multi_buf_pkt=0; + } /* All receives must fit into a single buffer */ @@ -2358,6 +2383,27 @@ } } + if(length > adapter->hw.max_frame_size + + (rx_desc->status & E1000_RXD_STAT_VP ? VLAN_TAG_SIZE : 0)) { + + spin_lock_irqsave(&adapter->stats_lock, flags); + adapter->stats.multibuf++; /* counted as "frame too large" error */ + /* TODO: decrement byte and packet counts */ + spin_unlock_irqrestore(&adapter->stats_lock, flags); + + E1000_DBG("%s: Packet length %d too big\n", + netdev->name, length); + + dev_kfree_skb_irq(skb); + rx_desc->status = 0; + buffer_info->skb = NULL; + + if(++i == rx_ring->count) i = 0; + + rx_desc = E1000_RX_DESC(*rx_ring, i); + continue; + } + /* Good Receive */ skb_put(skb, length - ETHERNET_FCS_SIZE); --------------000106050202080508060107-- From vkondra@mail.ru Thu Jul 1 13:01:21 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 01 Jul 2004 13:01:22 -0700 (PDT) Received: from mx2.mail.ru (mx2.mail.ru [194.67.23.122]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i61K1Kgi008857 for ; Thu, 1 Jul 2004 13:01:21 -0700 Received: from [212.179.236.79] (port=57836 helo=[192.168.10.2]) by mx2.mail.ru with esmtp id 1Bg7k6-0009MO-00 for netdev@oss.sgi.com; Fri, 02 Jul 2004 00:01:18 +0400 From: Vladimir Kondratiev To: netdev@oss.sgi.com Subject: skb->sb on driver xmit Date: Thu, 1 Jul 2004 22:59:56 +0300 User-Agent: KMail/1.6.2 MIME-Version: 1.0 Content-Disposition: inline Content-Type: Text/Plain; charset="us-ascii" Message-Id: <200407012300.04290.vkondra@mail.ru> Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id i61K1Kgi008857 X-archive-position: 6496 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: vkondra@mail.ru Precedence: bulk X-list: netdev -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 I see (kernel 2.4.26) that, sometimes, driver's xmit function gets packet with skb->cb containing something from higher level protocols, and non-NULL skb->destructor. Is it documented? If no, it is worth to do so. Like "In case driver want to do some non-trivial manipulations with skb, that involves skb->cb and destructor, one should call skb_orphan()." Also, "When passing skb from driver to stack, driver should clear skb->cb. Otherwise, stack will get confused." Vladimir. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.4 (GNU/Linux) iD4DBQFA5G1Cqxdj7mhC6o0RAkGnAKCi9xH1qP8Y6lCg2QcTsm2ywWPEoQCYqGml 6SMQsIlyKTBmejQY9yh6HA== =G3x9 -----END PGP SIGNATURE----- From shemminger@osdl.org Thu Jul 1 13:11:19 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 01 Jul 2004 13:11:25 -0700 (PDT) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i61KBJgi009530 for ; Thu, 1 Jul 2004 13:11:19 -0700 Received: from dell_ss3.pdx.osdl.net (dell_ss3.pdx.osdl.net [172.20.1.60]) by mail.osdl.org (8.11.6/8.11.6) with SMTP id i61KB1G29825; Thu, 1 Jul 2004 13:11:01 -0700 Date: Thu, 1 Jul 2004 13:11:01 -0700 From: Stephen Hemminger To: "David S. Miller" Cc: Catalin BOIE , netdev@oss.sgi.com, lartc@mailman.ds9a.nl Subject: [PATCH 2.4] update to network emulation QOS scheduler Message-Id: <20040701131101.184f7840@dell_ss3.pdx.osdl.net> In-Reply-To: <20040701113312.43cfe6c5@dell_ss3.pdx.osdl.net> References: <20040701113312.43cfe6c5@dell_ss3.pdx.osdl.net> Organization: Open Source Development Lab X-Mailer: Sylpheed version 0.9.10claws (GTK+ 1.2.10; i386-redhat-linux-gnu) X-Face: &@E+xe?c%:&e4D{>f1O<&U>2qwRREG5!}7R4;D<"NO^UI2mJ[eEOA2*3>(`Th.yP,VDPo9$ /`~cw![cmj~~jWe?AHY7D1S+\}5brN0k*NE?pPh_'_d>6;XGG[\KDRViCfumZT3@[ Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 6497 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev This is the 2.4 version of the conversion of simple network delay scheduler to network emulator. Signed-off-by: Stephen Hemminger diff -Nru a/include/linux/pkt_sched.h b/include/linux/pkt_sched.h --- a/include/linux/pkt_sched.h 2004-07-01 13:06:36 -07:00 +++ b/include/linux/pkt_sched.h 2004-07-01 13:06:36 -07:00 @@ -432,12 +432,15 @@ #define TCA_ATM_MAX TCA_ATM_STATE -/* Delay section */ -struct tc_dly_qopt +/* Network emulator */ +struct tc_netem_qopt { - __u32 latency; - __u32 limit; - __u32 loss; + __u32 latency; /* added delay (us) */ + __u32 limit; /* fifo limit (packets) */ + __u32 loss; /* random packet loss (0=none ~0=100%) */ + __u32 gap; /* re-ordering gap (0 for delay all) */ + __u32 duplicate; /* random packet dup (0=none ~0=100%) */ + __u32 rate; /* maximum transmit rate (bytes/sec) */ }; #endif diff -Nru a/net/sched/Config.in b/net/sched/Config.in --- a/net/sched/Config.in 2004-07-01 13:06:36 -07:00 +++ b/net/sched/Config.in 2004-07-01 13:06:36 -07:00 @@ -15,7 +15,7 @@ tristate ' TEQL queue' CONFIG_NET_SCH_TEQL tristate ' TBF queue' CONFIG_NET_SCH_TBF tristate ' GRED queue' CONFIG_NET_SCH_GRED -tristate ' Network delay simulator' CONFIG_NET_SCH_DELAY +tristate ' Network emulator' CONFIG_NET_SCH_NETEM tristate ' Diffserv field marker' CONFIG_NET_SCH_DSMARK if [ "$CONFIG_NETFILTER" = "y" ]; then tristate ' Ingress Qdisc' CONFIG_NET_SCH_INGRESS diff -Nru a/net/sched/Makefile b/net/sched/Makefile --- a/net/sched/Makefile 2004-07-01 13:06:36 -07:00 +++ b/net/sched/Makefile 2004-07-01 13:06:36 -07:00 @@ -14,7 +14,7 @@ obj-$(CONFIG_NET_SCH_INGRESS) += sch_ingress.o obj-$(CONFIG_NET_SCH_CBQ) += sch_cbq.o obj-$(CONFIG_NET_SCH_CSZ) += sch_csz.o -obj-$(CONFIG_NET_SCH_DELAY) += sch_delay.o +obj-$(CONFIG_NET_SCH_NETEM) += sch_netem.o obj-$(CONFIG_NET_SCH_HPFQ) += sch_hpfq.o obj-$(CONFIG_NET_SCH_HFSC) += sch_hfsc.o obj-$(CONFIG_NET_SCH_HTB) += sch_htb.o diff -Nru a/net/sched/sch_delay.c b/net/sched/sch_delay.c --- a/net/sched/sch_delay.c 2004-07-01 13:06:36 -07:00 +++ /dev/null Wed Dec 31 16:00:00 196900 @@ -1,289 +0,0 @@ -/* - * net/sched/sch_delay.c Simple constant delay - * - * This program is free software; you can redistribute it and/or - * modify it under the terms of the GNU General Public License - * as published by the Free Software Foundation; either version - * 2 of the License, or (at your option) any later version. - * - * Authors: Stephen Hemminger - */ - -#include -#include -#include -#include - -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include - -/* Network delay simulator - This scheduler adds a fixed delay to all packets. - Similar to NISTnet and BSD Dummynet. - - It uses byte fifo underneath similar to TBF */ -struct dly_sched_data { - u32 latency; - u32 limit; - u32 loss; - struct timer_list timer; - struct Qdisc *qdisc; -}; - -/* Time stamp put into socket buffer control block */ -struct dly_skb_cb { - psched_time_t queuetime; -}; - -/* Enqueue packets with underlying discipline (fifo) - * but mark them with current time first. - */ -static int dly_enqueue(struct sk_buff *skb, struct Qdisc *sch) -{ - struct dly_sched_data *q = (struct dly_sched_data *)sch->data; - struct dly_skb_cb *cb = (struct dly_skb_cb *)skb->cb; - int ret; - - /* Random packet drop 0 => none, ~0 => all */ - if (q->loss >= net_random()) { - sch->stats.drops++; - return 0; /* lie about loss so TCP doesn't know */ - } - - PSCHED_GET_TIME(cb->queuetime); - - /* Queue to underlying scheduler */ - ret = q->qdisc->enqueue(skb, q->qdisc); - if (ret) - sch->stats.drops++; - else { - sch->q.qlen++; - sch->stats.bytes += skb->len; - sch->stats.packets++; - } - return ret; -} - -/* Requeue packets but don't change time stamp */ -static int dly_requeue(struct sk_buff *skb, struct Qdisc *sch) -{ - struct dly_sched_data *q = (struct dly_sched_data *)sch->data; - int ret; - - ret = q->qdisc->ops->requeue(skb, q->qdisc); - if (ret == 0) - sch->q.qlen++; - return ret; -} - -static unsigned int dly_drop(struct Qdisc *sch) -{ - struct dly_sched_data *q = (struct dly_sched_data *)sch->data; - unsigned int len; - - len = q->qdisc->ops->drop(q->qdisc); - if (len) { - sch->q.qlen--; - sch->stats.drops++; - } - return len; -} - -/* Dequeue packet. - * If packet needs to be held up, then stop the - * queue and set timer to wakeup later. - */ -static struct sk_buff *dly_dequeue(struct Qdisc *sch) -{ - struct dly_sched_data *q = (struct dly_sched_data *)sch->data; - struct sk_buff *skb; - - retry: - skb = q->qdisc->dequeue(q->qdisc); - if (skb) { - struct dly_skb_cb *cb = (struct dly_skb_cb *)skb->cb; - psched_time_t now; - long diff, delay; - - PSCHED_GET_TIME(now); - diff = q->latency - PSCHED_TDIFF(now, cb->queuetime); - - if (diff <= 0) { - sch->q.qlen--; - sch->flags &= ~TCQ_F_THROTTLED; - return skb; - } - - if (q->qdisc->ops->requeue(skb, q->qdisc) != NET_XMIT_SUCCESS) { - sch->q.qlen--; - sch->stats.drops++; - goto retry; - } - - delay = PSCHED_US2JIFFIE(diff); - if (delay <= 0) - delay = 1; - mod_timer(&q->timer, jiffies+delay); - - sch->flags |= TCQ_F_THROTTLED; - } - return NULL; -} - -static void dly_reset(struct Qdisc *sch) -{ - struct dly_sched_data *q = (struct dly_sched_data *)sch->data; - - qdisc_reset(q->qdisc); - sch->q.qlen = 0; - sch->flags &= ~TCQ_F_THROTTLED; - del_timer(&q->timer); -} - -static void dly_timer(unsigned long arg) -{ - struct Qdisc *sch = (struct Qdisc *)arg; - - sch->flags &= ~TCQ_F_THROTTLED; - netif_schedule(sch->dev); -} - -/* Tell Fifo the new limit. */ -static int change_limit(struct Qdisc *q, u32 limit) -{ - struct rtattr *rta; - int ret; - - rta = kmalloc(RTA_LENGTH(sizeof(struct tc_fifo_qopt)), GFP_KERNEL); - if (!rta) - return -ENOMEM; - - rta->rta_type = RTM_NEWQDISC; - rta->rta_len = RTA_LENGTH(sizeof(struct tc_fifo_qopt)); - ((struct tc_fifo_qopt *)RTA_DATA(rta))->limit = limit; - ret = q->ops->change(q, rta); - kfree(rta); - - return ret; -} - -/* Setup underlying FIFO discipline */ -static int dly_change(struct Qdisc *sch, struct rtattr *opt) -{ - struct dly_sched_data *q = (struct dly_sched_data *)sch->data; - struct tc_dly_qopt *qopt = RTA_DATA(opt); - int err; - - if (q->qdisc == &noop_qdisc) { - struct Qdisc *child - = qdisc_create_dflt(sch->dev, &bfifo_qdisc_ops); - if (!child) - return -EINVAL; - q->qdisc = child; - } - - err = change_limit(q->qdisc, qopt->limit); - if (err) { - qdisc_destroy(q->qdisc); - q->qdisc = &noop_qdisc; - } else { - q->latency = qopt->latency; - q->limit = qopt->limit; - q->loss = qopt->loss; - } - return err; -} - -static int dly_init(struct Qdisc *sch, struct rtattr *opt) -{ - struct dly_sched_data *q = (struct dly_sched_data *)sch->data; - int err; - - if (!opt) - return -EINVAL; - - MOD_INC_USE_COUNT; - - init_timer(&q->timer); - q->timer.function = dly_timer; - q->timer.data = (unsigned long) sch; - q->qdisc = &noop_qdisc; - - err = dly_change(sch, opt); - if (err) - MOD_DEC_USE_COUNT; - - return err; -} - -static void dly_destroy(struct Qdisc *sch) -{ - struct dly_sched_data *q = (struct dly_sched_data *)sch->data; - - del_timer(&q->timer); - qdisc_destroy(q->qdisc); - q->qdisc = &noop_qdisc; - - MOD_DEC_USE_COUNT; -} - -static int dly_dump(struct Qdisc *sch, struct sk_buff *skb) -{ - struct dly_sched_data *q = (struct dly_sched_data *)sch->data; - unsigned char *b = skb->tail; - struct tc_dly_qopt qopt; - - qopt.latency = q->latency; - qopt.limit = q->limit; - qopt.loss = q->loss; - - RTA_PUT(skb, TCA_OPTIONS, sizeof(qopt), &qopt); - - return skb->len; - -rtattr_failure: - skb_trim(skb, b - skb->data); - return -1; -} - -struct Qdisc_ops dly_qdisc_ops = { - .id = "delay", - .priv_size = sizeof(struct dly_sched_data), - .enqueue = dly_enqueue, - .dequeue = dly_dequeue, - .requeue = dly_requeue, - .drop = dly_drop, - .init = dly_init, - .reset = dly_reset, - .destroy = dly_destroy, - .change = dly_change, - .dump = dly_dump, -}; - -#ifdef MODULE -int init_module(void) -{ - return register_qdisc(&dly_qdisc_ops); -} - -void cleanup_module(void) -{ - unregister_qdisc(&dly_qdisc_ops); -} -#endif -MODULE_LICENSE("GPL"); diff -Nru a/net/sched/sch_netem.c b/net/sched/sch_netem.c --- /dev/null Wed Dec 31 16:00:00 196900 +++ b/net/sched/sch_netem.c 2004-07-01 13:06:36 -07:00 @@ -0,0 +1,255 @@ +/* + * net/sched/sch_netem.c Network emulator + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + * + * Authors: Stephen Hemminger + * Catalin(ux aka Dino) BOIE + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include + +/* Network emulator + * + * This scheduler can alters spacing and order + * Similar to NISTnet and BSD Dummynet. + */ + +struct netem_sched_data { + struct sk_buff_head qnormal; + struct sk_buff_head qdelay; + struct timer_list timer; + + u32 latency; + u32 loss; + u32 counter; + u32 gap; +}; + +/* Time stamp put into socket buffer control block */ +struct netem_skb_cb { + psched_time_t time_to_send; +}; + +/* Enqueue packets with underlying discipline (fifo) + * but mark them with current time first. + */ +static int netem_enqueue(struct sk_buff *skb, struct Qdisc *sch) +{ + struct netem_sched_data *q = (struct netem_sched_data *)sch->data; + struct netem_skb_cb *cb = (struct netem_skb_cb *)skb->cb; + + pr_debug("netem_enqueue skb=%p @%lu\n", skb, jiffies); + + /* Random packet drop 0 => none, ~0 => all */ + if (q->loss >= net_random()) { + sch->stats.drops++; + return 0; /* lie about loss so TCP doesn't know */ + } + + if (q->qnormal.qlen < sch->dev->tx_queue_len) { + PSCHED_GET_TIME(cb->time_to_send); + PSCHED_TADD(cb->time_to_send, q->latency); + + __skb_queue_tail(&q->qnormal, skb); + sch->q.qlen++; + sch->stats.bytes += skb->len; + sch->stats.packets++; + return 0; + } + + sch->stats.drops++; + kfree_skb(skb); + return NET_XMIT_DROP; +} + +/* Requeue packets but don't change time stamp */ +static int netem_requeue(struct sk_buff *skb, struct Qdisc *sch) +{ + struct netem_sched_data *q = (struct netem_sched_data *)sch->data; + + __skb_queue_head(&q->qnormal, skb); + sch->q.qlen++; + return 0; +} + +/* + * Check the look aside buffer list, and see if any freshly baked buffers. + * If head of queue is not baked, set timer. + */ +static struct sk_buff *netem_get_delayed(struct netem_sched_data *q) +{ + struct sk_buff *skb; + psched_time_t now; + long delay; + + skb = skb_peek(&q->qdelay); + if (skb) { + const struct netem_skb_cb *cb + = (const struct netem_skb_cb *)skb->cb; + + PSCHED_GET_TIME(now); + delay = PSCHED_US2JIFFIE(PSCHED_TDIFF(cb->time_to_send, now)); + pr_debug("netem_dequeue: delay queue %p@%lu %ld\n", + skb, jiffies, delay); + + /* it's baked enough */ + if (delay <= 0) { + __skb_unlink(skb, &q->qdelay); + del_timer(&q->timer); + return skb; + } + + if (!timer_pending(&q->timer)) { + q->timer.expires = jiffies + delay; + add_timer(&q->timer); + } + } + return NULL; +} + +/* Dequeue packet. + * If packet needs to be held up, then put in the delay + * queue and set timer to wakeup later. + */ +static struct sk_buff *netem_dequeue(struct Qdisc *sch) +{ + struct netem_sched_data *q = (struct netem_sched_data *)sch->data; + struct sk_buff *skb; + + skb = netem_get_delayed(q); + if (!skb && (skb = __skb_dequeue(&q->qnormal))) { + /* are we doing out of order packet skip? */ + if (q->counter < q->gap) { + pr_debug("netem_dequeue: send %p normally\n", skb); + q->counter++; + } else { + /* don't send now hold for later */ + pr_debug("netem_dequeue: hold [%p]@%lu\n", skb, jiffies); + __skb_queue_tail(&q->qdelay, skb); + q->counter = 0; + skb = netem_get_delayed(q); + } + } + + if (skb) + sch->q.qlen--; + return skb; +} + +static void netem_timer(unsigned long arg) +{ + struct Qdisc *sch = (struct Qdisc *)arg; + + pr_debug("netem_timer: fired @%lu\n", jiffies); + netif_schedule(sch->dev); +} + +static void netem_reset(struct Qdisc *sch) +{ + struct netem_sched_data *q = (struct netem_sched_data *)sch->data; + + skb_queue_purge(&q->qnormal); + skb_queue_purge(&q->qdelay); + + sch->q.qlen = 0; + del_timer_sync(&q->timer); +} + +static int netem_change(struct Qdisc *sch, struct rtattr *opt) +{ + struct netem_sched_data *q = (struct netem_sched_data *)sch->data; + struct tc_netem_qopt *qopt = RTA_DATA(opt); + + if (qopt->limit) + sch->dev->tx_queue_len = qopt->limit; + + q->gap = qopt->gap; + q->loss = qopt->loss; + q->latency = qopt->latency; + + return 0; +} + +static int netem_init(struct Qdisc *sch, struct rtattr *opt) +{ + struct netem_sched_data *q = (struct netem_sched_data *)sch->data; + + if (!opt) + return -EINVAL; + + skb_queue_head_init(&q->qnormal); + skb_queue_head_init(&q->qdelay); + init_timer(&q->timer); + q->timer.function = netem_timer; + q->timer.data = (unsigned long) sch; + q->counter = 0; + + return netem_change(sch, opt); +} + +static void netem_destroy(struct Qdisc *sch) +{ + struct netem_sched_data *q = (struct netem_sched_data *)sch->data; + + del_timer_sync(&q->timer); +} + +static int netem_dump(struct Qdisc *sch, struct sk_buff *skb) +{ + struct netem_sched_data *q = (struct netem_sched_data *)sch->data; + unsigned char *b = skb->tail; + struct tc_netem_qopt qopt; + + qopt.latency = q->latency; + qopt.limit = sch->dev->tx_queue_len; + qopt.loss = q->loss; + qopt.gap = q->gap; + + RTA_PUT(skb, TCA_OPTIONS, sizeof(qopt), &qopt); + + return skb->len; + +rtattr_failure: + skb_trim(skb, b - skb->data); + return -1; +} + +static struct Qdisc_ops netem_qdisc_ops = { + .id = "netem", + .priv_size = sizeof(struct netem_sched_data), + .enqueue = netem_enqueue, + .dequeue = netem_dequeue, + .requeue = netem_requeue, + .init = netem_init, + .reset = netem_reset, + .destroy = netem_destroy, + .change = netem_change, + .dump = netem_dump, +}; + + +static int __init netem_module_init(void) +{ + return register_qdisc(&netem_qdisc_ops); +} +static void __exit netem_module_exit(void) +{ + unregister_qdisc(&netem_qdisc_ops); +} +module_init(netem_module_init) +module_exit(netem_module_exit) +MODULE_LICENSE("GPL"); From shemminger@osdl.org Thu Jul 1 13:37:54 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 01 Jul 2004 13:38:31 -0700 (PDT) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i61Kbsgi010239 for ; Thu, 1 Jul 2004 13:37:54 -0700 Received: from dell_ss3.pdx.osdl.net (dell_ss3.pdx.osdl.net [172.20.1.60]) by mail.osdl.org (8.11.6/8.11.6) with SMTP id i61KbcG02617; Thu, 1 Jul 2004 13:37:38 -0700 Date: Thu, 1 Jul 2004 13:37:38 -0700 From: Stephen Hemminger To: "David S. Miller" Cc: Arnaldo Carvalho de Melo , netdev@oss.sgi.com Subject: [PATCH] TCP acts like it is always out of memory. Message-Id: <20040701133738.301b9e46@dell_ss3.pdx.osdl.net> In-Reply-To: <20040630153049.3ca25b76.davem@redhat.com> References: <32886.63.170.215.71.1088564087.squirrel@www.osdl.org> <20040629222751.392f0a82.davem@redhat.com> <20040630152750.2d01ca51@dell_ss3.pdx.osdl.net> <20040630153049.3ca25b76.davem@redhat.com> Organization: Open Source Development Lab X-Mailer: Sylpheed version 0.9.10claws (GTK+ 1.2.10; i386-redhat-linux-gnu) X-Face: &@E+xe?c%:&e4D{>f1O<&U>2qwRREG5!}7R4;D<"NO^UI2mJ[eEOA2*3>(`Th.yP,VDPo9$ /`~cw![cmj~~jWe?AHY7D1S+\}5brN0k*NE?pPh_'_d>6;XGG[\KDRViCfumZT3@[ Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 6498 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev Current 2.6.7 tree acts as if it is alway under memory pressure because a recent change did a s/tcp_memory_pressure/tcp_prot.memory_pressure/. The problem is tcp_prot.memory_pressure is a pointer, so it is always non-zero! Rather than using *tcp_prot.memory_pressure, just go back to looking at tcp_memory_pressure. Signed-off-by: Stephen Hemminger diff -Nru a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c --- a/net/ipv4/tcp_input.c 2004-07-01 13:36:58 -07:00 +++ b/net/ipv4/tcp_input.c 2004-07-01 13:36:58 -07:00 @@ -259,7 +259,7 @@ /* Check #1 */ if (tp->rcv_ssthresh < tp->window_clamp && (int)tp->rcv_ssthresh < tcp_space(sk) && - !tcp_prot.memory_pressure) { + !tcp_memory_pressure) { int incr; /* Check #2. Increase window, if skb with such overhead @@ -349,7 +349,7 @@ if (ofo_win) { if (sk->sk_rcvbuf < sysctl_tcp_rmem[2] && !(sk->sk_userlocks & SOCK_RCVBUF_LOCK) && - !tcp_prot.memory_pressure && + !tcp_memory_pressure && atomic_read(&tcp_memory_allocated) < sysctl_tcp_mem[0]) sk->sk_rcvbuf = min(atomic_read(&sk->sk_rmem_alloc), sysctl_tcp_rmem[2]); @@ -3764,7 +3764,7 @@ if (atomic_read(&sk->sk_rmem_alloc) >= sk->sk_rcvbuf) tcp_clamp_window(sk, tp); - else if (tcp_prot.memory_pressure) + else if (tcp_memory_pressure) tp->rcv_ssthresh = min(tp->rcv_ssthresh, 4U * tp->advmss); tcp_collapse_ofo_queue(sk); @@ -3844,7 +3844,7 @@ if (tp->packets_out < tp->snd_cwnd && !(sk->sk_userlocks & SOCK_SNDBUF_LOCK) && - !tcp_prot.memory_pressure && + !tcp_memory_pressure && atomic_read(&tcp_memory_allocated) < sysctl_tcp_mem[0]) { int sndmem = max_t(u32, tp->mss_clamp, tp->mss_cache) + MAX_TCP_HEADER + 16 + sizeof(struct sk_buff), diff -Nru a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c --- a/net/ipv4/tcp_output.c 2004-07-01 13:36:58 -07:00 +++ b/net/ipv4/tcp_output.c 2004-07-01 13:36:58 -07:00 @@ -672,7 +672,7 @@ if (free_space < full_space/2) { tp->ack.quick = 0; - if (tcp_prot.memory_pressure) + if (tcp_memory_pressure) tp->rcv_ssthresh = min(tp->rcv_ssthresh, 4U*tp->advmss); if (free_space < mss) diff -Nru a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c --- a/net/ipv4/tcp_timer.c 2004-07-01 13:36:58 -07:00 +++ b/net/ipv4/tcp_timer.c 2004-07-01 13:36:58 -07:00 @@ -257,7 +257,7 @@ TCP_CHECK_TIMER(sk); out: - if (tcp_prot.memory_pressure) + if (tcp_memory_pressure) sk_stream_mem_reclaim(sk); out_unlock: bh_unlock_sock(sk); From davem@redhat.com Thu Jul 1 14:05:16 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 01 Jul 2004 14:05:19 -0700 (PDT) Received: from mx1.redhat.com (mx1.redhat.com [66.187.233.31]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i61L5Fgi011087 for ; Thu, 1 Jul 2004 14:05:16 -0700 Received: from int-mx1.corp.redhat.com (int-mx1.corp.redhat.com [172.16.52.254]) by mx1.redhat.com (8.12.10/8.12.10) with ESMTP id i61L57e1018065; Thu, 1 Jul 2004 17:05:07 -0400 Received: from devserv.devel.redhat.com (devserv.devel.redhat.com [172.16.58.1]) by int-mx1.corp.redhat.com (8.11.6/8.11.6) with ESMTP id i61L57013458; Thu, 1 Jul 2004 17:05:07 -0400 Received: from cheetah.davemloft.net (localhost.localdomain [127.0.0.1]) by devserv.devel.redhat.com (8.12.11/8.12.10) with SMTP id i61L4fOA004142; Thu, 1 Jul 2004 17:04:41 -0400 Date: Thu, 1 Jul 2004 14:04:06 -0700 From: "David S. Miller" To: Stephen Hemminger Cc: acme@conectiva.com.br, netdev@oss.sgi.com Subject: Re: [PATCH] TCP acts like it is always out of memory. Message-Id: <20040701140406.62dfbc2a.davem@redhat.com> In-Reply-To: <20040701133738.301b9e46@dell_ss3.pdx.osdl.net> References: <32886.63.170.215.71.1088564087.squirrel@www.osdl.org> <20040629222751.392f0a82.davem@redhat.com> <20040630152750.2d01ca51@dell_ss3.pdx.osdl.net> <20040630153049.3ca25b76.davem@redhat.com> <20040701133738.301b9e46@dell_ss3.pdx.osdl.net> X-Mailer: Sylpheed version 0.9.12 (GTK+ 1.2.10; sparc-unknown-linux-gnu) X-Face: "_;p5u5aPsO,_Vsx"^v-pEq09'CU4&Dc1$fQExov$62l60cgCc%FnIwD=.UF^a>?5'9Kn[;433QFVV9M..2eN.@4ZWPGbdi<=?[:T>y?SD(R*-3It"Vj:)"dP Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 6499 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Thu, 1 Jul 2004 13:37:38 -0700 Stephen Hemminger wrote: > Current 2.6.7 tree acts as if it is alway under memory pressure because > a recent change did a s/tcp_memory_pressure/tcp_prot.memory_pressure/. > The problem is tcp_prot.memory_pressure is a pointer, so it is always non-zero! > > Rather than using *tcp_prot.memory_pressure, just go back to looking at > tcp_memory_pressure. Hehe, applied thanks Stephen. From romieu@fr.zoreil.com Thu Jul 1 14:50:58 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 01 Jul 2004 14:51:01 -0700 (PDT) Received: from fr.zoreil.com (electric-eye.fr.zoreil.com [213.41.134.224]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i61Lougi012324 for ; Thu, 1 Jul 2004 14:50:57 -0700 Received: from electric-eye.fr.zoreil.com (localhost.localdomain [127.0.0.1]) by fr.zoreil.com (8.12.8/8.12.1) with ESMTP id i61Lloon007357; Thu, 1 Jul 2004 23:47:50 +0200 Received: (from romieu@localhost) by electric-eye.fr.zoreil.com (8.12.8/8.12.1) id i61LloHr007356; Thu, 1 Jul 2004 23:47:50 +0200 Date: Thu, 1 Jul 2004 23:47:50 +0200 From: Francois Romieu To: Jeff Garzik Cc: netdev@oss.sgi.com, alan@redhat.com, akpm@osdl.org Subject: Re: [PATCH 2.6.7-mm3 1/1] via-velocity: use common crc16 code for WOL Message-ID: <20040701234750.A7109@electric-eye.fr.zoreil.com> References: <20040630223346.A23520@electric-eye.fr.zoreil.com> <40E384F9.7090108@pobox.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5.1i In-Reply-To: <40E384F9.7090108@pobox.com>; from jgarzik@pobox.com on Wed, Jun 30, 2004 at 11:28:57PM -0400 X-Organisation: Land of Sunshine Inc. X-archive-position: 6500 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: romieu@fr.zoreil.com Precedence: bulk X-list: netdev Jeff Garzik : > patch OK but rejected against mainline I assume mainline means bk-linus here. I have checked and the patch applies fine to 2.6.7-mm5 which includes both bk-linus and bk-netdev from yesterday (aka: bk://gkernel.bkbits.net/netdev-2.6 jgarzik@pobox.com|ChangeSet|20040622043820|36094 jgarzik) -- Ueimor From acme@conectiva.com.br Thu Jul 1 18:45:42 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 01 Jul 2004 18:45:45 -0700 (PDT) Received: from perninha.conectiva.com.br (perninha.conectiva.com.br [200.140.247.100]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i621jegi020192 for ; Thu, 1 Jul 2004 18:45:41 -0700 Received: by perninha.conectiva.com.br (Postfix, from userid 563) id 760864754E; Thu, 1 Jul 2004 22:45:34 -0300 (BRT) Received: from burns.conectiva (burns.conectiva [10.0.0.4]) by perninha.conectiva.com.br (Postfix) with SMTP id 1F3F747B4A for ; Thu, 1 Jul 2004 22:45:30 -0300 (BRT) Received: (qmail 20664 invoked by uid 0); 2 Jul 2004 02:43:39 -0000 Received: from mapi8.distro.conectiva (HELO oops.kerneljanitors.org) (10.0.16.10) by burns.conectiva with SMTP; 2 Jul 2004 02:43:39 -0000 Received: by oops.kerneljanitors.org (Postfix, from userid 500) id 48BD04C822; Thu, 1 Jul 2004 22:32:25 -0300 (BRT) Date: Thu, 1 Jul 2004 22:32:25 -0300 From: Arnaldo Carvalho de Melo To: "David S. Miller" Cc: Stephen Hemminger , netdev@oss.sgi.com Subject: Re: [PATCH] TCP acts like it is always out of memory. Message-ID: <20040702013225.GA24707@conectiva.com.br> References: <32886.63.170.215.71.1088564087.squirrel@www.osdl.org> <20040629222751.392f0a82.davem@redhat.com> <20040630152750.2d01ca51@dell_ss3.pdx.osdl.net> <20040630153049.3ca25b76.davem@redhat.com> <20040701133738.301b9e46@dell_ss3.pdx.osdl.net> <20040701140406.62dfbc2a.davem@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20040701140406.62dfbc2a.davem@redhat.com> X-Url: http://advogato.org/person/acme User-Agent: Mutt/1.5.5.1i X-Bogosity: No, tests=bogofilter, spamicity=0.499998, version=0.16.3 X-archive-position: 6501 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: acme@conectiva.com.br Precedence: bulk X-list: netdev Em Thu, Jul 01, 2004 at 02:04:06PM -0700, David S. Miller escreveu: > On Thu, 1 Jul 2004 13:37:38 -0700 > Stephen Hemminger wrote: > > > Current 2.6.7 tree acts as if it is alway under memory pressure because > > a recent change did a s/tcp_memory_pressure/tcp_prot.memory_pressure/. > > The problem is tcp_prot.memory_pressure is a pointer, so it is always non-zero! > > > > Rather than using *tcp_prot.memory_pressure, just go back to looking at > > tcp_memory_pressure. > > Hehe, applied thanks Stephen. :-) Thanks Stephen for the fix, this was a leftover of the conversion of the memory pressure members in struct proto to pointers, to cover the case pointed out by David related to the ipv6_mapped functionality in the 1.1722.122.23 changeset, (i.e. tcp_prot and tcpv6_prot having to share the same accounting variables), I forgot to convert all places where the tcp_prot.memory_pressure memory is used, the fix is exactly what I should have done. Due to family health problems I was unable to promply fix this thinko, so, again, thank you very much. Best Regards, - Arnaldo From jhaller@lucent.com Thu Jul 1 21:40:06 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 01 Jul 2004 21:40:09 -0700 (PDT) Received: from ihemail1.lucent.com (ihemail1.lucent.com [192.11.222.161]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i624e5gi026764 for ; Thu, 1 Jul 2004 21:40:06 -0700 Received: from nwsgpa.ih.lucent.com (h135-1-121-22.lucent.com [135.1.121.22]) by ihemail1.lucent.com (8.12.11/8.12.11) with ESMTP id i624dwJh011365 for ; Thu, 1 Jul 2004 23:39:59 -0500 (CDT) Received: from lucent.com by nwsgpa.ih.lucent.com (8.11.7p1+Sun/EMS-1.5 sol2) id i624dwL16471; Thu, 1 Jul 2004 23:39:58 -0500 (CDT) Message-ID: <40E4E719.6000508@lucent.com> Date: Thu, 01 Jul 2004 23:39:53 -0500 From: John Haller User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.6) Gecko/20040113 X-Accept-Language: en-us, en MIME-Version: 1.0 To: netdev@oss.sgi.com Subject: SO_REUSEADDR, restarting servers, and security patches Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 6502 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jhaller@lucent.com Precedence: bulk X-list: netdev In October 2002, Yoshifuji Hideaki introduced a patch that prevents completely any duplication of , even when SO_REUSEADDR is set, preventing port stealing denial-of-service attacks. This also has the side effect of not allowing a server to be immediately restarted after being stopped, because of the sockets that remain in the TCP_TIME_WAIT state. Would security be negatively impacted by relaxing the restrictions introduced by the above patch to allow a bind to a TCP port only if all existing references to that TCP port were in the TCP_TIME_WAIT state, and both the listening port and all of the TCP_TIME_WAIT sockets had the SO_REUSEADDR flag set? This relaxation would only help in the case of servers where the listener and connected sockets are all stopped at the same time, and not loosely connect servers where the connected sockets are handled in a separate process from the listener. I don't want to use SO_REUSEPORT for two reasons. The first is that SO_REUSEPORT allows binding the same address twice for active sockets. The second is that SO_REUSEPORT is not commonly enabled. The top message regarding the patch is located here: http://oss.sgi.com/projects/netdev/archive/2002-10/msg00035.html -- John Haller From akpm@osdl.org Fri Jul 2 00:51:01 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 02 Jul 2004 00:51:08 -0700 (PDT) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i627p1gi002305 for ; Fri, 2 Jul 2004 00:51:01 -0700 Received: from bix (build.pdx.osdl.net [172.20.1.2]) by mail.osdl.org (8.11.6/8.11.6) with SMTP id i627otG14795 for ; Fri, 2 Jul 2004 00:50:55 -0700 Date: Fri, 2 Jul 2004 00:49:56 -0700 From: Andrew Morton To: netdev@oss.sgi.com Subject: Fw: [BUGS] [CHECKER] 99 synchronization bugs and a lock summary database Message-Id: <20040702004956.448c95d9.akpm@osdl.org> X-Mailer: Sylpheed version 0.9.7 (GTK+ 1.2.10; i386-redhat-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 6503 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: akpm@osdl.org Precedence: bulk X-list: netdev Could someone please take a look at the locking in net/ipv4/ipmr.c:ipmr_mfc_seq_next? It seems rather broken. Begin forwarded message: Date: Thu, 1 Jul 2004 18:01:00 -0700 (PDT) From: Yichen Xie To: linux-kernel@vger.kernel.org Subject: [BUGS] [CHECKER] 99 synchronization bugs and a lock summary database Hi all, We are a group of researchers at Stanford working on program analysis algorithms. We have been building a precision enhanced program analysis engine at Stanford, and our first application was to derive mutex/lock behavior in the linux kernel. In the process, we found 99 likely synchronization errors in linux kernel version 2.6.5: http://glide.stanford.edu/linux-lock/err1.html (69 errors) http://glide.stanford.edu/linux-lock/err2.html (30 errors) err1.html consists of potential double locks/unlocks. Acquiring a lock twice in a row may result in a system hang, and releasing a lock more than once with certain mutex functions (e.g. up) may cause critical section violations. err2.html consists of reports on inconsistent output lock states on function exit. These errors usually correspond to missed lock operations on error paths. (filenames in this report correspond to where a function is declared, so CTAGS may come in handy to find the actual implementation of the function). In the error reports, functions are hyperlinked to their derived summaries, and those of their callees (since the analysis spans function calls, the error condition of a particular function usually depend on the locking behavior of its callees). For example, in function "radeon_pm_program_v2clk" (first error report in err1.html), the tool flagged an error at line 323 of "drivers/video/aty/radeon_pm.c". Line 323 invokes two macros, OUTPLL, and INPLL. OUTPLL acquires "rinfo->reg_lock", and then evaluates "addr", which is calculated, in this case, by calling _INPLL. By clicking on the link "drivers/video/aty/radeon_pm.c:radeon_pm_program_v2clk", we can see that _INPLL requires "rinfo->reg_lock" be unheld on entry (confirmed by looking at its definition), which is not satisfied in this example. So this is a double lock error and could potentially lead to a deadlock on MP systems. We also have a separate web interface to the summary database at: http://glide.stanford.edu/linux-lock/ For example, typing "fh_put" in the input box gives ========= SUMMARY ========= FUNCTION SUMMARY: 'include/linux/nfsd/nfsfh.h:fh_put' { dcache_lock(global): [unlocked -> unlocked] fhp(param#0)->fh_dentry->d_lock: [unlocked -> unlocked] fhp(param#0)->fh_dentry->d_inode->i_sem: [locked -> unlocked] } Each line in the function summary correspond to the requirements and effects on one particular lock. For example, fh_put requires that the global lock variable dcache_lock be unheld on entry, and it'll remain unheld on exit. It also requires fhp->fh_dentry->d_inode->i_sem be held on entry and it'll release it on exit. (note: please ignore summaries for lock premitives like spin_lock or down_interruptible; models for these functions are built into the checker and the derived summaries are not used). We have found that some modules in the kernel has functions with complicated synchronization behavior (esp. in filesystems), and we hope summaries generated by this tool could be useful not only for bug finding, but also for documentation purposes as well. The analysis is intraprocedurally "path sensitive", so it won't be fooled by cases like if (flag & BLOCKING) spin_lock(&l); ... if (flag & BLOCKING) spin_unlock(&l); or if (!spin_trylock(&l)) return -EAGAIN; ... spin_unlock(&l); The analysis algorithm models values (down to individual bits) and pointers in the program with a boolean satisfiability solver with high precision, and we're actively looking for other properties involving (heavy) data dependencies where naive analysis would fail. Suggestions and insights from the linux kernel community will be more than welcome! As always, feedbacks and confirmations will be greatly appreciated! Best regards, Yichen Xie - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ From akpm@osdl.org Fri Jul 2 01:12:28 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 02 Jul 2004 01:13:11 -0700 (PDT) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i628CSgi003110 for ; Fri, 2 Jul 2004 01:12:28 -0700 Received: from localhost.localdomain (build.pdx.osdl.net [172.20.1.2]) by mail.osdl.org (8.11.6/8.11.6) with ESMTP id i628CFG17588; Fri, 2 Jul 2004 01:12:15 -0700 Message-Id: <200407020812.i628CFG17588@mail.osdl.org> Subject: [patch 1/1] err1-28: rose_route locking fix To: davem@redhat.com Cc: netdev@oss.sgi.com, akpm@osdl.org From: akpm@osdl.org Date: Fri, 02 Jul 2004 01:11:17 -0700 X-archive-position: 6504 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: akpm@osdl.org Precedence: bulk X-list: netdev Fix deadlock in rose_del_loopback_node(). Found by the Stanford locking checker. Signed-off-by: Andrew Morton --- 25-akpm/net/rose/rose_route.c | 1 - 1 files changed, 1 deletion(-) diff -puN net/rose/rose_route.c~err1-28-rose_route-locking-fix net/rose/rose_route.c --- 25/net/rose/rose_route.c~err1-28-rose_route-locking-fix 2004-07-02 01:09:27.377403248 -0700 +++ 25-akpm/net/rose/rose_route.c 2004-07-02 01:09:33.617454616 -0700 @@ -206,7 +206,6 @@ static void rose_remove_node(struct rose { struct rose_node *s; - spin_lock_bh(&rose_node_list_lock); if ((s = rose_node_list) == rose_node) { rose_node_list = rose_node->next; kfree(rose_node); _ From akpm@osdl.org Fri Jul 2 01:29:37 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 02 Jul 2004 01:30:46 -0700 (PDT) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i628TZgi003788 for ; Fri, 2 Jul 2004 01:29:37 -0700 Received: from localhost.localdomain (build.pdx.osdl.net [172.20.1.2]) by mail.osdl.org (8.11.6/8.11.6) with ESMTP id i628TNG20391; Fri, 2 Jul 2004 01:29:23 -0700 Message-Id: <200407020829.i628TNG20391@mail.osdl.org> Subject: [patch 1/1] err1-62: ax25_ds_idletimer_expiry() locking fix To: davem@redhat.com Cc: netdev@oss.sgi.com, akpm@osdl.org From: akpm@osdl.org Date: Fri, 02 Jul 2004 01:28:24 -0700 X-archive-position: 6505 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: akpm@osdl.org Precedence: bulk X-list: netdev Fix deadlock identified by the Stanford locking checker. Signed-off-by: Andrew Morton --- 25-akpm/net/ax25/ax25_ds_timer.c | 2 +- 1 files changed, 1 insertion(+), 1 deletion(-) diff -puN net/ax25/ax25_ds_timer.c~err1-62-ax25_ds_idletimer_expiry-locking-fix net/ax25/ax25_ds_timer.c --- 25/net/ax25/ax25_ds_timer.c~err1-62-ax25_ds_idletimer_expiry-locking-fix 2004-07-02 01:27:07.504239472 -0700 +++ 25-akpm/net/ax25/ax25_ds_timer.c 2004-07-02 01:27:11.824582680 -0700 @@ -180,7 +180,7 @@ void ax25_ds_idletimer_expiry(ax25_cb *a ax25->sk->sk_state_change(ax25->sk); sock_set_flag(ax25->sk, SOCK_DEAD); } - bh_lock_sock(ax25->sk); + bh_unlock_sock(ax25->sk); } } _ From akpm@osdl.org Fri Jul 2 01:32:01 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 02 Jul 2004 01:32:28 -0700 (PDT) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i628Vxgi004053 for ; Fri, 2 Jul 2004 01:32:01 -0700 Received: from localhost.localdomain (build.pdx.osdl.net [172.20.1.2]) by mail.osdl.org (8.11.6/8.11.6) with ESMTP id i628VmG20814; Fri, 2 Jul 2004 01:31:48 -0700 Message-Id: <200407020831.i628VmG20814@mail.osdl.org> Subject: [patch 1/1] err1-67: lapb_unregister() locking fix To: davem@redhat.com Cc: netdev@oss.sgi.com, akpm@osdl.org From: akpm@osdl.org Date: Fri, 02 Jul 2004 01:30:49 -0700 X-archive-position: 6506 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: akpm@osdl.org Precedence: bulk X-list: netdev Fix deadlock identified by the Stanford locking checker. Signed-off-by: Andrew Morton --- 25-akpm/net/lapb/lapb_iface.c | 2 +- 1 files changed, 1 insertion(+), 1 deletion(-) diff -puN net/lapb/lapb_iface.c~err1-67-lapb_unregister-locking-fix net/lapb/lapb_iface.c --- 25/net/lapb/lapb_iface.c~err1-67-lapb_unregister-locking-fix 2004-07-02 01:29:50.645438240 -0700 +++ 25-akpm/net/lapb/lapb_iface.c 2004-07-02 01:29:55.717667144 -0700 @@ -176,7 +176,7 @@ int lapb_unregister(struct net_device *d struct lapb_cb *lapb; int rc = LAPB_BADTOKEN; - write_unlock_bh(&lapb_list_lock); + write_lock_bh(&lapb_list_lock); lapb = __lapb_devtostruct(dev); if (!lapb) goto out; _ From akpm@osdl.org Fri Jul 2 01:47:34 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 02 Jul 2004 01:47:36 -0700 (PDT) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i628lXgi004814 for ; Fri, 2 Jul 2004 01:47:33 -0700 Received: from localhost.localdomain (build.pdx.osdl.net [172.20.1.2]) by mail.osdl.org (8.11.6/8.11.6) with ESMTP id i628lLG23044; Fri, 2 Jul 2004 01:47:22 -0700 Message-Id: <200407020847.i628lLG23044@mail.osdl.org> Subject: [patch 1/1] err2-15: ax25_rt_add() locking fix To: davem@redhat.com Cc: netdev@oss.sgi.com, akpm@osdl.org From: akpm@osdl.org Date: Fri, 02 Jul 2004 01:46:23 -0700 X-archive-position: 6508 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: akpm@osdl.org Precedence: bulk X-list: netdev It can return with the lock held. Found by the Stanford locking checker Signed-off-by: Andrew Morton --- 25-akpm/net/ax25/ax25_route.c | 1 + 1 files changed, 1 insertion(+) diff -puN net/ax25/ax25_route.c~err2-15-ax25_rt_add-locking-fix net/ax25/ax25_route.c --- 25/net/ax25/ax25_route.c~err2-15-ax25_rt_add-locking-fix 2004-07-02 01:45:20.234119280 -0700 +++ 25-akpm/net/ax25/ax25_route.c 2004-07-02 01:45:37.960424472 -0700 @@ -122,6 +122,7 @@ static int ax25_rt_add(struct ax25_route ax25_rt->digipeat->calls[i] = route->digi_addr[i]; } } + write_unlock(&ax25_route_lock); return 0; } ax25_rt = ax25_rt->next; _ From yoshfuji@linux-ipv6.org Fri Jul 2 01:46:42 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 02 Jul 2004 01:47:02 -0700 (PDT) Received: from yue.st-paulia.net ([203.178.140.15]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i628kggi004691 for ; Fri, 2 Jul 2004 01:46:42 -0700 Received: from localhost (localhost [127.0.0.1]) by yue.st-paulia.net (Postfix) with ESMTP id 32CC033CE5; Fri, 2 Jul 2004 17:47:54 +0900 (JST) Date: Fri, 02 Jul 2004 17:47:53 +0900 (JST) Message-Id: <20040702.174753.93406678.yoshfuji@linux-ipv6.org> To: davem@redhat.com, yxie@cs.stanford.edu Cc: linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: [BUGS] [CHECKER] 99 synchronization bugs and a lock summary database From: YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?= In-Reply-To: References: Organization: USAGI Project X-URL: http://www.yoshifuji.org/%7Ehideaki/ X-Fingerprint: 9022 65EB 1ECF 3AD1 0BDF 80D8 4807 F894 E062 0EEA X-PGP-Key-URL: http://www.yoshifuji.org/%7Ehideaki/hideaki@yoshifuji.org.asc X-Face: "5$Al-.M>NJ%a'@hhZdQm:."qn~PA^gq4o*>iCFToq*bAi#4FRtx}enhuQKz7fNqQz\BYU] $~O_5m-9'}MIs`XGwIEscw;e5b>n"B_?j/AkL~i/MEaZBLP X-Mailer: Mew version 2.2 on Emacs 20.7 / Mule 4.1 (AOI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 6507 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: yoshfuji@linux-ipv6.org Precedence: bulk X-list: netdev Hello. In article (at Thu, 1 Jul 2004 18:01:00 -0700 (PDT)), Yichen Xie says: > http://glide.stanford.edu/linux-lock/err1.html (69 errors) : > err1.html consists of potential double locks/unlocks. Acquiring a lock > twice in a row may result in a system hang, and releasing a lock more than > once with certain mutex functions (e.g. up) may cause critical section > violations. Well, lapb_iface.c:lapb_unregister() has typo and we failed to get lapb_list_lock. rose_route.c:rose_remove_node()'s caller has rose_node_list_lock; so, this is dead lock. Here's the fix. ===== net/lapb/lapb_iface.c 1.14 vs edited ===== --- 1.14/net/lapb/lapb_iface.c 2004-01-11 08:39:04 +09:00 +++ edited/net/lapb/lapb_iface.c 2004-07-02 17:23:27 +09:00 @@ -176,7 +176,7 @@ struct lapb_cb *lapb; int rc = LAPB_BADTOKEN; - write_unlock_bh(&lapb_list_lock); + write_lock_bh(&lapb_list_lock); lapb = __lapb_devtostruct(dev); if (!lapb) goto out; ===== net/rose/rose_route.c 1.12 vs edited ===== --- 1.12/net/rose/rose_route.c 2004-06-04 09:11:24 +09:00 +++ edited/net/rose/rose_route.c 2004-07-02 17:26:08 +09:00 @@ -206,7 +206,6 @@ { struct rose_node *s; - spin_lock_bh(&rose_node_list_lock); if ((s = rose_node_list) == rose_node) { rose_node_list = rose_node->next; kfree(rose_node); Thanks. -- Hideaki YOSHIFUJI @ USAGI Project GPG FP: 9022 65EB 1ECF 3AD1 0BDF 80D8 4807 F894 E062 0EEA From herbert@gondor.apana.org.au Fri Jul 2 04:05:15 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 02 Jul 2004 04:05:23 -0700 (PDT) Received: from arnor.apana.org.au (mail@arnor.apana.org.au [203.14.152.115]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i62B58gi010685 for ; Fri, 2 Jul 2004 04:05:12 -0700 Received: from gondolin.me.apana.org.au ([192.168.0.6] ident=mail) by arnor.apana.org.au with esmtp (Exim 3.35 #1 (Debian)) id 1BgLqo-00039p-00; Fri, 02 Jul 2004 21:05:02 +1000 Received: from herbert by gondolin.me.apana.org.au with local (Exim 3.36 #1 (Debian)) id 1BgLql-00055X-00; Fri, 02 Jul 2004 21:04:59 +1000 From: Herbert Xu To: akpm@osdl.org (Andrew Morton) Subject: Re: Fw: [BUGS] [CHECKER] 99 synchronization bugs and a lock summary database Cc: netdev@oss.sgi.com Organization: Core In-Reply-To: <20040702004956.448c95d9.akpm@osdl.org> X-Newsgroups: apana.lists.os.linux.netdev User-Agent: tin/1.7.4-20040225 ("Benbecula") (UNIX) (Linux/2.4.26-1-686-smp (i686)) Message-Id: Date: Fri, 02 Jul 2004 21:04:59 +1000 X-archive-position: 6509 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: herbert@gondor.apana.org.au Precedence: bulk X-list: netdev Andrew Morton wrote: > > Could someone please take a look at the locking in > net/ipv4/ipmr.c:ipmr_mfc_seq_next? It seems rather broken. Obfuscated yes, broken no. Unfortunately the seq interface tends to produce code like this. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt From herbert@gondor.apana.org.au Fri Jul 2 06:07:09 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 02 Jul 2004 06:07:15 -0700 (PDT) Received: from arnor.apana.org.au (mail@arnor.apana.org.au [203.14.152.115]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i62D77gi016628 for ; Fri, 2 Jul 2004 06:07:08 -0700 Received: from gondolin.me.apana.org.au ([192.168.0.6] ident=mail) by arnor.apana.org.au with esmtp (Exim 3.35 #1 (Debian)) id 1BgNkm-0003uk-00; Fri, 02 Jul 2004 23:06:56 +1000 Received: from herbert by gondolin.me.apana.org.au with local (Exim 3.36 #1 (Debian)) id 1BgNki-0005Ix-00; Fri, 02 Jul 2004 23:06:52 +1000 Date: Fri, 2 Jul 2004 23:06:52 +1000 To: "David S. Miller" Cc: netdev@oss.sgi.com Subject: [AH4] Harmonisation of output function Message-ID: <20040702130652.GA20334@gondor.apana.org.au> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="opJtzjQTFsWo+cga" Content-Disposition: inline User-Agent: Mutt/1.5.6+20040523i From: Herbert Xu X-archive-position: 6510 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: herbert@gondor.apana.org.au Precedence: bulk X-list: netdev --opJtzjQTFsWo+cga Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Hi Dave: This is another step towards the union of the tunnel mode encapsulation between transforms. As there are significant differences between the tunnel encapsulation of IPv4 and IPv6, I'll be dealing with IPv4 only for now. This particular patch rearranges the code in ah_output to isolate the tunnel mode encapsulation. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt --opJtzjQTFsWo+cga Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename=p ===== net/ipv4/ah4.c 1.33 vs edited ===== --- 1.33/net/ipv4/ah4.c 2004-06-24 11:19:28 +10:00 +++ edited/net/ipv4/ah4.c 2004-07-02 22:53:20 +10:00 @@ -86,29 +86,23 @@ top_iph = (struct iphdr*)skb_push(*pskb, x->props.header_len); top_iph->ihl = 5; top_iph->version = 4; - top_iph->tos = 0; - top_iph->tot_len = htons((*pskb)->len); - top_iph->frag_off = 0; + top_iph->tos = iph->tos; + if (x->props.flags & XFRM_STATE_NOECN) + IP_ECN_clear(top_iph); + top_iph->frag_off = iph->frag_off & ~htons(IP_MF|IP_OFFSET); if (!(iph->frag_off&htons(IP_DF))) __ip_select_ident(top_iph, dst, 0); - top_iph->ttl = 0; - top_iph->protocol = IPPROTO_AH; - top_iph->check = 0; + top_iph->ttl = iph->ttl; top_iph->saddr = x->props.saddr.a4; top_iph->daddr = x->id.daddr.a4; + memcpy(&tmp_iph, top_iph, 20); + memset(&(IPCB(*pskb)->opt), 0, sizeof(struct ip_options)); ah = (struct ip_auth_hdr*)(top_iph+1); ah->nexthdr = IPPROTO_IPIP; } else { memcpy(&tmp_iph, (*pskb)->data, iph->ihl*4); top_iph = (struct iphdr*)skb_push(*pskb, x->props.header_len); memcpy(top_iph, &tmp_iph, iph->ihl*4); - iph = &tmp_iph.iph; - top_iph->tos = 0; - top_iph->tot_len = htons((*pskb)->len); - top_iph->frag_off = 0; - top_iph->ttl = 0; - top_iph->protocol = IPPROTO_AH; - top_iph->check = 0; if (top_iph->ihl != 5) { err = ip_clear_mutable_options(top_iph, &top_iph->daddr); if (err) @@ -117,6 +111,15 @@ ah = (struct ip_auth_hdr*)((char*)top_iph+iph->ihl*4); ah->nexthdr = iph->protocol; } + + iph = &tmp_iph.iph; + top_iph->tos = 0; + top_iph->tot_len = htons((*pskb)->len); + top_iph->frag_off = 0; + top_iph->ttl = 0; + top_iph->protocol = IPPROTO_AH; + top_iph->check = 0; + ahp = x->data; ah->hdrlen = (XFRM_ALIGN8(sizeof(struct ip_auth_hdr) + ahp->icv_trunc_len) >> 2) - 2; @@ -127,13 +130,8 @@ ahp->icv(ahp, *pskb, ah->auth_data); top_iph->tos = iph->tos; top_iph->ttl = iph->ttl; - if (x->props.mode) { - if (x->props.flags & XFRM_STATE_NOECN) - IP_ECN_clear(top_iph); - top_iph->frag_off = iph->frag_off&~htons(IP_MF|IP_OFFSET); - memset(&(IPCB(*pskb)->opt), 0, sizeof(struct ip_options)); - } else { - top_iph->frag_off = iph->frag_off; + top_iph->frag_off = iph->frag_off; + if (!x->props.mode) { top_iph->daddr = iph->daddr; if (iph->ihl != 5) memcpy(top_iph+1, iph+1, iph->ihl*4 - sizeof(struct iphdr)); --opJtzjQTFsWo+cga-- From jgarzik@pobox.com Fri Jul 2 08:51:43 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 02 Jul 2004 08:52:07 -0700 (PDT) Received: from www.linux.or