From owner-netdev@oss.sgi.com Wed Dec 1 03:09:26 1999 Received: by oss.sgi.com id ; Wed, 1 Dec 1999 03:09:16 -0800 Received: from athena.nuclecu.unam.mx ([132.248.29.9]:22572 "EHLO athena.nuclecu.unam.mx") by oss.sgi.com with ESMTP id ; Wed, 1 Dec 1999 03:09:01 -0800 Received: from mta1.263.net ([202.96.44.43]) by athena.nuclecu.unam.mx (8.8.7/8.8.7) with ESMTP id FAA17250 for ; Wed, 1 Dec 1999 05:15:35 -0600 From: zam_ustc@263.net Received: by mta1.263.net (Postfix, from userid 60001) id 521191C5A3AF0; Tue, 23 Nov 1999 08:27:51 +0800 (CST) MIME-Version: 1.0 Message-Id: <3839DF87.06000@mta1> Date: Tue, 23 Nov 1999 08:27:51 +0800 (CST) To: netdev@nuclecu.unam.mx Subject: Re:What's the problem? Help me! X-Priority: 3 Sender: owner-netdev@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;netdev-outgoing On Sun, 21 Nov 1999 20:51:25 +0100 venaas@nvg.ntnu.no wrote: >You're not supposed to use IP_HDRINCL with IPv6, see RFC 2292. It >might help to look at some source code, see for instance the ping >code in the inet6-apps package at http://www.inner.net/pub/ipv6/ Thank you! Venaas. But what I want to do is to fill the Traffic Class field and the Flow Label field to do some test. Rfc2292 does not state about it. Then I used IPPROTO_RAW as the third parameter in call to socket and set the IP_HDRINCL option. I found I could fill the flow label! That is: sockfd=socket(AF_INET6,SOCK_RAW,IPPROTO_RAW); setsockopt(sockfd,IPPROTO_IPV6,IP_HDRINCL,&on,sizeof(on)); I have still another question here. I have read rfc2292 several times. Now I want to send a IPv6 packet with some options(such as hop by hop and others).So I use sendmsg() to do it.But it seems that some of the functions mentioned in the rfc do not work on my system.I write a very simple programme to test the functions inet6_option_xxxx(): #include #include main () {int len; struct cmsghdr hdr; len = inet6_option_space(sizeof(struct cmsghdr)); printf ("len=135705528\n", len); } # cc -o hopbyhop hopbyhop.c /tmp/ccHGMtzn.o: In function `main': /tmp/ccHGMtzn.o(.text+0x9): undefined reference to inet6_option_space' collect2: ld returned 1 exit status What's the problem? Any library need to be included or my system does not support them at all? I try to find these functions in the source code but in vain. What can I do? BTW:The version of my Linux system is 2.2.13. _____________________________________________ 首都在线--先进中国人的网上家园 http://www.263.net 免费邮箱 邮件杂志 签名邮件 邮件加密 邮件追身呼 搜索引擎 个人站点 在线游戏 网上聊天 网上挂号 金融王国 在线杀毒 跳蚤市场 软件下载 休闲娱乐 诺方安全,助您e路平安 From owner-netdev@oss.sgi.com Sun Dec 5 14:10:53 1999 Received: by oss.sgi.com id ; Sun, 5 Dec 1999 14:10:33 -0800 Received: from athena.nuclecu.unam.mx ([132.248.29.9]:23105 "EHLO athena.nuclecu.unam.mx") by oss.sgi.com with ESMTP id ; Sun, 5 Dec 1999 14:10:12 -0800 Received: from grok.cyberhighway.net (root@cx97923-a.phnx3.az.home.com [24.9.112.194]) by athena.nuclecu.unam.mx (8.8.7/8.8.7) with ESMTP id QAA21805 for ; Sun, 5 Dec 1999 16:17:22 -0600 Received: from cyberhighway.net (IDENT:greear@localhost [127.0.0.1]) by grok.cyberhighway.net (8.9.3/8.9.3) with ESMTP id PAA15391; Sun, 5 Dec 1999 15:31:03 -0700 Message-ID: <384AE7A7.DB66A7DB@cyberhighway.net> Date: Sun, 05 Dec 1999 15:31:03 -0700 From: Ben Greear Organization: ScrySOFT X-Mailer: Mozilla 4.7 [en] (X11; U; Linux 2.2.12-20 i586) X-Accept-Language: en MIME-Version: 1.0 To: vlan@Scry.WANfear.com, "netdev@nuclecu.unam.mx" Subject: Re: [VLAN] CVS VLAN Commit Info References: <199912052200.OAA18693@ns1.wanfear.com> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-netdev@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;netdev-outgoing Content-Length: 1485 Lines: 39 Ben Greear wrote: > > uid=1059(greear) gid=1059(greear) groups=1059(greear),152(scrydev),155(vlandev) > vlan CHANGELOG,1.2,1.3 MakeInclude,1.2,1.3 Makefile,1.1.1.1,1.2 vconfig.cc,1.1.1.1,1.2 vlan.html,1.4,1.5 vlan.patch,1.4,1.5 > Sun Dec 5 13:59:26 PST 1999 > Update of /home/cvs/vlan/vlan > In directory ns1.wanfear.com:/tmp/cvs-serv18630 > > Modified Files: > CHANGELOG MakeInclude Makefile vconfig.cc vlan.html vlan.patch > Log Message: > Changes: > Re-wrote the /proc code to never go above 4k buffers. This means > that each port now has it's own file entry. Fixed crash bug with > removing VLAN devices. Byte and pkt counters are now updated correctly, > and are found in the /proc/net/vlan/ file. > > Known problems remaining: > Some ppl still having ARP problems, and MTU must be set to 1496 for > certain ethernet drivers. > > _______________________________________________ > VLAN mailing list - VLAN@Scry.WANfear.com > http://www.WANfear.com/mailman/listinfo/vlan The new code is now available on my web page. Please let me know how it goes. In my very limited tests it seems to work fine. I haven't figured out a nice way to provide diffs of the diffs, as someone suggested. However, I'll happily put them on my page if someone sends them to me :) Enjoy, Ben -- Ben Greear (greear@cyberhighway.net) http://scry.wanfear.com/~greear Author of ScryMUD: scry.wanfear.com 4444 (Released under GPL) http://scry.wanfear.com From owner-netdev@oss.sgi.com Sun Dec 5 23:11:27 1999 Received: by oss.sgi.com id ; Sun, 5 Dec 1999 23:11:07 -0800 Received: from athena.nuclecu.unam.mx ([132.248.29.9]:34900 "EHLO athena.nuclecu.unam.mx") by oss.sgi.com with ESMTP id ; Sun, 5 Dec 1999 23:11:00 -0800 Received: from nkmz.ins.dn.ua (nkmz-dntel.dntel.donetsk.ua [193.220.71.254]) by athena.nuclecu.unam.mx (8.8.7/8.8.7) with ESMTP id BAA11739 for ; Mon, 6 Dec 1999 01:18:08 -0600 Received: from lsto.nkmz.donetsk.ua (IDENT:sans@lsto.nkmz.donetsk.ua [172.16.24.3]) by nkmz.ins.dn.ua (8.9.2/8.9.2) with ESMTP id JAA02594; Mon, 6 Dec 1999 09:17:30 +0200 (EET) Received: (from sans@localhost) by lsto.nkmz.donetsk.ua (8.8.7/8.8.7) id JAA28316; Mon, 6 Dec 1999 09:17:59 +0200 Date: Mon, 6 Dec 1999 09:17:59 +0200 From: "Alexander I. Chernous" Message-Id: <199912060717.JAA28316@lsto.nkmz.donetsk.ua> To: greear@cyberhighway.net, netdev@nuclecu.unam.mx, vlan@Scry.WANfear.com Subject: Re: [VLAN] CVS VLAN Commit Info Sender: owner-netdev@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;netdev-outgoing Content-Length: 8 Lines: 1 SIGNOFF From owner-netdev@oss.sgi.com Thu Dec 9 02:42:13 1999 Received: by oss.sgi.com id ; Thu, 9 Dec 1999 02:41:52 -0800 Received: from athena.nuclecu.unam.mx ([132.248.29.9]:17971 "EHLO athena.nuclecu.unam.mx") by oss.sgi.com with ESMTP id ; Thu, 9 Dec 1999 02:41:30 -0800 Received: from kempelen.iit.bme.hu (kempelen.iit.bme.hu [152.66.241.120]) by athena.nuclecu.unam.mx (8.8.7/8.8.7) with ESMTP id EAA13575 for ; Thu, 9 Dec 1999 04:48:56 -0600 Received: (from dhanak@localhost) by kempelen.iit.bme.hu (8.9.3+Sun/8.9.1) id LAA28178; Thu, 9 Dec 1999 11:45:10 +0100 (MET) X-Authentication-Warning: kempelen.iit.bme.hu: dhanak set sender to dhanak@inf.bme.hu using -f To: Alexey Kuznetsov Cc: IP Networking Subject: [HELP] Packet scheduler From: Hanak David Date: 09 Dec 1999 11:45:10 +0100 In-Reply-To: Hanak David's message of "08 Dec 1999 17:24:30 +0100" Message-ID: User-Agent: Gnus/5.0802 (Gnus v5.8.2) Emacs/20.4 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Sender: owner-netdev@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;netdev-outgoing Content-Length: 970 Lines: 25 Hi! I'm writing a WFQ packet scheduling algorithm for the Linux kernel. Based on scheduler sources shipped with the kernel, I managed to get the hang of it and it works fine. Only I get warnings saying: kmem_grow: Called nonatomically from int - size-64 This most probably happens in the enqueue or the dequeue function, since this happens only when there is heavy traffic, and also stops after a while. (When enough memory is already allocated, I figure.) I don't use any other sub-schedulers (sub-classes, if you like), so everything inside these two functions is my own code. (Apart from few kfree_skb calls.) My question is: what should I add to avoid this? Is a pair of start_bh_atomic() end_bh_atomic() calls missing? What should be protected with these functions at all? (And what needn't?) If you need a part of code, I gladly send it... Could you please answer in private mail, beacuse I'm not subscribed to the list. Thanks a lot, David Hanak From owner-netdev@oss.sgi.com Thu Dec 9 06:39:13 1999 Received: by oss.sgi.com id ; Thu, 9 Dec 1999 06:39:03 -0800 Received: from athena.nuclecu.unam.mx ([132.248.29.9]:21554 "EHLO athena.nuclecu.unam.mx") by oss.sgi.com with ESMTP id ; Thu, 9 Dec 1999 06:38:48 -0800 Received: from ms2.inr.ac.ru (minus.inr.ac.ru [193.233.7.97]) by athena.nuclecu.unam.mx (8.8.7/8.8.7) with SMTP id IAA17526 for ; Thu, 9 Dec 1999 08:46:16 -0600 From: kuznet@ms2.inr.ac.ru Received: (from kuznet@localhost) by ms2.inr.ac.ru (8.6.13/ANK) id RAA12928; Thu, 9 Dec 1999 17:45:32 +0300 Message-Id: <199912091445.RAA12928@ms2.inr.ac.ru> Subject: Re: [HELP] Packet scheduler To: dhanak@inf.bme.hu (Hanak David) Date: Thu, 9 Dec 1999 17:45:32 +0300 (MSK) Cc: netdev@roxanne.nuclecu.unam.mx In-Reply-To: from "Hanak David" at Dec 9, 99 11:45:10 am X-Mailer: ELM [version 2.4 PL24] MIME-Version: 1.0 Sender: owner-netdev@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;netdev-outgoing Content-Length: 730 Lines: 31 Hello! > the hang of it and it works fine. Only I get warnings saying: > > kmem_grow: Called nonatomically from int - size-64 It is not a warning, it is fatal error. > My question is: what should I add to avoid this? Do not use non-atomic operations in atomic context. > Is a pair of start_bh_atomic() end_bh_atomic() calls missing? enqueue/dequeue are already BH protected. > What should be > protected with these functions at all? (And what needn't?) They are already serialized, so that enqueue/dequeue need not any additional protection. > If you need a part of code, I gladly send it... Please. Though, check first the module for GFP_KERNEL and for balanced start_bh_atomic and end_bh_atomic. Alexey From owner-netdev@oss.sgi.com Thu Dec 9 07:56:13 1999 Received: by oss.sgi.com id ; Thu, 9 Dec 1999 07:56:03 -0800 Received: from athena.nuclecu.unam.mx ([132.248.29.9]:47954 "EHLO athena.nuclecu.unam.mx") by oss.sgi.com with ESMTP id ; Thu, 9 Dec 1999 07:55:51 -0800 Received: from laurin.munich.netsurf.de (laurin.munich.netsurf.de [194.64.166.1]) by athena.nuclecu.unam.mx (8.8.7/8.8.7) with ESMTP id KAA18438 for ; Thu, 9 Dec 1999 10:03:14 -0600 Received: from fred.muc.de (none@ns1140.munich.netsurf.de [195.180.235.140]) by laurin.munich.netsurf.de (8.9.3/8.9.3) with ESMTP id RAA05435; Thu, 9 Dec 1999 17:01:33 +0100 (MET) Received: from andi by fred.muc.de with local (Exim 2.05 #1) id 11w4Zz-0000Iy-00; Thu, 9 Dec 1999 15:29:59 +0100 Date: Thu, 9 Dec 1999 15:29:58 +0100 From: Andi Kleen To: Hanak David Cc: Alexey Kuznetsov , IP Networking Subject: Re: [HELP] Packet scheduler Message-ID: <19991209152958.A1170@fred.muc.de> References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Mailer: Mutt 0.95.4us In-Reply-To: ; from Hanak David on Thu, Dec 09, 1999 at 11:45:10AM +0100 Sender: owner-netdev@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;netdev-outgoing Content-Length: 311 Lines: 9 On Thu, Dec 09, 1999 at 11:45:10AM +0100, Hanak David wrote: > My question is: what should I add to avoid this? Is a pair of > start_bh_atomic() end_bh_atomic() calls missing? What should be > protected with these functions at all? (And what needn't?) Allocate memory only with the GFP_ATOMIC flag. -Andi From owner-netdev@oss.sgi.com Thu Dec 9 08:07:33 1999 Received: by oss.sgi.com id ; Thu, 9 Dec 1999 08:07:24 -0800 Received: from athena.nuclecu.unam.mx ([132.248.29.9]:15967 "EHLO athena.nuclecu.unam.mx") by oss.sgi.com with ESMTP id ; Thu, 9 Dec 1999 08:07:09 -0800 Received: from servidor.unam.mx (servidor.unam.mx [132.248.10.5]) by athena.nuclecu.unam.mx (8.8.7/8.8.7) with ESMTP id KAA18583 for ; Thu, 9 Dec 1999 10:14:38 -0600 Received: from servidor.unam.mx (servidor.unam.mx [132.248.10.5]) by servidor.unam.mx (8.9.3/8.9.3) with SMTP id KAA29858 for ; Thu, 9 Dec 1999 10:14:52 -0600 (CST) Date: Thu, 9 Dec 1999 10:14:52 -0600 (CST) From: Hector Yuen To: netdev@roxanne.nuclecu.unam.mx Subject: socket in c++ In-Reply-To: <19991209152958.A1170@fred.muc.de> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-netdev@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;netdev-outgoing Content-Length: 366 Lines: 8 hello, I am trying to make an IPv6 socket on c++, on linux, I first tried the C version, and works fine, but I want to make a class or something to make things easier, so I renamed the file sock.c to sock.cpp, and tried to compile it with: gcc sock.cpp -lnsl -lsocket -o sock well, with the c version works, but here it displays a lot of errors, what is happening? From owner-netdev@oss.sgi.com Thu Dec 9 08:12:43 1999 Received: by oss.sgi.com id ; Thu, 9 Dec 1999 08:12:33 -0800 Received: from x86unx3.comp.nus.edu.sg ([137.132.90.2]:48065 "EHLO x86unx3.comp.nus.edu.sg") by oss.sgi.com with ESMTP id ; Thu, 9 Dec 1999 08:12:26 -0800 Received: from sunA.comp.nus.edu.sg (leebp@sunA.comp.nus.edu.sg [137.132.87.10]) by x86unx3.comp.nus.edu.sg (8.9.1/8.9.1) with ESMTP id AAA01055 for ; Fri, 10 Dec 1999 00:19:54 +0800 (SGT) Received: (from leebp@localhost) by sunA.comp.nus.edu.sg (8.8.5/8.8.5) id AAA00764 for netdev@oss.sgi.com; Fri, 10 Dec 1999 00:19:48 +0800 (GMT-8) From: Lee Boon Peng Message-Id: <199912091619.AAA00764@sunA.comp.nus.edu.sg> Subject: SO_DEBUG & trpt To: netdev@oss.sgi.com Date: Fri, 10 Dec 1999 00:19:48 +0800 (GMT-8) X-Mailer: ELM [version 2.4 PL25] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-netdev@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;netdev-outgoing Content-Length: 305 Lines: 9 Hi there, How would I make use of the socket SO_DEBUG option? There seems to be no trpt command equivalent to get add the debugging information. I would be grateful for any pointers. I couldn't get any help from the Usenet groups, so I'm turning here for some guru info. Thanks a lot BP From owner-netdev@oss.sgi.com Thu Dec 9 08:37:23 1999 Received: by oss.sgi.com id ; Thu, 9 Dec 1999 08:37:13 -0800 Received: from laurin.munich.netsurf.de ([194.64.166.1]:53422 "EHLO laurin.munich.netsurf.de") by oss.sgi.com with ESMTP id ; Thu, 9 Dec 1999 08:36:58 -0800 Received: from fred.muc.de (none@ns1179.munich.netsurf.de [195.180.235.179]) by laurin.munich.netsurf.de (8.9.3/8.9.3) with ESMTP id RAA11981; Thu, 9 Dec 1999 17:44:22 +0100 (MET) Received: from andi by fred.muc.de with local (Exim 2.05 #1) id 11w6fu-0000Qr-00; Thu, 9 Dec 1999 17:44:14 +0100 Date: Thu, 9 Dec 1999 17:44:14 +0100 From: Andi Kleen To: Lee Boon Peng Cc: netdev@oss.sgi.com Subject: Re: SO_DEBUG & trpt Message-ID: <19991209174414.A1643@fred.muc.de> References: <199912091619.AAA00764@sunA.comp.nus.edu.sg> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Mailer: Mutt 0.95.4us In-Reply-To: <199912091619.AAA00764@sunA.comp.nus.edu.sg>; from Lee Boon Peng on Thu, Dec 09, 1999 at 05:19:48PM +0100 Sender: owner-netdev@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;netdev-outgoing Content-Length: 901 Lines: 24 On Thu, Dec 09, 1999 at 05:19:48PM +0100, Lee Boon Peng wrote: > Hi there, > > How would I make use of the socket SO_DEBUG option? There seems > to be no trpt command equivalent to get add the debugging information. > > I would be grateful for any pointers. I couldn't get any help > from the Usenet groups, so I'm turning here for some guru info. SO_DEBUG enables various unsystematic debugging printks that are output into the kernel log (readable via dmesg). They may or may not be helpful in diagnosing anything. It is only active when the kernel was compiled with SOCK_DEBUGGING. This is generally true for 2.3 kernels, but not for stable 2.2/2.0 kernels. You can enable it by enabling the SOCK_DEBUGGING macro in include/net/sock.h and recompiling. If you want TCP state traces you also need to enable STATE_TRACE in include/net/tcp.h. -Andi -- This is like TV. I don't like TV. From owner-netdev@oss.sgi.com Thu Dec 9 21:28:50 1999 Received: by oss.sgi.com id ; Thu, 9 Dec 1999 21:28:40 -0800 Received: from linuxcare.canberra.net.au ([203.29.91.49]:13322 "EHLO front.linuxcare.com.au") by oss.sgi.com with ESMTP id ; Thu, 9 Dec 1999 21:28:19 -0800 Received: from gargle.linuxcare.com.au (gargle.linuxcare.com.au [10.61.2.18]) by front.linuxcare.com.au (8.9.3/8.9.3/Debian/GNU) with ESMTP id QAA05816 for ; Fri, 10 Dec 1999 16:35:49 +1100 Received: from rustcorp.com.au (really [127.0.0.1]) by rustcorp.com.au via in.smtpd with esmtp id (Debian Smail3.2.0.102) for ; Fri, 10 Dec 1999 16:42:36 +1100 (EST) Message-Id: From: Paul Rusty Russell To: netdev@oss.sgi.com Subject: Permissions checks in ethertap device? Date: Fri, 10 Dec 1999 16:42:36 +1100 Sender: owner-netdev@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;netdev-outgoing Content-Length: 159 Lines: 7 Why can't we rely on the normal open permissions checks for /dev/tap0? For user mode kernel I'd like to be able to inject packets... Rusty. -- Hacking time. From owner-netdev@oss.sgi.com Fri Dec 10 10:53:54 1999 Received: by oss.sgi.com id ; Fri, 10 Dec 1999 10:53:44 -0800 Received: from minus.inr.ac.ru ([193.233.7.97]:18181 "HELO ms2.inr.ac.ru") by oss.sgi.com with SMTP id ; Fri, 10 Dec 1999 10:53:21 -0800 Received: (from kuznet@localhost) by ms2.inr.ac.ru (8.6.13/ANK) id WAA17153; Fri, 10 Dec 1999 22:00:43 +0300 From: kuznet@ms2.inr.ac.ru Message-Id: <199912101900.WAA17153@ms2.inr.ac.ru> Subject: Re: Permissions checks in ethertap device? To: Paul.RUssell@linuxcare.COM.AU (Paul Rusty Russell) Date: Fri, 10 Dec 1999 22:00:43 +0300 (MSK) Cc: netdev@oss.sgi.com In-Reply-To: from "Paul Rusty Russell" at Dec 10, 99 09:14:11 am X-Mailer: ELM [version 2.4 PL24] MIME-Version: 1.0 Sender: owner-netdev@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;netdev-outgoing Content-Length: 267 Lines: 11 Hello! > Why can't we rely on the normal open permissions checks for /dev/tap0? No, Paul. You should blame on me. I forgot about this aspect at all. Please, look at this problem, probably, you will invent a way how to fix it. I cannot invent anything now. Alexey From owner-netdev@oss.sgi.com Fri Dec 10 11:01:24 1999 Received: by oss.sgi.com id ; Fri, 10 Dec 1999 11:01:04 -0800 Received: from mainframe.dgrc.crc.ca ([142.92.38.206]:19163 "EHLO mainframe.dgrc.crc.ca") by oss.sgi.com with ESMTP id ; Fri, 10 Dec 1999 11:00:56 -0800 Received: from crc.ca (curly [142.92.38.251]) by mainframe.dgrc.crc.ca (8.9.3/8.9.3) with ESMTP id OAA28502; Fri, 10 Dec 1999 14:01:05 -0500 (EST) Message-ID: <38514EB1.39497B24@crc.ca> Date: Fri, 10 Dec 1999 14:04:17 -0500 From: Guilhem Tardy Organization: CRC X-Mailer: Mozilla 4.5 [en] (X11; I; SunOS 5.7 sun4u) X-Accept-Language: en MIME-Version: 1.0 To: Alberto Escudero CC: Masahito Yoshida , engp7643@leonis.nus.edu.sg, xhmeng@mcn.xidian.edu.cn, jiang@eudoramail.com, ccfoo@hawk.ee.nus.edu.sg, netdev@oss.sgi.com Subject: Destination Options (see at the end of message) References: Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-netdev@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;netdev-outgoing Content-Length: 1588 Lines: 46 Alberto Escudero wrote: > > Thanks Masahito! > > I have the Internet DRAFT of 25 June 1999 and i am bit confussed... > may be you can help me with this... > In the Mobility support in IPv6 New Destination Options are defined > > - Binding update Option Format > - Binding ACK OF > - Binding Reques OF > - Home Address OF > Correct. (latest draft is Oct. 22 1999, see draft-ietf-mobileip-ipv6-09.txt, chapter 5) > Checking in the Internet Draft those options are encoded in the TLV format > with different Option Types. > > 198=0xC6 for the binding update > 7 for the Binding ACK for example > > Checking mh.c: > > #define BIND_UPDATE_TYPE 195 > #define BIND_ACK_TYPE 2 > > Why does values are different? Do they change in the drafts? I guess either those values were different at the time or the programmers used other values as they sent UDP packets for that matter. > - In the documentation i have (from the TIFF) files ... it is said that > all those bindings were not available in IPv6 kernel of linux and were > implemented using UDP... is it that right in the 1.0 NUS MIPv6? Destination Options are still not implemented in the IPv6 Linux kernel (2.2.13 or 2.3.31), hence there's no Binding Update Option available without a hack. I would like to have your opinion (all) on that: shall we keep on using UDP or is it an option to actually implement the Destination Options in the IPv6 Linux kernel (which has already greatly improved since 2.1.59)? As a complementary question, what is your view on the complexity of the DestOpt implementation? Guilhem. From owner-netdev@oss.sgi.com Sat Dec 11 20:26:36 1999 Received: by oss.sgi.com id ; Sat, 11 Dec 1999 20:24:26 -0800 Received: from minus.inr.ac.ru ([193.233.7.97]:25352 "HELO ms2.inr.ac.ru") by oss.sgi.com with SMTP id ; Sat, 11 Dec 1999 20:24:09 -0800 Received: (from kuznet@localhost) by ms2.inr.ac.ru (8.6.13/ANK) id XAA21494; Fri, 10 Dec 1999 23:19:25 +0300 From: kuznet@ms2.inr.ac.ru Message-Id: <199912102019.XAA21494@ms2.inr.ac.ru> Subject: Re: Destination Options (see at the end of message) To: Guilhem.Tardy@crc.CA (Guilhem Tardy) Date: Fri, 10 Dec 1999 23:19:25 +0300 (MSK) Cc: netdev@oss.sgi.com In-Reply-To: <38514EB1.39497B24@crc.ca> from "Guilhem Tardy" at Dec 10, 99 11:13:37 pm X-Mailer: ELM [version 2.4 PL24] MIME-Version: 1.0 Sender: owner-netdev@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;netdev-outgoing Content-Length: 298 Lines: 11 Hello! > Destination Options are still not implemented in the IPv6 Linux kernel > (2.2.13 or 2.3.31), Please, explain. "Destination options" are implemented. Mobile IPv6 optinos are not implemented, because mobile is not implemented. You have to implement it for begginning 8) Alexey Kuznetsov From owner-netdev@oss.sgi.com Sat Dec 11 20:50:59 1999 Received: by oss.sgi.com id ; Sat, 11 Dec 1999 20:50:31 -0800 Received: from athena.nuclecu.unam.mx ([132.248.29.9]:28695 "EHLO athena.nuclecu.unam.mx") by oss.sgi.com with ESMTP id ; Fri, 10 Dec 1999 15:31:02 -0800 Received: from primus.cstp.umkc.edu (primus.cstp.umkc.edu [134.193.2.53]) by athena.nuclecu.unam.mx (8.8.7/8.8.7) with ESMTP id RAA16898 for ; Fri, 10 Dec 1999 17:29:35 -0600 Received: from primus.cstp.umkc.edu (134.193.2.53) by primus.cstp.umkc.edu (MX V5.1-X AnDk) with SMTP for ; Fri, 10 Dec 1999 17:29:31 -0500 Date: Fri, 10 Dec 1999 17:29:30 -0500 From: Mukunda Reddy Saddi To: netdev@roxanne.nuclecu.unam.mx Subject: need help Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-netdev@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;netdev-outgoing Content-Length: 518 Lines: 18 Hello, I am implementing a new protocol which is similar to GRE. Generic Routing Encapsulation protocol. This is for wireless networks. I have never implemented any protocol stack or any device drivers. I know C programming. I would be obliged to receive some guidance in this area. This new protocol would sit directly above the IP. So, I think it will not be in user space. So, I cannot use SOCK_DGRAM of UDP. Also any documentation regarding existing GRE implementation will be helpful. Thanks Mukunda From owner-netdev@oss.sgi.com Sat Dec 11 20:51:08 1999 Received: by oss.sgi.com id ; Sat, 11 Dec 1999 20:50:32 -0800 Received: from athena.nuclecu.unam.mx ([132.248.29.9]:34322 "EHLO athena.nuclecu.unam.mx") by oss.sgi.com with ESMTP id ; Fri, 10 Dec 1999 21:13:12 -0800 Received: from alpha.dtix.com (alpha.dtix.com [198.62.174.1]) by athena.nuclecu.unam.mx (8.8.7/8.8.7) with ESMTP id XAA20249 for ; Fri, 10 Dec 1999 23:11:48 -0600 Received: from localhost (sankar@localhost) by alpha.dtix.com (8.9.3/8.8.7) with ESMTP id XAA12163; Fri, 10 Dec 1999 23:20:45 -0500 Date: Fri, 10 Dec 1999 23:20:44 -0500 (EST) From: Sankar Das Reply-To: sankar@alpha.dtix.com To: Hector Yuen cc: netdev@roxanne.nuclecu.unam.mx Subject: Re: socket in c++ In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-netdev@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;netdev-outgoing Content-Length: 454 Lines: 14 Try with g++ instead of gcc. On Thu, 9 Dec 1999, Hector Yuen wrote: > hello, I am trying to make an IPv6 socket on c++, on linux, I first tried > the C version, and works fine, but I want to make a class or something to > make things easier, so I renamed the file sock.c to sock.cpp, and tried to > compile it with: > gcc sock.cpp -lnsl -lsocket -o sock > well, with the c version works, but here it displays a lot of errors, what > is happening? > From owner-netdev@oss.sgi.com Sat Dec 11 20:51:09 1999 Received: by oss.sgi.com id ; Sat, 11 Dec 1999 20:50:30 -0800 Received: from mainframe.dgrc.crc.ca ([142.92.38.206]:38375 "EHLO mainframe.dgrc.crc.ca") by oss.sgi.com with ESMTP id ; Fri, 10 Dec 1999 14:41:00 -0800 Received: from crc.ca (curly [142.92.38.251]) by mainframe.dgrc.crc.ca (8.9.3/8.9.3) with ESMTP id RAA00824 for ; Fri, 10 Dec 1999 17:36:20 -0500 (EST) Message-ID: <38518124.EBF52272@crc.ca> Date: Fri, 10 Dec 1999 17:39:32 -0500 From: Guilhem Tardy Organization: CRC X-Mailer: Mozilla 4.5 [en] (X11; I; SunOS 5.7 sun4u) X-Accept-Language: en MIME-Version: 1.0 To: netdev@oss.sgi.com Subject: 'memory_start' in init/main.c Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-netdev@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;netdev-outgoing Content-Length: 231 Lines: 11 Hi! I tried to compile 2.3.31, but it fails on init/main.c # in function 'start_kernel': # 'memory_start' undeclared Anyone who managed to compile this beastie (I guess so), would you mind to tell me about this? Guilhem Tardy. From owner-netdev@oss.sgi.com Sat Dec 11 20:51:28 1999 Received: by oss.sgi.com id ; Sat, 11 Dec 1999 20:50:34 -0800 Received: from nytoday.whowhere.com ([209.1.236.38]:9122 "HELO mc-qout2.whowhere.com") by oss.sgi.com with SMTP id ; Sat, 11 Dec 1999 09:39:02 -0800 Received: from Unknown/Local ([?.?.?.?]) by shared1-mail.whowhere.com; Sat Dec 11 09:37:29 1999 To: "Alberto Escudero" , "Guilhem Tardy" Date: Sat, 11 Dec 1999 09:37:29 -0800 From: "mingliang jiang" Message-ID: Mime-Version: 1.0 Cc: "Masahito Yoshida" , engp7643@leonis.nus.edu.sg, xhmeng@mcn.xidian.edu.cn, jiang@eudoramail.com, ccfoo@hawk.ee.nus.edu.sg, netdev@oss.sgi.com X-Sent-Mail: on X-Mailer: MailCity Service Subject: Re: Destination Options (see at the end of message) X-Sender-Ip: 203.124.1.60 Organization: QUALCOMM Eudora Web-Mail (http://www.eudoramail.com:80) Content-Type: text/plain; charset=us-ascii Content-Language: en Content-Transfer-Encoding: 7bit Sender: owner-netdev@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;netdev-outgoing Content-Length: 2235 Lines: 70 Dear All, I am sorry for keeping quite, as I am still waitng for a verdict from the Motorola Research Center. :) I think eventually it makes more sense for us to implement the destination option in the linux kernel. However as I am not following the 2.3.x development, I am not in the postion to comment on the complexity issues. Regards, Yours Humble Mingliang --- Pls cc your message to jiangmingliang@hotmail.com thanks! mingliang On Fri, 10 Dec 1999 14:04:17 Guilhem Tardy wrote: >Alberto Escudero wrote: >> >> Thanks Masahito! >> >> I have the Internet DRAFT of 25 June 1999 and i am bit confussed... >> may be you can help me with this... >> In the Mobility support in IPv6 New Destination Options are defined >> >> - Binding update Option Format >> - Binding ACK OF >> - Binding Reques OF >> - Home Address OF >> >Correct. (latest draft is Oct. 22 1999, see >draft-ietf-mobileip-ipv6-09.txt, chapter 5) > >> Checking in the Internet Draft those options are encoded in the TLV format >> with different Option Types. >> >> 198=0xC6 for the binding update >> 7 for the Binding ACK for example >> >> Checking mh.c: >> >> #define BIND_UPDATE_TYPE 195 >> #define BIND_ACK_TYPE 2 >> >> Why does values are different? Do they change in the drafts? >I guess either those values were different at the time or the >programmers used other values as they sent UDP packets for that matter. > >> - In the documentation i have (from the TIFF) files ... it is said that >> all those bindings were not available in IPv6 kernel of linux and were >> implemented using UDP... is it that right in the 1.0 NUS MIPv6? >Destination Options are still not implemented in the IPv6 Linux kernel >(2.2.13 or 2.3.31), hence there's no Binding Update Option available >without a hack. > >I would like to have your opinion (all) on that: >shall we keep on using UDP or is it an option to actually implement the >Destination Options in the IPv6 Linux kernel (which has already greatly >improved since 2.1.59)? >As a complementary question, what is your view on the complexity of the >DestOpt implementation? > >Guilhem. > Join 18 million Eudora users by signing up for a free Eudora Web-Mail account at http://www.eudoramail.com From owner-netdev@oss.sgi.com Sat Dec 11 20:51:29 1999 Received: by oss.sgi.com id ; Sat, 11 Dec 1999 20:50:33 -0800 Received: from nytoday.whowhere.com ([209.1.236.38]:12450 "HELO mc-qout2.whowhere.com") by oss.sgi.com with SMTP id ; Sat, 11 Dec 1999 09:39:00 -0800 Received: from Unknown/Local ([?.?.?.?]) by shared1-mail.whowhere.com; Sat Dec 11 09:37:35 1999 To: "Alberto Escudero" , "Guilhem Tardy" Date: Sat, 11 Dec 1999 09:37:35 -0800 From: "mingliang jiang" Message-ID: Mime-Version: 1.0 Cc: "Masahito Yoshida" , engp7643@leonis.nus.edu.sg, xhmeng@mcn.xidian.edu.cn, jiang@eudoramail.com, ccfoo@hawk.ee.nus.edu.sg, netdev@oss.sgi.com X-Sent-Mail: on X-Expiredinmiddle: true X-Mailer: MailCity Service Subject: Re: Destination Options (see at the end of message) X-Sender-Ip: 203.124.1.60 Organization: QUALCOMM Eudora Web-Mail (http://www.eudoramail.com:80) Content-Type: text/plain; charset=us-ascii Content-Language: en Content-Transfer-Encoding: 7bit Sender: owner-netdev@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;netdev-outgoing Content-Length: 2235 Lines: 70 Dear All, I am sorry for keeping quite, as I am still waitng for a verdict from the Motorola Research Center. :) I think eventually it makes more sense for us to implement the destination option in the linux kernel. However as I am not following the 2.3.x development, I am not in the postion to comment on the complexity issues. Regards, Yours Humble Mingliang --- Pls cc your message to jiangmingliang@hotmail.com thanks! mingliang On Fri, 10 Dec 1999 14:04:17 Guilhem Tardy wrote: >Alberto Escudero wrote: >> >> Thanks Masahito! >> >> I have the Internet DRAFT of 25 June 1999 and i am bit confussed... >> may be you can help me with this... >> In the Mobility support in IPv6 New Destination Options are defined >> >> - Binding update Option Format >> - Binding ACK OF >> - Binding Reques OF >> - Home Address OF >> >Correct. (latest draft is Oct. 22 1999, see >draft-ietf-mobileip-ipv6-09.txt, chapter 5) > >> Checking in the Internet Draft those options are encoded in the TLV format >> with different Option Types. >> >> 198=0xC6 for the binding update >> 7 for the Binding ACK for example >> >> Checking mh.c: >> >> #define BIND_UPDATE_TYPE 195 >> #define BIND_ACK_TYPE 2 >> >> Why does values are different? Do they change in the drafts? >I guess either those values were different at the time or the >programmers used other values as they sent UDP packets for that matter. > >> - In the documentation i have (from the TIFF) files ... it is said that >> all those bindings were not available in IPv6 kernel of linux and were >> implemented using UDP... is it that right in the 1.0 NUS MIPv6? >Destination Options are still not implemented in the IPv6 Linux kernel >(2.2.13 or 2.3.31), hence there's no Binding Update Option available >without a hack. > >I would like to have your opinion (all) on that: >shall we keep on using UDP or is it an option to actually implement the >Destination Options in the IPv6 Linux kernel (which has already greatly >improved since 2.1.59)? >As a complementary question, what is your view on the complexity of the >DestOpt implementation? > >Guilhem. > Join 18 million Eudora users by signing up for a free Eudora Web-Mail account at http://www.eudoramail.com From owner-netdev@oss.sgi.com Sat Dec 11 20:51:38 1999 Received: by oss.sgi.com id ; Sat, 11 Dec 1999 20:50:36 -0800 Received: from mainframe.dgrc.crc.ca ([142.92.38.206]:43746 "EHLO mainframe.dgrc.crc.ca") by oss.sgi.com with ESMTP id ; Fri, 10 Dec 1999 12:56:40 -0800 Received: from crc.ca (curly [142.92.38.251]) by mainframe.dgrc.crc.ca (8.9.3/8.9.3) with ESMTP id PAA29907; Fri, 10 Dec 1999 15:51:02 -0500 (EST) Message-ID: <38516875.6363A31E@crc.ca> Date: Fri, 10 Dec 1999 15:54:13 -0500 From: Guilhem Tardy Organization: CRC X-Mailer: Mozilla 4.5 [en] (X11; I; SunOS 5.7 sun4u) X-Accept-Language: en MIME-Version: 1.0 To: kuznet@ms2.inr.ac.ru CC: netdev@oss.sgi.com Subject: DestOpts: are they implemented? References: <199912102019.XAA21494@ms2.inr.ac.ru> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-netdev@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;netdev-outgoing Content-Length: 1074 Lines: 32 kuznet@ms2.inr.ac.ru wrote: > > Hello! > > > Destination Options are still not implemented in the IPv6 Linux kernel > > (2.2.13 or 2.3.31), > > Please, explain. "Destination options" are implemented. I am refering to the Mobile-IPv6 Destination Opts in the kernels above. where the file exthdrs.c shows: struct tlvtype_proc tlvprocdestopt_lst[] = { /* No destination options are defined now */ {-1, NULL} }; My understanding is that IPv6 does not have Dest Opts aside of PAD1 and PADN, right? Then it all makes sense. > Mobile IPv6 optinos are not implemented, because mobile > is not implemented. You have to implement it for begginning 8) That's what it is about: there's a bunch of guys out there who want to port a 2.1.59 Mobile-IPv6 stack to the newer 2.2.x kernel (and possibly 2.3.x later on). I am just trying to figure out the best way to do it, as the Linux IPv6 implementation was real short at the time of 2.1.59 and solutions change with the new situation. Note that I would be interested in a URL for the development IPv6 stack... Guilhem Tardy. From owner-netdev@oss.sgi.com Sat Dec 11 21:11:59 1999 Received: by oss.sgi.com id ; Sat, 11 Dec 1999 21:11:49 -0800 Received: from vindaloo.ras.ucalgary.ca ([136.159.55.21]:50561 "HELO vindaloo.ras.ucalgary.ca") by oss.sgi.com with SMTP id ; Sat, 11 Dec 1999 21:11:28 -0800 Received: (from rgooch@localhost) by vindaloo.ras.ucalgary.ca (8.6.12/8.6.12) id WAA11906; Sat, 11 Dec 1999 22:09:51 -0700 Date: Sat, 11 Dec 1999 22:09:51 -0700 Message-Id: <199912120509.WAA11906@vindaloo.ras.ucalgary.ca> From: Richard Gooch To: Guilhem Tardy Cc: netdev@oss.sgi.com Subject: Re: 'memory_start' in init/main.c In-Reply-To: <38518124.EBF52272@crc.ca> References: <38518124.EBF52272@crc.ca> Sender: owner-netdev@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;netdev-outgoing Content-Length: 754 Lines: 26 Guilhem Tardy writes: > Hi! > > I tried to compile 2.3.31, but it fails on init/main.c > > # in function 'start_kernel': > # 'memory_start' undeclared > > Anyone who managed to compile this beastie (I guess so), would you > mind to tell me about this? Known problem. The kernel newsflash page tells you what to do. BTW: this isn't the appropriate forum for general kernel questions. The linux-kernel list at vger.rutgers.edu is the place to ask. And make sure you read the FAQ: http://www.tux.org/lkml/ and the kernel newsflash page: http://www.atnf.csiro.au/~rgooch/linux/docs/kernel-newsflash.html before posting to linux-kernel. It's a busy list. Regards, Richard.... Permanent: rgooch@atnf.csiro.au Current: rgooch@ras.ucalgary.ca From owner-netdev@oss.sgi.com Sun Dec 12 07:56:11 1999 Received: by oss.sgi.com id ; Sun, 12 Dec 1999 07:55:52 -0800 Received: from minus.inr.ac.ru ([193.233.7.97]:2820 "HELO ms2.inr.ac.ru") by oss.sgi.com with SMTP id ; Sun, 12 Dec 1999 07:55:32 -0800 Received: (from kuznet@localhost) by ms2.inr.ac.ru (8.6.13/ANK) id SAA00191; Sun, 12 Dec 1999 18:53:49 +0300 From: kuznet@ms2.inr.ac.ru Message-Id: <199912121553.SAA00191@ms2.inr.ac.ru> Subject: Re: DestOpts: are they implemented? To: Guilhem.Tardy@crc.ca (Guilhem Tardy) Date: Sun, 12 Dec 1999 18:53:49 +0300 (MSK) Cc: netdev@oss.sgi.com In-Reply-To: <38516875.6363A31E@crc.ca> from "Guilhem Tardy" at Dec 10, 99 03:54:13 pm X-Mailer: ELM [version 2.4 PL24] MIME-Version: 1.0 Sender: owner-netdev@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;netdev-outgoing Content-Length: 670 Lines: 23 Hello! > My understanding is that IPv6 does not have Dest Opts aside of PAD1 and > PADN, right? Exactly, no more standard options exist. A particular module may add its own ones, if it is required. > 2.3.x later on). I am just trying to figure out the best way to do it, > as the Linux IPv6 implementation was real short at the time of 2.1.59 > and solutions change with the new situation. Mmm... To be honest, there were not so much of changes since that time. No structural changes at least, mainly filling holes. > Note that I would be interested in a URL for the development IPv6 > stack... You are listening this URL now 8) (i.e. netdev) Alexey Kuznetsov From owner-netdev@oss.sgi.com Sun Dec 12 08:12:11 1999 Received: by oss.sgi.com id ; Sun, 12 Dec 1999 08:12:02 -0800 Received: from mailgate.spray.se ([212.78.194.26]:24331 "HELO mailgate.spray.se") by oss.sgi.com with SMTP id ; Sun, 12 Dec 1999 08:11:47 -0800 Received: from 10.6.6.32 by mailgate.spray.se (InterScan E-Mail VirusWall NT); Sun, 12 Dec 1999 17:12:34 -0000 (GMT Standard Time) Received: by ST-EXC01 with Internet Mail Service (5.5.2448.0) id ; Sun, 12 Dec 1999 17:08:27 +0100 Message-ID: <8BBC85E87564D311B87900902799EB8B017C06CC@ST-EXC01> From: Andreas Thorstensson To: netdev@oss.sgi.com Subject: Date: Sun, 12 Dec 1999 17:08:26 +0100 MIME-Version: 1.0 X-Mailer: Internet Mail Service (5.5.2448.0) Content-Type: text/plain; charset="iso-8859-1" Sender: owner-netdev@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;netdev-outgoing Content-Length: 8 Lines: 1 SIGNOFF From owner-netdev@oss.sgi.com Mon Dec 13 01:07:04 1999 Received: by oss.sgi.com id ; Mon, 13 Dec 1999 01:06:54 -0800 Received: from smtp01ffm.de.uu.net ([192.76.144.150]:27779 "EHLO smtp01ffm.de.uu.net") by oss.sgi.com with ESMTP id ; Mon, 13 Dec 1999 01:06:38 -0800 Received: from gate.admin.de (gate.admin.de [194.175.103.18]) by smtp01ffm.de.uu.net (5.5.5/5.5.5) with ESMTP id KAA02661 for ; Mon, 13 Dec 1999 10:05:23 +0100 (MET) Received: from castle.bo2.admin.de (castle.bo.admin.de [192.168.90.4]) by gate.admin.de (Postfix) with SMTP id 10826A8804 for ; Mon, 13 Dec 1999 11:09:15 +0100 (CET) From: Lars Heete To: netdev@oss.sgi.com Subject: [PATCH] Fix for broken unix_accept Date: Mon, 13 Dec 1999 09:46:24 +0100 X-Mailer: KMail [version 1.0.28] Content-Type: Multipart/Mixed; boundary="Boundary-=_nWlrBbmQBhCDarzOwKkYHIDdqSCD" MIME-Version: 1.0 Message-Id: <99121310120200.00900@castle.bo2.admin.de> Sender: owner-netdev@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;netdev-outgoing Content-Length: 3594 Lines: 60 --Boundary-=_nWlrBbmQBhCDarzOwKkYHIDdqSCD Content-Type: text/plain Content-Transfer-Encoding: 8bit Hello, the accept call on unix-domain sockets in current 2.3 Linux is broken. Instead of waiting for an incoming connection, it returns ENOTCONN. The problem is the use of skb_recv_datagram, which checks for TCP_ESTABLISHED state before waiting. My solution is to use an own unix_wait_for_connect function, similar to wait_for_packet in core/datagram.c, but without unneeded error checks. Lars Heete --Boundary-=_nWlrBbmQBhCDarzOwKkYHIDdqSCD Content-Type: text/english; name="unix-accept.diff" Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="unix-accept.diff" LS0tIGxpbnV4Lm9yaWcvbmV0L3VuaXgvYWZfdW5peC5jCU1vbiBEZWMgMTMgMDg6Mjg6MTEgMTk5 OQorKysgbGludXgvbmV0L3VuaXgvYWZfdW5peC5jCU1vbiBEZWMgMTMgMDk6MTA6NTUgMTk5OQpA QCAtNDMsMTAgKzQzLDExIEBACiAgKgkJCQkJbnVtYmVyIG9mIHNvY2tzIHRvIDIqbWF4X2ZpbGVz IGFuZAogICoJCQkJCXRoZSBudW1iZXIgb2Ygc2tiIHF1ZXVlYWJsZSBpbiB0aGUKICAqCQkJCQlk Z3JhbSByZWNlaXZlci4KICAqCQlBcnR1ciBTa2F3aW5hICAgOglIYXNoIGZ1bmN0aW9uIG9wdGlt aXphdGlvbnMKICAqCSAgICAgQWxleGV5IEt1em5ldHNvdiAgIDoJRnVsbCBzY2FsZSBTTVAuIExv dCBvZiBidWdzIGFyZSBpbnRyb2R1Y2VkIDgpCisgKgkgICAgIExhcnMgSGVldGUgICAgICAgICA6 CUZpeCBmb3IgdW5peF9hY2NlcHQgYnVnLgogICoKICAqCiAgKiBLbm93biBkaWZmZXJlbmNlcyBm cm9tIHJlZmVyZW5jZSBCU0QgdGhhdCB3YXMgdGVzdGVkOgogICoKICAqCVtUTyBGSVhdCkBAIC05 NDIsMTAgKzk0MywzNyBAQAogCQlzb2NrYi0+c3RhdGU9U1NfQ09OTkVDVEVEOwogCX0KIAlyZXR1 cm4gMDsKIH0KIAorLyoKKyAqIFdhaXQgZm9yIGEgcGFja2V0Li4KKyAqLworCitzdGF0aWMgaW50 IHVuaXhfd2FpdF9mb3JfY29ubmVjdChzdHJ1Y3Qgc29jayAqIHNrLCBpbnQgKmVycikKK3sKKwlp bnQgZXJyb3IgPSAwOworCURFQ0xBUkVfV0FJVFFVRVVFKHdhaXQsIGN1cnJlbnQpOworCisJX19z ZXRfY3VycmVudF9zdGF0ZShUQVNLX0lOVEVSUlVQVElCTEUpOworCWFkZF93YWl0X3F1ZXVlKHNr LT5zbGVlcCwgJndhaXQpOworCisJaWYgKHNrYl9xdWV1ZV9lbXB0eSgmc2stPnJlY2VpdmVfcXVl dWUpKSB7CisJCS8qIGhhbmRsZSBzaWduYWxzICovCisJCWlmIChzaWduYWxfcGVuZGluZyhjdXJy ZW50KSkgeworCQkJZXJyb3IgPSAqZXJyID0gLUVSRVNUQVJUU1lTOworCQl9CisJCWVsc2Ugewor CQkJc2NoZWR1bGUoKTsKKwkJfQorCX0KKwljdXJyZW50LT5zdGF0ZSA9IFRBU0tfUlVOTklORzsK KwlyZW1vdmVfd2FpdF9xdWV1ZShzay0+c2xlZXAsICZ3YWl0KTsKKwlyZXR1cm4gZXJyb3I7Cit9 CisKKwogc3RhdGljIGludCB1bml4X2FjY2VwdChzdHJ1Y3Qgc29ja2V0ICpzb2NrLCBzdHJ1Y3Qg c29ja2V0ICpuZXdzb2NrLCBpbnQgZmxhZ3MpCiB7CiAJdW5peF9zb2NrZXQgKnNrID0gc29jay0+ c2s7CiAJdW5peF9zb2NrZXQgKnRzazsKIAlzdHJ1Y3Qgc2tfYnVmZiAqc2tiOwpAQCAtOTU5LDIw ICs5ODcsMjggQEAKIAlpZiAoc2stPnN0YXRlIT1UQ1BfTElTVEVOKQogCQlnb3RvIG91dDsKIAog CS8qIElmIHNvY2tldCBzdGF0ZSBpcyBUQ1BfTElTVEVOIGl0IGNhbm5vdCBjaGFuZ2UsCiAJICAg c28gdGhhdCBubyBsb2NrcyBhcmUgbmVjZXNzYXJ5LgorIAorCSAgIEl0J3Mgbm90IHBvc3NpYmxl IHRvIHVzZSBza2JfcmVjdl9kYXRhZ3JhbSBoZXJlLAorCSAgIGJlY2F1c2UgdGhpcyBjaGVja3Mg aWYgdGhlIHNvY2tldCBpcyBjb25uZWN0ZWQuCiAJICovCi0KLQlza2IgPSBza2JfcmVjdl9kYXRh Z3JhbShzaywgMCwgZmxhZ3MmT19OT05CTE9DSywgJmVycik7Ci0JaWYgKCFza2IpCi0JCWdvdG8g b3V0OworCQorCXdoaWxlICgoc2tiID0gc2tiX2RlcXVldWUoJnNrLT5yZWNlaXZlX3F1ZXVlKSkg PT0gTlVMTCkgeworCQlpZiAoZmxhZ3MmT19OT05CTE9DSykgeworCQkJZXJyID0gLUVBR0FJTjsK KwkJCWdvdG8gb3V0OworCQl9CisJCWlmICh1bml4X3dhaXRfZm9yX2Nvbm5lY3Qoc2ssICZlcnIp ICE9IDApCisJCQlnb3RvIG91dDsKKyAgICAgICAgfQogCiAJdHNrID0gc2tiLT5zazsKKwlrZnJl ZV9za2Ioc2tiKTsKIAlpZiAoc2tiX3F1ZXVlX2xlbigmc2stPnJlY2VpdmVfcXVldWUpIDw9IHNr LT5tYXhfYWNrX2JhY2tsb2cvMikKIAkJd2FrZV91cF9pbnRlcnJ1cHRpYmxlKCZzay0+cHJvdGlu Zm8uYWZfdW5peC5wZWVyX3dhaXQpOwotCXNrYl9mcmVlX2RhdGFncmFtKHNrLCBza2IpOwogCiAJ LyogYXR0YWNoIGFjY2VwdGVkIHNvY2sgdG8gc29ja2V0ICovCiAJdW5peF9zdGF0ZV93bG9jayh0 c2spOwogCW5ld3NvY2stPnN0YXRlID0gU1NfQ09OTkVDVEVEOwogCW5ld3NvY2stPnNrID0gdHNr Owo= --Boundary-=_nWlrBbmQBhCDarzOwKkYHIDdqSCD-- From owner-netdev@oss.sgi.com Mon Dec 13 01:10:44 1999 Received: by oss.sgi.com id ; Mon, 13 Dec 1999 01:10:24 -0800 Received: from pizda.ninka.net ([216.101.162.242]:27140 "EHLO pizda.ninka.net") by oss.sgi.com with ESMTP id ; Mon, 13 Dec 1999 01:10:06 -0800 Received: (from davem@localhost) by pizda.ninka.net (8.9.3/8.9.3) id BAA01296; Mon, 13 Dec 1999 01:07:46 -0800 Date: Mon, 13 Dec 1999 01:07:46 -0800 Message-Id: <199912130907.BAA01296@pizda.ninka.net> X-Authentication-Warning: pizda.ninka.net: davem set sender to davem@redhat.com using -f From: "David S. Miller" To: hel@admin.de CC: netdev@oss.sgi.com In-reply-to: <99121310120200.00900@castle.bo2.admin.de> (message from Lars Heete on Mon, 13 Dec 1999 09:46:24 +0100) Subject: Re: [PATCH] Fix for broken unix_accept References: <99121310120200.00900@castle.bo2.admin.de> Sender: owner-netdev@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;netdev-outgoing Content-Length: 389 Lines: 13 From: Lars Heete Date: Mon, 13 Dec 1999 09:46:24 +0100 the accept call on unix-domain sockets in current 2.3 Linux is broken. Yes, we know. I've seen your patch several times before and have not lost it. I will integrate it as soon as I am caught up with 2.3.x on sparc64 so I can test and work with things once more. Later, David S. Miller davem@redhat.com From owner-netdev@oss.sgi.com Mon Dec 13 03:47:51 1999 Received: by oss.sgi.com id ; Mon, 13 Dec 1999 03:47:42 -0800 Received: from athena.nuclecu.unam.mx ([132.248.29.9]:34144 "EHLO athena.nuclecu.unam.mx") by oss.sgi.com with ESMTP id ; Mon, 13 Dec 1999 03:47:22 -0800 Received: from ns.meltal.si (gw.meltal.si [195.246.16.135]) by athena.nuclecu.unam.mx (8.8.7/8.8.7) with ESMTP id FAA24077 for ; Mon, 13 Dec 1999 05:46:09 -0600 Received: from tecra (tecra.meltal.si [192.168.10.4]) by ns.meltal.si (8.8.6/8.8.6) with SMTP id MAA28801 for ; Mon, 13 Dec 1999 12:44:29 GMT Reply-To: From: "Stane Bozic" To: Date: Mon, 13 Dec 1999 12:43:43 +0100 Message-ID: MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Priority: 3 (Normal) X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2910.0) In-Reply-To: Importance: Normal X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300 Sender: owner-netdev@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;netdev-outgoing Content-Length: 16 Lines: 2 signoff netdev From owner-netdev@oss.sgi.com Mon Dec 13 22:16:47 1999 Received: by oss.sgi.com id ; Mon, 13 Dec 1999 22:16:27 -0800 Received: from linuxcare.canberra.net.au ([203.29.91.49]:18443 "EHLO front.linuxcare.com.au") by oss.sgi.com with ESMTP id ; Mon, 13 Dec 1999 22:16:20 -0800 Received: from gargle.linuxcare.com.au (gargle.linuxcare.com.au [10.61.2.18]) by front.linuxcare.com.au (8.9.3/8.9.3/Debian/GNU) with ESMTP id RAA09857; Tue, 14 Dec 1999 17:15:06 +1100 Received: from rustcorp.com.au (really [127.0.0.1]) by rustcorp.com.au via in.smtpd with esmtp id (Debian Smail3.2.0.102) for ; Mon, 13 Dec 1999 11:33:26 +1100 (EST) Message-Id: From: Paul Rusty Russell To: "Pekka Riikonen [Adm]" Cc: netfilter@lists.samba.org, netdev@oss.sgi.com Subject: Re: Concurrency within netfilter hooks In-reply-to: Your message of "Thu, 09 Dec 1999 13:08:20 +0200." Date: Mon, 13 Dec 1999 11:33:26 +1100 Sender: owner-netdev@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;netdev-outgoing Content-Length: 920 Lines: 25 In message you writ e: > Hi, > > Is the network layer or netfilter layer in Linux 2.3 (or in upcoming 2.4) > threaded so that there is a possibility of for example two netfilter hooks > executing concurrently in SMP box. For example, if PRE_ROUTING and > POST_ROUTING hooks would be executing at the same time and they touch > shared data, locking would have to be used to protect concurrency. Or, is > it quaranteed that there is never a situation like this thus locking is > not needed within the hooks. You should assume that there will be arbitrary levels of concurrency, and lock appropriately. For 2.4, it won't happen, except for packets from userspace being interrupted by bottom halves and timers, but this is changing: you can look into Alexey's crystal ball at ftp://ftp.inr.ac.ru/ip-routing/softnet-* Hope that helps, Rusty. -- Hacking time. From owner-netdev@oss.sgi.com Tue Dec 14 06:56:20 1999 Received: by oss.sgi.com id ; Tue, 14 Dec 1999 06:56:00 -0800 Received: from minus.inr.ac.ru ([193.233.7.97]:47372 "HELO ms2.inr.ac.ru") by oss.sgi.com with SMTP id ; Tue, 14 Dec 1999 06:55:47 -0800 Received: (from kuznet@localhost) by ms2.inr.ac.ru (8.6.13/ANK) id RAA09611; Tue, 14 Dec 1999 17:53:10 +0300 From: kuznet@ms2.inr.ac.ru Message-Id: <199912141453.RAA09611@ms2.inr.ac.ru> Subject: Re: Concurrency within netfilter hooks To: Paul.RUssell@linuxcare.COM.AU (Paul Rusty Russell) Date: Tue, 14 Dec 1999 17:53:10 +0300 (MSK) Cc: netdev@oss.sgi.com, davem@redhat.com (Dave Miller) In-Reply-To: from "Paul Rusty Russell" at Dec 14, 99 10:13:05 am X-Mailer: ELM [version 2.4 PL24] MIME-Version: 1.0 Sender: owner-netdev@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;netdev-outgoing Content-Length: 1043 Lines: 27 Hello! > For 2.4, it won't happen, except for packets from userspace being > interrupted by bottom halves and timers, Processes from userspace really overlap since 2.3.15. > but this is changing: you can look into Alexey's crystal ball at It is not necessary to look into magic crystals. 8) - Hooks, executed in process context, i.e. all output, post-routing etc. must be multithreaded. - Hooks (and all the code), usually executed from net_bh (input, forwarding) also must be multithreaded, but not softnet is reason for this. Netfilter itself creates concurrency in all the paths, which used to be executed in net_bh context, when it reinjects packets. Essentially, softnet adds __nothing__ new to these rules, except for one thing: concurency becomes common, rather than marginal phenomenon in all the paths. Essentially, it is the main argument, why I do not jest when proposing to add softnet before 2.4. All the complexity and all the bugs are already in 2.3 and softnet only clarifies code and fixes bugs. 8)8) Alexey From owner-netdev@oss.sgi.com Tue Dec 14 16:27:01 1999 Received: by oss.sgi.com id ; Tue, 14 Dec 1999 16:26:52 -0800 Received: from [206.171.92.89] ([206.171.92.89]:26887 "HELO penguin.filetron.com") by oss.sgi.com with SMTP id ; Tue, 14 Dec 1999 16:26:34 -0800 Received: (qmail 27119 invoked from network); 15 Dec 1999 00:23:19 -0000 Received: from ns1.filetron.com (httpd@206.171.92.1) by 206.171.92.89 with SMTP; 15 Dec 1999 00:23:19 -0000 Received: (from httpd@localhost) by ns1.filetron.com (8.8.7/8.8.7) id QAA06037; Tue, 14 Dec 1999 16:24:17 -0800 Date: Tue, 14 Dec 1999 16:24:17 -0800 Message-Id: <199912150024.QAA06037@ns1.filetron.com> Content-Type: text/plain Content-Disposition: inline Mime-Version: 1.0 X-Mailer: MIME-tools 4.103 (Entity 4.115) From: David "lordbeatnik" Jeffery To: netdev@oss.sgi.com Subject: two ipv6 oops on 2.3.31 Sender: owner-netdev@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;netdev-outgoing Content-Length: 833 Lines: 16 Hi all. I've just joined this mailing list, so if these problems are already known, sorry for the wasted bandwidth. With a stock 2.3.31, I can create a kernel oops by having an ipv6 program try to bind to a udp port. I can bind a ipv6 tcp port without a problem, but once I try to connect and talk to it, the kernel will die with "aiee, killing interupt handler, etc" I have the ksymoops output of the first oops, and a ksymoops output of what I could copy off the screen for the tcp death. If the oops traces would be of any use to anyone, let me know and I'll post it or send it to you. the programs were socktest and a simple client/server pair I wrote myself. Both would cause the same oops. David "LordBeatnik" Jeffery ------ Do you do Linux? :) Get your FREE @linuxstart.com email address at: http://www.linuxstart.com From owner-netdev@oss.sgi.com Wed Dec 15 07:04:04 1999 Received: by oss.sgi.com id ; Wed, 15 Dec 1999 07:03:54 -0800 Received: from minus.inr.ac.ru ([193.233.7.97]:37382 "HELO ms2.inr.ac.ru") by oss.sgi.com with SMTP id ; Wed, 15 Dec 1999 07:03:33 -0800 Received: (from kuznet@localhost) by ms2.inr.ac.ru (8.6.13/ANK) id SAA02381; Wed, 15 Dec 1999 18:01:46 +0300 From: kuznet@ms2.inr.ac.ru Message-Id: <199912151501.SAA02381@ms2.inr.ac.ru> Subject: Re: two ipv6 oops on 2.3.31 To: lordbeatnik@linuxstart.COM (David Jeffery lordbeatnik) Date: Wed, 15 Dec 1999 18:01:46 +0300 (MSK) Cc: netdev@oss.sgi.com In-Reply-To: <199912150024.QAA06037@ns1.filetron.com> from "David Jeffery, lordbeatnik" at Dec 15, 99 04:13:02 am X-Mailer: ELM [version 2.4 PL24] MIME-Version: 1.0 Sender: owner-netdev@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;netdev-outgoing Content-Length: 953 Lines: 39 Hello! > I've just joined this mailing list, so if these problems are already known, > sorry for the wasted bandwidth. Congratulations! You joined in time. Seems, you are the first user of IPv6 with 2.3. 8) Please, do not disappear. 8) > once I try to connect and talk to it, the kernel will die with > "aiee, killing interupt handler, etc" I think this was fixed by: diff -ur --new-file ../2.3.32/linux/net/ipv6/tcp_ipv6.c linux/net/ipv6/tcp_ipv6.c --- ../2.3.32/linux/net/ipv6/tcp_ipv6.c Fri Nov 19 22:33:29 1999 +++ linux/net/ipv6/tcp_ipv6.c Tue Dec 14 22:08:26 1999 @@ -273,8 +271,8 @@ } } } - if (sk) - sock_hold(sk); + if (result) + sock_hold(result); read_unlock(&tcp_lhash_lock); return result; } > With a stock 2.3.31, I can create a kernel oops by having an ipv6 program > try to bind to a udp port. I do not remember such bug... Please, send testing program and oops with symbolic information. Alexey Kuznetsov From owner-netdev@oss.sgi.com Wed Dec 15 10:32:05 1999 Received: by oss.sgi.com id ; Wed, 15 Dec 1999 10:31:54 -0800 Received: from [206.171.92.89] ([206.171.92.89]:24837 "HELO penguin.filetron.com") by oss.sgi.com with SMTP id ; Wed, 15 Dec 1999 10:31:26 -0800 Received: (qmail 10467 invoked from network); 15 Dec 1999 18:28:14 -0000 Received: from ns1.filetron.com (httpd@206.171.92.1) by 206.171.92.89 with SMTP; 15 Dec 1999 18:28:14 -0000 Received: (from httpd@localhost) by ns1.filetron.com (8.8.7/8.8.7) id KAA29421; Wed, 15 Dec 1999 10:29:10 -0800 Date: Wed, 15 Dec 1999 10:29:10 -0800 Message-Id: <199912151829.KAA29421@ns1.filetron.com> Content-Type: text/plain Content-Disposition: inline Mime-Version: 1.0 X-Mailer: MIME-tools 4.103 (Entity 4.115) From: David Jeffery To: kuznet@ms2.inr.ac.ru Subject: tcp oops fix successful Cc: netdev@oss.sgi.com Sender: owner-netdev@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;netdev-outgoing Content-Length: 379 Lines: 6 Thanks Alexey, that patch you sent did the trick on the tcp crash. For some reason patch didn't like the way I copied your patch and rejected it but once I put it in by hand and built it, I've been able to use tcp with ipv6 without a another oops. David "LordBeatnik" Jeffery ------ Do you do Linux? :) Get your FREE @linuxstart.com email address at: http://www.linuxstart.com From owner-netdev@oss.sgi.com Wed Dec 15 15:53:57 1999 Received: by oss.sgi.com id ; Wed, 15 Dec 1999 15:53:28 -0800 Received: from wanwan.sfc.wide.ad.jp ([203.178.140.19]:8632 "EHLO wanwan.sfc.wide.ad.jp") by oss.sgi.com with ESMTP id ; Wed, 15 Dec 1999 15:53:09 -0800 Received: from sfc.wide.ad.jp (localhost [127.0.0.1]) by wanwan.sfc.wide.ad.jp (8.9.3+3.2W/3.7Wpl2-wanwan) with ESMTP id IAA05354; Thu, 16 Dec 1999 08:51:34 +0900 (JST) (envelope-from kusune@sfc.wide.ad.jp) Message-Id: <199912152351.IAA05354@wanwan.sfc.wide.ad.jp> To: kuznet@ms2.inr.ac.ru Cc: netdev@oss.sgi.com Subject: source address selection In-reply-to: Your message of "Wed, 15 Dec 1999 18:01:46 JST." <199912151501.SAA02381@ms2.inr.ac.ru> Mime-Version: 1.0 (generated by tm-edit 7.106) Content-Type: multipart/mixed; boundary="Multipart_Thu_Dec_16_08:51:33_1999-1" Content-Transfer-Encoding: 7bit Date: Thu, 16 Dec 1999 08:51:34 +0900 From: Takeshi Kusune / =?ISO-2022-JP?B?GyRARm86LE06O1YbKEo=?= Sender: owner-netdev@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;netdev-outgoing Content-Length: 4350 Lines: 191 --Multipart_Thu_Dec_16_08:51:33_1999-1 Content-Type: text/plain; charset=US-ASCII Hi, Alex. In message <199912151501.SAA02381@ms2.inr.ac.ru> Subject: Re: two ipv6 oops on 2.3.31 kuznet@ms2.inr.ac.ru wrote: >> > once I try to connect and talk to it, the kernel will die with >> > "aiee, killing interupt handler, etc" >> >> I think this was fixed by: >> >> diff -ur --new-file ../2.3.32/linux/net/ipv6/tcp_ipv6.c linux/net/ipv6/tcp_ipv6.c That's a nice early X'mas gift :), Thank you. BTW, I think that Linux's current IPv6 code is not enough to work with multi-addressed network, because of weakness in source address selection. I made a quick hacked code about longest match for srcaddr selection. Please join this code (or like this) to current code. (In fact, this code is originally written for 2.2.13 kernel, and tested with it (works fine). I try to fixed it for 2.3.31 but it is not tested well.) -- Takeshi Kusune # Sorry for my poor english. --Multipart_Thu_Dec_16_08:51:33_1999-1 Content-Type: application/octet-stream; type=patch Content-Disposition: attachment; filename="longest-match.for-2.3.31.diff" Content-Transfer-Encoding: 7bit --- linux-2.3.31/net/ipv6/addrconf.c.orig Sat Nov 20 04:33:29 1999 +++ linux-2.3.31/net/ipv6/addrconf.c Thu Dec 16 08:12:02 1999 @@ -125,6 +125,34 @@ MAX_RTR_SOLICITATION_DELAY, /* rtr solicit delay */ }; +/* + * from ip6_fib.c + */ +static __inline__ int inet6_addr_diff(void *token1, void *token2) +{ + __u32 *a1 = token1; + __u32 *a2 = token2; + int addrlen = 4; + int i; + + addrlen >>= 2; + for (i = 0; i < addrlen; i++) { + __u32 xb; + + xb = a1[i] ^ a2[i]; + if (xb) { + int j = 31; + + xb = ntohl(xb); + while (test_bit(j, &xb) == 0) + j--; + return (i * 32 + 31 - j); + } + } + + return addrlen<<5; +} + int ipv6_addr_type(struct in6_addr *addr) { u32 st; @@ -450,6 +478,8 @@ struct net_device *dev = NULL; struct inet6_dev *idev; struct rt6_info *rt; + int deprecated = 1; + int matchlen = 0; int err; rt = (struct rt6_info *) dst; @@ -480,16 +510,31 @@ read_lock_bh(&idev->lock); for (ifp=idev->addr_list; ifp; ifp=ifp->if_next) { if (ifp->scope == scope) { + int newlen = inet6_addr_diff(daddr, &ifp->addr); + + if (newlen < matchlen) + continue; + if (!(ifp->flags & (IFA_F_DEPRECATED|IFA_F_TENTATIVE))) { + match = ifp; + matchlen = newlen; + deprecated = 0; in6_ifa_hold(ifp); - read_unlock_bh(&idev->lock); - read_unlock(&addrconf_lock); - goto out; + if (match != NULL) + in6_ifa_put(match); + continue; } - if (!match && !(ifp->flags & IFA_F_TENTATIVE)) { + if (!deprecated) + continue; + + if (!(ifp->flags & IFA_F_TENTATIVE)) { + matchlen = newlen; match = ifp; in6_ifa_hold(ifp); + if (match != NULL) + in6_ifa_put(match); + continue; } } } @@ -498,6 +543,9 @@ read_unlock(&addrconf_lock); } + if (match != NULL && !deprecated) + goto out; + if (scope == IFA_LINK) goto out; @@ -513,30 +561,43 @@ read_lock_bh(&idev->lock); for (ifp=idev->addr_list; ifp; ifp=ifp->if_next) { if (ifp->scope == scope) { + int newlen = inet6_addr_diff(daddr, &ifp->addr); + + if (newlen < matchlen) + continue; + if (!(ifp->flags&(IFA_F_DEPRECATED|IFA_F_TENTATIVE))) { in6_ifa_hold(ifp); - read_unlock_bh(&idev->lock); - goto out_unlock_base; + if (match != NULL) + in6_ifa_put(match); + matchlen = newlen; + deprecated = 0; + match = ifp; + continue; } - if (!match && !(ifp->flags&IFA_F_TENTATIVE)) { - match = ifp; + if (!deprecated) + continue; + + if (!(ifp->flags & IFA_F_TENTATIVE)) { in6_ifa_hold(ifp); + if (match != NULL) + in6_ifa_put(match); + matchlen = newlen; + match = ifp; + continue; } } } read_unlock_bh(&idev->lock); } } - -out_unlock_base: read_unlock(&addrconf_lock); read_unlock(&dev_base_lock); out: if (ifp == NULL) { ifp = match; - match = NULL; } err = -EADDRNOTAVAIL; @@ -545,8 +606,6 @@ err = 0; in6_ifa_put(ifp); } - if (match) - in6_ifa_put(match); return err; } --Multipart_Thu_Dec_16_08:51:33_1999-1-- From owner-netdev@oss.sgi.com Wed Dec 15 19:14:20 1999 Received: by oss.sgi.com id ; Wed, 15 Dec 1999 19:14:10 -0800 Received: from [206.171.92.89] ([206.171.92.89]:42245 "HELO penguin.filetron.com") by oss.sgi.com with SMTP id ; Wed, 15 Dec 1999 19:13:41 -0800 Received: (qmail 9668 invoked from network); 16 Dec 1999 03:10:34 -0000 Received: from ns1.filetron.com (httpd@206.171.92.1) by 206.171.92.89 with SMTP; 16 Dec 1999 03:10:34 -0000 Received: (from httpd@localhost) by ns1.filetron.com (8.8.7/8.8.7) id TAA28841; Wed, 15 Dec 1999 19:11:29 -0800 Date: Wed, 15 Dec 1999 19:11:29 -0800 Message-Id: <199912160311.TAA28841@ns1.filetron.com> Mime-Version: 1.0 X-Mailer: MIME-tools 4.103 (Entity 4.115) From: David Jeffery To: netdev@oss.sgi.com Subject: [patch] ipv6 netfilter Content-Type: multipart/mixed; boundary="----------=_945313889-28839-0" Sender: owner-netdev@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;netdev-outgoing Content-Length: 10116 Lines: 169 This is a multi-part message in MIME format... ------------=_945313889-28839-0 Content-Type: text/plain Content-Disposition: inline Hi all. To continue the trend of ipv6 patches flowing around, here is a patch to add some netfilter hooks for ipv6. It modifies ip6_input.c and ip6_output.c and creates netfilter_ipv6.h. I'm sure it misses some packet paths but it should get unicast tcp and udp traffic. I've also got a test module that uses it at http://lordbeatnik.dynodns.net/netfilter named test.c that connects to each netfilter hook and printk's a debug message saying what hook was just called and some info about the packet. I'd like to get some feedback on people's opinions of what I've done. I believe I've placed the NF_HOOKs in the best locations but I don't know for sure. Anyway, I'd like to have netfilter hooks for ipv6 so please send me any suggestions/flames/comments about this patch if someone has the time. David "LordBeatnik" Jeffery p.s. this patch doesn't include Alexey's tcp fix, but I have a patch for it on my website. The attached patch can also be downloaded from my website. ------ Do you do Linux? :) Get your FREE @linuxstart.com email address at: http://www.linuxstart.com ------------=_945313889-28839-0 Content-Type: application/octet-stream; name="nf_ipv6_0.2-2.3.33.patch" Content-Disposition: inline; filename="nf_ipv6_0.2-2.3.33.patch" Content-Transfer-Encoding: base64 ZGlmZiAtdSAtciAtTiBsaW51eC0yLjMuMzMvaW5jbHVkZS9saW51eC9uZXRm aWx0ZXJfaXB2Ni5oIGxpbnV4L2luY2x1ZGUvbGludXgvbmV0ZmlsdGVyX2lw djYuaAotLS0gbGludXgtMi4zLjMzL2luY2x1ZGUvbGludXgvbmV0ZmlsdGVy X2lwdjYuaAlXZWQgRGVjIDMxIDE5OjAwOjAwIDE5NjkKKysrIGxpbnV4L2lu Y2x1ZGUvbGludXgvbmV0ZmlsdGVyX2lwdjYuaAlUdWUgRGVjIDE0IDIxOjE0 OjI2IDE5OTkKQEAgLTAsMCArMSwyNSBAQAorI2lmbmRlZiBfX0xJTlVYX0lQ Nl9ORVRGSUxURVJfSAorI2RlZmluZSBfX0xJTlVYX0lQNl9ORVRGSUxURVJf SAorCisvKiBJUHY0LXNwZWNpZmljIGRlZmluZXMgZm9yIG5ldGZpbHRlci4g CisgKiAoQykxOTk4IFJ1c3R5IFJ1c3NlbGwgLS0gVGhpcyBjb2RlIGlzIEdQ TC4KKyAqLworCisjaW5jbHVkZSA8bGludXgvY29uZmlnLmg+CisjaW5jbHVk ZSA8bGludXgvbmV0ZmlsdGVyLmg+CisKKy8qIElQNiBIb29rcyAqLworLyog QWZ0ZXIgcHJvbWlzYyBkcm9wcywgY2hlY2tzdW0gY2hlY2tzLiAqLworI2Rl ZmluZSBORl9JUDZfUFJFX1JPVVRJTkcJMAorLyogSWYgdGhlIHBhY2tldCBp cyBkZXN0aW5lZCBmb3IgdGhpcyBib3guICovCisjZGVmaW5lIE5GX0lQNl9M T0NBTF9JTgkJMQorLyogSWYgdGhlIHBhY2tldCBpcyBkZXN0aW5lZCBmb3Ig YW5vdGhlciBpbnRlcmZhY2UuICovCisjZGVmaW5lIE5GX0lQNl9GT1JXQVJE CQkyCisvKiBQYWNrZXRzIGNvbWluZyBmcm9tIGEgbG9jYWwgcHJvY2Vzcy4g Ki8KKyNkZWZpbmUgTkZfSVA2X0xPQ0FMX09VVAkJMworLyogUGFja2V0cyBh Ym91dCB0byBoaXQgdGhlIHdpcmUuICovCisjZGVmaW5lIE5GX0lQNl9QT1NU X1JPVVRJTkcJNAorI2RlZmluZSBORl9JUDZfTlVNSE9PS1MJCTUKKworCisj ZW5kaWYgLypfX0xJTlVYX0lQNl9ORVRGSUxURVJfSCovCmRpZmYgLXUgLXIg LU4gbGludXgtMi4zLjMzL25ldC9pcHY2L2lwNl9pbnB1dC5jIGxpbnV4L25l dC9pcHY2L2lwNl9pbnB1dC5jCi0tLSBsaW51eC0yLjMuMzMvbmV0L2lwdjYv aXA2X2lucHV0LmMJV2VkIERlYyAgOCAxMjoxNTo1MyAxOTk5CisrKyBsaW51 eC9uZXQvaXB2Ni9pcDZfaW5wdXQuYwlUdWUgRGVjIDE0IDIwOjQ4OjE0IDE5 OTkKQEAgLTI2LDYgKzI2LDkgQEAKICNpbmNsdWRlIDxsaW51eC9pbjYuaD4K ICNpbmNsdWRlIDxsaW51eC9pY21wdjYuaD4KIAorI2luY2x1ZGUgPGxpbnV4 L25ldGZpbHRlci5oPgorI2luY2x1ZGUgPGxpbnV4L25ldGZpbHRlcl9pcHY2 Lmg+CisKICNpbmNsdWRlIDxuZXQvc29jay5oPgogI2luY2x1ZGUgPG5ldC9z bm1wLmg+CiAKQEAgLTM4LDYgKzQxLDE2IEBACiAjaW5jbHVkZSA8bmV0L2Fk ZHJjb25mLmg+CiAKIAorCitzdGF0aWMgaW50IGlwNl9yY3ZfZmluaXNoKCBz dHJ1Y3Qgc2tfYnVmZiAqc2tiKSAKK3sKKworCWlmIChza2ItPmRzdCA9PSBO VUxMKQorCQlpcDZfcm91dGVfaW5wdXQoc2tiKTsKKworCXJldHVybiBza2It PmRzdC0+aW5wdXQoc2tiKTsKK30KKwogaW50IGlwdjZfcmN2KHN0cnVjdCBz a19idWZmICpza2IsIHN0cnVjdCBuZXRfZGV2aWNlICpkZXYsIHN0cnVjdCBw YWNrZXRfdHlwZSAqcHQpCiB7CiAJc3RydWN0IGlwdjZoZHIgKmhkcjsKQEAg LTc3LDEyICs5MCw3IEBACiAJCQlyZXR1cm4gMDsKIAkJfQogCX0KLQotCWlm IChza2ItPmRzdCA9PSBOVUxMKQotCQlpcDZfcm91dGVfaW5wdXQoc2tiKTsK LQotCXJldHVybiBza2ItPmRzdC0+aW5wdXQoc2tiKTsKLQorCXJldHVybiBO Rl9IT09LKFBGX0lORVQ2LE5GX0lQNl9QUkVfUk9VVElORywgc2tiLCBkZXYs IE5VTEwsIGlwNl9yY3ZfZmluaXNoKTsKIHRydW5jYXRlZDoKIAlpcHY2X3N0 YXRpc3RpY3MuSXA2SW5UcnVuY2F0ZWRQa3RzKys7CiBlcnI6CkBAIC05Nyw3 ICsxMDUsOCBAQAogICoJRGVsaXZlciB0aGUgcGFja2V0IHRvIHRoZSBob3N0 CiAgKi8KIAotaW50IGlwNl9pbnB1dChzdHJ1Y3Qgc2tfYnVmZiAqc2tiKQor CitzdGF0aWMgaW5saW5lIGludCBpcDZfaW5wdXRfZmluaXNoKHN0cnVjdCBz a19idWZmICpza2IpCiB7CiAJc3RydWN0IGlwdjZoZHIgKmhkciA9IHNrYi0+ bmguaXB2Nmg7CiAJc3RydWN0IGluZXQ2X3Byb3RvY29sICppcHByb3Q7CkBA IC0xODIsNiArMTkxLDEyIEBACiAJcmV0dXJuIDA7CiB9CiAKKworaW50IGlw Nl9pbnB1dChzdHJ1Y3Qgc2tfYnVmZiAqc2tiKQoreworCXJldHVybiBORl9I T09LKFBGX0lORVQ2LE5GX0lQNl9MT0NBTF9JTiwgc2tiLCBza2ItPmRldiwg TlVMTCwgaXA2X2lucHV0X2ZpbmlzaCk7Cit9CisKIGludCBpcDZfbWNfaW5w dXQoc3RydWN0IHNrX2J1ZmYgKnNrYikKIHsKIAlzdHJ1Y3QgaXB2NmhkciAq aGRyOwkKQEAgLTIzMSwzICsyNDYsNCBAQAogCiAJcmV0dXJuIDA7CiB9CisK ZGlmZiAtdSAtciAtTiBsaW51eC0yLjMuMzMvbmV0L2lwdjYvaXA2X291dHB1 dC5jIGxpbnV4L25ldC9pcHY2L2lwNl9vdXRwdXQuYwotLS0gbGludXgtMi4z LjMzL25ldC9pcHY2L2lwNl9vdXRwdXQuYwlXZWQgRGVjICA4IDEyOjE1OjU0 IDE5OTkKKysrIGxpbnV4L25ldC9pcHY2L2lwNl9vdXRwdXQuYwlXZWQgRGVj IDE1IDIxOjQyOjI2IDE5OTkKQEAgLTM0LDYgKzM0LDkgQEAKICNpbmNsdWRl IDxsaW51eC9pbjYuaD4KICNpbmNsdWRlIDxsaW51eC9yb3V0ZS5oPgogCisj aW5jbHVkZSA8bGludXgvbmV0ZmlsdGVyLmg+CisjaW5jbHVkZSA8bGludXgv bmV0ZmlsdGVyX2lwdjYuaD4KKwogI2luY2x1ZGUgPG5ldC9zb2NrLmg+CiAj aW5jbHVkZSA8bmV0L3NubXAuaD4KIApAQCAtNDcsNiArNTAsMjYgQEAKIAog c3RhdGljIHUzMglpcHY2X2ZyYWdtZW50YXRpb25faWQgPSAxOwogCitzdGF0 aWMgaW5saW5lIGludCBpcDZfb3V0cHV0X2ZpbmlzaChzdHJ1Y3Qgc2tfYnVm ZiAqc2tiKQoreworCisJc3RydWN0IGRzdF9lbnRyeSAqZHN0ID0gc2tiLT5k c3Q7CisJc3RydWN0IGhoX2NhY2hlICpoaCA9IGRzdC0+aGg7CisKKwlpZiAo aGgpIHsKKwkJcmVhZF9sb2NrX2JoKCZoaC0+aGhfbG9jayk7CisJCW1lbWNw eShza2ItPmRhdGEgLSAxNiwgaGgtPmhoX2RhdGEsIDE2KTsKKwkJcmVhZF91 bmxvY2tfYmgoJmhoLT5oaF9sb2NrKTsKKwkgICAgICAgIHNrYl9wdXNoKHNr YiwgaGgtPmhoX2xlbik7CisJCXJldHVybiBoaC0+aGhfb3V0cHV0KHNrYik7 CisJfSBlbHNlIGlmIChkc3QtPm5laWdoYm91cikKKwkJcmV0dXJuIGRzdC0+ bmVpZ2hib3VyLT5vdXRwdXQoc2tiKTsKKworCWtmcmVlX3NrYihza2IpOwor CXJldHVybiAtRUlOVkFMOworCit9CisKIGludCBpcDZfb3V0cHV0KHN0cnVj dCBza19idWZmICpza2IpCiB7CiAJc3RydWN0IGRzdF9lbnRyeSAqZHN0ID0g c2tiLT5kc3Q7CkBAIC03NCwxNyArOTcsMzkgQEAKIAkJaXB2Nl9zdGF0aXN0 aWNzLklwNk91dE1jYXN0UGt0cysrOwogCX0KIAotCWlmIChoaCkgewotCQly ZWFkX2xvY2tfYmgoJmhoLT5oaF9sb2NrKTsKLQkJbWVtY3B5KHNrYi0+ZGF0 YSAtIDE2LCBoaC0+aGhfZGF0YSwgMTYpOwotCQlyZWFkX3VubG9ja19iaCgm aGgtPmhoX2xvY2spOwotCSAgICAgICAgc2tiX3B1c2goc2tiLCBoaC0+aGhf bGVuKTsKLQkJcmV0dXJuIGhoLT5oaF9vdXRwdXQoc2tiKTsKLQl9IGVsc2Ug aWYgKGRzdC0+bmVpZ2hib3VyKQotCQlyZXR1cm4gZHN0LT5uZWlnaGJvdXIt Pm91dHB1dChza2IpOworCXJldHVybiBORl9IT09LKFBGX0lORVQ2LCBORl9J UDZfUE9TVF9ST1VUSU5HLCBza2IsTlVMTCwgc2tiLT5kZXYsaXA2X291dHB1 dF9maW5pc2gpOwogCi0Ja2ZyZWVfc2tiKHNrYik7Ci0JcmV0dXJuIC1FSU5W QUw7Cit9CisKKyNpZmRlZiBDT05GSUdfTkVURklMVEVSCitzdGF0aWMgaW50 IHJvdXRlNl9tZV9oYXJkZXIoc3RydWN0IHNrX2J1ZmYgKnNrYikKK3sKKwlz dHJ1Y3QgaXB2NmhkciAqaXA2aCA9IHNrYi0+bmguaXB2Nmg7CisJc3RydWN0 IHJ0Nl9pbmZvICpydDY7CisvKiBpcyB0aGlzIHJpZ2h0PyBpcyB0aGlzIHVz ZSBvZiBydDZfbG9va3VwIG9uIHRoZXNlIExPQ0FMX09VVCBwYWNrZXRzCith IGdvb2Qgd2F5IHRvIHJlcm91dGUgY2hhbmdlZCBwYWNrZXRzPyBESgorIEZJ WE1FKi8KKwlydDYgPSBydDZfbG9va3VwKCZpcDZoLT5kYWRkciwgJmlwNmgt PnNhZGRyLCAwLCAwKTsKKworCWlmKHJ0NiA9PSBOVUxMKXsKKwkJcHJpbnRr KEtFUk5fREVCVUcgInJvdXRlNl9tZV9oYXJkZXIgZmFpbHVyZSBcbiIpOwor CQlyZXR1cm4gLUVJTlZBTDsKKwl9CisJc2tiLT5kc3QgPSAmcnQ2LT51LmRz dDsKKwlyZXR1cm4gMDsKK30KKyNlbmRpZiAvKiBDT05GSUdfTkVURklMVEVS ICovCisKK3N0YXRpYyBpbmxpbmUgaW50IGlwNl9tYXliZV9yZXJvdXRlKHN0 cnVjdCBza19idWZmICpza2IpCit7CisjaWZkZWYgQ09ORklHX05FVEZJTFRF UgorCWlmIChza2ItPm5mY2FjaGUgJiBORkNfQUxURVJFRCl7CisJCWlmIChy b3V0ZTZfbWVfaGFyZGVyKHNrYikgIT0gMCl7CisJCQlrZnJlZV9za2Ioc2ti KTsKKwkJfQorCX0KKyNlbmRpZiAvKiBDT05GSUdfTkVURklMVEVSICovCisJ cmV0dXJuIHNrYi0+ZHN0LT5vdXRwdXQoc2tiKTsKIH0KIAogLyoKQEAgLTE0 OSw3ICsxOTQsNyBAQAogCiAJaWYgKHNrYi0+bGVuIDw9IGRzdC0+cG10dSkg ewogCQlpcHY2X3N0YXRpc3RpY3MuSXA2T3V0UmVxdWVzdHMrKzsKLQkJcmV0 dXJuIGRzdC0+b3V0cHV0KHNrYik7CisJCXJldHVybiBORl9IT09LKFBGX0lO RVQ2LCBORl9JUDZfTE9DQUxfT1VULCBza2IsIE5VTEwsIGRzdC0+ZGV2LCBp cDZfbWF5YmVfcmVyb3V0ZSk7CiAJfQogCiAJcHJpbnRrKEtFUk5fREVCVUcg IklQdjY6IHNlbmRpbmcgcGt0X3Rvb19iaWcgdG8gc2VsZlxuIik7CkBAIC0z NzgsNyArNDIzLDcgQEAKIAogCQkJaXB2Nl9zdGF0aXN0aWNzLklwNkZyYWdD cmVhdGVzKys7CiAJCQlpcHY2X3N0YXRpc3RpY3MuSXA2T3V0UmVxdWVzdHMr KzsKLQkJCWVyciA9IGRzdC0+b3V0cHV0KHNrYik7CisJCQllcnIgPSBORl9I T09LKFBGX0lORVQ2LE5GX0lQNl9MT0NBTF9PVVQsIHNrYiwgTlVMTCwgZHN0 LT5kZXYsIGlwNl9tYXliZV9yZXJvdXRlKTsKIAkJCWlmIChlcnIpIHsKIAkJ CQlrZnJlZV9za2IobGFzdF9za2IpOwogCQkJCXJldHVybiBlcnI7CkBAIC00 MDQsNyArNDQ5LDcgQEAKIAlpcHY2X3N0YXRpc3RpY3MuSXA2RnJhZ0NyZWF0 ZXMrKzsKIAlpcHY2X3N0YXRpc3RpY3MuSXA2RnJhZ09LcysrOwogCWlwdjZf c3RhdGlzdGljcy5JcDZPdXRSZXF1ZXN0cysrOwotCXJldHVybiBkc3QtPm91 dHB1dChsYXN0X3NrYik7CisJcmV0dXJuIE5GX0hPT0soUEZfSU5FVDYsIE5G X0lQNl9MT0NBTF9PVVQsIGxhc3Rfc2tiLCBOVUxMLGRzdC0+ZGV2LCBpcDZf bWF5YmVfcmVyb3V0ZSk7CiB9CiAKIGludCBpcDZfYnVpbGRfeG1pdChzdHJ1 Y3Qgc29jayAqc2ssIGluZXRfZ2V0ZnJhZ190IGdldGZyYWcsIGNvbnN0IHZv aWQgKmRhdGEsCkBAIC01NzIsNyArNjE3LDcgQEAKIAogCQlpZiAoIWVycikg ewogCQkJaXB2Nl9zdGF0aXN0aWNzLklwNk91dFJlcXVlc3RzKys7Ci0JCQll cnIgPSBkc3QtPm91dHB1dChza2IpOworCQkJZXJyID0gTkZfSE9PSyhQRl9J TkVUNiwgTkZfSVA2X0xPQ0FMX09VVCwgc2tiLCBOVUxMLCBkc3QtPmRldiwg aXA2X21heWJlX3Jlcm91dGUpOwogCQl9IGVsc2UgewogCQkJZXJyID0gLUVG QVVMVDsKIAkJCWtmcmVlX3NrYihza2IpOwpAQCAtNTg0LDcgKzYyOSw2IEBA CiAJCQllcnIgPSAtRU1TR1NJWkU7CiAJCQlnb3RvIG91dDsKIAkJfQotCiAJ CWVyciA9IGlwNl9mcmFnX3htaXQoc2ssIGdldGZyYWcsIGRhdGEsIGRzdCwg ZmwsIG9wdCwgZmluYWxfZHN0LCBobGltaXQsCiAJCQkJICAgIGZsYWdzLCBs ZW5ndGgsIG10dSk7CiAJfQpAQCAtNjI2LDYgKzY3MCwxMSBAQAogCXJldHVy biAwOwogfQogCitzdGF0aWMgaW5saW5lIGludCBpcDZfZm9yd2FyZF9maW5p c2goc3RydWN0IHNrX2J1ZmYgKnNrYikKK3sKKwlyZXR1cm4gc2tiLT5kc3Qt Pm91dHB1dChza2IpOworfQorCiBpbnQgaXA2X2ZvcndhcmQoc3RydWN0IHNr X2J1ZmYgKnNrYikKIHsKIAlzdHJ1Y3QgZHN0X2VudHJ5ICpkc3QgPSBza2It PmRzdDsKQEAgLTcxNCwxMiArNzYzLDExIEBACiAJLyogTWFuZ2xpbmcgaG9w cyBudW1iZXIgZGVsYXllZCB0byBwb2ludCBhZnRlciBza2IgQ09XICovCiAg CiAJaGRyLT5ob3BfbGltaXQtLTsKLQogCWlwdjZfc3RhdGlzdGljcy5JcDZP dXRGb3J3RGF0YWdyYW1zKys7Ci0JcmV0dXJuIGRzdC0+b3V0cHV0KHNrYik7 Ci0KKwlyZXR1cm4gTkZfSE9PSyhQRl9JTkVUNixORl9JUDZfRk9SV0FSRCwg c2tiLCBza2ItPmRldiwgZHN0LT5kZXYsIGlwNl9mb3J3YXJkX2ZpbmlzaCk7 CiBkcm9wOgogCWlwdjZfc3RhdGlzdGljcy5JcDZJbkFkZHJFcnJvcnMrKzsK IAlrZnJlZV9za2Ioc2tiKTsKIAlyZXR1cm4gLUVJTlZBTDsKIH0KKwo= ------------=_945313889-28839-0-- From owner-netdev@oss.sgi.com Thu Dec 16 01:04:28 1999 Received: by oss.sgi.com id ; Thu, 16 Dec 1999 01:04:18 -0800 Received: from linuxcare.canberra.net.au ([203.29.91.49]:9 "EHLO front.linuxcare.com.au") by oss.sgi.com with ESMTP id ; Thu, 16 Dec 1999 01:03:57 -0800 Received: from gargle.linuxcare.com.au (gargle.linuxcare.com.au [10.61.2.18]) by front.linuxcare.com.au (8.9.3/8.9.3/Debian/GNU) with ESMTP id UAA04819; Thu, 16 Dec 1999 20:02:35 +1100 Received: from rustcorp.com.au (really [127.0.0.1]) by rustcorp.com.au via in.smtpd with esmtp id (Debian Smail3.2.0.102) for ; Thu, 16 Dec 1999 20:10:28 +1100 (EST) Message-Id: From: Paul Rusty Russell To: kuznet@ms2.inr.ac.ru, netdev@oss.sgi.com Cc: torvalds@transmeta.com Subject: Re: [PATCH] packet fragmentation after POST_ROUTING netfilter hook In-reply-to: Your message of "Sat, 04 Dec 1999 19:42:33 +0300." <199912041642.TAA31099@ms2.inr.ac.ru> Date: Thu, 16 Dec 1999 20:10:28 +1100 Sender: owner-netdev@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;netdev-outgoing Content-Length: 3599 Lines: 77 In message <199912041642.TAA31099@ms2.inr.ac.ru> you write: > Paul, I am sorry, but it is principial position. Code must be optimal, > fragmention by ip_fragment() is deprecated. It is usable, but programmer > _must_ take care of fragmentation itself exactly to feel that he does > something wrong. Linus, please apply. OK. This means that my conntrack code needs to *refragment* as the very last thing (eg. `ip_fragment(skb, ip_finish_output2)'). This means that ip_fragment() needs to copy skb->dev, and that the hooks need access to the okfn for this special case. Pretty icky, but fragmentation always is. --- linux-2.3-official/net/ipv4/ip_output.c Tue Nov 30 17:58:59 1999 +++ linux-2.3/net/ipv4/ip_output.c Thu Dec 16 17:36:17 1999 @@ -850,6 +854,7 @@ if (skb->sk) skb_set_owner_w(skb2, skb->sk); skb2->dst = dst_clone(skb->dst); + skb2->dev = skb->dev; /* * Copy the packet header into the new buffer. diff -urN --minimal --exclude *.lds --exclude autoconf.h --exclude compile.h --exclude version.h --exclude .* --exclude *.[oa] --exclude *.orig --exclude config --exclude asm --exclude modules --exclude *.[Ss] --exclude System.map --exclude consolemap_deftbl.c --exclude *~ --exclude TAGS --exclude tags --exclude modversions.h --exclude install-kernel linux-2.3-official/include/linux/netfilter.h linux-2.3/include/linux/netfilter.h --- linux-2.3-official/include/linux/netfilter.h Fri Dec 10 18:40:14 1999 +++ linux-2.3/include/linux/netfilter.h Sun Dec 12 17:04:37 1999 @@ -36,7 +36,8 @@ typedef unsigned int nf_hookfn(unsigned int hooknum, struct sk_buff **skb, const struct net_device *in, - const struct net_device *out); + const struct net_device *out, + int (*okfn)(struct sk_buff *)); typedef unsigned int nf_cacheflushfn(const void *packet, const struct net_device *in, diff -urN --minimal --exclude *.lds --exclude autoconf.h --exclude compile.h --exclude version.h --exclude .* --exclude *.[oa] --exclude *.orig --exclude config --exclude asm --exclude modules --exclude *.[Ss] --exclude System.map --exclude consolemap_deftbl.c --exclude *~ --exclude TAGS --exclude tags --exclude modversions.h --exclude install-kernel linux-2.3-official/net/core/netfilter.c linux-2.3/net/core/netfilter.c --- linux-2.3-official/net/core/netfilter.c Tue Nov 30 17:58:19 1999 +++ linux-2.3/net/core/netfilter.c Sun Dec 12 17:07:22 1999 @@ -353,11 +353,12 @@ int hook, const struct net_device *indev, const struct net_device *outdev, - struct list_head **i) + struct list_head **i, + int (*okfn)(struct sk_buff *)) { for (*i = (*i)->next; *i != head; *i = (*i)->next) { struct nf_hook_ops *elem = (struct nf_hook_ops *)*i; - switch (elem->hook(hook, skb, indev, outdev)) { + switch (elem->hook(hook, skb, indev, outdev, okfn)) { case NF_QUEUE: NFDEBUG("nf_iterate: NF_QUEUE for %p.\n", *skb); return NF_QUEUE; @@ -471,7 +472,7 @@ read_lock_bh(&nf_lock); elem = &nf_hooks[pf][hook]; verdict = nf_iterate(&nf_hooks[pf][hook], &skb, hook, indev, - outdev, &elem); + outdev, &elem, okfn); if (verdict == NF_QUEUE) { NFDEBUG("nf_hook: Verdict = QUEUE.\n"); nf_queue(skb, elem, pf, hook, indev, outdev, okfn); @@ -553,7 +554,8 @@ skb->nfmark = mark; verdict = nf_iterate(&nf_hooks[info->pf][info->hook], &skb, info->hook, - info->indev, info->outdev, &elem); + info->indev, info->outdev, &elem, + info->okfn); } if (verdict == NF_QUEUE) { -- Hacking time. From owner-netdev@oss.sgi.com Thu Dec 16 02:08:08 1999 Received: by oss.sgi.com id ; Thu, 16 Dec 1999 02:07:59 -0800 Received: from mailhost.iitb.ac.in ([202.54.44.115]:46605 "HELO mailhost.iitb.ac.in") by oss.sgi.com with SMTP id ; Thu, 16 Dec 1999 02:07:39 -0800 Received: (qmail 31765 invoked from network); 16 Dec 1999 10:27:21 -0000 Received: from surya.cse.iitb.ernet.in (144.16.111.14) by mailhost.iitb.ac.in with SMTP; 16 Dec 1999 10:27:21 -0000 Received: from everest.cse.iitb.ernet.in (everest [144.16.111.4]) by surya.cse.iitb.ernet.in (8.8.8/8.8.8) with ESMTP id PAA21036 for ; Thu, 16 Dec 1999 15:36:17 +0530 (IST) Received: (from dhiman@localhost) by everest.cse.iitb.ernet.in (8.9.2/8.9.2) id PAA26936 for netdev@oss.sgi.com; Thu, 16 Dec 1999 15:36:13 +0530 (GMT) Date: Thu, 16 Dec 1999 15:36:12 +0530 From: Dhiman Barman To: netdev@oss.sgi.com Subject: Differentaited Services under Linux Message-ID: <19991216153612.A26326@cse.iitb.ernet.in> Reply-To: dhiman@cse.iitb.ernet.in Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Mailer: Mutt 0.95.1i Sender: owner-netdev@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;netdev-outgoing Content-Length: 454 Lines: 12 Hi all, I have the following crib. I have successfully upgraded the Linux kernel 2.2.3 after patching. Now I want to patch the patches for Differentiated Services which I got from the site of www.telematik.informatik.uni-lkarlsruhe.de/... But patches seems to be extremely buggy. At every line, it cribs while patching. Has anyone ever tried this with these patches ? Or can anyone give me some other pointer for the same. Thanks. Regards, Dhiman From owner-netdev@oss.sgi.com Thu Dec 16 02:53:29 1999 Received: by oss.sgi.com id ; Thu, 16 Dec 1999 02:53:19 -0800 Received: from linuxcare.canberra.net.au ([203.29.91.49]:2569 "EHLO front.linuxcare.com.au") by oss.sgi.com with ESMTP id ; Thu, 16 Dec 1999 02:53:03 -0800 Received: from gargle.linuxcare.com.au (gargle.linuxcare.com.au [10.61.2.18]) by front.linuxcare.com.au (8.9.3/8.9.3/Debian/GNU) with ESMTP id VAA05606; Thu, 16 Dec 1999 21:52:03 +1100 Received: from rustcorp.com.au (really [127.0.0.1]) by rustcorp.com.au via in.smtpd with esmtp id (Debian Smail3.2.0.102) for ; Thu, 16 Dec 1999 21:59:56 +1100 (EST) Message-Id: From: Paul Rusty Russell To: netdev@oss.sgi.com cc: torvalds@transmeta.com Subject: [PATCH] TCP syn output interface passed to firewall hooks. Date: Thu, 16 Dec 1999 21:59:56 +1100 Sender: owner-netdev@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;netdev-outgoing Content-Length: 483 Lines: 18 Linus, please apply v2.3.32. My fuckup. Checked, and this is the only case of bad interface args to firewall hooks in ipv4. --- linux-2.3-official/net/ipv4/ip_output.c Tue Nov 30 17:58:59 1999 +++ linux-2.3/net/ipv4/ip_output.c Thu Dec 16 20:51:58 1999 @@ -182,7 +182,7 @@ ip_send_check(iph); /* Send it out. */ - NF_HOOK(PF_INET, NF_IP_LOCAL_OUT, skb, NULL, NULL, + NF_HOOK(PF_INET, NF_IP_LOCAL_OUT, skb, NULL, rt->u.dst.dev, output_maybe_reroute); } -- Hacking time. From owner-netdev@oss.sgi.com Thu Dec 16 08:12:04 1999 Received: by oss.sgi.com id ; Thu, 16 Dec 1999 08:11:54 -0800 Received: from minus.inr.ac.ru ([193.233.7.97]:11017 "HELO ms2.inr.ac.ru") by oss.sgi.com with SMTP id ; Thu, 16 Dec 1999 08:11:37 -0800 Received: (from kuznet@localhost) by ms2.inr.ac.ru (8.6.13/ANK) id TAA07161; Thu, 16 Dec 1999 19:09:31 +0300 From: kuznet@ms2.inr.ac.ru Message-Id: <199912161609.TAA07161@ms2.inr.ac.ru> Subject: Re: source address selection To: kusune@sfc.wide.ad.jp (Takeshi Kusune / =?ISO-2022-JP?B?GyRARm86LE06O1YbKEo=?=) Date: Thu, 16 Dec 1999 19:09:31 +0300 (MSK) Cc: netdev@oss.sgi.com In-Reply-To: <199912152351.IAA05354@wanwan.sfc.wide.ad.jp> from "Takeshi Kusune / =?ISO-2022-JP?B?GyRARm86LE06O1YbKEo=?=" at Dec 16, 99 08:51:34 am X-Mailer: ELM [version 2.4 PL24] MIME-Version: 1.0 Sender: owner-netdev@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;netdev-outgoing Content-Length: 958 Lines: 21 Hello! > BTW, I think that Linux's current IPv6 code is not enough to work with > multi-addressed network, because of weakness in source address selection. The patch is good and correct, but the solution is bad. We may make this thing in IPv4 because function inet_select_addr() is not in data path; selected source addresses are stored in routing tables. Correct solution would cache once found source address in IPv6 routing table, probably, cloning route, if it is required. BTW we have to make this thing in IPv4 because by historical reasons internet routing is very messy and smart source selection is required to route replies back. Moreover, by the same historical reasons, almost all IP apps are confused, when loopback address appears as source address of packet destined to some address of the host. Emerging IPv6 principles should not inherit all of this brain damage, so that the problem is present but it is not critical. Alexey Kuznetsov From owner-netdev@oss.sgi.com Thu Dec 16 15:51:37 1999 Received: by oss.sgi.com id ; Thu, 16 Dec 1999 15:51:18 -0800 Received: from athena.nuclecu.unam.mx ([132.248.29.9]:10761 "EHLO athena.nuclecu.unam.mx") by oss.sgi.com with ESMTP id ; Thu, 16 Dec 1999 15:50:57 -0800 Received: from scooter.eye-net.com.au (scooter.eye-net.com.au [202.139.6.76]) by athena.nuclecu.unam.mx (8.8.7/8.8.7) with ESMTP id RAA29736 for ; Thu, 16 Dec 1999 17:50:00 -0600 Received: by scooter.eye-net.com.au (Postfix, from userid 1000) id AB4919890; Fri, 17 Dec 1999 19:28:54 +1100 (EST) Subject: IPv6 tunnels don't work if ifconfig'ed then ip'ed To: netdev@roxanne.nuclecu.unam.mx Date: Fri, 17 Dec 1999 19:28:54 +1100 (EST) X-Mailer: ELM [version 2.4ME+ PL65 (25)] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Message-Id: <19991217082854.AB4919890@scooter.eye-net.com.au> From: csmall@scooter.eye-net.com.au (Craig Small) Sender: owner-netdev@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;netdev-outgoing Content-Length: 1407 Lines: 41 G'day All, I've had a weird bug in either the network tools while I was switch from using the ifconfig type settings to the ip settings (ip is so much nicer). It seems if you ifconfig a tunnel (say sit1) and then ifconfig sit1 down and then try to build a tunnel using ip tunnel add the tunnel doesn't get built at all; no errors, no nothing. ip should at the very least complain. I am unsure where this bug should go as I don't know enough about the tools and the kernel to know who's fault it is. # uname -a Linux scooter 2.2.13 #1 Thu Dec 16 18:15:29 EST 1999 i686 unknown # ip -V ip utility, iproute2-ss991023 # ifconfig -V net-tools 1.52 ifconfig 1.39 (1999-03-18) # ifconfig sit0 up tunnel ::10.1.2.3 # ifconfig sit2 up # ifconfig sit2 down # ip tunnel add foo mode sit remote 10.1.2.3 # ifconfig foo foo: error fetching interface information: Device not found # ip tunnel add bar mode sit remote 10.1.2.4 # ifconfig bar bar Link encap:IPv6-in-IPv4 POINTOPOINT NOARP MTU:1480 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 -- Craig Small VK2XLZ, PGP: AD 8D D8 63 6E BF C3 C7 47 41 B1 A2 1F 46 EC 90 Eye-Net Consulting http://www.eye-net.com.au/ MIEEE Debian developer From owner-netdev@oss.sgi.com Fri Dec 17 09:22:52 1999 Received: by oss.sgi.com id ; Fri, 17 Dec 1999 09:22:33 -0800 Received: from minus.inr.ac.ru ([193.233.7.97]:49669 "HELO ms2.inr.ac.ru") by oss.sgi.com with SMTP id ; Fri, 17 Dec 1999 09:22:13 -0800 Received: (from kuznet@localhost) by ms2.inr.ac.ru (8.6.13/ANK) id UAA04490; Fri, 17 Dec 1999 20:16:37 +0300 From: kuznet@ms2.inr.ac.ru Message-Id: <199912171716.UAA04490@ms2.inr.ac.ru> Subject: Re: IPv6 tunnels don't work if ifconfig'ed then ip'ed To: csmall@scooter.eye-net.COM.AU (Craig Small) Date: Fri, 17 Dec 1999 20:16:36 +0300 (MSK) Cc: netdev@oss.sgi.com In-Reply-To: <19991217082854.AB4919890@scooter.eye-net.com.au> from "Craig Small" at Dec 17, 99 03:13:15 am X-Mailer: ELM [version 2.4 PL24] MIME-Version: 1.0 Sender: owner-netdev@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;netdev-outgoing Content-Length: 534 Lines: 18 Hello! > It seems if you ifconfig a tunnel (say sit1) and then ifconfig sit1 down BTW if you will not down it, all the things will be the same. > and then try to build a tunnel using ip tunnel add the tunnel doesn't > get built at all; It is built succesfully, but it is called "sit1" as original creator called it. Operation "add" identify tunnels by endpoint addresses and do not rename existing tunnels. New name is ignored. You should delete sit1 before attempt to create new one or use "ip link set sit1 name foo". Alexey From owner-netdev@oss.sgi.com Fri Dec 17 14:57:09 1999 Received: by oss.sgi.com id ; Fri, 17 Dec 1999 14:56:59 -0800 Received: from relay.planetinternet.be ([194.119.232.24]:33031 "EHLO relay.planetinternet.be") by oss.sgi.com with ESMTP id ; Fri, 17 Dec 1999 14:56:41 -0800 Received: from dialup.eunet.be (postfix@dialup421.leuven.eunet.be [195.207.1.221]) by relay.planetinternet.be (8.9.3/8.9.3) with ESMTP id XAA19021 for ; Fri, 17 Dec 1999 23:55:05 +0100 Received: by dialup.eunet.be (Postfix, from userid 501) id D83A167612; Fri, 17 Dec 1999 23:55:46 +0100 (CET) Date: Fri, 17 Dec 1999 23:55:46 +0100 From: Q To: netdev@oss.sgi.com Subject: ipv6 connect to localhost Message-ID: <19991217235546.A201@ping.be> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Mailer: Mutt 1.0pre2i Sender: owner-netdev@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;netdev-outgoing Content-Length: 192 Lines: 11 I just tried to connect to localhost with ipv6 on kernel 2.3.31 and I got an OOPS. It looks like a recursive loop from the trace, most of the screen was gone. Hope this helps a little. Q From owner-netdev@oss.sgi.com Fri Dec 17 16:55:48 1999 Received: by oss.sgi.com id ; Fri, 17 Dec 1999 16:55:38 -0800 Received: from ha2.rdc1.bc.wave.home.com ([24.2.10.67]:45495 "EHLO lh2.rdc1.bc.home.com") by oss.sgi.com with ESMTP id ; Fri, 17 Dec 1999 16:55:25 -0800 Received: from pinky.deadbeef.ca ([24.113.49.197]) by lh2.rdc1.bc.home.com (InterMail v4.01.01.00 201-229-111) with SMTP id <19991218005435.TPPZ19112.lh2.rdc1.bc.home.com@pinky.deadbeef.ca> for ; Fri, 17 Dec 1999 16:54:35 -0800 From: "Prairie Flower" To: "netdev@oss.sgi.com" Date: Fri, 17 Dec 1999 16:53:57 +0800 Reply-To: "Prairie Flower" X-Mailer: PMMail 1.96a For OS/2 MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Subject: ne2000 Message-Id: <19991218005435.TPPZ19112.lh2.rdc1.bc.home.com@pinky.deadbeef.ca> Sender: owner-netdev@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;netdev-outgoing Content-Length: 303 Lines: 12 I just put a third ne2000 card into my firewall to make a demiliterized zone. I can get any combination of two cards to work but not all three at the same time. I have the ne2000 driver compiled in to the kernel and am passing ether parameters to the kernel. Any suggestions? wildrose@home.com From owner-netdev@oss.sgi.com Fri Dec 17 21:40:21 1999 Received: by oss.sgi.com id ; Fri, 17 Dec 1999 21:40:12 -0800 Received: from mailhost.iitb.ac.in ([202.54.44.115]:58885 "HELO mailhost.iitb.ac.in") by oss.sgi.com with SMTP id ; Fri, 17 Dec 1999 21:39:55 -0800 Received: (qmail 8631 invoked from network); 17 Dec 1999 18:14:07 -0000 Received: from surya.cse.iitb.ernet.in (144.16.111.14) by mailhost.iitb.ac.in with SMTP; 17 Dec 1999 18:14:07 -0000 Received: from everest.cse.iitb.ernet.in (dhiman@everest [144.16.111.4]) by surya.cse.iitb.ernet.in (8.8.8/8.8.8) with ESMTP id XAA11763 for ; Fri, 17 Dec 1999 23:22:58 +0530 (IST) Received: (from dhiman@localhost) by everest.cse.iitb.ernet.in (8.9.2/8.9.2) id XAA26950 for netdev@oss.sgi.com; Fri, 17 Dec 1999 23:22:54 +0530 (GMT) Date: Fri, 17 Dec 1999 23:22:54 +0530 From: Dhiman Barman To: netdev@oss.sgi.com Subject: kernel compilation Message-ID: <19991217232254.A26580@cse.iitb.ernet.in> Reply-To: dhiman@cse.iitb.ernet.in Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Mailer: Mutt 0.95.1i Sender: owner-netdev@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;netdev-outgoing Content-Length: 372 Lines: 15 Hi, while compiling kernel 2.2.10( make bzImage ), got the following error at the end. tools/build -b bbootsect bsetup compressed/bvmlinux.out CURRENT > bzImage make[1]: *** [bzImage] Error 139 make[1]: Leaving directory `/usr/src/linux-2.2.10/linux/arch/i386/boot' make: *** [bzImage] Error 2 Can someone see a problem there ? Please respond. cheers, Dhiman From owner-netdev@oss.sgi.com Sat Dec 18 02:38:03 1999 Received: by oss.sgi.com id ; Sat, 18 Dec 1999 02:37:52 -0800 Received: from athena.nuclecu.unam.mx ([132.248.29.9]:48696 "EHLO athena.nuclecu.unam.mx") by oss.sgi.com with ESMTP id ; Sat, 18 Dec 1999 02:37:28 -0800 Received: from tunnel.bieringer.de (tunnel.bieringer.de [195.226.187.50]) by athena.nuclecu.unam.mx (8.8.7/8.8.7) with ESMTP id EAA19860 for ; Sat, 18 Dec 1999 04:36:39 -0600 Received: (from peter@localhost) by tunnel.bieringer.de (8.9.3/8.9.3) id LAA12106; Sat, 18 Dec 1999 11:36:22 +0100 Message-Id: <3.0.6.32.19991218113723.007fbea0@mail.bieringer.de> X-URL: http://www.bieringer.de/pb/ X-Sender: peter@mail.bieringer.de X-Mailer: QUALCOMM Windows Eudora Light Version 3.0.6 (32) Date: Sat, 18 Dec 1999 11:37:23 +0100 To: linux-ipv6@inner.net, netdev@nuclecu.unam.mx, users@ipv6.org From: Peter Bieringer Subject: IPv6+Linux status page useful? Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Sender: owner-netdev@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;netdev-outgoing Content-Length: 698 Lines: 21 Hi all, a short lecture by me relating to the current status of IPv6 and Linux and short discussions afterwards on the 1. IPv6-Conference of the IPv6-Forum in Germany 1999 (Berlin 08.-09.12.1999) gave me the feeling that there is no real current status of the IPv6 implementation in Linux (on kernel and application side) available. Therefore I've created following HTML page (it's a sneak preview and has no link from the HowTo at the moment): http://www.bieringer.de/linux/IPv6/IPv6-HOWTO/IPv6+Linux-status.html If you feel that this idea is good and you have more information to fill in, please let me know. I will try to update the page as soon as I got new information. TIA, Peter From owner-netdev@oss.sgi.com Sat Dec 18 07:19:14 1999 Received: by oss.sgi.com id ; Sat, 18 Dec 1999 07:18:55 -0800 Received: from minus.inr.ac.ru ([193.233.7.97]:38406 "HELO ms2.inr.ac.ru") by oss.sgi.com with SMTP id ; Sat, 18 Dec 1999 07:18:39 -0800 Received: (from kuznet@localhost) by ms2.inr.ac.ru (8.6.13/ANK) id SAA01076; Sat, 18 Dec 1999 18:17:45 +0300 From: kuznet@ms2.inr.ac.ru Message-Id: <199912181517.SAA01076@ms2.inr.ac.ru> Subject: Re: IPv6+Linux status page useful? To: pb@bieringer.DE (Peter Bieringer) Date: Sat, 18 Dec 1999 18:17:45 +0300 (MSK) Cc: netdev@oss.sgi.com In-Reply-To: <3.0.6.32.19991218113723.007fbea0@mail.bieringer.de> from "Peter Bieringer" at Dec 18, 99 05:35:28 pm X-Mailer: ELM [version 2.4 PL24] MIME-Version: 1.0 Sender: owner-netdev@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;netdev-outgoing Content-Length: 860 Lines: 47 Hello! > Linux kernel IPv6 features ... > Dynamic tunneling What's it? > Statefull autoconfiguration Not a kernel issue. Some DHCPv6 implementations exist, but I did not hear about any attempts to compile it for linux. > Renumbering link net working (2.2) with radvd > Renumbering hierarchicly Not kernel issues. Actually, stateless addrconf also not a kernel issue, but we have it historically there. > Multicast router Not implemented. > Multicast client Multicast client working (2.2/2.3) current kernel source > Anycast router No differences of unicast case. "Anycast listening" is really different, because we should prohibit to use these addresses as source addresses and delay NDISC replies. > IPv6 options... IPv6 options working (2.2/2.3) current kernel source > Discovery features... What's it? Alexey From owner-netdev@oss.sgi.com Sat Dec 18 12:46:12 1999 Received: by oss.sgi.com id ; Sat, 18 Dec 1999 12:46:02 -0800 Received: from web123.yahoomail.com ([205.180.60.191]:46607 "HELO web123.yahoomail.com") by oss.sgi.com with SMTP id ; Sat, 18 Dec 1999 12:45:42 -0800 Received: (qmail 16495 invoked by uid 60001); 18 Dec 1999 20:44:59 -0000 Message-ID: <19991218204459.16494.qmail@web123.yahoomail.com> Received: from [209.233.238.122] by web123.yahoomail.com; Sat, 18 Dec 1999 12:44:59 PST Date: Sat, 18 Dec 1999 12:44:59 -0800 (PST) From: Cacophonix Gaul Subject: Re: IPv6+Linux status page useful? To: kuznet@ms2.inr.ac.ru, Peter Bieringer Cc: netdev@oss.sgi.com MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Sender: owner-netdev@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;netdev-outgoing Content-Length: 1019 Lines: 27 --- kuznet@ms2.inr.ac.ru wrote: > > Anycast router > > No differences of unicast case. > > "Anycast listening" is really different, because we should > prohibit to use these addresses as source addresses > and delay NDISC replies. > You can only prohibit using anycast addresses as source addresses only if you are semantically able to distinguish an anycast address from a unicast address. This may be possible with the reserved subnet anycast addresses that are defined for ipv6. However, nothing prevents the user from using any unicast address as an anycast address. For e.g., I use anycast with IPv4 as a simple mechanism to provide failover between linux servers. You need to understand the constraints, but if you do, anycast is simpler, and more scalable than most of the other clustering schemes, not to mention free. cheers! __________________________________________________ Do You Yahoo!? Thousands of Stores. Millions of Products. All in one place. Yahoo! Shopping: http://shopping.yahoo.com From owner-netdev@oss.sgi.com Sat Dec 18 13:06:42 1999 Received: by oss.sgi.com id ; Sat, 18 Dec 1999 13:06:32 -0800 Received: from athena.nuclecu.unam.mx ([132.248.29.9]:27252 "EHLO athena.nuclecu.unam.mx") by oss.sgi.com with ESMTP id ; Sat, 18 Dec 1999 13:06:21 -0800 Received: from web125.yahoomail.com (web125.yahoomail.com [205.180.60.193]) by athena.nuclecu.unam.mx (8.8.7/8.8.7) with SMTP id PAA24374 for ; Sat, 18 Dec 1999 15:05:36 -0600 Received: (qmail 21884 invoked by uid 60001); 18 Dec 1999 21:05:35 -0000 Message-ID: <19991218210535.21883.qmail@web125.yahoomail.com> Received: from [209.233.238.122] by web125.yahoomail.com; Sat, 18 Dec 1999 13:05:35 PST Date: Sat, 18 Dec 1999 13:05:35 -0800 (PST) From: Cacophonix Gaul Subject: Re: IPv6+Linux status page useful? To: Peter Bieringer , linux-ipv6@inner.net, netdev@nuclecu.unam.mx, users@ipv6.org MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Sender: owner-netdev@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;netdev-outgoing Content-Length: 815 Lines: 22 How about the ipv4-ipv6 socks translator at http://www.socks.nec.com/translator.html I presume by "renumbering hierarchically" you mean router renumbering? What about some of the v4-v6 transition schemes? For e.g, 6to4, etc. At the last IETF someone from Compaq mentioned that they are prototyping some of the v4-v6 translators on linux, I'm not aware of the status. Thanks for setting up the page! --- Peter Bieringer wrote: > > Therefore I've created following HTML page (it's a sneak preview and has no > link from the HowTo at the moment): > http://www.bieringer.de/linux/IPv6/IPv6-HOWTO/IPv6+Linux-status.html > __________________________________________________ Do You Yahoo!? Thousands of Stores. Millions of Products. All in one place. Yahoo! Shopping: http://shopping.yahoo.com From owner-netdev@oss.sgi.com Sat Dec 18 17:04:44 1999 Received: by oss.sgi.com id ; Sat, 18 Dec 1999 17:04:34 -0800 Received: from athena.nuclecu.unam.mx ([132.248.29.9]:44656 "EHLO athena.nuclecu.unam.mx") by oss.sgi.com with ESMTP id ; Sat, 18 Dec 1999 17:04:16 -0800 Received: from cerberus.nemoto.ecei.tohoku.ac.jp (root@cerberus.nemoto.ecei.tohoku.ac.jp [130.34.199.67]) by athena.nuclecu.unam.mx (8.8.7/8.8.7) with ESMTP id TAA26266 for ; Sat, 18 Dec 1999 19:03:31 -0600 Received: from localhost (yoshfuji@localhost [127.0.0.1]) by cerberus.nemoto.ecei.tohoku.ac.jp (8.9.3+3.2W/8.9.3/Debian 8.9.3-6) with ESMTP id KAA00504; Sun, 19 Dec 1999 10:03:16 +0900 To: linux-ipv6@inner.net, netdev@nuclecu.unam.mx, users@ipv6.org Subject: Re: IPv6+Linux status page useful? From: Hideaki YOSHIFUJI In-Reply-To: <3.0.6.32.19991218113723.007fbea0@mail.bieringer.de> References: <3.0.6.32.19991218113723.007fbea0@mail.bieringer.de> X-Mailer: Mew version 1.94 on Emacs 20.3 / Mule 4.0 =?iso-2022-jp?B?KBskQjJWMWMbKEIp?= X-URL: http://www.ecei.tohoku.ac.jp/%7Eyoshfuji/ X-Fingerprint: F7 31 65 99 5E B2 BB A7 15 15 13 23 18 06 A9 6F 57 00 6B 25 X-Pgp5-Key-Url: http://cerberus.nemoto.ecei.tohoku.ac.jp/%7Eyoshfuji/yoshfuji@ecei.tohoku.ac.jp.asc Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-Id: <19991219100316X.yoshfuji@cerberus.nemoto.ecei.tohoku.ac.jp> Date: Sun, 19 Dec 1999 10:03:16 +0900 X-Dispatcher: imput version 990905(IM130) Sender: owner-netdev@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;netdev-outgoing Content-Length: 828 Lines: 21 Hi, In article <3.0.6.32.19991218113723.007fbea0@mail.bieringer.de> (at Sat, 18 Dec 1999 11:37:23 +0100), Peter Bieringer says: > Therefore I've created following HTML page (it's a sneak preview and has no > link from the HowTo at the moment): > http://www.bieringer.de/linux/IPv6/IPv6-HOWTO/IPv6+Linux-status.html You can see some information about IPv6 patched / enabled applications at . It contains 'beta' patches, but most of them should be ok. Also, The IPv6 Application Porting Project has been started. Please see and join us! -- Hideaki YOSHIFUJI Web Page: http://www.ecei.tohoku.ac.jp/%7Eyoshfuji/ PGP5i FP: F731 6599 5EB2 BBA7 1515 1323 1806 A96F 5700 6B25 From owner-netdev@oss.sgi.com Sun Dec 19 07:27:21 1999 Received: by oss.sgi.com id ; Sun, 19 Dec 1999 07:27:12 -0800 Received: from minus.inr.ac.ru ([193.233.7.97]:41478 "HELO ms2.inr.ac.ru") by oss.sgi.com with SMTP id ; Sun, 19 Dec 1999 07:26:56 -0800 Received: (from kuznet@localhost) by ms2.inr.ac.ru (8.6.13/ANK) id SAA04119; Sun, 19 Dec 1999 18:23:48 +0300 From: kuznet@ms2.inr.ac.ru Message-Id: <199912191523.SAA04119@ms2.inr.ac.ru> Subject: Re: IPv6+Linux status page useful? To: cacophonix@yahoo.com (Cacophonix Gaul) Date: Sun, 19 Dec 1999 18:23:48 +0300 (MSK) Cc: pb@bieringer.DE, netdev@oss.sgi.com In-Reply-To: <19991218204459.16494.qmail@web123.yahoomail.com> from "Cacophonix Gaul" at Dec 18, 99 12:44:59 pm X-Mailer: ELM [version 2.4 PL24] MIME-Version: 1.0 Sender: owner-netdev@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;netdev-outgoing Content-Length: 1071 Lines: 26 Hello! > However, nothing prevents the > user from using any unicast address as an anycast address. It is impossible even to say "someone uses an unicast as anycast." Think a bit, anycast != unicast only from receiver viewpoint. If receiver thinks that an address is anycast, it __ anycast. > For e.g., I use anycast with IPv4 as a simple mechanism to provide > failover between linux servers. You need to understand the constraints, > but if you do, anycast is simpler, and more scalable than most of the > other clustering schemes, not to mention free. I am afraid, you say not about anycasts... Anycasts as defined in IPv6 (and can be used with IPv4 with the same success, provided ARP liteners agree some simple requirements, borrowed from NDISC. Linux-2.2 does.) are useless for any serious applications beyond scope of stateless datagram services f.e. name services. I am afraid you tell about _unicast_ assigned to multiple hosts. This case has nothing common both with anycasts and with any other practice ever approved by IETF to nowadays. Alexey From owner-netdev@oss.sgi.com Sun Dec 19 14:02:09 1999 Received: by oss.sgi.com id ; Sun, 19 Dec 1999 14:01:49 -0800 Received: from web117.yahoomail.com ([205.180.60.91]:24330 "HELO web117.yahoomail.com") by oss.sgi.com with SMTP id ; Sun, 19 Dec 1999 14:01:23 -0800 Received: (qmail 23998 invoked by uid 60001); 19 Dec 1999 22:00:45 -0000 Message-ID: <19991219220045.23997.qmail@web117.yahoomail.com> Received: from [209.233.238.122] by web117.yahoomail.com; Sun, 19 Dec 1999 14:00:45 PST Date: Sun, 19 Dec 1999 14:00:45 -0800 (PST) From: Cacophonix Gaul Subject: Re: IPv6+Linux status page useful? To: kuznet@ms2.inr.ac.ru Cc: pb@bieringer.DE, netdev@oss.sgi.com MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Sender: owner-netdev@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;netdev-outgoing Content-Length: 2039 Lines: 52 Hi Alexey, --- kuznet@ms2.inr.ac.ru wrote: > > However, nothing prevents the > > user from using any unicast address as an anycast address. > > It is impossible even to say "someone uses an unicast as anycast." > Think a bit, anycast != unicast only from receiver viewpoint. > If receiver thinks that an address is anycast, it __ anycast. > Note, I said unicast address as anycast address. Of course, the unicast mechanism is different from anycast. Also, there is no need for the receiver to be aware of the fact that it has an anycast address - it could be part of an anycast group without it's knowledge. > are useless for any serious applications beyond scope of stateless > datagram services f.e. name services. Use of simple anycast requires some additional care, even for things like dns. However, depending on your goal, there are -simple- tricks that can be played to possibly meet certain requirements (for e.g, automatic failover) for _certain_ stateful applications. Nevertheless, I concur that anycast is not appropriate for general usage by users who are not familiar with it's implications, except where formal definitions have been made, such as anycast PIM-RP's. I'd written up something a while ago describing one such trick, which I'll be glad to dig up & send to you if you are interested - let me know. > > I am afraid you tell about _unicast_ assigned to multiple hosts. rfc1546 describes this as one possible anycast address allocation mechanism. > This case has nothing common both with anycasts and with any > other practice ever approved by IETF to nowadays. Actually, there was a BoF in the last IETF to discuss anycast (both v4 and v6) & it's applications. While there was no commitment to pursue anything formal yet, there was no intent to prohibit it's use within individual administrative domains either. cheers! __________________________________________________ Do You Yahoo!? Thousands of Stores. Millions of Products. All in one place. Yahoo! Shopping: http://shopping.yahoo.com From owner-netdev@oss.sgi.com Sun Dec 19 22:48:16 1999 Received: by oss.sgi.com id ; Sun, 19 Dec 1999 22:47:56 -0800 Received: from linuxcare.canberra.net.au ([203.29.91.49]:8464 "EHLO front.linuxcare.com.au") by oss.sgi.com with ESMTP id ; Sun, 19 Dec 1999 22:47:31 -0800 Received: from gargle.linuxcare.com.au (gargle.linuxcare.com.au [10.61.2.18]) by front.linuxcare.com.au (8.9.3/8.9.3/Debian/GNU) with ESMTP id RAA10806; Mon, 20 Dec 1999 17:46:49 +1100 Received: from rustcorp.com.au (really [127.0.0.1]) by rustcorp.com.au via in.smtpd with esmtp id (Debian Smail3.2.0.102) for ; Mon, 20 Dec 1999 17:54:36 +1100 (EST) Message-Id: From: Rusty Russell To: "B. James Phillippe" Cc: netdev@oss.sgi.com Subject: Re: Tracking iterations in net_bh In-reply-to: Your message of "Sat, 18 Dec 1999 22:51:17 -0800." Date: Mon, 20 Dec 1999 17:54:36 +1100 Sender: owner-netdev@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;netdev-outgoing Content-Length: 638 Lines: 19 In message you write: > Greetings, > > I have a kernel application (firewall driver) which wants to be able to > tell if, for example, the packet I am looking at in the output chain is the > same one that came in on the input chain. This is hard ATM: consider fragmentation. Other than this, you can use the nf_mark/fwmark field. If the router plugin architecture suggested goes in 2.5, then tracking packets could be one side effect, since the frag code could keep this association intact: http://arl.wustl.edu/~dan/papers/rt_plugins_sigcomm98.pdf Rusty. -- Hacking time. From owner-netdev@oss.sgi.com Mon Dec 20 00:16:36 1999 Received: by oss.sgi.com id ; Mon, 20 Dec 1999 00:16:16 -0800 Received: from earth.terran.org ([216.39.144.215]:18954 "EHLO terran.org") by oss.sgi.com with ESMTP id ; Mon, 20 Dec 1999 00:15:58 -0800 Received: from neptune.terran (IDENT:root@neptune.terran [192.168.2.8]) by terran.org (8.9.3/8.9.3) with ESMTP id AAA08894; Mon, 20 Dec 1999 00:15:12 -0800 Received: from localhost (bryan@localhost) by neptune.terran (8.9.3/8.9.4) with ESMTP id AAA26651; Mon, 20 Dec 1999 00:15:12 -0800 X-Authentication-Warning: neptune.terran: bryan owned process doing -bs Date: Mon, 20 Dec 1999 00:15:12 -0800 (PST) From: "B. James Phillippe" To: Rusty Russell cc: netdev@oss.sgi.com Subject: Re: Tracking iterations in net_bh In-Reply-To: Message-ID: Organization: terran.org/darkforest.org X-Subliminal-Message: Use Linux MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-netdev@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;netdev-outgoing Content-Length: 1216 Lines: 30 On Mon, 20 Dec 1999, Rusty Russell wrote: > In message you write: > > Greetings, > > > > I have a kernel application (firewall driver) which wants to be able to > > tell if, for example, the packet I am looking at in the output chain is > > the same one that came in on the input chain. > > This is hard ATM: consider fragmentation. Other than this, you can > use the nf_mark/fwmark field. The code in question still hasn't made it to 2.2 yet, although that's RSN. :-) What if, instead of keeping track of the iterations (which is rather indirectly trying to solve the problem), we track the skb itself? With, for example, a unique skb id which starts at 0 and is incrementally assigned to each new skb that is allocated, until it wraps? The goal is to be able to look at a packet in the forward or output chain and very quickly say "yeah, I just saw this guy in the input chain and there's no need to re-run some expensive computations on him". It is otherwise expensive (based on the number of interfaces present) to determine if a packet originated on the firewall itself. thanks, -bp -- # bryan at terran dot org # http://www.terran.org/~bryan From owner-netdev@oss.sgi.com Mon Dec 20 08:13:09 1999 Received: by oss.sgi.com id ; Mon, 20 Dec 1999 08:12:49 -0800 Received: from minus.inr.ac.ru ([193.233.7.97]:41478 "HELO ms2.inr.ac.ru") by oss.sgi.com with SMTP id ; Mon, 20 Dec 1999 08:12:27 -0800 Received: (from kuznet@localhost) by ms2.inr.ac.ru (8.6.13/ANK) id TAA05288; Mon, 20 Dec 1999 19:10:37 +0300 From: kuznet@ms2.inr.ac.ru Message-Id: <199912201610.TAA05288@ms2.inr.ac.ru> Subject: Re: IPv6+Linux status page useful? To: cacophonix@yahoo.com (Cacophonix Gaul) Date: Mon, 20 Dec 1999 19:10:37 +0300 (MSK) Cc: pb@bieringer.DE, netdev@oss.sgi.com In-Reply-To: <19991219220045.23997.qmail@web117.yahoomail.com> from "Cacophonix Gaul" at Dec 19, 99 02:00:45 pm X-Mailer: ELM [version 2.4 PL24] MIME-Version: 1.0 Sender: owner-netdev@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;netdev-outgoing Content-Length: 866 Lines: 27 Hello! > unicast mechanism is different from anycast. Also, there is no > need for the receiver to be aware of the fact that it has an > anycast address - it could be part of an anycast group without > it's knowledge. Nope. It's knowledge is the _only_ thing differing anycast of unicasts. They are indistingushable outside of receivers. I suspect we talk about different things. 8) > I'd written up something a while ago describing one such trick, which > I'll be glad to dig up & send to you if you are interested - let me > know. I am. Please, send if you find it. It is interesting at least as information. > anything formal yet, there was no intent to prohibit it's use within > individual administrative domains either. It would be pretty funny, if some IETF meeting prohibited to do something within an individual administrative domain 8)8)8) Alexey From owner-netdev@oss.sgi.com Mon Dec 20 13:30:29 1999 Received: by oss.sgi.com id ; Mon, 20 Dec 1999 13:30:19 -0800 Received: from sabre-wulf.nvg.ntnu.no ([129.241.210.67]:2059 "EHLO sabre-wulf.nvg.ntnu.no") by oss.sgi.com with ESMTP id ; Mon, 20 Dec 1999 13:29:52 -0800 Received: from tyrell.nvg.ntnu.no ([IPv6:::ffff:129.241.210.70]:21771 "EHLO tyrell.nvg.ntnu.no" ident: "TIMEDOUT2") by sabre-wulf.nvg.ntnu.no with ESMTP id ; Mon, 20 Dec 1999 22:15:41 +0100 Received: (from venaas@localhost) by tyrell.nvg.ntnu.no (8.9.3/8.8.4) id WAA10057; Mon, 20 Dec 1999 22:15:10 +0100 From: Date: Mon, 20 Dec 1999 22:15:09 +0100 To: kuznet@ms2.inr.ac.ru Cc: Cacophonix Gaul , pb@bieringer.DE, netdev@oss.sgi.com Subject: Re: IPv6+Linux status page useful? Message-ID: <19991220221509.E9546@nvg.ntnu.no> References: <19991219220045.23997.qmail@web117.yahoomail.com> <199912201610.TAA05288@ms2.inr.ac.ru> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Mailer: Mutt 0.95.6i In-Reply-To: <199912201610.TAA05288@ms2.inr.ac.ru>; from kuznet@ms2.inr.ac.ru on Mon, Dec 20, 1999 at 07:10:37PM +0300 Sender: owner-netdev@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;netdev-outgoing Content-Length: 1495 Lines: 36 On Mon, Dec 20, 1999 at 07:10:37PM +0300, kuznet@ms2.inr.ac.ru wrote: > Hello! > > > unicast mechanism is different from anycast. Also, there is no > > need for the receiver to be aware of the fact that it has an > > anycast address - it could be part of an anycast group without > > it's knowledge. > > Nope. It's knowledge is the _only_ thing differing anycast of unicasts. > They are indistingushable outside of receivers. > I suspect we talk about different things. 8) My understanding is: "Classical" anycast addresses are indistuingishable from normal addresses for both senders and receivers, the trick is the routing. Different senders may reach different receivers using the same address, but a single packet should not go to more that one receiver. RFC 2373 states though that receivers must be configured to know it's an anycast address. There are AFAIK a lot of open issues here. If for instance there are two routers on a link and a host on the link sends a packet to the subnet-router anycast address, how should one decide which one to respond, of course this is a problem in some other scenarios as well. > > anything formal yet, there was no intent to prohibit it's use within > > individual administrative domains either. > > It would be pretty funny, if some IETF meeting prohibited to do something > within an individual administrative domain 8)8)8) Yes. What should be limited though is global usage, since it increases the size of the global routing tables. Stig From owner-netdev@oss.sgi.com Mon Dec 20 14:52:59 1999 Received: by oss.sgi.com id ; Mon, 20 Dec 1999 14:52:49 -0800 Received: from web121.yahoomail.com ([205.180.60.129]:28689 "HELO web121.yahoomail.com") by oss.sgi.com with SMTP id ; Mon, 20 Dec 1999 14:52:30 -0800 Received: (qmail 4838 invoked by uid 60001); 20 Dec 1999 22:51:21 -0000 Message-ID: <19991220225121.4837.qmail@web121.yahoomail.com> Received: from [156.153.255.250] by web121.yahoomail.com; Mon, 20 Dec 1999 14:51:21 PST Date: Mon, 20 Dec 1999 14:51:21 -0800 (PST) From: Cacophonix Gaul Subject: Re: IPv6+Linux status page useful? To: venaas@nvg.ntnu.no, kuznet@ms2.inr.ac.ru Cc: Cacophonix Gaul , pb@bieringer.DE, netdev@oss.sgi.com MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Sender: owner-netdev@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;netdev-outgoing Content-Length: 1363 Lines: 39 --- venaas@nvg.ntnu.no wrote: > > but a single > packet should not go to more that one receiver. This is the optimal case - however, I don't recollect reading any rfc that says that this is guaranteed in a ip network. In practice, this should not happen unless something's broken. > If for > instance there are two routers on a link and a host on the link sends a > packet to the subnet-router anycast address, how should one decide which > one to respond I would expect the packet to get delivered to the MAC address of one of the routers - which of course is unique to each router interface. Which one depends on the outcome of race conditions. > > > Yes. What should be limited though is global usage, since it increases the > size of the global routing tables. yes, there was quite a bit of discussion at the BoF about this. If anycast uses unicast space, then aggregation could follow normal unicast aggregation policies, but perhaps at the expense of less optimal routing of anycast. There was also a draft from someone at MIT suggesting seperate allocation of "anycast address space" - you may find the draft if you search internet drafts for "anycast". cheers, karthik __________________________________________________ Do You Yahoo!? Thousands of Stores. Millions of Products. All in one place. Yahoo! Shopping: http://shopping.yahoo.com From owner-netdev@oss.sgi.com Mon Dec 20 18:32:11 1999 Received: by oss.sgi.com id ; Mon, 20 Dec 1999 18:31:51 -0800 Received: from ha2.rdc1.bc.wave.home.com ([24.2.10.67]:37100 "EHLO lh2.rdc1.bc.home.com") by oss.sgi.com with ESMTP id ; Mon, 20 Dec 1999 18:31:36 -0800 Received: from pinky.deadbeef.ca ([24.113.49.197]) by lh2.rdc1.bc.home.com (InterMail v4.01.01.00 201-229-111) with SMTP id <19991221023102.WEFW19112.lh2.rdc1.bc.home.com@pinky.deadbeef.ca> for ; Mon, 20 Dec 1999 18:31:02 -0800 From: "Prairie Flower" To: "netdev@oss.sgi.com" Date: Mon, 20 Dec 1999 18:30:25 +0800 Reply-To: "Prairie Flower" X-Mailer: PMMail 1.96a For OS/2 MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Subject: triple homed host Message-Id: <19991221023102.WEFW19112.lh2.rdc1.bc.home.com@pinky.deadbeef.ca> Sender: owner-netdev@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;netdev-outgoing Content-Length: 858 Lines: 30 First I would like to thank everyone who offered help with my previous e-mail. I am still unable to get a third nic working tho I can get any two nics to work by changing the order of the ether= statements passed to the kerenl by lilo. Here is my dianotics so far: kernel=2.2.13 arch=i486 I have tried changing irq's and ioports as well as differenet slots in the mother board and just recently I replaced the new ne2k with an SMC-ultra with the same result. It seems that the kerenl is ignoring the third ether= argument entirely. I checked the MAX_LINKS which is #define MAX_LINKS 32. Both the ne2k driver and the SMC driver are statically compiled in and I have verified that all the cards are in fact working correctly and on seperate irq's/ioports etc. I suspect I am missing a #define that needs to be tweeked... wildrose@home.com From owner-netdev@oss.sgi.com Tue Dec 21 10:44:31 1999 Received: by oss.sgi.com id ; Tue, 21 Dec 1999 10:44:21 -0800 Received: from castillo.torrentnet.com ([4.18.161.34]:47634 "EHLO castillo.torrentnet.com") by oss.sgi.com with ESMTP id ; Tue, 21 Dec 1999 10:44:14 -0800 Received: from soco.torrentnet.com (soco.torrentnet.com [4.18.161.90]) by castillo.torrentnet.com (8.9.1/8.9.1) with ESMTP id NAA04060 for ; Tue, 21 Dec 1999 13:43:45 -0500 (EST) Received: (from jleu@localhost) by soco.torrentnet.com (8.8.5/8.8.5) id NAA27600 for netdev@oss.sgi.com; Tue, 21 Dec 1999 13:43:45 -0500 (EST) Message-ID: <19991221134344.I25425@torrentnet.com> Date: Tue, 21 Dec 1999 13:43:44 -0500 From: "James R. Leu" To: netdev@oss.sgi.com Subject: Finding an exact match in FIB Reply-To: jleu@torrentnet.com Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Mailer: Mutt 0.93.2i Organization: Ericsson Datacom Sender: owner-netdev@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;netdev-outgoing Content-Length: 1525 Lines: 65 I want to find an exact match in the FIB. Is there already a function which does this? I came up with a function to get AN answer, I'm just wondering if there is a better way. I copied code from a similar function and came up with this: -------------------------------------------------------------------------- struct fib_info *fn_hash_exact(u32 prefix,int z) { struct fib_table *tb = fib_get_table(RT_TABLE_MAIN); /* JLEU: this doesn't support MULTIPLE_TABLES */ struct fn_hash *table = (struct fn_hash*)tb->tb_data; struct fib_node **fp, **del_fp, *f; struct fn_zone *fz; fn_key_t key; #ifdef CONFIG_IP_ROUTE_TOS u8 tos = r->rtm_tos; #endif printk("tb(%d)_exact: %d %08x/%d %d\n", tb->tb_id, RTN_UNICAST, prefix, z, -1); if (z > 32) return NULL; if ((fz = table->fn_zones[z]) == NULL) return NULL; fz_key_0(key); { if (prefix & ~FZ_MASK(fz)) return NULL; key = fz_key(prefix, fz); } fp = fz_chain_p(key, fz); FIB_SCAN(f, fp) { if (fn_key_eq(f->fn_key, key)) break; if (fn_key_leq(key, f->fn_key)) return NULL; } #ifdef CONFIG_IP_ROUTE_TOS FIB_SCAN_KEY(f, fp, key) { if (f->fn_tos == tos) break; } #endif /* JLEU last match wins ! */ del_fp = NULL; FIB_SCAN_TOS(f, fp, key, tos) { if (f->fn_state&FN_S_ZOMBIE) return NULL; del_fp = fp; } if(del_fp && *del_fp) return FIB_INFO(*del_fp); return NULL; } -- James R. Leu | Ericsson Datacom, Inc Software Engineer | IP Infrastructure jleu@torrentnet.com | Morrisville, NC USA From owner-netdev@oss.sgi.com Tue Dec 21 11:23:21 1999 Received: by oss.sgi.com id ; Tue, 21 Dec 1999 11:23:11 -0800 Received: from minus.inr.ac.ru ([193.233.7.97]:42514 "HELO ms2.inr.ac.ru") by oss.sgi.com with SMTP id ; Tue, 21 Dec 1999 11:22:51 -0800 Received: (from kuznet@localhost) by ms2.inr.ac.ru (8.6.13/ANK) id WAA22950; Tue, 21 Dec 1999 22:21:54 +0300 From: kuznet@ms2.inr.ac.ru Message-Id: <199912211921.WAA22950@ms2.inr.ac.ru> Subject: Re: Finding an exact match in FIB To: jleu@torrentnet.COM Date: Tue, 21 Dec 1999 22:21:54 +0300 (MSK) Cc: netdev@oss.sgi.com In-Reply-To: <19991221134344.I25425@torrentnet.com> from "James R. Leu" at Dec 21, 99 10:13:26 pm X-Mailer: ELM [version 2.4 PL24] MIME-Version: 1.0 Sender: owner-netdev@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;netdev-outgoing Content-Length: 245 Lines: 12 Hello! > I want to find an exact match in the FIB. Is there already a function which > does this? No. > I copied code from a similar function and came up with this: Yes, function deleting an entry is the closest variant. Alexey Kuznetsov From owner-netdev@oss.sgi.com Wed Dec 22 00:32:10 1999 Received: by oss.sgi.com id ; Wed, 22 Dec 1999 00:31:59 -0800 Received: from linuxcare.canberra.net.au ([203.29.91.49]:20999 "EHLO front.linuxcare.com.au") by oss.sgi.com with ESMTP id ; Wed, 22 Dec 1999 00:31:46 -0800 Received: from gargle.linuxcare.com.au (penicillin.linuxcare.com.au [10.61.2.27]) by front.linuxcare.com.au (8.9.3/8.9.3/Debian/GNU) with ESMTP id TAA28376; Wed, 22 Dec 1999 19:31:09 +1100 Received: from rustcorp.com.au (really [127.0.0.1]) by rustcorp.com.au via in.smtpd with esmtp id (Debian Smail3.2.0.102) for ; Wed, 22 Dec 1999 19:38:53 +1100 (EST) Message-Id: From: Rusty Russell To: davem@redhat.com cc: netdev@oss.sgi.com Subject: [PATCH] skb_copy_expand Date: Wed, 22 Dec 1999 19:38:53 +1100 Sender: owner-netdev@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;netdev-outgoing Content-Length: 7068 Lines: 242 I just noticed that this never did go in. Patch against 2.3.34. 1) Adds skb_copy_expand, a generalization of skb_realloc_headroom. 2) Makes skb_realloc_headroom a (deprecated) macro. 3) Merges the common code of skb_copy/skb_realloc_headroom. Let's avoid having a `realloc_tailroom' implementation outside skbuff.c, like the ip_masq code in 2.0 and 2.2 does... Rusty. diff -ur linux-2.3-official/include/linux/skbuff.h linux-tmp/include/linux/skbuff.h --- linux-2.3-official/include/linux/skbuff.h Wed Dec 22 18:34:12 1999 +++ linux-tmp/include/linux/skbuff.h Wed Dec 22 19:31:36 1999 @@ -168,19 +168,25 @@ extern struct sk_buff * dev_alloc_skb(unsigned int size); extern void kfree_skbmem(struct sk_buff *skb); extern struct sk_buff * skb_clone(struct sk_buff *skb, int priority); -extern struct sk_buff * skb_copy(struct sk_buff *skb, int priority); -extern struct sk_buff * skb_realloc_headroom(struct sk_buff *skb, int newheadroom); +extern struct sk_buff * skb_copy(const struct sk_buff *skb, int priority); +extern struct sk_buff * skb_copy_expand(const struct sk_buff *skb, + int newheadroom, + int newtailroom, + int priority); #define dev_kfree_skb(a) kfree_skb(a) extern unsigned char * skb_put(struct sk_buff *skb, unsigned int len); extern unsigned char * skb_push(struct sk_buff *skb, unsigned int len); extern unsigned char * skb_pull(struct sk_buff *skb, unsigned int len); -extern int skb_headroom(struct sk_buff *skb); -extern int skb_tailroom(struct sk_buff *skb); +extern int skb_headroom(const struct sk_buff *skb); +extern int skb_tailroom(const struct sk_buff *skb); extern void skb_reserve(struct sk_buff *skb, unsigned int len); extern void skb_trim(struct sk_buff *skb, unsigned int len); extern void skb_over_panic(struct sk_buff *skb, int len, void *here); extern void skb_under_panic(struct sk_buff *skb, int len, void *here); +/* Backwards compatibility */ +#define skb_realloc_headroom(skb, nhr) skb_copy_expand(skb, nhr, skb_tailroom(skb), GFP_ATOMIC) + /* Internal */ extern __inline__ atomic_t *skb_datarefp(struct sk_buff *skb) { @@ -534,12 +540,12 @@ return __skb_pull(skb,len); } -extern __inline__ int skb_headroom(struct sk_buff *skb) +extern __inline__ int skb_headroom(const struct sk_buff *skb) { return skb->data-skb->head; } -extern __inline__ int skb_tailroom(struct sk_buff *skb) +extern __inline__ int skb_tailroom(const struct sk_buff *skb) { return skb->end-skb->tail; } diff -ur linux-2.3-official/net/core/skbuff.c linux-tmp/net/core/skbuff.c --- linux-2.3-official/net/core/skbuff.c Tue Nov 30 17:58:36 1999 +++ linux-tmp/net/core/skbuff.c Wed Dec 22 19:31:36 1999 @@ -275,14 +275,48 @@ return n; } +static void copy_skb_header(struct sk_buff *new, const struct sk_buff *old) +{ + /* + * Shift between the two data areas in bytes + */ + unsigned long offset = new->data - old->data; + + new->list=NULL; + new->sk=NULL; + new->dev=old->dev; + new->rx_dev=NULL; + new->priority=old->priority; + new->protocol=old->protocol; + new->dst=dst_clone(old->dst); + new->h.raw=old->h.raw+offset; + new->nh.raw=old->nh.raw+offset; + new->mac.raw=old->mac.raw+offset; + memcpy(new->cb, old->cb, sizeof(old->cb)); + new->used=old->used; + new->is_clone=0; + atomic_set(&new->users, 1); + new->pkt_type=old->pkt_type; + new->stamp=old->stamp; + new->destructor = NULL; + new->security=old->security; +#ifdef CONFIG_NETFILTER + new->nfmark=old->nfmark; + new->nfreason=old->nfreason; + new->nfcache=old->nfcache; +#ifdef CONFIG_NETFILTER_DEBUG + new->nf_debug=old->nf_debug; +#endif +#endif +} + /* * This is slower, and copies the whole data area */ -struct sk_buff *skb_copy(struct sk_buff *skb, int gfp_mask) +struct sk_buff *skb_copy(const struct sk_buff *skb, int gfp_mask) { struct sk_buff *n; - unsigned long offset; /* * Allocate the copy buffer @@ -292,12 +326,6 @@ if(n==NULL) return NULL; - /* - * Shift between the two data areas in bytes - */ - - offset=n->head-skb->head; - /* Set the data pointer */ skb_reserve(n,skb->data-skb->head); /* Set the tail pointer and length */ @@ -305,86 +333,35 @@ /* Copy the bytes */ memcpy(n->head,skb->head,skb->end-skb->head); n->csum = skb->csum; - n->list=NULL; - n->sk=NULL; - n->dev=skb->dev; - n->rx_dev=NULL; - n->priority=skb->priority; - n->protocol=skb->protocol; - n->dst=dst_clone(skb->dst); - n->h.raw=skb->h.raw+offset; - n->nh.raw=skb->nh.raw+offset; - n->mac.raw=skb->mac.raw+offset; - memcpy(n->cb, skb->cb, sizeof(skb->cb)); - n->used=skb->used; - n->is_clone=0; - atomic_set(&n->users, 1); - n->pkt_type=skb->pkt_type; - n->stamp=skb->stamp; - n->destructor = NULL; - n->security=skb->security; -#ifdef CONFIG_NETFILTER - n->nfmark=skb->nfmark; - n->nfreason=skb->nfreason; - n->nfcache=skb->nfcache; -#ifdef CONFIG_NETFILTER_DEBUG - n->nf_debug=skb->nf_debug; -#endif -#endif + copy_skb_header(n, skb); + return n; } -struct sk_buff *skb_realloc_headroom(struct sk_buff *skb, int newheadroom) +struct sk_buff *skb_copy_expand(const struct sk_buff *skb, + int newheadroom, + int newtailroom, + int gfp_mask) { struct sk_buff *n; - unsigned long offset; /* * Allocate the copy buffer */ - n=alloc_skb((skb->end-skb->data)+newheadroom, GFP_ATOMIC); + n=alloc_skb(newheadroom + (skb->tail - skb->data) + newtailroom, + gfp_mask); if(n==NULL) return NULL; skb_reserve(n,newheadroom); - /* - * Shift between the two data areas in bytes - */ - - offset=n->data-skb->data; - /* Set the tail pointer and length */ skb_put(n,skb->len); - /* Copy the bytes */ - memcpy(n->data,skb->data,skb->len); - n->list=NULL; - n->sk=NULL; - n->priority=skb->priority; - n->protocol=skb->protocol; - n->dev=skb->dev; - n->rx_dev=NULL; - n->dst=dst_clone(skb->dst); - n->h.raw=skb->h.raw+offset; - n->nh.raw=skb->nh.raw+offset; - n->mac.raw=skb->mac.raw+offset; - memcpy(n->cb, skb->cb, sizeof(skb->cb)); - n->used=skb->used; - n->is_clone=0; - atomic_set(&n->users, 1); - n->pkt_type=skb->pkt_type; - n->stamp=skb->stamp; - n->destructor = NULL; - n->security=skb->security; -#ifdef CONFIG_NETFILTER - n->nfmark=skb->nfmark; - n->nfreason=skb->nfreason; - n->nfcache=skb->nfcache; -#ifdef CONFIG_NETFILTER_DEBUG - n->nf_debug=skb->nf_debug; -#endif -#endif + /* Copy the bytes: data pointers must point to same data. */ + memcpy(n->data - skb_headroom(skb), skb->head, skb->end-skb->head); + + copy_skb_header(n, skb); return n; } Only in linux-tmp/net/core: skbuff.c.orig diff -ur linux-2.3-official/net/netsyms.c linux-tmp/net/netsyms.c --- linux-2.3-official/net/netsyms.c Wed Dec 22 10:06:46 1999 +++ linux-tmp/net/netsyms.c Wed Dec 22 19:31:36 1999 @@ -150,7 +150,7 @@ EXPORT_SYMBOL(skb_free_datagram); EXPORT_SYMBOL(skb_copy_datagram); EXPORT_SYMBOL(skb_copy_datagram_iovec); -EXPORT_SYMBOL(skb_realloc_headroom); +EXPORT_SYMBOL(skb_copy_expand); EXPORT_SYMBOL(datagram_poll); EXPORT_SYMBOL(put_cmsg); EXPORT_SYMBOL(sock_kmalloc); Only in linux-tmp/net: netsyms.c.orig -- Hacking time. From owner-netdev@oss.sgi.com Wed Dec 22 12:26:57 1999 Received: by oss.sgi.com id ; Wed, 22 Dec 1999 12:26:47 -0800 Received: from ha2.rdc1.bc.wave.home.com ([24.2.10.67]:40358 "EHLO lh2.rdc1.bc.home.com") by oss.sgi.com with ESMTP id ; Wed, 22 Dec 1999 12:26:28 -0800 Received: from pinky.deadbeef.ca ([24.113.49.197]) by lh2.rdc1.bc.home.com (InterMail v4.01.01.00 201-229-111) with SMTP id <19991222202604.ZJFS267.lh2.rdc1.bc.home.com@pinky.deadbeef.ca> for ; Wed, 22 Dec 1999 12:26:04 -0800 From: "Prairie Flower" To: "netdev@oss.sgi.com" Date: Wed, 22 Dec 1999 11:48:27 +0800 Reply-To: "Prairie Flower" X-Mailer: PMMail 1.96a For OS/2 MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Subject: /proc/cmdline Message-Id: <19991222202604.ZJFS267.lh2.rdc1.bc.home.com@pinky.deadbeef.ca> Sender: owner-netdev@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;netdev-outgoing Content-Length: 148 Lines: 9 I think I have found the source of the disappearing ethernet device. /proc/cmdline has been truncated after 80 characters. wildrose@home.com From owner-netdev@oss.sgi.com Wed Dec 22 18:14:39 1999 Received: by oss.sgi.com id ; Wed, 22 Dec 1999 18:14:28 -0800 Received: from pizda.ninka.net ([216.101.162.242]:45185 "EHLO pizda.ninka.net") by oss.sgi.com with ESMTP id ; Wed, 22 Dec 1999 18:14:22 -0800 Received: (from davem@localhost) by pizda.ninka.net (8.9.3/8.9.3) id SAA02281; Wed, 22 Dec 1999 18:12:19 -0800 Date: Wed, 22 Dec 1999 18:12:19 -0800 Message-Id: <199912230212.SAA02281@pizda.ninka.net> X-Authentication-Warning: pizda.ninka.net: davem set sender to davem@redhat.com using -f From: "David S. Miller" To: rusty@linuxcare.com.au CC: netdev@oss.sgi.com In-reply-to: (message from Rusty Russell on Wed, 22 Dec 1999 19:38:53 +1100) Subject: Re: [PATCH] skb_copy_expand References: Sender: owner-netdev@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;netdev-outgoing Content-Length: 223 Lines: 10 From: Rusty Russell Date: Wed, 22 Dec 1999 19:38:53 +1100 I just noticed that this never did go in. Patch against 2.3.34. Patch applied, thanks. Later, David S. Miller davem@redhat.com From owner-netdev@oss.sgi.com Thu Dec 23 02:22:21 1999 Received: by oss.sgi.com id ; Thu, 23 Dec 1999 02:22:11 -0800 Received: from linuxcare.canberra.net.au ([203.29.91.49]:65036 "EHLO front.linuxcare.com.au") by oss.sgi.com with ESMTP id ; Thu, 23 Dec 1999 02:21:55 -0800 Received: from gargle.linuxcare.com.au (penicillin.linuxcare.com.au [10.61.2.27]) by front.linuxcare.com.au (8.9.3/8.9.3/Debian/GNU) with ESMTP id VAA08078; Thu, 23 Dec 1999 21:21:28 +1100 Received: from rustcorp.com.au (really [127.0.0.1]) by rustcorp.com.au via in.smtpd with esmtp id (Debian Smail3.2.0.102) for ; Thu, 23 Dec 1999 21:21:27 +1100 (EST) Message-Id: From: Rusty Russell To: Multiple recipients of list NETFILTER Cc: netdev@oss.sgi.com Subject: Re: More on user-space filtering In-reply-to: Your message of "Wed, 22 Dec 1999 10:57:35 +1100." <14432.2404.915004.861739@pelerin.serpentine.com> Date: Thu, 23 Dec 1999 21:21:27 +1100 Sender: owner-netdev@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;netdev-outgoing Content-Length: 1092 Lines: 30 In message <14432.2404.915004.861739@pelerin.serpentine.com> you write: > However, I would like to be able to read a series of packets from > userspace without having to make a decision about each one before I > can see the next. Trying the poll route was the easiest. There seem > to be two other obvious options in front of me: Hi Bryan, Please come up with a better netfilter dev; the current one is simple as all hell. BTW, num packets queued is currently hard limited in netfilter.c. A perfect netfilter dev would have the following properties: 1) Minimum number of system calls: averaging << 1 syscall per packet would rock. 2) Handle out-of-order stuff. Please don't hand out pointers to userspace as cookies unless you have to, unless you verify them somehow when they get back. Even though only root can use netfilter_dev right now, I don't want a coding bug to crash my kernel please! Look at Alexey's memmapped sockpacket code for inspiration, although note that we have the skb itself, not a copy, and must handle modifications. Rusty. -- Hacking time. From owner-netdev@oss.sgi.com Sat Dec 25 18:54:28 1999 Received: by oss.sgi.com id ; Sat, 25 Dec 1999 18:51:38 -0800 Received: from mailhost.uni-koblenz.de ([141.26.64.1]:7611 "EHLO mailhost.uni-koblenz.de") by oss.sgi.com with ESMTP id ; Sat, 25 Dec 1999 18:51:15 -0800 Received: from cacc-11.uni-koblenz.de (cacc-11.uni-koblenz.de [141.26.131.11]) by mailhost.uni-koblenz.de (8.9.1/8.9.1) with ESMTP id DAA13677 for ; Sun, 26 Dec 1999 03:51:05 +0100 (MET) Received: by lappi.waldorf-gmbh.de id ; Sun, 26 Dec 1999 03:48:08 +0100 Date: Sun, 26 Dec 1999 03:48:07 +0100 From: Ralf Baechle To: netdev@oss.sgi.com Subject: ADMIN: netdev@nuclecu.unam.mx disabled Message-ID: <19991226034807.B32381@uni-koblenz.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Mailer: Mutt 1.0pre3us X-Accept-Language: de,en,fr Sender: owner-netdev@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;netdev-outgoing Now about 6 weeks after netdev has moved from it's old address netdev@nuclecu.unam.mx to it's current site I've configured the mailer on oss.sgi.com to reject email from athena.nuclecu.unam.mx since the forward from the old address did account for 100% of the spam on the list and the admins didn't remove it after me requesting this several times. This also means that legitimate posters from athena will have to post from another machine. Ralf From owner-netdev@oss.sgi.com Sun Dec 26 14:43:59 1999 Received: by oss.sgi.com id ; Sun, 26 Dec 1999 14:41:09 -0800 Received: from cx97923-a.phnx3.az.home.com ([24.9.112.194]:22789 "EHLO grok.myip.org") by oss.sgi.com with ESMTP id ; Sun, 26 Dec 1999 14:40:50 -0800 Received: from candelatech.com (IDENT:greear@localhost [127.0.0.1]) by grok.myip.org (8.9.3/8.9.3) with ESMTP id PAA25640 for ; Sun, 26 Dec 1999 15:57:30 -0700 Message-ID: <38669D5A.338A86D9@candelatech.com> Date: Sun, 26 Dec 1999 15:57:30 -0700 From: Ben Greear Organization: Candela Technologies X-Mailer: Mozilla 4.7 [en] (X11; U; Linux 2.2.12-20 i586) X-Accept-Language: en MIME-Version: 1.0 To: "netdev@oss.sgi.com" Subject: Test Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-netdev@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;netdev-outgoing Plz excuse. -- Ben Greear (greearb@candelatech.com) http://scry.wanfear.com/~greear Author of ScryMUD: scry.wanfear.com 4444 (Released under GPL) http://scry.wanfear.com From owner-netdev@oss.sgi.com Mon Dec 27 23:34:07 1999 Received: by oss.sgi.com id ; Mon, 27 Dec 1999 23:33:47 -0800 Received: from linuxcare.canberra.net.au ([203.29.91.49]:2578 "EHLO front.linuxcare.com.au") by oss.sgi.com with ESMTP id ; Mon, 27 Dec 1999 23:33:15 -0800 Received: from gargle.linuxcare.com.au (penicillin.linuxcare.com.au [10.61.2.27]) by front.linuxcare.com.au (8.9.3/8.9.3/Debian/GNU) with ESMTP id SAA07554 for ; Tue, 28 Dec 1999 18:33:05 +1100 Received: from rustcorp.com.au (really [127.0.0.1]) by rustcorp.com.au via in.smtpd with esmtp id (Debian Smail3.2.0.102) for ; Tue, 28 Dec 1999 18:33:05 +1100 (EST) Message-Id: From: Rusty Russell To: netdev@oss.sgi.com Subject: [PATCH] skb field reservation v2.3.34 Date: Tue, 28 Dec 1999 18:33:05 +1100 Sender: owner-netdev@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;netdev-outgoing Hi all, This allows random parties to grab a field in the skbuff header, and attach a destructor. This makes the connection tracking code (a netfilter module) much nicer (ie. real reference counts), and can also be used for other things where you need to play with skbs. It is the scalable replacement for the icky `mark' field which everyone hated (well, I hated, anyway). What do the networking triumvirate think? (Or should that be `networking troika'?) Rusty. diff -urN --minimal --exclude *.lds --exclude autoconf.h --exclude compile.h --exclude version.h --exclude .* --exclude *.[oa] --exclude *.orig --exclude config --exclude asm --exclude modules --exclude *.[Ss] --exclude System.map --exclude consolemap_deftbl.c --exclude *~ --exclude TAGS --exclude tags --exclude modversions.h --exclude install-kernel linux-2.3-official/include/linux/skbuff.h linux-2.3/include/linux/skbuff.h --- linux-2.3-official/include/linux/skbuff.h Tue Dec 28 18:25:15 1999 +++ linux-2.3/include/linux/skbuff.h Tue Dec 28 15:31:52 1999 @@ -109,11 +109,12 @@ unsigned char *tail; /* Tail pointer */ unsigned char *end; /* End pointer */ void (*destructor)(struct sk_buff *); /* Destruct function */ + + /* See skb_field_reserve()/skb_field_unreserve() */ + unsigned int reserve_gen; + char reserve[32]; + #ifdef CONFIG_NETFILTER - /* Can be used for communication between hooks. */ - unsigned long nfmark; - /* Reason for doing this to the packet (see netfilter.h) */ - __u32 nfreason; /* Cache info */ __u32 nfcache; #ifdef CONFIG_NETFILTER_DEBUG @@ -183,6 +184,38 @@ extern void skb_trim(struct sk_buff *skb, unsigned int len); extern void skb_over_panic(struct sk_buff *skb, int len, void *here); extern void skb_under_panic(struct sk_buff *skb, int len, void *here); + +/* Selling space in skb's: the VCs will love it... */ +struct skb_field +{ + /* Filled in by skb_field_reserve() */ + struct list_head list; + size_t offset; + unsigned int gen; + + /* Filled in by caller. */ + size_t size; + size_t alignment; /* Use __alignof__ */ + /* Don't call skb_field_unreserve from this: deadlock. */ + void (*destructor)(struct sk_buff *); +}; + +extern int skb_field_reserve(struct skb_field *reg); +extern void skb_field_unreserve(struct skb_field *unreg); + +/* Private */ +extern void skb_field_update(struct sk_buff *skb); + +/* Access a field of an skb */ +#define skb_field(skb, reg, type) \ +({ \ + if (((int)(skb)->reserve_gen - (int)(reg)->gen) < 0) \ + skb_field_update(skb); \ + &__skb_field(skb, reg, type); \ +}) + +/* If you reserve at boot, no skb can predate you, so use this. */ +#define __skb_field(skb, reg, type) (*((type *)((skb)->reserve+(reg)->offset))) /* Backwards compatibility */ #define skb_realloc_headroom(skb, nhr) skb_copy_expand(skb, nhr, skb_tailroom(skb), GFP_ATOMIC) diff -urN --minimal --exclude *.lds --exclude autoconf.h --exclude compile.h --exclude version.h --exclude .* --exclude *.[oa] --exclude *.orig --exclude config --exclude asm --exclude modules --exclude *.[Ss] --exclude System.map --exclude consolemap_deftbl.c --exclude *~ --exclude TAGS --exclude tags --exclude modversions.h --exclude install-kernel linux-2.3-official/net/core/skbuff.c linux-2.3/net/core/skbuff.c --- linux-2.3-official/net/core/skbuff.c Tue Dec 28 18:25:15 1999 +++ linux-2.3/net/core/skbuff.c Tue Dec 28 15:36:11 1999 @@ -19,6 +19,7 @@ * Ray VanTassle : Fixed --skb->lock in free * Alan Cox : skb_copy copy arp field * Andi Kleen : slabified it. + * Rusty Russell : field reservation * * NOTE: * The __skb_ routines should be called with interrupts @@ -77,6 +78,12 @@ static kmem_cache_t *skbuff_head_cache; +static LIST_HEAD(field_allocs); +/* Doesn't need to be atomic_t, if writes to longs are atomic. + Play safe --RR */ +static atomic_t field_generation = ATOMIC_INIT(0); +static rwlock_t field_lock = RW_LOCK_UNLOCKED; + /* * Keep out-of-line to prevent kernel bloat. * __builtin_return_address is not used because it is not always @@ -97,6 +104,65 @@ *(int*)0 = 0; } +int skb_field_reserve(struct skb_field *reg) +{ + struct list_head *i; + size_t align_mask = (reg->alignment - 1); + int ret = 0; + + reg->offset = 0; + write_lock_bh(&field_lock); + reg->gen = atomic_read(&field_generation); + + /* Yes it's an ordered list, no we don't do garbage collection */ + for (i = field_allocs.next; i != &field_allocs; i = i->next) { + struct skb_field *f = (struct skb_field *)i; + + if (reg->offset + reg->size <= f->offset) + break; + + /* offset = last aligned possibility */ + reg->offset = (f->offset + f->size + align_mask) & ~align_mask; + } + + if (reg->offset + reg->size < sizeof(((struct sk_buff *)0)->reserve)) + list_add_tail(®->list, i); + else + ret = -ENOMEM; /* -EEDITSKBUFF.H */ + + write_unlock_bh(&field_lock); + return ret; +} + +void skb_field_unreserve(struct skb_field *unreg) +{ + write_lock_bh(&field_lock); + atomic_inc(&field_generation); + list_del(&unreg->list); + write_unlock_bh(&field_lock); +} + +/* Called very rarely; skb alloc'ed before field registration. */ +void skb_field_update(struct sk_buff *skb) +{ + struct list_head *i; + unsigned int gen; + + write_lock_bh(&field_lock); + gen = atomic_read(&field_generation); + + /* Clear any registrations newer than this skb */ + for (i = field_allocs.next; i != &field_allocs; i = i->next) { + struct skb_field *f = (struct skb_field *)i; + + if ((int)f->gen - (int)skb->reserve_gen > 0) + memset(skb->reserve+f->offset, f->size, 0); + } + write_unlock_bh(&field_lock); + + skb->reserve_gen = gen; +} + void show_net_buffers(void) { printk("Networking buffers in use : %u\n", @@ -164,6 +230,7 @@ skb->len = 0; skb->is_clone = 0; skb->cloned = 0; + skb->reserve_gen = atomic_read(&field_generation); #ifdef CONFIG_ATM ATM_SKB(skb)->iovcnt = 0; @@ -200,8 +267,9 @@ skb->security = 0; /* By default packets are insecure */ skb->dst = NULL; skb->rx_dev = NULL; + memset(skb->reserve, 0, sizeof(skb->reserve)); #ifdef CONFIG_NETFILTER - skb->nfmark = skb->nfreason = skb->nfcache = 0; + skb->nfcache = 0; #ifdef CONFIG_NETFILTER_DEBUG skb->nf_debug = 0; #endif @@ -222,12 +290,42 @@ atomic_dec(&net_skbcount); } +static inline void reserve_destruct(struct sk_buff *skb) +{ + struct list_head *i; + + read_lock_bh(&field_lock); + + for (i = field_allocs.next; i != &field_allocs; i = i->next) { + struct skb_field *f = (struct skb_field *)i; + size_t j; + + /* skb predates reservation? */ + if ((long)skb->reserve_gen - (long)f->gen < 0) + continue; + + if (!f->destructor) + continue; + + for (j = 0; j < f->size; j++) { + if (skb->reserve[j + f->offset]) { + f->destructor(skb); + break; + } + } + } + + read_unlock_bh(&field_lock); +} + /* * Free an sk_buff. Release anything attached to the buffer. Clean the state. */ void __kfree_skb(struct sk_buff *skb) { + size_t i; + if (skb->list) { printk(KERN_WARNING "Warning: kfree_skb passed an skb still " "on a list (from %p).\n", NET_CALLER(skb)); @@ -237,6 +335,16 @@ dst_release(skb->dst); if(skb->destructor) skb->destructor(skb); + + /* Lock and call destructors iff neccessary. */ + /* Could optimize: only iterate over part actually allocated --RR */ + for (i = 0; i < sizeof(skb->reserve); i += sizeof(long)) { + if (*((long *)(skb->reserve + i))) { + reserve_destruct(skb); + break; + } + } + #ifdef CONFIG_NET if(skb->rx_dev) dev_put(skb->rx_dev); @@ -272,6 +380,8 @@ n->is_clone = 1; atomic_set(&n->users, 1); n->destructor = NULL; + memset(n->reserve, 0, sizeof(skb->reserve)); + skb->reserve_gen = atomic_read(&field_generation); return n; } @@ -301,8 +411,6 @@ new->destructor = NULL; new->security=old->security; #ifdef CONFIG_NETFILTER - new->nfmark=old->nfmark; - new->nfreason=old->nfreason; new->nfcache=old->nfcache; #ifdef CONFIG_NETFILTER_DEBUG new->nf_debug=old->nf_debug; diff -urN --minimal --exclude *.lds --exclude autoconf.h --exclude compile.h --exclude version.h --exclude .* --exclude *.[oa] --exclude *.orig --exclude config --exclude asm --exclude modules --exclude *.[Ss] --exclude System.map --exclude consolemap_deftbl.c --exclude *~ --exclude TAGS --exclude tags --exclude modversions.h --exclude install-kernel linux-2.3-official/include/linux/netfilter.h linux-2.3/include/linux/netfilter.h --- linux-2.3-official/include/linux/netfilter.h Tue Dec 21 20:32:52 1999 +++ linux-2.3/include/linux/netfilter.h Tue Dec 28 15:35:36 1999 @@ -15,7 +15,7 @@ #define NF_ACCEPT 1 #define NF_STOLEN 2 #define NF_QUEUE 3 -#define NF_MAX_VERDICT NF_QUEUE +/* >= NF_QUEUE treated the same. */ /* Generic cache responses from hook functions. */ #define NFC_ALTERED 0x8000 @@ -141,10 +141,8 @@ int pf; /* Bitmask of hook numbers to match (1 << hooknum). */ unsigned int hookmask; - /* If non-zero, only catch packets with this mark. */ - unsigned int mark; - /* If non-zero, only catch packets of this reason. */ - unsigned int reason; + /* If not 0xFFFFFFFF, only catch packets with this queue. */ + int queuenum; struct nf_wakeme *wake; }; @@ -154,11 +152,8 @@ extern void nf_unregister_interest(struct nf_interest *interest); extern void nf_getinfo(const struct sk_buff *skb, struct net_device **indev, - struct net_device **outdev, - unsigned long *mark); -extern void nf_reinject(struct sk_buff *skb, - unsigned long mark, - unsigned int verdict); + struct net_device **outdev); +extern void nf_reinject(struct sk_buff *skb, unsigned int verdict); #ifdef CONFIG_NETFILTER_DEBUG extern void nf_dump_skb(int pf, struct sk_buff *skb); @@ -192,14 +187,4 @@ #define SUMAX(a,b) ((size_t)(a)>(size_t)(b) ? (ssize_t)(a) : (ssize_t)(b)) #define SUMIN(a,b) ((size_t)(a)<(size_t)(b) ? (ssize_t)(a) : (ssize_t)(b)) #endif /*__KERNEL__*/ - -enum nf_reason { - /* Do not, NOT, reorder these. Add at end. */ - NF_REASON_NONE, - NF_REASON_SET_BY_IPCHAINS, - NF_REASON_FOR_ROUTING, - NF_REASON_FOR_CLS_FW, - NF_REASON_MIN_RESERVED_FOR_CONNTRACK = 1024, -}; - #endif /*__LINUX_NETFILTER_H*/ diff -urN --minimal --exclude *.lds --exclude autoconf.h --exclude compile.h --exclude version.h --exclude .* --exclude *.[oa] --exclude *.orig --exclude config --exclude asm --exclude modules --exclude *.[Ss] --exclude System.map --exclude consolemap_deftbl.c --exclude *~ --exclude TAGS --exclude tags --exclude modversions.h --exclude install-kernel linux-2.3-official/include/net/route.h linux-2.3/include/net/route.h --- linux-2.3-official/include/net/route.h Tue Dec 21 20:32:55 1999 +++ linux-2.3/include/net/route.h Tue Dec 28 15:35:36 1999 @@ -110,7 +110,9 @@ extern int ip_rt_ioctl(unsigned int cmd, void *arg); extern void ip_rt_get_source(u8 *src, struct rtable *rt); extern int ip_rt_dump(struct sk_buff *skb, struct netlink_callback *cb); - +#ifdef CONFIG_IP_ROUTE_FWMARK +extern struct skb_field ip_rt_mark_res; +#endif extern __inline__ void ip_rt_put(struct rtable * rt) { diff -urN --minimal --exclude *.lds --exclude autoconf.h --exclude compile.h --exclude version.h --exclude .* --exclude *.[oa] --exclude *.orig --exclude config --exclude asm --exclude modules --exclude *.[Ss] --exclude System.map --exclude consolemap_deftbl.c --exclude *~ --exclude TAGS --exclude tags --exclude modversions.h --exclude install-kernel linux-2.3-official/net/core/netfilter.c linux-2.3/net/core/netfilter.c --- linux-2.3-official/net/core/netfilter.c Tue Dec 21 14:20:03 1999 +++ linux-2.3/net/core/netfilter.c Sun Dec 26 20:17:19 1999 @@ -22,7 +22,7 @@ #include /* In this code, we can be waiting indefinitely for userspace to - * service a packet if a hook returns NF_QUEUE. We could keep a count + * service a packet if a hook returns >= NF_QUEUE. We could keep a count * of skbuffs queued for userspace, and not deregister a hook unless * this is zero, but that sucks. Now, we simply check when the * packets come back: if the hook is gone, the packet is discarded. */ @@ -40,8 +40,8 @@ /* If we're sent to userspace, this keeps housekeeping info */ int pf; - unsigned long mark; unsigned int hook; + unsigned int queuenum; struct net_device *indev, *outdev; int (*okfn)(struct sk_buff *); }; @@ -53,6 +53,10 @@ static LIST_HEAD(nf_sockopts); static LIST_HEAD(nf_interested); +static struct skb_field skb_res += { { NULL, NULL }, 0, 0, + sizeof(struct nf_info *), __alignof__(struct nf_info *), NULL }; + int nf_register_hook(struct nf_hook_ops *reg) { struct list_head *i; @@ -358,11 +362,10 @@ { for (*i = (*i)->next; *i != head; *i = (*i)->next) { struct nf_hook_ops *elem = (struct nf_hook_ops *)*i; - switch (elem->hook(hook, skb, indev, outdev, okfn)) { - case NF_QUEUE: - NFDEBUG("nf_iterate: NF_QUEUE for %p.\n", *skb); - return NF_QUEUE; + unsigned int verdict + = elem->hook(hook, skb, indev, outdev, okfn); + switch (verdict) { case NF_STOLEN: NFDEBUG("nf_iterate: NF_STOLEN for %p.\n", *skb); return NF_STOLEN; @@ -371,14 +374,12 @@ NFDEBUG("nf_iterate: NF_DROP for %p.\n", *skb); return NF_DROP; -#ifdef CONFIG_NETFILTER_DEBUG case NF_ACCEPT: break; default: - NFDEBUG("Evil return from %p(%u).\n", - elem->hook, hook); -#endif + NFDEBUG("nf_iterate: %u for %p.\n", verdict, *skb); + return verdict; } } return NF_ACCEPT; @@ -389,7 +390,8 @@ int pf, unsigned int hook, struct net_device *indev, struct net_device *outdev, - int (*okfn)(struct sk_buff *)) + int (*okfn)(struct sk_buff *), + unsigned int queuenum) { struct list_head *i; @@ -402,13 +404,14 @@ /* Can't do struct assignments with arrays in them. Damn. */ info->elem = (struct nf_hook_ops *)elem; - info->mark = skb->nfmark; info->pf = pf; info->hook = hook; info->okfn = okfn; info->indev = indev; info->outdev = outdev; - skb->nfmark = (unsigned long)info; + info->queuenum = queuenum; + + __skb_field(skb, &skb_res, struct nf_info *) = info; /* Bump dev refs so they don't vanish while packet is out */ if (indev) dev_hold(indev); @@ -419,8 +422,8 @@ if ((recip->hookmask & (1 << info->hook)) && info->pf == recip->pf - && (!recip->mark || info->mark == recip->mark) - && (!recip->reason || skb->nfreason == recip->reason)) { + && (recip->queuenum == 0xFFFFFFFF + || info->queuenum == recip->queuenum)) { /* FIXME: Andi says: use netlink. Hmmm... --RR */ if (skb_queue_len(&recip->wake->skbq) >= 100) { NFDEBUG("nf_hook: queue to long.\n"); @@ -428,8 +431,8 @@ } /* Hand it to userspace for collection */ skb_queue_tail(&recip->wake->skbq, skb); - NFDEBUG("Waking up pf=%i hook=%u mark=%lu reason=%u\n", - pf, hook, skb->nfmark, skb->nfreason); + NFDEBUG("Waking up pf=%i hook=%u\n", + pf, hook); wake_up_interruptible(&recip->wake->sleep); return; @@ -473,9 +476,10 @@ elem = &nf_hooks[pf][hook]; verdict = nf_iterate(&nf_hooks[pf][hook], &skb, hook, indev, outdev, &elem, okfn); - if (verdict == NF_QUEUE) { + if (verdict >= NF_QUEUE) { NFDEBUG("nf_hook: Verdict = QUEUE.\n"); - nf_queue(skb, elem, pf, hook, indev, outdev, okfn); + nf_queue(skb, elem, pf, hook, indev, outdev, okfn, + verdict - NF_QUEUE); } read_unlock_bh(&nf_lock); @@ -517,24 +521,24 @@ /* Blow away any queued skbs; this is overzealous. */ while ((skb = skb_dequeue(&interest->wake->skbq)) != NULL) - nf_reinject(skb, 0, NF_DROP); + nf_reinject(skb, NF_DROP); } void nf_getinfo(const struct sk_buff *skb, struct net_device **indev, - struct net_device **outdev, - unsigned long *mark) + struct net_device **outdev) { - const struct nf_info *info = (const struct nf_info *)skb->nfmark; + const struct nf_info *info = + __skb_field(skb, &skb_res, struct nf_info *); *indev = info->indev; *outdev = info->outdev; - *mark = info->mark; } -void nf_reinject(struct sk_buff *skb, unsigned long mark, unsigned int verdict) +void nf_reinject(struct sk_buff *skb, unsigned int verdict) { - struct nf_info *info = (struct nf_info *)skb->nfmark; + const struct nf_info *info = + __skb_field(skb, &skb_res, struct nf_info *); struct list_head *elem = &info->elem->list; struct list_head *i; @@ -551,16 +555,16 @@ /* Continue traversal iff userspace said ok, and devices still exist... */ if (verdict == NF_ACCEPT) { - skb->nfmark = mark; verdict = nf_iterate(&nf_hooks[info->pf][info->hook], &skb, info->hook, info->indev, info->outdev, &elem, info->okfn); } - if (verdict == NF_QUEUE) { + if (verdict >= NF_QUEUE) { nf_queue(skb, elem, info->pf, info->hook, - info->indev, info->outdev, info->okfn); + info->indev, info->outdev, info->okfn, + verdict - NF_QUEUE); } read_unlock_bh(&nf_lock); @@ -626,4 +630,8 @@ for (i = 0; i < NPROTO; i++) for (h = 0; h < NF_MAX_HOOKS; h++) INIT_LIST_HEAD(&nf_hooks[i][h]); + + if (skb_field_reserve(&skb_res) != 0) + panic("Can't reserve a %u byte field in skb\n", + sizeof(struct nf_info *)); } diff -urN --minimal --exclude *.lds --exclude autoconf.h --exclude compile.h --exclude version.h --exclude .* --exclude *.[oa] --exclude *.orig --exclude config --exclude asm --exclude modules --exclude *.[Ss] --exclude System.map --exclude consolemap_deftbl.c --exclude *~ --exclude TAGS --exclude tags --exclude modversions.h --exclude install-kernel linux-2.3-official/net/ipv4/route.c linux-2.3/net/ipv4/route.c --- linux-2.3-official/net/ipv4/route.c Wed Nov 24 18:19:29 1999 +++ linux-2.3/net/ipv4/route.c Thu Dec 23 21:05:07 1999 @@ -127,6 +127,10 @@ static struct timer_list rt_periodic_timer = { NULL, NULL, 0, 0L, NULL }; +#ifdef CONFIG_IP_ROUTE_FWMARK +struct skb_field ip_rt_mark_res += { { NULL, NULL }, 0, 0, sizeof(__u32), __alignof__(__u32), NULL }; +#endif /* * Interface to generic destination cache. */ @@ -1107,10 +1111,7 @@ rth->rt_dst = daddr; rth->key.tos = tos; #ifdef CONFIG_IP_ROUTE_FWMARK - if (skb->nfreason == NF_REASON_FOR_ROUTING) - rth->key.fwmark = skb->nfmark; - else - rth->key.fwmark = 0; + rth->key.fwmark = __skb_field(skb, &ip_rt_mark_res, __u32); #endif rth->key.src = saddr; rth->rt_src = saddr; @@ -1189,10 +1190,7 @@ key.src = saddr; key.tos = tos; #ifdef CONFIG_IP_ROUTE_FWMARK - if (skb->nfreason == NF_REASON_FOR_ROUTING) - key.fwmark = skb->nfmark; - else - key.fwmark = 0; + key.fwmark = __skb_field(skb, &ip_rt_mark_res, __u32); #endif key.iif = dev->ifindex; key.oif = 0; @@ -1314,10 +1312,7 @@ rth->rt_dst = daddr; rth->key.tos = tos; #ifdef CONFIG_IP_ROUTE_FWMARK - if (skb->nfreason == NF_REASON_FOR_ROUTING) - rth->key.fwmark = skb->nfmark; - else - rth->key.fwmark = 0; + rth->key.fwmark = __skb_field(skb, &ip_rt_mark_res, __u32); #endif rth->key.src = saddr; rth->rt_src = saddr; @@ -1391,10 +1386,7 @@ rth->rt_dst = daddr; rth->key.tos = tos; #ifdef CONFIG_IP_ROUTE_FWMARK - if (skb->nfreason == NF_REASON_FOR_ROUTING) - rth->key.fwmark = skb->nfmark; - else - rth->key.fwmark = 0; + rth->key.fwmark = __skb_field(skb, &ip_rt_mark_res, __u32); #endif rth->key.src = saddr; rth->rt_src = saddr; @@ -1482,8 +1474,7 @@ rth->key.oif == 0 && #ifdef CONFIG_IP_ROUTE_FWMARK rth->key.fwmark - == (skb->nfreason == NF_REASON_FOR_ROUTING - ? skb->nfmark : 0) && + == __skb_field(skb, &ip_rt_mark_res, __u32) && #endif rth->key.tos == tos) { rth->u.dst.lastuse = jiffies; @@ -2145,7 +2136,6 @@ #endif #endif - void __init ip_rt_init(void) { ipv4_dst_ops.kmem_cachep = kmem_cache_create("ip_dst_cache", @@ -2166,5 +2156,10 @@ proc_net_create ("rt_cache", 0, rt_cache_get_info); #ifdef CONFIG_NET_CLS_ROUTE create_proc_read_entry("net/rt_acct", 0, 0, ip_rt_acct_read, NULL); +#endif + +#ifdef CONFIG_IP_ROUTE_FWMARK + if (skb_field_reserve(&ip_rt_mark_res) != 0) + panic("ip_rt_init: can't reserve mark in skb"); #endif } diff -urN --minimal --exclude *.lds --exclude autoconf.h --exclude compile.h --exclude version.h --exclude .* --exclude *.[oa] --exclude *.orig --exclude config --exclude asm --exclude modules --exclude *.[Ss] --exclude System.map --exclude consolemap_deftbl.c --exclude *~ --exclude TAGS --exclude tags --exclude modversions.h --exclude install-kernel linux-2.3-official/net/netsyms.c linux-2.3/net/netsyms.c --- linux-2.3-official/net/netsyms.c Tue Dec 28 18:25:15 1999 +++ linux-2.3/net/netsyms.c Tue Dec 28 16:15:04 1999 @@ -101,6 +101,9 @@ /* Skbuff symbols. */ EXPORT_SYMBOL(skb_over_panic); EXPORT_SYMBOL(skb_under_panic); +EXPORT_SYMBOL(skb_field_reserve); +EXPORT_SYMBOL(skb_field_unreserve); +EXPORT_SYMBOL(skb_field_update); /* Socket layer registration */ EXPORT_SYMBOL(sock_register); @@ -251,6 +254,9 @@ EXPORT_SYMBOL(inetdev_by_index); EXPORT_SYMBOL(in_dev_finish_destroy); EXPORT_SYMBOL(ip_defrag); +#ifdef CONFIG_IP_ROUTE_FWMARK +EXPORT_SYMBOL(ip_rt_mark_res); +#endif /* Route manipulation */ EXPORT_SYMBOL(ip_rt_ioctl); @@ -580,7 +586,11 @@ EXPORT_SYMBOL(nf_register_interest); EXPORT_SYMBOL(nf_unregister_interest); EXPORT_SYMBOL(nf_hook_slow); -#endif +#ifdef CONFIG_NET_CLS_FW +extern struct skb_field cls_fw_res; +EXPORT_SYMBOL(cls_fw_res); +#endif /* CONFIG_NET_CLS_FW */ +#endif /* CONFIG_NETFILTER */ EXPORT_SYMBOL(register_gifconf); diff -urN --minimal --exclude *.lds --exclude autoconf.h --exclude compile.h --exclude version.h --exclude .* --exclude *.[oa] --exclude *.orig --exclude config --exclude asm --exclude modules --exclude *.[Ss] --exclude System.map --exclude consolemap_deftbl.c --exclude *~ --exclude TAGS --exclude tags --exclude modversions.h --exclude install-kernel linux-2.3-official/net/sched/cls_api.c linux-2.3/net/sched/cls_api.c --- linux-2.3-official/net/sched/cls_api.c Fri Oct 15 15:51:35 1999 +++ linux-2.3/net/sched/cls_api.c Thu Dec 23 20:41:22 1999 @@ -461,6 +461,14 @@ INIT_TC_FILTER(route4); #endif #ifdef CONFIG_NET_CLS_FW +#ifdef CONFIG_NETFILTER + { + extern struct skb_field cls_fw_res; + + if (skb_field_reserve(&cls_fw_res) != 0) + panic("tc_filter_init: Can't reserve field cls_fw"); + } +#endif INIT_TC_FILTER(fw); #endif #ifdef CONFIG_NET_CLS_RSVP diff -urN --minimal --exclude *.lds --exclude autoconf.h --exclude compile.h --exclude version.h --exclude .* --exclude *.[oa] --exclude *.orig --exclude config --exclude asm --exclude modules --exclude *.[Ss] --exclude System.map --exclude consolemap_deftbl.c --exclude *~ --exclude TAGS --exclude tags --exclude modversions.h --exclude install-kernel linux-2.3-official/net/sched/cls_fw.c linux-2.3/net/sched/cls_fw.c --- linux-2.3-official/net/sched/cls_fw.c Fri Oct 15 15:51:29 1999 +++ linux-2.3/net/sched/cls_fw.c Thu Dec 23 20:47:10 1999 @@ -40,6 +40,9 @@ #include #include +struct skb_field cls_fw_res += { { NULL, NULL }, 0, 0, sizeof(__u32), __alignof__(u32), NULL }; + struct fw_head { struct fw_filter *ht[256]; @@ -65,10 +68,11 @@ { struct fw_head *head = (struct fw_head*)tp->root; struct fw_filter *f; -#ifdef CONFIG_NETFILTER - u32 id = (skb->nfreason == NF_REASON_FOR_CLS_FW ? skb->nfmark : 0); -#else u32 id = 0; +#ifdef CONFIG_NETFILTER + u32 *idp = skb_field(skb, &cls_fw_res, u32); + + if (idp) id = *idp; #endif if (head == NULL) @@ -375,11 +379,28 @@ #ifdef MODULE int init_module(void) { - return register_tcf_proto_ops(&cls_fw_ops); + int ret = 0; + +#ifdef CONFIG_NETFILTER + ret = skb_field_reserve(&cls_fw_res); +#endif + + if (ret == 0) { + ret = register_tcf_proto_ops(&cls_fw_ops); +#ifdef CONFIG_NETFILTER + if (ret != 0) + skb_field_unreserve(&cls_fw_res); +#endif + } + + return ret; } void cleanup_module(void) { unregister_tcf_proto_ops(&cls_fw_ops); +#ifdef CONFIG_NETFILTER + skb_field_unreserve(&cls_fw_res); +#endif } #endif -- Hacking time. From owner-netdev@oss.sgi.com Tue Dec 28 04:51:53 1999 Received: by oss.sgi.com id ; Tue, 28 Dec 1999 04:51:43 -0800 Received: from laurin.munich.netsurf.de ([194.64.166.1]:46520 "EHLO laurin.munich.netsurf.de") by oss.sgi.com with ESMTP id ; Tue, 28 Dec 1999 04:51:17 -0800 Received: from fred.muc.de (none@ns1094.munich.netsurf.de [195.180.235.94]) by laurin.munich.netsurf.de (8.9.3/8.9.3) with ESMTP id NAA07236; Tue, 28 Dec 1999 13:50:27 +0100 (MET) Received: from andi by fred.muc.de with local (Exim 2.05 #1) id 122w4k-0000GX-00; Tue, 28 Dec 1999 13:50:06 +0100 Date: Tue, 28 Dec 1999 13:50:06 +0100 From: Andi Kleen To: Rusty Russell Cc: netdev@oss.sgi.com Subject: Re: [PATCH] skb field reservation v2.3.34 Message-ID: <19991228135006.A1007@fred.muc.de> References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Mailer: Mutt 0.95.4us In-Reply-To: ; from Rusty Russell on Tue, Dec 28, 1999 at 08:33:05AM +0100 Sender: owner-netdev@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;netdev-outgoing On Tue, Dec 28, 1999 at 08:33:05AM +0100, Rusty Russell wrote: > Hi all, > > This allows random parties to grab a field in the skbuff > header, and attach a destructor. This makes the connection tracking > code (a netfilter module) much nicer (ie. real reference counts), and > can also be used for other things where you need to play with skbs. > It is the scalable replacement for the icky `mark' field which > everyone hated (well, I hated, anyway). > > What do the networking triumvirate think? (Or should that be > `networking troika'?) Ugh. It is a ugly duck, but I guess it is needed. It adds at least one cache line access in the hot path. Please use a flag bit in the common part at least, that can be tested without fetching the new cache line on destruction (so that hot path code without firewalling does not pay the price) Also the linked list is cache unfriendly. Because the reserve buffer is limited anyways I think it is ok to use a fixed size array for the destructor pointers. -Andi From owner-netdev@oss.sgi.com Tue Dec 28 05:19:13 1999 Received: by oss.sgi.com id ; Tue, 28 Dec 1999 05:19:03 -0800 Received: from linuxcare.canberra.net.au ([203.29.91.49]:41234 "EHLO front.linuxcare.com.au") by oss.sgi.com with ESMTP id ; Tue, 28 Dec 1999 05:18:49 -0800 Received: from gargle.linuxcare.com.au (penicillin.linuxcare.com.au [10.61.2.27]) by front.linuxcare.com.au (8.9.3/8.9.3/Debian/GNU) with ESMTP id AAA08760; Wed, 29 Dec 1999 00:18:32 +1100 Received: from rustcorp.com.au (really [127.0.0.1]) by rustcorp.com.au via in.smtpd with esmtp id (Debian Smail3.2.0.102) for ; Wed, 29 Dec 1999 00:18:32 +1100 (EST) Message-Id: From: Rusty Russell To: Andi Kleen Cc: netdev@oss.sgi.com Subject: Re: [PATCH] skb field reservation v2.3.34 In-reply-to: Your message of "Tue, 28 Dec 1999 13:50:06 BST." <19991228135006.A1007@fred.muc.de> Date: Wed, 29 Dec 1999 00:18:32 +1100 Sender: owner-netdev@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;netdev-outgoing In message <19991228135006.A1007@fred.muc.de> you write: > Please use a flag bit in the common part at least, that can be tested > without fetching the new cache line on destruction (so that hot path > code without firewalling does not pay the price) Not sure I understand (common part of what?). I assume you are worried about __kfree_skb... I can have a global destructor count if you want, or better a max_destructed_field_offset: for (i = 0; i < max_destructed_field_offset; i += sizeof(long)) This will be zero iterations for the no-destructor case. > Also the linked list is cache unfriendly. Because the reserve buffer > is limited anyways I think it is ok to use a fixed size array for the > destructor pointers. I don't want destructor called unless some bit is set in the area reserved (otherwise we get a function call to ip connection tracking for every AF_UNIX skb: messy *and* slow). That means we have an array of: struct { u_int16_t start; u_int16_t size; void (*destruct)(struct sk_buff *skb); } which is not that much better than a linked list. Rusty. -- Hacking time. From owner-netdev@oss.sgi.com Tue Dec 28 06:42:53 1999 Received: by oss.sgi.com id ; Tue, 28 Dec 1999 06:42:33 -0800 Received: from colin.muc.de ([193.149.48.1]:17427 "HELO colin.muc.de") by oss.sgi.com with SMTP id ; Tue, 28 Dec 1999 06:42:16 -0800 Received: by colin.muc.de id <140552-2>; Tue, 28 Dec 1999 15:42:09 +0100 Message-ID: <19991228154203.35941@colin.muc.de> Date: Tue, 28 Dec 1999 15:42:03 +0100 From: Andi Kleen To: Rusty Russell Cc: Andi Kleen , netdev@oss.sgi.com Subject: Re: [PATCH] skb field reservation v2.3.34 References: <19991228135006.A1007@fred.muc.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Mailer: Mutt 0.88e In-Reply-To: ; from Rusty Russell on Tue, Dec 28, 1999 at 02:18:32PM +0100 Sender: owner-netdev@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;netdev-outgoing On Tue, Dec 28, 1999 at 02:18:32PM +0100, Rusty Russell wrote: > In message <19991228135006.A1007@fred.muc.de> you write: > > Please use a flag bit in the common part at least, that can be tested > > without fetching the new cache line on destruction (so that hot path > > code without firewalling does not pay the price) > > Not sure I understand (common part of what?). Along the char flags (used-ip_summed) which are accessed near always here are another two free bytes. These are near free to test because the cache line is already loaded. If you add a flag here (similar to the cloned flag) you can avoid the cache line reference in case it is not needed. > > I assume you are worried about __kfree_skb... I can have a global > destructor count if you want, or better a max_destructed_field_offset: > > for (i = 0; i < max_destructed_field_offset; i += sizeof(long)) > > This will be zero iterations for the no-destructor case. > > > Also the linked list is cache unfriendly. Because the reserve buffer > > is limited anyways I think it is ok to use a fixed size array for the > > destructor pointers. > > I don't want destructor called unless some bit is set in the area > reserved (otherwise we get a function call to ip connection tracking > for every AF_UNIX skb: messy *and* slow). > > That means we have an array of: > > struct { > u_int16_t start; > u_int16_t size; > void (*destruct)(struct sk_buff *skb); > } > > which is not that much better than a linked list. It is, because with the linked list you need one cache line per list entry (and the wasted bytes are unlikely to be useful). With the array you can pack 4 entries into a single cache line (assuming 32 bytes lines) -Andi From owner-netdev@oss.sgi.com Tue Dec 28 07:22:13 1999 Received: by oss.sgi.com id ; Tue, 28 Dec 1999 07:21:53 -0800 Received: from [130.131.166.29] ([130.131.166.29]:61892 "EHLO canospam.agcs.com") by oss.sgi.com with ESMTP id ; Tue, 28 Dec 1999 07:21:36 -0800 Received: from frontier. (marshal.agcs.com [130.131.60.2]) by canospam.agcs.com (8.9.3/8.9.1) with SMTP id IAA09356 for ; Tue, 28 Dec 1999 08:21:37 -0700 (MST) Received: from bootstrap.agcs.com ([130.131.48.11]) by frontier.agcs.com; Tue, 28 Dec 1999 08:21:24 +0000 (MST) Received: from pxmail1.agcs.com (pxmail1.agcs.com [130.131.168.5]) by bootstrap.agcs.com (8.9.3/8.9.1) with ESMTP id IAA10882 for ; Tue, 28 Dec 1999 08:20:55 -0700 (MST) Posted-Date: Tue, 28 Dec 1999 08:20:55 -0700 (MST) Received: from agcs.com ([130.131.166.110]) by pxmail1.agcs.com (Netscape Messaging Server 3.61) with ESMTP id AAA138F for ; Tue, 28 Dec 1999 08:21:24 -0700 Message-ID: <3868D573.29D8A6BE@agcs.com> Date: Tue, 28 Dec 1999 08:21:23 -0700 From: Ben Greear Reply-To: greearb@agcs.com Organization: AG Communication Systems X-Mailer: Mozilla 4.5 [en] (X11; U; SunOS 5.5.1 sun4u) X-Accept-Language: en MIME-Version: 1.0 To: netdev Subject: Layer 3 (IP) based switching for Linux? (Proxy-ARP??) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-netdev@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;netdev-outgoing I first sent this to the old .mx mailing list, which seems to be defunct. If you've already seen this, I appologize. After reading more in the intervening days, I am starting to think that what I really want is Proxy-ARP. However, I'm having a hard time finding any info on how to set that up. Here is my original email: (View in fixed-width font.) I am trying to set up a network that looks something like this: PC1 -------\ 5.5.1.2/24 \__ eth0 |-------| | | 5.5.1.254/24 ... | Linux | eth2 ------ [ gateway ] ---- { internet } | | PC2 ----------- eth1 |_______| 5.5.1.3/24 Instead of eth**, I'm going to be using my vlan code to have lots of vlan interfaces, probably 100+, maybe 1000+, but for sake of argument, the eth** should be identical in nature. If the gateway idea is too wierd, then the dflt gateway could reside on eth2 of the linux box. The fun part is that I want to be able to 'firewall' the interfaces coming from the PC's, mainly to restrict them to a certain IP address (they are un-trusted, and could possible be malicious.) The IPs will be configured from user-space. PC's should be able to talk to PC's as well, so the linux box will have to do some (hopefully smart) switching at layer 2 (ie ARP.) It will also have to switch layer 3, because the gateway will not want to route a pkt back down the wire, say from PC1 to PC2. At the same time, the bandwidth from the PCs to the Linux box is limited, and should be optimized (the switch needs to be smart.) I believe static routes would work except in the case of PC <-> PC communication? Since the Linux box will be configured to know what IP's belong where, it should *NOT* try to automatically learn the IP addresses. However, if there is code that already does this, then I can just block the out-going pkts with the firewalling rules, I hope. So, does anyone know of any existing software that can do this, or do I need to start hacking into kernel!! Thanks! Ben -- Ben Greear greearb@agcs.com Pager: 202-2717 (623) 581 4980 "More weight!" -- _The Crucible._ http://hydrogen:8080/home/greearb/public_html/index.html From owner-netdev@oss.sgi.com Tue Dec 28 08:05:44 1999 Received: by oss.sgi.com id ; Tue, 28 Dec 1999 08:05:24 -0800 Received: from minus.inr.ac.ru ([193.233.7.97]:50863 "HELO ms2.inr.ac.ru") by oss.sgi.com with SMTP id ; Tue, 28 Dec 1999 08:05:18 -0800 Received: (from kuznet@localhost) by ms2.inr.ac.ru (8.6.13/ANK) id TAA22785; Tue, 28 Dec 1999 19:04:52 +0300 From: kuznet@ms2.inr.ac.ru Message-Id: <199912281604.TAA22785@ms2.inr.ac.ru> Subject: Re: [PATCH] skb field reservation v2.3.34 To: rusty@linuxcare.COM.AU (Rusty Russell) Date: Tue, 28 Dec 1999 19:04:52 +0300 (MSK) Cc: netdev@oss.sgi.com In-Reply-To: from "Rusty Russell" at Dec 28, 99 11:13:07 am X-Mailer: ELM [version 2.4 PL24] MIME-Version: 1.0 Content-Length: 2158 Sender: owner-netdev@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;netdev-outgoing Hello! > This allows random parties to grab a field in the skbuff > header, and attach a destructor. Applications beyond netfilter? Such objects miss main required property: conservation while skb_clone/skb_copy. I see _no_ applications for such feature, beyond storing nfdebug, even nfmark must be copied. I am sorry, but you invented something really strange. To be honest, it is the reason why I am pretty apathetic to this. So that you may consider all written below as pure cavils 8) > This makes the connection tracking > code (a netfilter module) much nicer (ie. real reference counts), Is it possible to make it nil, at least when CONFIG_NETFILTER is not defined then? Do you see the point? If only netfilter uses it now, ifdef it properly. You've just allocated 32 mainly useless bytes and zero them each skb_alloc/clone. Actually, it is possible to remove this static array from sk_buff at all. F.e. look at TSI chains in gated. It is pretty expensive thing, but it has zero offset, if nobody needs it and it is critically more flexible. Another variant is to use aggregated objects a la skb->dst. Well, and if to use static array, then it takes sense to allocate objects in it also statically at least. Or do you plan to build any binary independant DDI? Then defer this great task to 2.5 or later, all this skbuff things will be apparently rewritten from scratch for zero copy in any case. > can also be used for other things where you need to play with skbs. Paul. When we want to play, we just add new field to skb and that's all. 8)8) If game turns out to be not interesting, we delete it. Did you forget that Linux has public sources? > + /* Could optimize: only iterate over part actually allocated --RR */ > + for (i = 0; i < sizeof(skb->reserve); i += sizeof(long)) { If you could, then why you did not? 8) Also, I remember you liked to talk about coding style. 8) Please, look at surrounding style and note that size_t and another posixisms are not used there. It is "unsigned long", if you meaned really this. But something propts me that you wanted to write just "int" or "unsigned char" in some places 8) Alexey From owner-netdev@oss.sgi.com Tue Dec 28 09:13:34 1999 Received: by oss.sgi.com id ; Tue, 28 Dec 1999 09:13:24 -0800 Received: from dial-02-247-tnt-ms2.btvt.together.net ([206.231.24.247]:20727 "EHLO sparrow.websense.net") by oss.sgi.com with ESMTP id ; Tue, 28 Dec 1999 09:13:11 -0800 Received: from localhost (wstearns@localhost [127.0.0.1]) by sparrow.websense.net (8.8.7/8.8.7) with ESMTP id MAA03462; Tue, 28 Dec 1999 12:13:06 -0500 Date: Tue, 28 Dec 1999 12:12:55 -0500 (EST) From: William Stearns X-Sender: wstearns@sparrow.websense.net Reply-To: William Stearns To: Ben Greear cc: netdev , William Stearns Subject: Re: Layer 3 (IP) based switching for Linux? (Proxy-ARP??) In-Reply-To: <3868D573.29D8A6BE@agcs.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-netdev@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;netdev-outgoing Good day, all, I've mailed Ben an introduction to the user-level side of proxyarp in a separate email. I'd be happy to privately mail it to anyone else that would like a copy. Ben - once you have the proxyarp set up, you can firewall between the two sides of the network. My apologies if the note I've sent you completely misses the problem you're having, or if you already knew all I mailed... :) Cheers, - Bill On Tue, 28 Dec 1999, Ben Greear wrote: > reading more in the intervening days, I am starting to think that > what I really want is Proxy-ARP. However, I'm having a hard time > finding any info on how to set that up. Here is my original [snip] --------------------------------------------------------------------------- "Love is a snowmobile racing across the tundra and then suddenly it flips over, pinning you underneath. At night, the ice weasels come." -- Matt Groening (Courtesy of Steve Dodd ) -------------------------------------------------------------------------- William Stearns (wstearns@pobox.com). Mason, Buildkernel, named2hosts, and ipfwadm2ipchains are at: http://www.pobox.com/~wstearns/ -------------------------------------------------------------------------- From owner-netdev@oss.sgi.com Tue Dec 28 09:34:54 1999 Received: by oss.sgi.com id ; Tue, 28 Dec 1999 09:34:44 -0800 Received: from [130.131.166.29] ([130.131.166.29]:30931 "EHLO canospam.agcs.com") by oss.sgi.com with ESMTP id ; Tue, 28 Dec 1999 09:34:24 -0800 Received: from frontier. (marshal.agcs.com [130.131.60.2]) by canospam.agcs.com (8.9.3/8.9.1) with SMTP id KAA12966 for ; Tue, 28 Dec 1999 10:34:20 -0700 (MST) Received: from bootstrap.agcs.com ([130.131.48.11]) by frontier.agcs.com; Tue, 28 Dec 1999 10:34:07 +0000 (MST) Received: from pxmail1.agcs.com (pxmail1.agcs.com [130.131.168.5]) by bootstrap.agcs.com (8.9.3/8.9.1) with ESMTP id KAA14714 for ; Tue, 28 Dec 1999 10:33:38 -0700 (MST) Posted-Date: Tue, 28 Dec 1999 10:33:38 -0700 (MST) Received: from agcs.com ([130.131.166.110]) by pxmail1.agcs.com (Netscape Messaging Server 3.61) with ESMTP id AAA24C3 for ; Tue, 28 Dec 1999 10:34:07 -0700 Message-ID: <3868F48F.7C363426@agcs.com> Date: Tue, 28 Dec 1999 10:34:07 -0700 From: Ben Greear Reply-To: greearb@agcs.com Organization: AG Communication Systems X-Mailer: Mozilla 4.5 [en] (X11; U; SunOS 5.5.1 sun4u) X-Accept-Language: en MIME-Version: 1.0 To: netdev Subject: Is there a limit on the number of devices you can have? Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-netdev@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;netdev-outgoing I've seen no hard limits on the number of devices (eth0, vlan0000, lo, etc) that you can have in a system, but I'm curious if anyone knows any practical limits? Are there any linear searches (ie walk the device list) other than one-time configuration and other non-critical path instances? The reason that I ask is I'm considering a box with 1000+ VLAN devices... :) Thanks, Ben -- Ben Greear greearb@agcs.com Pager: 202-2717 (623) 581 4980 "More weight!" -- _The Crucible._ http://hydrogen:8080/home/greearb/public_html/index.html From owner-netdev@oss.sgi.com Tue Dec 28 10:31:24 1999 Received: by oss.sgi.com id ; Tue, 28 Dec 1999 10:31:14 -0800 Received: from minus.inr.ac.ru ([193.233.7.97]:17793 "HELO ms2.inr.ac.ru") by oss.sgi.com with SMTP id ; Tue, 28 Dec 1999 10:30:54 -0800 Received: (from kuznet@localhost) by ms2.inr.ac.ru (8.6.13/ANK) id VAA27800; Tue, 28 Dec 1999 21:30:08 +0300 From: kuznet@ms2.inr.ac.ru Message-Id: <199912281830.VAA27800@ms2.inr.ac.ru> Subject: Re: Is there a limit on the number of devices you can have? To: greearb@agcs.COM Date: Tue, 28 Dec 1999 21:30:08 +0300 (MSK) Cc: netdev@oss.sgi.com In-Reply-To: <3868F48F.7C363426@agcs.com> from "Ben Greear" at Dec 28, 99 09:13:02 pm X-Mailer: ELM [version 2.4 PL24] MIME-Version: 1.0 Content-Length: 580 Sender: owner-netdev@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;netdev-outgoing Hello! > you can have in a system, but I'm curious if anyone knows any practical > limits? Try and you will see. 8) > Are there any linear searches (ie walk the device list) other than > one-time configuration and other non-critical path instances? Yes, all the devices are in single linked list. This list is not searched in data paths, but some applications will have pain in any case, if list is too long. Actually, if you need more than _several_ devices, your concept is funny. If you need 1000 VLANs, create one NBMA device, what is the problem? Alexey Kuznetsov From owner-netdev@oss.sgi.com Tue Dec 28 10:57:34 1999 Received: by oss.sgi.com id ; Tue, 28 Dec 1999 10:57:24 -0800 Received: from laurin.munich.netsurf.de ([194.64.166.1]:20382 "EHLO laurin.munich.netsurf.de") by oss.sgi.com with ESMTP id ; Tue, 28 Dec 1999 10:56:59 -0800 Received: from fred.muc.de (none@ns1212.munich.netsurf.de [195.180.235.212]) by laurin.munich.netsurf.de (8.9.3/8.9.3) with ESMTP id TAA20681; Tue, 28 Dec 1999 19:56:46 +0100 (MET) Received: from andi by fred.muc.de with local (Exim 2.05 #1) id 1231nL-0001H8-00; Tue, 28 Dec 1999 19:56:31 +0100 Date: Tue, 28 Dec 1999 19:56:31 +0100 From: Andi Kleen To: Ben Greear Cc: netdev Subject: Re: Is there a limit on the number of devices you can have? Message-ID: <19991228195631.A4899@fred.muc.de> References: <3868F48F.7C363426@agcs.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Mailer: Mutt 0.95.4us In-Reply-To: <3868F48F.7C363426@agcs.com>; from Ben Greear on Tue, Dec 28, 1999 at 06:34:07PM +0100 Sender: owner-netdev@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;netdev-outgoing On Tue, Dec 28, 1999 at 06:34:07PM +0100, Ben Greear wrote: > I've seen no hard limits on the number of devices (eth0, vlan0000, lo, etc) that > you can have in a system, but I'm curious if anyone knows any practical > limits? Are there any linear searches (ie walk the device list) other than > one-time configuration and other non-critical path instances? > > The reason that I ask is I'm considering a box with 1000+ > VLAN devices... :) The current algorithm to generate device names (eth0, eth1, etc.) is limited to 100 devices. Should be easy to fix. -Andi -- This is like TV. I don't like TV. From owner-netdev@oss.sgi.com Tue Dec 28 11:51:55 1999 Received: by oss.sgi.com id ; Tue, 28 Dec 1999 11:51:45 -0800 Received: from [130.131.166.29] ([130.131.166.29]:1758 "EHLO canospam.agcs.com") by oss.sgi.com with ESMTP id ; Tue, 28 Dec 1999 11:51:31 -0800 Received: from frontier. (marshal.agcs.com [130.131.60.2]) by canospam.agcs.com (8.9.3/8.9.1) with SMTP id MAA15582 for ; Tue, 28 Dec 1999 12:51:32 -0700 (MST) Received: from bootstrap.agcs.com ([130.131.48.11]) by frontier.agcs.com; Tue, 28 Dec 1999 12:51:21 +0000 (MST) Received: from pxmail1.agcs.com (pxmail1.agcs.com [130.131.168.5]) by bootstrap.agcs.com (8.9.3/8.9.1) with ESMTP id MAA17554 for ; Tue, 28 Dec 1999 12:50:51 -0700 (MST) Posted-Date: Tue, 28 Dec 1999 12:50:51 -0700 (MST) Received: from agcs.com ([130.131.166.110]) by pxmail1.agcs.com (Netscape Messaging Server 3.61) with ESMTP id AAA30DD; Tue, 28 Dec 1999 12:51:20 -0700 Message-ID: <386914B8.C51C9C4A@agcs.com> Date: Tue, 28 Dec 1999 12:51:20 -0700 From: Ben Greear Reply-To: greearb@agcs.com Organization: AG Communication Systems X-Mailer: Mozilla 4.5 [en] (X11; U; SunOS 5.5.1 sun4u) X-Accept-Language: en MIME-Version: 1.0 To: Prairie Flower , netdev Subject: Re: Layer 3 (IP) based switching for Linux? (Proxy-ARP??) References: <19991228184741.RVRH10175.mail.rdc2.bc.home.com@pinky.deadbeef.ca> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-netdev@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;netdev-outgoing Prairie Flower wrote: > On Tue, 28 Dec 1999 08:21:23 -0700, Ben Greear wrote: > > [snip] > > >PC1 -------\ > >5.5.1.2/24 \__ eth0 |-------| > > | | 5.5.1.254/24 > > ... | Linux | eth2 ------ [ gateway ] ---- { internet } > > | | > >PC2 ----------- eth1 |_______| > >5.5.1.3/24 > > Are you sure you don't mean 10.5.1.0/24? > > wildrose@home.com Which interface are you talking about? I think it is how I want it, but let me explain my true goals. I want to firewall based on VLANs. (I plan on using my vlan code for that, to make each vlan look like a seperate interface.) I want PCs to look like they are on a normal subnet. In other words, these are customer machines, and the customers are mostly likely clueless (this is a DSL type offering.) This means no host routes, and no linux-only tweaks. I want to conserve IP addresses, so no subnet-per-interface (that would take at least 4 IPs per customer, as well as being a possible headache for whatever admin had to support the ISP's network.) The magic box (labeled linux in my picture) can have any amount of ugly stuff (ie arp proxy, host routes, etc), just so long as it works!! Currently in the lab, I have this: On Linux: this setup has been run to create the vlan interfaces and give them IP addresses: vconfig add eth1 20 vconfig add eth1 21 ifconfig -i vlan0000 10.1.1.20 # vlan 20 ifconfig -i vlan0001 10.1.1.21 # vlan 21 ifconfig -i vlan0000 up ifconfig -i vlan0001 up route add -host 130.131.190.211 vlan0000 route add -host 130.131.190.212 vlan0001 # Do proxy-arp stuff # Note that all vlan devices on the same NIC (eth1 in this case) have the same MAC. arp -i vlan0001 -Ds 130.131.190.211 vlan0000 pub arp -i vlan0000 -Ds 130.131.190.212 vlan0000 pub PC1 --------vlan1-\ 130.131.190.212/24 | |-------| | | | 130.131.190.3 -eth1-| Linux | eth0 ------ [ gateway ] ---- { internet } | | | PC2 --------vlan0-/ |_______| 130.131.190.211/24 Things are almost working: When I try to ping from .212 to .211, the linux box ARP proxies and .212 starts sending icmp requests to 10.1.1.21 On the vlan0 interface, I see this: [root@linserv /root]# tcpdump -n -i vlan0000 User level filter, protocol ALL, datagram packet socket tcpdump: listening on vlan0000 12:36:48.898119 > arp who-has 130.131.190.211 tell 10.1.1.20 (0:60:97:3c:e6:9) 12:36:50.896697 > arp who-has 130.131.190.211 tell 10.1.1.20 (0:60:97:3c:e6:9) On the vlan0001 interface, I see this: [root@linserv /root]# tcpdump -n -i vlan0001 User level filter, protocol ALL, datagram packet socket tcpdump: listening on vlan0001 12:37:12.891220 < 130.131.190.212 > 130.131.190.211: icmp: echo request 12:37:12.898150 > 10.1.1.21 > 130.131.190.212: icmp: host 130.131.190.211 unreachable [tos 0xc0] 12:37:12.898215 > 10.1.1.21 > 130.131.190.212: icmp: host 130.131.190.211 unreachable [tos 0xc0] 12:37:12.898269 > 10.1.1.21 > 130.131.190.212: icmp: host 130.131.190.211 unreachable [tos 0xc0] 12:37:13.891667 < 130.131.190.212 > 130.131.190.211: icmp: echo request 12:37:14.892025 < 130.131.190.212 > 130.131.190.211: icmp: echo request 12:37:15.890706 < 130.131.190.212 > 130.131.190.211: icmp: echo request 12:37:16.888114 > 10.1.1.21 > 130.131.190.212: icmp: host 130.131.190.211 unreachable [tos 0xc0] 12:37:16.888172 > 10.1.1.21 > 130.131.190.212: icmp: host 130.131.190.211 unreachable [tos 0xc0] The problem is that .211 does not have a host route to tell it how to get a pkt to 10.1.1.20. (It may have other problems...should it try to send it to the dflt gateway?) So, what if I could set one of the interfaces on Linux to be, say: 130.131.190.200. If I could get the arp to say "tell 130.131.190.200", instead of 10.1.1.20, then the .211 PC could know how to get the response back? All ideas will be appreciated!! :) Here's some more info that might prove useful: [root@linserv /root]# ifconfig -a dummy Link encap:Ethernet HWaddr 00:00:00:00:00:00 BROADCAST NOARP MTU:1500 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 eth0 Link encap:Ethernet HWaddr 00:60:97:29:6F:B2 inet addr:130.131.190.238 Bcast:130.131.190.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:479615 errors:0 dropped:0 overruns:0 frame:0 TX packets:152213 errors:0 dropped:0 overruns:0 carrier:602 collisions:22 txqueuelen:100 Interrupt:9 Base address:0xff80 eth1 Link encap:Ethernet HWaddr 00:60:97:3C:E6:09 inet addr:192.168.101.1 Bcast:192.168.101.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:10716 errors:0 dropped:0 overruns:0 frame:0 TX packets:14693 errors:0 dropped:0 overruns:0 carrier:0 collisions:38 txqueuelen:100 Interrupt:5 Base address:0xff40 lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 UP LOOPBACK RUNNING MTU:3924 Metric:1 RX packets:1859 errors:0 dropped:0 overruns:0 frame:0 TX packets:1859 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 vlan0000 Link encap:Ethernet HWaddr 00:60:97:3C:E6:09 inet addr:10.1.1.20 Bcast:10.255.255.255 Mask:255.0.0.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:5862 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 vlan0001 Link encap:Ethernet HWaddr 00:60:97:3C:E6:09 inet addr:10.1.1.21 Bcast:10.255.255.255 Mask:255.0.0.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:6619 errors:0 dropped:0 overruns:0 frame:0 TX packets:6155 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 [root@linserv /root]# route -n Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 130.131.190.212 0.0.0.0 255.255.255.255 UH 0 0 0 vlan0001 130.131.190.211 0.0.0.0 255.255.255.255 UH 0 0 0 vlan0000 192.168.101.1 0.0.0.0 255.255.255.255 UH 0 0 0 eth1 130.131.190.238 0.0.0.0 255.255.255.255 UH 0 0 0 eth0 192.168.101.0 0.0.0.0 255.255.255.0 U 0 0 0 eth1 130.131.190.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0 172.0.0.0 130.131.190.229 255.0.0.0 UG 0 0 0 eth0 10.0.0.0 0.0.0.0 255.0.0.0 U 0 0 0 vlan0000 10.0.0.0 0.0.0.0 255.0.0.0 U 0 0 0 vlan0001 127.0.0.0 0.0.0.0 255.0.0.0 U 0 0 0 lo 0.0.0.0 130.131.190.211 0.0.0.0 UG 0 0 0 eth0 [root@linserv /root]# arp -an ? (130.131.190.211) at 00:10:7B:3B:55:01 [ether] on eth0 ? (130.131.190.254) at 00:10:4B:7A:A6:D4 [ether] on eth0 ? (130.131.190.211) at on vlan0000 ? (130.131.190.212) at 00:00:E8:34:22:33 [ether] on vlan0001 ? (130.131.190.211) at * PERM PUP on vlan0001 ? (130.131.190.212) at * PERM PUP on vlan0000 Thanks, Ben -- Ben Greear greearb@agcs.com Pager: 202-2717 (623) 581 4980 "More weight!" -- _The Crucible._ http://hydrogen:8080/home/greearb/public_html/index.html From owner-netdev@oss.sgi.com Tue Dec 28 11:56:45 1999 Received: by oss.sgi.com id ; Tue, 28 Dec 1999 11:56:35 -0800 Received: from [130.131.166.29] ([130.131.166.29]:36062 "EHLO canospam.agcs.com") by oss.sgi.com with ESMTP id ; Tue, 28 Dec 1999 11:56:27 -0800 Received: from frontier. (marshal.agcs.com [130.131.60.2]) by canospam.agcs.com (8.9.3/8.9.1) with SMTP id MAA15706 for ; Tue, 28 Dec 1999 12:56:28 -0700 (MST) Received: from bootstrap.agcs.com ([130.131.48.11]) by frontier.agcs.com; Tue, 28 Dec 1999 12:56:16 +0000 (MST) Received: from pxmail1.agcs.com (pxmail1.agcs.com [130.131.168.5]) by bootstrap.agcs.com (8.9.3/8.9.1) with ESMTP id MAA17690 for ; Tue, 28 Dec 1999 12:55:47 -0700 (MST) Posted-Date: Tue, 28 Dec 1999 12:55:47 -0700 (MST) Received: from agcs.com ([130.131.166.110]) by pxmail1.agcs.com (Netscape Messaging Server 3.61) with ESMTP id AAA3124; Tue, 28 Dec 1999 12:56:16 -0700 Message-ID: <386915E0.67BACD59@agcs.com> Date: Tue, 28 Dec 1999 12:56:16 -0700 From: Ben Greear Reply-To: greearb@agcs.com Organization: AG Communication Systems X-Mailer: Mozilla 4.5 [en] (X11; U; SunOS 5.5.1 sun4u) X-Accept-Language: en MIME-Version: 1.0 To: kuznet@ms2.inr.ac.ru CC: netdev@oss.sgi.com Subject: Re: Is there a limit on the number of devices you can have? References: <199912281830.VAA27800@ms2.inr.ac.ru> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-netdev@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;netdev-outgoing kuznet@ms2.inr.ac.ru wrote: > Actually, if you need more than _several_ devices, your concept > is funny. If you need 1000 VLANs, create one NBMA device, what is > the problem? > > Alexey Kuznetsov I'm not too clear on just what an NBMA device is...got any pointers? THanks, Ben -- Ben Greear greearb@agcs.com Pager: 202-2717 (623) 581 4980 "More weight!" -- _The Crucible._ http://hydrogen:8080/home/greearb/public_html/index.html From owner-netdev@oss.sgi.com Tue Dec 28 12:05:45 1999 Received: by oss.sgi.com id ; Tue, 28 Dec 1999 12:05:34 -0800 Received: from minus.inr.ac.ru ([193.233.7.97]:38529 "HELO ms2.inr.ac.ru") by oss.sgi.com with SMTP id ; Tue, 28 Dec 1999 12:05:16 -0800 Received: (from kuznet@localhost) by ms2.inr.ac.ru (8.6.13/ANK) id XAA06116; Tue, 28 Dec 1999 23:04:39 +0300 From: kuznet@ms2.inr.ac.ru Message-Id: <199912282004.XAA06116@ms2.inr.ac.ru> Subject: Re: Is there a limit on the number of devices you can have? To: greearb@agcs.com Date: Tue, 28 Dec 1999 23:04:39 +0300 (MSK) Cc: netdev@oss.sgi.com In-Reply-To: <386915E0.67BACD59@agcs.com> from "Ben Greear" at Dec 28, 99 12:56:16 pm X-Mailer: ELM [version 2.4 PL24] MIME-Version: 1.0 Content-Length: 563 Sender: owner-netdev@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;netdev-outgoing Hello! > I'm not too clear on just what an NBMA device is...got any pointers? Well, one device for all virtual lans determining how to deliver to a particular vlan via some internally saved information or stored in routing/neighbor tables. F.e. remembering ID of VLAN as sort of destination MAC address and fetching real information, including real MAC addresses from an internal table. I remember, one guy tried to work with about thousand of tunnels. It works, but it is painful even to type ifconfig: too long output 8) And traceroute dumps out 8) Alexey From owner-netdev@oss.sgi.com Tue Dec 28 13:30:37 1999 Received: by oss.sgi.com id ; Tue, 28 Dec 1999 13:30:27 -0800 Received: from [130.131.166.29] ([130.131.166.29]:49636 "EHLO canospam.agcs.com") by oss.sgi.com with ESMTP id ; Tue, 28 Dec 1999 13:30:13 -0800 Received: from frontier. (marshal.agcs.com [130.131.60.2]) by canospam.agcs.com (8.9.3/8.9.1) with SMTP id OAA17568 for ; Tue, 28 Dec 1999 14:30:05 -0700 (MST) Received: from bootstrap.agcs.com ([130.131.48.11]) by frontier.agcs.com; Tue, 28 Dec 1999 14:19:52 +0000 (MST) Received: from pxmail1.agcs.com (pxmail1.agcs.com [130.131.168.5]) by bootstrap.agcs.com (8.9.3/8.9.1) with ESMTP id OAA19376 for ; Tue, 28 Dec 1999 14:19:23 -0700 (MST) Posted-Date: Tue, 28 Dec 1999 14:19:23 -0700 (MST) Received: from agcs.com ([130.131.166.110]) by pxmail1.agcs.com (Netscape Messaging Server 3.61) with ESMTP id AAA3881; Tue, 28 Dec 1999 14:19:52 -0700 Message-ID: <38692977.8CFAEBF1@agcs.com> Date: Tue, 28 Dec 1999 14:19:51 -0700 From: Ben Greear Reply-To: greearb@agcs.com Organization: AG Communication Systems X-Mailer: Mozilla 4.5 [en] (X11; U; SunOS 5.5.1 sun4u) X-Accept-Language: en MIME-Version: 1.0 To: William Stearns , netdev Subject: Re: Layer 3 (IP) based switching for Linux? (IT WORKS!!) References: Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-netdev@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;netdev-outgoing IT WORKS!! :) (Linux rules!) After figuring out the win95 box (.211) was no longer on speaking terms with it's NIC, I put a linux box in its place. With a few more arp statements, everything started working!! One thing that is a little worrying is that the length of this config file will increase exponentially as interfaces are added. This may limit me to less interfaces that I was planning on supporting. Or maybe I'll just need to buy more RAM. NOTE: .211 became .239, if you are trying to sync this up with the picture I sent earlier. Here is the end result of the arp-proxy configuration: vconfig add eth1 20 vconfig add eth1 21 ifconfig -i vlan0000 10.1.1.20 # vlan 20 ifconfig -i vlan0001 10.1.1.21 # vlan 21 ifconfig -i vlan0000 up ifconfig -i vlan0001 up # Do proxy-arp stuff # For each new VLAN Interface (Subscriber): # Add vlan interface. # Give it IP and configure it UP. # Add proxy to every other interface (other than self) # Add host route. # How to find those on vlan0000 route add -host 130.131.190.239 vlan0000 arp -i vlan0001 -Ds 130.131.190.239 vlan0000 pub arp -i eth0 -Ds 130.131.190.239 eth0 pub # How to find those on vlan0001 route add -host 130.131.190.212 vlan0001 arp -i vlan0000 -Ds 130.131.190.212 vlan0000 pub arp -i eth0 -Ds 130.131.190.212 eth0 pub # Proxy for things on the upstream side. # How to find .254 (DNS server) arp -i vlan0000 -Ds 130.131.190.254 vlan0000 pub arp -i vlan0001 -Ds 130.131.190.254 vlan0000 pub # How to find .3 (gateway) arp -i vlan0000 -Ds 130.131.190.3 vlan0000 pub arp -i vlan0001 -Ds 130.131.190.3 vlan0000 pub Thanks for everyone's help!! Ben -- Ben Greear greearb@agcs.com Pager: 202-2717 (623) 581 4980 "More weight!" -- _The Crucible._ http://hydrogen:8080/home/greearb/public_html/index.html From owner-netdev@oss.sgi.com Tue Dec 28 16:34:17 1999 Received: by oss.sgi.com id ; Tue, 28 Dec 1999 16:33:57 -0800 Received: from mailhost.uni-koblenz.de ([141.26.64.1]:61152 "EHLO mailhost.uni-koblenz.de") by oss.sgi.com with ESMTP id ; Tue, 28 Dec 1999 16:33:40 -0800 Received: from cacc-17.uni-koblenz.de (cacc-17.uni-koblenz.de [141.26.131.17]) by mailhost.uni-koblenz.de (8.9.1/8.9.1) with ESMTP id BAA00687; Wed, 29 Dec 1999 01:33:44 +0100 (MET) Received: by lappi.waldorf-gmbh.de id ; Wed, 29 Dec 1999 01:26:48 +0100 Date: Wed, 29 Dec 1999 01:26:48 +0100 From: Ralf Baechle To: Ben Greear Cc: netdev Subject: Re: Layer 3 (IP) based switching for Linux? (Proxy-ARP??) Message-ID: <19991229012648.C9742@uni-koblenz.de> References: <3868D573.29D8A6BE@agcs.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Mailer: Mutt 1.0pre3us In-Reply-To: <3868D573.29D8A6BE@agcs.com> X-Accept-Language: de,en,fr Sender: owner-netdev@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;netdev-outgoing On Tue, Dec 28, 1999 at 08:21:23AM -0700, Ben Greear wrote: > I first sent this to the old .mx mailing list, which seems to > be defunct. oss.sgi.com rejects all mail via the old list since the old lists had such an awful percentage of spam. Ralf From owner-netdev@oss.sgi.com Wed Dec 29 00:55:41 1999 Received: by oss.sgi.com id ; Wed, 29 Dec 1999 00:55:30 -0800 Received: from azure.mist.sfc.wide.ad.jp ([203.178.140.41]:33286 "EHLO convert rfc822-to-8bit v6.linux.or.jp") by oss.sgi.com with ESMTP id ; Wed, 29 Dec 1999 00:55:03 -0800 Received: from localhost (IDENT:sekiya@LOCALHOST [::ffff:127.0.0.1] (may be forged)) by v6.linux.or.jp (8.9.3+3.1W/3.3Wb4) with ESMTP id RAA02528; Wed, 29 Dec 1999 17:53:13 +0900 Date: Wed, 29 Dec 1999 17:53:00 +0900 Message-ID: From: Yuji Sekiya To: kuznet@ms2.inr.ac.ru Cc: netdev@oss.sgi.com, users@ipv6.org Cc: Hideaki YOSHIFUJI Subject: sin6_scope_id User-Agent: Wanderlust/2.2.12 (Joyride) SEMI/1.13.7 (Awazu) FLIM/1.13.2 (Kasanui) MULE XEmacs/21.2 (beta19) (Shinjuku) (i586-pc-linux) Organization: Information Sciences Institute MIME-Version: 1.0 (generated by SEMI 1.13.7 - "Awazu") Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 8BIT X-Dispatcher: imput version 990604(IM116) Lines: 83 Sender: owner-netdev@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;netdev-outgoing Hello Alexey and folks, We(Linux IPv6 Group in Japan) found the portability problem that sin6_scope_id member is missing in Linux kernel. When we ported some applications to IPv6 ready, we encountered this problem. RFC2553 says in section 2.4 that follow 2.4 Structures When structures are described the members shown are the ones that must appear in an implementation. ~~~~ and defines sockaddr_in6 as the following. struct sockaddr_in6 { sa_family_t sin6_family; /* AF_INET6 */ in_port_t sin6_port; /* transport layer port # */ uint32_t sin6_flowinfo; /* IPv6 traffic class & flow info */ struct in6_addr sin6_addr; /* IPv6 address */ uint32_t sin6_scope_id; /* set of interfaces for a scope */ }; But in in6.h file of Linux kernel, sockaddr_in6 is defined as follows. struct sockaddr_in6 { unsigned short int sin6_family; /* AF_INET6 */ __u16 sin6_port; /* Transport layer port # */ __u32 sin6_flowinfo; /* IPv6 flow information */ struct in6_addr sin6_addr; /* IPv6 address */ }; In case that the destination of outgoing packets is an IPv6 global address, kernel may select an outgoing interface as routing table. But in case of a link-local address, how can I select the interface for outgoing packets without sin6_scope_id ? Of course, I know the method of using SO_BINDTODEVICE, but I think it is the linux only method and it is hard to keep portability of applications with other Operating Systems. In IPv4 environment, there are few applications which have to select an outgoing interface. It is a rare case (at least for me) that we have to specify an outgoing interface to use applications. AFAIK, tcpdump is the only one. But IPv6, we can select one of addresses from three address types, global, site-local and link-local. So it is not rare case for us to specify an outgoing interface. >From our experiences, we encountered the case which we wanna use sin6_scope_id for portability of source code many times. IMHO, I think it is better for us to support both of SO_BINDTODEVICE and sin6_scope_id. Do you have any plan to add sin6_scope_id in in6.h ? BTW, if we add sin6_scope_id into **ONLY LINUX KERNEL** or **ONLY LIBC HEADER FILE**, we encounter the problem of inconsistency between kernel and libc structure. Many applications which use sockaddr_in6 don't work correctly. As a solution, we would suggest the following alternatives. 1) Implementing the mechanism for backward compatibility in Linux kernel. ( Switch kernel syscall dynamically according to the size of sockaddr_in6 structure. ) 2) Providing kernel option in which we can select sockaddr_in6 type. Are these reasonable solutions for you ? If you aren't interested in the sockaddr_in6 problem, we would like take any solution to the problem. Regards. -- SEKIYA Yuji USC/ISI Computer Networks Division 7 From owner-netdev@oss.sgi.com Wed Dec 29 01:40:30 1999 Received: by oss.sgi.com id ; Wed, 29 Dec 1999 01:40:20 -0800 Received: from azure.mist.sfc.wide.ad.jp ([203.178.140.41]:37894 "EHLO v6.linux.or.jp") by oss.sgi.com with ESMTP id ; Wed, 29 Dec 1999 01:39:58 -0800 Received: from localhost (IDENT:sekiya@LOCALHOST [::ffff:127.0.0.1] (may be forged)) by v6.linux.or.jp (8.9.3+3.1W/3.3Wb4) with ESMTP id SAA02694; Wed, 29 Dec 1999 18:38:11 +0900 Date: Wed, 29 Dec 1999 18:38:03 +0900 Message-ID: From: Yuji Sekiya To: kuznet@ms2.inr.ac.ru Cc: kusune@sfc.wide.ad.jp, netdev@oss.sgi.com Subject: Re: source address selection In-Reply-To: In your message of "Thu, 16 Dec 1999 19:09:31 +0300 (MSK)" <199912161609.TAA07161@ms2.inr.ac.ru> References: <199912152351.IAA05354@wanwan.sfc.wide.ad.jp> <199912161609.TAA07161@ms2.inr.ac.ru> User-Agent: Wanderlust/2.2.12 (Joyride) SEMI/1.13.7 (Awazu) FLIM/1.13.2 (Kasanui) MULE XEmacs/21.2 (beta19) (Shinjuku) (i586-pc-linux) Organization: Information Sciences Institute MIME-Version: 1.0 (generated by SEMI 1.13.7 - "Awazu") Content-Type: text/plain; charset=US-ASCII X-Dispatcher: imput version 990604(IM116) Lines: 32 Sender: owner-netdev@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;netdev-outgoing At Thu, 16 Dec 1999 19:09:31 +0300 (MSK), kuznet@ms2.inr.ac.ru wrote: Hi, > We may make this thing in IPv4 because function inet_select_addr() > is not in data path; selected source addresses are stored in routing tables. > Correct solution would cache once found source address > in IPv6 routing table, probably, cloning route, if it is required. Hmm... I think what you say is right. But IMHO, I think that source address selection is more useful in IPv6 environment than IPv4 environment. Because IPv6 allows us to assign some IPv6 addresses to one interface, not alias interfaces. I think we use multi-address or multi-home as ordinary network environment in IPv6. If you have no plan to implement source address selection into IPv4 protocol stack, how about eating kusune's patch or implementing the feature into IPv6 stack at first ? Thanks. P.S. if you have any requests about kusune's solution, please give us advice. We will improve this patch. -- SEKIYA Yuji USC/ISI Computer Networks Division 7 From owner-netdev@oss.sgi.com Wed Dec 29 01:58:20 1999 Received: by oss.sgi.com id ; Wed, 29 Dec 1999 01:58:10 -0800 Received: from wiget.t17.ds.pwr.wroc.pl ([156.17.210.110]:14091 "HELO wiget.t17.ds.pwr.wroc.pl") by oss.sgi.com with SMTP id ; Wed, 29 Dec 1999 01:57:55 -0800 Received: by wiget.t17.ds.pwr.wroc.pl (Postfix, from userid 1000) id 3868F39021; Wed, 29 Dec 1999 10:54:49 +0100 (CET) Date: Wed, 29 Dec 1999 10:54:49 +0100 From: Artur Frysiak To: kuznet@ms2.inr.ac.ru Cc: "Takeshi Kusune / ?$@Fo:,M:;V?(J" , netdev@oss.sgi.com Subject: Re: source address selection Message-ID: <19991229105448.C10337@wiget> Reply-To: Artur Frysiak Mail-Followup-To: kuznet@ms2.inr.ac.ru, "Takeshi Kusune / ?$@Fo:,M:;V?(J" , netdev@oss.sgi.com References: <199912152351.IAA05354@wanwan.sfc.wide.ad.jp> <199912161609.TAA07161@ms2.inr.ac.ru> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Mailer: Mutt 1.0i In-Reply-To: <199912161609.TAA07161@ms2.inr.ac.ru>; from kuznet@ms2.inr.ac.ru on Thu, Dec 16, 1999 at 07:09:31PM +0300 X-Operating-System: Linux wiget 2.3.34 Sender: owner-netdev@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;netdev-outgoing On Thu, 16 Dec 1999, kuznet@ms2.inr.ac.ru wrote: > Hello! > > > BTW, I think that Linux's current IPv6 code is not enough to work with > > multi-addressed network, because of weakness in source address selection. > > The patch is good and correct, but the solution is bad. > > We may make this thing in IPv4 because function inet_select_addr() > is not in data path; selected source addresses are stored in routing tables. > Correct solution would cache once found source address > in IPv6 routing table, probably, cloning route, if it is required. > > BTW we have to make this thing in IPv4 because by historical reasons > internet routing is very messy and smart source selection is required > to route replies back. Moreover, by the same historical reasons, almost > all IP apps are confused, when loopback address appears as source > address of packet destined to some address of the host. > Emerging IPv6 principles should not inherit all of this brain damage, > so that the problem is present but it is not critical. Source address selection for IPv6 is very useful feature. I give one real example: I use zebra as routing daemons. My machine has 2 IPv6 address at eth1 (internal LAN). I must set both address as neighbor in BGP4+ config in other machine in LAN to get zebra to work. If I may select source address then I may set only one neighbor. Wiget -- wiget@t17.ds.pwr.wroc.pl DS T17 Bofh PGP key: http://www.t17.ds.pwr.wroc.pl/~wiget/pgp.key 1024D/D3D4CF84 E4D3 6787 284C 57F0 3C1F ADFD A92A 3F2E D3D4 CF84 From owner-netdev@oss.sgi.com Wed Dec 29 08:00:20 1999 Received: by oss.sgi.com id ; Wed, 29 Dec 1999 08:00:10 -0800 Received: from minus.inr.ac.ru ([193.233.7.97]:44559 "HELO ms2.inr.ac.ru") by oss.sgi.com with SMTP id ; Wed, 29 Dec 1999 07:59:59 -0800 Received: (from kuznet@localhost) by ms2.inr.ac.ru (8.6.13/ANK) id SAA04136; Wed, 29 Dec 1999 18:56:02 +0300 From: kuznet@ms2.inr.ac.ru Message-Id: <199912291556.SAA04136@ms2.inr.ac.ru> Subject: Re: source address selection To: wiget@ipv6.t17.dhs.org Date: Wed, 29 Dec 1999 18:56:02 +0300 (MSK) Cc: kusune@sfc.wide.ad.jp, netdev@oss.sgi.com In-Reply-To: <19991229105448.C10337@wiget> from "Artur Frysiak" at Dec 29, 99 10:54:49 am X-Mailer: ELM [version 2.4 PL24] MIME-Version: 1.0 Content-Length: 562 Sender: owner-netdev@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;netdev-outgoing Hello! > I give one real example: > I use zebra as routing daemons. My machine has 2 IPv6 address at eth1 (internal > LAN). I must set both address as neighbor in BGP4+ config in other machine in LAN > to get zebra to work. If I may select source address then I may set only one > neighbor. zebra, sets source address _itself_ without help from kernel. Actually, routing daemons _must_ not use any automatic selection at all. If Zebra really does not bind() its sockets, it is strongly broken. Check this, and if it is the case, report to Kunihiro. Alexey From owner-netdev@oss.sgi.com Wed Dec 29 08:08:40 1999 Received: by oss.sgi.com id ; Wed, 29 Dec 1999 08:08:21 -0800 Received: from minus.inr.ac.ru ([193.233.7.97]:19218 "HELO ms2.inr.ac.ru") by oss.sgi.com with SMTP id ; Wed, 29 Dec 1999 08:08:14 -0800 Received: (from kuznet@localhost) by ms2.inr.ac.ru (8.6.13/ANK) id TAA04827; Wed, 29 Dec 1999 19:07:59 +0300 From: kuznet@ms2.inr.ac.ru Message-Id: <199912291607.TAA04827@ms2.inr.ac.ru> Subject: Re: sin6_scope_id To: sekiya@ISI.EDU (Yuji Sekiya) Date: Wed, 29 Dec 1999 19:07:59 +0300 (MSK) Cc: netdev@oss.sgi.com, users@ipv6.org, yoshfuji@ecei.tohoku.ac.jp In-Reply-To: from "Yuji Sekiya" at Dec 29, 99 05:53:00 pm X-Mailer: ELM [version 2.4 PL24] MIME-Version: 1.0 Content-Length: 1967 Sender: owner-netdev@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;netdev-outgoing Hello! > We(Linux IPv6 Group in Japan) found the portability problem that > sin6_scope_id member is missing in Linux kernel. Yes. > RFC2553 says in section 2.4 that follow Linux does not follow this RFC. It complies to previous RFC. > In case that the destination of outgoing packets is an > IPv6 global address, kernel may select an outgoing interface as routing > table. But in case of a link-local address, how can I select the interface > for outgoing packets without sin6_scope_id ? Certainly, it is IPV6_PKTINFO. > Of course, I know the method of using SO_BINDTODEVICE, Or this one. > In IPv4 environment, there are few applications which have to select an > outgoing interface. IP_PKTINFO. IPv4 has the same problems as IPv6 does. After thinking a bit, you will understand that IPv4 has both link local and site local and all the kinds of addresses. The only difference of IPv6 is returning to brain-dead "classful" addressing, which was rejected in IPv4 years ago. > But IPv6, we can select one of addresses from three address types, global, > site-local and link-local. So it is not rare case for us to specify > an outgoing interface. site-local addresses have no differences of global scope at all. > Do you have any plan to add sin6_scope_id in in6.h ? No. I really have plan to resist to this silly idea until the last cartridge 8)8)8) And to accept the defeat 8) Yes, I do think that this idea is deathly broken. Application must not refer to this field and then they will be not only portable, but also free of ambiguity. > BTW, if we add sin6_scope_id into **ONLY LINUX KERNEL** > or **ONLY LIBC HEADER FILE**, we encounter the problem of > inconsistency between kernel and libc structure. Even not commented. I do not understand at all, how struct definitions may be got from different sources. > If you aren't interested in the sockaddr_in6 problem, > we would like take any solution to the problem. IPV6_PKTINFO. Alexey From owner-netdev@oss.sgi.com Wed Dec 29 08:10:00 1999 Received: by oss.sgi.com id ; Wed, 29 Dec 1999 08:09:50 -0800 Received: from minus.inr.ac.ru ([193.233.7.97]:34322 "HELO ms2.inr.ac.ru") by oss.sgi.com with SMTP id ; Wed, 29 Dec 1999 08:09:38 -0800 Received: (from kuznet@localhost) by ms2.inr.ac.ru (8.6.13/ANK) id TAA04930; Wed, 29 Dec 1999 19:09:22 +0300 From: kuznet@ms2.inr.ac.ru Message-Id: <199912291609.TAA04930@ms2.inr.ac.ru> Subject: Re: source address selection To: sekiya@ISI.EDU (Yuji Sekiya) Date: Wed, 29 Dec 1999 19:09:22 +0300 (MSK) Cc: kusune@sfc.wide.ad.jp, netdev@oss.sgi.com In-Reply-To: from "Yuji Sekiya" at Dec 29, 99 06:38:03 pm X-Mailer: ELM [version 2.4 PL24] MIME-Version: 1.0 Content-Length: 475 Sender: owner-netdev@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;netdev-outgoing Hello! > But IMHO, I think that source address selection is more useful in > IPv6 environment than IPv4 environment. Because IPv6 allows us to > assign some IPv6 addresses to one interface, not alias interfaces. I do not understand. IPv6 and IPv4 are absolutely identical here. > If you have no plan to implement source address selection > into IPv4 protocol stack, how about eating kusune's patch > or implementing the feature into IPv6 stack at first ? In 2.5. Alexey From owner-netdev@oss.sgi.com Wed Dec 29 09:33:21 1999 Received: by oss.sgi.com id ; Wed, 29 Dec 1999 09:33:11 -0800 Received: from wiget.t17.ds.pwr.wroc.pl ([156.17.210.110]:38925 "HELO wiget.t17.ds.pwr.wroc.pl") by oss.sgi.com with SMTP id ; Wed, 29 Dec 1999 09:32:54 -0800 Received: by wiget.t17.ds.pwr.wroc.pl (Postfix, from userid 1000) id BDBCB39021; Wed, 29 Dec 1999 18:29:35 +0100 (CET) Date: Wed, 29 Dec 1999 18:29:35 +0100 From: Artur Frysiak To: kuznet@ms2.inr.ac.ru Cc: David Jeffery lordbeatnik , netdev@oss.sgi.com, davem@redhat.com Subject: Re: two ipv6 oops on 2.3.31 Message-ID: <19991229182935.J10337@wiget> Reply-To: Artur Frysiak Mail-Followup-To: kuznet@ms2.inr.ac.ru, David Jeffery lordbeatnik , netdev@oss.sgi.com, davem@redhat.com References: <199912150024.QAA06037@ns1.filetron.com> <199912151501.SAA02381@ms2.inr.ac.ru> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Mailer: Mutt 1.0i In-Reply-To: <199912151501.SAA02381@ms2.inr.ac.ru>; from kuznet@ms2.inr.ac.ru on Wed, Dec 15, 1999 at 06:01:46PM +0300 X-Operating-System: Linux wiget 2.3.34 Sender: owner-netdev@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;netdev-outgoing On Wed, 15 Dec 1999, kuznet@ms2.inr.ac.ru wrote: > Hello! > > > I've just joined this mailing list, so if these problems are already known, > > sorry for the wasted bandwidth. > > Congratulations! You joined in time. Seems, you are the first user > of IPv6 with 2.3. 8) Please, do not disappear. 8) > > > > once I try to connect and talk to it, the kernel will die with > > "aiee, killing interupt handler, etc" > > I think this was fixed by: > > diff -ur --new-file ../2.3.32/linux/net/ipv6/tcp_ipv6.c linux/net/ipv6/tcp_ipv6.c > --- ../2.3.32/linux/net/ipv6/tcp_ipv6.c Fri Nov 19 22:33:29 1999 > +++ linux/net/ipv6/tcp_ipv6.c Tue Dec 14 22:08:26 1999 > @@ -273,8 +271,8 @@ > } > } > } > - if (sk) > - sock_hold(sk); > + if (result) > + sock_hold(result); > read_unlock(&tcp_lhash_lock); > return result; > } Alexey, Why this patch don't go to main kernel source ? I check 2.3.35 and this problem is not fixed. With this patch IPv6 work fine. Wiget -- wiget@t17.ds.pwr.wroc.pl DS T17 Bofh PGP key: http://www.t17.ds.pwr.wroc.pl/~wiget/pgp.key 1024D/D3D4CF84 E4D3 6787 284C 57F0 3C1F ADFD A92A 3F2E D3D4 CF84 From owner-netdev@oss.sgi.com Thu Dec 30 02:11:29 1999 Received: by oss.sgi.com id ; Thu, 30 Dec 1999 02:11:10 -0800 Received: from linuxcare.canberra.net.au ([203.29.91.49]:27908 "EHLO front.linuxcare.com.au") by oss.sgi.com with ESMTP id ; Thu, 30 Dec 1999 02:10:59 -0800 Received: from gargle.linuxcare.com.au (penicillin.linuxcare.com.au [10.61.2.27]) by front.linuxcare.com.au (8.9.3/8.9.3/Debian/GNU) with ESMTP id VAA24413 for ; Thu, 30 Dec 1999 21:11:00 +1100 Received: from rustcorp.com.au (really [127.0.0.1]) by rustcorp.com.au via in.smtpd with esmtp id (Debian Smail3.2.0.102) for ; Thu, 30 Dec 1999 21:11:00 +1100 (EST) Message-Id: From: Rusty Russell To: netdev@oss.sgi.com Subject: Re: [PATCH] skb field reservation v2.3.34 In-reply-to: Your message of "Tue, 28 Dec 1999 19:04:52 +0300." <199912281604.TAA22785@ms2.inr.ac.ru> Date: Thu, 30 Dec 1999 21:11:00 +1100 Sender: owner-netdev@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;netdev-outgoing In message <199912281604.TAA22785@ms2.inr.ac.ru> Alexey writes: > Such objects miss main required property: conservation while > skb_clone/skb_copy. I see _no_ applications for such feature, Very good point. Changes: 1) Region is copied on skb_clone/skb_copy. 2) Callbacks when that is done (just like the destructor). 3) No more dcache lines get hit in __kfree_skb on no-reservations case. 4) Use array for iteration when there are reservations. 5) Everything wrapped in CONFIG_NETFILTER (can be moved later if desired). 6) s/size_t/unsigned int/ 7) Against 2.3.35 Almost all changes in first two files of patch... (skbuff.h and skbuff.c). > > can also be used for other things where you need to play with skbs. > > Paul. When we want to play, we just add new field to skb and that's all. 8)8) > If game turns out to be not interesting, we delete it. Did you forget that > Linux has public sources? I don't want to see the following inside skbuff.h: #if defined(CONFIG_NETFILTER_IP_CONNTRACK) || defined(CONFIG_NETFILTER_IP_CONNTRACK_MODULE) void *ip_connection; /* For connection tracking */ #endif #if defined(CONFIG_NETFILTER_IP_CONNTRACK_FTP) || defined(CONFIG_NETFILTER_IP_CONNTRACK_MODULE_FTP) u_int16_t ftp_offset, ftp_length; /* For ftp tracking */ #endif > Also, I remember you liked to talk about coding style. 8) I only like to talk about other people's coding style 8-). Happy random holiday, Rusty. diff -urN --minimal --exclude classlist.h --exclude devlist.h --exclude *.lds --exclude autoconf.h --exclude compile.h --exclude version.h --exclude .* --exclude *.[oa] --exclude *.orig --exclude config --exclude asm --exclude modules --exclude *.[Ss] --exclude System.map --exclude consolemap_deftbl.c --exclude *~ --exclude TAGS --exclude tags --exclude modversions.h --exclude install-kernel linux-2.3-official/net/core/skbuff.c linux-2.3/net/core/skbuff.c --- linux-2.3-official/net/core/skbuff.c Wed Dec 29 23:19:30 1999 +++ linux-2.3/net/core/skbuff.c Thu Dec 30 18:37:46 1999 @@ -19,6 +19,7 @@ * Ray VanTassle : Fixed --skb->lock in free * Alan Cox : skb_copy copy arp field * Andi Kleen : slabified it. + * Rusty Russell : field reservation * * NOTE: * The __skb_ routines should be called with interrupts @@ -77,6 +78,154 @@ static kmem_cache_t *skbuff_head_cache; +#ifdef CONFIG_NETFILTER +static LIST_HEAD(field_allocs); +/* These don't need to be atomic_t, but play safe --RR */ +static atomic_t field_generation = ATOMIC_INIT(0); +static rwlock_t field_lock = RW_LOCK_UNLOCKED; + +/* Cache-friendly form */ +struct field_funcs +{ + u_int16_t offset, size; + int generation; + void (*func)(struct sk_buff *skb); +}; +static atomic_t field_copiers = ATOMIC_INIT(0); +static struct field_funcs field_copy[SKB_RESERVE_SIZE]; +static atomic_t field_destructors = ATOMIC_INIT(0); +static struct field_funcs field_destroy[SKB_RESERVE_SIZE]; + +int skb_field_reserve(struct skb_field *reg) +{ + struct list_head *i; + unsigned int align_mask = (reg->alignment - 1); + int ret = 0; + + reg->offset = 0; + write_lock_bh(&field_lock); + reg->gen = atomic_read(&field_generation) + 1; + + /* Yes it's an ordered list, no we don't do garbage collection */ + for (i = field_allocs.next; i != &field_allocs; i = i->next) { + struct skb_field *f = (struct skb_field *)i; + + if (reg->offset + reg->size <= f->offset) + break; + + /* offset = last aligned possibility */ + reg->offset = (f->offset + f->size + align_mask) & ~align_mask; + } + + if (reg->offset + reg->size < SKB_RESERVE_SIZE) { + list_add_tail(®->list, i); + if (reg->destructor) { + field_destroy[atomic_read(&field_destructors)] + = ((struct field_funcs){ reg->offset, + reg->size, + reg->gen, + reg->destructor }); + atomic_inc(&field_destructors); + } + if (reg->copy) { + field_copy[atomic_read(&field_copiers)] + = ((struct field_funcs){ reg->offset, + reg->size, + reg->gen, + reg->copy }); + atomic_inc(&field_copiers); + } + } else + ret = -ENOMEM; /* -EEDITSKBUFF.H */ + + /* Need field_copiers and field_destructors updated first */ + wmb(); + atomic_inc(&field_generation); + write_unlock_bh(&field_lock); + return ret; +} + +static void reserve_eliminate(struct skb_field *unreg, + struct field_funcs *funcs, + atomic_t *num_funcs) +{ + int i, num; + + atomic_dec(num_funcs); + num = atomic_read(num_funcs); + + for (i = 0; i < num; i++) { + if (funcs[i].offset == unreg->offset) { + /* Move everything down... */ + memmove(funcs + i, funcs + i + 1, + sizeof(struct field_funcs) * (num - i)); + break; + } + } +} + +void skb_field_unreserve(struct skb_field *unreg) +{ + write_lock_bh(&field_lock); + atomic_inc(&field_generation); + wmb(); + list_del(&unreg->list); + if (unreg->destructor) + reserve_eliminate(unreg, field_destroy, &field_destructors); + if (unreg->copy) + reserve_eliminate(unreg, field_copy, &field_copiers); + + write_unlock_bh(&field_lock); +} + +/* Called very rarely; skb alloc'ed before field registration. */ +void skb_field_update(struct sk_buff *skb) +{ + struct list_head *i; + int gen; + + write_lock_bh(&field_lock); + gen = atomic_read(&field_generation); + + /* Clear any registrations newer than this skb */ + for (i = field_allocs.next; i != &field_allocs; i = i->next) { + struct skb_field *f = (struct skb_field *)i; + + if ((int)f->gen - (int)skb->reserve_gen > 0) + memset(skb->reserve+f->offset, f->size, 0); + } + skb->reserved_copies = atomic_read(&field_copiers); + skb->reserved_destructors = atomic_read(&field_destructors); + write_unlock_bh(&field_lock); + + skb->reserve_gen = gen; +} + +static inline void reserve_do_funcs(struct sk_buff *skb, + const struct field_funcs *funcs, + atomic_t *num_funcs) +{ + unsigned int i, num; + + read_lock_bh(&field_lock); + num = atomic_read(num_funcs); + + for (i = 0; i < num; i++) { + if ((int)skb->reserve_gen - (int)funcs[i].generation >= 0) { + unsigned int j; + + for (j = 0; j < funcs[i].size; j++) { + if (skb->reserve[j + funcs[i].offset]) { + funcs[i].func(skb); + break; + } + } + } + } + read_unlock_bh(&field_lock); +} +#endif + /* * Keep out-of-line to prevent kernel bloat. * __builtin_return_address is not used because it is not always @@ -165,6 +314,15 @@ skb->is_clone = 0; skb->cloned = 0; +#ifdef CONFIG_NETFILTER + /* Race here ok: if they ever use a field, and generation + changed between these assignment statements, skb will be + updated */ + skb->reserve_gen = atomic_read(&field_generation); + skb->reserved_copies = atomic_read(&field_copiers); + skb->reserved_destructors = atomic_read(&field_destructors); +#endif + #ifdef CONFIG_ATM ATM_SKB(skb)->iovcnt = 0; #endif @@ -201,7 +359,8 @@ skb->dst = NULL; skb->rx_dev = NULL; #ifdef CONFIG_NETFILTER - skb->nfmark = skb->nfreason = skb->nfcache = 0; + memset(skb->reserve, 0, sizeof(skb->reserve)); + skb->nfcache = 0; #ifdef CONFIG_NETFILTER_DEBUG skb->nf_debug = 0; #endif @@ -235,8 +394,13 @@ } dst_release(skb->dst); +#ifdef CONFIG_NETFILTER + if (skb->reserved_destructors) + reserve_do_funcs(skb, field_destroy, &field_destructors); +#endif if(skb->destructor) skb->destructor(skb); + #ifdef CONFIG_NET if(skb->rx_dev) dev_put(skb->rx_dev); @@ -272,6 +436,12 @@ n->is_clone = 1; atomic_set(&n->users, 1); n->destructor = NULL; + +#ifdef CONFIG_NETFILTER + if (n->reserved_copies) + reserve_do_funcs(skb, field_copy, &field_copiers); +#endif + return n; } @@ -301,13 +471,19 @@ new->destructor = NULL; new->security=old->security; #ifdef CONFIG_NETFILTER - new->nfmark=old->nfmark; - new->nfreason=old->nfreason; - new->nfcache=old->nfcache; #ifdef CONFIG_NETFILTER_DEBUG new->nf_debug=old->nf_debug; #endif -#endif + new->nfcache=old->nfcache; + + new->reserve_gen = old->reserve_gen; + new->reserved_destructors = old->reserved_destructors; + new->reserved_copies = old->reserved_copies; + memcpy(new->reserve, old->reserve, sizeof(old->reserve)); + + if (new->reserved_copies) + reserve_do_funcs(new, field_copy, &field_copiers); +#endif /*CONFIG_NETFILTER*/ } /* diff -urN --minimal --exclude classlist.h --exclude devlist.h --exclude *.lds --exclude autoconf.h --exclude compile.h --exclude version.h --exclude .* --exclude *.[oa] --exclude *.orig --exclude config --exclude asm --exclude modules --exclude *.[Ss] --exclude System.map --exclude consolemap_deftbl.c --exclude *~ --exclude TAGS --exclude tags --exclude modversions.h --exclude install-kernel linux-2.3-official/include/linux/skbuff.h linux-2.3/include/linux/skbuff.h --- linux-2.3-official/include/linux/skbuff.h Thu Dec 30 18:37:13 1999 +++ linux-2.3/include/linux/skbuff.h Thu Dec 30 17:25:52 1999 @@ -37,6 +37,8 @@ #define NET_CALLER(arg) __builtin_return_address(0) #endif +#define SKB_RESERVE_SIZE 32 + struct sk_buff_head { /* These two members must be first. */ struct sk_buff * next; @@ -97,7 +99,9 @@ cloned, /* head may be cloned (check refcnt to be sure). */ pkt_type, /* Packet class */ pkt_bridged, /* Tracker for bridging */ - ip_summed; /* Driver fed us an IP checksum */ + ip_summed, /* Driver fed us an IP checksum */ + reserved_copies, /* Reserved fields on copy */ + reserved_destructors; /* Reserved fields on kfree */ __u32 priority; /* Packet queueing priority */ atomic_t users; /* User count - see datagram.c,tcp.c */ unsigned short protocol; /* Packet protocol from driver. */ @@ -109,11 +113,12 @@ unsigned char *tail; /* Tail pointer */ unsigned char *end; /* End pointer */ void (*destructor)(struct sk_buff *); /* Destruct function */ + #ifdef CONFIG_NETFILTER - /* Can be used for communication between hooks. */ - unsigned long nfmark; - /* Reason for doing this to the packet (see netfilter.h) */ - __u32 nfreason; + /* See skb_field_reserve()/skb_field_unreserve() */ + int reserve_gen; + char reserve[SKB_RESERVE_SIZE]; + /* Cache info */ __u32 nfcache; #ifdef CONFIG_NETFILTER_DEBUG @@ -183,6 +188,41 @@ extern void skb_trim(struct sk_buff *skb, unsigned int len); extern void skb_over_panic(struct sk_buff *skb, int len, void *here); extern void skb_under_panic(struct sk_buff *skb, int len, void *here); + +#ifdef CONFIG_NETFILTER +/* Selling space in skb's: the VCs will love it... */ +struct skb_field +{ + /* Filled in by skb_field_reserve() */ + struct list_head list; + unsigned int offset; + int gen; + + /* Filled in by caller. */ + unsigned int size; + unsigned int alignment; /* Use __alignof__ */ + /* Don't call skb_field_unreserve from this: deadlock. */ + void (*destructor)(struct sk_buff *); + /* Called after clone or copy. */ + void (*copy)(struct sk_buff *); +}; + +extern int skb_field_reserve(struct skb_field *reg); +extern void skb_field_unreserve(struct skb_field *unreg); + +/* Private */ +extern void skb_field_update(struct sk_buff *skb); + +/* Access a field of an skb */ +#define skb_field(skb, reg, type) \ +({ \ + if ((skb)->reserve_gen - (reg)->gen < 0) skb_field_update(skb); \ + &__skb_field(skb, reg, type); \ +}) + +/* If you reserve at boot, no skb can predate you, so use this. */ +#define __skb_field(skb, reg, type) (*((type *)((skb)->reserve+(reg)->offset))) +#endif /* CONFIG_NETFILTER */ /* Backwards compatibility */ #define skb_realloc_headroom(skb, nhr) skb_copy_expand(skb, nhr, skb_tailroom(skb), GFP_ATOMIC) diff -urN --minimal --exclude classlist.h --exclude devlist.h --exclude *.lds --exclude autoconf.h --exclude compile.h --exclude version.h --exclude .* --exclude *.[oa] --exclude *.orig --exclude config --exclude asm --exclude modules --exclude *.[Ss] --exclude System.map --exclude consolemap_deftbl.c --exclude *~ --exclude TAGS --exclude tags --exclude modversions.h --exclude install-kernel linux-2.3-official/include/linux/netfilter.h linux-2.3/include/linux/netfilter.h --- linux-2.3-official/include/linux/netfilter.h Thu Dec 30 18:43:00 1999 +++ linux-2.3/include/linux/netfilter.h Thu Dec 30 17:29:16 1999 @@ -15,7 +15,7 @@ #define NF_ACCEPT 1 #define NF_STOLEN 2 #define NF_QUEUE 3 -#define NF_MAX_VERDICT NF_QUEUE +/* >= NF_QUEUE treated the same. */ /* Generic cache responses from hook functions. */ #define NFC_ALTERED 0x8000 @@ -141,10 +141,8 @@ int pf; /* Bitmask of hook numbers to match (1 << hooknum). */ unsigned int hookmask; - /* If non-zero, only catch packets with this mark. */ - unsigned int mark; - /* If non-zero, only catch packets of this reason. */ - unsigned int reason; + /* If not 0xFFFFFFFF, only catch packets with this queue. */ + int queuenum; struct nf_wakeme *wake; }; @@ -154,11 +152,8 @@ extern void nf_unregister_interest(struct nf_interest *interest); extern void nf_getinfo(const struct sk_buff *skb, struct net_device **indev, - struct net_device **outdev, - unsigned long *mark); -extern void nf_reinject(struct sk_buff *skb, - unsigned long mark, - unsigned int verdict); + struct net_device **outdev); +extern void nf_reinject(struct sk_buff *skb, unsigned int verdict); #ifdef CONFIG_NETFILTER_DEBUG extern void nf_dump_skb(int pf, struct sk_buff *skb); @@ -192,14 +187,4 @@ #define SUMAX(a,b) ((size_t)(a)>(size_t)(b) ? (ssize_t)(a) : (ssize_t)(b)) #define SUMIN(a,b) ((size_t)(a)<(size_t)(b) ? (ssize_t)(a) : (ssize_t)(b)) #endif /*__KERNEL__*/ - -enum nf_reason { - /* Do not, NOT, reorder these. Add at end. */ - NF_REASON_NONE, - NF_REASON_SET_BY_IPCHAINS, - NF_REASON_FOR_ROUTING, - NF_REASON_FOR_CLS_FW, - NF_REASON_MIN_RESERVED_FOR_CONNTRACK = 1024, -}; - #endif /*__LINUX_NETFILTER_H*/ diff -urN --minimal --exclude classlist.h --exclude devlist.h --exclude *.lds --exclude autoconf.h --exclude compile.h --exclude version.h --exclude .* --exclude *.[oa] --exclude *.orig --exclude config --exclude asm --exclude modules --exclude *.[Ss] --exclude System.map --exclude consolemap_deftbl.c --exclude *~ --exclude TAGS --exclude tags --exclude modversions.h --exclude install-kernel linux-2.3-official/include/net/route.h linux-2.3/include/net/route.h --- linux-2.3-official/include/net/route.h Thu Dec 30 18:43:00 1999 +++ linux-2.3/include/net/route.h Thu Dec 30 17:29:16 1999 @@ -110,7 +110,9 @@ extern int ip_rt_ioctl(unsigned int cmd, void *arg); extern void ip_rt_get_source(u8 *src, struct rtable *rt); extern int ip_rt_dump(struct sk_buff *skb, struct netlink_callback *cb); - +#ifdef CONFIG_IP_ROUTE_FWMARK +extern struct skb_field ip_rt_mark_res; +#endif extern __inline__ void ip_rt_put(struct rtable * rt) { diff -urN --minimal --exclude classlist.h --exclude devlist.h --exclude *.lds --exclude autoconf.h --exclude compile.h --exclude version.h --exclude .* --exclude *.[oa] --exclude *.orig --exclude config --exclude asm --exclude modules --exclude *.[Ss] --exclude System.map --exclude consolemap_deftbl.c --exclude *~ --exclude TAGS --exclude tags --exclude modversions.h --exclude install-kernel linux-2.3-official/net/core/netfilter.c linux-2.3/net/core/netfilter.c --- linux-2.3-official/net/core/netfilter.c Tue Dec 21 14:20:03 1999 +++ linux-2.3/net/core/netfilter.c Wed Dec 29 19:27:45 1999 @@ -22,7 +22,7 @@ #include /* In this code, we can be waiting indefinitely for userspace to - * service a packet if a hook returns NF_QUEUE. We could keep a count + * service a packet if a hook returns >= NF_QUEUE. We could keep a count * of skbuffs queued for userspace, and not deregister a hook unless * this is zero, but that sucks. Now, we simply check when the * packets come back: if the hook is gone, the packet is discarded. */ @@ -40,8 +40,8 @@ /* If we're sent to userspace, this keeps housekeeping info */ int pf; - unsigned long mark; unsigned int hook; + unsigned int queuenum; struct net_device *indev, *outdev; int (*okfn)(struct sk_buff *); }; @@ -53,6 +53,10 @@ static LIST_HEAD(nf_sockopts); static LIST_HEAD(nf_interested); +static struct skb_field skb_res += { { NULL, NULL }, 0, 0, + sizeof(struct nf_info *), __alignof__(struct nf_info *), NULL, NULL }; + int nf_register_hook(struct nf_hook_ops *reg) { struct list_head *i; @@ -358,11 +362,10 @@ { for (*i = (*i)->next; *i != head; *i = (*i)->next) { struct nf_hook_ops *elem = (struct nf_hook_ops *)*i; - switch (elem->hook(hook, skb, indev, outdev, okfn)) { - case NF_QUEUE: - NFDEBUG("nf_iterate: NF_QUEUE for %p.\n", *skb); - return NF_QUEUE; + unsigned int verdict + = elem->hook(hook, skb, indev, outdev, okfn); + switch (verdict) { case NF_STOLEN: NFDEBUG("nf_iterate: NF_STOLEN for %p.\n", *skb); return NF_STOLEN; @@ -371,14 +374,12 @@ NFDEBUG("nf_iterate: NF_DROP for %p.\n", *skb); return NF_DROP; -#ifdef CONFIG_NETFILTER_DEBUG case NF_ACCEPT: break; default: - NFDEBUG("Evil return from %p(%u).\n", - elem->hook, hook); -#endif + NFDEBUG("nf_iterate: %u for %p.\n", verdict, *skb); + return verdict; } } return NF_ACCEPT; @@ -389,7 +390,8 @@ int pf, unsigned int hook, struct net_device *indev, struct net_device *outdev, - int (*okfn)(struct sk_buff *)) + int (*okfn)(struct sk_buff *), + unsigned int queuenum) { struct list_head *i; @@ -402,13 +404,14 @@ /* Can't do struct assignments with arrays in them. Damn. */ info->elem = (struct nf_hook_ops *)elem; - info->mark = skb->nfmark; info->pf = pf; info->hook = hook; info->okfn = okfn; info->indev = indev; info->outdev = outdev; - skb->nfmark = (unsigned long)info; + info->queuenum = queuenum; + + __skb_field(skb, &skb_res, struct nf_info *) = info; /* Bump dev refs so they don't vanish while packet is out */ if (indev) dev_hold(indev); @@ -419,8 +422,8 @@ if ((recip->hookmask & (1 << info->hook)) && info->pf == recip->pf - && (!recip->mark || info->mark == recip->mark) - && (!recip->reason || skb->nfreason == recip->reason)) { + && (recip->queuenum == 0xFFFFFFFF + || info->queuenum == recip->queuenum)) { /* FIXME: Andi says: use netlink. Hmmm... --RR */ if (skb_queue_len(&recip->wake->skbq) >= 100) { NFDEBUG("nf_hook: queue to long.\n"); @@ -428,8 +431,8 @@ } /* Hand it to userspace for collection */ skb_queue_tail(&recip->wake->skbq, skb); - NFDEBUG("Waking up pf=%i hook=%u mark=%lu reason=%u\n", - pf, hook, skb->nfmark, skb->nfreason); + NFDEBUG("Waking up pf=%i hook=%u\n", + pf, hook); wake_up_interruptible(&recip->wake->sleep); return; @@ -473,9 +476,10 @@ elem = &nf_hooks[pf][hook]; verdict = nf_iterate(&nf_hooks[pf][hook], &skb, hook, indev, outdev, &elem, okfn); - if (verdict == NF_QUEUE) { + if (verdict >= NF_QUEUE) { NFDEBUG("nf_hook: Verdict = QUEUE.\n"); - nf_queue(skb, elem, pf, hook, indev, outdev, okfn); + nf_queue(skb, elem, pf, hook, indev, outdev, okfn, + verdict - NF_QUEUE); } read_unlock_bh(&nf_lock); @@ -517,24 +521,24 @@ /* Blow away any queued skbs; this is overzealous. */ while ((skb = skb_dequeue(&interest->wake->skbq)) != NULL) - nf_reinject(skb, 0, NF_DROP); + nf_reinject(skb, NF_DROP); } void nf_getinfo(const struct sk_buff *skb, struct net_device **indev, - struct net_device **outdev, - unsigned long *mark) + struct net_device **outdev) { - const struct nf_info *info = (const struct nf_info *)skb->nfmark; + const struct nf_info *info = + __skb_field(skb, &skb_res, struct nf_info *); *indev = info->indev; *outdev = info->outdev; - *mark = info->mark; } -void nf_reinject(struct sk_buff *skb, unsigned long mark, unsigned int verdict) +void nf_reinject(struct sk_buff *skb, unsigned int verdict) { - struct nf_info *info = (struct nf_info *)skb->nfmark; + const struct nf_info *info = + __skb_field(skb, &skb_res, struct nf_info *); struct list_head *elem = &info->elem->list; struct list_head *i; @@ -551,16 +555,16 @@ /* Continue traversal iff userspace said ok, and devices still exist... */ if (verdict == NF_ACCEPT) { - skb->nfmark = mark; verdict = nf_iterate(&nf_hooks[info->pf][info->hook], &skb, info->hook, info->indev, info->outdev, &elem, info->okfn); } - if (verdict == NF_QUEUE) { + if (verdict >= NF_QUEUE) { nf_queue(skb, elem, info->pf, info->hook, - info->indev, info->outdev, info->okfn); + info->indev, info->outdev, info->okfn, + verdict - NF_QUEUE); } read_unlock_bh(&nf_lock); @@ -626,4 +630,8 @@ for (i = 0; i < NPROTO; i++) for (h = 0; h < NF_MAX_HOOKS; h++) INIT_LIST_HEAD(&nf_hooks[i][h]); + + if (skb_field_reserve(&skb_res) != 0) + panic("Can't reserve a %u byte field in skb\n", + sizeof(struct nf_info *)); } diff -urN --minimal --exclude classlist.h --exclude devlist.h --exclude *.lds --exclude autoconf.h --exclude compile.h --exclude version.h --exclude .* --exclude *.[oa] --exclude *.orig --exclude config --exclude asm --exclude modules --exclude *.[Ss] --exclude System.map --exclude consolemap_deftbl.c --exclude *~ --exclude TAGS --exclude tags --exclude modversions.h --exclude install-kernel linux-2.3-official/net/ipv4/route.c linux-2.3/net/ipv4/route.c --- linux-2.3-official/net/ipv4/route.c Wed Dec 29 23:19:30 1999 +++ linux-2.3/net/ipv4/route.c Thu Dec 30 15:35:47 1999 @@ -127,6 +127,10 @@ static struct timer_list rt_periodic_timer = { NULL, NULL, 0, 0L, NULL }; +#ifdef CONFIG_IP_ROUTE_FWMARK +struct skb_field ip_rt_mark_res += { { NULL, NULL }, 0, 0, sizeof(__u32), __alignof__(__u32), NULL, NULL }; +#endif /* * Interface to generic destination cache. */ @@ -1107,10 +1111,7 @@ rth->rt_dst = daddr; rth->key.tos = tos; #ifdef CONFIG_IP_ROUTE_FWMARK - if (skb->nfreason == NF_REASON_FOR_ROUTING) - rth->key.fwmark = skb->nfmark; - else - rth->key.fwmark = 0; + rth->key.fwmark = __skb_field(skb, &ip_rt_mark_res, __u32); #endif rth->key.src = saddr; rth->rt_src = saddr; @@ -1189,10 +1190,7 @@ key.src = saddr; key.tos = tos; #ifdef CONFIG_IP_ROUTE_FWMARK - if (skb->nfreason == NF_REASON_FOR_ROUTING) - key.fwmark = skb->nfmark; - else - key.fwmark = 0; + key.fwmark = __skb_field(skb, &ip_rt_mark_res, __u32); #endif key.iif = dev->ifindex; key.oif = 0; @@ -1314,10 +1312,7 @@ rth->rt_dst = daddr; rth->key.tos = tos; #ifdef CONFIG_IP_ROUTE_FWMARK - if (skb->nfreason == NF_REASON_FOR_ROUTING) - rth->key.fwmark = skb->nfmark; - else - rth->key.fwmark = 0; + rth->key.fwmark = __skb_field(skb, &ip_rt_mark_res, __u32); #endif rth->key.src = saddr; rth->rt_src = saddr; @@ -1391,10 +1386,7 @@ rth->rt_dst = daddr; rth->key.tos = tos; #ifdef CONFIG_IP_ROUTE_FWMARK - if (skb->nfreason == NF_REASON_FOR_ROUTING) - rth->key.fwmark = skb->nfmark; - else - rth->key.fwmark = 0; + rth->key.fwmark = __skb_field(skb, &ip_rt_mark_res, __u32); #endif rth->key.src = saddr; rth->rt_src = saddr; @@ -1482,8 +1474,7 @@ rth->key.oif == 0 && #ifdef CONFIG_IP_ROUTE_FWMARK rth->key.fwmark - == (skb->nfreason == NF_REASON_FOR_ROUTING - ? skb->nfmark : 0) && + == __skb_field(skb, &ip_rt_mark_res, __u32) && #endif rth->key.tos == tos) { rth->u.dst.lastuse = jiffies; @@ -2166,5 +2157,10 @@ proc_net_create ("rt_cache", 0, rt_cache_get_info); #ifdef CONFIG_NET_CLS_ROUTE create_proc_read_entry("net/rt_acct", 0, 0, ip_rt_acct_read, NULL); +#endif + +#ifdef CONFIG_IP_ROUTE_FWMARK + if (skb_field_reserve(&ip_rt_mark_res) != 0) + panic("ip_rt_init: can't reserve mark in skb"); #endif } diff -urN --minimal --exclude classlist.h --exclude devlist.h --exclude *.lds --exclude autoconf.h --exclude compile.h --exclude version.h --exclude .* --exclude *.[oa] --exclude *.orig --exclude config --exclude asm --exclude modules --exclude *.[Ss] --exclude System.map --exclude consolemap_deftbl.c --exclude *~ --exclude TAGS --exclude tags --exclude modversions.h --exclude install-kernel linux-2.3-official/net/netsyms.c linux-2.3/net/netsyms.c --- linux-2.3-official/net/netsyms.c Wed Dec 29 23:19:30 1999 +++ linux-2.3/net/netsyms.c Wed Dec 29 19:27:41 1999 @@ -251,6 +251,9 @@ EXPORT_SYMBOL(inetdev_by_index); EXPORT_SYMBOL(in_dev_finish_destroy); EXPORT_SYMBOL(ip_defrag); +#ifdef CONFIG_IP_ROUTE_FWMARK +EXPORT_SYMBOL(ip_rt_mark_res); +#endif /* Route manipulation */ EXPORT_SYMBOL(ip_rt_ioctl); @@ -580,7 +583,14 @@ EXPORT_SYMBOL(nf_register_interest); EXPORT_SYMBOL(nf_unregister_interest); EXPORT_SYMBOL(nf_hook_slow); -#endif +EXPORT_SYMBOL(skb_field_reserve); +EXPORT_SYMBOL(skb_field_unreserve); +EXPORT_SYMBOL(skb_field_update); +#ifdef CONFIG_NET_CLS_FW +extern struct skb_field cls_fw_res; +EXPORT_SYMBOL(cls_fw_res); +#endif /* CONFIG_NET_CLS_FW */ +#endif /* CONFIG_NETFILTER */ EXPORT_SYMBOL(register_gifconf); diff -urN --minimal --exclude classlist.h --exclude devlist.h --exclude *.lds --exclude autoconf.h --exclude compile.h --exclude version.h --exclude .* --exclude *.[oa] --exclude *.orig --exclude config --exclude asm --exclude modules --exclude *.[Ss] --exclude System.map --exclude consolemap_deftbl.c --exclude *~ --exclude TAGS --exclude tags --exclude modversions.h --exclude install-kernel linux-2.3-official/net/sched/cls_api.c linux-2.3/net/sched/cls_api.c --- linux-2.3-official/net/sched/cls_api.c Fri Oct 15 15:51:35 1999 +++ linux-2.3/net/sched/cls_api.c Thu Dec 23 20:41:22 1999 @@ -461,6 +461,14 @@ INIT_TC_FILTER(route4); #endif #ifdef CONFIG_NET_CLS_FW +#ifdef CONFIG_NETFILTER + { + extern struct skb_field cls_fw_res; + + if (skb_field_reserve(&cls_fw_res) != 0) + panic("tc_filter_init: Can't reserve field cls_fw"); + } +#endif INIT_TC_FILTER(fw); #endif #ifdef CONFIG_NET_CLS_RSVP diff -urN --minimal --exclude classlist.h --exclude devlist.h --exclude *.lds --exclude autoconf.h --exclude compile.h --exclude version.h --exclude .* --exclude *.[oa] --exclude *.orig --exclude config --exclude asm --exclude modules --exclude *.[Ss] --exclude System.map --exclude consolemap_deftbl.c --exclude *~ --exclude TAGS --exclude tags --exclude modversions.h --exclude install-kernel linux-2.3-official/net/sched/cls_fw.c linux-2.3/net/sched/cls_fw.c --- linux-2.3-official/net/sched/cls_fw.c Fri Oct 15 15:51:29 1999 +++ linux-2.3/net/sched/cls_fw.c Wed Dec 29 20:16:20 1999 @@ -40,6 +40,9 @@ #include #include +struct skb_field cls_fw_res += { { NULL, NULL }, 0, 0, sizeof(__u32), __alignof__(u32), NULL, NULL }; + struct fw_head { struct fw_filter *ht[256]; @@ -66,7 +69,7 @@ struct fw_head *head = (struct fw_head*)tp->root; struct fw_filter *f; #ifdef CONFIG_NETFILTER - u32 id = (skb->nfreason == NF_REASON_FOR_CLS_FW ? skb->nfmark : 0); + u32 id = *skb_field(skb, &cls_fw_res, u32); #else u32 id = 0; #endif @@ -375,11 +378,28 @@ #ifdef MODULE int init_module(void) { - return register_tcf_proto_ops(&cls_fw_ops); + int ret = 0; + +#ifdef CONFIG_NETFILTER + ret = skb_field_reserve(&cls_fw_res); +#endif + + if (ret == 0) { + ret = register_tcf_proto_ops(&cls_fw_ops); +#ifdef CONFIG_NETFILTER + if (ret != 0) + skb_field_unreserve(&cls_fw_res); +#endif + } + + return ret; } void cleanup_module(void) { unregister_tcf_proto_ops(&cls_fw_ops); +#ifdef CONFIG_NETFILTER + skb_field_unreserve(&cls_fw_res); +#endif } #endif -- Hacking time. From owner-netdev@oss.sgi.com Thu Dec 30 02:56:09 1999 Received: by oss.sgi.com id ; Thu, 30 Dec 1999 02:55:59 -0800 Received: from linuxcare.canberra.net.au ([203.29.91.49]:57092 "EHLO front.linuxcare.com.au") by oss.sgi.com with ESMTP id ; Thu, 30 Dec 1999 02:55:35 -0800 Received: from gargle.linuxcare.com.au (penicillin.linuxcare.com.au [10.61.2.27]) by front.linuxcare.com.au (8.9.3/8.9.3/Debian/GNU) with ESMTP id VAA25186 for ; Thu, 30 Dec 1999 21:55:45 +1100 Received: from rustcorp.com.au (really [127.0.0.1]) by rustcorp.com.au via in.smtpd with esmtp id (Debian Smail3.2.0.102) for ; Thu, 30 Dec 1999 21:55:45 +1100 (EST) Message-Id: From: Rusty Russell To: netdev@oss.sgi.com Subject: BUG: 2.3.35 SMP tcpdump crash Date: Thu, 30 Dec 1999 21:55:45 +1100 Sender: owner-netdev@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;netdev-outgoing Only occurs on SMP when tcpdumping forwarded packets. Completely repeatable on stock 2.3.35. Looks like skb->head is bogus on freed skb. This makes me wonder about the af_packet skb `borrowing' code... Unable to handle kernel NULL pointer dereference at virtual address 00000008 c0133c3e *pde = 00000000 Oops: 0000 CPU: 0 EIP: 0010:[] Using defaults from ksymoops -t elf32-i386 -a i386 EFLAGS: 00010086 eax: c1263160 ebx: c1263140 ecx: 00000000 edx: c1243728 esi: c79fd85c edi: 00000000 ebp: c79fd7c0 esp: c7911e18 ds: 0018 es: 0018 ss: 0018 Process tcpdump (pid: 118, stackpage=c7911000) Stack: c79fc8a0 0000004a c1263160 00000246 c016fa43 c79fd7c0 c79fc840 c016fc41 c79fc840 c79fc840 c79fc882 c7911efe 0000004a 0000004a c0170b89 c79fc840 c80148fa c127fcc0 c79fc840 c7e86d90 0000061c c7911ebc c7911f6c c127fcc0 Call Trace: [] [] [] [] [] [>EIP; c0133c3e <===== Trace; c016fa43 Trace; c016fc41 <__kfree_skb+1e1/1ec> Trace; c0170b89 Trace; c80148fa <[parport]parport_ieee1284_epp_read_data+4a/f4> Trace; c016b3c9 Code; c0133c3e 00000000 <_EIP>: Code; c0133c3e <===== 0: 8b 47 08 mov 0x8(%edi),%eax <===== Code; c0133c41 3: 3d 2b 2f c3 a5 cmp $0xa5c32f2b,%eax Code; c0133c46 8: 0f 85 e4 02 00 00 jne 2f2 <_EIP+0x2f2> c0133f30 Code; c0133c4c e: f6 43 05 01 testb $0x1,0x5(%ebx) Code; c0133c50 12: 0f 85 00 00 00 00 jne 18 <_EIP+0x18> c0133c56 -- Hacking time. From owner-netdev@oss.sgi.com Thu Dec 30 05:23:01 1999 Received: by oss.sgi.com id ; Thu, 30 Dec 1999 05:22:41 -0800 Received: from azure.mist.sfc.wide.ad.jp ([203.178.140.41]:37639 "EHLO v6.linux.or.jp") by oss.sgi.com with ESMTP id ; Thu, 30 Dec 1999 05:22:28 -0800 Received: from localhost (IDENT:sekiya@LOCALHOST [::ffff:127.0.0.1] (may be forged)) by v6.linux.or.jp (8.9.3+3.1W/3.3Wb4) with ESMTP id WAA03759; Thu, 30 Dec 1999 22:20:04 +0900 Date: Thu, 30 Dec 1999 22:18:56 +0900 Message-ID: From: Yuji Sekiya To: kuznet@ms2.inr.ac.ru Cc: netdev@oss.sgi.com Cc: venaas@nvg.ntnu.no Subject: Re: two ipv6 oops on 2.3.31 In-Reply-To: In your message of "Wed, 29 Dec 1999 18:29:35 +0100" <19991229182935.J10337@wiget> References: <199912150024.QAA06037@ns1.filetron.com> <199912151501.SAA02381@ms2.inr.ac.ru> <19991229182935.J10337@wiget> User-Agent: Wanderlust/2.2.12 (Joyride) SEMI/1.13.7 (Awazu) FLIM/1.13.2 (Kasanui) MULE XEmacs/21.2 (beta19) (Shinjuku) (i586-pc-linux) Organization: Information Sciences Institute MIME-Version: 1.0 (generated by SEMI 1.13.7 - "Awazu") Content-Type: text/plain; charset=US-ASCII X-Dispatcher: imput version 990604(IM116) Lines: 26 Sender: owner-netdev@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;netdev-outgoing At Wed, 29 Dec 1999 18:29:35 +0100, Artur Frysiak wrote: Hello, > Alexey, Why this patch don't go to main kernel source ? > I check 2.3.35 and this problem is not fixed. > With this patch IPv6 work fine. And how about this patch, Alexey ? This patch made by solves ICMPV6_PKT_TOOBIG problem. --- tcp_ipv6.c.orig Thu Aug 26 02:29:53 1999 +++ tcp_ipv6.c Sun Oct 31 00:59:31 1999 @@ -632,7 +632,6 @@ if (dst == NULL) { struct flowi fl; - struct dst_entry *dst; /* BUGGG_FUTURE: Again, it is not clear how to handle rthdr case. Ignore this complexity -- SEKIYA Yuji USC/ISI Computer Networks Division 7 From owner-netdev@oss.sgi.com Thu Dec 30 07:08:00 1999 Received: by oss.sgi.com id ; Thu, 30 Dec 1999 07:07:51 -0800 Received: from azure.mist.sfc.wide.ad.jp ([203.178.140.41]:47623 "EHLO v6.linux.or.jp") by oss.sgi.com with ESMTP id ; Thu, 30 Dec 1999 07:07:33 -0800 Received: from localhost (IDENT:sekiya@LOCALHOST [::ffff:127.0.0.1] (may be forged)) by v6.linux.or.jp (8.9.3+3.1W/3.3Wb4) with ESMTP id AAA03917; Fri, 31 Dec 1999 00:05:48 +0900 Date: Fri, 31 Dec 1999 00:05:40 +0900 Message-ID: From: Yuji Sekiya To: kuznet@ms2.inr.ac.ru Cc: kusune@sfc.wide.ad.jp, netdev@oss.sgi.com Subject: Re: source address selection In-Reply-To: In your message of "Wed, 29 Dec 1999 19:09:22 +0300 (MSK)" <199912291609.TAA04930@ms2.inr.ac.ru> References: <199912291609.TAA04930@ms2.inr.ac.ru> User-Agent: Wanderlust/2.2.12 (Joyride) SEMI/1.13.7 (Awazu) FLIM/1.13.2 (Kasanui) MULE XEmacs/21.2 (beta19) (Shinjuku) (i586-pc-linux) Organization: Information Sciences Institute MIME-Version: 1.0 (generated by SEMI 1.13.7 - "Awazu") Content-Type: text/plain; charset=US-ASCII X-Dispatcher: imput version 990604(IM116) Lines: 20 Sender: owner-netdev@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;netdev-outgoing At Wed, 29 Dec 1999 19:09:22 +0300 (MSK), kuznet@ms2.inr.ac.ru wrote: Hello Alexey, > > If you have no plan to implement source address selection > > into IPv4 protocol stack, how about eating kusune's patch > > or implementing the feature into IPv6 stack at first ? > > In 2.5. It's a good news ! Thanks for your interest in this problem. Until your implementation has come, we will use kusune's patch for source address selection. Regards. -- SEKIYA Yuji USC/ISI Computer Networks Division 7 From owner-netdev@oss.sgi.com Thu Dec 30 07:39:11 1999 Received: by oss.sgi.com id ; Thu, 30 Dec 1999 07:39:00 -0800 Received: from minus.inr.ac.ru ([193.233.7.97]:12051 "HELO ms2.inr.ac.ru") by oss.sgi.com with SMTP id ; Thu, 30 Dec 1999 07:38:39 -0800 Received: (from kuznet@localhost) by ms2.inr.ac.ru (8.6.13/ANK) id SAA06616; Thu, 30 Dec 1999 18:37:00 +0300 From: kuznet@ms2.inr.ac.ru Message-Id: <199912301537.SAA06616@ms2.inr.ac.ru> Subject: Re: two ipv6 oops on 2.3.31 To: sekiya@ISI.EDU (Yuji Sekiya) Date: Thu, 30 Dec 1999 18:37:00 +0300 (MSK) Cc: netdev@oss.sgi.com, venaas@nvg.ntnu.no In-Reply-To: from "Yuji Sekiya" at Dec 30, 99 10:18:56 pm X-Mailer: ELM [version 2.4 PL24] MIME-Version: 1.0 Content-Length: 360 Sender: owner-netdev@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;netdev-outgoing Hello! > And how about this patch, Alexey ? Thank you very much that you reminded about this one. To be honest, I just have no time to track state of 2.3 kernels. It will be done once, after 2.4 will be released eventually. For now: ftp://ftp.inr.ac.ru/ip-routing/softnet-*.dif.gz, README.softnet. All these old holes are repaired there. Alexey Kuznetsov From owner-netdev@oss.sgi.com Thu Dec 30 07:41:30 1999 Received: by oss.sgi.com id ; Thu, 30 Dec 1999 07:41:21 -0800 Received: from minus.inr.ac.ru ([193.233.7.97]:16403 "HELO ms2.inr.ac.ru") by oss.sgi.com with SMTP id ; Thu, 30 Dec 1999 07:41:16 -0800 Received: (from kuznet@localhost) by ms2.inr.ac.ru (8.6.13/ANK) id SAA06756; Thu, 30 Dec 1999 18:40:51 +0300 From: kuznet@ms2.inr.ac.ru Message-Id: <199912301540.SAA06756@ms2.inr.ac.ru> Subject: Re: source address selection To: sekiya@ISI.EDU (Yuji Sekiya) Date: Thu, 30 Dec 1999 18:40:51 +0300 (MSK) Cc: kusune@sfc.wide.ad.jp, netdev@oss.sgi.com In-Reply-To: from "Yuji Sekiya" at Dec 31, 99 00:05:40 am X-Mailer: ELM [version 2.4 PL24] MIME-Version: 1.0 Content-Length: 618 Sender: owner-netdev@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;netdev-outgoing Hello! > > > If you have no plan to implement source address selection > > > into IPv4 protocol stack, how about eating kusune's patch > > > or implementing the feature into IPv6 stack at first ? > > > > In 2.5. Grrrr, I am replying to myself.... Seems, I misread your reply and you misread that mail, which I replied. 8) The result is full misunderstanding. > > > If you have no plan to implement source address selection > > > into IPv4 protocol stack, Please, explain. What is "source address selection"? I have no idea what can be added to IPv4 in this area. All that could be done, were done to 2.2. Alexey From owner-netdev@oss.sgi.com Thu Dec 30 09:15:42 1999 Received: by oss.sgi.com id ; Thu, 30 Dec 1999 09:15:32 -0800 Received: from minus.inr.ac.ru ([193.233.7.97]:27909 "HELO ms2.inr.ac.ru") by oss.sgi.com with SMTP id ; Thu, 30 Dec 1999 09:15:16 -0800 Received: (from kuznet@localhost) by ms2.inr.ac.ru (8.6.13/ANK) id UAA10334; Thu, 30 Dec 1999 20:14:31 +0300 From: kuznet@ms2.inr.ac.ru Message-Id: <199912301714.UAA10334@ms2.inr.ac.ru> Subject: Re: BUG: 2.3.35 SMP tcpdump crash To: rusty@linuxcare.COM.AU (Rusty Russell) Date: Thu, 30 Dec 1999 20:14:31 +0300 (MSK) Cc: netdev@oss.sgi.com In-Reply-To: from "Rusty Russell" at Dec 30, 99 02:13:02 pm X-Mailer: ELM [version 2.4 PL24] MIME-Version: 1.0 Content-Length: 350 Sender: owner-netdev@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;netdev-outgoing Hello! > Only occurs on SMP when tcpdumping forwarded packets. Completely > repeatable on stock 2.3.35. For me it occured the first boot without any tcpdump. > Looks like skb->head is bogus on freed skb. This makes me wonder about > the af_packet skb `borrowing' code... Look at new skb_expand etc., backing out this fixes the problem. Alexey From owner-netdev@oss.sgi.com Thu Dec 30 20:05:07 1999 Received: by oss.sgi.com id ; Thu, 30 Dec 1999 20:04:58 -0800 Received: from linuxcare.canberra.net.au ([203.29.91.49]:12039 "EHLO front.linuxcare.com.au") by oss.sgi.com with ESMTP id ; Thu, 30 Dec 1999 20:04:41 -0800 Received: from gargle.linuxcare.com.au (penicillin.linuxcare.com.au [10.61.2.27]) by front.linuxcare.com.au (8.9.3/8.9.3/Debian/GNU) with ESMTP id PAA30002; Fri, 31 Dec 1999 15:04:46 +1100 Received: from rustcorp.com.au (really [127.0.0.1]) by rustcorp.com.au via in.smtpd with esmtp id (Debian Smail3.2.0.102) for ; Fri, 31 Dec 1999 15:04:46 +1100 (EST) Message-Id: From: Rusty Russell To: kuznet@ms2.inr.ac.ru Cc: netdev@oss.sgi.com Subject: Re: BUG: 2.3.35 SMP tcpdump crash In-reply-to: Your message of "Thu, 30 Dec 1999 20:14:31 +0300." <199912301714.UAA10334@ms2.inr.ac.ru> Date: Fri, 31 Dec 1999 15:04:46 +1100 Sender: owner-netdev@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;netdev-outgoing In message <199912301714.UAA10334@ms2.inr.ac.ru> you write: > Look at new skb_expand etc., backing out this fixes the problem. Thanks. Looks like someone is doing what I never expected: calling skb_realloc_headroom() to get *less* headroom (from 32 down to 16). This fixes it; now I'm looking to see who's doing that. Sorry for the delay: urgent coffee machine repairs got priority. Rusty. --- linux-2.3-official/net/core/skbuff.c Wed Dec 29 23:19:30 1999 +++ linux-2.3/net/core/skbuff.c Fri Dec 31 14:55:10 1999 @@ -358,8 +534,9 @@ /* Set the tail pointer and length */ skb_put(n,skb->len); - /* Copy the bytes: data pointers must point to same data. */ - memcpy(n->data - skb_headroom(skb), skb->head, skb->end-skb->head); + + /* Copy the data only. */ + memcpy(n->data, skb->data, skb->len); copy_skb_header(n, skb); return n; -- Hacking time.