From gnb@melbourne.sgi.com Wed Jun 1 01:17:29 2005 Received: with ECARTIS (v1.0.0; list netdev); Wed, 01 Jun 2005 01:17:32 -0700 (PDT) Received: from larry.melbourne.sgi.com (mverd138.asia.info.net [61.14.31.138]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id j518HRXq022082 for ; Wed, 1 Jun 2005 01:17:28 -0700 Received: from [134.14.55.176] (hole.melbourne.sgi.com [134.14.55.176]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id SAA15672; Wed, 1 Jun 2005 18:16:26 +1000 Subject: Re: Locking model for NAPI drivers From: Greg Banks To: "David S. Miller" Cc: Linux Network Development list In-Reply-To: <20050531.154847.63995530.davem@davemloft.net> References: <20050531.154847.63995530.davem@davemloft.net> Content-Type: text/plain Organization: Silicon Graphics Inc, Australian Software Group. Message-Id: <1117613796.26331.2479.camel@hole.melbourne.sgi.com> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.4.6-1mdk Date: Wed, 01 Jun 2005 18:16:36 +1000 Content-Transfer-Encoding: 7bit X-archive-position: 1940 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: gnb@melbourne.sgi.com Precedence: bulk X-list: netdev On Wed, 2005-06-01 at 08:48, David S. Miller wrote: > So the idea is, if we can make all of the spinlocks BH locks we'll > solve a whole bunch of problems: > [...] > 2) the driver will actually produce useful profiling data > via oprofile and friends since timer interrupts will run > even while holding the locks That would be really, really nice. Greg. -- Greg Banks, R&D Software Engineer, SGI Australian Software Group. I don't speak for SGI. From herbert@gondor.apana.org.au Wed Jun 1 01:44:03 2005 Received: with ECARTIS (v1.0.0; list netdev); Wed, 01 Jun 2005 01:44:09 -0700 (PDT) Received: from arnor.apana.org.au (arnor.apana.org.au [203.14.152.115]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j518i1Xq023834 for ; Wed, 1 Jun 2005 01:44:03 -0700 Received: from gondolin.me.apana.org.au ([192.168.0.6] ident=mail) by arnor.apana.org.au with esmtp (Exim 3.35 #1 (Debian)) id 1DdOoK-0005CB-00; Wed, 01 Jun 2005 18:42:48 +1000 Received: from herbert by gondolin.me.apana.org.au with local (Exim 3.36 #1 (Debian)) id 1DdOoF-00082p-00; Wed, 01 Jun 2005 18:42:43 +1000 From: Herbert Xu To: ak@muc.de (Andi Kleen) Subject: Re: Locking model for NAPI drivers Cc: davem@davemloft.net, netdev@oss.sgi.com Organization: Core In-Reply-To: X-Newsgroups: apana.lists.os.linux.netdev User-Agent: tin/1.7.4-20040225 ("Benbecula") (UNIX) (Linux/2.4.27-hx-1-686-smp (i686)) Message-Id: Date: Wed, 01 Jun 2005 18:42:43 +1000 X-archive-position: 1941 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: herbert@gondor.apana.org.au Precedence: bulk X-list: netdev Andi Kleen wrote: > > That is because of the kmap_atomic it does right? At least in the i386 > highmem implementation I don't see any code that would be less safe in > hard interrupt context compared to BHs. And FRV and mips look like they > allow it too. To make it safe we'll have to allocate another precious km_type entry. -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt From raghunathan.venkatesan@wipro.com Wed Jun 1 04:40:57 2005 Received: with ECARTIS (v1.0.0; list netdev); Wed, 01 Jun 2005 04:41:12 -0700 (PDT) Received: from wip-ec-wd.wipro.com (wip-ec-wd.wipro.com [203.101.113.39]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j51BepXq004532 for ; Wed, 1 Jun 2005 04:40:54 -0700 Received: from wip-ec-wd.wipro.com (localhost.wipro.com [127.0.0.1]) by localhost (Postfix) with ESMTP id 5CA9D205E4; Wed, 1 Jun 2005 17:00:54 +0530 (IST) Received: from blr-ec-bh01.wipro.com (unknown [10.201.50.91]) by wip-ec-wd.wipro.com (Postfix) with ESMTP id 3B276205E1; Wed, 1 Jun 2005 17:00:54 +0530 (IST) Received: from chn-snr-bh2.wipro.com ([10.145.50.92]) by blr-ec-bh01.wipro.com with Microsoft SMTPSVC(6.0.3790.211); Wed, 1 Jun 2005 17:09:48 +0530 Received: from CHN-SNR-MBX01.wipro.com ([10.145.50.181]) by chn-snr-bh2.wipro.com with Microsoft SMTPSVC(6.0.3790.0); Wed, 1 Jun 2005 17:04:44 +0530 X-MimeOLE: Produced By Microsoft Exchange V6.5.7226.0 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="----_=_NextPart_001_01C5669D.E951F95B" Subject: Unable to handle kernel paging request at virtual address 04000460 Date: Wed, 1 Jun 2005 17:01:23 +0530 Message-ID: <438662DA48DCAA41B1DF648BD4BD76C0E45DF1@CHN-SNR-MBX01.wipro.com> X-MS-Has-Attach: yes X-MS-TNEF-Correlator: Thread-Topic: Unable to handle kernel paging request at virtual address 04000460 Thread-Index: AcVgd+bKyjXc1BZZTzOXOhBhBJ2c9wAwfS0wAFOR2UABBPMC8A== From: To: , , X-OriginalArrivalTime: 01 Jun 2005 11:34:44.0715 (UTC) FILETIME=[E8897BB0:01C5669D] X-archive-position: 1942 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: raghunathan.venkatesan@wipro.com Precedence: bulk X-list: netdev This is a multi-part message in MIME format. ------_=_NextPart_001_01C5669D.E951F95B Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Hi Everyone, We are facing the following crash in custom Linux 2.4.26 kernel, when we run a netperf TCP Stream (sizes varying from 64 to 32586 bytes) test over an IPSEC tunnel created between a host and a VPN server through our box. This is a Au1550 MIPS32 based board (DB1550 Cabernet board from AMD). We observe that crash happens randomly (the PrId keeps changing at each crash), because of burstiness in the netperf tool generated traffic. Please look into the following capture below. I'd like some help in debugging this issue. The same set of IPSEC drivers (not from Linux) works fine on a custom Linux 2.4.25 based kernel. We debugged the Oops traces and found that all problems arise in skbuff (donno where in skbuff). Is there a patch that needs to be applied for Linux 2.4.26 ?=20 Thanks & Regards, Raghu Venkatesan Project Manager (E & PE, Semiconductor & Access), CDC2, Sozhanganallur, Chennai - 600 119, INDIA +91 -44-24500200 Ext. 2643 raghunathan.venkatesan@wipro.com =20 ------_=_NextPart_001_01C5669D.E951F95B Content-Type: application/octet-stream; name="recent.cap_send1.oops" Content-Transfer-Encoding: base64 Content-Description: recent.cap_send1.oops Content-Disposition: attachment; filename="recent.cap_send1.oops" a3N5bW9vcHMgMi40Ljkgb24gaTY4NiAyLjQuMjItMS4yMTE1Lm5wdGwuICBPcHRpb25zIHVzZWQK ICAgICAtdiAvaG9tZS9hbWQvcHJvamVjdC9hbWQva2VybmVsL3ZtbGludXggKGRlZmF1bHQpCiAg ICAgLUsgKHNwZWNpZmllZCkKICAgICAtbCAvcHJvYy9tb2R1bGVzIChkZWZhdWx0KQogICAgIC1v IC9ob21lL2FtZC9wcm9qZWN0L2FtZC9maWxlc3lzdGVtL3Vzci9saWIvbW9kdWxlcy8gKGRlZmF1 bHQpCiAgICAgLW0gL2hvbWUvYW1kL3Byb2plY3QvYW1kL2tlcm5lbC9TeXN0ZW0ubWFwIChkZWZh dWx0KQogICAgIC10IGVsZjMyLWxpdHRsZW1pcHMgLWEgbWlwczo0NjAwCgpObyBtb2R1bGVzIGlu IGtzeW1zLCBza2lwcGluZyBvYmplY3RzCk5vIGtzeW1zLCBza2lwcGluZyBsc21vZApVbmFibGUg dG8gaGFuZGxlIGtlcm5lbCBwYWdpbmcgcmVxdWVzdCBhdCB2aXJ0dWFsIGFkZHJlc3MgMDIwMDA0 ZDQsIGVwYyA9PSA4MDI0YWY2YywgcmEgPT0gODAyNGIwOTQKT29wcyBpbiBmYXVsdC5jOjpkb19w YWdlX2ZhdWx0LCBsaW5lIDIwNjoKJDAgOiAwMDAwMDAwMCAxMDAwZmMwMCA4YWJiYjYwMCAwMjAw MDQ2MCAwMjAwMDQ2MCA4YWJiYjVlYyAwMDAwMDAwMCAwMDAwMDVlYwokOCA6IDVhZDM0MzZlIDhh YmJiZGVjIGIzZGU1ZDcxIDU2NzM2OTg4IDA3ODNmZGZiIDgwMzIzODU4IDgwMzIzODA0IDI0ZTEy YWU1CiQxNjogMDIwMDA0NjAgMDAwMDAwMDEgOGFiYmI4MDAgMDAwMDA2MDAgMDAwMDBjZmMgMDAw MDA1ZGMgMDAwMDAwMTQgMDAwMDM0MDgKJDI0OiAwMDAwMDAwMCAyYWVhM2M3MCAgICAgICAgICAg ICAgICAgICA4MDMyMjAwMCA4MDMyM2EyOCAwMDAwMzQxYyA4MDI0YjA5NApIaSA6IDAwMDAwMDAw CkxvIDogMDAwMDA4MDAKZXBjICAgOiA4MDI0YWY2YyAgICBOb3QgdGFpbnRlZApTdGF0dXM6IDEw MDBmYzAzCkNhdXNlIDogMDA4MDAwMDgKUHJvY2VzcyBzd2FwcGVyIChwaWQ6IDAsIHN0YWNrcGFn ZT04MDMyMjAwMCkKU3RhY2s6ICAgIDhiOTYyNDgwIDAwMDAwMDAwIDAwMDAwMDAwIDAwMDAwMDAw IDAwMDAwODAwIDhiNmFmNDYwIDgwMjRiMDk0CiA4YjZhZjQ2MCA4YWJiYjgwMCAwMDAwMDYwMCAw MDAwMGNmYyAwMDAwMDVkYyAwMDAwMDgwMCA4YjZhZjQ2MCA4MDI0YjdjYwogODAyNGI3YjAgMjRl MTJhZTUgODAzMjM4NTggODAzMjM4MDQgYzAxYzIwNTAgOGI2YWY0NjAgODAzYTA0MDAgMDAwMDA1 YzgKIDgxMmJlMzAwIDgwMjUwMWQ0IDAwMDAwNWRjIDAwMDAwMDE0IDAwMDAyZTQwIDAwMDAwMDAw IDJhZWEzYzcwIDhiNmFmNDYwCiA4YWVhMTE2MCAwMDAwMDVjOCA4MDI2YTllOCAwMDAwMmU1NCA4 MDI2YTE4NCAxMDAwZmMwMyAwMDAwMDAwMCA4YjZhZjQ2MAogOGFiYmIwMTAgLi4uCkNhbGwgVHJh Y2U6ICAgWzw4MDI0YjA5ND5dIFs8ODAyNGI3Y2M+XSBbPDgwMjRiN2IwPl0gWzw4MDI1MDFkND5d IFs8ODAyNmE5ZTg+XQogWzw4MDI2YTE4ND5dIFs8ODAyNmEzMGM+XSBbPDgwMjZhMWRjPl0gWzw4 MDI2YTkwYz5dIFs8ODAyNmE5MGM+XSBbPDgwMjljNDE4Pl0KIFs8ODAyNmE5MGM+XSBbPDgwMjZh OTBjPl0gWzw4MDI1YTQ4ND5dIFs8ODAyNmE5MGM+XSBbPDgwMjZhOTBjPl0gWzw4MDI1YTk0OD5d CiBbPDgwMmRhMGUwPl0gWzw4MDI2YTkwYz5dIFs8ODAyNmE4ZDQ+XSBbPDgwMjZhOTBjPl0gWzw4 MDI2YTMwYz5dIFs8ODAyNmExODQ+XQogWzw4MDI2NzEzMD5dIFs8ODAyNjcxYjA+XSBbPDgwMjZh NzQ0Pl0gWzw4MDI1YTk4Yz5dIFs8ODAyOWVkODg+XSBbPDgwMjY3MTMwPl0KIFs8ODAyOWZkMzQ+ XSBbPDgwMjY3MDZjPl0gWzw4MDI2NzEzMD5dIFs8ODAyNjU3Zjg+XSBbPDgwMjY1YTIwPl0gWzw4 MDI1YTQ4ND5dCiBbPGMwMWNlMmE4Pl0gWzw4MDI2NTdmOD5dIFs8ODAyNjU3Zjg+XSBbPDgwMjVh OThjPl0gWzw4MDI1YTk0OD5dIC4uLgpXYXJuaW5nIChPb3BzX3RyYWNlX2xpbmUpOiBnYXJiYWdl ICcuLi4nIGF0IGVuZCBvZiB0cmFjZSBsaW5lIGlnbm9yZWQKQ29kZTogOGM1MDAwMDggIGFjNDAw MDA4ICAwMjAwMjAyMSA8OGM4MjAwNzQ+IDEwNTEwMDA5ICA4ZTEwMDAwMCAgYzA4MzAwNzQgIDAw NzExMDIzICBlMDgyMDA3NAoKCj4+UkE7ICAwMDAwMDAwMDgwMjRiMDk0IDxza2JfcmVsZWFzZV9k YXRhK2IwL2JjPgo+PiQxMzsgMDAwMDAwMDA4MDMyMzg1OCA8aW5pdF90YXNrX3VuaW9uKzE4NTgv MjAwMD4KPj4kMTQ7IDAwMDAwMDAwODAzMjM4MDQgPGluaXRfdGFza191bmlvbisxODA0LzIwMDA+ Cj4+JDI4OyAwMDAwMDAwMDgwMzIyMDAwIDxpbml0X3Rhc2tfdW5pb24rMC8yMDAwPgo+PiQyOTsg MDAwMDAwMDA4MDMyM2EyOCA8aW5pdF90YXNrX3VuaW9uKzFhMjgvMjAwMD4KPj4kMzE7IDAwMDAw MDAwODAyNGIwOTQgPHNrYl9yZWxlYXNlX2RhdGErYjAvYmM+Cgo+PlBDOyAgMDAwMDAwMDA4MDI0 YWY2YyA8c2tiX2Ryb3BfZnJhZ2xpc3QrMzQvNzQ+ICAgPD09PT09CgpUcmFjZTsgMDAwMDAwMDA4 MDI0YjA5NCA8c2tiX3JlbGVhc2VfZGF0YStiMC9iYz4KVHJhY2U7IDAwMDAwMDAwODAyNGI3Y2Mg PHNrYl9saW5lYXJpemUrYzQvMTRjPgpUcmFjZTsgMDAwMDAwMDA4MDI0YjdiMCA8c2tiX2xpbmVh cml6ZSthOC8xNGM+ClRyYWNlOyAwMDAwMDAwMDgwMjUwMWQ0IDxkZXZfcXVldWVfeG1pdCs1MC8z Yjg+ClRyYWNlOyAwMDAwMDAwMDgwMjZhOWU4IDxpcF9maW5pc2hfb3V0cHV0MitlYy8xNTA+ClRy YWNlOyAwMDAwMDAwMDgwMjZhMTg0IDxpcF9mcmFnbWVudCsyNDAvNTAwPgpUcmFjZTsgMDAwMDAw MDA4MDI2YTMwYyA8aXBfZnJhZ21lbnQrM2M4LzUwMD4KVHJhY2U7IDAwMDAwMDAwODAyNmExZGMg PGlwX2ZyYWdtZW50KzI5OC81MDA+ClRyYWNlOyAwMDAwMDAwMDgwMjZhOTBjIDxpcF9maW5pc2hf b3V0cHV0MisxMC8xNTA+ClRyYWNlOyAwMDAwMDAwMDgwMjZhOTBjIDxpcF9maW5pc2hfb3V0cHV0 MisxMC8xNTA+ClRyYWNlOyAwMDAwMDAwMDgwMjljNDE4IDxpcF9yZWZyYWcrNjgvNzQ+ClRyYWNl OyAwMDAwMDAwMDgwMjZhOTBjIDxpcF9maW5pc2hfb3V0cHV0MisxMC8xNTA+ClRyYWNlOyAwMDAw MDAwMDgwMjZhOTBjIDxpcF9maW5pc2hfb3V0cHV0MisxMC8xNTA+ClRyYWNlOyAwMDAwMDAwMDgw MjVhNDg0IDxuZl9pdGVyYXRlKzk0LzExND4KVHJhY2U7IDAwMDAwMDAwODAyNmE5MGMgPGlwX2Zp bmlzaF9vdXRwdXQyKzEwLzE1MD4KVHJhY2U7IDAwMDAwMDAwODAyNmE5MGMgPGlwX2ZpbmlzaF9v dXRwdXQyKzEwLzE1MD4KVHJhY2U7IDAwMDAwMDAwODAyNWE5NDggPG5mX2hvb2tfc2xvdysxMjgv MWY4PgpUcmFjZTsgMDAwMDAwMDA4MDJkYTBlMCA8bWVtc2V0KzAvMWM+ClRyYWNlOyAwMDAwMDAw MDgwMjZhOTBjIDxpcF9maW5pc2hfb3V0cHV0MisxMC8xNTA+ClRyYWNlOyAwMDAwMDAwMDgwMjZh OGQ0IDxpcF9maW5pc2hfb3V0cHV0KzFhMC8xYTQ+ClRyYWNlOyAwMDAwMDAwMDgwMjZhOTBjIDxp cF9maW5pc2hfb3V0cHV0MisxMC8xNTA+ClRyYWNlOyAwMDAwMDAwMDgwMjZhMzBjIDxpcF9mcmFn bWVudCszYzgvNTAwPgpUcmFjZTsgMDAwMDAwMDA4MDI2YTE4NCA8aXBfZnJhZ21lbnQrMjQwLzUw MD4KVHJhY2U7IDAwMDAwMDAwODAyNjcxMzAgPGlwX2ZvcndhcmRfZmluaXNoKzEwL2EwPgpUcmFj ZTsgMDAwMDAwMDA4MDI2NzFiMCA8aXBfZm9yd2FyZF9maW5pc2grOTAvYTA+ClRyYWNlOyAwMDAw MDAwMDgwMjZhNzQ0IDxpcF9maW5pc2hfb3V0cHV0KzEwLzFhND4KVHJhY2U7IDAwMDAwMDAwODAy NWE5OGMgPG5mX2hvb2tfc2xvdysxNmMvMWY4PgpUcmFjZTsgMDAwMDAwMDA4MDI5ZWQ4OCA8aXBf Y3RfcmVmcmVzaCs4NC9iOD4KVHJhY2U7IDAwMDAwMDAwODAyNjcxMzAgPGlwX2ZvcndhcmRfZmlu aXNoKzEwL2EwPgpUcmFjZTsgMDAwMDAwMDA4MDI5ZmQzNCA8aWNtcF9wYWNrZXQrOTgvOWM+ClRy YWNlOyAwMDAwMDAwMDgwMjY3MDZjIDxfX2dudV9jb21waWxlZF9jKzI2Yy8zMjA+ClRyYWNlOyAw MDAwMDAwMDgwMjY3MTMwIDxpcF9mb3J3YXJkX2ZpbmlzaCsxMC9hMD4KVHJhY2U7IDAwMDAwMDAw ODAyNjU3ZjggPGlwX3Jjdl9maW5pc2grMTAvMmE4PgpUcmFjZTsgMDAwMDAwMDA4MDI2NWEyMCA8 aXBfcmN2X2ZpbmlzaCsyMzgvMmE4PgpUcmFjZTsgMDAwMDAwMDA4MDI1YTQ4NCA8bmZfaXRlcmF0 ZSs5NC8xMTQ+ClRyYWNlOyAwMDAwMDAwMGMwMWNlMmE4IDxFTkRfT0ZfQ09ERSszZmUzYmFhOC8/ Pz8/PgpUcmFjZTsgMDAwMDAwMDA4MDI2NTdmOCA8aXBfcmN2X2ZpbmlzaCsxMC8yYTg+ClRyYWNl OyAwMDAwMDAwMDgwMjY1N2Y4IDxpcF9yY3ZfZmluaXNoKzEwLzJhOD4KVHJhY2U7IDAwMDAwMDAw ODAyNWE5OGMgPG5mX2hvb2tfc2xvdysxNmMvMWY4PgpUcmFjZTsgMDAwMDAwMDA4MDI1YTk0OCA8 bmZfaG9va19zbG93KzEyOC8xZjg+CgpDb2RlOyAgMDAwMDAwMDA4MDI0YWY2MCA8c2tiX2Ryb3Bf ZnJhZ2xpc3QrMjgvNzQ+CjAwMDAwMDAwIDxfUEM+OgpDb2RlOyAgMDAwMDAwMDA4MDI0YWY2MCA8 c2tiX2Ryb3BfZnJhZ2xpc3QrMjgvNzQ+CiAgIDA6ICAgOGM1MDAwMDggIGx3ICAgICAgczAsOCh2 MCkKQ29kZTsgIDAwMDAwMDAwODAyNGFmNjQgPHNrYl9kcm9wX2ZyYWdsaXN0KzJjLzc0PgogICA0 OiAgIGFjNDAwMDA4ICBzdyAgICAgIHplcm8sOCh2MCkKQ29kZTsgIDAwMDAwMDAwODAyNGFmNjgg PHNrYl9kcm9wX2ZyYWdsaXN0KzMwLzc0PgogICA4OiAgIDAyMDAyMDIxICBtb3ZlICAgIGEwLHMw CkNvZGU7ICAwMDAwMDAwMDgwMjRhZjZjIDxza2JfZHJvcF9mcmFnbGlzdCszNC83ND4gICA8PT09 PT0KICAgYzogICA4YzgyMDA3NCAgbHcgICAgICB2MCwxMTYoYTApICAgPD09PT09CkNvZGU7ICAw MDAwMDAwMDgwMjRhZjcwIDxza2JfZHJvcF9mcmFnbGlzdCszOC83ND4KICAxMDogICAxMDUxMDAw OSAgYmVxICAgICB2MCxzMSwzOCA8X1BDKzB4Mzg+CkNvZGU7ICAwMDAwMDAwMDgwMjRhZjc0IDxz a2JfZHJvcF9mcmFnbGlzdCszYy83ND4KICAxNDogICA4ZTEwMDAwMCAgbHcgICAgICBzMCwwKHMw KQpDb2RlOyAgMDAwMDAwMDA4MDI0YWY3OCA8c2tiX2Ryb3BfZnJhZ2xpc3QrNDAvNzQ+CiAgMTg6 ICAgYzA4MzAwNzQgIGxsICAgICAgdjEsMTE2KGEwKQpDb2RlOyAgMDAwMDAwMDA4MDI0YWY3YyA8 c2tiX2Ryb3BfZnJhZ2xpc3QrNDQvNzQ+CiAgMWM6ICAgMDA3MTEwMjMgIHN1YnUgICAgdjAsdjEs czEKQ29kZTsgIDAwMDAwMDAwODAyNGFmODAgPHNrYl9kcm9wX2ZyYWdsaXN0KzQ4Lzc0PgogIDIw OiAgIGUwODIwMDc0ICBzYyAgICAgIHYwLDExNihhMCkKCktlcm5lbCBwYW5pYzogQWllZSwga2ls bGluZyBpbnRlcnJ1cHQgaGFuZGxlciEKCjEgd2FybmluZyBpc3N1ZWQuICBSZXN1bHRzIG1heSBu b3QgYmUgcmVsaWFibGUuCg== ------_=_NextPart_001_01C5669D.E951F95B Content-Type: application/octet-stream; name="recent.cap.oops" Content-Transfer-Encoding: base64 Content-Description: recent.cap.oops Content-Disposition: attachment; filename="recent.cap.oops" a3N5bW9vcHMgMi40Ljkgb24gaTY4NiAyLjQuMjItMS4yMTE1Lm5wdGwuICBPcHRpb25zIHVzZWQK ICAgICAtdiAvaG9tZS9hbWQvcHJvamVjdC9hbWQva2VybmVsL3ZtbGludXggKGRlZmF1bHQpCiAg ICAgLUsgKHNwZWNpZmllZCkKICAgICAtbCAvcHJvYy9tb2R1bGVzIChkZWZhdWx0KQogICAgIC1v IC9ob21lL2FtZC9wcm9qZWN0L2FtZC9maWxlc3lzdGVtL3Vzci9saWIvbW9kdWxlcy8gKGRlZmF1 bHQpCiAgICAgLW0gL2hvbWUvYW1kL3Byb2plY3QvYW1kL2tlcm5lbC9TeXN0ZW0ubWFwIChkZWZh dWx0KQogICAgIC10IGVsZjMyLWxpdHRsZW1pcHMgLWEgbWlwczo0NjAwCgpObyBtb2R1bGVzIGlu IGtzeW1zLCBza2lwcGluZyBvYmplY3RzCk5vIGtzeW1zLCBza2lwcGluZyBsc21vZApVbmFibGUg dG8gaGFuZGxlIGtlcm5lbCBwYWdpbmcgcmVxdWVzdCBhdCB2aXJ0dWFsIGFkZHJlc3MgMDQwMDA0 NjAsIGVwYyA9PSA4MDI0YjIwYywgcmEgPT0gODAyYzQ5ZjgKT29wcyBpbiBmYXVsdC5jOjpkb19w YWdlX2ZhdWx0LCBsaW5lIDIwNjoKJDAgOiAwMDAwMDAwMCAwMDAwMDAwMCAwMDAwMDAwMCAwMDAw MDAwMSA4Yjc4MzU4MCAwMDAwMDAwMCAwNDAwMDQ2MCAwMDAwMDAwMQokOCA6IDAwMDAwMDAwIDAw MDAwMDAwIDAwMDAwMDAyIGQzZDBiMDAwIDgwMzIzYjY4IDAwMDAwMDAwIDgwMzIzZDYwIDdiN2E3 OTc4CiQxNjogODEyYmViMjAgODEyYmViMjAgZmZmZmZmZmYgOGJiMGQ4MDAgODAzYTA4MDQgMDAw MDAwMDAgMDAwMDAwMDIgODAzMjNlMTAKJDI0OiAwMDAwMDAwMCAyYjAwYWM3MCAgICAgICAgICAg ICAgICAgICA4MDMyMjAwMCA4MDMyM2FkMCAwMDAwMjQwMSA4MDJjNDlmOApIaSA6IDAwMDAyMDkx CkxvIDogZDY5MTI4NWUKZXBjICAgOiA4MDI0YjIwYyAgICBOb3QgdGFpbnRlZApTdGF0dXM6IDEw MDBmYzAzCkNhdXNlIDogMDA4MDAwMDgKUHJvY2VzcyBzd2FwcGVyIChwaWQ6IDAsIHN0YWNrcGFn ZT04MDMyMjAwMCkKU3RhY2s6ICAgIDAwMDAwMDAwIDhiYjBkODAwIDgwM2EwODA0IDAwMDAwMDAw IDgxMmJlYjIwIDgwMmM0OWY4IDgwMTA3YzI4CiAwMDAwMDAwMCAwMDAwMDAwMCAwMDAwMDAwMCA4 MTJiZWIyMCA4MTI0ZmM2OCA4YjZhZjVhMCA4MDNhMDgwMCAwMDAwMDAwNAogODAyNTAwODggMDAw MDAwMDAgMDAwMDAwMDAgMDAwMDAwMDAgMDAwMDAwMDAgODEyYjY1NjAgODAzYTA4MDAgOGI2YWY1 YTAKIDgwM2EwODAwIDAwMDAwMDAwIDgwMjVjM2UwIDAwMDAwMDAwIDAwMDAwMDAwIDgwMzIzYzE4 IDgwMzY5YmYwIDgwMzRkN2U4CiA4MDNhMDgwMCAwMDAwMDAwMCA4MDI1MDM3YyA4MDI5YzNlYyAw MDAwMDAwMCA4Yjc4MzU4MCA4YjZhZjVhMCAwMDAwMDAwZQogOGI2YWY1YTAgLi4uCkNhbGwgVHJh Y2U6ICAgWzw4MDJjNDlmOD5dIFs8ODAxMDdjMjg+XSBbPDgwMjUwMDg4Pl0gWzw4MDI1YzNlMD5d IFs8ODAyNTAzN2M+XQogWzw4MDI5YzNlYz5dIFs8ODAyNTczYTg+XSBbPDgwMjVhNDg0Pl0gWzw4 MDI2YTkwYz5dIFs8ODAyNmE5ZTg+XSBbPDgwMjZhOTBjPl0KIFs8ODAyNWE5OGM+XSBbPDgwMjVh OTQ4Pl0gWzw4MDI2YTkwYz5dIFs8ODAyYTNkOTg+XSBbPDgwMjY3MTMwPl0gWzw4MDI2YThkND5d CiBbPDgwMjZhOTBjPl0gWzw4MDI2NzFjMD5dIFs8ODAyNjcxMzA+XSBbPDgwMjVhOThjPl0gWzw4 MDI5Y2Y1MD5dIFs8ODAyNjcxMzA+XQogWzw4MDI5ZmQwND5dIFs8ODAyNjcwNmM+XSBbPDgwMjY3 MTMwPl0gWzw4MDI2NTdmOD5dIFs8ODAyNjVhMjA+XSBbPDgwMjVhNDg0Pl0KIFs8YzAxY2UyYTg+ XSBbPDgwMjY1N2Y4Pl0gWzw4MDI2NTdmOD5dIFs8ODAyNWE5OGM+XSBbPDgwMjVhOTQ4Pl0gWzw4 MDI2NTdmOD5dCiBbPDgwMjY1NWEwPl0gWzw4MDI2NTdmOD5dIFs8ODAyNTBkNDg+XSBbPDgwMmUw MWY0Pl0gWzw4MDEwN2MyOD5dIC4uLgpXYXJuaW5nIChPb3BzX3RyYWNlX2xpbmUpOiBnYXJiYWdl ICcuLi4nIGF0IGVuZCBvZiB0cmFjZSBsaW5lIGlnbm9yZWQKQ29kZTogOGUwNjAwOWMgIDEwYzAw MDBlICAyNDAzMDAwMSA8OGNjMjAwMDA+IGMwNDUwMDAwICAwMGEzMjAyMyAgZTA0NDAwMDAgIDEw ODBmZmZjICAwMGEzMjAyMwoKCj4+UkE7ICAwMDAwMDAwMDgwMmM0OWY4IDxwYWNrZXRfcmN2X3Nw a3QrMjljLzJiMD4KPj4kMTI7IDAwMDAwMDAwODAzMjNiNjggPGluaXRfdGFza191bmlvbisxYjY4 LzIwMDA+Cj4+JDE0OyAwMDAwMDAwMDgwMzIzZDYwIDxpbml0X3Rhc2tfdW5pb24rMWQ2MC8yMDAw Pgo+PiQyMzsgMDAwMDAwMDA4MDMyM2UxMCA8aW5pdF90YXNrX3VuaW9uKzFlMTAvMjAwMD4KPj4k Mjg7IDAwMDAwMDAwODAzMjIwMDAgPGluaXRfdGFza191bmlvbiswLzIwMDA+Cj4+JDI5OyAwMDAw MDAwMDgwMzIzYWQwIDxpbml0X3Rhc2tfdW5pb24rMWFkMC8yMDAwPgo+PiQzMTsgMDAwMDAwMDA4 MDJjNDlmOCA8cGFja2V0X3Jjdl9zcGt0KzI5Yy8yYjA+Cgo+PlBDOyAgMDAwMDAwMDA4MDI0YjIw YyA8X19rZnJlZV9za2IrYTQvMTMwPiAgIDw9PT09PQoKVHJhY2U7IDAwMDAwMDAwODAyYzQ5Zjgg PHBhY2tldF9yY3Zfc3BrdCsyOWMvMmIwPgpUcmFjZTsgMDAwMDAwMDA4MDEwN2MyOCA8ZG9fZ2V0 dGltZW9mZGF5KzU4LzExND4KVHJhY2U7IDAwMDAwMDAwODAyNTAwODggPGRldl9xdWV1ZV94bWl0 X25pdCtiYy8xMTA+ClRyYWNlOyAwMDAwMDAwMDgwMjVjM2UwIDxfX2dudV9jb21waWxlZF9jKzcw LzE0Yz4KVHJhY2U7IDAwMDAwMDAwODAyNTAzN2MgPGRldl9xdWV1ZV94bWl0KzFmOC8zYjg+ClRy YWNlOyAwMDAwMDAwMDgwMjljM2VjIDxpcF9yZWZyYWcrM2MvNzQ+ClRyYWNlOyAwMDAwMDAwMDgw MjU3M2E4IDxuZWlnaF9yZXNvbHZlX291dHB1dCsxZmMvMjljPgpUcmFjZTsgMDAwMDAwMDA4MDI1 YTQ4NCA8bmZfaXRlcmF0ZSs5NC8xMTQ+ClRyYWNlOyAwMDAwMDAwMDgwMjZhOTBjIDxpcF9maW5p c2hfb3V0cHV0MisxMC8xNTA+ClRyYWNlOyAwMDAwMDAwMDgwMjZhOWU4IDxpcF9maW5pc2hfb3V0 cHV0MitlYy8xNTA+ClRyYWNlOyAwMDAwMDAwMDgwMjZhOTBjIDxpcF9maW5pc2hfb3V0cHV0Misx MC8xNTA+ClRyYWNlOyAwMDAwMDAwMDgwMjVhOThjIDxuZl9ob29rX3Nsb3crMTZjLzFmOD4KVHJh Y2U7IDAwMDAwMDAwODAyNWE5NDggPG5mX2hvb2tfc2xvdysxMjgvMWY4PgpUcmFjZTsgMDAwMDAw MDA4MDI2YTkwYyA8aXBfZmluaXNoX291dHB1dDIrMTAvMTUwPgpUcmFjZTsgMDAwMDAwMDA4MDJh M2Q5OCA8aXB0X2xvY2FsX291dF9ob29rKzQvOGM+ClRyYWNlOyAwMDAwMDAwMDgwMjY3MTMwIDxp cF9mb3J3YXJkX2ZpbmlzaCsxMC9hMD4KVHJhY2U7IDAwMDAwMDAwODAyNmE4ZDQgPGlwX2Zpbmlz aF9vdXRwdXQrMWEwLzFhND4KVHJhY2U7IDAwMDAwMDAwODAyNmE5MGMgPGlwX2ZpbmlzaF9vdXRw dXQyKzEwLzE1MD4KVHJhY2U7IDAwMDAwMDAwODAyNjcxYzAgPGlwX29wdGlvbnNfYnVpbGQrMC8w PgpUcmFjZTsgMDAwMDAwMDA4MDI2NzEzMCA8aXBfZm9yd2FyZF9maW5pc2grMTAvYTA+ClRyYWNl OyAwMDAwMDAwMDgwMjVhOThjIDxuZl9ob29rX3Nsb3crMTZjLzFmOD4KVHJhY2U7IDAwMDAwMDAw ODAyOWNmNTAgPGRlYXRoX2J5X3RpbWVvdXQrM2MvYTg+ClRyYWNlOyAwMDAwMDAwMDgwMjY3MTMw IDxpcF9mb3J3YXJkX2ZpbmlzaCsxMC9hMD4KVHJhY2U7IDAwMDAwMDAwODAyOWZkMDQgPGljbXBf cGFja2V0KzY4LzljPgpUcmFjZTsgMDAwMDAwMDA4MDI2NzA2YyA8X19nbnVfY29tcGlsZWRfYysy NmMvMzIwPgpUcmFjZTsgMDAwMDAwMDA4MDI2NzEzMCA8aXBfZm9yd2FyZF9maW5pc2grMTAvYTA+ ClRyYWNlOyAwMDAwMDAwMDgwMjY1N2Y4IDxpcF9yY3ZfZmluaXNoKzEwLzJhOD4KVHJhY2U7IDAw MDAwMDAwODAyNjVhMjAgPGlwX3Jjdl9maW5pc2grMjM4LzJhOD4KVHJhY2U7IDAwMDAwMDAwODAy NWE0ODQgPG5mX2l0ZXJhdGUrOTQvMTE0PgpUcmFjZTsgMDAwMDAwMDBjMDFjZTJhOCA8RU5EX09G X0NPREUrM2ZlM2JhYTgvPz8/Pz4KVHJhY2U7IDAwMDAwMDAwODAyNjU3ZjggPGlwX3Jjdl9maW5p c2grMTAvMmE4PgpUcmFjZTsgMDAwMDAwMDA4MDI2NTdmOCA8aXBfcmN2X2ZpbmlzaCsxMC8yYTg+ ClRyYWNlOyAwMDAwMDAwMDgwMjVhOThjIDxuZl9ob29rX3Nsb3crMTZjLzFmOD4KVHJhY2U7IDAw MDAwMDAwODAyNWE5NDggPG5mX2hvb2tfc2xvdysxMjgvMWY4PgpUcmFjZTsgMDAwMDAwMDA4MDI2 NTdmOCA8aXBfcmN2X2ZpbmlzaCsxMC8yYTg+ClRyYWNlOyAwMDAwMDAwMDgwMjY1NWEwIDxpcF9y Y3YrNTEwLzU3OD4KVHJhY2U7IDAwMDAwMDAwODAyNjU3ZjggPGlwX3Jjdl9maW5pc2grMTAvMmE4 PgpUcmFjZTsgMDAwMDAwMDA4MDI1MGQ0OCA8bmV0aWZfcmVjZWl2ZV9za2IrMjcwLzJjMD4KVHJh Y2U7IDAwMDAwMDAwODAyZTAxZjQgPGF1MTAwMF9JUlErMTM0LzFhMD4KVHJhY2U7IDAwMDAwMDAw ODAxMDdjMjggPGRvX2dldHRpbWVvZmRheSs1OC8xMTQ+CgpDb2RlOyAgMDAwMDAwMDA4MDI0YjIw MCA8X19rZnJlZV9za2IrOTgvMTMwPgowMDAwMDAwMCA8X1BDPjoKQ29kZTsgIDAwMDAwMDAwODAy NGIyMDAgPF9fa2ZyZWVfc2tiKzk4LzEzMD4KICAgMDogICA4ZTA2MDA5YyAgbHcgICAgICBhMiwx NTYoczApCkNvZGU7ICAwMDAwMDAwMDgwMjRiMjA0IDxfX2tmcmVlX3NrYis5Yy8xMzA+CiAgIDQ6 ICAgMTBjMDAwMGUgIGJlcXogICAgYTIsNDAgPF9QQysweDQwPgpDb2RlOyAgMDAwMDAwMDA4MDI0 YjIwOCA8X19rZnJlZV9za2IrYTAvMTMwPgogICA4OiAgIDI0MDMwMDAxICBsaSAgICAgIHYxLDEK Q29kZTsgIDAwMDAwMDAwODAyNGIyMGMgPF9fa2ZyZWVfc2tiK2E0LzEzMD4gICA8PT09PT0KICAg YzogICA4Y2MyMDAwMCAgbHcgICAgICB2MCwwKGEyKSAgIDw9PT09PQpDb2RlOyAgMDAwMDAwMDA4 MDI0YjIxMCA8X19rZnJlZV9za2IrYTgvMTMwPgogIDEwOiAgIGMwNDUwMDAwICBsbCAgICAgIGEx LDAodjApCkNvZGU7ICAwMDAwMDAwMDgwMjRiMjE0IDxfX2tmcmVlX3NrYithYy8xMzA+CiAgMTQ6 ICAgMDBhMzIwMjMgIHN1YnUgICAgYTAsYTEsdjEKQ29kZTsgIDAwMDAwMDAwODAyNGIyMTggPF9f a2ZyZWVfc2tiK2IwLzEzMD4KICAxODogICBlMDQ0MDAwMCAgc2MgICAgICBhMCwwKHYwKQpDb2Rl OyAgMDAwMDAwMDA4MDI0YjIxYyA8X19rZnJlZV9za2IrYjQvMTMwPgogIDFjOiAgIDEwODBmZmZj ICBiZXF6ICAgIGEwLDEwIDxfUEMrMHgxMD4KQ29kZTsgIDAwMDAwMDAwODAyNGIyMjAgPF9fa2Zy ZWVfc2tiK2I4LzEzMD4KICAyMDogICAwMGEzMjAyMyAgc3VidSAgICBhMCxhMSx2MQoKS2VybmVs IHBhbmljOiBBaWVlLCBraWxsaW5nIGludGVycnVwdCBoYW5kbGVyIQoKMSB3YXJuaW5nIGlzc3Vl ZC4gIFJlc3VsdHMgbWF5IG5vdCBiZSByZWxpYWJsZS4K ------_=_NextPart_001_01C5669D.E951F95B Content-Type: application/octet-stream; name="recent.cap_recv.oops" Content-Transfer-Encoding: base64 Content-Description: recent.cap_recv.oops Content-Disposition: attachment; filename="recent.cap_recv.oops" a3N5bW9vcHMgMi40Ljkgb24gaTY4NiAyLjQuMjItMS4yMTE1Lm5wdGwuICBPcHRpb25zIHVzZWQK ICAgICAtdiAvaG9tZS9hbWQvcHJvamVjdC9hbWQva2VybmVsL3ZtbGludXggKGRlZmF1bHQpCiAg ICAgLUsgKHNwZWNpZmllZCkKICAgICAtbCAvcHJvYy9tb2R1bGVzIChkZWZhdWx0KQogICAgIC1v IC9ob21lL2FtZC9wcm9qZWN0L2FtZC9maWxlc3lzdGVtL3Vzci9saWIvbW9kdWxlcy8gKGRlZmF1 bHQpCiAgICAgLW0gL2hvbWUvYW1kL3Byb2plY3QvYW1kL2tlcm5lbC9TeXN0ZW0ubWFwIChkZWZh dWx0KQogICAgIC10IGVsZjMyLWxpdHRsZW1pcHMgLWEgbWlwczo0NjAwCgpObyBtb2R1bGVzIGlu IGtzeW1zLCBza2lwcGluZyBvYmplY3RzCk5vIGtzeW1zLCBza2lwcGluZyBsc21vZApVbmFibGUg dG8gaGFuZGxlIGtlcm5lbCBwYWdpbmcgcmVxdWVzdCBhdCB2aXJ0dWFsIGFkZHJlc3MgMDAwMDMy NjAsIGVwYyA9PSA4MDI0YjIwYywgcmEgPT0gODAyYzQ5ZjgKT29wcyBpbiBmYXVsdC5jOjpkb19w YWdlX2ZhdWx0LCBsaW5lIDIwNjoKJDAgOiAwMDAwMDAwMCAwMDAwMDAwMCAwMDAwMDAwMCAwMDAw MDAwMSA4Yjc4MDc2MCAwMDAwMDAwMCAwMDAwMzI2MCAwMDAwMDAwMQokOCA6IDAwMDAwMDAwIDAw MDAwMDAwIDAwMDAwMDAyIGQzZDBiMDAwIGMwMTE1MDAwIDAwMDAxNGI4IDhiOWJmZDI4IDdiN2E3 OTc4CiQxNjogOGI2YjU0NjAgOGI2YjU0NjAgZmZmZmZmZmYgOGI5MGY4MDAgODAzYTA4MDQgMDAw MDAwMDAgMDAwMDAwMDIgOGI5YmZkZDgKJDI0OiAwMDAwMDAwMCAyYWNhZDU1MCAgICAgICAgICAg ICAgICAgICA4YjliZTAwMCA4YjliZmE5OCAwMDAwNDc5ZCA4MDJjNDlmOApIaSA6IDAwMDAyMzYx CkxvIDogNzY1MGYxMDgKZXBjICAgOiA4MDI0YjIwYyAgICBOb3QgdGFpbnRlZApTdGF0dXM6IDEw MDBmYzAzCkNhdXNlIDogMDA4MDAwMDgKUHJvY2VzcyB2b3Nsb2cgKHBpZDogMTM0LCBzdGFja3Bh Z2U9OGI5YmUwMDApClN0YWNrOiAgICAwMDAwMDAwMCA4YjkwZjgwMCA4MDNhMDgwNCAwMDAwMDAw MCA4YjZiNTQ2MCA4MDJjNDlmOCA4MDEwN2MyOAogMDAwMDAwMDAgMDAwMDAwMDAgMDAwMDAwMDAg OGI2YjU0NjAgODEyNGZjNjggODEyYmVkMDAgODAzYTA4MDAgMDAwMDAwMDQKIDgwMjUwMDg4IDAw MDAwMDAwIDAwMDAwMDAwIDgwMjlkMzgwIDAwMDAwMDAwIDgxMmI2NTYwIDgwM2EwODAwIDgxMmJl ZDAwCiA4MDNhMDgwMCAwMDAwMDAwMCA4MDI1YzNlMCA4MDI2YTkwYyAwMDAwMDAwMyAwMDAwMDAw MiA4MDI5YzNhYyA4MDM0ZDdlOAogODAzYTA4MDAgMDAwMDAwMDAgODAyNTAzN2MgODAyOWMzZWMg MDAwMDAwMDAgOGI3ODA3NjAgODEyYmVkMDAgMDAwMDAwMGUKIDgxMmJlZDAwIC4uLgpDYWxsIFRy YWNlOiAgIFs8ODAyYzQ5Zjg+XSBbPDgwMTA3YzI4Pl0gWzw4MDI1MDA4OD5dIFs8ODAyOWQzODA+ XSBbPDgwMjVjM2UwPl0KIFs8ODAyNmE5MGM+XSBbPDgwMjljM2FjPl0gWzw4MDI1MDM3Yz5dIFs8 ODAyOWMzZWM+XSBbPDgwMjU3M2E4Pl0gWzw4MDI1YTQ4ND5dCiBbPDgwMjZhOTBjPl0gWzw4MDI2 YTllOD5dIFs8ODAyNmE5MGM+XSBbPDgwMjVhOThjPl0gWzw4MDI1YTk0OD5dIFs8ODAyNmE5MGM+ XQogWzw4MDJhM2Q5OD5dIFs8ODAyNjcxMzA+XSBbPDgwMjZhOGQ0Pl0gWzw4MDI2YTkwYz5dIFs8 ODAyNjcxYzA+XSBbPDgwMjY3MTMwPl0KIFs8ODAyNWE5OGM+XSBbPDgwMjY3MTMwPl0gWzw4MDI5 ZmQzND5dIFs8ODAyNjcwNmM+XSBbPDgwMjY3MTMwPl0gWzw4MDI2NTdmOD5dCiBbPDgwMjY1YTIw Pl0gWzw4MDI1YTQ4ND5dIFs8YzAxY2UyYTg+XSBbPDgwMjY1N2Y4Pl0gWzw4MDI2NTdmOD5dIFs8 ODAyNWE5OGM+XQogWzw4MDI1YTk0OD5dIFs8ODAyNjU3Zjg+XSBbPDgwMjY1NWEwPl0gWzw4MDI2 NTdmOD5dIFs8ODAxMDEzM2M+XSAuLi4KV2FybmluZyAoT29wc190cmFjZV9saW5lKTogZ2FyYmFn ZSAnLi4uJyBhdCBlbmQgb2YgdHJhY2UgbGluZSBpZ25vcmVkCkNvZGU6IDhlMDYwMDljICAxMGMw MDAwZSAgMjQwMzAwMDEgPDhjYzIwMDAwPiBjMDQ1MDAwMCAgMDBhMzIwMjMgIGUwNDQwMDAwICAx MDgwZmZmYyAgMDBhMzIwMjMKCgo+PlJBOyAgMDAwMDAwMDA4MDJjNDlmOCA8cGFja2V0X3Jjdl9z cGt0KzI5Yy8yYjA+Cj4+JDMxOyAwMDAwMDAwMDgwMmM0OWY4IDxwYWNrZXRfcmN2X3Nwa3QrMjlj LzJiMD4KCj4+UEM7ICAwMDAwMDAwMDgwMjRiMjBjIDxfX2tmcmVlX3NrYithNC8xMzA+ICAgPD09 PT09CgpUcmFjZTsgMDAwMDAwMDA4MDJjNDlmOCA8cGFja2V0X3Jjdl9zcGt0KzI5Yy8yYjA+ClRy YWNlOyAwMDAwMDAwMDgwMTA3YzI4IDxkb19nZXR0aW1lb2ZkYXkrNTgvMTE0PgpUcmFjZTsgMDAw MDAwMDA4MDI1MDA4OCA8ZGV2X3F1ZXVlX3htaXRfbml0K2JjLzExMD4KVHJhY2U7IDAwMDAwMDAw ODAyOWQzODAgPF9faXBfY29ubnRyYWNrX2NvbmZpcm0rMjM4LzJjOD4KVHJhY2U7IDAwMDAwMDAw ODAyNWMzZTAgPF9fZ251X2NvbXBpbGVkX2MrNzAvMTRjPgpUcmFjZTsgMDAwMDAwMDA4MDI2YTkw YyA8aXBfZmluaXNoX291dHB1dDIrMTAvMTUwPgpUcmFjZTsgMDAwMDAwMDA4MDI5YzNhYyA8aXBf Y29uZmlybSs0OC80Yz4KVHJhY2U7IDAwMDAwMDAwODAyNTAzN2MgPGRldl9xdWV1ZV94bWl0KzFm OC8zYjg+ClRyYWNlOyAwMDAwMDAwMDgwMjljM2VjIDxpcF9yZWZyYWcrM2MvNzQ+ClRyYWNlOyAw MDAwMDAwMDgwMjU3M2E4IDxuZWlnaF9yZXNvbHZlX291dHB1dCsxZmMvMjljPgpUcmFjZTsgMDAw MDAwMDA4MDI1YTQ4NCA8bmZfaXRlcmF0ZSs5NC8xMTQ+ClRyYWNlOyAwMDAwMDAwMDgwMjZhOTBj IDxpcF9maW5pc2hfb3V0cHV0MisxMC8xNTA+ClRyYWNlOyAwMDAwMDAwMDgwMjZhOWU4IDxpcF9m aW5pc2hfb3V0cHV0MitlYy8xNTA+ClRyYWNlOyAwMDAwMDAwMDgwMjZhOTBjIDxpcF9maW5pc2hf b3V0cHV0MisxMC8xNTA+ClRyYWNlOyAwMDAwMDAwMDgwMjVhOThjIDxuZl9ob29rX3Nsb3crMTZj LzFmOD4KVHJhY2U7IDAwMDAwMDAwODAyNWE5NDggPG5mX2hvb2tfc2xvdysxMjgvMWY4PgpUcmFj ZTsgMDAwMDAwMDA4MDI2YTkwYyA8aXBfZmluaXNoX291dHB1dDIrMTAvMTUwPgpUcmFjZTsgMDAw MDAwMDA4MDJhM2Q5OCA8aXB0X2xvY2FsX291dF9ob29rKzQvOGM+ClRyYWNlOyAwMDAwMDAwMDgw MjY3MTMwIDxpcF9mb3J3YXJkX2ZpbmlzaCsxMC9hMD4KVHJhY2U7IDAwMDAwMDAwODAyNmE4ZDQg PGlwX2ZpbmlzaF9vdXRwdXQrMWEwLzFhND4KVHJhY2U7IDAwMDAwMDAwODAyNmE5MGMgPGlwX2Zp bmlzaF9vdXRwdXQyKzEwLzE1MD4KVHJhY2U7IDAwMDAwMDAwODAyNjcxYzAgPGlwX29wdGlvbnNf YnVpbGQrMC8wPgpUcmFjZTsgMDAwMDAwMDA4MDI2NzEzMCA8aXBfZm9yd2FyZF9maW5pc2grMTAv YTA+ClRyYWNlOyAwMDAwMDAwMDgwMjVhOThjIDxuZl9ob29rX3Nsb3crMTZjLzFmOD4KVHJhY2U7 IDAwMDAwMDAwODAyNjcxMzAgPGlwX2ZvcndhcmRfZmluaXNoKzEwL2EwPgpUcmFjZTsgMDAwMDAw MDA4MDI5ZmQzNCA8aWNtcF9wYWNrZXQrOTgvOWM+ClRyYWNlOyAwMDAwMDAwMDgwMjY3MDZjIDxf X2dudV9jb21waWxlZF9jKzI2Yy8zMjA+ClRyYWNlOyAwMDAwMDAwMDgwMjY3MTMwIDxpcF9mb3J3 YXJkX2ZpbmlzaCsxMC9hMD4KVHJhY2U7IDAwMDAwMDAwODAyNjU3ZjggPGlwX3Jjdl9maW5pc2gr MTAvMmE4PgpUcmFjZTsgMDAwMDAwMDA4MDI2NWEyMCA8aXBfcmN2X2ZpbmlzaCsyMzgvMmE4PgpU cmFjZTsgMDAwMDAwMDA4MDI1YTQ4NCA8bmZfaXRlcmF0ZSs5NC8xMTQ+ClRyYWNlOyAwMDAwMDAw MGMwMWNlMmE4IDxFTkRfT0ZfQ09ERSszZmUzYmFhOC8/Pz8/PgpUcmFjZTsgMDAwMDAwMDA4MDI2 NTdmOCA8aXBfcmN2X2ZpbmlzaCsxMC8yYTg+ClRyYWNlOyAwMDAwMDAwMDgwMjY1N2Y4IDxpcF9y Y3ZfZmluaXNoKzEwLzJhOD4KVHJhY2U7IDAwMDAwMDAwODAyNWE5OGMgPG5mX2hvb2tfc2xvdysx NmMvMWY4PgpUcmFjZTsgMDAwMDAwMDA4MDI1YTk0OCA8bmZfaG9va19zbG93KzEyOC8xZjg+ClRy YWNlOyAwMDAwMDAwMDgwMjY1N2Y4IDxpcF9yY3ZfZmluaXNoKzEwLzJhOD4KVHJhY2U7IDAwMDAw MDAwODAyNjU1YTAgPGlwX3Jjdis1MTAvNTc4PgpUcmFjZTsgMDAwMDAwMDA4MDI2NTdmOCA8aXBf cmN2X2ZpbmlzaCsxMC8yYTg+ClRyYWNlOyAwMDAwMDAwMDgwMTAxMzNjIDxkb19JUlErZjQvMTE4 PgoKQ29kZTsgIDAwMDAwMDAwODAyNGIyMDAgPF9fa2ZyZWVfc2tiKzk4LzEzMD4KMDAwMDAwMDAg PF9QQz46CkNvZGU7ICAwMDAwMDAwMDgwMjRiMjAwIDxfX2tmcmVlX3NrYis5OC8xMzA+CiAgIDA6 ICAgOGUwNjAwOWMgIGx3ICAgICAgYTIsMTU2KHMwKQpDb2RlOyAgMDAwMDAwMDA4MDI0YjIwNCA8 X19rZnJlZV9za2IrOWMvMTMwPgogICA0OiAgIDEwYzAwMDBlICBiZXF6ICAgIGEyLDQwIDxfUEMr MHg0MD4KQ29kZTsgIDAwMDAwMDAwODAyNGIyMDggPF9fa2ZyZWVfc2tiK2EwLzEzMD4KICAgODog ICAyNDAzMDAwMSAgbGkgICAgICB2MSwxCkNvZGU7ICAwMDAwMDAwMDgwMjRiMjBjIDxfX2tmcmVl X3NrYithNC8xMzA+ICAgPD09PT09CiAgIGM6ICAgOGNjMjAwMDAgIGx3ICAgICAgdjAsMChhMikg ICA8PT09PT0KQ29kZTsgIDAwMDAwMDAwODAyNGIyMTAgPF9fa2ZyZWVfc2tiK2E4LzEzMD4KICAx MDogICBjMDQ1MDAwMCAgbGwgICAgICBhMSwwKHYwKQpDb2RlOyAgMDAwMDAwMDA4MDI0YjIxNCA8 X19rZnJlZV9za2IrYWMvMTMwPgogIDE0OiAgIDAwYTMyMDIzICBzdWJ1ICAgIGEwLGExLHYxCkNv ZGU7ICAwMDAwMDAwMDgwMjRiMjE4IDxfX2tmcmVlX3NrYitiMC8xMzA+CiAgMTg6ICAgZTA0NDAw MDAgIHNjICAgICAgYTAsMCh2MCkKQ29kZTsgIDAwMDAwMDAwODAyNGIyMWMgPF9fa2ZyZWVfc2ti K2I0LzEzMD4KICAxYzogICAxMDgwZmZmYyAgYmVxeiAgICBhMCwxMCA8X1BDKzB4MTA+CkNvZGU7 ICAwMDAwMDAwMDgwMjRiMjIwIDxfX2tmcmVlX3NrYitiOC8xMzA+CiAgMjA6ICAgMDBhMzIwMjMg IHN1YnUgICAgYTAsYTEsdjEKCktlcm5lbCBwYW5pYzogQWllZSwga2lsbGluZyBpbnRlcnJ1cHQg aGFuZGxlciEKCjEgd2FybmluZyBpc3N1ZWQuICBSZXN1bHRzIG1heSBub3QgYmUgcmVsaWFibGUu Cg== ------_=_NextPart_001_01C5669D.E951F95B Content-Type: application/octet-stream; name="recent.cap_send.oops" Content-Transfer-Encoding: base64 Content-Description: recent.cap_send.oops Content-Disposition: attachment; filename="recent.cap_send.oops" a3N5bW9vcHMgMi40Ljkgb24gaTY4NiAyLjQuMjItMS4yMTE1Lm5wdGwuICBPcHRpb25zIHVzZWQK ICAgICAtdiAvaG9tZS9hbWQvcHJvamVjdC9hbWQva2VybmVsL3ZtbGludXggKGRlZmF1bHQpCiAg ICAgLUsgKHNwZWNpZmllZCkKICAgICAtbCAvcHJvYy9tb2R1bGVzIChkZWZhdWx0KQogICAgIC1v IC9ob21lL2FtZC9wcm9qZWN0L2FtZC9maWxlc3lzdGVtL3Vzci9saWIvbW9kdWxlcy8gKGRlZmF1 bHQpCiAgICAgLW0gL2hvbWUvYW1kL3Byb2plY3QvYW1kL2tlcm5lbC9TeXN0ZW0ubWFwIChkZWZh dWx0KQogICAgIC10IGVsZjMyLWxpdHRsZW1pcHMgLWEgbWlwczo0NjAwCgpObyBtb2R1bGVzIGlu IGtzeW1zLCBza2lwcGluZyBvYmplY3RzCk5vIGtzeW1zLCBza2lwcGluZyBsc21vZApVbmFibGUg dG8gaGFuZGxlIGtlcm5lbCBwYWdpbmcgcmVxdWVzdCBhdCB2aXJ0dWFsIGFkZHJlc3MgMDAwMDMy ZDQsIGVwYyA9PSA4MDI0YWY2YywgcmEgPT0gODAyNGIwOTQKT29wcyBpbiBmYXVsdC5jOjpkb19w YWdlX2ZhdWx0LCBsaW5lIDIwNjoKJDAgOiAwMDAwMDAwMCAxMDAwZmMwMCA4YWM4MWUwMCAwMDAw MzI2MCAwMDAwMzI2MCAwMDAwMDAwMCAwMDAwMDAwMCA4YjM4YjM0MAokOCA6IDAwMDAwMDMwIDgw MmRhMWEwIDAwMDAwMDEwIGJmYmViZGJjIGEzYTJhMWEwIDAwMDAwMDAwIDhhYjc5ZGU4IGE3YTZh NWE0CiQxNjogMDAwMDMyNjAgMDAwMDAwMDEgOGFlYTgyNjAgYzAxNzI5NGMgMDAwMDAwMGYgODAy NGIxNzggYzAxNjdhYjggYzAxNzI5NTAKJDI0OiAwMDAwMDAxMCAwMDQwZTBmMCAgICAgICAgICAg ICAgICAgICA4YWI3ODAwMCA4YWI3OWE2OCBjMDE3MjdkOCA4MDI0YjA5NApIaSA6IDAwMDAwMDAw CkxvIDogMDAwMDAwMGIKZXBjICAgOiA4MDI0YWY2YyAgICBOb3QgdGFpbnRlZApTdGF0dXM6IDEw MDBmYzAzCkNhdXNlIDogMDA4MDAwMDgKUHJvY2VzcyBtZG0td2lwcm8tbm8tZGUgKHBpZDogNDEw LCBzdGFja3BhZ2U9OGFiNzgwMDApClN0YWNrOiAgICA4YWI3OWFkOCA4MDM2OWJmMCAwMDAwMDAw NCA4MDI1YTQ4NCA4YjZiNTQ2MCA4YjZiNTQ2MCA4MDI0YjA5NAogZmZmYmM0NzMgODAyNmE5MGMg ODAzYTA0MDAgODEyYmVhODAgODAzYTA0MDAgOGI2YjU0NjAgOGIzOGIzNjAgODAyNGIwYzQKIDAw MDAwMDAwIDAwMDAwMDAyIDAwMDA0MGQyIDgwMmRhMGUwIDhhYjc5YzU4IDhiNmI1NDYwIDgwMjRi Mjk4IDgxMmI2NDYwCiA4MDNhMDQwMCA4MDNhMDQwMCA4YWI3OWFkOCA4YjZiNTQ2MCBjMDE3MWY1 OCA4MTJiZWJjMCA4MDM5MDZhOCAwMDAwMDAyMAogODAyNGFlMzggOGI2YjU3ODAgOGFhYzQwZjYg OGI0MjhkNjAgMDAwMDAwMDAgODEyYmViYzAgOGFhYzAwMTAgOGFlYTgyNjAKIDAwMDA0MGQyIC4u LgpDYWxsIFRyYWNlOiAgIFs8ODAyNWE0ODQ+XSBbPDgwMjRiMDk0Pl0gWzw4MDI2YTkwYz5dIFs8 ODAyNGIwYzQ+XSBbPDgwMmRhMGUwPl0KIFs8ODAyNGIyOTg+XSBbPGMwMTcxZjU4Pl0gWzw4MDI0 YWUzOD5dIFs8ODAyZDlkODA+XSBbPGMwMTcxZTEwPl0gWzw4MDI2YTkwYz5dCiBbPGMwMTcyNDE0 Pl0gWzxjMDE3NDBlOD5dIFs8ODAyNWE0ODQ+XSBbPDgwMjZhOTBjPl0gWzw4MDI2YTkwYz5dIFs8 ODAyNWE5NDg+XQogWzw4MDJkYTBlMD5dIFs8ODAyNmE5MGM+XSBbPGMwMTc1MWRjPl0gWzw4MDI2 YThkND5dIFs8ODAyNmE5MGM+XSBbPDgwMjZhMzBjPl0KIFs8ODAyNmExODQ+XSBbPDgwMjY3MTMw Pl0gWzw4MDI2NzFiMD5dIFs8ODAyNmE3NDQ+XSBbPDgwMjVhOThjPl0gWzw4MDI2NzEzMD5dCiBb PDgwMjY3MDZjPl0gWzw4MDI2NzEzMD5dIFs8ODAyNjU3Zjg+XSBbPDgwMjY1YTIwPl0gWzw4MDI1 YTQ4ND5dIFs8YzAxY2UyYTg+XQogWzw4MDI2NTdmOD5dIFs8ODAyNjU3Zjg+XSBbPDgwMjVhOThj Pl0gWzw4MDI1YTk0OD5dIFs8ODAyNjU3Zjg+XSAuLi4KV2FybmluZyAoT29wc190cmFjZV9saW5l KTogZ2FyYmFnZSAnLi4uJyBhdCBlbmQgb2YgdHJhY2UgbGluZSBpZ25vcmVkCkNvZGU6IDhjNTAw MDA4ICBhYzQwMDAwOCAgMDIwMDIwMjEgPDhjODIwMDc0PiAxMDUxMDAwOSAgOGUxMDAwMDAgIGMw ODMwMDc0ICAwMDcxMTAyMyAgZTA4MjAwNzQKCgo+PlJBOyAgMDAwMDAwMDA4MDI0YjA5NCA8c2ti X3JlbGVhc2VfZGF0YStiMC9iYz4KPj4kOTsgMDAwMDAwMDA4MDJkYTFhMCA8bWVtc2V0X3BhcnRp YWwrMjQvNmM+Cj4+JDIxOyAwMDAwMDAwMDgwMjRiMTc4IDxfX2tmcmVlX3NrYisxMC8xMzA+Cj4+ JDMxOyAwMDAwMDAwMDgwMjRiMDk0IDxza2JfcmVsZWFzZV9kYXRhK2IwL2JjPgoKPj5QQzsgIDAw MDAwMDAwODAyNGFmNmMgPHNrYl9kcm9wX2ZyYWdsaXN0KzM0Lzc0PiAgIDw9PT09PQoKVHJhY2U7 IDAwMDAwMDAwODAyNWE0ODQgPG5mX2l0ZXJhdGUrOTQvMTE0PgpUcmFjZTsgMDAwMDAwMDA4MDI0 YjA5NCA8c2tiX3JlbGVhc2VfZGF0YStiMC9iYz4KVHJhY2U7IDAwMDAwMDAwODAyNmE5MGMgPGlw X2ZpbmlzaF9vdXRwdXQyKzEwLzE1MD4KVHJhY2U7IDAwMDAwMDAwODAyNGIwYzQgPGtmcmVlX3Nr Ym1lbSsyNC9jOD4KVHJhY2U7IDAwMDAwMDAwODAyZGEwZTAgPG1lbXNldCswLzFjPgpUcmFjZTsg MDAwMDAwMDA4MDI0YjI5OCA8c2tiX2Nsb25lKzAvMjUwPgpUcmFjZTsgMDAwMDAwMDBjMDE3MWY1 OCA8RU5EX09GX0NPREUrM2ZkZGY3NTgvPz8/Pz4KVHJhY2U7IDAwMDAwMDAwODAyNGFlMzggPGFs bG9jX3NrYisxNjAvMjYwPgpUcmFjZTsgMDAwMDAwMDA4MDJkOWQ4MCA8bWVtY3B5KzAvND4KVHJh Y2U7IDAwMDAwMDAwYzAxNzFlMTAgPEVORF9PRl9DT0RFKzNmZGRmNjEwLz8/Pz8+ClRyYWNlOyAw MDAwMDAwMDgwMjZhOTBjIDxpcF9maW5pc2hfb3V0cHV0MisxMC8xNTA+ClRyYWNlOyAwMDAwMDAw MGMwMTcyNDE0IDxFTkRfT0ZfQ09ERSszZmRkZmMxNC8/Pz8/PgpUcmFjZTsgMDAwMDAwMDBjMDE3 NDBlOCA8RU5EX09GX0NPREUrM2ZkZTE4ZTgvPz8/Pz4KVHJhY2U7IDAwMDAwMDAwODAyNWE0ODQg PG5mX2l0ZXJhdGUrOTQvMTE0PgpUcmFjZTsgMDAwMDAwMDA4MDI2YTkwYyA8aXBfZmluaXNoX291 dHB1dDIrMTAvMTUwPgpUcmFjZTsgMDAwMDAwMDA4MDI2YTkwYyA8aXBfZmluaXNoX291dHB1dDIr MTAvMTUwPgpUcmFjZTsgMDAwMDAwMDA4MDI1YTk0OCA8bmZfaG9va19zbG93KzEyOC8xZjg+ClRy YWNlOyAwMDAwMDAwMDgwMmRhMGUwIDxtZW1zZXQrMC8xYz4KVHJhY2U7IDAwMDAwMDAwODAyNmE5 MGMgPGlwX2ZpbmlzaF9vdXRwdXQyKzEwLzE1MD4KVHJhY2U7IDAwMDAwMDAwYzAxNzUxZGMgPEVO RF9PRl9DT0RFKzNmZGUyOWRjLz8/Pz8+ClRyYWNlOyAwMDAwMDAwMDgwMjZhOGQ0IDxpcF9maW5p c2hfb3V0cHV0KzFhMC8xYTQ+ClRyYWNlOyAwMDAwMDAwMDgwMjZhOTBjIDxpcF9maW5pc2hfb3V0 cHV0MisxMC8xNTA+ClRyYWNlOyAwMDAwMDAwMDgwMjZhMzBjIDxpcF9mcmFnbWVudCszYzgvNTAw PgpUcmFjZTsgMDAwMDAwMDA4MDI2YTE4NCA8aXBfZnJhZ21lbnQrMjQwLzUwMD4KVHJhY2U7IDAw MDAwMDAwODAyNjcxMzAgPGlwX2ZvcndhcmRfZmluaXNoKzEwL2EwPgpUcmFjZTsgMDAwMDAwMDA4 MDI2NzFiMCA8aXBfZm9yd2FyZF9maW5pc2grOTAvYTA+ClRyYWNlOyAwMDAwMDAwMDgwMjZhNzQ0 IDxpcF9maW5pc2hfb3V0cHV0KzEwLzFhND4KVHJhY2U7IDAwMDAwMDAwODAyNWE5OGMgPG5mX2hv b2tfc2xvdysxNmMvMWY4PgpUcmFjZTsgMDAwMDAwMDA4MDI2NzEzMCA8aXBfZm9yd2FyZF9maW5p c2grMTAvYTA+ClRyYWNlOyAwMDAwMDAwMDgwMjY3MDZjIDxfX2dudV9jb21waWxlZF9jKzI2Yy8z MjA+ClRyYWNlOyAwMDAwMDAwMDgwMjY3MTMwIDxpcF9mb3J3YXJkX2ZpbmlzaCsxMC9hMD4KVHJh Y2U7IDAwMDAwMDAwODAyNjU3ZjggPGlwX3Jjdl9maW5pc2grMTAvMmE4PgpUcmFjZTsgMDAwMDAw MDA4MDI2NWEyMCA8aXBfcmN2X2ZpbmlzaCsyMzgvMmE4PgpUcmFjZTsgMDAwMDAwMDA4MDI1YTQ4 NCA8bmZfaXRlcmF0ZSs5NC8xMTQ+ClRyYWNlOyAwMDAwMDAwMGMwMWNlMmE4IDxFTkRfT0ZfQ09E RSszZmUzYmFhOC8/Pz8/PgpUcmFjZTsgMDAwMDAwMDA4MDI2NTdmOCA8aXBfcmN2X2ZpbmlzaCsx MC8yYTg+ClRyYWNlOyAwMDAwMDAwMDgwMjY1N2Y4IDxpcF9yY3ZfZmluaXNoKzEwLzJhOD4KVHJh Y2U7IDAwMDAwMDAwODAyNWE5OGMgPG5mX2hvb2tfc2xvdysxNmMvMWY4PgpUcmFjZTsgMDAwMDAw MDA4MDI1YTk0OCA8bmZfaG9va19zbG93KzEyOC8xZjg+ClRyYWNlOyAwMDAwMDAwMDgwMjY1N2Y4 IDxpcF9yY3ZfZmluaXNoKzEwLzJhOD4KCkNvZGU7ICAwMDAwMDAwMDgwMjRhZjYwIDxza2JfZHJv cF9mcmFnbGlzdCsyOC83ND4KMDAwMDAwMDAgPF9QQz46CkNvZGU7ICAwMDAwMDAwMDgwMjRhZjYw IDxza2JfZHJvcF9mcmFnbGlzdCsyOC83ND4KICAgMDogICA4YzUwMDAwOCAgbHcgICAgICBzMCw4 KHYwKQpDb2RlOyAgMDAwMDAwMDA4MDI0YWY2NCA8c2tiX2Ryb3BfZnJhZ2xpc3QrMmMvNzQ+CiAg IDQ6ICAgYWM0MDAwMDggIHN3ICAgICAgemVybyw4KHYwKQpDb2RlOyAgMDAwMDAwMDA4MDI0YWY2 OCA8c2tiX2Ryb3BfZnJhZ2xpc3QrMzAvNzQ+CiAgIDg6ICAgMDIwMDIwMjEgIG1vdmUgICAgYTAs czAKQ29kZTsgIDAwMDAwMDAwODAyNGFmNmMgPHNrYl9kcm9wX2ZyYWdsaXN0KzM0Lzc0PiAgIDw9 PT09PQogICBjOiAgIDhjODIwMDc0ICBsdyAgICAgIHYwLDExNihhMCkgICA8PT09PT0KQ29kZTsg IDAwMDAwMDAwODAyNGFmNzAgPHNrYl9kcm9wX2ZyYWdsaXN0KzM4Lzc0PgogIDEwOiAgIDEwNTEw MDA5ICBiZXEgICAgIHYwLHMxLDM4IDxfUEMrMHgzOD4KQ29kZTsgIDAwMDAwMDAwODAyNGFmNzQg PHNrYl9kcm9wX2ZyYWdsaXN0KzNjLzc0PgogIDE0OiAgIDhlMTAwMDAwICBsdyAgICAgIHMwLDAo czApCkNvZGU7ICAwMDAwMDAwMDgwMjRhZjc4IDxza2JfZHJvcF9mcmFnbGlzdCs0MC83ND4KICAx ODogICBjMDgzMDA3NCAgbGwgICAgICB2MSwxMTYoYTApCkNvZGU7ICAwMDAwMDAwMDgwMjRhZjdj IDxza2JfZHJvcF9mcmFnbGlzdCs0NC83ND4KICAxYzogICAwMDcxMTAyMyAgc3VidSAgICB2MCx2 MSxzMQpDb2RlOyAgMDAwMDAwMDA4MDI0YWY4MCA8c2tiX2Ryb3BfZnJhZ2xpc3QrNDgvNzQ+CiAg MjA6ICAgZTA4MjAwNzQgIHNjICAgICAgdjAsMTE2KGEwKQoKS2VybmVsIHBhbmljOiBBaWVlLCBr aWxsaW5nIGludGVycnVwdCBoYW5kbGVyIQoKMSB3YXJuaW5nIGlzc3VlZC4gIFJlc3VsdHMgbWF5 IG5vdCBiZSByZWxpYWJsZS4K ------_=_NextPart_001_01C5669D.E951F95B-- From jaegert@us.ibm.com Wed Jun 1 07:00:44 2005 Received: with ECARTIS (v1.0.0; list netdev); Wed, 01 Jun 2005 07:00:54 -0700 (PDT) Received: from e3.ny.us.ibm.com (e3.ny.us.ibm.com [32.97.182.143]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j51E0bXq012794 for ; Wed, 1 Jun 2005 07:00:44 -0700 Received: from d01relay02.pok.ibm.com (d01relay02.pok.ibm.com [9.56.227.234]) by e3.ny.us.ibm.com (8.12.11/8.12.11) with ESMTP id j51Dxf89024306 for ; Wed, 1 Jun 2005 09:59:41 -0400 Received: from d01av02.pok.ibm.com (d01av02.pok.ibm.com [9.56.224.216]) by d01relay02.pok.ibm.com (8.12.10/NCO/VER6.6) with ESMTP id j51DxftR261752 for ; Wed, 1 Jun 2005 09:59:41 -0400 Received: from d01av02.pok.ibm.com (loopback [127.0.0.1]) by d01av02.pok.ibm.com (8.12.11/8.13.3) with ESMTP id j51DxfQI017299 for ; Wed, 1 Jun 2005 09:59:41 -0400 Received: from d01ml605.pok.ibm.com (d01ml605.pok.ibm.com [9.56.227.91]) by d01av02.pok.ibm.com (8.12.11/8.12.11) with ESMTP id j51DxfnV017282; Wed, 1 Jun 2005 09:59:41 -0400 In-Reply-To: To: James Morris Cc: chrisw@osdl.org, latten@austin.ibm.com, netdev@oss.sgi.com, sds@tycho.nsa.gov, serue@us.ibm.com MIME-Version: 1.0 Subject: Re: [PATCH 2/2] Resend: LSM-IPSec Networking Hooks X-Mailer: Lotus Notes Release 6.0.2CF1 June 9, 2003 From: Trent Jaeger Message-ID: Date: Wed, 1 Jun 2005 09:59:40 -0400 X-MIMETrack: Serialize by Router on D01ML605/01/M/IBM(Build V70_M4_01112005 Beta 3|January 11, 2005) at 06/01/2005 09:59:40, Serialize complete at 06/01/2005 09:59:40 Content-Type: multipart/alternative; boundary="=_alternative 004CDF1985257013_=" X-archive-position: 1943 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jaegert@us.ibm.com Precedence: bulk X-list: netdev This is a multipart message in MIME format. --=_alternative 004CDF1985257013_= Content-Type: text/plain; charset="US-ASCII" OK. Thanks for the detailed comments. I will review and get back with comments and mods (probably next week). Regards, Trent. ------------------------------------------------------------ Trent Jaeger IBM T.J. Watson Research Center 19 Skyline Drive, Hawthorne, NY 10532 (914) 784-7225, FAX (914) 784-7225 James Morris 05/31/2005 12:15 AM To: Trent Jaeger/Watson/IBM@IBMUS cc: netdev@oss.sgi.com, , serue@us.ltcfwd.linux.ibm.com, , Subject: Re: [PATCH 2/2] Resend: LSM-IPSec Networking Hooks On Tue, 17 May 2005, jaegert wrote: Ok, my last review in this iteration. > @@ -984,6 +1029,13 @@ static struct xfrm_state * pfkey_msg2xfr > x->lft.soft_add_expires_seconds = lifetime->sadb_lifetime_addtime; > x->lft.soft_use_expires_seconds = lifetime->sadb_lifetime_usetime; > } > + > + sec_ctx = (struct sadb_x_sec_ctx *) ext_hdrs[SADB_X_EXT_SEC_CTX-1]; > + if (sec_ctx != NULL) { > + if (security_xfrm_state_alloc(x, sec_ctx)) > + goto out; You should propagate the return value of security_xfrm_state_alloc() here by assigning it to err. > -selinux-y := avc.o hooks.o selinuxfs.o netlink.o nlmsgtab.o > +selinux-y := avc.o hooks.o selinuxfs.o netlink.o nlmsgtab.o nethooks.o What about making nethooks.o (or whatever it'll be called) conditionally compiled via CONFIG_SECURITY_NETWORK_XFRM ? (see netif.o) > + * ISSUES: > + * 1. Caching packets, so they are not dropped during negotiation This needs to be done for IPsec in general, not sure what the status is. > + * 2. Emulating a reasonable SO_PEERSEC across machines This may not be too difficult if we limit this to connected TCP sockets. > + * 3. Testing sk_policy setting with context What does this mean? Overall, this looks like a really good approach to the problem. - James -- James Morris --=_alternative 004CDF1985257013_= Content-Type: text/html; charset="US-ASCII"
OK.

Thanks for the detailed comments.  

I will review and get back with comments and mods (probably next week).

Regards,
Trent.
------------------------------------------------------------
Trent Jaeger
IBM T.J. Watson Research Center
19 Skyline Drive, Hawthorne, NY 10532
(914) 784-7225, FAX (914) 784-7225



James Morris <jmorris@redhat.com>

05/31/2005 12:15 AM

       
        To:        Trent Jaeger/Watson/IBM@IBMUS
        cc:        netdev@oss.sgi.com, <chrisw@osdl.org>, serue@us.ltcfwd.linux.ibm.com, <latten@austin.ibm.com>, <sds@tycho.nsa.gov>
        Subject:        Re: [PATCH 2/2] Resend: LSM-IPSec Networking Hooks



On Tue, 17 May 2005, jaegert wrote:

Ok, my last review in this iteration.

> @@ -984,6 +1029,13 @@ static struct xfrm_state * pfkey_msg2xfr
>                x->lft.soft_add_expires_seconds = lifetime->sadb_lifetime_addtime;
>                x->lft.soft_use_expires_seconds = lifetime->sadb_lifetime_usetime;
>        }
> +
> +       sec_ctx = (struct sadb_x_sec_ctx *) ext_hdrs[SADB_X_EXT_SEC_CTX-1];
> +       if (sec_ctx != NULL) {
> +               if (security_xfrm_state_alloc(x, sec_ctx))
> +                       goto out;

You should propagate the return value of security_xfrm_state_alloc() here
by assigning it to err.

> -selinux-y := avc.o hooks.o selinuxfs.o netlink.o nlmsgtab.o
> +selinux-y := avc.o hooks.o selinuxfs.o netlink.o nlmsgtab.o nethooks.o

What about making nethooks.o (or whatever it'll be called) conditionally
compiled via CONFIG_SECURITY_NETWORK_XFRM ? (see netif.o)


> + * ISSUES:
> + *   1. Caching packets, so they are not dropped during negotiation

This needs to be done for IPsec in general, not sure what the status is.

> + *   2. Emulating a reasonable SO_PEERSEC across machines

This may not be too difficult if we limit this to connected TCP sockets.

> + *   3. Testing sk_policy setting with context

What does this mean?


Overall, this looks like a really good approach to the problem.


- James
--
James Morris
<jmorris@redhat.com>



--=_alternative 004CDF1985257013_=-- From kernel@linuxace.com Wed Jun 1 10:01:54 2005 Received: with ECARTIS (v1.0.0; list netdev); Wed, 01 Jun 2005 10:01:59 -0700 (PDT) Received: from linuxace.com (adsl-67-120-171-161.dsl.lsan03.pacbell.net [67.120.171.161]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id j51H1sXq026236 for ; Wed, 1 Jun 2005 10:01:54 -0700 Received: (qmail 20132 invoked by uid 0); 1 Jun 2005 17:00:58 -0000 Date: Wed, 1 Jun 2005 10:00:58 -0700 From: Phil Oester To: Herbert Xu Cc: netdev@oss.sgi.com, akpm@osdl.org Subject: Re: 2.6.12-rcx networking oops Message-ID: <20050601170058.GA20112@linuxace.com> References: <20050531224012.GA16789@linuxace.com> <20050601054955.GA2625@gondor.apana.org.au> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20050601054955.GA2625@gondor.apana.org.au> User-Agent: Mutt/1.4.1i X-archive-position: 1944 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kernel@linuxace.com Precedence: bulk X-list: netdev On Wed, Jun 01, 2005 at 03:49:55PM +1000, Herbert Xu wrote: > This looks like stack overflow. %esi is meant to be "res" which is > a local variable. As you can see, it's pointing below %esp and > threadinfo. Ok, so I enabled DEBUG_STACKOVERFLOW in addition to CONFIG_DEBUG_SLAB and CONFIG_DEBUG_PAGEALLOC, and got the below today...so maybe it is a slab issue? 0xc0238cdd is in dst_alloc (net/core/dst.c:124). 119 if (ops->gc && atomic_read(&ops->entries) > ops->gc_thresh) { 120 if (ops->gc()) 121 return NULL; 122 } 123 dst = kmem_cache_alloc(ops->kmem_cachep, SLAB_ATOMIC); 0xc013912b is at mm/slab.c:3077. 3072 size = kmem_cache_size(c); 3073 local_irq_restore(flags); 3074 } 3075 3076 return size; 3077 } Phil invalid operand: 0000 [#1] SMP DEBUG_PAGEALLOC CPU: 1 EIP: 0060:[] Not tainted VLI EFLAGS: 00016292 (2.6.12-rc5-git5) EIP is at ksize+0x7b/0x100 eax: c0238cdd ebx: f7ba9c20 ecx: f7babf78 edx: dcc59000 esi: 00000020 edi: 0000e3ba ebp: c0338d98 esp: c0338d88 ds: 007b es: 007b ss: 0068 Process swapper (pid: 0, threadinfo=c0338000 task=c1989b00) Stack: 00000000 04000000 c02d1a00 ffffff97 c0338db0 c0238cdd c0338e58 04000000 00000000 ffffff97 c0338eb4 c0245cb7 00000002 f7b01000 c0338dec c0338df0 f7318ef8 00000000 00000000 00000001 f72dbef8 0000a704 103c243b f27ceec0 Call Trace: [] show_stack+0x7a/0x90 [] show_registers+0x14d/0x1b0 [] die+0xf9/0x180 [] do_trap+0xa0/0xb0 [] do_invalid_op+0xa9/0xc0 [] error_code+0x4f/0x54 [] dst_alloc+0x2d/0xa0 [] ip_route_input_slow+0x4a7/0x840 [] ip_route_input+0x9a/0x160 [] ip_rcv+0x3b0/0x4d0 [] netif_receive_skb+0x13a/0x1a0 [] e1000_clean_rx_irq+0x180/0x4d0 [] e1000_clean+0x40/0xe0 [] net_rx_action+0x90/0x130 [] __do_softirq+0xd4/0xf0 [] do_softirq+0x52/0x70 ======================= [] irq_exit+0x3a/0x40 [] do_IRQ+0x68/0xa0 [] common_interrupt+0x1a/0x20 [] cpu_idle+0x7b/0x80 [] start_secondary+0x73/0x90 [<00000000>] stext+0x3feffd6c/0xc [] 0xc198afb4 Code: 8d 05 0c e2 34 c0 e8 e9 25 15 00 e9 96 dd ff ff 8d 05 0c e2 34 c0 e8 a9 25 15 00 e9 00 e2 ff ff 8d 05 0c e2 34 c0 e8 c9 25 15 00 23 e2 ff ff 8d 05 0c e2 34 c0 e8 89 25 15 00 e9 84 e2 ff ff <0>Kernel panic - not syncing: Fatal exception in interrupt From mmporter@cox.net Wed Jun 1 11:26:34 2005 Received: with ECARTIS (v1.0.0; list netdev); Wed, 01 Jun 2005 11:26:39 -0700 (PDT) Received: from fed1rmmtao09.cox.net (fed1rmmtao09.cox.net [68.230.241.30]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j51IQXXq032754 for ; Wed, 1 Jun 2005 11:26:34 -0700 Received: from liberty.homelinux.org ([68.2.41.86]) by fed1rmmtao09.cox.net (InterMail vM.6.01.04.00 201-2131-118-20041027) with ESMTP id <20050601182536.OPJC7275.fed1rmmtao09.cox.net@liberty.homelinux.org>; Wed, 1 Jun 2005 14:25:36 -0400 Received: (from mmporter@localhost) by liberty.homelinux.org (8.9.3/8.9.3/Debian 8.9.3-21) id LAA16886; Wed, 1 Jun 2005 11:25:34 -0700 Date: Wed, 1 Jun 2005 11:25:34 -0700 From: Matt Porter To: torvalds@osdl.org, akpm@osdl.org, jgarzik@pobox.com Cc: linux-kernel@vger.kernel.org, linuxppc-embedded@ozlabs.org, netdev@oss.sgi.com Subject: [PATCH][3/3] RapidIO support: net driver over messaging Message-ID: <20050601112534.C16559@cox.net> References: <20050601110836.A16559@cox.net> <20050601111516.B16559@cox.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: <20050601111516.B16559@cox.net>; from mporter@kernel.crashing.org on Wed, Jun 01, 2005 at 11:15:17AM -0700 X-archive-position: 1945 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mporter@kernel.crashing.org Precedence: bulk X-list: netdev Adds an "Ethernet" driver which sends Ethernet packets over the standard RapidIO messaging. This depends on the core RIO patch for mailbox/doorbell access. Signed-off-by: Matt Porter Index: drivers/net/Kconfig =================================================================== --- f0bf7810dbe8c4073832d6c3785364084e9523a7/drivers/net/Kconfig (mode:100644) +++ 4ed27b6e30a69f314a2ca131e80ac45e2111f245/drivers/net/Kconfig (mode:100644) @@ -2185,6 +2185,20 @@ tristate "iSeries Virtual Ethernet driver support" depends on NETDEVICES && PPC_ISERIES +config RIONET + tristate "RapidIO Ethernet over messaging driver support" + depends on NETDEVICES && RAPIDIO + +config RIONET_TX_SIZE + int "Number of outbound queue entries" + depends on RIONET + default "128" + +config RIONET_RX_SIZE + int "Number of inbound queue entries" + depends on RIONET + default "128" + config FDDI bool "FDDI driver support" depends on NETDEVICES && (PCI || EISA) Index: drivers/net/Makefile =================================================================== --- f0bf7810dbe8c4073832d6c3785364084e9523a7/drivers/net/Makefile (mode:100644) +++ 4ed27b6e30a69f314a2ca131e80ac45e2111f245/drivers/net/Makefile (mode:100644) @@ -58,6 +58,7 @@ obj-$(CONFIG_VIA_RHINE) += via-rhine.o obj-$(CONFIG_VIA_VELOCITY) += via-velocity.o obj-$(CONFIG_ADAPTEC_STARFIRE) += starfire.o +obj-$(CONFIG_RIONET) += rionet.o # # end link order section Index: drivers/net/rionet.c =================================================================== --- /dev/null (tree:f0bf7810dbe8c4073832d6c3785364084e9523a7) +++ 4ed27b6e30a69f314a2ca131e80ac45e2111f245/drivers/net/rionet.c (mode:100644) @@ -0,0 +1,622 @@ +/* + * rionet - Ethernet driver over RapidIO messaging services + * + * Copyright 2005 MontaVista Software, Inc. + * Matt Porter + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License as published by the + * Free Software Foundation; either version 2 of the License, or (at your + * option) any later version. + */ + +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include + +#define DRV_NAME "rionet" +#define DRV_VERSION "0.1" +#define DRV_AUTHOR "Matt Porter " +#define DRV_DESC "Ethernet over RapidIO" + +MODULE_AUTHOR(DRV_AUTHOR); +MODULE_DESCRIPTION(DRV_DESC); +MODULE_LICENSE("GPL"); + +#define RIONET_DEFAULT_MSGLEVEL 0 +#define RIONET_DOORBELL_JOIN 0x1000 +#define RIONET_DOORBELL_LEAVE 0x1001 + +#define RIONET_MAILBOX 0 + +#define RIONET_TX_RING_SIZE CONFIG_RIONET_TX_SIZE +#define RIONET_RX_RING_SIZE CONFIG_RIONET_RX_SIZE + +LIST_HEAD(rionet_peers); + +struct rionet_private { + struct rio_mport *mport; + struct sk_buff *rx_skb[RIONET_RX_RING_SIZE]; + struct sk_buff *tx_skb[RIONET_TX_RING_SIZE]; + struct net_device_stats stats; + int rx_slot; + int tx_slot; + int tx_cnt; + int ack_slot; + spinlock_t lock; + u32 msg_enable; +}; + +struct rionet_peer { + struct list_head node; + struct rio_dev *rdev; + struct resource *res; +}; + +static int rionet_check = 0; +static int rionet_capable = 1; +static struct net_device *sndev = NULL; + +/* + * This is a fast lookup table for for translating TX + * Ethernet packets into a destination RIO device. It + * could be made into a hash table to save memory depending + * on system trade-offs. + */ +static struct rio_dev *rionet_active[RIO_MAX_ROUTE_ENTRIES]; + +#define is_rionet_capable(pef, src_ops, dst_ops) \ + ((pef & RIO_PEF_INB_MBOX) && \ + (pef & RIO_PEF_INB_DOORBELL) && \ + (src_ops & RIO_SRC_OPS_DOORBELL) && \ + (dst_ops & RIO_DST_OPS_DOORBELL)) +#define dev_rionet_capable(dev) \ + is_rionet_capable(dev->pef, dev->src_ops, dev->dst_ops) + +#define RIONET_MAC_MATCH(x) (*(u32 *)x == 0x00010001) +#define RIONET_GET_DESTID(x) (*(u16 *)(x + 4)) + +static struct net_device_stats *rionet_stats(struct net_device *ndev) +{ + struct rionet_private *rnet = ndev->priv; + return &rnet->stats; +} + +static int rionet_rx_clean(struct net_device *ndev) +{ + int i; + int error = 0; + struct rionet_private *rnet = ndev->priv; + void *data; + + i = rnet->rx_slot; + + do { + if (!rnet->rx_skb[i]) { + rnet->stats.rx_dropped++; + continue; + } + + if (!(data = rio_get_inb_message(rnet->mport, RIONET_MAILBOX))) + break; + + rnet->rx_skb[i]->data = data; + skb_put(rnet->rx_skb[i], RIO_MAX_MSG_SIZE); + rnet->rx_skb[i]->dev = sndev; + rnet->rx_skb[i]->protocol = + eth_type_trans(rnet->rx_skb[i], sndev); + error = netif_rx(rnet->rx_skb[i]); + + if (error == NET_RX_DROP) { + rnet->stats.rx_dropped++; + } else if (error == NET_RX_BAD) { + if (netif_msg_rx_err(rnet)) + printk(KERN_WARNING "%s: bad rx packet\n", + DRV_NAME); + rnet->stats.rx_errors++; + } else { + rnet->stats.rx_packets++; + rnet->stats.rx_bytes += RIO_MAX_MSG_SIZE; + } + + } while ((i = (i + 1) % RIONET_RX_RING_SIZE) != rnet->rx_slot); + + return i; +} + +static void rionet_rx_fill(struct net_device *ndev, int end) +{ + int i; + struct rionet_private *rnet = ndev->priv; + + i = rnet->rx_slot; + do { + rnet->rx_skb[i] = dev_alloc_skb(RIO_MAX_MSG_SIZE); + + if (!rnet->rx_skb[i]) + break; + + rio_add_inb_buffer(rnet->mport, RIONET_MAILBOX, + rnet->rx_skb[i]->data); + } while ((i = (i + 1) % RIONET_RX_RING_SIZE) != end); + + rnet->rx_slot = i; +} + +static int rionet_queue_tx_msg(struct sk_buff *skb, struct net_device *ndev, + struct rio_dev *rdev) +{ + struct rionet_private *rnet = ndev->priv; + + rio_add_outb_message(rnet->mport, rdev, 0, skb->data, skb->len); + rnet->tx_skb[rnet->tx_slot] = skb; + + rnet->stats.tx_packets++; + rnet->stats.tx_bytes += skb->len; + + if (++rnet->tx_cnt == RIONET_TX_RING_SIZE) + netif_stop_queue(ndev); + + if (++rnet->tx_slot == RIONET_TX_RING_SIZE) + rnet->tx_slot = 0; + + if (netif_msg_tx_queued(rnet)) + printk(KERN_INFO "%s: queued skb %8.8x len %8.8x\n", DRV_NAME, + (u32) skb, skb->len); + + return 0; +} + +static int rionet_start_xmit(struct sk_buff *skb, struct net_device *ndev) +{ + int i; + struct rionet_private *rnet = ndev->priv; + struct ethhdr *eth = (struct ethhdr *)skb->data; + u16 destid; + + spin_lock_irq(&rnet->lock); + + if ((rnet->tx_cnt + 1) > RIONET_TX_RING_SIZE) { + netif_stop_queue(ndev); + spin_unlock_irq(&rnet->lock); + return -EBUSY; + } + + if (eth->h_dest[0] & 0x01) { + /* + * XXX Need to delay queuing if ring max is reached, + * flush additional packets in tx_event() before + * awakening the queue. We can easily exceed ring + * size with a large number of nodes or even a + * small number where the ring is relatively full + * on entrance to hard_start_xmit. + */ + for (i = 0; i < RIO_MAX_ROUTE_ENTRIES; i++) + if (rionet_active[i]) + rionet_queue_tx_msg(skb, ndev, + rionet_active[i]); + } else if (RIONET_MAC_MATCH(eth->h_dest)) { + destid = RIONET_GET_DESTID(eth->h_dest); + if (rionet_active[destid]) + rionet_queue_tx_msg(skb, ndev, rionet_active[destid]); + } + + spin_unlock_irq(&rnet->lock); + + return 0; +} + +static int rionet_set_mac_address(struct net_device *ndev, void *p) +{ + struct sockaddr *addr = p; + + if (!is_valid_ether_addr(addr->sa_data)) + return -EADDRNOTAVAIL; + + memcpy(ndev->dev_addr, addr->sa_data, ndev->addr_len); + + return 0; +} + +static int rionet_change_mtu(struct net_device *ndev, int new_mtu) +{ + struct rionet_private *rnet = ndev->priv; + + if (netif_msg_drv(rnet)) + printk(KERN_WARNING + "%s: rionet_change_mtu(): not implemented\n", DRV_NAME); + + return 0; +} + +static void rionet_set_multicast_list(struct net_device *ndev) +{ + struct rionet_private *rnet = ndev->priv; + + if (netif_msg_drv(rnet)) + printk(KERN_WARNING + "%s: rionet_set_multicast_list(): not implemented\n", + DRV_NAME); +} + +static void rionet_dbell_event(struct rio_mport *mport, u16 sid, u16 tid, + u16 info) +{ + struct net_device *ndev = sndev; + struct rionet_private *rnet = ndev->priv; + struct rionet_peer *peer; + + if (netif_msg_intr(rnet)) + printk(KERN_INFO "%s: doorbell sid %4.4x tid %4.4x info %4.4x", + DRV_NAME, sid, tid, info); + if (info == RIONET_DOORBELL_JOIN) { + if (!rionet_active[sid]) { + list_for_each_entry(peer, &rionet_peers, node) { + if (peer->rdev->destid == sid) + rionet_active[sid] = peer->rdev; + } + rio_mport_send_doorbell(mport, sid, + RIONET_DOORBELL_JOIN); + } + } else if (info == RIONET_DOORBELL_LEAVE) { + rionet_active[sid] = NULL; + } else { + if (netif_msg_intr(rnet)) + printk(KERN_WARNING "%s: unhandled doorbell\n", + DRV_NAME); + } +} + +static void rionet_inb_msg_event(struct rio_mport *mport, int mbox, int slot) +{ + int n; + struct net_device *ndev = sndev; + struct rionet_private *rnet = (struct rionet_private *)ndev->priv; + + if (netif_msg_intr(rnet)) + printk(KERN_INFO "%s: inbound message event, mbox %d slot %d\n", + DRV_NAME, mbox, slot); + + spin_lock(&rnet->lock); + if ((n = rionet_rx_clean(ndev)) != rnet->rx_slot) + rionet_rx_fill(ndev, n); + spin_unlock(&rnet->lock); +} + +static void rionet_outb_msg_event(struct rio_mport *mport, int mbox, int slot) +{ + struct net_device *ndev = sndev; + struct rionet_private *rnet = ndev->priv; + + spin_lock(&rnet->lock); + + if (netif_msg_intr(rnet)) + printk(KERN_INFO + "%s: outbound message event, mbox %d slot %d\n", + DRV_NAME, mbox, slot); + + while (rnet->tx_cnt && (rnet->ack_slot != slot)) { + /* dma unmap single */ + dev_kfree_skb_irq(rnet->tx_skb[rnet->ack_slot]); + rnet->tx_skb[rnet->ack_slot] = NULL; + if (++rnet->ack_slot == RIONET_TX_RING_SIZE) + rnet->ack_slot = 0; + rnet->tx_cnt--; + } + + if (rnet->tx_cnt < RIONET_TX_RING_SIZE) + netif_wake_queue(ndev); + + spin_unlock(&rnet->lock); +} + +static int rionet_open(struct net_device *ndev) +{ + int i, rc = 0; + struct rionet_peer *peer, *tmp; + u32 pwdcsr; + struct rionet_private *rnet = ndev->priv; + + if (netif_msg_ifup(rnet)) + printk(KERN_INFO "%s: open\n", DRV_NAME); + + if ((rc = rio_request_inb_dbell(rnet->mport, + RIONET_DOORBELL_JOIN, + RIONET_DOORBELL_LEAVE, + rionet_dbell_event)) < 0) + goto out; + + if ((rc = rio_request_inb_mbox(rnet->mport, + RIONET_MAILBOX, + RIONET_RX_RING_SIZE, + rionet_inb_msg_event)) < 0) + goto out; + + if ((rc = rio_request_outb_mbox(rnet->mport, + RIONET_MAILBOX, + RIONET_TX_RING_SIZE, + rionet_outb_msg_event)) < 0) + goto out; + + /* Initialize inbound message ring */ + for (i = 0; i < RIONET_RX_RING_SIZE; i++) + rnet->rx_skb[i] = NULL; + rnet->rx_slot = 0; + rionet_rx_fill(ndev, 0); + + rnet->tx_slot = 0; + rnet->tx_cnt = 0; + rnet->ack_slot = 0; + + spin_lock_init(&rnet->lock); + + rnet->msg_enable = RIONET_DEFAULT_MSGLEVEL; + + netif_carrier_on(ndev); + netif_start_queue(ndev); + + list_for_each_entry_safe(peer, tmp, &rionet_peers, node) { + if (!(peer->res = rio_request_outb_dbell(peer->rdev, + RIONET_DOORBELL_JOIN, + RIONET_DOORBELL_LEAVE))) + { + printk(KERN_ERR "%s: error requesting doorbells\n", + DRV_NAME); + continue; + } + + /* + * If device has initialized inbound doorbells, + * send a join message + */ + rio_read_config_32(peer->rdev, RIO_WRITE_PORT_CSR, &pwdcsr); + if (pwdcsr & RIO_DOORBELL_AVAIL) + rio_send_doorbell(peer->rdev, RIONET_DOORBELL_JOIN); + } + + out: + return rc; +} + +static int rionet_close(struct net_device *ndev) +{ + struct rionet_private *rnet = (struct rionet_private *)ndev->priv; + struct rionet_peer *peer, *tmp; + int i; + + if (netif_msg_ifup(rnet)) + printk(KERN_INFO "%s: close\n", DRV_NAME); + + netif_stop_queue(ndev); + netif_carrier_off(ndev); + + for (i = 0; i < RIONET_RX_RING_SIZE; i++) + if (rnet->rx_skb[i]) + kfree_skb(rnet->rx_skb[i]); + + list_for_each_entry_safe(peer, tmp, &rionet_peers, node) { + if (rionet_active[peer->rdev->destid]) { + rio_send_doorbell(peer->rdev, RIONET_DOORBELL_LEAVE); + rionet_active[peer->rdev->destid] = NULL; + } + rio_release_outb_dbell(peer->rdev, peer->res); + } + + rio_release_inb_dbell(rnet->mport, RIONET_DOORBELL_JOIN, + RIONET_DOORBELL_LEAVE); + rio_release_inb_mbox(rnet->mport, RIONET_MAILBOX); + rio_release_outb_mbox(rnet->mport, RIONET_MAILBOX); + + return 0; +} + +static void rionet_remove(struct rio_dev *rdev) +{ + struct net_device *ndev = NULL; + struct rionet_peer *peer, *tmp; + + unregister_netdev(ndev); + kfree(ndev); + + list_for_each_entry_safe(peer, tmp, &rionet_peers, node) { + list_del(&peer->node); + kfree(peer); + } +} + +static int rionet_ioctl(struct net_device *ndev, struct ifreq *rq, int cmd) +{ + return -EOPNOTSUPP; +} + +static void rionet_get_drvinfo(struct net_device *ndev, + struct ethtool_drvinfo *info) +{ + struct rionet_private *rnet = ndev->priv; + + strcpy(info->driver, DRV_NAME); + strcpy(info->version, DRV_VERSION); + strcpy(info->fw_version, "n/a"); + sprintf(info->bus_info, "RIO master port %d", rnet->mport->id); +} + +static u32 rionet_get_msglevel(struct net_device *ndev) +{ + struct rionet_private *rnet = ndev->priv; + + return rnet->msg_enable; +} + +static void rionet_set_msglevel(struct net_device *ndev, u32 value) +{ + struct rionet_private *rnet = ndev->priv; + + rnet->msg_enable = value; +} + +static u32 rionet_get_link(struct net_device *ndev) +{ + return netif_carrier_ok(ndev); +} + +static struct ethtool_ops rionet_ethtool_ops = { + .get_drvinfo = rionet_get_drvinfo, + .get_msglevel = rionet_get_msglevel, + .set_msglevel = rionet_set_msglevel, + .get_link = rionet_get_link, +}; + +static int rionet_setup_netdev(struct rio_mport *mport) +{ + int rc = 0; + struct net_device *ndev = NULL; + struct rionet_private *rnet; + u16 device_id; + + /* Allocate our net_device structure */ + ndev = alloc_etherdev(sizeof(struct rionet_private)); + if (ndev == NULL) { + printk(KERN_INFO "%s: could not allocate ethernet device.\n", + DRV_NAME); + rc = -ENOMEM; + goto out; + } + + /* + * XXX hack, store point a static at ndev so we can get it... + * Perhaps need an array of these that the handler can + * index via the mbox number. + */ + sndev = ndev; + + /* Set up private area */ + rnet = (struct rionet_private *)ndev->priv; + rnet->mport = mport; + + /* Set the default MAC address */ + device_id = rio_local_get_device_id(mport); + ndev->dev_addr[0] = 0x00; + ndev->dev_addr[1] = 0x01; + ndev->dev_addr[2] = 0x00; + ndev->dev_addr[3] = 0x01; + ndev->dev_addr[4] = device_id >> 8; + ndev->dev_addr[5] = device_id & 0xff; + + /* Fill in the driver function table */ + ndev->open = &rionet_open; + ndev->hard_start_xmit = &rionet_start_xmit; + ndev->stop = &rionet_close; + ndev->get_stats = &rionet_stats; + ndev->change_mtu = &rionet_change_mtu; + ndev->set_mac_address = &rionet_set_mac_address; + ndev->set_multicast_list = &rionet_set_multicast_list; + ndev->do_ioctl = &rionet_ioctl; + SET_ETHTOOL_OPS(ndev, &rionet_ethtool_ops); + + ndev->mtu = RIO_MAX_MSG_SIZE - 14; + + SET_MODULE_OWNER(ndev); + + rc = register_netdev(ndev); + if (rc != 0) + goto out; + + printk("%s: %s %s Version %s, MAC %02x:%02x:%02x:%02x:%02x:%02x\n", + ndev->name, + DRV_NAME, + DRV_DESC, + DRV_VERSION, + ndev->dev_addr[0], ndev->dev_addr[1], ndev->dev_addr[2], + ndev->dev_addr[3], ndev->dev_addr[4], ndev->dev_addr[5]); + + out: + return rc; +} + +/* + * XXX Make multi-net safe + */ +static int rionet_probe(struct rio_dev *rdev, const struct rio_device_id *id) +{ + int rc = -ENODEV; + u32 lpef, lsrc_ops, ldst_ops; + struct rionet_peer *peer; + + /* If local device is not rionet capable, give up quickly */ + if (!rionet_capable) + goto out; + + /* + * First time through, make sure local device is rionet + * capable, setup netdev, and set flags so this is skipped + * on later probes + */ + if (!rionet_check) { + rio_local_read_config_32(rdev->net->hport, RIO_PEF_CAR, &lpef); + rio_local_read_config_32(rdev->net->hport, RIO_SRC_OPS_CAR, + &lsrc_ops); + rio_local_read_config_32(rdev->net->hport, RIO_DST_OPS_CAR, + &ldst_ops); + if (!is_rionet_capable(lpef, lsrc_ops, ldst_ops)) { + printk(KERN_ERR + "%s: local device is not network capable\n", + DRV_NAME); + rionet_check = 1; + rionet_capable = 0; + goto out; + } + + rc = rionet_setup_netdev(rdev->net->hport); + rionet_check = 1; + } + + /* + * If the remote device has mailbox/doorbell capabilities, + * add it to the peer list. + */ + if (dev_rionet_capable(rdev)) { + if (!(peer = kmalloc(sizeof(struct rionet_peer), GFP_KERNEL))) { + rc = -ENOMEM; + goto out; + } + peer->rdev = rdev; + list_add_tail(&peer->node, &rionet_peers); + } + + out: + return rc; +} + +static struct rio_device_id rionet_id_table[] = { + {RIO_DEVICE(RIO_ANY_ID, RIO_ANY_ID)} +}; + +static struct rio_driver rionet_driver = { + .name = "rionet", + .id_table = rionet_id_table, + .probe = rionet_probe, + .remove = rionet_remove, +}; + +static int __init rionet_init(void) +{ + return rio_register_driver(&rionet_driver); +} + +static void __exit rionet_exit(void) +{ + rio_unregister_driver(&rionet_driver); +} + +module_init(rionet_init); +module_exit(rionet_exit); From davem@davemloft.net Wed Jun 1 11:56:28 2005 Received: with ECARTIS (v1.0.0; list netdev); Wed, 01 Jun 2005 11:56:30 -0700 (PDT) Received: from sunset.davemloft.net (dsl027-180-168.sfo1.dsl.speakeasy.net [216.27.180.168]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j51IuMXq001813 for ; Wed, 1 Jun 2005 11:56:28 -0700 Received: from localhost ([127.0.0.1] ident=davem) by sunset.davemloft.net with esmtp (Exim 4.50) id 1DdYMW-0003Ku-N6; Wed, 01 Jun 2005 11:54:44 -0700 Date: Wed, 01 Jun 2005 11:54:44 -0700 (PDT) Message-Id: <20050601.115444.68157121.davem@davemloft.net> To: raghunathan.venkatesan@wipro.com Cc: linux-net@vger.kernel.org, netdev@oss.sgi.com, linux@der-keiler.de Subject: Re: Unable to handle kernel paging request at virtual address 04000460 From: "David S. Miller" In-Reply-To: <438662DA48DCAA41B1DF648BD4BD76C0E45DF1@CHN-SNR-MBX01.wipro.com> References: <438662DA48DCAA41B1DF648BD4BD76C0E45DF1@CHN-SNR-MBX01.wipro.com> X-Mailer: Mew version 3.3 on Emacs 21.4 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 1946 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev Please don't ask the community to debug your custom kernel with private VPN driver modules installed. From afleming@freescale.com Wed Jun 1 13:46:36 2005 Received: with ECARTIS (v1.0.0; list netdev); Wed, 01 Jun 2005 13:46:40 -0700 (PDT) Received: from az33egw01.freescale.net (az33egw01.freescale.net [192.88.158.102]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j51KkZXq010355 for ; Wed, 1 Jun 2005 13:46:36 -0700 Received: from az33smr02.freescale.net (az33smr02.freescale.net [10.64.34.200]) by az33egw01.freescale.net (8.12.11/az33egw01) with ESMTP id j51KouYg020960; Wed, 1 Jun 2005 13:50:56 -0700 (MST) Received: from [10.82.17.56] ([10.82.17.56]) by az33smr02.freescale.net (8.13.1/8.13.0) with ESMTP id j51KmgBH016530; Wed, 1 Jun 2005 15:48:42 -0500 (CDT) In-Reply-To: <20050531105939.7486e071@dxpl.pdx.osdl.net> References: <1107b64b01fb8e9a6c84359bb56881a6@freescale.com> <20050531105939.7486e071@dxpl.pdx.osdl.net> Mime-Version: 1.0 (Apple Message framework v730) Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed Message-Id: <92F1428A-0B26-428B-8C06-35C7E5B9EEE3@freescale.com> Cc: Netdev , Embedded PPC Linux list , Kumar Gala Content-Transfer-Encoding: 7bit From: Andy Fleming Subject: Re: RFC: PHY Abstraction Layer II Date: Wed, 1 Jun 2005 15:45:26 -0500 To: Stephen Hemminger X-Mailer: Apple Mail (2.730) X-archive-position: 1947 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: afleming@freescale.com Precedence: bulk X-list: netdev On May 31, 2005, at 12:59, Stephen Hemminger wrote: > Here are some patches: > * allow phy's to be modules > * use driver owner for ref count > * make local functions static where ever possible I agree with all these. > * get rid of bus read may sleep implication in comment. > since you are holding phy spin lock it better not!! But not this one. The phy_read and phy_write functions are reading from and writing to a bus. It is a reasonable implementation to have the operation block in the bus driver, and be awoken when an interrupt signals the operation is done. All of the phydev spinlocks have been arranged so as to prevent the lock being taken during interrupt time. Unless I've misunderstood spinlocks (it wouldn't be the first time), as long as the lock is never taken in interrupt time, it should be ok to hold the lock, and wait for an interrupt before clearing the lock. Andy Fleming From gwingerde@home.nl Wed Jun 1 13:58:29 2005 Received: with ECARTIS (v1.0.0; list netdev); Wed, 01 Jun 2005 13:58:33 -0700 (PDT) Received: from smtpq3.home.nl (smtpq3.home.nl [213.51.128.198]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j51KwRXq011274 for ; Wed, 1 Jun 2005 13:58:29 -0700 Received: from [213.51.128.134] (port=47200 helo=smtp3.home.nl) by smtpq3.home.nl with esmtp (Exim 4.30) id 1DdaHH-0007SM-4Z; Wed, 01 Jun 2005 22:57:27 +0200 Received: from cc10088-a.ensch1.ov.home.nl ([217.123.128.105]:58103 helo=[192.168.14.1]) by smtp3.home.nl with esmtp (Exim 4.30) id 1DdaHF-0000hZ-Q1; Wed, 01 Jun 2005 22:57:25 +0200 Message-ID: <429E1FAB.6080503@home.nl> Date: Wed, 01 Jun 2005 22:50:51 +0200 From: Gertjan van Wingerde User-Agent: Mozilla Thunderbird 1.0.2 (X11/20050322) X-Accept-Language: en-us, en MIME-Version: 1.0 To: netdev@oss.sgi.com, jgarzik@pobox.com Subject: [PATCH] ieee80211: Update generic definitions to latest specs. Content-Type: multipart/mixed; boundary="------------020800010603020503020809" X-AtHome-MailScanner-Information: Neem contact op met support@home.nl voor meer informatie X-AtHome-MailScanner: Found to be clean X-archive-position: 1948 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: gwingerde@home.nl Precedence: bulk X-list: netdev This is a multi-part message in MIME format. --------------020800010603020503020809 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Hi, Attached patch updates the definitions of the generic ieee80211 stack to the latest versions of the published 802.11x specification suite. Please review and apply. Signed-off-by: Gertjan van Wingerde --------------020800010603020503020809 Content-Type: text/plain; name="ieee80211.diff" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="ieee80211.diff" Index: include/net/ieee80211.h =================================================================== --- 4b4ba76aa81b3627142787262fd2f8049dd3662d/include/net/ieee80211.h (mode:100644) +++ uncommitted/include/net/ieee80211.h (mode:100644) @@ -103,7 +103,7 @@ #define MAX_FRAG_THRESHOLD 2346U /* Frame control field constants */ -#define IEEE80211_FCTL_VERS 0x0002 +#define IEEE80211_FCTL_VERS 0x0003 #define IEEE80211_FCTL_FTYPE 0x000c #define IEEE80211_FCTL_STYPE 0x00f0 #define IEEE80211_FCTL_TODS 0x0100 @@ -111,8 +111,8 @@ #define IEEE80211_FCTL_MOREFRAGS 0x0400 #define IEEE80211_FCTL_RETRY 0x0800 #define IEEE80211_FCTL_PM 0x1000 -#define IEEE80211_FCTL_MOREDATA 0x2000 -#define IEEE80211_FCTL_WEP 0x4000 +#define IEEE80211_FCTL_MOREDATA 0x2000 +#define IEEE80211_FCTL_PROTECTEDFRAME 0x4000 #define IEEE80211_FCTL_ORDER 0x8000 #define IEEE80211_FTYPE_MGMT 0x0000 @@ -131,6 +131,7 @@ #define IEEE80211_STYPE_DISASSOC 0x00A0 #define IEEE80211_STYPE_AUTH 0x00B0 #define IEEE80211_STYPE_DEAUTH 0x00C0 +#define IEEE80211_STYPE_ACTION 0x00D0 /* control */ #define IEEE80211_STYPE_PSPOLL 0x00A0 @@ -251,6 +252,7 @@ #define SNAP_SIZE sizeof(struct ieee80211_snap_hdr) +#define WLAN_FC_GET_VERS(fc) ((fc) & IEEE80211_FCTL_VERS) #define WLAN_FC_GET_TYPE(fc) ((fc) & IEEE80211_FCTL_FTYPE) #define WLAN_FC_GET_STYPE(fc) ((fc) & IEEE80211_FCTL_STYPE) @@ -271,6 +273,9 @@ #define WLAN_CAPABILITY_SHORT_PREAMBLE (1<<5) #define WLAN_CAPABILITY_PBCC (1<<6) #define WLAN_CAPABILITY_CHANNEL_AGILITY (1<<7) +#define WLAN_CAPABILITY_SPECTRUM_MGMT (1<<8) +#define WLAN_CAPABILITY_SHORT_SLOT_TIME (1<<10) +#define WLAN_CAPABILITY_OSSS_OFDM (1<<13) /* Status codes */ #define WLAN_STATUS_SUCCESS 0 @@ -285,9 +290,24 @@ #define WLAN_STATUS_AP_UNABLE_TO_HANDLE_NEW_STA 17 #define WLAN_STATUS_ASSOC_DENIED_RATES 18 /* 802.11b */ -#define WLAN_STATUS_ASSOC_DENIED_NOSHORT 19 +#define WLAN_STATUS_ASSOC_DENIED_NOSHORTPREAMBLE 19 #define WLAN_STATUS_ASSOC_DENIED_NOPBCC 20 #define WLAN_STATUS_ASSOC_DENIED_NOAGILITY 21 +/* 802.11h */ +#define WLAN_STATUS_ASSOC_DENIED_SPECTRUM_MGMT_REQUIRED 22 +#define WLAN_STATUS_ASSOC_REJECTED_POWER_CAP_UNACCEPTABLE 23 +#define WLAN_STATUS_ASSOC_REJECTED_SUPP_CHANNELS_UNACCEPTABLE 24 +/* 802.11g */ +#define WLAN_STATUS_ASSOC_DENIED_NOSHORTTIME 25 +#define WLAN_STATUS_ASSOC_DENIED_NODSSSOFDM 26 +/* 802.11i */ +#define WLAN_STATUS_INVALID_IE 40 +#define WLAN_STATUS_INVALID_GROUP_CIPHER 41 +#define WLAN_STATUS_INVALID_PAIRWISE_CIPHER 42 +#define WLAN_STATUS_INVALID_AKMP 43 +#define WLAN_STATUS_UNSUPP_RSN_VERSION 44 +#define WLAN_STATUS_INVALID_RSN_IE_CAP 45 +#define WLAN_STATUS_CIPHER_SUITE_REJECTED 46 /* Reason codes */ #define WLAN_REASON_UNSPECIFIED 1 @@ -299,6 +319,22 @@ #define WLAN_REASON_CLASS3_FRAME_FROM_NONASSOC_STA 7 #define WLAN_REASON_DISASSOC_STA_HAS_LEFT 8 #define WLAN_REASON_STA_REQ_ASSOC_WITHOUT_AUTH 9 +/* 802.11h */ +#define WLAN_REASON_DISASSOC_POWER_CAP_UNACCEPTABLE 10 +#define WLAN_REASON_DISASSOC_SUPP_CHANNELS_UNACCEPTABLE 11 +/* 802.11i */ +#define WLAN_REASON_INVALID_IE 13 +#define WLAN_REASON_MIC_FAILURE 14 +#define WLAN_REASON_4WAY_HANDSHAKE_TIMEOUT 15 +#define WLAN_REASON_GROUP_KEY_HANDSHAKE_TIMEOUT 16 +#define WLAN_REASON_IE_DIFFERENT 17 +#define WLAN_REASON_INVALID_GROUP_CIPHER 18 +#define WLAN_REASON_INVALID_PAIRWISE_CIPHER 19 +#define WLAN_REASON_INVALID_AKMP 20 +#define WLAN_REASON_UNSUPP_RSN_VERSION 21 +#define WLAN_REASON_INVALID_RSN_IE_CAP 22 +#define WLAN_REASON_IEEE8021X_FAILED 23 +#define WLAN_REASON_CIPHER_SUITE_REJECTED 24 #define IEEE80211_STATMASK_SIGNAL (1<<0) @@ -477,17 +513,32 @@ #define BEACON_PROBE_SSID_ID_POSITION 12 /* Management Frame Information Element Types */ -#define MFIE_TYPE_SSID 0 -#define MFIE_TYPE_RATES 1 -#define MFIE_TYPE_FH_SET 2 -#define MFIE_TYPE_DS_SET 3 -#define MFIE_TYPE_CF_SET 4 -#define MFIE_TYPE_TIM 5 -#define MFIE_TYPE_IBSS_SET 6 -#define MFIE_TYPE_CHALLENGE 16 -#define MFIE_TYPE_RSN 48 -#define MFIE_TYPE_RATES_EX 50 -#define MFIE_TYPE_GENERIC 221 +#define MFIE_TYPE_SSID 0 +#define MFIE_TYPE_RATES 1 +#define MFIE_TYPE_FH_SET 2 +#define MFIE_TYPE_DS_SET 3 +#define MFIE_TYPE_CF_SET 4 +#define MFIE_TYPE_TIM 5 +#define MFIE_TYPE_IBSS_SET 6 +#define MFIE_TYPE_COUNTRY 7 +#define MFIE_TYPE_HOP_PARAMS 8 +#define MFIE_TYPE_HOP_TABLE 9 +#define MFIE_TYPE_REQUEST 10 +#define MFIE_TYPE_CHALLENGE 16 +#define MFIE_TYPE_POWER_CONSTRAINT 32 +#define MFIE_TYPE_POWER_CAPABILITY 33 +#define MFIE_TYPE_TPC_REQUEST 34 +#define MFIE_TYPE_TPC_REPORT 35 +#define MFIE_TYPE_SUPP_CHANNELS 36 +#define MFIE_TYPE_CSA 37 +#define MFIE_TYPE_MEASURE_REQUEST 38 +#define MFIE_TYPE_MEASURE_REPORT 39 +#define MFIE_TYPE_QUIET 40 +#define MFIE_TYPE_IBSS_DFS 41 +#define MFIE_TYPE_ERP_INFO 42 +#define MFIE_TYPE_RSN 48 +#define MFIE_TYPE_RATES_EX 50 +#define MFIE_TYPE_GENERIC 221 struct ieee80211_info_element_hdr { u8 id; Index: net/ieee80211/ieee80211_rx.c =================================================================== --- 4b4ba76aa81b3627142787262fd2f8049dd3662d/net/ieee80211/ieee80211_rx.c (mode:100644) +++ uncommitted/net/ieee80211/ieee80211_rx.c (mode:100644) @@ -440,7 +440,7 @@ crypt->ops->decrypt_mpdu == NULL)) crypt = NULL; - if (!crypt && (fc & IEEE80211_FCTL_WEP)) { + if (!crypt && (fc & IEEE80211_FCTL_PROTECTEDFRAME)) { /* This seems to be triggered by some (multicast?) * frames from other than current BSS, so just drop the * frames silently instead of filling system log with @@ -456,7 +456,7 @@ #ifdef NOT_YET if (type != WLAN_FC_TYPE_DATA) { if (type == WLAN_FC_TYPE_MGMT && stype == WLAN_FC_STYPE_AUTH && - fc & IEEE80211_FCTL_WEP && ieee->host_decrypt && + fc & IEEE80211_FCTL_PROTECTEDFRAME && ieee->host_decrypt && (keyidx = hostap_rx_frame_decrypt(ieee, skb, crypt)) < 0) { printk(KERN_DEBUG "%s: failed to decrypt mgmt::auth " @@ -557,7 +557,7 @@ /* skb: hdr + (possibly fragmented, possibly encrypted) payload */ - if (ieee->host_decrypt && (fc & IEEE80211_FCTL_WEP) && + if (ieee->host_decrypt && (fc & IEEE80211_FCTL_PROTECTEDFRAME) && (keyidx = ieee80211_rx_frame_decrypt(ieee, skb, crypt)) < 0) goto rx_dropped; @@ -565,7 +565,7 @@ /* skb: hdr + (possibly fragmented) plaintext payload */ // PR: FIXME: hostap has additional conditions in the "if" below: - // ieee->host_decrypt && (fc & IEEE80211_FCTL_WEP) && + // ieee->host_decrypt && (fc & IEEE80211_FCTL_PROTECTEDFRAME) && if ((frag != 0 || (fc & IEEE80211_FCTL_MOREFRAGS))) { int flen; struct sk_buff *frag_skb = ieee80211_frag_cache_get(ieee, hdr); @@ -621,12 +621,12 @@ /* skb: hdr + (possible reassembled) full MSDU payload; possibly still * encrypted/authenticated */ - if (ieee->host_decrypt && (fc & IEEE80211_FCTL_WEP) && + if (ieee->host_decrypt && (fc & IEEE80211_FCTL_PROTECTEDFRAME) && ieee80211_rx_frame_decrypt_msdu(ieee, skb, keyidx, crypt)) goto rx_dropped; hdr = (struct ieee80211_hdr *) skb->data; - if (crypt && !(fc & IEEE80211_FCTL_WEP) && !ieee->open_wep) { + if (crypt && !(fc & IEEE80211_FCTL_PROTECTEDFRAME) && !ieee->open_wep) { if (/*ieee->ieee802_1x &&*/ ieee80211_is_eapol_frame(ieee, skb)) { #ifdef CONFIG_IEEE80211_DEBUG @@ -647,7 +647,7 @@ } #ifdef CONFIG_IEEE80211_DEBUG - if (crypt && !(fc & IEEE80211_FCTL_WEP) && + if (crypt && !(fc & IEEE80211_FCTL_PROTECTEDFRAME) && ieee80211_is_eapol_frame(ieee, skb)) { struct eapol *eap = (struct eapol *)(skb->data + 24); @@ -656,7 +656,7 @@ } #endif - if (crypt && !(fc & IEEE80211_FCTL_WEP) && !ieee->open_wep && + if (crypt && !(fc & IEEE80211_FCTL_PROTECTEDFRAME) && !ieee->open_wep && !ieee80211_is_eapol_frame(ieee, skb)) { IEEE80211_DEBUG_DROP( "dropped unencrypted RX data " Index: net/ieee80211/ieee80211_tx.c =================================================================== --- 4b4ba76aa81b3627142787262fd2f8049dd3662d/net/ieee80211/ieee80211_tx.c (mode:100644) +++ uncommitted/net/ieee80211/ieee80211_tx.c (mode:100644) @@ -314,7 +314,7 @@ if (encrypt) fc = IEEE80211_FTYPE_DATA | IEEE80211_STYPE_DATA | - IEEE80211_FCTL_WEP; + IEEE80211_FCTL_PROTECTEDFRAME; else fc = IEEE80211_FTYPE_DATA | IEEE80211_STYPE_DATA; Index: drivers/net/wireless/atmel.c =================================================================== --- 4b4ba76aa81b3627142787262fd2f8049dd3662d/drivers/net/wireless/atmel.c (mode:100644) +++ uncommitted/drivers/net/wireless/atmel.c (mode:100644) @@ -867,7 +867,7 @@ header.duration_id = 0; header.seq_ctl = 0; if (priv->wep_is_on) - frame_ctl |= IEEE80211_FCTL_WEP; + frame_ctl |= IEEE80211_FCTL_PROTECTEDFRAME; if (priv->operating_mode == IW_MODE_ADHOC) { memcpy(&header.addr1, skb->data, 6); memcpy(&header.addr2, dev->dev_addr, 6); @@ -1117,7 +1117,7 @@ /* probe for CRC use here if needed once five packets have arrived with the same crc status, we assume we know what's happening and stop probing */ if (priv->probe_crc) { - if (!priv->wep_is_on || !(frame_ctl & IEEE80211_FCTL_WEP)) { + if (!priv->wep_is_on || !(frame_ctl & IEEE80211_FCTL_PROTECTEDFRAME)) { priv->do_rx_crc = probe_crc(priv, rx_packet_loc, msdu_size); } else { priv->do_rx_crc = probe_crc(priv, rx_packet_loc + 24, msdu_size - 24); @@ -1132,7 +1132,7 @@ } /* don't CRC header when WEP in use */ - if (priv->do_rx_crc && (!priv->wep_is_on || !(frame_ctl & IEEE80211_FCTL_WEP))) { + if (priv->do_rx_crc && (!priv->wep_is_on || !(frame_ctl & IEEE80211_FCTL_PROTECTEDFRAME))) { crc = crc32_le(0xffffffff, (unsigned char *)&header, 24); } msdu_size -= 24; /* header */ @@ -2677,7 +2677,7 @@ auth.alg = cpu_to_le16(C80211_MGMT_AAN_SHAREDKEY); /* no WEP for authentication frames with TrSeqNo 1 */ if (priv->CurrentAuthentTransactionSeqNum != 1) - header.frame_ctl |= cpu_to_le16(IEEE80211_FCTL_WEP); + header.frame_ctl |= cpu_to_le16(IEEE80211_FCTL_PROTECTEDFRAME); } else { auth.alg = cpu_to_le16(C80211_MGMT_AAN_OPENSYSTEM); } --------------020800010603020503020809-- From shemminger@osdl.org Wed Jun 1 14:20:19 2005 Received: with ECARTIS (v1.0.0; list netdev); Wed, 01 Jun 2005 14:20:21 -0700 (PDT) Received: from smtp.osdl.org (fire.osdl.org [65.172.181.4]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j51LKIXq012488 for ; Wed, 1 Jun 2005 14:20:19 -0700 Received: from [10.8.0.74] (fw.osdl.org [65.172.181.6]) (authenticated bits=0) by smtp.osdl.org (8.12.8/8.12.8) with ESMTP id j51LJFj9029727 (version=TLSv1/SSLv3 cipher=RC4-MD5 bits=128 verify=NO); Wed, 1 Jun 2005 14:19:16 -0700 Message-ID: <429E2653.6010101@osdl.org> Date: Wed, 01 Jun 2005 14:19:15 -0700 From: Stephen Hemminger User-Agent: Mozilla Thunderbird 1.0.2-1.3.3 (X11/20050513) X-Accept-Language: en-us, en MIME-Version: 1.0 To: Andy Fleming CC: Netdev , Embedded PPC Linux list , Kumar Gala Subject: Re: RFC: PHY Abstraction Layer II References: <1107b64b01fb8e9a6c84359bb56881a6@freescale.com> <20050531105939.7486e071@dxpl.pdx.osdl.net> <92F1428A-0B26-428B-8C06-35C7E5B9EEE3@freescale.com> In-Reply-To: <92F1428A-0B26-428B-8C06-35C7E5B9EEE3@freescale.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-MIMEDefang-Filter: osdl$Revision: 1.109 $ X-Scanned-By: MIMEDefang 2.36 X-archive-position: 1949 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev Andy Fleming wrote: > > On May 31, 2005, at 12:59, Stephen Hemminger wrote: > >> Here are some patches: >> * allow phy's to be modules >> * use driver owner for ref count >> * make local functions static where ever possible > > > I agree with all these. > >> * get rid of bus read may sleep implication in comment. >> since you are holding phy spin lock it better not!! > > > But not this one. The phy_read and phy_write functions are reading > from and writing to a bus. It is a reasonable implementation to have > the operation block in the bus driver, and be awoken when an > interrupt signals the operation is done. All of the phydev spinlocks > have been arranged so as to prevent the lock being taken during > interrupt time. > > Unless I've misunderstood spinlocks (it wouldn't be the first time), > as long as the lock is never taken in interrupt time, it should be ok > to hold the lock, and wait for an interrupt before clearing the lock. The problem is that sleeping is defined in the linux kernel as meaning waiting on a mutual exclusion primitive (like semaphore) that puts the current thread to sleep. It is not legal to sleep with a spinlock held. In the phy_read code you do: spin_lock_bh(&bus->mdio_lock); retval = bus->read(bus, phydev->addr, regnum); spin_unlock_bh(&bus->mdio_lock); If the bus->read function were to do something like start a request and wait on a semaphore, then you would be sleeping with a spin lock held. So bus->read can not sleep! (as sleep is defined in the linux kernel). From mchan@broadcom.com Wed Jun 1 14:32:33 2005 Received: with ECARTIS (v1.0.0; list netdev); Wed, 01 Jun 2005 14:32:36 -0700 (PDT) Received: from MMS2.broadcom.com (mms2.broadcom.com [216.31.210.18]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j51LWXXq013485 for ; Wed, 1 Jun 2005 14:32:33 -0700 Received: from 10.10.64.121 by MMS2.broadcom.com with SMTP (Broadcom SMTP Relay (Email Firewall v6.1.0)); Wed, 01 Jun 2005 14:31:20 -0700 X-Server-Uuid: 1F20ACF3-9CAF-44F7-AB47-F294E2D5B4EA Received: from mail-irva-8.broadcom.com ([10.10.64.221]) by mail-irva-1.broadcom.com (Post.Office MTA v3.5.3 release 223 ID# 0-72233U7200L2200S0V35) with ESMTP id com; Wed, 1 Jun 2005 14:31:18 -0700 Received: from mon-irva-10.broadcom.com (mon-irva-10.broadcom.com [10.10.64.171]) by mail-irva-8.broadcom.com (MOS 3.5.6-GR) with ESMTP id BBO09844; Wed, 1 Jun 2005 14:31:15 -0700 (PDT) Received: from nt-irva-0741.brcm.ad.broadcom.com ( nt-irva-0741.brcm.ad.broadcom.com [10.8.194.54]) by mon-irva-10.broadcom.com (8.9.1/8.9.1) with ESMTP id OAA04566; Wed, 1 Jun 2005 14:31:15 -0700 (PDT) Received: from 10.7.18.177 ([10.7.18.177]) by NT-IRVA-0741.brcm.ad.broadcom.com ([10.8.194.54]) with Microsoft Exchange Server HTTP-DAV ; Wed, 1 Jun 2005 21:31:14 +0000 Received: from rh4 by nt-irva-0741; 01 Jun 2005 13:33:39 -0700 Subject: Re: Locking model for NAPI drivers From: "Michael Chan" To: "David S. Miller" cc: netdev@oss.sgi.com In-Reply-To: <20050531.154847.63995530.davem@davemloft.net> References: <20050531.154847.63995530.davem@davemloft.net> Date: Wed, 01 Jun 2005 13:33:39 -0700 Message-ID: <1117658019.4310.58.camel@rh4> MIME-Version: 1.0 X-Mailer: Evolution 2.0.2 (2.0.2-3) X-WSS-ID: 6E80F6A21VO4407082-01-01 Content-Type: text/plain Content-Transfer-Encoding: 7bit X-archive-position: 1950 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mchan@broadcom.com Precedence: bulk X-list: netdev On Tue, 2005-05-31 at 15:48 -0700, David S. Miller wrote: > Once we make this transformation, we need some way to synchronize > with the IRQ handler when shutting down the device or making major > configuration changes to the chip. > > The idea I came up with is a two-bit atomic bitmask. When base > level code wants to quiesce interrupt processing, it takes the > necessary driver spinlocks, sets the "SYNC" bit in the bitmask, > forces and IRQ to be asserted by the tg3 card, then waits for the > COMPLETE bit to get set by the interrupt handler. > During light testing, I found a race condition that caused tg3_irq_quiesce() to spin forever. The race condition is shown below. CPU1 CPU2 tg3_interrupt_tagged() tg3_netif_stop() netif_poll_disable() netif_rx_schedule() will do nothing tg3_full_lock() tg3_irq_quiesce() Because netif_poll_disable() is called, netif_rx_schedule() will do nothing in the interrupt handler. As a result, tg3_poll() will never be called to re-enable interrupts. Since interrupts are disabled, tg3_irq_quiesce() will not be able to set the interrupts and cause the interrupt handler to be called again, and therefore will wait forever. Even adding another call to tg3_irq_sync() at the end of the interrupt handler does not eliminate the race condition. I suppose we can enable interrupts in tg3_irq_quiesce() after setting the SYNC bit. From shemminger@osdl.org Wed Jun 1 14:38:37 2005 Received: with ECARTIS (v1.0.0; list netdev); Wed, 01 Jun 2005 14:38:41 -0700 (PDT) Received: from smtp.osdl.org (fire.osdl.org [65.172.181.4]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j51LcbXq014435 for ; Wed, 1 Jun 2005 14:38:37 -0700 Received: from shell0.pdx.osdl.net (fw.osdl.org [65.172.181.6]) by smtp.osdl.org (8.12.8/8.12.8) with ESMTP id j51LbZjA032260 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO); Wed, 1 Jun 2005 14:37:35 -0700 Received: from dxpl.pdx.osdl.net (dxpl.pdx.osdl.net [10.8.0.74]) by shell0.pdx.osdl.net (8.13.1/8.11.6) with ESMTP id j51LbYcg019087; Wed, 1 Jun 2005 14:37:35 -0700 Date: Wed, 1 Jun 2005 14:37:34 -0700 From: Stephen Hemminger To: Gertjan van Wingerde Cc: netdev@oss.sgi.com, jgarzik@pobox.com Subject: Re: [PATCH] ieee80211: Update generic definitions to latest specs. Message-ID: <20050601143734.3b7a49ca@dxpl.pdx.osdl.net> In-Reply-To: <429E1FAB.6080503@home.nl> References: <429E1FAB.6080503@home.nl> Organization: Open Source Development Lab X-Mailer: Sylpheed-Claws 1.0.4 (GTK+ 1.2.10; x86_64-unknown-linux-gnu) X-Face: &@E+xe?c%:&e4D{>f1O<&U>2qwRREG5!}7R4;D<"NO^UI2mJ[eEOA2*3>(`Th.yP,VDPo9$ /`~cw![cmj~~jWe?AHY7D1S+\}5brN0k*NE?pPh_'_d>6;XGG[\KDRViCfumZT3@[ Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-MIMEDefang-Filter: osdl$Revision: 1.109 $ X-Scanned-By: MIMEDefang 2.36 X-archive-position: 1951 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev On Wed, 01 Jun 2005 22:50:51 +0200 Gertjan van Wingerde wrote: > Hi, > > Attached patch updates the definitions of the generic ieee80211 stack to > the latest versions of the published 802.11x specification suite. > Please review and apply. > > Signed-off-by: Gertjan van Wingerde > Could you change the elements that fix to be enum's instead of define's example: /* Management Frame Information Element Types */ enum ieee80211_mfie { MFIE_TYPE_SSID = 0, MFIE_TYPE_RATES = 1, MFIE_TYPE_FH_SET= 2, ... From shemminger@osdl.org Wed Jun 1 14:42:24 2005 Received: with ECARTIS (v1.0.0; list netdev); Wed, 01 Jun 2005 14:42:28 -0700 (PDT) Received: from smtp.osdl.org (fire.osdl.org [65.172.181.4]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j51LgOXq015034 for ; Wed, 1 Jun 2005 14:42:24 -0700 Received: from shell0.pdx.osdl.net (fw.osdl.org [65.172.181.6]) by smtp.osdl.org (8.12.8/8.12.8) with ESMTP id j51LfNjA032676 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO); Wed, 1 Jun 2005 14:41:24 -0700 Received: from dxpl.pdx.osdl.net (dxpl.pdx.osdl.net [10.8.0.74]) by shell0.pdx.osdl.net (8.13.1/8.11.6) with ESMTP id j51LfN66019413; Wed, 1 Jun 2005 14:41:23 -0700 Date: Wed, 1 Jun 2005 14:41:23 -0700 From: Stephen Hemminger To: Andy Fleming Cc: Netdev , Embedded PPC Linux list , Kumar Gala Subject: Re: RFC: PHY Abstraction Layer II Message-ID: <20050601144123.2bc11c06@dxpl.pdx.osdl.net> In-Reply-To: <92F1428A-0B26-428B-8C06-35C7E5B9EEE3@freescale.com> References: <1107b64b01fb8e9a6c84359bb56881a6@freescale.com> <20050531105939.7486e071@dxpl.pdx.osdl.net> <92F1428A-0B26-428B-8C06-35C7E5B9EEE3@freescale.com> Organization: Open Source Development Lab X-Mailer: Sylpheed-Claws 1.0.4 (GTK+ 1.2.10; x86_64-unknown-linux-gnu) X-Face: &@E+xe?c%:&e4D{>f1O<&U>2qwRREG5!}7R4;D<"NO^UI2mJ[eEOA2*3>(`Th.yP,VDPo9$ /`~cw![cmj~~jWe?AHY7D1S+\}5brN0k*NE?pPh_'_d>6;XGG[\KDRViCfumZT3@[ Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-MIMEDefang-Filter: osdl$Revision: 1.109 $ X-Scanned-By: MIMEDefang 2.36 X-archive-position: 1952 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev On Wed, 1 Jun 2005 15:45:26 -0500 Andy Fleming wrote: > > On May 31, 2005, at 12:59, Stephen Hemminger wrote: > > > Here are some patches: > > * allow phy's to be modules > > * use driver owner for ref count > > * make local functions static where ever possible > > I agree with all these. > > > * get rid of bus read may sleep implication in comment. > > since you are holding phy spin lock it better not!! > On a different note, I am not sure that using sysfs/kobject bus object is the right thing for this object. Isn't the phy instance really just an kobject whose parent is the network device? I can't see a 1 to N relationship between phy bus and phy objects existing. The main use I can see for being a driver object is to catch suspend/resume, and wouldn't you want that to be tied to the network device. From davem@davemloft.net Wed Jun 1 15:22:59 2005 Received: with ECARTIS (v1.0.0; list netdev); Wed, 01 Jun 2005 15:23:02 -0700 (PDT) Received: from sunset.davemloft.net (dsl027-180-168.sfo1.dsl.speakeasy.net [216.27.180.168]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j51MMwXq017804 for ; Wed, 1 Jun 2005 15:22:59 -0700 Received: from localhost ([127.0.0.1] ident=davem) by sunset.davemloft.net with esmtp (Exim 4.50) id 1Ddbag-0004kn-VP; Wed, 01 Jun 2005 15:21:35 -0700 Date: Wed, 01 Jun 2005 15:21:34 -0700 (PDT) Message-Id: <20050601.152134.120445266.davem@davemloft.net> To: mchan@broadcom.com Cc: netdev@oss.sgi.com Subject: Re: Locking model for NAPI drivers From: "David S. Miller" In-Reply-To: <1117658019.4310.58.camel@rh4> References: <20050531.154847.63995530.davem@davemloft.net> <1117658019.4310.58.camel@rh4> X-Mailer: Mew version 3.3 on Emacs 21.4 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 1953 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev From: "Michael Chan" Date: Wed, 01 Jun 2005 13:33:39 -0700 > I suppose we can enable interrupts in tg3_irq_quiesce() after setting > the SYNC bit. Since the caller shuts down NAPI ->poll(), after setting the SYNC bit we can just check the MAILBOX register, and if a '1' is there just return. Does one need to mask out the upper bits of the regiser in order to avoid seeing the IRQ tag in such a comparison? Another potential problem is if the chip is hung for some reason, and even though an interrupt is asserted it does not send the interrupt. We'd hang in this case as well. Therefore it may be wise to add a timeout to the COMPLETE bit polling loop in order to handle such cases properly. From mchan@broadcom.com Wed Jun 1 15:32:55 2005 Received: with ECARTIS (v1.0.0; list netdev); Wed, 01 Jun 2005 15:32:58 -0700 (PDT) Received: from MMS2.broadcom.com (mms2.broadcom.com [216.31.210.18]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j51MWsXq018653 for ; Wed, 1 Jun 2005 15:32:55 -0700 Received: from 10.10.64.121 by MMS2.broadcom.com with SMTP (Broadcom SMTP Relay (Email Firewall v6.1.0)); Wed, 01 Jun 2005 15:31:50 -0700 X-Server-Uuid: 1F20ACF3-9CAF-44F7-AB47-F294E2D5B4EA Received: from mail-irva-8.broadcom.com ([10.10.64.221]) by mail-irva-1.broadcom.com (Post.Office MTA v3.5.3 release 223 ID# 0-72233U7200L2200S0V35) with ESMTP id com; Wed, 1 Jun 2005 15:31:49 -0700 Received: from mon-irva-10.broadcom.com (mon-irva-10.broadcom.com [10.10.64.171]) by mail-irva-8.broadcom.com (MOS 3.5.6-GR) with ESMTP id BBP11233; Wed, 1 Jun 2005 15:31:45 -0700 (PDT) Received: from nt-irva-0741.brcm.ad.broadcom.com ( nt-irva-0741.brcm.ad.broadcom.com [10.8.194.54]) by mon-irva-10.broadcom.com (8.9.1/8.9.1) with ESMTP id PAA24578; Wed, 1 Jun 2005 15:31:45 -0700 (PDT) Received: from 10.7.18.177 ([10.7.18.177]) by NT-IRVA-0741.brcm.ad.broadcom.com ([10.8.194.54]) with Microsoft Exchange Server HTTP-DAV ; Wed, 1 Jun 2005 22:31:45 +0000 Received: from rh4 by nt-irva-0741; 01 Jun 2005 14:34:10 -0700 Subject: Re: Locking model for NAPI drivers From: "Michael Chan" To: "David S. Miller" cc: netdev@oss.sgi.com In-Reply-To: <20050601.152134.120445266.davem@davemloft.net> References: <20050531.154847.63995530.davem@davemloft.net> <1117658019.4310.58.camel@rh4> <20050601.152134.120445266.davem@davemloft.net> Date: Wed, 01 Jun 2005 14:34:10 -0700 Message-ID: <1117661650.4310.62.camel@rh4> MIME-Version: 1.0 X-Mailer: Evolution 2.0.2 (2.0.2-3) X-WSS-ID: 6E80E8DC1VO4417184-01-01 Content-Type: text/plain Content-Transfer-Encoding: 7bit X-archive-position: 1954 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mchan@broadcom.com Precedence: bulk X-list: netdev On Wed, 2005-06-01 at 15:21 -0700, David S. Miller wrote: > From: "Michael Chan" > Date: Wed, 01 Jun 2005 13:33:39 -0700 > > > I suppose we can enable interrupts in tg3_irq_quiesce() after setting > > the SYNC bit. > > Since the caller shuts down NAPI ->poll(), after setting the SYNC bit > we can just check the MAILBOX register, and if a '1' is there just > return. Does one need to mask out the upper bits of the regiser in > order to avoid seeing the IRQ tag in such a comparison? > No, just check for the value 1 since that's the value we use to disable interrupts. The value read back will always be 1 if 1 was previously written to it. From afleming@freescale.com Wed Jun 1 15:38:00 2005 Received: with ECARTIS (v1.0.0; list netdev); Wed, 01 Jun 2005 15:38:03 -0700 (PDT) Received: from az33egw02.freescale.net (az33egw02.freescale.net [192.88.158.103]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j51MbxXq019309 for ; Wed, 1 Jun 2005 15:37:59 -0700 Received: from az33smr02.freescale.net (az33smr02.freescale.net [10.64.34.200]) by az33egw02.freescale.net (8.12.11/az33egw02) with ESMTP id j51Mf5oC009076; Wed, 1 Jun 2005 15:41:06 -0700 (MST) Received: from [10.82.17.56] ([10.82.17.56]) by az33smr02.freescale.net (8.13.1/8.13.0) with ESMTP id j51Me9Xo018231; Wed, 1 Jun 2005 17:40:10 -0500 (CDT) In-Reply-To: <20050601144123.2bc11c06@dxpl.pdx.osdl.net> References: <1107b64b01fb8e9a6c84359bb56881a6@freescale.com> <20050531105939.7486e071@dxpl.pdx.osdl.net> <92F1428A-0B26-428B-8C06-35C7E5B9EEE3@freescale.com> <20050601144123.2bc11c06@dxpl.pdx.osdl.net> Mime-Version: 1.0 (Apple Message framework v730) Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed Message-Id: <9A2D608A-D818-455B-96F4-ED42413556C0@freescale.com> Cc: Netdev , Embedded PPC Linux list , Kumar Gala Content-Transfer-Encoding: 7bit From: Andy Fleming Subject: Re: RFC: PHY Abstraction Layer II Date: Wed, 1 Jun 2005 17:36:54 -0500 To: Stephen Hemminger X-Mailer: Apple Mail (2.730) X-archive-position: 1955 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: afleming@freescale.com Precedence: bulk X-list: netdev On Jun 1, 2005, at 16:41, Stephen Hemminger wrote: > On Wed, 1 Jun 2005 15:45:26 -0500 > Andy Fleming wrote: >> >>> * get rid of bus read may sleep implication in comment. >>> since you are holding phy spin lock it better not!! >>> >> >> > > On a different note, I am not sure that using sysfs/kobject bus object > is the right thing for this object. Isn't the phy instance really > just > an kobject whose parent is the network device? I can't see a 1 to N > relationship between phy bus and phy objects existing. Well, the MII Management bus is, in fact, a bus. When a network driver wants to modify a PHY, it must access that bus. Many ethernet controllers have a 1 to 1 relationship, since a typical NIC is a PCI card with 1 ethernet port (meaning one controller, and one PHY). However, many systems have multiple ethernet controllers attached to one bus, which configures multiple PHYs. Currently, these systems have been relying on luck to prevent multiple accesses to the same bus. This tends to work because all of the PHY support is contained within the ethernet driver, so it is easy for such drivers to ensure that only one PHY transaction is done at a time. This system begins to fall apart, though, when the PHY drivers start operating more independently to react to changing PHY state. It really begins to fall apart if you have multiple drivers trying to access a shared bus. For instance, the 8560 ADS board has 2 gigabit ethernet ports controlled by the gianfar driver, and 2 10/100 ports in the CPM subsystem, controlled by the fcc_enet driver. These two drivers each have an access point for the bus, which use different mechanisms (one is a bit bang interface, and one is register based). Using the new abstraction, it is possible for the FCC driver to use the gianfar driver's bus, thus saving code, and reducing complexity. > > The main use I can see for being a driver object is to catch > suspend/resume, > and wouldn't you want that to be tied to the network device. It would be quite easy for the network driver to suspend or resume the PHY and bus objects under the new abstraction. However, if eth0 is suspended, should it suspend the whole bus, and all the PHYs on it? By making the MII bus an independent entity, eth0 can be suspended, and it can choose to suspend its PHY, but eth1 can continue to access its PHY over the bus, since those aren't suspended. From afleming@freescale.com Wed Jun 1 15:43:58 2005 Received: with ECARTIS (v1.0.0; list netdev); Wed, 01 Jun 2005 15:44:01 -0700 (PDT) Received: from az33egw01.freescale.net (az33egw01.freescale.net [192.88.158.102]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j51MhwXq020018 for ; Wed, 1 Jun 2005 15:43:58 -0700 Received: from az33smr02.freescale.net (az33smr02.freescale.net [10.64.34.200]) by az33egw01.freescale.net (8.12.11/az33egw01) with ESMTP id j51MmPBu017065; Wed, 1 Jun 2005 15:48:25 -0700 (MST) Received: from [10.82.17.56] ([10.82.17.56]) by az33smr02.freescale.net (8.13.1/8.13.0) with ESMTP id j51MkCFY019534; Wed, 1 Jun 2005 17:46:12 -0500 (CDT) In-Reply-To: <429E2653.6010101@osdl.org> References: <1107b64b01fb8e9a6c84359bb56881a6@freescale.com> <20050531105939.7486e071@dxpl.pdx.osdl.net> <92F1428A-0B26-428B-8C06-35C7E5B9EEE3@freescale.com> <429E2653.6010101@osdl.org> Mime-Version: 1.0 (Apple Message framework v730) Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed Message-Id: Cc: Netdev , Embedded PPC Linux list , Kumar Gala Content-Transfer-Encoding: 7bit From: Andy Fleming Subject: Re: RFC: PHY Abstraction Layer II Date: Wed, 1 Jun 2005 17:42:56 -0500 To: Stephen Hemminger X-Mailer: Apple Mail (2.730) X-archive-position: 1956 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: afleming@freescale.com Precedence: bulk X-list: netdev On Jun 1, 2005, at 16:19, Stephen Hemminger wrote: > Andy Fleming wrote: >> >> But not this one. The phy_read and phy_write functions are >> reading from and writing to a bus. It is a reasonable >> implementation to have the operation block in the bus driver, and >> be awoken when an interrupt signals the operation is done. All >> of the phydev spinlocks have been arranged so as to prevent the >> lock being taken during interrupt time. >> >> Unless I've misunderstood spinlocks (it wouldn't be the first >> time), as long as the lock is never taken in interrupt time, it >> should be ok to hold the lock, and wait for an interrupt before >> clearing the lock. >> > > > The problem is that sleeping is defined in the linux kernel as > meaning waiting on a mutual exclusion > primitive (like semaphore) that puts the current thread to sleep. > It is not legal to sleep with a spinlock held. > In the phy_read code you do: > spin_lock_bh(&bus->mdio_lock); > retval = bus->read(bus, phydev->addr, regnum); > spin_unlock_bh(&bus->mdio_lock); > > If the bus->read function were to do something like start a request > and wait on a semaphore, then > you would be sleeping with a spin lock held. So bus->read can not > sleep! (as sleep is defined in the > linux kernel). Hmm... I understand this reasoning, but I still need a way for a bus read to wait for an interrupt before returning. I suppose I can just have the code spin while it waits, but that seems wrong, somehow. I'm open to any suggestions. From gwingerde@home.nl Wed Jun 1 20:55:46 2005 Received: with ECARTIS (v1.0.0; list netdev); Wed, 01 Jun 2005 20:55:50 -0700 (PDT) Received: from smtpq1.home.nl (smtpq1.home.nl [213.51.128.196]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j523thXq012188 for ; Wed, 1 Jun 2005 20:55:46 -0700 Received: from [213.51.128.134] (port=59856 helo=smtp3.home.nl) by smtpq1.home.nl with esmtp (Exim 4.30) id 1Ddgn8-0007D7-Al; Thu, 02 Jun 2005 05:54:46 +0200 Received: from cc10088-a.ensch1.ov.home.nl ([217.123.128.105]:59933 helo=[192.168.14.1]) by smtp3.home.nl with esmtp (Exim 4.30) id 1Ddgn6-00071G-DC; Thu, 02 Jun 2005 05:54:44 +0200 Message-ID: <429E8175.7010609@home.nl> Date: Thu, 02 Jun 2005 05:48:05 +0200 From: Gertjan van Wingerde User-Agent: Mozilla Thunderbird 1.0.2 (X11/20050322) X-Accept-Language: en-us, en MIME-Version: 1.0 To: Stephen Hemminger CC: netdev@oss.sgi.com, jgarzik@pobox.com Subject: Re: [PATCH] ieee80211: Update generic definitions to latest specs. References: <429E1FAB.6080503@home.nl> <20050601143734.3b7a49ca@dxpl.pdx.osdl.net> In-Reply-To: <20050601143734.3b7a49ca@dxpl.pdx.osdl.net> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-AtHome-MailScanner-Information: Neem contact op met support@home.nl voor meer informatie X-AtHome-MailScanner: Found to be clean X-archive-position: 1958 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: gwingerde@home.nl Precedence: bulk X-list: netdev Stephen Hemminger wrote: >On Wed, 01 Jun 2005 22:50:51 +0200 >Gertjan van Wingerde wrote: > > > >>Hi, >> >>Attached patch updates the definitions of the generic ieee80211 stack to >>the latest versions of the published 802.11x specification suite. >>Please review and apply. >> >>Signed-off-by: Gertjan van Wingerde >> >> >> >Could you change the elements that fix to be enum's instead of define's > >example: > >/* Management Frame Information Element Types */ >enum ieee80211_mfie { > MFIE_TYPE_SSID = 0, > MFIE_TYPE_RATES = 1, > MFIE_TYPE_FH_SET= 2, >... > Hi Stephen, Well, my patch is really just an add-on to the existing code. Converting to enums is really a follow-up patch that can be applied on top of this one. I'm happy to produce a patch if everybody agrees. Jeff, any opinions on this? Best regards, Gertjan. From raghunathan.venkatesan@wipro.com Wed Jun 1 20:54:49 2005 Received: with ECARTIS (v1.0.0; list netdev); Wed, 01 Jun 2005 20:54:52 -0700 (PDT) Received: from wip-ec-wd.wipro.com (wip-ec-wd.wipro.com [203.101.113.39]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j523sjXq012029 for ; Wed, 1 Jun 2005 20:54:48 -0700 Received: from wip-ec-wd.wipro.com (localhost.wipro.com [127.0.0.1]) by localhost (Postfix) with ESMTP id CA4D8205E7; Thu, 2 Jun 2005 09:14:50 +0530 (IST) Received: from blr-ec-bh01.wipro.com (unknown [10.201.50.91]) by wip-ec-wd.wipro.com (Postfix) with ESMTP id B055A205E5; Thu, 2 Jun 2005 09:14:50 +0530 (IST) Received: from chn-snr-bh2.wipro.com ([10.145.50.92]) by blr-ec-bh01.wipro.com with Microsoft SMTPSVC(6.0.3790.211); Thu, 2 Jun 2005 09:23:44 +0530 Received: from CHN-SNR-MBX01.wipro.com ([10.145.50.181]) by chn-snr-bh2.wipro.com with Microsoft SMTPSVC(6.0.3790.0); Thu, 2 Jun 2005 09:23:43 +0530 X-MimeOLE: Produced By Microsoft Exchange V6.5.7226.0 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Subject: RE: Unable to handle kernel paging request at virtual address 04000460 Date: Thu, 2 Jun 2005 09:20:21 +0530 Message-ID: <438662DA48DCAA41B1DF648BD4BD76C0E461B8@CHN-SNR-MBX01.wipro.com> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: Unable to handle kernel paging request at virtual address 04000460 Thread-Index: AcVm23XgN0MUhNi8RfmehJZdjhz+YAASlBxQ From: To: Cc: , , X-OriginalArrivalTime: 02 Jun 2005 03:53:43.0508 (UTC) FILETIME=[AB960140:01C56726] Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id j523sjXq012029 X-archive-position: 1957 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: raghunathan.venkatesan@wipro.com Precedence: bulk X-list: netdev Hi David, I understand that the linux community may not be able to debug it for me. All I require is if people have seen similar problems (the problems we face are w.r.t to kfree_skb and skb_drop_fraglist crashing due to some reason, which could be a Memory Management issue or some thing we are not aware of), then let us know the patches, so that we can try them out here. Thankyou for your response. Regards, Raghu -----Original Message----- From: David S. Miller [mailto:davem@davemloft.net] Sent: Thursday, June 02, 2005 12:25 AM To: Raghunathan Venkatesan (WT01 - EMBEDDED & PRODUCT ENGINEERING SOLUTIONS) Cc: linux-net@vger.kernel.org; netdev@oss.sgi.com; linux@der-keiler.de Subject: Re: Unable to handle kernel paging request at virtual address 04000460 Please don't ask the community to debug your custom kernel with private VPN driver modules installed. From herbert@gondor.apana.org.au Thu Jun 2 02:45:17 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Jun 2005 02:45:22 -0700 (PDT) Received: from arnor.apana.org.au (arnor.apana.org.au [203.14.152.115]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j529jEXq000794 for ; Thu, 2 Jun 2005 02:45:15 -0700 Received: from gondolin.me.apana.org.au ([192.168.0.6] ident=mail) by arnor.apana.org.au with esmtp (Exim 3.35 #1 (Debian)) id 1DdmFD-0007y2-00; Thu, 02 Jun 2005 19:44:07 +1000 Received: from herbert by gondolin.me.apana.org.au with local (Exim 3.36 #1 (Debian)) id 1DdmFA-0006o5-00; Thu, 02 Jun 2005 19:44:04 +1000 Date: Thu, 2 Jun 2005 19:44:04 +1000 To: "David S. Miller" , netdev@oss.sgi.com Subject: [IPV4/IPV6] Replace spin_lock_irq with spin_lock_bh Message-ID: <20050602094404.GA10316@gondor.apana.org.au> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="cWoXeonUoKmBZSoM" Content-Disposition: inline User-Agent: Mutt/1.5.9i From: Herbert Xu X-archive-position: 1959 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: herbert@gondor.apana.org.au Precedence: bulk X-list: netdev --cWoXeonUoKmBZSoM Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Hi Dave: In light of my recent patch to net/ipv4/udp.c that replaced the spin_lock_irq calls on the receive queue lock with spin_lock_bh, here is a similar patch for all other occurences of spin_lock_irq on receive/error queue locks in IPv4 and IPv6. In these stacks, we know that they can only be entered from user or softirq context. Therefore it's safe to disable BH only. Signed-off-by: Herbert Xu Since this patch simply improves the consistent use of locking primitives rather fixing any real bugs, it should probably go into net-2.6.13. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt --cWoXeonUoKmBZSoM Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename=p diff --git a/net/ipv4/ip_sockglue.c b/net/ipv4/ip_sockglue.c --- a/net/ipv4/ip_sockglue.c +++ b/net/ipv4/ip_sockglue.c @@ -360,14 +360,14 @@ int ip_recv_error(struct sock *sk, struc err = copied; /* Reset and regenerate socket error */ - spin_lock_irq(&sk->sk_error_queue.lock); + spin_lock_bh(&sk->sk_error_queue.lock); sk->sk_err = 0; if ((skb2 = skb_peek(&sk->sk_error_queue)) != NULL) { sk->sk_err = SKB_EXT_ERR(skb2)->ee.ee_errno; - spin_unlock_irq(&sk->sk_error_queue.lock); + spin_unlock_bh(&sk->sk_error_queue.lock); sk->sk_error_report(sk); } else - spin_unlock_irq(&sk->sk_error_queue.lock); + spin_unlock_bh(&sk->sk_error_queue.lock); out_free_skb: kfree_skb(skb); diff --git a/net/ipv4/raw.c b/net/ipv4/raw.c --- a/net/ipv4/raw.c +++ b/net/ipv4/raw.c @@ -691,11 +691,11 @@ static int raw_ioctl(struct sock *sk, in struct sk_buff *skb; int amount = 0; - spin_lock_irq(&sk->sk_receive_queue.lock); + spin_lock_bh(&sk->sk_receive_queue.lock); skb = skb_peek(&sk->sk_receive_queue); if (skb != NULL) amount = skb->len; - spin_unlock_irq(&sk->sk_receive_queue.lock); + spin_unlock_bh(&sk->sk_receive_queue.lock); return put_user(amount, (int __user *)arg); } diff --git a/net/ipv6/datagram.c b/net/ipv6/datagram.c --- a/net/ipv6/datagram.c +++ b/net/ipv6/datagram.c @@ -353,14 +353,14 @@ int ipv6_recv_error(struct sock *sk, str err = copied; /* Reset and regenerate socket error */ - spin_lock_irq(&sk->sk_error_queue.lock); + spin_lock_bh(&sk->sk_error_queue.lock); sk->sk_err = 0; if ((skb2 = skb_peek(&sk->sk_error_queue)) != NULL) { sk->sk_err = SKB_EXT_ERR(skb2)->ee.ee_errno; - spin_unlock_irq(&sk->sk_error_queue.lock); + spin_unlock_bh(&sk->sk_error_queue.lock); sk->sk_error_report(sk); } else { - spin_unlock_irq(&sk->sk_error_queue.lock); + spin_unlock_bh(&sk->sk_error_queue.lock); } out_free_skb: diff --git a/net/ipv6/raw.c b/net/ipv6/raw.c --- a/net/ipv6/raw.c +++ b/net/ipv6/raw.c @@ -434,12 +434,12 @@ csum_copy_err: /* Clear queue. */ if (flags&MSG_PEEK) { int clear = 0; - spin_lock_irq(&sk->sk_receive_queue.lock); + spin_lock_bh(&sk->sk_receive_queue.lock); if (skb == skb_peek(&sk->sk_receive_queue)) { __skb_unlink(skb, &sk->sk_receive_queue); clear = 1; } - spin_unlock_irq(&sk->sk_receive_queue.lock); + spin_unlock_bh(&sk->sk_receive_queue.lock); if (clear) kfree_skb(skb); } @@ -971,11 +971,11 @@ static int rawv6_ioctl(struct sock *sk, struct sk_buff *skb; int amount = 0; - spin_lock_irq(&sk->sk_receive_queue.lock); + spin_lock_bh(&sk->sk_receive_queue.lock); skb = skb_peek(&sk->sk_receive_queue); if (skb != NULL) amount = skb->tail - skb->h.raw; - spin_unlock_irq(&sk->sk_receive_queue.lock); + spin_unlock_bh(&sk->sk_receive_queue.lock); return put_user(amount, (int __user *)arg); } diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c --- a/net/ipv6/udp.c +++ b/net/ipv6/udp.c @@ -300,12 +300,12 @@ csum_copy_err: /* Clear queue. */ if (flags&MSG_PEEK) { int clear = 0; - spin_lock_irq(&sk->sk_receive_queue.lock); + spin_lock_bh(&sk->sk_receive_queue.lock); if (skb == skb_peek(&sk->sk_receive_queue)) { __skb_unlink(skb, &sk->sk_receive_queue); clear = 1; } - spin_unlock_irq(&sk->sk_receive_queue.lock); + spin_unlock_bh(&sk->sk_receive_queue.lock); if (clear) kfree_skb(skb); } --cWoXeonUoKmBZSoM-- From herbert@gondor.apana.org.au Thu Jun 2 02:56:04 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Jun 2005 02:56:10 -0700 (PDT) Received: from arnor.apana.org.au (arnor.apana.org.au [203.14.152.115]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j529u2Xq001566 for ; Thu, 2 Jun 2005 02:56:03 -0700 Received: from gondolin.me.apana.org.au ([192.168.0.6] ident=mail) by arnor.apana.org.au with esmtp (Exim 3.35 #1 (Debian)) id 1DdmPm-00084I-00; Thu, 02 Jun 2005 19:55:02 +1000 Received: from herbert by gondolin.me.apana.org.au with local (Exim 3.36 #1 (Debian)) id 1DdmPj-0007pT-00; Thu, 02 Jun 2005 19:54:59 +1000 Date: Thu, 2 Jun 2005 19:54:59 +1000 To: "David S. Miller" , netdev@oss.sgi.com Subject: [SCTP] Replace spin_lock_irqsave with spin_lock_bh Message-ID: <20050602095459.GA26638@gondor.apana.org.au> References: <20050602094404.GA10316@gondor.apana.org.au> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="2oS5YaxWCcQjTEyO" Content-Disposition: inline In-Reply-To: <20050602094404.GA10316@gondor.apana.org.au> User-Agent: Mutt/1.5.9i From: Herbert Xu X-archive-position: 1960 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: herbert@gondor.apana.org.au Precedence: bulk X-list: netdev --2oS5YaxWCcQjTEyO Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Hi Dave: This patch replaces the spin_lock_irqsave call on the receive queue lock in SCTP with spin_lock_bh. Despite the proliferation of spin_lock_irqsave calls in this stack, it is only entered from the IPv4/IPv6 stack and user space. That is, it is never entered from hardirq context. The call in question is only called from recvmsg which means that IRQs aren't disabled. Therefore it is safe to replace it with spin_lock_bh. Signed-off-by: Herbert Xu As before, this should probably only go into net-2.6.13. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt --2oS5YaxWCcQjTEyO Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename=p diff --git a/net/sctp/socket.c b/net/sctp/socket.c --- a/net/sctp/socket.c +++ b/net/sctp/socket.c @@ -4368,15 +4368,11 @@ static struct sk_buff *sctp_skb_recv_dat * However, this function was corrent in any case. 8) */ if (flags & MSG_PEEK) { - unsigned long cpu_flags; - - sctp_spin_lock_irqsave(&sk->sk_receive_queue.lock, - cpu_flags); + spin_lock_bh(&sk->sk_receive_queue.lock); skb = skb_peek(&sk->sk_receive_queue); if (skb) atomic_inc(&skb->users); - sctp_spin_unlock_irqrestore(&sk->sk_receive_queue.lock, - cpu_flags); + spin_unlock_bh(&sk->sk_receive_queue.lock); } else { skb = skb_dequeue(&sk->sk_receive_queue); } --2oS5YaxWCcQjTEyO-- From jtbbesaa@bipt106.bi.ehu.es Thu Jun 2 03:40:00 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Jun 2005 03:40:04 -0700 (PDT) Received: from bipt106.bi.ehu.es (bipt106.bi.ehu.es [158.227.67.106]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j52AdsXq003451 for ; Thu, 2 Jun 2005 03:39:59 -0700 Received: from bipt54.bi.ehu.es ([158.227.75.54] helo=ibook.ziberghetto.dhis.org) by bipt106.bi.ehu.es with esmtp (Exim 3.35 #1 (Debian)) id 1Ddn6I-0002Yr-00; Thu, 02 Jun 2005 12:38:58 +0200 Received: by ibook.ziberghetto.dhis.org (Postfix, from userid 1000) id 1D9BB20F1F; Thu, 2 Jun 2005 12:38:26 +0200 (CEST) From: Alfredo Beaumont Sainz Organization: Euskal Herriko Unibertsitatea To: netdev@oss.sgi.com Subject: Problems with Broadcom and Intel PRO/1000 cards Date: Thu, 2 Jun 2005 12:38:19 +0200 User-Agent: KMail/1.8 MIME-Version: 1.0 Content-Type: multipart/signed; boundary="nextPart2076079.6N4Hu5pznk"; protocol="application/pgp-signature"; micalg=pgp-sha1 Content-Transfer-Encoding: 7bit Message-Id: <200506021238.25615.jtbbesaa@aintel.bi.ehu.es> X-archive-position: 1961 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jtbbesaa@bipt106.bi.ehu.es Precedence: bulk X-list: netdev --nextPart2076079.6N4Hu5pznk Content-Type: text/plain; charset="iso-8859-15" Content-Transfer-Encoding: quoted-printable Content-Disposition: inline Hi, I've a dual opteron machine with an integrated dual Broadcom 5704 10/100/10= 00=20 (tg3 driver) and an Intel PRO/1000 MT (e1000 driver). It seems that I canno= t=20 make them work a Gbps. I've a crossover cable connecting a interface of the= =20 Broadcom (eth1) with the Intel (eth2), but they connect at 100Mbps: # /sbin/mii-tool -v eth1: negotiated 100baseTx-FD, link ok product info: vendor 00:08:18, model 25 rev 0 basic mode: autonegotiation enabled basic status: autonegotiation complete, link ok capabilities: 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD advertising: 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD flow-control link partner: 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD eth2: negotiated 100baseTx-FD, link ok product info: vendor 00:50:43, model 2 rev 5 basic mode: autonegotiation enabled basic status: autonegotiation complete, link ok capabilities: 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD advertising: 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD link partner: 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD flow-control As you can see, there's no 1000 FD advsertising. Forcing it with ethtool ma= kes=20 them lose link connection: # /usr/sbin/ethtool -s eth1 speed 1000 duplex full # /sbin/mii-tool -v eth1: no link product info: vendor 00:08:18, model 25 rev 0 basic mode: autonegotiation enabled basic status: no link capabilities: 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD advertising: 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD flow-control link partner: 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD eth2: no link product info: vendor 00:50:43, model 2 rev 5 basic mode: autonegotiation enabled basic status: no link capabilities: 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD advertising: 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD link partner: 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD flow-control After some secs link is recovered, at 100 again, and dmesg shows the follow= ing=20 kernel messages: tg3: eth1: Link is down. e1000: eth2: e1000_watchdog: NIC Link is Down tg3: eth1: Link is up at 1000 Mbps, full duplex. tg3: eth1: Flow control is off for TX and off for RX. e1000: eth2: e1000_watchdog: NIC Link is Up 1000 Mbps Full Duplex According to the messages links would be at 1000 but they are not really. T= he=20 same happens when forcing eth2. I'm using kernel version 2.6.11.11 but it also happened with previous versi= on=20 of the kernel. Any hints? Thanks. =2D-=20 Alfredo Beaumont. GPG: http://aintel.bi.ehu.es/~jtbbesaa/jtbbesaa.gpg.asc Elektronika eta Telekomunikazioak Saila (Ingeniaritza Telematikoa) Euskal Herriko Unibertsitatea, Bilbao (Basque Country). http://www.ehu.es --nextPart2076079.6N4Hu5pznk Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.1 (GNU/Linux) iD8DBQBCnuGh6KTU/EgLc1ERAgOsAJ9Bs8oPJEelifI+GtiP62cMEfl8ZQCfXxc6 e2z/CGhpOy0qWoXNj22/SMQ= =4LuQ -----END PGP SIGNATURE----- --nextPart2076079.6N4Hu5pznk-- From postman@harrier.cohaesio.com Thu Jun 2 04:34:16 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Jun 2005 04:34:19 -0700 (PDT) Received: from harrier.cohaesio.com (harrier.cohaesio.com [212.97.128.50]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j52BYFXq009978 for ; Thu, 2 Jun 2005 04:34:16 -0700 Received: by harrier.cohaesio.com (Postfix, from userid 88) id 7BF0647; Thu, 2 Jun 2005 13:33:14 +0200 (CEST) X-Original-To: news2mail@news.cohaesio.com Delivered-To: news2mail@news.cohaesio.com From: "Anders K. Pedersen" Subject: Re: Problems with Broadcom and Intel PRO/1000 cards Date: Thu, 02 Jun 2005 13:34:09 +0200 Organization: Cohaesio A/S Lines: 13 Message-ID: References: <200506021238.25615.jtbbesaa@aintel.bi.ehu.es> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Trace: harrier.cohaesio.com 1117711993 26359 212.97.128.136 (2 Jun 2005 11:33:13 GMT) X-Complaints-To: newsmaster@news.cohaesio.com X-Accept-Language: en-us, en To: netdev@oss.sgi.com X-archive-position: 1962 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: akp@cohaesio.com Precedence: bulk X-list: netdev Alfredo Beaumont Sainz wrote: > I've a dual opteron machine with an integrated dual Broadcom 5704 10/100/1000 > (tg3 driver) and an Intel PRO/1000 MT (e1000 driver). It seems that I cannot > make them work a Gbps. I've a crossover cable connecting a interface of the > Broadcom (eth1) with the Intel (eth2), but they connect at 100Mbps: > > # /sbin/mii-tool -v mii-tool does not (yet) support more than 100 Mbit/s, so it will report a 1000 Mbit/s connection as only running 100 Mbit/s. Use ethtool for now. Regards, Anders K. Pedersen From bunk@stusta.de Thu Jun 2 05:16:23 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Jun 2005 05:16:26 -0700 (PDT) Received: from mailout.stusta.mhn.de (emailhub.stusta.mhn.de [141.84.69.5]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id j52CGMXq011914 for ; Thu, 2 Jun 2005 05:16:23 -0700 Received: (qmail 16519 invoked from network); 2 Jun 2005 12:15:12 -0000 Received: from r063144.stusta.swh.mhn.de (10.150.63.144) by mailhub.stusta.mhn.de with SMTP; 2 Jun 2005 12:15:12 -0000 Received: by r063144.stusta.swh.mhn.de (Postfix, from userid 1000) id 6DB05BB5F8; Thu, 2 Jun 2005 14:15:11 +0200 (CEST) Date: Thu, 2 Jun 2005 14:15:11 +0200 From: Adrian Bunk To: Andrew Morton , shemminger@osdl.org Cc: linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: 2.6.12-rc5-mm2: "bic unavailable using TCP reno" messages Message-ID: <20050602121511.GE4992@stusta.de> References: <20050601022824.33c8206e.akpm@osdl.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20050601022824.33c8206e.akpm@osdl.org> User-Agent: Mutt/1.5.9i X-archive-position: 1963 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: bunk@stusta.de Precedence: bulk X-list: netdev On Wed, Jun 01, 2005 at 02:28:24AM -0700, Andrew Morton wrote: >... > Changes since 2.6.12-rc5-mm1: >... > +tcp-tcp_infra.patch >... > Steve Hemminger's TCP enhancements. >... I said "no" to CONFIG_TCP_CONG_BIC, and now my syslog is full of messages kernel: bic unavailable using TCP reno I have no problem with such a message being shown once - but once should be enough. cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed From hadi@cyberus.ca Thu Jun 2 05:27:47 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Jun 2005 05:27:55 -0700 (PDT) Received: from mx02.cybersurf.com (mx02.cybersurf.com [209.197.145.105]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j52CRkXq012728 for ; Thu, 2 Jun 2005 05:27:47 -0700 Received: from mail.cyberus.ca ([209.197.145.21]) by mx02.cybersurf.com with esmtp (Exim 4.30) id 1Ddomh-0004kV-UR for netdev@oss.sgi.com; Thu, 02 Jun 2005 08:26:51 -0400 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.229]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1Ddomg-0006iO-KG; Thu, 02 Jun 2005 08:26:50 -0400 Subject: Re: RFC: NAPI packet weighting patch From: jamal Reply-To: hadi@cyberus.ca To: Jon Mason Cc: "David S. Miller" , mitch.a.williams@intel.com, shemminger@osdl.org, netdev@oss.sgi.com, Robert.Olsson@data.slu.se, john.ronciak@intel.com, ganesh.venkatesan@intel.com, jesse.brandeburg@intel.com In-Reply-To: <200505311828.44304.jdmason@us.ibm.com> References: <1117241786.6251.7.camel@localhost.localdomain> <200505311707.54487.jdmason@us.ibm.com> <20050531.151443.74564699.davem@davemloft.net> <200505311828.44304.jdmason@us.ibm.com> Content-Type: text/plain Organization: unknown Date: Thu, 02 Jun 2005 08:26:46 -0400 Message-Id: <1117715207.6050.21.camel@localhost.localdomain> Mime-Version: 1.0 X-Mailer: Evolution 2.2.1.1 Content-Transfer-Encoding: 7bit X-archive-position: 1964 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev On Tue, 2005-31-05 at 18:28 -0500, Jon Mason wrote: > On Tuesday 31 May 2005 05:14 pm, David S. Miller wrote: > > From: Jon Mason > > Date: Tue, 31 May 2005 17:07:54 -0500 > > > > > Of course some performace analysis would have to be done to determine the > > > optimal numbers for each speed/duplexity setting per driver. > > > > per cpu speed, per memory bus speed, per I/O bus speed, and add in other > > complications such as NUMA > > > > My point is that whatever experimental number you come up with will be > > good for that driver on your systems, not necessarily for others. > > > > Even within a system, whatever number you select will be the wrong > > thing to use if one starts a continuous I/O stream to the SATA > > controller in the next PCI slot, for example. > > > > We keep getting bitten by this, as the Altix perf data continually shows, > > and we need to absolutely stop thinking this way. > > > > The way to go is to make selections based upon observed events and > > mesaurements. > > I'm not arguing against a /proc entry to tune dev->weight for those sysadmins > advanced enough to do that. I am arguing that we can make the driver smarter > (at little/no cost) for "out of the box" users. > What is the point of making the driver "smarter"? Recall, the algorithm used to schedule the netdevices is based on an extension of Weighted Round Robin from Varghese et al known as DRR (ask gooogle for details). The idea is to provide fairness amongst many drivers. As an example, if you have a gige driver it shouldnt be taking all the resources at the expense of starving the fastether driver. If the admin wants one driver to be more "important" than the other, s/he will make sure it has a higher weight. cheers, jamal From hadi@cyberus.ca Thu Jun 2 06:05:57 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Jun 2005 06:06:04 -0700 (PDT) Received: from mx03.cybersurf.com (mx03.cybersurf.com [209.197.145.106]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j52D5rXq015814 for ; Thu, 2 Jun 2005 06:05:57 -0700 Received: from mail.cyberus.ca ([209.197.145.21]) by mx03.cybersurf.com with esmtp (Exim 4.30) id 1DdpNa-0004ha-Tm for netdev@oss.sgi.com; Thu, 02 Jun 2005 09:04:58 -0400 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.229]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1DdpNY-0004wq-08; Thu, 02 Jun 2005 09:04:56 -0400 Subject: PATCH: explicit typing WAS(Re: PATCH: rtnetlink explicit flags setting From: jamal Reply-To: hadi@cyberus.ca To: "David S. Miller" Cc: tgraf@suug.ch, netdev@oss.sgi.com In-Reply-To: <20050531.153125.95894437.davem@davemloft.net> References: <1117197157.6688.24.camel@localhost.localdomain> <20050531.144338.112623594.davem@davemloft.net> <20050531222646.GK15391@postel.suug.ch> <20050531.153125.95894437.davem@davemloft.net> Content-Type: multipart/mixed; boundary="=-MNGFh9ieSNAM2tZgwH9J" Organization: unknown Date: Thu, 02 Jun 2005 09:04:52 -0400 Message-Id: <1117717493.6050.29.camel@localhost.localdomain> Mime-Version: 1.0 X-Mailer: Evolution 2.2.1.1 X-archive-position: 1965 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev --=-MNGFh9ieSNAM2tZgwH9J Content-Type: text/plain Content-Transfer-Encoding: 7bit On Tue, 2005-31-05 at 15:31 -0700, David S. Miller wrote: > From: Thomas Graf > Date: Wed, 1 Jun 2005 00:26:46 +0200 > > > > Please use explicit "unsigned int flags" instead of "unsigned flags". > > > > I converted this already in the two patches later in the thread. > > I see, thanks for pointing this out. > If you want to do it right, it should be a u16 actually ;-> In any case since we are being gracious - lets fix where i cutnpasted it from using TheLinuxWay ;-> ------------- This patch converts "unsigned flags" to use more explict types like u16 instead and incrementally introduces NLMSG_NEW(). Signed-off-by: Jamal Hadi Salim cheers, jamal --=-MNGFh9ieSNAM2tZgwH9J Content-Disposition: attachment; filename=expl_p Content-Type: text/plain; name=expl_p; charset=UTF-8 Content-Transfer-Encoding: 7bit net/ipv6/addrconf.c: needs update net/sched/act_api.c: needs update net/sched/cls_api.c: needs update net/sched/sch_api.c: needs update Index: net/ipv6/addrconf.c =================================================================== --- faa2ccd541211d62ece040534da95da9476d4f14/net/ipv6/addrconf.c (mode:100644) +++ uncommitted/net/ipv6/addrconf.c (mode:100644) @@ -131,7 +131,7 @@ static int addrconf_ifdown(struct net_device *dev, int how); -static void addrconf_dad_start(struct inet6_ifaddr *ifp, int flags); +static void addrconf_dad_start(struct inet6_ifaddr *ifp, u32 flags); static void addrconf_dad_timer(unsigned long data); static void addrconf_dad_completed(struct inet6_ifaddr *ifp); static void addrconf_rs_timer(unsigned long data); @@ -491,7 +491,7 @@ static struct inet6_ifaddr * ipv6_add_addr(struct inet6_dev *idev, const struct in6_addr *addr, int pfxlen, - int scope, unsigned flags) + int scope, u32 flags) { struct inet6_ifaddr *ifa = NULL; struct rt6_info *rt; @@ -1319,7 +1319,7 @@ static void addrconf_prefix_route(struct in6_addr *pfx, int plen, struct net_device *dev, - unsigned long expires, unsigned flags) + unsigned long expires, u32 flags) { struct in6_rtmsg rtmsg; @@ -2228,7 +2228,7 @@ /* * Duplicate Address Detection */ -static void addrconf_dad_start(struct inet6_ifaddr *ifp, int flags) +static void addrconf_dad_start(struct inet6_ifaddr *ifp, u32 flags) { struct inet6_dev *idev = ifp->idev; struct net_device *dev = idev->dev; @@ -2670,7 +2670,7 @@ } static int inet6_fill_ifmcaddr(struct sk_buff *skb, struct ifmcaddr6 *ifmca, - u32 pid, u32 seq, int event, unsigned flags) + u32 pid, u32 seq, int event, u16 flags) { struct ifaddrmsg *ifm; struct nlmsghdr *nlh; Index: net/sched/act_api.c =================================================================== --- faa2ccd541211d62ece040534da95da9476d4f14/net/sched/act_api.c (mode:100644) +++ uncommitted/net/sched/act_api.c (mode:100644) @@ -428,15 +428,15 @@ static int tca_get_fill(struct sk_buff *skb, struct tc_action *a, u32 pid, u32 seq, - unsigned flags, int event, int bind, int ref) + u16 flags, int event, int bind, int ref) { struct tcamsg *t; struct nlmsghdr *nlh; unsigned char *b = skb->tail; struct rtattr *x; - nlh = NLMSG_PUT(skb, pid, seq, event, sizeof(*t)); - nlh->nlmsg_flags = flags; + nlh = NLMSG_NEW(skb, pid, seq, event, sizeof(*t), flags); + t = NLMSG_DATA(nlh); t->tca_family = AF_UNSPEC; @@ -669,7 +669,7 @@ } static int tcf_add_notify(struct tc_action *a, u32 pid, u32 seq, int event, - unsigned flags) + u16 flags) { struct tcamsg *t; struct nlmsghdr *nlh; @@ -684,8 +684,7 @@ b = (unsigned char *)skb->tail; - nlh = NLMSG_PUT(skb, pid, seq, event, sizeof(*t)); - nlh->nlmsg_flags = flags; + nlh = NLMSG_NEW(skb, pid, seq, event, sizeof(*t), flags); t = NLMSG_DATA(nlh); t->tca_family = AF_UNSPEC; Index: net/sched/cls_api.c =================================================================== --- faa2ccd541211d62ece040534da95da9476d4f14/net/sched/cls_api.c (mode:100644) +++ uncommitted/net/sched/cls_api.c (mode:100644) @@ -322,14 +322,13 @@ static int tcf_fill_node(struct sk_buff *skb, struct tcf_proto *tp, unsigned long fh, - u32 pid, u32 seq, unsigned flags, int event) + u32 pid, u32 seq, u16 flags, int event) { struct tcmsg *tcm; struct nlmsghdr *nlh; unsigned char *b = skb->tail; - nlh = NLMSG_PUT(skb, pid, seq, event, sizeof(*tcm)); - nlh->nlmsg_flags = flags; + nlh = NLMSG_NEW(skb, pid, seq, event, sizeof(*tcm), flags); tcm = NLMSG_DATA(nlh); tcm->tcm_family = AF_UNSPEC; tcm->tcm_ifindex = tp->q->dev->ifindex; Index: net/sched/sch_api.c =================================================================== --- faa2ccd541211d62ece040534da95da9476d4f14/net/sched/sch_api.c (mode:100644) +++ uncommitted/net/sched/sch_api.c (mode:100644) @@ -760,15 +760,14 @@ } static int tc_fill_qdisc(struct sk_buff *skb, struct Qdisc *q, u32 clid, - u32 pid, u32 seq, unsigned flags, int event) + u32 pid, u32 seq, u16 flags, int event) { struct tcmsg *tcm; struct nlmsghdr *nlh; unsigned char *b = skb->tail; struct gnet_dump d; - nlh = NLMSG_PUT(skb, pid, seq, event, sizeof(*tcm)); - nlh->nlmsg_flags = flags; + nlh = NLMSG_NEW(skb, pid, seq, event, sizeof(*tcm), flags); tcm = NLMSG_DATA(nlh); tcm->tcm_family = AF_UNSPEC; tcm->tcm_ifindex = q->dev->ifindex; @@ -997,7 +996,7 @@ static int tc_fill_tclass(struct sk_buff *skb, struct Qdisc *q, unsigned long cl, - u32 pid, u32 seq, unsigned flags, int event) + u32 pid, u32 seq, u16 flags, int event) { struct tcmsg *tcm; struct nlmsghdr *nlh; @@ -1005,8 +1004,7 @@ struct gnet_dump d; struct Qdisc_class_ops *cl_ops = q->ops->cl_ops; - nlh = NLMSG_PUT(skb, pid, seq, event, sizeof(*tcm)); - nlh->nlmsg_flags = flags; + nlh = NLMSG_NEW(skb, pid, seq, event, sizeof(*tcm), flags); tcm = NLMSG_DATA(nlh); tcm->tcm_family = AF_UNSPEC; tcm->tcm_ifindex = q->dev->ifindex; --=-MNGFh9ieSNAM2tZgwH9J-- From abonilla@linuxwireless.org Thu Jun 2 06:06:15 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Jun 2005 06:06:22 -0700 (PDT) Received: from linuxwireless.org.ve.carpathiahost.net (linuxwireless.org.ve.carpathiahost.net [66.117.45.234]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j52D6EXq015848 for ; Thu, 2 Jun 2005 06:06:15 -0700 Received: from WCRSJO2KPAB047 ([200.9.49.66]) by linuxwireless.org.ve.carpathiahost.net (8.12.10/8.12.10) with SMTP id j52D4vgC001796; Thu, 2 Jun 2005 09:04:58 -0400 Reply-To: From: "Alejandro Bonilla" To: "'Alfredo Beaumont Sainz'" , Subject: RE: Problems with Broadcom and Intel PRO/1000 cards Date: Thu, 2 Jun 2005 07:04:43 -0600 Message-ID: <001c01c56773$a5684060$600cc60a@amer.sykes.com> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-15" Content-Transfer-Encoding: 7bit X-Priority: 3 (Normal) X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook CWS, Build 9.0.6604 (9.0.2911.0) In-Reply-To: <200506021238.25615.jtbbesaa@aintel.bi.ehu.es> X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1478 Importance: Normal X-archive-position: 1966 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: abonilla@linuxwireless.org Precedence: bulk X-list: netdev > Hi, > > I've a dual opteron machine with an integrated dual Broadcom > 5704 10/100/1000 > (tg3 driver) and an Intel PRO/1000 MT (e1000 driver). It > seems that I cannot > make them work a Gbps. I've a crossover cable connecting a > interface of the > Broadcom (eth1) with the Intel (eth2), but they connect at 100Mbps: > Only time that I have seen this before, it was because I was using an incorrect cable. Make sure you have the _REAL_ Gb crossover cable. http://logout.sh/computers/net/gigabit/ Also, I would trust in dmesg and not in some other tool. .Alejandro From baruch@ev-en.org Thu Jun 2 06:59:24 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Jun 2005 06:59:26 -0700 (PDT) Received: from galon.ev-en.org (rrcs-24-123-59-149.central.biz.rr.com [24.123.59.149]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j52DxNXq020273 for ; Thu, 2 Jun 2005 06:59:24 -0700 Received: by galon.ev-en.org (Postfix, from userid 105) id 9282711A953; Thu, 2 Jun 2005 16:58:24 +0300 (IDT) Received: from [10.220.3.66] (hamilton.nuim.ie [149.157.192.252]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by galon.ev-en.org (Postfix) with ESMTP id 9DB9D11A952; Thu, 2 Jun 2005 16:58:21 +0300 (IDT) Message-ID: <429F1079.5070701@ev-en.org> Date: Thu, 02 Jun 2005 14:58:17 +0100 From: Baruch Even User-Agent: Debian Thunderbird 1.0.2 (X11/20050331) X-Accept-Language: en-us, en MIME-Version: 1.0 To: Adrian Bunk Cc: Andrew Morton , shemminger@osdl.org, linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: 2.6.12-rc5-mm2: "bic unavailable using TCP reno" messages References: <20050601022824.33c8206e.akpm@osdl.org> <20050602121511.GE4992@stusta.de> In-Reply-To: <20050602121511.GE4992@stusta.de> X-Enigmail-Version: 0.91.0.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-archive-position: 1967 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: baruch@ev-en.org Precedence: bulk X-list: netdev Adrian Bunk wrote: > On Wed, Jun 01, 2005 at 02:28:24AM -0700, Andrew Morton wrote: > >>... >>Changes since 2.6.12-rc5-mm1: >>... >>+tcp-tcp_infra.patch >>... >> Steve Hemminger's TCP enhancements. >>... > > > I said "no" to CONFIG_TCP_CONG_BIC, and now my syslog is full of messages > kernel: bic unavailable using TCP reno > > I have no problem with such a message being shown once - but once should > be enough. The best solution for this would be to check the available protocols at setup time and not at connection creation time. This would also provide a better feedback to the user, since he will either see that what he set was taken, or it wasn't. In the current mechanism you can set the protocol to 'foo' and it will show back as 'foo'. You'll get complaints only once a connection is attempted with this protocol. It does mean some extra work in the sysctl stage, but it's better IMO to do it there rather than at connection setup time. Baruch From hadi@cyberus.ca Thu Jun 2 07:13:59 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Jun 2005 07:14:02 -0700 (PDT) Received: from mx01.cybersurf.com (mx01.cybersurf.com [209.197.145.104]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j52EDsXq021395 for ; Thu, 2 Jun 2005 07:13:59 -0700 Received: from mail.cyberus.ca ([209.197.145.21]) by mx01.cybersurf.com with esmtp (Exim 4.30) id 1DdqRN-0007RQ-AU for netdev@oss.sgi.com; Thu, 02 Jun 2005 08:12:57 -0600 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.229]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1Ddpp0-0002iM-B1; Thu, 02 Jun 2005 09:33:18 -0400 Subject: Re: [PATCH 3/4] [NEIGH] neighbour table configuration and statistics via rtnetlink From: jamal Reply-To: hadi@cyberus.ca To: Thomas Graf Cc: "David S. Miller" , netdev@oss.sgi.com In-Reply-To: <20050531161315.GH15391@postel.suug.ch> References: <20050527151608.GZ15391@postel.suug.ch> <1117209411.6383.104.camel@localhost.localdomain> <20050527163516.GB15391@postel.suug.ch> <1117244567.6251.34.camel@localhost.localdomain> <20050528120731.GP15391@postel.suug.ch> <1117533847.6134.32.camel@localhost.localdomain> <20050531114251.GC15391@postel.suug.ch> <1117543711.6134.48.camel@localhost.localdomain> <20050531131747.GF15391@postel.suug.ch> <1117551561.6279.2.camel@localhost.localdomain> <20050531161315.GH15391@postel.suug.ch> Content-Type: text/plain Organization: unknown Date: Thu, 02 Jun 2005 09:33:15 -0400 Message-Id: <1117719195.6050.54.camel@localhost.localdomain> Mime-Version: 1.0 X-Mailer: Evolution 2.2.1.1 Content-Transfer-Encoding: 7bit X-archive-position: 1969 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev On Tue, 2005-31-05 at 18:13 +0200, Thomas Graf wrote: [..] > So what I propose is to have the neighbour table parameters, > e.g. everything in arp_tbl be distributed over RTM_NEIGHTBL > and put the device specific parameters into devconfig, > e.g. in_dev->arp_parms. > Right, this is what i am saying a well. The only caveat i was pointing out is that the devconfig piece is more than just the neighbor stuff - and of course it hasnt been written, yet;-> The major challenge will be events - some change via /proc, sysfs etc should generate event. I suggest something along usage of notifier_block with something like NETDEV_CONFIG to transport these things around. Damn, if only i can find my patch .... I had already started doing events based on changes from /proc or sysctl etc. > Absolutely, more specific: > > netdevice -> inet_device -> parameter set -> neighbour table > or: > neighbour table -> list of parameter sets -> netdevice > > both ways are possible right now. Sounds good to me. cheers, jamal From jtbbesaa@bipt106.bi.ehu.es Thu Jun 2 07:13:28 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Jun 2005 07:13:34 -0700 (PDT) Received: from bipt106.bi.ehu.es (bipt106.bi.ehu.es [158.227.67.106]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j52EDPXq021320 for ; Thu, 2 Jun 2005 07:13:28 -0700 Received: from bipt54.bi.ehu.es ([158.227.75.54] helo=ibook.ziberghetto.dhis.org) by bipt106.bi.ehu.es with esmtp (Exim 3.35 #1 (Debian)) id 1DdqQu-0005HX-00 for ; Thu, 02 Jun 2005 16:12:28 +0200 Received: by ibook.ziberghetto.dhis.org (Postfix, from userid 1000) id 04FA121151; Thu, 2 Jun 2005 16:11:55 +0200 (CEST) From: Alfredo Beaumont Sainz Organization: Euskal Herriko Unibertsitatea To: netdev@oss.sgi.com Subject: Re: Problems with Broadcom and Intel PRO/1000 cards Date: Thu, 2 Jun 2005 16:11:42 +0200 User-Agent: KMail/1.8 References: <200506021238.25615.jtbbesaa@aintel.bi.ehu.es> In-Reply-To: MIME-Version: 1.0 Content-Type: multipart/signed; boundary="nextPart1279143.OyKeIErFOt"; protocol="application/pgp-signature"; micalg=pgp-sha1 Content-Transfer-Encoding: 7bit Message-Id: <200506021611.54933.jtbbesaa@aintel.bi.ehu.es> X-archive-position: 1968 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jtbbesaa@bipt106.bi.ehu.es Precedence: bulk X-list: netdev --nextPart1279143.OyKeIErFOt Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Content-Disposition: inline Og, 2005eko Ekaren 02a 13:34(e)an, Anders K. Pedersen(e)k idatzi zuen: > Alfredo Beaumont Sainz wrote: > > I've a dual opteron machine with an integrated dual Broadcom 5704 > > 10/100/1000 (tg3 driver) and an Intel PRO/1000 MT (e1000 driver). It > > seems that I cannot make them work a Gbps. I've a crossover cable > > connecting a interface of the Broadcom (eth1) with the Intel (eth2), but > > they connect at 100Mbps: > > > > # /sbin/mii-tool -v > > mii-tool does not (yet) support more than 100 Mbit/s, so it will report > a 1000 Mbit/s connection as only running 100 Mbit/s. Use ethtool for now. Ouch, you are right. They are really working at 1000Mbit/s. I should have=20 checked that. They work with a crossover cable, but I still have problems w= ith=20 the switch. I'll further investigate before posting again. Thanks! =2D-=20 Alfredo Beaumont. GPG: http://aintel.bi.ehu.es/~jtbbesaa/jtbbesaa.gpg.asc Elektronika eta Telekomunikazioak Saila (Ingeniaritza Telematikoa) Euskal Herriko Unibertsitatea, Bilbao (Basque Country). http://www.ehu.es --nextPart1279143.OyKeIErFOt Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.1 (GNU/Linux) iD8DBQBCnxOq6KTU/EgLc1ERAsbKAJ9U+j2OiPemLbu1oNp/t/T1ijHWDQCeLRho mxzLFdj20GxHxb4LXD7z5pM= =drCD -----END PGP SIGNATURE----- --nextPart1279143.OyKeIErFOt-- From Peter.Kutschera@arcs.ac.at Thu Jun 2 08:51:32 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Jun 2005 08:51:41 -0700 (PDT) Received: from s0ms2.arc.local (arcmail.arcs.ac.at [62.218.164.36]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j52FpUXq031245 for ; Thu, 2 Jun 2005 08:51:31 -0700 Received: from s1ms3.D01.arc.local ([172.24.10.15]) by s0ms2.arc.local with Microsoft SMTPSVC(6.0.3790.0); Thu, 2 Jun 2005 17:50:28 +0200 X-MimeOLE: Produced By Microsoft Exchange V6.5.7226.0 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Subject: R8169 from U.S.Robotics not found by driver Date: Thu, 2 Jun 2005 17:50:28 +0200 Message-ID: <3BDD1137DBC16749ACF2C93F82FCA98DA107D2@s1ms3.D01.arc.local> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: R8169 from U.S.Robotics not found by driver Thread-Index: AcVnisxe/CbtXBnaRKOcSTb93OxXtg== From: "Kutschera Peter" To: "Linux r8169 crew" X-OriginalArrivalTime: 02 Jun 2005 15:50:28.0291 (UTC) FILETIME=[CC6F2130:01C5678A] Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id j52FpUXq031245 X-archive-position: 1970 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: Peter.Kutschera@arcs.ac.at Precedence: bulk X-list: netdev Hello to whoever is out there! I found your e-mail address in r8169.c: MODULE_AUTHOR("Realtek and the Linux r8169 crew "); MODULE_DESCRIPTION("RealTek RTL-8169 Gigabit Ethernet driver"); Maybe you are interested in the following problem? I just bought a new 1000MB NIC from U.S.Robotics since I was thinking there is a driver in kernel 2.6.8. It wasn't. But there is a driver (on the CD and also downloadable from http://www.usr.com/support/product-template.asp?prod=7902 (see linux.exe :-)) And there is also a newer driver in 2.6.11. The different results are: Modprobe r8169 with the driver from 2.6.8 or 2.6.11 simple has no effect - the module is loaded but there is no error message, no eth1 (it's my 2nd network card, eth0 in onboard) and nothing in dmesg :-( I was building and using the driver from U.S.Robotics with 2.6.8 and 2.6.11: pinguc1:~# modprobe r8169 pinguc1:~# dmesg | tail ACPI: PCI interrupt 0000:00:04.0[A] -> GSI 25 (level, low) -> IRQ 193 eth1: Identified chip type is 'RTL8169s/8110s'. eth1: U.S. Robotics 10/100/1000 PCI NIC driver version 2.0 at 0xf89e8000, 00:c0:49:59:28:71, IRQ 193 eth1: Auto-negotiation Enabled. eth1: 1000Mbps Full-duplex operation. pinguc1:~# ifup eth1 pinguc1:~# ping cluster2 PING cluster2 (192.168.1.2) 56(84) bytes of data. 64 bytes from cluster2 (192.168.1.2): icmp_seq=1 ttl=64 time=0.069 ms Fine, isnt' it? NO IT IS NOT :-( It works fine for a wile but when starting to put LOTS OF DATA about this interface: pinguc1:~# dmesg | tail irq 193: nobody cared! [] __report_bad_irq+0x31/0x77 [] note_interrupt+0x4c/0x71 [] __do_IRQ+0xd9/0x121 [] do_IRQ+0x1b/0x28 [] common_interrupt+0x1a/0x20 [] default_idle+0x0/0x29 [] default_idle+0x23/0x29 [] cpu_idle+0x39/0x4e [] start_kernel+0x178/0x17c handlers: [] (rtl8169_interrupt+0x0/0x7e [r8169]) Disabling IRQ #193 No interrupt - No data transfer Maybe some of the following is usefull for you? pinguc1:~# lspci 0000:00:00.0 Host bridge: ServerWorks GCNB-LE Host Bridge (rev 32) 0000:00:00.1 Host bridge: ServerWorks GCNB-LE Host Bridge 0000:00:02.0 Ethernet controller: Intel Corp. 82540EM Gigabit Ethernet Controlle r (rev 02) 0000:00:04.0 Ethernet controller: U.S. Robotics: Unknown device 0116 (rev 10) 0000:00:0e.0 VGA compatible controller: ATI Technologies Inc Rage XL (rev 27) 0000:00:0f.0 Host bridge: ServerWorks CSB5 South Bridge (rev 93) 0000:00:0f.1 IDE interface: ServerWorks CSB5 IDE Controller (rev 93) 0000:00:0f.2 USB Controller: ServerWorks OSB4/CSB5 OHCI USB Controller (rev 05) 0000:00:0f.3 ISA bridge: ServerWorks CSB5 LPC bridge 0000:00:10.0 Host bridge: ServerWorks CIOB-X2 PCI-X I/O Bridge (rev 05) 0000:00:10.2 Host bridge: ServerWorks CIOB-X2 PCI-X I/O Bridge (rev 05) 0000:01:02.0 RAID bus controller: American Megatrends Inc. MegaRAID (rev 02) 0000:01:04.0 SCSI storage controller: LSI Logic / Symbios Logic 53c1030 PCI-X Fu sion-MPT Dual Ultra320 SCSI (rev 07) pinguc1:~# hd /proc/bus/pci/01/04.0 00000000 00 10 30 00 17 01 30 02 07 00 00 01 10 48 00 00 |..0...0......H..| 00000010 01 dc 00 00 04 00 f1 fc 00 00 00 00 04 00 f0 fc |.Ü....ñü......ðü| 00000020 00 00 00 00 00 00 00 00 00 00 00 00 28 10 35 01 |............(.5.| 00000030 00 00 e0 fc 50 00 00 00 00 00 00 00 0b 01 11 12 |..àüP...........| 00000040 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| 00000050 01 58 02 06 00 00 00 00 05 00 80 00 00 00 00 00 |.X..............| 00000060 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| * 00000100 I am not sure if this is a problem of the USR-hriver or the hardware (Dell PowerEdge 1600). I would like to test your driver but it seems to me that your driver can't find the card. On another PC running the same software (debian sage with 3.6.8 cernel and USR-driver on the other end of the cable) the module from USR seems to work. If you have any tips please let me know. In the meantime i will try another PCI slot and, as iI expect this will not help, an old 3C509. Not the best choice for a linux cluster I think. Thanks Peter -- Dipl.-Ing. Peter Kutschera tel: +43 664 620 7642 http://Peter.Kutschera.at/ mailto:Peter@Kutschera.at From jbenc@suse.cz Thu Jun 2 09:51:37 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Jun 2005 09:51:38 -0700 (PDT) Received: from mail.suse.cz (styx.suse.cz [82.119.242.94]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j52GpZXq001174 for ; Thu, 2 Jun 2005 09:51:36 -0700 Received: from griffin.suse.cz (griffin.suse.cz [10.20.1.99]) by mail.suse.cz (SUSE CR ESMTP Mailer) with ESMTP id 38BA662830C; Thu, 2 Jun 2005 18:50:38 +0200 (CEST) Date: Thu, 2 Jun 2005 18:50:38 +0200 From: Jiri Benc To: Gertjan van Wingerde Cc: netdev@oss.sgi.com, jgarzik@pobox.com, jbohac@suse.cz Subject: Re: [PATCH] ieee80211: Update generic definitions to latest specs. Message-ID: <20050602185038.4fd9dafb@griffin.suse.cz> In-Reply-To: <429E1FAB.6080503@home.nl> References: <429E1FAB.6080503@home.nl> X-Mailer: Sylpheed-Claws 1.0.4a (GTK+ 1.2.10; x86_64-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 1971 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jbenc@suse.cz Precedence: bulk X-list: netdev On Wed, 01 Jun 2005 22:50:51 +0200, Gertjan van Wingerde wrote: > +#define WLAN_STATUS_ASSOC_DENIED_SPECTRUM_MGMT_REQUIRED 22 > +#define WLAN_STATUS_ASSOC_REJECTED_POWER_CAP_UNACCEPTABLE 23 > +#define WLAN_STATUS_ASSOC_REJECTED_SUPP_CHANNELS_UNACCEPTABLE 24 > (...) > +/* 802.11h */ > +#define WLAN_REASON_DISASSOC_POWER_CAP_UNACCEPTABLE 10 > +#define WLAN_REASON_DISASSOC_SUPP_CHANNELS_UNACCEPTABLE 11 Aren't these identifiers a bit too long? It seems to be unpractical to use them. -- Jiri Benc SUSE Labs From shemminger@osdl.org Thu Jun 2 10:32:27 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Jun 2005 10:32:39 -0700 (PDT) Received: from smtp.osdl.org (fire.osdl.org [65.172.181.4]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j52HWRXq003829 for ; Thu, 2 Jun 2005 10:32:27 -0700 Received: from shell0.pdx.osdl.net (fw.osdl.org [65.172.181.6]) by smtp.osdl.org (8.12.8/8.12.8) with ESMTP id j52HUqjA028494 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO); Thu, 2 Jun 2005 10:30:53 -0700 Received: from dxpl.pdx.osdl.net (dxpl.pdx.osdl.net [10.8.0.74]) by shell0.pdx.osdl.net (8.13.1/8.11.6) with ESMTP id j52HUqCQ005577; Thu, 2 Jun 2005 10:30:52 -0700 Date: Thu, 2 Jun 2005 10:30:52 -0700 From: Stephen Hemminger To: hadi@cyberus.ca Cc: Jon Mason , "David S. Miller" , mitch.a.williams@intel.com, netdev@oss.sgi.com, Robert.Olsson@data.slu.se, john.ronciak@intel.com, ganesh.venkatesan@intel.com, jesse.brandeburg@intel.com Subject: Re: RFC: NAPI packet weighting patch Message-ID: <20050602103052.66f12f21@dxpl.pdx.osdl.net> In-Reply-To: <1117715207.6050.21.camel@localhost.localdomain> References: <1117241786.6251.7.camel@localhost.localdomain> <200505311707.54487.jdmason@us.ibm.com> <20050531.151443.74564699.davem@davemloft.net> <200505311828.44304.jdmason@us.ibm.com> <1117715207.6050.21.camel@localhost.localdomain> Organization: Open Source Development Lab X-Mailer: Sylpheed-Claws 1.0.4 (GTK+ 1.2.10; x86_64-unknown-linux-gnu) X-Face: &@E+xe?c%:&e4D{>f1O<&U>2qwRREG5!}7R4;D<"NO^UI2mJ[eEOA2*3>(`Th.yP,VDPo9$ /`~cw![cmj~~jWe?AHY7D1S+\}5brN0k*NE?pPh_'_d>6;XGG[\KDRViCfumZT3@[ Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-MIMEDefang-Filter: osdl$Revision: 1.109 $ X-Scanned-By: MIMEDefang 2.36 X-archive-position: 1972 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev On Thu, 02 Jun 2005 08:26:46 -0400 jamal wrote: > On Tue, 2005-31-05 at 18:28 -0500, Jon Mason wrote: > > On Tuesday 31 May 2005 05:14 pm, David S. Miller wrote: > > > From: Jon Mason > > > Date: Tue, 31 May 2005 17:07:54 -0500 > > > > > > > Of course some performace analysis would have to be done to determine the > > > > optimal numbers for each speed/duplexity setting per driver. > > > > > > per cpu speed, per memory bus speed, per I/O bus speed, and add in other > > > complications such as NUMA > > > > > > My point is that whatever experimental number you come up with will be > > > good for that driver on your systems, not necessarily for others. > > > > > > Even within a system, whatever number you select will be the wrong > > > thing to use if one starts a continuous I/O stream to the SATA > > > controller in the next PCI slot, for example. > > > > > > We keep getting bitten by this, as the Altix perf data continually shows, > > > and we need to absolutely stop thinking this way. > > > > > > The way to go is to make selections based upon observed events and > > > mesaurements. > > > > I'm not arguing against a /proc entry to tune dev->weight for those sysadmins > > advanced enough to do that. I am arguing that we can make the driver smarter > > (at little/no cost) for "out of the box" users. > > > > What is the point of making the driver "smarter"? > Recall, the algorithm used to schedule the netdevices is based on an > extension of Weighted Round Robin from Varghese et al known as DRR (ask > gooogle for details). > The idea is to provide fairness amongst many drivers. As an example, if > you have a gige driver it shouldnt be taking all the resources at the > expense of starving the fastether driver. > If the admin wants one driver to be more "important" than the other, > s/he will make sure it has a higher weight. > In fact, since the default weighting should be based on the amount of cpu time expended per frame rather than link speed. The point is that a more "heavy weight" driver shouldn't starve out all the others. From shemminger@osdl.org Thu Jun 2 10:39:06 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Jun 2005 10:39:11 -0700 (PDT) Received: from smtp.osdl.org (fire.osdl.org [65.172.181.4]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j52Hd6Xq004550 for ; Thu, 2 Jun 2005 10:39:06 -0700 Received: from shell0.pdx.osdl.net (fw.osdl.org [65.172.181.6]) by smtp.osdl.org (8.12.8/8.12.8) with ESMTP id j52Hc6jA029209 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO); Thu, 2 Jun 2005 10:38:06 -0700 Received: from dxpl.pdx.osdl.net (dxpl.pdx.osdl.net [10.8.0.74]) by shell0.pdx.osdl.net (8.13.1/8.11.6) with ESMTP id j52Hc5BT006053; Thu, 2 Jun 2005 10:38:05 -0700 Date: Thu, 2 Jun 2005 10:38:05 -0700 From: Stephen Hemminger To: Baruch Even Cc: Adrian Bunk , Andrew Morton , linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: 2.6.12-rc5-mm2: "bic unavailable using TCP reno" messages Message-ID: <20050602103805.6beb4f4e@dxpl.pdx.osdl.net> In-Reply-To: <429F1079.5070701@ev-en.org> References: <20050601022824.33c8206e.akpm@osdl.org> <20050602121511.GE4992@stusta.de> <429F1079.5070701@ev-en.org> Organization: Open Source Development Lab X-Mailer: Sylpheed-Claws 1.0.4 (GTK+ 1.2.10; x86_64-unknown-linux-gnu) X-Face: &@E+xe?c%:&e4D{>f1O<&U>2qwRREG5!}7R4;D<"NO^UI2mJ[eEOA2*3>(`Th.yP,VDPo9$ /`~cw![cmj~~jWe?AHY7D1S+\}5brN0k*NE?pPh_'_d>6;XGG[\KDRViCfumZT3@[ Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-MIMEDefang-Filter: osdl$Revision: 1.109 $ X-Scanned-By: MIMEDefang 2.36 X-archive-position: 1973 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev On Thu, 02 Jun 2005 14:58:17 +0100 Baruch Even wrote: > Adrian Bunk wrote: > > On Wed, Jun 01, 2005 at 02:28:24AM -0700, Andrew Morton wrote: > > > >>... > >>Changes since 2.6.12-rc5-mm1: > >>... > >>+tcp-tcp_infra.patch > >>... > >> Steve Hemminger's TCP enhancements. > >>... > > > > > > I said "no" to CONFIG_TCP_CONG_BIC, and now my syslog is full of messages > > kernel: bic unavailable using TCP reno > > > > I have no problem with such a message being shown once - but once should > > be enough. > > The best solution for this would be to check the available protocols at > setup time and not at connection creation time. This would also provide > a better feedback to the user, since he will either see that what he set > was taken, or it wasn't. > > In the current mechanism you can set the protocol to 'foo' and it will > show back as 'foo'. You'll get complaints only once a connection is > attempted with this protocol. > > It does mean some extra work in the sysctl stage, but it's better IMO to > do it there rather than at connection setup time. > > Baruch Your right, the sysctl handler should be smarter, but that is not the problem here. The problem is that the default value is set to be BIC to be compatible with earlier kernels. Since 75% of the world isn't smart enough to figure out how to use sysctl, there is a question of what the default should be, and what to do if that is missing. One version had a messy ifdef chain to try and avoid the warning: char sysctl_tcp_congestion_control[] = #if defined(CONFIG_TCP_BIC) "bic" #elif defined(CONFIG_TCP_HTCP) "htcp" #else "reno" #endif ; but that was ugly. Another possibility is putting it in as yet another config value at kernel build time. To suppress the warning repeating, probably the best solution would be rewrite the string if we have to revert to reno. But carefully to avoid SMP issues. This also implies a smarter sysctl string handler for this value as well. P.s: saw your comparison paper, after a little more corroboration I would like to make H-TCP the default. From shemminger@osdl.org Thu Jun 2 10:45:27 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Jun 2005 10:45:34 -0700 (PDT) Received: from smtp.osdl.org (fire.osdl.org [65.172.181.4]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j52HjRXq005497 for ; Thu, 2 Jun 2005 10:45:27 -0700 Received: from shell0.pdx.osdl.net (fw.osdl.org [65.172.181.6]) by smtp.osdl.org (8.12.8/8.12.8) with ESMTP id j52HiOjA029583 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO); Thu, 2 Jun 2005 10:44:25 -0700 Received: from dxpl.pdx.osdl.net (dxpl.pdx.osdl.net [10.8.0.74]) by shell0.pdx.osdl.net (8.13.1/8.11.6) with ESMTP id j52HiNYe006311; Thu, 2 Jun 2005 10:44:23 -0700 Date: Thu, 2 Jun 2005 10:44:23 -0700 From: Stephen Hemminger To: Cc: , , , Subject: Re: Unable to handle kernel paging request at virtual address 04000460 Message-ID: <20050602104423.2c3825e5@dxpl.pdx.osdl.net> In-Reply-To: <438662DA48DCAA41B1DF648BD4BD76C0E461B8@CHN-SNR-MBX01.wipro.com> References: <438662DA48DCAA41B1DF648BD4BD76C0E461B8@CHN-SNR-MBX01.wipro.com> Organization: Open Source Development Lab X-Mailer: Sylpheed-Claws 1.0.4 (GTK+ 1.2.10; x86_64-unknown-linux-gnu) X-Face: &@E+xe?c%:&e4D{>f1O<&U>2qwRREG5!}7R4;D<"NO^UI2mJ[eEOA2*3>(`Th.yP,VDPo9$ /`~cw![cmj~~jWe?AHY7D1S+\}5brN0k*NE?pPh_'_d>6;XGG[\KDRViCfumZT3@[ Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-MIMEDefang-Filter: osdl$Revision: 1.109 $ X-Scanned-By: MIMEDefang 2.36 X-archive-position: 1974 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev On Thu, 2 Jun 2005 09:20:21 +0530 wrote: > Hi David, > I understand that the linux community may not be able to debug it for > me. All I require is if people have seen similar problems (the problems > we face are w.r.t to kfree_skb and skb_drop_fraglist crashing due to > some reason, which could be a Memory Management issue or some thing we > are not aware of), then let us know the patches, so that we can try them > out here. Turn on Debug memory allocations, spinlock debugging, sleep-inside-spinlock checking, and preempt, it will help your debugging. If you are not building your own kernel from source learn how. You are probably freeing memory twice, or not doing ref counting properly or other locking issues. Since it is your code, good luck debugging it, if you want the community help it needs to be open source code that is available for download or be in the kernel.org kernel. From shemminger@osdl.org Thu Jun 2 10:53:39 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Jun 2005 10:53:44 -0700 (PDT) Received: from smtp.osdl.org (fire.osdl.org [65.172.181.4]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j52HrcXq006428 for ; Thu, 2 Jun 2005 10:53:38 -0700 Received: from shell0.pdx.osdl.net (fw.osdl.org [65.172.181.6]) by smtp.osdl.org (8.12.8/8.12.8) with ESMTP id j52HqcjA030322 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO); Thu, 2 Jun 2005 10:52:39 -0700 Received: from dxpl.pdx.osdl.net (dxpl.pdx.osdl.net [10.8.0.74]) by shell0.pdx.osdl.net (8.13.1/8.11.6) with ESMTP id j52HqcMY006729; Thu, 2 Jun 2005 10:52:38 -0700 Date: Thu, 2 Jun 2005 10:52:38 -0700 From: Stephen Hemminger To: Andrew Morton Cc: John Heffner , netdev@oss.sgi.com Subject: [PATCH] Scalable TCP (cleaned) Message-ID: <20050602105238.69b6bcb3@dxpl.pdx.osdl.net> In-Reply-To: <200505251550.42252.jheffner@psc.edu> References: <200505251550.42252.jheffner@psc.edu> Organization: Open Source Development Lab X-Mailer: Sylpheed-Claws 1.0.4 (GTK+ 1.2.10; x86_64-unknown-linux-gnu) X-Face: &@E+xe?c%:&e4D{>f1O<&U>2qwRREG5!}7R4;D<"NO^UI2mJ[eEOA2*3>(`Th.yP,VDPo9$ /`~cw![cmj~~jWe?AHY7D1S+\}5brN0k*NE?pPh_'_d>6;XGG[\KDRViCfumZT3@[ Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-MIMEDefang-Filter: osdl$Revision: 1.109 $ X-Scanned-By: MIMEDefang 2.36 X-archive-position: 1975 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev Here is a whitespace cleaned up version of John's scaleable TCP patch to go with the other TCP congestion algorithms, to put in -mm. -------- This patch implements Tom Kelly's Scalable TCP congestion control algorithm for the modular framework. The algorithm has some nice scaling properties, and has been used a fair bit in research, though is known to have significant fairness issues, so it's not really suitable for general purpose use. Signed-off-by: John Heffner Index: 2.6.12-rc5-tcp3/net/ipv4/Makefile =================================================================== --- 2.6.12-rc5-tcp3.orig/net/ipv4/Makefile +++ 2.6.12-rc5-tcp3/net/ipv4/Makefile @@ -35,6 +35,7 @@ obj-$(CONFIG_TCP_CONG_HSTCP) += tcp_high obj-$(CONFIG_TCP_CONG_HYBLA) += tcp_hybla.o obj-$(CONFIG_TCP_CONG_HTCP) += tcp_htcp.o obj-$(CONFIG_TCP_CONG_VEGAS) += tcp_vegas.o +obj-$(CONFIG_TCP_CONG_SCALABLE) += tcp_scalable.o obj-$(CONFIG_XFRM) += xfrm4_policy.o xfrm4_state.o xfrm4_input.o \ xfrm4_output.o Index: 2.6.12-rc5-tcp3/net/ipv4/tcp_scalable.c =================================================================== --- /dev/null +++ 2.6.12-rc5-tcp3/net/ipv4/tcp_scalable.c @@ -0,0 +1,68 @@ +/* Tom Kelly's Scalable TCP + * + * See htt://www-lce.eng.cam.ac.uk/~ctk21/scalable/ + * + * John Heffner + */ + +#include +#include +#include + +/* These factors derived from the recommended values in the paper: + * .01 and and 7/8. We use 50 instead of 100 to account for + * delayed ack. + */ +#define TCP_SCALABLE_AI_CNT 50U +#define TCP_SCALABLE_MD_SCALE 3 + +static void tcp_scalable_cong_avoid(struct tcp_sock *tp, u32 ack, u32 rtt, + u32 in_flight, int flag) +{ + if (in_flight < tp->snd_cwnd) + return; + + if (tp->snd_cwnd <= tp->snd_ssthresh) { + tp->snd_cwnd++; + } else { + tp->snd_cwnd_cnt++; + if (tp->snd_cwnd_cnt > min(tp->snd_cwnd, TCP_SCALABLE_AI_CNT)){ + tp->snd_cwnd++; + tp->snd_cwnd_cnt = 0; + } + } + tp->snd_cwnd = min_t(u32, tp->snd_cwnd, tp->snd_cwnd_clamp); + tp->snd_cwnd_stamp = tcp_time_stamp; +} + +static u32 tcp_scalable_ssthresh(struct tcp_sock *tp) +{ + return max(tp->snd_cwnd - (tp->snd_cwnd>>TCP_SCALABLE_MD_SCALE), 2U); +} + + +static struct tcp_congestion_ops tcp_scalable = { + .ssthresh = tcp_scalable_ssthresh, + .cong_avoid = tcp_scalable_cong_avoid, + .min_cwnd = tcp_reno_min_cwnd, + + .owner = THIS_MODULE, + .name = "scalable", +}; + +static int __init tcp_scalable_register(void) +{ + return tcp_register_congestion_control(&tcp_scalable); +} + +static void __exit tcp_scalable_unregister(void) +{ + tcp_unregister_congestion_control(&tcp_scalable); +} + +module_init(tcp_scalable_register); +module_exit(tcp_scalable_unregister); + +MODULE_AUTHOR("John Heffner"); +MODULE_LICENSE("GPL"); +MODULE_DESCRIPTION("Scalable TCP"); Index: 2.6.12-rc5-tcp3/net/ipv4/Kconfig =================================================================== --- 2.6.12-rc5-tcp3.orig/net/ipv4/Kconfig +++ 2.6.12-rc5-tcp3/net/ipv4/Kconfig @@ -481,6 +481,15 @@ config TCP_CONG_VEGAS window. TCP Vegas should provide less packet loss, but it is not as aggressive as TCP Reno. +config TCP_CONG_SCALABLE + tristate "Scalable TCP" + depends on EXPERIMENTAL + default n + ---help--- + Scalable TCP is a sender-side only change to TCP which uses a + MIMD congestion control algorithm which has some nice scaling + properties, though is known to have fairness issues. + See http://www-lce.eng.cam.ac.uk/~ctk21/scalable/ endmenu From shemminger@osdl.org Thu Jun 2 11:15:43 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Jun 2005 11:15:47 -0700 (PDT) Received: from smtp.osdl.org (fire.osdl.org [65.172.181.4]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j52IFgXq008328 for ; Thu, 2 Jun 2005 11:15:43 -0700 Received: from shell0.pdx.osdl.net (fw.osdl.org [65.172.181.6]) by smtp.osdl.org (8.12.8/8.12.8) with ESMTP id j52IEbjA000394 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO); Thu, 2 Jun 2005 11:14:38 -0700 Received: from dxpl.pdx.osdl.net (dxpl.pdx.osdl.net [10.8.0.74]) by shell0.pdx.osdl.net (8.13.1/8.11.6) with ESMTP id j52IEbtP008613; Thu, 2 Jun 2005 11:14:37 -0700 Date: Thu, 2 Jun 2005 11:14:37 -0700 From: Stephen Hemminger To: "David S. Miller" Cc: Mitch Williams , netdev@oss.sgi.com, john.ronciak@intel.com, ganesh.venkatesan@intel.com, jesse.brandeburg@intel.com Subject: [PATCH] net: allow controlling NAPI weight with sysfs Message-ID: <20050602111437.1c492138@dxpl.pdx.osdl.net> In-Reply-To: References: Organization: Open Source Development Lab X-Mailer: Sylpheed-Claws 1.0.4 (GTK+ 1.2.10; x86_64-unknown-linux-gnu) X-Face: &@E+xe?c%:&e4D{>f1O<&U>2qwRREG5!}7R4;D<"NO^UI2mJ[eEOA2*3>(`Th.yP,VDPo9$ /`~cw![cmj~~jWe?AHY7D1S+\}5brN0k*NE?pPh_'_d>6;XGG[\KDRViCfumZT3@[ Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-MIMEDefang-Filter: osdl$Revision: 1.109 $ X-Scanned-By: MIMEDefang 2.36 X-archive-position: 1976 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev Simple interface to allow changing network device scheduling weight with sysfs. Please consider this for 2.6.12, since risk/impact is small. Signed-off-by: Stephen Hemminger Index: napi-sysfs/net/core/net-sysfs.c =================================================================== --- napi-sysfs.orig/net/core/net-sysfs.c +++ napi-sysfs/net/core/net-sysfs.c @@ -184,6 +184,22 @@ static ssize_t store_tx_queue_len(struct static CLASS_DEVICE_ATTR(tx_queue_len, S_IRUGO | S_IWUSR, show_tx_queue_len, store_tx_queue_len); +NETDEVICE_SHOW(weight, fmt_ulong); + +static int change_weight(struct net_device *net, unsigned long new_weight) +{ + net->weight = new_weight; + return 0; +} + +static ssize_t store_weight(struct class_device *dev, const char *buf, size_t len) +{ + return netdev_store(dev, buf, len, change_weight); +} + +static CLASS_DEVICE_ATTR(weight, S_IRUGO | S_IWUSR, show_weight, + store_weight); + static struct class_device_attribute *net_class_attributes[] = { &class_device_attr_ifindex, @@ -193,6 +209,7 @@ static struct class_device_attribute *ne &class_device_attr_features, &class_device_attr_mtu, &class_device_attr_flags, + &class_device_attr_weight, &class_device_attr_type, &class_device_attr_address, &class_device_attr_broadcast, From shemminger@osdl.org Thu Jun 2 11:20:10 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Jun 2005 11:20:13 -0700 (PDT) Received: from smtp.osdl.org (fire.osdl.org [65.172.181.4]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j52IKAXq008963 for ; Thu, 2 Jun 2005 11:20:10 -0700 Received: from shell0.pdx.osdl.net (fw.osdl.org [65.172.181.6]) by smtp.osdl.org (8.12.8/8.12.8) with ESMTP id j52IJ9jA001475 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO); Thu, 2 Jun 2005 11:19:10 -0700 Received: from dxpl.pdx.osdl.net (dxpl.pdx.osdl.net [10.8.0.74]) by shell0.pdx.osdl.net (8.13.1/8.11.6) with ESMTP id j52IJ97X009167; Thu, 2 Jun 2005 11:19:09 -0700 Date: Thu, 2 Jun 2005 11:19:09 -0700 From: Stephen Hemminger To: "David S. Miller" Cc: Mitch Williams , netdev@oss.sgi.com Subject: [PATCH] net: fix sysctl_ Message-ID: <20050602111909.63ef419a@dxpl.pdx.osdl.net> In-Reply-To: References: Organization: Open Source Development Lab X-Mailer: Sylpheed-Claws 1.0.4 (GTK+ 1.2.10; x86_64-unknown-linux-gnu) X-Face: &@E+xe?c%:&e4D{>f1O<&U>2qwRREG5!}7R4;D<"NO^UI2mJ[eEOA2*3>(`Th.yP,VDPo9$ /`~cw![cmj~~jWe?AHY7D1S+\}5brN0k*NE?pPh_'_d>6;XGG[\KDRViCfumZT3@[ Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-MIMEDefang-Filter: osdl$Revision: 1.109 $ X-Scanned-By: MIMEDefang 2.36 X-archive-position: 1977 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev Changing the sysctl net.core.dev_weight has no effect because the weight of the backlog devices is set during initialization and never changed. This patch propagates any changes to the global value affected by sysctl to the per-cpu devices. It is done every time the packet handler function is run. Signed-off-by: Stephen Hemminger Index: skge-0.8/net/core/dev.c =================================================================== --- skge-0.8.orig/net/core/dev.c +++ skge-0.8/net/core/dev.c @@ -1732,6 +1732,7 @@ static int process_backlog(struct net_de struct softnet_data *queue = &__get_cpu_var(softnet_data); unsigned long start_time = jiffies; + backlog_dev->weight = weight_p; for (;;) { struct sk_buff *skb; struct net_device *dev; From romieu@fr.zoreil.com Thu Jun 2 11:36:06 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Jun 2005 11:36:10 -0700 (PDT) Received: from fr.zoreil.com (electric-eye.fr.zoreil.com [213.41.134.224]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j52Ia4Xq010316 for ; Thu, 2 Jun 2005 11:36:05 -0700 Received: from electric-eye.fr.zoreil.com (localhost.localdomain [127.0.0.1]) by fr.zoreil.com (8.13.1/8.12.1) with ESMTP id j52IZ24i006169; Thu, 2 Jun 2005 20:35:02 +0200 Received: (from romieu@localhost) by electric-eye.fr.zoreil.com (8.13.1/8.13.1/Submit) id j52IYuZT006168; Thu, 2 Jun 2005 20:34:56 +0200 Date: Thu, 2 Jun 2005 20:34:56 +0200 From: Francois Romieu To: Kutschera Peter Cc: Linux r8169 crew , jgarzik@pobox.com Subject: Re: R8169 from U.S.Robotics not found by driver Message-ID: <20050602183456.GA5606@electric-eye.fr.zoreil.com> References: <3BDD1137DBC16749ACF2C93F82FCA98DA107D2@s1ms3.D01.arc.local> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <3BDD1137DBC16749ACF2C93F82FCA98DA107D2@s1ms3.D01.arc.local> User-Agent: Mutt/1.4.1i X-Organisation: Land of Sunshine Inc. X-Subliminal-Message: Merge the r8169 driver in mainline X-archive-position: 1978 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: romieu@fr.zoreil.com Precedence: bulk X-list: netdev Kutschera Peter : [...] > If you have any tips please let me know. Upgrade to (at your option): - 2.6.12-rc5 + Jeff Garzik's r8169 git branch; - 2.6.12-rc5-mm2. Both contain the latest r8169 driver. It will handle USR hardware. If you manage to kill it, please report it. Would your setup allow to test the driver in the Mpps range by any luck ? -- Ueimor From gwingerde@home.nl Thu Jun 2 12:03:27 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Jun 2005 12:03:29 -0700 (PDT) Received: from smtpq3.home.nl (smtpq3.home.nl [213.51.128.198]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j52J3QXq012315 for ; Thu, 2 Jun 2005 12:03:27 -0700 Message-Id: <200506021903.j52J3QXq012315@oss.sgi.com> Received: from [213.51.128.134] (port=56855 helo=smtp3.home.nl) by smtpq3.home.nl with esmtp (Exim 4.30) id 1DduxW-0005WB-6W; Thu, 02 Jun 2005 21:02:26 +0200 Received: from [10.100.3.12] (port=33042 helo=mail.home.nl) by smtp3.home.nl with smtp (Exim 4.30) id 1DduxU-00011G-VC; Thu, 02 Jun 2005 21:02:24 +0200 X-Mailer: Openwave WebEngine, version 2.8.12 (webedge20-101-197-20030912) X-Originating-IP: [213.84.184.98] From: To: Jiri Benc CC: , , Subject: Antw: Re: [PATCH] ieee80211: Update generic definitions to latest specs. Date: Thu, 2 Jun 2005 21:02:24 +0200 MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-AtHome-MailScanner-Information: Neem contact op met support@home.nl voor meer informatie X-AtHome-MailScanner: Found to be clean X-archive-position: 1979 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: gwingerde@home.nl Precedence: bulk X-list: netdev Content-Length: 724 Lines: 22 On Thu, 02 Jun 2005, Jiri Benc wrote: > On Wed, 01 Jun 2005 22:50:51 +0200, Gertjan van Wingerde wrote: > > +#define WLAN_STATUS_ASSOC_DENIED_SPECTRUM_MGMT_REQUIRED 22 > > +#define WLAN_STATUS_ASSOC_REJECTED_POWER_CAP_UNACCEPTABLE 23 > > +#define WLAN_STATUS_ASSOC_REJECTED_SUPP_CHANNELS_UNACCEPTABLE 24 > > (...) > > +/* 802.11h */ > > +#define WLAN_REASON_DISASSOC_POWER_CAP_UNACCEPTABLE 10 > > +#define WLAN_REASON_DISASSOC_SUPP_CHANNELS_UNACCEPTABLE 11 > > Aren't these identifiers a bit too long? It seems to be unpractical to use > them. > I was thinking about that too, but couldn't find a proper shorter version without losing the descriptive meaning. Do you have any suggestions to shorten them? BR, Gertjan From davem@davemloft.net Thu Jun 2 13:07:54 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Jun 2005 13:07:58 -0700 (PDT) Received: from sunset.davemloft.net (dsl027-180-168.sfo1.dsl.speakeasy.net [216.27.180.168]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j52K7sXq019659 for ; Thu, 2 Jun 2005 13:07:54 -0700 Received: from localhost ([127.0.0.1] ident=davem) by sunset.davemloft.net with esmtp (Exim 4.50) id 1Ddvxo-0001uE-Nv; Thu, 02 Jun 2005 13:06:48 -0700 Date: Thu, 02 Jun 2005 13:06:48 -0700 (PDT) Message-Id: <20050602.130648.75428139.davem@davemloft.net> To: bunk@stusta.de Cc: akpm@osdl.org, netdev@oss.sgi.com, linux-kernel@vger.kernel.org, yoshfuji@linux-ipv6.org Subject: Re: [2.6 patch] net/ipv6/ipv6_syms.c: unexport fl6_sock_lookup From: "David S. Miller" In-Reply-To: <20050530205653.GZ10441@stusta.de> References: <20050530205653.GZ10441@stusta.de> X-Mailer: Mew version 3.3 on Emacs 21.4 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 1981 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev Content-Length: 254 Lines: 9 From: Adrian Bunk Date: Mon, 30 May 2005 22:56:53 +0200 > There is no usage of this EXPORT_SYMBOL in the kernel. > > Signed-off-by: Adrian Bunk > Acked-by: Hideaki YOSHIFUJI Applied, thanks. From davem@davemloft.net Thu Jun 2 13:03:47 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Jun 2005 13:03:51 -0700 (PDT) Received: from sunset.davemloft.net (dsl027-180-168.sfo1.dsl.speakeasy.net [216.27.180.168]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j52K3lXq019130 for ; Thu, 2 Jun 2005 13:03:47 -0700 Received: from localhost ([127.0.0.1] ident=davem) by sunset.davemloft.net with esmtp (Exim 4.50) id 1Ddvth-0001s9-Sa; Thu, 02 Jun 2005 13:02:33 -0700 Date: Thu, 02 Jun 2005 13:02:33 -0700 (PDT) Message-Id: <20050602.130233.59653068.davem@davemloft.net> To: bunk@stusta.de Cc: ja@ssi.bg, wensong@LinuxVirtualServer.org, netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: Re: [2.6 patch] remove net/ipv4/ipvs/ip_vs_proto_icmp.c? From: "David S. Miller" In-Reply-To: <20050515132906.GW16549@stusta.de> References: <20050513041622.GE3603@stusta.de> <20050515132906.GW16549@stusta.de> X-Mailer: Mew version 3.3 on Emacs 21.4 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 1980 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev Content-Length: 189 Lines: 8 From: Adrian Bunk Date: Sun, 15 May 2005 15:29:06 +0200 > ip_vs_proto_icmp.c was never finished. > > Signed-off-by: Adrian Bunk Applied, thanks Adrian. From bunk@stusta.de Thu Jun 2 13:08:03 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Jun 2005 13:08:13 -0700 (PDT) Received: from mailout.stusta.mhn.de (mailout.stusta.mhn.de [141.84.69.5]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id j52K81Xq019698 for ; Thu, 2 Jun 2005 13:08:02 -0700 Received: (qmail 32434 invoked from network); 2 Jun 2005 20:07:04 -0000 Received: from r063144.stusta.swh.mhn.de (10.150.63.144) by mailout.stusta.mhn.de with SMTP; 2 Jun 2005 20:07:04 -0000 Received: by r063144.stusta.swh.mhn.de (Postfix, from userid 1000) id 67F5ABBFA9; Thu, 2 Jun 2005 22:07:02 +0200 (CEST) Date: Thu, 2 Jun 2005 22:07:02 +0200 From: Adrian Bunk To: Andrew Morton , jkmaline@cc.hut.fi, jgarzik@pobox.com Cc: linux-kernel@vger.kernel.org, hostap@shmoo.com, netdev@oss.sgi.com Subject: [-mm patch] fix recursive IPW2200 dependencies Message-ID: <20050602200701.GG4992@stusta.de> References: <20050601022824.33c8206e.akpm@osdl.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20050601022824.33c8206e.akpm@osdl.org> User-Agent: Mutt/1.5.9i X-archive-position: 1982 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: bunk@stusta.de Precedence: bulk X-list: netdev Content-Length: 999 Lines: 34 On Wed, Jun 01, 2005 at 02:28:24AM -0700, Andrew Morton wrote: >... > Changes since 2.6.12-rc5-mm1: >... > +git-netdev-we18-ieee80211-wifi.patch > > Various things added and merged in netdev land. >... This results in recursive dependencies: - IPW2200 depends on NET_RADIO - IPW2200 selects IEEE80211 - IEEE80211 selects NET_RADIO This patch fixes the IPW2200 dependencies in a way that they are similar to the IPW2100 dependencies. Signed-off-by: Adrian Bunk --- linux-2.6.12-rc5-mm2-full/drivers/net/wireless/Kconfig.old 2005-06-02 22:04:02.000000000 +0200 +++ linux-2.6.12-rc5-mm2-full/drivers/net/wireless/Kconfig 2005-06-02 22:04:40.000000000 +0200 @@ -192,9 +192,8 @@ config IPW2200 tristate "Intel PRO/Wireless 2200BG and 2915ABG Network Connection" - depends on NET_RADIO && PCI + depends on IEEE80211 && PCI select FW_LOADER - select IEEE80211 ---help--- A driver for the Intel PRO/Wireless 2200BG and 2915ABG Network Connection adapters. From davem@davemloft.net Thu Jun 2 13:14:20 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Jun 2005 13:14:22 -0700 (PDT) Received: from sunset.davemloft.net (dsl027-180-168.sfo1.dsl.speakeasy.net [216.27.180.168]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j52KEJXq021078 for ; Thu, 2 Jun 2005 13:14:20 -0700 Received: from localhost ([127.0.0.1] ident=davem) by sunset.davemloft.net with esmtp (Exim 4.50) id 1Ddw42-0001wh-Vk; Thu, 02 Jun 2005 13:13:15 -0700 Date: Thu, 02 Jun 2005 13:13:14 -0700 (PDT) Message-Id: <20050602.131314.21926883.davem@davemloft.net> To: bunk@stusta.de Cc: akpm@osdl.org, netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: Re: [RFC: 2.6 patch] net/ipv4/: possible cleanups From: "David S. Miller" In-Reply-To: <20050530205651.GY10441@stusta.de> References: <20050530205651.GY10441@stusta.de> X-Mailer: Mew version 3.3 on Emacs 21.4 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 1983 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev Content-Length: 955 Lines: 28 From: Adrian Bunk Subject: [RFC: 2.6 patch] net/ipv4/: possible cleanups Date: Mon, 30 May 2005 22:56:51 +0200 > This patch contains the following possible cleanups: > - make needlessly global code static > - #if 0 the following unused global function: > - xfrm4_state.c: xfrm4_state_fini > - remove the following unneeded EXPORT_SYMBOL's: > - ip_output.c: ip_finish_output > - ip_output.c: sysctl_ip_default_ttl > - fib_frontend.c: ip_dev_find > - inetpeer.c: inet_peer_idlock > - ip_options.c: ip_options_compile > - ip_options.c: ip_options_undo > - tcp_ipv4.c: sysctl_max_syn_backlog > > Please review which of these changes are correct and which might > conflict with pending patches. Please keep all of the ECN implementation in the tcp_ecn.h header file, even if the routine is only called in one C file. And therefore, please do not remove the tcp_enter_quickack_mode() extern declaration from tcp.h Thanks. From davem@davemloft.net Thu Jun 2 13:15:33 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Jun 2005 13:15:35 -0700 (PDT) Received: from sunset.davemloft.net (dsl027-180-168.sfo1.dsl.speakeasy.net [216.27.180.168]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j52KFWXq021474 for ; Thu, 2 Jun 2005 13:15:32 -0700 Received: from localhost ([127.0.0.1] ident=davem) by sunset.davemloft.net with esmtp (Exim 4.50) id 1Ddw5E-0001xY-Nx; Thu, 02 Jun 2005 13:14:28 -0700 Date: Thu, 02 Jun 2005 13:14:28 -0700 (PDT) Message-Id: <20050602.131428.28787855.davem@davemloft.net> To: bunk@stusta.de Cc: akpm@osdl.org, netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: Re: [2.6 patch] net/socket.c: unexport move_addr_to_kernel From: "David S. Miller" In-Reply-To: <20050530205647.GW10441@stusta.de> References: <20050530205647.GW10441@stusta.de> X-Mailer: Mew version 3.3 on Emacs 21.4 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 1984 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev Content-Length: 329 Lines: 10 From: Adrian Bunk Date: Mon, 30 May 2005 22:56:47 +0200 > I didn't find any modular usage in the kernel. > > Signed-off-by: Adrian Bunk Yes, but as a part of the socket kernel API, I could definitely see some out-of-tree code legitimately using this interface. Let's keep it around for now. From abonilla@linuxwireless.org Thu Jun 2 13:20:15 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Jun 2005 13:20:19 -0700 (PDT) Received: from linuxwireless.org.ve.carpathiahost.net (linuxwireless.org.ve.carpathiahost.net [66.117.45.234]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j52KKEXq022260 for ; Thu, 2 Jun 2005 13:20:15 -0700 Received: from WCRSJO2KPAB047 ([200.9.49.66]) by linuxwireless.org.ve.carpathiahost.net (8.12.10/8.12.10) with SMTP id j52KJEnE002565; Thu, 2 Jun 2005 16:19:14 -0400 Reply-To: From: "Alejandro Bonilla" To: "'Adrian Bunk'" , "'Andrew Morton'" , , Cc: , , Subject: RE: [-mm patch] fix recursive IPW2200 dependencies Date: Thu, 2 Jun 2005 14:19:10 -0600 Message-ID: <003a01c567b0$56bed860$600cc60a@amer.sykes.com> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Priority: 3 (Normal) X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook CWS, Build 9.0.6604 (9.0.2911.0) In-Reply-To: <20050602200701.GG4992@stusta.de> X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1478 Importance: Normal X-archive-position: 1985 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: abonilla@linuxwireless.org Precedence: bulk X-list: netdev Content-Length: 1333 Lines: 49 > On Wed, Jun 01, 2005 at 02:28:24AM -0700, Andrew Morton wrote: > >... > > Changes since 2.6.12-rc5-mm1: > >... > > +git-netdev-we18-ieee80211-wifi.patch > > > > Various things added and merged in netdev land. > >... > > This results in recursive dependencies: > - IPW2200 depends on NET_RADIO > - IPW2200 selects IEEE80211 > - IEEE80211 selects NET_RADIO > > > This patch fixes the IPW2200 dependencies in a way that they > are similar > to the IPW2100 dependencies. > > Signed-off-by: Adrian Bunk > > --- > linux-2.6.12-rc5-mm2-full/drivers/net/wireless/Kconfig.old > 2005-06-02 22:04:02.000000000 +0200 > +++ linux-2.6.12-rc5-mm2-full/drivers/net/wireless/Kconfig > 2005-06-02 22:04:40.000000000 +0200 > @@ -192,9 +192,8 @@ > > config IPW2200 > tristate "Intel PRO/Wireless 2200BG and 2915ABG Network > Connection" > - depends on NET_RADIO && PCI > + depends on IEEE80211 && PCI > select FW_LOADER > - select IEEE80211 > ---help--- > A driver for the Intel PRO/Wireless 2200BG and > 2915ABG Network > Connection adapters. I think the normal usage of the name is Intel PRO/Wireless 2200BG/2915ABG Network Connection. I'm just saying this in case that you care about Intel Trademarking or about a more unified usage of the name of the Adapter. maybe this is something silly. .Alejandro From bunk@stusta.de Thu Jun 2 13:39:24 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Jun 2005 13:39:28 -0700 (PDT) Received: from mailout.stusta.mhn.de (emailhub.stusta.mhn.de [141.84.69.5]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id j52KdMXq023360 for ; Thu, 2 Jun 2005 13:39:23 -0700 Received: (qmail 922 invoked from network); 2 Jun 2005 20:38:25 -0000 Received: from r063144.stusta.swh.mhn.de (10.150.63.144) by mailhub.stusta.mhn.de with SMTP; 2 Jun 2005 20:38:25 -0000 Received: by r063144.stusta.swh.mhn.de (Postfix, from userid 1000) id 942E4AFA78; Thu, 2 Jun 2005 22:38:23 +0200 (CEST) Date: Thu, 2 Jun 2005 22:38:23 +0200 From: Adrian Bunk To: Stephen Hemminger Cc: Baruch Even , Andrew Morton , linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: 2.6.12-rc5-mm2: "bic unavailable using TCP reno" messages Message-ID: <20050602203823.GI4992@stusta.de> References: <20050601022824.33c8206e.akpm@osdl.org> <20050602121511.GE4992@stusta.de> <429F1079.5070701@ev-en.org> <20050602103805.6beb4f4e@dxpl.pdx.osdl.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20050602103805.6beb4f4e@dxpl.pdx.osdl.net> User-Agent: Mutt/1.5.9i X-archive-position: 1986 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: bunk@stusta.de Precedence: bulk X-list: netdev Content-Length: 1619 Lines: 56 On Thu, Jun 02, 2005 at 10:38:05AM -0700, Stephen Hemminger wrote: > On Thu, 02 Jun 2005 14:58:17 +0100 > Baruch Even wrote: > > >... > > Your right, the sysctl handler should be smarter, but that is not the problem here. > The problem is that the default value is set to be BIC to be compatible with earlier kernels. > Since 75% of the world isn't smart enough to figure out how to use sysctl, there is a > question of what the default should be, and what to do if that is missing. > > One version had a messy ifdef chain to try and avoid the warning: > > char sysctl_tcp_congestion_control[] = > #if defined(CONFIG_TCP_BIC) > "bic" > #elif defined(CONFIG_TCP_HTCP) > "htcp" > #else > "reno" > #endif > ; > > but that was ugly. > > Another possibility is putting it in as yet another config value at kernel build time. >... One thing that currently makes all solutions harder (and the #ifdef example above not ugly but simply wrong) is that you allow modular congestion control options for the always static net support. Is this really required? The IO schedulers have a similar problem, and they are using the #ifdef approach for selecting the default. One approach is to actually choose the default using #ifdef's. You could also do any kind of runtime selection, but please don't print the warning more than once. cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed From mchan@broadcom.com Thu Jun 2 13:55:41 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Jun 2005 13:55:44 -0700 (PDT) Received: from MMS1.broadcom.com (mms1.broadcom.com [216.31.210.17]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j52KtfXq024547 for ; Thu, 2 Jun 2005 13:55:41 -0700 Received: from 10.10.64.121 by MMS1.broadcom.com with SMTP (Broadcom SMTP Relay (Email Firewall v6.1.0)); Thu, 02 Jun 2005 13:54:34 -0700 X-Server-Uuid: 146C3151-C1DE-4F71-9D02-C3BE503878DD Received: from mail-irva-8.broadcom.com ([10.10.64.221]) by mail-irva-1.broadcom.com (Post.Office MTA v3.5.3 release 223 ID# 0-72233U7200L2200S0V35) with ESMTP id com; Thu, 2 Jun 2005 13:54:32 -0700 Received: from mon-irva-10.broadcom.com (mon-irva-10.broadcom.com [10.10.64.171]) by mail-irva-8.broadcom.com (MOS 3.5.6-GR) with ESMTP id BBU46339; Thu, 2 Jun 2005 13:54:28 -0700 (PDT) Received: from nt-irva-0741.brcm.ad.broadcom.com ( nt-irva-0741.brcm.ad.broadcom.com [10.8.194.54]) by mon-irva-10.broadcom.com (8.9.1/8.9.1) with ESMTP id NAA08432; Thu, 2 Jun 2005 13:54:28 -0700 (PDT) Received: from 10.7.18.177 ([10.7.18.177]) by NT-IRVA-0741.brcm.ad.broadcom.com ([10.8.194.54]) with Microsoft Exchange Server HTTP-DAV ; Thu, 2 Jun 2005 20:54:27 +0000 Received: from rh4 by nt-irva-0741; 02 Jun 2005 12:56:53 -0700 Subject: Re: Locking model for NAPI drivers From: "Michael Chan" To: "David S. Miller" cc: netdev@oss.sgi.com In-Reply-To: <1117661650.4310.62.camel@rh4> References: <20050531.154847.63995530.davem@davemloft.net> <1117658019.4310.58.camel@rh4> <20050601.152134.120445266.davem@davemloft.net> <1117661650.4310.62.camel@rh4> Date: Thu, 02 Jun 2005 12:56:52 -0700 Message-ID: <1117742212.22670.24.camel@rh4> MIME-Version: 1.0 X-Mailer: Evolution 2.0.2 (2.0.2-3) X-WSS-ID: 6E81AD802U44899064-01-01 Content-Type: text/plain Content-Transfer-Encoding: 7bit X-archive-position: 1987 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mchan@broadcom.com Precedence: bulk X-list: netdev Content-Length: 1057 Lines: 27 On Wed, 2005-06-01 at 14:34 -0700, Michael Chan wrote: > On Wed, 2005-06-01 at 15:21 -0700, David S. Miller wrote: > > Since the caller shuts down NAPI ->poll(), after setting the SYNC bit > > we can just check the MAILBOX register, and if a '1' is there just > > return. Does one need to mask out the upper bits of the regiser in > > order to avoid seeing the IRQ tag in such a comparison? > > > No, just check for the value 1 since that's the value we use to disable > interrupts. The value read back will always be 1 if 1 was previously > written to it. > One more race condition: CPU1 CPU2 tg3_poll() __netif_rx_complete() tg3_netif_stop() netif_poll_disable() tg3_full_lock() tg3_irq_quiesce() tg3_restart_ints() BUG_ON(tp->irq_state) This race condition is somewhat harmless but I think we need to take care of it for correctness. Any simple ways to fix it? From john.ronciak@intel.com Thu Jun 2 14:22:27 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Jun 2005 14:22:46 -0700 (PDT) Received: from orsfmr004.jf.intel.com (fmr19.intel.com [134.134.136.18]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j52LMRXq026133 for ; Thu, 2 Jun 2005 14:22:27 -0700 Received: from orsfmr100.jf.intel.com (orsfmr100.jf.intel.com [10.7.209.16]) by orsfmr004.jf.intel.com (8.12.10/8.12.10/d: major-outer.mc,v 1.1 2004/09/17 17:50:56 root Exp $) with ESMTP id j52LK7HM032086; Thu, 2 Jun 2005 21:20:07 GMT Received: from orsmsxvs041.jf.intel.com (orsmsxvs041.jf.intel.com [192.168.65.54]) by orsfmr100.jf.intel.com (8.12.10/8.12.10/d: major-inner.mc,v 1.2 2004/09/17 18:05:01 root Exp $) with SMTP id j52LK5go030673; Thu, 2 Jun 2005 21:20:05 GMT Received: from orsmsx332.amr.corp.intel.com ([192.168.65.60]) by orsmsxvs041.jf.intel.com (SAVSMTP 3.1.7.47) with SMTP id M2005060214200507391 ; Thu, 02 Jun 2005 14:20:05 -0700 Received: from orsmsx408.amr.corp.intel.com ([192.168.65.52]) by orsmsx332.amr.corp.intel.com with Microsoft SMTPSVC(6.0.3790.211); Thu, 2 Jun 2005 14:19:56 -0700 X-MimeOLE: Produced By Microsoft Exchange V6.5.7226.0 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Subject: RE: RFC: NAPI packet weighting patch Date: Thu, 2 Jun 2005 14:19:55 -0700 Message-ID: <468F3FDA28AA87429AD807992E22D07E0450BFD0@orsmsx408> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: RFC: NAPI packet weighting patch Thread-Index: AcVnbmGZxxYUID7BQ5qgE6xlxM/aIwASbefA From: "Ronciak, John" To: , "Jon Mason" Cc: "David S. Miller" , "Williams, Mitch A" , , , , "Venkatesan, Ganesh" , "Brandeburg, Jesse" X-OriginalArrivalTime: 02 Jun 2005 21:19:56.0276 (UTC) FILETIME=[D3124340:01C567B8] X-Scanned-By: MIMEDefang 2.44 Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id j52LMRXq026133 X-archive-position: 1988 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: john.ronciak@intel.com Precedence: bulk X-list: netdev Content-Length: 4140 Lines: 100 The DRR algorithm assumes a perfect world, where hardware resources are infinite, packets arrive continuously (or separated by very long delays), there are no bus latencies, and CPU speed is infinite. The real world is much messier: hardware starves for resources if it's not serviced quickly enough, packets arrive at inconvenient intervals (especially at 10 and 100 Mbps speeds), and buses and CPUs are slow. Thus, the driver should have the intelligence built into it to make an "intelligent" choice on what the weight should be for that driver/hardware. The calculation in the driver should take into account all the factors that the driver has access to. These include link speed, bus type and speed, processor speed and some amount of actual device FIFO size and latency smarts. The driver would use all of the factors to come up with a weight to prevent it from dropping frames and not to starve out other devices in the system or hinder performance. It seems to us that the driver is the one that know best and should try to come up with a reasonable value for weight based on its own knowledge of the hardware. This has been showing up in our NAPI test data which Mitch is currently scrubbing for release. It shows that there is a need for either better default static weight numbers or for them to be calculated based on some system dynamic variables. We would like to see the latter tried but the only problem is that each driver would have to make its own calculations, and it may not have access to all of the system-wide data it would need to make a proper calculation. Even with a more intelligent driver, we still would like to see some mechanism for the weight to be changed at runtime, such as with Stephen's sysfs patch. This would allow a sysadmin (or user-space app) to tune the system based on statistical data that isn't available to the individual driver. Cheers, John > -----Original Message----- > From: jamal [mailto:hadi@cyberus.ca] > Sent: Thursday, June 02, 2005 5:27 AM > To: Jon Mason > Cc: David S. Miller; Williams, Mitch A; shemminger@osdl.org; > netdev@oss.sgi.com; Robert.Olsson@data.slu.se; Ronciak, John; > Venkatesan, Ganesh; Brandeburg, Jesse > Subject: Re: RFC: NAPI packet weighting patch > > > On Tue, 2005-31-05 at 18:28 -0500, Jon Mason wrote: > > On Tuesday 31 May 2005 05:14 pm, David S. Miller wrote: > > > From: Jon Mason > > > Date: Tue, 31 May 2005 17:07:54 -0500 > > > > > > > Of course some performace analysis would have to be > done to determine the > > > > optimal numbers for each speed/duplexity setting per driver. > > > > > > per cpu speed, per memory bus speed, per I/O bus speed, > and add in other > > > complications such as NUMA > > > > > > My point is that whatever experimental number you come up > with will be > > > good for that driver on your systems, not necessarily for others. > > > > > > Even within a system, whatever number you select will be the wrong > > > thing to use if one starts a continuous I/O stream to the SATA > > > controller in the next PCI slot, for example. > > > > > > We keep getting bitten by this, as the Altix perf data > continually shows, > > > and we need to absolutely stop thinking this way. > > > > > > The way to go is to make selections based upon observed events and > > > mesaurements. > > > > I'm not arguing against a /proc entry to tune dev->weight > for those sysadmins > > advanced enough to do that. I am arguing that we can make > the driver smarter > > (at little/no cost) for "out of the box" users. > > > > What is the point of making the driver "smarter"? > Recall, the algorithm used to schedule the netdevices is based on an > extension of Weighted Round Robin from Varghese et al known > as DRR (ask > gooogle for details). > The idea is to provide fairness amongst many drivers. As an > example, if > you have a gige driver it shouldnt be taking all the resources at the > expense of starving the fastether driver. > If the admin wants one driver to be more "important" than the other, > s/he will make sure it has a higher weight. > > cheers, > jamal > > From shemminger@osdl.org Thu Jun 2 14:33:01 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Jun 2005 14:33:14 -0700 (PDT) Received: from smtp.osdl.org (fire.osdl.org [65.172.181.4]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j52LX1Xq027235 for ; Thu, 2 Jun 2005 14:33:01 -0700 Received: from shell0.pdx.osdl.net (fw.osdl.org [65.172.181.6]) by smtp.osdl.org (8.12.8/8.12.8) with ESMTP id j52LVQjA018683 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO); Thu, 2 Jun 2005 14:31:26 -0700 Received: from dxpl.pdx.osdl.net (dxpl.pdx.osdl.net [10.8.0.74]) by shell0.pdx.osdl.net (8.13.1/8.11.6) with ESMTP id j52LVQL9019198; Thu, 2 Jun 2005 14:31:26 -0700 Date: Thu, 2 Jun 2005 14:31:26 -0700 From: Stephen Hemminger To: "Ronciak, John" Cc: , "Jon Mason" , "David S. Miller" , "Williams, Mitch A" , , , "Venkatesan, Ganesh" , "Brandeburg, Jesse" Subject: Re: RFC: NAPI packet weighting patch Message-ID: <20050602143126.7c302cfd@dxpl.pdx.osdl.net> In-Reply-To: <468F3FDA28AA87429AD807992E22D07E0450BFD0@orsmsx408> References: <468F3FDA28AA87429AD807992E22D07E0450BFD0@orsmsx408> Organization: Open Source Development Lab X-Mailer: Sylpheed-Claws 1.0.4 (GTK+ 1.2.10; x86_64-unknown-linux-gnu) X-Face: &@E+xe?c%:&e4D{>f1O<&U>2qwRREG5!}7R4;D<"NO^UI2mJ[eEOA2*3>(`Th.yP,VDPo9$ /`~cw![cmj~~jWe?AHY7D1S+\}5brN0k*NE?pPh_'_d>6;XGG[\KDRViCfumZT3@[ Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-MIMEDefang-Filter: osdl$Revision: 1.109 $ X-Scanned-By: MIMEDefang 2.36 X-archive-position: 1989 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev Content-Length: 2869 Lines: 57 On Thu, 2 Jun 2005 14:19:55 -0700 "Ronciak, John" wrote: > The DRR algorithm assumes a perfect world, where hardware resources are > infinite, packets arrive continuously (or separated by very long > delays), there are no bus latencies, and CPU speed is infinite. > > The real world is much messier: hardware starves for resources if it's > not serviced quickly enough, packets arrive at inconvenient intervals > (especially at 10 and 100 Mbps speeds), and buses and CPUs are slow. > > Thus, the driver should have the intelligence built into it to make an > "intelligent" choice on what the weight should be for that > driver/hardware. The calculation in the driver should take into account > all the factors that the driver has access to. These include link > speed, bus type and speed, processor speed and some amount of actual > device FIFO size and latency smarts. The driver would use all of the > factors to come up with a weight to prevent it from dropping frames and > not to starve out other devices in the system or hinder performance. It > seems to us that the driver is the one that know best and should try to > come up with a reasonable value for weight based on its own knowledge of > the hardware. This is like saying each CPU vendor should write their own process scheduler for Linux. Now with NUMA and HT, it is getting almost that bad but we still try and keep it CPU neutral. For networking the problem is worse, the "right" choice depends on workload and relationship between components in the system. I can't see how you could ever expect a driver specific solution. > This has been showing up in our NAPI test data which Mitch is currently > scrubbing for release. It shows that there is a need for either better > default static weight numbers or for them to be calculated based on some > system dynamic variables. We would like to see the latter tried but the > only problem is that each driver would have to make its own > calculations, and it may not have access to all of the system-wide data > it would need to make a proper calculation. And for other workloads, and other systems (think about the Altix with long access latencies), your numbers will be wrong. Perhaps we need to quit trying for a perfect solution and just get a "good enough" one that works. Let's keep the intelligence out of the driver. Most of the existing smart drivers end up looking like crap and don't work that well. > Even with a more intelligent driver, we still would like to see some > mechanism for the weight to be changed at runtime, such as with > Stephen's sysfs patch. This would allow a sysadmin (or user-space app) > to tune the system based on statistical data that isn't available to the > individual driver. > It will be yet another knob that all except the benchmark tweakers can ignore (hopefully). From mmporter@cox.net Thu Jun 2 14:35:03 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Jun 2005 14:35:08 -0700 (PDT) Received: from fed1rmmtao06.cox.net (fed1rmmtao06.cox.net [68.230.241.33]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j52LZ2Xq028054 for ; Thu, 2 Jun 2005 14:35:03 -0700 Received: from liberty.homelinux.org ([68.2.41.86]) by fed1rmmtao06.cox.net (InterMail vM.6.01.04.00 201-2131-118-20041027) with ESMTP id <20050602213405.WKPG19494.fed1rmmtao06.cox.net@liberty.homelinux.org>; Thu, 2 Jun 2005 17:34:05 -0400 Received: (from mmporter@localhost) by liberty.homelinux.org (8.9.3/8.9.3/Debian 8.9.3-21) id OAA26210; Thu, 2 Jun 2005 14:34:04 -0700 Date: Thu, 2 Jun 2005 14:34:04 -0700 From: Matt Porter To: torvalds@osdl.org, akpm@osdl.org, jgarzik@pobox.com Cc: linux-kernel@vger.kernel.org, linuxppc-embedded@ozlabs.org, netdev@oss.sgi.com Subject: [PATCH][5/5] RapidIO support: net driver over messaging Message-ID: <20050602143404.F24818@cox.net> References: <20050602140359.B24818@cox.net> <20050602141247.C24818@cox.net> <20050602141946.D24818@cox.net> <20050602142509.E24818@cox.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: <20050602142509.E24818@cox.net>; from mporter@kernel.crashing.org on Thu, Jun 02, 2005 at 02:25:10PM -0700 X-archive-position: 1990 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mporter@kernel.crashing.org Precedence: bulk X-list: netdev Content-Length: 17633 Lines: 670 Adds an "Ethernet" driver which sends Ethernet packets over the standard RapidIO messaging. This depends on the core RIO patch for mailbox/doorbell access. Signed-off-by: Matt Porter Index: drivers/net/Kconfig =================================================================== --- 711ec47634f5d5ded866eaa965a0f7dadcbc65f4/drivers/net/Kconfig (mode:100644) +++ 8bdd37ff79724c95795ed39c28588a94e1f13e60/drivers/net/Kconfig (mode:100644) @@ -2185,6 +2185,20 @@ tristate "iSeries Virtual Ethernet driver support" depends on NETDEVICES && PPC_ISERIES +config RIONET + tristate "RapidIO Ethernet over messaging driver support" + depends on NETDEVICES && RAPIDIO + +config RIONET_TX_SIZE + int "Number of outbound queue entries" + depends on RIONET + default "128" + +config RIONET_RX_SIZE + int "Number of inbound queue entries" + depends on RIONET + default "128" + config FDDI bool "FDDI driver support" depends on NETDEVICES && (PCI || EISA) Index: drivers/net/Makefile =================================================================== --- 711ec47634f5d5ded866eaa965a0f7dadcbc65f4/drivers/net/Makefile (mode:100644) +++ 8bdd37ff79724c95795ed39c28588a94e1f13e60/drivers/net/Makefile (mode:100644) @@ -58,6 +58,7 @@ obj-$(CONFIG_VIA_RHINE) += via-rhine.o obj-$(CONFIG_VIA_VELOCITY) += via-velocity.o obj-$(CONFIG_ADAPTEC_STARFIRE) += starfire.o +obj-$(CONFIG_RIONET) += rionet.o # # end link order section Index: drivers/net/rionet.c =================================================================== --- /dev/null (tree:711ec47634f5d5ded866eaa965a0f7dadcbc65f4) +++ 8bdd37ff79724c95795ed39c28588a94e1f13e60/drivers/net/rionet.c (mode:100644) @@ -0,0 +1,622 @@ +/* + * rionet - Ethernet driver over RapidIO messaging services + * + * Copyright 2005 MontaVista Software, Inc. + * Matt Porter + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License as published by the + * Free Software Foundation; either version 2 of the License, or (at your + * option) any later version. + */ + +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include + +#define DRV_NAME "rionet" +#define DRV_VERSION "0.1" +#define DRV_AUTHOR "Matt Porter " +#define DRV_DESC "Ethernet over RapidIO" + +MODULE_AUTHOR(DRV_AUTHOR); +MODULE_DESCRIPTION(DRV_DESC); +MODULE_LICENSE("GPL"); + +#define RIONET_DEFAULT_MSGLEVEL 0 +#define RIONET_DOORBELL_JOIN 0x1000 +#define RIONET_DOORBELL_LEAVE 0x1001 + +#define RIONET_MAILBOX 0 + +#define RIONET_TX_RING_SIZE CONFIG_RIONET_TX_SIZE +#define RIONET_RX_RING_SIZE CONFIG_RIONET_RX_SIZE + +LIST_HEAD(rionet_peers); + +struct rionet_private { + struct rio_mport *mport; + struct sk_buff *rx_skb[RIONET_RX_RING_SIZE]; + struct sk_buff *tx_skb[RIONET_TX_RING_SIZE]; + struct net_device_stats stats; + int rx_slot; + int tx_slot; + int tx_cnt; + int ack_slot; + spinlock_t lock; + u32 msg_enable; +}; + +struct rionet_peer { + struct list_head node; + struct rio_dev *rdev; + struct resource *res; +}; + +static int rionet_check = 0; +static int rionet_capable = 1; +static struct net_device *sndev = NULL; + +/* + * This is a fast lookup table for for translating TX + * Ethernet packets into a destination RIO device. It + * could be made into a hash table to save memory depending + * on system trade-offs. + */ +static struct rio_dev *rionet_active[RIO_MAX_ROUTE_ENTRIES]; + +#define is_rionet_capable(pef, src_ops, dst_ops) \ + ((pef & RIO_PEF_INB_MBOX) && \ + (pef & RIO_PEF_INB_DOORBELL) && \ + (src_ops & RIO_SRC_OPS_DOORBELL) && \ + (dst_ops & RIO_DST_OPS_DOORBELL)) +#define dev_rionet_capable(dev) \ + is_rionet_capable(dev->pef, dev->src_ops, dev->dst_ops) + +#define RIONET_MAC_MATCH(x) (*(u32 *)x == 0x00010001) +#define RIONET_GET_DESTID(x) (*(u16 *)(x + 4)) + +static struct net_device_stats *rionet_stats(struct net_device *ndev) +{ + struct rionet_private *rnet = ndev->priv; + return &rnet->stats; +} + +static int rionet_rx_clean(struct net_device *ndev) +{ + int i; + int error = 0; + struct rionet_private *rnet = ndev->priv; + void *data; + + i = rnet->rx_slot; + + do { + if (!rnet->rx_skb[i]) { + rnet->stats.rx_dropped++; + continue; + } + + if (!(data = rio_get_inb_message(rnet->mport, RIONET_MAILBOX))) + break; + + rnet->rx_skb[i]->data = data; + skb_put(rnet->rx_skb[i], RIO_MAX_MSG_SIZE); + rnet->rx_skb[i]->dev = sndev; + rnet->rx_skb[i]->protocol = + eth_type_trans(rnet->rx_skb[i], sndev); + error = netif_rx(rnet->rx_skb[i]); + + if (error == NET_RX_DROP) { + rnet->stats.rx_dropped++; + } else if (error == NET_RX_BAD) { + if (netif_msg_rx_err(rnet)) + printk(KERN_WARNING "%s: bad rx packet\n", + DRV_NAME); + rnet->stats.rx_errors++; + } else { + rnet->stats.rx_packets++; + rnet->stats.rx_bytes += RIO_MAX_MSG_SIZE; + } + + } while ((i = (i + 1) % RIONET_RX_RING_SIZE) != rnet->rx_slot); + + return i; +} + +static void rionet_rx_fill(struct net_device *ndev, int end) +{ + int i; + struct rionet_private *rnet = ndev->priv; + + i = rnet->rx_slot; + do { + rnet->rx_skb[i] = dev_alloc_skb(RIO_MAX_MSG_SIZE); + + if (!rnet->rx_skb[i]) + break; + + rio_add_inb_buffer(rnet->mport, RIONET_MAILBOX, + rnet->rx_skb[i]->data); + } while ((i = (i + 1) % RIONET_RX_RING_SIZE) != end); + + rnet->rx_slot = i; +} + +static int rionet_queue_tx_msg(struct sk_buff *skb, struct net_device *ndev, + struct rio_dev *rdev) +{ + struct rionet_private *rnet = ndev->priv; + + rio_add_outb_message(rnet->mport, rdev, 0, skb->data, skb->len); + rnet->tx_skb[rnet->tx_slot] = skb; + + rnet->stats.tx_packets++; + rnet->stats.tx_bytes += skb->len; + + if (++rnet->tx_cnt == RIONET_TX_RING_SIZE) + netif_stop_queue(ndev); + + if (++rnet->tx_slot == RIONET_TX_RING_SIZE) + rnet->tx_slot = 0; + + if (netif_msg_tx_queued(rnet)) + printk(KERN_INFO "%s: queued skb %8.8x len %8.8x\n", DRV_NAME, + (u32) skb, skb->len); + + return 0; +} + +static int rionet_start_xmit(struct sk_buff *skb, struct net_device *ndev) +{ + int i; + struct rionet_private *rnet = ndev->priv; + struct ethhdr *eth = (struct ethhdr *)skb->data; + u16 destid; + + spin_lock_irq(&rnet->lock); + + if ((rnet->tx_cnt + 1) > RIONET_TX_RING_SIZE) { + netif_stop_queue(ndev); + spin_unlock_irq(&rnet->lock); + return -EBUSY; + } + + if (eth->h_dest[0] & 0x01) { + /* + * XXX Need to delay queuing if ring max is reached, + * flush additional packets in tx_event() before + * awakening the queue. We can easily exceed ring + * size with a large number of nodes or even a + * small number where the ring is relatively full + * on entrance to hard_start_xmit. + */ + for (i = 0; i < RIO_MAX_ROUTE_ENTRIES; i++) + if (rionet_active[i]) + rionet_queue_tx_msg(skb, ndev, + rionet_active[i]); + } else if (RIONET_MAC_MATCH(eth->h_dest)) { + destid = RIONET_GET_DESTID(eth->h_dest); + if (rionet_active[destid]) + rionet_queue_tx_msg(skb, ndev, rionet_active[destid]); + } + + spin_unlock_irq(&rnet->lock); + + return 0; +} + +static int rionet_set_mac_address(struct net_device *ndev, void *p) +{ + struct sockaddr *addr = p; + + if (!is_valid_ether_addr(addr->sa_data)) + return -EADDRNOTAVAIL; + + memcpy(ndev->dev_addr, addr->sa_data, ndev->addr_len); + + return 0; +} + +static int rionet_change_mtu(struct net_device *ndev, int new_mtu) +{ + struct rionet_private *rnet = ndev->priv; + + if (netif_msg_drv(rnet)) + printk(KERN_WARNING + "%s: rionet_change_mtu(): not implemented\n", DRV_NAME); + + return 0; +} + +static void rionet_set_multicast_list(struct net_device *ndev) +{ + struct rionet_private *rnet = ndev->priv; + + if (netif_msg_drv(rnet)) + printk(KERN_WARNING + "%s: rionet_set_multicast_list(): not implemented\n", + DRV_NAME); +} + +static void rionet_dbell_event(struct rio_mport *mport, u16 sid, u16 tid, + u16 info) +{ + struct net_device *ndev = sndev; + struct rionet_private *rnet = ndev->priv; + struct rionet_peer *peer; + + if (netif_msg_intr(rnet)) + printk(KERN_INFO "%s: doorbell sid %4.4x tid %4.4x info %4.4x", + DRV_NAME, sid, tid, info); + if (info == RIONET_DOORBELL_JOIN) { + if (!rionet_active[sid]) { + list_for_each_entry(peer, &rionet_peers, node) { + if (peer->rdev->destid == sid) + rionet_active[sid] = peer->rdev; + } + rio_mport_send_doorbell(mport, sid, + RIONET_DOORBELL_JOIN); + } + } else if (info == RIONET_DOORBELL_LEAVE) { + rionet_active[sid] = NULL; + } else { + if (netif_msg_intr(rnet)) + printk(KERN_WARNING "%s: unhandled doorbell\n", + DRV_NAME); + } +} + +static void rionet_inb_msg_event(struct rio_mport *mport, int mbox, int slot) +{ + int n; + struct net_device *ndev = sndev; + struct rionet_private *rnet = (struct rionet_private *)ndev->priv; + + if (netif_msg_intr(rnet)) + printk(KERN_INFO "%s: inbound message event, mbox %d slot %d\n", + DRV_NAME, mbox, slot); + + spin_lock(&rnet->lock); + if ((n = rionet_rx_clean(ndev)) != rnet->rx_slot) + rionet_rx_fill(ndev, n); + spin_unlock(&rnet->lock); +} + +static void rionet_outb_msg_event(struct rio_mport *mport, int mbox, int slot) +{ + struct net_device *ndev = sndev; + struct rionet_private *rnet = ndev->priv; + + spin_lock(&rnet->lock); + + if (netif_msg_intr(rnet)) + printk(KERN_INFO + "%s: outbound message event, mbox %d slot %d\n", + DRV_NAME, mbox, slot); + + while (rnet->tx_cnt && (rnet->ack_slot != slot)) { + /* dma unmap single */ + dev_kfree_skb_irq(rnet->tx_skb[rnet->ack_slot]); + rnet->tx_skb[rnet->ack_slot] = NULL; + if (++rnet->ack_slot == RIONET_TX_RING_SIZE) + rnet->ack_slot = 0; + rnet->tx_cnt--; + } + + if (rnet->tx_cnt < RIONET_TX_RING_SIZE) + netif_wake_queue(ndev); + + spin_unlock(&rnet->lock); +} + +static int rionet_open(struct net_device *ndev) +{ + int i, rc = 0; + struct rionet_peer *peer, *tmp; + u32 pwdcsr; + struct rionet_private *rnet = ndev->priv; + + if (netif_msg_ifup(rnet)) + printk(KERN_INFO "%s: open\n", DRV_NAME); + + if ((rc = rio_request_inb_dbell(rnet->mport, + RIONET_DOORBELL_JOIN, + RIONET_DOORBELL_LEAVE, + rionet_dbell_event)) < 0) + goto out; + + if ((rc = rio_request_inb_mbox(rnet->mport, + RIONET_MAILBOX, + RIONET_RX_RING_SIZE, + rionet_inb_msg_event)) < 0) + goto out; + + if ((rc = rio_request_outb_mbox(rnet->mport, + RIONET_MAILBOX, + RIONET_TX_RING_SIZE, + rionet_outb_msg_event)) < 0) + goto out; + + /* Initialize inbound message ring */ + for (i = 0; i < RIONET_RX_RING_SIZE; i++) + rnet->rx_skb[i] = NULL; + rnet->rx_slot = 0; + rionet_rx_fill(ndev, 0); + + rnet->tx_slot = 0; + rnet->tx_cnt = 0; + rnet->ack_slot = 0; + + spin_lock_init(&rnet->lock); + + rnet->msg_enable = RIONET_DEFAULT_MSGLEVEL; + + netif_carrier_on(ndev); + netif_start_queue(ndev); + + list_for_each_entry_safe(peer, tmp, &rionet_peers, node) { + if (!(peer->res = rio_request_outb_dbell(peer->rdev, + RIONET_DOORBELL_JOIN, + RIONET_DOORBELL_LEAVE))) + { + printk(KERN_ERR "%s: error requesting doorbells\n", + DRV_NAME); + continue; + } + + /* + * If device has initialized inbound doorbells, + * send a join message + */ + rio_read_config_32(peer->rdev, RIO_WRITE_PORT_CSR, &pwdcsr); + if (pwdcsr & RIO_DOORBELL_AVAIL) + rio_send_doorbell(peer->rdev, RIONET_DOORBELL_JOIN); + } + + out: + return rc; +} + +static int rionet_close(struct net_device *ndev) +{ + struct rionet_private *rnet = (struct rionet_private *)ndev->priv; + struct rionet_peer *peer, *tmp; + int i; + + if (netif_msg_ifup(rnet)) + printk(KERN_INFO "%s: close\n", DRV_NAME); + + netif_stop_queue(ndev); + netif_carrier_off(ndev); + + for (i = 0; i < RIONET_RX_RING_SIZE; i++) + if (rnet->rx_skb[i]) + kfree_skb(rnet->rx_skb[i]); + + list_for_each_entry_safe(peer, tmp, &rionet_peers, node) { + if (rionet_active[peer->rdev->destid]) { + rio_send_doorbell(peer->rdev, RIONET_DOORBELL_LEAVE); + rionet_active[peer->rdev->destid] = NULL; + } + rio_release_outb_dbell(peer->rdev, peer->res); + } + + rio_release_inb_dbell(rnet->mport, RIONET_DOORBELL_JOIN, + RIONET_DOORBELL_LEAVE); + rio_release_inb_mbox(rnet->mport, RIONET_MAILBOX); + rio_release_outb_mbox(rnet->mport, RIONET_MAILBOX); + + return 0; +} + +static void rionet_remove(struct rio_dev *rdev) +{ + struct net_device *ndev = NULL; + struct rionet_peer *peer, *tmp; + + unregister_netdev(ndev); + kfree(ndev); + + list_for_each_entry_safe(peer, tmp, &rionet_peers, node) { + list_del(&peer->node); + kfree(peer); + } +} + +static int rionet_ioctl(struct net_device *ndev, struct ifreq *rq, int cmd) +{ + return -EOPNOTSUPP; +} + +static void rionet_get_drvinfo(struct net_device *ndev, + struct ethtool_drvinfo *info) +{ + struct rionet_private *rnet = ndev->priv; + + strcpy(info->driver, DRV_NAME); + strcpy(info->version, DRV_VERSION); + strcpy(info->fw_version, "n/a"); + sprintf(info->bus_info, "RIO master port %d", rnet->mport->id); +} + +static u32 rionet_get_msglevel(struct net_device *ndev) +{ + struct rionet_private *rnet = ndev->priv; + + return rnet->msg_enable; +} + +static void rionet_set_msglevel(struct net_device *ndev, u32 value) +{ + struct rionet_private *rnet = ndev->priv; + + rnet->msg_enable = value; +} + +static u32 rionet_get_link(struct net_device *ndev) +{ + return netif_carrier_ok(ndev); +} + +static struct ethtool_ops rionet_ethtool_ops = { + .get_drvinfo = rionet_get_drvinfo, + .get_msglevel = rionet_get_msglevel, + .set_msglevel = rionet_set_msglevel, + .get_link = rionet_get_link, +}; + +static int rionet_setup_netdev(struct rio_mport *mport) +{ + int rc = 0; + struct net_device *ndev = NULL; + struct rionet_private *rnet; + u16 device_id; + + /* Allocate our net_device structure */ + ndev = alloc_etherdev(sizeof(struct rionet_private)); + if (ndev == NULL) { + printk(KERN_INFO "%s: could not allocate ethernet device.\n", + DRV_NAME); + rc = -ENOMEM; + goto out; + } + + /* + * XXX hack, store point a static at ndev so we can get it... + * Perhaps need an array of these that the handler can + * index via the mbox number. + */ + sndev = ndev; + + /* Set up private area */ + rnet = (struct rionet_private *)ndev->priv; + rnet->mport = mport; + + /* Set the default MAC address */ + device_id = rio_local_get_device_id(mport); + ndev->dev_addr[0] = 0x00; + ndev->dev_addr[1] = 0x01; + ndev->dev_addr[2] = 0x00; + ndev->dev_addr[3] = 0x01; + ndev->dev_addr[4] = device_id >> 8; + ndev->dev_addr[5] = device_id & 0xff; + + /* Fill in the driver function table */ + ndev->open = &rionet_open; + ndev->hard_start_xmit = &rionet_start_xmit; + ndev->stop = &rionet_close; + ndev->get_stats = &rionet_stats; + ndev->change_mtu = &rionet_change_mtu; + ndev->set_mac_address = &rionet_set_mac_address; + ndev->set_multicast_list = &rionet_set_multicast_list; + ndev->do_ioctl = &rionet_ioctl; + SET_ETHTOOL_OPS(ndev, &rionet_ethtool_ops); + + ndev->mtu = RIO_MAX_MSG_SIZE - 14; + + SET_MODULE_OWNER(ndev); + + rc = register_netdev(ndev); + if (rc != 0) + goto out; + + printk("%s: %s %s Version %s, MAC %02x:%02x:%02x:%02x:%02x:%02x\n", + ndev->name, + DRV_NAME, + DRV_DESC, + DRV_VERSION, + ndev->dev_addr[0], ndev->dev_addr[1], ndev->dev_addr[2], + ndev->dev_addr[3], ndev->dev_addr[4], ndev->dev_addr[5]); + + out: + return rc; +} + +/* + * XXX Make multi-net safe + */ +static int rionet_probe(struct rio_dev *rdev, const struct rio_device_id *id) +{ + int rc = -ENODEV; + u32 lpef, lsrc_ops, ldst_ops; + struct rionet_peer *peer; + + /* If local device is not rionet capable, give up quickly */ + if (!rionet_capable) + goto out; + + /* + * First time through, make sure local device is rionet + * capable, setup netdev, and set flags so this is skipped + * on later probes + */ + if (!rionet_check) { + rio_local_read_config_32(rdev->net->hport, RIO_PEF_CAR, &lpef); + rio_local_read_config_32(rdev->net->hport, RIO_SRC_OPS_CAR, + &lsrc_ops); + rio_local_read_config_32(rdev->net->hport, RIO_DST_OPS_CAR, + &ldst_ops); + if (!is_rionet_capable(lpef, lsrc_ops, ldst_ops)) { + printk(KERN_ERR + "%s: local device is not network capable\n", + DRV_NAME); + rionet_check = 1; + rionet_capable = 0; + goto out; + } + + rc = rionet_setup_netdev(rdev->net->hport); + rionet_check = 1; + } + + /* + * If the remote device has mailbox/doorbell capabilities, + * add it to the peer list. + */ + if (dev_rionet_capable(rdev)) { + if (!(peer = kmalloc(sizeof(struct rionet_peer), GFP_KERNEL))) { + rc = -ENOMEM; + goto out; + } + peer->rdev = rdev; + list_add_tail(&peer->node, &rionet_peers); + } + + out: + return rc; +} + +static struct rio_device_id rionet_id_table[] = { + {RIO_DEVICE(RIO_ANY_ID, RIO_ANY_ID)} +}; + +static struct rio_driver rionet_driver = { + .name = "rionet", + .id_table = rionet_id_table, + .probe = rionet_probe, + .remove = rionet_remove, +}; + +static int __init rionet_init(void) +{ + return rio_register_driver(&rionet_driver); +} + +static void __exit rionet_exit(void) +{ + rio_unregister_driver(&rionet_driver); +} + +module_init(rionet_init); +module_exit(rionet_exit); From davem@davemloft.net Thu Jun 2 14:41:15 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Jun 2005 14:41:19 -0700 (PDT) Received: from sunset.davemloft.net (dsl027-180-168.sfo1.dsl.speakeasy.net [216.27.180.168]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j52LfFXq028972 for ; Thu, 2 Jun 2005 14:41:15 -0700 Received: from localhost ([127.0.0.1] ident=davem) by sunset.davemloft.net with esmtp (Exim 4.50) id 1DdxQ3-00058E-Ju; Thu, 02 Jun 2005 14:40:03 -0700 Date: Thu, 02 Jun 2005 14:40:03 -0700 (PDT) Message-Id: <20050602.144003.35660495.davem@davemloft.net> To: shemminger@osdl.org Cc: john.ronciak@intel.com, hadi@cyberus.ca, jdmason@us.ibm.com, mitch.a.williams@intel.com, netdev@oss.sgi.com, Robert.Olsson@data.slu.se, ganesh.venkatesan@intel.com, jesse.brandeburg@intel.com Subject: Re: RFC: NAPI packet weighting patch From: "David S. Miller" In-Reply-To: <20050602143126.7c302cfd@dxpl.pdx.osdl.net> References: <468F3FDA28AA87429AD807992E22D07E0450BFD0@orsmsx408> <20050602143126.7c302cfd@dxpl.pdx.osdl.net> X-Mailer: Mew version 3.3 on Emacs 21.4 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 1991 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev Content-Length: 1077 Lines: 25 From: Stephen Hemminger Date: Thu, 2 Jun 2005 14:31:26 -0700 > For networking the problem is worse, the "right" choice depends on workload > and relationship between components in the system. I can't see how you could > ever expect a driver specific solution. I totally agree, even the mere concept of driver-centric decisions in this area is pretty bogus. > And for other workloads, and other systems (think about the Altix with > long access latencies), your numbers will be wrong. Perhaps we need > to quit trying for a perfect solution and just get a "good enough" one > that works. I don't understand why nobody is investigating doing this stuff by generic measurements that the core kernel can perform. The generic ->poll() runner code can say, wow it took N-usec to process M packets, perhaps I should adjust the weight. I haven't seen one concrete suggestion along those lines, yet that is where the answer to this kind of stuff is. Those kinds of solutions are completely CPU, memory, I/O bus, network device, and workload independant. From jdmason@us.ibm.com Thu Jun 2 14:53:04 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Jun 2005 14:53:06 -0700 (PDT) Received: from e33.co.us.ibm.com (e33.co.us.ibm.com [32.97.110.131]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j52Lr3Xq000575 for ; Thu, 2 Jun 2005 14:53:03 -0700 Received: from d03relay04.boulder.ibm.com (d03relay04.boulder.ibm.com [9.17.195.106]) by e33.co.us.ibm.com (8.12.10/8.12.9) with ESMTP id j52Lq6mD244468 for ; Thu, 2 Jun 2005 17:52:06 -0400 Received: from d03av04.boulder.ibm.com (d03av04.boulder.ibm.com [9.17.195.170]) by d03relay04.boulder.ibm.com (8.12.10/NCO/VER6.6) with ESMTP id j52Lq5Jj038004 for ; Thu, 2 Jun 2005 15:52:05 -0600 Received: from d03av04.boulder.ibm.com (loopback [127.0.0.1]) by d03av04.boulder.ibm.com (8.12.11/8.13.3) with ESMTP id j52Lq4EK018604 for ; Thu, 2 Jun 2005 15:52:05 -0600 Received: from [192.168.0.29] (dreadnought.austin.ibm.com [9.53.90.32]) by d03av04.boulder.ibm.com (8.12.11/8.12.11) with ESMTP id j52Lq4KL018571; Thu, 2 Jun 2005 15:52:04 -0600 From: Jon Mason Organization: IBM To: Stephen Hemminger Subject: Re: RFC: NAPI packet weighting patch Date: Thu, 2 Jun 2005 16:51:48 -0500 User-Agent: KMail/1.7.2 Cc: "Ronciak, John" , hadi@cyberus.ca, "David S. Miller" , "Williams, Mitch A" , netdev@oss.sgi.com, Robert.Olsson@data.slu.se, "Venkatesan, Ganesh" , "Brandeburg, Jesse" References: <468F3FDA28AA87429AD807992E22D07E0450BFD0@orsmsx408> <20050602143126.7c302cfd@dxpl.pdx.osdl.net> In-Reply-To: <20050602143126.7c302cfd@dxpl.pdx.osdl.net> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200506021651.49013.jdmason@us.ibm.com> X-archive-position: 1992 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jdmason@us.ibm.com Precedence: bulk X-list: netdev Content-Length: 1355 Lines: 26 On Thursday 02 June 2005 04:31 pm, Stephen Hemminger wrote: <...> > For networking the problem is worse, the "right" choice depends on workload > and relationship between components in the system. I can't see how you > could ever expect a driver specific solution. I think there is a way for a generic driver NAPI enhancement. That is to modify the weight dependent on link speed. Here is the problem as I see it, NAPI enablement for slow media speeds causes unneeded strain on the system. This is because of the "weight" of NAPI. Lets look at e1000 as an example. Currently the NAPI weight is 64, regardless of link media speed. This weight is probably fine for a gigabit link, but for 10/100 this is way to large. Thus causing interrupts to be enabled/disabled after every poll/interrupt. Lots of overhead, and not very smart. Why not have the driver set the weight to 16/32 respectively for the weight (or better yet, have someone run numbers to find weight that are closer to what the adapter can actually use)? While these numbers may not be optimal for every system, this is much better that the current system, and would only require 5 or so extra lines of code per NAPI enabled driver. For those who want to have an optimal weight for their tuned system, let them use the /proc entry that is being proposed. Thanks, Jon From shemminger@osdl.org Thu Jun 2 15:06:45 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Jun 2005 15:06:48 -0700 (PDT) Received: from smtp.osdl.org (fire.osdl.org [65.172.181.4]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j52M6jXq001487 for ; Thu, 2 Jun 2005 15:06:45 -0700 Received: from shell0.pdx.osdl.net (fw.osdl.org [65.172.181.6]) by smtp.osdl.org (8.12.8/8.12.8) with ESMTP id j52M5hjA021673 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO); Thu, 2 Jun 2005 15:05:43 -0700 Received: from dxpl.pdx.osdl.net (dxpl.pdx.osdl.net [10.8.0.74]) by shell0.pdx.osdl.net (8.13.1/8.11.6) with ESMTP id j52M5hJ8021042; Thu, 2 Jun 2005 15:05:43 -0700 Date: Thu, 2 Jun 2005 15:05:43 -0700 From: Stephen Hemminger To: Matt Porter Cc: torvalds@osdl.org, akpm@osdl.org, jgarzik@pobox.com, linux-kernel@vger.kernel.org, linuxppc-embedded@ozlabs.org, netdev@oss.sgi.com Subject: Re: [PATCH][5/5] RapidIO support: net driver over messaging Message-ID: <20050602150543.7e4326b6@dxpl.pdx.osdl.net> In-Reply-To: <20050602143404.F24818@cox.net> References: <20050602140359.B24818@cox.net> <20050602141247.C24818@cox.net> <20050602141946.D24818@cox.net> <20050602142509.E24818@cox.net> <20050602143404.F24818@cox.net> Organization: Open Source Development Lab X-Mailer: Sylpheed-Claws 1.0.4 (GTK+ 1.2.10; x86_64-unknown-linux-gnu) X-Face: &@E+xe?c%:&e4D{>f1O<&U>2qwRREG5!}7R4;D<"NO^UI2mJ[eEOA2*3>(`Th.yP,VDPo9$ /`~cw![cmj~~jWe?AHY7D1S+\}5brN0k*NE?pPh_'_d>6;XGG[\KDRViCfumZT3@[ Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-MIMEDefang-Filter: osdl$Revision: 1.109 $ X-Scanned-By: MIMEDefang 2.36 X-archive-position: 1993 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev Content-Length: 3600 Lines: 131 How much is this like ethernet? does it still do ARP? Can it do promiscious receive? > +LIST_HEAD(rionet_peers); Does this have to be global? Not sure about the locking of this stuff, are you relying on the RTNL? > + > +static int rionet_change_mtu(struct net_device *ndev, int new_mtu) > +{ > + struct rionet_private *rnet = ndev->priv; > + > + if (netif_msg_drv(rnet)) > + printk(KERN_WARNING > + "%s: rionet_change_mtu(): not implemented\n", DRV_NAME); > + > + return 0; > +} If you can allow any mtu then don't need this at all. Or if you are limited then better return an error for bad values. > +static void rionet_set_multicast_list(struct net_device *ndev) > +{ > + struct rionet_private *rnet = ndev->priv; > + > + if (netif_msg_drv(rnet)) > + printk(KERN_WARNING > + "%s: rionet_set_multicast_list(): not implemented\n", > + DRV_NAME); > +} If you can't handle it then just leave dev->set_multicast_list as NULL and all attempts to add or delete will get -EINVAL > + > +static int rionet_open(struct net_device *ndev) > +{ > + /* Initialize inbound message ring */ > + for (i = 0; i < RIONET_RX_RING_SIZE; i++) > + rnet->rx_skb[i] = NULL; > + rnet->rx_slot = 0; > + rionet_rx_fill(ndev, 0); > + > + rnet->tx_slot = 0; > + rnet->tx_cnt = 0; > + rnet->ack_slot = 0; > + > + spin_lock_init(&rnet->lock); > + > + rnet->msg_enable = RIONET_DEFAULT_MSGLEVEL; Better to do all initialization of the per device data in the place it is allocated (rio_setup_netdev) > + > +static int rionet_ioctl(struct net_device *ndev, struct ifreq *rq, int cmd) > +{ > + return -EOPNOTSUPP; > +} Unneeded, if dev->do_ioctl is NULL, then all private ioctl's will return -EINVAL that is what you want. > + > +static u32 rionet_get_link(struct net_device *ndev) > +{ > + return netif_carrier_ok(ndev); > +} Use ethtool_op_get_link > + > +static int rionet_setup_netdev(struct rio_mport *mport) > +{ > + int rc = 0; > + struct net_device *ndev = NULL; > + struct rionet_private *rnet; > + u16 device_id; > + > + /* Allocate our net_device structure */ > + ndev = alloc_etherdev(sizeof(struct rionet_private)); > + if (ndev == NULL) { > + printk(KERN_INFO "%s: could not allocate ethernet device.\n", > + DRV_NAME); > + rc = -ENOMEM; > + goto out; > + } > + > + /* > + * XXX hack, store point a static at ndev so we can get it... > + * Perhaps need an array of these that the handler can > + * index via the mbox number. > + */ > + sndev = ndev; > + > + /* Set up private area */ > + rnet = (struct rionet_private *)ndev->priv; > + rnet->mport = mport; > + > + /* Set the default MAC address */ > + device_id = rio_local_get_device_id(mport); > + ndev->dev_addr[0] = 0x00; > + ndev->dev_addr[1] = 0x01; > + ndev->dev_addr[2] = 0x00; > + ndev->dev_addr[3] = 0x01; > + ndev->dev_addr[4] = device_id >> 8; > + ndev->dev_addr[5] = device_id & 0xff; > + > + /* Fill in the driver function table */ > + ndev->open = &rionet_open; > + ndev->hard_start_xmit = &rionet_start_xmit; > + ndev->stop = &rionet_close; > + ndev->get_stats = &rionet_stats; > + ndev->change_mtu = &rionet_change_mtu; > + ndev->set_mac_address = &rionet_set_mac_address; > + ndev->set_multicast_list = &rionet_set_multicast_list; > + ndev->do_ioctl = &rionet_ioctl; > + SET_ETHTOOL_OPS(ndev, &rionet_ethtool_ops); > + > + ndev->mtu = RIO_MAX_MSG_SIZE - 14; > + > + SET_MODULE_OWNER(ndev); Can you set any ndev->features to get better performance. Can you take >32bit data addresses? then set HIGHDMA You are doing your on locking, can you use LLTX? Does the hardware support scatter gather? From davem@davemloft.net Thu Jun 2 15:13:24 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Jun 2005 15:13:27 -0700 (PDT) Received: from sunset.davemloft.net (dsl027-180-168.sfo1.dsl.speakeasy.net [216.27.180.168]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j52MDOXq002149 for ; Thu, 2 Jun 2005 15:13:24 -0700 Received: from localhost ([127.0.0.1] ident=davem) by sunset.davemloft.net with esmtp (Exim 4.50) id 1DdxvA-0005GI-CT; Thu, 02 Jun 2005 15:12:12 -0700 Date: Thu, 02 Jun 2005 15:12:12 -0700 (PDT) Message-Id: <20050602.151212.35014607.davem@davemloft.net> To: jdmason@us.ibm.com Cc: shemminger@osdl.org, john.ronciak@intel.com, hadi@cyberus.ca, mitch.a.williams@intel.com, netdev@oss.sgi.com, Robert.Olsson@data.slu.se, ganesh.venkatesan@intel.com, jesse.brandeburg@intel.com Subject: Re: RFC: NAPI packet weighting patch From: "David S. Miller" In-Reply-To: <200506021651.49013.jdmason@us.ibm.com> References: <468F3FDA28AA87429AD807992E22D07E0450BFD0@orsmsx408> <20050602143126.7c302cfd@dxpl.pdx.osdl.net> <200506021651.49013.jdmason@us.ibm.com> X-Mailer: Mew version 3.3 on Emacs 21.4 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 1994 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev Content-Length: 880 Lines: 20 From: Jon Mason Date: Thu, 2 Jun 2005 16:51:48 -0500 > Why not have the driver set the weight to 16/32 respectively for the > weight (or better yet, have someone run numbers to find weight that > are closer to what the adapter can actually use)? While these > numbers may not be optimal for every system, this is much better > that the current system, and would only require 5 or so extra lines > of code per NAPI enabled driver. Why do this when we can adjust the weight in one spot, namely the upper level NAPI ->poll() running loop? It can measure the overhead, how many packets processed, etc. and make intelligent decisions based upon that. This is a CPU speed, memory speed, I/O bus speed, and link speed agnostic solution. The driver need not take any part in this, and the scheme will dynamically adjust to resource usage changes in the system. From Robert.Olsson@data.slu.se Thu Jun 2 15:17:05 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Jun 2005 15:17:14 -0700 (PDT) Received: from mx1.slu.se (mx1.slu.se [130.238.96.70]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j52MH4Xq002804 for ; Thu, 2 Jun 2005 15:17:05 -0700 Received: from robur.slu.se (robur.slu.se [130.238.98.12]) by mx1.slu.se (8.13.1/8.13.1) with ESMTP id j52MFs9E022315; Fri, 3 Jun 2005 00:15:54 +0200 Received: by robur.slu.se (Postfix, from userid 1000) id 45CA6EE3F0; Fri, 3 Jun 2005 00:15:51 +0200 (CEST) From: Robert Olsson MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <17055.34070.718986.664873@robur.slu.se> Date: Fri, 3 Jun 2005 00:15:50 +0200 To: Jon Mason Cc: Stephen Hemminger , "Ronciak, John" , hadi@cyberus.ca, "David S. Miller" , "Williams, Mitch A" , netdev@oss.sgi.com, Robert.Olsson@data.slu.se, "Venkatesan, Ganesh" , "Brandeburg, Jesse" Subject: Re: RFC: NAPI packet weighting patch In-Reply-To: <200506021651.49013.jdmason@us.ibm.com> References: <468F3FDA28AA87429AD807992E22D07E0450BFD0@orsmsx408> <20050602143126.7c302cfd@dxpl.pdx.osdl.net> <200506021651.49013.jdmason@us.ibm.com> X-Mailer: VM 7.18 under Emacs 21.4.1 X-Scanned-By: MIMEDefang 2.48 on 130.238.96.70 X-archive-position: 1995 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: Robert.Olsson@data.slu.se Precedence: bulk X-list: netdev Content-Length: 1819 Lines: 43 Differentiate the meaning of weight a bit. Let weight only limit the number of pkts we deliver per ->poll Have some other mechanism or threshold to control when interrupts are to be turned on. The first approximation for this could be to poll as long as we see any pkt on the RX ring. As interrupt seems expensive on all platforms. Cheers. --ro Jon Mason writes: > On Thursday 02 June 2005 04:31 pm, Stephen Hemminger wrote: > <...> > > For networking the problem is worse, the "right" choice depends on workload > > and relationship between components in the system. I can't see how you > > could ever expect a driver specific solution. > > I think there is a way for a generic driver NAPI enhancement. That is to > modify the weight dependent on link speed. > > Here is the problem as I see it, NAPI enablement for slow media speeds causes > unneeded strain on the system. This is because of the "weight" of NAPI. > Lets look at e1000 as an example. Currently the NAPI weight is 64, > regardless of link media speed. This weight is probably fine for a gigabit > link, but for 10/100 this is way to large. Thus causing interrupts to be > enabled/disabled after every poll/interrupt. Lots of overhead, and not very > smart. Why not have the driver set the weight to 16/32 respectively for the > weight (or better yet, have someone run numbers to find weight that are > closer to what the adapter can actually use)? While these numbers may not be > optimal for every system, this is much better that the current system, and > would only require 5 or so extra lines of code per NAPI enabled driver. > > For those who want to have an optimal weight for their tuned system, let them > use the /proc entry that is being proposed. > > Thanks, > Jon From jdmason@us.ibm.com Thu Jun 2 15:21:01 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Jun 2005 15:21:04 -0700 (PDT) Received: from e33.co.us.ibm.com (e33.co.us.ibm.com [32.97.110.131]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j52ML0Xq003409 for ; Thu, 2 Jun 2005 15:21:00 -0700 Received: from westrelay02.boulder.ibm.com (westrelay02.boulder.ibm.com [9.17.195.11]) by e33.co.us.ibm.com (8.12.10/8.12.9) with ESMTP id j52MK3mD022896 for ; Thu, 2 Jun 2005 18:20:03 -0400 Received: from d03av03.boulder.ibm.com (d03av03.boulder.ibm.com [9.17.195.169]) by westrelay02.boulder.ibm.com (8.12.10/NCO/VER6.6) with ESMTP id j52MK2uC222176 for ; Thu, 2 Jun 2005 16:20:02 -0600 Received: from d03av03.boulder.ibm.com (loopback [127.0.0.1]) by d03av03.boulder.ibm.com (8.12.11/8.13.3) with ESMTP id j52MK1bg015726 for ; Thu, 2 Jun 2005 16:20:02 -0600 Received: from [192.168.0.29] (dreadnought.austin.ibm.com [9.53.90.32]) by d03av03.boulder.ibm.com (8.12.11/8.12.11) with ESMTP id j52MK1hV015713; Thu, 2 Jun 2005 16:20:01 -0600 From: Jon Mason Organization: IBM To: "David S. Miller" Subject: Re: RFC: NAPI packet weighting patch Date: Thu, 2 Jun 2005 17:19:46 -0500 User-Agent: KMail/1.7.2 Cc: shemminger@osdl.org, john.ronciak@intel.com, hadi@cyberus.ca, mitch.a.williams@intel.com, netdev@oss.sgi.com, Robert.Olsson@data.slu.se, ganesh.venkatesan@intel.com, jesse.brandeburg@intel.com References: <468F3FDA28AA87429AD807992E22D07E0450BFD0@orsmsx408> <200506021651.49013.jdmason@us.ibm.com> <20050602.151212.35014607.davem@davemloft.net> In-Reply-To: <20050602.151212.35014607.davem@davemloft.net> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200506021719.47459.jdmason@us.ibm.com> X-archive-position: 1996 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jdmason@us.ibm.com Precedence: bulk X-list: netdev Content-Length: 1047 Lines: 23 On Thursday 02 June 2005 05:12 pm, David S. Miller wrote: > From: Jon Mason > Date: Thu, 2 Jun 2005 16:51:48 -0500 > > > Why not have the driver set the weight to 16/32 respectively for the > > weight (or better yet, have someone run numbers to find weight that > > are closer to what the adapter can actually use)? While these > > numbers may not be optimal for every system, this is much better > > that the current system, and would only require 5 or so extra lines > > of code per NAPI enabled driver. > > Why do this when we can adjust the weight in one spot, > namely the upper level NAPI ->poll() running loop? > > It can measure the overhead, how many packets processed, etc. > and make intelligent decisions based upon that. This is a CPU > speed, memory speed, I/O bus speed, and link speed agnostic > solution. > > The driver need not take any part in this, and the scheme will > dynamically adjust to resource usage changes in the system. Yes, a much better idea to do this generically. I 100% agree with you. From hadi@cyberus.ca Thu Jun 2 15:22:46 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Jun 2005 15:22:54 -0700 (PDT) Received: from mx03.cybersurf.com (mx03.cybersurf.com [209.197.145.106]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j52MMkXq003771 for ; Thu, 2 Jun 2005 15:22:46 -0700 Received: from mail.cyberus.ca ([209.197.145.21]) by mx03.cybersurf.com with esmtp (Exim 4.30) id 1Ddy4U-0008Hr-T6 for netdev@oss.sgi.com; Thu, 02 Jun 2005 18:21:50 -0400 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.229]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1DdpmN-0002Jn-DX; Thu, 02 Jun 2005 09:30:35 -0400 Subject: Re: PATCH: explicit typing WAS(Re: PATCH: rtnetlink explicit flags setting From: jamal Reply-To: hadi@cyberus.ca To: "David S. Miller" Cc: tgraf@suug.ch, netdev@oss.sgi.com In-Reply-To: <1117717493.6050.29.camel@localhost.localdomain> References: <1117197157.6688.24.camel@localhost.localdomain> <20050531.144338.112623594.davem@davemloft.net> <20050531222646.GK15391@postel.suug.ch> <20050531.153125.95894437.davem@davemloft.net> <1117717493.6050.29.camel@localhost.localdomain> Content-Type: text/plain Organization: unknown Date: Thu, 02 Jun 2005 09:30:32 -0400 Message-Id: <1117719032.6050.50.camel@localhost.localdomain> Mime-Version: 1.0 X-Mailer: Evolution 2.2.1.1 Content-Transfer-Encoding: 7bit X-archive-position: 1997 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev Content-Length: 344 Lines: 17 I should say this patch is against net-2.6.13.git as of 6am this morning. cheers, jamal On Thu, 2005-02-06 at 09:04 -0400, jamal wrote: > ------------- > This patch converts "unsigned flags" to use more explict types like u16 > instead and incrementally introduces NLMSG_NEW(). > > Signed-off-by: Jamal Hadi Salim > From ravinandan.arakali@neterion.com Thu Jun 2 16:20:36 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Jun 2005 16:20:39 -0700 (PDT) Received: from ns1.s2io.com (ns1.s2io.com [142.46.200.198]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j52NKZXq010483 for ; Thu, 2 Jun 2005 16:20:36 -0700 Received: from guinness.s2io.com (sentry.s2io.com [142.46.200.199]) by ns1.s2io.com (8.12.10/8.12.10) with ESMTP id j52NJ5OC005380; Thu, 2 Jun 2005 19:19:05 -0400 (EDT) Received: from rarakali ([10.16.16.57]) by guinness.s2io.com (8.12.6/8.12.6) with SMTP id j52NJ1VG016944; Thu, 2 Jun 2005 19:19:02 -0400 (EDT) From: "Ravinandan Arakali" To: "'David S. Miller'" Cc: , , , , , Subject: RE: [PATCH 2.6.12-rc4] IPv4/IPv6: UDP Large Send Offload feature Date: Thu, 2 Jun 2005 16:18:55 -0700 Message-ID: <003201c567c9$73322240$3910100a@pc.s2io.com> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Priority: 3 (Normal) X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook CWS, Build 9.0.2416 (9.0.2911.0) In-Reply-To: <20050527.120215.26278001.davem@davemloft.net> X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.2180 Importance: Normal X-Scanned-By: MIMEDefang 2.34 X-archive-position: 1998 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ravinandan.arakali@neterion.com Precedence: bulk X-list: netdev Content-Length: 1822 Lines: 52 David, Since there seems to be pros and cons for both the approaches, we are planning to submit two separate patches(one for each approach). These patches also include the ethtool changes. In terms of performance, we did not observe any diff between the two approaches although the first approach(using SG) minimizes coalescing in driver. Also, some changes will be required in the ethtool user-level utility. I'm not sure if this is the right forum to submit patches for the ethtool utility as well.. Thanks, Ravi -----Original Message----- From: David S. Miller [mailto:davem@davemloft.net] Sent: Friday, May 27, 2005 12:02 PM To: ravinandan.arakali@neterion.com Cc: jgarzik@pobox.com; netdev@oss.sgi.com; raghavendra.koushik@neterion.com; leonid.grossman@neterion.com; ananda.raju@neterion.com; rapuru.sriram@neterion.com Subject: Re: [PATCH 2.6.12-rc4] IPv4/IPv6: UDP Large Send Offload feature From: "Ravinandan Arakali" Date: Fri, 27 May 2005 09:32:00 -0700 > Thanks for the quick feedback. > At that time when we considered using skb_shinfo(skb)->fraglist, > it contained fragments of MTU size. So, for a 60k udp datagram > and 1500 MTU we will have 60k/1500 = 45 fragments which is > more than MAX_SKB_FRAGS(18). > > However we will relook at fraglist for the possibility of increasing > frag size to >MTU. MAX_SKB_FRAGS controls the limit of skb_shinfo(skb)->frags[] entries, not how many SKBs may be chained via skb_shinfo(skb)->fraglist, there is no limit on the latter. Note that there is much coalescing that can be performed on the SKB list data areas, particularly if UDP sendfile() is being used. But such coalescing is messy to be performing inside of the drivers. It may end up being the case that your approach ends up being a better one for these reasons. From davem@davemloft.net Thu Jun 2 16:23:13 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Jun 2005 16:23:15 -0700 (PDT) Received: from sunset.davemloft.net (dsl027-180-168.sfo1.dsl.speakeasy.net [216.27.180.168]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j52NNCXq010825 for ; Thu, 2 Jun 2005 16:23:13 -0700 Received: from localhost ([127.0.0.1] ident=davem) by sunset.davemloft.net with esmtp (Exim 4.50) id 1Ddz0m-0006ic-Pq; Thu, 02 Jun 2005 16:22:04 -0700 Date: Thu, 02 Jun 2005 16:22:04 -0700 (PDT) Message-Id: <20050602.162204.68041633.davem@davemloft.net> To: ravinandan.arakali@neterion.com Cc: jgarzik@pobox.com, netdev@oss.sgi.com, raghavendra.koushik@neterion.com, leonid.grossman@neterion.com, ananda.raju@neterion.com, rapuru.sriram@neterion.com Subject: Re: [PATCH 2.6.12-rc4] IPv4/IPv6: UDP Large Send Offload feature From: "David S. Miller" In-Reply-To: <003201c567c9$73322240$3910100a@pc.s2io.com> References: <20050527.120215.26278001.davem@davemloft.net> <003201c567c9$73322240$3910100a@pc.s2io.com> X-Mailer: Mew version 3.3 on Emacs 21.4 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 1999 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev Content-Length: 831 Lines: 20 From: "Ravinandan Arakali" Date: Thu, 2 Jun 2005 16:18:55 -0700 > Since there seems to be pros and cons for both the approaches, we are > planning > to submit two separate patches(one for each approach). These patches also > include the ethtool changes. In terms of performance, we did not observe any > diff between the two approaches although the first approach(using SG) > minimizes > coalescing in driver. Ok. I think minimizing driver specific work is probably going to make the SG approach more desirable, but we'll see. > Also, some changes will be required in the ethtool user-level utility. > I'm not sure if this is the right forum to submit patches for the ethtool > utility as well.. Making sure jgarzik@pobox.com gets the patch is usually the way to go wrt. ethtool submissions. From davem@davemloft.net Thu Jun 2 16:37:36 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Jun 2005 16:37:41 -0700 (PDT) Received: from sunset.davemloft.net (dsl027-180-168.sfo1.dsl.speakeasy.net [216.27.180.168]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j52NbaXq012155 for ; Thu, 2 Jun 2005 16:37:36 -0700 Received: from localhost ([127.0.0.1] ident=davem) by sunset.davemloft.net with esmtp (Exim 4.50) id 1DdzEi-0007AN-Q4; Thu, 02 Jun 2005 16:36:28 -0700 Date: Thu, 02 Jun 2005 16:36:28 -0700 (PDT) Message-Id: <20050602.163628.01205145.davem@davemloft.net> To: hch@lst.de Cc: netdev@oss.sgi.com Subject: Re: [PATCH] shaper.c: fix locking From: "David S. Miller" In-Reply-To: <20050601052149.GA11935@lst.de> References: <20050527115450.GA19469@lst.de> <20050531.144114.78710204.davem@davemloft.net> <20050601052149.GA11935@lst.de> X-Mailer: Mew version 3.3 on Emacs 21.4 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 2001 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev Content-Length: 927 Lines: 22 From: Christoph Hellwig Date: Wed, 1 Jun 2005 07:21:50 +0200 > On Tue, May 31, 2005 at 02:41:14PM -0700, David S. Miller wrote: > > From: Christoph Hellwig > > Subject: [PATCH] shaper.c: fix locking > > Date: Fri, 27 May 2005 13:54:50 +0200 > > > > > o use a semaphore instead of an opencoded and racy lock > > > o move locking out of shaper_kick and into the callers - most just > > > released the lock before calling shaper_kick > > > o remove in_interrupt() tests. from ->close we can always block, from > > > ->hard_start_xmit and timer context never > > > > Do you really want to use a semaphore for a lock taken > > %99 of the time in software IRQ context, which obviously > > cannot sleep? > > I want to change as little as possible from the previous variant ;-) Fair enough, patch applied. If this driver breaks as a result of these changes, you get to keep the pieces ok? :-) From davem@davemloft.net Thu Jun 2 16:36:19 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Jun 2005 16:36:25 -0700 (PDT) Received: from sunset.davemloft.net (dsl027-180-168.sfo1.dsl.speakeasy.net [216.27.180.168]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j52NaJXq011978 for ; Thu, 2 Jun 2005 16:36:19 -0700 Received: from localhost ([127.0.0.1] ident=davem) by sunset.davemloft.net with esmtp (Exim 4.50) id 1DdzDU-00073m-CE; Thu, 02 Jun 2005 16:35:12 -0700 Date: Thu, 02 Jun 2005 16:35:12 -0700 (PDT) Message-Id: <20050602.163512.10298458.davem@davemloft.net> To: baruch@ev-en.org Cc: netdev@oss.sgi.com, shemminger@osdl.org, doug.leith@nuim.ie Subject: Re: Comparison of several congestion control algorithms From: "David S. Miller" In-Reply-To: <4298E045.9050009@ev-en.org> References: <4298E045.9050009@ev-en.org> X-Mailer: Mew version 3.3 on Emacs 21.4 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 2000 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev Content-Length: 1163 Lines: 24 From: Baruch Even Date: Sat, 28 May 2005 22:19:01 +0100 > I wanted to point you to a comparison of congestion control algorithm > done at the Hamilton Institute. These experiments compare Scalable-TCP, > High-Speed TCP, FAST-TCP, BIC-TCP, H-TCP and Standard TCP. They compared > fairness, compatibility with TCP and link utilisation. > > You can find the results and a report at http://hamilton.ie/net/eval/ Nice work, I enjoyed this paper very much. There is something that none of these papers mention, but is essential for interpreting results. Did you use interfaces with TSO enabled? There is a very serious congestion window growth bug with TSO enabled in the current 2.6.x tree. The problem is due to congestion window validation. When we build TSO frames, even if we have packets to send, we may defer a few frames until the full TSO packet can go out. But this causes the congestion window validation checks in tcp_ack() to not pass, and thus the congestion window does not grow. I am going to have this fixed, but for now people should do congestion window algorithm tests with TSO explicitly disabled on their interfaces. From baruch@ev-en.org Thu Jun 2 16:51:10 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Jun 2005 16:51:14 -0700 (PDT) Received: from galon.ev-en.org (rrcs-24-123-59-149.central.biz.rr.com [24.123.59.149]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j52NpAXq013555 for ; Thu, 2 Jun 2005 16:51:10 -0700 Received: by galon.ev-en.org (Postfix, from userid 105) id 1AFEF11A953; Fri, 3 Jun 2005 02:50:12 +0300 (IDT) Received: from [10.220.3.66] (hamilton.nuim.ie [149.157.192.252]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by galon.ev-en.org (Postfix) with ESMTP id 3141E11A951; Fri, 3 Jun 2005 02:50:08 +0300 (IDT) Message-ID: <429F9B2F.8030507@ev-en.org> Date: Fri, 03 Jun 2005 00:50:07 +0100 From: Baruch Even User-Agent: Debian Thunderbird 1.0.2 (X11/20050331) X-Accept-Language: en-us, en MIME-Version: 1.0 To: "David S. Miller" Cc: netdev@oss.sgi.com, shemminger@osdl.org, doug.leith@nuim.ie Subject: Re: Comparison of several congestion control algorithms References: <4298E045.9050009@ev-en.org> <20050602.163512.10298458.davem@davemloft.net> In-Reply-To: <20050602.163512.10298458.davem@davemloft.net> X-Enigmail-Version: 0.91.0.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-archive-position: 2002 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: baruch@ev-en.org Precedence: bulk X-list: netdev Content-Length: 1075 Lines: 28 David S. Miller wrote: > From: Baruch Even > Date: Sat, 28 May 2005 22:19:01 +0100 > > >>I wanted to point you to a comparison of congestion control algorithm >>done at the Hamilton Institute. These experiments compare Scalable-TCP, >>High-Speed TCP, FAST-TCP, BIC-TCP, H-TCP and Standard TCP. They compared >> fairness, compatibility with TCP and link utilisation. >> >>You can find the results and a report at http://hamilton.ie/net/eval/ > > > Nice work, I enjoyed this paper very much. > > There is something that none of these papers mention, but is essential > for interpreting results. Did you use interfaces with TSO enabled? I did not do these experiments myself, but to the best of my knowledge, none of the experiments done so far in Hamilton have used the TSO feature. This is in part because of the start of the work that was based on 2.4 kernels and even as far as the 2.6.6 kernel which had disabled TSO once it saw SACKs. This made TSO unusable for our needs. AFAIK, the tests reported in that document used kernel 2.6.6. Baruch From davem@davemloft.net Thu Jun 2 16:54:41 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Jun 2005 16:54:46 -0700 (PDT) Received: from sunset.davemloft.net (dsl027-180-168.sfo1.dsl.speakeasy.net [216.27.180.168]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j52NsfXq014133 for ; Thu, 2 Jun 2005 16:54:41 -0700 Received: from localhost ([127.0.0.1] ident=davem) by sunset.davemloft.net with esmtp (Exim 4.50) id 1DdzVO-0000j6-1S; Thu, 02 Jun 2005 16:53:42 -0700 Date: Thu, 02 Jun 2005 16:53:41 -0700 (PDT) Message-Id: <20050602.165341.63126720.davem@davemloft.net> To: baruch@ev-en.org Cc: netdev@oss.sgi.com, shemminger@osdl.org, doug.leith@nuim.ie Subject: Re: Comparison of several congestion control algorithms From: "David S. Miller" In-Reply-To: <429F9B2F.8030507@ev-en.org> References: <4298E045.9050009@ev-en.org> <20050602.163512.10298458.davem@davemloft.net> <429F9B2F.8030507@ev-en.org> X-Mailer: Mew version 3.3 on Emacs 21.4 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 2003 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev Content-Length: 689 Lines: 18 From: Baruch Even Date: Fri, 03 Jun 2005 00:50:07 +0100 > This is in part because of the start of the work that was based on 2.4 > kernels and even as far as the 2.6.6 kernel which had disabled TSO once > it saw SACKs. This made TSO unusable for our needs. > > AFAIK, the tests reported in that document used kernel 2.6.6. Sure SACKs turn off TSO currently, but you'll have them enabled at the beginning until the first loss and this affects how fast the cwnd will grow. If you have e1000 cards, for example, you're getting TSO enabled by default. You really need to look into this, as it has a real and very non-trivial effect on all of the results you obtained. From john.ronciak@intel.com Thu Jun 2 17:14:35 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Jun 2005 17:14:47 -0700 (PDT) Received: from orsfmr005.jf.intel.com (fmr20.intel.com [134.134.136.19]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j530EYXq015670 for ; Thu, 2 Jun 2005 17:14:35 -0700 Received: from orsfmr100.jf.intel.com (orsfmr100.jf.intel.com [10.7.209.16]) by orsfmr005.jf.intel.com (8.12.10/8.12.10/d: major-outer.mc,v 1.1 2004/09/17 17:50:56 root Exp $) with ESMTP id j530CDM3022374; Fri, 3 Jun 2005 00:12:13 GMT Received: from orsmsxvs041.jf.intel.com (orsmsxvs041.jf.intel.com [192.168.65.54]) by orsfmr100.jf.intel.com (8.12.10/8.12.10/d: major-inner.mc,v 1.2 2004/09/17 18:05:01 root Exp $) with SMTP id j530CCh0023812; Fri, 3 Jun 2005 00:12:13 GMT Received: from orsmsx332.amr.corp.intel.com ([192.168.65.60]) by orsmsxvs041.jf.intel.com (SAVSMTP 3.1.7.47) with SMTP id M2005060217121301969 ; Thu, 02 Jun 2005 17:12:13 -0700 Received: from orsmsx408.amr.corp.intel.com ([192.168.65.52]) by orsmsx332.amr.corp.intel.com with Microsoft SMTPSVC(6.0.3790.211); Thu, 2 Jun 2005 17:11:21 -0700 X-MimeOLE: Produced By Microsoft Exchange V6.5.7226.0 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Subject: RE: RFC: NAPI packet weighting patch Date: Thu, 2 Jun 2005 17:11:20 -0700 Message-ID: <468F3FDA28AA87429AD807992E22D07E0450BFDB@orsmsx408> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: RFC: NAPI packet weighting patch Thread-Index: AcVnwTswxSKQFuwsSBqFR1SjFJna0QADuSdA From: "Ronciak, John" To: "Jon Mason" , "David S. Miller" Cc: , , "Williams, Mitch A" , , , "Venkatesan, Ganesh" , "Brandeburg, Jesse" X-OriginalArrivalTime: 03 Jun 2005 00:11:21.0645 (UTC) FILETIME=[C5A105D0:01C567D0] X-Scanned-By: MIMEDefang 2.44 Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id j530EYXq015670 X-archive-position: 2005 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: john.ronciak@intel.com Precedence: bulk X-list: netdev Content-Length: 1961 Lines: 53 I like this idea as well but I do an issue with it. How would this stack code find out that the weight is too high and pacekts are being dropped (not being polled fast enough)? It would have to check the controller stats to see the error count increasing for some period. I'm not sure this is workable unless we have some sort of feedback which the driver could send up (or set) saying that this is happening and the dynamic weight code could take into acount. Comments? Cheers, John > -----Original Message----- > From: Jon Mason [mailto:jdmason@us.ibm.com] > Sent: Thursday, June 02, 2005 3:20 PM > To: David S. Miller > Cc: shemminger@osdl.org; Ronciak, John; hadi@cyberus.ca; > Williams, Mitch A; netdev@oss.sgi.com; > Robert.Olsson@data.slu.se; Venkatesan, Ganesh; Brandeburg, Jesse > Subject: Re: RFC: NAPI packet weighting patch > > > On Thursday 02 June 2005 05:12 pm, David S. Miller wrote: > > From: Jon Mason > > Date: Thu, 2 Jun 2005 16:51:48 -0500 > > > > > Why not have the driver set the weight to 16/32 > respectively for the > > > weight (or better yet, have someone run numbers to find > weight that > > > are closer to what the adapter can actually use)? While these > > > numbers may not be optimal for every system, this is much better > > > that the current system, and would only require 5 or so > extra lines > > > of code per NAPI enabled driver. > > > > Why do this when we can adjust the weight in one spot, > > namely the upper level NAPI ->poll() running loop? > > > > It can measure the overhead, how many packets processed, etc. > > and make intelligent decisions based upon that. This is a CPU > > speed, memory speed, I/O bus speed, and link speed agnostic > > solution. > > > > The driver need not take any part in this, and the scheme will > > dynamically adjust to resource usage changes in the system. > > Yes, a much better idea to do this generically. I 100% agree > with you. > From hadi@cyberus.ca Thu Jun 2 17:14:35 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Jun 2005 17:14:39 -0700 (PDT) Received: from mx02.cybersurf.com (mx02.cybersurf.com [209.197.145.105]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j530EYXq015668 for ; Thu, 2 Jun 2005 17:14:35 -0700 Received: from mail.cyberus.ca ([209.197.145.21]) by mx02.cybersurf.com with esmtp (Exim 4.30) id 1Ddzoj-0001X1-Ur for netdev@oss.sgi.com; Thu, 02 Jun 2005 20:13:41 -0400 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.229]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1Ddq7c-0005rn-D5; Thu, 02 Jun 2005 09:52:32 -0400 Subject: PATCH: ioctl send PID in netlink events From: jamal Reply-To: hadi@cyberus.ca To: "David S. Miller" Cc: netdev Content-Type: multipart/mixed; boundary="=-Vf8rMgMoYxExZv+wVDZr" Organization: unknown Date: Thu, 02 Jun 2005 09:52:29 -0400 Message-Id: <1117720349.6050.59.camel@localhost.localdomain> Mime-Version: 1.0 X-Mailer: Evolution 2.2.1.1 X-archive-position: 2004 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev Content-Length: 4221 Lines: 131 --=-Vf8rMgMoYxExZv+wVDZr Content-Type: text/plain Content-Transfer-Encoding: 7bit This is where i was trying to get to ;-> This patch is on top of the earlier one i sent for explicit types. I still have to think about how to best do IPV6 routes as well as ARP and NDISC. If anyone has suggestions or wants to tackle them let me know, the v6 route is not going to be a pretty one i think. cheers, jamal This patch ensures that netlink events created as a result of programns using ioctls (such as ifconfig, route etc) contains the correct PID of those events. Signed-off-by: Jamal Hadi Salim --=-Vf8rMgMoYxExZv+wVDZr Content-Disposition: attachment; filename=ifconf_pid_p Content-Type: text/plain; name=ifconf_pid_p; charset=UTF-8 Content-Transfer-Encoding: 7bit net/core/rtnetlink.c: needs update net/ipv4/devinet.c: needs update net/ipv4/fib_semantics.c: needs update net/ipv6/addrconf.c: needs update Index: net/core/rtnetlink.c =================================================================== --- e4f7366a04d973a42a948d3b4175d66e9adf143e/net/core/rtnetlink.c (mode:100644) +++ uncommitted/net/core/rtnetlink.c (mode:100644) @@ -452,7 +452,7 @@ if (!skb) return; - if (rtnetlink_fill_ifinfo(skb, dev, type, 0, 0, change, 0) < 0) { + if (rtnetlink_fill_ifinfo(skb, dev, type, current->pid, 0, change, 0) < 0) { kfree_skb(skb); return; } Index: net/ipv4/devinet.c =================================================================== --- e4f7366a04d973a42a948d3b4175d66e9adf143e/net/ipv4/devinet.c (mode:100644) +++ uncommitted/net/ipv4/devinet.c (mode:100644) @@ -236,6 +236,7 @@ struct in_ifaddr *promote = NULL; struct in_ifaddr *ifa1 = *ifap; + printk("inet_del_ifa: pid %d\n",current->pid); ASSERT_RTNL(); /* 1. Deleting primary ifaddr forces deletion all secondaries @@ -305,6 +306,7 @@ ASSERT_RTNL(); + printk("inet_insert_ifa: pid %d\n",current->pid); if (!ifa->ifa_local) { inet_free_ifa(ifa); return 0; @@ -1112,7 +1114,7 @@ if (!skb) netlink_set_err(rtnl, 0, RTMGRP_IPV4_IFADDR, ENOBUFS); - else if (inet_fill_ifaddr(skb, ifa, 0, 0, event, 0) < 0) { + else if (inet_fill_ifaddr(skb, ifa, current->pid, 0, event, 0) < 0) { kfree_skb(skb); netlink_set_err(rtnl, 0, RTMGRP_IPV4_IFADDR, EINVAL); } else { Index: net/ipv4/fib_semantics.c =================================================================== --- e4f7366a04d973a42a948d3b4175d66e9adf143e/net/ipv4/fib_semantics.c (mode:100644) +++ uncommitted/net/ipv4/fib_semantics.c (mode:100644) @@ -276,7 +276,7 @@ struct nlmsghdr *n, struct netlink_skb_parms *req) { struct sk_buff *skb; - u32 pid = req ? req->pid : 0; + u32 pid = req ? req->pid : n->nlmsg_pid; int size = NLMSG_SPACE(sizeof(struct rtmsg)+256); skb = alloc_skb(size, GFP_KERNEL); @@ -1035,7 +1035,7 @@ } nl->nlmsg_flags = NLM_F_REQUEST; - nl->nlmsg_pid = 0; + nl->nlmsg_pid = current->pid; nl->nlmsg_seq = 0; nl->nlmsg_len = NLMSG_LENGTH(sizeof(*rtm)); if (cmd == SIOCDELRT) { Index: net/ipv6/addrconf.c =================================================================== --- e4f7366a04d973a42a948d3b4175d66e9adf143e/net/ipv6/addrconf.c (mode:100644) +++ uncommitted/net/ipv6/addrconf.c (mode:100644) @@ -2872,7 +2872,7 @@ netlink_set_err(rtnl, 0, RTMGRP_IPV6_IFADDR, ENOBUFS); return; } - if (inet6_fill_ifaddr(skb, ifa, 0, 0, event, 0) < 0) { + if (inet6_fill_ifaddr(skb, ifa, current->pid, 0, event, 0) < 0) { kfree_skb(skb); netlink_set_err(rtnl, 0, RTMGRP_IPV6_IFADDR, EINVAL); return; @@ -3007,7 +3007,7 @@ netlink_set_err(rtnl, 0, RTMGRP_IPV6_IFINFO, ENOBUFS); return; } - if (inet6_fill_ifinfo(skb, idev, 0, 0, event, 0) < 0) { + if (inet6_fill_ifinfo(skb, idev, current->pid, 0, event, 0) < 0) { kfree_skb(skb); netlink_set_err(rtnl, 0, RTMGRP_IPV6_IFINFO, EINVAL); return; @@ -3064,7 +3064,7 @@ netlink_set_err(rtnl, 0, RTMGRP_IPV6_PREFIX, ENOBUFS); return; } - if (inet6_fill_prefix(skb, idev, pinfo, 0, 0, event, 0) < 0) { + if (inet6_fill_prefix(skb, idev, pinfo, current->pid, 0, event, 0) < 0) { kfree_skb(skb); netlink_set_err(rtnl, 0, RTMGRP_IPV6_PREFIX, EINVAL); return; --=-Vf8rMgMoYxExZv+wVDZr-- From davem@davemloft.net Thu Jun 2 17:19:21 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Jun 2005 17:19:24 -0700 (PDT) Received: from sunset.davemloft.net (dsl027-180-168.sfo1.dsl.speakeasy.net [216.27.180.168]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j530JKXq016790 for ; Thu, 2 Jun 2005 17:19:21 -0700 Received: from localhost ([127.0.0.1] ident=davem) by sunset.davemloft.net with esmtp (Exim 4.50) id 1Ddzt6-0001j3-Rq; Thu, 02 Jun 2005 17:18:12 -0700 Date: Thu, 02 Jun 2005 17:18:12 -0700 (PDT) Message-Id: <20050602.171812.48807872.davem@davemloft.net> To: john.ronciak@intel.com Cc: jdmason@us.ibm.com, shemminger@osdl.org, hadi@cyberus.ca, mitch.a.williams@intel.com, netdev@oss.sgi.com, Robert.Olsson@data.slu.se, ganesh.venkatesan@intel.com, jesse.brandeburg@intel.com Subject: Re: RFC: NAPI packet weighting patch From: "David S. Miller" In-Reply-To: <468F3FDA28AA87429AD807992E22D07E0450BFDB@orsmsx408> References: <468F3FDA28AA87429AD807992E22D07E0450BFDB@orsmsx408> X-Mailer: Mew version 3.3 on Emacs 21.4 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 2006 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev Content-Length: 827 Lines: 15 From: "Ronciak, John" Date: Thu, 2 Jun 2005 17:11:20 -0700 > I like this idea as well but I do an issue with it. How would this > stack code find out that the weight is too high and pacekts are being > dropped (not being polled fast enough)? It would have to check the > controller stats to see the error count increasing for some period. I'm > not sure this is workable unless we have some sort of feedback which the > driver could send up (or set) saying that this is happening and the > dynamic weight code could take into acount. What more do you need other than checking the statistics counter? The drop statistics (the ones we care about) are incremented in real time by the ->poll() code, so it's not like we have to trigger some asynchronous event to get a current version of the number. From ravinandan.arakali@neterion.com Thu Jun 2 17:51:36 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Jun 2005 17:51:40 -0700 (PDT) Received: from linux.site (adsl-67-120-213-161.dsl.sntc01.pacbell.net [67.120.213.161]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j530paXq018846 for ; Thu, 2 Jun 2005 17:51:36 -0700 Received: by linux.site (Postfix, from userid 0) id 28C4B7B99F; Thu, 2 Jun 2005 17:43:58 -0700 (PDT) To: davem@davemloft.net, jgarzik@pobox.com, netdev@oss.sgi.com Cc: raghavendra.koushik@neterion.com, ravinandan.arakali@neterion.com, leonid.grossman@neterion.com, ananda.raju@neterion.com, rapuru.sriram@neterion.com From: ravinandan.arakali@neterion.com Subject: [PATCH 2.6.12-rc4] ethtool: Support for UDP Large Send Offload Message-Id: <20050603004358.28C4B7B99F@linux.site> Date: Thu, 2 Jun 2005 17:43:58 -0700 (PDT) X-archive-position: 2009 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ravinandan.arakali@neterion.com Precedence: bulk X-list: netdev Content-Length: 3847 Lines: 136 Hi, Attached below is a patch on ethtool utility to support USO(UDP Large Send Offload). Pls review the patch. Usage: 1. To view USO setting # ethtool -k 2. To set/unset USO # ethtool -K uso on|off Signed-off-by: Ananda Raju Signed-off-by: Ravinandan Arakali --- diff -uNr ethtool-3/ethtool-copy.h ethtool-3_uso/ethtool-copy.h --- ethtool-3/ethtool-copy.h 2005-01-28 01:50:26.000000000 +0545 +++ ethtool-3_uso/ethtool-copy.h 2005-06-02 23:06:48.000000000 +0545 @@ -283,6 +283,8 @@ #define ETHTOOL_GSTATS 0x0000001d /* get NIC-specific statistics */ #define ETHTOOL_GTSO 0x0000001e /* Get TSO enable (ethtool_value) */ #define ETHTOOL_STSO 0x0000001f /* Set TSO enable (ethtool_value) */ +#define ETHTOOL_GUSO 0x00000020 /* Get USO enable (ethtool_value) */ +#define ETHTOOL_SUSO 0x00000021 /* Set USO enable (ethtool_value) */ /* compatibility with older code */ #define SPARC_ETH_GSET ETHTOOL_GSET diff -uNr ethtool-3/ethtool.c ethtool-3_uso/ethtool.c --- ethtool-3/ethtool.c 2005-01-28 04:19:29.000000000 +0545 +++ ethtool-3_uso/ethtool.c 2005-06-02 23:06:52.000000000 +0545 @@ -119,6 +119,7 @@ * [ tx on|off ] \ * [ sg on|off ] \ * [ tso on|off ] + * [ uso on|off ] * ethtool -r DEVNAME * ethtool -p DEVNAME [ %d ] * ethtool -t DEVNAME [ online|offline ] @@ -191,6 +192,7 @@ " [ tx on|off ] \\\n" " [ sg on|off ] \\\n" " [ tso on|off ]\n" + " [ uso on|off ]\n" " ethtool -r DEVNAME\n" " ethtool -p DEVNAME [ %%d ]\n" " ethtool -t DEVNAME [online|(offline)]\n" @@ -236,6 +238,7 @@ static int off_csum_tx_wanted = -1; static int off_sg_wanted = -1; static int off_tso_wanted = -1; +static int off_uso_wanted = -1; static struct ethtool_pauseparam epause; static int gpause_changed = 0; @@ -339,6 +342,7 @@ { "tx", CMDL_BOOL, &off_csum_tx_wanted, NULL }, { "sg", CMDL_BOOL, &off_sg_wanted, NULL }, { "tso", CMDL_BOOL, &off_tso_wanted, NULL }, + { "uso", CMDL_BOOL, &off_uso_wanted, NULL }, }; static struct cmdline_info cmdline_pause[] = { @@ -1184,17 +1188,19 @@ return 0; } -static int dump_offload (int rx, int tx, int sg, int tso) +static int dump_offload (int rx, int tx, int sg, int tso, int uso) { fprintf(stdout, "rx-checksumming: %s\n" "tx-checksumming: %s\n" "scatter-gather: %s\n" - "tcp segmentation offload: %s\n", + "tcp segmentation offload: %s\n" + "udp large send offload: %s\n", rx ? "on" : "off", tx ? "on" : "off", sg ? "on" : "off", - tso ? "on" : "off"); + tso ? "on" : "off", + uso ? "on" : "off"); return 0; } @@ -1458,7 +1464,7 @@ static int do_goffload(int fd, struct ifreq *ifr) { struct ethtool_value eval; - int err, allfail = 1, rx = 0, tx = 0, sg = 0, tso = 0; + int err, allfail = 1, rx = 0, tx = 0, sg = 0, tso = 0, uso = 0; fprintf(stdout, "Offload parameters for %s:\n", devname); @@ -1502,12 +1508,22 @@ allfail = 0; } + eval.cmd = ETHTOOL_GUSO; + ifr->ifr_data = (caddr_t)&eval; + err = ioctl(fd, SIOCETHTOOL, ifr); + if (err) + perror("Cannot get device udp large send offload settings"); + else { + uso = eval.data; + allfail = 0; + } + if (allfail) { fprintf(stdout, "no offload info available\n"); return 83; } - return dump_offload(rx, tx, sg, tso); + return dump_offload(rx, tx, sg, tso, uso); } static int do_soffload(int fd, struct ifreq *ifr) @@ -1562,6 +1578,17 @@ return 88; } } + if (off_uso_wanted >= 0) { + changed = 1; + eval.cmd = ETHTOOL_SUSO; + eval.data = (off_uso_wanted == 1); + ifr->ifr_data = (caddr_t)&eval; + err = ioctl(fd, SIOCETHTOOL, ifr); + if (err) { + perror("Cannot set device udp large send offload settings"); + return 89; + } + } if (!changed) { fprintf(stdout, "no offload settings changed\n"); } From ravinandan.arakali@neterion.com Thu Jun 2 17:48:48 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Jun 2005 17:48:52 -0700 (PDT) Received: from linux.site (adsl-67-120-213-161.dsl.sntc01.pacbell.net [67.120.213.161]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j530mlXq018431 for ; Thu, 2 Jun 2005 17:48:48 -0700 Received: by linux.site (Postfix, from userid 0) id BAB6A7B990; Thu, 2 Jun 2005 17:41:06 -0700 (PDT) To: davem@davemloft.net, jgarzik@pobox.com, netdev@oss.sgi.com Cc: raghavendra.koushik@neterion.com, ravinandan.arakali@neterion.com, leonid.grossman@neterion.com, ananda.raju@neterion.com, rapuru.sriram@neterion.com From: ravinandan.arakali@neterion.com Subject: [PATCH 2.6.12-rc4] IPv4/IPv6: USO v2, Scatter-gather approach Message-Id: <20050603004106.BAB6A7B990@linux.site> Date: Thu, 2 Jun 2005 17:41:06 -0700 (PDT) X-archive-position: 2007 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ravinandan.arakali@neterion.com Precedence: bulk X-list: netdev Content-Length: 14893 Lines: 444 Hi, Attached below is version 2 of kernel patch for UDP Large send offload feature. This patch uses the "Scatter-Gather" approach. It also incorporates David Miller's comments on the first version. Also, below is a "how-to" on changes required in network drivers to use the USO interface. UDP Large Send Offload (USO) Interface: -------------------------------------- USO is a feature wherein the Linux kernel network stack will offload the IP fragmentation functionality of large UDP datagram to hardware. This will reduce the overhead of stack in fragmenting the large UDP datagram to MTU sized packets. 1) Drivers indicate their capability of USO using dev->features |= NETIF_F_USO | NETIF_F_HW_CSUM | NETIF_F_SG NETIF_F_HW_CSUM is required for USO over ipv6. 2) USO packet will be submitted for transmission using driver xmit routine. USO packet will have a non-zero value for "skb_shinfo(skb)->uso_size" skb_shinfo(skb)->uso_size will indicate the length of data part in each IP fragment going out of the adapter after IP fragmentation by hardware. skb->data will contain MAC/IP/UDP header and skb_shinfo(skb)->frags[] contains the data payload. The skb->ip_summed will be set to CHECKSUM_HW indicating that hardware has to do checksum calculation. Hardware should compute the UDP checksum of complete datagram and also ip header checksum of each fragmented IP packet. For IPV6 the USO provides the fragment identification-id in skb_shinfo(skb)->ip6_frag_id. The adapter should use this ID for generating IPv6 fragments. Signed-off-by: Ananda Raju Signed-off-by: Ravinandan Arakali --- diff -uNr linux-2.6.12-rc4.org/include/linux/ethtool.h linux-2.6.12-rc4/include/linux/ethtool.h --- linux-2.6.12-rc4.org/include/linux/ethtool.h 2005-06-01 19:56:58.000000000 +0545 +++ linux-2.6.12-rc4/include/linux/ethtool.h 2005-06-01 19:51:47.000000000 +0545 @@ -260,6 +260,8 @@ int ethtool_op_set_sg(struct net_device *dev, u32 data); u32 ethtool_op_get_tso(struct net_device *dev); int ethtool_op_set_tso(struct net_device *dev, u32 data); +u32 ethtool_op_get_uso(struct net_device *dev); +int ethtool_op_set_uso(struct net_device *dev, u32 data); /** * ðtool_ops - Alter and report network device settings @@ -289,6 +291,8 @@ * set_sg: Turn scatter-gather on or off * get_tso: Report whether TCP segmentation offload is enabled * set_tso: Turn TCP segmentation offload on or off + * get_uso: Report whether UDP large send offload is enabled + * set_uso: Turn UDP large send offload on or off * self_test: Run specified self-tests * get_strings: Return a set of strings that describe the requested objects * phys_id: Identify the device @@ -353,6 +357,8 @@ void (*get_ethtool_stats)(struct net_device *, struct ethtool_stats *, u64 *); int (*begin)(struct net_device *); void (*complete)(struct net_device *); + u32 (*get_uso)(struct net_device *); + int (*set_uso)(struct net_device *, u32); }; /* CMDs currently supported */ @@ -388,6 +394,8 @@ #define ETHTOOL_GSTATS 0x0000001d /* get NIC-specific statistics */ #define ETHTOOL_GTSO 0x0000001e /* Get TSO enable (ethtool_value) */ #define ETHTOOL_STSO 0x0000001f /* Set TSO enable (ethtool_value) */ +#define ETHTOOL_GUSO 0x00000020 /* Get USO enable (ethtool_value) */ +#define ETHTOOL_SUSO 0x00000021 /* Set USO enable (ethtool_value) */ /* compatibility with older code */ #define SPARC_ETH_GSET ETHTOOL_GSET diff -uNr linux-2.6.12-rc4.org/include/linux/netdevice.h linux-2.6.12-rc4/include/linux/netdevice.h --- linux-2.6.12-rc4.org/include/linux/netdevice.h 2005-05-25 17:18:11.000000000 +0545 +++ linux-2.6.12-rc4/include/linux/netdevice.h 2005-06-01 14:33:12.000000000 +0545 @@ -414,6 +414,7 @@ #define NETIF_F_VLAN_CHALLENGED 1024 /* Device cannot handle VLAN packets */ #define NETIF_F_TSO 2048 /* Can offload TCP/IP segmentation */ #define NETIF_F_LLTX 4096 /* LockLess TX */ +#define NETIF_F_USO 8192 /* Can offload UDP Large Send*/ /* Called after device is detached from network. */ void (*uninit)(struct net_device *dev); diff -uNr linux-2.6.12-rc4.org/include/linux/skbuff.h linux-2.6.12-rc4/include/linux/skbuff.h --- linux-2.6.12-rc4.org/include/linux/skbuff.h 2005-05-25 17:18:20.000000000 +0545 +++ linux-2.6.12-rc4/include/linux/skbuff.h 2005-06-01 15:18:44.000000000 +0545 @@ -135,6 +135,8 @@ atomic_t dataref; unsigned int nr_frags; unsigned short tso_size; + unsigned short uso_size; + unsigned int ip6_frag_id; unsigned short tso_segs; struct sk_buff *frag_list; skb_frag_t frags[MAX_SKB_FRAGS]; diff -uNr linux-2.6.12-rc4.org/include/net/sock.h linux-2.6.12-rc4/include/net/sock.h --- linux-2.6.12-rc4.org/include/net/sock.h 2005-05-25 17:18:44.000000000 +0545 +++ linux-2.6.12-rc4/include/net/sock.h 2005-05-25 20:28:14.000000000 +0545 @@ -1296,5 +1296,11 @@ return -ENODEV; } #endif +struct sk_buff *sock_append_data(struct sock *sk, + int getfrag(void *from, char *to, int offset, int len, + int odd, struct sk_buff *skb), + void *from, int length, int transhdrlen, + int hh_len, int fragheaderlen, + unsigned int flags,int *err); #endif /* _SOCK_H */ diff -uNr linux-2.6.12-rc4.org/net/core/dev.c linux-2.6.12-rc4/net/core/dev.c --- linux-2.6.12-rc4.org/net/core/dev.c 2005-06-01 14:35:01.000000000 +0545 +++ linux-2.6.12-rc4/net/core/dev.c 2005-06-01 19:46:03.000000000 +0545 @@ -2793,6 +2793,18 @@ dev->name); dev->features &= ~NETIF_F_TSO; } + if (dev->features & NETIF_F_USO) { + if (!(dev->features & NETIF_F_HW_CSUM)) { + printk("%s: Dropping NETIF_F_USO since no ", dev->name); + printk("NETIF_F_HW_CSUM feature.\n"); + dev->features &= ~NETIF_F_USO; + } + if (!(dev->features & NETIF_F_SG)) { + printk("%s: Dropping NETIF_F_USO since no ", dev->name); + printk("NETIF_F_SG feature.\n"); + dev->features &= ~NETIF_F_USO; + } + } /* * nil rebuild_header routine, diff -uNr linux-2.6.12-rc4.org/net/core/ethtool.c linux-2.6.12-rc4/net/core/ethtool.c --- linux-2.6.12-rc4.org/net/core/ethtool.c 2005-06-01 19:48:31.000000000 +0545 +++ linux-2.6.12-rc4/net/core/ethtool.c 2005-06-01 23:02:39.000000000 +0545 @@ -72,6 +72,21 @@ return 0; } +u32 ethtool_op_get_uso(struct net_device *dev) +{ + return (dev->features & NETIF_F_USO) != 0; +} + +int ethtool_op_set_uso(struct net_device *dev, u32 data) +{ + if (data) + dev->features |= NETIF_F_USO; + else + dev->features &= ~NETIF_F_USO; + + return 0; +} + /* Handlers for each ethtool command */ static int ethtool_get_settings(struct net_device *dev, void __user *useraddr) @@ -460,6 +475,9 @@ err = dev->ethtool_ops->set_tso(dev, 0); if (err) return err; + err = dev->ethtool_ops->set_uso(dev, 0); + if (err) + return err; } return dev->ethtool_ops->set_sg(dev, data); @@ -548,6 +566,39 @@ return dev->ethtool_ops->set_tso(dev, edata.data); } +static int ethtool_get_uso(struct net_device *dev, char __user *useraddr) +{ + struct ethtool_value edata = { ETHTOOL_GTSO }; + + if (!dev->ethtool_ops->get_uso) + return -EOPNOTSUPP; + + edata.data = dev->ethtool_ops->get_uso(dev); + + if (copy_to_user(useraddr, &edata, sizeof(edata))) + return -EFAULT; + return 0; +} + +static int ethtool_set_uso(struct net_device *dev, char __user *useraddr) +{ + struct ethtool_value edata; + + if (!dev->ethtool_ops->set_uso) + return -EOPNOTSUPP; + + if (copy_from_user(&edata, useraddr, sizeof(edata))) + return -EFAULT; + + if (edata.data && !(dev->features & NETIF_F_SG)) + return -EINVAL; + + if (edata.data && !(dev->features & NETIF_F_HW_CSUM)) + return -EINVAL; + + return dev->ethtool_ops->set_uso(dev, edata.data); +} + static int ethtool_self_test(struct net_device *dev, char __user *useraddr) { struct ethtool_test test; @@ -795,6 +846,12 @@ case ETHTOOL_GSTATS: rc = ethtool_get_stats(dev, useraddr); break; + case ETHTOOL_GUSO: + rc = ethtool_get_uso(dev, useraddr); + break; + case ETHTOOL_SUSO: + rc = ethtool_set_uso(dev, useraddr); + break; default: rc = -EOPNOTSUPP; } @@ -817,3 +874,6 @@ EXPORT_SYMBOL(ethtool_op_set_sg); EXPORT_SYMBOL(ethtool_op_set_tso); EXPORT_SYMBOL(ethtool_op_set_tx_csum); +EXPORT_SYMBOL(ethtool_op_set_uso); +EXPORT_SYMBOL(ethtool_op_get_uso); + diff -uNr linux-2.6.12-rc4.org/net/core/skbuff.c linux-2.6.12-rc4/net/core/skbuff.c --- linux-2.6.12-rc4.org/net/core/skbuff.c 2005-05-25 20:25:35.000000000 +0545 +++ linux-2.6.12-rc4/net/core/skbuff.c 2005-06-01 14:34:27.000000000 +0545 @@ -159,6 +159,8 @@ skb_shinfo(skb)->tso_size = 0; skb_shinfo(skb)->tso_segs = 0; skb_shinfo(skb)->frag_list = NULL; + skb_shinfo(skb)->uso_size = 0; + skb_shinfo(skb)->ip6_frag_id = 0; out: return skb; nodata: diff -uNr linux-2.6.12-rc4.org/net/core/sock.c linux-2.6.12-rc4/net/core/sock.c --- linux-2.6.12-rc4.org/net/core/sock.c 2005-05-25 20:25:47.000000000 +0545 +++ linux-2.6.12-rc4/net/core/sock.c 2005-06-01 19:40:03.000000000 +0545 @@ -1401,6 +1401,102 @@ EXPORT_SYMBOL(proto_unregister); +/* + * sock_append_data - append the user data to a skb, + * sk - sock structure which contains skbs for transmission + * getfrag - The function to be called to get the data from the user. + * from - pointer to user message iov + * length - length of the iov message + * transhdrlen - transport header length + * hh_len - hardware header length + * fragheaderlen - length of the IP header + * flags - iov message flags + * err - Error code returned + * + * This procedure will allocate a skb enough to hold protocol headers and + * append the user data in the fragment part of the skb and add the skb to + * socket write queue + */ +struct sk_buff *sock_append_data(struct sock *sk, + int getfrag(void *from, char *to, int offset, int len, + int odd, struct sk_buff *skb), + void *from, int length, int transhdrlen, + int hh_len, int fragheaderlen, + unsigned int flags,int *err) +{ + struct sk_buff *skb; + int frg_cnt = 0; + skb_frag_t *frag = NULL; + struct page *page = NULL; + int copy, left; + int offset = 0; + + if (skb_queue_len(&sk->sk_write_queue)) { + *err = -EOPNOTSUPP; + return NULL; + } + + skb = sock_alloc_send_skb(sk, + hh_len + fragheaderlen + transhdrlen + 20, + (flags & MSG_DONTWAIT), err); + if (skb == NULL) { + *err = -ENOMEM; + return NULL; + } + /* reserve space for Hardware header */ + skb_reserve(skb, hh_len); + /* create space for UDP/IP header */ + skb_put(skb,fragheaderlen + transhdrlen); + /* initialize network header pointer */ + skb->nh.raw = skb->data; + /* initialize protocol header pointer */ + skb->h.raw = skb->data + fragheaderlen; + skb->ip_summed = CHECKSUM_HW; + skb->csum = 0; + do { + copy = length; + if (frg_cnt >= MAX_SKB_FRAGS) { + *err = -EFAULT; + kfree_skb(skb); + return NULL; + } + page = alloc_pages(sk->sk_allocation, 0); + if (page == NULL) { + *err = -ENOMEM; + kfree_skb(skb); + return NULL; + } + sk->sk_sndmsg_page = page; + sk->sk_sndmsg_off = 0; + skb_fill_page_desc(skb, frg_cnt, page, 0, 0); + skb->truesize += PAGE_SIZE; + atomic_add(PAGE_SIZE, &sk->sk_wmem_alloc); + frg_cnt = skb_shinfo(skb)->nr_frags; + frag = &skb_shinfo(skb)->frags[frg_cnt - 1]; + left = PAGE_SIZE - frag->page_offset; + if (copy > left) + copy = left; + if (getfrag(from, page_address(frag->page)+ + frag->page_offset+frag->size, + offset, copy, 0, skb) < 0) { + *err = -EFAULT; + kfree_skb(skb); + return NULL; + } + sk->sk_sndmsg_off += copy; + frag->size += copy; + skb->len += copy; + skb->data_len += copy; + offset += copy; + length -= copy; + page = NULL; + } while (length > 0); + __skb_queue_tail(&sk->sk_write_queue, skb); + *err = 0; + return skb; +} +EXPORT_SYMBOL(sock_append_data); + #ifdef CONFIG_PROC_FS static inline struct proto *__proto_head(void) { diff -uNr linux-2.6.12-rc4.org/net/ipv4/ip_output.c linux-2.6.12-rc4/net/ipv4/ip_output.c --- linux-2.6.12-rc4.org/net/ipv4/ip_output.c 2005-05-25 20:26:07.000000000 +0545 +++ linux-2.6.12-rc4/net/ipv4/ip_output.c 2005-06-02 22:04:59.000000000 +0545 @@ -291,7 +291,8 @@ { IP_INC_STATS(IPSTATS_MIB_OUTREQUESTS); - if (skb->len > dst_mtu(skb->dst) && !skb_shinfo(skb)->tso_size) + if (skb->len > dst_mtu(skb->dst) && + !(skb_shinfo(skb)->uso_size || skb_shinfo(skb)->tso_size)) return ip_fragment(skb, ip_finish_output); else return ip_finish_output(skb); @@ -789,6 +790,28 @@ inet->cork.length += length; + if (((length > mtu) && (sk->sk_protocol == IPPROTO_UDP)) && + (rt->u.dst.dev->features & NETIF_F_USO)) { + /* There is support for UDP large send offload by network + * device, so create one single skb packet containing complete + * udp datagram + */ + skb = sock_append_data(sk, getfrag, from, + (length - transhdrlen), transhdrlen, + hh_len, fragheaderlen, flags, &err); + if (skb != NULL) { + /* specify the length of each IP datagram fragment*/ + skb_shinfo(skb)->uso_size = (mtu - fragheaderlen); + return 0; + } else if (err == -EOPNOTSUPP) { + /* There is not enough support do UPD LSO, + * so follow normal path + */ + err = 0; + } else + goto error; + } + /* So, what's going on in the loop below? * * We use calculated fragment length to generate chained skb, diff -uNr linux-2.6.12-rc4.org/net/ipv6/ip6_output.c linux-2.6.12-rc4/net/ipv6/ip6_output.c --- linux-2.6.12-rc4.org/net/ipv6/ip6_output.c 2005-05-25 20:26:17.000000000 +0545 +++ linux-2.6.12-rc4/net/ipv6/ip6_output.c 2005-06-02 22:05:24.000000000 +0545 @@ -147,7 +147,8 @@ int ip6_output(struct sk_buff *skb) { - if (skb->len > dst_mtu(skb->dst) || dst_allfrag(skb->dst)) + if ((skb->len > dst_mtu(skb->dst) && !skb_shinfo(skb)->uso_size) || + dst_allfrag(skb->dst)) return ip6_fragment(skb, ip6_output2); else return ip6_output2(skb); @@ -898,6 +899,33 @@ */ inet->cork.length += length; + if (((length > mtu) && (sk->sk_protocol == IPPROTO_UDP)) && + (rt->u.dst.dev->features & NETIF_F_USO)) { + + /* There is support for UDP large send offload by network + * device, so create one single skb packet containing complete + * udp datagram + */ + skb = sock_append_data(sk, getfrag, from, + (length - transhdrlen), transhdrlen, + hh_len, fragheaderlen, flags, &err); + if (skb != NULL) { + struct frag_hdr fhdr; + + /* specify the length of each IP datagram fragment*/ + skb_shinfo(skb)->uso_size = (mtu - fragheaderlen - + sizeof(struct frag_hdr)); + ipv6_select_ident(skb, &fhdr); + skb_shinfo(skb)->ip6_frag_id = fhdr.identification; + return 0; + } else if (err == -EOPNOTSUPP){ + /* There is not enough support for UDP LSO, + * so follow normal path + */ + err = 0; + } else + goto error; + } if ((skb = skb_peek_tail(&sk->sk_write_queue)) == NULL) goto alloc_new_skb; From ravinandan.arakali@neterion.com Thu Jun 2 17:51:30 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Jun 2005 17:51:32 -0700 (PDT) Received: from linux.site (adsl-67-120-213-161.dsl.sntc01.pacbell.net [67.120.213.161]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j530pTXq018773 for ; Thu, 2 Jun 2005 17:51:29 -0700 Received: by linux.site (Postfix, from userid 0) id 41BFB7B990; Thu, 2 Jun 2005 17:43:51 -0700 (PDT) To: davem@davemloft.net, jgarzik@pobox.com, netdev@oss.sgi.com Cc: raghavendra.koushik@neterion.com, ravinandan.arakali@neterion.com, leonid.grossman@neterion.com, ananda.raju@neterion.com, rapuru.sriram@neterion.com From: ravinandan.arakali@neterion.com Subject: [PATCH 2.6.12-rc4] IPv4/IPv6: USO v2, fragment list approach Message-Id: <20050603004351.41BFB7B990@linux.site> Date: Thu, 2 Jun 2005 17:43:51 -0700 (PDT) X-archive-position: 2008 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ravinandan.arakali@neterion.com Precedence: bulk X-list: netdev Content-Length: 11495 Lines: 322 Hi, Attached below is version 2 of kernel patch for UDP Large send offload feature. This patch uses the "fragment list" approach. It also incorporates David Miller's comments on the first version. Also, below is a "how-to" on changes required in network drivers to use the USO interface. UDP Large Send Offload (USO) Interface: --------------------------------------- USO is a feature wherein the Linux kernel network stack will offload the IP fragmentation functionality of large UDP datagram to hardware. This will reduce the overhead of stack in fragmenting the large UDP datagram to MTU sized packets. 1) Drivers indicate their capability of USO using dev->features |= NETIF_F_USO | NETIF_F_HW_CSUM | NETIF_F_FRAGLIST NETIF_F_HW_CSUM is required for USO over IPv6. 2) USO packet will be submitted for transmission using driver xmit routine. USO packet will have a non zero value for "skb_shinfo(skb)->uso_size" skb_shinfo(skb)->uso_size indicates the length of data part in each IP fragment going out of the adapter after IP fragmentation by hardware. skb->data and skb_shinfo(skb)->frag_list will contain complete large UDP datagram. The driver is required to traverse each skb in skb_shinfo(skb)->frag_list to get complete UDP packet. The skb->ip_summed will be set to CHECKSUM_HW indicating that hardware has to perform checksum calculation. Hardware should compute the UDP checksum of complete UDP datagram and also ip header checksum of each fragmented IP packet. For IPV6 the USO provides the fragment identification id in skb_shinfo(skb)->ip6_frag_id. The adapter should use this ID for generating IPv6 fragments. Signed-off-by: Ananda Raju Signed-off-by: Ravinandan Arakali --- diff -uNr linux-2.6.12-rc4.org/include/linux/ethtool.h linux-2.6.12-rc4/include/linux/ethtool.h --- linux-2.6.12-rc4.org/include/linux/ethtool.h 2005-06-02 16:55:51.000000000 +0545 +++ linux-2.6.12-rc4/include/linux/ethtool.h 2005-06-02 16:56:46.000000000 +0545 @@ -260,6 +260,8 @@ int ethtool_op_set_sg(struct net_device *dev, u32 data); u32 ethtool_op_get_tso(struct net_device *dev); int ethtool_op_set_tso(struct net_device *dev, u32 data); +u32 ethtool_op_get_uso(struct net_device *dev); +int ethtool_op_set_uso(struct net_device *dev, u32 data); /** * ðtool_ops - Alter and report network device settings @@ -289,6 +291,8 @@ * set_sg: Turn scatter-gather on or off * get_tso: Report whether TCP segmentation offload is enabled * set_tso: Turn TCP segmentation offload on or off + * get_uso: Report whether UDP large send offload is enabled + * set_uso: Turn UDP large send offload on or off * self_test: Run specified self-tests * get_strings: Return a set of strings that describe the requested objects * phys_id: Identify the device @@ -353,6 +357,8 @@ void (*get_ethtool_stats)(struct net_device *, struct ethtool_stats *, u64 *); int (*begin)(struct net_device *); void (*complete)(struct net_device *); + u32 (*get_uso)(struct net_device *); + int (*set_uso)(struct net_device *, u32); }; /* CMDs currently supported */ @@ -388,6 +394,8 @@ #define ETHTOOL_GSTATS 0x0000001d /* get NIC-specific statistics */ #define ETHTOOL_GTSO 0x0000001e /* Get TSO enable (ethtool_value) */ #define ETHTOOL_STSO 0x0000001f /* Set TSO enable (ethtool_value) */ +#define ETHTOOL_GUSO 0x00000020 /* Get USO enable (ethtool_value) */ +#define ETHTOOL_SUSO 0x00000021 /* Set USO enable (ethtool_value) */ /* compatibility with older code */ #define SPARC_ETH_GSET ETHTOOL_GSET diff -uNr linux-2.6.12-rc4.org/include/linux/netdevice.h linux-2.6.12-rc4/include/linux/netdevice.h --- linux-2.6.12-rc4.org/include/linux/netdevice.h 2005-05-27 23:22:46.000000000 +0545 +++ linux-2.6.12-rc4/include/linux/netdevice.h 2005-05-31 10:02:02.000000000 +0545 @@ -414,6 +414,7 @@ #define NETIF_F_VLAN_CHALLENGED 1024 /* Device cannot handle VLAN packets */ #define NETIF_F_TSO 2048 /* Can offload TCP/IP segmentation */ #define NETIF_F_LLTX 4096 /* LockLess TX */ +#define NETIF_F_USO 8192 /* Can offload UDP Large Send*/ /* Called after device is detached from network. */ void (*uninit)(struct net_device *dev); diff -uNr linux-2.6.12-rc4.org/include/linux/skbuff.h linux-2.6.12-rc4/include/linux/skbuff.h --- linux-2.6.12-rc4.org/include/linux/skbuff.h 2005-05-27 23:22:46.000000000 +0545 +++ linux-2.6.12-rc4/include/linux/skbuff.h 2005-06-02 20:27:43.000000000 +0545 @@ -136,6 +136,8 @@ unsigned int nr_frags; unsigned short tso_size; unsigned short tso_segs; + unsigned short uso_size; + unsigned int ip6_frag_id; struct sk_buff *frag_list; skb_frag_t frags[MAX_SKB_FRAGS]; }; diff -uNr linux-2.6.12-rc4.org/net/core/dev.c linux-2.6.12-rc4/net/core/dev.c --- linux-2.6.12-rc4.org/net/core/dev.c 2005-05-28 01:49:18.000000000 +0545 +++ linux-2.6.12-rc4/net/core/dev.c 2005-05-31 22:57:22.000000000 +0545 @@ -2793,6 +2793,18 @@ dev->name); dev->features &= ~NETIF_F_TSO; } + if (dev->features & NETIF_F_USO) { + if(!(dev->features & NETIF_F_FRAGLIST)) { + printk("%s: Dropping NETIF_F_USO since no ", dev->name); + printk("NETIF_F_FRAGLIST feature.\n"); + dev->features &= ~NETIF_F_USO; + } + if(!(dev->features & NETIF_F_HW_CSUM)) { + printk("%s: Dropping NETIF_F_USO since no ", dev->name); + printk("NETIF_F_HW_CSUM feature.\n"); + dev->features &= ~NETIF_F_USO; + } + } /* * nil rebuild_header routine, diff -uNr linux-2.6.12-rc4.org/net/core/ethtool.c linux-2.6.12-rc4/net/core/ethtool.c --- linux-2.6.12-rc4.org/net/core/ethtool.c 2005-06-02 16:55:32.000000000 +0545 +++ linux-2.6.12-rc4/net/core/ethtool.c 2005-06-02 21:53:16.000000000 +0545 @@ -72,6 +72,21 @@ return 0; } +u32 ethtool_op_get_uso(struct net_device *dev) +{ + return (dev->features & NETIF_F_USO) != 0; +} + +int ethtool_op_set_uso(struct net_device *dev, u32 data) +{ + if (data) + dev->features |= NETIF_F_USO; + else + dev->features &= ~NETIF_F_USO; + + return 0; +} + /* Handlers for each ethtool command */ static int ethtool_get_settings(struct net_device *dev, void __user *useraddr) @@ -548,6 +563,39 @@ return dev->ethtool_ops->set_tso(dev, edata.data); } +static int ethtool_get_uso(struct net_device *dev, char __user *useraddr) +{ + struct ethtool_value edata = { ETHTOOL_GTSO }; + + if (!dev->ethtool_ops->get_uso) + return -EOPNOTSUPP; + + edata.data = dev->ethtool_ops->get_uso(dev); + + if (copy_to_user(useraddr, &edata, sizeof(edata))) + return -EFAULT; + return 0; +} + +static int ethtool_set_uso(struct net_device *dev, char __user *useraddr) +{ + struct ethtool_value edata; + + if (!dev->ethtool_ops->set_uso) + return -EOPNOTSUPP; + + if (copy_from_user(&edata, useraddr, sizeof(edata))) + return -EFAULT; + + if (edata.data && !(dev->features & NETIF_F_FRAGLIST)) + return -EINVAL; + + if (edata.data && !(dev->features & NETIF_F_HW_CSUM)) + return -EINVAL; + + return dev->ethtool_ops->set_uso(dev, edata.data); +} + static int ethtool_self_test(struct net_device *dev, char __user *useraddr) { struct ethtool_test test; @@ -795,6 +843,12 @@ case ETHTOOL_GSTATS: rc = ethtool_get_stats(dev, useraddr); break; + case ETHTOOL_GUSO: + rc = ethtool_get_uso(dev, useraddr); + break; + case ETHTOOL_SUSO: + rc = ethtool_set_uso(dev, useraddr); + break; default: rc = -EOPNOTSUPP; } @@ -817,3 +871,6 @@ EXPORT_SYMBOL(ethtool_op_set_sg); EXPORT_SYMBOL(ethtool_op_set_tso); EXPORT_SYMBOL(ethtool_op_set_tx_csum); +EXPORT_SYMBOL(ethtool_op_set_uso); +EXPORT_SYMBOL(ethtool_op_get_uso); + diff -uNr linux-2.6.12-rc4.org/net/core/skbuff.c linux-2.6.12-rc4/net/core/skbuff.c --- linux-2.6.12-rc4.org/net/core/skbuff.c 2005-05-27 23:22:46.000000000 +0545 +++ linux-2.6.12-rc4/net/core/skbuff.c 2005-06-02 20:27:27.000000000 +0545 @@ -159,6 +159,8 @@ skb_shinfo(skb)->tso_size = 0; skb_shinfo(skb)->tso_segs = 0; skb_shinfo(skb)->frag_list = NULL; + skb_shinfo(skb)->ip6_frag_id = 0; + skb_shinfo(skb)->uso_size = 0; out: return skb; nodata: diff -uNr linux-2.6.12-rc4.org/net/ipv4/ip_output.c linux-2.6.12-rc4/net/ipv4/ip_output.c --- linux-2.6.12-rc4.org/net/ipv4/ip_output.c 2005-05-27 23:22:46.000000000 +0545 +++ linux-2.6.12-rc4/net/ipv4/ip_output.c 2005-05-31 15:55:39.000000000 +0545 @@ -291,7 +291,8 @@ { IP_INC_STATS(IPSTATS_MIB_OUTREQUESTS); - if (skb->len > dst_mtu(skb->dst) && !skb_shinfo(skb)->tso_size) + if (skb->len > dst_mtu(skb->dst) && + !(skb_shinfo(skb)->tso_size || skb_shinfo(skb)->uso_size)) return ip_fragment(skb, ip_finish_output); else return ip_finish_output(skb); @@ -768,7 +769,6 @@ mtu = inet->cork.fragsize; } hh_len = LL_RESERVED_SPACE(rt->u.dst.dev); - fragheaderlen = sizeof(struct iphdr) + (opt ? opt->optlen : 0); maxfraglen = ((mtu - fragheaderlen) & ~7) + fragheaderlen; @@ -864,6 +864,12 @@ skb->ip_summed = csummode; skb->csum = 0; skb_reserve(skb, hh_len); + if ((!offset) && (length > mtu) && + (sk->sk_protocol == IPPROTO_UDP) && + (rt->u.dst.dev->features & NETIF_F_USO)) { + skb_shinfo(skb)->uso_size = mtu - fragheaderlen; + skb->ip_summed = CHECKSUM_HW; + } /* * Find where to start putting bytes. diff -uNr linux-2.6.12-rc4.org/net/ipv4/udp.c linux-2.6.12-rc4/net/ipv4/udp.c --- linux-2.6.12-rc4.org/net/ipv4/udp.c 2005-05-27 23:23:55.000000000 +0545 +++ linux-2.6.12-rc4/net/ipv4/udp.c 2005-05-31 21:14:44.000000000 +0545 @@ -424,9 +424,10 @@ goto send; } - if (skb_queue_len(&sk->sk_write_queue) == 1) { + if ((skb_queue_len(&sk->sk_write_queue) == 1) || + (skb_shinfo(skb)->uso_size)) { /* - * Only one fragment on the socket. + * Only one fragment on the socket or it is udp lso skb. */ if (skb->ip_summed == CHECKSUM_HW) { skb->csum = offsetof(struct udphdr, check); diff -uNr linux-2.6.12-rc4.org/net/ipv6/ip6_output.c linux-2.6.12-rc4/net/ipv6/ip6_output.c --- linux-2.6.12-rc4.org/net/ipv6/ip6_output.c 2005-05-27 23:22:46.000000000 +0545 +++ linux-2.6.12-rc4/net/ipv6/ip6_output.c 2005-06-02 20:27:55.000000000 +0545 @@ -147,7 +147,8 @@ int ip6_output(struct sk_buff *skb) { - if (skb->len > dst_mtu(skb->dst) || dst_allfrag(skb->dst)) + if ((skb->len > dst_mtu(skb->dst) || dst_allfrag(skb->dst)) && + !skb_shinfo(skb)->uso_size) return ip6_fragment(skb, ip6_output2); else return ip6_output2(skb); @@ -977,6 +978,19 @@ skb->csum = 0; /* reserve for fragmentation */ skb_reserve(skb, hh_len+sizeof(struct frag_hdr)); + if ((!offset) && (length > mtu) && + (sk->sk_protocol == IPPROTO_UDP) && + (rt->u.dst.dev->features & NETIF_F_USO)) { + struct frag_hdr fhdr; + + skb_shinfo(skb)->uso_size = + (mtu - fragheaderlen - + sizeof(struct frag_hdr)); + skb->ip_summed = CHECKSUM_HW; + ipv6_select_ident(skb, &fhdr); + skb_shinfo(skb)->ip6_frag_id = + fhdr.identification; + } /* * Find where to start putting bytes diff -uNr linux-2.6.12-rc4.org/net/ipv6/udp.c linux-2.6.12-rc4/net/ipv6/udp.c --- linux-2.6.12-rc4.org/net/ipv6/udp.c 2005-05-27 23:24:12.000000000 +0545 +++ linux-2.6.12-rc4/net/ipv6/udp.c 2005-05-31 17:32:31.000000000 +0545 @@ -590,7 +590,8 @@ goto send; } - if (skb_queue_len(&sk->sk_write_queue) == 1) { + if ((skb_queue_len(&sk->sk_write_queue) == 1) || + (skb_shinfo(skb)->uso_size)) { skb->csum = csum_partial((char *)uh, sizeof(struct udphdr), skb->csum); uh->check = csum_ipv6_magic(&fl->fl6_src, From tgraf@suug.ch Thu Jun 2 18:01:40 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Jun 2005 18:01:42 -0700 (PDT) Received: from postel.suug.ch (postel.suug.ch [195.134.158.23]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j5311dXq021235 for ; Thu, 2 Jun 2005 18:01:39 -0700 Received: by postel.suug.ch (Postfix, from userid 10001) id 3FBB51C0EF; Fri, 3 Jun 2005 03:00:59 +0200 (CEST) Date: Fri, 3 Jun 2005 03:00:59 +0200 From: Thomas Graf To: jamal Cc: "David S. Miller" , netdev Subject: Re: PATCH: ioctl send PID in netlink events Message-ID: <20050603010059.GU15391@postel.suug.ch> References: <1117720349.6050.59.camel@localhost.localdomain> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1117720349.6050.59.camel@localhost.localdomain> X-archive-position: 2010 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev Content-Length: 1453 Lines: 48 > Index: net/ipv4/devinet.c > =================================================================== > --- e4f7366a04d973a42a948d3b4175d66e9adf143e/net/ipv4/devinet.c (mode:100644) > +++ uncommitted/net/ipv4/devinet.c (mode:100644) > @@ -236,6 +236,7 @@ > struct in_ifaddr *promote = NULL; > struct in_ifaddr *ifa1 = *ifap; > > + printk("inet_del_ifa: pid %d\n",current->pid); > ASSERT_RTNL(); > > /* 1. Deleting primary ifaddr forces deletion all secondaries > @@ -305,6 +306,7 @@ > > ASSERT_RTNL(); > > + printk("inet_insert_ifa: pid %d\n",current->pid); > if (!ifa->ifa_local) { > inet_free_ifa(ifa); > return 0; Don't you want to remove these? > Index: net/ipv4/fib_semantics.c > =================================================================== > --- e4f7366a04d973a42a948d3b4175d66e9adf143e/net/ipv4/fib_semantics.c (mode:100644) > +++ uncommitted/net/ipv4/fib_semantics.c (mode:100644) > @@ -276,7 +276,7 @@ > struct nlmsghdr *n, struct netlink_skb_parms *req) > { > struct sk_buff *skb; > - u32 pid = req ? req->pid : 0; > + u32 pid = req ? req->pid : n->nlmsg_pid; > int size = NLMSG_SPACE(sizeof(struct rtmsg)+256); > > skb = alloc_skb(size, GFP_KERNEL); > @@ -1035,7 +1035,7 @@ > } > > nl->nlmsg_flags = NLM_F_REQUEST; > - nl->nlmsg_pid = 0; > + nl->nlmsg_pid = current->pid; > nl->nlmsg_seq = 0; > nl->nlmsg_len = NLMSG_LENGTH(sizeof(*rtm)); > if (cmd == SIOCDELRT) { Neat ;-> From hadi@cyberus.ca Thu Jun 2 18:38:41 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Jun 2005 18:38:44 -0700 (PDT) Received: from mx03.cybersurf.com (mx03.cybersurf.com [209.197.145.106]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j531cfXq023346 for ; Thu, 2 Jun 2005 18:38:41 -0700 Received: from mail.cyberus.ca ([209.197.145.21]) by mx03.cybersurf.com with esmtp (Exim 4.30) id 1De18A-0000Ng-QE for netdev@oss.sgi.com; Thu, 02 Jun 2005 21:37:50 -0400 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.229]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1De181-0000zG-1e; Thu, 02 Jun 2005 21:37:41 -0400 Subject: Re: PATCH: ioctl send PID in netlink events From: jamal Reply-To: hadi@cyberus.ca To: Thomas Graf Cc: "David S. Miller" , netdev In-Reply-To: <20050603010059.GU15391@postel.suug.ch> References: <1117720349.6050.59.camel@localhost.localdomain> <20050603010059.GU15391@postel.suug.ch> Content-Type: text/plain Organization: unknown Date: Thu, 02 Jun 2005 21:37:35 -0400 Message-Id: <1117762655.6095.3.camel@localhost.localdomain> Mime-Version: 1.0 X-Mailer: Evolution 2.2.1.1 Content-Transfer-Encoding: 7bit X-archive-position: 2011 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev Content-Length: 1809 Lines: 62 On Fri, 2005-03-06 at 03:00 +0200, Thomas Graf wrote: > > Index: net/ipv4/devinet.c > > =================================================================== > > --- e4f7366a04d973a42a948d3b4175d66e9adf143e/net/ipv4/devinet.c (mode:100644) > > +++ uncommitted/net/ipv4/devinet.c (mode:100644) > > @@ -236,6 +236,7 @@ > > struct in_ifaddr *promote = NULL; > > struct in_ifaddr *ifa1 = *ifap; > > > > + printk("inet_del_ifa: pid %d\n",current->pid); > > ASSERT_RTNL(); > > > > /* 1. Deleting primary ifaddr forces deletion all secondaries > > @@ -305,6 +306,7 @@ > > > > ASSERT_RTNL(); > > > > + printk("inet_insert_ifa: pid %d\n",current->pid); > > if (!ifa->ifa_local) { > > inet_free_ifa(ifa); > > return 0; > > Don't you want to remove these? > > Yes, how did those get there? ;-> > > Index: net/ipv4/fib_semantics.c > > =================================================================== > > --- e4f7366a04d973a42a948d3b4175d66e9adf143e/net/ipv4/fib_semantics.c (mode:100644) > > +++ uncommitted/net/ipv4/fib_semantics.c (mode:100644) > > @@ -276,7 +276,7 @@ > > struct nlmsghdr *n, struct netlink_skb_parms *req) > > { > > struct sk_buff *skb; > > - u32 pid = req ? req->pid : 0; > > + u32 pid = req ? req->pid : n->nlmsg_pid; > > int size = NLMSG_SPACE(sizeof(struct rtmsg)+256); > > > > skb = alloc_skb(size, GFP_KERNEL); > > @@ -1035,7 +1035,7 @@ > > } > > > > nl->nlmsg_flags = NLM_F_REQUEST; > > - nl->nlmsg_pid = 0; > > + nl->nlmsg_pid = current->pid; > > nl->nlmsg_seq = 0; > > nl->nlmsg_len = NLMSG_LENGTH(sizeof(*rtm)); > > if (cmd == SIOCDELRT) { > > Neat ;-> The second one could probably use the new macros. Maybe i will wait until Dave puts this in his tree and send a small change; else you could send it. cheers, jamal From hadi@cyberus.ca Thu Jun 2 19:37:27 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Jun 2005 19:37:29 -0700 (PDT) Received: from mx03.cybersurf.com (mx03.cybersurf.com [209.197.145.106]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j532bMXq027358 for ; Thu, 2 Jun 2005 19:37:27 -0700 Received: from mail.cyberus.ca ([209.197.145.21]) by mx03.cybersurf.com with esmtp (Exim 4.30) id 1De22u-00056A-E8 for netdev@oss.sgi.com; Thu, 02 Jun 2005 22:36:28 -0400 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.229]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1De22p-0002eo-L3; Thu, 02 Jun 2005 22:36:23 -0400 Subject: Re: [PATCH] shaper.c: fix locking From: jamal Reply-To: hadi@cyberus.ca To: "David S. Miller" Cc: hch@lst.de, netdev@oss.sgi.com In-Reply-To: <20050602.163628.01205145.davem@davemloft.net> References: <20050527115450.GA19469@lst.de> <20050531.144114.78710204.davem@davemloft.net> <20050601052149.GA11935@lst.de> <20050602.163628.01205145.davem@davemloft.net> Content-Type: text/plain Organization: unknown Date: Thu, 02 Jun 2005 22:36:17 -0400 Message-Id: <1117766177.6095.51.camel@localhost.localdomain> Mime-Version: 1.0 X-Mailer: Evolution 2.2.1.1 Content-Transfer-Encoding: 7bit X-archive-position: 2013 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev Content-Length: 249 Lines: 10 On Thu, 2005-02-06 at 16:36 -0700, David S. Miller wrote: > Fair enough, patch applied. If this driver breaks as a result of > these changes, you get to keep the pieces ok? :-) The question is anyone really using this driver? ;-> cheers, jamal From hadi@cyberus.ca Thu Jun 2 19:33:41 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Jun 2005 19:33:45 -0700 (PDT) Received: from mx02.cybersurf.com (mx02.cybersurf.com [209.197.145.105]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j532XfXq027002 for ; Thu, 2 Jun 2005 19:33:41 -0700 Received: from mail.cyberus.ca ([209.197.145.21]) by mx02.cybersurf.com with esmtp (Exim 4.30) id 1De1zI-0000o5-El for netdev@oss.sgi.com; Thu, 02 Jun 2005 22:32:44 -0400 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.229]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1De1zG-00029R-2M; Thu, 02 Jun 2005 22:32:42 -0400 Subject: Re: RFC: NAPI packet weighting patch From: jamal Reply-To: hadi@cyberus.ca To: "David S. Miller" Cc: john.ronciak@intel.com, jdmason@us.ibm.com, shemminger@osdl.org, mitch.a.williams@intel.com, netdev@oss.sgi.com, Robert.Olsson@data.slu.se, ganesh.venkatesan@intel.com, jesse.brandeburg@intel.com In-Reply-To: <20050602.171812.48807872.davem@davemloft.net> References: <468F3FDA28AA87429AD807992E22D07E0450BFDB@orsmsx408> <20050602.171812.48807872.davem@davemloft.net> Content-Type: text/plain Organization: unknown Date: Thu, 02 Jun 2005 22:32:33 -0400 Message-Id: <1117765954.6095.49.camel@localhost.localdomain> Mime-Version: 1.0 X-Mailer: Evolution 2.2.1.1 Content-Transfer-Encoding: 7bit X-archive-position: 2012 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev Content-Length: 2544 Lines: 54 On Thu, 2005-02-06 at 17:18 -0700, David S. Miller wrote: > From: "Ronciak, John" > Date: Thu, 2 Jun 2005 17:11:20 -0700 > > > I like this idea as well but I do an issue with it. How would this > > stack code find out that the weight is too high and pacekts are being > > dropped (not being polled fast enough)? It would have to check the > > controller stats to see the error count increasing for some period. I'm > > not sure this is workable unless we have some sort of feedback which the > > driver could send up (or set) saying that this is happening and the > > dynamic weight code could take into acount. > > What more do you need other than checking the statistics counter? The > drop statistics (the ones we care about) are incremented in real time > by the ->poll() code, so it's not like we have to trigger some > asynchronous event to get a current version of the number. I am reading through all the emails and I think either the problem is not being clearly stated or not understood. I was going to say "or i am on crack "- but I know i am clean ;-> Heres what i think i saw as a flow of events: Someone posted a theory that if you happen to reduce the weight (iirc the reduction was via a shift) then the DRR would give less CPU time cycle to the driver - Whats the big suprise there? thats DRR design intent. Stephen has a patch which allows people to reduce the weight. DRR provides fairness. If you have 10 NICs coming at different wire rates, the weights provide a fairness quota without caring about what those speeds are. So it doesnt make any sense IMO to have the weight based on what the NIC speed is. Infact i claim it is _nonsense_. You dont need to factor speed. And the claim that DRR is not real world is blasphemous. Having said that: I have a feeling that issue which is which is being waded around is the amount that the softirq chews in the CPU (unfortunately a well known issue) and to some extent the packet flow a specific driver chews depending on the path it takes. In other words, for DRR algorithm to enhance the fairness it should consider not only fairness in the amounts of packets the driver injects into the system but also the amount of CPU that driver chews. At the moment we lump all drivers together as far as the CPU cycles are concerned. If we could narrow it down to this, then i think there is something that could lead to meaningful discussion. This, however, does not eradicate the need for DRR and is absolutely not driver specific. cheers, jamal From raghunathan.venkatesan@wipro.com Thu Jun 2 20:03:02 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Jun 2005 20:03:05 -0700 (PDT) Received: from wip-ec-wd.wipro.com (wip-ec-wd.wipro.com [203.101.113.39]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j53331Xq029634 for ; Thu, 2 Jun 2005 20:03:01 -0700 Received: from wip-ec-wd.wipro.com (localhost.wipro.com [127.0.0.1]) by localhost (Postfix) with ESMTP id B83EC205E8; Fri, 3 Jun 2005 08:23:02 +0530 (IST) Received: from blr-ec-bh01.wipro.com (unknown [10.201.50.91]) by wip-ec-wd.wipro.com (Postfix) with ESMTP id 9C493205E5; Fri, 3 Jun 2005 08:23:02 +0530 (IST) Received: from chn-snr-bh2.wipro.com ([10.145.50.92]) by blr-ec-bh01.wipro.com with Microsoft SMTPSVC(6.0.3790.211); Fri, 3 Jun 2005 08:31:47 +0530 Received: from CHN-SNR-MBX01.wipro.com ([10.145.50.181]) by chn-snr-bh2.wipro.com with Microsoft SMTPSVC(6.0.3790.0); Fri, 3 Jun 2005 08:32:04 +0530 X-MimeOLE: Produced By Microsoft Exchange V6.5.7226.0 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Subject: RE: Unable to handle kernel paging request at virtual address 04000460 Date: Fri, 3 Jun 2005 08:28:34 +0530 Message-ID: <438662DA48DCAA41B1DF648BD4BD76C0E98682@CHN-SNR-MBX01.wipro.com> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: Unable to handle kernel paging request at virtual address 04000460 Thread-Index: AcVnmr7o+HCu/Cf9S06Mjf+yNyDZmwATUH0g From: To: Cc: , , , X-OriginalArrivalTime: 03 Jun 2005 03:02:04.0645 (UTC) FILETIME=[9EEEC950:01C567E8] Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id j53331Xq029634 X-archive-position: 2014 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: raghunathan.venkatesan@wipro.com Precedence: bulk X-list: netdev Content-Length: 1431 Lines: 40 Hi Stephen, I appreciate you response. We'll get deeper into the problem after turning on these debugs. Thanks, Raghu -----Original Message----- From: Stephen Hemminger [mailto:shemminger@osdl.org] Sent: Thursday, June 02, 2005 11:14 PM To: Raghunathan Venkatesan (WT01 - EMBEDDED & PRODUCT ENGINEERING SOLUTIONS) Cc: davem@davemloft.net; linux-net@vger.kernel.org; netdev@oss.sgi.com; linux@der-keiler.de Subject: Re: Unable to handle kernel paging request at virtual address 04000460 On Thu, 2 Jun 2005 09:20:21 +0530 wrote: > Hi David, > I understand that the linux community may not be able to debug it for > me. All I require is if people have seen similar problems (the > problems we face are w.r.t to kfree_skb and skb_drop_fraglist crashing > due to some reason, which could be a Memory Management issue or some > thing we are not aware of), then let us know the patches, so that we > can try them out here. Turn on Debug memory allocations, spinlock debugging, sleep-inside-spinlock checking, and preempt, it will help your debugging. If you are not building your own kernel from source learn how. You are probably freeing memory twice, or not doing ref counting properly or other locking issues. Since it is your code, good luck debugging it, if you want the community help it needs to be open source code that is available for download or be in the kernel.org kernel. From hadi@cyberus.ca Thu Jun 2 20:34:00 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Jun 2005 20:34:02 -0700 (PDT) Received: from mx03.cybersurf.com (mx03.cybersurf.com [209.197.145.106]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j533XxXq002309 for ; Thu, 2 Jun 2005 20:33:59 -0700 Received: from mail.cyberus.ca ([209.197.145.21]) by mx03.cybersurf.com with esmtp (Exim 4.30) id 1De2vl-0007ca-IK for netdev@oss.sgi.com; Thu, 02 Jun 2005 23:33:09 -0400 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.229]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1De2vg-0004O7-AG; Thu, 02 Jun 2005 23:33:04 -0400 Subject: Re: PATCH: ioctl send PID in netlink events From: jamal Reply-To: hadi@cyberus.ca To: Thomas Graf Cc: "David S. Miller" , netdev In-Reply-To: <1117762655.6095.3.camel@localhost.localdomain> References: <1117720349.6050.59.camel@localhost.localdomain> <20050603010059.GU15391@postel.suug.ch> <1117762655.6095.3.camel@localhost.localdomain> Content-Type: multipart/mixed; boundary="=-Ufypw+g9dyMzRlXKv19C" Organization: unknown Date: Thu, 02 Jun 2005 23:32:55 -0400 Message-Id: <1117769575.6095.91.camel@localhost.localdomain> Mime-Version: 1.0 X-Mailer: Evolution 2.2.1.1 X-archive-position: 2015 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev Content-Length: 3665 Lines: 113 --=-Ufypw+g9dyMzRlXKv19C Content-Type: text/plain Content-Transfer-Encoding: 7bit Dave, If you havent applied that patch to net-2.6.13 heres one that removes those extrenous printks. On Thu, 2005-02-06 at 21:37 -0400, jamal wrote: > The second one could probably use the new macros. > Maybe i will wait until Dave puts this in his tree and send a small > change; else you could send it. > Actually cant be done, sorry i lied ;-> cheers, jamal --=-Ufypw+g9dyMzRlXKv19C Content-Disposition: attachment; filename=ifconf-2 Content-Type: text/plain; name=ifconf-2; charset=utf-8 Content-Transfer-Encoding: 7bit net/core/rtnetlink.c: needs update net/ipv4/devinet.c: needs update net/ipv4/fib_semantics.c: needs update net/ipv6/addrconf.c: needs update Index: net/core/rtnetlink.c =================================================================== --- e4f7366a04d973a42a948d3b4175d66e9adf143e/net/core/rtnetlink.c (mode:100644) +++ uncommitted/net/core/rtnetlink.c (mode:100644) @@ -452,7 +452,7 @@ if (!skb) return; - if (rtnetlink_fill_ifinfo(skb, dev, type, 0, 0, change, 0) < 0) { + if (rtnetlink_fill_ifinfo(skb, dev, type, current->pid, 0, change, 0) < 0) { kfree_skb(skb); return; } Index: net/ipv4/devinet.c =================================================================== --- e4f7366a04d973a42a948d3b4175d66e9adf143e/net/ipv4/devinet.c (mode:100644) +++ uncommitted/net/ipv4/devinet.c (mode:100644) @@ -1112,7 +1112,7 @@ if (!skb) netlink_set_err(rtnl, 0, RTMGRP_IPV4_IFADDR, ENOBUFS); - else if (inet_fill_ifaddr(skb, ifa, 0, 0, event, 0) < 0) { + else if (inet_fill_ifaddr(skb, ifa, current->pid, 0, event, 0) < 0) { kfree_skb(skb); netlink_set_err(rtnl, 0, RTMGRP_IPV4_IFADDR, EINVAL); } else { Index: net/ipv4/fib_semantics.c =================================================================== --- e4f7366a04d973a42a948d3b4175d66e9adf143e/net/ipv4/fib_semantics.c (mode:100644) +++ uncommitted/net/ipv4/fib_semantics.c (mode:100644) @@ -276,7 +276,7 @@ struct nlmsghdr *n, struct netlink_skb_parms *req) { struct sk_buff *skb; - u32 pid = req ? req->pid : 0; + u32 pid = req ? req->pid : n->nlmsg_pid; int size = NLMSG_SPACE(sizeof(struct rtmsg)+256); skb = alloc_skb(size, GFP_KERNEL); @@ -1035,7 +1035,7 @@ } nl->nlmsg_flags = NLM_F_REQUEST; - nl->nlmsg_pid = 0; + nl->nlmsg_pid = current->pid; nl->nlmsg_seq = 0; nl->nlmsg_len = NLMSG_LENGTH(sizeof(*rtm)); if (cmd == SIOCDELRT) { Index: net/ipv6/addrconf.c =================================================================== --- e4f7366a04d973a42a948d3b4175d66e9adf143e/net/ipv6/addrconf.c (mode:100644) +++ uncommitted/net/ipv6/addrconf.c (mode:100644) @@ -2872,7 +2872,7 @@ netlink_set_err(rtnl, 0, RTMGRP_IPV6_IFADDR, ENOBUFS); return; } - if (inet6_fill_ifaddr(skb, ifa, 0, 0, event, 0) < 0) { + if (inet6_fill_ifaddr(skb, ifa, current->pid, 0, event, 0) < 0) { kfree_skb(skb); netlink_set_err(rtnl, 0, RTMGRP_IPV6_IFADDR, EINVAL); return; @@ -3007,7 +3007,7 @@ netlink_set_err(rtnl, 0, RTMGRP_IPV6_IFINFO, ENOBUFS); return; } - if (inet6_fill_ifinfo(skb, idev, 0, 0, event, 0) < 0) { + if (inet6_fill_ifinfo(skb, idev, current->pid, 0, event, 0) < 0) { kfree_skb(skb); netlink_set_err(rtnl, 0, RTMGRP_IPV6_IFINFO, EINVAL); return; @@ -3064,7 +3064,7 @@ netlink_set_err(rtnl, 0, RTMGRP_IPV6_PREFIX, ENOBUFS); return; } - if (inet6_fill_prefix(skb, idev, pinfo, 0, 0, event, 0) < 0) { + if (inet6_fill_prefix(skb, idev, pinfo, current->pid, 0, event, 0) < 0) { kfree_skb(skb); netlink_set_err(rtnl, 0, RTMGRP_IPV6_PREFIX, EINVAL); return; --=-Ufypw+g9dyMzRlXKv19C-- From davem@davemloft.net Thu Jun 2 22:10:01 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Jun 2005 22:10:03 -0700 (PDT) Received: from sunset.davemloft.net (dsl027-180-168.sfo1.dsl.speakeasy.net [216.27.180.168]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j535A1Xq007191 for ; Thu, 2 Jun 2005 22:10:01 -0700 Received: from localhost ([127.0.0.1] ident=davem) by sunset.davemloft.net with esmtp (Exim 4.50) id 1De4QW-0002C8-6G; Thu, 02 Jun 2005 22:09:00 -0700 Date: Thu, 02 Jun 2005 22:09:00 -0700 (PDT) Message-Id: <20050602.220900.92343575.davem@davemloft.net> To: hadi@cyberus.ca Cc: tgraf@suug.ch, netdev@oss.sgi.com Subject: Re: PATCH: ioctl send PID in netlink events From: "David S. Miller" In-Reply-To: <1117769575.6095.91.camel@localhost.localdomain> References: <20050603010059.GU15391@postel.suug.ch> <1117762655.6095.3.camel@localhost.localdomain> <1117769575.6095.91.camel@localhost.localdomain> X-Mailer: Mew version 3.3 on Emacs 21.4 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 2017 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev Content-Length: 192 Lines: 7 From: jamal Date: Thu, 02 Jun 2005 23:32:55 -0400 > If you havent applied that patch to net-2.6.13 heres one that removes > those extrenous printks. Applied, thanks Jamal. From davem@davemloft.net Thu Jun 2 22:12:18 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Jun 2005 22:12:21 -0700 (PDT) Received: from sunset.davemloft.net (dsl027-180-168.sfo1.dsl.speakeasy.net [216.27.180.168]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j535CIXq007864 for ; Thu, 2 Jun 2005 22:12:18 -0700 Received: from localhost ([127.0.0.1] ident=davem) by sunset.davemloft.net with esmtp (Exim 4.50) id 1De4Sm-0002Dl-1U; Thu, 02 Jun 2005 22:11:20 -0700 Date: Thu, 02 Jun 2005 22:11:19 -0700 (PDT) Message-Id: <20050602.221119.105431518.davem@davemloft.net> To: herbert@gondor.apana.org.au Cc: netdev@oss.sgi.com Subject: Re: [SCTP] Replace spin_lock_irqsave with spin_lock_bh From: "David S. Miller" In-Reply-To: <20050602095459.GA26638@gondor.apana.org.au> References: <20050602094404.GA10316@gondor.apana.org.au> <20050602095459.GA26638@gondor.apana.org.au> X-Mailer: Mew version 3.3 on Emacs 21.4 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 2019 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev Content-Length: 334 Lines: 10 From: Herbert Xu Date: Thu, 2 Jun 2005 19:54:59 +1000 > The call in question is only called from recvmsg which means that > IRQs aren't disabled. Therefore it is safe to replace it with > spin_lock_bh. > > Signed-off-by: Herbert Xu Also applied to net-2.6.13, thanks. From davem@davemloft.net Thu Jun 2 22:11:29 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Jun 2005 22:11:33 -0700 (PDT) Received: from sunset.davemloft.net (dsl027-180-168.sfo1.dsl.speakeasy.net [216.27.180.168]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j535BSXq007449 for ; Thu, 2 Jun 2005 22:11:29 -0700 Received: from localhost ([127.0.0.1] ident=davem) by sunset.davemloft.net with esmtp (Exim 4.50) id 1De4Rv-0002Cn-3Z; Thu, 02 Jun 2005 22:10:27 -0700 Date: Thu, 02 Jun 2005 22:10:26 -0700 (PDT) Message-Id: <20050602.221026.112287995.davem@davemloft.net> To: herbert@gondor.apana.org.au Cc: netdev@oss.sgi.com Subject: Re: [IPV4/IPV6] Replace spin_lock_irq with spin_lock_bh From: "David S. Miller" In-Reply-To: <20050602094404.GA10316@gondor.apana.org.au> References: <20050602094404.GA10316@gondor.apana.org.au> X-Mailer: Mew version 3.3 on Emacs 21.4 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 2018 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev Content-Length: 570 Lines: 14 From: Herbert Xu Date: Thu, 2 Jun 2005 19:44:04 +1000 > In light of my recent patch to net/ipv4/udp.c that replaced the > spin_lock_irq calls on the receive queue lock with spin_lock_bh, > here is a similar patch for all other occurences of spin_lock_irq > on receive/error queue locks in IPv4 and IPv6. > > In these stacks, we know that they can only be entered from user > or softirq context. Therefore it's safe to disable BH only. > > Signed-off-by: Herbert Xu Applied to net-2.6.13, thanks Herbert. From davem@davemloft.net Thu Jun 2 22:09:52 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Jun 2005 22:10:00 -0700 (PDT) Received: from sunset.davemloft.net (dsl027-180-168.sfo1.dsl.speakeasy.net [216.27.180.168]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j5359pXq007173 for ; Thu, 2 Jun 2005 22:09:52 -0700 Received: from localhost ([127.0.0.1] ident=davem) by sunset.davemloft.net with esmtp (Exim 4.50) id 1De4QD-0002Bu-Br; Thu, 02 Jun 2005 22:08:41 -0700 Date: Thu, 02 Jun 2005 22:08:41 -0700 (PDT) Message-Id: <20050602.220841.48530513.davem@davemloft.net> To: hadi@cyberus.ca Cc: tgraf@suug.ch, netdev@oss.sgi.com Subject: Re: PATCH: explicit typing WAS(Re: PATCH: rtnetlink explicit flags setting From: "David S. Miller" In-Reply-To: <1117717493.6050.29.camel@localhost.localdomain> References: <20050531222646.GK15391@postel.suug.ch> <20050531.153125.95894437.davem@davemloft.net> <1117717493.6050.29.camel@localhost.localdomain> X-Mailer: Mew version 3.3 on Emacs 21.4 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 2016 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev Content-Length: 276 Lines: 9 From: jamal Date: Thu, 02 Jun 2005 09:04:52 -0400 > This patch converts "unsigned flags" to use more explict types like u16 > instead and incrementally introduces NLMSG_NEW(). > > Signed-off-by: Jamal Hadi Salim Applied, thanks Jamal. From kostodo@gmail.com Thu Jun 2 22:46:28 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Jun 2005 22:46:35 -0700 (PDT) Received: from rproxy.gmail.com (rproxy.gmail.com [64.233.170.195]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j535kSXq011247 for ; Thu, 2 Jun 2005 22:46:28 -0700 Received: by rproxy.gmail.com with SMTP id z35so260962rne for ; Thu, 02 Jun 2005 22:45:30 -0700 (PDT) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:reply-to:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=KQ5UPSnKzwwBGN3AFNeRIlX1sOp1cCO/MNatNpmKTAhYoLC1XeEAx3av3TlIqQZvlkMGDolXZe03jUYIGpqA2y7A+mmAtgXvqbB4ZKDzv8Mnnd86y8t+3XSeSlrwz8nxwsfXDtAkC3KJaI1bQsz25lDItg+8J98GonoMfPcqz1c= Received: by 10.38.88.3 with SMTP id l3mr730055rnb; Thu, 02 Jun 2005 22:45:30 -0700 (PDT) Received: by 10.38.208.46 with HTTP; Thu, 2 Jun 2005 22:45:30 -0700 (PDT) Message-ID: Date: Fri, 3 Jun 2005 09:45:30 +0400 From: Kosta Todorovic Reply-To: Kosta Todorovic To: Ben Greear Subject: Re: Network card driver problem (znb.o/tulip) Cc: jgarzik@pobox.com, tulip-users@lists.sourceforge.net, netdev@oss.sgi.com In-Reply-To: <428E0B3B.1090507@candelatech.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Disposition: inline References: <428CC958.1080909@candelatech.com> <428E0B3B.1090507@candelatech.com> Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id j535kSXq011247 X-archive-position: 2020 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kostodo@gmail.com Precedence: bulk X-list: netdev Content-Length: 5281 Lines: 152 I'm not too concerned with backward compatibility. I see silicom-usa provide both a Broadcom and Intel based chipsets. Is there any reason in particular that you reccomended Broadcom? And can standard kernel drivers be used for these cards? I've had bad expirience with custom manufactorer drivers once they discontinue development and support for their card. How reliable is Silicom-usa? As a management decision, who would you purchase 10 quad port cards from and which kinds of cards would u get? Thanks, K On 5/20/05, Ben Greear wrote: > Kosta Todorovic wrote: > > 2 more questions: > > > > 1) Is there anything special I will need to compile in terms of the > > linux kernel for 64-bit PCI bus mode (PCI-X) ? (Currently I'm using > > kernel 2.4.x but that is because my current card drivers do not > > support 2.6.x) > > Nothing special...2.4 and 2.6 kernels since way back will work just fine. > > > 2) The machine actually has a PCI extension with 9 other PCI-X slots. > > The current cards are 64-bit (pci-x) but as a test i'm planning on > > replacing them with DLinks DFE-580tx's. Unfortunately these are 32-bit > > cards (legacy pci). How will these 4 ports work in 32-bit mode? What > > will the effect be on the speed? > > If you put a 33Mhz NIC in a PCI-X bus it makes the entire bus run at > 33Mhz speed. > > If you do want full backwards compatibility, get the 'universal' 4-port > broadcom NIC from silicom-usa. It works fine in 32-bit PCI busses, and > though I haven't personally tested it, it should work fine in PCI-X > busses at high speed as well. > > Ben > > > > > > > > > On 5/19/05, Ben Greear wrote: > > > >>Kosta Todorovic wrote: > >> > >>>Whats the best 4-port NIC currently available? I'm interested in > >>>purchasing 10 4-port NICs as a replacement for my current cards. > >>> > >>>I am looking for 10/100Mbps and a good driver for linux (2.4.x and > >>>2.6.x). Preferably a mainstream company but thats not priority. > >>> > >>>Could the community please recommend the best card available? Money is > >>>not an issue since im really interested in the best of the best. > >> > >>Get an Intel 4-port GigE NIC. It will do 10/100/1000, and if you really > >>want to use all 4 ports at even 100Mbps, you need the 64-bit PCI bus... > >> > >>I have been getting mine from silicom-usa.com lately. They also have > >>6-port NICs, and 4-port broadcom GigE nics that can be used in 32-bit > >>PCI slots. (The Intel 4-port NICs will only work in 64-bit PCI slots.) > >> > >>If you really want 10/100 nics, try the p430tx from aei: > >>http://www.aei-it.com/hardware/fastenet/p430tx.htm > >> > >>These are like the old DFE570tx NICs, and use the tulip driver. They > >>are almost as expensive as the GigE NICs though... > >> > >>Thanks, > >>Ben > >> > >> > >>>Any suggestions? > >>> > >>>Regards, > >>>Kosta > >>> > >>> > >>> > >>>On 3/11/05, Kosta Todorovic wrote: > >>> > >>> > >>>>My company has recently purchased several ZNYX ZX274 network cards. > >>>>These cards are Four Channel, 10/100 PCI Adapters. They use Intel chipsets. > >>>> > >>>>Unfortunately there exists no drivers for linux amd64 architecture. > >>>>There are 32bit drivers found at: > >>>>http://www.znyx.com/support/drivers/ZX374_drivers.htm but naturally > >>>>they wont compile under my amd64 system. > >>>> > >>>>The driver itself is called znb.o and can be downloaded from ZNYX's > >>>>website. I spoke to support staff there but they told me they have > >>>>discontinued support and development for this series of cards. > >>>> > >>>>The system I am running gentoo and have tried both 2.4.x and 2.6.x > >>>>kernels but no luck. > >>>> > >>>>Unfortunately there is NO 64bit drivers available for ANY platform. not even MS. > >>>> > >>>>Does anyone know of a customised znb.o driver built for amd64? > >>>>Is there any chance of anyone modifying the source code of the driver > >>>>to compile under a amd64 system? > >>>> > >>>>I've noticed that "tulip" drivers get loaded as a module at boot time. > >>>>but they dont function correctly. (lets you start the device and > >>>>attach ips but cant talk through it) > >>>> > >>>>Is there any variants of the tulip driver that will work for this? > >>>> > >>>>Help much appreciated. > >>>> > >>>> > >>>>/proc/pci extract for network cards: > >>>> > >>>> Bus 5, device 5, function 0: > >>>> Ethernet controller: Digital Equipment Corporation DECchip > >>>>21142/43 (#30) (rev 65). > >>>> IRQ 30. > >>>> Master Capable. Latency=128. Min Gnt=20.Max Lat=40. > >>>> I/O at 0x0 [0x7f]. > >>>> Non-prefetchable 32 bit memory at 0xfa1ff400 [0xfa1ff7ff]. > >>>> Bus 5, device 4, function 0: > >>>> Ethernet controller: Digital Equipment Corporation DECchip > >>>>21142/43 (#29) (rev 65). > >>>> IRQ 29. > >>>> Master Capable. No bursts. Min Gnt=20.Max Lat=40. > >>>> I/O at 0x0 [0x7f]. > >>>> Non-prefetchable 32 bit memory at 0xf9f00000 [0xf9f003ff]. > >>>> > >>> > >>> > >> > >>-- > >>Ben Greear > >>Candela Technologies Inc http://www.candelatech.com > >> > >> > > > > > > > -- > Ben Greear > Candela Technologies Inc http://www.candelatech.com > > From greearb@candelatech.com Thu Jun 2 22:54:57 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Jun 2005 22:55:00 -0700 (PDT) Received: from www.lanforge.com (ns1.lanforge.com [66.165.47.210]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j535suXq012007 for ; Thu, 2 Jun 2005 22:54:56 -0700 Received: from [71.112.207.80] (pool-71-112-207-80.sttlwa.dsl-w.verizon.net [71.112.207.80]) (authenticated bits=0) by www.lanforge.com (8.12.8/8.12.8) with ESMTP id j536RJ5I026433; Thu, 2 Jun 2005 23:27:20 -0700 Message-ID: <429FF071.8040707@candelatech.com> Date: Thu, 02 Jun 2005 22:53:53 -0700 From: Ben Greear Organization: Candela Technologies User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.7.8) Gecko/20050513 Fedora/1.7.8-1.3.1 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Kosta Todorovic CC: jgarzik@pobox.com, tulip-users@lists.sourceforge.net, netdev@oss.sgi.com Subject: Re: Network card driver problem (znb.o/tulip) References: <428CC958.1080909@candelatech.com> <428E0B3B.1090507@candelatech.com> In-Reply-To: Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 2021 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: greearb@candelatech.com Precedence: bulk X-list: netdev Content-Length: 1133 Lines: 30 Kosta Todorovic wrote: > I'm not too concerned with backward compatibility. I see silicom-usa > provide both a Broadcom and Intel based chipsets. > > Is there any reason in particular that you reccomended Broadcom? And > can standard kernel drivers be used for these cards? I've had bad > expirience with custom manufactorer drivers once they discontinue > development and support for their card. The BCM NICs will work in a normal 32-bit PCI bus..the 4-port Intels will not. If you have 64-bit PCI-X, then I'd get Intel..but that's just because I've used them longer...I have no reason to believe the BCM is inferior at this time. > How reliable is Silicom-usa? > > As a management decision, who would you purchase 10 quad port cards > from and which kinds of cards would u get? Heh, I've already purchased more than 10 from silicom, and have shipped them all over the world. So far...no complaints! But, if you don't need the BCM, you can get good ole Intel quad GigE NICs from www.newegg.com and a million other places. Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com From kostodo@gmail.com Thu Jun 2 23:00:36 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Jun 2005 23:00:39 -0700 (PDT) Received: from rproxy.gmail.com (rproxy.gmail.com [64.233.170.205]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j5360aXq014080 for ; Thu, 2 Jun 2005 23:00:36 -0700 Received: by rproxy.gmail.com with SMTP id z35so262012rne for ; Thu, 02 Jun 2005 22:59:38 -0700 (PDT) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:reply-to:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=MqhClYLMKtWNDtSO9HOZE3iKsNC79hPqIupJwBH2jx6xreuo5TRlp9QQFHhzyGLH5SLAhrmumGFo4fiZ1vFXdn/aagDSzJDm86MD91MT/9f8Xb6GAGMIa+w9ejQB8Rhwa4zQcP1uiiE7mPh7ik91ChBO2Huj4Cq1uNzTH4RBzvI= Received: by 10.38.88.1 with SMTP id l1mr727356rnb; Thu, 02 Jun 2005 22:58:44 -0700 (PDT) Received: by 10.38.208.46 with HTTP; Thu, 2 Jun 2005 22:58:44 -0700 (PDT) Message-ID: Date: Fri, 3 Jun 2005 09:58:44 +0400 From: Kosta Todorovic Reply-To: Kosta Todorovic To: Ben Greear Subject: Re: Network card driver problem (znb.o/tulip) Cc: jgarzik@pobox.com, tulip-users@lists.sourceforge.net, netdev@oss.sgi.com In-Reply-To: <429FF071.8040707@candelatech.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Disposition: inline References: <428CC958.1080909@candelatech.com> <428E0B3B.1090507@candelatech.com> <429FF071.8040707@candelatech.com> Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id j5360aXq014080 X-archive-position: 2022 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kostodo@gmail.com Precedence: bulk X-list: netdev Content-Length: 1341 Lines: 36 So intel gigE nics use standard tulip linux drivers that come shipped with a vanilla kernel? On 6/3/05, Ben Greear wrote: > Kosta Todorovic wrote: > > I'm not too concerned with backward compatibility. I see silicom-usa > > provide both a Broadcom and Intel based chipsets. > > > > Is there any reason in particular that you reccomended Broadcom? And > > can standard kernel drivers be used for these cards? I've had bad > > expirience with custom manufactorer drivers once they discontinue > > development and support for their card. > > The BCM NICs will work in a normal 32-bit PCI bus..the 4-port Intels will > not. If you have 64-bit PCI-X, then I'd get Intel..but that's just because > I've used them longer...I have no reason to believe the BCM is inferior at > this time. > > > How reliable is Silicom-usa? > > > > As a management decision, who would you purchase 10 quad port cards > > from and which kinds of cards would u get? > > Heh, I've already purchased more than 10 from silicom, and have shipped > them all over the world. So far...no complaints! But, if you don't > need the BCM, you can get good ole Intel quad GigE NICs from www.newegg.com > and a million other places. > > Ben > > -- > Ben Greear > Candela Technologies Inc http://www.candelatech.com > > From greearb@candelatech.com Thu Jun 2 23:26:56 2005 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Jun 2005 23:27:00 -0700 (PDT) Received: from www.lanforge.com (ns1.lanforge.com [66.165.47.210]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j536QuXq015870 for ; Thu, 2 Jun 2005 23:26:56 -0700 Received: from [71.112.207.80] (pool-71-112-207-80.sttlwa.dsl-w.verizon.net [71.112.207.80]) (authenticated bits=0) by www.lanforge.com (8.12.8/8.12.8) with ESMTP id j536xJ5I026786; Thu, 2 Jun 2005 23:59:19 -0700 Message-ID: <429FF7F0.7050505@candelatech.com> Date: Thu, 02 Jun 2005 23:25:52 -0700 From: Ben Greear Organization: Candela Technologies User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.7.8) Gecko/20050513 Fedora/1.7.8-1.3.1 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Kosta Todorovic CC: jgarzik@pobox.com, tulip-users@lists.sourceforge.net, netdev@oss.sgi.com Subject: Re: Network card driver problem (znb.o/tulip) References: <428CC958.1080909@candelatech.com> <428E0B3B.1090507@candelatech.com> <429FF071.8040707@candelatech.com> In-Reply-To: Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 2023 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: greearb@candelatech.com Precedence: bulk X-list: netdev Content-Length: 369 Lines: 15 Kosta Todorovic wrote: > So intel gigE nics use standard tulip linux drivers that come shipped > with a vanilla kernel? No..forget about tulip. It uses standard e1000 driver shipped with vanilla kernel. The BCM chipsets use standard drivers in the kernel as well. Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com From jbenc@suse.cz Fri Jun 3 02:34:43 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Jun 2005 02:34:46 -0700 (PDT) Received: from mail.suse.cz (styx.suse.cz [82.119.242.94]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j539YgXq029261 for ; Fri, 3 Jun 2005 02:34:43 -0700 Received: from griffin.suse.cz (griffin.suse.cz [10.20.1.99]) by mail.suse.cz (SUSE CR ESMTP Mailer) with ESMTP id 1EC99628312; Fri, 3 Jun 2005 11:33:44 +0200 (CEST) Date: Fri, 3 Jun 2005 11:33:43 +0200 From: Jiri Benc To: Cc: , Subject: Re: [PATCH] ieee80211: Update generic definitions to latest specs. Message-ID: <20050603113343.55d19cfc@griffin.suse.cz> In-Reply-To: <20050602190232.340996282D7@mail.suse.cz> References: <20050602190232.340996282D7@mail.suse.cz> X-Mailer: Sylpheed-Claws 1.0.4a (GTK+ 1.2.10; x86_64-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 2024 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jbenc@suse.cz Precedence: bulk X-list: netdev Content-Length: 753 Lines: 24 On Thu, 2 Jun 2005 21:02:24 +0200, gwingerde@home.nl wrote: > I was thinking about that too, but couldn't find a proper shorter > version without losing the descriptive meaning. > > Do you have any suggestions to shorten them? Maybe we can lose a bit of descriptiveness and put comments above definitions instead? I can imagine names such as WLAN_STATUS_ASSOC_DENIED_NOSPECTRUM, WLAN_STATUS_ASSOC_DENIED_BAD_POWER, WLAN_STATUS_ASSOC_DENIED_BAD_SUPPCHANNS, WLAN_REASON_DISASSOC_BAD_POWER, and so on. Also WLAN_STATUS_ASSOC_DENIED_NOSHORT seems to be acceptable for me. More often used identifiers probably could have even shorter name - what about renaming IEEE80211_FCTL_PROTECTEDFRAME to IEEE80211_FCTL_PROTECTED? Thanks, -- Jiri Benc SUSE Labs From baruch@ev-en.org Fri Jun 3 06:43:58 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Jun 2005 06:44:06 -0700 (PDT) Received: from galon.ev-en.org (rrcs-24-123-59-149.central.biz.rr.com [24.123.59.149]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j53DhvXq018401 for ; Fri, 3 Jun 2005 06:43:58 -0700 Received: by galon.ev-en.org (Postfix, from userid 105) id 3DBED11A953; Fri, 3 Jun 2005 16:42:59 +0300 (IDT) Received: from [10.220.3.66] (hamilton.nuim.ie [149.157.192.252]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by galon.ev-en.org (Postfix) with ESMTP id 5196D11A951; Fri, 3 Jun 2005 16:42:53 +0300 (IDT) Message-ID: <42A05E5C.9050408@ev-en.org> Date: Fri, 03 Jun 2005 14:42:52 +0100 From: Baruch Even User-Agent: Debian Thunderbird 1.0.2 (X11/20050331) X-Accept-Language: en-us, en MIME-Version: 1.0 To: "David S. Miller" Cc: netdev@oss.sgi.com, shemminger@osdl.org, doug.leith@nuim.ie Subject: Re: Comparison of several congestion control algorithms References: <4298E045.9050009@ev-en.org> <20050602.163512.10298458.davem@davemloft.net> <429F9B2F.8030507@ev-en.org> <20050602.165341.63126720.davem@davemloft.net> In-Reply-To: <20050602.165341.63126720.davem@davemloft.net> X-Enigmail-Version: 0.91.0.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-archive-position: 2025 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: baruch@ev-en.org Precedence: bulk X-list: netdev Content-Length: 936 Lines: 27 David S. Miller wrote: > From: Baruch Even > Date: Fri, 03 Jun 2005 00:50:07 +0100 > > >>This is in part because of the start of the work that was based on 2.4 >>kernels and even as far as the 2.6.6 kernel which had disabled TSO once >>it saw SACKs. This made TSO unusable for our needs. >> >>AFAIK, the tests reported in that document used kernel 2.6.6. > > > Sure SACKs turn off TSO currently, but you'll have them enabled > at the beginning until the first loss and this affects how fast > the cwnd will grow. > > If you have e1000 cards, for example, you're getting TSO enabled > by default. > > You really need to look into this, as it has a real and very > non-trivial effect on all of the results you obtained. I checked that now and ethtool -k shows TSO to be disabled after boot. Since all the test scripts are not playing with ethtool I can be sure that TSO was off during all of our tests. Baruch From jbenc@suse.cz Fri Jun 3 09:27:25 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Jun 2005 09:27:31 -0700 (PDT) Received: from mail.suse.cz (styx.suse.cz [82.119.242.94]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j53GROXq031421 for ; Fri, 3 Jun 2005 09:27:24 -0700 Received: from griffin.suse.cz (griffin.suse.cz [10.20.1.99]) by mail.suse.cz (SUSE CR ESMTP Mailer) with ESMTP id 8B6CE6282FC; Fri, 3 Jun 2005 18:26:25 +0200 (CEST) Date: Fri, 3 Jun 2005 18:26:25 +0200 From: Jiri Benc To: NetDev Cc: Jeff Garzik , Jirka Bohac Subject: [0/9] ieee80211: Improvements to the layer Message-ID: <20050603182625.64d33be3@griffin.suse.cz> X-Mailer: Sylpheed-Claws 1.0.4a (GTK+ 1.2.10; x86_64-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 2026 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jbenc@suse.cz Precedence: bulk X-list: netdev Content-Length: 456 Lines: 14 Following patches are nearly the same as were sent couple of days ago. However, they are against current netdev-2.6 tree and they contain some more fixes (TKIP compilation, new file for protocol layer functions). The HH_DATA_OFF bugfix is needed too (http://oss.sgi.com/projects/netdev/archive/2005-05/msg00962.html), it's not included here as it is in Linus' tree already. Also there are two patches from Adrian Bunk included. -- Jiri Benc SUSE Labs From jbenc@suse.cz Fri Jun 3 09:29:19 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Jun 2005 09:29:23 -0700 (PDT) Received: from mail.suse.cz (styx.suse.cz [82.119.242.94]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j53GTIXq031749 for ; Fri, 3 Jun 2005 09:29:18 -0700 Received: from griffin.suse.cz (griffin.suse.cz [10.20.1.99]) by mail.suse.cz (SUSE CR ESMTP Mailer) with ESMTP id EBDDB628305; Fri, 3 Jun 2005 18:28:19 +0200 (CEST) Date: Fri, 3 Jun 2005 18:28:19 +0200 From: Jiri Benc To: NetDev Cc: Jeff Garzik , Jirka Bohac Subject: [1/9] ieee80211: remove pci.h #include's Message-ID: <20050603182819.44500c27@griffin.suse.cz> In-Reply-To: <20050603182625.64d33be3@griffin.suse.cz> References: <20050603182625.64d33be3@griffin.suse.cz> X-Mailer: Sylpheed-Claws 1.0.4a (GTK+ 1.2.10; x86_64-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 2027 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jbenc@suse.cz Precedence: bulk X-list: netdev Content-Length: 1527 Lines: 44 From: Adrian Bunk I was wondering why editing pci.h triggered the rebuild of three files under net/, and as far as I can see, there's no reason for these three files to #include pci.h . Signed-off-by: Adrian Bunk Signed-off-by: Jiri Benc --- linux-2.6.12-rc3-mm1-full/net/ieee80211/ieee80211_module.c.old 2005-04-30 23:23:14.000000000 +0200 +++ linux-2.6.12-rc3-mm1-full/net/ieee80211/ieee80211_module.c 2005-04-30 23:23:18.000000000 +0200 @@ -40,7 +40,6 @@ #include #include #include -#include #include #include #include --- linux-2.6.12-rc3-mm1-full/net/ieee80211/ieee80211_tx.c.old 2005-04-30 23:23:25.000000000 +0200 +++ linux-2.6.12-rc3-mm1-full/net/ieee80211/ieee80211_tx.c 2005-04-30 23:23:32.000000000 +0200 @@ -33,7 +33,6 @@ #include #include #include -#include #include #include #include --- linux-2.6.12-rc3-mm1-full/net/ieee80211/ieee80211_rx.c.old 2005-04-30 23:23:42.000000000 +0200 +++ linux-2.6.12-rc3-mm1-full/net/ieee80211/ieee80211_rx.c 2005-04-30 23:23:46.000000000 +0200 @@ -23,7 +23,6 @@ #include #include #include -#include #include #include #include -- Jiri Benc SUSE Labs From jbenc@suse.cz Fri Jun 3 09:30:19 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Jun 2005 09:30:22 -0700 (PDT) Received: from mail.suse.cz (styx.suse.cz [82.119.242.94]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j53GUIXq032183 for ; Fri, 3 Jun 2005 09:30:18 -0700 Received: from griffin.suse.cz (griffin.suse.cz [10.20.1.99]) by mail.suse.cz (SUSE CR ESMTP Mailer) with ESMTP id 5C4466282FC; Fri, 3 Jun 2005 18:29:20 +0200 (CEST) Date: Fri, 3 Jun 2005 18:29:20 +0200 From: Jiri Benc To: NetDev Cc: Jeff Garzik , Jirka Bohac Subject: [2/9] ieee80211: fix recursive ipw2200 dependencies Message-ID: <20050603182920.689a269f@griffin.suse.cz> In-Reply-To: <20050603182625.64d33be3@griffin.suse.cz> References: <20050603182625.64d33be3@griffin.suse.cz> X-Mailer: Sylpheed-Claws 1.0.4a (GTK+ 1.2.10; x86_64-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 2028 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jbenc@suse.cz Precedence: bulk X-list: netdev Content-Length: 896 Lines: 32 From: Adrian Bunk This results in recursive dependencies: - IPW2200 depends on NET_RADIO - IPW2200 selects IEEE80211 - IEEE80211 selects NET_RADIO This patch fixes the IPW2200 dependencies in a way that they are similar to the IPW2100 dependencies. Signed-off-by: Adrian Bunk Signed-off-by: Jiri Benc --- linux-2.6.12-rc5-mm2-full/drivers/net/wireless/Kconfig.old 2005-06-02 22:04:02.000000000 +0200 +++ linux-2.6.12-rc5-mm2-full/drivers/net/wireless/Kconfig 2005-06-02 22:04:40.000000000 +0200 @@ -192,9 +192,8 @@ config IPW2200 tristate "Intel PRO/Wireless 2200BG and 2915ABG Network Connection" - depends on NET_RADIO && PCI + depends on IEEE80211 && PCI select FW_LOADER - select IEEE80211 ---help--- A driver for the Intel PRO/Wireless 2200BG and 2915ABG Network Connection adapters. -- Jiri Benc SUSE Labs From jbenc@suse.cz Fri Jun 3 09:31:47 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Jun 2005 09:31:51 -0700 (PDT) Received: from mail.suse.cz (styx.suse.cz [82.119.242.94]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j53GVkXq000509 for ; Fri, 3 Jun 2005 09:31:47 -0700 Received: from griffin.suse.cz (griffin.suse.cz [10.20.1.99]) by mail.suse.cz (SUSE CR ESMTP Mailer) with ESMTP id 5E157628305; Fri, 3 Jun 2005 18:30:48 +0200 (CEST) Date: Fri, 3 Jun 2005 18:30:48 +0200 From: Jiri Benc To: NetDev Cc: Jeff Garzik , Jirka Bohac Subject: [3/9] ieee80211: fix ipw 64bit compilation warnings Message-ID: <20050603183048.7786f98b@griffin.suse.cz> In-Reply-To: <20050603182625.64d33be3@griffin.suse.cz> References: <20050603182625.64d33be3@griffin.suse.cz> X-Mailer: Sylpheed-Claws 1.0.4a (GTK+ 1.2.10; x86_64-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 2029 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jbenc@suse.cz Precedence: bulk X-list: netdev Content-Length: 7028 Lines: 237 This patch fixes warnings when compiling ipw2100 and ipw2200 on x86_64. Signed-off-by: Jiri Benc Signed-off-by: Jirka Bohac Index: netdev/drivers/net/wireless/ipw2200.c =================================================================== --- netdev.orig/drivers/net/wireless/ipw2200.c 2005-06-01 11:03:37.000000000 +0200 +++ netdev/drivers/net/wireless/ipw2200.c 2005-06-03 15:46:31.000000000 +0200 @@ -241,8 +241,8 @@ IPW_DEBUG_IO(" reg = 0x%8X : value = 0x%8X\n", reg, value); _ipw_write32(priv, CX2_INDIRECT_ADDR, reg & CX2_INDIRECT_ADDR_MASK); _ipw_write8(priv, CX2_INDIRECT_DATA, value); - IPW_DEBUG_IO(" reg = 0x%8X : value = 0x%8X\n", - (unsigned)(priv->hw_base + CX2_INDIRECT_DATA), + IPW_DEBUG_IO(" reg = 0x%8lX : value = 0x%8X\n", + (unsigned long)(priv->hw_base + CX2_INDIRECT_DATA), value); } @@ -508,7 +508,7 @@ /* verify we have enough room to store the value */ if (*len < sizeof(u32)) { IPW_DEBUG_ORD("ordinal buffer length too small, " - "need %d\n", sizeof(u32)); + "need %d\n", (int)sizeof(u32)); return -EINVAL; } @@ -541,7 +541,7 @@ /* verify we have enough room to store the value */ if (*len < sizeof(u32)) { IPW_DEBUG_ORD("ordinal buffer length too small, " - "need %d\n", sizeof(u32)); + "need %d\n", (int)sizeof(u32)); return -EINVAL; } @@ -1740,7 +1740,7 @@ u32 address = CX2_SHARED_SRAM_DMA_CONTROL + (sizeof(struct command_block) * index); IPW_DEBUG_FW(">> :\n"); - ipw_write_indirect(priv, address, (u8*)cb, sizeof(struct command_block)); + ipw_write_indirect(priv, address, (u8*)cb, (int)sizeof(struct command_block)); IPW_DEBUG_FW("<< :\n"); return 0; @@ -2342,11 +2342,11 @@ return -EINVAL; } - IPW_DEBUG_INFO("Loading firmware '%s' file v%d.%d (%d bytes)\n", + IPW_DEBUG_INFO("Loading firmware '%s' file v%d.%d (%ld bytes)\n", name, IPW_FW_MAJOR(header->version), IPW_FW_MINOR(header->version), - (*fw)->size - sizeof(struct fw_header)); + (long)(*fw)->size - sizeof(struct fw_header)); return 0; } @@ -2698,7 +2698,7 @@ q->bd = pci_alloc_consistent(dev,sizeof(q->bd[0])*count, &q->q.dma_addr); if (!q->bd) { IPW_ERROR("pci_alloc_consistent(%d) failed\n", - sizeof(q->bd[0]) * count); + (int)sizeof(q->bd[0]) * count); kfree(q->txb); q->txb = NULL; return -ENOMEM; @@ -3467,7 +3467,7 @@ } else { IPW_DEBUG_SCAN("Scan result of wrong size %d " "(should be %d)\n", - notif->size,sizeof(*x)); + notif->size, (int)sizeof(*x)); } break; } @@ -3483,7 +3483,7 @@ } else { IPW_ERROR("Scan completed of wrong size %d " "(should be %d)\n", - notif->size,sizeof(*x)); + notif->size, (int)sizeof(*x)); } priv->status &= ~(STATUS_SCANNING | STATUS_SCAN_ABORTING); @@ -3516,7 +3516,7 @@ } else { IPW_ERROR("Frag length of wrong size %d " "(should be %d)\n", - notif->size, sizeof(*x)); + notif->size, (int)sizeof(*x)); } break; } @@ -3533,7 +3533,7 @@ } else { IPW_ERROR("Link Deterioration of wrong size %d " "(should be %d)\n", - notif->size,sizeof(*x)); + notif->size, (int)sizeof(*x)); } break; } @@ -3552,7 +3552,7 @@ struct notif_beacon_state *x = ¬if->u.beacon_state; if (notif->size != sizeof(*x)) { IPW_ERROR("Beacon state of wrong size %d (should " - "be %d)\n", notif->size, sizeof(*x)); + "be %d)\n", notif->size, (int)sizeof(*x)); break; } @@ -3603,7 +3603,7 @@ } IPW_ERROR("TGi Tx Key of wrong size %d (should be %d)\n", - notif->size,sizeof(*x)); + notif->size, (int)sizeof(*x)); break; } @@ -3617,7 +3617,7 @@ } IPW_ERROR("Calibration of wrong size %d (should be %d)\n", - notif->size,sizeof(*x)); + notif->size, (int)sizeof(*x)); break; } @@ -3629,7 +3629,7 @@ } IPW_ERROR("Noise stat is wrong size %d (should be %d)\n", - notif->size, sizeof(u32)); + notif->size, (int)sizeof(u32)); break; } @@ -4823,7 +4823,7 @@ } /* Advance skb->data to the start of the actual payload */ - skb_reserve(rxb->skb, (u32)&pkt->u.frame.data[0] - (u32)pkt); + skb_reserve(rxb->skb, offsetof(struct ipw_rx_packet, u.frame.data)); /* Set the size of the skb to the size of the frame */ skb_put(rxb->skb, pkt->u.frame.length); Index: netdev/drivers/net/wireless/ipw2100.c =================================================================== --- netdev.orig/drivers/net/wireless/ipw2100.c 2005-06-01 11:03:37.000000000 +0200 +++ netdev/drivers/net/wireless/ipw2100.c 2005-06-03 15:43:53.000000000 +0200 @@ -494,7 +494,7 @@ IPW_DEBUG_WARNING(DRV_NAME ": ordinal buffer length too small, need %d\n", - IPW_ORD_TAB_1_ENTRY_SIZE); + (int)IPW_ORD_TAB_1_ENTRY_SIZE); return -EINVAL; } @@ -2302,7 +2302,7 @@ #endif IPW_DEBUG_INFO(DRV_NAME ": PCI latency error detected at " - "0x%04X.\n", i * sizeof(struct ipw2100_status)); + "0x%04X.\n", i * (int)sizeof(struct ipw2100_status)); #ifdef ACPI_CSTATE_LIMIT_DEFINED IPW_DEBUG_INFO(DRV_NAME ": Disabling C3 transitions.\n"); @@ -2398,7 +2398,7 @@ /* Make a copy of the frame so we can dump it to the logs if * ieee80211_rx fails */ memcpy(packet_data, packet->skb->data, - min(status->frame_size, IPW_RX_NIC_BUFFER_LENGTH)); + min_t(u32, status->frame_size, IPW_RX_NIC_BUFFER_LENGTH)); #endif if (!ieee80211_rx(priv->ieee, packet->skb, stats)) { @@ -2730,21 +2730,21 @@ { int i = txq->oldest; IPW_DEBUG_TX( - "TX%d V=%p P=%p T=%p L=%d\n", i, + "TX%d V=%p P=%04X T=%04X L=%d\n", i, &txq->drv[i], - (void*)txq->nic + i * sizeof(struct ipw2100_bd), - (void*)txq->drv[i].host_addr, + (u32)(txq->nic + i * sizeof(struct ipw2100_bd)), + txq->drv[i].host_addr, txq->drv[i].buf_length); if (packet->type == DATA) { i = (i + 1) % txq->entries; IPW_DEBUG_TX( - "TX%d V=%p P=%p T=%p L=%d\n", i, + "TX%d V=%p P=%04X T=%04X L=%d\n", i, &txq->drv[i], - (void*)txq->nic + i * - sizeof(struct ipw2100_bd), - (void*)txq->drv[i].host_addr, + (u32)(txq->nic + i * + sizeof(struct ipw2100_bd)), + (u32)txq->drv[i].host_addr, txq->drv[i].buf_length); } } @@ -4212,7 +4212,7 @@ { IPW_DEBUG_INFO("enter\n"); - IPW_DEBUG_INFO("initializing bd queue at virt=%p, phys=%08x\n", q->drv, q->nic); + IPW_DEBUG_INFO("initializing bd queue at virt=%p, phys=%08x\n", q->drv, (u32)q->nic); write_register(priv->net_dev, base, q->nic); write_register(priv->net_dev, size, q->entries); @@ -8431,8 +8431,8 @@ priv->net_dev->name, fw_name); return rc; } - IPW_DEBUG_INFO("firmware data %p size %d\n", fw->fw_entry->data, - fw->fw_entry->size); + IPW_DEBUG_INFO("firmware data %p size %ld\n", fw->fw_entry->data, + (long)fw->fw_entry->size); ipw2100_mod_firmware_load(fw); -- Jiri Benc SUSE Labs From jbenc@suse.cz Fri Jun 3 09:32:48 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Jun 2005 09:32:51 -0700 (PDT) Received: from mail.suse.cz (styx.suse.cz [82.119.242.94]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j53GWlXq000977 for ; Fri, 3 Jun 2005 09:32:48 -0700 Received: from griffin.suse.cz (griffin.suse.cz [10.20.1.99]) by mail.suse.cz (SUSE CR ESMTP Mailer) with ESMTP id 6E2FD6282FC; Fri, 3 Jun 2005 18:31:49 +0200 (CEST) Date: Fri, 3 Jun 2005 18:31:49 +0200 From: Jiri Benc To: NetDev Cc: Jeff Garzik , Jirka Bohac Subject: [4/9] ieee80211: ieee80211_device alignment fix and cleanup Message-ID: <20050603183149.228ab747@griffin.suse.cz> In-Reply-To: <20050603182625.64d33be3@griffin.suse.cz> References: <20050603182625.64d33be3@griffin.suse.cz> X-Mailer: Sylpheed-Claws 1.0.4a (GTK+ 1.2.10; x86_64-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 2030 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jbenc@suse.cz Precedence: bulk X-list: netdev Content-Length: 9617 Lines: 310 Changes to the ieee80211 layer: - fixes a serious alignment problem of the driver's private data - makes the drivers use the ieee80211_device instead of the net_device where appropriate (will ease further development of ieee80211 as a self-contained layer) Signed-off-by: Jiri Benc Signed-off-by: Jirka Bohac Index: netdev/include/net/ieee80211.h =================================================================== --- netdev.orig/include/net/ieee80211.h 2005-06-01 11:05:06.000000000 +0200 +++ netdev/include/net/ieee80211.h 2005-06-03 13:20:46.000000000 +0200 @@ -704,15 +704,13 @@ int abg_ture; /* ABG flag */ /* Callback functions */ - void (*set_security)(struct net_device *dev, + void (*set_security)(struct ieee80211_device *ieee, struct ieee80211_security *sec); int (*hard_start_xmit)(struct ieee80211_txb *txb, - struct net_device *dev); - int (*reset_port)(struct net_device *dev); + struct ieee80211_device *ieee); + int (*reset_port)(struct ieee80211_device *ieee); - /* This must be the last item so that it points to the data - * allocated beyond this structure by alloc_ieee80211 */ - u8 priv[0]; + void *priv; }; #define IEEE_A (1<<0) @@ -720,9 +718,27 @@ #define IEEE_G (1<<2) #define IEEE_MODE_MASK (IEEE_A|IEEE_B|IEEE_G) -extern inline void *ieee80211_priv(struct net_device *dev) +static inline void *ieee80211_priv(struct ieee80211_device *ieee) { - return ((struct ieee80211_device *)netdev_priv(dev))->priv; + return (char *)ieee + + ((sizeof(struct ieee80211_device) + NETDEV_ALIGN_CONST) + & ~NETDEV_ALIGN_CONST); +} + +static inline void *ieee80211_dev_to_priv(struct net_device *dev) +{ + return (char *)dev + + ((sizeof(struct net_device) + NETDEV_ALIGN_CONST) + & ~NETDEV_ALIGN_CONST) + + ((sizeof(struct ieee80211_device) + NETDEV_ALIGN_CONST) + & ~NETDEV_ALIGN_CONST); +} + +static inline struct net_device *ieee80211_dev(struct ieee80211_device *ieee) +{ + return (struct net_device *)((char *)ieee - + ((sizeof(struct net_device) + NETDEV_ALIGN_CONST) + & ~NETDEV_ALIGN_CONST)); } extern inline int ieee80211_is_empty_essid(const char *essid, int essid_len) @@ -795,8 +811,8 @@ /* ieee80211.c */ -extern void free_ieee80211(struct net_device *dev); -extern struct net_device *alloc_ieee80211(int sizeof_priv); +extern void free_ieee80211(struct ieee80211_device *ieee); +extern struct ieee80211_device *alloc_ieee80211(int sizeof_priv); extern int ieee80211_set_encryption(struct ieee80211_device *ieee); Index: netdev/net/ieee80211/ieee80211_module.c =================================================================== --- netdev.orig/net/ieee80211/ieee80211_module.c 2005-06-03 13:20:40.000000000 +0200 +++ netdev/net/ieee80211/ieee80211_module.c 2005-06-03 13:20:46.000000000 +0200 @@ -69,7 +69,7 @@ GFP_KERNEL); if (!ieee->networks) { printk(KERN_WARNING "%s: Out of memory allocating beacons\n", - ieee->dev->name); + ieee80211_dev(ieee)->name); return -ENOMEM; } @@ -98,23 +98,28 @@ } -struct net_device *alloc_ieee80211(int sizeof_priv) +struct ieee80211_device *alloc_ieee80211(int sizeof_priv) { struct ieee80211_device *ieee; struct net_device *dev; + int alloc_size; int err; IEEE80211_DEBUG_INFO("Initializing...\n"); - dev = alloc_etherdev(sizeof(struct ieee80211_device) + sizeof_priv); + alloc_size = ((sizeof(struct ieee80211_device) + NETDEV_ALIGN_CONST) + & ~NETDEV_ALIGN_CONST) + + sizeof_priv; + dev = alloc_etherdev(alloc_size); if (!dev) { IEEE80211_ERROR("Unable to network device.\n"); goto failed; } ieee = netdev_priv(dev); - dev->hard_start_xmit = ieee80211_xmit; - ieee->dev = dev; + ieee->priv = ieee80211_priv(ieee); + + dev->hard_start_xmit = ieee80211_xmit; err = ieee80211_networks_allocate(ieee); if (err) { @@ -147,7 +152,7 @@ ieee->privacy_invoked = 0; ieee->ieee802_1x = 1; - return dev; + return ieee; failed: if (dev) @@ -156,10 +161,8 @@ } -void free_ieee80211(struct net_device *dev) +void free_ieee80211(struct ieee80211_device *ieee) { - struct ieee80211_device *ieee = netdev_priv(dev); - int i; del_timer_sync(&ieee->crypt_deinit_timer); @@ -178,7 +181,7 @@ } ieee80211_networks_free(ieee); - free_netdev(dev); + free_netdev(ieee80211_dev(ieee)); } #ifdef CONFIG_IEEE80211_DEBUG Index: netdev/net/ieee80211/ieee80211_rx.c =================================================================== --- netdev.orig/net/ieee80211/ieee80211_rx.c 2005-06-03 13:20:40.000000000 +0200 +++ netdev/net/ieee80211/ieee80211_rx.c 2005-06-03 13:20:46.000000000 +0200 @@ -99,7 +99,7 @@ if (frag == 0) { /* Reserve enough space to fit maximum frame length */ - skb = dev_alloc_skb(ieee->dev->mtu + + skb = dev_alloc_skb(ieee80211_dev(ieee)->mtu + sizeof(struct ieee80211_hdr) + 8 /* LLC */ + 2 /* alignment */ + @@ -175,7 +175,7 @@ { if (ieee->iw_mode == IW_MODE_MASTER) { printk(KERN_DEBUG "%s: Master mode not yet suppported.\n", - ieee->dev->name); + ieee80211_dev(ieee)->name); return 0; /* hostap_update_sta_ps(ieee, (struct hostap_ieee80211_hdr *) @@ -233,7 +233,7 @@ static int ieee80211_is_eapol_frame(struct ieee80211_device *ieee, struct sk_buff *skb) { - struct net_device *dev = ieee->dev; + struct net_device *dev = ieee80211_dev(ieee); u16 fc, ethertype; struct ieee80211_hdr *hdr; u8 *pos; @@ -289,7 +289,7 @@ if (net_ratelimit()) { printk(KERN_DEBUG "%s: TKIP countermeasures: dropped " "received packet from " MAC_FMT "\n", - ieee->dev->name, MAC_ARG(hdr->addr2)); + ieee80211_dev(ieee)->name, MAC_ARG(hdr->addr2)); } return -1; } @@ -334,7 +334,7 @@ if (res < 0) { printk(KERN_DEBUG "%s: MSDU decryption/MIC verification failed" " (SA=" MAC_FMT " keyidx=%d)\n", - ieee->dev->name, MAC_ARG(hdr->addr2), keyidx); + ieee80211_dev(ieee)->name, MAC_ARG(hdr->addr2), keyidx); return -1; } @@ -348,7 +348,7 @@ int ieee80211_rx(struct ieee80211_device *ieee, struct sk_buff *skb, struct ieee80211_rx_stats *rx_stats) { - struct net_device *dev = ieee->dev; + struct net_device *dev = ieee80211_dev(ieee); struct ieee80211_hdr *hdr; size_t hdrlen; u16 fc, type, stype, sc; @@ -1194,7 +1194,7 @@ IEEE80211_DEBUG_MGMT("received UNKNOWN (%d)\n", WLAN_FC_GET_STYPE(header->frame_ctl)); IEEE80211_WARNING("%s: Unknown management packet: %d\n", - ieee->dev->name, + ieee80211_dev(ieee)->name, WLAN_FC_GET_STYPE(header->frame_ctl)); break; } Index: netdev/net/ieee80211/ieee80211_tx.c =================================================================== --- netdev.orig/net/ieee80211/ieee80211_tx.c 2005-06-03 13:20:40.000000000 +0200 +++ netdev/net/ieee80211/ieee80211_tx.c 2005-06-03 13:20:46.000000000 +0200 @@ -171,7 +171,7 @@ if (net_ratelimit()) { printk(KERN_DEBUG "%s: TKIP countermeasures: dropped " "TX packet to " MAC_FMT "\n", - ieee->dev->name, MAC_ARG(header->addr1)); + ieee80211_dev(ieee)->name, MAC_ARG(header->addr1)); } return -1; } @@ -192,7 +192,7 @@ atomic_dec(&crypt->refcnt); if (res < 0) { printk(KERN_INFO "%s: Encryption failed: len=%d.\n", - ieee->dev->name, frag->len); + ieee80211_dev(ieee)->name, frag->len); ieee->ieee_stats.tx_discards++; return -1; } @@ -269,13 +269,13 @@ * creating it... */ if (!ieee->hard_start_xmit) { printk(KERN_WARNING "%s: No xmit handler.\n", - ieee->dev->name); + dev->name); goto success; } if (unlikely(skb->len < SNAP_SIZE + sizeof(u16))) { printk(KERN_WARNING "%s: skb too small (%d).\n", - ieee->dev->name, skb->len); + dev->name, skb->len); goto success; } @@ -371,7 +371,7 @@ txb = ieee80211_alloc_txb(nr_frags, frag_size, GFP_ATOMIC); if (unlikely(!txb)) { printk(KERN_WARNING "%s: Could not allocate TXB\n", - ieee->dev->name); + dev->name); goto failed; } txb->encrypted = encrypt; @@ -426,7 +426,7 @@ dev_kfree_skb_any(skb); if (txb) { - if ((*ieee->hard_start_xmit)(txb, dev) == 0) { + if ((*ieee->hard_start_xmit)(txb, ieee) == 0) { stats->tx_packets++; stats->tx_bytes += txb->payload_size; return 0; Index: netdev/net/ieee80211/ieee80211_wx.c =================================================================== --- netdev.orig/net/ieee80211/ieee80211_wx.c 2005-06-01 11:05:14.000000000 +0200 +++ netdev/net/ieee80211/ieee80211_wx.c 2005-06-03 13:20:46.000000000 +0200 @@ -252,7 +252,7 @@ union iwreq_data *wrqu, char *keybuf) { struct iw_point *erq = &(wrqu->encoding); - struct net_device *dev = ieee->dev; + struct net_device *dev = ieee80211_dev(ieee); struct ieee80211_security sec = { .flags = 0 }; @@ -402,7 +402,7 @@ sec.level = SEC_LEVEL_1; /* 40 and 104 bit WEP */ if (ieee->set_security) - ieee->set_security(dev, &sec); + ieee->set_security(ieee, &sec); /* Do not reset port if card is in Managed mode since resetting will * generate new IEEE 802.11 authentication which may end up in looping @@ -411,7 +411,7 @@ * the callbacks structures used to initialize the 802.11 stack. */ if (ieee->reset_on_keychange && ieee->iw_mode != IW_MODE_INFRA && - ieee->reset_port && ieee->reset_port(dev)) { + ieee->reset_port && ieee->reset_port(ieee)) { printk(KERN_DEBUG "%s: reset_port failed\n", dev->name); return -EINVAL; } -- Jiri Benc SUSE Labs From jbenc@suse.cz Fri Jun 3 09:33:54 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Jun 2005 09:33:57 -0700 (PDT) Received: from mail.suse.cz (styx.suse.cz [82.119.242.94]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j53GXqXq001489 for ; Fri, 3 Jun 2005 09:33:53 -0700 Received: from griffin.suse.cz (griffin.suse.cz [10.20.1.99]) by mail.suse.cz (SUSE CR ESMTP Mailer) with ESMTP id 86D3B6282FC; Fri, 3 Jun 2005 18:32:54 +0200 (CEST) Date: Fri, 3 Jun 2005 18:32:54 +0200 From: Jiri Benc To: NetDev Cc: Jeff Garzik , Jirka Bohac Subject: [5/9] ipw: fix after "ieee80211_device alignment fix" Message-ID: <20050603183254.03afaa81@griffin.suse.cz> In-Reply-To: <20050603182625.64d33be3@griffin.suse.cz> References: <20050603182625.64d33be3@griffin.suse.cz> X-Mailer: Sylpheed-Claws 1.0.4a (GTK+ 1.2.10; x86_64-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 2031 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jbenc@suse.cz Precedence: bulk X-list: netdev Content-Length: 30535 Lines: 1000 Fixes ipw2100 and ipw2200 after the API change (alignment, struct iee80211_device). Signed-off-by: Jiri Benc Signed-off-by: Jirka Bohac Index: netdev/drivers/net/wireless/ipw2100.c =================================================================== --- netdev.orig/drivers/net/wireless/ipw2100.c 2005-06-01 11:03:37.000000000 +0200 +++ netdev/drivers/net/wireless/ipw2100.c 2005-06-03 11:57:33.000000000 +0200 @@ -1772,7 +1772,7 @@ /* Called by register_netdev() */ static int ipw2100_net_init(struct net_device *dev) { - struct ipw2100_priv *priv = ieee80211_priv(dev); + struct ipw2100_priv *priv = ieee80211_dev_to_priv(dev); return ipw2100_up(priv, 1); } @@ -3248,9 +3248,9 @@ return IRQ_NONE; } -static int ipw2100_tx(struct ieee80211_txb *txb, struct net_device *dev) +static int ipw2100_tx(struct ieee80211_txb *txb, struct ieee80211_device *ieee) { - struct ipw2100_priv *priv = ieee80211_priv(dev); + struct ipw2100_priv *priv = ieee80211_priv(ieee); struct list_head *element; struct ipw2100_tx_packet *packet; unsigned long flags; @@ -3260,7 +3260,7 @@ if (!(priv->status & STATUS_ASSOCIATED)) { IPW_DEBUG_INFO("Can not transmit when not connected.\n"); priv->ieee->stats.tx_carrier_errors++; - netif_stop_queue(dev); + netif_stop_queue(ieee80211_dev(ieee)); goto fail_unlock; } @@ -3291,7 +3291,7 @@ return 0; fail_unlock: - netif_stop_queue(dev); + netif_stop_queue(ieee80211_dev(ieee)); spin_unlock_irqrestore(&priv->low_lock, flags); return 1; } @@ -5418,10 +5418,10 @@ ipw2100_configure_security(priv, 0); } -static void shim__set_security(struct net_device *dev, +static void shim__set_security(struct ieee80211_device *ieee, struct ieee80211_security *sec) { - struct ipw2100_priv *priv = ieee80211_priv(dev); + struct ipw2100_priv *priv = ieee80211_priv(ieee); int i, force_update = 0; down(&priv->action_sem); @@ -5609,7 +5609,7 @@ * method as well) to talk to the firmware */ static int ipw2100_set_address(struct net_device *dev, void *p) { - struct ipw2100_priv *priv = ieee80211_priv(dev); + struct ipw2100_priv *priv = ieee80211_dev_to_priv(dev); struct sockaddr *addr = p; int err = 0; @@ -5637,7 +5637,7 @@ static int ipw2100_open(struct net_device *dev) { - struct ipw2100_priv *priv = ieee80211_priv(dev); + struct ipw2100_priv *priv = ieee80211_dev_to_priv(dev); unsigned long flags; IPW_DEBUG_INFO("dev->open\n"); @@ -5651,7 +5651,7 @@ static int ipw2100_close(struct net_device *dev) { - struct ipw2100_priv *priv = ieee80211_priv(dev); + struct ipw2100_priv *priv = ieee80211_dev_to_priv(dev); unsigned long flags; struct list_head *element; struct ipw2100_tx_packet *packet; @@ -5692,7 +5692,7 @@ */ static void ipw2100_tx_timeout(struct net_device *dev) { - struct ipw2100_priv *priv = ieee80211_priv(dev); + struct ipw2100_priv *priv = ieee80211_dev_to_priv(dev); priv->ieee->stats.tx_errors++; @@ -5715,7 +5715,7 @@ */ static struct net_device_stats *ipw2100_stats(struct net_device *dev) { - struct ipw2100_priv *priv = ieee80211_priv(dev); + struct ipw2100_priv *priv = ieee80211_dev_to_priv(dev); return &priv->ieee->stats; } @@ -5802,7 +5802,7 @@ } if (ieee->set_security) - ieee->set_security(ieee->dev, &sec); + ieee->set_security(ieee, &sec); else ret = -EOPNOTSUPP; @@ -5829,7 +5829,7 @@ } if (ieee->set_security) - ieee->set_security(ieee->dev, &sec); + ieee->set_security(ieee, &sec); else ret = -EOPNOTSUPP; @@ -5839,7 +5839,7 @@ static int ipw2100_wpa_set_param(struct net_device *dev, u8 name, u32 value){ - struct ipw2100_priv *priv = ieee80211_priv(dev); + struct ipw2100_priv *priv = ieee80211_dev_to_priv(dev); int ret=0; switch(name){ @@ -5878,7 +5878,7 @@ static int ipw2100_wpa_mlme(struct net_device *dev, int command, int reason){ - struct ipw2100_priv *priv = ieee80211_priv(dev); + struct ipw2100_priv *priv = ieee80211_dev_to_priv(dev); int ret=0; switch(command){ @@ -5920,8 +5920,8 @@ static int ipw2100_wpa_set_wpa_ie(struct net_device *dev, struct ipw2100_param *param, int plen){ - struct ipw2100_priv *priv = ieee80211_priv(dev); - struct ieee80211_device *ieee = priv->ieee; + struct ieee80211_device *ieee = netdev_priv(dev); + struct ipw2100_priv *priv = ieee80211_priv(ieee); u8 *buf; if (! ieee->wpa_enabled) @@ -5960,8 +5960,8 @@ struct ipw2100_param *param, int param_len){ int ret = 0; - struct ipw2100_priv *priv = ieee80211_priv(dev); - struct ieee80211_device *ieee = priv->ieee; + struct ieee80211_device *ieee = netdev_priv(dev); + struct ipw2100_priv *priv = ieee80211_priv(ieee); struct ieee80211_crypto_ops *ops; struct ieee80211_crypt_data **crypt; @@ -6081,7 +6081,7 @@ } done: if (ieee->set_security) - ieee->set_security(ieee->dev, &sec); + ieee->set_security(ieee, &sec); /* Do not reset port if card is in Managed mode since resetting will * generate new IEEE 802.11 authentication which may end up in looping @@ -6178,7 +6178,7 @@ static void ipw_ethtool_get_drvinfo(struct net_device *dev, struct ethtool_drvinfo *info) { - struct ipw2100_priv *priv = ieee80211_priv(dev); + struct ipw2100_priv *priv = ieee80211_dev_to_priv(dev); char fw_ver[64], ucode_ver[64]; strcpy(info->driver, DRV_NAME); @@ -6195,7 +6195,7 @@ static u32 ipw2100_ethtool_get_link(struct net_device *dev) { - struct ipw2100_priv *priv = ieee80211_priv(dev); + struct ipw2100_priv *priv = ieee80211_dev_to_priv(dev); return (priv->status & STATUS_ASSOCIATED) ? 1 : 0; } @@ -6288,12 +6288,14 @@ { struct ipw2100_priv *priv; struct net_device *dev; + struct ieee80211_device *ieee; - dev = alloc_ieee80211(sizeof(struct ipw2100_priv)); - if (!dev) + ieee = alloc_ieee80211(sizeof(struct ipw2100_priv)); + if (!ieee) return NULL; - priv = ieee80211_priv(dev); - priv->ieee = netdev_priv(dev); + dev = ieee80211_dev(ieee); + priv = ieee80211_priv(ieee); + priv->ieee = ieee; priv->pci_dev = pci_dev; priv->net_dev = dev; @@ -6477,7 +6479,7 @@ return err; } - priv = ieee80211_priv(dev); + priv = ieee80211_dev_to_priv(dev); pci_set_master(pci_dev); pci_set_drvdata(pci_dev, priv); @@ -6618,7 +6620,7 @@ ipw2100_queues_free(priv); sysfs_remove_group(&pci_dev->dev.kobj, &ipw2100_attribute_group); - free_ieee80211(dev); + free_ieee80211(netdev_priv(dev)); pci_set_drvdata(pci_dev, NULL); } @@ -6675,7 +6677,7 @@ if (dev->base_addr) iounmap((unsigned char *)dev->base_addr); - free_ieee80211(dev); + free_ieee80211(netdev_priv(dev)); } pci_release_regions(pci_dev); @@ -6918,7 +6920,7 @@ * This can be called at any time. No action lock required */ - struct ipw2100_priv *priv = ieee80211_priv(dev); + struct ipw2100_priv *priv = ieee80211_dev_to_priv(dev); if (!(priv->status & STATUS_ASSOCIATED)) strcpy(wrqu->name, "unassociated"); else @@ -6933,7 +6935,7 @@ struct iw_request_info *info, union iwreq_data *wrqu, char *extra) { - struct ipw2100_priv *priv = ieee80211_priv(dev); + struct ipw2100_priv *priv = ieee80211_dev_to_priv(dev); struct iw_freq *fwrq = &wrqu->freq; int err = 0; @@ -6984,7 +6986,7 @@ * This can be called at any time. No action lock required */ - struct ipw2100_priv *priv = ieee80211_priv(dev); + struct ipw2100_priv *priv = ieee80211_dev_to_priv(dev); wrqu->freq.e = 0; @@ -7005,7 +7007,7 @@ struct iw_request_info *info, union iwreq_data *wrqu, char *extra) { - struct ipw2100_priv *priv = ieee80211_priv(dev); + struct ipw2100_priv *priv = ieee80211_dev_to_priv(dev); int err = 0; IPW_DEBUG_WX("SET Mode -> %d \n", wrqu->mode); @@ -7048,7 +7050,7 @@ * This can be called at any time. No action lock required */ - struct ipw2100_priv *priv = ieee80211_priv(dev); + struct ipw2100_priv *priv = ieee80211_dev_to_priv(dev); wrqu->mode = priv->ieee->iw_mode; IPW_DEBUG_WX("GET Mode -> %d\n", wrqu->mode); @@ -7084,7 +7086,7 @@ * This can be called at any time. No action lock required */ - struct ipw2100_priv *priv = ieee80211_priv(dev); + struct ipw2100_priv *priv = ieee80211_dev_to_priv(dev); struct iw_range *range = (struct iw_range *)extra; u16 val; int i, level; @@ -7196,7 +7198,7 @@ struct iw_request_info *info, union iwreq_data *wrqu, char *extra) { - struct ipw2100_priv *priv = ieee80211_priv(dev); + struct ipw2100_priv *priv = ieee80211_dev_to_priv(dev); int err = 0; static const unsigned char any[] = { @@ -7251,7 +7253,7 @@ * This can be called at any time. No action lock required */ - struct ipw2100_priv *priv = ieee80211_priv(dev); + struct ipw2100_priv *priv = ieee80211_dev_to_priv(dev); /* If we are associated, trying to associate, or have a statically * configured BSSID then return that; otherwise return ANY */ @@ -7271,7 +7273,7 @@ struct iw_request_info *info, union iwreq_data *wrqu, char *extra) { - struct ipw2100_priv *priv = ieee80211_priv(dev); + struct ipw2100_priv *priv = ieee80211_dev_to_priv(dev); char *essid = ""; /* ANY */ int length = 0; int err = 0; @@ -7325,7 +7327,7 @@ * This can be called at any time. No action lock required */ - struct ipw2100_priv *priv = ieee80211_priv(dev); + struct ipw2100_priv *priv = ieee80211_dev_to_priv(dev); /* If we are associated, trying to associate, or have a statically * configured ESSID then return that; otherwise return ANY */ @@ -7353,7 +7355,7 @@ * This can be called at any time. No action lock required */ - struct ipw2100_priv *priv = ieee80211_priv(dev); + struct ipw2100_priv *priv = ieee80211_dev_to_priv(dev); if (wrqu->data.length > IW_ESSID_MAX_SIZE) return -E2BIG; @@ -7375,7 +7377,7 @@ * This can be called at any time. No action lock required */ - struct ipw2100_priv *priv = ieee80211_priv(dev); + struct ipw2100_priv *priv = ieee80211_dev_to_priv(dev); wrqu->data.length = strlen(priv->nick) + 1; memcpy(extra, priv->nick, wrqu->data.length); @@ -7390,7 +7392,7 @@ struct iw_request_info *info, union iwreq_data *wrqu, char *extra) { - struct ipw2100_priv *priv = ieee80211_priv(dev); + struct ipw2100_priv *priv = ieee80211_dev_to_priv(dev); u32 target_rate = wrqu->bitrate.value; u32 rate; int err = 0; @@ -7431,7 +7433,7 @@ struct iw_request_info *info, union iwreq_data *wrqu, char *extra) { - struct ipw2100_priv *priv = ieee80211_priv(dev); + struct ipw2100_priv *priv = ieee80211_dev_to_priv(dev); int val; int len = sizeof(val); int err = 0; @@ -7483,7 +7485,7 @@ struct iw_request_info *info, union iwreq_data *wrqu, char *extra) { - struct ipw2100_priv *priv = ieee80211_priv(dev); + struct ipw2100_priv *priv = ieee80211_dev_to_priv(dev); int value, err; /* Auto RTS not yet supported */ @@ -7523,7 +7525,7 @@ * This can be called at any time. No action lock required */ - struct ipw2100_priv *priv = ieee80211_priv(dev); + struct ipw2100_priv *priv = ieee80211_dev_to_priv(dev); wrqu->rts.value = priv->rts_threshold & ~RTS_DISABLED; wrqu->rts.fixed = 1; /* no auto select */ @@ -7540,7 +7542,7 @@ struct iw_request_info *info, union iwreq_data *wrqu, char *extra) { - struct ipw2100_priv *priv = ieee80211_priv(dev); + struct ipw2100_priv *priv = ieee80211_dev_to_priv(dev); int err = 0, value; if (priv->ieee->iw_mode != IW_MODE_ADHOC) @@ -7580,7 +7582,7 @@ * This can be called at any time. No action lock required */ - struct ipw2100_priv *priv = ieee80211_priv(dev); + struct ipw2100_priv *priv = ieee80211_dev_to_priv(dev); if (priv->ieee->iw_mode != IW_MODE_ADHOC) { wrqu->power.disabled = 1; @@ -7616,7 +7618,7 @@ * This can be called at any time. No action lock required */ - struct ipw2100_priv *priv = ieee80211_priv(dev); + struct ipw2100_priv *priv = ieee80211_dev_to_priv(dev); if (!wrqu->frag.fixed) return -EINVAL; @@ -7646,7 +7648,7 @@ * This can be called at any time. No action lock required */ - struct ipw2100_priv *priv = ieee80211_priv(dev); + struct ipw2100_priv *priv = ieee80211_dev_to_priv(dev); wrqu->frag.value = priv->frag_threshold & ~FRAG_DISABLED; wrqu->frag.fixed = 0; /* no auto select */ wrqu->frag.disabled = (priv->frag_threshold & FRAG_DISABLED) ? 1 : 0; @@ -7660,7 +7662,7 @@ struct iw_request_info *info, union iwreq_data *wrqu, char *extra) { - struct ipw2100_priv *priv = ieee80211_priv(dev); + struct ipw2100_priv *priv = ieee80211_dev_to_priv(dev); int err = 0; if (wrqu->retry.flags & IW_RETRY_LIFETIME || @@ -7709,7 +7711,7 @@ * This can be called at any time. No action lock required */ - struct ipw2100_priv *priv = ieee80211_priv(dev); + struct ipw2100_priv *priv = ieee80211_dev_to_priv(dev); wrqu->retry.disabled = 0; /* can't be disabled */ @@ -7738,7 +7740,7 @@ struct iw_request_info *info, union iwreq_data *wrqu, char *extra) { - struct ipw2100_priv *priv = ieee80211_priv(dev); + struct ipw2100_priv *priv = ieee80211_dev_to_priv(dev); int err = 0; down(&priv->action_sem); @@ -7769,7 +7771,7 @@ * This can be called at any time. No action lock required */ - struct ipw2100_priv *priv = ieee80211_priv(dev); + struct ipw2100_priv *priv = ieee80211_dev_to_priv(dev); return ieee80211_wx_get_scan(priv->ieee, info, wrqu, extra); } @@ -7785,7 +7787,7 @@ * No check of STATUS_INITIALIZED required */ - struct ipw2100_priv *priv = ieee80211_priv(dev); + struct ipw2100_priv *priv = ieee80211_dev_to_priv(dev); return ieee80211_wx_set_encode(priv->ieee, info, wrqu, key); } @@ -7797,7 +7799,7 @@ * This can be called at any time. No action lock required */ - struct ipw2100_priv *priv = ieee80211_priv(dev); + struct ipw2100_priv *priv = ieee80211_dev_to_priv(dev); return ieee80211_wx_get_encode(priv->ieee, info, wrqu, key); } @@ -7805,7 +7807,7 @@ struct iw_request_info *info, union iwreq_data *wrqu, char *extra) { - struct ipw2100_priv *priv = ieee80211_priv(dev); + struct ipw2100_priv *priv = ieee80211_dev_to_priv(dev); int err = 0; down(&priv->action_sem); @@ -7855,7 +7857,7 @@ * This can be called at any time. No action lock required */ - struct ipw2100_priv *priv = ieee80211_priv(dev); + struct ipw2100_priv *priv = ieee80211_dev_to_priv(dev); if (!(priv->power_mode & IPW_POWER_ENABLED)) { wrqu->power.disabled = 1; @@ -7880,7 +7882,7 @@ struct iw_request_info *info, union iwreq_data *wrqu, char *extra) { - struct ipw2100_priv *priv = ieee80211_priv(dev); + struct ipw2100_priv *priv = ieee80211_dev_to_priv(dev); int *parms = (int *)extra; int enable = (parms[0] > 0); int err = 0; @@ -7911,7 +7913,7 @@ struct iw_request_info *info, union iwreq_data *wrqu, char *extra) { - struct ipw2100_priv *priv = ieee80211_priv(dev); + struct ipw2100_priv *priv = ieee80211_dev_to_priv(dev); if (priv->status & STATUS_INITIALIZED) schedule_reset(priv); return 0; @@ -7923,7 +7925,7 @@ struct iw_request_info *info, union iwreq_data *wrqu, char *extra) { - struct ipw2100_priv *priv = ieee80211_priv(dev); + struct ipw2100_priv *priv = ieee80211_dev_to_priv(dev); int err = 0, mode = *(int *)extra; down(&priv->action_sem); @@ -7951,7 +7953,7 @@ * This can be called at any time. No action lock required */ - struct ipw2100_priv *priv = ieee80211_priv(dev); + struct ipw2100_priv *priv = ieee80211_dev_to_priv(dev); int level = IPW_POWER_LEVEL(priv->power_mode); s32 timeout, period; @@ -7988,7 +7990,7 @@ struct iw_request_info *info, union iwreq_data *wrqu, char *extra) { - struct ipw2100_priv *priv = ieee80211_priv(dev); + struct ipw2100_priv *priv = ieee80211_dev_to_priv(dev); int err, mode = *(int *)extra; down(&priv->action_sem); @@ -8021,7 +8023,7 @@ * This can be called at any time. No action lock required */ - struct ipw2100_priv *priv = ieee80211_priv(dev); + struct ipw2100_priv *priv = ieee80211_dev_to_priv(dev); if (priv->config & CFG_LONG_PREAMBLE) snprintf(wrqu->name, IFNAMSIZ, "long (1)"); @@ -8163,7 +8165,7 @@ int tx_qual; int beacon_qual; - struct ipw2100_priv *priv = ieee80211_priv(dev); + struct ipw2100_priv *priv = ieee80211_dev_to_priv(dev); struct iw_statistics *wstats; u32 rssi, quality, tx_retries, missed_beacons, tx_failures; u32 ord_len = sizeof(u32); Index: netdev/drivers/net/wireless/ipw2200.c =================================================================== --- netdev.orig/drivers/net/wireless/ipw2200.c 2005-06-01 11:03:37.000000000 +0200 +++ netdev/drivers/net/wireless/ipw2200.c 2005-06-03 11:57:33.000000000 +0200 @@ -5157,7 +5157,7 @@ struct iw_request_info *info, union iwreq_data *wrqu, char *extra) { - struct ipw_priv *priv = ieee80211_priv(dev); + struct ipw_priv *priv = ieee80211_dev_to_priv(dev); if (!(priv->status & STATUS_ASSOCIATED)) strcpy(wrqu->name, "unassociated"); else @@ -5210,7 +5210,7 @@ struct iw_request_info *info, union iwreq_data *wrqu, char *extra) { - struct ipw_priv *priv = ieee80211_priv(dev); + struct ipw_priv *priv = ieee80211_dev_to_priv(dev); struct iw_freq *fwrq = &wrqu->freq; /* if setting by freq convert to channel */ @@ -5244,7 +5244,7 @@ struct iw_request_info *info, union iwreq_data *wrqu, char *extra) { - struct ipw_priv *priv = ieee80211_priv(dev); + struct ipw_priv *priv = ieee80211_dev_to_priv(dev); wrqu->freq.e = 0; @@ -5264,7 +5264,7 @@ struct iw_request_info *info, union iwreq_data *wrqu, char *extra) { - struct ipw_priv *priv = ieee80211_priv(dev); + struct ipw_priv *priv = ieee80211_dev_to_priv(dev); int err = 0; IPW_DEBUG_WX("Set MODE: %d\n", wrqu->mode); @@ -5317,7 +5317,7 @@ struct iw_request_info *info, union iwreq_data *wrqu, char *extra) { - struct ipw_priv *priv = ieee80211_priv(dev); + struct ipw_priv *priv = ieee80211_dev_to_priv(dev); wrqu->mode = priv->ieee->iw_mode; IPW_DEBUG_WX("Get MODE -> %d\n", wrqu->mode); @@ -5354,7 +5354,7 @@ struct iw_request_info *info, union iwreq_data *wrqu, char *extra) { - struct ipw_priv *priv = ieee80211_priv(dev); + struct ipw_priv *priv = ieee80211_dev_to_priv(dev); struct iw_range *range = (struct iw_range *)extra; u16 val; int i; @@ -5418,7 +5418,7 @@ struct iw_request_info *info, union iwreq_data *wrqu, char *extra) { - struct ipw_priv *priv = ieee80211_priv(dev); + struct ipw_priv *priv = ieee80211_dev_to_priv(dev); static const unsigned char any[] = { 0xff, 0xff, 0xff, 0xff, 0xff, 0xff @@ -5472,7 +5472,7 @@ struct iw_request_info *info, union iwreq_data *wrqu, char *extra) { - struct ipw_priv *priv = ieee80211_priv(dev); + struct ipw_priv *priv = ieee80211_dev_to_priv(dev); /* If we are associated, trying to associate, or have a statically * configured BSSID then return that; otherwise return ANY */ if (priv->config & CFG_STATIC_BSSID || @@ -5491,7 +5491,7 @@ struct iw_request_info *info, union iwreq_data *wrqu, char *extra) { - struct ipw_priv *priv = ieee80211_priv(dev); + struct ipw_priv *priv = ieee80211_dev_to_priv(dev); char *essid = ""; /* ANY */ int length = 0; @@ -5543,7 +5543,7 @@ struct iw_request_info *info, union iwreq_data *wrqu, char *extra) { - struct ipw_priv *priv = ieee80211_priv(dev); + struct ipw_priv *priv = ieee80211_dev_to_priv(dev); /* If we are associated, trying to associate, or have a statically * configured ESSID then return that; otherwise return ANY */ @@ -5567,7 +5567,7 @@ struct iw_request_info *info, union iwreq_data *wrqu, char *extra) { - struct ipw_priv *priv = ieee80211_priv(dev); + struct ipw_priv *priv = ieee80211_dev_to_priv(dev); IPW_DEBUG_WX("Setting nick to '%s'\n", extra); if (wrqu->data.length > IW_ESSID_MAX_SIZE) @@ -5586,7 +5586,7 @@ struct iw_request_info *info, union iwreq_data *wrqu, char *extra) { - struct ipw_priv *priv = ieee80211_priv(dev); + struct ipw_priv *priv = ieee80211_dev_to_priv(dev); IPW_DEBUG_WX("Getting nick\n"); wrqu->data.length = strlen(priv->nick) + 1; memcpy(extra, priv->nick, wrqu->data.length); @@ -5607,7 +5607,7 @@ struct iw_request_info *info, union iwreq_data *wrqu, char *extra) { - struct ipw_priv * priv = ieee80211_priv(dev); + struct ipw_priv * priv = ieee80211_dev_to_priv(dev); wrqu->bitrate.value = priv->last_rate; IPW_DEBUG_WX("GET Rate -> %d \n", wrqu->bitrate.value); @@ -5619,7 +5619,7 @@ struct iw_request_info *info, union iwreq_data *wrqu, char *extra) { - struct ipw_priv *priv = ieee80211_priv(dev); + struct ipw_priv *priv = ieee80211_dev_to_priv(dev); if (wrqu->rts.disabled) priv->rts_threshold = DEFAULT_RTS_THRESHOLD; @@ -5640,7 +5640,7 @@ struct iw_request_info *info, union iwreq_data *wrqu, char *extra) { - struct ipw_priv *priv = ieee80211_priv(dev); + struct ipw_priv *priv = ieee80211_dev_to_priv(dev); wrqu->rts.value = priv->rts_threshold; wrqu->rts.fixed = 0; /* no auto select */ wrqu->rts.disabled = @@ -5655,7 +5655,7 @@ struct iw_request_info *info, union iwreq_data *wrqu, char *extra) { - struct ipw_priv *priv = ieee80211_priv(dev); + struct ipw_priv *priv = ieee80211_dev_to_priv(dev); struct ipw_tx_power tx_power; int i; @@ -5699,7 +5699,7 @@ struct iw_request_info *info, union iwreq_data *wrqu, char *extra) { - struct ipw_priv *priv = ieee80211_priv(dev); + struct ipw_priv *priv = ieee80211_dev_to_priv(dev); wrqu->power.value = priv->tx_power; wrqu->power.fixed = 1; @@ -5717,7 +5717,7 @@ struct iw_request_info *info, union iwreq_data *wrqu, char *extra) { - struct ipw_priv *priv = ieee80211_priv(dev); + struct ipw_priv *priv = ieee80211_dev_to_priv(dev); if (wrqu->frag.disabled) priv->ieee->fts = DEFAULT_FTS; @@ -5738,7 +5738,7 @@ struct iw_request_info *info, union iwreq_data *wrqu, char *extra) { - struct ipw_priv *priv = ieee80211_priv(dev); + struct ipw_priv *priv = ieee80211_dev_to_priv(dev); wrqu->frag.value = priv->ieee->fts; wrqu->frag.fixed = 0; /* no auto select */ wrqu->frag.disabled = @@ -5771,7 +5771,7 @@ struct iw_request_info *info, union iwreq_data *wrqu, char *extra) { - struct ipw_priv *priv = ieee80211_priv(dev); + struct ipw_priv *priv = ieee80211_dev_to_priv(dev); IPW_DEBUG_WX("Start scan\n"); if (ipw_request_scan(priv)) return -EIO; @@ -5782,7 +5782,7 @@ struct iw_request_info *info, union iwreq_data *wrqu, char *extra) { - struct ipw_priv *priv = ieee80211_priv(dev); + struct ipw_priv *priv = ieee80211_dev_to_priv(dev); return ieee80211_wx_get_scan(priv->ieee, info, wrqu, extra); } @@ -5790,7 +5790,7 @@ struct iw_request_info *info, union iwreq_data *wrqu, char *key) { - struct ipw_priv *priv = ieee80211_priv(dev); + struct ipw_priv *priv = ieee80211_dev_to_priv(dev); return ieee80211_wx_set_encode(priv->ieee, info, wrqu, key); } @@ -5798,7 +5798,7 @@ struct iw_request_info *info, union iwreq_data *wrqu, char *key) { - struct ipw_priv *priv = ieee80211_priv(dev); + struct ipw_priv *priv = ieee80211_dev_to_priv(dev); return ieee80211_wx_get_encode(priv->ieee, info, wrqu, key); } @@ -5806,7 +5806,7 @@ struct iw_request_info *info, union iwreq_data *wrqu, char *extra) { - struct ipw_priv *priv = ieee80211_priv(dev); + struct ipw_priv *priv = ieee80211_dev_to_priv(dev); int err; if (wrqu->power.disabled) { @@ -5855,7 +5855,7 @@ struct iw_request_info *info, union iwreq_data *wrqu, char *extra) { - struct ipw_priv *priv = ieee80211_priv(dev); + struct ipw_priv *priv = ieee80211_dev_to_priv(dev); if (!(priv->power_mode & IPW_POWER_ENABLED)) { wrqu->power.disabled = 1; @@ -5872,7 +5872,7 @@ struct iw_request_info *info, union iwreq_data *wrqu, char *extra) { - struct ipw_priv *priv = ieee80211_priv(dev); + struct ipw_priv *priv = ieee80211_dev_to_priv(dev); int mode = *(int *)extra; int err; @@ -5900,7 +5900,7 @@ struct iw_request_info *info, union iwreq_data *wrqu, char *extra) { - struct ipw_priv *priv = ieee80211_priv(dev); + struct ipw_priv *priv = ieee80211_dev_to_priv(dev); int level = IPW_POWER_LEVEL(priv->power_mode); char *p = extra; @@ -5932,7 +5932,7 @@ struct iw_request_info *info, union iwreq_data *wrqu, char *extra) { - struct ipw_priv *priv = ieee80211_priv(dev); + struct ipw_priv *priv = ieee80211_dev_to_priv(dev); int mode = *(int *)extra; u8 band = 0, modulation = 0; @@ -5998,7 +5998,7 @@ struct iw_request_info *info, union iwreq_data *wrqu, char *extra) { - struct ipw_priv *priv = ieee80211_priv(dev); + struct ipw_priv *priv = ieee80211_dev_to_priv(dev); switch (priv->ieee->freq_band) { case IEEE80211_24GHZ_BAND: @@ -6046,7 +6046,7 @@ struct iw_request_info *info, union iwreq_data *wrqu, char *extra) { - struct ipw_priv *priv = ieee80211_priv(dev); + struct ipw_priv *priv = ieee80211_dev_to_priv(dev); int *parms = (int *)extra; int enable = (parms[0] > 0); @@ -6072,7 +6072,7 @@ struct iw_request_info *info, union iwreq_data *wrqu, char *extra) { - struct ipw_priv *priv = ieee80211_priv(dev); + struct ipw_priv *priv = ieee80211_dev_to_priv(dev); IPW_DEBUG_WX("RESET\n"); ipw_adapter_restart(priv); return 0; @@ -6185,7 +6185,7 @@ */ static struct iw_statistics *ipw_get_wireless_stats(struct net_device * dev) { - struct ipw_priv *priv = ieee80211_priv(dev); + struct ipw_priv *priv = ieee80211_dev_to_priv(dev); struct iw_statistics *wstats; wstats = &priv->wstats; @@ -6248,7 +6248,7 @@ static int ipw_net_open(struct net_device *dev) { - struct ipw_priv *priv = ieee80211_priv(dev); + struct ipw_priv *priv = ieee80211_dev_to_priv(dev); IPW_DEBUG_INFO("dev->open\n"); /* we should be verifying the device is ready to be opened */ if (!(priv->status & STATUS_RF_KILL_MASK) && @@ -6394,9 +6394,9 @@ } static int ipw_net_hard_start_xmit(struct ieee80211_txb *txb, - struct net_device *dev) + struct ieee80211_device *ieee) { - struct ipw_priv *priv = ieee80211_priv(dev); + struct ipw_priv *priv = ieee80211_priv(ieee); unsigned long flags; IPW_DEBUG_TX("dev->xmit(%d bytes)\n", txb->payload_size); @@ -6406,7 +6406,7 @@ if (!(priv->status & STATUS_ASSOCIATED)) { IPW_DEBUG_INFO("Tx attempt while not associated.\n"); priv->ieee->stats.tx_carrier_errors++; - netif_stop_queue(dev); + netif_stop_queue(ieee80211_dev(ieee)); goto fail_unlock; } @@ -6422,7 +6422,7 @@ static struct net_device_stats *ipw_net_get_stats(struct net_device *dev) { - struct ipw_priv *priv = ieee80211_priv(dev); + struct ipw_priv *priv = ieee80211_dev_to_priv(dev); priv->ieee->stats.tx_packets = priv->tx_packets; priv->ieee->stats.rx_packets = priv->rx_packets; @@ -6436,7 +6436,7 @@ static int ipw_net_set_mac_address(struct net_device *dev, void *p) { - struct ipw_priv *priv = ieee80211_priv(dev); + struct ipw_priv *priv = ieee80211_dev_to_priv(dev); struct sockaddr *addr = p; if (!is_valid_ether_addr(addr->sa_data)) return -EADDRNOTAVAIL; @@ -6451,7 +6451,7 @@ static void ipw_ethtool_get_drvinfo(struct net_device *dev, struct ethtool_drvinfo *info) { - struct ipw_priv *p = ieee80211_priv(dev); + struct ipw_priv *p = ieee80211_dev_to_priv(dev); char vers[64]; char date[32]; u32 len; @@ -6472,7 +6472,7 @@ static u32 ipw_ethtool_get_link(struct net_device *dev) { - struct ipw_priv *priv = ieee80211_priv(dev); + struct ipw_priv *priv = ieee80211_dev_to_priv(dev); return (priv->status & STATUS_ASSOCIATED) != 0; } @@ -6484,7 +6484,7 @@ static int ipw_ethtool_get_eeprom(struct net_device *dev, struct ethtool_eeprom *eeprom, u8 *bytes) { - struct ipw_priv *p = ieee80211_priv(dev); + struct ipw_priv *p = ieee80211_dev_to_priv(dev); if (eeprom->offset + eeprom->len > CX2_EEPROM_IMAGE_SIZE) return -EINVAL; @@ -6496,7 +6496,7 @@ static int ipw_ethtool_set_eeprom(struct net_device *dev, struct ethtool_eeprom *eeprom, u8 *bytes) { - struct ipw_priv *p = ieee80211_priv(dev); + struct ipw_priv *p = ieee80211_dev_to_priv(dev); int i; if (eeprom->offset + eeprom->len > CX2_EEPROM_IMAGE_SIZE) @@ -6633,10 +6633,10 @@ } -static void shim__set_security(struct net_device *dev, +static void shim__set_security(struct ieee80211_device *ieee, struct ieee80211_security *sec) { - struct ipw_priv *priv = ieee80211_priv(dev); + struct ipw_priv *priv = ieee80211_priv(ieee); int i; for (i = 0; i < 4; i++) { @@ -6874,7 +6874,7 @@ /* Called by register_netdev() */ static int ipw_net_init(struct net_device *dev) { - struct ipw_priv *priv = ieee80211_priv(dev); + struct ipw_priv *priv = ieee80211_dev_to_priv(dev); if (priv->status & STATUS_RF_KILL_SW) { IPW_WARNING("Radio disabled by module parameter.\n"); @@ -6952,19 +6952,21 @@ { int err = 0; struct net_device *net_dev; + struct ieee80211_device *ieee; void __iomem *base; u32 length, val; struct ipw_priv *priv; int band, modulation; - net_dev = alloc_ieee80211(sizeof(struct ipw_priv)); - if (net_dev == NULL) { + ieee = alloc_ieee80211(sizeof(struct ipw_priv)); + if (ieee == NULL) { err = -ENOMEM; goto out; } + net_dev = ieee80211_dev(ieee); - priv = ieee80211_priv(net_dev); - priv->ieee = netdev_priv(net_dev); + priv = ieee80211_priv(ieee); + priv->ieee = ieee; priv->net_dev = net_dev; priv->pci_dev = pdev; #ifdef CONFIG_IPW_DEBUG @@ -7160,7 +7162,7 @@ pci_disable_device(pdev); pci_set_drvdata(pdev, NULL); out_free_ieee80211: - free_ieee80211(priv->net_dev); + free_ieee80211(priv->ieee); out: return err; } @@ -7202,7 +7204,7 @@ pci_release_regions(pdev); pci_disable_device(pdev); pci_set_drvdata(pdev, NULL); - free_ieee80211(priv->net_dev); + free_ieee80211(priv->ieee); #ifdef CONFIG_PM if (fw_loaded) { -- Jiri Benc SUSE Labs From jbenc@suse.cz Fri Jun 3 09:35:18 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Jun 2005 09:35:21 -0700 (PDT) Received: from mail.suse.cz (styx.suse.cz [82.119.242.94]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j53GZGXq002144 for ; Fri, 3 Jun 2005 09:35:17 -0700 Received: from griffin.suse.cz (griffin.suse.cz [10.20.1.99]) by mail.suse.cz (SUSE CR ESMTP Mailer) with ESMTP id 960F16282FC; Fri, 3 Jun 2005 18:34:18 +0200 (CEST) Date: Fri, 3 Jun 2005 18:34:18 +0200 From: Jiri Benc To: NetDev Cc: Jeff Garzik , Jirka Bohac Subject: [6/9] ieee80211: ethernet independency Message-ID: <20050603183418.58c47b0c@griffin.suse.cz> In-Reply-To: <20050603182625.64d33be3@griffin.suse.cz> References: <20050603182625.64d33be3@griffin.suse.cz> X-Mailer: Sylpheed-Claws 1.0.4a (GTK+ 1.2.10; x86_64-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 2032 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jbenc@suse.cz Precedence: bulk X-list: netdev Content-Length: 38022 Lines: 1183 Makes the 802.11 layer independent of ethernet. (The previous implementation had the ethernet headers built by the ethernet layer and then parsed them and rebuilt them into 802.11 headers.) Signed-off-by: Jiri Benc Signed-off-by: Jirka Bohac Index: netdev/include/linux/netdevice.h =================================================================== --- netdev.orig/include/linux/netdevice.h 2005-06-01 11:05:01.000000000 +0200 +++ netdev/include/linux/netdevice.h 2005-06-03 13:21:00.000000000 +0200 @@ -83,13 +83,18 @@ * used. */ -#if !defined(CONFIG_AX25) && !defined(CONFIG_AX25_MODULE) && !defined(CONFIG_TR) +#if !defined(CONFIG_AX25) && !defined(CONFIG_AX25_MODULE) && !defined(CONFIG_TR) \ + && !defined(CONFIG_IEEE80211) #define LL_MAX_HEADER 32 #else #if defined(CONFIG_AX25) || defined(CONFIG_AX25_MODULE) #define LL_MAX_HEADER 96 #else +#if defined(CONFIG_TR) #define LL_MAX_HEADER 48 +#else +#define LL_MAX_HEADER 38 +#endif #endif #endif Index: netdev/include/net/ieee80211.h =================================================================== --- netdev.orig/include/net/ieee80211.h 2005-06-03 13:20:46.000000000 +0200 +++ netdev/include/net/ieee80211.h 2005-06-03 13:21:00.000000000 +0200 @@ -20,7 +20,6 @@ */ #ifndef IEEE80211_H #define IEEE80211_H -#include /* ETH_ALEN */ #include /* ARRAY_SIZE */ #if WIRELESS_EXT < 17 @@ -42,25 +41,26 @@ WEP IV and ICV. (this interpretation suggested by Ramiro Barreiro) */ +#define IEEE80211_ALEN 6 #define IEEE80211_HLEN 30 #define IEEE80211_FRAME_LEN (IEEE80211_DATA_LEN + IEEE80211_HLEN) struct ieee80211_hdr { u16 frame_ctl; u16 duration_id; - u8 addr1[ETH_ALEN]; - u8 addr2[ETH_ALEN]; - u8 addr3[ETH_ALEN]; + u8 addr1[IEEE80211_ALEN]; + u8 addr2[IEEE80211_ALEN]; + u8 addr3[IEEE80211_ALEN]; u16 seq_ctl; - u8 addr4[ETH_ALEN]; + u8 addr4[IEEE80211_ALEN]; } __attribute__ ((packed)); struct ieee80211_hdr_3addr { u16 frame_ctl; u16 duration_id; - u8 addr1[ETH_ALEN]; - u8 addr2[ETH_ALEN]; - u8 addr3[ETH_ALEN]; + u8 addr1[IEEE80211_ALEN]; + u8 addr2[IEEE80211_ALEN]; + u8 addr3[IEEE80211_ALEN]; u16 seq_ctl; } __attribute__ ((packed)); @@ -233,7 +233,7 @@ #define ETH_P_PREAUTH 0x88C7 /* IEEE 802.11i pre-authentication */ #ifndef ETH_P_80211_RAW -#define ETH_P_80211_RAW (ETH_P_ECONET + 1) +#define ETH_P_80211_RAW 0x0003 #endif /* IEEE 802.11 defines */ @@ -246,11 +246,29 @@ u8 ssap; /* always 0xAA */ u8 ctrl; /* always 0x03 */ u8 oui[P80211_OUI_LEN]; /* organizational universal id */ + u16 type; /* packet type ID field */ } __attribute__ ((packed)); #define SNAP_SIZE sizeof(struct ieee80211_snap_hdr) +#define IEEE80211_SNAP_IS_RFC1042(snap) \ + ((snap)->oui[0] == 0 && (snap)->oui[1] == 0 && (snap)->oui[2] == 0) +#define IEEE80211_SNAP_IS_BRIDGE_TUNNEL(snap) \ + ((snap)->oui[0] == 0 && (snap)->oui[1] == 0 && (snap)->oui[2] == 0xf8) + +#define IEEE80211_FC_GET_TODS(hdr) \ + ((hdr)->frame_ctl & __constant_cpu_to_le16(IEEE80211_FCTL_TODS)) +#define IEEE80211_FC_GET_FROMDS(hdr) \ + ((hdr)->frame_ctl & __constant_cpu_to_le16(IEEE80211_FCTL_FROMDS)) +#define IEEE80211_GET_DADDR(hdr) \ + (IEEE80211_FC_GET_TODS(hdr) ? (hdr)->addr3 : (hdr)->addr1) +#define IEEE80211_GET_SADDR(hdr) \ + (IEEE80211_FC_GET_FROMDS(hdr) ? \ + (IEEE80211_FC_GET_TODS(hdr) ? (hdr)->addr4 : (hdr)->addr3) \ + : (hdr)->addr2) +/* IEEE80211_GET_xADDR do not work when both TODS and FROMDS are set. */ + #define WLAN_FC_GET_TYPE(fc) ((fc) & IEEE80211_FCTL_FTYPE) #define WLAN_FC_GET_STYPE(fc) ((fc) & IEEE80211_FCTL_STYPE) @@ -395,8 +413,8 @@ unsigned int seq; unsigned int last_frag; struct sk_buff *skb; - u8 src_addr[ETH_ALEN]; - u8 dst_addr[ETH_ALEN]; + u8 src_addr[IEEE80211_ALEN]; + u8 dst_addr[IEEE80211_ALEN]; }; struct ieee80211_stats { @@ -507,7 +525,7 @@ u16 auth_sequence; u16 beacon_interval; u16 capability; - u8 current_ap[ETH_ALEN]; + u8 current_ap[IEEE80211_ALEN]; u16 listen_interval; struct { u16 association_id:14, reserved:2; @@ -537,7 +555,7 @@ struct ieee80211_assoc_request_frame { u16 capability; u16 listen_interval; - u8 current_ap[ETH_ALEN]; + u8 current_ap[IEEE80211_ALEN]; struct ieee80211_info_element info_element; } __attribute__ ((packed)); @@ -581,7 +599,7 @@ struct ieee80211_network { /* These entries are used to identify a unique network */ - u8 bssid[ETH_ALEN]; + u8 bssid[IEEE80211_ALEN]; u8 channel; /* Ensure null-terminated for any debug msgs */ u8 ssid[IW_ESSID_MAX_SIZE + 1]; @@ -625,12 +643,12 @@ #define MAC_ARG(x) ((u8*)(x))[0],((u8*)(x))[1],((u8*)(x))[2],((u8*)(x))[3],((u8*)(x))[4],((u8*)(x))[5] -extern inline int is_multicast_ether_addr(const u8 *addr) +extern inline int is_multicast_ieee80211_addr(const u8 *addr) { return ((addr[0] != 0xff) && (0x01 & addr[0])); } -extern inline int is_broadcast_ether_addr(const u8 *addr) +extern inline int is_broadcast_ieee80211_addr(const u8 *addr) { return ((addr[0] == 0xff) && (addr[1] == 0xff) && (addr[2] == 0xff) && \ (addr[3] == 0xff) && (addr[4] == 0xff) && (addr[5] == 0xff)); @@ -694,7 +712,7 @@ u16 fts; /* Fragmentation Threshold */ /* Association info */ - u8 bssid[ETH_ALEN]; + u8 bssid[IEEE80211_ALEN]; enum ieee80211_state state; @@ -783,7 +801,7 @@ return 0; } -extern inline int ieee80211_get_hdrlen(u16 fc) +extern inline int __ieee80211_get_hdrlen(u16 fc) { int hdrlen = IEEE80211_3ADDR_LEN; @@ -807,12 +825,29 @@ return hdrlen; } +#define ieee80211_get_hdrlen(hdr) __ieee80211_get_hdrlen(le16_to_cpu((hdr)->frame_ctl)) +#define IEEE80211_GET_DATA_HDR_LEN(hdr) \ + ((((hdr)->frame_ctl & \ + __constant_cpu_to_le16(IEEE80211_FCTL_TODS | IEEE80211_FCTL_FROMDS)) \ + == __constant_cpu_to_le16(IEEE80211_FCTL_TODS | IEEE80211_FCTL_FROMDS)) \ + ? IEEE80211_4ADDR_LEN : IEEE80211_3ADDR_LEN) +#define IEEE80211_GET_SNAP(hdr) \ + ((struct ieee80211_snap_hdr *) \ + ((u8 *)(hdr) + IEEE80211_GET_DATA_HDR_LEN(hdr))) + +extern inline int ieee80211_get_proto(struct ieee80211_hdr *header) +{ + struct ieee80211_snap_hdr *snap = IEEE80211_GET_SNAP(header); + return (snap->dsap == 0xaa && snap->ssap == 0xaa ? + ntohs(snap->type) : ETH_P_802_2); +} /* ieee80211.c */ extern void free_ieee80211(struct ieee80211_device *ieee); extern struct ieee80211_device *alloc_ieee80211(int sizeof_priv); +extern void ieee80211_setup(struct net_device *dev); extern int ieee80211_set_encryption(struct ieee80211_device *ieee); Index: netdev/net/ieee80211/ieee80211_rx.c =================================================================== --- netdev.orig/net/ieee80211/ieee80211_rx.c 2005-06-03 13:20:46.000000000 +0200 +++ netdev/net/ieee80211/ieee80211_rx.c 2005-06-03 13:21:00.000000000 +0200 @@ -41,11 +41,10 @@ struct ieee80211_rx_stats *rx_stats) { struct ieee80211_hdr *hdr = (struct ieee80211_hdr *)skb->data; - u16 fc = le16_to_cpu(hdr->frame_ctl); skb->dev = ieee->dev; skb->mac.raw = skb->data; - skb_pull(skb, ieee80211_get_hdrlen(fc)); + skb_pull(skb, ieee80211_get_hdrlen(hdr)); skb->pkt_type = PACKET_OTHERHOST; skb->protocol = __constant_htons(ETH_P_80211_RAW); memset(skb->cb, 0, sizeof(skb->cb)); @@ -75,8 +74,8 @@ if (entry->skb != NULL && entry->seq == seq && (entry->last_frag + 1 == frag || frag == -1) && - memcmp(entry->src_addr, src, ETH_ALEN) == 0 && - memcmp(entry->dst_addr, dst, ETH_ALEN) == 0) + memcmp(entry->src_addr, src, IEEE80211_ALEN) == 0 && + memcmp(entry->dst_addr, dst, IEEE80211_ALEN) == 0) return entry; } @@ -103,7 +102,7 @@ sizeof(struct ieee80211_hdr) + 8 /* LLC */ + 2 /* alignment */ + - 8 /* WEP */ + ETH_ALEN /* WDS */); + 8 /* WEP */ + IEEE80211_ALEN /* WDS */); if (skb == NULL) return NULL; @@ -119,8 +118,8 @@ entry->seq = seq; entry->last_frag = frag; entry->skb = skb; - memcpy(entry->src_addr, hdr->addr2, ETH_ALEN); - memcpy(entry->dst_addr, hdr->addr1, ETH_ALEN); + memcpy(entry->src_addr, hdr->addr2, IEEE80211_ALEN); + memcpy(entry->dst_addr, hdr->addr1, IEEE80211_ALEN); } else { /* received a fragment of a frame for which the head fragment * should have already been received */ @@ -220,15 +219,6 @@ #endif -/* See IEEE 802.1H for LLC/SNAP encapsulation/decapsulation */ -/* Ethernet-II snap header (RFC1042 for most EtherTypes) */ -static unsigned char rfc1042_header[] = -{ 0xaa, 0xaa, 0x03, 0x00, 0x00, 0x00 }; -/* Bridge-Tunnel header (for EtherTypes ETH_P_AARP and ETH_P_IPX) */ -static unsigned char bridge_tunnel_header[] = -{ 0xaa, 0xaa, 0x03, 0x00, 0x00, 0xf8 }; -/* No encapsulation header if EtherType < 0x600 (=length) */ - /* Called by ieee80211_rx_frame_decrypt */ static int ieee80211_is_eapol_frame(struct ieee80211_device *ieee, struct sk_buff *skb) @@ -236,7 +226,6 @@ struct net_device *dev = ieee80211_dev(ieee); u16 fc, ethertype; struct ieee80211_hdr *hdr; - u8 *pos; if (skb->len < 24) return 0; @@ -247,12 +236,12 @@ /* check that the frame is unicast frame to us */ if ((fc & (IEEE80211_FCTL_TODS | IEEE80211_FCTL_FROMDS)) == IEEE80211_FCTL_TODS && - memcmp(hdr->addr1, dev->dev_addr, ETH_ALEN) == 0 && - memcmp(hdr->addr3, dev->dev_addr, ETH_ALEN) == 0) { + memcmp(hdr->addr1, dev->dev_addr, IEEE80211_ALEN) == 0 && + memcmp(hdr->addr3, dev->dev_addr, IEEE80211_ALEN) == 0) { /* ToDS frame with own addr BSSID and DA */ } else if ((fc & (IEEE80211_FCTL_TODS | IEEE80211_FCTL_FROMDS)) == IEEE80211_FCTL_FROMDS && - memcmp(hdr->addr1, dev->dev_addr, ETH_ALEN) == 0) { + memcmp(hdr->addr1, dev->dev_addr, IEEE80211_ALEN) == 0) { /* FromDS frame with own addr as DA */ } else return 0; @@ -261,8 +250,7 @@ return 0; /* check for port access entity Ethernet type */ - pos = skb->data + 24; - ethertype = (pos[6] << 8) | pos[7]; + ethertype = ieee80211_get_proto(hdr); if (ethertype == ETH_P_PAE) return 1; @@ -281,7 +269,7 @@ return 0; hdr = (struct ieee80211_hdr *) skb->data; - hdrlen = ieee80211_get_hdrlen(le16_to_cpu(hdr->frame_ctl)); + hdrlen = ieee80211_get_hdrlen(hdr); #ifdef CONFIG_IEEE80211_CRYPT_TKIP if (ieee->tkip_countermeasures && @@ -326,7 +314,7 @@ return 0; hdr = (struct ieee80211_hdr *) skb->data; - hdrlen = ieee80211_get_hdrlen(le16_to_cpu(hdr->frame_ctl)); + hdrlen = ieee80211_get_hdrlen(hdr); atomic_inc(&crypt->refcnt); res = crypt->ops->decrypt_msdu(skb, keyidx, hdrlen, crypt->priv); @@ -342,6 +330,44 @@ } +unsigned short ieee80211_type_trans(struct sk_buff *skb, + struct ieee80211_device *ieee) +{ + struct ieee80211_hdr *hdr = (struct ieee80211_hdr *)skb->data; + struct ieee80211_snap_hdr *snap; + int hdrlen; + u8 *daddr = IEEE80211_GET_DADDR(hdr); + unsigned short type; + + skb->mac.raw = skb->data; + + hdrlen = ieee80211_get_hdrlen(hdr); + snap = (struct ieee80211_snap_hdr *)(skb->data + hdrlen); + if (snap->dsap == 0xaa && snap->ssap == 0xaa && + ((IEEE80211_SNAP_IS_RFC1042(snap) && + snap->type != __constant_htons(ETH_P_AARP) && + snap->type != __constant_htons(ETH_P_IPX)) || + IEEE80211_SNAP_IS_BRIDGE_TUNNEL(snap))) { + type = snap->type; + skb_pull(skb, hdrlen + SNAP_SIZE); + } + else { + type = __constant_htons(ETH_P_802_2); + skb_pull(skb, hdrlen); + } + + skb->input_dev = ieee->dev; + if (is_broadcast_ieee80211_addr(daddr)) + skb->pkt_type = PACKET_BROADCAST; + else if (is_multicast_ieee80211_addr(daddr)) + skb->pkt_type = PACKET_MULTICAST; + else if (memcmp(daddr, ieee->dev->dev_addr, IEEE80211_ALEN)) + skb->pkt_type = PACKET_OTHERHOST; + + return type; +} + + /* All received frames are sent to this function. @skb contains the frame in * IEEE 802.11 format, i.e., in the format it was sent over air. * This function is called only as a tasklet (software IRQ). */ @@ -354,8 +380,6 @@ u16 fc, type, stype, sc; struct net_device_stats *stats; unsigned int frag; - u8 *payload; - u16 ethertype; #ifdef NOT_YET struct net_device *wds = NULL; struct sk_buff *skb2 = NULL; @@ -364,8 +388,8 @@ int from_assoc_ap = 0; void *sta = NULL; #endif - u8 dst[ETH_ALEN]; - u8 src[ETH_ALEN]; + u8 dst[IEEE80211_ALEN]; + u8 src[IEEE80211_ALEN]; struct ieee80211_crypt_data *crypt = NULL; int keyidx = 0; @@ -383,7 +407,7 @@ stype = WLAN_FC_GET_STYPE(fc); sc = le16_to_cpu(hdr->seq_ctl); frag = WLAN_GET_SEQ_FRAG(sc); - hdrlen = ieee80211_get_hdrlen(fc); + hdrlen = __ieee80211_get_hdrlen(fc); #ifdef NOT_YET #if WIRELESS_EXT > 15 @@ -479,22 +503,23 @@ switch (fc & (IEEE80211_FCTL_FROMDS | IEEE80211_FCTL_TODS)) { case IEEE80211_FCTL_FROMDS: - memcpy(dst, hdr->addr1, ETH_ALEN); - memcpy(src, hdr->addr3, ETH_ALEN); + memcpy(dst, hdr->addr1, IEEE80211_ALEN); + memcpy(src, hdr->addr3, IEEE80211_ALEN); break; case IEEE80211_FCTL_TODS: - memcpy(dst, hdr->addr3, ETH_ALEN); - memcpy(src, hdr->addr2, ETH_ALEN); + memcpy(dst, hdr->addr3, IEEE80211_ALEN); + memcpy(src, hdr->addr2, IEEE80211_ALEN); break; case IEEE80211_FCTL_FROMDS | IEEE80211_FCTL_TODS: if (skb->len < IEEE80211_4ADDR_LEN) goto rx_dropped; - memcpy(dst, hdr->addr3, ETH_ALEN); - memcpy(src, hdr->addr4, ETH_ALEN); + memcpy(dst, hdr->addr3, IEEE80211_ALEN); + memcpy(src, hdr->addr4, IEEE80211_ALEN); + /* FIXME: this is wrong */ break; case 0: - memcpy(dst, hdr->addr1, ETH_ALEN); - memcpy(src, hdr->addr2, ETH_ALEN); + memcpy(dst, hdr->addr1, IEEE80211_ALEN); + memcpy(src, hdr->addr2, IEEE80211_ALEN); break; } @@ -509,7 +534,7 @@ if (ieee->iw_mode == IW_MODE_MASTER && !wds && (fc & (IEEE80211_FCTL_TODS | IEEE80211_FCTL_FROMDS)) == IEEE80211_FCTL_FROMDS && ieee->stadev && - memcmp(hdr->addr2, ieee->assoc_ap_addr, ETH_ALEN) == 0) { + memcmp(hdr->addr2, ieee->assoc_ap_addr, IEEE80211_ALEN) == 0) { /* Frame from BSSID of the AP for which we are a client */ skb->dev = dev = ieee->stadev; stats = hostap_get_stats(dev); @@ -667,9 +692,6 @@ /* skb: hdr + (possible reassembled) full plaintext payload */ - payload = skb->data + hdrlen; - ethertype = (payload[6] << 8) | payload[7]; - #ifdef NOT_YET /* If IEEE 802.1X is used, check whether the port is authorized to send * the received frame. */ @@ -696,38 +718,6 @@ } #endif - /* convert hdr + possible LLC headers into Ethernet header */ - if (skb->len - hdrlen >= 8 && - ((memcmp(payload, rfc1042_header, SNAP_SIZE) == 0 && - ethertype != ETH_P_AARP && ethertype != ETH_P_IPX) || - memcmp(payload, bridge_tunnel_header, SNAP_SIZE) == 0)) { - /* remove RFC1042 or Bridge-Tunnel encapsulation and - * replace EtherType */ - skb_pull(skb, hdrlen + SNAP_SIZE); - memcpy(skb_push(skb, ETH_ALEN), src, ETH_ALEN); - memcpy(skb_push(skb, ETH_ALEN), dst, ETH_ALEN); - } else { - u16 len; - /* Leave Ethernet header part of hdr and full payload */ - skb_pull(skb, hdrlen); - len = htons(skb->len); - memcpy(skb_push(skb, 2), &len, 2); - memcpy(skb_push(skb, ETH_ALEN), src, ETH_ALEN); - memcpy(skb_push(skb, ETH_ALEN), dst, ETH_ALEN); - } - -#ifdef NOT_YET - if (wds && ((fc & (IEEE80211_FCTL_TODS | IEEE80211_FCTL_FROMDS)) == - IEEE80211_FCTL_TODS) && - skb->len >= ETH_HLEN + ETH_ALEN) { - /* Non-standard frame: get addr4 from its bogus location after - * the payload */ - memcpy(skb->data + ETH_ALEN, - skb->data + skb->len - ETH_ALEN, ETH_ALEN); - skb_trim(skb, skb->len - ETH_ALEN); - } -#endif - stats->rx_packets++; stats->rx_bytes += skb->len; @@ -753,7 +743,7 @@ if (skb2 != NULL) { /* send to wireless media */ - skb2->protocol = __constant_htons(ETH_P_802_3); + skb2->protocol = ieee80211_type_trans(skb2, ieee); skb2->mac.raw = skb2->nh.raw = skb2->data; /* skb2->nh.raw = skb2->data + ETH_HLEN; */ skb2->dev = dev; @@ -763,7 +753,7 @@ #endif if (skb) { - skb->protocol = eth_type_trans(skb, dev); + skb->protocol = ieee80211_type_trans(skb, ieee); memset(skb->cb, 0, sizeof(skb->cb)); skb->dev = dev; skb->ip_summed = CHECKSUM_NONE; /* 802.11 crc not sufficient */ @@ -820,7 +810,7 @@ u8 i; /* Pull out fixed field data */ - memcpy(network->bssid, beacon->header.addr3, ETH_ALEN); + memcpy(network->bssid, beacon->header.addr3, IEEE80211_ALEN); network->capability = beacon->capability; network->last_scanned = jiffies; network->time_stamp[0] = beacon->time_stamp[0]; @@ -848,7 +838,7 @@ while (left >= sizeof(struct ieee80211_info_element_hdr)) { if (sizeof(struct ieee80211_info_element_hdr) + info_element->len > left) { IEEE80211_DEBUG_SCAN("SCAN: parse failed: info_element->len + 2 > left : info_element->len+2=%d left=%d.\n", - info_element->len + sizeof(struct ieee80211_info_element), + info_element->len + (int)sizeof(struct ieee80211_info_element), left); return 1; } @@ -1016,7 +1006,7 @@ * as one network */ return ((src->ssid_len == dst->ssid_len) && (src->channel == dst->channel) && - !memcmp(src->bssid, dst->bssid, ETH_ALEN) && + !memcmp(src->bssid, dst->bssid, IEEE80211_ALEN) && !memcmp(src->ssid, dst->ssid, src->ssid_len)); } Index: netdev/net/ieee80211/ieee80211_module.c =================================================================== --- netdev.orig/net/ieee80211/ieee80211_module.c 2005-06-03 13:20:46.000000000 +0200 +++ netdev/net/ieee80211/ieee80211_module.c 2005-06-03 13:21:00.000000000 +0200 @@ -47,7 +47,6 @@ #include #include #include -#include #include #include @@ -102,24 +101,22 @@ { struct ieee80211_device *ieee; struct net_device *dev; - int alloc_size; + int alloc_size; int err; IEEE80211_DEBUG_INFO("Initializing...\n"); - alloc_size = ((sizeof(struct ieee80211_device) + NETDEV_ALIGN_CONST) - & ~NETDEV_ALIGN_CONST) - + sizeof_priv; - dev = alloc_etherdev(alloc_size); + alloc_size = ((sizeof(struct ieee80211_device) + NETDEV_ALIGN_CONST) + & ~NETDEV_ALIGN_CONST) + + sizeof_priv; + dev = alloc_netdev(alloc_size, "wlan%d", ieee80211_setup); if (!dev) { - IEEE80211_ERROR("Unable to network device.\n"); + IEEE80211_ERROR("Unable to allocate network device.\n"); goto failed; } ieee = netdev_priv(dev); ieee->dev = dev; ieee->priv = ieee80211_priv(ieee); - - dev->hard_start_xmit = ieee80211_xmit; err = ieee80211_networks_allocate(ieee); if (err) { Index: netdev/net/ieee80211/ieee80211_tx.c =================================================================== --- netdev.orig/net/ieee80211/ieee80211_tx.c 2005-06-03 13:20:46.000000000 +0200 +++ netdev/net/ieee80211/ieee80211_tx.c 2005-06-03 13:21:00.000000000 +0200 @@ -83,16 +83,6 @@ Total: 8 non-data bytes -802.3 Ethernet Data Frame - - ,-----------------------------------------. -Bytes | 6 | 6 | 2 | Variable | 4 | - |-------|-------|------|-----------|------| -Desc. | Dest. | Source| Type | IP Packet | fcs | - | MAC | MAC | | | | - `-----------------------------------------' -Total: 18 non-data bytes - In the event that fragmentation is required, the incoming payload is split into N parts of size ieee->fts. The first fragment contains the SNAP header and the remaining packets are just data. @@ -103,56 +93,8 @@ encryption it will take 3 frames. With WEP it will take 4 frames as the payload of each frame is reduced to 492 bytes. -* SKB visualization -* -* ,- skb->data -* | -* | ETHERNET HEADER ,-<-- PAYLOAD -* | | 14 bytes from skb->data -* | 2 bytes for Type --> ,T. | (sizeof ethhdr) -* | | | | -* |,-Dest.--. ,--Src.---. | | | -* | 6 bytes| | 6 bytes | | | | -* v | | | | | | -* 0 | v 1 | v | v 2 -* 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 -* ^ | ^ | ^ | -* | | | | | | -* | | | | `T' <---- 2 bytes for Type -* | | | | -* | | '---SNAP--' <-------- 6 bytes for SNAP -* | | -* `-IV--' <-------------------- 4 bytes for IV (WEP) -* -* SNAP HEADER -* */ -static u8 P802_1H_OUI[P80211_OUI_LEN] = { 0x00, 0x00, 0xf8 }; -static u8 RFC1042_OUI[P80211_OUI_LEN] = { 0x00, 0x00, 0x00 }; - -static inline int ieee80211_put_snap(u8 *data, u16 h_proto) -{ - struct ieee80211_snap_hdr *snap; - u8 *oui; - - snap = (struct ieee80211_snap_hdr *)data; - snap->dsap = 0xaa; - snap->ssap = 0xaa; - snap->ctrl = 0x03; - - if (h_proto == 0x8137 || h_proto == 0x80f3) - oui = P802_1H_OUI; - else - oui = RFC1042_OUI; - snap->oui[0] = oui[0]; - snap->oui[1] = oui[1]; - snap->oui[2] = oui[2]; - - *(u16 *)(data + SNAP_SIZE) = htons(h_proto); - - return SNAP_SIZE + sizeof(u16); -} static inline int ieee80211_encrypt_fragment( struct ieee80211_device *ieee, @@ -247,19 +189,16 @@ struct net_device *dev) { struct ieee80211_device *ieee = netdev_priv(dev); + struct ieee80211_hdr *header = (struct ieee80211_hdr *)skb->data; struct ieee80211_txb *txb = NULL; struct ieee80211_hdr *frag_hdr; int i, bytes_per_frag, nr_frags, bytes_last_frag, frag_size; unsigned long flags; struct net_device_stats *stats = &ieee->stats; - int ether_type, encrypt; + int type, encrypt; int bytes, fc, hdr_len; struct sk_buff *skb_frag; - struct ieee80211_hdr header = { /* Ensure zero initialized */ - .duration_id = 0, - .seq_ctl = 0 - }; - u8 dest[ETH_ALEN], src[ETH_ALEN]; + u8 *dest; struct ieee80211_crypt_data* crypt; @@ -268,76 +207,48 @@ /* If there is no driver handler to take the TXB, dont' bother * creating it... */ if (!ieee->hard_start_xmit) { - printk(KERN_WARNING "%s: No xmit handler.\n", - dev->name); + if (printk_ratelimit()) + printk(KERN_WARNING "%s: No xmit handler.\n", + dev->name); goto success; } - if (unlikely(skb->len < SNAP_SIZE + sizeof(u16))) { - printk(KERN_WARNING "%s: skb too small (%d).\n", - dev->name, skb->len); - goto success; - } - - ether_type = ntohs(((struct ethhdr *)skb->data)->h_proto); + type = ieee80211_get_proto(header); + dest = IEEE80211_GET_DADDR(header); + hdr_len = ieee80211_get_hdrlen(header); crypt = ieee->crypt[ieee->tx_keyidx]; - encrypt = !(ether_type == ETH_P_PAE && ieee->ieee802_1x) && + encrypt = !(type == ETH_P_PAE && ieee->ieee802_1x) && ieee->host_encrypt && crypt && crypt->ops; if (!encrypt && ieee->ieee802_1x && - ieee->drop_unencrypted && ether_type != ETH_P_PAE) { + ieee->drop_unencrypted && type != ETH_P_PAE) { stats->tx_dropped++; goto success; } #ifdef CONFIG_IEEE80211_DEBUG - if (crypt && !encrypt && ether_type == ETH_P_PAE) { - struct eapol *eap = (struct eapol *)(skb->data + - sizeof(struct ethhdr) - SNAP_SIZE - sizeof(u16)); + if (crypt && !encrypt && type == ETH_P_PAE) { + struct eapol *eap = (struct eapol *)(skb->data + hdr_len); IEEE80211_DEBUG_EAP("TX: IEEE 802.11 EAPOL frame: %s\n", eap_get_type(eap->type)); } #endif - /* Save source and destination addresses */ - memcpy(&dest, skb->data, ETH_ALEN); - memcpy(&src, skb->data+ETH_ALEN, ETH_ALEN); - - /* Advance the SKB to the start of the payload */ - skb_pull(skb, sizeof(struct ethhdr)); - /* Determine total amount of storage required for TXB packets */ - bytes = skb->len + SNAP_SIZE + sizeof(u16); + bytes = skb->len - hdr_len; + fc = le16_to_cpu(header->frame_ctl); if (encrypt) - fc = IEEE80211_FTYPE_DATA | IEEE80211_STYPE_DATA | - IEEE80211_FCTL_WEP; - else - fc = IEEE80211_FTYPE_DATA | IEEE80211_STYPE_DATA; + fc |= IEEE80211_FCTL_WEP; - if (ieee->iw_mode == IW_MODE_INFRA) { - fc |= IEEE80211_FCTL_TODS; - /* To DS: Addr1 = BSSID, Addr2 = SA, - Addr3 = DA */ - memcpy(&header.addr1, ieee->bssid, ETH_ALEN); - memcpy(&header.addr2, &src, ETH_ALEN); - memcpy(&header.addr3, &dest, ETH_ALEN); - } else if (ieee->iw_mode == IW_MODE_ADHOC) { - /* not From/To DS: Addr1 = DA, Addr2 = SA, - Addr3 = BSSID */ - memcpy(&header.addr1, dest, ETH_ALEN); - memcpy(&header.addr2, src, ETH_ALEN); - memcpy(&header.addr3, ieee->bssid, ETH_ALEN); - } - header.frame_ctl = cpu_to_le16(fc); - hdr_len = IEEE80211_3ADDR_LEN; + header->frame_ctl = cpu_to_le16(fc); /* Determine fragmentation size based on destination (multicast * and broadcast are not fragmented) */ - if (is_multicast_ether_addr(dest) || - is_broadcast_ether_addr(dest)) + if (is_multicast_ieee80211_addr(dest) || + is_broadcast_ieee80211_addr(dest)) frag_size = MAX_FRAG_THRESHOLD; else frag_size = ieee->fts; @@ -346,7 +257,7 @@ * this stack is providing the full 802.11 header, one will * eventually be affixed to this fragment -- so we must account for * it when determining the amount of payload space. */ - bytes_per_frag = frag_size - IEEE80211_3ADDR_LEN; + bytes_per_frag = frag_size - hdr_len; if (ieee->config & (CFG_IEEE80211_COMPUTE_FCS | CFG_IEEE80211_RESERVE_FCS)) bytes_per_frag -= IEEE80211_FCS_LEN; @@ -377,6 +288,8 @@ txb->encrypted = encrypt; txb->payload_size = bytes; + skb_pull(skb, hdr_len); + for (i = 0; i < nr_frags; i++) { skb_frag = txb->fragments[i]; @@ -384,7 +297,7 @@ skb_reserve(skb_frag, crypt->ops->extra_prefix_len); frag_hdr = (struct ieee80211_hdr *)skb_put(skb_frag, hdr_len); - memcpy(frag_hdr, &header, hdr_len); + memcpy(frag_hdr, header, hdr_len); /* If this is not the last fragment, then add the MOREFRAGS * bit to the frame control */ @@ -397,14 +310,6 @@ bytes = bytes_last_frag; } - /* Put a SNAP header on the first fragment */ - if (i == 0) { - ieee80211_put_snap( - skb_put(skb_frag, SNAP_SIZE + sizeof(u16)), - ether_type); - bytes -= SNAP_SIZE + sizeof(u16); - } - memcpy(skb_put(skb_frag, bytes), skb->data, bytes); /* Advance the SKB... */ Index: netdev/net/ieee80211/ieee80211_wx.c =================================================================== --- netdev.orig/net/ieee80211/ieee80211_wx.c 2005-06-03 13:20:46.000000000 +0200 +++ netdev/net/ieee80211/ieee80211_wx.c 2005-06-03 13:21:00.000000000 +0200 @@ -53,7 +53,7 @@ /* First entry *MUST* be the AP MAC address */ iwe.cmd = SIOCGIWAP; iwe.u.ap_addr.sa_family = ARPHRD_ETHER; - memcpy(iwe.u.ap_addr.sa_data, network->bssid, ETH_ALEN); + memcpy(iwe.u.ap_addr.sa_data, network->bssid, IEEE80211_ALEN); start = iwe_stream_add_event(start, stop, &iwe, IW_EV_ADDR_LEN); /* Remaining entries will be displayed in the order we provide them */ Index: netdev/net/ieee80211/ieee80211_crypt_ccmp.c =================================================================== --- netdev.orig/net/ieee80211/ieee80211_crypt_ccmp.c 2005-06-01 11:05:14.000000000 +0200 +++ netdev/net/ieee80211/ieee80211_crypt_ccmp.c 2005-06-03 13:21:00.000000000 +0200 @@ -17,7 +17,6 @@ #include #include #include -#include #include #include #include @@ -156,7 +155,7 @@ * Dlen */ b0[0] = 0x59; b0[1] = qc; - memcpy(b0 + 2, hdr->addr2, ETH_ALEN); + memcpy(b0 + 2, hdr->addr2, IEEE80211_ALEN); memcpy(b0 + 8, pn, CCMP_PN_LEN); b0[14] = (dlen >> 8) & 0xff; b0[15] = dlen & 0xff; @@ -173,13 +172,13 @@ aad[1] = aad_len & 0xff; aad[2] = pos[0] & 0x8f; aad[3] = pos[1] & 0xc7; - memcpy(aad + 4, hdr->addr1, 3 * ETH_ALEN); + memcpy(aad + 4, hdr->addr1, 3 * IEEE80211_ALEN); pos = (u8 *) &hdr->seq_ctl; aad[22] = pos[0] & 0x0f; aad[23] = 0; /* all bits masked */ memset(aad + 24, 0, 8); if (a4_included) - memcpy(aad + 24, hdr->addr4, ETH_ALEN); + memcpy(aad + 24, hdr->addr4, IEEE80211_ALEN); if (qc_included) { aad[a4_included ? 30 : 24] = qc; /* rest of QC masked */ Index: netdev/net/ieee80211/ieee80211_crypt_tkip.c =================================================================== --- netdev.orig/net/ieee80211/ieee80211_crypt_tkip.c 2005-06-01 11:05:14.000000000 +0200 +++ netdev/net/ieee80211/ieee80211_crypt_tkip.c 2005-06-03 13:21:00.000000000 +0200 @@ -17,7 +17,6 @@ #include #include #include -#include #include #include @@ -461,20 +460,20 @@ switch (le16_to_cpu(hdr11->frame_ctl) & (IEEE80211_FCTL_FROMDS | IEEE80211_FCTL_TODS)) { case IEEE80211_FCTL_TODS: - memcpy(hdr, hdr11->addr3, ETH_ALEN); /* DA */ - memcpy(hdr + ETH_ALEN, hdr11->addr2, ETH_ALEN); /* SA */ + memcpy(hdr, hdr11->addr3, IEEE80211_ALEN); /* DA */ + memcpy(hdr + IEEE80211_ALEN, hdr11->addr2, IEEE80211_ALEN); /* SA */ break; case IEEE80211_FCTL_FROMDS: - memcpy(hdr, hdr11->addr1, ETH_ALEN); /* DA */ - memcpy(hdr + ETH_ALEN, hdr11->addr3, ETH_ALEN); /* SA */ + memcpy(hdr, hdr11->addr1, IEEE80211_ALEN); /* DA */ + memcpy(hdr + IEEE80211_ALEN, hdr11->addr3, IEEE80211_ALEN); /* SA */ break; case IEEE80211_FCTL_FROMDS | IEEE80211_FCTL_TODS: - memcpy(hdr, hdr11->addr3, ETH_ALEN); /* DA */ - memcpy(hdr + ETH_ALEN, hdr11->addr4, ETH_ALEN); /* SA */ + memcpy(hdr, hdr11->addr3, IEEE80211_ALEN); /* DA */ + memcpy(hdr + IEEE80211_ALEN, hdr11->addr4, IEEE80211_ALEN); /* SA */ break; case 0: - memcpy(hdr, hdr11->addr1, ETH_ALEN); /* DA */ - memcpy(hdr + ETH_ALEN, hdr11->addr2, ETH_ALEN); /* SA */ + memcpy(hdr, hdr11->addr1, IEEE80211_ALEN); /* DA */ + memcpy(hdr + IEEE80211_ALEN, hdr11->addr2, IEEE80211_ALEN); /* SA */ break; } @@ -521,7 +520,7 @@ else ev.flags |= IW_MICFAILURE_PAIRWISE; ev.src_addr.sa_family = ARPHRD_ETHER; - memcpy(ev.src_addr.sa_data, hdr->addr2, ETH_ALEN); + memcpy(ev.src_addr.sa_data, hdr->addr2, IEEE80211_ALEN); memset(&wrqu, 0, sizeof(wrqu)); wrqu.data.length = sizeof(ev); wireless_send_event(dev, IWEVMICHAELMICFAILURE, &wrqu, (char *) &ev); Index: netdev/net/ieee80211/Makefile =================================================================== --- netdev.orig/net/ieee80211/Makefile 2005-06-01 11:05:14.000000000 +0200 +++ netdev/net/ieee80211/Makefile 2005-06-03 13:21:00.000000000 +0200 @@ -5,6 +5,7 @@ obj-$(CONFIG_IEEE80211_CRYPT_TKIP) += ieee80211_crypt_tkip.o ieee80211-objs := \ ieee80211_module.o \ + ieee80211_proto.o \ ieee80211_tx.o \ ieee80211_rx.o \ ieee80211_wx.o Index: netdev/net/ieee80211/ieee80211_proto.c =================================================================== --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ netdev/net/ieee80211/ieee80211_proto.c 2005-06-03 13:21:00.000000000 +0200 @@ -0,0 +1,239 @@ +/******************************************************************************* + + Copyright (c) 2005 Jiri Benc and Jirka Bohac + Copyright (c) 2004 Intel Corporation. All rights reserved. + (Contact: James P. Ketrenos ) + + Sponsored by SuSE. + + This program is free software; you can redistribute it and/or modify it + under the terms of version 2 of the GNU General Public License as + published by the Free Software Foundation. + + This program is distributed in the hope that it will be useful, but WITHOUT + ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + more details. + + You should have received a copy of the GNU General Public License along with + this program; if not, write to the Free Software Foundation, Inc., 59 + Temple Place - Suite 330, Boston, MA 02111-1307, USA. + +*******************************************************************************/ + +#include +#include +#include +#include +#include +#include +#include +#include + +#include + +static int ieee80211_change_mtu(struct net_device *dev, int new_mtu) +{ + if ((new_mtu < 68) || (new_mtu > IEEE80211_DATA_LEN - 8 - SNAP_SIZE)) + return -EINVAL; + dev->mtu = new_mtu; + return 0; +} + + +static u8 P802_1H_OUI[P80211_OUI_LEN] = { 0x00, 0x00, 0xf8 }; +static u8 RFC1042_OUI[P80211_OUI_LEN] = { 0x00, 0x00, 0x00 }; + +static inline int __ieee80211_put_snap(u8 *data, u16 h_proto) +{ + struct ieee80211_snap_hdr *snap; + u8 *oui; + + snap = (struct ieee80211_snap_hdr *)data; + snap->dsap = 0xaa; + snap->ssap = 0xaa; + snap->ctrl = 0x03; + + if (h_proto == __constant_htons(ETH_P_IPX) || + h_proto == __constant_htons(ETH_P_AARP)) + oui = P802_1H_OUI; + else + oui = RFC1042_OUI; + snap->oui[0] = oui[0]; + snap->oui[1] = oui[1]; + snap->oui[2] = oui[2]; + + snap->type = h_proto; + + return SNAP_SIZE; +} + +static inline int ieee80211_put_snap(u8 *data, u16 h_proto) +{ + return __ieee80211_put_snap(data, htons(h_proto)); +} + +/* + * Create the IEEE 802.11 MAC header for an arbitrary protocol layer + * + * saddr=NULL means use device source address + * daddr=NULL means leave destination address (eg unresolved arp) + */ +static int ieee80211_header(struct sk_buff *skb, struct net_device *dev, + unsigned short type, void *daddr, void *saddr, unsigned len) +{ + struct ieee80211_device *ieee = netdev_priv(dev); + struct ieee80211_hdr *header; + int fc = IEEE80211_FTYPE_DATA | IEEE80211_STYPE_DATA; + int hdr_len = IEEE80211_3ADDR_LEN; + + if (type != ETH_P_802_3 && type != ETH_P_802_2) { + ieee80211_put_snap(skb_push(skb, SNAP_SIZE), type); + hdr_len += SNAP_SIZE; + } + + if (!saddr) saddr = dev->dev_addr; + header = (struct ieee80211_hdr *)skb_push(skb, IEEE80211_3ADDR_LEN); + header->duration_id = header->seq_ctl = 0; + if (ieee->iw_mode == IW_MODE_INFRA) { + fc |= IEEE80211_FCTL_TODS; + /* To DS: Addr1 = BSSID, Addr2 = SA, + Addr3 = DA */ + memcpy(header->addr1, ieee->bssid, IEEE80211_ALEN); + memcpy(header->addr2, saddr, IEEE80211_ALEN); + if (daddr) + memcpy(header->addr3, daddr, IEEE80211_ALEN); + else + memset(header->addr3, 0, IEEE80211_ALEN); + } else if (ieee->iw_mode == IW_MODE_ADHOC) { + /* not From/To DS: Addr1 = DA, Addr2 = SA, + Addr3 = BSSID */ + if (daddr) + memcpy(header->addr1, daddr, IEEE80211_ALEN); + else + memset(header->addr1, 0, IEEE80211_ALEN); + memcpy(header->addr2, saddr, IEEE80211_ALEN); + memcpy(header->addr3, ieee->bssid, IEEE80211_ALEN); + } + header->frame_ctl = cpu_to_le16(fc); + + if (!daddr || (dev->flags & (IFF_LOOPBACK | IFF_NOARP))) + return -hdr_len; + return hdr_len; +} + +static int ieee80211_rebuild_header(struct sk_buff *skb) +{ + struct ieee80211_hdr *header = (struct ieee80211_hdr *)skb->data; + struct net_device *dev = skb->dev; + unsigned short type; + + type = ieee80211_get_proto(header); + + switch (type) { +#ifdef CONFIG_INET + case ETH_P_IP: + return arp_find(IEEE80211_GET_DADDR(header), skb); +#endif + default: + printk(KERN_DEBUG + "%s: unable to resolve type %X addresses.\n", + dev->name, type); + break; + } + + return 0; +} + +static int ieee80211_mac_addr(struct net_device *dev, void *p) +{ + struct sockaddr *addr = p; + + if (netif_running(dev)) + return -EBUSY; + memcpy(dev->dev_addr, addr->sa_data, dev->addr_len); + return 0; +} + +static int ieee80211_header_cache(struct neighbour *neigh, struct hh_cache *hh) +{ + struct net_device *dev = neigh->dev; + struct ieee80211_device *ieee = netdev_priv(dev); + unsigned short type = hh->hh_type; + struct ieee80211_hdr *header; + int fc = IEEE80211_FTYPE_DATA | IEEE80211_STYPE_DATA; + + if (type == __constant_htons(ETH_P_802_3) || + type == __constant_htons(ETH_P_802_2)) + return -1; + + header = (struct ieee80211_hdr *) + (((u8 *)hh->hh_data) + + (HH_DATA_OFF(IEEE80211_3ADDR_LEN + SNAP_SIZE))); + __ieee80211_put_snap((u8 *)header + IEEE80211_3ADDR_LEN, type); + + header->duration_id = header->seq_ctl = 0; + if (ieee->iw_mode == IW_MODE_INFRA) { + fc |= IEEE80211_FCTL_TODS; + /* To DS: Addr1 = BSSID, Addr2 = SA, + Addr3 = DA */ + memcpy(header->addr1, ieee->bssid, IEEE80211_ALEN); + memcpy(header->addr2, dev->dev_addr, IEEE80211_ALEN); + memcpy(header->addr3, neigh->ha, IEEE80211_ALEN); + } else if (ieee->iw_mode == IW_MODE_ADHOC) { + /* not From/To DS: Addr1 = DA, Addr2 = SA, + Addr3 = BSSID */ + memcpy(header->addr1, neigh->ha, IEEE80211_ALEN); + memcpy(header->addr2, dev->dev_addr, IEEE80211_ALEN); + memcpy(header->addr3, ieee->bssid, IEEE80211_ALEN); + } + header->frame_ctl = cpu_to_le16(fc); + + hh->hh_len = IEEE80211_3ADDR_LEN + SNAP_SIZE; + return 0; +} + +static void ieee80211_header_cache_update(struct hh_cache *hh, + struct net_device *dev, unsigned char *haddr) +{ + struct ieee80211_hdr *header; + + header = (struct ieee80211_hdr *) + (((u8 *)hh->hh_data) + + (HH_DATA_OFF(IEEE80211_3ADDR_LEN + SNAP_SIZE))); + memcpy(IEEE80211_GET_DADDR(header), haddr, dev->addr_len); +} + +static int ieee80211_header_parse(struct sk_buff *skb, unsigned char *haddr) +{ + struct ieee80211_hdr *header = (struct ieee80211_hdr *)skb->data; + + memcpy(haddr, IEEE80211_GET_SADDR(header), IEEE80211_ALEN); + return IEEE80211_ALEN; +} + + +void ieee80211_setup(struct net_device *dev) +{ + dev->change_mtu = ieee80211_change_mtu; + dev->hard_header = ieee80211_header; + dev->rebuild_header = ieee80211_rebuild_header; + dev->set_mac_address = ieee80211_mac_addr; + dev->hard_header_cache = ieee80211_header_cache; + dev->header_cache_update = ieee80211_header_cache_update; + dev->hard_header_parse = ieee80211_header_parse; + + dev->hard_start_xmit = ieee80211_xmit; + + dev->type = ARPHRD_ETHER; + dev->hard_header_len = IEEE80211_3ADDR_LEN + SNAP_SIZE; + dev->mtu = IEEE80211_DATA_LEN - 8 - SNAP_SIZE; + dev->addr_len = IEEE80211_ALEN; + dev->tx_queue_len = 1000; + dev->flags = IFF_BROADCAST | IFF_MULTICAST; + + memset(dev->broadcast, 0xFF, IEEE80211_ALEN); +} + + +EXPORT_SYMBOL(ieee80211_setup); -- Jiri Benc SUSE Labs From jbenc@suse.cz Fri Jun 3 09:36:25 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Jun 2005 09:36:29 -0700 (PDT) Received: from mail.suse.cz (styx.suse.cz [82.119.242.94]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j53GaPXq002608 for ; Fri, 3 Jun 2005 09:36:25 -0700 Received: from griffin.suse.cz (griffin.suse.cz [10.20.1.99]) by mail.suse.cz (SUSE CR ESMTP Mailer) with ESMTP id 052A06282FC; Fri, 3 Jun 2005 18:35:27 +0200 (CEST) Date: Fri, 3 Jun 2005 18:35:26 +0200 From: Jiri Benc To: NetDev Cc: Jeff Garzik , Jirka Bohac Subject: [7/9] ipw: fix after "ieee80211: ethernet independency" Message-ID: <20050603183526.0effd2b0@griffin.suse.cz> In-Reply-To: <20050603182625.64d33be3@griffin.suse.cz> References: <20050603182625.64d33be3@griffin.suse.cz> X-Mailer: Sylpheed-Claws 1.0.4a (GTK+ 1.2.10; x86_64-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 2033 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jbenc@suse.cz Precedence: bulk X-list: netdev Content-Length: 1866 Lines: 59 Fixes ipw2200 after making the ieee80211 layer independent of ethernet. Signed-off-by: Jiri Benc Signed-off-by: Jirka Bohac Index: netdev/drivers/net/wireless/ipw2200.c =================================================================== --- netdev.orig/drivers/net/wireless/ipw2200.c 2005-05-31 18:25:53.000000000 +0200 +++ netdev/drivers/net/wireless/ipw2200.c 2005-05-31 18:32:18.000000000 +0200 @@ -4920,8 +4920,8 @@ ETH_ALEN) || !memcmp(header->addr3, priv->bssid, ETH_ALEN) || - is_broadcast_ether_addr(header->addr1) || - is_multicast_ether_addr(header->addr1); + is_broadcast_ieee80211_addr(header->addr1) || + is_multicast_ieee80211_addr(header->addr1); break; case IW_MODE_INFRA: @@ -4932,8 +4932,8 @@ !memcmp(header->addr1, priv->net_dev->dev_addr, ETH_ALEN) || - is_broadcast_ether_addr(header->addr1) || - is_multicast_ether_addr(header->addr1); + is_broadcast_ieee80211_addr(header->addr1) || + is_multicast_ieee80211_addr(header->addr1); break; } @@ -6285,8 +6285,8 @@ switch (priv->ieee->iw_mode) { case IW_MODE_ADHOC: hdr_len = IEEE80211_3ADDR_LEN; - unicast = !is_broadcast_ether_addr(hdr->addr1) && - !is_multicast_ether_addr(hdr->addr1); + unicast = !is_broadcast_ieee80211_addr(hdr->addr1) && + !is_multicast_ieee80211_addr(hdr->addr1); id = ipw_find_station(priv, hdr->addr1); if (id == IPW_INVALID_STATION) { id = ipw_add_station(priv, hdr->addr1); @@ -6301,8 +6301,8 @@ case IW_MODE_INFRA: default: - unicast = !is_broadcast_ether_addr(hdr->addr3) && - !is_multicast_ether_addr(hdr->addr3); + unicast = !is_broadcast_ieee80211_addr(hdr->addr3) && + !is_multicast_ieee80211_addr(hdr->addr3); hdr_len = IEEE80211_3ADDR_LEN; id = 0; break; -- Jiri Benc SUSE Labs From jbenc@suse.cz Fri Jun 3 09:37:16 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Jun 2005 09:37:21 -0700 (PDT) Received: from mail.suse.cz (styx.suse.cz [82.119.242.94]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j53GbGXq003065 for ; Fri, 3 Jun 2005 09:37:16 -0700 Received: from griffin.suse.cz (griffin.suse.cz [10.20.1.99]) by mail.suse.cz (SUSE CR ESMTP Mailer) with ESMTP id 022006282FC; Fri, 3 Jun 2005 18:36:18 +0200 (CEST) Date: Fri, 3 Jun 2005 18:36:17 +0200 From: Jiri Benc To: NetDev Cc: Jeff Garzik , Jirka Bohac Subject: [8/9] ieee80211: add sequence numbers Message-ID: <20050603183617.7903c5a0@griffin.suse.cz> In-Reply-To: <20050603182625.64d33be3@griffin.suse.cz> References: <20050603182625.64d33be3@griffin.suse.cz> X-Mailer: Sylpheed-Claws 1.0.4a (GTK+ 1.2.10; x86_64-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 2034 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jbenc@suse.cz Precedence: bulk X-list: netdev Content-Length: 2286 Lines: 72 Adds sequence numbers to IEEE 802.11 headers. Signed-off-by: Jiri Benc Signed-off-by: Jirka Bohac Index: netdev/include/net/ieee80211.h =================================================================== --- netdev.orig/include/net/ieee80211.h 2005-06-03 13:21:00.000000000 +0200 +++ netdev/include/net/ieee80211.h 2005-06-03 13:21:06.000000000 +0200 @@ -711,6 +711,8 @@ unsigned int frag_next_idx; u16 fts; /* Fragmentation Threshold */ + u16 seq_number; /* sequence number in transmitted frames */ + /* Association info */ u8 bssid[IEEE80211_ALEN]; Index: netdev/net/ieee80211/ieee80211_module.c =================================================================== --- netdev.orig/net/ieee80211/ieee80211_module.c 2005-06-03 13:21:00.000000000 +0200 +++ netdev/net/ieee80211/ieee80211_module.c 2005-06-03 13:21:06.000000000 +0200 @@ -128,6 +128,7 @@ /* Default fragmentation threshold is maximum payload size */ ieee->fts = DEFAULT_FTS; + ieee->seq_number = 0; ieee->scan_age = DEFAULT_MAX_SCAN_AGE; ieee->open_wep = 1; Index: netdev/net/ieee80211/ieee80211_tx.c =================================================================== --- netdev.orig/net/ieee80211/ieee80211_tx.c 2005-06-03 13:21:00.000000000 +0200 +++ netdev/net/ieee80211/ieee80211_tx.c 2005-06-03 13:21:06.000000000 +0200 @@ -276,6 +276,13 @@ else bytes_last_frag = bytes_per_frag; + if (nr_frags > 16) { + /* Should never happen */ + printk(KERN_WARNING "%s: Fragmentation threshold too low\n", + dev->name); + goto failed; + } + /* When we allocate the TXB we allocate enough space for the reserve * and full fragment bytes (bytes_per_frag doesn't include prefix, * postfix, header, FCS, etc.) */ @@ -299,6 +306,8 @@ frag_hdr = (struct ieee80211_hdr *)skb_put(skb_frag, hdr_len); memcpy(frag_hdr, header, hdr_len); + frag_hdr->seq_ctl = cpu_to_le16(ieee->seq_number | i); + /* If this is not the last fragment, then add the MOREFRAGS * bit to the frame control */ if (i != nr_frags - 1) { @@ -323,7 +332,7 @@ (CFG_IEEE80211_COMPUTE_FCS | CFG_IEEE80211_RESERVE_FCS)) skb_put(skb_frag, 4); } - + ieee->seq_number += 0x10; success: spin_unlock_irqrestore(&ieee->lock, flags); -- Jiri Benc SUSE Labs From jbenc@suse.cz Fri Jun 3 09:38:26 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Jun 2005 09:38:30 -0700 (PDT) Received: from mail.suse.cz (styx.suse.cz [82.119.242.94]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j53GcPXq003682 for ; Fri, 3 Jun 2005 09:38:25 -0700 Received: from griffin.suse.cz (griffin.suse.cz [10.20.1.99]) by mail.suse.cz (SUSE CR ESMTP Mailer) with ESMTP id C06036282FC; Fri, 3 Jun 2005 18:37:26 +0200 (CEST) Date: Fri, 3 Jun 2005 18:37:26 +0200 From: Jiri Benc To: NetDev Cc: Jeff Garzik , Jirka Bohac Subject: [9/9] ieee80211: ETH_P_802_11 ethertype Message-ID: <20050603183726.482a91d2@griffin.suse.cz> In-Reply-To: <20050603182625.64d33be3@griffin.suse.cz> References: <20050603182625.64d33be3@griffin.suse.cz> X-Mailer: Sylpheed-Claws 1.0.4a (GTK+ 1.2.10; x86_64-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 2035 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jbenc@suse.cz Precedence: bulk X-list: netdev Content-Length: 3775 Lines: 110 Introduced new ETH_P_802_11 ethertype. Fixed ieee80211_type_trans() to return ETH_P_802_11 in case of non-data frame. Signed-off-by: Jiri Benc Signed-off-by: Jirka Bohac Index: netdev/include/linux/if_ether.h =================================================================== --- netdev.orig/include/linux/if_ether.h 2005-06-01 11:04:59.000000000 +0200 +++ netdev/include/linux/if_ether.h 2005-06-03 13:21:15.000000000 +0200 @@ -92,6 +92,7 @@ #define ETH_P_ECONET 0x0018 /* Acorn Econet */ #define ETH_P_HDLC 0x0019 /* HDLC frames */ #define ETH_P_ARCNET 0x001A /* 1A for ArcNet :-) */ +#define ETH_P_802_11 0x001B /* 802.11 frames */ /* * This is an Ethernet frame header. Index: netdev/include/net/ieee80211.h =================================================================== --- netdev.orig/include/net/ieee80211.h 2005-06-03 13:21:10.000000000 +0200 +++ netdev/include/net/ieee80211.h 2005-06-03 13:21:15.000000000 +0200 @@ -232,10 +232,6 @@ #define ETH_P_PREAUTH 0x88C7 /* IEEE 802.11i pre-authentication */ -#ifndef ETH_P_80211_RAW -#define ETH_P_80211_RAW 0x0003 -#endif - /* IEEE 802.11 defines */ #define P80211_OUI_LEN 3 Index: netdev/net/ieee80211/ieee80211_rx.c =================================================================== --- netdev.orig/net/ieee80211/ieee80211_rx.c 2005-06-03 13:21:00.000000000 +0200 +++ netdev/net/ieee80211/ieee80211_rx.c 2005-06-03 13:21:15.000000000 +0200 @@ -46,7 +46,7 @@ skb->mac.raw = skb->data; skb_pull(skb, ieee80211_get_hdrlen(hdr)); skb->pkt_type = PACKET_OTHERHOST; - skb->protocol = __constant_htons(ETH_P_80211_RAW); + skb->protocol = __constant_htons(ETH_P_802_11); memset(skb->cb, 0, sizeof(skb->cb)); netif_rx(skb); } @@ -338,22 +338,33 @@ int hdrlen; u8 *daddr = IEEE80211_GET_DADDR(hdr); unsigned short type; + u16 fc; skb->mac.raw = skb->data; - hdrlen = ieee80211_get_hdrlen(hdr); - snap = (struct ieee80211_snap_hdr *)(skb->data + hdrlen); - if (snap->dsap == 0xaa && snap->ssap == 0xaa && - ((IEEE80211_SNAP_IS_RFC1042(snap) && - snap->type != __constant_htons(ETH_P_AARP) && - snap->type != __constant_htons(ETH_P_IPX)) || - IEEE80211_SNAP_IS_BRIDGE_TUNNEL(snap))) { - type = snap->type; - skb_pull(skb, hdrlen + SNAP_SIZE); + fc = le16_to_cpu(hdr->frame_ctl); + if (WLAN_FC_GET_TYPE(fc) == IEEE80211_FTYPE_DATA && + WLAN_FC_GET_STYPE(fc) == IEEE80211_STYPE_DATA) { + hdrlen = __ieee80211_get_hdrlen(fc); + snap = (struct ieee80211_snap_hdr *)(skb->data + hdrlen); + if (snap->dsap == 0xaa && snap->ssap == 0xaa && + ((IEEE80211_SNAP_IS_RFC1042(snap) && + snap->type != __constant_htons(ETH_P_AARP) && + snap->type != __constant_htons(ETH_P_IPX)) || + IEEE80211_SNAP_IS_BRIDGE_TUNNEL(snap))) { + type = snap->type; + skb_pull(skb, hdrlen + SNAP_SIZE); + } + else { + type = __constant_htons(ETH_P_802_2); + skb_pull(skb, hdrlen); + } } else { - type = __constant_htons(ETH_P_802_2); - skb_pull(skb, hdrlen); + /* If the type isn't data we want to keep the 802.11 header + * in place. + */ + type = __constant_htons(ETH_P_802_11); } skb->input_dev = ieee->dev; Index: netdev/net/ieee80211/ieee80211_proto.c =================================================================== --- netdev.orig/net/ieee80211/ieee80211_proto.c 2005-06-03 13:21:00.000000000 +0200 +++ netdev/net/ieee80211/ieee80211_proto.c 2005-06-03 13:21:15.000000000 +0200 @@ -87,6 +87,8 @@ int fc = IEEE80211_FTYPE_DATA | IEEE80211_STYPE_DATA; int hdr_len = IEEE80211_3ADDR_LEN; + if (type == ETH_P_802_11) + return 0; if (type != ETH_P_802_3 && type != ETH_P_802_2) { ieee80211_put_snap(skb_push(skb, SNAP_SIZE), type); hdr_len += SNAP_SIZE; -- Jiri Benc SUSE Labs From mitch.a.williams@intel.com Fri Jun 3 10:45:41 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Jun 2005 10:45:45 -0700 (PDT) Received: from orsfmr003.jf.intel.com (fmr18.intel.com [134.134.136.17]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j53HjeXq009108 for ; Fri, 3 Jun 2005 10:45:41 -0700 Received: from orsfmr101.jf.intel.com (orsfmr101.jf.intel.com [10.7.209.17]) by orsfmr003.jf.intel.com (8.12.10/8.12.10/d: major-outer.mc,v 1.1 2004/09/17 17:50:56 root Exp $) with ESMTP id j53HhWV5003802; Fri, 3 Jun 2005 17:43:32 GMT Received: from nwlxmail01.jf.intel.com (nwlxmail01.jf.intel.com [10.7.171.40]) by orsfmr101.jf.intel.com (8.12.10/8.12.10/d: major-inner.mc,v 1.2 2004/09/17 18:05:01 root Exp $) with ESMTP id j53HhWSc004792; Fri, 3 Jun 2005 17:43:32 GMT Received: from mawilli1-desk2.amr.corp.intel.com (mawilli1-desk2.amr.corp.intel.com [134.134.3.124]) by nwlxmail01.jf.intel.com (8.12.10/8.12.9/MailSET/Hub) with ESMTP id j53HhWSL028048; Fri, 3 Jun 2005 10:43:32 -0700 Date: Fri, 3 Jun 2005 10:43:32 -0700 From: Mitch Williams X-X-Sender: mawilli1@mawilli1-desk2.amr.corp.intel.com To: jamal cc: "David S. Miller" , "Ronciak, John" , jdmason@us.ibm.com, shemminger@osdl.org, netdev@oss.sgi.com, Robert.Olsson@data.slu.se, "Venkatesan, Ganesh" , "Brandeburg, Jesse" Subject: Re: RFC: NAPI packet weighting patch In-Reply-To: <1117765954.6095.49.camel@localhost.localdomain> Message-ID: References: <468F3FDA28AA87429AD807992E22D07E0450BFDB@orsmsx408> <20050602.171812.48807872.davem@davemloft.net> <1117765954.6095.49.camel@localhost.localdomain> ReplyTo: "Mitch Williams" MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Scanned-By: MIMEDefang 2.44 X-archive-position: 2037 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mitch.a.williams@intel.com Precedence: bulk X-list: netdev Content-Length: 4153 Lines: 88 On Thu, 2 Jun 2005, jamal wrote: > > Heres what i think i saw as a flow of events: > Someone posted a theory that if you happen to reduce the weight > (iirc the reduction was via a shift) then the DRR would give less CPU > time cycle to the driver - Whats the big suprise there? thats DRR design > intent. Well, that was me. Or at least I was the original poster on this thread. But my theory (if you can call it that) really wasn't about CPU time. I spent several weeks in our lab with the somewhat nebulous task of "look at Linux performance". And what I found was, to me, counterintuitive: reducing weight improved performance, sometimes significantly. > > Stephen has a patch which allows people to reduce the weight. > DRR provides fairness. If you have 10 NICs coming at different wire > rates, the weights provide a fairness quota without caring about what > those speeds are. So it doesnt make any sense IMO to have the weight > based on what the NIC speed is. Infact i claim it is _nonsense_. You > dont need to factor speed. And the claim that DRR is not real world > is blasphemous. OK, well, call me a blasphemer (against whom?). I'm not really saying that the DRR algorithm is not real-world, but rather that NAPI as currently implemented has some significant performance limitations. In my mind, there are two major problems with NAPI as it stands today. First, at Gigabit and higher speeds, the default settings don't allow the driver to process received packets in a timely manner. This causes dropped packets due to lack of receive resources. Lowering the weight can fix this, at least in a single-adapter environment. Second, at 10Mbps and 100Mbps, modern processors are just too fast for the network. The NAPI polling loop runs so much quicker than the wire speed that only one or two packets are processed per softirq -- which effectively puts the adapter back in interrupt mode. Because of this, you can easily bog down a very fast box with relatively slow traffic, just due to the massive number of interrupts generated. My original post (and patch) were to address the first issue. By using the shift value on the quota, I effectively lowered the weight for every driver in the system. Stephen sent out a patch that allowed you to adjust each driver's weight individually. My testing has shown that, as expected, you can achieve the same performance gain either way. In a multiple-adapter environment, you need to adjust the weight of all drivers together to fix the dropped packets issue. Lowering the weight on one adapter won't help it if the other interfaces are still taking up a lot of time in their receive loops. My patch gave you one knob to twiddle that would correct this issue. Stephen's patch gave you one knob for each adapter, but now you need to twiddle them all to see any benefit. The second issue currently has no fix. What is needed is a way for the driver to request a delayed poll, possibly based on line speed. If we could wait, say, 8 packet times before polling, we could significantly reduce the number of interrupts the system has to deal with, at the cost of higher latency. We haven't had time to investigate this at all, but the need is clearly present -- we've had customer calls about this issue. > > Having said that: > I have a feeling that issue which is which is being waded around is the > amount that the softirq chews in the CPU (unfortunately a well known > issue) and to some extent the packet flow a specific driver chews > depending on the path it takes. I fiddled with this concept a little bit, but didn't see much performance gain by doing so. But it may be something that we can go back and look at. Either way, I think the netdev community needs to look critically at NAPI, and make some changes. Network performance in 2.6.12-rcWhatever is pretty poor. 2.4.30 beats it handily, and it really shouldn't be that way. > This, however, does not eradicate the need for DRR and is absolutely not > driver specific. Agreed. All of the changes I've experimented with at the NAPI level have affected performance similarly on multiple drivers. -Mitch From john.ronciak@intel.com Fri Jun 3 10:43:12 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Jun 2005 10:43:21 -0700 (PDT) Received: from orsfmr005.jf.intel.com (fmr20.intel.com [134.134.136.19]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j53HhCXq008752 for ; Fri, 3 Jun 2005 10:43:12 -0700 Received: from orsfmr100.jf.intel.com (orsfmr100.jf.intel.com [10.7.209.16]) by orsfmr005.jf.intel.com (8.12.10/8.12.10/d: major-outer.mc,v 1.1 2004/09/17 17:50:56 root Exp $) with ESMTP id j53HepFT003163; Fri, 3 Jun 2005 17:40:51 GMT Received: from orsmsxvs040.jf.intel.com (orsmsxvs040.jf.intel.com [192.168.65.206]) by orsfmr100.jf.intel.com (8.12.10/8.12.10/d: major-inner.mc,v 1.2 2004/09/17 18:05:01 root Exp $) with SMTP id j53HeadQ026719; Fri, 3 Jun 2005 17:40:48 GMT Received: from orsmsx331.amr.corp.intel.com ([192.168.65.56]) by orsmsxvs040.jf.intel.com (SAVSMTP 3.1.7.47) with SMTP id M2005060310404812872 ; Fri, 03 Jun 2005 10:40:48 -0700 Received: from orsmsx408.amr.corp.intel.com ([192.168.65.52]) by orsmsx331.amr.corp.intel.com with Microsoft SMTPSVC(6.0.3790.211); Fri, 3 Jun 2005 10:40:48 -0700 X-MimeOLE: Produced By Microsoft Exchange V6.5.7226.0 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Subject: RE: RFC: NAPI packet weighting patch Date: Fri, 3 Jun 2005 10:40:47 -0700 Message-ID: <468F3FDA28AA87429AD807992E22D07E0450BFE6@orsmsx408> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: RFC: NAPI packet weighting patch Thread-Index: AcVn0fHLt/WdosjHQo2U4D6fkFIrvwAkDmig From: "Ronciak, John" To: "David S. Miller" Cc: , , , "Williams, Mitch A" , , , "Venkatesan, Ganesh" , "Brandeburg, Jesse" X-OriginalArrivalTime: 03 Jun 2005 17:40:48.0018 (UTC) FILETIME=[60830F20:01C56863] X-Scanned-By: MIMEDefang 2.44 Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id j53HhCXq008752 X-archive-position: 2036 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: john.ronciak@intel.com Precedence: bulk X-list: netdev Content-Length: 899 Lines: 23 > What more do you need other than checking the statistics counter? The > drop statistics (the ones we care about) are incremented in real time > by the ->poll() code, so it's not like we have to trigger some > asynchronous event to get a current version of the number. > I think that there is some more confusion here. I'm talking about frames dropped by the Ethernet controller at the hardware level (no descriptor available). This for example is happening now with our driver with the weight set to 64. This is also what started us looking into what was going on with the weight. I don't see how the NAPI code to dynamically adjust the weight could easily get the hardware stats number to know if frames are being dropped or not. Sorry if I caused the confusion here. Mitch is working on a response to Jamal's last mail trying to level set what we are seeing and doing. Cheers, John From Robert.Olsson@data.slu.se Fri Jun 3 11:10:04 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Jun 2005 11:10:15 -0700 (PDT) Received: from mx1.slu.se (mx1.slu.se [130.238.96.70]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j53IA3Xq010457 for ; Fri, 3 Jun 2005 11:10:04 -0700 Received: from robur.slu.se (robur.slu.se [130.238.98.12]) by mx1.slu.se (8.13.1/8.13.1) with ESMTP id j53I8mkB020710; Fri, 3 Jun 2005 20:08:48 +0200 Received: by robur.slu.se (Postfix, from userid 1000) id 143BBEE3F0; Fri, 3 Jun 2005 20:08:48 +0200 (CEST) From: Robert Olsson MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <17056.40112.39108.32685@robur.slu.se> Date: Fri, 3 Jun 2005 20:08:48 +0200 To: "Ronciak, John" Cc: "David S. Miller" , , , , "Williams, Mitch A" , , , "Venkatesan, Ganesh" , "Brandeburg, Jesse" Subject: RE: RFC: NAPI packet weighting patch In-Reply-To: <468F3FDA28AA87429AD807992E22D07E0450BFE6@orsmsx408> References: <468F3FDA28AA87429AD807992E22D07E0450BFE6@orsmsx408> X-Mailer: VM 7.18 under Emacs 21.4.1 X-Scanned-By: MIMEDefang 2.48 on 130.238.96.70 X-archive-position: 2038 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: Robert.Olsson@data.slu.se Precedence: bulk X-list: netdev Content-Length: 1102 Lines: 25 Ronciak, John writes: > > What more do you need other than checking the statistics counter? The > > drop statistics (the ones we care about) are incremented in real time > > by the ->poll() code, so it's not like we have to trigger some > > asynchronous event to get a current version of the number. > > > > I think that there is some more confusion here. I'm talking about > frames dropped by the Ethernet controller at the hardware level (no > descriptor available). This for example is happening now with our > driver with the weight set to 64. This is also what started us looking > into what was going on with the weight. I don't see how the NAPI code > to dynamically adjust the weight could easily get the hardware stats > number to know if frames are being dropped or not. Sorry if I caused > the confusion here. It's not obvious that weight is to blame for frames dropped. I would look into RX ring size in relation to HW mitigation. And of course if you system is very loaded the RX softirq gives room for other jobs and frames get dropped Cheers. --ro From john.ronciak@intel.com Fri Jun 3 11:21:21 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Jun 2005 11:21:33 -0700 (PDT) Received: from orsfmr005.jf.intel.com (fmr20.intel.com [134.134.136.19]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j53ILIXq011580 for ; Fri, 3 Jun 2005 11:21:21 -0700 Received: from orsfmr100.jf.intel.com (orsfmr100.jf.intel.com [10.7.209.16]) by orsfmr005.jf.intel.com (8.12.10/8.12.10/d: major-outer.mc,v 1.1 2004/09/17 17:50:56 root Exp $) with ESMTP id j53IJ3FT010376; Fri, 3 Jun 2005 18:19:03 GMT Received: from orsmsxvs040.jf.intel.com (orsmsxvs040.jf.intel.com [192.168.65.206]) by orsfmr100.jf.intel.com (8.12.10/8.12.10/d: major-inner.mc,v 1.2 2004/09/17 18:05:01 root Exp $) with SMTP id j53IIcdm021983; Fri, 3 Jun 2005 18:19:03 GMT Received: from orsmsx331.amr.corp.intel.com ([192.168.65.56]) by orsmsxvs040.jf.intel.com (SAVSMTP 3.1.7.47) with SMTP id M2005060311190319676 ; Fri, 03 Jun 2005 11:19:03 -0700 Received: from orsmsx408.amr.corp.intel.com ([192.168.65.52]) by orsmsx331.amr.corp.intel.com with Microsoft SMTPSVC(6.0.3790.211); Fri, 3 Jun 2005 11:19:03 -0700 X-MimeOLE: Produced By Microsoft Exchange V6.5.7226.0 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Subject: RE: RFC: NAPI packet weighting patch Date: Fri, 3 Jun 2005 11:19:02 -0700 Message-ID: <468F3FDA28AA87429AD807992E22D07E0450BFE8@orsmsx408> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: RFC: NAPI packet weighting patch Thread-Index: AcVoZ1hMIfRnNfjORXaqb2xRuIo6IAAAQaHw From: "Ronciak, John" To: "Robert Olsson" Cc: "David S. Miller" , , , , "Williams, Mitch A" , , "Venkatesan, Ganesh" , "Brandeburg, Jesse" X-OriginalArrivalTime: 03 Jun 2005 18:19:03.0501 (UTC) FILETIME=[B8B9F7D0:01C56868] X-Scanned-By: MIMEDefang 2.44 Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id j53ILIXq011580 X-archive-position: 2039 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: john.ronciak@intel.com Precedence: bulk X-list: netdev Content-Length: 584 Lines: 16 > It's not obvious that weight is to blame for frames dropped. I would > look into RX ring size in relation to HW mitigation. > And of course if you system is very loaded the RX softirq gives room > for other jobs and frames get dropped > With the same system (fairly high end with nothing major running on it) we got rid of the dropped frames by just reducing the weight for 64. So the weight did have something to do with the dropped frames. Maybe other factors as well, but in static tests like this it sure looks like the 64 value is wrong is some cases. Cheers, John From greearb@candelatech.com Fri Jun 3 11:34:15 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Jun 2005 11:34:26 -0700 (PDT) Received: from www.lanforge.com (ns1.lanforge.com [66.165.47.210]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j53IYEXq012436 for ; Fri, 3 Jun 2005 11:34:15 -0700 Received: from [71.112.207.80] (pool-71-112-207-80.sttlwa.dsl-w.verizon.net [71.112.207.80]) (authenticated bits=0) by www.lanforge.com (8.12.8/8.12.8) with ESMTP id j53J6U5I003158; Fri, 3 Jun 2005 12:06:31 -0700 Message-ID: <42A0A25C.8000503@candelatech.com> Date: Fri, 03 Jun 2005 11:33:00 -0700 From: Ben Greear Organization: Candela Technologies User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.7.8) Gecko/20050513 Fedora/1.7.8-1.3.1 X-Accept-Language: en-us, en MIME-Version: 1.0 To: "Ronciak, John" CC: Robert Olsson , "David S. Miller" , jdmason@us.ibm.com, shemminger@osdl.org, hadi@cyberus.ca, "Williams, Mitch A" , netdev@oss.sgi.com, "Venkatesan, Ganesh" , "Brandeburg, Jesse" Subject: Re: RFC: NAPI packet weighting patch References: <468F3FDA28AA87429AD807992E22D07E0450BFE8@orsmsx408> In-Reply-To: <468F3FDA28AA87429AD807992E22D07E0450BFE8@orsmsx408> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 2040 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: greearb@candelatech.com Precedence: bulk X-list: netdev Content-Length: 1320 Lines: 37 Ronciak, John wrote: >> It's not obvious that weight is to blame for frames dropped. I would >> look into RX ring size in relation to HW mitigation. >> And of course if you system is very loaded the RX softirq gives room >> for other jobs and frames get dropped >> > > With the same system (fairly high end with nothing major running on it) > we got rid of the dropped frames by just reducing the weight for 64. So > the weight did have something to do with the dropped frames. Maybe > other factors as well, but in static tests like this it sure looks like > the 64 value is wrong is some cases. Is this implying that having the NAPI poll do less work per poll of the driver actually increases performance? I would have guessed that the opposite would be true. Maybe the poll is disabling the IRQs on the NIC for too long, or something like that? For e1000, are you using larger than the default 256 receive descriptors? I have seen that increasing these descriptors helps decrease drops by a small percentage. Have you tried increasing the netdev-backlog setting to see if that fixes the problem (while leaving the weight at the default)? What packet sizes and speeds are you using for your tests? Thanks, Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com From davem@davemloft.net Fri Jun 3 11:39:30 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Jun 2005 11:39:33 -0700 (PDT) Received: from sunset.davemloft.net (dsl027-180-168.sfo1.dsl.speakeasy.net [216.27.180.168]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j53IdTXq013121 for ; Fri, 3 Jun 2005 11:39:29 -0700 Received: from localhost ([127.0.0.1] ident=davem) by sunset.davemloft.net with esmtp (Exim 4.50) id 1DeH3h-0001to-BE; Fri, 03 Jun 2005 11:38:17 -0700 Date: Fri, 03 Jun 2005 11:38:17 -0700 (PDT) Message-Id: <20050603.113817.74562842.davem@davemloft.net> To: mitch.a.williams@intel.com Cc: hadi@cyberus.ca, john.ronciak@intel.com, jdmason@us.ibm.com, shemminger@osdl.org, netdev@oss.sgi.com, Robert.Olsson@data.slu.se, ganesh.venkatesan@intel.com, jesse.brandeburg@intel.com Subject: Re: RFC: NAPI packet weighting patch From: "David S. Miller" In-Reply-To: References: <20050602.171812.48807872.davem@davemloft.net> <1117765954.6095.49.camel@localhost.localdomain> X-Mailer: Mew version 3.3 on Emacs 21.4 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 2041 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev Content-Length: 1504 Lines: 33 From: Mitch Williams Date: Fri, 3 Jun 2005 10:43:32 -0700 > In my mind, there are two major problems with NAPI as it stands today. > First, at Gigabit and higher speeds, the default settings don't allow the > driver to process received packets in a timely manner. This causes > dropped packets due to lack of receive resources. Lowering the weight can > fix this, at least in a single-adapter environment. I really don't see how changing the weight can change things in the single adapter case. When we hit the quota, we just loop and process more packets. It doesn't fundamentally change anything about how the NAPI code operates. Please investigate what exactly is happening. I have a few theories. First, is it the case that with a lower weight we drop out of the loop because 'jiffies' advanced one tick? Some simply instrumentation in net/core/dev.c:net_rx_action() would show what's going on. Actually, we keep this statistic via netdev_rx_stat, so just cat /proc/net/softnet_stat to get a look at if "time_squeeze" is being incremented when dev->weight is 64 in your tests. Next, I don't think "budget" in that function is going down to zero, that's set to 300 by default. If the quota is consumed, the device is just added right back to the tail of the poll_list, and if it's the only device active we jump right back into it's ->poll() routine over and over until there is no more pending work in the device or we hit the "jiffies - start_time > 1" test. From hadi@cyberus.ca Fri Jun 3 11:44:00 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Jun 2005 11:44:03 -0700 (PDT) Received: from mx03.cybersurf.com (mx03.cybersurf.com [209.197.145.106]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j53IhxXq013815 for ; Fri, 3 Jun 2005 11:43:59 -0700 Received: from mail.cyberus.ca ([209.197.145.21]) by mx03.cybersurf.com with esmtp (Exim 4.30) id 1DeH8M-0003zL-Lx for netdev@oss.sgi.com; Fri, 03 Jun 2005 14:43:06 -0400 Received: from [216.209.86.2] (helo=[10.0.0.229]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1DeH8J-0005ai-51; Fri, 03 Jun 2005 14:43:03 -0400 Subject: Re: RFC: NAPI packet weighting patch From: jamal Reply-To: hadi@cyberus.ca To: Mitch Williams Cc: "David S. Miller" , "Ronciak, John" , jdmason@us.ibm.com, shemminger@osdl.org, netdev@oss.sgi.com, Robert.Olsson@data.slu.se, "Venkatesan, Ganesh" , "Brandeburg, Jesse" In-Reply-To: References: <468F3FDA28AA87429AD807992E22D07E0450BFDB@orsmsx408> <20050602.171812.48807872.davem@davemloft.net> <1117765954.6095.49.camel@localhost.localdomain> Content-Type: text/plain Organization: unknown Date: Fri, 03 Jun 2005 14:42:30 -0400 Message-Id: <1117824150.6071.34.camel@localhost.localdomain> Mime-Version: 1.0 X-Mailer: Evolution 2.2.1.1 Content-Transfer-Encoding: 7bit X-archive-position: 2042 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev Content-Length: 5233 Lines: 124 On Fri, 2005-03-06 at 10:43 -0700, Mitch Williams wrote: > > On Thu, 2 Jun 2005, jamal wrote: > > > > Heres what i think i saw as a flow of events: > > Someone posted a theory that if you happen to reduce the weight > > (iirc the reduction was via a shift) then the DRR would give less CPU > > time cycle to the driver - Whats the big suprise there? thats DRR design > > intent. > > Well, that was me. Or at least I was the original poster on this thread. > But my theory (if you can call it that) really wasn't about CPU time. I > spent several weeks in our lab with the somewhat nebulous task of "look at > Linux performance". And what I found was, to me, counterintuitive: > reducing weight improved performance, sometimes significantly. > When you reduce the weight, the system is spending less time in the softirq processing packets before softirq yields. If this gives more opportunity to your app to run, then the performance will go up. Is this what you are seeing? > OK, well, call me a blasphemer (against whom?). > I'm not really saying > that the DRR algorithm is not real-world, but rather that NAPI as > currently implemented has some significant performance limitations. > And we need to be fair and investigate why. > In my mind, there are two major problems with NAPI as it stands today. > First, at Gigabit and higher speeds, the default settings don't allow the > driver to process received packets in a timely manner. What do you mean by timely? > This causes > dropped packets due to lack of receive resources. Lowering the weight can > fix this, at least in a single-adapter environment. > If your know your workload you could tune the weight. Additionaly you could tune the softirq using nice. > Second, at 10Mbps and 100Mbps, modern processors are just too fast for the > network. The NAPI polling loop runs so much quicker than the wire speed > that only one or two packets are processed per softirq -- which > effectively puts the adapter back in interrupt mode. Because of this, you > can easily bog down a very fast box with relatively slow traffic, just due > to the massive number of interrupts generated. > Massive is an overstatement. The issue is really IO. If you process one packet in each interupt then NAPI does add extra IO costs at "low" traffic levels. Note that this is also a known issue - reference the threads from waay back from people like Manfred Spraul and recently from the SGI folks. IO unfortunately hasnt kept up with CPU speeds; hardware vendors such as your company have been busy making processors faster but forgetting about IO and RAM latencies. PCI-E seems promising from what i have heard, interim PCI-E bridging to PCI-X is form what i have heard on its IO performance worse. > My original post (and patch) were to address the first issue. By using > the shift value on the quota, I effectively lowered the weight for every > driver in the system. Stephen sent out a patch that allowed you to > adjust each driver's weight individually. My testing has shown that, as > expected, you can achieve the same performance gain either way. > Ok, glad to hear thats resolved. > In a multiple-adapter environment, you need to adjust the weight of all > drivers together to fix the dropped packets issue. Lowering the weight on > one adapter won't help it if the other interfaces are still taking up a > lot of time in their receive loops. > > My patch gave you one knob to twiddle that would correct this issue. > Stephen's patch gave you one knob for each adapter, but now you need to > twiddle them all to see any benefit. > makes sense > The second issue currently has no fix. What is needed is a way for the > driver to request a delayed poll, possibly based on line speed. If we > could wait, say, 8 packet times before polling, we could significantly > reduce the number of interrupts the system has to deal with, at the cost > of higher latency. We haven't had time to investigate this at all, but > the need is clearly present -- we've had customer calls about this issue. > I can believe you (note it has to do with IO costs though) having seen how horrific MMIO numbers are on faster processors. Talk to Jesse, he has seen a little program from Lennert/Robert/Harald that does MMIO measurements. It seems the trend is that as CPUs get faster, IO gets more expensive in both cpu cycles as well as absolute time. The solution to this issue is to be found in mitigation at the moment in conjunction with NAPI. The SGI folks have made some real progress with recent patches from Davem and Michael Chan on tg3. I have been experimenting with some patches but they introduce unacceptable jitter in latency. So lets summarize it this way: This is something that needs to be resolved - but whatever solution needs to be generic. > Either way, I think the netdev community needs to look critically at NAPI, > and make some changes. I think what you call as the second issue needs a solution. Mitigation is the only generic solution at the moment. > Network performance in 2.6.12-rcWhatever is > pretty poor. 2.4.30 beats it handily, and it really shouldn't be that > way. > Are you using NAPI as well on 2.4.30? cheers, jamal From davem@davemloft.net Fri Jun 3 11:51:03 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Jun 2005 11:51:07 -0700 (PDT) Received: from sunset.davemloft.net (dsl027-180-168.sfo1.dsl.speakeasy.net [216.27.180.168]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j53Ip3Xq014729 for ; Fri, 3 Jun 2005 11:51:03 -0700 Received: from localhost ([127.0.0.1] ident=davem) by sunset.davemloft.net with esmtp (Exim 4.50) id 1DeHEs-0001vW-KJ; Fri, 03 Jun 2005 11:49:50 -0700 Date: Fri, 03 Jun 2005 11:49:50 -0700 (PDT) Message-Id: <20050603.114950.119242486.davem@davemloft.net> To: greearb@candelatech.com Cc: john.ronciak@intel.com, Robert.Olsson@data.slu.se, jdmason@us.ibm.com, shemminger@osdl.org, hadi@cyberus.ca, mitch.a.williams@intel.com, netdev@oss.sgi.com, ganesh.venkatesan@intel.com, jesse.brandeburg@intel.com Subject: Re: RFC: NAPI packet weighting patch From: "David S. Miller" In-Reply-To: <42A0A25C.8000503@candelatech.com> References: <468F3FDA28AA87429AD807992E22D07E0450BFE8@orsmsx408> <42A0A25C.8000503@candelatech.com> X-Mailer: Mew version 3.3 on Emacs 21.4 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 2044 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev Content-Length: 1080 Lines: 28 From: Ben Greear Date: Fri, 03 Jun 2005 11:33:00 -0700 > Is this implying that having the NAPI poll do less work per poll > of the driver actually increases performance? I would have guessed that > the opposite would be true. Exactly my thoughts as well :) > Maybe the poll is disabling the IRQs on the NIC for too long, or something > like that? In a reply I just sent out to this thread, I postulate that the jiffies check is hitting earlier with a lower weight value, a quick look at /proc/net/softnet_stat during their testing will confirm or deny this theory. It could also just be a simple bug in the dev->quota accounting somewhere. Note that, in all of this, I do not have any objections to providing a way to configure the dev->weight values. I will be applying Stephen Hemminger's patches. But I think we MUST find out the reason for the observed behavior, especially in the single-adapter case since the result is so illogical. We could find an important bug in the NAPI implementation, or learn something new about how NAPI behaves. From fubar@us.ibm.com Fri Jun 3 11:50:08 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Jun 2005 11:50:15 -0700 (PDT) Received: from e35.co.us.ibm.com (e35.co.us.ibm.com [32.97.110.133]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j53Io8Xq014546 for ; Fri, 3 Jun 2005 11:50:08 -0700 Received: from d03relay04.boulder.ibm.com (d03relay04.boulder.ibm.com [9.17.195.106]) by e35.co.us.ibm.com (8.12.10/8.12.9) with ESMTP id j53In9MK501054 for ; Fri, 3 Jun 2005 14:49:09 -0400 Received: from d03av02.boulder.ibm.com (d03av02.boulder.ibm.com [9.17.195.168]) by d03relay04.boulder.ibm.com (8.12.10/NCO/VER6.6) with ESMTP id j53In96g177088 for ; Fri, 3 Jun 2005 12:49:09 -0600 Received: from d03av02.boulder.ibm.com (loopback [127.0.0.1]) by d03av02.boulder.ibm.com (8.12.11/8.13.3) with ESMTP id j53In8aT012300 for ; Fri, 3 Jun 2005 12:49:09 -0600 Received: from death.nxdomain.ibm.com (lig32-225-151-29.us.lig-dial.ibm.com [32.225.151.29]) by d03av02.boulder.ibm.com (8.12.11/8.12.11) with ESMTP id j53Imu67011588; Fri, 3 Jun 2005 12:48:57 -0600 Received: from death.nxdomain.ibm.com (localhost [127.0.0.1]) by death.nxdomain.ibm.com (8.12.8/8.12.8) with ESMTP id j53ImVse031367; Fri, 3 Jun 2005 11:48:51 -0700 Received: from death (fubar@localhost) by death.nxdomain.ibm.com (8.12.8/8.12.8/Submit) with ESMTP id j53ImAwZ031354; Fri, 3 Jun 2005 11:48:30 -0700 Message-Id: <200506031848.j53ImAwZ031354@death.nxdomain.ibm.com> To: netdev@oss.sgi.com, bonding-devel@lists.sourceforge.net Subject: [PATCH 2.6.12-rc5] bonding: documentation update X-Mailer: MH-E 7.83; nmh 1.0.4; GNU Emacs 21.3.1 Date: Fri, 03 Jun 2005 11:48:09 -0700 From: Jay Vosburgh X-archive-position: 2043 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: fubar@us.ibm.com Precedence: bulk X-list: netdev Content-Length: 57102 Lines: 1266 Documentation update: added some more configuration info, (hopefully) better examples, updated some out of date info, and a bonus pass through ispell to banish the "paramters." -J --- -Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com Signed-off-by: Jay Vosburgh diff -ur linux-2.6.12-rc5/Documentation/networking/bonding.txt linux-doc/Documentation/networking/bonding.txt --- linux-2.6.12-rc5/Documentation/networking/bonding.txt 2005-06-03 11:29:04.394823672 -0700 +++ linux-doc/Documentation/networking/bonding.txt 2005-06-03 11:29:41.143237064 -0700 @@ -1,5 +1,7 @@ - Linux Ethernet Bonding Driver HOWTO + Linux Ethernet Bonding Driver HOWTO + + Latest update: 2 June 2005 Initial release : Thomas Davis Corrections, HA extensions : 2000/10/03-15 : @@ -11,15 +13,22 @@ Reorganized and updated Feb 2005 by Jay Vosburgh -Note : ------- +Introduction +============ + + The Linux bonding driver provides a method for aggregating +multiple network interfaces into a single logical "bonded" interface. +The behavior of the bonded interfaces depends upon the mode; generally +speaking, modes provide either hot standby or load balancing services. +Additionally, link integrity monitoring may be performed. -The bonding driver originally came from Donald Becker's beowulf patches for -kernel 2.0. It has changed quite a bit since, and the original tools from -extreme-linux and beowulf sites will not work with this version of the driver. + The bonding driver originally came from Donald Becker's +beowulf patches for kernel 2.0. It has changed quite a bit since, and +the original tools from extreme-linux and beowulf sites will not work +with this version of the driver. -For new versions of the driver, patches for older kernels and the updated -userspace tools, please follow the links at the end of this file. + For new versions of the driver, updated userspace tools, and +who to ask for help, please follow the links at the end of this file. Table of Contents ================= @@ -30,9 +39,13 @@ 3. Configuring Bonding Devices 3.1 Configuration with sysconfig support +3.1.1 Using DHCP with sysconfig +3.1.2 Configuring Multiple Bonds with sysconfig 3.2 Configuration with initscripts support +3.2.1 Using DHCP with initscripts +3.2.2 Configuring Multiple Bonds with initscripts 3.3 Configuring Bonding Manually -3.4 Configuring Multiple Bonds +3.3.1 Configuring Multiple Bonds Manually 5. Querying Bonding Configuration 5.1 Bonding Configuration @@ -56,21 +69,28 @@ 11. Promiscuous mode -12. High Availability Information +12. Configuring Bonding for High Availability 12.1 High Availability in a Single Switch Topology -12.1.1 Bonding Mode Selection for Single Switch Topology -12.1.2 Link Monitoring for Single Switch Topology 12.2 High Availability in a Multiple Switch Topology -12.2.1 Bonding Mode Selection for Multiple Switch Topology -12.2.2 Link Monitoring for Multiple Switch Topology -12.3 Switch Behavior Issues for High Availability +12.2.1 HA Bonding Mode Selection for Multiple Switch Topology +12.2.2 HA Link Monitoring for Multiple Switch Topology + +13. Configuring Bonding for Maximum Throughput +13.1 Maximum Throughput in a Single Switch Topology +13.1.1 MT Bonding Mode Selection for Single Switch Topology +13.1.2 MT Link Monitoring for Single Switch Topology +13.2 Maximum Throughput in a Multiple Switch Topology +13.2.1 MT Bonding Mode Selection for Multiple Switch Topology +13.2.2 MT Link Monitoring for Multiple Switch Topology -13. Hardware Specific Considerations -13.1 IBM BladeCenter +14. Switch Behavior Issues -14. Frequently Asked Questions +15. Hardware Specific Considerations +15.1 IBM BladeCenter -15. Resources and Links +16. Frequently Asked Questions + +17. Resources and Links 1. Bonding Driver Installation @@ -86,16 +106,10 @@ 1.1 Configure and build the kernel with bonding ----------------------------------------------- - The latest version of the bonding driver is available in the + The current version of the bonding driver is available in the drivers/net/bonding subdirectory of the most recent kernel source -(which is available on http://kernel.org). - - Prior to the 2.4.11 kernel, the bonding driver was maintained -largely outside the kernel tree; patches for some earlier kernels are -available on the bonding sourceforge site, although those patches are -still several years out of date. Most users will want to use either -the most recent kernel from kernel.org or whatever kernel came with -their distro. +(which is available on http://kernel.org). Most users "rolling their +own" will want to use the most recent kernel from kernel.org. Configure kernel with "make menuconfig" (or "make xconfig" or "make config"), then select "Bonding driver support" in the "Network @@ -103,8 +117,8 @@ driver as module since it is currently the only way to pass parameters to the driver or configure more than one bonding device. - Build and install the new kernel and modules, then proceed to -step 2. + Build and install the new kernel and modules, then continue +below to install ifenslave. 1.2 Install ifenslave Control Utility ------------------------------------- @@ -147,9 +161,9 @@ Options for the bonding driver are supplied as parameters to the bonding module at load time. They may be given as command line arguments to the insmod or modprobe command, but are usually specified -in either the /etc/modprobe.conf configuration file, or in a -distro-specific configuration file (some of which are detailed in the -next section). +in either the /etc/modules.conf or /etc/modprobe.conf configuration +file, or in a distro-specific configuration file (some of which are +detailed in the next section). The available bonding driver parameters are listed below. If a parameter is not specified the default value is used. When initially @@ -162,34 +176,34 @@ support at least miimon, so there is really no reason not to use it. Options with textual values will accept either the text name - or, for backwards compatibility, the option value. E.g., - "mode=802.3ad" and "mode=4" set the same mode. +or, for backwards compatibility, the option value. E.g., +"mode=802.3ad" and "mode=4" set the same mode. The parameters are as follows: arp_interval - Specifies the ARP monitoring frequency in milli-seconds. If - ARP monitoring is used in a load-balancing mode (mode 0 or 2), - the switch should be configured in a mode that evenly - distributes packets across all links - such as round-robin. If - the switch is configured to distribute the packets in an XOR + Specifies the ARP link monitoring frequency in milliseconds. + If ARP monitoring is used in an etherchannel compatible mode + (modes 0 and 2), the switch should be configured in a mode + that evenly distributes packets across all links. If the + switch is configured to distribute the packets in an XOR fashion, all replies from the ARP targets will be received on the same link which could cause the other team members to - fail. ARP monitoring should not be used in conjunction with - miimon. A value of 0 disables ARP monitoring. The default + fail. ARP monitoring should not be used in conjunction with + miimon. A value of 0 disables ARP monitoring. The default value is 0. arp_ip_target - Specifies the ip addresses to use when arp_interval is > 0. - These are the targets of the ARP request sent to determine the - health of the link to the targets. Specify these values in - ddd.ddd.ddd.ddd format. Multiple ip adresses must be - seperated by a comma. At least one IP address must be given - for ARP monitoring to function. The maximum number of targets - that can be specified is 16. The default value is no IP - addresses. + Specifies the IP addresses to use as ARP monitoring peers when + arp_interval is > 0. These are the targets of the ARP request + sent to determine the health of the link to the targets. + Specify these values in ddd.ddd.ddd.ddd format. Multiple IP + addresses must be separated by a comma. At least one IP + address must be given for ARP monitoring to function. The + maximum number of targets that can be specified is 16. The + default value is no IP addresses. downdelay @@ -207,11 +221,13 @@ are: slow or 0 - Request partner to transmit LACPDUs every 30 seconds (default) + Request partner to transmit LACPDUs every 30 seconds fast or 1 Request partner to transmit LACPDUs every 1 second + The default is slow. + max_bonds Specifies the number of bonding devices to create for this @@ -221,10 +237,11 @@ miimon - Specifies the frequency in milli-seconds that MII link - monitoring will occur. A value of zero disables MII link - monitoring. A value of 100 is a good starting point. The - use_carrier option, below, affects how the link state is + Specifies the MII link monitoring frequency in milliseconds. + This determines how often the link state of each slave is + inspected for link failures. A value of zero disables MII + link monitoring. A value of 100 is a good starting point. + The use_carrier option, below, affects how the link state is determined. See the High Availability section for additional information. The default value is 0. @@ -270,7 +287,7 @@ duplex settings. Utilizes all slaves in the active aggregator according to the 802.3ad specification. - Pre-requisites: + Prerequisites: 1. Ethtool support in the base drivers for retrieving the speed and duplex of each slave. @@ -333,7 +350,7 @@ When a link is reconnected or a new slave joins the bond the receive traffic is redistributed among all - active slaves in the bond by intiating ARP Replies + active slaves in the bond by initiating ARP Replies with the selected mac address to each of the clients. The updelay parameter (detailed below) must be set to a value equal or greater than the switch's @@ -448,8 +465,9 @@ slave devices. On SLES 9, this is most easily done by running the yast2 sysconfig configuration utility. The goal is for to create an ifcfg-id file for each slave device. The simplest way to accomplish -this is to configure the devices for DHCP. The name of the -configuration file for each device will be of the form: +this is to configure the devices for DHCP (this is only to get the +file ifcfg-id file created; see below for some issues with DHCP). The +name of the configuration file for each device will be of the form: ifcfg-id-xx:xx:xx:xx:xx:xx @@ -459,7 +477,7 @@ Once the set of ifcfg-id-xx:xx:xx:xx:xx:xx files has been created, it is necessary to edit the configuration files for the slave devices (the MAC addresses correspond to those of the slave devices). -Before editing, the file will contain muliple lines, and will look +Before editing, the file will contain multiple lines, and will look something like this: BOOTPROTO='dhcp' @@ -501,11 +519,6 @@ Replace the sample BROADCAST, IPADDR, NETMASK and NETWORK values with the appropriate values for your network. - Note that configuring the bonding device with BOOTPROTO='dhcp' -does not work; the scripts attempt to obtain the device address from -DHCP prior to adding any of the slave devices. Without active slaves, -the DHCP requests are not sent to the network. - The STARTMODE specifies when the device is brought online. The possible values are: @@ -544,7 +557,7 @@ Note that the network control script (/sbin/ifdown) will remove the bonding module as part of the network shutdown processing, so it is not necessary to remove the module by hand if, e.g., the -module paramters have changed. +module parameters have changed. Also, at this writing, YaST/YaST2 will not manage bonding devices (they do not show bonding interfaces on its list of network @@ -559,12 +572,37 @@ Note that the template does not document the various BONDING_ settings described above, but does describe many of the other options. +3.1.1 Using DHCP with sysconfig +------------------------------- + + Under sysconfig, configuring a device with BOOTPROTO='dhcp' +will cause it to query DHCP for its IP address information. At this +writing, this does not function for bonding devices; the scripts +attempt to obtain the device address from DHCP prior to adding any of +the slave devices. Without active slaves, the DHCP requests are not +sent to the network. + +3.1.2 Configuring Multiple Bonds with sysconfig +----------------------------------------------- + + The sysconfig network initialization system is capable of +handling multiple bonding devices. All that is necessary is for each +bonding instance to have an appropriately configured ifcfg-bondX file +(as described above). Do not specify the "max_bonds" parameter to any +instance of bonding, as this will confuse sysconfig. If you require +multiple bonding devices with identical parameters, create multiple +ifcfg-bondX files. + + Because the sysconfig scripts supply the bonding module +options in the ifcfg-bondX file, it is not necessary to add them to +the system /etc/modules.conf or /etc/modprobe.conf configuration file. + 3.2 Configuration with initscripts support ------------------------------------------ This section applies to distros using a version of initscripts with bonding support, for example, Red Hat Linux 9 or Red Hat -Enterprise Linux version 3. On these systems, the network +Enterprise Linux version 3 or 4. On these systems, the network initialization scripts have some knowledge of bonding, and can be configured to control bonding devices. @@ -614,10 +652,11 @@ Be sure to change the networking specific lines (IPADDR, NETMASK, NETWORK and BROADCAST) to match your network configuration. - Finally, it is necessary to edit /etc/modules.conf to load the -bonding module when the bond0 interface is brought up. The following -sample lines in /etc/modules.conf will load the bonding module, and -select its options: + Finally, it is necessary to edit /etc/modules.conf (or +/etc/modprobe.conf, depending upon your distro) to load the bonding +module with your desired options when the bond0 interface is brought +up. The following lines in /etc/modules.conf (or modprobe.conf) will +load the bonding module, and select its options: alias bond0 bonding options bond0 mode=balance-alb miimon=100 @@ -629,6 +668,33 @@ will restart the networking subsystem and your bond link should be now up and running. +3.2.1 Using DHCP with initscripts +--------------------------------- + + Recent versions of initscripts (the version supplied with +Fedora Core 3 and Red Hat Enterprise Linux 4 is reported to work) do +have support for assigning IP information to bonding devices via DHCP. + + To configure bonding for DHCP, configure it as described +above, except replace the line "BOOTPROTO=none" with "BOOTPROTO=dhcp" +and add a line consisting of "TYPE=Bonding". Note that the TYPE value +is case sensitive. + +3.2.2 Configuring Multiple Bonds with initscripts +------------------------------------------------- + + At this writing, the initscripts package does not directly +support loading the bonding driver multiple times, so the process for +doing so is the same as described in the "Configuring Multiple Bonds +Manually" section, below. + + NOTE: It has been observed that some Red Hat supplied kernels +are apparently unable to rename modules at load time (the "-obonding1" +part). Attempts to pass that option to modprobe will produce an +"Operation not permitted" error. This has been reported on some +Fedora Core kernels, and has been seen on RHEL 4 as well. On kernels +exhibiting this problem, it will be impossible to configure multiple +bonds with differing parameters. 3.3 Configuring Bonding Manually -------------------------------- @@ -638,10 +704,11 @@ knowledge of bonding. One such distro is SuSE Linux Enterprise Server version 8. - The general methodology for these systems is to place the -bonding module parameters into /etc/modprobe.conf, then add modprobe -and/or ifenslave commands to the system's global init script. The -name of the global init script differs; for sysconfig, it is + The general method for these systems is to place the bonding +module parameters into /etc/modules.conf or /etc/modprobe.conf (as +appropriate for the installed distro), then add modprobe and/or +ifenslave commands to the system's global init script. The name of +the global init script differs; for sysconfig, it is /etc/init.d/boot.local and for initscripts it is /etc/rc.d/rc.local. For example, if you wanted to make a simple bond of two e100 @@ -649,7 +716,7 @@ reboots, edit the appropriate file (/etc/init.d/boot.local or /etc/rc.d/rc.local), and add the following: -modprobe bonding -obond0 mode=balance-alb miimon=100 +modprobe bonding mode=balance-alb miimon=100 modprobe e100 ifconfig bond0 192.168.1.1 netmask 255.255.255.0 up ifenslave bond0 eth0 @@ -657,11 +724,7 @@ Replace the example bonding module parameters and bond0 network configuration (IP address, netmask, etc) with the appropriate -values for your configuration. The above example loads the bonding -module with the name "bond0," this simplifies the naming if multiple -bonding modules are loaded (each successive instance of the module is -given a different name, and the module instance names match the -bonding interface names). +values for your configuration. Unfortunately, this method will not provide support for the ifup and ifdown scripts on the bond devices. To reload the bonding @@ -684,20 +747,23 @@ the following: # ifconfig bond0 down -# rmmod bond0 +# rmmod bonding # rmmod e100 Again, for convenience, it may be desirable to create a script with these commands. -3.4 Configuring Multiple Bonds ------------------------------- +3.3.1 Configuring Multiple Bonds Manually +----------------------------------------- This section contains information on configuring multiple -bonding devices with differing options. If you require multiple -bonding devices, but all with the same options, see the "max_bonds" -module paramter, documented above. +bonding devices with differing options for those systems whose network +initialization scripts lack support for configuring multiple bonds. + + If you require multiple bonding devices, but all with the same +options, you may wish to use the "max_bonds" module parameter, +documented above. To create multiple bonding devices with differing options, it is necessary to load the bonding driver multiple times. Note that @@ -724,11 +790,16 @@ miimon of 100. The second instance is named "bond1" and creates the bond1 device in balance-alb mode with an miimon of 50. + In some circumstances (typically with older distributions), +the above does not work, and the second bonding instance never sees +its options. In that case, the second options line can be substituted +as follows: + +install bonding1 /sbin/modprobe bonding -obond1 mode=balance-alb miimon=50 + This may be repeated any number of times, specifying a new and -unique name in place of bond0 or bond1 for each instance. +unique name in place of bond1 for each subsequent instance. - When the appropriate module paramters are in place, then -configure bonding according to the instructions for your distro. 5. Querying Bonding Configuration ================================= @@ -846,8 +917,8 @@ self generated packets. For reasons of simplicity, and to support the use of adapters -that can do VLAN hardware acceleration offloding, the bonding -interface declares itself as fully hardware offloaing capable, it gets +that can do VLAN hardware acceleration offloading, the bonding +interface declares itself as fully hardware offloading capable, it gets the add_vid/kill_vid notifications to gather the necessary information, and it propagates those actions to the slaves. In case of mixed adapter types, hardware accelerated tagged packets that @@ -880,7 +951,7 @@ matches the hardware address of the VLAN interfaces. Note that changing a VLAN interface's HW address would set the -underlying device -- i.e. the bonding interface -- to promiscouos +underlying device -- i.e. the bonding interface -- to promiscuous mode, which might not be what you want. @@ -923,7 +994,7 @@ an additional target (or several) increases the reliability of the ARP monitoring. - Multiple ARP targets must be seperated by commas as follows: + Multiple ARP targets must be separated by commas as follows: # example options for ARP monitoring with three targets alias bond0 bonding @@ -1045,7 +1116,7 @@ This will, when loading the bonding module, rather than performing the normal action, instead execute the provided command. This command loads the device drivers in the order needed, then calls -modprobe with --ingore-install to cause the normal action to then take +modprobe with --ignore-install to cause the normal action to then take place. Full documentation on this can be found in the modprobe.conf and modprobe manual pages. @@ -1130,14 +1201,14 @@ common to enable promiscuous mode on the device, so that all traffic is seen (instead of seeing only traffic destined for the local host). The bonding driver handles promiscuous mode changes to the bonding -master device (e.g., bond0), and propogates the setting to the slave +master device (e.g., bond0), and propagates the setting to the slave devices. For the balance-rr, balance-xor, broadcast, and 802.3ad modes, -the promiscuous mode setting is propogated to all slaves. +the promiscuous mode setting is propagated to all slaves. For the active-backup, balance-tlb and balance-alb modes, the -promiscuous mode setting is propogated only to the active slave. +promiscuous mode setting is propagated only to the active slave. For balance-tlb mode, the active slave is the slave currently receiving inbound traffic. @@ -1148,46 +1219,182 @@ For the active-backup, balance-tlb and balance-alb modes, when the active slave changes (e.g., due to a link failure), the -promiscuous setting will be propogated to the new active slave. +promiscuous setting will be propagated to the new active slave. -12. High Availability Information -================================= +12. Configuring Bonding for High Availability +============================================= High Availability refers to configurations that provide maximum network availability by having redundant or backup devices, -links and switches between the host and the rest of the world. - - There are currently two basic methods for configuring to -maximize availability. They are dependent on the network topology and -the primary goal of the configuration, but in general, a configuration -can be optimized for maximum available bandwidth, or for maximum -network availability. +links or switches between the host and the rest of the world. The +goal is to provide the maximum availability of network connectivity +(i.e., the network always works), even though other configurations +could provide higher throughput. 12.1 High Availability in a Single Switch Topology -------------------------------------------------- - If two hosts (or a host and a switch) are directly connected -via multiple physical links, then there is no network availability -penalty for optimizing for maximum bandwidth: there is only one switch -(or peer), so if it fails, you have no alternative access to fail over -to. - -Example 1 : host to switch (or other host) - - +----------+ +----------+ - | |eth0 eth0| switch | - | Host A +--------------------------+ or | - | +--------------------------+ other | - | |eth1 eth1| host | - +----------+ +----------+ + If two hosts (or a host and a single switch) are directly +connected via multiple physical links, then there is no availability +penalty to optimizing for maximum bandwidth. In this case, there is +only one switch (or peer), so if it fails, there is no alternative +access to fail over to. Additionally, the bonding load balance modes +support link monitoring of their members, so if individual links fail, +the load will be rebalanced across the remaining devices. + + See Section 13, "Configuring Bonding for Maximum Throughput" +for information on configuring bonding with one peer device. +12.2 High Availability in a Multiple Switch Topology +---------------------------------------------------- -12.1.1 Bonding Mode Selection for single switch topology --------------------------------------------------------- + With multiple switches, the configuration of bonding and the +network changes dramatically. In multiple switch topologies, there is +a trade off between network availability and usable bandwidth. + + Below is a sample network, configured to maximize the +availability of the network: + + | | + |port3 port3| + +-----+----+ +-----+----+ + | |port2 ISL port2| | + | switch A +--------------------------+ switch B | + | | | | + +-----+----+ +-----++---+ + |port1 port1| + | +-------+ | + +-------------+ host1 +---------------+ + eth0 +-------+ eth1 + + In this configuration, there is a link between the two +switches (ISL, or inter switch link), and multiple ports connecting to +the outside world ("port3" on each switch). There is no technical +reason that this could not be extended to a third switch. + +12.2.1 HA Bonding Mode Selection for Multiple Switch Topology +------------------------------------------------------------- + + In a topology such as the example above, the active-backup and +broadcast modes are the only useful bonding modes when optimizing for +availability; the other modes require all links to terminate on the +same peer for them to behave rationally. + +active-backup: This is generally the preferred mode, particularly if + the switches have an ISL and play together well. If the + network configuration is such that one switch is specifically + a backup switch (e.g., has lower capacity, higher cost, etc), + then the primary option can be used to insure that the + preferred link is always used when it is available. + +broadcast: This mode is really a special purpose mode, and is suitable + only for very specific needs. For example, if the two + switches are not connected (no ISL), and the networks beyond + them are totally independent. In this case, if it is + necessary for some specific one-way traffic to reach both + independent networks, then the broadcast mode may be suitable. + +12.2.2 HA Link Monitoring Selection for Multiple Switch Topology +---------------------------------------------------------------- + + The choice of link monitoring ultimately depends upon your +switch. If the switch can reliably fail ports in response to other +failures, then either the MII or ARP monitors should work. For +example, in the above example, if the "port3" link fails at the remote +end, the MII monitor has no direct means to detect this. The ARP +monitor could be configured with a target at the remote end of port3, +thus detecting that failure without switch support. + + In general, however, in a multiple switch topology, the ARP +monitor can provide a higher level of reliability in detecting end to +end connectivity failures (which may be caused by the failure of any +individual component to pass traffic for any reason). Additionally, +the ARP monitor should be configured with multiple targets (at least +one for each switch in the network). This will insure that, +regardless of which switch is active, the ARP monitor has a suitable +target to query. + + +13. Configuring Bonding for Maximum Throughput +============================================== + +13.1 Maximizing Throughput in a Single Switch Topology +------------------------------------------------------ + + In a single switch configuration, the best method to maximize +throughput depends upon the application and network environment. The +various load balancing modes each have strengths and weaknesses in +different environments, as detailed below. + + For this discussion, we will break down the topologies into +two categories. Depending upon the destination of most traffic, we +categorize them into either "gatewayed" or "local" configurations. + + In a gatewayed configuration, the "switch" is acting primarily +as a router, and the majority of traffic passes through this router to +other networks. An example would be the following: + + + +----------+ +----------+ + | |eth0 port1| | to other networks + | Host A +---------------------+ router +-------------------> + | +---------------------+ | Hosts B and C are out + | |eth1 port2| | here somewhere + +----------+ +----------+ + + The router may be a dedicated router device, or another host +acting as a gateway. For our discussion, the important point is that +the majority of traffic from Host A will pass through the router to +some other network before reaching its final destination. + + In a gatewayed network configuration, although Host A may +communicate with many other systems, all of its traffic will be sent +and received via one other peer on the local network, the router. + + Note that the case of two systems connected directly via +multiple physical links is, for purposes of configuring bonding, the +same as a gatewayed configuration. In that case, it happens that all +traffic is destined for the "gateway" itself, not some other network +beyond the gateway. + + In a local configuration, the "switch" is acting primarily as +a switch, and the majority of traffic passes through this switch to +reach other stations on the same network. An example would be the +following: + + +----------+ +----------+ +--------+ + | |eth0 port1| +-------+ Host B | + | Host A +------------+ switch |port3 +--------+ + | +------------+ | +--------+ + | |eth1 port2| +------------------+ Host C | + +----------+ +----------+port4 +--------+ + + + Again, the switch may be a dedicated switch device, or another +host acting as a gateway. For our discussion, the important point is +that the majority of traffic from Host A is destined for other hosts +on the same local network (Hosts B and C in the above example). + + In summary, in a gatewayed configuration, traffic to and from +the bonded device will be to the same MAC level peer on the network +(the gateway itself, i.e., the router), regardless of its final +destination. In a local configuration, traffic flows directly to and +from the final destinations, thus, each destination (Host B, Host C) +will be addressed directly by their individual MAC addresses. + + This distinction between a gatewayed and a local network +configuration is important because many of the load balancing modes +available use the MAC addresses of the local network source and +destination to make load balancing decisions. The behavior of each +mode is described below. + + +13.1.1 MT Bonding Mode Selection for Single Switch Topology +----------------------------------------------------------- This configuration is the easiest to set up and to understand, although you will have to decide which bonding mode best suits your -needs. The tradeoffs for each mode are detailed below: +needs. The trade offs for each mode are detailed below: balance-rr: This mode is the only mode that will permit a single TCP/IP connection to stripe traffic across multiple @@ -1206,6 +1413,23 @@ interface's worth of throughput, even after adjusting tcp_reordering. + Note that this out of order delivery occurs when both the + sending and receiving systems are utilizing a multiple + interface bond. Consider a configuration in which a + balance-rr bond feeds into a single higher capacity network + channel (e.g., multiple 100Mb/sec ethernets feeding a single + gigabit ethernet via an etherchannel capable switch). In this + configuration, traffic sent from the multiple 100Mb devices to + a destination connected to the gigabit device will not see + packets out of order. However, traffic sent from the gigabit + device to the multiple 100Mb devices may or may not see + traffic out of order, depending upon the balance policy of the + switch. Many switches do not support any modes that stripe + traffic (instead choosing a port based upon IP or MAC level + addresses); for those devices, traffic flowing from the + gigabit device to the many 100Mb devices will only utilize one + interface. + If you are utilizing protocols other than TCP/IP, UDP for example, and your application can tolerate out of order delivery, then this mode can allow for single stream datagram @@ -1220,16 +1444,21 @@ connected to the same peer as the primary. In this case, a load balancing mode (with link monitoring) will provide the same level of network availability, but with increased - available bandwidth. On the plus side, it does not require - any configuration of the switch. + available bandwidth. On the plus side, active-backup mode + does not require any configuration of the switch, so it may + have value if the hardware available does not support any of + the load balance modes. balance-xor: This mode will limit traffic such that packets destined for specific peers will always be sent over the same interface. Since the destination is determined by the MAC - addresses involved, this may be desirable if you have a large - network with many hosts. It is likely to be suboptimal if all - your traffic is passed through a single router, however. As - with balance-rr, the switch ports need to be configured for + addresses involved, this mode works best in a "local" network + configuration (as described above), with destinations all on + the same local network. This mode is likely to be suboptimal + if all your traffic is passed through a single router (i.e., a + "gatewayed" network configuration, as described above). + + As with balance-rr, the switch ports need to be configured for "etherchannel" or "trunking." broadcast: Like active-backup, there is not much advantage to this @@ -1241,122 +1470,128 @@ protocol includes automatic configuration of the aggregates, so minimal manual configuration of the switch is needed (typically only to designate that some set of devices is - usable for 802.3ad). The 802.3ad standard also mandates that - frames be delivered in order (within certain limits), so in - general single connections will not see misordering of + available for 802.3ad). The 802.3ad standard also mandates + that frames be delivered in order (within certain limits), so + in general single connections will not see misordering of packets. The 802.3ad mode does have some drawbacks: the standard mandates that all devices in the aggregate operate at the same speed and duplex. Also, as with all bonding load balance modes other than balance-rr, no single connection will be able to utilize more than a single interface's worth of - bandwidth. Additionally, the linux bonding 802.3ad - implementation distributes traffic by peer (using an XOR of - MAC addresses), so in general all traffic to a particular - destination will use the same interface. Finally, the 802.3ad - mode mandates the use of the MII monitor, therefore, the ARP - monitor is not available in this mode. - -balance-tlb: This mode is also a good choice for this type of - topology. It has no special switch configuration - requirements, and balances outgoing traffic by peer, in a - vaguely intelligent manner (not a simple XOR as in balance-xor - or 802.3ad mode), so that unlucky MAC addresses will not all - "bunch up" on a single interface. Interfaces may be of - differing speeds. On the down side, in this mode all incoming - traffic arrives over a single interface, this mode requires - certain ethtool support in the network device driver of the - slave interfaces, and the ARP monitor is not available. - -balance-alb: This mode is everything that balance-tlb is, and more. It - has all of the features (and restrictions) of balance-tlb, and - will also balance incoming traffic from peers (as described in - the Bonding Module Options section, above). The only extra - down side to this mode is that the network device driver must - support changing the hardware address while the device is - open. + bandwidth. -12.1.2 Link Monitoring for Single Switch Topology -------------------------------------------------- + Additionally, the linux bonding 802.3ad implementation + distributes traffic by peer (using an XOR of MAC addresses), + so in a "gatewayed" configuration, all outgoing traffic will + generally use the same device. Incoming traffic may also end + up on a single device, but that is dependent upon the + balancing policy of the peer's 8023.ad implementation. In a + "local" configuration, traffic will be distributed across the + devices in the bond. + + Finally, the 802.3ad mode mandates the use of the MII monitor, + therefore, the ARP monitor is not available in this mode. + +balance-tlb: The balance-tlb mode balances outgoing traffic by peer. + Since the balancing is done according to MAC address, in a + "gatewayed" configuration (as described above), this mode will + send all traffic across a single device. However, in a + "local" network configuration, this mode balances multiple + local network peers across devices in a vaguely intelligent + manner (not a simple XOR as in balance-xor or 802.3ad mode), + so that mathematically unlucky MAC addresses (i.e., ones that + XOR to the same value) will not all "bunch up" on a single + interface. + + Unlike 802.3ad, interfaces may be of differing speeds, and no + special switch configuration is required. On the down side, + in this mode all incoming traffic arrives over a single + interface, this mode requires certain ethtool support in the + network device driver of the slave interfaces, and the ARP + monitor is not available. + +balance-alb: This mode is everything that balance-tlb is, and more. + It has all of the features (and restrictions) of balance-tlb, + and will also balance incoming traffic from local network + peers (as described in the Bonding Module Options section, + above). + + The only additional down side to this mode is that the network + device driver must support changing the hardware address while + the device is open. + +13.1.2 MT Link Monitoring for Single Switch Topology +---------------------------------------------------- The choice of link monitoring may largely depend upon which mode you choose to use. The more advanced load balancing modes do not support the use of the ARP monitor, and are thus restricted to using -the MII monitor (which does not provide as high a level of assurance -as the ARP monitor). - - -12.2 High Availability in a Multiple Switch Topology ----------------------------------------------------- - - With multiple switches, the configuration of bonding and the -network changes dramatically. In multiple switch topologies, there is -a tradeoff between network availability and usable bandwidth. - - Below is a sample network, configured to maximize the -availability of the network: - - | | - |port3 port3| - +-----+----+ +-----+----+ - | |port2 ISL port2| | - | switch A +--------------------------+ switch B | - | | | | - +-----+----+ +-----++---+ - |port1 port1| - | +-------+ | - +-------------+ host1 +---------------+ - eth0 +-------+ eth1 - - In this configuration, there is a link between the two -switches (ISL, or inter switch link), and multiple ports connecting to -the outside world ("port3" on each switch). There is no technical -reason that this could not be extended to a third switch. +the MII monitor (which does not provide as high a level of end to end +assurance as the ARP monitor). -12.2.1 Bonding Mode Selection for Multiple Switch Topology ----------------------------------------------------------- +13.2 Maximum Throughput in a Multiple Switch Topology +----------------------------------------------------- - In a topology such as this, the active-backup and broadcast -modes are the only useful bonding modes; the other modes require all -links to terminate on the same peer for them to behave rationally. - -active-backup: This is generally the preferred mode, particularly if - the switches have an ISL and play together well. If the - network configuration is such that one switch is specifically - a backup switch (e.g., has lower capacity, higher cost, etc), - then the primary option can be used to insure that the - preferred link is always used when it is available. + Multiple switches may be utilized to optimize for throughput +when they are configured in parallel as part of an isolated network +between two or more systems, for example: + + +-----------+ + | Host A | + +-+---+---+-+ + | | | + +--------+ | +---------+ + | | | + +------+---+ +-----+----+ +-----+----+ + | Switch A | | Switch B | | Switch C | + +------+---+ +-----+----+ +-----+----+ + | | | + +--------+ | +---------+ + | | | + +-+---+---+-+ + | Host B | + +-----------+ + + In this configuration, the switches are isolated from one +another. One reason to employ a topology such as this is for an +isolated network with many hosts (a cluster configured for high +performance, for example), using multiple smaller switches can be more +cost effective than a single larger switch, e.g., on a network with 24 +hosts, three 24 port switches can be significantly less expensive than +a single 72 port switch. + + If access beyond the network is required, an individual host +can be equipped with an additional network device connected to an +external network; this host then additionally acts as a gateway. -broadcast: This mode is really a special purpose mode, and is suitable - only for very specific needs. For example, if the two - switches are not connected (no ISL), and the networks beyond - them are totally independant. In this case, if it is - necessary for some specific one-way traffic to reach both - independent networks, then the broadcast mode may be suitable. - -12.2.2 Link Monitoring Selection for Multiple Switch Topology +13.2.1 MT Bonding Mode Selection for Multiple Switch Topology ------------------------------------------------------------- - The choice of link monitoring ultimately depends upon your -switch. If the switch can reliably fail ports in response to other -failures, then either the MII or ARP monitors should work. For -example, in the above example, if the "port3" link fails at the remote -end, the MII monitor has no direct means to detect this. The ARP -monitor could be configured with a target at the remote end of port3, -thus detecting that failure without switch support. - - In general, however, in a multiple switch topology, the ARP -monitor can provide a higher level of reliability in detecting link -failures. Additionally, it should be configured with multiple targets -(at least one for each switch in the network). This will insure that, -regardless of which switch is active, the ARP monitor has a suitable -target to query. - + In actual practice, the bonding mode typically employed in +configurations of this type is balance-rr. Historically, in this +network configuration, the usual caveats about out of order packet +delivery are mitigated by the use of network adapters that do not do +any kind of packet coalescing (via the use of NAPI, or because the +device itself does not generate interrupts until some number of +packets has arrived). When employed in this fashion, the balance-rr +mode allows individual connections between two hosts to effectively +utilize greater than one interface's bandwidth. + +13.2.2 MT Link Monitoring for Multiple Switch Topology +------------------------------------------------------ + + Again, in actual practice, the MII monitor is most often used +in this configuration, as performance is given preference over +availability. The ARP monitor will function in this topology, but its +advantages over the MII monitor are mitigated by the volume of probes +needed as the number of systems involved grows (remember that each +host in the network is configured with bonding). -12.3 Switch Behavior Issues for High Availability -------------------------------------------------- +14. Switch Behavior Issues +-------------------------- - You may encounter issues with the timing of link up and down -reporting by the switch. + Some switches exhibit undesirable behavior with regard to the +timing of link up and down reporting by the switch. First, when a link comes up, some switches may indicate that the link is up (carrier available), but not pass traffic over the @@ -1370,30 +1605,31 @@ Second, some switches may "bounce" the link state one or more times while a link is changing state. This occurs most commonly while the switch is initializing. Again, an appropriate updelay value may -help, but note that if all links are down, then updelay is ignored -when any link becomes active (the slave closest to completing its -updelay is chosen). +help. Note that when a bonding interface has no active links, the -driver will immediately reuse the first link that goes up, even if -updelay parameter was specified. If there are slave interfaces -waiting for the updelay timeout to expire, the interface that first -went into that state will be immediately reused. This reduces down -time of the network if the value of updelay has been overestimated. +driver will immediately reuse the first link that goes up, even if the +updelay parameter has been specified (the updelay is ignored in this +case). If there are slave interfaces waiting for the updelay timeout +to expire, the interface that first went into that state will be +immediately reused. This reduces down time of the network if the +value of updelay has been overestimated, and since this occurs only in +cases with no connectivity, there is no additional penalty for +ignoring the updelay. In addition to the concerns about switch timings, if your switches take a long time to go into backup mode, it may be desirable to not activate a backup interface immediately after a link goes down. Failover may be delayed via the downdelay bonding module option. -13. Hardware Specific Considerations +15. Hardware Specific Considerations ==================================== This section contains additional information for configuring bonding on specific hardware platforms, or for interfacing bonding with particular switches or other devices. -13.1 IBM BladeCenter +15.1 IBM BladeCenter -------------------- This applies to the JS20 and similar systems. @@ -1407,12 +1643,12 @@ -------------------------------- All JS20s come with two Broadcom Gigabit Ethernet ports -integrated on the planar. In the BladeCenter chassis, the eth0 port -of all JS20 blades is hard wired to I/O Module #1; similarly, all eth1 -ports are wired to I/O Module #2. An add-on Broadcom daughter card -can be installed on a JS20 to provide two more Gigabit Ethernet ports. -These ports, eth2 and eth3, are wired to I/O Modules 3 and 4, -respectively. +integrated on the planar (that's "motherboard" in IBM-speak). In the +BladeCenter chassis, the eth0 port of all JS20 blades is hard wired to +I/O Module #1; similarly, all eth1 ports are wired to I/O Module #2. +An add-on Broadcom daughter card can be installed on a JS20 to provide +two more Gigabit Ethernet ports. These ports, eth2 and eth3, are +wired to I/O Modules 3 and 4, respectively. Each I/O Module may contain either a switch or a passthrough module (which allows ports to be directly connected to an external @@ -1432,29 +1668,30 @@ of ways, this discussion will be confined to describing basic configurations. - Normally, Ethernet Switch Modules (ESM) are used in I/O + Normally, Ethernet Switch Modules (ESMs) are used in I/O modules 1 and 2. In this configuration, the eth0 and eth1 ports of a JS20 will be connected to different internal switches (in the respective I/O modules). - An optical passthru module (OPM) connects the I/O module -directly to an external switch. By using OPMs in I/O module #1 and -#2, the eth0 and eth1 interfaces of a JS20 can be redirected to the -outside world and connected to a common external switch. - - Depending upon the mix of ESM and OPM modules, the network -will appear to bonding as either a single switch topology (all OPM -modules) or as a multiple switch topology (one or more ESM modules, -zero or more OPM modules). It is also possible to connect ESM modules -together, resulting in a configuration much like the example in "High -Availability in a multiple switch topology." - -Requirements for specifc modes ------------------------------- - - The balance-rr mode requires the use of OPM modules for -devices in the bond, all connected to an common external switch. That -switch must be configured for "etherchannel" or "trunking" on the + A passthrough module (OPM or CPM, optical or copper, +passthrough module) connects the I/O module directly to an external +switch. By using PMs in I/O module #1 and #2, the eth0 and eth1 +interfaces of a JS20 can be redirected to the outside world and +connected to a common external switch. + + Depending upon the mix of ESMs and PMs, the network will +appear to bonding as either a single switch topology (all PMs) or as a +multiple switch topology (one or more ESMs, zero or more PMs). It is +also possible to connect ESMs together, resulting in a configuration +much like the example in "High Availability in a Multiple Switch +Topology," above. + +Requirements for specific modes +------------------------------- + + The balance-rr mode requires the use of passthrough modules +for devices in the bond, all connected to an common external switch. +That switch must be configured for "etherchannel" or "trunking" on the appropriate ports, as is usual for balance-rr. The balance-alb and balance-tlb modes will function with @@ -1484,17 +1721,18 @@ Other concerns -------------- - The Serial Over LAN link is established over the primary + The Serial Over LAN (SoL) link is established over the primary ethernet (eth0) only, therefore, any loss of link to eth0 will result in losing your SoL connection. It will not fail over with other -network traffic. +network traffic, as the SoL system is beyond the control of the +bonding driver. It may be desirable to disable spanning tree on the switch (either the internal Ethernet Switch Module, or an external switch) to -avoid fail-over delays issues when using bonding. +avoid fail-over delay issues when using bonding. -14. Frequently Asked Questions +16. Frequently Asked Questions ============================== 1. Is it SMP safe? @@ -1505,8 +1743,8 @@ 2. What type of cards will work with it? Any Ethernet type cards (you can even mix cards - a Intel -EtherExpress PRO/100 and a 3com 3c905b, for example). They need not -be of the same speed. +EtherExpress PRO/100 and a 3com 3c905b, for example). For most modes, +devices need not be of the same speed. 3. How many bonding devices can I have? @@ -1524,11 +1762,12 @@ disabled. The active-backup mode will fail over to a backup link, and other modes will ignore the failed link. The link will continue to be monitored, and should it recover, it will rejoin the bond (in whatever -manner is appropriate for the mode). See the section on High -Availability for additional information. +manner is appropriate for the mode). See the sections on High +Availability and the documentation for each mode for additional +information. Link monitoring can be enabled via either the miimon or -arp_interval paramters (described in the module paramters section, +arp_interval parameters (described in the module parameters section, above). In general, miimon monitors the carrier state as sensed by the underlying network device, and the arp monitor (arp_interval) monitors connectivity to another host on the local network. @@ -1536,7 +1775,7 @@ If no link monitoring is configured, the bonding driver will be unable to detect link failures, and will assume that all links are always available. This will likely result in lost packets, and a -resulting degredation of performance. The precise performance loss +resulting degradation of performance. The precise performance loss depends upon the bonding mode and network configuration. 6. Can bonding be used for High Availability? @@ -1550,12 +1789,12 @@ In the basic balance modes (balance-rr and balance-xor), it works with any system that supports etherchannel (also called trunking). Most managed switches currently available have such -support, and many unmananged switches as well. +support, and many unmanaged switches as well. The advanced balance modes (balance-tlb and balance-alb) do not have special switch requirements, but do need device drivers that support specific features (described in the appropriate section under -module paramters, above). +module parameters, above). In 802.3ad mode, it works with with systems that support IEEE 802.3ad Dynamic Link Aggregation. Most managed and many unmanaged @@ -1565,17 +1804,19 @@ 8. Where does a bonding device get its MAC address from? - If not explicitly configured with ifconfig, the MAC address of -the bonding device is taken from its first slave device. This MAC -address is then passed to all following slaves and remains persistent -(even if the the first slave is removed) until the bonding device is -brought down or reconfigured. + If not explicitly configured (with ifconfig or ip link), the +MAC address of the bonding device is taken from its first slave +device. This MAC address is then passed to all following slaves and +remains persistent (even if the the first slave is removed) until the +bonding device is brought down or reconfigured. If you wish to change the MAC address, you can set it with -ifconfig: +ifconfig or ip link: # ifconfig bond0 hw ether 00:11:22:33:44:55 +# ip link set bond0 address 66:77:88:99:aa:bb + The MAC address can be also changed by bringing down/up the device and then changing its slaves (or their order): @@ -1591,23 +1832,28 @@ then restore the MAC addresses that the slaves had before they were enslaved. -15. Resources and Links +16. Resources and Links ======================= The latest version of the bonding driver can be found in the latest version of the linux kernel, found on http://kernel.org +The latest version of this document can be found in either the latest +kernel source (named Documentation/networking/bonding.txt), or on the +bonding sourceforge site: + +http://www.sourceforge.net/projects/bonding + Discussions regarding the bonding driver take place primarily on the bonding-devel mailing list, hosted at sourceforge.net. If you have -questions or problems, post them to the list. +questions or problems, post them to the list. The list address is: bonding-devel@lists.sourceforge.net -https://lists.sourceforge.net/lists/listinfo/bonding-devel - -There is also a project site on sourceforge. + The administrative interface (to subscribe or unsubscribe) can +be found at: -http://www.sourceforge.net/projects/bonding +https://lists.sourceforge.net/lists/listinfo/bonding-devel Donald Becker's Ethernet Drivers and diag programs may be found at : - http://www.scyld.com/network/ From greearb@candelatech.com Fri Jun 3 12:00:41 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Jun 2005 12:00:44 -0700 (PDT) Received: from www.lanforge.com (ns1.lanforge.com [66.165.47.210]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j53J0fXq016275 for ; Fri, 3 Jun 2005 12:00:41 -0700 Received: from [71.112.207.80] (pool-71-112-207-80.sttlwa.dsl-w.verizon.net [71.112.207.80]) (authenticated bits=0) by www.lanforge.com (8.12.8/8.12.8) with ESMTP id j53JX55I003518; Fri, 3 Jun 2005 12:33:05 -0700 Message-ID: <42A0A897.5080006@candelatech.com> Date: Fri, 03 Jun 2005 11:59:35 -0700 From: Ben Greear Organization: Candela Technologies User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.7.8) Gecko/20050513 Fedora/1.7.8-1.3.1 X-Accept-Language: en-us, en MIME-Version: 1.0 To: "David S. Miller" CC: john.ronciak@intel.com, Robert.Olsson@data.slu.se, jdmason@us.ibm.com, shemminger@osdl.org, hadi@cyberus.ca, mitch.a.williams@intel.com, netdev@oss.sgi.com, ganesh.venkatesan@intel.com, jesse.brandeburg@intel.com Subject: Re: RFC: NAPI packet weighting patch References: <468F3FDA28AA87429AD807992E22D07E0450BFE8@orsmsx408> <42A0A25C.8000503@candelatech.com> <20050603.114950.119242486.davem@davemloft.net> In-Reply-To: <20050603.114950.119242486.davem@davemloft.net> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 2045 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: greearb@candelatech.com Precedence: bulk X-list: netdev Content-Length: 1039 Lines: 32 David S. Miller wrote: > From: Ben Greear >>Maybe the poll is disabling the IRQs on the NIC for too long, or something >>like that? > > > In a reply I just sent out to this thread, I postulate that the > jiffies check is hitting earlier with a lower weight value, a quick > look at /proc/net/softnet_stat during their testing will confirm or > deny this theory. That would basically just decrease the work done in the NAPI poll though, so I don't see how that could be the problem, since the 'solution' was to force less work to be done. > It could also just be a simple bug in the dev->quota accounting > somewhere. > > Note that, in all of this, I do not have any objections to providing > a way to configure the dev->weight values. I will be applying Stephen > Hemminger's patches. Good. The more knobs the merrier, so long as they are at least somewhat documented and default to good sane values :) Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com From davem@davemloft.net Fri Jun 3 12:02:38 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Jun 2005 12:02:42 -0700 (PDT) Received: from sunset.davemloft.net (dsl027-180-168.sfo1.dsl.speakeasy.net [216.27.180.168]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j53J2cXq016818 for ; Fri, 3 Jun 2005 12:02:38 -0700 Received: from localhost ([127.0.0.1] ident=davem) by sunset.davemloft.net with esmtp (Exim 4.50) id 1DeHQ6-0001wa-TJ; Fri, 03 Jun 2005 12:01:26 -0700 Date: Fri, 03 Jun 2005 12:01:26 -0700 (PDT) Message-Id: <20050603.120126.41874584.davem@davemloft.net> To: hadi@cyberus.ca Cc: mitch.a.williams@intel.com, john.ronciak@intel.com, jdmason@us.ibm.com, shemminger@osdl.org, netdev@oss.sgi.com, Robert.Olsson@data.slu.se, ganesh.venkatesan@intel.com, jesse.brandeburg@intel.com Subject: Re: RFC: NAPI packet weighting patch From: "David S. Miller" In-Reply-To: <1117824150.6071.34.camel@localhost.localdomain> References: <1117765954.6095.49.camel@localhost.localdomain> <1117824150.6071.34.camel@localhost.localdomain> X-Mailer: Mew version 3.3 on Emacs 21.4 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 2046 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev Content-Length: 1271 Lines: 34 From: jamal Date: Fri, 03 Jun 2005 14:42:30 -0400 > When you reduce the weight, the system is spending less time in the > softirq processing packets before softirq yields. If this gives more > opportunity to your app to run, then the performance will go up. > Is this what you are seeing? Jamal, this is my current theory as well, we hit the jiffies check. It it the only logical explanation I can come up with for the single adapter case. There are some ways we can mitigate this. Here is one idea off the top of my head. When the jiffies check is hit, lower the weight of the most recently polled device towards some minimum (perhaps divide by two). If we successfully poll without hitting the jiffies check, make a small increment of the weight up to some limit. It is Van Jacobson TCP congestion avoidance applied to NAPI :-) Just a simple AIMD (Additive Increase, Multiplicative Decrease). So, hitting the jiffies work limit is congestion, and the cause of the congestion is the most recently polled device. In this regime, what the driver currently specifies as "->weight" is actually the maximum we'll use in the congestion control algorithm. And we can choose some constant minimum, something like "8" ought to work well. Comments? From davem@davemloft.net Fri Jun 3 12:04:04 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Jun 2005 12:04:07 -0700 (PDT) Received: from sunset.davemloft.net (dsl027-180-168.sfo1.dsl.speakeasy.net [216.27.180.168]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j53J44Xq017423 for ; Fri, 3 Jun 2005 12:04:04 -0700 Received: from localhost ([127.0.0.1] ident=davem) by sunset.davemloft.net with esmtp (Exim 4.50) id 1DeHRZ-0001x7-H4; Fri, 03 Jun 2005 12:02:57 -0700 Date: Fri, 03 Jun 2005 12:02:57 -0700 (PDT) Message-Id: <20050603.120257.21929814.davem@davemloft.net> To: greearb@candelatech.com Cc: john.ronciak@intel.com, Robert.Olsson@data.slu.se, jdmason@us.ibm.com, shemminger@osdl.org, hadi@cyberus.ca, mitch.a.williams@intel.com, netdev@oss.sgi.com, ganesh.venkatesan@intel.com, jesse.brandeburg@intel.com Subject: Re: RFC: NAPI packet weighting patch From: "David S. Miller" In-Reply-To: <42A0A897.5080006@candelatech.com> References: <42A0A25C.8000503@candelatech.com> <20050603.114950.119242486.davem@davemloft.net> <42A0A897.5080006@candelatech.com> X-Mailer: Mew version 3.3 on Emacs 21.4 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 2047 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev Content-Length: 581 Lines: 14 From: Ben Greear Date: Fri, 03 Jun 2005 11:59:35 -0700 > David S. Miller wrote: > > In a reply I just sent out to this thread, I postulate that the > > jiffies check is hitting earlier with a lower weight value, a quick > > look at /proc/net/softnet_stat during their testing will confirm or > > deny this theory. > > That would basically just decrease the work done in the NAPI poll though, > so I don't see how that could be the problem, since the 'solution' was to > force less work to be done. It allows his application to get onto the CPU faster. From davem@davemloft.net Fri Jun 3 12:27:03 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Jun 2005 12:27:07 -0700 (PDT) Received: from sunset.davemloft.net (dsl027-180-168.sfo1.dsl.speakeasy.net [216.27.180.168]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j53JR2Xq022784 for ; Fri, 3 Jun 2005 12:27:03 -0700 Received: from localhost ([127.0.0.1] ident=davem) by sunset.davemloft.net with esmtp (Exim 4.50) id 1DeHnq-00028w-Pk; Fri, 03 Jun 2005 12:25:58 -0700 Date: Fri, 03 Jun 2005 12:25:58 -0700 (PDT) Message-Id: <20050603.122558.88474819.davem@davemloft.net> To: netdev@oss.sgi.com CC: mchan@broadcom.com Subject: [PATCH]: Tigon3 new NAPI locking v2 From: "David S. Miller" X-Mailer: Mew version 3.3 on Emacs 21.4 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 2048 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev Content-Length: 24870 Lines: 915 This version incorporates two bug fixes from Michael. 1) Check the mailbox register for 0x1 while polling on the COMPLETE state bit. 2) Remove the BUG_ON() check in tg3_restart_ints(), it can legally and harmlessly occur. Point #2 may want some refinements, but this patch below is good enough for testing. If someone (please please, pretty please) could be adventurous enough to attempt this kind of change for e1000, that would be great. Thanks. [TG3]: Eliminate all hw IRQ handler spinlocks. Move all driver spinlocks to be taken at sw IRQ context only. This fixes the skb_copy() we were doing with hw IRQs disabled (which is illegal and triggers a BUG() with HIGHMEM enabled). It also simplifies the locking all over the driver tremendously. We accomplish this feat by creating a special sequence to synchronize with the hw IRQ handler using a 2-bit atomic state. Signed-off-by: David S. Miller --- 1/drivers/net/tg3.c.~1~ 2005-06-03 12:11:40.000000000 -0700 +++ 2/drivers/net/tg3.c 2005-06-03 12:15:34.000000000 -0700 @@ -337,12 +337,10 @@ static struct { static void tg3_write_indirect_reg32(struct tg3 *tp, u32 off, u32 val) { if ((tp->tg3_flags & TG3_FLAG_PCIX_TARGET_HWBUG) != 0) { - unsigned long flags; - - spin_lock_irqsave(&tp->indirect_lock, flags); + spin_lock_bh(&tp->indirect_lock); pci_write_config_dword(tp->pdev, TG3PCI_REG_BASE_ADDR, off); pci_write_config_dword(tp->pdev, TG3PCI_REG_DATA, val); - spin_unlock_irqrestore(&tp->indirect_lock, flags); + spin_unlock_bh(&tp->indirect_lock); } else { writel(val, tp->regs + off); if ((tp->tg3_flags & TG3_FLAG_5701_REG_WRITE_BUG) != 0) @@ -353,12 +351,10 @@ static void tg3_write_indirect_reg32(str static void _tw32_flush(struct tg3 *tp, u32 off, u32 val) { if ((tp->tg3_flags & TG3_FLAG_PCIX_TARGET_HWBUG) != 0) { - unsigned long flags; - - spin_lock_irqsave(&tp->indirect_lock, flags); + spin_lock_bh(&tp->indirect_lock); pci_write_config_dword(tp->pdev, TG3PCI_REG_BASE_ADDR, off); pci_write_config_dword(tp->pdev, TG3PCI_REG_DATA, val); - spin_unlock_irqrestore(&tp->indirect_lock, flags); + spin_unlock_bh(&tp->indirect_lock); } else { void __iomem *dest = tp->regs + off; writel(val, dest); @@ -398,28 +394,24 @@ static inline void _tw32_tx_mbox(struct static void tg3_write_mem(struct tg3 *tp, u32 off, u32 val) { - unsigned long flags; - - spin_lock_irqsave(&tp->indirect_lock, flags); + spin_lock_bh(&tp->indirect_lock); pci_write_config_dword(tp->pdev, TG3PCI_MEM_WIN_BASE_ADDR, off); pci_write_config_dword(tp->pdev, TG3PCI_MEM_WIN_DATA, val); /* Always leave this as zero. */ pci_write_config_dword(tp->pdev, TG3PCI_MEM_WIN_BASE_ADDR, 0); - spin_unlock_irqrestore(&tp->indirect_lock, flags); + spin_unlock_bh(&tp->indirect_lock); } static void tg3_read_mem(struct tg3 *tp, u32 off, u32 *val) { - unsigned long flags; - - spin_lock_irqsave(&tp->indirect_lock, flags); + spin_lock_bh(&tp->indirect_lock); pci_write_config_dword(tp->pdev, TG3PCI_MEM_WIN_BASE_ADDR, off); pci_read_config_dword(tp->pdev, TG3PCI_MEM_WIN_DATA, val); /* Always leave this as zero. */ pci_write_config_dword(tp->pdev, TG3PCI_MEM_WIN_BASE_ADDR, 0); - spin_unlock_irqrestore(&tp->indirect_lock, flags); + spin_unlock_bh(&tp->indirect_lock); } static void tg3_disable_ints(struct tg3 *tp) @@ -443,7 +435,7 @@ static void tg3_enable_ints(struct tg3 * tw32_mailbox(MAILBOX_INTERRUPT_0 + TG3_64BIT_REG_LOW, (tp->last_tag << 24)); tr32(MAILBOX_INTERRUPT_0 + TG3_64BIT_REG_LOW); - + tp->irq_state = 0; tg3_cond_int(tp); } @@ -2578,7 +2570,7 @@ static void tg3_tx(struct tg3 *tp) sw_idx = NEXT_TX(sw_idx); } - dev_kfree_skb_irq(skb); + dev_kfree_skb(skb); } tp->tx_cons = sw_idx; @@ -2884,11 +2876,8 @@ static int tg3_poll(struct net_device *n { struct tg3 *tp = netdev_priv(netdev); struct tg3_hw_status *sblk = tp->hw_status; - unsigned long flags; int done; - spin_lock_irqsave(&tp->lock, flags); - /* handle link change and other phy events */ if (!(tp->tg3_flags & (TG3_FLAG_USE_LINKCHG_REG | @@ -2896,7 +2885,9 @@ static int tg3_poll(struct net_device *n if (sblk->status & SD_STATUS_LINK_CHG) { sblk->status = SD_STATUS_UPDATED | (sblk->status & ~SD_STATUS_LINK_CHG); + spin_lock(&tp->lock); tg3_setup_phy(tp, 0); + spin_unlock(&tp->lock); } } @@ -2907,8 +2898,6 @@ static int tg3_poll(struct net_device *n spin_unlock(&tp->tx_lock); } - spin_unlock_irqrestore(&tp->lock, flags); - /* run RX thread, within the bounds set by NAPI. * All RX "locking" is done by ensuring outside * code synchronizes with dev->poll() @@ -2933,15 +2922,62 @@ static int tg3_poll(struct net_device *n /* if no more work, tell net stack and NIC we're done */ done = !tg3_has_work(tp); if (done) { - spin_lock_irqsave(&tp->lock, flags); + spin_lock(&tp->lock); __netif_rx_complete(netdev); tg3_restart_ints(tp); - spin_unlock_irqrestore(&tp->lock, flags); + spin_unlock(&tp->lock); } return (done ? 0 : 1); } +static void tg3_irq_quiesce(struct tg3 *tp) +{ + BUG_ON(test_bit(TG3_IRQSTATE_SYNC, &tp->irq_state)); + + set_bit(TG3_IRQSTATE_SYNC, &tp->irq_state); + smp_mb(); + tw32(GRC_LOCAL_CTRL, + tp->grc_local_ctrl | GRC_LCLCTRL_SETINT); + + while (!test_bit(TG3_IRQSTATE_COMPLETE, &tp->irq_state)) { + u32 val = tr32(MAILBOX_INTERRUPT_0 + TG3_64BIT_REG_LOW); + + if (val == 0x00000001) + break; + + cpu_relax(); + } +} + +static inline int tg3_irq_sync(struct tg3 *tp) +{ + if (test_bit(TG3_IRQSTATE_SYNC, &tp->irq_state)) { + set_bit(TG3_IRQSTATE_COMPLETE, &tp->irq_state); + return 1; + } + return 0; +} + +/* Fully shutdown all tg3 driver activity elsewhere in the system. + * If irq_sync is non-zero, then the IRQ handler must be synchronized + * with as well. Most of the time, this is not necessary except when + * shutting down the device. + */ +static inline void tg3_full_lock(struct tg3 *tp, int irq_sync) +{ + if (irq_sync) + tg3_irq_quiesce(tp); + spin_lock_bh(&tp->lock); + spin_lock(&tp->tx_lock); +} + +static inline void tg3_full_unlock(struct tg3 *tp) +{ + spin_unlock(&tp->tx_lock); + spin_unlock_bh(&tp->lock); +} + /* MSI ISR - No need to check for interrupt sharing and no need to * flush status block and interrupt mailbox. PCI ordering rules * guarantee that MSI will arrive after the status block. @@ -2951,9 +2987,6 @@ static irqreturn_t tg3_msi(int irq, void struct net_device *dev = dev_id; struct tg3 *tp = netdev_priv(dev); struct tg3_hw_status *sblk = tp->hw_status; - unsigned long flags; - - spin_lock_irqsave(&tp->lock, flags); /* * Writing any value to intr-mbox-0 clears PCI INTA# and @@ -2964,6 +2997,8 @@ static irqreturn_t tg3_msi(int irq, void */ tw32_mailbox(MAILBOX_INTERRUPT_0 + TG3_64BIT_REG_LOW, 0x00000001); tp->last_tag = sblk->status_tag; + if (tg3_irq_sync(tp)) + goto out; sblk->status &= ~SD_STATUS_UPDATED; if (likely(tg3_has_work(tp))) netif_rx_schedule(dev); /* schedule NAPI poll */ @@ -2972,9 +3007,7 @@ static irqreturn_t tg3_msi(int irq, void tw32_mailbox(MAILBOX_INTERRUPT_0 + TG3_64BIT_REG_LOW, tp->last_tag << 24); } - - spin_unlock_irqrestore(&tp->lock, flags); - +out: return IRQ_RETVAL(1); } @@ -2983,11 +3016,8 @@ static irqreturn_t tg3_interrupt(int irq struct net_device *dev = dev_id; struct tg3 *tp = netdev_priv(dev); struct tg3_hw_status *sblk = tp->hw_status; - unsigned long flags; unsigned int handled = 1; - spin_lock_irqsave(&tp->lock, flags); - /* In INTx mode, it is possible for the interrupt to arrive at * the CPU before the status block posted prior to the interrupt. * Reading the PCI State register will confirm whether the @@ -3004,6 +3034,8 @@ static irqreturn_t tg3_interrupt(int irq */ tw32_mailbox(MAILBOX_INTERRUPT_0 + TG3_64BIT_REG_LOW, 0x00000001); + if (tg3_irq_sync(tp)) + goto out; sblk->status &= ~SD_STATUS_UPDATED; if (likely(tg3_has_work(tp))) netif_rx_schedule(dev); /* schedule NAPI poll */ @@ -3018,9 +3050,7 @@ static irqreturn_t tg3_interrupt(int irq } else { /* shared interrupt */ handled = 0; } - - spin_unlock_irqrestore(&tp->lock, flags); - +out: return IRQ_RETVAL(handled); } @@ -3029,11 +3059,8 @@ static irqreturn_t tg3_interrupt_tagged( struct net_device *dev = dev_id; struct tg3 *tp = netdev_priv(dev); struct tg3_hw_status *sblk = tp->hw_status; - unsigned long flags; unsigned int handled = 1; - spin_lock_irqsave(&tp->lock, flags); - /* In INTx mode, it is possible for the interrupt to arrive at * the CPU before the status block posted prior to the interrupt. * Reading the PCI State register will confirm whether the @@ -3051,6 +3078,8 @@ static irqreturn_t tg3_interrupt_tagged( tw32_mailbox(MAILBOX_INTERRUPT_0 + TG3_64BIT_REG_LOW, 0x00000001); tp->last_tag = sblk->status_tag; + if (tg3_irq_sync(tp)) + goto out; sblk->status &= ~SD_STATUS_UPDATED; if (likely(tg3_has_work(tp))) netif_rx_schedule(dev); /* schedule NAPI poll */ @@ -3065,9 +3094,7 @@ static irqreturn_t tg3_interrupt_tagged( } else { /* shared interrupt */ handled = 0; } - - spin_unlock_irqrestore(&tp->lock, flags); - +out: return IRQ_RETVAL(handled); } @@ -3106,8 +3133,7 @@ static void tg3_reset_task(void *_data) tg3_netif_stop(tp); - spin_lock_irq(&tp->lock); - spin_lock(&tp->tx_lock); + tg3_full_lock(tp, 1); restart_timer = tp->tg3_flags2 & TG3_FLG2_RESTART_TIMER; tp->tg3_flags2 &= ~TG3_FLG2_RESTART_TIMER; @@ -3117,8 +3143,7 @@ static void tg3_reset_task(void *_data) tg3_netif_start(tp); - spin_unlock(&tp->tx_lock); - spin_unlock_irq(&tp->lock); + tg3_full_unlock(tp); if (restart_timer) mod_timer(&tp->timer, jiffies + 1); @@ -3224,39 +3249,21 @@ static int tg3_start_xmit(struct sk_buff unsigned int i; u32 len, entry, base_flags, mss; int would_hit_hwbug; - unsigned long flags; len = skb_headlen(skb); /* No BH disabling for tx_lock here. We are running in BH disabled * context and TX reclaim runs via tp->poll inside of a software - * interrupt. Rejoice! - * - * Actually, things are not so simple. If we are to take a hw - * IRQ here, we can deadlock, consider: - * - * CPU1 CPU2 - * tg3_start_xmit - * take tp->tx_lock - * tg3_timer - * take tp->lock - * tg3_interrupt - * spin on tp->lock - * spin on tp->tx_lock - * - * So we really do need to disable interrupts when taking - * tx_lock here. + * interrupt. Furthermore, IRQ processing runs lockless so we have + * no IRQ context deadlocks to worry about either. Rejoice! */ - local_irq_save(flags); - if (!spin_trylock(&tp->tx_lock)) { - local_irq_restore(flags); + if (!spin_trylock(&tp->tx_lock)) return NETDEV_TX_LOCKED; - } /* This is a hard error, log it. */ if (unlikely(TX_BUFFS_AVAIL(tp) <= (skb_shinfo(skb)->nr_frags + 1))) { netif_stop_queue(dev); - spin_unlock_irqrestore(&tp->tx_lock, flags); + spin_unlock(&tp->tx_lock); printk(KERN_ERR PFX "%s: BUG! Tx Ring full when queue awake!\n", dev->name); return NETDEV_TX_BUSY; @@ -3421,7 +3428,7 @@ static int tg3_start_xmit(struct sk_buff out_unlock: mmiowb(); - spin_unlock_irqrestore(&tp->tx_lock, flags); + spin_unlock(&tp->tx_lock); dev->trans_start = jiffies; @@ -3455,8 +3462,8 @@ static int tg3_change_mtu(struct net_dev } tg3_netif_stop(tp); - spin_lock_irq(&tp->lock); - spin_lock(&tp->tx_lock); + + tg3_full_lock(tp, 1); tg3_halt(tp, RESET_KIND_SHUTDOWN, 1); @@ -3466,8 +3473,7 @@ static int tg3_change_mtu(struct net_dev tg3_netif_start(tp); - spin_unlock(&tp->tx_lock); - spin_unlock_irq(&tp->lock); + tg3_full_unlock(tp); return 0; } @@ -5088,9 +5094,9 @@ static int tg3_set_mac_addr(struct net_d memcpy(dev->dev_addr, addr->sa_data, dev->addr_len); - spin_lock_irq(&tp->lock); + spin_lock_bh(&tp->lock); __tg3_set_mac_addr(tp); - spin_unlock_irq(&tp->lock); + spin_unlock_bh(&tp->lock); return 0; } @@ -5802,10 +5808,8 @@ static void tg3_periodic_fetch_stats(str static void tg3_timer(unsigned long __opaque) { struct tg3 *tp = (struct tg3 *) __opaque; - unsigned long flags; - spin_lock_irqsave(&tp->lock, flags); - spin_lock(&tp->tx_lock); + spin_lock(&tp->lock); if (!(tp->tg3_flags & TG3_FLAG_TAGGED_STATUS)) { /* All of this garbage is because when using non-tagged @@ -5822,8 +5826,7 @@ static void tg3_timer(unsigned long __op if (!(tr32(WDMAC_MODE) & WDMAC_MODE_ENABLE)) { tp->tg3_flags2 |= TG3_FLG2_RESTART_TIMER; - spin_unlock(&tp->tx_lock); - spin_unlock_irqrestore(&tp->lock, flags); + spin_unlock(&tp->lock); schedule_work(&tp->reset_task); return; } @@ -5891,8 +5894,7 @@ static void tg3_timer(unsigned long __op tp->asf_counter = tp->asf_multiplier; } - spin_unlock(&tp->tx_lock); - spin_unlock_irqrestore(&tp->lock, flags); + spin_unlock(&tp->lock); tp->timer.expires = jiffies + tp->timer_offset; add_timer(&tp->timer); @@ -6007,14 +6009,12 @@ static int tg3_test_msi(struct tg3 *tp) /* Need to reset the chip because the MSI cycle may have terminated * with Master Abort. */ - spin_lock_irq(&tp->lock); - spin_lock(&tp->tx_lock); + tg3_full_lock(tp, 1); tg3_halt(tp, RESET_KIND_SHUTDOWN, 1); err = tg3_init_hw(tp); - spin_unlock(&tp->tx_lock); - spin_unlock_irq(&tp->lock); + tg3_full_unlock(tp); if (err) free_irq(tp->pdev->irq, dev); @@ -6027,14 +6027,12 @@ static int tg3_open(struct net_device *d struct tg3 *tp = netdev_priv(dev); int err; - spin_lock_irq(&tp->lock); - spin_lock(&tp->tx_lock); + tg3_full_lock(tp, 0); tg3_disable_ints(tp); tp->tg3_flags &= ~TG3_FLAG_INIT_COMPLETE; - spin_unlock(&tp->tx_lock); - spin_unlock_irq(&tp->lock); + tg3_full_unlock(tp); /* The placement of this call is tied * to the setup and use of Host TX descriptors. @@ -6081,8 +6079,7 @@ static int tg3_open(struct net_device *d return err; } - spin_lock_irq(&tp->lock); - spin_lock(&tp->tx_lock); + tg3_full_lock(tp, 0); err = tg3_init_hw(tp); if (err) { @@ -6106,8 +6103,7 @@ static int tg3_open(struct net_device *d tp->timer.function = tg3_timer; } - spin_unlock(&tp->tx_lock); - spin_unlock_irq(&tp->lock); + tg3_full_unlock(tp); if (err) { free_irq(tp->pdev->irq, dev); @@ -6123,8 +6119,7 @@ static int tg3_open(struct net_device *d err = tg3_test_msi(tp); if (err) { - spin_lock_irq(&tp->lock); - spin_lock(&tp->tx_lock); + tg3_full_lock(tp, 0); if (tp->tg3_flags2 & TG3_FLG2_USING_MSI) { pci_disable_msi(tp->pdev); @@ -6134,22 +6129,19 @@ static int tg3_open(struct net_device *d tg3_free_rings(tp); tg3_free_consistent(tp); - spin_unlock(&tp->tx_lock); - spin_unlock_irq(&tp->lock); + tg3_full_unlock(tp); return err; } } - spin_lock_irq(&tp->lock); - spin_lock(&tp->tx_lock); + tg3_full_lock(tp, 0); add_timer(&tp->timer); tp->tg3_flags |= TG3_FLAG_INIT_COMPLETE; tg3_enable_ints(tp); - spin_unlock(&tp->tx_lock); - spin_unlock_irq(&tp->lock); + tg3_full_unlock(tp); netif_start_queue(dev); @@ -6395,8 +6387,7 @@ static int tg3_close(struct net_device * del_timer_sync(&tp->timer); - spin_lock_irq(&tp->lock); - spin_lock(&tp->tx_lock); + tg3_full_lock(tp, 1); #if 0 tg3_dump_state(tp); #endif @@ -6410,8 +6401,7 @@ static int tg3_close(struct net_device * TG3_FLAG_GOT_SERDES_FLOWCTL); netif_carrier_off(tp->dev); - spin_unlock(&tp->tx_lock); - spin_unlock_irq(&tp->lock); + tg3_full_unlock(tp); free_irq(tp->pdev->irq, dev); if (tp->tg3_flags2 & TG3_FLG2_USING_MSI) { @@ -6448,16 +6438,15 @@ static unsigned long calc_crc_errors(str if (!(tp->tg3_flags2 & TG3_FLG2_PHY_SERDES) && (GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5700 || GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5701)) { - unsigned long flags; u32 val; - spin_lock_irqsave(&tp->lock, flags); + spin_lock_bh(&tp->lock); if (!tg3_readphy(tp, 0x1e, &val)) { tg3_writephy(tp, 0x1e, val | 0x8000); tg3_readphy(tp, 0x14, &val); } else val = 0; - spin_unlock_irqrestore(&tp->lock, flags); + spin_unlock_bh(&tp->lock); tp->phy_crc_errors += val; @@ -6719,11 +6708,9 @@ static void tg3_set_rx_mode(struct net_d { struct tg3 *tp = netdev_priv(dev); - spin_lock_irq(&tp->lock); - spin_lock(&tp->tx_lock); + tg3_full_lock(tp, 0); __tg3_set_rx_mode(dev); - spin_unlock(&tp->tx_lock); - spin_unlock_irq(&tp->lock); + tg3_full_unlock(tp); } #define TG3_REGDUMP_LEN (32 * 1024) @@ -6745,8 +6732,7 @@ static void tg3_get_regs(struct net_devi memset(p, 0, TG3_REGDUMP_LEN); - spin_lock_irq(&tp->lock); - spin_lock(&tp->tx_lock); + tg3_full_lock(tp, 0); #define __GET_REG32(reg) (*(p)++ = tr32(reg)) #define GET_REG32_LOOP(base,len) \ @@ -6796,8 +6782,7 @@ do { p = (u32 *)(orig_p + (reg)); \ #undef GET_REG32_LOOP #undef GET_REG32_1 - spin_unlock(&tp->tx_lock); - spin_unlock_irq(&tp->lock); + tg3_full_unlock(tp); } static int tg3_get_eeprom_len(struct net_device *dev) @@ -6973,8 +6958,7 @@ static int tg3_set_settings(struct net_d return -EINVAL; } - spin_lock_irq(&tp->lock); - spin_lock(&tp->tx_lock); + tg3_full_lock(tp, 0); tp->link_config.autoneg = cmd->autoneg; if (cmd->autoneg == AUTONEG_ENABLE) { @@ -6990,8 +6974,7 @@ static int tg3_set_settings(struct net_d if (netif_running(dev)) tg3_setup_phy(tp, 1); - spin_unlock(&tp->tx_lock); - spin_unlock_irq(&tp->lock); + tg3_full_unlock(tp); return 0; } @@ -7027,12 +7010,12 @@ static int tg3_set_wol(struct net_device !(tp->tg3_flags & TG3_FLAG_SERDES_WOL_CAP)) return -EINVAL; - spin_lock_irq(&tp->lock); + spin_lock_bh(&tp->lock); if (wol->wolopts & WAKE_MAGIC) tp->tg3_flags |= TG3_FLAG_WOL_ENABLE; else tp->tg3_flags &= ~TG3_FLAG_WOL_ENABLE; - spin_unlock_irq(&tp->lock); + spin_unlock_bh(&tp->lock); return 0; } @@ -7072,7 +7055,7 @@ static int tg3_nway_reset(struct net_dev if (!netif_running(dev)) return -EAGAIN; - spin_lock_irq(&tp->lock); + spin_lock_bh(&tp->lock); r = -EINVAL; tg3_readphy(tp, MII_BMCR, &bmcr); if (!tg3_readphy(tp, MII_BMCR, &bmcr) && @@ -7080,7 +7063,7 @@ static int tg3_nway_reset(struct net_dev tg3_writephy(tp, MII_BMCR, bmcr | BMCR_ANRESTART); r = 0; } - spin_unlock_irq(&tp->lock); + spin_unlock_bh(&tp->lock); return r; } @@ -7111,8 +7094,7 @@ static int tg3_set_ringparam(struct net_ if (netif_running(dev)) tg3_netif_stop(tp); - spin_lock_irq(&tp->lock); - spin_lock(&tp->tx_lock); + tg3_full_lock(tp, 0); tp->rx_pending = ering->rx_pending; @@ -7128,8 +7110,7 @@ static int tg3_set_ringparam(struct net_ tg3_netif_start(tp); } - spin_unlock(&tp->tx_lock); - spin_unlock_irq(&tp->lock); + tg3_full_unlock(tp); return 0; } @@ -7150,8 +7131,8 @@ static int tg3_set_pauseparam(struct net if (netif_running(dev)) tg3_netif_stop(tp); - spin_lock_irq(&tp->lock); - spin_lock(&tp->tx_lock); + tg3_full_lock(tp, 1); + if (epause->autoneg) tp->tg3_flags |= TG3_FLAG_PAUSE_AUTONEG; else @@ -7170,8 +7151,8 @@ static int tg3_set_pauseparam(struct net tg3_init_hw(tp); tg3_netif_start(tp); } - spin_unlock(&tp->tx_lock); - spin_unlock_irq(&tp->lock); + + tg3_full_unlock(tp); return 0; } @@ -7192,12 +7173,12 @@ static int tg3_set_rx_csum(struct net_de return 0; } - spin_lock_irq(&tp->lock); + spin_lock_bh(&tp->lock); if (data) tp->tg3_flags |= TG3_FLAG_RX_CHECKSUMS; else tp->tg3_flags &= ~TG3_FLAG_RX_CHECKSUMS; - spin_unlock_irq(&tp->lock); + spin_unlock_bh(&tp->lock); return 0; } @@ -7719,8 +7700,7 @@ static void tg3_self_test(struct net_dev if (netif_running(dev)) tg3_netif_stop(tp); - spin_lock_irq(&tp->lock); - spin_lock(&tp->tx_lock); + tg3_full_lock(tp, 1); tg3_halt(tp, RESET_KIND_SUSPEND, 1); tg3_nvram_lock(tp); @@ -7742,14 +7722,14 @@ static void tg3_self_test(struct net_dev data[4] = 1; } - spin_unlock(&tp->tx_lock); - spin_unlock_irq(&tp->lock); + tg3_full_unlock(tp); + if (tg3_test_interrupt(tp) != 0) { etest->flags |= ETH_TEST_FL_FAILED; data[5] = 1; } - spin_lock_irq(&tp->lock); - spin_lock(&tp->tx_lock); + + tg3_full_lock(tp, 0); tg3_halt(tp, RESET_KIND_SHUTDOWN, 1); if (netif_running(dev)) { @@ -7757,8 +7737,8 @@ static void tg3_self_test(struct net_dev tg3_init_hw(tp); tg3_netif_start(tp); } - spin_unlock(&tp->tx_lock); - spin_unlock_irq(&tp->lock); + + tg3_full_unlock(tp); } } @@ -7779,9 +7759,9 @@ static int tg3_ioctl(struct net_device * if (tp->tg3_flags2 & TG3_FLG2_PHY_SERDES) break; /* We have no PHY */ - spin_lock_irq(&tp->lock); + spin_lock_bh(&tp->lock); err = tg3_readphy(tp, data->reg_num & 0x1f, &mii_regval); - spin_unlock_irq(&tp->lock); + spin_unlock_bh(&tp->lock); data->val_out = mii_regval; @@ -7795,9 +7775,9 @@ static int tg3_ioctl(struct net_device * if (!capable(CAP_NET_ADMIN)) return -EPERM; - spin_lock_irq(&tp->lock); + spin_lock_bh(&tp->lock); err = tg3_writephy(tp, data->reg_num & 0x1f, data->val_in); - spin_unlock_irq(&tp->lock); + spin_unlock_bh(&tp->lock); return err; @@ -7813,28 +7793,24 @@ static void tg3_vlan_rx_register(struct { struct tg3 *tp = netdev_priv(dev); - spin_lock_irq(&tp->lock); - spin_lock(&tp->tx_lock); + tg3_full_lock(tp, 0); tp->vlgrp = grp; /* Update RX_MODE_KEEP_VLAN_TAG bit in RX_MODE register. */ __tg3_set_rx_mode(dev); - spin_unlock(&tp->tx_lock); - spin_unlock_irq(&tp->lock); + tg3_full_unlock(tp); } static void tg3_vlan_rx_kill_vid(struct net_device *dev, unsigned short vid) { struct tg3 *tp = netdev_priv(dev); - spin_lock_irq(&tp->lock); - spin_lock(&tp->tx_lock); + tg3_full_lock(tp, 0); if (tp->vlgrp) tp->vlgrp->vlan_devices[vid] = NULL; - spin_unlock(&tp->tx_lock); - spin_unlock_irq(&tp->lock); + tg3_full_unlock(tp); } #endif @@ -10141,24 +10117,19 @@ static int tg3_suspend(struct pci_dev *p del_timer_sync(&tp->timer); - spin_lock_irq(&tp->lock); - spin_lock(&tp->tx_lock); + tg3_full_lock(tp, 1); tg3_disable_ints(tp); - spin_unlock(&tp->tx_lock); - spin_unlock_irq(&tp->lock); + tg3_full_unlock(tp); netif_device_detach(dev); - spin_lock_irq(&tp->lock); - spin_lock(&tp->tx_lock); + tg3_full_lock(tp, 0); tg3_halt(tp, RESET_KIND_SHUTDOWN, 1); - spin_unlock(&tp->tx_lock); - spin_unlock_irq(&tp->lock); + tg3_full_unlock(tp); err = tg3_set_power_state(tp, pci_choose_state(pdev, state)); if (err) { - spin_lock_irq(&tp->lock); - spin_lock(&tp->tx_lock); + tg3_full_lock(tp, 0); tg3_init_hw(tp); @@ -10168,8 +10139,7 @@ static int tg3_suspend(struct pci_dev *p netif_device_attach(dev); tg3_netif_start(tp); - spin_unlock(&tp->tx_lock); - spin_unlock_irq(&tp->lock); + tg3_full_unlock(tp); } return err; @@ -10192,8 +10162,7 @@ static int tg3_resume(struct pci_dev *pd netif_device_attach(dev); - spin_lock_irq(&tp->lock); - spin_lock(&tp->tx_lock); + tg3_full_lock(tp, 0); tg3_init_hw(tp); @@ -10204,8 +10173,7 @@ static int tg3_resume(struct pci_dev *pd tg3_netif_start(tp); - spin_unlock(&tp->tx_lock); - spin_unlock_irq(&tp->lock); + tg3_full_unlock(tp); return 0; } --- 1/drivers/net/tg3.h.~1~ 2005-06-03 12:11:44.000000000 -0700 +++ 2/drivers/net/tg3.h 2005-06-03 12:12:03.000000000 -0700 @@ -2006,17 +2006,33 @@ struct tg3_ethtool_stats { struct tg3 { /* begin "general, frequently-used members" cacheline section */ + /* If the IRQ handler (which runs lockless) needs to be + * quiesced, the following bitmask state is used. The + * SYNC bit is set by non-IRQ context code to initiate + * the quiescence. The setter of this bit also forces + * an interrupt to run via the GRC misc host control + * register. + * + * The IRQ handler notes this, disables interrupts, and + * sets the COMPLETE bit. At this point the SYNC bit + * setter can be assured that interrupts will no longer + * get run. + * + * In this way all SMP driver locks are never acquired + * in hw IRQ context, only sw IRQ context or lower. + */ + unsigned long irq_state; +#define TG3_IRQSTATE_SYNC 0 +#define TG3_IRQSTATE_COMPLETE 1 + /* SMP locking strategy: * * lock: Held during all operations except TX packet * processing. * - * tx_lock: Held during tg3_start_xmit{,_4gbug} and tg3_tx + * tx_lock: Held during tg3_start_xmit and tg3_tx * - * If you want to shut up all asynchronous processing you must - * acquire both locks, 'lock' taken before 'tx_lock'. IRQs must - * be disabled to take 'lock' but only softirq disabling is - * necessary for acquisition of 'tx_lock'. + * Both of these locks are to be held with BH safety. */ spinlock_t lock; spinlock_t indirect_lock; From mitch.a.williams@intel.com Fri Jun 3 12:30:25 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Jun 2005 12:30:29 -0700 (PDT) Received: from orsfmr003.jf.intel.com (fmr18.intel.com [134.134.136.17]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j53JUOXq023660 for ; Fri, 3 Jun 2005 12:30:24 -0700 Received: from orsfmr101.jf.intel.com (orsfmr101.jf.intel.com [10.7.209.17]) by orsfmr003.jf.intel.com (8.12.10/8.12.10/d: major-outer.mc,v 1.1 2004/09/17 17:50:56 root Exp $) with ESMTP id j53JSBV5021340; Fri, 3 Jun 2005 19:28:12 GMT Received: from nwlxmail01.jf.intel.com (nwlxmail01.jf.intel.com [10.7.171.40]) by orsfmr101.jf.intel.com (8.12.10/8.12.10/d: major-inner.mc,v 1.2 2004/09/17 18:05:01 root Exp $) with ESMTP id j53JSBSc001696; Fri, 3 Jun 2005 19:28:11 GMT Received: from mawilli1-desk2.amr.corp.intel.com (mawilli1-desk2.amr.corp.intel.com [134.134.3.124]) by nwlxmail01.jf.intel.com (8.12.10/8.12.9/MailSET/Hub) with ESMTP id j53JSASL003143; Fri, 3 Jun 2005 12:28:10 -0700 Date: Fri, 3 Jun 2005 12:28:10 -0700 From: Mitch Williams X-X-Sender: mawilli1@mawilli1-desk2.amr.corp.intel.com To: "David S. Miller" cc: hadi@cyberus.ca, mitch.a.williams@intel.com, john.ronciak@intel.com, jdmason@us.ibm.com, shemminger@osdl.org, netdev@oss.sgi.com, Robert.Olsson@data.slu.se, ganesh.venkatesan@intel.com, jesse.brandeburg@intel.com Subject: Re: RFC: NAPI packet weighting patch In-Reply-To: <20050603.120126.41874584.davem@davemloft.net> Message-ID: References: <1117765954.6095.49.camel@localhost.localdomain> <1117824150.6071.34.camel@localhost.localdomain> <20050603.120126.41874584.davem@davemloft.net> ReplyTo: "Mitch Williams" MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Scanned-By: MIMEDefang 2.44 X-archive-position: 2049 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mitch.a.williams@intel.com Precedence: bulk X-list: netdev Content-Length: 2334 Lines: 57 On Fri, 3 Jun 2005, David S. Miller wrote: > From: jamal > Date: Fri, 03 Jun 2005 14:42:30 -0400 > > > When you reduce the weight, the system is spending less time in the > > softirq processing packets before softirq yields. If this gives more > > opportunity to your app to run, then the performance will go up. > > Is this what you are seeing? > > Jamal, this is my current theory as well, we hit the jiffies > check. Well, I hate to mess up your guys' theories, but the real reason is simpler: hardware receive resources, specifically descriptors and buffers. In a typical NAPI polling loop, the driver processes receive packets until it either hits the quota or runs out of packets. Then, at the end of the loop, it returns all of those now-free receive resources back to the hardware. With a heavy receive load, the hardware will run out of receive descriptors in the time it takes the driver/NAPI/stack to process 64 packets. So it drops them on the floor. And, as we know, dropped packets are A Bad Thing. By reducing the driver weight, we cause the driver to give receive resources back to the hardware more often, which prevents dropped packets. As Ben Greer noticed, increasing the number of descriptors can help with this issue. But it really can't eliminate the problem -- once the ring is full, it doesn't matter how big it is, it's still full. In my testing (Dual 2.8GHz Xeon, PCI-X bus, Gigabit network, 10 clients), I was able to completely eliminate dropped packets in most cases by reducing the driver weight down to about 20. Now for some speculation: Aside from dropped packets, I saw continued performance gain with even lower weights, with the sweet spot (on a single adapter) being about 8 to 10. I don't have a definite answer for why this is happening, but my theory is that it's latency. Packets are processed more often, meaning they spend less time sitting in hardware-owned buffers, which means they get to the stack quicker, which means less latency. But I'm happy to admit I might be wrong with this theory. Nevertheless, the effect exists, and I've seen it on drivers other than just e1000. (And, no, I'm not allowed to say which other drivers I've used, or give specific numbers, or our lawyers will string me up by my toes.) Anybody else got a theory? -Mitch From hadi@cyberus.ca Fri Jun 3 12:42:22 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Jun 2005 12:42:25 -0700 (PDT) Received: from mx03.cybersurf.com (mx03.cybersurf.com [209.197.145.106]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j53JgJXq024746 for ; Fri, 3 Jun 2005 12:42:22 -0700 Received: from mail.cyberus.ca ([209.197.145.21]) by mx03.cybersurf.com with esmtp (Exim 4.30) id 1DeI2o-0000q7-UJ for netdev@oss.sgi.com; Fri, 03 Jun 2005 15:41:26 -0400 Received: from [216.209.86.2] (helo=[10.0.0.229]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1DeI2l-0007BV-3a; Fri, 03 Jun 2005 15:41:23 -0400 Subject: Re: RFC: NAPI packet weighting patch From: jamal Reply-To: hadi@cyberus.ca To: "David S. Miller" Cc: mitch.a.williams@intel.com, john.ronciak@intel.com, jdmason@us.ibm.com, shemminger@osdl.org, netdev@oss.sgi.com, Robert.Olsson@data.slu.se, ganesh.venkatesan@intel.com, jesse.brandeburg@intel.com In-Reply-To: <20050603.120126.41874584.davem@davemloft.net> References: <1117765954.6095.49.camel@localhost.localdomain> <1117824150.6071.34.camel@localhost.localdomain> <20050603.120126.41874584.davem@davemloft.net> Content-Type: text/plain Organization: unknown Date: Fri, 03 Jun 2005 15:40:50 -0400 Message-Id: <1117827650.6071.59.camel@localhost.localdomain> Mime-Version: 1.0 X-Mailer: Evolution 2.2.1.1 Content-Transfer-Encoding: 7bit X-archive-position: 2050 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev Content-Length: 1976 Lines: 55 On Fri, 2005-03-06 at 12:01 -0700, David S. Miller wrote: > From: jamal > Date: Fri, 03 Jun 2005 14:42:30 -0400 > > > When you reduce the weight, the system is spending less time in the > > softirq processing packets before softirq yields. If this gives more > > opportunity to your app to run, then the performance will go up. > > Is this what you are seeing? > > Jamal, this is my current theory as well, we hit the jiffies > check. > I think you are more than likely right. If we can instrument it Mitch could check it out. Mitch would you like to try something that will instrument this? I know i have seen this behavior but it was when i was playing with some system that had a real small HZ. > It it the only logical explanation I can come up with for the > single adapter case. > > There are some ways we can mitigate this. Here is one idea > off the top of my head. > > When the jiffies check is hit, lower the weight of the most recently > polled device towards some minimum (perhaps divide by two). If we > successfully poll without hitting the jiffies check, make a small > increment of the weight up to some limit. > You probably wanna start high up first until you hit congestion and then start lowering. > It is Van Jacobson TCP congestion avoidance applied to NAPI :-) > > Just a simple AIMD (Additive Increase, Multiplicative Decrease). > So, hitting the jiffies work limit is congestion, and the cause > of the congestion is the most recently polled device. > > In this regime, what the driver currently specifies as "->weight" > is actually the maximum we'll use in the congestion control > algorithm. And we can choose some constant minimum, something > like "8" ought to work well. > > Comments? > In theory it looks good - but i think you end up defeating the fairness factor. If you can narrow it down to which driver is causing congestion, and only penalize that driver i think it would work well. cheers, jamal From hadi@cyberus.ca Fri Jun 3 13:01:00 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Jun 2005 13:01:05 -0700 (PDT) Received: from mx03.cybersurf.com (mx03.cybersurf.com [209.197.145.106]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j53K10Xq026708 for ; Fri, 3 Jun 2005 13:01:00 -0700 Received: from mail.cyberus.ca ([209.197.145.21]) by mx03.cybersurf.com with esmtp (Exim 4.30) id 1DeIKt-0007Az-K1 for netdev@oss.sgi.com; Fri, 03 Jun 2005 16:00:07 -0400 Received: from [216.209.86.2] (helo=[10.0.0.229]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1DeIKp-0002Aj-Tt; Fri, 03 Jun 2005 16:00:04 -0400 Subject: Re: RFC: NAPI packet weighting patch From: jamal Reply-To: hadi@cyberus.ca To: Mitch Williams Cc: "David S. Miller" , john.ronciak@intel.com, jdmason@us.ibm.com, shemminger@osdl.org, netdev@oss.sgi.com, Robert.Olsson@data.slu.se, ganesh.venkatesan@intel.com, jesse.brandeburg@intel.com In-Reply-To: References: <1117765954.6095.49.camel@localhost.localdomain> <1117824150.6071.34.camel@localhost.localdomain> <20050603.120126.41874584.davem@davemloft.net> Content-Type: text/plain Organization: unknown Date: Fri, 03 Jun 2005 15:59:31 -0400 Message-Id: <1117828771.6071.77.camel@localhost.localdomain> Mime-Version: 1.0 X-Mailer: Evolution 2.2.1.1 Content-Transfer-Encoding: 7bit X-archive-position: 2051 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev Content-Length: 1995 Lines: 51 On Fri, 2005-03-06 at 12:28 -0700, Mitch Williams wrote: > > On Fri, 3 Jun 2005, David S. Miller wrote: > > > From: jamal > > Date: Fri, 03 Jun 2005 14:42:30 -0400 > > > > > When you reduce the weight, the system is spending less time in the > > > softirq processing packets before softirq yields. If this gives more > > > opportunity to your app to run, then the performance will go up. > > > Is this what you are seeing? > > > > Jamal, this is my current theory as well, we hit the jiffies > > check. > > Well, I hate to mess up your guys' theories, but the real reason is > simpler: hardware receive resources, specifically descriptors and > buffers. > > In a typical NAPI polling loop, the driver processes receive packets until > it either hits the quota or runs out of packets. Then, at the end of the > loop, it returns all of those now-free receive resources back to the > hardware. > > With a heavy receive load, the hardware will run out of receive > descriptors in the time it takes the driver/NAPI/stack to process 64 > packets. So it drops them on the floor. And, as we know, dropped packets > are A Bad Thing. > > By reducing the driver weight, we cause the driver to give receive > resources back to the hardware more often, which prevents dropped packets. > > As Ben Greer noticed, increasing the number of descriptors can help with > this issue. But it really can't eliminate the problem -- once the ring > is full, it doesn't matter how big it is, it's still full. > > In my testing (Dual 2.8GHz Xeon, PCI-X bus, Gigabit network, 10 clients), > I was able to completely eliminate dropped packets in most cases by > reducing the driver weight down to about 20. > > Now for some speculation: > What you said above is unfortunately also speculation ;-> But one that you could validate by putting proper hooks. As an example, try to restore a descriptor every time you pick one - for an example of this look at the sb1250 driver. cheers, jamal From Robert.Olsson@data.slu.se Fri Jun 3 13:18:45 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Jun 2005 13:19:00 -0700 (PDT) Received: from mx1.slu.se (mx1.slu.se [130.238.96.70]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j53KIiXq027844 for ; Fri, 3 Jun 2005 13:18:45 -0700 Received: from robur.slu.se (robur.slu.se [130.238.98.12]) by mx1.slu.se (8.13.1/8.13.1) with ESMTP id j53KHVFH031123; Fri, 3 Jun 2005 22:17:32 +0200 Received: by robur.slu.se (Postfix, from userid 1000) id 985F5EE3F0; Fri, 3 Jun 2005 22:17:31 +0200 (CEST) From: Robert Olsson MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <17056.47835.583602.151291@robur.slu.se> Date: Fri, 3 Jun 2005 22:17:31 +0200 To: "Ronciak, John" Cc: "Robert Olsson" , "David S. Miller" , , , , "Williams, Mitch A" , , "Venkatesan, Ganesh" , "Brandeburg, Jesse" Subject: RE: RFC: NAPI packet weighting patch In-Reply-To: <468F3FDA28AA87429AD807992E22D07E0450BFE8@orsmsx408> References: <468F3FDA28AA87429AD807992E22D07E0450BFE8@orsmsx408> X-Mailer: VM 7.18 under Emacs 21.4.1 X-Scanned-By: MIMEDefang 2.48 on 130.238.96.70 X-archive-position: 2052 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: Robert.Olsson@data.slu.se Precedence: bulk X-list: netdev Content-Length: 815 Lines: 23 Ronciak, John writes: > With the same system (fairly high end with nothing major running on it) > we got rid of the dropped frames by just reducing the weight for 64. So > the weight did have something to do with the dropped frames. Maybe > other factors as well, but in static tests like this it sure looks like > the 64 value is wrong is some cases. It is possible that a lower weight forced your driver to disable interrupts and do packet reception w/o interrupts often this is more efficient as we get rid intr. latency etc. Again I think weight should only used for fairness and not control the threshold when to disable interrupts. You can test with a new policy in e1000_clean so you schedule for a new poll if work_done (any pkts received) or tx_cleaned is true. Cheers. --ro From davem@davemloft.net Fri Jun 3 13:24:10 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Jun 2005 13:24:14 -0700 (PDT) Received: from sunset.davemloft.net (dsl027-180-168.sfo1.dsl.speakeasy.net [216.27.180.168]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j53KO9Xq028589 for ; Fri, 3 Jun 2005 13:24:10 -0700 Received: from localhost ([127.0.0.1] ident=davem) by sunset.davemloft.net with esmtp (Exim 4.50) id 1DeIh0-0002Ce-4d; Fri, 03 Jun 2005 13:22:58 -0700 Date: Fri, 03 Jun 2005 13:22:57 -0700 (PDT) Message-Id: <20050603.132257.23013342.davem@davemloft.net> To: mitch.a.williams@intel.com Cc: hadi@cyberus.ca, john.ronciak@intel.com, jdmason@us.ibm.com, shemminger@osdl.org, netdev@oss.sgi.com, Robert.Olsson@data.slu.se, ganesh.venkatesan@intel.com, jesse.brandeburg@intel.com Subject: Re: RFC: NAPI packet weighting patch From: "David S. Miller" In-Reply-To: References: <1117824150.6071.34.camel@localhost.localdomain> <20050603.120126.41874584.davem@davemloft.net> X-Mailer: Mew version 3.3 on Emacs 21.4 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 2053 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev Content-Length: 785 Lines: 20 From: Mitch Williams Date: Fri, 3 Jun 2005 12:28:10 -0700 > In a typical NAPI polling loop, the driver processes receive packets until > it either hits the quota or runs out of packets. Then, at the end of the > loop, it returns all of those now-free receive resources back to the > hardware. > > With a heavy receive load, the hardware will run out of receive > descriptors in the time it takes the driver/NAPI/stack to process 64 > packets. So it drops them on the floor. And, as we know, dropped packets > are A Bad Thing. This is why you should replenish RX packets _IN_ your RX packet receive processing, not via some tasklet or other seperate work processing context. No wonder I never see this on tg3. It is the only way to do this cleanly. From jgarzik@pobox.com Fri Jun 3 13:24:10 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Jun 2005 13:24:16 -0700 (PDT) Received: from mail.dvmed.net (mail.dvmed.net [216.237.124.58]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j53KO9Xq028590 for ; Fri, 3 Jun 2005 13:24:10 -0700 Received: from cpe-065-184-065-144.nc.res.rr.com ([65.184.65.144] helo=[10.10.10.88]) by mail.dvmed.net with esmtpsa (Exim 4.51 #1 (Red Hat Linux)) id 1DeIhC-00022M-Vr; Fri, 03 Jun 2005 20:23:11 +0000 Message-ID: <42A0BC2B.4020409@pobox.com> Date: Fri, 03 Jun 2005 16:23:07 -0400 From: Jeff Garzik User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.6) Gecko/20050328 Fedora/1.7.6-1.2.5 X-Accept-Language: en-us, en MIME-Version: 1.0 To: "David S. Miller" CC: netdev@oss.sgi.com, mchan@broadcom.com Subject: Re: [PATCH]: Tigon3 new NAPI locking v2 References: <20050603.122558.88474819.davem@davemloft.net> In-Reply-To: <20050603.122558.88474819.davem@davemloft.net> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 2054 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev Content-Length: 3007 Lines: 97 David S. Miller wrote: > [TG3]: Eliminate all hw IRQ handler spinlocks. > > Move all driver spinlocks to be taken at sw IRQ > context only. > > This fixes the skb_copy() we were doing with hw > IRQs disabled (which is illegal and triggers a > BUG() with HIGHMEM enabled). It also simplifies > the locking all over the driver tremendously. > > We accomplish this feat by creating a special > sequence to synchronize with the hw IRQ handler > using a 2-bit atomic state. > > Signed-off-by: David S. Miller overall, pretty spiffy :) As further work, I would like to see how much (alot? all?) of the timer code could be moved into a workqueue, where we could kill the last of the horrible-udelay loops in the driver. Particularly awful is while (++tick < 195000) { status = tg3_fiber_aneg_smachine(tp, &aninfo); if (status == ANEG_DONE || status == ANEG_FAILED) break; udelay(1); } where you could freeze a uniprocess box (lock out everything but interrupts) for over 1 second. IOW, the slower the phy, the more these slow-path delays can affect the overall system. This is a MINOR, low priority issue; but long delays are uglies that should be fixed, if its relatively painless. > +static void tg3_irq_quiesce(struct tg3 *tp) > +{ > + BUG_ON(test_bit(TG3_IRQSTATE_SYNC, &tp->irq_state)); > + > + set_bit(TG3_IRQSTATE_SYNC, &tp->irq_state); > + smp_mb(); > + tw32(GRC_LOCAL_CTRL, > + tp->grc_local_ctrl | GRC_LCLCTRL_SETINT); > + > + while (!test_bit(TG3_IRQSTATE_COMPLETE, &tp->irq_state)) { > + u32 val = tr32(MAILBOX_INTERRUPT_0 + TG3_64BIT_REG_LOW); > + > + if (val == 0x00000001) > + break; > + > + cpu_relax(); > + } > +} * This loop makes me nervous... If there's a fault on the PCI bus or the hardware is unplugged, val will equal 0xffffffff. * A few comments for normal humans like "force an interrupt" and "wait for interrupt handler to complete" might be nice. * a BUG_ON(if-interrupts-are-disabled) line might be nice > +static inline int tg3_irq_sync(struct tg3 *tp) > +{ > + if (test_bit(TG3_IRQSTATE_SYNC, &tp->irq_state)) { > + set_bit(TG3_IRQSTATE_COMPLETE, &tp->irq_state); > + return 1; > + } > + return 0; > +} > + > +/* Fully shutdown all tg3 driver activity elsewhere in the system. > + * If irq_sync is non-zero, then the IRQ handler must be synchronized > + * with as well. Most of the time, this is not necessary except when > + * shutting down the device. > + */ > +static inline void tg3_full_lock(struct tg3 *tp, int irq_sync) > +{ > + if (irq_sync) > + tg3_irq_quiesce(tp); > + spin_lock_bh(&tp->lock); > + spin_lock(&tp->tx_lock); > +} Rather than an 'irq_sync' arg, my instinct would have been to create tg3_full_lock() and tg3_full_lock_sync(). This makes the action -much- more obvious to the reader, and since its inline doesn't cost anything (compiler's optimizer even does a tiny bit less work my way). Jeff From hadi@cyberus.ca Fri Jun 3 13:24:55 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Jun 2005 13:25:00 -0700 (PDT) Received: from mx03.cybersurf.com (mx03.cybersurf.com [209.197.145.106]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j53KOsXq029246 for ; Fri, 3 Jun 2005 13:24:54 -0700 Received: from mail.cyberus.ca ([209.197.145.21]) by mx03.cybersurf.com with esmtp (Exim 4.30) id 1DeIi2-0007K2-72 for netdev@oss.sgi.com; Fri, 03 Jun 2005 16:24:02 -0400 Received: from [216.209.86.2] (helo=[10.0.0.229]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1DeIhy-0006OD-Jw; Fri, 03 Jun 2005 16:23:58 -0400 Subject: Re: RFC: NAPI packet weighting patch From: jamal Reply-To: hadi@cyberus.ca To: "David S. Miller" Cc: mitch.a.williams@intel.com, john.ronciak@intel.com, jdmason@us.ibm.com, shemminger@osdl.org, netdev@oss.sgi.com, Robert.Olsson@data.slu.se, ganesh.venkatesan@intel.com, jesse.brandeburg@intel.com In-Reply-To: <1117827650.6071.59.camel@localhost.localdomain> References: <1117765954.6095.49.camel@localhost.localdomain> <1117824150.6071.34.camel@localhost.localdomain> <20050603.120126.41874584.davem@davemloft.net> <1117827650.6071.59.camel@localhost.localdomain> Content-Type: text/plain Organization: unknown Date: Fri, 03 Jun 2005 16:23:25 -0400 Message-Id: <1117830205.6071.81.camel@localhost.localdomain> Mime-Version: 1.0 X-Mailer: Evolution 2.2.1.1 Content-Transfer-Encoding: 7bit X-archive-position: 2055 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev Content-Length: 636 Lines: 22 On Fri, 2005-03-06 at 15:40 -0400, jamal wrote: > On Fri, 2005-03-06 at 12:01 -0700, David S. Miller wrote: > > I think you are more than likely right. If we can instrument it Mitch > could check it out. Mitch would you like to try something that will > instrument this? I know i have seen this behavior but it was when i was > playing with some system that had a real small HZ. > Sorry, Its already there as Dave said in his email. Look for time_squeeze. Its the column i labeled XXXX below. ----- $ cat /proc/net/softnet_stat 0000f938 00000000 XXXXXXX 00000000 00000000 00000000 00000000 00000000 00000000 ------ cheers, jamal From davem@davemloft.net Fri Jun 3 13:30:34 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Jun 2005 13:30:45 -0700 (PDT) Received: from sunset.davemloft.net (dsl027-180-168.sfo1.dsl.speakeasy.net [216.27.180.168]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j53KUYXq030460 for ; Fri, 3 Jun 2005 13:30:34 -0700 Received: from localhost ([127.0.0.1] ident=davem) by sunset.davemloft.net with esmtp (Exim 4.50) id 1DeInD-0002DW-4v; Fri, 03 Jun 2005 13:29:23 -0700 Date: Fri, 03 Jun 2005 13:29:22 -0700 (PDT) Message-Id: <20050603.132922.63997492.davem@davemloft.net> To: mitch.a.williams@intel.com Cc: hadi@cyberus.ca, john.ronciak@intel.com, jdmason@us.ibm.com, shemminger@osdl.org, netdev@oss.sgi.com, Robert.Olsson@data.slu.se, ganesh.venkatesan@intel.com, jesse.brandeburg@intel.com Subject: Re: RFC: NAPI packet weighting patch From: "David S. Miller" In-Reply-To: <20050603.132257.23013342.davem@davemloft.net> References: <20050603.120126.41874584.davem@davemloft.net> <20050603.132257.23013342.davem@davemloft.net> X-Mailer: Mew version 3.3 on Emacs 21.4 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 2056 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev Content-Length: 1152 Lines: 33 From: "David S. Miller" Date: Fri, 03 Jun 2005 13:22:57 -0700 (PDT) > This is why you should replenish RX packets _IN_ your > RX packet receive processing, not via some tasklet > or other seperate work processing context. > > No wonder I never see this on tg3. Actually, the problem is slightly different. E1000 processes the full QUOTA of RX packets, _THEN_ replenishes with new RX buffers. No wonder the chip runs out of RX descriptors. You should replenish _AS_ you grab RX packets off the receive queue, just as tg3 does. This allows you to accomplish two things: 1) Keep up with the chip so that it does not starve, regardless of dev->weight setting or system load. 2) Make intelligent decisions when RX buffer allocation fails. When we look at a RX descriptor in tg3 we never leave the descriptor empty. If replacement RX buffer fails, we simply ignore the RX packet we're looking at and give it back to the chip. Every driver should implement this policy. Drivers that do not do things this way run into all kinds of RX ring chip starvation issues like the ones you are seeing here. From mitch.a.williams@intel.com Fri Jun 3 13:31:13 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Jun 2005 13:31:16 -0700 (PDT) Received: from orsfmr002.jf.intel.com (fmr17.intel.com [134.134.136.16]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j53KVCXq030621 for ; Fri, 3 Jun 2005 13:31:12 -0700 Received: from orsfmr100.jf.intel.com (orsfmr100.jf.intel.com [10.7.209.16]) by orsfmr002.jf.intel.com (8.12.10/8.12.10/d: major-outer.mc,v 1.1 2004/09/17 17:50:56 root Exp $) with ESMTP id j53KSvYu018304; Fri, 3 Jun 2005 20:28:57 GMT Received: from nwlxmail01.jf.intel.com (nwlxmail01.jf.intel.com [10.7.171.40]) by orsfmr100.jf.intel.com (8.12.10/8.12.10/d: major-inner.mc,v 1.2 2004/09/17 18:05:01 root Exp $) with ESMTP id j53KSvdD000620; Fri, 3 Jun 2005 20:28:57 GMT Received: from mawilli1-desk2.amr.corp.intel.com (mawilli1-desk2.amr.corp.intel.com [134.134.3.124]) by nwlxmail01.jf.intel.com (8.12.10/8.12.9/MailSET/Hub) with ESMTP id j53KSvSL006410; Fri, 3 Jun 2005 13:28:57 -0700 Date: Fri, 3 Jun 2005 13:28:57 -0700 From: Mitch Williams X-X-Sender: mawilli1@mawilli1-desk2.amr.corp.intel.com To: jamal cc: "David S. Miller" , "Williams, Mitch A" , "Ronciak, John" , jdmason@us.ibm.com, shemminger@osdl.org, netdev@oss.sgi.com, Robert.Olsson@data.slu.se, "Venkatesan, Ganesh" , "Brandeburg, Jesse" Subject: Re: RFC: NAPI packet weighting patch In-Reply-To: <1117830205.6071.81.camel@localhost.localdomain> Message-ID: References: <1117765954.6095.49.camel@localhost.localdomain> <1117824150.6071.34.camel@localhost.localdomain> <20050603.120126.41874584.davem@davemloft.net> <1117827650.6071.59.camel@localhost.localdomain> <1117830205.6071.81.camel@localhost.localdomain> ReplyTo: "Mitch Williams" MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Scanned-By: MIMEDefang 2.44 X-archive-position: 2057 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mitch.a.williams@intel.com Precedence: bulk X-list: netdev Content-Length: 464 Lines: 21 On Fri, 3 Jun 2005, jamal wrote: > > Sorry, Its already there as Dave said in his email. > Look for time_squeeze. Its the column i labeled XXXX below. > > ----- > $ cat /proc/net/softnet_stat > 0000f938 00000000 XXXXXXX 00000000 00000000 00000000 00000000 00000000 > 00000000 > ------ I might not be able to get into the lab today (they keep making me do work!), but I should be able to pop in Monday and take a look. Shouldn't take too long. Thanks, Mitch From davem@davemloft.net Fri Jun 3 13:31:50 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Jun 2005 13:31:57 -0700 (PDT) Received: from sunset.davemloft.net (dsl027-180-168.sfo1.dsl.speakeasy.net [216.27.180.168]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j53KVoXq031106 for ; Fri, 3 Jun 2005 13:31:50 -0700 Received: from localhost ([127.0.0.1] ident=davem) by sunset.davemloft.net with esmtp (Exim 4.50) id 1DeIoT-0002EE-9j; Fri, 03 Jun 2005 13:30:41 -0700 Date: Fri, 03 Jun 2005 13:30:41 -0700 (PDT) Message-Id: <20050603.133041.35664164.davem@davemloft.net> To: Robert.Olsson@data.slu.se Cc: john.ronciak@intel.com, jdmason@us.ibm.com, shemminger@osdl.org, hadi@cyberus.ca, mitch.a.williams@intel.com, netdev@oss.sgi.com, ganesh.venkatesan@intel.com, jesse.brandeburg@intel.com Subject: Re: RFC: NAPI packet weighting patch From: "David S. Miller" In-Reply-To: <17056.47835.583602.151291@robur.slu.se> References: <468F3FDA28AA87429AD807992E22D07E0450BFE8@orsmsx408> <17056.47835.583602.151291@robur.slu.se> X-Mailer: Mew version 3.3 on Emacs 21.4 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 2058 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev Content-Length: 852 Lines: 22 From: Robert Olsson Date: Fri, 3 Jun 2005 22:17:31 +0200 > It is possible that a lower weight forced your driver to disable interrupts > and do packet reception w/o interrupts often this is more efficient as > we get rid intr. latency etc. > > Again I think weight should only used for fairness and not control the > threshold when to disable interrupts. > > You can test with a new policy in e1000_clean so you schedule for a new > poll if work_done (any pkts received) or tx_cleaned is true. I don't think this is it. What's happening is that E1000 pulls up to a full dev->quota of packets off the ring, and _THEN_ goes back and does RX buffer replenishing. It is very clear why E1000 runs out of RX descriptors with this kind of policy. I outlined a way to fix this in the E1000 driver in another email. From greearb@candelatech.com Fri Jun 3 13:32:01 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Jun 2005 13:32:08 -0700 (PDT) Received: from www.lanforge.com (ns1.lanforge.com [66.165.47.210]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j53KW1Xq031239 for ; Fri, 3 Jun 2005 13:32:01 -0700 Received: from [71.112.207.80] (pool-71-112-207-80.sttlwa.dsl-w.verizon.net [71.112.207.80]) (authenticated bits=0) by www.lanforge.com (8.12.8/8.12.8) with ESMTP id j53L4P5I004513; Fri, 3 Jun 2005 14:04:25 -0700 Message-ID: <42A0BDFE.1020607@candelatech.com> Date: Fri, 03 Jun 2005 13:30:54 -0700 From: Ben Greear Organization: Candela Technologies User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.7.8) Gecko/20050513 Fedora/1.7.8-1.3.1 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Mitch Williams CC: "David S. Miller" , hadi@cyberus.ca, john.ronciak@intel.com, jdmason@us.ibm.com, shemminger@osdl.org, netdev@oss.sgi.com, Robert.Olsson@data.slu.se, ganesh.venkatesan@intel.com, jesse.brandeburg@intel.com Subject: Re: RFC: NAPI packet weighting patch References: <1117765954.6095.49.camel@localhost.localdomain> <1117824150.6071.34.camel@localhost.localdomain> <20050603.120126.41874584.davem@davemloft.net> In-Reply-To: Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 2059 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: greearb@candelatech.com Precedence: bulk X-list: netdev Content-Length: 3639 Lines: 86 Mitch Williams wrote: > > On Fri, 3 Jun 2005, David S. Miller wrote: > > >>From: jamal >>Date: Fri, 03 Jun 2005 14:42:30 -0400 >> >> >>>When you reduce the weight, the system is spending less time in the >>>softirq processing packets before softirq yields. If this gives more >>>opportunity to your app to run, then the performance will go up. >>>Is this what you are seeing? >> >>Jamal, this is my current theory as well, we hit the jiffies >>check. > > > Well, I hate to mess up your guys' theories, but the real reason is > simpler: hardware receive resources, specifically descriptors and > buffers. > > In a typical NAPI polling loop, the driver processes receive packets until > it either hits the quota or runs out of packets. Then, at the end of the > loop, it returns all of those now-free receive resources back to the > hardware. > > With a heavy receive load, the hardware will run out of receive > descriptors in the time it takes the driver/NAPI/stack to process 64 > packets. So it drops them on the floor. And, as we know, dropped packets > are A Bad Thing. If it can fill up more than 190 RX descriptors in the time it takes NAPI to pull 64, then there is no possible way to not drop packets! How could NAPI ever keep up if what you say is true? > By reducing the driver weight, we cause the driver to give receive > resources back to the hardware more often, which prevents dropped packets. > > As Ben Greer noticed, increasing the number of descriptors can help with > this issue. But it really can't eliminate the problem -- once the ring > is full, it doesn't matter how big it is, it's still full. If you have 1024 rx descriptors, and the NAPI poll pulls off 64 at one time, I do not see how pulling off 20 could be any more useful. Either way, you have more than 900 other RX descriptors to be received. Even if you only have the default of 256 the NIC should be able to continue receiving packets with the other 190 or so descriptors while NAPI is doing it's receive poll. If the buffers are often nearly used up, then the problem is that the NAPI poll cannot pull the packets fast enough, and again, I do not see how making it do more polls could make it able to pull packets from the NIC more efficiently. Maybe you could instrument the NAPI receive logic to see if there is some horrible waste of CPU and/or time when it tries to pull larger amounts of packets at once? A linear increase in work cannot explain what you are describing. > In my testing (Dual 2.8GHz Xeon, PCI-X bus, Gigabit network, 10 clients), > I was able to completely eliminate dropped packets in most cases by > reducing the driver weight down to about 20. At least tell us what type of traffic you are using? TCP with MTU sized packets, traffic-generator with 60 byte packets? Actual speed that you are running (aggregate)? Full-duplex traffic, or mostly uni-directional? packets-per-second you are receiving & transmitting when the drops occur? On a dual 2.8Ghz xeon system with PCI-X bus, with a quad-port Intel pro/1000 NIC I can run about 950Mbps of traffic, bi-directional, on two ports at the same time, and drop few or no packets. (MTU sized packets here). This is using a modified version of pktgen, btw. So, if you are seeing any amount of dropped pkts on a single NIC, especially if you are mostly doing uni-directional traffic, then I think the problem might be elsewhere, because the stock 2.6.11 and similar kernels can easily handle this amount of network traffic. Thanks, Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com From davem@davemloft.net Fri Jun 3 13:32:45 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Jun 2005 13:32:49 -0700 (PDT) Received: from sunset.davemloft.net (dsl027-180-168.sfo1.dsl.speakeasy.net [216.27.180.168]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j53KWjXq032093 for ; Fri, 3 Jun 2005 13:32:45 -0700 Received: from localhost ([127.0.0.1] ident=davem) by sunset.davemloft.net with esmtp (Exim 4.50) id 1DeIpJ-0002EY-N9; Fri, 03 Jun 2005 13:31:33 -0700 Date: Fri, 03 Jun 2005 13:31:33 -0700 (PDT) Message-Id: <20050603.133133.38710501.davem@davemloft.net> To: hadi@cyberus.ca Cc: mitch.a.williams@intel.com, john.ronciak@intel.com, jdmason@us.ibm.com, shemminger@osdl.org, netdev@oss.sgi.com, Robert.Olsson@data.slu.se, ganesh.venkatesan@intel.com, jesse.brandeburg@intel.com Subject: Re: RFC: NAPI packet weighting patch From: "David S. Miller" In-Reply-To: <1117828771.6071.77.camel@localhost.localdomain> References: <20050603.120126.41874584.davem@davemloft.net> <1117828771.6071.77.camel@localhost.localdomain> X-Mailer: Mew version 3.3 on Emacs 21.4 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 2060 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev Content-Length: 498 Lines: 13 From: jamal Date: Fri, 03 Jun 2005 15:59:31 -0400 > But one that you could validate by putting proper hooks. As an example, > try to restore a descriptor every time you pick one - for an example of > this look at the sb1250 driver. Yes, this in my mind is exactly the problem. TG3 does this properly, as do several other drivers. You should never defer RX buffer replenishment, you should always do it as you grab packets off of the ring. You will starve the chip otherwise. From gwingerde@home.nl Fri Jun 3 13:39:30 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Jun 2005 13:39:34 -0700 (PDT) Received: from smtpq2.home.nl (smtpq2.home.nl [213.51.128.197]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j53KdTXq000881 for ; Fri, 3 Jun 2005 13:39:29 -0700 Received: from [213.51.128.134] (port=56790 helo=smtp3.home.nl) by smtpq2.home.nl with esmtp (Exim 4.30) id 1DeIw3-0007qn-85; Fri, 03 Jun 2005 22:38:31 +0200 Received: from cc10088-a.ensch1.ov.home.nl ([217.123.128.105]:47093 helo=[192.168.14.1]) by smtp3.home.nl with esmtp (Exim 4.30) id 1DeIw1-0006TR-5a; Fri, 03 Jun 2005 22:38:29 +0200 Message-ID: <42A0BE19.3060503@home.nl> Date: Fri, 03 Jun 2005 22:31:21 +0200 From: Gertjan van Wingerde User-Agent: Mozilla Thunderbird 1.0.2 (X11/20050322) X-Accept-Language: en-us, en MIME-Version: 1.0 To: netdev@oss.sgi.com CC: jgarzik@pobox.com Subject: [PATCH 1/2] ieee80211: Update generic definitions to latest specs - take #2 Content-Type: multipart/mixed; boundary="------------060209080801050100020902" X-AtHome-MailScanner-Information: Neem contact op met support@home.nl voor meer informatie X-AtHome-MailScanner: Found to be clean X-archive-position: 2062 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: gwingerde@home.nl Precedence: bulk X-list: netdev Content-Length: 10875 Lines: 296 This is a multi-part message in MIME format. --------------060209080801050100020902 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Hi, Attached patch updates the definitions of the generic ieee80211 stack to the latest versions of the published 802.11x specification suite. Please review and apply. Signed-off-by: Gertjan van Wingerde Thanks, Gertjan van Wingerde --------------060209080801050100020902 Content-Type: text/plain; name="ieee80211-new-definitions.diff" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="ieee80211-new-definitions.diff" Index: include/net/ieee80211.h =================================================================== --- 4b4ba76aa81b3627142787262fd2f8049dd3662d/include/net/ieee80211.h (mode:100644) +++ uncommitted/include/net/ieee80211.h (mode:100644) @@ -103,7 +103,7 @@ #define MAX_FRAG_THRESHOLD 2346U /* Frame control field constants */ -#define IEEE80211_FCTL_VERS 0x0002 +#define IEEE80211_FCTL_VERS 0x0003 #define IEEE80211_FCTL_FTYPE 0x000c #define IEEE80211_FCTL_STYPE 0x00f0 #define IEEE80211_FCTL_TODS 0x0100 @@ -111,8 +111,8 @@ #define IEEE80211_FCTL_MOREFRAGS 0x0400 #define IEEE80211_FCTL_RETRY 0x0800 #define IEEE80211_FCTL_PM 0x1000 -#define IEEE80211_FCTL_MOREDATA 0x2000 -#define IEEE80211_FCTL_WEP 0x4000 +#define IEEE80211_FCTL_MOREDATA 0x2000 +#define IEEE80211_FCTL_PROTECTED 0x4000 #define IEEE80211_FCTL_ORDER 0x8000 #define IEEE80211_FTYPE_MGMT 0x0000 @@ -131,6 +131,7 @@ #define IEEE80211_STYPE_DISASSOC 0x00A0 #define IEEE80211_STYPE_AUTH 0x00B0 #define IEEE80211_STYPE_DEAUTH 0x00C0 +#define IEEE80211_STYPE_ACTION 0x00D0 /* control */ #define IEEE80211_STYPE_PSPOLL 0x00A0 @@ -251,6 +252,7 @@ #define SNAP_SIZE sizeof(struct ieee80211_snap_hdr) +#define WLAN_FC_GET_VERS(fc) ((fc) & IEEE80211_FCTL_VERS) #define WLAN_FC_GET_TYPE(fc) ((fc) & IEEE80211_FCTL_FTYPE) #define WLAN_FC_GET_STYPE(fc) ((fc) & IEEE80211_FCTL_STYPE) @@ -271,6 +273,9 @@ #define WLAN_CAPABILITY_SHORT_PREAMBLE (1<<5) #define WLAN_CAPABILITY_PBCC (1<<6) #define WLAN_CAPABILITY_CHANNEL_AGILITY (1<<7) +#define WLAN_CAPABILITY_SPECTRUM_MGMT (1<<8) +#define WLAN_CAPABILITY_SHORT_SLOT_TIME (1<<10) +#define WLAN_CAPABILITY_OSSS_OFDM (1<<13) /* Status codes */ #define WLAN_STATUS_SUCCESS 0 @@ -285,9 +290,24 @@ #define WLAN_STATUS_AP_UNABLE_TO_HANDLE_NEW_STA 17 #define WLAN_STATUS_ASSOC_DENIED_RATES 18 /* 802.11b */ -#define WLAN_STATUS_ASSOC_DENIED_NOSHORT 19 +#define WLAN_STATUS_ASSOC_DENIED_NOSHORTPREAMBLE 19 #define WLAN_STATUS_ASSOC_DENIED_NOPBCC 20 #define WLAN_STATUS_ASSOC_DENIED_NOAGILITY 21 +/* 802.11h */ +#define WLAN_STATUS_ASSOC_DENIED_NOSPECTRUM 22 +#define WLAN_STATUS_ASSOC_REJECTED_BAD_POWER 23 +#define WLAN_STATUS_ASSOC_REJECTED_BAD_SUPP_CHAN 24 +/* 802.11g */ +#define WLAN_STATUS_ASSOC_DENIED_NOSHORTTIME 25 +#define WLAN_STATUS_ASSOC_DENIED_NODSSSOFDM 26 +/* 802.11i */ +#define WLAN_STATUS_INVALID_IE 40 +#define WLAN_STATUS_INVALID_GROUP_CIPHER 41 +#define WLAN_STATUS_INVALID_PAIRWISE_CIPHER 42 +#define WLAN_STATUS_INVALID_AKMP 43 +#define WLAN_STATUS_UNSUPP_RSN_VERSION 44 +#define WLAN_STATUS_INVALID_RSN_IE_CAP 45 +#define WLAN_STATUS_CIPHER_SUITE_REJECTED 46 /* Reason codes */ #define WLAN_REASON_UNSPECIFIED 1 @@ -299,6 +319,22 @@ #define WLAN_REASON_CLASS3_FRAME_FROM_NONASSOC_STA 7 #define WLAN_REASON_DISASSOC_STA_HAS_LEFT 8 #define WLAN_REASON_STA_REQ_ASSOC_WITHOUT_AUTH 9 +/* 802.11h */ +#define WLAN_REASON_DISASSOC_BAD_POWER 10 +#define WLAN_REASON_DISASSOC_BAD_SUPP_CHAN 11 +/* 802.11i */ +#define WLAN_REASON_INVALID_IE 13 +#define WLAN_REASON_MIC_FAILURE 14 +#define WLAN_REASON_4WAY_HANDSHAKE_TIMEOUT 15 +#define WLAN_REASON_GROUP_KEY_HANDSHAKE_TIMEOUT 16 +#define WLAN_REASON_IE_DIFFERENT 17 +#define WLAN_REASON_INVALID_GROUP_CIPHER 18 +#define WLAN_REASON_INVALID_PAIRWISE_CIPHER 19 +#define WLAN_REASON_INVALID_AKMP 20 +#define WLAN_REASON_UNSUPP_RSN_VERSION 21 +#define WLAN_REASON_INVALID_RSN_IE_CAP 22 +#define WLAN_REASON_IEEE8021X_FAILED 23 +#define WLAN_REASON_CIPHER_SUITE_REJECTED 24 #define IEEE80211_STATMASK_SIGNAL (1<<0) @@ -477,17 +513,32 @@ #define BEACON_PROBE_SSID_ID_POSITION 12 /* Management Frame Information Element Types */ -#define MFIE_TYPE_SSID 0 -#define MFIE_TYPE_RATES 1 -#define MFIE_TYPE_FH_SET 2 -#define MFIE_TYPE_DS_SET 3 -#define MFIE_TYPE_CF_SET 4 -#define MFIE_TYPE_TIM 5 -#define MFIE_TYPE_IBSS_SET 6 -#define MFIE_TYPE_CHALLENGE 16 -#define MFIE_TYPE_RSN 48 -#define MFIE_TYPE_RATES_EX 50 -#define MFIE_TYPE_GENERIC 221 +#define MFIE_TYPE_SSID 0 +#define MFIE_TYPE_RATES 1 +#define MFIE_TYPE_FH_SET 2 +#define MFIE_TYPE_DS_SET 3 +#define MFIE_TYPE_CF_SET 4 +#define MFIE_TYPE_TIM 5 +#define MFIE_TYPE_IBSS_SET 6 +#define MFIE_TYPE_COUNTRY 7 +#define MFIE_TYPE_HOP_PARAMS 8 +#define MFIE_TYPE_HOP_TABLE 9 +#define MFIE_TYPE_REQUEST 10 +#define MFIE_TYPE_CHALLENGE 16 +#define MFIE_TYPE_POWER_CONSTRAINT 32 +#define MFIE_TYPE_POWER_CAPABILITY 33 +#define MFIE_TYPE_TPC_REQUEST 34 +#define MFIE_TYPE_TPC_REPORT 35 +#define MFIE_TYPE_SUPP_CHANNELS 36 +#define MFIE_TYPE_CSA 37 +#define MFIE_TYPE_MEASURE_REQUEST 38 +#define MFIE_TYPE_MEASURE_REPORT 39 +#define MFIE_TYPE_QUIET 40 +#define MFIE_TYPE_IBSS_DFS 41 +#define MFIE_TYPE_ERP_INFO 42 +#define MFIE_TYPE_RSN 48 +#define MFIE_TYPE_RATES_EX 50 +#define MFIE_TYPE_GENERIC 221 struct ieee80211_info_element_hdr { u8 id; Index: net/ieee80211/ieee80211_rx.c =================================================================== --- 4b4ba76aa81b3627142787262fd2f8049dd3662d/net/ieee80211/ieee80211_rx.c (mode:100644) +++ uncommitted/net/ieee80211/ieee80211_rx.c (mode:100644) @@ -440,7 +440,7 @@ crypt->ops->decrypt_mpdu == NULL)) crypt = NULL; - if (!crypt && (fc & IEEE80211_FCTL_WEP)) { + if (!crypt && (fc & IEEE80211_FCTL_PROTECTED)) { /* This seems to be triggered by some (multicast?) * frames from other than current BSS, so just drop the * frames silently instead of filling system log with @@ -456,7 +456,7 @@ #ifdef NOT_YET if (type != WLAN_FC_TYPE_DATA) { if (type == WLAN_FC_TYPE_MGMT && stype == WLAN_FC_STYPE_AUTH && - fc & IEEE80211_FCTL_WEP && ieee->host_decrypt && + fc & IEEE80211_FCTL_PROTECTED && ieee->host_decrypt && (keyidx = hostap_rx_frame_decrypt(ieee, skb, crypt)) < 0) { printk(KERN_DEBUG "%s: failed to decrypt mgmt::auth " @@ -557,7 +557,7 @@ /* skb: hdr + (possibly fragmented, possibly encrypted) payload */ - if (ieee->host_decrypt && (fc & IEEE80211_FCTL_WEP) && + if (ieee->host_decrypt && (fc & IEEE80211_FCTL_PROTECTED) && (keyidx = ieee80211_rx_frame_decrypt(ieee, skb, crypt)) < 0) goto rx_dropped; @@ -565,7 +565,7 @@ /* skb: hdr + (possibly fragmented) plaintext payload */ // PR: FIXME: hostap has additional conditions in the "if" below: - // ieee->host_decrypt && (fc & IEEE80211_FCTL_WEP) && + // ieee->host_decrypt && (fc & IEEE80211_FCTL_PROTECTED) && if ((frag != 0 || (fc & IEEE80211_FCTL_MOREFRAGS))) { int flen; struct sk_buff *frag_skb = ieee80211_frag_cache_get(ieee, hdr); @@ -621,12 +621,12 @@ /* skb: hdr + (possible reassembled) full MSDU payload; possibly still * encrypted/authenticated */ - if (ieee->host_decrypt && (fc & IEEE80211_FCTL_WEP) && + if (ieee->host_decrypt && (fc & IEEE80211_FCTL_PROTECTED) && ieee80211_rx_frame_decrypt_msdu(ieee, skb, keyidx, crypt)) goto rx_dropped; hdr = (struct ieee80211_hdr *) skb->data; - if (crypt && !(fc & IEEE80211_FCTL_WEP) && !ieee->open_wep) { + if (crypt && !(fc & IEEE80211_FCTL_PROTECTED) && !ieee->open_wep) { if (/*ieee->ieee802_1x &&*/ ieee80211_is_eapol_frame(ieee, skb)) { #ifdef CONFIG_IEEE80211_DEBUG @@ -647,7 +647,7 @@ } #ifdef CONFIG_IEEE80211_DEBUG - if (crypt && !(fc & IEEE80211_FCTL_WEP) && + if (crypt && !(fc & IEEE80211_FCTL_PROTECTED) && ieee80211_is_eapol_frame(ieee, skb)) { struct eapol *eap = (struct eapol *)(skb->data + 24); @@ -656,7 +656,7 @@ } #endif - if (crypt && !(fc & IEEE80211_FCTL_WEP) && !ieee->open_wep && + if (crypt && !(fc & IEEE80211_FCTL_PROTECTED) && !ieee->open_wep && !ieee80211_is_eapol_frame(ieee, skb)) { IEEE80211_DEBUG_DROP( "dropped unencrypted RX data " Index: net/ieee80211/ieee80211_tx.c =================================================================== --- 4b4ba76aa81b3627142787262fd2f8049dd3662d/net/ieee80211/ieee80211_tx.c (mode:100644) +++ uncommitted/net/ieee80211/ieee80211_tx.c (mode:100644) @@ -314,7 +314,7 @@ if (encrypt) fc = IEEE80211_FTYPE_DATA | IEEE80211_STYPE_DATA | - IEEE80211_FCTL_WEP; + IEEE80211_FCTL_PROTECTED; else fc = IEEE80211_FTYPE_DATA | IEEE80211_STYPE_DATA; Index: drivers/net/wireless/atmel.c =================================================================== --- 4b4ba76aa81b3627142787262fd2f8049dd3662d/drivers/net/wireless/atmel.c (mode:100644) +++ uncommitted/drivers/net/wireless/atmel.c (mode:100644) @@ -867,7 +867,7 @@ header.duration_id = 0; header.seq_ctl = 0; if (priv->wep_is_on) - frame_ctl |= IEEE80211_FCTL_WEP; + frame_ctl |= IEEE80211_FCTL_PROTECTED; if (priv->operating_mode == IW_MODE_ADHOC) { memcpy(&header.addr1, skb->data, 6); memcpy(&header.addr2, dev->dev_addr, 6); @@ -1117,7 +1117,7 @@ /* probe for CRC use here if needed once five packets have arrived with the same crc status, we assume we know what's happening and stop probing */ if (priv->probe_crc) { - if (!priv->wep_is_on || !(frame_ctl & IEEE80211_FCTL_WEP)) { + if (!priv->wep_is_on || !(frame_ctl & IEEE80211_FCTL_PROTECTED)) { priv->do_rx_crc = probe_crc(priv, rx_packet_loc, msdu_size); } else { priv->do_rx_crc = probe_crc(priv, rx_packet_loc + 24, msdu_size - 24); @@ -1132,7 +1132,7 @@ } /* don't CRC header when WEP in use */ - if (priv->do_rx_crc && (!priv->wep_is_on || !(frame_ctl & IEEE80211_FCTL_WEP))) { + if (priv->do_rx_crc && (!priv->wep_is_on || !(frame_ctl & IEEE80211_FCTL_PROTECTED))) { crc = crc32_le(0xffffffff, (unsigned char *)&header, 24); } msdu_size -= 24; /* header */ @@ -2677,7 +2677,7 @@ auth.alg = cpu_to_le16(C80211_MGMT_AAN_SHAREDKEY); /* no WEP for authentication frames with TrSeqNo 1 */ if (priv->CurrentAuthentTransactionSeqNum != 1) - header.frame_ctl |= cpu_to_le16(IEEE80211_FCTL_WEP); + header.frame_ctl |= cpu_to_le16(IEEE80211_FCTL_PROTECTED); } else { auth.alg = cpu_to_le16(C80211_MGMT_AAN_OPENSYSTEM); } --------------060209080801050100020902-- From gwingerde@home.nl Fri Jun 3 13:39:35 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Jun 2005 13:39:38 -0700 (PDT) Received: from smtpq3.home.nl (smtpq3.home.nl [213.51.128.198]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j53KdYXq000900 for ; Fri, 3 Jun 2005 13:39:34 -0700 Received: from [213.51.128.133] (port=52706 helo=smtp2.home.nl) by smtpq3.home.nl with esmtp (Exim 4.30) id 1DeIw6-0001k0-7U; Fri, 03 Jun 2005 22:38:34 +0200 Received: from cc10088-a.ensch1.ov.home.nl ([217.123.128.105]:47094 helo=[192.168.14.1]) by smtp2.home.nl with esmtp (Exim 4.30) id 1DeIw4-00051c-PM; Fri, 03 Jun 2005 22:38:32 +0200 Message-ID: <42A0BE1C.6080904@home.nl> Date: Fri, 03 Jun 2005 22:31:24 +0200 From: Gertjan van Wingerde User-Agent: Mozilla Thunderbird 1.0.2 (X11/20050322) X-Accept-Language: en-us, en MIME-Version: 1.0 To: netdev@oss.sgi.com CC: jgarzik@pobox.com Subject: [PATCH 2/2] ieee80211: Update generic definitions to latest specs - take #2 Content-Type: multipart/mixed; boundary="------------080907000309050506010000" X-AtHome-MailScanner-Information: Neem contact op met support@home.nl voor meer informatie X-AtHome-MailScanner: Found to be clean X-archive-position: 2063 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: gwingerde@home.nl Precedence: bulk X-list: netdev Content-Length: 7315 Lines: 213 This is a multi-part message in MIME format. --------------080907000309050506010000 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Hi, Attached patch cleans up the long lists of #defines for status codes, reason codes, and information elements. Signed-off-by: Gertjan van Wingerde Thanks, Gertjan van Wingerde --------------080907000309050506010000 Content-Type: text/plain; name="ieee80211-cleanup.diff" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="ieee80211-cleanup.diff" Index: include/net/ieee80211.h =================================================================== --- eb77617da695526508e860d3775afc781de70dea/include/net/ieee80211.h (mode:100644) +++ uncommitted/include/net/ieee80211.h (mode:100644) @@ -278,63 +278,67 @@ #define WLAN_CAPABILITY_OSSS_OFDM (1<<13) /* Status codes */ -#define WLAN_STATUS_SUCCESS 0 -#define WLAN_STATUS_UNSPECIFIED_FAILURE 1 -#define WLAN_STATUS_CAPS_UNSUPPORTED 10 -#define WLAN_STATUS_REASSOC_NO_ASSOC 11 -#define WLAN_STATUS_ASSOC_DENIED_UNSPEC 12 -#define WLAN_STATUS_NOT_SUPPORTED_AUTH_ALG 13 -#define WLAN_STATUS_UNKNOWN_AUTH_TRANSACTION 14 -#define WLAN_STATUS_CHALLENGE_FAIL 15 -#define WLAN_STATUS_AUTH_TIMEOUT 16 -#define WLAN_STATUS_AP_UNABLE_TO_HANDLE_NEW_STA 17 -#define WLAN_STATUS_ASSOC_DENIED_RATES 18 -/* 802.11b */ -#define WLAN_STATUS_ASSOC_DENIED_NOSHORTPREAMBLE 19 -#define WLAN_STATUS_ASSOC_DENIED_NOPBCC 20 -#define WLAN_STATUS_ASSOC_DENIED_NOAGILITY 21 -/* 802.11h */ -#define WLAN_STATUS_ASSOC_DENIED_NOSPECTRUM 22 -#define WLAN_STATUS_ASSOC_REJECTED_BAD_POWER 23 -#define WLAN_STATUS_ASSOC_REJECTED_BAD_SUPP_CHAN 24 -/* 802.11g */ -#define WLAN_STATUS_ASSOC_DENIED_NOSHORTTIME 25 -#define WLAN_STATUS_ASSOC_DENIED_NODSSSOFDM 26 -/* 802.11i */ -#define WLAN_STATUS_INVALID_IE 40 -#define WLAN_STATUS_INVALID_GROUP_CIPHER 41 -#define WLAN_STATUS_INVALID_PAIRWISE_CIPHER 42 -#define WLAN_STATUS_INVALID_AKMP 43 -#define WLAN_STATUS_UNSUPP_RSN_VERSION 44 -#define WLAN_STATUS_INVALID_RSN_IE_CAP 45 -#define WLAN_STATUS_CIPHER_SUITE_REJECTED 46 +enum ieee80211_statuscode { + WLAN_STATUS_SUCCESS = 0, + WLAN_STATUS_UNSPECIFIED_FAILURE = 1, + WLAN_STATUS_CAPS_UNSUPPORTED = 10, + WLAN_STATUS_REASSOC_NO_ASSOC = 11, + WLAN_STATUS_ASSOC_DENIED_UNSPEC = 12, + WLAN_STATUS_NOT_SUPPORTED_AUTH_ALG = 13, + WLAN_STATUS_UNKNOWN_AUTH_TRANSACTION = 14, + WLAN_STATUS_CHALLENGE_FAIL = 15, + WLAN_STATUS_AUTH_TIMEOUT = 16, + WLAN_STATUS_AP_UNABLE_TO_HANDLE_NEW_STA = 17, + WLAN_STATUS_ASSOC_DENIED_RATES = 18, + /* 802.11b */ + WLAN_STATUS_ASSOC_DENIED_NOSHORTPREAMBLE = 19, + WLAN_STATUS_ASSOC_DENIED_NOPBCC = 20, + WLAN_STATUS_ASSOC_DENIED_NOAGILITY = 21, + /* 802.11h */ + WLAN_STATUS_ASSOC_DENIED_NOSPECTRUM = 22, + WLAN_STATUS_ASSOC_REJECTED_BAD_POWER = 23, + WLAN_STATUS_ASSOC_REJECTED_BAD_SUPP_CHAN = 24, + /* 802.11g */ + WLAN_STATUS_ASSOC_DENIED_NOSHORTTIME = 25, + WLAN_STATUS_ASSOC_DENIED_NODSSSOFDM = 26, + /* 802.11i */ + WLAN_STATUS_INVALID_IE = 40, + WLAN_STATUS_INVALID_GROUP_CIPHER = 41, + WLAN_STATUS_INVALID_PAIRWISE_CIPHER = 42, + WLAN_STATUS_INVALID_AKMP = 43, + WLAN_STATUS_UNSUPP_RSN_VERSION = 44, + WLAN_STATUS_INVALID_RSN_IE_CAP = 45, + WLAN_STATUS_CIPHER_SUITE_REJECTED = 46, +}; /* Reason codes */ -#define WLAN_REASON_UNSPECIFIED 1 -#define WLAN_REASON_PREV_AUTH_NOT_VALID 2 -#define WLAN_REASON_DEAUTH_LEAVING 3 -#define WLAN_REASON_DISASSOC_DUE_TO_INACTIVITY 4 -#define WLAN_REASON_DISASSOC_AP_BUSY 5 -#define WLAN_REASON_CLASS2_FRAME_FROM_NONAUTH_STA 6 -#define WLAN_REASON_CLASS3_FRAME_FROM_NONASSOC_STA 7 -#define WLAN_REASON_DISASSOC_STA_HAS_LEFT 8 -#define WLAN_REASON_STA_REQ_ASSOC_WITHOUT_AUTH 9 -/* 802.11h */ -#define WLAN_REASON_DISASSOC_BAD_POWER 10 -#define WLAN_REASON_DISASSOC_BAD_SUPP_CHAN 11 -/* 802.11i */ -#define WLAN_REASON_INVALID_IE 13 -#define WLAN_REASON_MIC_FAILURE 14 -#define WLAN_REASON_4WAY_HANDSHAKE_TIMEOUT 15 -#define WLAN_REASON_GROUP_KEY_HANDSHAKE_TIMEOUT 16 -#define WLAN_REASON_IE_DIFFERENT 17 -#define WLAN_REASON_INVALID_GROUP_CIPHER 18 -#define WLAN_REASON_INVALID_PAIRWISE_CIPHER 19 -#define WLAN_REASON_INVALID_AKMP 20 -#define WLAN_REASON_UNSUPP_RSN_VERSION 21 -#define WLAN_REASON_INVALID_RSN_IE_CAP 22 -#define WLAN_REASON_IEEE8021X_FAILED 23 -#define WLAN_REASON_CIPHER_SUITE_REJECTED 24 +enum ieee80211_reasoncode { + WLAN_REASON_UNSPECIFIED = 1, + WLAN_REASON_PREV_AUTH_NOT_VALID = 2, + WLAN_REASON_DEAUTH_LEAVING = 3, + WLAN_REASON_DISASSOC_DUE_TO_INACTIVITY = 4, + WLAN_REASON_DISASSOC_AP_BUSY = 5, + WLAN_REASON_CLASS2_FRAME_FROM_NONAUTH_STA = 6, + WLAN_REASON_CLASS3_FRAME_FROM_NONASSOC_STA = 7, + WLAN_REASON_DISASSOC_STA_HAS_LEFT = 8, + WLAN_REASON_STA_REQ_ASSOC_WITHOUT_AUTH = 9, + /* 802.11h */ + WLAN_REASON_DISASSOC_BAD_POWER = 10, + WLAN_REASON_DISASSOC_BAD_SUPP_CHAN = 11, + /* 802.11i */ + WLAN_REASON_INVALID_IE = 13, + WLAN_REASON_MIC_FAILURE = 14, + WLAN_REASON_4WAY_HANDSHAKE_TIMEOUT = 15, + WLAN_REASON_GROUP_KEY_HANDSHAKE_TIMEOUT = 16, + WLAN_REASON_IE_DIFFERENT = 17, + WLAN_REASON_INVALID_GROUP_CIPHER = 18, + WLAN_REASON_INVALID_PAIRWISE_CIPHER = 19, + WLAN_REASON_INVALID_AKMP = 20, + WLAN_REASON_UNSUPP_RSN_VERSION = 21, + WLAN_REASON_INVALID_RSN_IE_CAP = 22, + WLAN_REASON_IEEE8021X_FAILED = 23, + WLAN_REASON_CIPHER_SUITE_REJECTED = 24, +}; #define IEEE80211_STATMASK_SIGNAL (1<<0) @@ -513,32 +517,34 @@ #define BEACON_PROBE_SSID_ID_POSITION 12 /* Management Frame Information Element Types */ -#define MFIE_TYPE_SSID 0 -#define MFIE_TYPE_RATES 1 -#define MFIE_TYPE_FH_SET 2 -#define MFIE_TYPE_DS_SET 3 -#define MFIE_TYPE_CF_SET 4 -#define MFIE_TYPE_TIM 5 -#define MFIE_TYPE_IBSS_SET 6 -#define MFIE_TYPE_COUNTRY 7 -#define MFIE_TYPE_HOP_PARAMS 8 -#define MFIE_TYPE_HOP_TABLE 9 -#define MFIE_TYPE_REQUEST 10 -#define MFIE_TYPE_CHALLENGE 16 -#define MFIE_TYPE_POWER_CONSTRAINT 32 -#define MFIE_TYPE_POWER_CAPABILITY 33 -#define MFIE_TYPE_TPC_REQUEST 34 -#define MFIE_TYPE_TPC_REPORT 35 -#define MFIE_TYPE_SUPP_CHANNELS 36 -#define MFIE_TYPE_CSA 37 -#define MFIE_TYPE_MEASURE_REQUEST 38 -#define MFIE_TYPE_MEASURE_REPORT 39 -#define MFIE_TYPE_QUIET 40 -#define MFIE_TYPE_IBSS_DFS 41 -#define MFIE_TYPE_ERP_INFO 42 -#define MFIE_TYPE_RSN 48 -#define MFIE_TYPE_RATES_EX 50 -#define MFIE_TYPE_GENERIC 221 +enum ieee80211_mfie { + MFIE_TYPE_SSID = 0, + MFIE_TYPE_RATES = 1, + MFIE_TYPE_FH_SET = 2, + MFIE_TYPE_DS_SET = 3, + MFIE_TYPE_CF_SET = 4, + MFIE_TYPE_TIM = 5, + MFIE_TYPE_IBSS_SET = 6, + MFIE_TYPE_COUNTRY = 7, + MFIE_TYPE_HOP_PARAMS = 8, + MFIE_TYPE_HOP_TABLE = 9, + MFIE_TYPE_REQUEST = 10, + MFIE_TYPE_CHALLENGE = 16, + MFIE_TYPE_POWER_CONSTRAINT = 32, + MFIE_TYPE_POWER_CAPABILITY = 33, + MFIE_TYPE_TPC_REQUEST = 34, + MFIE_TYPE_TPC_REPORT = 35, + MFIE_TYPE_SUPP_CHANNELS = 36, + MFIE_TYPE_CSA = 37, + MFIE_TYPE_MEASURE_REQUEST = 38, + MFIE_TYPE_MEASURE_REPORT = 39, + MFIE_TYPE_QUIET = 40, + MFIE_TYPE_IBSS_DFS = 41, + MFIE_TYPE_ERP_INFO = 42, + MFIE_TYPE_RSN = 48, + MFIE_TYPE_RATES_EX = 50, + MFIE_TYPE_GENERIC = 221, +}; struct ieee80211_info_element_hdr { u8 id; --------------080907000309050506010000-- From gwingerde@home.nl Fri Jun 3 13:39:28 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Jun 2005 13:39:31 -0700 (PDT) Received: from smtpq2.home.nl (smtpq2.home.nl [213.51.128.197]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j53KdRXq000874 for ; Fri, 3 Jun 2005 13:39:28 -0700 Received: from [213.51.128.134] (port=56778 helo=smtp3.home.nl) by smtpq2.home.nl with esmtp (Exim 4.30) id 1DeIvx-0007q0-A0; Fri, 03 Jun 2005 22:38:25 +0200 Received: from cc10088-a.ensch1.ov.home.nl ([217.123.128.105]:47092 helo=[192.168.14.1]) by smtp3.home.nl with esmtp (Exim 4.30) id 1DeIvv-0006Rf-Qs; Fri, 03 Jun 2005 22:38:23 +0200 Message-ID: <42A0BE13.3060509@home.nl> Date: Fri, 03 Jun 2005 22:31:15 +0200 From: Gertjan van Wingerde User-Agent: Mozilla Thunderbird 1.0.2 (X11/20050322) X-Accept-Language: en-us, en MIME-Version: 1.0 To: netdev@oss.sgi.com CC: jgarzik@pobox.com Subject: [PATCH 0/2] ieee80211: Update generic definitions to latest specs - take #2 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-AtHome-MailScanner-Information: Neem contact op met support@home.nl voor meer informatie X-AtHome-MailScanner: Found to be clean X-archive-position: 2061 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: gwingerde@home.nl Precedence: bulk X-list: netdev Content-Length: 381 Lines: 17 Hi, Following patches update the definitions of the generic ieee80211 stack to the latest versions of the published 802.11x specification suite, and cleans up the long list of defines. The set of patches is a resubmittal of my earlier patch, with the comments of Jiri Benc and Stephen Hemminger fixed. The patches need to be applied in order. Thanks, Gertjan van Wingerde From mchan@broadcom.com Fri Jun 3 13:48:26 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Jun 2005 13:48:29 -0700 (PDT) Received: from MMS1.broadcom.com (mms1.broadcom.com [216.31.210.17]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j53KmPXq002813 for ; Fri, 3 Jun 2005 13:48:26 -0700 Received: from 10.10.64.121 by MMS1.broadcom.com with SMTP (Broadcom SMTP Relay (Email Firewall v6.1.0)); Fri, 03 Jun 2005 13:47:07 -0700 X-Server-Uuid: 146C3151-C1DE-4F71-9D02-C3BE503878DD Received: from mail-irva-8.broadcom.com ([10.10.64.221]) by mail-irva-1.broadcom.com (Post.Office MTA v3.5.3 release 223 ID# 0-72233U7200L2200S0V35) with ESMTP id com; Fri, 3 Jun 2005 13:47:06 -0700 Received: from mon-irva-10.broadcom.com (mon-irva-10.broadcom.com [10.10.64.171]) by mail-irva-8.broadcom.com (MOS 3.5.6-GR) with ESMTP id BCA28027; Fri, 3 Jun 2005 13:47:04 -0700 (PDT) Received: from nt-irva-0741.brcm.ad.broadcom.com ( nt-irva-0741.brcm.ad.broadcom.com [10.8.194.54]) by mon-irva-10.broadcom.com (8.9.1/8.9.1) with ESMTP id NAA24325; Fri, 3 Jun 2005 13:47:03 -0700 (PDT) Received: from 10.7.18.177 ([10.7.18.177]) by NT-IRVA-0741.brcm.ad.broadcom.com ([10.8.194.54]) with Microsoft Exchange Server HTTP-DAV ; Fri, 3 Jun 2005 20:47:03 +0000 Received: from rh4 by nt-irva-0741; 03 Jun 2005 12:49:29 -0700 Subject: Re: RFC: NAPI packet weighting patch From: "Michael Chan" To: "David S. Miller" cc: mitch.a.williams@intel.com, hadi@cyberus.ca, john.ronciak@intel.com, jdmason@us.ibm.com, shemminger@osdl.org, netdev@oss.sgi.com, Robert.Olsson@data.slu.se, ganesh.venkatesan@intel.com, jesse.brandeburg@intel.com In-Reply-To: <20050603.132922.63997492.davem@davemloft.net> References: <20050603.120126.41874584.davem@davemloft.net> <20050603.132257.23013342.davem@davemloft.net> <20050603.132922.63997492.davem@davemloft.net> Date: Fri, 03 Jun 2005 12:49:29 -0700 Message-ID: <1117828169.4430.29.camel@rh4> MIME-Version: 1.0 X-Mailer: Evolution 2.0.2 (2.0.2-3) X-WSS-ID: 6EBE1E412U45192377-01-01 Content-Type: text/plain Content-Transfer-Encoding: 7bit X-archive-position: 2064 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mchan@broadcom.com Precedence: bulk X-list: netdev Content-Length: 590 Lines: 15 On Fri, 2005-06-03 at 13:29 -0700, David S. Miller wrote: > E1000 processes the full QUOTA of RX packets, > _THEN_ replenishes with new RX buffers. No wonder > the chip runs out of RX descriptors. > > You should replenish _AS_ you grab RX packets > off the receive queue, just as tg3 does. Yes, in tg3, rx buffers are replenished and put back into the ring as completed packets are taken off the ring. But we don't tell the chip about these new buffers until we get to the end of the loop, potentially after a full quota of packets. Doesn't this make the end result the same as e1000? From buytenh@wantstofly.org Fri Jun 3 14:00:50 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Jun 2005 14:01:04 -0700 (PDT) Received: from xi.wantstofly.org (alephnull.demon.nl [212.238.201.82]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j53L0nXq004267 for ; Fri, 3 Jun 2005 14:00:50 -0700 Received: by xi.wantstofly.org (Postfix, from userid 500) id 5F5B5945C8; Fri, 3 Jun 2005 22:59:45 +0200 (MEST) Date: Fri, 3 Jun 2005 22:59:45 +0200 From: Lennert Buytenhek To: Michael Chan Cc: "David S. Miller" , mitch.a.williams@intel.com, hadi@cyberus.ca, john.ronciak@intel.com, jdmason@us.ibm.com, shemminger@osdl.org, netdev@oss.sgi.com, Robert.Olsson@data.slu.se, ganesh.venkatesan@intel.com, jesse.brandeburg@intel.com Subject: Re: RFC: NAPI packet weighting patch Message-ID: <20050603205944.GC20623@xi.wantstofly.org> References: <20050603.120126.41874584.davem@davemloft.net> <20050603.132257.23013342.davem@davemloft.net> <20050603.132922.63997492.davem@davemloft.net> <1117828169.4430.29.camel@rh4> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <1117828169.4430.29.camel@rh4> User-Agent: Mutt/1.4.1i X-archive-position: 2065 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: buytenh@wantstofly.org Precedence: bulk X-list: netdev Content-Length: 873 Lines: 22 On Fri, Jun 03, 2005 at 12:49:29PM -0700, Michael Chan wrote: > > E1000 processes the full QUOTA of RX packets, > > _THEN_ replenishes with new RX buffers. No wonder > > the chip runs out of RX descriptors. > > > > You should replenish _AS_ you grab RX packets > > off the receive queue, just as tg3 does. > > Yes, in tg3, rx buffers are replenished and put back into the ring > as completed packets are taken off the ring. But we don't tell the > chip about these new buffers until we get to the end of the loop, > potentially after a full quota of packets. Which makes a lot more sense, since you'd rather do one MMIO write at the end of the loop than one per iteration, especially if your MMIO read (flush) latency is high. (Any subsequent MMIO read will have to flush out all pending writes, which'll be slow if there's a lot of writes still in the queue.) --L From edgar@edgar.se.axis.com Fri Jun 3 14:08:07 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Jun 2005 14:08:12 -0700 (PDT) Received: from miranda.se.axis.com (miranda.se.axis.com [193.13.178.8]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j53L86Xq005258 for ; Fri, 3 Jun 2005 14:08:07 -0700 Received: from edgar.se.axis.com (edgar.se.axis.com [10.92.151.1]) by miranda.se.axis.com (8.12.9/8.12.9/Debian-5local0.1) with ESMTP id j53L71Nc014427 for ; Fri, 3 Jun 2005 23:07:01 +0200 Received: (qmail 3313 invoked by uid 400); 3 Jun 2005 23:07:01 +0200 Date: Fri, 3 Jun 2005 23:07:01 +0200 From: Edgar E Iglesias To: Lennert Buytenhek Cc: Michael Chan , "David S. Miller" , mitch.a.williams@intel.com, hadi@cyberus.ca, john.ronciak@intel.com, jdmason@us.ibm.com, shemminger@osdl.org, netdev@oss.sgi.com, Robert.Olsson@data.slu.se, ganesh.venkatesan@intel.com, jesse.brandeburg@intel.com Subject: Re: RFC: NAPI packet weighting patch Message-ID: <20050603210701.GA3263@edgar.se.axis.com> References: <20050603.120126.41874584.davem@davemloft.net> <20050603.132257.23013342.davem@davemloft.net> <20050603.132922.63997492.davem@davemloft.net> <1117828169.4430.29.camel@rh4> <20050603205944.GC20623@xi.wantstofly.org> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20050603205944.GC20623@xi.wantstofly.org> User-Agent: Mutt/1.5.8i X-archive-position: 2066 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: edgar.iglesias@axis.com Precedence: bulk X-list: netdev Content-Length: 1307 Lines: 33 On Fri, Jun 03, 2005 at 10:59:45PM +0200, Lennert Buytenhek wrote: > On Fri, Jun 03, 2005 at 12:49:29PM -0700, Michael Chan wrote: > > > > E1000 processes the full QUOTA of RX packets, > > > _THEN_ replenishes with new RX buffers. No wonder > > > the chip runs out of RX descriptors. > > > > > > You should replenish _AS_ you grab RX packets > > > off the receive queue, just as tg3 does. > > > > Yes, in tg3, rx buffers are replenished and put back into the ring > > as completed packets are taken off the ring. But we don't tell the > > chip about these new buffers until we get to the end of the loop, > > potentially after a full quota of packets. > > Which makes a lot more sense, since you'd rather do one MMIO write > at the end of the loop than one per iteration, especially if your > MMIO read (flush) latency is high. (Any subsequent MMIO read will > have to flush out all pending writes, which'll be slow if there's > a lot of writes still in the queue.) > > > --L Maybe it would be better to put a fixed weight at this level, return the descriptors to the HW after every X packets. That way you can keep the NAPI weight at 64 (or what ever) and still give back descriptors to HW more often. Best regards -- Programmer Edgar E Iglesias 46.46.272.1946 From jdmason@us.ibm.com Fri Jun 3 14:13:26 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Jun 2005 14:13:32 -0700 (PDT) Received: from e34.co.us.ibm.com (e34.co.us.ibm.com [32.97.110.132]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j53LDPXq005959 for ; Fri, 3 Jun 2005 14:13:26 -0700 Received: from westrelay02.boulder.ibm.com (westrelay02.boulder.ibm.com [9.17.195.11]) by e34.co.us.ibm.com (8.12.10/8.12.9) with ESMTP id j53LCR8E032746 for ; Fri, 3 Jun 2005 17:12:27 -0400 Received: from d03av03.boulder.ibm.com (d03av03.boulder.ibm.com [9.17.195.169]) by westrelay02.boulder.ibm.com (8.12.10/NCO/VER6.6) with ESMTP id j53LCRBm239632 for ; Fri, 3 Jun 2005 15:12:27 -0600 Received: from d03av03.boulder.ibm.com (loopback [127.0.0.1]) by d03av03.boulder.ibm.com (8.12.11/8.13.3) with ESMTP id j53LCQKD008289 for ; Fri, 3 Jun 2005 15:12:27 -0600 Received: from dyn95390157.austin.ibm.com (dyn95390157.austin.ibm.com [9.53.90.157]) by d03av03.boulder.ibm.com (8.12.11/8.12.11) with ESMTP id j53LCQQY008283; Fri, 3 Jun 2005 15:12:26 -0600 From: Jon Mason Organization: IBM To: "David S. Miller" Subject: Re: RFC: NAPI packet weighting patch Date: Fri, 3 Jun 2005 16:12:10 -0500 User-Agent: KMail/1.7.2 Cc: hadi@cyberus.ca, mitch.a.williams@intel.com, john.ronciak@intel.com, shemminger@osdl.org, netdev@oss.sgi.com, Robert.Olsson@data.slu.se, ganesh.venkatesan@intel.com, jesse.brandeburg@intel.com References: <20050603.120126.41874584.davem@davemloft.net> <1117828771.6071.77.camel@localhost.localdomain> <20050603.133133.38710501.davem@davemloft.net> In-Reply-To: <20050603.133133.38710501.davem@davemloft.net> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200506031612.10456.jdmason@us.ibm.com> X-archive-position: 2067 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jdmason@us.ibm.com Precedence: bulk X-list: netdev Content-Length: 767 Lines: 19 On Friday 03 June 2005 03:31 pm, David S. Miller wrote: > From: jamal > Date: Fri, 03 Jun 2005 15:59:31 -0400 > > > But one that you could validate by putting proper hooks. As an example, > > try to restore a descriptor every time you pick one - for an example of > > this look at the sb1250 driver. > > Yes, this in my mind is exactly the problem. TG3 does this > properly, as do several other drivers. > > You should never defer RX buffer replenishment, you should > always do it as you grab packets off of the ring. You will > starve the chip otherwise. e1000 isn't the only driver to do things this way. r8169, via-velocity, dl2k, and skge (and I'm sure many more). Might be nice to perform a driver audit to see what drivers do this. From tgr@postel.suug.ch Fri Jun 3 14:15:11 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Jun 2005 14:15:14 -0700 (PDT) Received: from postel.suug.ch (postel.suug.ch [195.134.158.23]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j53LFAXq006409 for ; Fri, 3 Jun 2005 14:15:10 -0700 Received: by postel.suug.ch (Postfix, from userid 10001) id 6AE181C0EE; Fri, 3 Jun 2005 23:14:31 +0200 (CEST) Message-Id: <20050603211241.593114000@axs> Date: Fri, 03 Jun 2005 23:12:41 +0200 From: Thomas Graf To: davem@davemloft.net Cc: netdev@oss.sgi.com Subject: [PATCHSET] PKT_SCHED related fixes and a meta ematch completion X-archive-position: 2068 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev Content-Length: 334 Lines: 10 Dave, The following patchset fixes some serious bugs that prevent the basic classifier and the meta ematch from working properly. Patch 2 adds a few new meta collectors for socket attribtues which I'd like to have in 2.6.12 as well. If you think this is too intrusive (it isn't ;->) I'll resend patch 4 with offsets fixed. Thanks. From tgr@postel.suug.ch Fri Jun 3 14:15:13 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Jun 2005 14:15:19 -0700 (PDT) Received: from postel.suug.ch (postel.suug.ch [195.134.158.23]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j53LFCXq006459 for ; Fri, 3 Jun 2005 14:15:13 -0700 Received: by postel.suug.ch (Postfix, from userid 10001) id 878551C0EE; Fri, 3 Jun 2005 23:14:36 +0200 (CEST) Message-Id: <20050603211315.521247000@axs> References: <20050603211241.593114000@axs> Date: Fri, 03 Jun 2005 23:12:42 +0200 From: Thomas Graf To: davem@davemloft.net Cc: netdev@oss.sgi.com Subject: [PATCH 1/4] [PKT_SCHED] Fix typo in NET_EMATCH_STACK help text Content-Disposition: inline; filename=fix_ematch_kconfig_typo X-archive-position: 2069 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev Content-Length: 643 Lines: 20 Spotted by Geert Uytterhoeven . Signed-off-by: Thomas Graf Index: ematch/net/sched/Kconfig =================================================================== --- ematch.orig/net/sched/Kconfig +++ ematch/net/sched/Kconfig @@ -405,7 +405,7 @@ config NET_EMATCH_STACK ---help--- Size of the local stack variable used while evaluating the tree of ematches. Limits the depth of the tree, i.e. the number of - encapsulated precedences. Every level requires 4 bytes of addtional + encapsulated precedences. Every level requires 4 bytes of additional stack space. config NET_EMATCH_CMP From tgr@postel.suug.ch Fri Jun 3 14:15:19 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Jun 2005 14:15:22 -0700 (PDT) Received: from postel.suug.ch (postel.suug.ch [195.134.158.23]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j53LFIXq006541 for ; Fri, 3 Jun 2005 14:15:18 -0700 Received: by postel.suug.ch (Postfix, from userid 10001) id 99C361C0EE; Fri, 3 Jun 2005 23:14:41 +0200 (CEST) Message-Id: <20050603211315.677553000@axs> References: <20050603211241.593114000@axs> Date: Fri, 03 Jun 2005 23:12:43 +0200 From: Thomas Graf To: davem@davemloft.net Cc: netdev@oss.sgi.com Subject: [PATCH 2/4] [PKT_SCHED] Allow socket attributes to be matched on via meta ematch Content-Disposition: inline; filename=ematch_meta_sk X-archive-position: 2070 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev Content-Length: 10978 Lines: 385 Adds meta collectors for all socket attributes that make sense to be filtered upon. Some of them are only useful for debugging but having them doesn't hurt. Signed-off-by: Thomas Graf Index: ematch/net/sched/em_meta.c =================================================================== --- ematch.orig/net/sched/em_meta.c +++ ematch/net/sched/em_meta.c @@ -32,7 +32,7 @@ * +-----------+ +-----------+ * | | * ---> meta_ops[INT][INDEV](...) | - * | | + * | | * ----------- | * V V * +-----------+ +-----------+ @@ -70,6 +70,7 @@ #include #include #include +#include struct meta_obj { @@ -284,6 +285,214 @@ META_COLLECTOR(int_rtiif) } /************************************************************************** + * Socket Attributes + **************************************************************************/ + +#define SKIP_NONLOCAL(skb) \ + if (unlikely(skb->sk == NULL)) { \ + *err = -1; \ + return; \ + } + +META_COLLECTOR(int_sk_family) +{ + SKIP_NONLOCAL(skb); + dst->value = skb->sk->sk_family; +} + +META_COLLECTOR(int_sk_state) +{ + SKIP_NONLOCAL(skb); + dst->value = skb->sk->sk_state; +} + +META_COLLECTOR(int_sk_reuse) +{ + SKIP_NONLOCAL(skb); + dst->value = skb->sk->sk_reuse; +} + +META_COLLECTOR(int_sk_bound_if) +{ + SKIP_NONLOCAL(skb); + /* No error if bound_dev_if is 0, legal userspace check */ + dst->value = skb->sk->sk_bound_dev_if; +} + +META_COLLECTOR(var_sk_bound_if) +{ + SKIP_NONLOCAL(skb); + + if (skb->sk->sk_bound_dev_if == 0) { + dst->value = (unsigned long) "any"; + dst->len = 3; + } else { + struct net_device *dev; + + dev = dev_get_by_index(skb->sk->sk_bound_dev_if); + *err = var_dev(dev, dst); + if (dev) + dev_put(dev); + } +} + +META_COLLECTOR(int_sk_refcnt) +{ + SKIP_NONLOCAL(skb); + dst->value = atomic_read(&skb->sk->sk_refcnt); +} + +META_COLLECTOR(int_sk_rcvbuf) +{ + SKIP_NONLOCAL(skb); + dst->value = skb->sk->sk_rcvbuf; +} + +META_COLLECTOR(int_sk_shutdown) +{ + SKIP_NONLOCAL(skb); + dst->value = skb->sk->sk_shutdown; +} + +META_COLLECTOR(int_sk_proto) +{ + SKIP_NONLOCAL(skb); + dst->value = skb->sk->sk_protocol; +} + +META_COLLECTOR(int_sk_type) +{ + SKIP_NONLOCAL(skb); + dst->value = skb->sk->sk_type; +} + +META_COLLECTOR(int_sk_rmem_alloc) +{ + SKIP_NONLOCAL(skb); + dst->value = atomic_read(&skb->sk->sk_rmem_alloc); +} + +META_COLLECTOR(int_sk_wmem_alloc) +{ + SKIP_NONLOCAL(skb); + dst->value = atomic_read(&skb->sk->sk_wmem_alloc); +} + +META_COLLECTOR(int_sk_omem_alloc) +{ + SKIP_NONLOCAL(skb); + dst->value = atomic_read(&skb->sk->sk_omem_alloc); +} + +META_COLLECTOR(int_sk_rcv_qlen) +{ + SKIP_NONLOCAL(skb); + dst->value = skb->sk->sk_receive_queue.qlen; +} + +META_COLLECTOR(int_sk_snd_qlen) +{ + SKIP_NONLOCAL(skb); + dst->value = skb->sk->sk_write_queue.qlen; +} + +META_COLLECTOR(int_sk_wmem_queued) +{ + SKIP_NONLOCAL(skb); + dst->value = skb->sk->sk_wmem_queued; +} + +META_COLLECTOR(int_sk_fwd_alloc) +{ + SKIP_NONLOCAL(skb); + dst->value = skb->sk->sk_forward_alloc; +} + +META_COLLECTOR(int_sk_sndbuf) +{ + SKIP_NONLOCAL(skb); + dst->value = skb->sk->sk_sndbuf; +} + +META_COLLECTOR(int_sk_alloc) +{ + SKIP_NONLOCAL(skb); + dst->value = skb->sk->sk_allocation; +} + +META_COLLECTOR(int_sk_route_caps) +{ + SKIP_NONLOCAL(skb); + dst->value = skb->sk->sk_route_caps; +} + +META_COLLECTOR(int_sk_hashent) +{ + SKIP_NONLOCAL(skb); + dst->value = skb->sk->sk_hashent; +} + +META_COLLECTOR(int_sk_lingertime) +{ + SKIP_NONLOCAL(skb); + dst->value = skb->sk->sk_lingertime / HZ; +} + +META_COLLECTOR(int_sk_err_qlen) +{ + SKIP_NONLOCAL(skb); + dst->value = skb->sk->sk_error_queue.qlen; +} + +META_COLLECTOR(int_sk_ack_bl) +{ + SKIP_NONLOCAL(skb); + dst->value = skb->sk->sk_ack_backlog; +} + +META_COLLECTOR(int_sk_max_ack_bl) +{ + SKIP_NONLOCAL(skb); + dst->value = skb->sk->sk_max_ack_backlog; +} + +META_COLLECTOR(int_sk_prio) +{ + SKIP_NONLOCAL(skb); + dst->value = skb->sk->sk_priority; +} + +META_COLLECTOR(int_sk_rcvlowat) +{ + SKIP_NONLOCAL(skb); + dst->value = skb->sk->sk_rcvlowat; +} + +META_COLLECTOR(int_sk_rcvtimeo) +{ + SKIP_NONLOCAL(skb); + dst->value = skb->sk->sk_rcvtimeo / HZ; +} + +META_COLLECTOR(int_sk_sndtimeo) +{ + SKIP_NONLOCAL(skb); + dst->value = skb->sk->sk_sndtimeo / HZ; +} + +META_COLLECTOR(int_sk_sendmsg_off) +{ + SKIP_NONLOCAL(skb); + dst->value = skb->sk->sk_sndmsg_off; +} + +META_COLLECTOR(int_sk_write_pend) +{ + SKIP_NONLOCAL(skb); + dst->value = skb->sk->sk_write_pending; +} + +/************************************************************************** * Meta value collectors assignment table **************************************************************************/ @@ -293,41 +502,75 @@ struct meta_ops struct meta_value *, struct meta_obj *, int *); }; +#define META_ID(name) TCF_META_ID_##name +#define META_FUNC(name) { .get = meta_##name } + /* Meta value operations table listing all meta value collectors and * assigns them to a type and meta id. */ static struct meta_ops __meta_ops[TCF_META_TYPE_MAX+1][TCF_META_ID_MAX+1] = { [TCF_META_TYPE_VAR] = { - [TCF_META_ID_DEV] = { .get = meta_var_dev }, - [TCF_META_ID_INDEV] = { .get = meta_var_indev }, - [TCF_META_ID_REALDEV] = { .get = meta_var_realdev } + [META_ID(DEV)] = META_FUNC(var_dev), + [META_ID(INDEV)] = META_FUNC(var_indev), + [META_ID(REALDEV)] = META_FUNC(var_realdev), + [META_ID(SK_BOUND_IF)] = META_FUNC(var_sk_bound_if), }, [TCF_META_TYPE_INT] = { - [TCF_META_ID_RANDOM] = { .get = meta_int_random }, - [TCF_META_ID_LOADAVG_0] = { .get = meta_int_loadavg_0 }, - [TCF_META_ID_LOADAVG_1] = { .get = meta_int_loadavg_1 }, - [TCF_META_ID_LOADAVG_2] = { .get = meta_int_loadavg_2 }, - [TCF_META_ID_DEV] = { .get = meta_int_dev }, - [TCF_META_ID_INDEV] = { .get = meta_int_indev }, - [TCF_META_ID_REALDEV] = { .get = meta_int_realdev }, - [TCF_META_ID_PRIORITY] = { .get = meta_int_priority }, - [TCF_META_ID_PROTOCOL] = { .get = meta_int_protocol }, - [TCF_META_ID_SECURITY] = { .get = meta_int_security }, - [TCF_META_ID_PKTTYPE] = { .get = meta_int_pkttype }, - [TCF_META_ID_PKTLEN] = { .get = meta_int_pktlen }, - [TCF_META_ID_DATALEN] = { .get = meta_int_datalen }, - [TCF_META_ID_MACLEN] = { .get = meta_int_maclen }, + [META_ID(RANDOM)] = META_FUNC(int_random), + [META_ID(LOADAVG_0)] = META_FUNC(int_loadavg_0), + [META_ID(LOADAVG_1)] = META_FUNC(int_loadavg_1), + [META_ID(LOADAVG_2)] = META_FUNC(int_loadavg_2), + [META_ID(DEV)] = META_FUNC(int_dev), + [META_ID(INDEV)] = META_FUNC(int_indev), + [META_ID(REALDEV)] = META_FUNC(int_realdev), + [META_ID(PRIORITY)] = META_FUNC(int_priority), + [META_ID(PROTOCOL)] = META_FUNC(int_protocol), + [META_ID(SECURITY)] = META_FUNC(int_security), + [META_ID(PKTTYPE)] = META_FUNC(int_pkttype), + [META_ID(PKTLEN)] = META_FUNC(int_pktlen), + [META_ID(DATALEN)] = META_FUNC(int_datalen), + [META_ID(MACLEN)] = META_FUNC(int_maclen), #ifdef CONFIG_NETFILTER - [TCF_META_ID_NFMARK] = { .get = meta_int_nfmark }, + [META_ID(NFMARK)] = META_FUNC(int_nfmark), #endif - [TCF_META_ID_TCINDEX] = { .get = meta_int_tcindex }, + [META_ID(TCINDEX)] = META_FUNC(int_tcindex), #ifdef CONFIG_NET_CLS_ACT - [TCF_META_ID_TCVERDICT] = { .get = meta_int_tcverd }, - [TCF_META_ID_TCCLASSID] = { .get = meta_int_tcclassid }, + [META_ID(TCVERDICT)] = META_FUNC(int_tcverd), + [META_ID(TCCLASSID)] = META_FUNC(int_tcclassid), #endif #ifdef CONFIG_NET_CLS_ROUTE - [TCF_META_ID_RTCLASSID] = { .get = meta_int_rtclassid }, + [META_ID(RTCLASSID)] = META_FUNC(int_rtclassid), #endif - [TCF_META_ID_RTIIF] = { .get = meta_int_rtiif } + [META_ID(RTIIF)] = META_FUNC(int_rtiif), + [META_ID(SK_FAMILY)] = META_FUNC(int_sk_family), + [META_ID(SK_STATE)] = META_FUNC(int_sk_state), + [META_ID(SK_REUSE)] = META_FUNC(int_sk_reuse), + [META_ID(SK_BOUND_IF)] = META_FUNC(int_sk_bound_if), + [META_ID(SK_REFCNT)] = META_FUNC(int_sk_refcnt), + [META_ID(SK_RCVBUF)] = META_FUNC(int_sk_rcvbuf), + [META_ID(SK_SNDBUF)] = META_FUNC(int_sk_sndbuf), + [META_ID(SK_SHUTDOWN)] = META_FUNC(int_sk_shutdown), + [META_ID(SK_PROTO)] = META_FUNC(int_sk_proto), + [META_ID(SK_TYPE)] = META_FUNC(int_sk_type), + [META_ID(SK_RMEM_ALLOC)] = META_FUNC(int_sk_rmem_alloc), + [META_ID(SK_WMEM_ALLOC)] = META_FUNC(int_sk_wmem_alloc), + [META_ID(SK_OMEM_ALLOC)] = META_FUNC(int_sk_omem_alloc), + [META_ID(SK_WMEM_QUEUED)] = META_FUNC(int_sk_wmem_queued), + [META_ID(SK_RCV_QLEN)] = META_FUNC(int_sk_rcv_qlen), + [META_ID(SK_SND_QLEN)] = META_FUNC(int_sk_snd_qlen), + [META_ID(SK_ERR_QLEN)] = META_FUNC(int_sk_err_qlen), + [META_ID(SK_FORWARD_ALLOCS)] = META_FUNC(int_sk_fwd_alloc), + [META_ID(SK_ALLOCS)] = META_FUNC(int_sk_alloc), + [META_ID(SK_ROUTE_CAPS)] = META_FUNC(int_sk_route_caps), + [META_ID(SK_HASHENT)] = META_FUNC(int_sk_hashent), + [META_ID(SK_LINGERTIME)] = META_FUNC(int_sk_lingertime), + [META_ID(SK_ACK_BACKLOG)] = META_FUNC(int_sk_ack_bl), + [META_ID(SK_MAX_ACK_BACKLOG)] = META_FUNC(int_sk_max_ack_bl), + [META_ID(SK_PRIO)] = META_FUNC(int_sk_prio), + [META_ID(SK_RCVLOWAT)] = META_FUNC(int_sk_rcvlowat), + [META_ID(SK_RCVTIMEO)] = META_FUNC(int_sk_rcvtimeo), + [META_ID(SK_SNDTIMEO)] = META_FUNC(int_sk_sndtimeo), + [META_ID(SK_SENDMSG_OFF)] = META_FUNC(int_sk_sendmsg_off), + [META_ID(SK_WRITE_PENDING)] = META_FUNC(int_sk_write_pend), } }; Index: ematch/include/linux/tc_ematch/tc_em_meta.h =================================================================== --- ematch.orig/include/linux/tc_ematch/tc_em_meta.h +++ ematch/include/linux/tc_ematch/tc_em_meta.h @@ -56,6 +56,36 @@ enum TCF_META_ID_TCCLASSID, TCF_META_ID_RTCLASSID, TCF_META_ID_RTIIF, + TCF_META_ID_SK_FAMILY, + TCF_META_ID_SK_STATE, + TCF_META_ID_SK_REUSE, + TCF_META_ID_SK_BOUND_IF, + TCF_META_ID_SK_REFCNT, + TCF_META_ID_SK_SHUTDOWN, + TCF_META_ID_SK_PROTO, + TCF_META_ID_SK_TYPE, + TCF_META_ID_SK_RCVBUF, + TCF_META_ID_SK_RMEM_ALLOC, + TCF_META_ID_SK_WMEM_ALLOC, + TCF_META_ID_SK_OMEM_ALLOC, + TCF_META_ID_SK_WMEM_QUEUED, + TCF_META_ID_SK_RCV_QLEN, + TCF_META_ID_SK_SND_QLEN, + TCF_META_ID_SK_ERR_QLEN, + TCF_META_ID_SK_FORWARD_ALLOCS, + TCF_META_ID_SK_SNDBUF, + TCF_META_ID_SK_ALLOCS, + TCF_META_ID_SK_ROUTE_CAPS, + TCF_META_ID_SK_HASHENT, + TCF_META_ID_SK_LINGERTIME, + TCF_META_ID_SK_ACK_BACKLOG, + TCF_META_ID_SK_MAX_ACK_BACKLOG, + TCF_META_ID_SK_PRIO, + TCF_META_ID_SK_RCVLOWAT, + TCF_META_ID_SK_RCVTIMEO, + TCF_META_ID_SK_SNDTIMEO, + TCF_META_ID_SK_SENDMSG_OFF, + TCF_META_ID_SK_WRITE_PENDING, __TCF_META_ID_MAX }; #define TCF_META_ID_MAX (__TCF_META_ID_MAX - 1) From tgr@postel.suug.ch Fri Jun 3 14:15:23 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Jun 2005 14:15:31 -0700 (PDT) Received: from postel.suug.ch (postel.suug.ch [195.134.158.23]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j53LFNXq006648 for ; Fri, 3 Jun 2005 14:15:23 -0700 Received: by postel.suug.ch (Postfix, from userid 10001) id ABF7A1C0EE; Fri, 3 Jun 2005 23:14:46 +0200 (CEST) Message-Id: <20050603211315.818843000@axs> References: <20050603211241.593114000@axs> Date: Fri, 03 Jun 2005 23:12:44 +0200 From: Thomas Graf To: davem@davemloft.net Cc: netdev@oss.sgi.com Subject: [PATCH 3/4] [PKT_SCHED] Dump classification result for basic classifier Content-Disposition: inline; filename=cls_basic_dump_classid X-archive-position: 2071 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev Content-Length: 581 Lines: 19 Signed-off-by: Thomas Graf Index: ematch/net/sched/cls_basic.c =================================================================== --- ematch.orig/net/sched/cls_basic.c +++ ematch/net/sched/cls_basic.c @@ -261,6 +261,9 @@ static int basic_dump(struct tcf_proto * rta = (struct rtattr *) b; RTA_PUT(skb, TCA_OPTIONS, 0, NULL); + if (f->res.classid) + RTA_PUT_U32(skb, TCA_BASIC_CLASSID, f->res.classid); + if (tcf_exts_dump(skb, &f->exts, &basic_ext_map) < 0 || tcf_em_tree_dump(skb, &f->ematches, TCA_BASIC_EMATCHES) < 0) goto rtattr_failure; From tgr@postel.suug.ch Fri Jun 3 14:15:29 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Jun 2005 14:15:34 -0700 (PDT) Received: from postel.suug.ch (postel.suug.ch [195.134.158.23]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j53LFSXq006744 for ; Fri, 3 Jun 2005 14:15:29 -0700 Received: by postel.suug.ch (Postfix, from userid 10001) id CA15F1C0EE; Fri, 3 Jun 2005 23:14:51 +0200 (CEST) Message-Id: <20050603211315.972265000@axs> References: <20050603211241.593114000@axs> Date: Fri, 03 Jun 2005 23:12:45 +0200 From: Thomas Graf To: davem@davemloft.net Cc: netdev@oss.sgi.com Subject: [PATCH 4/4] [PKT_SCHED] Fix numeric comparison in meta ematch Content-Disposition: inline; filename=meta_compare_fix X-archive-position: 2072 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev Content-Length: 648 Lines: 23 This patch is brought to you by the department of applied stupidity. Signed-off-by: Thomas Graf Index: ematch/net/sched/em_meta.c =================================================================== --- ematch.orig/net/sched/em_meta.c +++ ematch/net/sched/em_meta.c @@ -639,9 +639,9 @@ static int meta_int_compare(struct meta_ /* Let gcc optimize it, the unlikely is not really based on * some numbers but jump free code for mismatches seems * more logical. */ - if (unlikely(a == b)) + if (unlikely(a->value == b->value)) return 0; - else if (a < b) + else if (a->value < b->value) return -1; else return 1; From mchan@broadcom.com Fri Jun 3 14:34:30 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Jun 2005 14:34:36 -0700 (PDT) Received: from MMS2.broadcom.com (mms2.broadcom.com [216.31.210.18]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j53LYUXq009935 for ; Fri, 3 Jun 2005 14:34:30 -0700 Received: from 10.10.64.121 by MMS2.broadcom.com with SMTP (Broadcom SMTP Relay (Email Firewall v6.1.0)); Fri, 03 Jun 2005 14:33:11 -0700 X-Server-Uuid: 1F20ACF3-9CAF-44F7-AB47-F294E2D5B4EA Received: from mail-irva-8.broadcom.com ([10.10.64.221]) by mail-irva-1.broadcom.com (Post.Office MTA v3.5.3 release 223 ID# 0-72233U7200L2200S0V35) with ESMTP id com; Fri, 3 Jun 2005 14:33:10 -0700 Received: from mon-irva-10.broadcom.com (mon-irva-10.broadcom.com [10.10.64.171]) by mail-irva-8.broadcom.com (MOS 3.5.6-GR) with ESMTP id BCB13804; Fri, 3 Jun 2005 14:32:57 -0700 (PDT) Received: from nt-irva-0741.brcm.ad.broadcom.com ( nt-irva-0741.brcm.ad.broadcom.com [10.8.194.54]) by mon-irva-10.broadcom.com (8.9.1/8.9.1) with ESMTP id OAA10245; Fri, 3 Jun 2005 14:32:57 -0700 (PDT) Received: from 10.7.18.177 ([10.7.18.177]) by NT-IRVA-0741.brcm.ad.broadcom.com ([10.8.194.54]) with Microsoft Exchange Server HTTP-DAV ; Fri, 3 Jun 2005 21:32:56 +0000 Received: from rh4 by nt-irva-0741; 03 Jun 2005 13:35:22 -0700 Subject: Re: RFC: NAPI packet weighting patch From: "Michael Chan" To: "Lennert Buytenhek" cc: "David S. Miller" , mitch.a.williams@intel.com, hadi@cyberus.ca, john.ronciak@intel.com, jdmason@us.ibm.com, shemminger@osdl.org, netdev@oss.sgi.com, Robert.Olsson@data.slu.se, ganesh.venkatesan@intel.com, jesse.brandeburg@intel.com In-Reply-To: <20050603205944.GC20623@xi.wantstofly.org> References: <20050603.120126.41874584.davem@davemloft.net> <20050603.132257.23013342.davem@davemloft.net> <20050603.132922.63997492.davem@davemloft.net> <1117828169.4430.29.camel@rh4> <20050603205944.GC20623@xi.wantstofly.org> Date: Fri, 03 Jun 2005 13:35:22 -0700 Message-ID: <1117830922.4430.44.camel@rh4> MIME-Version: 1.0 X-Mailer: Evolution 2.0.2 (2.0.2-3) X-WSS-ID: 6EBE131D1VO5004533-01-01 Content-Type: text/plain Content-Transfer-Encoding: 7bit X-archive-position: 2073 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mchan@broadcom.com Precedence: bulk X-list: netdev Content-Length: 1119 Lines: 22 On Fri, 2005-06-03 at 22:59 +0200, Lennert Buytenhek wrote: > On Fri, Jun 03, 2005 at 12:49:29PM -0700, Michael Chan wrote: > > > Yes, in tg3, rx buffers are replenished and put back into the ring > > as completed packets are taken off the ring. But we don't tell the > > chip about these new buffers until we get to the end of the loop, > > potentially after a full quota of packets. > > Which makes a lot more sense, since you'd rather do one MMIO write > at the end of the loop than one per iteration, especially if your > MMIO read (flush) latency is high. (Any subsequent MMIO read will > have to flush out all pending writes, which'll be slow if there's > a lot of writes still in the queue.) > I agree on the merit of issuing only one IO at the end. What I'm saying is that doing so will make it similar to e1000 where all the buffers are replenished at the end. Isn't that so or am I missing something? By the way, in tg3 there is a buffer replenishment threshold programmed to the chip and is currently set at rx_pending / 8 (200/8 = 25). This means that the chip will replenish 25 rx buffers at a time. From shemminger@osdl.org Fri Jun 3 14:38:05 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Jun 2005 14:38:07 -0700 (PDT) Received: from smtp.osdl.org (fire.osdl.org [65.172.181.4]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j53Lc5Xq010567 for ; Fri, 3 Jun 2005 14:38:05 -0700 Received: from shell0.pdx.osdl.net (fw.osdl.org [65.172.181.6]) by smtp.osdl.org (8.12.8/8.12.8) with ESMTP id j53Lb3jA000331 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO); Fri, 3 Jun 2005 14:37:03 -0700 Received: from dxpl.pdx.osdl.net (dxpl.pdx.osdl.net [10.8.0.74]) by shell0.pdx.osdl.net (8.13.1/8.11.6) with ESMTP id j53Lb2gH015308; Fri, 3 Jun 2005 14:37:02 -0700 Date: Fri, 3 Jun 2005 14:37:02 -0700 From: Stephen Hemminger To: Adrian Bunk , Baruch Even Cc: Andrew Morton , linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: 2.6.12-rc5-mm2: "bic unavailable using TCP reno" messages Message-ID: <20050603143702.0422101d@dxpl.pdx.osdl.net> In-Reply-To: <20050602203823.GI4992@stusta.de> References: <20050601022824.33c8206e.akpm@osdl.org> <20050602121511.GE4992@stusta.de> <429F1079.5070701@ev-en.org> <20050602103805.6beb4f4e@dxpl.pdx.osdl.net> <20050602203823.GI4992@stusta.de> Organization: Open Source Development Lab X-Mailer: Sylpheed-Claws 1.0.4 (GTK+ 1.2.10; x86_64-unknown-linux-gnu) X-Face: &@E+xe?c%:&e4D{>f1O<&U>2qwRREG5!}7R4;D<"NO^UI2mJ[eEOA2*3>(`Th.yP,VDPo9$ /`~cw![cmj~~jWe?AHY7D1S+\}5brN0k*NE?pPh_'_d>6;XGG[\KDRViCfumZT3@[ Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-MIMEDefang-Filter: osdl$Revision: 1.109 $ X-Scanned-By: MIMEDefang 2.36 X-archive-position: 2074 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev Content-Length: 6680 Lines: 232 Here is what I am working on as better way to make the sysctl selection. I am not totally happy with the way the default congestion control value is determined by the load order. But it does seem good that if you load "tcp_xxx" module and it registers it becomes the default. Index: 2.6.12-rc5-tcp3/include/net/tcp.h =================================================================== --- 2.6.12-rc5-tcp3.orig/include/net/tcp.h +++ 2.6.12-rc5-tcp3/include/net/tcp.h @@ -1242,6 +1242,8 @@ extern int tcp_register_congestion_contr extern void tcp_unregister_congestion_control(struct tcp_congestion_ops *type); extern void tcp_init_congestion_control(struct tcp_sock *tp); extern void tcp_release_congestion_control(struct tcp_sock *tp); +extern int tcp_set_congestion_control(const char *name); +extern void tcp_get_congestion_control(char *name); extern struct tcp_congestion_ops tcp_reno; extern u32 tcp_reno_ssthresh(struct tcp_sock *tp); Index: 2.6.12-rc5-tcp3/net/ipv4/tcp_cong.c =================================================================== --- 2.6.12-rc5-tcp3.orig/net/ipv4/tcp_cong.c +++ 2.6.12-rc5-tcp3/net/ipv4/tcp_cong.c @@ -13,8 +13,6 @@ #include #include -char sysctl_tcp_congestion_control[TCP_CA_NAME_MAX] = "bic"; - static DEFINE_SPINLOCK(tcp_cong_list_lock); static LIST_HEAD(tcp_cong_list); @@ -23,7 +21,7 @@ static struct tcp_congestion_ops *tcp_ca { struct tcp_congestion_ops *e; - list_for_each_entry_rcu(e, &tcp_cong_list, list) { + list_for_each_entry(e, &tcp_cong_list, list) { if (strcmp(e->name, name) == 0) return e; } @@ -46,7 +44,7 @@ int tcp_register_congestion_control(stru return -EINVAL; } - spin_lock_irq(&tcp_cong_list_lock); + spin_lock(&tcp_cong_list_lock); if (tcp_ca_find(ca->name)) { printk(KERN_NOTICE "TCP %s already registered\n", ca->name); ret = -EEXIST; @@ -54,7 +52,7 @@ int tcp_register_congestion_control(stru list_add_rcu(&ca->list, &tcp_cong_list); printk(KERN_INFO "TCP %s registered\n", ca->name); } - spin_unlock_irq(&tcp_cong_list_lock); + spin_unlock(&tcp_cong_list_lock); return ret; } @@ -69,7 +67,6 @@ EXPORT_SYMBOL_GPL(tcp_register_congestio void tcp_unregister_congestion_control(struct tcp_congestion_ops *ca) { spin_lock(&tcp_cong_list_lock); - BUG_ON(!tcp_ca_find(ca->name)); list_del_rcu(&ca->list); spin_unlock(&tcp_cong_list_lock); } @@ -78,34 +75,22 @@ EXPORT_SYMBOL_GPL(tcp_unregister_congest /* Assign choice of congestion control. */ void tcp_init_congestion_control(struct tcp_sock *tp) { - const char *cong_proto = sysctl_tcp_congestion_control; struct tcp_congestion_ops *ca; rcu_read_lock(); - ca = tcp_ca_find(cong_proto); -#ifdef CONFIG_KMOD - if (!ca) { - /* autoload and try again */ - rcu_read_unlock(); - request_module("tcp_%s", cong_proto); - rcu_read_lock(); - - ca = tcp_ca_find(cong_proto); - } -#endif - - /* If selection doesn't exist or is being removed use Reno */ - if (!ca || !try_module_get(ca->owner)) { - if (net_ratelimit()) - printk(KERN_WARNING "%s unavailable using TCP reno\n", - cong_proto); - ca = &tcp_reno; - } - tp->ca_ops = ca; - rcu_read_unlock(); + tp->ca_ops = NULL; + list_for_each_entry_rcu(ca, &tcp_cong_list, list) { + if (try_module_get(ca->owner)) { + tp->ca_ops = ca; + break; + } - if (ca->init) - ca->init(tp); + } + + /* We will always have reno to fallback on. */ + if (tp->ca_ops->init) + tp->ca_ops->init(tp); + rcu_read_unlock(); } EXPORT_SYMBOL(tcp_init_congestion_control); @@ -122,6 +107,36 @@ void tcp_release_congestion_control(stru } } +/* Used by sysctl to change default congestion control */ +int tcp_set_congestion_control(const char *name) +{ + struct tcp_congestion_ops *ca; + int ret = -ENOENT; + + spin_lock(&tcp_cong_list_lock); + ca = tcp_ca_find(name); + if (ca) { + list_move(&ca->list, &tcp_cong_list); + ret = 0; + } + spin_unlock(&tcp_cong_list_lock); + + return ret; +} + +/* Get current default congestion control */ +void tcp_get_congestion_control(char *name) +{ + struct tcp_congestion_ops *ca; + /* We will always have reno... */ + BUG_ON(list_empty(&tcp_cong_list)); + + rcu_read_lock(); + ca = list_entry(tcp_cong_list.next, struct tcp_congestion_ops, list); + strncpy(name, ca->name, TCP_CA_NAME_MAX); + rcu_read_lock(); +} + /* * TCP Reno congestion control * This is special case used for fallback as well. Index: 2.6.12-rc5-tcp3/net/ipv4/sysctl_net_ipv4.c =================================================================== --- 2.6.12-rc5-tcp3.orig/net/ipv4/sysctl_net_ipv4.c +++ 2.6.12-rc5-tcp3/net/ipv4/sysctl_net_ipv4.c @@ -48,9 +48,6 @@ extern int inet_peer_maxttl; extern int inet_peer_gc_mintime; extern int inet_peer_gc_maxtime; -/* From tcp_input.c */ -extern char sysctl_tcp_congestion_control[TCP_CA_NAME_MAX]; - #ifdef CONFIG_SYSCTL static int tcp_retr1_max = 255; static int ip_local_port_range_min[] = { 1, 1 }; @@ -120,6 +117,52 @@ static int ipv4_sysctl_forward_strategy( return 1; } +static int proc_tcp_congestion_control(ctl_table *ctl, int write, struct file * filp, + void __user *buffer, size_t *lenp, loff_t *ppos) +{ + char val[TCP_CA_NAME_MAX]; + ctl_table tbl = { + .data = val, + .maxlen = TCP_CA_NAME_MAX, + }; + int ret; + + tcp_get_congestion_control(val); + + ret = proc_dostring(&tbl, write, filp, buffer, lenp, ppos); + if (write && ret == 0) { + ret = tcp_set_congestion_control(val); +#ifdef CONFIG_KMOD + if (ret == -ENOENT) { + request_module("tcp_%s", val); + ret = tcp_set_congestion_control(val); + } +#endif + } + return ret; +} + +int sysctl_tcp_congestion_control(ctl_table *table, int __user *name, int nlen, + void __user *oldval, size_t __user *oldlenp, + void __user *newval, size_t newlen, + void **context) +{ + char val[TCP_CA_NAME_MAX]; + ctl_table tbl = { + .data = val, + .maxlen = TCP_CA_NAME_MAX, + }; + int ret; + + tcp_get_congestion_control(val); + ret = sysctl_string(&tbl, name, nlen, oldval, oldlenp, newval, newlen, + context); + if (ret == 0 && newval && newlen) + ret = tcp_set_congestion_control(val); + return ret; +} + + ctl_table ipv4_table[] = { { .ctl_name = NET_IPV4_TCP_TIMESTAMPS, @@ -624,11 +667,10 @@ ctl_table ipv4_table[] = { { .ctl_name = NET_TCP_CONG_CONTROL, .procname = "tcp_congestion_control", - .data = &sysctl_tcp_congestion_control, - .maxlen = TCP_CA_NAME_MAX, .mode = 0644, - .proc_handler = &proc_dostring, - .strategy = &sysctl_string, + .maxlen = TCP_CA_NAME_MAX, + .proc_handler = &proc_tcp_congestion_control, + .strategy = &sysctl_tcp_congestion_control, }, { .ctl_name = 0 } From hadi@cyberus.ca Fri Jun 3 15:31:36 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Jun 2005 15:31:42 -0700 (PDT) Received: from mx03.cybersurf.com (mx03.cybersurf.com [209.197.145.106]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j53MVUXq013872 for ; Fri, 3 Jun 2005 15:31:31 -0700 Received: from mail.cyberus.ca ([209.197.145.21]) by mx03.cybersurf.com with esmtp (Exim 4.30) id 1DeKgW-0004KQ-O0 for netdev@oss.sgi.com; Fri, 03 Jun 2005 18:30:36 -0400 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.229]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1DeKgR-0007c5-Gc; Fri, 03 Jun 2005 18:30:31 -0400 Subject: Re: RFC: NAPI packet weighting patch From: jamal Reply-To: hadi@cyberus.ca To: Michael Chan Cc: Lennert Buytenhek , "David S. Miller" , mitch.a.williams@intel.com, john.ronciak@intel.com, jdmason@us.ibm.com, shemminger@osdl.org, netdev@oss.sgi.com, Robert.Olsson@data.slu.se, ganesh.venkatesan@intel.com, jesse.brandeburg@intel.com In-Reply-To: <1117830922.4430.44.camel@rh4> References: <20050603.120126.41874584.davem@davemloft.net> <20050603.132257.23013342.davem@davemloft.net> <20050603.132922.63997492.davem@davemloft.net> <1117828169.4430.29.camel@rh4> <20050603205944.GC20623@xi.wantstofly.org> <1117830922.4430.44.camel@rh4> Content-Type: text/plain Organization: unknown Date: Fri, 03 Jun 2005 18:29:58 -0400 Message-Id: <1117837798.6266.25.camel@localhost.localdomain> Mime-Version: 1.0 X-Mailer: Evolution 2.2.1.1 Content-Transfer-Encoding: 7bit X-archive-position: 2075 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev Content-Length: 1139 Lines: 29 On Fri, 2005-03-06 at 13:35 -0700, Michael Chan wrote: > On Fri, 2005-06-03 at 22:59 +0200, Lennert Buytenhek wrote: > > Which makes a lot more sense, since you'd rather do one MMIO write > > at the end of the loop than one per iteration, especially if your > > MMIO read (flush) latency is high. (Any subsequent MMIO read will > > have to flush out all pending writes, which'll be slow if there's > > a lot of writes still in the queue.) > > > I agree on the merit of issuing only one IO at the end. What I'm saying > is that doing so will make it similar to e1000 where all the buffers are > replenished at the end. Isn't that so or am I missing something? > I think the main issue would be a lot less CPU used in your case (because of the single MMIO). > By the way, in tg3 there is a buffer replenishment threshold programmed > to the chip and is currently set at rx_pending / 8 (200/8 = 25). This > means that the chip will replenish 25 rx buffers at a time. > So when you write the MMIO, 25 buffers are replenished or is this auto magically happening in the background? Sounds like a neat feature either way. cheers, jamal From baruch@ev-en.org Fri Jun 3 15:33:30 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Jun 2005 15:33:33 -0700 (PDT) Received: from galon.ev-en.org (rrcs-24-123-59-149.central.biz.rr.com [24.123.59.149]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j53MXUXq014198 for ; Fri, 3 Jun 2005 15:33:30 -0700 Received: by galon.ev-en.org (Postfix, from userid 105) id 176AF11A953; Sat, 4 Jun 2005 01:32:30 +0300 (IDT) Received: from [10.220.3.66] (hamilton.nuim.ie [149.157.192.252]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by galon.ev-en.org (Postfix) with ESMTP id 52CA111A951; Sat, 4 Jun 2005 01:32:25 +0300 (IDT) Message-ID: <42A0DA78.2040804@ev-en.org> Date: Fri, 03 Jun 2005 23:32:24 +0100 From: Baruch Even User-Agent: Debian Thunderbird 1.0.2 (X11/20050331) X-Accept-Language: en-us, en MIME-Version: 1.0 To: Stephen Hemminger Cc: Adrian Bunk , Andrew Morton , linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: 2.6.12-rc5-mm2: "bic unavailable using TCP reno" messages References: <20050601022824.33c8206e.akpm@osdl.org> <20050602121511.GE4992@stusta.de> <429F1079.5070701@ev-en.org> <20050602103805.6beb4f4e@dxpl.pdx.osdl.net> <20050602203823.GI4992@stusta.de> <20050603143702.0422101d@dxpl.pdx.osdl.net> In-Reply-To: <20050603143702.0422101d@dxpl.pdx.osdl.net> X-Enigmail-Version: 0.91.0.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-archive-position: 2076 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: baruch@ev-en.org Precedence: bulk X-list: netdev Content-Length: 1091 Lines: 33 Stephen Hemminger wrote: > Here is what I am working on as better way to make the sysctl selection. > I am not totally happy with the way the default congestion control value is determined > by the load order. But it does seem good that if you load "tcp_xxx" module and it > registers it becomes the default. Looks good. > @@ -120,6 +117,52 @@ static int ipv4_sysctl_forward_strategy( > return 1; > } > > +static int proc_tcp_congestion_control(ctl_table *ctl, int write, struct file * filp, > + void __user *buffer, size_t *lenp, loff_t *ppos) > +{ > + char val[TCP_CA_NAME_MAX]; > + ctl_table tbl = { > + .data = val, > + .maxlen = TCP_CA_NAME_MAX, > + }; > + int ret; > + > + tcp_get_congestion_control(val); Maybe we should call this tcp_get_current_congestion_control(), the current name implies (to me) that you give it a name and it returns the the ca struct. get_current might also just return the current one and the strcpy can be done here. Otherwise you probably should document the tcp_get_congestion_control() to say what size of string it accepts. Baruch From mmporter@cox.net Fri Jun 3 15:44:28 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Jun 2005 15:44:32 -0700 (PDT) Received: from fed1rmmtao04.cox.net (fed1rmmtao04.cox.net [68.230.241.35]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j53MiSXq015431 for ; Fri, 3 Jun 2005 15:44:28 -0700 Received: from liberty.homelinux.org ([68.2.41.86]) by fed1rmmtao04.cox.net (InterMail vM.6.01.04.00 201-2131-118-20041027) with ESMTP id <20050603224327.ZYDV23392.fed1rmmtao04.cox.net@liberty.homelinux.org>; Fri, 3 Jun 2005 18:43:27 -0400 Received: (from mmporter@localhost) by liberty.homelinux.org (8.9.3/8.9.3/Debian 8.9.3-21) id PAA01451; Fri, 3 Jun 2005 15:43:25 -0700 Date: Fri, 3 Jun 2005 15:43:25 -0700 From: Matt Porter To: Stephen Hemminger Cc: torvalds@osdl.org, akpm@osdl.org, jgarzik@pobox.com, linux-kernel@vger.kernel.org, linuxppc-embedded@ozlabs.org, netdev@oss.sgi.com Subject: Re: [PATCH][5/5] RapidIO support: net driver over messaging Message-ID: <20050603154324.I32392@cox.net> References: <20050602140359.B24818@cox.net> <20050602141247.C24818@cox.net> <20050602141946.D24818@cox.net> <20050602142509.E24818@cox.net> <20050602143404.F24818@cox.net> <20050602150543.7e4326b6@dxpl.pdx.osdl.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: <20050602150543.7e4326b6@dxpl.pdx.osdl.net>; from shemminger@osdl.org on Thu, Jun 02, 2005 at 03:05:43PM -0700 X-archive-position: 2077 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mporter@kernel.crashing.org Precedence: bulk X-list: netdev Content-Length: 5146 Lines: 155 On Thu, Jun 02, 2005 at 03:05:43PM -0700, Stephen Hemminger wrote: > How much is this like ethernet? does it still do ARP? It's nothing like Ethernet, the only relation is that an Ethernet network driver is easy to implement over top of raw message ports on a switched fabric network. It gives easy access to RIO messaging from userspace without inventing a new interface. ARP works by the driver emulating a broadcast over RIO by sending the same ARP packet to each node that is participating in the rionet. Nodes join/leave the rionet by sending RIO-specific doorbell messages to potential participants on the switched fabric. A table is kept to flag active participants such that a fast lookup can be made to translate the dst MAC address to a RIO device struct that is used to actually send the Ethernet packet encapsulated into a standard RIO message to the appropriate node(s). > Can it do promiscious receive? No. > > +LIST_HEAD(rionet_peers); > > Does this have to be global? Nope, should be static. Fixing. > Not sure about the locking of this stuff, are you > relying on the RTNL? Yes, last I looked that was sufficient for all the entry points. I protect the driver-specific data (tx skb rings, etc.) with a private lock. > > + > > +static int rionet_change_mtu(struct net_device *ndev, int new_mtu) > > +{ > > + struct rionet_private *rnet = ndev->priv; > > + > > + if (netif_msg_drv(rnet)) > > + printk(KERN_WARNING > > + "%s: rionet_change_mtu(): not implemented\n", DRV_NAME); > > + > > + return 0; > > +} > > If you can allow any mtu then don't need this at all. > Or if you are limited then better return an error for bad values. Ok, I do have a upper limit of 4082 as the RIO messages have a max 4096 byte payload. That's the default on open as well. I'll fix this up. > > +static void rionet_set_multicast_list(struct net_device *ndev) > > +{ > > + struct rionet_private *rnet = ndev->priv; > > + > > + if (netif_msg_drv(rnet)) > > + printk(KERN_WARNING > > + "%s: rionet_set_multicast_list(): not implemented\n", > > + DRV_NAME); > > +} > > If you can't handle it then just leave dev->set_multicast_list > as NULL and all attempts to add or delete will get -EINVAL Will do. It was a placeholder at one point when I thought I might emulate multicast in the driver...it's fallen down my priority list. > > + > > +static int rionet_open(struct net_device *ndev) > > +{ > > > > + /* Initialize inbound message ring */ > > + for (i = 0; i < RIONET_RX_RING_SIZE; i++) > > + rnet->rx_skb[i] = NULL; > > + rnet->rx_slot = 0; > > + rionet_rx_fill(ndev, 0); > > + > > + rnet->tx_slot = 0; > > + rnet->tx_cnt = 0; > > + rnet->ack_slot = 0; > > + > > + spin_lock_init(&rnet->lock); > > + > > + rnet->msg_enable = RIONET_DEFAULT_MSGLEVEL; > > Better to do all initialization of the per device data > in the place it is allocated (rio_setup_netdev) Right, will do. > > +static int rionet_ioctl(struct net_device *ndev, struct ifreq *rq, int cmd) > > +{ > > + return -EOPNOTSUPP; > > +} > > Unneeded, if dev->do_ioctl is NULL, then all private ioctl's will > return -EINVAL that is what you want. Ah, ok. Good, none of the MII stuff applies in this case. > > +static u32 rionet_get_link(struct net_device *ndev) > > +{ > > + return netif_carrier_ok(ndev); > > +} > > Use ethtool_op_get_link Ok > > + /* Fill in the driver function table */ > > + ndev->open = &rionet_open; > > + ndev->hard_start_xmit = &rionet_start_xmit; > > + ndev->stop = &rionet_close; > > + ndev->get_stats = &rionet_stats; > > + ndev->change_mtu = &rionet_change_mtu; > > + ndev->set_mac_address = &rionet_set_mac_address; > > + ndev->set_multicast_list = &rionet_set_multicast_list; > > + ndev->do_ioctl = &rionet_ioctl; > > + SET_ETHTOOL_OPS(ndev, &rionet_ethtool_ops); > > + > > + ndev->mtu = RIO_MAX_MSG_SIZE - 14; > > + > > + SET_MODULE_OWNER(ndev); > > Can you set any ndev->features to get better performance. > Can you take >32bit data addresses? then set HIGHDMA > You are doing your on locking, can you use LLTX? > Does the hardware support scatter gather? Some of these get tricky. In general, rionet could support SG and with driver help we can flag IP_CSUM. In practice, the current generation MPC85xx HW on my development system have some problems with their message port dma queues. In short, their implementation is such that the arch-specific code is forced to do a copy of the skb on both tx and rx. Because of this, adding SG/IP_CSUM doesn't have any value yet...it'll make sense to add the addtional features once we get a platform with better messaging hardware. HIGHDMA may not be suitable on all platforms. Since rionet sits on top of a hardware abstraction, it doesn't have full knowledge of the DMA capabilities of the hardware. We can eventually have some interfaces to the arch code to learn that info, but it's not there yet. I have to look into LLTX, I know what it stands for, but I'm not sure of the details. Do you have a good LLTX example reference? That said, my goal is to enable as many features as possible when we have hw to take advantage of them. -Matt From kernel@linuxace.com Fri Jun 3 16:25:13 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Jun 2005 16:25:17 -0700 (PDT) Received: from linuxace.com (adsl-67-120-171-161.dsl.lsan03.pacbell.net [67.120.171.161]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id j53NPCXq021135 for ; Fri, 3 Jun 2005 16:25:12 -0700 Received: (qmail 29381 invoked by uid 0); 3 Jun 2005 23:24:13 -0000 Date: Fri, 3 Jun 2005 16:24:13 -0700 From: Phil Oester To: netdev@oss.sgi.com Subject: Unitialized queue_lock oops? Message-ID: <20050603232413.GA29308@linuxace.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4.1i X-archive-position: 2078 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kernel@linuxace.com Precedence: bulk X-list: netdev Content-Length: 2797 Lines: 73 In my ongoing attempts to migrate to anything higher than 2.6.10, I decided to retest 2.6.11-rc2 but backout the problematic LLTX patch. I also enabled spinlock debugging, and hit an odd BUG. Full oops output below, but the summary is: kernel BUG at include/asm/spinlock.h:92! which is here: BUG_ON(lock->magic != SPINLOCK_MAGIC); And we got there via dev_queue_xmit: /* Grab device queue */ spin_lock(&dev->queue_lock); -- no complaints yet, so queue_lock must be initialized here rc = q->enqueue(skb, q); qdisc_run(dev); -- qdisc_run drops queue_lock briefly - it get mangled while it's dropped? spin_unlock(&dev->queue_lock); -- now we hit the BUG - queue_lock->magic != SPINLOCK_MAGIC. I know the proposed LLTX changes were meant to address a race while the queue_lock was dropped - is the above another illustration of the race potential? Phil kernel BUG at include/asm/spinlock.h:92! invalid operand: 0000 [#1] SMP DEBUG_PAGEALLOC CPU: 1 EIP: 0060:[] Not tainted VLI EFLAGS: 00010217 (2.6.11-rc2) EIP is at _spin_unlock+0x24/0x30 eax: f7ae7ec0 ebx: f6d5ff00 ecx: f6d5ffbc edx: f7ae7ec0 esi: f7ae3800 edi: c4a45f50 ebp: c0333d64 esp: c0333d64 ds: 007b es: 007b ss: 0068 Process swapper (pid: 0, threadinfo=c0333000 task=c198aaf0) Stack: c0333d88 c023168a c0272eea f7ae3800 f7ae35bc 00000000 f590c89c f590c888 c63cc020 c0333da8 c0249873 c02497c0 f590c888 c4a45f50 00000000 00000004 00000002 c0333ddc c023b61e 00000000 f7ae3800 c0333dcc c02497c0 80000000 Call Trace: [] show_stack+0x7a/0x90 [] show_registers+0x14d/0x1b0 [] die+0xf9/0x180 [] do_invalid_op+0xa9/0xc0 [] error_code+0x2b/0x30 [] dev_queue_xmit+0x20a/0x290 [] ip_finish_output2+0xb3/0x1c0 [] nf_hook_slow+0xae/0xe0 [] ip_finish_output+0x1ee/0x200 [] ip_forward_finish+0x2c/0x50 [] nf_hook_slow+0xae/0xe0 [] ip_forward+0x19c/0x230 [] ip_rcv_finish+0x1b8/0x230 [] nf_hook_slow+0xae/0xe0 [] ip_rcv+0x3b5/0x470 [] netif_receive_skb+0x13a/0x190 [] e1000_clean_rx_irq+0x156/0x480 [] e1000_clean+0x45/0xf0 [] net_rx_action+0x90/0x130 [] __do_softirq+0xb8/0xd0 [] do_softirq+0x4d/0x60 ======================= [] do_IRQ+0x68/0xa0 [] common_interrupt+0x1a/0x20 [] cpu_idle+0x5f/0x70 [<00000000>] 0x0 [] 0xc198bfbc Code: 8d bc 27 00 00 00 00 55 89 c2 89 e5 81 78 04 ad 4e ad de 75 16 0f b6 02 84 c0 7f 05 c6 02 01 5d c3 0f 0b 5d 00 08 9b 29 c0 eb f1 <0f> 0b 5c 00 08 9b 29 c0 eb e0 89 f6 55 89 e5 f0 81 00 00 00 00 From buytenh@wantstofly.org Fri Jun 3 16:28:03 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Jun 2005 16:28:10 -0700 (PDT) Received: from xi.wantstofly.org (alephnull.demon.nl [212.238.201.82]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j53NS2Xq021612 for ; Fri, 3 Jun 2005 16:28:03 -0700 Received: by xi.wantstofly.org (Postfix, from userid 500) id EA0A1945C8; Sat, 4 Jun 2005 01:26:56 +0200 (MEST) Date: Sat, 4 Jun 2005 01:26:56 +0200 From: Lennert Buytenhek To: Michael Chan Cc: "David S. Miller" , mitch.a.williams@intel.com, hadi@cyberus.ca, john.ronciak@intel.com, jdmason@us.ibm.com, shemminger@osdl.org, netdev@oss.sgi.com, Robert.Olsson@data.slu.se, ganesh.venkatesan@intel.com, jesse.brandeburg@intel.com Subject: Re: RFC: NAPI packet weighting patch Message-ID: <20050603232656.GB21125@xi.wantstofly.org> References: <20050603.120126.41874584.davem@davemloft.net> <20050603.132257.23013342.davem@davemloft.net> <20050603.132922.63997492.davem@davemloft.net> <1117828169.4430.29.camel@rh4> <20050603205944.GC20623@xi.wantstofly.org> <1117830922.4430.44.camel@rh4> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1117830922.4430.44.camel@rh4> User-Agent: Mutt/1.4.1i X-archive-position: 2079 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: buytenh@wantstofly.org Precedence: bulk X-list: netdev Content-Length: 1520 Lines: 32 On Fri, Jun 03, 2005 at 01:35:22PM -0700, Michael Chan wrote: > > > Yes, in tg3, rx buffers are replenished and put back into the ring > > > as completed packets are taken off the ring. But we don't tell the > > > chip about these new buffers until we get to the end of the loop, > > > potentially after a full quota of packets. > > > > Which makes a lot more sense, since you'd rather do one MMIO write > > at the end of the loop than one per iteration, especially if your > > MMIO read (flush) latency is high. (Any subsequent MMIO read will > > have to flush out all pending writes, which'll be slow if there's > > a lot of writes still in the queue.) > > I agree on the merit of issuing only one IO at the end. What I'm saying > is that doing so will make it similar to e1000 where all the buffers are > replenished at the end. Isn't that so or am I missing something? I think you're right: for e1000 as well as tg3, the NIC cannot use the new RX buffers until the CPU breaks out of the poll loop. I don't understand why reducing the weight apparently makes the e1000 go faster. Perhaps as Robert said, the RX ring is not big enough and that's why feeding RX buffers back to the chip more agressively might help prevent overruns? I would say that running with a N+64-entry RX ring and a weight of 64 should not show any worse behavior than running with a N+16-entry RX ring with a weight of 16. If anything, weight=64 should show _better_ performance than weight=16. Something else must be going on. --L From buytenh@wantstofly.org Fri Jun 3 16:31:22 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Jun 2005 16:31:25 -0700 (PDT) Received: from xi.wantstofly.org (alephnull.demon.nl [212.238.201.82]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j53NVLXq022336 for ; Fri, 3 Jun 2005 16:31:21 -0700 Received: by xi.wantstofly.org (Postfix, from userid 500) id 004CA945C8; Sat, 4 Jun 2005 01:30:21 +0200 (MEST) Date: Sat, 4 Jun 2005 01:30:21 +0200 From: Lennert Buytenhek To: Edgar E Iglesias Cc: Michael Chan , "David S. Miller" , mitch.a.williams@intel.com, hadi@cyberus.ca, john.ronciak@intel.com, jdmason@us.ibm.com, shemminger@osdl.org, netdev@oss.sgi.com, Robert.Olsson@data.slu.se, ganesh.venkatesan@intel.com, jesse.brandeburg@intel.com Subject: Re: RFC: NAPI packet weighting patch Message-ID: <20050603233021.GC21125@xi.wantstofly.org> References: <20050603.120126.41874584.davem@davemloft.net> <20050603.132257.23013342.davem@davemloft.net> <20050603.132922.63997492.davem@davemloft.net> <1117828169.4430.29.camel@rh4> <20050603205944.GC20623@xi.wantstofly.org> <20050603210701.GA3263@edgar.se.axis.com> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20050603210701.GA3263@edgar.se.axis.com> User-Agent: Mutt/1.4.1i X-archive-position: 2080 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: buytenh@wantstofly.org Precedence: bulk X-list: netdev Content-Length: 1295 Lines: 30 On Fri, Jun 03, 2005 at 11:07:01PM +0200, Edgar E Iglesias wrote: > > > Yes, in tg3, rx buffers are replenished and put back into the ring > > > as completed packets are taken off the ring. But we don't tell the > > > chip about these new buffers until we get to the end of the loop, > > > potentially after a full quota of packets. > > > > Which makes a lot more sense, since you'd rather do one MMIO write > > at the end of the loop than one per iteration, especially if your > > MMIO read (flush) latency is high. (Any subsequent MMIO read will > > have to flush out all pending writes, which'll be slow if there's > > a lot of writes still in the queue.) > > Maybe it would be better to put a fixed weight at this level, return > the descriptors to the HW after every X packets. That way you > can keep the NAPI weight at 64 (or what ever) and still give back > descriptors to HW more often. For this scheme to make any difference at all, the RX ring must be overflowing in the case where we refill the RX ring only once every 64 packets. If the RX ring _is_ overflowing but the system is otherwise capable of keeping up with the receive rate (i.e. the packet service times as seen by the NIC have a high variance), simply make the RX ring bigger. I don't see what's going on. --L From herbert@gondor.apana.org.au Fri Jun 3 16:47:43 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Jun 2005 16:47:51 -0700 (PDT) Received: from arnor.apana.org.au (arnor.apana.org.au [203.14.152.115]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j53NleXq023265 for ; Fri, 3 Jun 2005 16:47:42 -0700 Received: from gondolin.me.apana.org.au ([192.168.0.6] ident=mail) by arnor.apana.org.au with esmtp (Exim 3.35 #1 (Debian)) id 1DeLru-00089e-00; Sat, 04 Jun 2005 09:46:26 +1000 Received: from herbert by gondolin.me.apana.org.au with local (Exim 3.36 #1 (Debian)) id 1DeLrr-0005Fc-00; Sat, 04 Jun 2005 09:46:23 +1000 Date: Sat, 4 Jun 2005 09:46:23 +1000 To: "David S. Miller" , James Morris , Linux Crypto Mailing List , netdev@oss.sgi.com Subject: [RFC] Replace scatterlist with crypto_frag Message-ID: <20050603234623.GA20088@gondor.apana.org.au> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.9i From: Herbert Xu X-archive-position: 2081 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: herbert@gondor.apana.org.au Precedence: bulk X-list: netdev Content-Length: 1730 Lines: 45 Hi: I was looking at how we can move the IPsec input/output processing out of the critical section protected by the spin locks on the xfrm_state. This is useful because it would allow concurrent processing of IPsec packets for the same SA. It is also necessary if we're ever going to add support for asynchronous crypto to IPsec. The first requirement for this is that we need to stop using data that is shared across a single SA in the IPsec input/output routines. The biggest hurdle there as it stands is sgbuf in esp_data. This was introduced to reduce stack usage in esp_input/esp_output as sgbuf would consume up to 64 bytes of space. In order to move it back onto the stack (so we can run these things in parallel), I'm thinking of reducing the size of the scatterlist structure itself. The Crypto API doesn't need all the data contained in a scatterlist structure. For instance, it has no need for anything to do with DMA. When we implement hardware crypto (which might do DMA), they're going to have their own lists of descriptors so they can't use the scatterlist as is anyway. The skb_frag_t structure on the other hand is much more suited for our purpose. It is only half the size of scatterlist on i386. So what do you think about introducing a new crypto_frag structure which looks like this: struct crypto_frag { struct page *page; u16 offset; u16 length; }; We could then move sgbuf back into esp_input/esp_output at the cost of 32 bytes of stack. Is this stack cost acceptable? Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt From herbert@gondor.apana.org.au Fri Jun 3 16:52:35 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Jun 2005 16:52:38 -0700 (PDT) Received: from arnor.apana.org.au (arnor.apana.org.au [203.14.152.115]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j53NqXXq023918 for ; Fri, 3 Jun 2005 16:52:34 -0700 Received: from gondolin.me.apana.org.au ([192.168.0.6] ident=mail) by arnor.apana.org.au with esmtp (Exim 3.35 #1 (Debian)) id 1DeLwq-0008BS-00; Sat, 04 Jun 2005 09:51:32 +1000 Received: from herbert by gondolin.me.apana.org.au with local (Exim 3.36 #1 (Debian)) id 1DeLwo-0005GI-00; Sat, 04 Jun 2005 09:51:30 +1000 From: Herbert Xu To: kernel@linuxace.com (Phil Oester) Subject: Re: Unitialized queue_lock oops? Cc: netdev@oss.sgi.com Organization: Core In-Reply-To: <20050603232413.GA29308@linuxace.com> X-Newsgroups: apana.lists.os.linux.netdev User-Agent: tin/1.7.4-20040225 ("Benbecula") (UNIX) (Linux/2.4.27-hx-1-686-smp (i686)) Message-Id: Date: Sat, 04 Jun 2005 09:51:30 +1000 X-archive-position: 2082 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: herbert@gondor.apana.org.au Precedence: bulk X-list: netdev Content-Length: 634 Lines: 16 Phil Oester wrote: > > I know the proposed LLTX changes were meant to address a race while > the queue_lock was dropped - is the above another illustration of the > race potential? I'd say that either you're using a dodgy qdisc, or your hardware is just stuffed. That is, if you are using the default qdisc, you should start looking at replacing pieces of the hardware to find the problem. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt From kernel@linuxace.com Fri Jun 3 17:01:45 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Jun 2005 17:01:47 -0700 (PDT) Received: from linuxace.com (adsl-67-120-171-161.dsl.lsan03.pacbell.net [67.120.171.161]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id j5401iXq024784 for ; Fri, 3 Jun 2005 17:01:44 -0700 Received: (qmail 29514 invoked by uid 0); 4 Jun 2005 00:00:46 -0000 Date: Fri, 3 Jun 2005 17:00:46 -0700 From: Phil Oester To: Herbert Xu Cc: netdev@oss.sgi.com Subject: Re: Unitialized queue_lock oops? Message-ID: <20050604000046.GA29438@linuxace.com> References: <20050603232413.GA29308@linuxace.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.1i X-archive-position: 2083 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kernel@linuxace.com Precedence: bulk X-list: netdev Content-Length: 374 Lines: 9 On Sat, Jun 04, 2005 at 09:51:30AM +1000, Herbert Xu wrote: > I'd say that either you're using a dodgy qdisc, or your hardware is > just stuffed. That is, if you are using the default qdisc, you should > start looking at replacing pieces of the hardware to find the problem. Yes, default qdisc. Interesting that 2.6.10 is rock solid on the same hardware...oh well. Phil From jgarzik@pobox.com Fri Jun 3 17:03:58 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Jun 2005 17:04:01 -0700 (PDT) Received: from mail.dvmed.net (mail.dvmed.net [216.237.124.58]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j5403wXq025385 for ; Fri, 3 Jun 2005 17:03:58 -0700 Received: from cpe-065-184-065-144.nc.res.rr.com ([65.184.65.144] helo=[10.10.10.88]) by mail.dvmed.net with esmtpsa (Exim 4.51 #1 (Red Hat Linux)) id 1DeM7q-00026Z-Hb; Sat, 04 Jun 2005 00:02:54 +0000 Message-ID: <42A0EFAC.7070609@pobox.com> Date: Fri, 03 Jun 2005 20:02:52 -0400 From: Jeff Garzik User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.6) Gecko/20050328 Fedora/1.7.6-1.2.5 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Herbert Xu CC: "David S. Miller" , James Morris , Linux Crypto Mailing List , netdev@oss.sgi.com Subject: Re: [RFC] Replace scatterlist with crypto_frag References: <20050603234623.GA20088@gondor.apana.org.au> In-Reply-To: <20050603234623.GA20088@gondor.apana.org.au> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 2084 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev Content-Length: 954 Lines: 25 Herbert Xu wrote: > The Crypto API doesn't need all the data contained in a scatterlist > structure. For instance, it has no need for anything to do with DMA. > When we implement hardware crypto (which might do DMA), they're going > to have their own lists of descriptors so they can't use the scatterlist > as is anyway. I'm not sure I agree with this. A standard feature of struct scatterlist is having the DMA mappings right next to the kernel virtual address/length info. Drivers use the arch-specific DMA-mapped part of struct scatterlist to fill the hardware-specific descriptions with addresses and other info. Since you -will- have to DMA map buffers before passing them to hardware, it seems like struct scatterlist is much more appropriate than crypto_frag when dealing with hardware. For pure software implementations, I don't see why you can't just ignore the extra fields that each arch puts into struct scatterlist. Jeff From niv@us.ibm.com Fri Jun 3 17:21:11 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Jun 2005 17:21:19 -0700 (PDT) Received: from e34.co.us.ibm.com (e34.co.us.ibm.com [32.97.110.132]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j540L5Xq028640 for ; Fri, 3 Jun 2005 17:21:11 -0700 Received: from d03relay04.boulder.ibm.com (d03relay04.boulder.ibm.com [9.17.195.106]) by e34.co.us.ibm.com (8.12.10/8.12.9) with ESMTP id j540K68E421694 for ; Fri, 3 Jun 2005 20:20:06 -0400 Received: from d03av02.boulder.ibm.com (d03av02.boulder.ibm.com [9.17.195.168]) by d03relay04.boulder.ibm.com (8.12.10/NCO/VER6.6) with ESMTP id j540K66g124576 for ; Fri, 3 Jun 2005 18:20:06 -0600 Received: from d03av02.boulder.ibm.com (loopback [127.0.0.1]) by d03av02.boulder.ibm.com (8.12.11/8.13.3) with ESMTP id j540K5tw020413 for ; Fri, 3 Jun 2005 18:20:05 -0600 Received: from [9.47.22.158] (dyn9047022158.beaverton.ibm.com [9.47.22.158]) by d03av02.boulder.ibm.com (8.12.11/8.12.11) with ESMTP id j540K5mB020403 for ; Fri, 3 Jun 2005 18:20:05 -0600 Message-ID: <42A0F3B4.1060601@us.ibm.com> Date: Fri, 03 Jun 2005 17:20:04 -0700 From: Nivedita Singhvi User-Agent: Mozilla Thunderbird 0.8 (X11/20041020) X-Accept-Language: en-us, en MIME-Version: 1.0 To: netdev@oss.sgi.com Subject: Automated linux kernel testing results Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 2085 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: niv@us.ibm.com Precedence: bulk X-list: netdev Content-Length: 1458 Lines: 39 For those who don't read lkml, I thought I'd point to Martin Bligh's post regarding automated testing being set up, since some people on this list were interested. http://marc.theaimsgroup.com/?l=linux-kernel&m=111775021327595&w=2 Networking tests are in plan... thanks, Nivedita -------------------------- OK, I've finally got this to the point where I can publish it. http://ftp.kernel.org/pub/linux/kernel/people/mbligh/abat/regression_matrix.html Currently it builds and boots any mainline, -mjb, -mm kernel within about 15 minutes of release. runs dbench, tbench, kernbench, reaim and fsx. Currently I'm using a 4x AMD64 box, a 16x NUMA-Q, 4x NUMA-Q, 32x x440 (ia32) PPC64 Power 5 LPAR, PPC64 Power 4 LPAR, and PPC64 Power 4 bare metal system. The config files it uses are linked by the machine names in the column headers. Thanks to all the other IBM people who've worked on the ABAT test system that this stuff relies on - too many to list, but especially Andy, Adam, and Enrique, who have fixed endless bugs, and put up with my incessant bitching about it all not working as it should ;-) Clicking on the failure ones error codes should take you to somewhere vaguely helpful to diagnose it. Clicking on the job number just below that takes you to the info I'm publishing right now, which should include perf results and profiles, etc. I'll add graphs, etc later, comparing performance across kernels (I have them ... just not automated). From herbert@gondor.apana.org.au Fri Jun 3 17:35:46 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Jun 2005 17:35:51 -0700 (PDT) Received: from arnor.apana.org.au (arnor.apana.org.au [203.14.152.115]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j540ZjXq029730 for ; Fri, 3 Jun 2005 17:35:46 -0700 Received: from gondolin.me.apana.org.au ([192.168.0.6] ident=mail) by arnor.apana.org.au with esmtp (Exim 3.35 #1 (Debian)) id 1DeMcd-0008PT-00; Sat, 04 Jun 2005 10:34:43 +1000 Received: from herbert by gondolin.me.apana.org.au with local (Exim 3.36 #1 (Debian)) id 1DeMcb-0005KZ-00; Sat, 04 Jun 2005 10:34:41 +1000 Date: Sat, 4 Jun 2005 10:34:41 +1000 To: Phil Oester Cc: netdev@oss.sgi.com Subject: Re: Unitialized queue_lock oops? Message-ID: <20050604003441.GA20471@gondor.apana.org.au> References: <20050603232413.GA29308@linuxace.com> <20050604000046.GA29438@linuxace.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20050604000046.GA29438@linuxace.com> User-Agent: Mutt/1.5.9i From: Herbert Xu X-archive-position: 2086 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: herbert@gondor.apana.org.au Precedence: bulk X-list: netdev Content-Length: 817 Lines: 20 On Fri, Jun 03, 2005 at 05:00:46PM -0700, Phil Oester wrote: > > Yes, default qdisc. Interesting that 2.6.10 is rock solid on the same > hardware...oh well. Well if you do have the time feel free to keep searching back to 2.6.10. Even though I'd say that this is most likely to turn out to be a hardware problem, there is no telling what you might find along the way. At least it might tell us what sort of hardware problems would result in only networking crashes :) If this were your average hardware problem I'd have expected to see crashes all over the place, especially under fs/ and mm/. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt From kernel@linuxace.com Fri Jun 3 17:39:34 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Jun 2005 17:39:36 -0700 (PDT) Received: from linuxace.com (adsl-67-120-171-161.dsl.lsan03.pacbell.net [67.120.171.161]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id j540dXXq030385 for ; Fri, 3 Jun 2005 17:39:33 -0700 Received: (qmail 29677 invoked by uid 0); 4 Jun 2005 00:38:35 -0000 Date: Fri, 3 Jun 2005 17:38:35 -0700 From: Phil Oester To: Herbert Xu Cc: netdev@oss.sgi.com Subject: Re: Unitialized queue_lock oops? Message-ID: <20050604003835.GA29635@linuxace.com> References: <20050603232413.GA29308@linuxace.com> <20050604000046.GA29438@linuxace.com> <20050604003441.GA20471@gondor.apana.org.au> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20050604003441.GA20471@gondor.apana.org.au> User-Agent: Mutt/1.4.1i X-archive-position: 2087 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kernel@linuxace.com Precedence: bulk X-list: netdev Content-Length: 874 Lines: 20 On Sat, Jun 04, 2005 at 10:34:41AM +1000, Herbert Xu wrote: > On Fri, Jun 03, 2005 at 05:00:46PM -0700, Phil Oester wrote: > > > > Yes, default qdisc. Interesting that 2.6.10 is rock solid on the same > > hardware...oh well. > > Well if you do have the time feel free to keep searching back to 2.6.10. > Even though I'd say that this is most likely to turn out to be a hardware > problem, there is no telling what you might find along the way. > > At least it might tell us what sort of hardware problems would result in > only networking crashes :) If this were your average hardware problem > I'd have expected to see crashes all over the place, especially under > fs/ and mm/. Ok, how bout next week I adjust OSPF costs to make my secondary firewall primary, and see if I still have problems? At least then we can put the hardware problem theory behind us... Phil From herbert@gondor.apana.org.au Fri Jun 3 17:43:25 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Jun 2005 17:43:30 -0700 (PDT) Received: from arnor.apana.org.au (arnor.apana.org.au [203.14.152.115]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j540hNXq031001 for ; Fri, 3 Jun 2005 17:43:24 -0700 Received: from gondolin.me.apana.org.au ([192.168.0.6] ident=mail) by arnor.apana.org.au with esmtp (Exim 3.35 #1 (Debian)) id 1DeMjm-0008VR-00; Sat, 04 Jun 2005 10:42:06 +1000 Received: from herbert by gondolin.me.apana.org.au with local (Exim 3.36 #1 (Debian)) id 1DeMjh-0005Lr-00; Sat, 04 Jun 2005 10:42:01 +1000 Date: Sat, 4 Jun 2005 10:42:01 +1000 To: Jeff Garzik Cc: "David S. Miller" , James Morris , Linux Crypto Mailing List , netdev@oss.sgi.com Subject: Re: [RFC] Replace scatterlist with crypto_frag Message-ID: <20050604004201.GB20471@gondor.apana.org.au> References: <20050603234623.GA20088@gondor.apana.org.au> <42A0EFAC.7070609@pobox.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <42A0EFAC.7070609@pobox.com> User-Agent: Mutt/1.5.9i From: Herbert Xu X-archive-position: 2088 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: herbert@gondor.apana.org.au Precedence: bulk X-list: netdev Content-Length: 1837 Lines: 42 Hi Jeff: On Fri, Jun 03, 2005 at 08:02:52PM -0400, Jeff Garzik wrote: > > A standard feature of struct scatterlist is having the DMA mappings > right next to the kernel virtual address/length info. Drivers use the > arch-specific DMA-mapped part of struct scatterlist to fill the > hardware-specific descriptions with addresses and other info. Agreed. > Since you -will- have to DMA map buffers before passing them to > hardware, it seems like struct scatterlist is much more appropriate than > crypto_frag when dealing with hardware. > > For pure software implementations, I don't see why you can't just ignore > the extra fields that each arch puts into struct scatterlist. It depends on who is going to do the mapping. When we implement hardware crypto, the DMA mapping will be done either by the crypto layer or under it by the driver itself. So the crypto layer is certainly going to need the scatterlist structure. However, the users of the crypto layer (such as IPsec/dmcrypt) don't have to know about DMA at all. Therefore the data structure between the users and the crypto layer itself doesn't have to carry DMA information. Compare this with the block layer. Between the users of the block layer and the block layer itself you have the bio_vec structure which carries no DMA information. The scatterlist structure only comes into play after DMA mapping has been carried out under the block layer. So this is really a sort of bio_vec for crypto structures. The objective here is to make the structure as compact as possible to allow users to allocate it on the stack most of the time. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt From mchan@broadcom.com Fri Jun 3 18:24:39 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Jun 2005 18:24:49 -0700 (PDT) Received: from MMS2.broadcom.com (mms2.broadcom.com [216.31.210.18]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j541OdXq000881 for ; Fri, 3 Jun 2005 18:24:39 -0700 Received: from 10.10.64.121 by MMS2.broadcom.com with SMTP (Broadcom SMTP Relay (Email Firewall v6.1.0)); Fri, 03 Jun 2005 18:23:23 -0700 X-Server-Uuid: 1F20ACF3-9CAF-44F7-AB47-F294E2D5B4EA Received: from mail-irva-8.broadcom.com ([10.10.64.221]) by mail-irva-1.broadcom.com (Post.Office MTA v3.5.3 release 223 ID# 0-72233U7200L2200S0V35) with ESMTP id com; Fri, 3 Jun 2005 18:23:22 -0700 Received: from mon-irva-10.broadcom.com (mon-irva-10.broadcom.com [10.10.64.171]) by mail-irva-8.broadcom.com (MOS 3.5.6-GR) with ESMTP id BCC34449; Fri, 3 Jun 2005 18:23:11 -0700 (PDT) Received: from nt-irva-0741.brcm.ad.broadcom.com ( nt-irva-0741.brcm.ad.broadcom.com [10.8.194.54]) by mon-irva-10.broadcom.com (8.9.1/8.9.1) with ESMTP id SAA12263; Fri, 3 Jun 2005 18:23:11 -0700 (PDT) Received: from 10.7.18.177 ([10.7.18.177]) by NT-IRVA-0741.brcm.ad.broadcom.com ([10.8.194.54]) with Microsoft Exchange Server HTTP-DAV ; Sat, 4 Jun 2005 01:23:04 +0000 Received: from rh4 by nt-irva-0741; 03 Jun 2005 17:25:36 -0700 Subject: Re: RFC: NAPI packet weighting patch From: "Michael Chan" To: hadi@cyberus.ca cc: "Lennert Buytenhek" , "David S. Miller" , mitch.a.williams@intel.com, john.ronciak@intel.com, jdmason@us.ibm.com, shemminger@osdl.org, netdev@oss.sgi.com, Robert.Olsson@data.slu.se, ganesh.venkatesan@intel.com, jesse.brandeburg@intel.com In-Reply-To: <1117837798.6266.25.camel@localhost.localdomain> References: <20050603.120126.41874584.davem@davemloft.net> <20050603.132257.23013342.davem@davemloft.net> <20050603.132922.63997492.davem@davemloft.net> <1117828169.4430.29.camel@rh4> <20050603205944.GC20623@xi.wantstofly.org> <1117830922.4430.44.camel@rh4> <1117837798.6266.25.camel@localhost.localdomain> Date: Fri, 03 Jun 2005 17:25:36 -0700 Message-ID: <1117844736.4430.51.camel@rh4> MIME-Version: 1.0 X-Mailer: Evolution 2.0.2 (2.0.2-3) X-WSS-ID: 6EBFDD011VO5052745-01-01 Content-Type: text/plain Content-Transfer-Encoding: 7bit X-archive-position: 2089 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mchan@broadcom.com Precedence: bulk X-list: netdev Content-Length: 693 Lines: 17 On Fri, 2005-06-03 at 18:29 -0400, jamal wrote: > On Fri, 2005-03-06 at 13:35 -0700, Michael Chan wrote: > > > By the way, in tg3 there is a buffer replenishment threshold programmed > > to the chip and is currently set at rx_pending / 8 (200/8 = 25). This > > means that the chip will replenish 25 rx buffers at a time. > > > > So when you write the MMIO, 25 buffers are replenished or is this auto > magically happening in the background? Sounds like a neat feature either > way. > The MMIO writes a cumulative producer index of new rx descriptors in the ring. As the chip requires new buffers for rx packets, it will DMA 25 of these rx descriptors at a time up to the producer index. From jmorris@redhat.com Fri Jun 3 21:40:50 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Jun 2005 21:40:53 -0700 (PDT) Received: from mx1.redhat.com (mx1.redhat.com [66.187.233.31]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j544eoXq013446 for ; Fri, 3 Jun 2005 21:40:50 -0700 Received: from int-mx1.corp.redhat.com (int-mx1.corp.redhat.com [172.16.52.254]) by mx1.redhat.com (8.12.11/8.12.11) with ESMTP id j544dhgU003834; Sat, 4 Jun 2005 00:39:43 -0400 Received: from mail.boston.redhat.com (mail.boston.redhat.com [172.16.76.12]) by int-mx1.corp.redhat.com (8.11.6/8.11.6) with ESMTP id j544dgO20153; Sat, 4 Jun 2005 00:39:42 -0400 Received: from thoron.boston.redhat.com (thoron.boston.redhat.com [172.16.80.63]) by mail.boston.redhat.com (8.12.8/8.12.8) with ESMTP id j544dfDj029248; Sat, 4 Jun 2005 00:39:42 -0400 Date: Sat, 4 Jun 2005 00:39:41 -0400 (EDT) From: James Morris X-X-Sender: jmorris@thoron.boston.redhat.com To: Herbert Xu cc: Jeff Garzik , "David S. Miller" , Linux Crypto Mailing List , Subject: Re: [RFC] Replace scatterlist with crypto_frag In-Reply-To: <20050604004201.GB20471@gondor.apana.org.au> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 2090 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jmorris@redhat.com Precedence: bulk X-list: netdev Content-Length: 314 Lines: 15 On Sat, 4 Jun 2005, Herbert Xu wrote: > So this is really a sort of bio_vec for crypto structures. The objective > here is to make the structure as compact as possible to allow users to > allocate it on the stack most of the time. Seems like a good idea to me. - James -- James Morris From herbert@gondor.apana.org.au Fri Jun 3 21:53:06 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Jun 2005 21:53:19 -0700 (PDT) Received: from arnor.apana.org.au (arnor.apana.org.au [203.14.152.115]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j544r4Xq014289 for ; Fri, 3 Jun 2005 21:53:05 -0700 Received: from gondolin.me.apana.org.au ([192.168.0.6] ident=mail) by arnor.apana.org.au with esmtp (Exim 3.35 #1 (Debian)) id 1DeQd5-00012e-00; Sat, 04 Jun 2005 14:51:27 +1000 Received: from herbert by gondolin.me.apana.org.au with local (Exim 3.36 #1 (Debian)) id 1DeQcx-0006a6-00; Sat, 04 Jun 2005 14:51:19 +1000 Date: Sat, 4 Jun 2005 14:51:19 +1000 To: James Morris Cc: Jeff Garzik , "David S. Miller" , Linux Crypto Mailing List , netdev@oss.sgi.com Subject: Re: [RFC] Replace scatterlist with crypto_frag Message-ID: <20050604045119.GA25270@gondor.apana.org.au> References: <20050604004201.GB20471@gondor.apana.org.au> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.9i From: Herbert Xu X-archive-position: 2091 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: herbert@gondor.apana.org.au Precedence: bulk X-list: netdev Content-Length: 782 Lines: 21 On Sat, Jun 04, 2005 at 12:39:41AM -0400, James Morris wrote: > On Sat, 4 Jun 2005, Herbert Xu wrote: > > > So this is really a sort of bio_vec for crypto structures. The objective > > here is to make the structure as compact as possible to allow users to > > allocate it on the stack most of the time. > > Seems like a good idea to me. Thanks James. What do you think about eating up 32 bytes on the stack in esp_input/esp_output? In fact, how did we come up with the number of four frags? Why wouldn't say two frags do for most users or perhaps even one? Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt From imipak@yahoo.com Fri Jun 3 22:02:22 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Jun 2005 22:02:24 -0700 (PDT) Received: from web31504.mail.mud.yahoo.com (web31504.mail.mud.yahoo.com [68.142.198.133]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id j5452LXq015391 for ; Fri, 3 Jun 2005 22:02:22 -0700 Received: (qmail 9899 invoked by uid 60001); 4 Jun 2005 05:01:23 -0000 DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=Message-ID:Received:Date:From:Subject:To:In-Reply-To:MIME-Version:Content-Type:Content-Transfer-Encoding; b=UfBP4q3u6+tXOFdEMbW/rZmEjhIHiB6oK1XIRbqtuX5qLY2x0UEgNnJf5SY9iQFYWttGt+hrGCnrKC9WCszTo3YJphiSTY9d8PguECy22Uv9ARW68Qrfxb+pCOiTaFmymFcPdUEpPhoTv5+/Dlg7JvB+zZljnXp8dMyCU3uOql8= ; Message-ID: <20050604050123.9897.qmail@web31504.mail.mud.yahoo.com> Received: from [70.59.136.169] by web31504.mail.mud.yahoo.com via HTTP; Fri, 03 Jun 2005 22:01:23 PDT Date: Fri, 3 Jun 2005 22:01:23 -0700 (PDT) From: Jonathan Day Subject: Re: Automated linux kernel testing results To: Nivedita Singhvi , netdev@oss.sgi.com In-Reply-To: <42A0F3B4.1060601@us.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-archive-position: 2092 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: imipak@yahoo.com Precedence: bulk X-list: netdev Content-Length: 3157 Lines: 105 I am very impressed, especially as it sounds as though a lot more tests exist (he talks of only pushing small amounts of data to kernel.org) and a lot more are going to be added. It seems to me that there are a lot of disparate test suites out there - some test the APIs, some benchmark the performance, some validate the state at the end, some verify that the source obeys expected rules. What I have not (yet) seen is any work on relating the results. Is a bug in the design? The implementation? Some combination thereof? Is something correctly written but not functioning because something it depends on isn't working correctly? It would even be useful if we could cross-reference some of the benchmarks with the Linux graphing project, so that we could see how the complexity of the tested component differs between versions and variants. (A small degredation in performance, if related to a large increase in necessary sophistication, is not necessarily that bad. The same performance drop, if related to a massive simplification of the design, is an indication of a serious problem.) Test suites are necessary. Test suites are great. Anyone working on a test suite deserves many kudos and much praise. Test suites that are relatable enough that you can see the same problem from different angles -- those are worth their printout weight in gold. --- Nivedita Singhvi wrote: > For those who don't read lkml, I thought I'd point > to > Martin Bligh's post regarding automated testing > being > set up, since some people on this list were > interested. > > http://marc.theaimsgroup.com/?l=linux-kernel&m=111775021327595&w=2 > > Networking tests are in plan... > > thanks, > Nivedita > > -------------------------- > > OK, I've finally got this to the point where I can > publish it. > > http://ftp.kernel.org/pub/linux/kernel/people/mbligh/abat/regression_matrix.html > > > Currently it builds and boots any mainline, -mjb, > -mm kernel within > about 15 minutes of release. runs dbench, tbench, > kernbench, reaim and fsx. > Currently I'm using a 4x AMD64 box, a 16x NUMA-Q, 4x > NUMA-Q, 32x x440 > (ia32) > PPC64 Power 5 LPAR, PPC64 Power 4 LPAR, and PPC64 > Power 4 bare metal > system. > The config files it uses are linked by the machine > names in the column > headers. > > Thanks to all the other IBM people who've worked on > the ABAT test system > that this stuff relies on - too many to list, but > especially Andy, Adam, > and Enrique, who have fixed endless bugs, and put up > with my incessant > bitching about it all not working as it should ;-) > > Clicking on the failure ones error codes should take > you to somewhere > vaguely helpful to diagnose it. Clicking on the job > number just below > that takes you to the info I'm publishing right now, > which should > include perf results and profiles, etc. I'll add > graphs, etc later, > comparing performance across kernels (I have them > ... just not automated). > > > > __________________________________ Discover Yahoo! Find restaurants, movies, travel and more fun for the weekend. Check it out! http://discover.yahoo.com/weekend.html From jmorris@redhat.com Fri Jun 3 22:25:08 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Jun 2005 22:25:14 -0700 (PDT) Received: from mx1.redhat.com (mx1.redhat.com [66.187.233.31]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j545P7Xq016566 for ; Fri, 3 Jun 2005 22:25:07 -0700 Received: from int-mx1.corp.redhat.com (int-mx1.corp.redhat.com [172.16.52.254]) by mx1.redhat.com (8.12.11/8.12.11) with ESMTP id j545NvbQ010698; Sat, 4 Jun 2005 01:23:57 -0400 Received: from mail.boston.redhat.com (mail.boston.redhat.com [172.16.76.12]) by int-mx1.corp.redhat.com (8.11.6/8.11.6) with ESMTP id j545NuO23822; Sat, 4 Jun 2005 01:23:56 -0400 Received: from thoron.boston.redhat.com (thoron.boston.redhat.com [172.16.80.63]) by mail.boston.redhat.com (8.12.8/8.12.8) with ESMTP id j545NtDj031527; Sat, 4 Jun 2005 01:23:55 -0400 Date: Sat, 4 Jun 2005 01:23:55 -0400 (EDT) From: James Morris X-X-Sender: jmorris@thoron.boston.redhat.com To: Herbert Xu cc: Jeff Garzik , "David S. Miller" , Linux Crypto Mailing List , Subject: Re: [RFC] Replace scatterlist with crypto_frag In-Reply-To: <20050604045119.GA25270@gondor.apana.org.au> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 2093 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jmorris@redhat.com Precedence: bulk X-list: netdev Content-Length: 425 Lines: 19 On Sat, 4 Jun 2005, Herbert Xu wrote: > Thanks James. What do you think about eating up 32 bytes on the > stack in esp_input/esp_output? Sounds like a low price to pay, given the general overhead of ipsec. > In fact, how did we come up with the number of four frags? Why wouldn't > say two frags do for most users or perhaps even one? I don't know where that came from. - James -- James Morris From jm@jm.kir.nu Fri Jun 3 22:34:11 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Jun 2005 22:34:15 -0700 (PDT) Received: from jm.kir.nu (dsl017-049-110.sfo4.dsl.speakeasy.net [69.17.49.110]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j545YAXq017326 for ; Fri, 3 Jun 2005 22:34:11 -0700 Received: from jm by jm.kir.nu with local (Exim 4.43) id 1DeRDV-0002DE-Es; Fri, 03 Jun 2005 22:29:05 -0700 Date: Fri, 3 Jun 2005 22:29:05 -0700 From: Jouni Malinen To: Jiri Benc Cc: gwingerde@home.nl, netdev@oss.sgi.com, jbohac@suse.cz Subject: Re: [PATCH] ieee80211: Update generic definitions to latest specs. Message-ID: <20050604052905.GA8130@jm.kir.nu> References: <20050602190232.340996282D7@mail.suse.cz> <20050603113343.55d19cfc@griffin.suse.cz> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20050603113343.55d19cfc@griffin.suse.cz> User-Agent: Mutt/1.5.8i X-archive-position: 2094 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jkmaline@cc.hut.fi Precedence: bulk X-list: netdev Content-Length: 557 Lines: 13 On Fri, Jun 03, 2005 at 11:33:43AM +0200, Jiri Benc wrote: > and so on. Also WLAN_STATUS_ASSOC_DENIED_NOSHORT seems to be acceptable > for me. That would be just asking for problems. IEEE 802.11 uses "short" in number of terms and two of them happen to be already part of capabilities negotatiation (short preamble and short slot time) and both have status codes for rejecting association.. In other words, the constants/enums better include PREAMBLE and SLOTTIME in the name. -- Jouni Malinen PGP id EFC895FA From herbert@gondor.apana.org.au Fri Jun 3 22:35:12 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Jun 2005 22:35:22 -0700 (PDT) Received: from arnor.apana.org.au (arnor.apana.org.au [203.14.152.115]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j545ZAXq017615 for ; Fri, 3 Jun 2005 22:35:11 -0700 Received: from gondolin.me.apana.org.au ([192.168.0.6] ident=mail) by arnor.apana.org.au with esmtp (Exim 3.35 #1 (Debian)) id 1DeRI9-0001JB-00; Sat, 04 Jun 2005 15:33:53 +1000 Received: from herbert by gondolin.me.apana.org.au with local (Exim 3.36 #1 (Debian)) id 1DeRI4-0001XZ-00; Sat, 04 Jun 2005 15:33:48 +1000 Date: Sat, 4 Jun 2005 15:33:48 +1000 To: James Morris Cc: Jeff Garzik , "David S. Miller" , Linux Crypto Mailing List , netdev@oss.sgi.com Subject: Re: [RFC] Replace scatterlist with crypto_frag Message-ID: <20050604053348.GA5877@gondor.apana.org.au> References: <20050604045119.GA25270@gondor.apana.org.au> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.9i From: Herbert Xu X-archive-position: 2095 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: herbert@gondor.apana.org.au Precedence: bulk X-list: netdev Content-Length: 708 Lines: 19 On Sat, Jun 04, 2005 at 01:23:55AM -0400, James Morris wrote: > On Sat, 4 Jun 2005, Herbert Xu wrote: > > > Thanks James. What do you think about eating up 32 bytes on the > > stack in esp_input/esp_output? > > Sounds like a low price to pay, given the general overhead of ipsec. I agree with you on the stack usage. BTW, we can now pump 5Gb/s through the Crypto API using a 1Ghz VIA CPU with the Padlock so encryption is no longer necessarily the slowest piece along the pipeline :) Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt From jm@jm.kir.nu Fri Jun 3 22:50:09 2005 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Jun 2005 22:50:14 -0700 (PDT) Received: from jm.kir.nu (dsl017-049-110.sfo4.dsl.speakeasy.net [69.17.49.110]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j545o9Xq018724 for ; Fri, 3 Jun 2005 22:50:09 -0700 Received: from jm by jm.kir.nu with local (Exim 4.43) id 1DeRT3-0002EJ-MY; Fri, 03 Jun 2005 22:45:09 -0700 Date: Fri, 3 Jun 2005 22:45:09 -0700 From: Jouni Malinen To: Jiri Benc Cc: NetDev , Jeff Garzik , Jirka Bohac Subject: Re: [6/9] ieee80211: ethernet independency Message-ID: <20050604054509.GB8130@jm.kir.nu> References: <20050603182625.64d33be3@griffin.suse.cz> <20050603183418.58c47b0c@griffin.suse.cz> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20050603183418.58c47b0c@griffin.suse.cz> User-Agent: Mutt/1.5.8i X-archive-position: 2096 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jkmaline@cc.hut.fi Precedence: bulk X-list: netdev Content-Length: 1446 Lines: 29 On Fri, Jun 03, 2005 at 06:34:18PM +0200, Jiri Benc wrote: > Makes the 802.11 layer independent of ethernet. (The previous implementation > had the ethernet headers built by the ethernet layer and then parsed them and > rebuilt them into 802.11 headers.) Many (most?) parts of this change seems to be only for client (managed and ad-hoc) modes. Has anyone had chance to go through what would be needed for AP (master mode) and WDS links? What about extra bytes added for QoS information (IEEE 802.11e/WMM)? Are there places here that will not handle variable length header nicely? I haven't yet looked into details, but could someone explain what a user space program needs to do when receiving or sending packets with packet socket from a 802.11 netdev (e.g., ethertype=EAPOL)? Let's say in the "worst case" scenario: QoS enabled and pairwise keys configured and 4-address WDS link (i.e., 32-byte IEEE 802.11 header). Will the user space program need to parse (and generate) the IEEE 802.11 header, including the knowledge of four addresses and QoS data, and SNAP header? Packet socket with SOCK_DGRAM could otherwise be one way of doing this, but sockaddr_ll does not have places for many parameters.. Many of these questions are not really specifically related to this patch, but I haven't seen a good answer to these open areas (well, at least to me) so far. -- Jouni Malinen PGP id EFC895FA From jgarzik@pobox.com Sat Jun 4 01:39:28 2005 Received: with ECARTIS (v1.0.0; list netdev); Sat, 04 Jun 2005 01:39:34 -0700 (PDT) Received: from mail.dvmed.net (mail.dvmed.net [216.237.124.58]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j548dQXq029891 for ; Sat, 4 Jun 2005 01:39:28 -0700 Received: from cpe-065-184-065-144.nc.res.rr.com ([65.184.65.144] helo=[10.10.10.88]) by mail.dvmed.net with esmtpsa (Exim 4.51 #1 (Red Hat Linux)) id 1DeUAk-0002Ly-RQ; Sat, 04 Jun 2005 08:38:27 +0000 Message-ID: <42A16880.4030802@pobox.com> Date: Sat, 04 Jun 2005 04:38:24 -0400 From: Jeff Garzik User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.6) Gecko/20050328 Fedora/1.7.6-1.2.5 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Linus Torvalds , Andrew Morton CC: Netdev , Linux Kernel Subject: [git patches] 2.6.x net driver fixes Content-Type: multipart/mixed; boundary="------------050200070305000207090100" X-archive-position: 2097 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev Content-Length: 3914 Lines: 138 This is a multi-part message in MIME format. --------------050200070305000207090100 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Please pull from the 'misc-fixes' branch of rsync://rsync.kernel.org/pub/scm/linux/kernel/git/jgarzik/netdev-2.6.git to obtain r8169 and 3c574_cs fixes. diffstat/shortlog/patch attached. Jeff --------------050200070305000207090100 Content-Type: text/plain; name="netdev-2.6.txt" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="netdev-2.6.txt" drivers/net/pcmcia/3c574_cs.c | 3 +++ drivers/net/r8169.c | 31 +++++++++++++++++++++++++------ 2 files changed, 28 insertions(+), 6 deletions(-) : Automatic merge of /spare/repo/netdev-2.6 branch r8169-fix Automatic merge of rsync://rsync.kernel.org/.../torvalds/linux-2.6.git branch HEAD Daniel Ritz : 3c574_cs: disable interrupts in el3_close Francois Romieu : [PATCH] r8169: incoming frame length check diff --git a/drivers/net/pcmcia/3c574_cs.c b/drivers/net/pcmcia/3c574_cs.c --- a/drivers/net/pcmcia/3c574_cs.c +++ b/drivers/net/pcmcia/3c574_cs.c @@ -1274,6 +1274,9 @@ static int el3_close(struct net_device * spin_lock_irqsave(&lp->window_lock, flags); update_stats(dev); spin_unlock_irqrestore(&lp->window_lock, flags); + + /* force interrupts off */ + outw(SetIntrEnb | 0x0000, ioaddr + EL3_CMD); } link->open--; diff --git a/drivers/net/r8169.c b/drivers/net/r8169.c --- a/drivers/net/r8169.c +++ b/drivers/net/r8169.c @@ -1585,8 +1585,8 @@ rtl8169_hw_start(struct net_device *dev) RTL_W8(ChipCmd, CmdTxEnb | CmdRxEnb); RTL_W8(EarlyTxThres, EarlyTxThld); - /* For gigabit rtl8169, MTU + header + CRC + VLAN */ - RTL_W16(RxMaxSize, tp->rx_buf_sz); + /* Low hurts. Let's disable the filtering. */ + RTL_W16(RxMaxSize, 16383); /* Set Rx Config register */ i = rtl8169_rx_config | @@ -2127,6 +2127,11 @@ rtl8169_tx_interrupt(struct net_device * } } +static inline int rtl8169_fragmented_frame(u32 status) +{ + return (status & (FirstFrag | LastFrag)) != (FirstFrag | LastFrag); +} + static inline void rtl8169_rx_csum(struct sk_buff *skb, struct RxDesc *desc) { u32 opts1 = le32_to_cpu(desc->opts1); @@ -2177,27 +2182,41 @@ rtl8169_rx_interrupt(struct net_device * while (rx_left > 0) { unsigned int entry = cur_rx % NUM_RX_DESC; + struct RxDesc *desc = tp->RxDescArray + entry; u32 status; rmb(); - status = le32_to_cpu(tp->RxDescArray[entry].opts1); + status = le32_to_cpu(desc->opts1); if (status & DescOwn) break; if (status & RxRES) { - printk(KERN_INFO "%s: Rx ERROR!!!\n", dev->name); + printk(KERN_INFO "%s: Rx ERROR. status = %08x\n", + dev->name, status); tp->stats.rx_errors++; if (status & (RxRWT | RxRUNT)) tp->stats.rx_length_errors++; if (status & RxCRC) tp->stats.rx_crc_errors++; + rtl8169_mark_to_asic(desc, tp->rx_buf_sz); } else { - struct RxDesc *desc = tp->RxDescArray + entry; struct sk_buff *skb = tp->Rx_skbuff[entry]; int pkt_size = (status & 0x00001FFF) - 4; void (*pci_action)(struct pci_dev *, dma_addr_t, size_t, int) = pci_dma_sync_single_for_device; + /* + * The driver does not support incoming fragmented + * frames. They are seen as a symptom of over-mtu + * sized frames. + */ + if (unlikely(rtl8169_fragmented_frame(status))) { + tp->stats.rx_dropped++; + tp->stats.rx_length_errors++; + rtl8169_mark_to_asic(desc, tp->rx_buf_sz); + goto move_on; + } + rtl8169_rx_csum(skb, desc); pci_dma_sync_single_for_cpu(tp->pci_dev, @@ -2224,7 +2243,7 @@ rtl8169_rx_interrupt(struct net_device * tp->stats.rx_bytes += pkt_size; tp->stats.rx_packets++; } - +move_on: cur_rx++; rx_left--; } --------------050200070305000207090100-- From johnpol@2ka.mipt.ru Sat Jun 4 02:56:55 2005 Received: with ECARTIS (v1.0.0; list netdev); Sat, 04 Jun 2005 02:56:59 -0700 (PDT) Received: from 2ka.mipt.ru (relay.2ka.mipt.ru [194.85.82.65]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j549urXq032618 for ; Sat, 4 Jun 2005 02:56:54 -0700 Received: from zanzibar.2ka.mipt.ru (zanzibar.2ka.mipt.ru [194.85.82.77]) by 2ka.mipt.ru (8.12.11/8.12.11) with ESMTP id j549u8I5018923; Sat, 4 Jun 2005 13:56:08 +0400 Date: Sat, 4 Jun 2005 13:55:35 +0400 From: Evgeniy Polyakov To: Herbert Xu Cc: "David S. Miller" , James Morris , Linux Crypto Mailing List , netdev@oss.sgi.com Subject: Re: [RFC] Replace scatterlist with crypto_frag Message-ID: <20050604135535.3cfb631f@zanzibar.2ka.mipt.ru> In-Reply-To: <20050603234623.GA20088@gondor.apana.org.au> References: <20050603234623.GA20088@gondor.apana.org.au> Reply-To: johnpol@2ka.mipt.ru Organization: MIPT X-Mailer: Sylpheed-Claws 1.0.4 (GTK+ 1.2.10; i386-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-1.7.5 (2ka.mipt.ru [194.85.82.65]); Sat, 04 Jun 2005 13:56:09 +0400 (MSD) X-archive-position: 2098 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: johnpol@2ka.mipt.ru Precedence: bulk X-list: netdev Content-Length: 1547 Lines: 40 On Sat, 4 Jun 2005 09:46:23 +1000 Herbert Xu wrote: > Hi: > > I was looking at how we can move the IPsec input/output processing out > of the critical section protected by the spin locks on the xfrm_state. > This is useful because it would allow concurrent processing of IPsec > packets for the same SA. It is also necessary if we're ever going to > add support for asynchronous crypto to IPsec. Asynchronous schemas already works without any changes to scaterlist processing code. And you can not easily move away of SA lock due to synchronous problems with the same tfm. Existing asynchronous schemas do not use any shared object in SA, only skb. > The first requirement for this is that we need to stop using data that > is shared across a single SA in the IPsec input/output routines. The > biggest hurdle there as it stands is sgbuf in esp_data. This was > introduced to reduce stack usage in esp_input/esp_output as sgbuf > would consume up to 64 bytes of space. No need to have it at all, I think. > Cheers, > -- > Visit Openswan at http://www.openswan.org/ > Email: Herbert Xu ~{PmV>HI~} > Home Page: http://gondor.apana.org.au/~herbert/ > PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt > - > To unsubscribe from this list: send the line "unsubscribe linux-crypto" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html Evgeniy Polyakov Only failure makes us experts. -- Theo de Raadt From herbert@gondor.apana.org.au Sat Jun 4 03:00:28 2005 Received: with ECARTIS (v1.0.0; list netdev); Sat, 04 Jun 2005 03:00:31 -0700 (PDT) Received: from arnor.apana.org.au (arnor.apana.org.au [203.14.152.115]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j54A0OXq000783 for ; Sat, 4 Jun 2005 03:00:27 -0700 Received: from gondolin.me.apana.org.au ([192.168.0.6] ident=mail) by arnor.apana.org.au with esmtp (Exim 3.35 #1 (Debian)) id 1DeVQf-0002ay-00; Sat, 04 Jun 2005 19:58:57 +1000 Received: from herbert by gondolin.me.apana.org.au with local (Exim 3.36 #1 (Debian)) id 1DeVQc-0000Gp-00; Sat, 04 Jun 2005 19:58:54 +1000 Date: Sat, 4 Jun 2005 19:58:54 +1000 To: Evgeniy Polyakov Cc: "David S. Miller" , James Morris , Linux Crypto Mailing List , netdev@oss.sgi.com Subject: Re: [RFC] Replace scatterlist with crypto_frag Message-ID: <20050604095854.GA1003@gondor.apana.org.au> References: <20050603234623.GA20088@gondor.apana.org.au> <20050604135535.3cfb631f@zanzibar.2ka.mipt.ru> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20050604135535.3cfb631f@zanzibar.2ka.mipt.ru> User-Agent: Mutt/1.5.9i From: Herbert Xu X-archive-position: 2099 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: herbert@gondor.apana.org.au Precedence: bulk X-list: netdev Content-Length: 681 Lines: 16 On Sat, Jun 04, 2005 at 01:55:35PM +0400, Evgeniy Polyakov wrote: > > processing code. And you can not easily move away of SA lock due to > synchronous problems with the same tfm. This is not true. The tfm context contains no shared state apart from the IV. As the IV can be specified through the *_iv functions, this allows crypto API users to process the same cipher tfm on two CPUs in parallel. If you don't believe me just wait for my upcoming patches to IPsec. -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt From johnpol@2ka.mipt.ru Sat Jun 4 03:01:33 2005 Received: with ECARTIS (v1.0.0; list netdev); Sat, 04 Jun 2005 03:01:36 -0700 (PDT) Received: from 2ka.mipt.ru (relay.2ka.mipt.ru [194.85.82.65]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j54A1WXq001174 for ; Sat, 4 Jun 2005 03:01:33 -0700 Received: from zanzibar.2ka.mipt.ru (zanzibar.2ka.mipt.ru [194.85.82.77]) by 2ka.mipt.ru (8.12.11/8.12.11) with ESMTP id j54A0scj023143; Sat, 4 Jun 2005 14:00:54 +0400 Date: Sat, 4 Jun 2005 14:00:21 +0400 From: Evgeniy Polyakov To: Herbert Xu Cc: Jeff Garzik , "David S. Miller" , James Morris , Linux Crypto Mailing List , netdev@oss.sgi.com Subject: Re: [RFC] Replace scatterlist with crypto_frag Message-ID: <20050604140021.62259ad3@zanzibar.2ka.mipt.ru> In-Reply-To: <20050604004201.GB20471@gondor.apana.org.au> References: <20050603234623.GA20088@gondor.apana.org.au> <42A0EFAC.7070609@pobox.com> <20050604004201.GB20471@gondor.apana.org.au> Reply-To: johnpol@2ka.mipt.ru Organization: MIPT X-Mailer: Sylpheed-Claws 1.0.4 (GTK+ 1.2.10; i386-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-1.7.5 (2ka.mipt.ru [194.85.82.65]); Sat, 04 Jun 2005 14:00:54 +0400 (MSD) X-archive-position: 2100 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: johnpol@2ka.mipt.ru Precedence: bulk X-list: netdev Content-Length: 2582 Lines: 61 On Sat, 4 Jun 2005 10:42:01 +1000 Herbert Xu wrote: > Hi Jeff: > > On Fri, Jun 03, 2005 at 08:02:52PM -0400, Jeff Garzik wrote: > > > > A standard feature of struct scatterlist is having the DMA mappings > > right next to the kernel virtual address/length info. Drivers use the > > arch-specific DMA-mapped part of struct scatterlist to fill the > > hardware-specific descriptions with addresses and other info. > > Agreed. > > > Since you -will- have to DMA map buffers before passing them to > > hardware, it seems like struct scatterlist is much more appropriate than > > crypto_frag when dealing with hardware. > > > > For pure software implementations, I don't see why you can't just ignore > > the extra fields that each arch puts into struct scatterlist. > > It depends on who is going to do the mapping. When we implement hardware > crypto, the DMA mapping will be done either by the crypto layer or under > it by the driver itself. So the crypto layer is certainly going to need > the scatterlist structure. > > However, the users of the crypto layer (such as IPsec/dmcrypt) don't have > to know about DMA at all. Therefore the data structure between the users > and the crypto layer itself doesn't have to carry DMA information. > > Compare this with the block layer. Between the users of the block layer > and the block layer itself you have the bio_vec structure which carries > no DMA information. The scatterlist structure only comes into play after > DMA mapping has been carried out under the block layer. > > So this is really a sort of bio_vec for crypto structures. The objective > here is to make the structure as compact as possible to allow users to > allocate it on the stack most of the time. As far as I remember, IPsec has scterlists specially to _not_ remap from any inner strucutre to scaterlist later. Block layer was not designed in a such way because there is no easy mapping in block cache into scaterlist and bio_vec has much bigger usage than SA, so removing dma address is suitable there. > Cheers, > -- > Visit Openswan at http://www.openswan.org/ > Email: Herbert Xu ~{PmV>HI~} > Home Page: http://gondor.apana.org.au/~herbert/ > PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt > - > To unsubscribe from this list: send the line "unsubscribe linux-crypto" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html Evgeniy Polyakov Only failure makes us experts. -- Theo de Raadt From johnpol@2ka.mipt.ru Sat Jun 4 03:18:45 2005 Received: with ECARTIS (v1.0.0; list netdev); Sat, 04 Jun 2005 03:18:53 -0700 (PDT) Received: from 2ka.mipt.ru (relay.2ka.mipt.ru [194.85.82.65]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j54AIhXq002296 for ; Sat, 4 Jun 2005 03:18:44 -0700 Received: from zanzibar.2ka.mipt.ru (zanzibar.2ka.mipt.ru [194.85.82.77]) by 2ka.mipt.ru (8.12.11/8.12.11) with ESMTP id j54AI5Op007818; Sat, 4 Jun 2005 14:18:05 +0400 Date: Sat, 4 Jun 2005 14:17:31 +0400 From: Evgeniy Polyakov To: Herbert Xu Cc: "David S. Miller" , James Morris , Linux Crypto Mailing List , netdev@oss.sgi.com Subject: Re: [RFC] Replace scatterlist with crypto_frag Message-ID: <20050604141731.37479347@zanzibar.2ka.mipt.ru> In-Reply-To: <20050604095854.GA1003@gondor.apana.org.au> References: <20050603234623.GA20088@gondor.apana.org.au> <20050604135535.3cfb631f@zanzibar.2ka.mipt.ru> <20050604095854.GA1003@gondor.apana.org.au> Reply-To: johnpol@2ka.mipt.ru Organization: MIPT X-Mailer: Sylpheed-Claws 1.0.4 (GTK+ 1.2.10; i386-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-1.7.5 (2ka.mipt.ru [194.85.82.65]); Sat, 04 Jun 2005 14:18:05 +0400 (MSD) X-archive-position: 2101 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: johnpol@2ka.mipt.ru Precedence: bulk X-list: netdev Content-Length: 3462 Lines: 127 On Sat, 4 Jun 2005 19:58:54 +1000 Herbert Xu wrote: > On Sat, Jun 04, 2005 at 01:55:35PM +0400, Evgeniy Polyakov wrote: > > > > processing code. And you can not easily move away of SA lock due to > > synchronous problems with the same tfm. > > This is not true. The tfm context contains no shared state apart > from the IV. As the IV can be specified through the *_iv functions, > this allows crypto API users to process the same cipher tfm on two > CPUs in parallel. > > If you don't believe me just wait for my upcoming patches to IPsec. Sure I believe you, in tfm there are no shared objects except data. But can we catch the situation when we encrypting the same skb? As far as I can see skb_cow_data() must take care of it. You are right, encrypting is safe. Here is part of esp_output() I use for acrypto. Static scaterlists are not used and new are dinamically allocated. @@ -95,7 +239,90 @@ esph->spi = x->id.spi; esph->seq_no = htonl(++x->replay.oseq); + +#ifdef CONFIG_ACRYPTO + do { + struct crypto_session_initializer ci; + struct crypto_data data; + struct scatterlist *sg; + struct crypto_session *s; + u8 *key, *iv; + + nfrags++; /* key */ + + if (esp->conf.ivlen) + nfrags++; + memset(&ci, 0, sizeof(ci)); + memset(&data, 0, sizeof(data)); + + ci.operation = CRYPTO_OP_ENCRYPT; + ci.mode = crypto_tfm_get_mode(tfm); + ci.type = crypto_tfm_get_type(tfm); + ci.priority = 0; + ci.callback = &esp4_async_callback; + + if (ci.mode == 0xffff || ci.type == 0xffff) + goto sync_crypto; + + sg = kmalloc(sizeof(struct scatterlist)*nfrags, GFP_ATOMIC); + if (!sg) + goto error; + skb_to_sgvec(skb, sg, esph->enc_data+esp->conf.ivlen-skb->data, clen); + data.sg_src = data.sg_dst = sg; + + key = kmalloc(crypto_tfm_alg_ivsize(tfm) + esp->conf.key_len, GFP_ATOMIC); + if (!key) + goto err_out_free_sg; + + iv = key + esp->conf.key_len; + + if (esp->conf.ivlen) { + data.sg_key = &sg[nfrags - 2]; + data.sg_iv = &sg[nfrags - 1]; + data.sg_key_num = data.sg_iv_num = 1; + } else { + data.sg_key = &sg[nfrags - 1]; + data.sg_iv = NULL; + data.sg_key_num = 1; + data.sg_iv_num = 0; + } + + data.sg_src_num = data.sg_dst_num = nfrags - data.sg_key_num - data.sg_iv_num; + + memcpy(key, esp->conf.key, esp->conf.key_len); + data.sg_key[0].offset = offset_in_page(key); + data.sg_key[0].length = esp->conf.key_len; + data.sg_key[0].page = virt_to_page(key); + + if (esp->conf.ivlen) { + memcpy(iv, esp->conf.ivec, crypto_tfm_alg_ivsize(tfm)); + data.sg_iv[0].offset = offset_in_page(iv); + data.sg_iv[0].length = crypto_tfm_alg_ivsize(tfm); + data.sg_iv[0].page = virt_to_page(iv); + } + + data.priv = esp_output_async_prepare(x, skb); + if (!data.priv) + goto err_out_free_key; + + s = crypto_session_alloc(&ci, &data); + if (!s) + goto err_out_free_ea; + + return 0; + +err_out_free_ea: + kfree(data.priv); +err_out_free_key: + kfree(key); +err_out_free_sg: + kfree(sg); + goto sync_crypto; + } while (0); + +sync_crypto: +#endif if (esp->conf.ivlen) crypto_cipher_set_iv(tfm, esp->conf.ivec, crypto_tfm_alg_ivsize(tfm)); > -- > Visit Openswan at http://www.openswan.org/ > Email: Herbert Xu ~{PmV>HI~} > Home Page: http://gondor.apana.org.au/~herbert/ > PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt Evgeniy Polyakov Only failure makes us experts. -- Theo de Raadt From herbert@gondor.apana.org.au Sat Jun 4 03:23:28 2005 Received: with ECARTIS (v1.0.0; list netdev); Sat, 04 Jun 2005 03:23:31 -0700 (PDT) Received: from arnor.apana.org.au (arnor.apana.org.au [203.14.152.115]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j54ANRXq002965 for ; Sat, 4 Jun 2005 03:23:28 -0700 Received: from gondolin.me.apana.org.au ([192.168.0.6] ident=mail) by arnor.apana.org.au with esmtp (Exim 3.35 #1 (Debian)) id 1DeVn4-0002jP-00; Sat, 04 Jun 2005 20:22:06 +1000 Received: from herbert by gondolin.me.apana.org.au with local (Exim 3.36 #1 (Debian)) id 1DeVn2-0000Ki-00; Sat, 04 Jun 2005 20:22:04 +1000 Date: Sat, 4 Jun 2005 20:22:04 +1000 To: Evgeniy Polyakov Cc: "David S. Miller" , James Morris , Linux Crypto Mailing List , netdev@oss.sgi.com Subject: Re: [RFC] Replace scatterlist with crypto_frag Message-ID: <20050604102204.GA1214@gondor.apana.org.au> References: <20050603234623.GA20088@gondor.apana.org.au> <20050604135535.3cfb631f@zanzibar.2ka.mipt.ru> <20050604095854.GA1003@gondor.apana.org.au> <20050604141731.37479347@zanzibar.2ka.mipt.ru> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20050604141731.37479347@zanzibar.2ka.mipt.ru> User-Agent: Mutt/1.5.9i From: Herbert Xu X-archive-position: 2102 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: herbert@gondor.apana.org.au Precedence: bulk X-list: netdev Content-Length: 713 Lines: 19 On Sat, Jun 04, 2005 at 02:17:31PM +0400, Evgeniy Polyakov wrote: > > Static scaterlists are not used and new are dinamically allocated. That's precisely why we're having this discussion. We can now encrypt/decrypt a 1500 byte packet in 2us so the last thing we want is to impose additional latencies on the common case unless it's absolutely required. If we can shrink the structure used between IPsec and the crypto layer then we can allocate the sgbuf off the stack for 99% of the users. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt From johnpol@2ka.mipt.ru Sat Jun 4 03:30:53 2005 Received: with ECARTIS (v1.0.0; list netdev); Sat, 04 Jun 2005 03:30:59 -0700 (PDT) Received: from 2ka.mipt.ru (relay.2ka.mipt.ru [194.85.82.65]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j54AUqXq003664 for ; Sat, 4 Jun 2005 03:30:52 -0700 Received: from zanzibar.2ka.mipt.ru (zanzibar.2ka.mipt.ru [194.85.82.77]) by 2ka.mipt.ru (8.12.11/8.12.11) with ESMTP id j54AUE9g019299; Sat, 4 Jun 2005 14:30:14 +0400 Date: Sat, 4 Jun 2005 14:29:39 +0400 From: Evgeniy Polyakov To: Herbert Xu Cc: "David S. Miller" , James Morris , Linux Crypto Mailing List , netdev@oss.sgi.com Subject: Re: [RFC] Replace scatterlist with crypto_frag Message-ID: <20050604142939.4e2efc55@zanzibar.2ka.mipt.ru> In-Reply-To: <20050604102204.GA1214@gondor.apana.org.au> References: <20050603234623.GA20088@gondor.apana.org.au> <20050604135535.3cfb631f@zanzibar.2ka.mipt.ru> <20050604095854.GA1003@gondor.apana.org.au> <20050604141731.37479347@zanzibar.2ka.mipt.ru> <20050604102204.GA1214@gondor.apana.org.au> Reply-To: johnpol@2ka.mipt.ru Organization: MIPT X-Mailer: Sylpheed-Claws 1.0.4 (GTK+ 1.2.10; i386-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-1.7.5 (2ka.mipt.ru [194.85.82.65]); Sat, 04 Jun 2005 14:30:14 +0400 (MSD) X-archive-position: 2103 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: johnpol@2ka.mipt.ru Precedence: bulk X-list: netdev Content-Length: 1157 Lines: 32 On Sat, 4 Jun 2005 20:22:04 +1000 Herbert Xu wrote: > On Sat, Jun 04, 2005 at 02:17:31PM +0400, Evgeniy Polyakov wrote: > > > > Static scaterlists are not used and new are dinamically allocated. > > That's precisely why we're having this discussion. We can now > encrypt/decrypt a 1500 byte packet in 2us so the last thing we > want is to impose additional latencies on the common case unless > it's absolutely required. > > If we can shrink the structure used between IPsec and the crypto > layer then we can allocate the sgbuf off the stack for 99% of > the users. I do see that 4 sg are enough for 99% of the users, I event think 2 is enough - it will be 8kb, almost the maximum seen 9kb jumbo frame. But without sg we sill save 4*sizeof(dma addr) - is it really a price? For hardware we will need to remap it later... > Cheers, > -- > Visit Openswan at http://www.openswan.org/ > Email: Herbert Xu ~{PmV>HI~} > Home Page: http://gondor.apana.org.au/~herbert/ > PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt Evgeniy Polyakov Only failure makes us experts. -- Theo de Raadt From herbert@gondor.apana.org.au Sat Jun 4 03:34:19 2005 Received: with ECARTIS (v1.0.0; list netdev); Sat, 04 Jun 2005 03:34:22 -0700 (PDT) Received: from arnor.apana.org.au (arnor.apana.org.au [203.14.152.115]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j54AYHXq004328 for ; Sat, 4 Jun 2005 03:34:18 -0700 Received: from gondolin.me.apana.org.au ([192.168.0.6] ident=mail) by arnor.apana.org.au with esmtp (Exim 3.35 #1 (Debian)) id 1DeVxU-0002ur-00; Sat, 04 Jun 2005 20:32:52 +1000 Received: from herbert by gondolin.me.apana.org.au with local (Exim 3.36 #1 (Debian)) id 1DeVxR-0000Mb-00; Sat, 04 Jun 2005 20:32:49 +1000 Date: Sat, 4 Jun 2005 20:32:49 +1000 To: Evgeniy Polyakov Cc: "David S. Miller" , James Morris , Linux Crypto Mailing List , netdev@oss.sgi.com Subject: Re: [RFC] Replace scatterlist with crypto_frag Message-ID: <20050604103249.GA1378@gondor.apana.org.au> References: <20050603234623.GA20088@gondor.apana.org.au> <20050604135535.3cfb631f@zanzibar.2ka.mipt.ru> <20050604095854.GA1003@gondor.apana.org.au> <20050604141731.37479347@zanzibar.2ka.mipt.ru> <20050604102204.GA1214@gondor.apana.org.au> <20050604142939.4e2efc55@zanzibar.2ka.mipt.ru> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20050604142939.4e2efc55@zanzibar.2ka.mipt.ru> User-Agent: Mutt/1.5.9i From: Herbert Xu X-archive-position: 2104 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: herbert@gondor.apana.org.au Precedence: bulk X-list: netdev Content-Length: 738 Lines: 20 On Sat, Jun 04, 2005 at 02:29:39PM +0400, Evgeniy Polyakov wrote: > > But without sg we sill save 4*sizeof(dma addr) - is it really a price? We're also reducing the offset/length to 16 bits from 32 bits so we're shaving off half the size. > For hardware we will need to remap it later... Well we can't modify the supplied scatterlist structure in the crypto API anyway since we don't have exclusive ownership of it. Since we can't expect the user of the crypto API to do the mapping this space is basically wasted. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt From johnpol@2ka.mipt.ru Sat Jun 4 03:42:17 2005 Received: with ECARTIS (v1.0.0; list netdev); Sat, 04 Jun 2005 03:42:22 -0700 (PDT) Received: from 2ka.mipt.ru (relay.2ka.mipt.ru [194.85.82.65]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j54AgGXq005163 for ; Sat, 4 Jun 2005 03:42:16 -0700 Received: from zanzibar.2ka.mipt.ru (zanzibar.2ka.mipt.ru [194.85.82.77]) by 2ka.mipt.ru (8.12.11/8.12.11) with ESMTP id j54AfYAh031712; Sat, 4 Jun 2005 14:41:35 +0400 Date: Sat, 4 Jun 2005 14:40:59 +0400 From: Evgeniy Polyakov To: Herbert Xu Cc: "David S. Miller" , James Morris , Linux Crypto Mailing List , netdev@oss.sgi.com Subject: Re: [RFC] Replace scatterlist with crypto_frag Message-ID: <20050604144059.2be84671@zanzibar.2ka.mipt.ru> In-Reply-To: <20050604103249.GA1378@gondor.apana.org.au> References: <20050603234623.GA20088@gondor.apana.org.au> <20050604135535.3cfb631f@zanzibar.2ka.mipt.ru> <20050604095854.GA1003@gondor.apana.org.au> <20050604141731.37479347@zanzibar.2ka.mipt.ru> <20050604102204.GA1214@gondor.apana.org.au> <20050604142939.4e2efc55@zanzibar.2ka.mipt.ru> <20050604103249.GA1378@gondor.apana.org.au> Reply-To: johnpol@2ka.mipt.ru Organization: MIPT X-Mailer: Sylpheed-Claws 1.0.4 (GTK+ 1.2.10; i386-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-1.7.5 (2ka.mipt.ru [194.85.82.65]); Sat, 04 Jun 2005 14:41:35 +0400 (MSD) X-archive-position: 2105 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: johnpol@2ka.mipt.ru Precedence: bulk X-list: netdev Content-Length: 1417 Lines: 39 On Sat, 4 Jun 2005 20:32:49 +1000 Herbert Xu wrote: > On Sat, Jun 04, 2005 at 02:29:39PM +0400, Evgeniy Polyakov wrote: > > > > But without sg we sill save 4*sizeof(dma addr) - is it really a price? > > We're also reducing the offset/length to 16 bits from 32 bits so we're > shaving off half the size. > > > For hardware we will need to remap it later... > > Well we can't modify the supplied scatterlist structure in the > crypto API anyway since we don't have exclusive ownership of it. > Since we can't expect the user of the crypto API to do the mapping > this space is basically wasted. So why not remove it completely? Sycnhronous hardware (like VIA/freescale processors) do not use at all any scaterlists, so it is not needed there. CryptoAPI does not use half of the scaterlist structure. CryptoAPI design can not be used with "interruptible" hardware like HIFN, so for asynchronous hardware we need some kind of remapping anyway, so why just not to move to the new fragments Herbert introduced all over the place in CryptoAPI? But pleaso do not remove skb_to_sgvec() :) > Cheers, > -- > Visit Openswan at http://www.openswan.org/ > Email: Herbert Xu ~{PmV>HI~} > Home Page: http://gondor.apana.org.au/~herbert/ > PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt Evgeniy Polyakov Only failure makes us experts. -- Theo de Raadt From SRS0+ca62117fdd73a480a370+650+infradead.org+hch@pentafluge.srs.infradead.org Sat Jun 4 04:24:27 2005 Received: with ECARTIS (v1.0.0; list netdev); Sat, 04 Jun 2005 04:24:38 -0700 (PDT) Received: from pentafluge.infradead.org (pentafluge.infradead.org [213.146.154.40]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j54BOKXq011188 for ; Sat, 4 Jun 2005 04:24:27 -0700 Received: from hch by pentafluge.infradead.org with local (Exim 4.43 #1 (Red Hat Linux)) id 1DeWkE-0005BW-Ip; Sat, 04 Jun 2005 12:23:14 +0100 Date: Sat, 4 Jun 2005 12:23:14 +0100 From: Christoph Hellwig To: Herbert Xu Cc: "David S. Miller" , James Morris , Linux Crypto Mailing List , netdev@oss.sgi.com Subject: Re: [RFC] Replace scatterlist with crypto_frag Message-ID: <20050604112314.GA19819@infradead.org> References: <20050603234623.GA20088@gondor.apana.org.au> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20050603234623.GA20088@gondor.apana.org.au> User-Agent: Mutt/1.4.1i X-SRS-Rewrite: SMTP reverse-path rewritten from by pentafluge.infradead.org See http://www.infradead.org/rpr.html X-archive-position: 2106 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hch@infradead.org Precedence: bulk X-list: netdev Content-Length: 273 Lines: 10 On Sat, Jun 04, 2005 at 09:46:23AM +1000, Herbert Xu wrote: > struct crypto_frag { > struct page *page; > u16 offset; > u16 length; > }; we have this structure as skb_frag_struct and bio_vec already, care to use the same structure with a generic name for all of them? From herbert@gondor.apana.org.au Sat Jun 4 04:27:35 2005 Received: with ECARTIS (v1.0.0; list netdev); Sat, 04 Jun 2005 04:27:42 -0700 (PDT) Received: from arnor.apana.org.au (arnor.apana.org.au [203.14.152.115]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j54BRYXq011662 for ; Sat, 4 Jun 2005 04:27:35 -0700 Received: from gondolin.me.apana.org.au ([192.168.0.6] ident=mail) by arnor.apana.org.au with esmtp (Exim 3.35 #1 (Debian)) id 1DeWn4-0003EA-00; Sat, 04 Jun 2005 21:26:10 +1000 Received: from herbert by gondolin.me.apana.org.au with local (Exim 3.36 #1 (Debian)) id 1DeWn0-0000TP-00; Sat, 04 Jun 2005 21:26:06 +1000 Date: Sat, 4 Jun 2005 21:26:06 +1000 To: Christoph Hellwig Cc: "David S. Miller" , James Morris , Linux Crypto Mailing List , netdev@oss.sgi.com Subject: Re: [RFC] Replace scatterlist with crypto_frag Message-ID: <20050604112606.GA1799@gondor.apana.org.au> References: <20050603234623.GA20088@gondor.apana.org.au> <20050604112314.GA19819@infradead.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20050604112314.GA19819@infradead.org> User-Agent: Mutt/1.5.9i From: Herbert Xu X-archive-position: 2107 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: herbert@gondor.apana.org.au Precedence: bulk X-list: netdev Content-Length: 936 Lines: 25 On Sat, Jun 04, 2005 at 12:23:14PM +0100, Christoph Hellwig wrote: > On Sat, Jun 04, 2005 at 09:46:23AM +1000, Herbert Xu wrote: > > struct crypto_frag { > > struct page *page; > > u16 offset; > > u16 length; > > }; > > we have this structure as skb_frag_struct and bio_vec already, care > to use the same structure with a generic name for all of them? I certainly would have no problems merging with skb_frag_struct. However, merging with bio_vec would mean that either bio_vec would have to drop down to 16-bit counters, or crypto_frag would have to move up to 32-bit counters. The latter is problematic because I'm trying to shrink the size enough so that we can squeeze four of these things onto the stack. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt From SRS0+ca62117fdd73a480a370+650+infradead.org+hch@pentafluge.srs.infradead.org Sat Jun 4 04:59:57 2005 Received: with ECARTIS (v1.0.0; list netdev); Sat, 04 Jun 2005 05:00:00 -0700 (PDT) Received: from pentafluge.infradead.org (pentafluge.infradead.org [213.146.154.40]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j54BxuXq013796 for ; Sat, 4 Jun 2005 04:59:57 -0700 Received: from hch by pentafluge.infradead.org with local (Exim 4.43 #1 (Red Hat Linux)) id 1DeXIj-0005IC-Ht; Sat, 04 Jun 2005 12:58:53 +0100 Date: Sat, 4 Jun 2005 12:58:53 +0100 From: Christoph Hellwig To: Herbert Xu Cc: Christoph Hellwig , "David S. Miller" , James Morris , Linux Crypto Mailing List , netdev@oss.sgi.com Subject: Re: [RFC] Replace scatterlist with crypto_frag Message-ID: <20050604115853.GA20335@infradead.org> References: <20050603234623.GA20088@gondor.apana.org.au> <20050604112314.GA19819@infradead.org> <20050604112606.GA1799@gondor.apana.org.au> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20050604112606.GA1799@gondor.apana.org.au> User-Agent: Mutt/1.4.1i X-SRS-Rewrite: SMTP reverse-path rewritten from by pentafluge.infradead.org See http://www.infradead.org/rpr.html X-archive-position: 2108 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hch@infradead.org Precedence: bulk X-list: netdev Content-Length: 870 Lines: 21 On Sat, Jun 04, 2005 at 09:26:06PM +1000, Herbert Xu wrote: > On Sat, Jun 04, 2005 at 12:23:14PM +0100, Christoph Hellwig wrote: > > On Sat, Jun 04, 2005 at 09:46:23AM +1000, Herbert Xu wrote: > > > struct crypto_frag { > > > struct page *page; > > > u16 offset; > > > u16 length; > > > }; > > > > we have this structure as skb_frag_struct and bio_vec already, care > > to use the same structure with a generic name for all of them? > > I certainly would have no problems merging with skb_frag_struct. > However, merging with bio_vec would mean that either bio_vec would > have to drop down to 16-bit counters, or crypto_frag would have to > move up to 32-bit counters. the usage of 16bit counters in bio_vec doesn't make sense, and if did all others would have to move to 32bit aswell (in case we started supporting page sizes that aren't addressable by 16bits) From kaber@trash.net Sat Jun 4 09:41:41 2005 Received: with ECARTIS (v1.0.0; list netdev); Sat, 04 Jun 2005 09:41:44 -0700 (PDT) Received: from kaber.coreworks.de ([62.206.217.67]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j54GfeXq028725 for ; Sat, 4 Jun 2005 09:41:40 -0700 Received: from localhost ([127.0.0.1]) by kaber.coreworks.de with esmtp (Exim 4.50) id 1DebhT-00058G-BD; Sat, 04 Jun 2005 18:40:43 +0200 Message-ID: <42A1D98B.7030400@trash.net> Date: Sat, 04 Jun 2005 18:40:43 +0200 From: Patrick McHardy User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.7.7) Gecko/20050420 Debian/1.7.7-2 X-Accept-Language: en MIME-Version: 1.0 To: mailinglist.chris@gmail.com CC: Andrew Morton , netdev@oss.sgi.com Subject: Re: Fw: kernel 2.6 libipq kernel hang References: <20050406155828.1584d7cd.akpm@osdl.org> In-Reply-To: <20050406155828.1584d7cd.akpm@osdl.org> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 2109 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kaber@trash.net Precedence: bulk X-list: netdev Content-Length: 688 Lines: 21 Andrew Morton wrote: > > Begin forwarded message: > > Date: Wed, 6 Apr 2005 15:12:05 -0400 > From: Mailing List > To: linux-kernel@vger.kernel.org > Subject: kernel 2.6 libipq kernel hang > > /sbin/iptables -t mangle -A POSTROUTING -d 192.168.3.0/24 -j QUEUE > /sbin/iptables -t mangle -A PREROUTING -s 192.168.3.0/24 -j QUEUE > > If anyone has any suggestions about what I am doing wrong in either > the libipq program or the client or server programs, or any ideas > about what is going on with netlink, please let me know. Please try latest -git, Harald fixed a bug that could cause a deadlock when ip_queue was used in PRE_ROUTING. Regards Patrick From akpm@osdl.org Sat Jun 4 19:52:40 2005 Received: with ECARTIS (v1.0.0; list netdev); Sat, 04 Jun 2005 19:52:47 -0700 (PDT) Received: from smtp.osdl.org (fire.osdl.org [65.172.181.4]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j552qdXq025446 for ; Sat, 4 Jun 2005 19:52:39 -0700 Received: from shell0.pdx.osdl.net (fw.osdl.org [65.172.181.6]) by smtp.osdl.org (8.12.8/8.12.8) with ESMTP id j552pPjA030229 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO); Sat, 4 Jun 2005 19:51:25 -0700 Received: from bix (shell0.pdx.osdl.net [10.9.0.31]) by shell0.pdx.osdl.net (8.13.1/8.11.6) with SMTP id j552pOHd016709; Sat, 4 Jun 2005 19:51:24 -0700 Date: Sat, 4 Jun 2005 19:51:22 -0700 From: Andrew Morton To: netdev@oss.sgi.com Cc: Rommer Subject: Fw: PROBLEM: tcp_output.c bug Message-Id: <20050604195122.6a07abc7.akpm@osdl.org> X-Mailer: Sylpheed version 1.0.4 (GTK+ 1.2.10; i386-redhat-linux-gnu) Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="Multipart=_Sat__4_Jun_2005_19_51_22_-0700_Kp/TSOvd/GHsKqPd" X-MIMEDefang-Filter: osdl$Revision: 1.109 $ X-Scanned-By: MIMEDefang 2.36 X-archive-position: 2110 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: akpm@osdl.org Precedence: bulk X-list: netdev Content-Length: 59528 Lines: 2610 This is a multi-part message in MIME format. --Multipart=_Sat__4_Jun_2005_19_51_22_-0700_Kp/TSOvd/GHsKqPd Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Begin forwarded message: Date: Sun, 05 Jun 2005 04:25:43 +0300 From: Rommer To: linux-kernel@vger.kernel.org Subject: PROBLEM: tcp_output.c bug [1.] My server goes to reboot for about 1 time per 2 weeks because of kernel bug in tcp_output.c [2.] My server goes to reboot because of /proc/sys/kernel/panic set to 1, but I determined the problem using netconsole module. It is a "kernel BUG at net/ipv4/tcp_output.c:919!" I looked the code on line 919 in tcp_ouput.c and found a macro BUG_ON in function tcp_retrans_try_collapse(...). I disabled calling of this function by running: echo 0 >/proc/sys/net/ipv4/tcp_retrans_collapse, and now server works fine about 4 weeks. Also I looked the code of this function in tcp_output.c from kernel 2.6.11.8 sources and it is the same. [3.] sh scripts/ver_linux Linux us401.activeby.net 2.6.9 #4 SMP Fri Apr 22 16:46:30 EEST 2005 i686 i686 i386 GNU/Linux Gnu C 3.3.2 Gnu make 3.79.1 binutils 2.14.90.0.6 util-linux 2.12 mount 2.12 module-init-tools 2.4.26 e2fsprogs 1.35 jfsutils 1.1.3 reiserfsprogs 2003-------------> reiser4progs line pcmcia-cs 3.1.31 quota-tools 3.06. PPP 2.4.1 isdn4k-utils 3.3 nfs-utils 1.0.6 Linux C Library 2.3.3 Dynamic linker (ldd) 2.3.3 Procps 3.2.0 Net-tools 1.60 Kbd 1.08 Sh-utils 5.2.1 Modules Loaded netconsole ipv6 ipt_TOS iptable_mangle ip_conntrack_ftp ip_conntrack_irc ipt_LOG ipt_limit ipt_multiport autofs ipt_REJECT ipt_state ip_conntrack iptable_filter ip_tables e100 mii ohci1394 ieee1394 sg scsi_mod parport_pc parport microcode loop thermal processor fan button battery ac ext3 jbd raid1 [4.] part of the of the log of netconsole ------------[ cut here ]------------ kernel BUG at net/ipv4/tcp_output.c:919! invalid operand: 0000 [#1] SMP Modules linked in: netconsole ipv6 ipt_TOS iptable_mangle ip_conntrack_ftp ip_co nntrack_irc ipt_LOG ipt_limit ipt_multiport autofs ipt_REJECT ipt_state iptable_ filter ip_conntrack ip_tables e100 mii ohci1394 ieee1394 sg scsi_mod parport_pc microcode parport thermal fan loop processor button battery ext3 tcp_v4_rcv+0x71 c/0x980 nf_hook_slow+0xc9/0x100 [] ip_rcv_finish+0x0/0x2a0 [] ip_rcv+0x41c/0x4e0 [] ip_rcv_finish+0x0/0x2a0 [] [] do_gettimeofday+0x20/0xc0 netif_receive_skb+0x1df/0x2d0 e100_poll+0x5ac/0x620 [e100] [] [] net_rx_action+0x81/0x110 [] __do_softirq+0xba/0xd0 [] do_softirq+0x2d/0x30 [] do_IRQ+0x105/0x130 [] unknown_bootoption+0x0/0x180 [] common_interrupt+0x18/0x20 [] default_idle+0x0/0x40 [] unknown_bootoption+0x0/0x180 [] default_idle+0x2c/0x40 [] cpu_idle+0x3b/0x50 [] [] start_kernel+0x19d/0x1e0 Code: fe unknown_bootoption+0x0/0x180e9 7f ff ff c7 44 24 08 e1 72 28 c0 54 89 24 04 24 e8 89 1c b3 7e fc ff fe 3a 0f e9 ff 0b ff c9 02 d7 c0 ca 2d 0a e9 fe ff ff 97 03 c0 8b 83 May be this log damaged because of UDP [6.] I don't know what cause the kernel panic [7.] [7.2.] cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 15 model : 2 model name : Intel(R) Xeon(TM) CPU 2.80GHz stepping : 5 cpu MHz : 2807.502 cache size : 512 KB physical id : 0 siblings : 2 fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 2 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe cid xtpr bogomips : 5537.79 processor : 1 vendor_id : GenuineIntel cpu family : 15 model : 2 model name : Intel(R) Xeon(TM) CPU 2.80GHz stepping : 5 cpu MHz : 2807.502 cache size : 512 KB physical id : 0 siblings : 2 fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 2 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe cid xtpr bogomips : 5603.32 processor : 2 vendor_id : GenuineIntel cpu family : 15 model : 2 model name : Intel(R) Xeon(TM) CPU 2.80GHz stepping : 5 cpu MHz : 2807.502 cache size : 512 KB physical id : 3 siblings : 2 fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 2 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe cid xtpr bogomips : 5603.32 processor : 3 vendor_id : GenuineIntel cpu family : 15 model : 2 model name : Intel(R) Xeon(TM) CPU 2.80GHz stepping : 5 cpu MHz : 2807.502 cache size : 512 KB physical id : 3 siblings : 2 fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 2 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe cid xtpr bogomips : 5603.32 [7.3.] cat /proc/modules netconsole 3040 - - Live 0xf8cce000 ipv6 258208 - - Live 0xf8e05000 ipt_TOS 2216 - - Live 0xf8d0c000 iptable_mangle 2536 - - Live 0xf8ca4000 ip_conntrack_ftp 72628 - - Live 0xf8d47000 ip_conntrack_irc 71636 - - Live 0xf8d34000 ipt_LOG 6856 - - Live 0xf8ce7000 ipt_limit 2248 - - Live 0xf8cd0000 ipt_multiport 1736 - - Live 0xf8ca6000 autofs 17096 - - Live 0xf8d0e000 ipt_REJECT 6792 - - Live 0xf8ce4000 ipt_state 1640 - - Live 0xf8cde000 ip_conntrack 47300 - - Live 0xf8d16000 iptable_filter 2632 - - Live 0xf8821000 ip_tables 17120 - - Live 0xf8c94000 e100 34664 - - Live 0xf8cd4000 mii 4744 - - Live 0xf8c91000 ohci1394 35564 - - Live 0xf8c9a000 ieee1394 114680 - - Live 0xf8cea000 sg 38408 - - Live 0xf8c81000 scsi_mod 124780 - - Live 0xf8ca8000 parport_pc 26208 - - Live 0xf8c79000 parport 41544 - - Live 0xf885f000 microcode 7200 - - Live 0xf884e000 loop 15696 - - Live 0xf882b000 thermal 13008 - - Live 0xf8830000 processor 17824 - - Live 0xf8848000 fan 3692 - - Live 0xf8829000 button 6328 - - Live 0xf8802000 battery 9260 - - Live 0xf8825000 ac 4524 - - Live 0xf8805000 ext3 126024 - - Live 0xf886d000 jbd 65760 - - Live 0xf8836000 raid1 16936 - - Live 0xf881b000 [7.4.] cat /proc/ioports 0000-001f : dma1 0020-0021 : pic1 0040-0043 : timer0 0050-0053 : timer1 0060-006f : keyboard 0070-0077 : rtc 0080-008f : dma page reg 00a0-00a1 : pic2 00c0-00df : dma2 00f0-00ff : fpu 0170-0177 : ide1 01f0-01f7 : ide0 0376-0376 : ide1 03c0-03df : vga+ 03f6-03f6 : ide0 0400-047f : 0000:00:1f.0 0400-0403 : PM1a_EVT_BLK 0404-0405 : PM1a_CNT_BLK 0408-040b : PM_TMR 0428-042f : GPE0_BLK 0480-04bf : 0000:00:1f.0 0500-051f : 0000:00:1f.3 0cf8-0cff : PCI conf1 a000-afff : PCI Bus #01 a000-a07f : 0000:01:00.0 b000-b03f : 0000:02:0b.0 b000-b03f : e100 f000-f00f : 0000:00:1f.1 f000-f007 : ide0 f008-f00f : ide1 cat /proc/iomem 00000000-0009f7ff : System RAM 0009f800-0009ffff : reserved 000a0000-000bffff : Video RAM area 000c0000-000cffff : Video ROM 000d0000-000d17ff : Adapter ROM 000f0000-000fffff : System ROM 00100000-7fedffff : System RAM 00100000-002c15ac : Kernel code 002c15ad-0038721f : Kernel data 7fee0000-7fee2fff : ACPI Non-volatile Storage 7fee3000-7feeffff : ACPI Tables 7fef0000-7fefffff : reserved 7ff00000-7ff003ff : 0000:00:1f.1 e0000000-efffffff : PCI Bus #01 e0000000-efffffff : 0000:01:00.0 f0000000-f1ffffff : PCI Bus #01 f1000000-f103ffff : 0000:01:00.0 f3000000-f301ffff : 0000:02:0b.0 f3000000-f301ffff : e100 f3020000-f3020fff : 0000:02:0b.0 f3020000-f3020fff : e100 f4000000-f43fffff : 0000:00:00.0 fec00000-ffffffff : reserved [7.5.] /sbin/lspci -vvv 00:00.0 Host bridge: Intel Corp. 82875P Memory Controller Hub (rev 02) Subsystem: Asustek Computer, Inc.: Unknown device 80f6 Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- Status: Cap+ 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- SERR- 00:01.0 PCI bridge: Intel Corp. 82875P Processor to AGP Controller (rev 02) (prog-if 00 [Normal decode]) Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- Status: Cap- 66Mhz+ UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- SERR- Reset- FastB2B- 00:1e.0 PCI bridge: Intel Corp. 82801BA/CA/DB/EB/ER Hub interface to PCI Bridge (rev c2) (prog-if 00 [Normal decode]) Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- Status: Cap- 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- SERR- Reset- FastB2B- 00:1f.0 ISA bridge: Intel Corp. 82801EB/ER (ICH5/ICH5R) LPC Bridge (rev 02) Control: I/O+ Mem+ BusMaster+ SpecCycle+ MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- Status: Cap- 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- SERR- TAbort- SERR- Region 1: I/O ports at Region 2: I/O ports at Region 3: I/O ports at Region 4: I/O ports at f000 [size=16] Region 5: Memory at 7ff00000 (32-bit, non-prefetchable) [size=1K] 00:1f.3 SMBus: Intel Corp. 82801EB/ER (ICH5/ICH5R) SMBus Controller (rev 02) Subsystem: Asustek Computer, Inc. P4P800 Mainboard Control: I/O+ Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- Status: Cap- 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- SERR- TAbort- SERR- [disabled] [size=64K] Capabilities: [40] Power Management version 2 Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-) Status: D0 PME-Enable- DSel=0 DScale=0 PME- Capabilities: [50] AGP version 2.0 Status: RQ=15 SBA- 64bit- FW- Rate=x1,x2,x4 Command: RQ=0 SBA- AGP- 64bit- FW- Rate= 02:0b.0 Ethernet controller: Intel Corp. 82557/8/9 [Ethernet Pro 100] (rev 0c) Subsystem: Intel Corp. EtherExpress PRO/100 S Desktop Adapter Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- Status: Cap+ 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- SERR- [disabled] [size=64K] Capabilities: [dc] Power Management version 2 Flags: PMEClk- DSI+ D1+ D2+ AuxCurrent=0mA PME(D0+,D1+,D2+,D3hot+,D3cold+) Status: D0 PME-Enable+ DSel=0 DScale=2 PME- [7.6.] cat /proc/scsi/scsi Attached devices: Kernel config attached -- Best regards, Roman --Multipart=_Sat__4_Jun_2005_19_51_22_-0700_Kp/TSOvd/GHsKqPd Content-Type: text/plain; name="config" Content-Disposition: attachment; filename="config" Content-Transfer-Encoding: 7bit # # Automatically generated make config: don't edit # Linux kernel version: 2.6.9 # Mon Nov 22 12:11:25 2004 # CONFIG_X86=y CONFIG_MMU=y CONFIG_UID16=y CONFIG_GENERIC_ISA_DMA=y CONFIG_GENERIC_IOMAP=y # # Code maturity level options # CONFIG_EXPERIMENTAL=y CONFIG_CLEAN_COMPILE=y # # General setup # CONFIG_LOCALVERSION="" CONFIG_SWAP=y CONFIG_SYSVIPC=y # CONFIG_POSIX_MQUEUE is not set CONFIG_BSD_PROCESS_ACCT=y # CONFIG_BSD_PROCESS_ACCT_V3 is not set CONFIG_SYSCTL=y # CONFIG_AUDIT is not set CONFIG_LOG_BUF_SHIFT=15 CONFIG_HOTPLUG=y # CONFIG_IKCONFIG is not set # CONFIG_EMBEDDED is not set CONFIG_KALLSYMS=y # CONFIG_KALLSYMS_ALL is not set # CONFIG_KALLSYMS_EXTRA_PASS is not set CONFIG_FUTEX=y CONFIG_EPOLL=y CONFIG_IOSCHED_NOOP=y CONFIG_IOSCHED_AS=y CONFIG_IOSCHED_DEADLINE=y CONFIG_IOSCHED_CFQ=y # CONFIG_CC_OPTIMIZE_FOR_SIZE is not set CONFIG_SHMEM=y # CONFIG_TINY_SHMEM is not set # # Loadable module support # CONFIG_MODULES=y # CONFIG_MODULE_UNLOAD is not set CONFIG_OBSOLETE_MODPARM=y CONFIG_MODVERSIONS=y CONFIG_KMOD=y # # Processor type and features # CONFIG_X86_PC=y # CONFIG_X86_ELAN is not set # CONFIG_X86_VOYAGER is not set # CONFIG_X86_NUMAQ is not set # CONFIG_X86_SUMMIT is not set # CONFIG_X86_BIGSMP is not set # CONFIG_X86_VISWS is not set # CONFIG_X86_GENERICARCH is not set # CONFIG_X86_ES7000 is not set # CONFIG_M386 is not set # CONFIG_M486 is not set # CONFIG_M586 is not set # CONFIG_M586TSC is not set # CONFIG_M586MMX is not set CONFIG_M686=y # CONFIG_MPENTIUMII is not set # CONFIG_MPENTIUMIII is not set # CONFIG_MPENTIUMM is not set # CONFIG_MPENTIUM4 is not set # CONFIG_MK6 is not set # CONFIG_MK7 is not set # CONFIG_MK8 is not set # CONFIG_MCRUSOE is not set # CONFIG_MWINCHIPC6 is not set # CONFIG_MWINCHIP2 is not set # CONFIG_MWINCHIP3D is not set # CONFIG_MCYRIXIII is not set # CONFIG_MVIAC3_2 is not set # CONFIG_X86_GENERIC is not set CONFIG_X86_CMPXCHG=y CONFIG_X86_XADD=y CONFIG_X86_L1_CACHE_SHIFT=5 CONFIG_RWSEM_XCHGADD_ALGORITHM=y CONFIG_X86_PPRO_FENCE=y CONFIG_X86_WP_WORKS_OK=y CONFIG_X86_INVLPG=y CONFIG_X86_BSWAP=y CONFIG_X86_POPAD_OK=y CONFIG_X86_GOOD_APIC=y CONFIG_X86_USE_PPRO_CHECKSUM=y # CONFIG_HPET_TIMER is not set CONFIG_SMP=y CONFIG_NR_CPUS=8 # CONFIG_SCHED_SMT is not set # CONFIG_PREEMPT is not set CONFIG_X86_LOCAL_APIC=y CONFIG_X86_IO_APIC=y CONFIG_X86_TSC=y CONFIG_X86_MCE=y # CONFIG_X86_MCE_NONFATAL is not set # CONFIG_X86_MCE_P4THERMAL is not set CONFIG_TOSHIBA=m CONFIG_I8K=m CONFIG_MICROCODE=m CONFIG_X86_MSR=m CONFIG_X86_CPUID=m # # Firmware Drivers # CONFIG_EDD=m # CONFIG_NOHIGHMEM is not set # CONFIG_HIGHMEM4G is not set CONFIG_HIGHMEM64G=y CONFIG_HIGHMEM=y CONFIG_X86_PAE=y # CONFIG_HIGHPTE is not set # CONFIG_MATH_EMULATION is not set CONFIG_MTRR=y # CONFIG_EFI is not set CONFIG_IRQBALANCE=y CONFIG_HAVE_DEC_LOCK=y # CONFIG_REGPARM is not set # # Power management options (ACPI, APM) # CONFIG_PM=y # CONFIG_PM_DEBUG is not set # CONFIG_SOFTWARE_SUSPEND is not set # # ACPI (Advanced Configuration and Power Interface) Support # CONFIG_ACPI=y CONFIG_ACPI_BOOT=y CONFIG_ACPI_INTERPRETER=y CONFIG_ACPI_SLEEP=y CONFIG_ACPI_SLEEP_PROC_FS=y CONFIG_ACPI_AC=m CONFIG_ACPI_BATTERY=m CONFIG_ACPI_BUTTON=m CONFIG_ACPI_FAN=m CONFIG_ACPI_PROCESSOR=m CONFIG_ACPI_THERMAL=m CONFIG_ACPI_ASUS=m CONFIG_ACPI_TOSHIBA=m CONFIG_ACPI_BLACKLIST_YEAR=0 # CONFIG_ACPI_DEBUG is not set CONFIG_ACPI_BUS=y CONFIG_ACPI_EC=y CONFIG_ACPI_POWER=y CONFIG_ACPI_PCI=y CONFIG_ACPI_SYSTEM=y # CONFIG_X86_PM_TIMER is not set # # APM (Advanced Power Management) BIOS Support # CONFIG_APM=y # CONFIG_APM_IGNORE_USER_SUSPEND is not set # CONFIG_APM_DO_ENABLE is not set CONFIG_APM_CPU_IDLE=y # CONFIG_APM_DISPLAY_BLANK is not set CONFIG_APM_RTC_IS_GMT=y # CONFIG_APM_ALLOW_INTS is not set # CONFIG_APM_REAL_MODE_POWER_OFF is not set # # CPU Frequency scaling # CONFIG_CPU_FREQ=y CONFIG_CPU_FREQ_PROC_INTF=y CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE=y # CONFIG_CPU_FREQ_DEFAULT_GOV_USERSPACE is not set CONFIG_CPU_FREQ_GOV_PERFORMANCE=y # CONFIG_CPU_FREQ_GOV_POWERSAVE is not set CONFIG_CPU_FREQ_GOV_USERSPACE=y # CONFIG_CPU_FREQ_GOV_ONDEMAND is not set CONFIG_CPU_FREQ_24_API=y CONFIG_CPU_FREQ_TABLE=y # # CPUFreq processor drivers # # CONFIG_X86_ACPI_CPUFREQ is not set CONFIG_X86_POWERNOW_K6=m CONFIG_X86_POWERNOW_K7=m CONFIG_X86_POWERNOW_K7_ACPI=y # CONFIG_X86_POWERNOW_K8 is not set # CONFIG_X86_GX_SUSPMOD is not set CONFIG_X86_SPEEDSTEP_CENTRINO=m CONFIG_X86_SPEEDSTEP_CENTRINO_TABLE=y # CONFIG_X86_SPEEDSTEP_CENTRINO_ACPI is not set CONFIG_X86_SPEEDSTEP_ICH=m # CONFIG_X86_SPEEDSTEP_SMI is not set CONFIG_X86_P4_CLOCKMOD=m CONFIG_X86_SPEEDSTEP_LIB=m # CONFIG_X86_SPEEDSTEP_RELAXED_CAP_CHECK is not set CONFIG_X86_LONGRUN=m CONFIG_X86_LONGHAUL=m # # Bus options (PCI, PCMCIA, EISA, MCA, ISA) # CONFIG_PCI=y # CONFIG_PCI_GOBIOS is not set # CONFIG_PCI_GOMMCONFIG is not set # CONFIG_PCI_GODIRECT is not set CONFIG_PCI_GOANY=y CONFIG_PCI_BIOS=y CONFIG_PCI_DIRECT=y CONFIG_PCI_MMCONFIG=y # CONFIG_PCI_MSI is not set # CONFIG_PCI_LEGACY_PROC is not set CONFIG_PCI_NAMES=y CONFIG_ISA=y CONFIG_EISA=y # CONFIG_EISA_VLB_PRIMING is not set CONFIG_EISA_PCI_EISA=y CONFIG_EISA_VIRTUAL_ROOT=y CONFIG_EISA_NAMES=y # CONFIG_MCA is not set # CONFIG_SCx200 is not set # # PCMCIA/CardBus support # CONFIG_PCMCIA=m # CONFIG_PCMCIA_DEBUG is not set # CONFIG_YENTA is not set # CONFIG_PD6729 is not set CONFIG_I82092=m CONFIG_I82365=m CONFIG_TCIC=m CONFIG_PCMCIA_PROBE=y # # PCI Hotplug Support # CONFIG_HOTPLUG_PCI=y # CONFIG_HOTPLUG_PCI_FAKE is not set CONFIG_HOTPLUG_PCI_COMPAQ=m # CONFIG_HOTPLUG_PCI_COMPAQ_NVRAM is not set CONFIG_HOTPLUG_PCI_IBM=m CONFIG_HOTPLUG_PCI_ACPI=m # CONFIG_HOTPLUG_PCI_ACPI_IBM is not set # CONFIG_HOTPLUG_PCI_CPCI is not set # CONFIG_HOTPLUG_PCI_PCIE is not set # CONFIG_HOTPLUG_PCI_SHPC is not set # # Executable file formats # CONFIG_BINFMT_ELF=y CONFIG_BINFMT_AOUT=m CONFIG_BINFMT_MISC=m # # Device Drivers # # # Generic Driver Options # CONFIG_STANDALONE=y CONFIG_PREVENT_FIRMWARE_BUILD=y CONFIG_FW_LOADER=m # CONFIG_DEBUG_DRIVER is not set # # Memory Technology Devices (MTD) # CONFIG_MTD=m # CONFIG_MTD_DEBUG is not set # CONFIG_MTD_PARTITIONS is not set CONFIG_MTD_CONCAT=m # # User Modules And Translation Layers # CONFIG_MTD_CHAR=m CONFIG_MTD_BLOCK=m CONFIG_MTD_BLOCK_RO=m CONFIG_FTL=m CONFIG_NFTL=m CONFIG_NFTL_RW=y CONFIG_INFTL=m # # RAM/ROM/Flash chip drivers # CONFIG_MTD_CFI=m CONFIG_MTD_JEDECPROBE=m CONFIG_MTD_GEN_PROBE=m # CONFIG_MTD_CFI_ADV_OPTIONS is not set CONFIG_MTD_MAP_BANK_WIDTH_1=y CONFIG_MTD_MAP_BANK_WIDTH_2=y CONFIG_MTD_MAP_BANK_WIDTH_4=y # CONFIG_MTD_MAP_BANK_WIDTH_8 is not set # CONFIG_MTD_MAP_BANK_WIDTH_16 is not set # CONFIG_MTD_MAP_BANK_WIDTH_32 is not set CONFIG_MTD_CFI_I1=y CONFIG_MTD_CFI_I2=y # CONFIG_MTD_CFI_I4 is not set # CONFIG_MTD_CFI_I8 is not set CONFIG_MTD_CFI_INTELEXT=m CONFIG_MTD_CFI_AMDSTD=m CONFIG_MTD_CFI_AMDSTD_RETRY=0 CONFIG_MTD_CFI_STAA=m CONFIG_MTD_CFI_UTIL=m CONFIG_MTD_RAM=m CONFIG_MTD_ROM=m CONFIG_MTD_ABSENT=m # # Mapping drivers for chip access # CONFIG_MTD_COMPLEX_MAPPINGS=y # CONFIG_MTD_PHYSMAP is not set CONFIG_MTD_SC520CDP=m CONFIG_MTD_SCx200_DOCFLASH=m CONFIG_MTD_AMD76XROM=m # CONFIG_MTD_ICHXROM is not set CONFIG_MTD_SCB2_FLASH=m CONFIG_MTD_L440GX=m CONFIG_MTD_PCI=m # # Self-contained MTD device drivers # CONFIG_MTD_PMC551=m # CONFIG_MTD_PMC551_BUGFIX is not set # CONFIG_MTD_PMC551_DEBUG is not set # CONFIG_MTD_SLRAM is not set # CONFIG_MTD_PHRAM is not set CONFIG_MTD_MTDRAM=m CONFIG_MTDRAM_TOTAL_SIZE=4096 CONFIG_MTDRAM_ERASE_SIZE=128 # CONFIG_MTD_BLKMTD is not set # # Disk-On-Chip Device Drivers # CONFIG_MTD_DOC2000=m # CONFIG_MTD_DOC2001 is not set CONFIG_MTD_DOC2001PLUS=m CONFIG_MTD_DOCPROBE=m CONFIG_MTD_DOCECC=m # CONFIG_MTD_DOCPROBE_ADVANCED is not set CONFIG_MTD_DOCPROBE_ADDRESS=0 # # NAND Flash Device Drivers # CONFIG_MTD_NAND=m # CONFIG_MTD_NAND_VERIFY_WRITE is not set CONFIG_MTD_NAND_IDS=m # CONFIG_MTD_NAND_DISKONCHIP is not set # # Parallel port support # CONFIG_PARPORT=m CONFIG_PARPORT_PC=m CONFIG_PARPORT_PC_CML1=m CONFIG_PARPORT_SERIAL=m # CONFIG_PARPORT_PC_FIFO is not set # CONFIG_PARPORT_PC_SUPERIO is not set CONFIG_PARPORT_PC_PCMCIA=m # CONFIG_PARPORT_OTHER is not set CONFIG_PARPORT_1284=y # # Plug and Play support # CONFIG_PNP=y # CONFIG_PNP_DEBUG is not set # # Protocols # CONFIG_ISAPNP=y # CONFIG_PNPBIOS is not set # # Block devices # CONFIG_BLK_DEV_FD=m CONFIG_BLK_DEV_XD=m CONFIG_PARIDE=m CONFIG_PARIDE_PARPORT=m # # Parallel IDE high-level drivers # CONFIG_PARIDE_PD=m CONFIG_PARIDE_PCD=m CONFIG_PARIDE_PF=m CONFIG_PARIDE_PT=m CONFIG_PARIDE_PG=m # # Parallel IDE protocol modules # CONFIG_PARIDE_ATEN=m CONFIG_PARIDE_BPCK=m CONFIG_PARIDE_BPCK6=m CONFIG_PARIDE_COMM=m CONFIG_PARIDE_DSTR=m CONFIG_PARIDE_FIT2=m CONFIG_PARIDE_FIT3=m CONFIG_PARIDE_EPAT=m CONFIG_PARIDE_EPATC8=y CONFIG_PARIDE_EPIA=m CONFIG_PARIDE_FRIQ=m CONFIG_PARIDE_FRPW=m CONFIG_PARIDE_KBIC=m CONFIG_PARIDE_KTTI=m CONFIG_PARIDE_ON20=m CONFIG_PARIDE_ON26=m CONFIG_BLK_CPQ_DA=m CONFIG_BLK_CPQ_CISS_DA=m CONFIG_CISS_SCSI_TAPE=y CONFIG_BLK_DEV_DAC960=m CONFIG_BLK_DEV_UMEM=m CONFIG_BLK_DEV_LOOP=m # CONFIG_BLK_DEV_CRYPTOLOOP is not set CONFIG_BLK_DEV_NBD=m # CONFIG_BLK_DEV_SX8 is not set # CONFIG_BLK_DEV_UB is not set CONFIG_BLK_DEV_RAM=y CONFIG_BLK_DEV_RAM_SIZE=8192 CONFIG_BLK_DEV_INITRD=y # CONFIG_LBD is not set # # ATA/ATAPI/MFM/RLL support # CONFIG_IDE=y CONFIG_BLK_DEV_IDE=y # # Please see Documentation/ide.txt for help/info on IDE drives # # CONFIG_BLK_DEV_IDE_SATA is not set # CONFIG_BLK_DEV_HD_IDE is not set CONFIG_BLK_DEV_IDEDISK=y CONFIG_IDEDISK_MULTI_MODE=y CONFIG_BLK_DEV_IDECS=m CONFIG_BLK_DEV_IDECD=m CONFIG_BLK_DEV_IDETAPE=m CONFIG_BLK_DEV_IDEFLOPPY=y CONFIG_BLK_DEV_IDESCSI=m # CONFIG_IDE_TASK_IOCTL is not set # CONFIG_IDE_TASKFILE_IO is not set # # IDE chipset support/bugfixes # CONFIG_IDE_GENERIC=y CONFIG_BLK_DEV_CMD640=y # CONFIG_BLK_DEV_CMD640_ENHANCED is not set # CONFIG_BLK_DEV_IDEPNP is not set CONFIG_BLK_DEV_IDEPCI=y CONFIG_IDEPCI_SHARE_IRQ=y # CONFIG_BLK_DEV_OFFBOARD is not set CONFIG_BLK_DEV_GENERIC=y # CONFIG_BLK_DEV_OPTI621 is not set CONFIG_BLK_DEV_RZ1000=y CONFIG_BLK_DEV_IDEDMA_PCI=y # CONFIG_BLK_DEV_IDEDMA_FORCED is not set CONFIG_IDEDMA_PCI_AUTO=y # CONFIG_IDEDMA_ONLYDISK is not set CONFIG_BLK_DEV_AEC62XX=y CONFIG_BLK_DEV_ALI15X3=y # CONFIG_WDC_ALI15X3 is not set CONFIG_BLK_DEV_AMD74XX=y # CONFIG_BLK_DEV_ATIIXP is not set CONFIG_BLK_DEV_CMD64X=y CONFIG_BLK_DEV_TRIFLEX=y CONFIG_BLK_DEV_CY82C693=y # CONFIG_BLK_DEV_CS5520 is not set CONFIG_BLK_DEV_CS5530=y CONFIG_BLK_DEV_HPT34X=y # CONFIG_HPT34X_AUTODMA is not set CONFIG_BLK_DEV_HPT366=y # CONFIG_BLK_DEV_SC1200 is not set CONFIG_BLK_DEV_PIIX=y # CONFIG_BLK_DEV_NS87415 is not set CONFIG_BLK_DEV_PDC202XX_OLD=y # CONFIG_PDC202XX_BURST is not set CONFIG_BLK_DEV_PDC202XX_NEW=y CONFIG_PDC202XX_FORCE=y CONFIG_BLK_DEV_SVWKS=y CONFIG_BLK_DEV_SIIMAGE=y CONFIG_BLK_DEV_SIS5513=y CONFIG_BLK_DEV_SLC90E66=y # CONFIG_BLK_DEV_TRM290 is not set CONFIG_BLK_DEV_VIA82CXXX=y # CONFIG_IDE_ARM is not set # CONFIG_IDE_CHIPSETS is not set CONFIG_BLK_DEV_IDEDMA=y # CONFIG_IDEDMA_IVB is not set CONFIG_IDEDMA_AUTO=y # CONFIG_BLK_DEV_HD is not set # # SCSI device support # CONFIG_SCSI=m CONFIG_SCSI_PROC_FS=y # # SCSI support type (disk, tape, CD-ROM) # CONFIG_BLK_DEV_SD=m CONFIG_CHR_DEV_ST=m CONFIG_CHR_DEV_OSST=m CONFIG_BLK_DEV_SR=m CONFIG_BLK_DEV_SR_VENDOR=y CONFIG_CHR_DEV_SG=m # # Some SCSI devices (e.g. CD jukebox) support multiple LUNs # # CONFIG_SCSI_MULTI_LUN is not set CONFIG_SCSI_CONSTANTS=y CONFIG_SCSI_LOGGING=y # # SCSI Transport Attributes # CONFIG_SCSI_SPI_ATTRS=m # CONFIG_SCSI_FC_ATTRS is not set # # SCSI low-level drivers # CONFIG_BLK_DEV_3W_XXXX_RAID=m # CONFIG_SCSI_3W_9XXX is not set CONFIG_SCSI_7000FASST=m CONFIG_SCSI_ACARD=m CONFIG_SCSI_AHA152X=m CONFIG_SCSI_AHA1542=m CONFIG_SCSI_AHA1740=m CONFIG_SCSI_AACRAID=m CONFIG_SCSI_AIC7XXX=m CONFIG_AIC7XXX_CMDS_PER_DEVICE=32 CONFIG_AIC7XXX_RESET_DELAY_MS=15000 # CONFIG_AIC7XXX_PROBE_EISA_VL is not set # CONFIG_AIC7XXX_DEBUG_ENABLE is not set CONFIG_AIC7XXX_DEBUG_MASK=0 # CONFIG_AIC7XXX_REG_PRETTY_PRINT is not set CONFIG_SCSI_AIC7XXX_OLD=m CONFIG_SCSI_AIC79XX=m CONFIG_AIC79XX_CMDS_PER_DEVICE=32 CONFIG_AIC79XX_RESET_DELAY_MS=15000 # CONFIG_AIC79XX_ENABLE_RD_STRM is not set # CONFIG_AIC79XX_DEBUG_ENABLE is not set CONFIG_AIC79XX_DEBUG_MASK=0 # CONFIG_AIC79XX_REG_PRETTY_PRINT is not set CONFIG_SCSI_DPT_I2O=m CONFIG_SCSI_IN2000=m # CONFIG_MEGARAID_NEWGEN is not set # CONFIG_MEGARAID_LEGACY is not set CONFIG_SCSI_SATA=y CONFIG_SCSI_SATA_SVW=m CONFIG_SCSI_ATA_PIIX=m # CONFIG_SCSI_SATA_NV is not set CONFIG_SCSI_SATA_PROMISE=m # CONFIG_SCSI_SATA_SX4 is not set CONFIG_SCSI_SATA_SIL=m # CONFIG_SCSI_SATA_SIS is not set CONFIG_SCSI_SATA_VIA=m # CONFIG_SCSI_SATA_VITESSE is not set CONFIG_SCSI_BUSLOGIC=m # CONFIG_SCSI_OMIT_FLASHPOINT is not set CONFIG_SCSI_DMX3191D=m CONFIG_SCSI_DTC3280=m CONFIG_SCSI_EATA=m CONFIG_SCSI_EATA_TAGGED_QUEUE=y # CONFIG_SCSI_EATA_LINKED_COMMANDS is not set CONFIG_SCSI_EATA_MAX_TAGS=16 CONFIG_SCSI_EATA_PIO=m CONFIG_SCSI_FUTURE_DOMAIN=m CONFIG_SCSI_GDTH=m CONFIG_SCSI_GENERIC_NCR5380=m # CONFIG_SCSI_GENERIC_NCR5380_MMIO is not set # CONFIG_SCSI_GENERIC_NCR53C400 is not set CONFIG_SCSI_IPS=m CONFIG_SCSI_INIA100=m CONFIG_SCSI_PPA=m CONFIG_SCSI_IMM=m # CONFIG_SCSI_IZIP_EPP16 is not set # CONFIG_SCSI_IZIP_SLOW_CTR is not set CONFIG_SCSI_NCR53C406A=m CONFIG_53C700_IO_MAPPED=y CONFIG_SCSI_SYM53C8XX_2=m CONFIG_SCSI_SYM53C8XX_DMA_ADDRESSING_MODE=1 CONFIG_SCSI_SYM53C8XX_DEFAULT_TAGS=16 CONFIG_SCSI_SYM53C8XX_MAX_TAGS=64 # CONFIG_SCSI_SYM53C8XX_IOMAPPED is not set # CONFIG_SCSI_IPR is not set CONFIG_SCSI_PAS16=m CONFIG_SCSI_PSI240I=m CONFIG_SCSI_QLOGIC_FAS=m CONFIG_SCSI_QLOGIC_ISP=m CONFIG_SCSI_QLOGIC_FC=m # CONFIG_SCSI_QLOGIC_FC_FIRMWARE is not set CONFIG_SCSI_QLOGIC_1280=m CONFIG_SCSI_QLA2XXX=m # CONFIG_SCSI_QLA21XX is not set # CONFIG_SCSI_QLA22XX is not set # CONFIG_SCSI_QLA2300 is not set # CONFIG_SCSI_QLA2322 is not set # CONFIG_SCSI_QLA6312 is not set # CONFIG_SCSI_QLA6322 is not set CONFIG_SCSI_SIM710=m CONFIG_SCSI_SYM53C416=m # CONFIG_SCSI_DC395x is not set CONFIG_SCSI_DC390T=m CONFIG_SCSI_T128=m CONFIG_SCSI_U14_34F=m # CONFIG_SCSI_U14_34F_TAGGED_QUEUE is not set # CONFIG_SCSI_U14_34F_LINKED_COMMANDS is not set CONFIG_SCSI_U14_34F_MAX_TAGS=8 CONFIG_SCSI_ULTRASTOR=m CONFIG_SCSI_NSP32=m CONFIG_SCSI_DEBUG=m # # PCMCIA SCSI adapter support # CONFIG_PCMCIA_AHA152X=m CONFIG_PCMCIA_FDOMAIN=m CONFIG_PCMCIA_NINJA_SCSI=m CONFIG_PCMCIA_QLOGIC=m # CONFIG_PCMCIA_SYM53C500 is not set # # Old CD-ROM drivers (not SCSI, not IDE) # # CONFIG_CD_NO_IDESCSI is not set # # Multi-device support (RAID and LVM) # CONFIG_MD=y CONFIG_BLK_DEV_MD=y CONFIG_MD_LINEAR=m CONFIG_MD_RAID0=m CONFIG_MD_RAID1=m # CONFIG_MD_RAID10 is not set CONFIG_MD_RAID5=m # CONFIG_MD_RAID6 is not set CONFIG_MD_MULTIPATH=m CONFIG_BLK_DEV_DM=m # CONFIG_DM_CRYPT is not set # CONFIG_DM_SNAPSHOT is not set # CONFIG_DM_MIRROR is not set # CONFIG_DM_ZERO is not set # # Fusion MPT device support # CONFIG_FUSION=m CONFIG_FUSION_MAX_SGE=40 CONFIG_FUSION_CTL=m CONFIG_FUSION_LAN=m # # IEEE 1394 (FireWire) support # CONFIG_IEEE1394=m # # Subsystem Options # # CONFIG_IEEE1394_VERBOSEDEBUG is not set # CONFIG_IEEE1394_OUI_DB is not set CONFIG_IEEE1394_EXTRA_CONFIG_ROMS=y CONFIG_IEEE1394_CONFIG_ROM_IP1394=y # # Device Drivers # # CONFIG_IEEE1394_PCILYNX is not set CONFIG_IEEE1394_OHCI1394=m # # Protocol Drivers # CONFIG_IEEE1394_VIDEO1394=m CONFIG_IEEE1394_SBP2=m CONFIG_IEEE1394_SBP2_PHYS_DMA=y CONFIG_IEEE1394_ETH1394=m CONFIG_IEEE1394_DV1394=m CONFIG_IEEE1394_RAWIO=m CONFIG_IEEE1394_CMP=m CONFIG_IEEE1394_AMDTP=m # # I2O device support # CONFIG_I2O=m # CONFIG_I2O_CONFIG is not set CONFIG_I2O_BLOCK=m CONFIG_I2O_SCSI=m CONFIG_I2O_PROC=m # # Networking support # CONFIG_NET=y # # Networking options # CONFIG_PACKET=y CONFIG_PACKET_MMAP=y CONFIG_NETLINK_DEV=y CONFIG_UNIX=y CONFIG_NET_KEY=m CONFIG_INET=y CONFIG_IP_MULTICAST=y CONFIG_IP_ADVANCED_ROUTER=y CONFIG_IP_MULTIPLE_TABLES=y CONFIG_IP_ROUTE_FWMARK=y CONFIG_IP_ROUTE_MULTIPATH=y CONFIG_IP_ROUTE_VERBOSE=y # CONFIG_IP_PNP is not set CONFIG_NET_IPIP=m CONFIG_NET_IPGRE=m CONFIG_NET_IPGRE_BROADCAST=y CONFIG_IP_MROUTE=y CONFIG_IP_PIMSM_V1=y CONFIG_IP_PIMSM_V2=y # CONFIG_ARPD is not set CONFIG_SYN_COOKIES=y CONFIG_INET_AH=m CONFIG_INET_ESP=m CONFIG_INET_IPCOMP=m CONFIG_INET_TUNNEL=m # # IP: Virtual Server Configuration # CONFIG_IP_VS=m # CONFIG_IP_VS_DEBUG is not set CONFIG_IP_VS_TAB_BITS=16 # # IPVS transport protocol load balancing support # # CONFIG_IP_VS_PROTO_TCP is not set # CONFIG_IP_VS_PROTO_UDP is not set # CONFIG_IP_VS_PROTO_ESP is not set # CONFIG_IP_VS_PROTO_AH is not set # # IPVS scheduler # CONFIG_IP_VS_RR=m CONFIG_IP_VS_WRR=m CONFIG_IP_VS_LC=m CONFIG_IP_VS_WLC=m CONFIG_IP_VS_LBLC=m CONFIG_IP_VS_LBLCR=m CONFIG_IP_VS_DH=m CONFIG_IP_VS_SH=m # CONFIG_IP_VS_SED is not set # CONFIG_IP_VS_NQ is not set # # IPVS application helper # CONFIG_IPV6=m # CONFIG_IPV6_PRIVACY is not set CONFIG_INET6_AH=m CONFIG_INET6_ESP=m CONFIG_INET6_IPCOMP=m CONFIG_INET6_TUNNEL=m # CONFIG_IPV6_TUNNEL is not set CONFIG_NETFILTER=y # CONFIG_NETFILTER_DEBUG is not set CONFIG_BRIDGE_NETFILTER=y # # IP: Netfilter Configuration # CONFIG_IP_NF_CONNTRACK=m # CONFIG_IP_NF_CT_ACCT is not set # CONFIG_IP_NF_CT_PROTO_SCTP is not set CONFIG_IP_NF_FTP=m CONFIG_IP_NF_IRC=m CONFIG_IP_NF_TFTP=m CONFIG_IP_NF_AMANDA=m CONFIG_IP_NF_QUEUE=m CONFIG_IP_NF_IPTABLES=m CONFIG_IP_NF_MATCH_LIMIT=m # CONFIG_IP_NF_MATCH_IPRANGE is not set CONFIG_IP_NF_MATCH_MAC=m CONFIG_IP_NF_MATCH_PKTTYPE=m CONFIG_IP_NF_MATCH_MARK=m CONFIG_IP_NF_MATCH_MULTIPORT=m CONFIG_IP_NF_MATCH_TOS=m CONFIG_IP_NF_MATCH_RECENT=m CONFIG_IP_NF_MATCH_ECN=m CONFIG_IP_NF_MATCH_DSCP=m CONFIG_IP_NF_MATCH_AH_ESP=m CONFIG_IP_NF_MATCH_LENGTH=m CONFIG_IP_NF_MATCH_TTL=m CONFIG_IP_NF_MATCH_TCPMSS=m CONFIG_IP_NF_MATCH_HELPER=m CONFIG_IP_NF_MATCH_STATE=m CONFIG_IP_NF_MATCH_CONNTRACK=m CONFIG_IP_NF_MATCH_OWNER=m # CONFIG_IP_NF_MATCH_PHYSDEV is not set # CONFIG_IP_NF_MATCH_ADDRTYPE is not set # CONFIG_IP_NF_MATCH_REALM is not set # CONFIG_IP_NF_MATCH_SCTP is not set # CONFIG_IP_NF_MATCH_COMMENT is not set CONFIG_IP_NF_FILTER=m CONFIG_IP_NF_TARGET_REJECT=m CONFIG_IP_NF_TARGET_LOG=m CONFIG_IP_NF_TARGET_ULOG=m CONFIG_IP_NF_TARGET_TCPMSS=m CONFIG_IP_NF_NAT=m CONFIG_IP_NF_NAT_NEEDED=y CONFIG_IP_NF_TARGET_MASQUERADE=m CONFIG_IP_NF_TARGET_REDIRECT=m # CONFIG_IP_NF_TARGET_NETMAP is not set # CONFIG_IP_NF_TARGET_SAME is not set # CONFIG_IP_NF_NAT_LOCAL is not set CONFIG_IP_NF_NAT_SNMP_BASIC=m CONFIG_IP_NF_NAT_IRC=m CONFIG_IP_NF_NAT_FTP=m CONFIG_IP_NF_NAT_TFTP=m CONFIG_IP_NF_NAT_AMANDA=m CONFIG_IP_NF_MANGLE=m CONFIG_IP_NF_TARGET_TOS=m CONFIG_IP_NF_TARGET_ECN=m CONFIG_IP_NF_TARGET_DSCP=m CONFIG_IP_NF_TARGET_MARK=m # CONFIG_IP_NF_TARGET_CLASSIFY is not set # CONFIG_IP_NF_RAW is not set CONFIG_IP_NF_ARPTABLES=m CONFIG_IP_NF_ARPFILTER=m CONFIG_IP_NF_ARP_MANGLE=m CONFIG_IP_NF_COMPAT_IPCHAINS=m CONFIG_IP_NF_COMPAT_IPFWADM=m # # IPv6: Netfilter Configuration # # CONFIG_IP6_NF_QUEUE is not set CONFIG_IP6_NF_IPTABLES=m CONFIG_IP6_NF_MATCH_LIMIT=m CONFIG_IP6_NF_MATCH_MAC=m CONFIG_IP6_NF_MATCH_RT=m CONFIG_IP6_NF_MATCH_OPTS=m CONFIG_IP6_NF_MATCH_FRAG=m CONFIG_IP6_NF_MATCH_HL=m CONFIG_IP6_NF_MATCH_MULTIPORT=m CONFIG_IP6_NF_MATCH_OWNER=m CONFIG_IP6_NF_MATCH_MARK=m CONFIG_IP6_NF_MATCH_IPV6HEADER=m CONFIG_IP6_NF_MATCH_AHESP=m CONFIG_IP6_NF_MATCH_LENGTH=m CONFIG_IP6_NF_MATCH_EUI64=m # CONFIG_IP6_NF_MATCH_PHYSDEV is not set CONFIG_IP6_NF_FILTER=m CONFIG_IP6_NF_TARGET_LOG=m CONFIG_IP6_NF_MANGLE=m CONFIG_IP6_NF_TARGET_MARK=m # CONFIG_IP6_NF_RAW is not set # # Bridge: Netfilter Configuration # # CONFIG_BRIDGE_NF_EBTABLES is not set CONFIG_XFRM=y CONFIG_XFRM_USER=m # # SCTP Configuration (EXPERIMENTAL) # # CONFIG_IP_SCTP is not set CONFIG_ATM=y CONFIG_ATM_CLIP=y # CONFIG_ATM_CLIP_NO_ICMP is not set CONFIG_ATM_LANE=m CONFIG_ATM_MPOA=m CONFIG_ATM_BR2684=m CONFIG_ATM_BR2684_IPFILTER=y CONFIG_BRIDGE=m CONFIG_VLAN_8021Q=m # CONFIG_DECNET is not set CONFIG_LLC=y # CONFIG_LLC2 is not set CONFIG_IPX=m # CONFIG_IPX_INTERN is not set CONFIG_ATALK=m CONFIG_DEV_APPLETALK=y CONFIG_LTPC=m CONFIG_COPS=m CONFIG_COPS_DAYNA=y CONFIG_COPS_TANGENT=y CONFIG_IPDDP=m CONFIG_IPDDP_ENCAP=y CONFIG_IPDDP_DECAP=y # CONFIG_X25 is not set # CONFIG_LAPB is not set CONFIG_NET_DIVERT=y # CONFIG_ECONET is not set CONFIG_WAN_ROUTER=m # CONFIG_NET_HW_FLOWCONTROL is not set # # QoS and/or fair queueing # CONFIG_NET_SCHED=y CONFIG_NET_SCH_CLK_JIFFIES=y # CONFIG_NET_SCH_CLK_GETTIMEOFDAY is not set # CONFIG_NET_SCH_CLK_CPU is not set CONFIG_NET_SCH_CBQ=m CONFIG_NET_SCH_HTB=m # CONFIG_NET_SCH_HFSC is not set # CONFIG_NET_SCH_ATM is not set CONFIG_NET_SCH_PRIO=m CONFIG_NET_SCH_RED=m CONFIG_NET_SCH_SFQ=m CONFIG_NET_SCH_TEQL=m CONFIG_NET_SCH_TBF=m CONFIG_NET_SCH_GRED=m CONFIG_NET_SCH_DSMARK=m # CONFIG_NET_SCH_NETEM is not set CONFIG_NET_SCH_INGRESS=m CONFIG_NET_QOS=y CONFIG_NET_ESTIMATOR=y CONFIG_NET_CLS=y CONFIG_NET_CLS_TCINDEX=m CONFIG_NET_CLS_ROUTE4=m CONFIG_NET_CLS_ROUTE=y CONFIG_NET_CLS_FW=m CONFIG_NET_CLS_U32=m # CONFIG_CLS_U32_PERF is not set # CONFIG_NET_CLS_IND is not set CONFIG_NET_CLS_RSVP=m CONFIG_NET_CLS_RSVP6=m # CONFIG_NET_CLS_ACT is not set CONFIG_NET_CLS_POLICE=y # # Network testing # # CONFIG_NET_PKTGEN is not set CONFIG_NETPOLL=y # CONFIG_NETPOLL_RX is not set # CONFIG_NETPOLL_TRAP is not set CONFIG_NET_POLL_CONTROLLER=y # CONFIG_HAMRADIO is not set CONFIG_IRDA=m # # IrDA protocols # CONFIG_IRLAN=m CONFIG_IRNET=m CONFIG_IRCOMM=m CONFIG_IRDA_ULTRA=y # # IrDA options # CONFIG_IRDA_CACHE_LAST_LSAP=y CONFIG_IRDA_FAST_RR=y # CONFIG_IRDA_DEBUG is not set # # Infrared-port device drivers # # # SIR device drivers # CONFIG_IRTTY_SIR=m # # Dongle support # CONFIG_DONGLE=y CONFIG_ESI_DONGLE=m CONFIG_ACTISYS_DONGLE=m CONFIG_TEKRAM_DONGLE=m CONFIG_LITELINK_DONGLE=m CONFIG_MA600_DONGLE=m CONFIG_GIRBIL_DONGLE=m CONFIG_MCP2120_DONGLE=m CONFIG_OLD_BELKIN_DONGLE=m CONFIG_ACT200L_DONGLE=m # # Old SIR device drivers # # # Old Serial dongle support # # # FIR device drivers # CONFIG_USB_IRDA=m # CONFIG_SIGMATEL_FIR is not set CONFIG_NSC_FIR=m CONFIG_WINBOND_FIR=m CONFIG_TOSHIBA_FIR=m CONFIG_SMC_IRCC_FIR=m CONFIG_ALI_FIR=m CONFIG_VLSI_FIR=m # CONFIG_VIA_FIR is not set # CONFIG_BT is not set CONFIG_NETDEVICES=y CONFIG_DUMMY=m CONFIG_BONDING=m CONFIG_EQUALIZER=m CONFIG_TUN=m CONFIG_ETHERTAP=m CONFIG_NET_SB1000=m # # ARCnet devices # # CONFIG_ARCNET is not set # # Ethernet (10 or 100Mbit) # CONFIG_NET_ETHERNET=y CONFIG_MII=m CONFIG_HAPPYMEAL=m CONFIG_SUNGEM=m CONFIG_NET_VENDOR_3COM=y CONFIG_EL1=m CONFIG_EL2=m CONFIG_ELPLUS=m CONFIG_EL16=m CONFIG_EL3=m CONFIG_3C515=m CONFIG_VORTEX=m CONFIG_TYPHOON=m CONFIG_LANCE=m CONFIG_NET_VENDOR_SMC=y CONFIG_WD80x3=m CONFIG_ULTRA=m CONFIG_ULTRA32=m CONFIG_SMC9194=m CONFIG_NET_VENDOR_RACAL=y CONFIG_NI52=m CONFIG_NI65=m # # Tulip family network device support # # CONFIG_NET_TULIP is not set CONFIG_AT1700=m CONFIG_DEPCA=m CONFIG_HP100=m CONFIG_NET_ISA=y CONFIG_E2100=m # CONFIG_EWRK3 is not set CONFIG_EEXPRESS=m CONFIG_EEXPRESS_PRO=m CONFIG_HPLAN_PLUS=m CONFIG_HPLAN=m CONFIG_LP486E=m CONFIG_ETH16I=m CONFIG_NE2000=m # CONFIG_ZNET is not set # CONFIG_SEEQ8005 is not set CONFIG_NET_PCI=y CONFIG_PCNET32=m CONFIG_AMD8111_ETH=m # CONFIG_AMD8111E_NAPI is not set CONFIG_ADAPTEC_STARFIRE=m # CONFIG_ADAPTEC_STARFIRE_NAPI is not set CONFIG_AC3200=m CONFIG_APRICOT=m CONFIG_B44=m # CONFIG_FORCEDETH is not set CONFIG_CS89x0=m CONFIG_DGRS=m CONFIG_EEPRO100=m # CONFIG_EEPRO100_PIO is not set CONFIG_E100=m # CONFIG_E100_NAPI is not set CONFIG_LNE390=m CONFIG_FEALNX=m CONFIG_NATSEMI=m CONFIG_NE2K_PCI=m CONFIG_NE3210=m CONFIG_ES3210=m CONFIG_8139CP=m CONFIG_8139TOO=m CONFIG_8139TOO_PIO=y # CONFIG_8139TOO_TUNE_TWISTER is not set CONFIG_8139TOO_8129=y # CONFIG_8139_OLD_RX_RESET is not set CONFIG_SIS900=m CONFIG_EPIC100=m CONFIG_SUNDANCE=m # CONFIG_SUNDANCE_MMIO is not set CONFIG_TLAN=m CONFIG_VIA_RHINE=m # CONFIG_VIA_RHINE_MMIO is not set # CONFIG_VIA_VELOCITY is not set CONFIG_NET_POCKET=y CONFIG_ATP=m CONFIG_DE600=m CONFIG_DE620=m # # Ethernet (1000 Mbit) # CONFIG_ACENIC=m # CONFIG_ACENIC_OMIT_TIGON_I is not set CONFIG_DL2K=m CONFIG_E1000=m CONFIG_E1000_NAPI=y CONFIG_NS83820=m CONFIG_HAMACHI=m CONFIG_YELLOWFIN=m CONFIG_R8169=m # CONFIG_R8169_NAPI is not set CONFIG_SK98LIN=m CONFIG_TIGON3=m # # Ethernet (10000 Mbit) # # CONFIG_IXGB is not set # CONFIG_S2IO is not set # # Token Ring devices # CONFIG_TR=y CONFIG_IBMTR=m CONFIG_IBMOL=m CONFIG_IBMLS=m CONFIG_3C359=m CONFIG_TMS380TR=m CONFIG_TMSPCI=m # CONFIG_SKISA is not set # CONFIG_PROTEON is not set CONFIG_ABYSS=m CONFIG_SMCTR=m # # Wireless LAN (non-hamradio) # CONFIG_NET_RADIO=y # # Obsolete Wireless cards support (pre-802.11) # CONFIG_STRIP=m CONFIG_ARLAN=m CONFIG_WAVELAN=m CONFIG_PCMCIA_WAVELAN=m CONFIG_PCMCIA_NETWAVE=m # # Wireless 802.11 Frequency Hopping cards support # CONFIG_PCMCIA_RAYCS=m # # Wireless 802.11b ISA/PCI cards support # CONFIG_AIRO=m CONFIG_HERMES=m CONFIG_PLX_HERMES=m # CONFIG_TMD_HERMES is not set CONFIG_PCI_HERMES=m # CONFIG_ATMEL is not set # # Wireless 802.11b Pcmcia/Cardbus cards support # CONFIG_PCMCIA_HERMES=m CONFIG_AIRO_CS=m # CONFIG_PCMCIA_WL3501 is not set # # Prism GT/Duette 802.11(a/b/g) PCI/Cardbus support # # CONFIG_PRISM54 is not set CONFIG_NET_WIRELESS=y # # PCMCIA network device support # CONFIG_NET_PCMCIA=y CONFIG_PCMCIA_3C589=m CONFIG_PCMCIA_3C574=m CONFIG_PCMCIA_FMVJ18X=m CONFIG_PCMCIA_PCNET=m CONFIG_PCMCIA_NMCLAN=m CONFIG_PCMCIA_SMC91C92=m CONFIG_PCMCIA_XIRC2PS=m CONFIG_PCMCIA_AXNET=m CONFIG_PCMCIA_IBMTR=m # # Wan interfaces # CONFIG_WAN=y CONFIG_HOSTESS_SV11=m CONFIG_COSA=m # CONFIG_DSCC4 is not set # CONFIG_LANMEDIA is not set CONFIG_SEALEVEL_4021=m # CONFIG_SYNCLINK_SYNCPPP is not set # CONFIG_HDLC is not set CONFIG_DLCI=m CONFIG_DLCI_COUNT=24 CONFIG_DLCI_MAX=8 CONFIG_SDLA=m CONFIG_WAN_ROUTER_DRIVERS=y CONFIG_CYCLADES_SYNC=m CONFIG_CYCLOMX_X25=y CONFIG_SBNI=m CONFIG_SBNI_MULTILINE=y # # ATM drivers # CONFIG_ATM_TCP=m CONFIG_ATM_LANAI=m CONFIG_ATM_ENI=m # CONFIG_ATM_ENI_DEBUG is not set # CONFIG_ATM_ENI_TUNE_BURST is not set CONFIG_ATM_FIRESTREAM=m CONFIG_ATM_ZATM=m # CONFIG_ATM_ZATM_DEBUG is not set CONFIG_ATM_NICSTAR=m CONFIG_ATM_NICSTAR_USE_SUNI=y CONFIG_ATM_NICSTAR_USE_IDT77105=y CONFIG_ATM_IDT77252=m # CONFIG_ATM_IDT77252_DEBUG is not set # CONFIG_ATM_IDT77252_RCV_ALL is not set CONFIG_ATM_IDT77252_USE_SUNI=y CONFIG_ATM_AMBASSADOR=m # CONFIG_ATM_AMBASSADOR_DEBUG is not set CONFIG_ATM_HORIZON=m # CONFIG_ATM_HORIZON_DEBUG is not set CONFIG_ATM_IA=m # CONFIG_ATM_IA_DEBUG is not set CONFIG_ATM_FORE200E_MAYBE=m CONFIG_ATM_FORE200E_PCA=y CONFIG_ATM_FORE200E_PCA_DEFAULT_FW=y # CONFIG_ATM_FORE200E_USE_TASKLET is not set CONFIG_ATM_FORE200E_TX_RETRY=16 CONFIG_ATM_FORE200E_DEBUG=0 CONFIG_ATM_FORE200E=m CONFIG_ATM_HE=m # CONFIG_ATM_HE_USE_SUNI is not set CONFIG_FDDI=y CONFIG_DEFXX=m CONFIG_SKFP=m # CONFIG_HIPPI is not set CONFIG_PLIP=m CONFIG_PPP=m CONFIG_PPP_MULTILINK=y CONFIG_PPP_FILTER=y CONFIG_PPP_ASYNC=m CONFIG_PPP_SYNC_TTY=m CONFIG_PPP_DEFLATE=m # CONFIG_PPP_BSDCOMP is not set CONFIG_PPPOE=m CONFIG_PPPOATM=m CONFIG_SLIP=m CONFIG_SLIP_COMPRESSED=y CONFIG_SLIP_SMART=y CONFIG_SLIP_MODE_SLIP6=y CONFIG_NET_FC=y CONFIG_SHAPER=m CONFIG_NETCONSOLE=m # # ISDN subsystem # CONFIG_ISDN=m # # Old ISDN4Linux # # CONFIG_ISDN_I4L is not set # # CAPI subsystem # CONFIG_ISDN_CAPI=m CONFIG_ISDN_DRV_AVMB1_VERBOSE_REASON=y CONFIG_ISDN_CAPI_MIDDLEWARE=y CONFIG_ISDN_CAPI_CAPI20=m CONFIG_ISDN_CAPI_CAPIFS_BOOL=y CONFIG_ISDN_CAPI_CAPIFS=m # # CAPI hardware drivers # # # Active AVM cards # # CONFIG_CAPI_AVM is not set # # Active Eicon DIVA Server cards # # CONFIG_CAPI_EICON is not set # # Telephony Support # CONFIG_PHONE=m CONFIG_PHONE_IXJ=m CONFIG_PHONE_IXJ_PCMCIA=m # # Input device support # CONFIG_INPUT=y # # Userland interfaces # CONFIG_INPUT_MOUSEDEV=y CONFIG_INPUT_MOUSEDEV_PSAUX=y CONFIG_INPUT_MOUSEDEV_SCREEN_X=1024 CONFIG_INPUT_MOUSEDEV_SCREEN_Y=768 CONFIG_INPUT_JOYDEV=m # CONFIG_INPUT_TSDEV is not set CONFIG_INPUT_EVDEV=m # CONFIG_INPUT_EVBUG is not set # # Input I/O drivers # # CONFIG_GAMEPORT is not set CONFIG_SOUND_GAMEPORT=y CONFIG_SERIO=y CONFIG_SERIO_I8042=y CONFIG_SERIO_SERPORT=y # CONFIG_SERIO_CT82C710 is not set # CONFIG_SERIO_PARKBD is not set # CONFIG_SERIO_PCIPS2 is not set # CONFIG_SERIO_RAW is not set # # Input Device Drivers # CONFIG_INPUT_KEYBOARD=y CONFIG_KEYBOARD_ATKBD=y # CONFIG_KEYBOARD_SUNKBD is not set # CONFIG_KEYBOARD_LKKBD is not set # CONFIG_KEYBOARD_XTKBD is not set # CONFIG_KEYBOARD_NEWTON is not set CONFIG_INPUT_MOUSE=y CONFIG_MOUSE_PS2=y # CONFIG_MOUSE_SERIAL is not set # CONFIG_MOUSE_INPORT is not set # CONFIG_MOUSE_LOGIBM is not set # CONFIG_MOUSE_PC110PAD is not set # CONFIG_MOUSE_VSXXXAA is not set # CONFIG_INPUT_JOYSTICK is not set # CONFIG_INPUT_TOUCHSCREEN is not set # CONFIG_INPUT_MISC is not set # # Character devices # CONFIG_VT=y CONFIG_VT_CONSOLE=y CONFIG_HW_CONSOLE=y CONFIG_SERIAL_NONSTANDARD=y CONFIG_ROCKETPORT=m CONFIG_CYCLADES=m # CONFIG_CYZ_INTR is not set CONFIG_SYNCLINK=m # CONFIG_SYNCLINKMP is not set CONFIG_N_HDLC=m CONFIG_STALDRV=y # # Serial drivers # CONFIG_SERIAL_8250=m # CONFIG_SERIAL_8250_CS is not set # CONFIG_SERIAL_8250_ACPI is not set CONFIG_SERIAL_8250_NR_UARTS=4 # CONFIG_SERIAL_8250_EXTENDED is not set # # Non-8250 serial port support # CONFIG_SERIAL_CORE=m CONFIG_UNIX98_PTYS=y CONFIG_LEGACY_PTYS=y CONFIG_LEGACY_PTY_COUNT=256 CONFIG_PRINTER=m CONFIG_LP_CONSOLE=y CONFIG_PPDEV=m CONFIG_TIPAR=m # # IPMI # CONFIG_IPMI_HANDLER=m # CONFIG_IPMI_PANIC_EVENT is not set CONFIG_IPMI_DEVICE_INTERFACE=m # CONFIG_IPMI_SI is not set CONFIG_IPMI_WATCHDOG=m # CONFIG_IPMI_POWEROFF is not set # # Watchdog Cards # CONFIG_WATCHDOG=y # CONFIG_WATCHDOG_NOWAYOUT is not set # # Watchdog Device Drivers # CONFIG_SOFT_WATCHDOG=m CONFIG_ACQUIRE_WDT=m CONFIG_ADVANTECH_WDT=m CONFIG_ALIM1535_WDT=m CONFIG_ALIM7101_WDT=m CONFIG_SC520_WDT=m CONFIG_EUROTECH_WDT=m CONFIG_IB700_WDT=m CONFIG_WAFER_WDT=m # CONFIG_I8XX_TCO is not set CONFIG_SC1200_WDT=m # CONFIG_SCx200_WDT is not set # CONFIG_60XX_WDT is not set # CONFIG_CPU5_WDT is not set # CONFIG_W83627HF_WDT is not set CONFIG_W83877F_WDT=m CONFIG_MACHZ_WDT=m # # ISA-based Watchdog Cards # CONFIG_PCWATCHDOG=m # CONFIG_MIXCOMWD is not set CONFIG_WDT=m # CONFIG_WDT_501 is not set # # PCI-based Watchdog Cards # # CONFIG_PCIPCWATCHDOG is not set CONFIG_WDTPCI=m # CONFIG_WDT_501_PCI is not set # # USB-based Watchdog Cards # # CONFIG_USBPCWATCHDOG is not set # CONFIG_HW_RANDOM is not set CONFIG_NVRAM=m CONFIG_RTC=y CONFIG_DTLK=m CONFIG_R3964=m # CONFIG_APPLICOM is not set CONFIG_SONYPI=m # # Ftape, the floppy tape device driver # CONFIG_AGP=m CONFIG_AGP_ALI=m CONFIG_AGP_ATI=m CONFIG_AGP_AMD=m # CONFIG_AGP_AMD64 is not set CONFIG_AGP_INTEL=m # CONFIG_AGP_INTEL_MCH is not set CONFIG_AGP_NVIDIA=m CONFIG_AGP_SIS=m CONFIG_AGP_SWORKS=m CONFIG_AGP_VIA=m # CONFIG_AGP_EFFICEON is not set CONFIG_DRM=y CONFIG_DRM_TDFX=m CONFIG_DRM_R128=m CONFIG_DRM_RADEON=m CONFIG_DRM_I810=m CONFIG_DRM_I830=m # CONFIG_DRM_I915 is not set CONFIG_DRM_MGA=m CONFIG_DRM_SIS=m # # PCMCIA character devices # CONFIG_SYNCLINK_CS=m CONFIG_MWAVE=m # CONFIG_RAW_DRIVER is not set # CONFIG_HPET is not set # CONFIG_HANGCHECK_TIMER is not set # # I2C support # CONFIG_I2C=m CONFIG_I2C_CHARDEV=m # # I2C Algorithms # CONFIG_I2C_ALGOBIT=m CONFIG_I2C_ALGOPCF=m # CONFIG_I2C_ALGOPCA is not set # # I2C Hardware Bus support # CONFIG_I2C_ALI1535=m # CONFIG_I2C_ALI1563 is not set CONFIG_I2C_ALI15X3=m CONFIG_I2C_AMD756=m # CONFIG_I2C_AMD8111 is not set CONFIG_I2C_I801=m CONFIG_I2C_I810=m CONFIG_I2C_ISA=m # CONFIG_I2C_NFORCE2 is not set CONFIG_I2C_PARPORT=m # CONFIG_I2C_PARPORT_LIGHT is not set CONFIG_I2C_PIIX4=m # CONFIG_I2C_PROSAVAGE is not set # CONFIG_I2C_SAVAGE4 is not set # CONFIG_SCx200_ACB is not set CONFIG_I2C_SIS5595=m # CONFIG_I2C_SIS630 is not set # CONFIG_I2C_SIS96X is not set CONFIG_I2C_VIA=m CONFIG_I2C_VIAPRO=m CONFIG_I2C_VOODOO3=m # CONFIG_I2C_PCA_ISA is not set # # Hardware Sensors Chip support # CONFIG_I2C_SENSOR=m CONFIG_SENSORS_ADM1021=m CONFIG_SENSORS_ADM1025=m # CONFIG_SENSORS_ADM1031 is not set # CONFIG_SENSORS_ASB100 is not set CONFIG_SENSORS_DS1621=m # CONFIG_SENSORS_FSCHER is not set CONFIG_SENSORS_GL518SM=m CONFIG_SENSORS_IT87=m CONFIG_SENSORS_LM75=m # CONFIG_SENSORS_LM77 is not set CONFIG_SENSORS_LM78=m CONFIG_SENSORS_LM80=m # CONFIG_SENSORS_LM83 is not set # CONFIG_SENSORS_LM85 is not set # CONFIG_SENSORS_LM90 is not set # CONFIG_SENSORS_MAX1619 is not set CONFIG_SENSORS_SMSC47M1=m CONFIG_SENSORS_VIA686A=m CONFIG_SENSORS_W83781D=m # CONFIG_SENSORS_W83L785TS is not set # CONFIG_SENSORS_W83627HF is not set # # Other I2C Chip support # CONFIG_SENSORS_EEPROM=m CONFIG_SENSORS_PCF8574=m CONFIG_SENSORS_PCF8591=m # CONFIG_SENSORS_RTC8564 is not set # CONFIG_I2C_DEBUG_CORE is not set # CONFIG_I2C_DEBUG_ALGO is not set # CONFIG_I2C_DEBUG_BUS is not set # CONFIG_I2C_DEBUG_CHIP is not set # # Dallas's 1-wire bus # # CONFIG_W1 is not set # # Misc devices # # CONFIG_IBM_ASM is not set # # Multimedia devices # CONFIG_VIDEO_DEV=m # # Video For Linux # # # Video Adapters # CONFIG_VIDEO_BT848=m CONFIG_VIDEO_PMS=m CONFIG_VIDEO_BWQCAM=m CONFIG_VIDEO_CQCAM=m CONFIG_VIDEO_W9966=m CONFIG_VIDEO_CPIA=m CONFIG_VIDEO_CPIA_PP=m CONFIG_VIDEO_CPIA_USB=m # CONFIG_VIDEO_SAA5246A is not set CONFIG_VIDEO_SAA5249=m CONFIG_TUNER_3036=m CONFIG_VIDEO_STRADIS=m CONFIG_VIDEO_ZORAN=m CONFIG_VIDEO_ZORAN_BUZ=m CONFIG_VIDEO_ZORAN_DC10=m # CONFIG_VIDEO_ZORAN_DC30 is not set CONFIG_VIDEO_ZORAN_LML33=m # CONFIG_VIDEO_ZORAN_LML33R10 is not set # CONFIG_VIDEO_SAA7134 is not set # CONFIG_VIDEO_MXB is not set # CONFIG_VIDEO_DPC is not set # CONFIG_VIDEO_HEXIUM_ORION is not set # CONFIG_VIDEO_HEXIUM_GEMINI is not set # CONFIG_VIDEO_CX88 is not set # CONFIG_VIDEO_OVCAMCHIP is not set # # Radio Adapters # CONFIG_RADIO_CADET=m CONFIG_RADIO_RTRACK=m CONFIG_RADIO_RTRACK2=m CONFIG_RADIO_AZTECH=m CONFIG_RADIO_GEMTEK=m CONFIG_RADIO_GEMTEK_PCI=m CONFIG_RADIO_MAXIRADIO=m CONFIG_RADIO_MAESTRO=m CONFIG_RADIO_SF16FMI=m CONFIG_RADIO_SF16FMR2=m CONFIG_RADIO_TERRATEC=m CONFIG_RADIO_TRUST=m CONFIG_RADIO_TYPHOON=m CONFIG_RADIO_TYPHOON_PROC_FS=y CONFIG_RADIO_ZOLTRIX=m # # Digital Video Broadcasting Devices # # CONFIG_DVB is not set CONFIG_VIDEO_TUNER=m CONFIG_VIDEO_BUF=m CONFIG_VIDEO_BTCX=m CONFIG_VIDEO_IR=m # # Graphics support # CONFIG_FB=y CONFIG_FB_MODE_HELPERS=y # CONFIG_FB_CIRRUS is not set CONFIG_FB_PM2=m # CONFIG_FB_PM2_FIFO_DISCONNECT is not set # CONFIG_FB_CYBER2000 is not set # CONFIG_FB_ASILIANT is not set # CONFIG_FB_IMSTT is not set CONFIG_FB_VGA16=m CONFIG_FB_VESA=y CONFIG_VIDEO_SELECT=y CONFIG_FB_HGA=m # CONFIG_FB_HGA_ACCEL is not set CONFIG_FB_RIVA=m # CONFIG_FB_RIVA_I2C is not set # CONFIG_FB_RIVA_DEBUG is not set # CONFIG_FB_I810 is not set CONFIG_FB_MATROX=m CONFIG_FB_MATROX_MILLENIUM=y CONFIG_FB_MATROX_MYSTIQUE=y CONFIG_FB_MATROX_G450=y CONFIG_FB_MATROX_G100=y CONFIG_FB_MATROX_I2C=m CONFIG_FB_MATROX_MAVEN=m CONFIG_FB_MATROX_MULTIHEAD=y # CONFIG_FB_RADEON_OLD is not set CONFIG_FB_RADEON=m CONFIG_FB_RADEON_I2C=y # CONFIG_FB_RADEON_DEBUG is not set CONFIG_FB_ATY128=m CONFIG_FB_ATY=m CONFIG_FB_ATY_CT=y CONFIG_FB_ATY_GX=y # CONFIG_FB_ATY_XL_INIT is not set CONFIG_FB_SIS=m CONFIG_FB_SIS_300=y CONFIG_FB_SIS_315=y CONFIG_FB_NEOMAGIC=m # CONFIG_FB_KYRO is not set CONFIG_FB_3DFX=m # CONFIG_FB_3DFX_ACCEL is not set CONFIG_FB_VOODOO1=m # CONFIG_FB_TRIDENT is not set # CONFIG_FB_VIRTUAL is not set # # Console display driver support # CONFIG_VGA_CONSOLE=y CONFIG_MDA_CONSOLE=m CONFIG_DUMMY_CONSOLE=y # CONFIG_FRAMEBUFFER_CONSOLE is not set # # Logo configuration # # CONFIG_LOGO is not set # # Sound # CONFIG_SOUND=m # # Advanced Linux Sound Architecture # # CONFIG_SND is not set # # Open Sound System # # CONFIG_SOUND_PRIME is not set # # USB support # CONFIG_USB=m # CONFIG_USB_DEBUG is not set # # Miscellaneous USB options # CONFIG_USB_DEVICEFS=y # CONFIG_USB_BANDWIDTH is not set # CONFIG_USB_DYNAMIC_MINORS is not set # CONFIG_USB_SUSPEND is not set # CONFIG_USB_OTG is not set # # USB Host Controller Drivers # CONFIG_USB_EHCI_HCD=m # CONFIG_USB_EHCI_SPLIT_ISO is not set # CONFIG_USB_EHCI_ROOT_HUB_TT is not set # CONFIG_USB_OHCI_HCD is not set # CONFIG_USB_UHCI_HCD is not set # # USB Device Class drivers # CONFIG_USB_AUDIO=m # CONFIG_USB_BLUETOOTH_TTY is not set CONFIG_USB_MIDI=m CONFIG_USB_ACM=m CONFIG_USB_PRINTER=m CONFIG_USB_STORAGE=m # CONFIG_USB_STORAGE_DEBUG is not set # CONFIG_USB_STORAGE_RW_DETECT is not set CONFIG_USB_STORAGE_DATAFAB=y CONFIG_USB_STORAGE_FREECOM=y CONFIG_USB_STORAGE_ISD200=y CONFIG_USB_STORAGE_DPCM=y CONFIG_USB_STORAGE_HP8200e=y CONFIG_USB_STORAGE_SDDR09=y CONFIG_USB_STORAGE_SDDR55=y CONFIG_USB_STORAGE_JUMPSHOT=y # # USB Human Interface Devices (HID) # CONFIG_USB_HID=m CONFIG_USB_HIDINPUT=y # CONFIG_HID_FF is not set # CONFIG_USB_HIDDEV is not set # # USB HID Boot Protocol drivers # # CONFIG_USB_KBD is not set # CONFIG_USB_MOUSE is not set CONFIG_USB_AIPTEK=m CONFIG_USB_WACOM=m CONFIG_USB_KBTAB=m CONFIG_USB_POWERMATE=m # CONFIG_USB_MTOUCH is not set # CONFIG_USB_EGALAX is not set # CONFIG_USB_XPAD is not set # CONFIG_USB_ATI_REMOTE is not set # # USB Imaging devices # CONFIG_USB_MDC800=m CONFIG_USB_MICROTEK=m CONFIG_USB_HPUSBSCSI=m # # USB Multimedia devices # CONFIG_USB_DABUSB=m CONFIG_USB_VICAM=m CONFIG_USB_DSBR=m CONFIG_USB_IBMCAM=m CONFIG_USB_KONICAWC=m CONFIG_USB_OV511=m CONFIG_USB_SE401=m # CONFIG_USB_SN9C102 is not set CONFIG_USB_STV680=m # # USB Network adaptors # CONFIG_USB_CATC=m CONFIG_USB_KAWETH=m CONFIG_USB_PEGASUS=m CONFIG_USB_RTL8150=m CONFIG_USB_USBNET=m # # USB Host-to-Host Cables # CONFIG_USB_ALI_M5632=y CONFIG_USB_AN2720=y CONFIG_USB_BELKIN=y CONFIG_USB_GENESYS=y CONFIG_USB_NET1080=y CONFIG_USB_PL2301=y # # Intelligent USB Devices/Gadgets # CONFIG_USB_ARMLINUX=y CONFIG_USB_EPSON2888=y CONFIG_USB_ZAURUS=y CONFIG_USB_CDCETHER=y # # USB Network Adapters # CONFIG_USB_AX8817X=y # # USB port drivers # CONFIG_USB_USS720=m # # USB Serial Converter support # CONFIG_USB_SERIAL=m CONFIG_USB_SERIAL_GENERIC=y CONFIG_USB_SERIAL_BELKIN=m CONFIG_USB_SERIAL_DIGI_ACCELEPORT=m CONFIG_USB_SERIAL_EMPEG=m CONFIG_USB_SERIAL_FTDI_SIO=m CONFIG_USB_SERIAL_VISOR=m CONFIG_USB_SERIAL_IPAQ=m CONFIG_USB_SERIAL_IR=m CONFIG_USB_SERIAL_EDGEPORT=m CONFIG_USB_SERIAL_EDGEPORT_TI=m CONFIG_USB_SERIAL_KEYSPAN_PDA=m CONFIG_USB_SERIAL_KEYSPAN=m # CONFIG_USB_SERIAL_KEYSPAN_MPR is not set # CONFIG_USB_SERIAL_KEYSPAN_USA28 is not set CONFIG_USB_SERIAL_KEYSPAN_USA28X=y CONFIG_USB_SERIAL_KEYSPAN_USA28XA=y CONFIG_USB_SERIAL_KEYSPAN_USA28XB=y # CONFIG_USB_SERIAL_KEYSPAN_USA19 is not set # CONFIG_USB_SERIAL_KEYSPAN_USA18X is not set CONFIG_USB_SERIAL_KEYSPAN_USA19W=y CONFIG_USB_SERIAL_KEYSPAN_USA19QW=y CONFIG_USB_SERIAL_KEYSPAN_USA19QI=y CONFIG_USB_SERIAL_KEYSPAN_USA49W=y CONFIG_USB_SERIAL_KEYSPAN_USA49WLC=y CONFIG_USB_SERIAL_KLSI=m CONFIG_USB_SERIAL_KOBIL_SCT=m CONFIG_USB_SERIAL_MCT_U232=m CONFIG_USB_SERIAL_PL2303=m # CONFIG_USB_SERIAL_SAFE is not set CONFIG_USB_SERIAL_CYBERJACK=m CONFIG_USB_SERIAL_XIRCOM=m CONFIG_USB_SERIAL_OMNINET=m CONFIG_USB_EZUSB=y # # USB Miscellaneous drivers # # CONFIG_USB_EMI62 is not set # CONFIG_USB_EMI26 is not set CONFIG_USB_TIGL=m CONFIG_USB_AUERSWALD=m CONFIG_USB_RIO500=m # CONFIG_USB_LEGOTOWER is not set CONFIG_USB_LCD=m # CONFIG_USB_LED is not set # CONFIG_USB_CYTHERM is not set CONFIG_USB_SPEEDTOUCH=m # CONFIG_USB_PHIDGETSERVO is not set # CONFIG_USB_TEST is not set # # USB Gadget Support # # CONFIG_USB_GADGET is not set # # File systems # CONFIG_EXT2_FS=y # CONFIG_EXT2_FS_XATTR is not set CONFIG_EXT3_FS=m CONFIG_EXT3_FS_XATTR=y CONFIG_EXT3_FS_POSIX_ACL=y # CONFIG_EXT3_FS_SECURITY is not set CONFIG_JBD=m # CONFIG_JBD_DEBUG is not set CONFIG_FS_MBCACHE=y CONFIG_REISERFS_FS=m # CONFIG_REISERFS_CHECK is not set CONFIG_REISERFS_PROC_INFO=y # CONFIG_REISERFS_FS_XATTR is not set CONFIG_JFS_FS=m # CONFIG_JFS_POSIX_ACL is not set CONFIG_JFS_DEBUG=y # CONFIG_JFS_STATISTICS is not set CONFIG_FS_POSIX_ACL=y # CONFIG_XFS_FS is not set CONFIG_MINIX_FS=m CONFIG_ROMFS_FS=m CONFIG_QUOTA=y # CONFIG_QFMT_V1 is not set CONFIG_QFMT_V2=y CONFIG_QUOTACTL=y CONFIG_AUTOFS_FS=m CONFIG_AUTOFS4_FS=m # # CD-ROM/DVD Filesystems # CONFIG_ISO9660_FS=y CONFIG_JOLIET=y CONFIG_ZISOFS=y CONFIG_ZISOFS_FS=y CONFIG_UDF_FS=m CONFIG_UDF_NLS=y # # DOS/FAT/NT Filesystems # CONFIG_FAT_FS=m CONFIG_MSDOS_FS=m CONFIG_VFAT_FS=m CONFIG_FAT_DEFAULT_CODEPAGE=437 CONFIG_FAT_DEFAULT_IOCHARSET="iso8859-1" # CONFIG_NTFS_FS is not set # # Pseudo filesystems # CONFIG_PROC_FS=y CONFIG_PROC_KCORE=y CONFIG_SYSFS=y # CONFIG_DEVFS_FS is not set # CONFIG_DEVPTS_FS_XATTR is not set CONFIG_TMPFS=y # CONFIG_HUGETLBFS is not set # CONFIG_HUGETLB_PAGE is not set CONFIG_RAMFS=y # # Miscellaneous filesystems # # CONFIG_ADFS_FS is not set # CONFIG_AFFS_FS is not set CONFIG_HFS_FS=m CONFIG_HFSPLUS_FS=m CONFIG_BEFS_FS=m # CONFIG_BEFS_DEBUG is not set CONFIG_BFS_FS=m # CONFIG_EFS_FS is not set CONFIG_JFFS_FS=m CONFIG_JFFS_FS_VERBOSE=0 CONFIG_JFFS_PROC_FS=y CONFIG_JFFS2_FS=m CONFIG_JFFS2_FS_DEBUG=0 # CONFIG_JFFS2_FS_NAND is not set # CONFIG_JFFS2_COMPRESSION_OPTIONS is not set CONFIG_JFFS2_ZLIB=y CONFIG_JFFS2_RTIME=y # CONFIG_JFFS2_RUBIN is not set CONFIG_CRAMFS=m CONFIG_VXFS_FS=m # CONFIG_HPFS_FS is not set # CONFIG_QNX4FS_FS is not set CONFIG_SYSV_FS=m CONFIG_UFS_FS=m # CONFIG_UFS_FS_WRITE is not set # # Network File Systems # CONFIG_NFS_FS=m CONFIG_NFS_V3=y # CONFIG_NFS_V4 is not set CONFIG_NFS_DIRECTIO=y CONFIG_NFSD=m CONFIG_NFSD_V3=y # CONFIG_NFSD_V4 is not set CONFIG_NFSD_TCP=y CONFIG_LOCKD=m CONFIG_LOCKD_V4=y CONFIG_EXPORTFS=m CONFIG_SUNRPC=m # CONFIG_RPCSEC_GSS_KRB5 is not set # CONFIG_RPCSEC_GSS_SPKM3 is not set CONFIG_SMB_FS=m # CONFIG_SMB_NLS_DEFAULT is not set # CONFIG_CIFS is not set CONFIG_NCP_FS=m CONFIG_NCPFS_PACKET_SIGNING=y CONFIG_NCPFS_IOCTL_LOCKING=y CONFIG_NCPFS_STRONG=y CONFIG_NCPFS_NFS_NS=y CONFIG_NCPFS_OS2_NS=y CONFIG_NCPFS_SMALLDOS=y CONFIG_NCPFS_NLS=y CONFIG_NCPFS_EXTRAS=y CONFIG_CODA_FS=m # CONFIG_CODA_FS_OLD_API is not set CONFIG_AFS_FS=m CONFIG_RXRPC=m # # Partition Types # CONFIG_PARTITION_ADVANCED=y # CONFIG_ACORN_PARTITION is not set CONFIG_OSF_PARTITION=y # CONFIG_AMIGA_PARTITION is not set # CONFIG_ATARI_PARTITION is not set CONFIG_MAC_PARTITION=y CONFIG_MSDOS_PARTITION=y CONFIG_BSD_DISKLABEL=y CONFIG_MINIX_SUBPARTITION=y CONFIG_SOLARIS_X86_PARTITION=y CONFIG_UNIXWARE_DISKLABEL=y # CONFIG_LDM_PARTITION is not set CONFIG_SGI_PARTITION=y # CONFIG_ULTRIX_PARTITION is not set CONFIG_SUN_PARTITION=y # CONFIG_EFI_PARTITION is not set # # Native Language Support # CONFIG_NLS=y CONFIG_NLS_DEFAULT="iso8859-1" CONFIG_NLS_CODEPAGE_437=m CONFIG_NLS_CODEPAGE_737=m CONFIG_NLS_CODEPAGE_775=m CONFIG_NLS_CODEPAGE_850=m CONFIG_NLS_CODEPAGE_852=m CONFIG_NLS_CODEPAGE_855=m CONFIG_NLS_CODEPAGE_857=m CONFIG_NLS_CODEPAGE_860=m CONFIG_NLS_CODEPAGE_861=m CONFIG_NLS_CODEPAGE_862=m CONFIG_NLS_CODEPAGE_863=m CONFIG_NLS_CODEPAGE_864=m CONFIG_NLS_CODEPAGE_865=m CONFIG_NLS_CODEPAGE_866=m CONFIG_NLS_CODEPAGE_869=m CONFIG_NLS_CODEPAGE_936=m CONFIG_NLS_CODEPAGE_950=m CONFIG_NLS_CODEPAGE_932=m CONFIG_NLS_CODEPAGE_949=m CONFIG_NLS_CODEPAGE_874=m CONFIG_NLS_ISO8859_8=m CONFIG_NLS_CODEPAGE_1250=m CONFIG_NLS_CODEPAGE_1251=m # CONFIG_NLS_ASCII is not set CONFIG_NLS_ISO8859_1=m CONFIG_NLS_ISO8859_2=m CONFIG_NLS_ISO8859_3=m CONFIG_NLS_ISO8859_4=m CONFIG_NLS_ISO8859_5=m CONFIG_NLS_ISO8859_6=m CONFIG_NLS_ISO8859_7=m CONFIG_NLS_ISO8859_9=m CONFIG_NLS_ISO8859_13=m CONFIG_NLS_ISO8859_14=m CONFIG_NLS_ISO8859_15=m CONFIG_NLS_KOI8_R=m CONFIG_NLS_KOI8_U=m CONFIG_NLS_UTF8=m # # Profiling support # CONFIG_PROFILING=y CONFIG_OPROFILE=m # # Kernel hacking # CONFIG_DEBUG_KERNEL=y CONFIG_MAGIC_SYSRQ=y # CONFIG_DEBUG_SLAB is not set # CONFIG_DEBUG_SPINLOCK is not set # CONFIG_DEBUG_SPINLOCK_SLEEP is not set # CONFIG_DEBUG_HIGHMEM is not set # CONFIG_DEBUG_INFO is not set # CONFIG_FRAME_POINTER is not set CONFIG_EARLY_PRINTK=y # CONFIG_DEBUG_STACKOVERFLOW is not set # CONFIG_KPROBES is not set # CONFIG_DEBUG_STACK_USAGE is not set # CONFIG_DEBUG_PAGEALLOC is not set # CONFIG_4KSTACKS is not set # CONFIG_SCHEDSTATS is not set CONFIG_X86_FIND_SMP_CONFIG=y CONFIG_X86_MPPARSE=y # # Security options # # CONFIG_SECURITY is not set # # Cryptographic options # CONFIG_CRYPTO=y CONFIG_CRYPTO_HMAC=y CONFIG_CRYPTO_NULL=m CONFIG_CRYPTO_MD4=m CONFIG_CRYPTO_MD5=m CONFIG_CRYPTO_SHA1=m CONFIG_CRYPTO_SHA256=m CONFIG_CRYPTO_SHA512=m # CONFIG_CRYPTO_WP512 is not set CONFIG_CRYPTO_DES=m CONFIG_CRYPTO_BLOWFISH=m # CONFIG_CRYPTO_TWOFISH is not set CONFIG_CRYPTO_SERPENT=m # CONFIG_CRYPTO_AES_586 is not set CONFIG_CRYPTO_CAST5=m # CONFIG_CRYPTO_CAST6 is not set # CONFIG_CRYPTO_TEA is not set # CONFIG_CRYPTO_ARC4 is not set # CONFIG_CRYPTO_KHAZAD is not set CONFIG_CRYPTO_DEFLATE=m # CONFIG_CRYPTO_MICHAEL_MIC is not set # CONFIG_CRYPTO_CRC32C is not set # CONFIG_CRYPTO_TEST is not set # # Library routines # CONFIG_CRC_CCITT=m CONFIG_CRC32=y # CONFIG_LIBCRC32C is not set CONFIG_ZLIB_INFLATE=y CONFIG_ZLIB_DEFLATE=m CONFIG_X86_SMP=y CONFIG_X86_HT=y CONFIG_X86_BIOS_REBOOT=y CONFIG_X86_TRAMPOLINE=y CONFIG_PC=y --Multipart=_Sat__4_Jun_2005_19_51_22_-0700_Kp/TSOvd/GHsKqPd-- From herbert@gondor.apana.org.au Sun Jun 5 01:03:23 2005 Received: with ECARTIS (v1.0.0; list netdev); Sun, 05 Jun 2005 01:03:29 -0700 (PDT) Received: from arnor.apana.org.au (arnor.apana.org.au [203.14.152.115]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j5583LXq014420 for ; Sun, 5 Jun 2005 01:03:22 -0700 Received: from gondolin.me.apana.org.au ([192.168.0.6] ident=mail) by arnor.apana.org.au with esmtp (Exim 3.35 #1 (Debian)) id 1Deq5A-0001D7-00; Sun, 05 Jun 2005 18:02:08 +1000 Received: from herbert by gondolin.me.apana.org.au with local (Exim 3.36 #1 (Debian)) id 1Deq54-00013u-00; Sun, 05 Jun 2005 18:02:02 +1000 From: Herbert Xu To: akpm@osdl.org (Andrew Morton) Subject: Re: Fw: PROBLEM: tcp_output.c bug Cc: netdev@oss.sgi.com, rommer@active.by Organization: Core In-Reply-To: <20050604195122.6a07abc7.akpm@osdl.org> X-Newsgroups: apana.lists.os.linux.netdev User-Agent: tin/1.7.4-20040225 ("Benbecula") (UNIX) (Linux/2.4.27-hx-1-686-smp (i686)) Message-Id: Date: Sun, 05 Jun 2005 18:02:02 +1000 X-archive-position: 2111 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: herbert@gondor.apana.org.au Precedence: bulk X-list: netdev Content-Length: 406 Lines: 12 Andrew Morton wrote: > > [3.] sh scripts/ver_linux > Linux us401.activeby.net 2.6.9 #4 SMP Fri Apr 22 16:46:30 EEST 2005 i686 i686 > i386 GNU/Linux This bug was fixed in 2.6.11. -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt From manfred@colorfullife.com Sun Jun 5 08:37:29 2005 Received: with ECARTIS (v1.0.0; list netdev); Sun, 05 Jun 2005 08:37:33 -0700 (PDT) Received: from dbl.q-ag.de (dbl.q-ag.de [213.172.117.3]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j55FbRXq016711 for ; Sun, 5 Jun 2005 08:37:29 -0700 Received: from [127.0.0.2] (dbl [127.0.0.1]) by dbl.q-ag.de (8.13.3/8.13.3/Debian-6) with ESMTP id j55Fc30h032241; Sun, 5 Jun 2005 17:38:04 +0200 Message-ID: <42A31BEB.7030900@colorfullife.com> Date: Sun, 05 Jun 2005 17:36:11 +0200 From: Manfred Spraul User-Agent: Mozilla/5.0 (X11; U; Linux i686; fr-FR; rv:1.7.7) Gecko/20050417 Fedora/1.7.7-1.3.1 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Jeff Garzik CC: AAbdulla@nvidia.com, Netdev Subject: [PATCH] forcedeth: add two new pci ids Content-Type: multipart/mixed; boundary="------------060003070105090106070501" X-archive-position: 2112 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: manfred@colorfullife.com Precedence: bulk X-list: netdev Content-Length: 2536 Lines: 77 This is a multi-part message in MIME format. --------------060003070105090106070501 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Hi Jeff, Ayaz wrote a patch that adds two new pci ids to the forcedeth driver. Could you add it to your tree? I'm not sure if it's worth to sneak it into 2.6.12, but it looks to be obviously correct (tm). -- Manfred Signed-Off-By: Manfred Spraul --------------060003070105090106070501 Content-Type: text/plain; name="patch-forcedeth-mcp51" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="patch-forcedeth-mcp51" --- 2.6/drivers/net/forcedeth.c 2005-05-16 19:45:54.000000000 +0200 +++ build-2.6/drivers/net/forcedeth.c 2005-05-16 19:52:59.000000000 +0200 @@ -82,6 +82,7 @@ * 0.31: 14 Nov 2004: ethtool support for getting/setting link * capabilities. * 0.32: 16 Apr 2005: RX_ERROR4 handling added. + * 0.33: 16 Mai 2005: Support for MCP51 added. * * Known bugs: * We suspect that on some hardware no TX done interrupts are generated. @@ -93,7 +94,7 @@ * DEV_NEED_TIMERIRQ will not harm you on sane hardware, only generating a few * superfluous timer interrupts from the nic. */ -#define FORCEDETH_VERSION "0.32" +#define FORCEDETH_VERSION "0.33" #define DRV_NAME "forcedeth" #include @@ -1998,7 +1999,9 @@ /* handle different descriptor versions */ if (pci_dev->device == PCI_DEVICE_ID_NVIDIA_NVENET_1 || pci_dev->device == PCI_DEVICE_ID_NVIDIA_NVENET_2 || - pci_dev->device == PCI_DEVICE_ID_NVIDIA_NVENET_3) + pci_dev->device == PCI_DEVICE_ID_NVIDIA_NVENET_3 || + pci_dev->device == PCI_DEVICE_ID_NVIDIA_NVENET_12 || + pci_dev->device == PCI_DEVICE_ID_NVIDIA_NVENET_13) np->desc_ver = DESC_VER_1; else np->desc_ver = DESC_VER_2; @@ -2256,6 +2259,20 @@ .subdevice = PCI_ANY_ID, .driver_data = DEV_NEED_LASTPACKET1|DEV_IRQMASK_2|DEV_NEED_TIMERIRQ, }, + { /* MCP51 Ethernet Controller */ + .vendor = PCI_VENDOR_ID_NVIDIA, + .device = PCI_DEVICE_ID_NVIDIA_NVENET_12, + .subvendor = PCI_ANY_ID, + .subdevice = PCI_ANY_ID, + .driver_data = DEV_NEED_LASTPACKET1|DEV_IRQMASK_2|DEV_NEED_TIMERIRQ, + }, + { /* MCP51 Ethernet Controller */ + .vendor = PCI_VENDOR_ID_NVIDIA, + .device = PCI_DEVICE_ID_NVIDIA_NVENET_13, + .subvendor = PCI_ANY_ID, + .subdevice = PCI_ANY_ID, + .driver_data = DEV_NEED_LASTPACKET1|DEV_IRQMASK_2|DEV_NEED_TIMERIRQ, + }, {0,}, }; --------------060003070105090106070501-- From davem@davemloft.net Sun Jun 5 13:13:28 2005 Received: with ECARTIS (v1.0.0; list netdev); Sun, 05 Jun 2005 13:13:38 -0700 (PDT) Received: from sunset.davemloft.net (dsl027-180-168.sfo1.dsl.speakeasy.net [216.27.180.168]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j55KDRXq001717 for ; Sun, 5 Jun 2005 13:13:27 -0700 Received: from localhost ([127.0.0.1] ident=davem) by sunset.davemloft.net with esmtp (Exim 4.50) id 1Df1T8-0000e7-JU; Sun, 05 Jun 2005 13:11:38 -0700 Date: Sun, 05 Jun 2005 13:11:38 -0700 (PDT) Message-Id: <20050605.131138.21611278.davem@davemloft.net> To: mchan@broadcom.com Cc: buytenh@wantstofly.org, mitch.a.williams@intel.com, hadi@cyberus.ca, john.ronciak@intel.com, jdmason@us.ibm.com, shemminger@osdl.org, netdev@oss.sgi.com, Robert.Olsson@data.slu.se, ganesh.venkatesan@intel.com, jesse.brandeburg@intel.com Subject: Re: RFC: NAPI packet weighting patch From: "David S. Miller" In-Reply-To: <1117830922.4430.44.camel@rh4> References: <1117828169.4430.29.camel@rh4> <20050603205944.GC20623@xi.wantstofly.org> <1117830922.4430.44.camel@rh4> X-Mailer: Mew version 3.3 on Emacs 21.4 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 2113 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev Content-Length: 1265 Lines: 30 From: "Michael Chan" Date: Fri, 03 Jun 2005 13:35:22 -0700 > I agree on the merit of issuing only one IO at the end. What I'm saying > is that doing so will make it similar to e1000 where all the buffers are > replenished at the end. Isn't that so or am I missing something? You're totally right. I guess we don't see the e1000 behavior due to any of the following: 1) we set the RX ring sizes larger by default 2) we set it larger than what the e1000 tests were done with 3) we process the RX ring faster and thus the chip can't catch up and exhaust the ring We use a default of 200 in tg3, and e1000 seems to use a default of 256. This actually points more to the fact that what you're actually doing to process the packet has a huge influence on whether the chip can catch up and exhaust the RX ring. How much software work does the netif_receive_skb() call entail, on average, for the given workload? That is why the exact test being run is important in analyzing reports such as these. If you're doing a TCP transfer, then netif_receive_skb() can be _VERY_ expensive per-call. If, on the other hand, you're routing tiny 64-byte packets or responding to simple ICMP echo requests, the per-call cost can be significantly lower. From glen.turner@aarnet.edu.au Sun Jun 5 13:30:42 2005 Received: with ECARTIS (v1.0.0; list netdev); Sun, 05 Jun 2005 13:30:46 -0700 (PDT) Received: from clix.aarnet.edu.au (clix.aarnet.edu.au [192.94.63.10]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j55KUfXq002968 for ; Sun, 5 Jun 2005 13:30:42 -0700 Received: from [202.158.193.5] (andromache.adelaide.aarnet.edu.au [202.158.193.5]) (authenticated bits=0) by clix.aarnet.edu.au (8.12.8/8.12.8) with ESMTP id j55KTUpg008271 (version=TLSv1/SSLv3 cipher=RC4-MD5 bits=128 verify=NO); Mon, 6 Jun 2005 06:29:31 +1000 Message-ID: <42A360A0.1040902@aarnet.edu.au> Date: Mon, 06 Jun 2005 05:59:20 +0930 From: Glen Turner Organization: Australia's Academic & Research Network User-Agent: Mozilla Thunderbird 1.0.2-1.3.3 (X11/20050513) X-Accept-Language: en-us, en MIME-Version: 1.0 To: Andy Fleming CC: Stephen Hemminger , Netdev , Kumar Gala Subject: Re: RFC: PHY Abstraction Layer II References: <1107b64b01fb8e9a6c84359bb56881a6@freescale.com> <20050531105939.7486e071@dxpl.pdx.osdl.net> <92F1428A-0B26-428B-8C06-35C7E5B9EEE3@freescale.com> <20050601144123.2bc11c06@dxpl.pdx.osdl.net> <9A2D608A-D818-455B-96F4-ED42413556C0@freescale.com> In-Reply-To: <9A2D608A-D818-455B-96F4-ED42413556C0@freescale.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-MDSA: Yes X-Scanned-By: MIMEDefang 2.39 X-archive-position: 2114 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: glen.turner@aarnet.edu.au Precedence: bulk X-list: netdev Content-Length: 405 Lines: 12 Operationally, it would be very useful if the PHY printed the physical interface detail when detected (1000Base-LX, etc). Also, it would be nice to be able to retrieve PHY data independent of the interface status (eg, to retrieve asset serial numbers, GBIC make/models, etc). -- Glen Turner Tel: (08) 8303 3936 or +61 8 8303 3936 Australia's Academic & Research Network www.aarnet.edu.au From davem@davemloft.net Sun Jun 5 14:38:08 2005 Received: with ECARTIS (v1.0.0; list netdev); Sun, 05 Jun 2005 14:38:17 -0700 (PDT) Received: from sunset.davemloft.net (dsl027-180-168.sfo1.dsl.speakeasy.net [216.27.180.168]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j55Lc7Xq005931 for ; Sun, 5 Jun 2005 14:38:08 -0700 Received: from localhost ([127.0.0.1] ident=davem) by sunset.davemloft.net with esmtp (Exim 4.50) id 1Df2nd-0001Uq-UL; Sun, 05 Jun 2005 14:36:53 -0700 Date: Sun, 05 Jun 2005 14:36:53 -0700 (PDT) Message-Id: <20050605.143653.75191476.davem@davemloft.net> To: mchan@broadcom.com Cc: hadi@cyberus.ca, buytenh@wantstofly.org, mitch.a.williams@intel.com, john.ronciak@intel.com, jdmason@us.ibm.com, shemminger@osdl.org, netdev@oss.sgi.com, Robert.Olsson@data.slu.se, ganesh.venkatesan@intel.com, jesse.brandeburg@intel.com Subject: Re: RFC: NAPI packet weighting patch From: "David S. Miller" In-Reply-To: <1117844736.4430.51.camel@rh4> References: <1117830922.4430.44.camel@rh4> <1117837798.6266.25.camel@localhost.localdomain> <1117844736.4430.51.camel@rh4> X-Mailer: Mew version 3.3 on Emacs 21.4 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 2115 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev Content-Length: 4378 Lines: 126 To illustrate my most recent point (that packet processing cost on RX is variable, and at times highly so) I made some hacks to the tg3 driver to record how many system clock ticks each netif_receive_skb() call consumed. This clock on my sparc64 box updates at a rate of 12MHZ and is used for system time keeping. Anyways, here is a log from a stream transfer to this system. So the packet trace is heavily TCP receive bound. Here is a sample from this. I take a tick sample before the netif_receive_skb() call, take one afterwards, and record the difference between the two: [ 52 73 41 65 38 61 58 63 37 62 36 62 50 74 38 64 ] [ 37 63 39 62 36 64 36 61 50 75 38 64 39 65 37 62 ] [ 36 60 36 62 50 76 39 67 38 63 35 62 35 64 35 62 ] [ 62 74 41 65 37 62 37 63 36 61 39 62 52 75 38 66 ] [ 37 63 35 61 38 62 36 60 49 75 38 64 37 62 36 66 ] [ 42 62 36 62 48 76 38 64 35 62 40 63 36 60 36 63 ] [ 49 76 36 64 35 64 38 64 37 61 36 62 60 74 37 80 ] [ 43 69 36 65 36 62 37 62 54 77 42 66 37 64 35 60 ] [ 36 61 38 62 51 75 40 64 35 62 36 61 37 61 39 61 ] [ 51 76 38 64 35 63 36 63 38 62 37 63 49 76 39 64 ] [ 35 64 35 64 38 62 36 62 61 85 42 65 38 79 38 62 ] [ 36 61 35 64 49 77 37 63 38 64 36 60 37 62 36 60 ] [ 51 76 38 66 38 62 37 63 36 62 37 60 50 77 41 64 ] [ 36 60 36 60 36 61 37 61 50 78 39 66 37 63 36 62 ] [ 36 61 39 63 60 74 38 66 37 61 35 63 37 65 36 65 ] [ 48 76 38 65 36 64 41 64 36 60 35 61 49 76 39 66 ] [ 36 64 39 60 37 60 36 59 51 73 37 64 40 64 36 62 ] [ 37 61 35 62 50 78 39 67 38 63 35 61 36 63 36 61 ] [ 66 75 41 66 37 65 36 61 36 62 38 63 50 75 38 65 ] [ 37 63 36 62 38 63 36 63 49 76 38 64 38 63 40 64 ] [ 35 63 36 60 50 74 39 65 37 65 38 62 36 62 36 60 ] [ 51 75 37 66 39 65 37 62 37 62 38 61 67 72 39 65 ] [ 37 62 35 61 37 61 54 63 53 75 42 67 35 63 36 61 ] [ 36 65 39 62 53 75 38 64 36 63 35 62 38 63 36 61 ] [ 49 77 39 66 38 62 36 62 38 61 35 59 83 91 77 25 ] [ 22 22 22 24 21 21 21 20 21 35 67 24 50 47 67 39 ] [ 65 34 65 36 63 65 74 38 64 35 64 37 63 37 62 36 ] [ 61 51 75 38 67 39 63 35 64 37 62 36 61 50 74 37 ] [ 66 37 62 35 63 35 61 36 65 52 76 40 65 38 61 37 ] [ 62 36 61 40 64 63 71 40 62 36 64 36 63 36 61 39 ] [ 62 49 76 37 65 36 62 36 61 38 65 41 64 50 75 39 ] [ 67 37 62 37 63 36 62 38 61 69 153 70 140 200 737 67 ] Notice how the packet trail seems to bounce back and forth between taking ~30 ticks to taking ~60 ticks? The ~60 tick packets are the TCP data packets that make us output an ACK packet. So this makes it cost double of what it takes to process a TCP data packet for which we do not immediately generate an ACK. It pretty much shows that we need to have something other than a blank "COUNT" to represent the NAPI weight, and we should instead try to measure the real "work" actually consumed, via some time measurement and limit, to implement this stuff properly. BTW, here is the patch implementing this stuff. --- ./drivers/net/tg3.c.~1~ 2005-06-03 11:13:14.000000000 -0700 +++ ./drivers/net/tg3.c 2005-06-05 14:16:32.000000000 -0700 @@ -2836,7 +2836,17 @@ static int tg3_rx(struct tg3 *tp, int bu desc->err_vlan & RXD_VLAN_MASK); } else #endif + { + unsigned long t = get_cycles(); + unsigned int ent; + netif_receive_skb(skb); + t = get_cycles() - t; + + ent = tp->rx_log_ent; + tp->rx_log[ent] = (u32) t; + tp->rx_log_ent = ((ent + 1) & RX_LOG_MASK); + } tp->dev->last_rx = jiffies; received++; @@ -6609,6 +6619,28 @@ static struct net_device_stats *tg3_get_ stats->rx_crc_errors = old_stats->rx_crc_errors + calc_crc_errors(tp); + /* XXX Yes, I know, do this right. :-) */ + { + unsigned int ent, pos; + + printk("TG3: RX LOG, current ent[%d]\n", tp->rx_log_ent); + ent = tp->rx_log_ent - 512; + pos = 0; + while (ent != tp->rx_log_ent) { + if (!pos) printk("[ "); + + printk("%u ", tp->rx_log[ent]); + + if (++pos >= 16) { + printk("]\n"); + pos = 0; + } + ent = (ent + 1) & RX_LOG_MASK; + } + if (pos != 0) + printk("]\n"); + } + return stats; } --- ./drivers/net/tg3.h.~1~ 2005-06-03 11:13:14.000000000 -0700 +++ ./drivers/net/tg3.h 2005-06-05 14:16:00.000000000 -0700 @@ -2232,6 +2232,11 @@ struct tg3 { #define SST_25VF0X0_PAGE_SIZE 4098 struct ethtool_coalesce coal; + +#define RX_LOG_SIZE (1 << 14) +#define RX_LOG_MASK (RX_LOG_SIZE - 1) + unsigned int rx_log_ent; + u32 rx_log[RX_LOG_SIZE]; }; #endif /* !(_T3_H) */ From davem@davemloft.net Sun Jun 5 23:02:35 2005 Received: with ECARTIS (v1.0.0; list netdev); Sun, 05 Jun 2005 23:02:39 -0700 (PDT) Received: from sunset.davemloft.net (dsl027-180-168.sfo1.dsl.speakeasy.net [216.27.180.168]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j5662ZXq017763 for ; Sun, 5 Jun 2005 23:02:35 -0700 Received: from localhost ([127.0.0.1] ident=davem) by sunset.davemloft.net with esmtp (Exim 4.50) id 1DfAfz-0000qY-5z; Sun, 05 Jun 2005 23:01:31 -0700 Date: Sun, 05 Jun 2005 23:01:31 -0700 (PDT) Message-Id: <20050605.230131.78711491.davem@davemloft.net> To: jgarzik@pobox.com Cc: netdev@oss.sgi.com, mchan@broadcom.com Subject: Re: [PATCH]: Tigon3 new NAPI locking v2 From: "David S. Miller" In-Reply-To: <42A0BC2B.4020409@pobox.com> References: <20050603.122558.88474819.davem@davemloft.net> <42A0BC2B.4020409@pobox.com> X-Mailer: Mew version 3.3 on Emacs 21.4 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 2116 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev Content-Length: 1525 Lines: 45 From: Jeff Garzik Date: Fri, 03 Jun 2005 16:23:07 -0400 > overall, pretty spiffy :) Thanks. > As further work, I would like to see how much (alot? all?) of the timer > code could be moved into a workqueue, where we could kill the last of > the horrible-udelay loops in the driver. Particularly awful is > > while (++tick < 195000) { > status = tg3_fiber_aneg_smachine(tp, &aninfo); > if (status == ANEG_DONE || status == ANEG_FAILED) > break; > > udelay(1); > } I know :). > * This loop makes me nervous... If there's a fault on the PCI bus or > the hardware is unplugged, val will equal 0xffffffff. I agree, if the chip wedges for whatever reason and stops receiving interrupts, we will totally lock up here. I'll add a timeout to the final version. Remind me if I don't :) > * A few comments for normal humans like "force an interrupt" and "wait > for interrupt handler to complete" might be nice. Ok. > * a BUG_ON(if-interrupts-are-disabled) line might be nice Which interrupts? Local cpu interrupts? Tigon3 chip interrupts? > Rather than an 'irq_sync' arg, my instinct would have been to create > tg3_full_lock() and tg3_full_lock_sync(). This makes the action -much- > more obvious to the reader, and since its inline doesn't cost anything > (compiler's optimizer even does a tiny bit less work my way). This doesn't sound like a bad idea either. Thanks for the feedback Jeff. From yi.zhu@intel.com Sun Jun 5 23:33:50 2005 Received: with ECARTIS (v1.0.0; list netdev); Sun, 05 Jun 2005 23:33:53 -0700 (PDT) Received: from fmsfmr002.fm.intel.com (fmr14.intel.com [192.55.52.68]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j566XoXq020272 for ; Sun, 5 Jun 2005 23:33:50 -0700 Received: from fmsfmr100.fm.intel.com (fmsfmr100.fm.intel.com [10.1.192.58]) by fmsfmr002.fm.intel.com (8.12.10/8.12.10/d: major-outer.mc,v 1.1 2004/09/17 17:50:56 root Exp $) with ESMTP id j566WmVj029556; Mon, 6 Jun 2005 06:32:48 GMT Received: from fmsmsxvs043.fm.intel.com (fmsmsxvs043.fm.intel.com [132.233.42.129]) by fmsfmr100.fm.intel.com (8.12.10/8.12.10/d: major-inner.mc,v 1.2 2004/09/17 18:05:01 root Exp $) with SMTP id j566WSbt023446; Mon, 6 Jun 2005 06:32:48 GMT Received: from debian.sh.intel.com ([172.16.219.38]) by fmsmsxvs043.fm.intel.com (SAVSMTP 3.1.7.47) with SMTP id M2005060523324620993 ; Sun, 05 Jun 2005 23:32:47 -0700 Subject: Re: [3/9] ieee80211: fix ipw 64bit compilation warnings From: Zhu Yi To: Jiri Benc Cc: NetDev , Jeff Garzik , Jirka Bohac In-Reply-To: <20050603183048.7786f98b@griffin.suse.cz> References: <20050603182625.64d33be3@griffin.suse.cz> <20050603183048.7786f98b@griffin.suse.cz> Content-Type: text/plain Organization: Intel Corp. Date: Mon, 06 Jun 2005 14:29:52 +0800 Message-Id: <1118039392.5702.30.camel@debian.sh.intel.com> Mime-Version: 1.0 X-Mailer: Evolution 2.2.2 Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.44 X-archive-position: 2118 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: yi.zhu@intel.com Precedence: bulk X-list: netdev Content-Length: 354 Lines: 13 On Fri, 2005-06-03 at 18:30 +0200, Jiri Benc wrote: > @@ -508,7 +508,7 @@ > /* verify we have enough room to store the value */ > if (*len < sizeof(u32)) { > IPW_DEBUG_ORD("ordinal buffer length too small, " > - "need %d\n", sizeof(u32)); > + "need %d\n", (int)sizeof(u32)); ("%zd", sizeof()) should be better. Thanks, -yi From davem@davemloft.net Sun Jun 5 23:44:22 2005 Received: with ECARTIS (v1.0.0; list netdev); Sun, 05 Jun 2005 23:44:32 -0700 (PDT) Received: from sunset.davemloft.net (dsl027-180-168.sfo1.dsl.speakeasy.net [216.27.180.168]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j566iMXq021171 for ; Sun, 5 Jun 2005 23:44:22 -0700 Received: from localhost ([127.0.0.1] ident=davem) by sunset.davemloft.net with esmtp (Exim 4.50) id 1DfBKG-0001OG-0R; Sun, 05 Jun 2005 23:43:08 -0700 Date: Sun, 05 Jun 2005 23:43:07 -0700 (PDT) Message-Id: <20050605.234307.92584592.davem@davemloft.net> To: mchan@broadcom.com Cc: hadi@cyberus.ca, buytenh@wantstofly.org, mitch.a.williams@intel.com, john.ronciak@intel.com, jdmason@us.ibm.com, shemminger@osdl.org, netdev@oss.sgi.com, Robert.Olsson@data.slu.se, ganesh.venkatesan@intel.com, jesse.brandeburg@intel.com Subject: Re: RFC: NAPI packet weighting patch From: "David S. Miller" In-Reply-To: <20050605.143653.75191476.davem@davemloft.net> References: <1117837798.6266.25.camel@localhost.localdomain> <1117844736.4430.51.camel@rh4> <20050605.143653.75191476.davem@davemloft.net> X-Mailer: Mew version 3.3 on Emacs 21.4 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 2119 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev Content-Length: 5307 Lines: 190 From: "David S. Miller" Date: Sun, 05 Jun 2005 14:36:53 -0700 (PDT) > BTW, here is the patch implementing this stuff. A new patch and some more data. When we go to gigabit, and NAPI kicks in, the first RX packet costs a lot (cache misses etc.) but the rest are very efficient to process. I suspect this only holds for the single socket case, and on a real system processing many connections the cost drop might not be so clean. The log output format is: (TX_TICKS:RX_TICKS[ RX_TICK1 RX_TICK2 RX_TICK3 ... ]) Here is an example trace from a single socket TCP stream send over gigabit: (9:112[ 26 8 7 8 7 ]) (6:110[ 23 8 8 8 7 ]) (7:57[ 26 8 ]) (6:117[ 25 8 9 7 7 ]) (5:37[ 26 ]) (6:113[ 28 8 7 8 7 ]) (0:20[ 9 ]) (8:111[ 27 7 7 8 7 ]) (5:109[ 25 8 8 8 7 ]) (8:113[ 25 7 8 9 7 ]) (6:108[ 25 8 7 7 7 ]) (8:88[ 26 8 8 7 ]) (6:109[ 25 7 7 7 7 ]) (6:111[ 25 9 8 7 7 ]) (0:48[ 9 5 ]) This kind of trace reiterates some things we already know. For example, mitigation (HW, SW, or a combination of both) helps because processing multiple packets let's us "reuse" the cpu cache priming the handling of the first packet achieves for us. It would be great to stick something like this into the e1000 driver, and get some output from it with Intel's single NIC performance degradation test case. It is also necessary for the Intel folks to say whether the NIC is running out of RX descriptors in the single NIC case with dev->weight set to the default of 64. If so, does increasing the RX ring size to a larger value via ethtool help? If not, then why in the world are things running more slowly? I've got a crappy 1.5GHZ sparc64 box in my tg3 tests here, and it can handle gigabit line rate with much CPU to spare. So either Intel is doing something other than TCP stream tests, or something else is out of whack. I even tried to do things like having a memory touching program run in parallel with the TCP stream test, and this did not make the timing numbers in the logs increase much at all. --- ./drivers/net/tg3.c.~1~ 2005-06-03 11:13:14.000000000 -0700 +++ ./drivers/net/tg3.c 2005-06-05 23:21:11.000000000 -0700 @@ -2836,7 +2836,22 @@ static int tg3_rx(struct tg3 *tp, int bu desc->err_vlan & RXD_VLAN_MASK); } else #endif + { + unsigned long t = get_cycles(); + struct tg3_poll_log_ent *lp; + unsigned int ent; + netif_receive_skb(skb); + t = get_cycles() - t; + + ent = tp->poll_log_ent; + lp = &tp->poll_log[ent]; + ent = lp->rx_cur_ent; + if (ent < POLL_RX_SIZE) { + lp->rx_ents[ent] = (u16) t; + lp->rx_cur_ent = ent + 1; + } + } tp->dev->last_rx = jiffies; received++; @@ -2897,9 +2912,15 @@ static int tg3_poll(struct net_device *n /* run TX completion thread */ if (sblk->idx[0].tx_consumer != tp->tx_cons) { + unsigned long t; + spin_lock(&tp->tx_lock); + t = get_cycles(); tg3_tx(tp); + t = get_cycles() - t; spin_unlock(&tp->tx_lock); + + tp->poll_log[tp->poll_log_ent].tx_ticks = (u16) t; } spin_unlock_irqrestore(&tp->lock, flags); @@ -2911,16 +2932,28 @@ static int tg3_poll(struct net_device *n if (sblk->idx[0].rx_producer != tp->rx_rcb_ptr) { int orig_budget = *budget; int work_done; + unsigned long t; + unsigned int ent; if (orig_budget > netdev->quota) orig_budget = netdev->quota; + t = get_cycles(); work_done = tg3_rx(tp, orig_budget); + t = get_cycles() - t; + + ent = tp->poll_log_ent; + tp->poll_log[ent].rx_ticks = (u16) t; *budget -= work_done; netdev->quota -= work_done; } + tp->poll_log_ent = (tp->poll_log_ent + 1) & POLL_LOG_MASK; + tp->poll_log[tp->poll_log_ent].tx_ticks = 0; + tp->poll_log[tp->poll_log_ent].rx_ticks = 0; + tp->poll_log[tp->poll_log_ent].rx_cur_ent = 0; + if (tp->tg3_flags & TG3_FLAG_TAGGED_STATUS) tp->last_tag = sblk->status_tag; rmb(); @@ -6609,6 +6642,27 @@ static struct net_device_stats *tg3_get_ stats->rx_crc_errors = old_stats->rx_crc_errors + calc_crc_errors(tp); + /* XXX Yes, I know, do this right. :-) */ + { + unsigned int ent; + + printk("TG3: POLL LOG, current ent[%d]\n", tp->poll_log_ent); + ent = tp->poll_log_ent - (POLL_LOG_SIZE - 1); + ent &= POLL_LOG_MASK; + while (ent != tp->poll_log_ent) { + struct tg3_poll_log_ent *lp = &tp->poll_log[ent]; + int i; + + printk("(%u:%u[ ", + lp->tx_ticks, lp->rx_ticks); + for (i = 0; i < lp->rx_cur_ent; i++) + printk("%d ", lp->rx_ents[i]); + printk("])\n"); + + ent = (ent + 1) & POLL_LOG_MASK; + } + } + return stats; } --- ./drivers/net/tg3.h.~1~ 2005-06-03 11:13:14.000000000 -0700 +++ ./drivers/net/tg3.h 2005-06-05 23:21:05.000000000 -0700 @@ -2003,6 +2003,15 @@ struct tg3_ethtool_stats { u64 nic_tx_threshold_hit; }; +struct tg3_poll_log_ent { + u16 tx_ticks; + u16 rx_ticks; +#define POLL_RX_SIZE 8 +#define POLL_RX_MASK (POLL_RX_SIZE - 1) + u16 rx_cur_ent; + u16 rx_ents[POLL_RX_SIZE]; +}; + struct tg3 { /* begin "general, frequently-used members" cacheline section */ @@ -2232,6 +2241,11 @@ struct tg3 { #define SST_25VF0X0_PAGE_SIZE 4098 struct ethtool_coalesce coal; + +#define POLL_LOG_SIZE (1 << 7) +#define POLL_LOG_MASK (POLL_LOG_SIZE - 1) + unsigned int poll_log_ent; + struct tg3_poll_log_ent poll_log[POLL_LOG_SIZE]; }; #endif /* !(_T3_H) */ From hhh@imada.sdu.dk Mon Jun 6 02:36:11 2005 Received: with ECARTIS (v1.0.0; list netdev); Mon, 06 Jun 2005 02:36:14 -0700 (PDT) Received: from berlioz.imada.sdu.dk (berlioz.imada.sdu.dk [130.225.128.12]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j569a9Xq001968 for ; Mon, 6 Jun 2005 02:36:11 -0700 Received: from localhost (localhost [127.0.0.1]) by localhost.imada.sdu.dk (Postfix) with ESMTP id 4C3F262728 for ; Mon, 6 Jun 2005 11:35:07 +0200 (CEST) Received: from berlioz.imada.sdu.dk ([127.0.0.1]) by localhost (berlioz.imada.sdu.dk [127.0.0.1]) (amavisd-new, port 10025) with ESMTP id 28588-07 for ; Mon, 6 Jun 2005 09:35:06 +0000 (UTC) Received: from [139.91.76.186] (unknown [139.91.76.186]) (using TLSv1 with cipher RC4-MD5 (128/128 bits)) (No client certificate requested) by berlioz.imada.sdu.dk (Postfix) with ESMTP id 183A462745 for ; Mon, 6 Jun 2005 11:35:06 +0200 (CEST) From: Hans Henrik Happe Subject: PROBLEM: High TCP latency User-Agent: KMail/1.7.2 MIME-Version: 1.0 To: netdev@oss.sgi.com Date: Mon, 6 Jun 2005 11:35:09 +0200 Content-Type: Multipart/Mixed; boundary="Boundary-00=_NjBpCAIVJaMD5eg" Message-Id: <200506061135.09869.hhh@imada.sdu.dk> X-archive-position: 2120 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hhh@imada.sdu.dk Precedence: bulk X-list: netdev Content-Length: 21992 Lines: 1011 --Boundary-00=_NjBpCAIVJaMD5eg Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Content-Disposition: inline Short: TCP puts the system into the idle state even though there are data in transit. During coding a distributed application I discovered a TCP latency issue. The application does a lot of request forwarding like P2P protocols. I have tried to track down the problem and have written a small program (random-tcp.c) that shows the long latencies. In this program one message is passed round between a number om processes. Each time a process receives the message it randomly chooses a process to forward to next. This I have compared to a program that doesn't give long latencies (ring-tcp.c). In this program each process always forwards to the same process (ring topology). I have also made the same programs using SCTP and this protocol has no issue in the random case. The following is a test with 16 processes forwarding the message 100000 times. The avg. forwarding time from process to process is messured. $ ./random-tcp 16 100000 avg forwarding time: 0.000326 $ ./ring-tcp 16 100000 avg forwarding time: 0.000044 $ ./random-sctp 16 100000 avg forwarding time: 0.000068 $ ./ring-sctp 16 100000 avg forwarding time: 0.000067 Using 'top' i have observed that the system spends time in the idle state when running 'random-tcp'. This I have observed with just 3 processes. With 16 processes the CPU is only 20% loaded on my Mobile Intel(R) Celeron(R) CPU 1.60GHz. I have also tried with socketpair()'s which didn't have the problem. Therefore my conclusion is that it must be a TCP issue. Now this local use of TCP is not that usefull. Therefore, I tried a MPI version and tested this in a 16 node cluster. Here the random case is 5 times slower than the ring. I have tested on many kernel versions from 2.4.25 up until 2.6.12-rc5 and all had this issue. A few people on lkml also confirmed it, but I have not got any reply from someone with a greater knowledge of the inner working of Linux TCP (at least they didn't tell me that they had this knowledge :-). I hope this is helpfull. Regards Hans Henrik Happe --Boundary-00=_NjBpCAIVJaMD5eg Content-Type: text/x-csrc; charset="us-ascii"; name="random-sctp.c" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="random-sctp.c" /* By Hans Henrik Happe * * compile: gcc -o random-sctp random-sctp.c -lsctp * * usage: random-sctp <# processes> <# forwards> */ #include #include #include #include #include #include #include #include #include #include double second() { struct timeval tv; struct timezone tz; double t; gettimeofday(&tv,&tz); t= (double)(tv.tv_sec)+(double)(tv.tv_usec/1.0e6); return t; } typedef struct { struct sockaddr sockadr; int len; } adr_t; int get_adr(adr_t *adr, int port) { int n; struct addrinfo hints, *res; char str[6]; memset(&hints, 0, sizeof(struct addrinfo)); hints.ai_flags = AI_PASSIVE; hints.ai_family = PF_UNSPEC; hints.ai_socktype = SOCK_STREAM; sprintf(str, "%d", port); n = getaddrinfo("localhost", str, &hints, &res); if (n != 0) { fprintf(stderr, "getaddrinfo error: [%s]\n", gai_strerror(n)); return -1; } memcpy(&adr->sockadr, res->ai_addr, sizeof(*res->ai_addr)); adr->len = sizeof(*res->ai_addr); freeaddrinfo(res); return 0; } int init_listen(int port) { int n, on=1; int sock; struct sockaddr_in name; sock = socket(PF_INET, SOCK_SEQPACKET, IPPROTO_SCTP); if (sock == -1) { perror("socket"); return -1; } name.sin_family = PF_INET; name.sin_port = htons (port); name.sin_addr.s_addr = htonl (INADDR_ANY); if (bind (sock, (struct sockaddr *) &name, sizeof (name)) == -1) { perror("bind"); return -1; } if (listen(sock, 10) == -1) { perror("listen"); return -1; } return sock; } int do_recv(int sock, void *buf, int n) { struct sockaddr sa; struct sctp_sndrcvinfo info; int slen, flags, res; slen = sizeof(sa); res = sctp_recvmsg(sock, buf, n, &sa, &slen, &info, &flags); if (res == -1) { perror("recv"); } if (res != n) { fprintf(stderr, "recv incomplete\n"); } return res; } int do_send(int sock, adr_t *adr, void *buf, int n) { int res; res = sctp_sendmsg(sock, buf, n, &adr->sockadr, adr->len, 666, MSG_ADDR_OVER, 0, 0, 444); if (res == -1) { perror("send"); } if (res != n) { fprintf(stderr, "send incomplete\n"); } return res; } int main(int argc, char *argv[]) { int i, cnt, pid, src, dest, its; int lsock; char id, rank, data; int port = 11100; double t0, t1; /* # processes */ cnt = atoi(argv[1]); /* # forwards */ its = atoi(argv[2]); { adr_t dests[cnt]; /* Create processes */ rank = 0; for (i=1; i <# forwards> */ #include #include #include #include #include #include #include #include #include #include double second() { struct timeval tv; struct timezone tz; double t; gettimeofday(&tv,&tz); t= (double)(tv.tv_sec)+(double)(tv.tv_usec/1.0e6); return t; } int do_connect(int port) { int n, sock, on=1; struct addrinfo hints, *res; char str[6]; void *adr; memset(&hints, 0, sizeof(struct addrinfo)); hints.ai_flags = AI_PASSIVE; hints.ai_family = PF_UNSPEC; hints.ai_socktype = SOCK_STREAM; sprintf(str, "%d", port); n = getaddrinfo("localhost", str, &hints, &res); if (n != 0) { fprintf(stderr, "getaddrinfo error: [%s]\n", gai_strerror(n)); return -1; } sock = socket(AF_INET, SOCK_STREAM, 0); if (sock == -1) { perror("socket"); return -1; } if (setsockopt(sock, SOL_TCP, TCP_NODELAY, &on, sizeof(on)) == -1) { perror("setsockopt"); return -1; } if (connect(sock, (struct sockaddr *)res->ai_addr, sizeof(*res->ai_addr)) == -1) { perror("connect"); return -1; } freeaddrinfo(res); return sock; } int start_listen(int port) { int n, on=1; int sock; struct sockaddr_in name; sock = socket(AF_INET, SOCK_STREAM, 0); if (sock == -1) { perror("socket"); return -1; } if (setsockopt(sock, SOL_SOCKET, SO_REUSEADDR, &on, sizeof(on)) == -1) { perror("setsockopt"); return -1; } name.sin_family = AF_INET; name.sin_port = htons (port); name.sin_addr.s_addr = htonl (INADDR_ANY); if (bind (sock, (struct sockaddr *) &name, sizeof (name)) == -1) { perror("bind"); return -1; } if (listen(sock, 10) == -1) { perror("listen"); return -1; } return sock; } int do_accept(int lsock) { struct sockaddr addr; socklen_t len = sizeof(addr); int sock, on=1; if ((sock = accept(lsock, &addr, &len)) == -1) { perror("accept"); return -1; } if (setsockopt(sock, SOL_TCP, TCP_NODELAY, &on, sizeof(on)) == -1) { perror("setsockopt"); return -1; } return sock; } int do_read(int fd, void *buf, int n) { int res; res = read(fd, buf, n); if (res == -1) { perror("read"); } if (res != n) { fprintf(stderr, "read incomplete\n"); } return res; } int do_write(int fd, void *buf, int n) { int res; res = write(fd, buf, n); if (res == -1) { perror("write"); } if (res != n) { fprintf(stderr, "write incomplete\n"); } return res; } int main(int argc, char *argv[]) { int i, cnt, pid, dest, src, its; int lsock, sock; char id, rank, data; int port = 11100; double t0, t1; /* # processes */ cnt = atoi(argv[1]); /* # forwards */ its = atoi(argv[2]); { int socks[cnt]; /* Create processes */ rank = 0; for (i=1; i <# forwards> */ #include #include #include #include #include #include #include #include #include #include double second() { struct timeval tv; struct timezone tz; double t; gettimeofday(&tv,&tz); t= (double)(tv.tv_sec)+(double)(tv.tv_usec/1.0e6); return t; } typedef struct { struct sockaddr sockadr; int len; } adr_t; int get_adr(adr_t *adr, int port) { int n; struct addrinfo hints, *res; char str[6]; memset(&hints, 0, sizeof(struct addrinfo)); hints.ai_flags = AI_PASSIVE; hints.ai_family = PF_UNSPEC; hints.ai_socktype = SOCK_STREAM; sprintf(str, "%d", port); n = getaddrinfo("localhost", str, &hints, &res); if (n != 0) { fprintf(stderr, "getaddrinfo error: [%s]\n", gai_strerror(n)); return -1; } memcpy(&adr->sockadr, res->ai_addr, sizeof(*res->ai_addr)); adr->len = sizeof(*res->ai_addr); freeaddrinfo(res); return 0; } int init_listen(int port) { int n, on=1; int sock; struct sockaddr_in name; sock = socket(PF_INET, SOCK_SEQPACKET, IPPROTO_SCTP); if (sock == -1) { perror("socket"); return -1; } name.sin_family = PF_INET; name.sin_port = htons (port); name.sin_addr.s_addr = htonl (INADDR_ANY); if (bind (sock, (struct sockaddr *) &name, sizeof (name)) == -1) { perror("bind"); return -1; } if (listen(sock, 10) == -1) { perror("listen"); return -1; } return sock; } int do_recv(int sock, void *buf, int n) { struct sockaddr sa; struct sctp_sndrcvinfo info; int slen, flags, res; slen = sizeof(sa); res = sctp_recvmsg(sock, buf, n, &sa, &slen, &info, &flags); if (res == -1) { perror("recv"); } if (res != n) { fprintf(stderr, "recv incomplete\n"); } return res; } int do_send(int sock, adr_t *adr, void *buf, int n) { int res; res = sctp_sendmsg(sock, buf, n, &adr->sockadr, adr->len, 666, MSG_ADDR_OVER, 0, 0, 444); if (res == -1) { perror("send"); } if (res != n) { fprintf(stderr, "send incomplete\n"); } return res; } int main(int argc, char *argv[]) { int i, cnt, pid, src, dest, its; int lsock; char id, rank, data; int port = 11100; double t0, t1; /* # processes */ cnt = atoi(argv[1]); /* # forwards */ its = atoi(argv[2]); { adr_t dests[cnt]; /* Create processes */ rank = 0; for (i=1; i <# forwards> */ #include #include #include #include #include #include #include #include #include #include double second() { struct timeval tv; struct timezone tz; double t; gettimeofday(&tv,&tz); t= (double)(tv.tv_sec)+(double)(tv.tv_usec/1.0e6); return t; } int do_connect(int port) { int n, sock, on=1; struct addrinfo hints, *res; char str[6]; void *adr; memset(&hints, 0, sizeof(struct addrinfo)); hints.ai_flags = AI_PASSIVE; hints.ai_family = PF_UNSPEC; hints.ai_socktype = SOCK_STREAM; sprintf(str, "%d", port); n = getaddrinfo("localhost", str, &hints, &res); if (n != 0) { fprintf(stderr, "getaddrinfo error: [%s]\n", gai_strerror(n)); return -1; } sock = socket(AF_INET, SOCK_STREAM, 0); if (sock == -1) { perror("socket"); return -1; } if (setsockopt(sock, SOL_TCP, TCP_NODELAY, &on, sizeof(on)) == -1) { perror("setsockopt"); return -1; } if (connect(sock, (struct sockaddr *)res->ai_addr, sizeof(*res->ai_addr)) == -1) { perror("connect"); return -1; } freeaddrinfo(res); return sock; } int start_listen(int port) { int n, on=1; int sock; struct sockaddr_in name; sock = socket(AF_INET, SOCK_STREAM, 0); if (sock == -1) { perror("socket"); return -1; } if (setsockopt(sock, SOL_SOCKET, SO_REUSEADDR, &on, sizeof(on)) == -1) { perror("setsockopt"); return -1; } name.sin_family = AF_INET; name.sin_port = htons (port); name.sin_addr.s_addr = htonl (INADDR_ANY); if (bind (sock, (struct sockaddr *) &name, sizeof (name)) == -1) { perror("bind"); return -1; } if (listen(sock, 10) == -1) { perror("listen"); return -1; } return sock; } int do_accept(int lsock) { struct sockaddr addr; socklen_t len = sizeof(addr); int sock, on=1; if ((sock = accept(lsock, &addr, &len)) == -1) { perror("accept"); return -1; } if (setsockopt(sock, SOL_TCP, TCP_NODELAY, &on, sizeof(on)) == -1) { perror("setsockopt"); return -1; } return sock; } int do_read(int fd, void *buf, int n) { int res; res = read(fd, buf, n); if (res == -1) { perror("read"); } if (res != n) { fprintf(stderr, "read incomplete\n"); } return res; } int do_write(int fd, void *buf, int n) { int res; res = write(fd, buf, n); if (res == -1) { perror("write"); } if (res != n) { fprintf(stderr, "write incomplete\n"); } return res; } int main(int argc, char *argv[]) { int i, cnt, pid, dest, src, its; int lsock, sock; char id, rank, data; int port = 11100; double t0, t1; /* # processes */ cnt = atoi(argv[1]); /* # forwards */ its = atoi(argv[2]); { int socks[cnt]; /* Create processes */ rank = 0; for (i=1; i; Mon, 6 Jun 2005 03:40:04 -0700 Received: from mail pickup service by hotmail.com with Microsoft SMTPSVC; Mon, 6 Jun 2005 03:38:56 -0700 Message-ID: Received: from 144.16.64.4 by by24fd.bay24.hotmail.msn.com with HTTP; Mon, 06 Jun 2005 10:38:56 GMT X-Originating-IP: [144.16.64.4] X-Originating-Email: [rahulhsaxena@hotmail.com] X-Sender: rahulhsaxena@hotmail.com In-Reply-To: <20050605221106.GB15391@postel.suug.ch> From: "rahul hari" To: tgraf@suug.ch Cc: diffserv-general@lists.sourceforge.net, netdev@oss.sgi.com Subject: Re: [Linux Diffserv] GRED queueing discipline and the file sch_gred.c Date: Mon, 06 Jun 2005 16:08:56 +0530 Mime-Version: 1.0 Content-Type: text/plain; format=flowed X-OriginalArrivalTime: 06 Jun 2005 10:38:56.0611 (UTC) FILETIME=[F0FA1B30:01C56A83] X-archive-position: 2121 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: rahulhsaxena@hotmail.com Precedence: bulk X-list: netdev Content-Length: 1388 Lines: 34 Dear Thomas, Thanks for the reply. Actually in my experiment, I am implementing 2 queues, in one of the queues, I use the prio scheme of tc and in another I define 3 virtual queues, out of which I want to provide absolute priority to one of the queue over the others (ie, if there is any packet in this queue, it should be dispatched immediately regardless of whatever happens to the other two virtual queues). For the other two virtual queues, I want to apply individual REDs (with different parameters but the average queue length should be equal to the total qave of these two virtual queues) on each but the dequeuing priority should be equal (the dequeuing takes place alternately). Can the current implementations somehow help me with this , or I would have to design this from scratch. Regards, Rahul ------- "The fear you let build up in your mind is worse than the situation that actually exists" taken from "who moved my cheese" ----------------------------------------------------------------------------- Rahul Hari Senior Undergraduate Student, Department of CSE, ITBHU, Varanasi. Ph: +91-9845347020 ----------------------------------------------------------------------------- _________________________________________________________________ Don’t just search. Find. Check out the new MSN Search! http://search.msn.click-url.com/go/onm00200636ave/direct/01/ From aharon.abramson@intel.com Mon Jun 6 04:15:40 2005 Received: with ECARTIS (v1.0.0; list netdev); Mon, 06 Jun 2005 04:15:42 -0700 (PDT) Received: from hermes.iil.intel.com (hermes.iil.intel.com [192.198.152.99]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j56BFaXq007585 for ; Mon, 6 Jun 2005 04:15:39 -0700 Received: from petasus.iil.intel.com (petasus.iil.intel.com [143.185.77.3]) by hermes.iil.intel.com (8.12.9-20030918-01/8.12.10/d: large-outer.mc,v 1.2 2004/09/17 18:04:59 root Exp $) with ESMTP id j56BMm0W003358 for ; Mon, 6 Jun 2005 11:22:48 GMT Received: from hasmsxvs01.iil.intel.com (hasmsxvs01.iil.intel.com [143.185.63.58]) by petasus.iil.intel.com (8.12.9-20030918-01/8.12.10/d: large-inner.mc,v 1.2 2004/09/17 18:04:31 root Exp $) with SMTP id j56BQ6vH031616 for ; Mon, 6 Jun 2005 11:26:12 GMT Received: from hasmsx331.ger.corp.intel.com ([143.185.63.144]) by hasmsxvs01.iil.intel.com (SAVSMTP 3.1.7.47) with SMTP id M2005060614143131764 for ; Mon, 06 Jun 2005 14:14:31 +0300 Received: from hasmsx402.ger.corp.intel.com ([143.185.63.156]) by hasmsx331.ger.corp.intel.com with Microsoft SMTPSVC(6.0.3790.211); Mon, 6 Jun 2005 14:14:32 +0300 X-MimeOLE: Produced By Microsoft Exchange V6.5.7226.0 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----_=_NextPart_001_01C56A88.E972F0D2" Subject: constructing struct sk_buff objects from a pre-allocated buffer Date: Mon, 6 Jun 2005 14:14:31 +0300 Message-ID: X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: constructing struct sk_buff objects from a pre-allocated buffer thread-index: AcVqiOkzUm0R5PdXRKi+rrqFWxuSMg== From: "Abramson, Aharon" To: X-OriginalArrivalTime: 06 Jun 2005 11:14:32.0167 (UTC) FILETIME=[E9DE1770:01C56A88] X-Scanned-By: MIMEDefang 2.31 (www . roaringpenguin . com / mimedefang) X-archive-position: 2122 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: aharon.abramson@intel.com Precedence: bulk X-list: netdev Content-Length: 1603 Lines: 49 This is a multi-part message in MIME format. ------_=_NextPart_001_01C56A88.E972F0D2 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Hello, all. I'm developing a network device driver. This device may deliver multiple frames in single pre-allocated receive buffer. How do I construct struct sk_buff objects for these frames, since alloc_skb allocates the object's data by itself? =20 Thanks, Aharon Abramson =20 ------_=_NextPart_001_01C56A88.E972F0D2 Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable
Hello, = all.
I'm = developing a=20 network device driver. This device may deliver multiple frames in single = pre-allocated receive buffer. How do I construct struct sk_buff objects = for=20 these frames, since alloc_skb allocates the object's data by=20 itself?
 
Thanks,
Aharon Abramson
 
------_=_NextPart_001_01C56A88.E972F0D2-- From tgraf@suug.ch Mon Jun 6 04:39:55 2005 Received: with ECARTIS (v1.0.0; list netdev); Mon, 06 Jun 2005 04:39:58 -0700 (PDT) Received: from postel.suug.ch (postel.suug.ch [195.134.158.23]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j56BdqXq013578 for ; Mon, 6 Jun 2005 04:39:55 -0700 Received: by postel.suug.ch (Postfix, from userid 10001) id 7FF5F1C0EE; Mon, 6 Jun 2005 13:39:07 +0200 (CEST) Date: Mon, 6 Jun 2005 13:39:07 +0200 From: Thomas Graf To: rahul hari Cc: diffserv-general@lists.sourceforge.net, netdev@oss.sgi.com Subject: Re: [Linux Diffserv] GRED queueing discipline and the file sch_gred.c Message-ID: <20050606113907.GC15391@postel.suug.ch> References: <20050605221106.GB15391@postel.suug.ch> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-archive-position: 2123 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev Content-Length: 1639 Lines: 36 Rahul, * rahul hari 2005-06-06 16:08 > Thanks for the reply. Actually in my experiment, I am implementing 2 > queues, in one of the queues, I use the prio scheme of tc and in another I > define 3 virtual queues, out of which I want to provide absolute priority > to one of the queue over the others (ie, if there is any packet in this > queue, it should be dispatched immediately regardless of whatever happens > to the other two virtual queues). Use a prio qdisc with RED leaf qdiscs. RED and GREDs purpose is to calculate a marking probability and not to provide any prioritizing schemes. RIO mode is a small exception from this but the used priority only describes the weight of the VQ and has no influence on the actual queue position later on. > For the other two virtual queues, I want to apply individual REDs (with > different parameters but the average queue length should be equal to the > total qave of these two virtual queues) on each but the dequeuing priority > should be equal (the dequeuing takes place alternately). Use a GRED qdisc, give both VQs the same prio (so they go into equalize mode) and enable RIO mode. The VQ you select as default will be used to store qavg and the idle time. CBQ cbq:queue_1 cbq:queue_2 | | prio GRED (rio mode) | | | | | RED_1 RED_2 RED_3 VQ1(prio=1) VQ2(prio=1) You did not talk about how to separate the two initial queues so I assumed CBQ but it doesn't really matter as long its a classful qdisc. From hadi@cyberus.ca Mon Jun 6 04:47:30 2005 Received: with ECARTIS (v1.0.0; list netdev); Mon, 06 Jun 2005 04:47:32 -0700 (PDT) Received: from mx04.cybersurf.com (mx04.cybersurf.com [209.197.145.108]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j56BlTXq014374 for ; Mon, 6 Jun 2005 04:47:29 -0700 Received: from mail.cyberus.ca ([209.197.145.21]) by mx04.cybersurf.com with esmtp (Exim 4.30) id 1DfG3p-000700-Cv for netdev@oss.sgi.com; Mon, 06 Jun 2005 07:46:29 -0400 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.229]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1DfG3n-0003ge-HV; Mon, 06 Jun 2005 07:46:27 -0400 Subject: Re: [Linux Diffserv] GRED queueing discipline and the file sch_gred.c From: jamal Reply-To: hadi@cyberus.ca To: rahul hari Cc: tgraf@suug.ch, diffserv-general@lists.sourceforge.net, netdev@oss.sgi.com In-Reply-To: References: Content-Type: text/plain Organization: unknown Date: Mon, 06 Jun 2005 07:45:51 -0400 Message-Id: <1118058351.6266.119.camel@localhost.localdomain> Mime-Version: 1.0 X-Mailer: Evolution 2.2.1.1 Content-Transfer-Encoding: 7bit X-archive-position: 2124 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev Content-Length: 1505 Lines: 37 On Mon, 2005-06-06 at 16:08 +0530, rahul hari wrote: > Dear Thomas, > Thanks for the reply. Actually in my experiment, I am implementing 2 queues, > in one of the queues, I use the prio scheme of tc and in another I define 3 > virtual queues, out of which I want to provide absolute priority to one of > the queue over the others (ie, if there is any packet in this queue, it > should be dispatched immediately regardless of whatever happens to the other > two virtual queues). > For the other two virtual queues, I want to apply individual REDs (with > different parameters but the average queue length should be equal to the > total qave of these two virtual queues) on each but the dequeuing priority > should be equal (the dequeuing takes place alternately). > Can the current implementations somehow help me with this , or I would have > to design this from scratch. > It is not clear what your requirements are. You are stating what your solution is ;-> Assuming that you require to have the first queue to be of the utmost priority followed by the first red queue as being important and then the last two, then you need a prio qdisc with three bands: +---- pfifo | +---- RED | +---- GRED The pfifo will starved the lower 2. The RED will starve the GRED if it can and GRED virtual queues will need to be set in (CISCO) WRED mode i.e select GRIO but give them equal priority. Make sure those two VQs have exactly the same drop priorities and queue parameters. cheers, jamal From hadi@cyberus.ca Mon Jun 6 04:55:56 2005 Received: with ECARTIS (v1.0.0; list netdev); Mon, 06 Jun 2005 04:55:58 -0700 (PDT) Received: from mx04.cybersurf.com (mx04.cybersurf.com [209.197.145.108]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j56BtuXq015367 for ; Mon, 6 Jun 2005 04:55:56 -0700 Received: from mail.cyberus.ca ([209.197.145.21]) by mx04.cybersurf.com with esmtp (Exim 4.30) id 1DfGBw-0000uc-J9 for netdev@oss.sgi.com; Mon, 06 Jun 2005 07:54:52 -0400 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.229]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1DfGBv-0004uq-VZ; Mon, 06 Jun 2005 07:54:52 -0400 Subject: Re: [Linux Diffserv] GRED queueing discipline and the file sch_gred.c From: jamal Reply-To: hadi@cyberus.ca To: Thomas Graf Cc: rahul hari , diffserv-general@lists.sourceforge.net, netdev@oss.sgi.com In-Reply-To: <20050606113907.GC15391@postel.suug.ch> References: <20050605221106.GB15391@postel.suug.ch> <20050606113907.GC15391@postel.suug.ch> Content-Type: text/plain Organization: unknown Date: Mon, 06 Jun 2005 07:54:18 -0400 Message-Id: <1118058859.6266.126.camel@localhost.localdomain> Mime-Version: 1.0 X-Mailer: Evolution 2.2.1.1 Content-Transfer-Encoding: 7bit X-archive-position: 2125 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev Content-Length: 606 Lines: 18 On Mon, 2005-06-06 at 13:39 +0200, Thomas Graf wrote: > Use a prio qdisc with RED leaf qdiscs. RED and GREDs purpose is to > calculate a marking probability and not to provide any prioritizing > schemes. Prioritization is still implicitly provided if you vary the queue lengths or the drop probabilities. For example, if you set everything to be exactly the same, and varied only the drop probability - the VQ with the highest drop probability will be less important (i.e relatively more of its packets will be dropped; recall: the drop decision is made before the packet is queued). cheers, jamal From herbert@gondor.apana.org.au Mon Jun 6 05:01:33 2005 Received: with ECARTIS (v1.0.0; list netdev); Mon, 06 Jun 2005 05:01:39 -0700 (PDT) Received: from arnor.apana.org.au (arnor.apana.org.au [203.14.152.115]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j56C1VXq016397 for ; Mon, 6 Jun 2005 05:01:32 -0700 Received: from gondolin.me.apana.org.au ([192.168.0.6] ident=mail) by arnor.apana.org.au with esmtp (Exim 3.35 #1 (Debian)) id 1DfGGm-0002Mz-00; Mon, 06 Jun 2005 21:59:52 +1000 Received: from herbert by gondolin.me.apana.org.au with local (Exim 3.36 #1 (Debian)) id 1DfGGZ-00007C-00; Mon, 06 Jun 2005 21:59:39 +1000 Date: Mon, 6 Jun 2005 21:59:39 +1000 To: Christoph Hellwig Cc: "David S. Miller" , James Morris , Linux Crypto Mailing List , netdev@oss.sgi.com Subject: Re: [RFC] Replace scatterlist with crypto_frag Message-ID: <20050606115939.GA399@gondor.apana.org.au> References: <20050603234623.GA20088@gondor.apana.org.au> <20050604112314.GA19819@infradead.org> <20050604112606.GA1799@gondor.apana.org.au> <20050604115853.GA20335@infradead.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20050604115853.GA20335@infradead.org> User-Agent: Mutt/1.5.9i From: Herbert Xu X-archive-position: 2126 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: herbert@gondor.apana.org.au Precedence: bulk X-list: netdev Content-Length: 1179 Lines: 37 On Sat, Jun 04, 2005 at 12:58:53PM +0100, Christoph Hellwig wrote: > > the usage of 16bit counters in bio_vec doesn't make sense, and if did > all others would have to move to 32bit aswell (in case we started > supporting page sizes that aren't addressable by 16bits) You know what? The more I think about this the more I think that your idea is brilliant. The reason is that the two main users of crypto API happen to be in possession of bio_vec and skb_frag_t respectively. Had we merged the three structures, they would not have to copy the structures as they do now or even worse, process the buffers one-by-one as dmcrypt is doing. Back to the topic of 16-bit vs. 32-bit counters. Could we do something like this? #if (PAGE_SHIFT > 16) || (BITS_PER_LONG > 32) typedef unsigned int page_offset_t #else typedef unsigned short page_offset_t #endif And then define struct foovec { struct page *page; page_offset_t offset; page_offset_t length; }; Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt From SRS0+26fa12ab9fa0d64ac01b+652+infradead.org+hch@pentafluge.srs.infradead.org Mon Jun 6 05:10:23 2005 Received: with ECARTIS (v1.0.0; list netdev); Mon, 06 Jun 2005 05:10:26 -0700 (PDT) Received: from pentafluge.infradead.org (pentafluge.infradead.org [213.146.154.40]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j56CALXq017189 for ; Mon, 6 Jun 2005 05:10:23 -0700 Received: from hch by pentafluge.infradead.org with local (Exim 4.43 #1 (Red Hat Linux)) id 1DfGPq-0002Az-Mv; Mon, 06 Jun 2005 13:09:14 +0100 Date: Mon, 6 Jun 2005 13:09:14 +0100 From: Christoph Hellwig To: Herbert Xu Cc: Christoph Hellwig , "David S. Miller" , James Morris , Linux Crypto Mailing List , netdev@oss.sgi.com Subject: Re: [RFC] Replace scatterlist with crypto_frag Message-ID: <20050606120914.GA8317@infradead.org> References: <20050603234623.GA20088@gondor.apana.org.au> <20050604112314.GA19819@infradead.org> <20050604112606.GA1799@gondor.apana.org.au> <20050604115853.GA20335@infradead.org> <20050606115939.GA399@gondor.apana.org.au> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20050606115939.GA399@gondor.apana.org.au> User-Agent: Mutt/1.4.1i X-SRS-Rewrite: SMTP reverse-path rewritten from by pentafluge.infradead.org See http://www.infradead.org/rpr.html X-archive-position: 2127 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hch@infradead.org Precedence: bulk X-list: netdev Content-Length: 1235 Lines: 33 On Mon, Jun 06, 2005 at 09:59:39PM +1000, Herbert Xu wrote: > On Sat, Jun 04, 2005 at 12:58:53PM +0100, Christoph Hellwig wrote: > > > > the usage of 16bit counters in bio_vec doesn't make sense, and if did > > all others would have to move to 32bit aswell (in case we started > > supporting page sizes that aren't addressable by 16bits) > > You know what? The more I think about this the more I think that your > idea is brilliant. The reason is that the two main users of crypto API > happen to be in possession of bio_vec and skb_frag_t respectively. > > Had we merged the three structures, they would not have to copy the > structures as they do now or even worse, process the buffers one-by-one > as dmcrypt is doing. > > Back to the topic of 16-bit vs. 32-bit counters. Could we do something > like this? > > #if (PAGE_SHIFT > 16) || (BITS_PER_LONG > 32) what is the BITS_PER_LONG check for? > typedef unsigned int page_offset_t > #else > typedef unsigned short page_offset_t > #endif the name is a) a little long and b) easy to confuse with pgoff_t as used in the pagecache. I'm not sure what a better name would be. We probably shouldn't care about this as the networking code didn't handle larger offsets either. From tgraf@suug.ch Mon Jun 6 05:16:07 2005 Received: with ECARTIS (v1.0.0; list netdev); Mon, 06 Jun 2005 05:16:12 -0700 (PDT) Received: from postel.suug.ch (postel.suug.ch [195.134.158.23]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j56CG7Xq021094 for ; Mon, 6 Jun 2005 05:16:07 -0700 Received: by postel.suug.ch (Postfix, from userid 10001) id 4DE881C0EE; Mon, 6 Jun 2005 14:15:27 +0200 (CEST) Date: Mon, 6 Jun 2005 14:15:27 +0200 From: Thomas Graf To: jamal Cc: rahul hari , diffserv-general@lists.sourceforge.net, netdev@oss.sgi.com Subject: Re: [Linux Diffserv] GRED queueing discipline and the file sch_gred.c Message-ID: <20050606121527.GE15391@postel.suug.ch> References: <20050605221106.GB15391@postel.suug.ch> <20050606113907.GC15391@postel.suug.ch> <1118058859.6266.126.camel@localhost.localdomain> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1118058859.6266.126.camel@localhost.localdomain> X-archive-position: 2128 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev Content-Length: 1083 Lines: 20 * jamal <1118058859.6266.126.camel@localhost.localdomain> 2005-06-06 07:54 > On Mon, 2005-06-06 at 13:39 +0200, Thomas Graf wrote: > > > Use a prio qdisc with RED leaf qdiscs. RED and GREDs purpose is to > > calculate a marking probability and not to provide any prioritizing > > schemes. > > Prioritization is still implicitly provided if you vary the queue > lengths or the drop probabilities. > For example, if you set everything to be exactly the same, and varied > only the drop probability - the VQ with the highest drop probability > will be less important (i.e relatively more of its packets will be > dropped; recall: the drop decision is made before the packet is queued). Absolutely, what I meant is that GRED does not take influence on the actual ordering of packets not dropped. The priority together with the qavg parameters and the thresholds only have influence on the probability a packet gets marked/dropped, sure this is prioritization as well but Rahul wanted to have one VQ strave out another VQ completely. My point is that this is not possible with GRED. From herbert@gondor.apana.org.au Mon Jun 6 05:42:10 2005 Received: with ECARTIS (v1.0.0; list netdev); Mon, 06 Jun 2005 05:42:16 -0700 (PDT) Received: from arnor.apana.org.au (arnor.apana.org.au [203.14.152.115]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j56Cg8Xq022566 for ; Mon, 6 Jun 2005 05:42:09 -0700 Received: from gondolin.me.apana.org.au ([192.168.0.6] ident=mail) by arnor.apana.org.au with esmtp (Exim 3.35 #1 (Debian)) id 1DfGuM-0002cN-00; Mon, 06 Jun 2005 22:40:46 +1000 Received: from herbert by gondolin.me.apana.org.au with local (Exim 3.36 #1 (Debian)) id 1DfGuJ-0000BQ-00; Mon, 06 Jun 2005 22:40:43 +1000 Date: Mon, 6 Jun 2005 22:40:43 +1000 To: Christoph Hellwig Cc: "David S. Miller" , James Morris , Linux Crypto Mailing List , netdev@oss.sgi.com Subject: Re: [RFC] Replace scatterlist with crypto_frag Message-ID: <20050606124043.GA625@gondor.apana.org.au> References: <20050603234623.GA20088@gondor.apana.org.au> <20050604112314.GA19819@infradead.org> <20050604112606.GA1799@gondor.apana.org.au> <20050604115853.GA20335@infradead.org> <20050606115939.GA399@gondor.apana.org.au> <20050606120914.GA8317@infradead.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20050606120914.GA8317@infradead.org> User-Agent: Mutt/1.5.9i From: Herbert Xu X-archive-position: 2129 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: herbert@gondor.apana.org.au Precedence: bulk X-list: netdev Content-Length: 1223 Lines: 34 On Mon, Jun 06, 2005 at 01:09:14PM +0100, Christoph Hellwig wrote: > > > #if (PAGE_SHIFT > 16) || (BITS_PER_LONG > 32) > > what is the BITS_PER_LONG check for? These structures are normally used in arrays. On a 64-bit machine the alignment requirement means that the 16-bit version will be padded to have the same length as the 32-bit version. Since 32-bit access is usually faster we might as well get it for free. > > typedef unsigned int page_offset_t > > the name is a) a little long and b) easy to confuse with pgoff_t as used in > the pagecache. I'm not sure what a better name would be. Alternatively we can put the ifdef around (or inside) the struct definition. > We probably shouldn't care about this as the networking code didn't handle > larger offsets either. I'm not sure what you mean here. However, for skb_frag_t at least going to the 32-bit version on i386 means at least 72 bytes extra for every skb->data allocation. Dave, what are your views on making skb_frag_t bigger? Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt From SRS0+26fa12ab9fa0d64ac01b+652+infradead.org+hch@pentafluge.srs.infradead.org Mon Jun 6 06:31:30 2005 Received: with ECARTIS (v1.0.0; list netdev); Mon, 06 Jun 2005 06:31:33 -0700 (PDT) Received: from pentafluge.infradead.org (pentafluge.infradead.org [213.146.154.40]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j56DVTXq026084 for ; Mon, 6 Jun 2005 06:31:29 -0700 Received: from hch by pentafluge.infradead.org with local (Exim 4.43 #1 (Red Hat Linux)) id 1DfHgM-0002kw-Av; Mon, 06 Jun 2005 14:30:22 +0100 Date: Mon, 6 Jun 2005 14:30:22 +0100 From: Christoph Hellwig To: Herbert Xu Cc: Christoph Hellwig , "David S. Miller" , James Morris , Linux Crypto Mailing List , netdev@oss.sgi.com Subject: Re: [RFC] Replace scatterlist with crypto_frag Message-ID: <20050606133022.GA10566@infradead.org> References: <20050603234623.GA20088@gondor.apana.org.au> <20050604112314.GA19819@infradead.org> <20050604112606.GA1799@gondor.apana.org.au> <20050604115853.GA20335@infradead.org> <20050606115939.GA399@gondor.apana.org.au> <20050606120914.GA8317@infradead.org> <20050606124043.GA625@gondor.apana.org.au> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20050606124043.GA625@gondor.apana.org.au> User-Agent: Mutt/1.4.1i X-SRS-Rewrite: SMTP reverse-path rewritten from by pentafluge.infradead.org See http://www.infradead.org/rpr.html X-archive-position: 2130 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hch@infradead.org Precedence: bulk X-list: netdev Content-Length: 602 Lines: 15 On Mon, Jun 06, 2005 at 10:40:43PM +1000, Herbert Xu wrote: > On Mon, Jun 06, 2005 at 01:09:14PM +0100, Christoph Hellwig wrote: > > > > > #if (PAGE_SHIFT > 16) || (BITS_PER_LONG > 32) > > > > what is the BITS_PER_LONG check for? > > These structures are normally used in arrays. On a 64-bit machine > the alignment requirement means that the 16-bit version will be > padded to have the same length as the 32-bit version. Since 32-bit > access is usually faster we might as well get it for free. At this point it might be easiest to just say the architecture must declare the type in asm/types.h From john.ronciak@intel.com Mon Jun 6 08:37:55 2005 Received: with ECARTIS (v1.0.0; list netdev); Mon, 06 Jun 2005 08:38:08 -0700 (PDT) Received: from orsfmr005.jf.intel.com (fmr20.intel.com [134.134.136.19]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j56FbsXq004055 for ; Mon, 6 Jun 2005 08:37:55 -0700 Received: from orsfmr100.jf.intel.com (orsfmr100.jf.intel.com [10.7.209.16]) by orsfmr005.jf.intel.com (8.12.10/8.12.10/d: major-outer.mc,v 1.1 2004/09/17 17:50:56 root Exp $) with ESMTP id j56FZYGO028451; Mon, 6 Jun 2005 15:35:34 GMT Received: from orsmsxvs040.jf.intel.com (orsmsxvs040.jf.intel.com [192.168.65.206]) by orsfmr100.jf.intel.com (8.12.10/8.12.10/d: major-inner.mc,v 1.2 2004/09/17 18:05:01 root Exp $) with SMTP id j56FXTRl025229; Mon, 6 Jun 2005 15:35:29 GMT Received: from orsmsx331.amr.corp.intel.com ([192.168.65.56]) by orsmsxvs040.jf.intel.com (SAVSMTP 3.1.7.47) with SMTP id M2005060608352714440 ; Mon, 06 Jun 2005 08:35:27 -0700 Received: from orsmsx408.amr.corp.intel.com ([192.168.65.52]) by orsmsx331.amr.corp.intel.com with Microsoft SMTPSVC(6.0.3790.211); Mon, 6 Jun 2005 08:35:27 -0700 X-MimeOLE: Produced By Microsoft Exchange V6.5.7226.0 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Subject: RE: RFC: NAPI packet weighting patch Date: Mon, 6 Jun 2005 08:35:26 -0700 Message-ID: <468F3FDA28AA87429AD807992E22D07E0450C002@orsmsx408> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: RFC: NAPI packet weighting patch Thread-Index: AcVqYwuQH8C8q1ZUSvOxuU2JvA44ugASN1rQ From: "Ronciak, John" To: "David S. Miller" , Cc: , , "Williams, Mitch A" , , , , , "Venkatesan, Ganesh" , "Brandeburg, Jesse" X-OriginalArrivalTime: 06 Jun 2005 15:35:27.0765 (UTC) FILETIME=[5D54C450:01C56AAD] X-Scanned-By: MIMEDefang 2.44 Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id j56FbsXq004055 X-archive-position: 2131 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: john.ronciak@intel.com Precedence: bulk X-list: netdev Content-Length: 6820 Lines: 222 We are dropping packets at the HW level (FIFO errors) with 256 descriptors and the default weight of 64. As we said reducing the weight eliminates this which is understandable since the driver is being serviced more fequently. We also hacked the driver to do a buffer allocation per packet sent up the stack. This reduced the number of dropped pacekts by about 80% but it was still a significant number of drops (190K to 39K dropped). So I don't think this is where the problem is. This is also comfimed with the tg3 driver doing the buffer update to the HW every 25 descriptors. We did not up the descriptor ring size with the default weight but will try this today and report back. Cheers, John > -----Original Message----- > From: David S. Miller [mailto:davem@davemloft.net] > Sent: Sunday, June 05, 2005 11:43 PM > To: mchan@broadcom.com > Cc: hadi@cyberus.ca; buytenh@wantstofly.org; Williams, Mitch > A; Ronciak, John; jdmason@us.ibm.com; shemminger@osdl.org; > netdev@oss.sgi.com; Robert.Olsson@data.slu.se; Venkatesan, > Ganesh; Brandeburg, Jesse > Subject: Re: RFC: NAPI packet weighting patch > > > From: "David S. Miller" > Date: Sun, 05 Jun 2005 14:36:53 -0700 (PDT) > > > BTW, here is the patch implementing this stuff. > > A new patch and some more data. > > When we go to gigabit, and NAPI kicks in, the first RX > packet costs a lot (cache misses etc.) but the rest are > very efficient to process. I suspect this only holds > for the single socket case, and on a real system processing > many connections the cost drop might not be so clean. > > The log output format is: > > (TX_TICKS:RX_TICKS[ RX_TICK1 RX_TICK2 RX_TICK3 ... ]) > > Here is an example trace from a single socket TCP stream > send over gigabit: > > (9:112[ 26 8 7 8 7 ]) > (6:110[ 23 8 8 8 7 ]) > (7:57[ 26 8 ]) > (6:117[ 25 8 9 7 7 ]) > (5:37[ 26 ]) > (6:113[ 28 8 7 8 7 ]) > (0:20[ 9 ]) > (8:111[ 27 7 7 8 7 ]) > (5:109[ 25 8 8 8 7 ]) > (8:113[ 25 7 8 9 7 ]) > (6:108[ 25 8 7 7 7 ]) > (8:88[ 26 8 8 7 ]) > (6:109[ 25 7 7 7 7 ]) > (6:111[ 25 9 8 7 7 ]) > (0:48[ 9 5 ]) > > This kind of trace reiterates some things we already know. > For example, mitigation (HW, SW, or a combination of both) > helps because processing multiple packets let's us "reuse" > the cpu cache priming the handling of the first packet > achieves for us. > > It would be great to stick something like this into the e1000 > driver, and get some output from it with Intel's single NIC > performance degradation test case. > > It is also necessary for the Intel folks to say whether the > NIC is running out of RX descriptors in the single NIC > case with dev->weight set to the default of 64. If so, does > increasing the RX ring size to a larger value via ethtool > help? If not, then why in the world are things running more > slowly? > > I've got a crappy 1.5GHZ sparc64 box in my tg3 tests here, and it can > handle gigabit line rate with much CPU to spare. So either Intel is > doing something other than TCP stream tests, or something else is out > of whack. > > I even tried to do things like having a memory touching program > run in parallel with the TCP stream test, and this did not make > the timing numbers in the logs increase much at all. > > --- ./drivers/net/tg3.c.~1~ 2005-06-03 11:13:14.000000000 -0700 > +++ ./drivers/net/tg3.c 2005-06-05 23:21:11.000000000 -0700 > @@ -2836,7 +2836,22 @@ static int tg3_rx(struct tg3 *tp, int bu > desc->err_vlan & RXD_VLAN_MASK); > } else > #endif > + { > + unsigned long t = get_cycles(); > + struct tg3_poll_log_ent *lp; > + unsigned int ent; > + > netif_receive_skb(skb); > + t = get_cycles() - t; > + > + ent = tp->poll_log_ent; > + lp = &tp->poll_log[ent]; > + ent = lp->rx_cur_ent; > + if (ent < POLL_RX_SIZE) { > + lp->rx_ents[ent] = (u16) t; > + lp->rx_cur_ent = ent + 1; > + } > + } > > tp->dev->last_rx = jiffies; > received++; > @@ -2897,9 +2912,15 @@ static int tg3_poll(struct net_device *n > > /* run TX completion thread */ > if (sblk->idx[0].tx_consumer != tp->tx_cons) { > + unsigned long t; > + > spin_lock(&tp->tx_lock); > + t = get_cycles(); > tg3_tx(tp); > + t = get_cycles() - t; > spin_unlock(&tp->tx_lock); > + > + tp->poll_log[tp->poll_log_ent].tx_ticks = (u16) t; > } > > spin_unlock_irqrestore(&tp->lock, flags); > @@ -2911,16 +2932,28 @@ static int tg3_poll(struct net_device *n > if (sblk->idx[0].rx_producer != tp->rx_rcb_ptr) { > int orig_budget = *budget; > int work_done; > + unsigned long t; > + unsigned int ent; > > if (orig_budget > netdev->quota) > orig_budget = netdev->quota; > > + t = get_cycles(); > work_done = tg3_rx(tp, orig_budget); > + t = get_cycles() - t; > + > + ent = tp->poll_log_ent; > + tp->poll_log[ent].rx_ticks = (u16) t; > > *budget -= work_done; > netdev->quota -= work_done; > } > > + tp->poll_log_ent = (tp->poll_log_ent + 1) & POLL_LOG_MASK; > + tp->poll_log[tp->poll_log_ent].tx_ticks = 0; > + tp->poll_log[tp->poll_log_ent].rx_ticks = 0; > + tp->poll_log[tp->poll_log_ent].rx_cur_ent = 0; > + > if (tp->tg3_flags & TG3_FLAG_TAGGED_STATUS) > tp->last_tag = sblk->status_tag; > rmb(); > @@ -6609,6 +6642,27 @@ static struct net_device_stats *tg3_get_ > stats->rx_crc_errors = old_stats->rx_crc_errors + > calc_crc_errors(tp); > > + /* XXX Yes, I know, do this right. :-) */ > + { > + unsigned int ent; > + > + printk("TG3: POLL LOG, current ent[%d]\n", > tp->poll_log_ent); > + ent = tp->poll_log_ent - (POLL_LOG_SIZE - 1); > + ent &= POLL_LOG_MASK; > + while (ent != tp->poll_log_ent) { > + struct tg3_poll_log_ent *lp = > &tp->poll_log[ent]; > + int i; > + > + printk("(%u:%u[ ", > + lp->tx_ticks, lp->rx_ticks); > + for (i = 0; i < lp->rx_cur_ent; i++) > + printk("%d ", lp->rx_ents[i]); > + printk("])\n"); > + > + ent = (ent + 1) & POLL_LOG_MASK; > + } > + } > + > return stats; > } > > --- ./drivers/net/tg3.h.~1~ 2005-06-03 11:13:14.000000000 -0700 > +++ ./drivers/net/tg3.h 2005-06-05 23:21:05.000000000 -0700 > @@ -2003,6 +2003,15 @@ struct tg3_ethtool_stats { > u64 nic_tx_threshold_hit; > }; > > +struct tg3_poll_log_ent { > + u16 tx_ticks; > + u16 rx_ticks; > +#define POLL_RX_SIZE 8 > +#define POLL_RX_MASK (POLL_RX_SIZE - 1) > + u16 rx_cur_ent; > + u16 rx_ents[POLL_RX_SIZE]; > +}; > + > struct tg3 { > /* begin "general, frequently-used members" cacheline section */ > > @@ -2232,6 +2241,11 @@ struct tg3 { > #define SST_25VF0X0_PAGE_SIZE 4098 > > struct ethtool_coalesce coal; > + > +#define POLL_LOG_SIZE (1 << 7) > +#define POLL_LOG_MASK (POLL_LOG_SIZE - 1) > + unsigned int poll_log_ent; > + struct tg3_poll_log_ent poll_log[POLL_LOG_SIZE]; > }; > > #endif /* !(_T3_H) */ > From rahulhsaxena@gmail.com Mon Jun 6 10:49:40 2005 Received: with ECARTIS (v1.0.0; list netdev); Mon, 06 Jun 2005 10:49:48 -0700 (PDT) Received: from zproxy.gmail.com (zproxy.gmail.com [64.233.162.202]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j56HndXq011644 for ; Mon, 6 Jun 2005 10:49:40 -0700 Received: by zproxy.gmail.com with SMTP id 34so1127690nzf for ; Mon, 06 Jun 2005 10:48:37 -0700 (PDT) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:reply-to:to:subject:cc:mime-version:content-type:content-transfer-encoding:content-disposition; b=HZuJJgd/+Kw9B+FY4TxM2zsPry53N0HoCIkATJkrNTAWFJd6gpnhq5RDQcRkVaKSqj1gMznfgf7WWhcfrbUaATqYdTCg+1moFFTYDtO/bcWFfMLTT7qmZv7W1+X1Ag1BN/hJxze0y3gIF77nY4tSUUyOp83DhlaT8P4QBo/BKe4= Received: by 10.36.220.9 with SMTP id s9mr730301nzg; Mon, 06 Jun 2005 10:48:37 -0700 (PDT) Received: by 10.36.4.6 with HTTP; Mon, 6 Jun 2005 10:48:37 -0700 (PDT) Message-ID: <4532f31705060610486ef106a1@mail.gmail.com> Date: Mon, 6 Jun 2005 23:18:37 +0530 From: Rahul Hari Reply-To: rahul.hari@cse06.itbhu.org To: hadi@cyberus.ca, tgraf@suug.ch Subject: Re: [Linux Diffserv] GRED queueing discipline and the filesch_gred.c Cc: diffserv-general@lists.sourceforge.net, netdev@oss.sgi.com Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id j56HndXq011644 X-archive-position: 2132 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: rahulhsaxena@gmail.com Precedence: bulk X-list: netdev Content-Length: 2970 Lines: 72 Thanks for all the suggestions Jamal and Thomas. From what you people have been suggesting, i feel that i should be giving a brief explaination of the problem I am currently working on. I have divided all the traffic on a network into 5 categories : Real time video (UDP1),Real time audio (UDP2), TCP not requiring any QoS (TCP1), TCP requiring QoS but with the size of the entire transaction very low(TCP2), and TCP requiring QoS with the size of the transaction in several MBs (TCP3). Now I am putting UDP1 and TCP1 in one particular queue (say q1) and giving priority to UDP1 (for dequeuing not caring if TCP1 is getting starved). I am putting UDP2 ,TCP2 and TCP3 in a different queue (thus keeping the average queue length almost constant) (say q2)and applying RED on each of TCP2 and TCP3 (the application of the two REDs being independent of each other). Here also I am providing priority to UDP2 (without caring if TCP2 or TCP3 is getting starved ). To schedule between q1 and q2, I am using WRR and to schedule between UDP1 and TCP1, I am using prio. For implementing q2, I am currently putting UDP2,TCP2 and TCP3 in 3 different virtual queues and applying GRED with grio. I am providing UDP2 the highest priority and providing TCP2 and TCP3 equal priorities. To ensure that RED does not apply on the UDP2, I have set Tmax=Tmin so that Pbmax=1. But the results I am getting with this configuration do not match with the results that I have got from the simulations. So I want to implement this stuff such that the UDP2 gets highest priority among the three, is not included while calculating the total average queue length and the qave used for the application of REDs on TCP2 and TCP3 should be equal to the qave of tcp2+ qave of tcp3. To schedule between TCP2 and TCP3, I want to use WRR or something that gives equal priority and prevents the starvation of any of these. Regards, Rahul -- ---------------------- "The fear you let build up in your mind is worse than the situation that actually exists" from "who moved my cheese" --------------------------------------------------------------------------------- Rahul Hari Senior Under Grad. Student, Department of CSE, ITBHU, Varanasi. Ph: +91-9845347020 rahul.hari@cse06.itbhu.org ------------------------------------------------------------------------------------------ > >On Mon, 2005-06-06 at 13:39 +0200, Thomas Graf wrote: > > > Use a prio qdisc with RED leaf qdiscs. RED and GREDs purpose is to > > calculate a marking probability and not to provide any prioritizing > > schemes. > >Prioritization is still implicitly provided if you vary the queue >lengths or the drop probabilities. >For example, if you set everything to be exactly the same, and varied >only the drop probability - the VQ with the highest drop probability >will be less important (i.e relatively more of its packets will be >dropped; recall: the drop decision is made before the packet is queued). > >cheers, >jamal > > > From romieu@fr.zoreil.com Mon Jun 6 11:12:35 2005 Received: with ECARTIS (v1.0.0; list netdev); Mon, 06 Jun 2005 11:12:40 -0700 (PDT) Received: from fr.zoreil.com (electric-eye.fr.zoreil.com [213.41.134.224]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j56ICXXq013378 for ; Mon, 6 Jun 2005 11:12:34 -0700 Received: from electric-eye.fr.zoreil.com (localhost.localdomain [127.0.0.1]) by fr.zoreil.com (8.13.1/8.12.1) with ESMTP id j56I8JdY029814; Mon, 6 Jun 2005 20:08:19 +0200 Received: (from romieu@localhost) by electric-eye.fr.zoreil.com (8.13.1/8.13.1/Submit) id j56I8EKp029813; Mon, 6 Jun 2005 20:08:14 +0200 Date: Mon, 6 Jun 2005 20:08:13 +0200 From: Francois Romieu To: Wolfgang Empacher Cc: linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: Kernel 2.4.31 - netdriver r8169 Message-ID: <20050606180813.GA29537@electric-eye.fr.zoreil.com> References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.1i X-Organisation: Land of Sunshine Inc. X-archive-position: 2133 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: romieu@fr.zoreil.com Precedence: bulk X-list: netdev Content-Length: 376 Lines: 12 Wolfgang Empacher : [...] > in kernel 2.4.31 there is version 1.2 of r8169 driver in use. this version > doesn't work well (RESETS of the device many and all the times). using > version 1.6 of this driver performs smooth and well. Where did you get your 1.6 version from ? (netdev added to Cc: as per the r8169 entry in the MAINTAINERS file) -- Ueimor From tgraf@suug.ch Mon Jun 6 11:28:59 2005 Received: with ECARTIS (v1.0.0; list netdev); Mon, 06 Jun 2005 11:29:02 -0700 (PDT) Received: from postel.suug.ch (postel.suug.ch [195.134.158.23]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j56ISwXq014464 for ; Mon, 6 Jun 2005 11:28:58 -0700 Received: by postel.suug.ch (Postfix, from userid 10001) id 84B771C0EE; Mon, 6 Jun 2005 20:28:14 +0200 (CEST) Date: Mon, 6 Jun 2005 20:28:14 +0200 From: Thomas Graf To: rahul.hari@cse06.itbhu.org Cc: hadi@cyberus.ca, diffserv-general@lists.sourceforge.net, netdev@oss.sgi.com Subject: Re: [Linux Diffserv] GRED queueing discipline and the filesch_gred.c Message-ID: <20050606182814.GI15391@postel.suug.ch> References: <4532f31705060610486ef106a1@mail.gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4532f31705060610486ef106a1@mail.gmail.com> X-archive-position: 2134 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev Content-Length: 1015 Lines: 17 * Rahul Hari <4532f31705060610486ef106a1@mail.gmail.com> 2005-06-06 23:18 > UDP1 and TCP1, I am using prio. For implementing q2, I am currently > putting UDP2,TCP2 and TCP3 in 3 different virtual queues and applying > GRED with grio. I am providing UDP2 the highest priority and providing > TCP2 and TCP3 equal priorities. To ensure that RED does not apply on > the UDP2, I have set Tmax=Tmin so that Pbmax=1. But the results I am > getting with this configuration do not match with the results that I > have got from the simulations. I assume Tmax being qth_max so you basically disable probability drops which is the main point of RED. What you do is about equal as a simple FIFO with hard queue limit comparing against a EWMA based queue length. Depending on whether you want UDP2 to starve out the others use either prio or cbq/htb and a GRED in rio mode with equal vq prios for TCP2 and TCP3. The drops should be roughly proportional to their bandwidth share but I'm not sure if this is fair enough for you. From niv@us.ibm.com Mon Jun 6 11:32:11 2005 Received: with ECARTIS (v1.0.0; list netdev); Mon, 06 Jun 2005 11:32:16 -0700 (PDT) Received: from e32.co.us.ibm.com (e32.co.us.ibm.com [32.97.110.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j56IWBXq015155 for ; Mon, 6 Jun 2005 11:32:11 -0700 Received: from westrelay02.boulder.ibm.com (westrelay02.boulder.ibm.com [9.17.195.11]) by e32.co.us.ibm.com (8.12.10/8.12.9) with ESMTP id j56IUr9q707826 for ; Mon, 6 Jun 2005 14:30:57 -0400 Received: from d03av02.boulder.ibm.com (d03av02.boulder.ibm.com [9.17.195.168]) by westrelay02.boulder.ibm.com (8.12.10/NCO/VER6.6) with ESMTP id j56IUrXR238564 for ; Mon, 6 Jun 2005 12:30:53 -0600 Received: from d03av02.boulder.ibm.com (loopback [127.0.0.1]) by d03av02.boulder.ibm.com (8.12.11/8.13.3) with ESMTP id j56IUn4O005941 for ; Mon, 6 Jun 2005 12:30:49 -0600 Received: from [9.47.22.158] (dyn9047022158.beaverton.ibm.com [9.47.22.158]) by d03av02.boulder.ibm.com (8.12.11/8.12.11) with ESMTP id j56IUnMK005880; Mon, 6 Jun 2005 12:30:49 -0600 Message-ID: <42A49658.7060608@us.ibm.com> Date: Mon, 06 Jun 2005 11:30:48 -0700 From: Nivedita Singhvi User-Agent: Mozilla Thunderbird 0.8 (X11/20041020) X-Accept-Language: en-us, en MIME-Version: 1.0 To: Jonathan Day CC: netdev@oss.sgi.com Subject: Re: Automated linux kernel testing results References: <20050604050123.9897.qmail@web31504.mail.mud.yahoo.com> In-Reply-To: <20050604050123.9897.qmail@web31504.mail.mud.yahoo.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 2135 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: niv@us.ibm.com Precedence: bulk X-list: netdev Content-Length: 1152 Lines: 35 Jonathan Day wrote: > What I have not (yet) seen is any work on relating the > results. Is a bug in the design? The implementation? > Some combination thereof? Is something correctly > written but not functioning because something it > depends on isn't working correctly? Currently, you can get some idea (kernel didn't build, machine couldn't reboot, or if the system crashes during the tests, crash info etc. Looking into whether the cause is a design bug or an implementation bug is likely beyond automation. > It would even be useful if we could cross-reference > some of the benchmarks with the Linux graphing > project, so that we could see how the complexity of I believe they do (ping Martin for details) have some plans to graph stuff, and possibly info could be sucked out of the data/results provided to feed other people's needs. > Test suites are necessary. Test suites are great. > Anyone working on a test suite deserves many kudos and > much praise. Test suites that are relatable enough > that you can see the same problem from different > angles -- those are worth their printout weight in > gold. Yeah. :). thanks, Nivedita From tkoponen@iki.fi Mon Jun 6 12:00:48 2005 Received: with ECARTIS (v1.0.0; list netdev); Mon, 06 Jun 2005 12:00:50 -0700 (PDT) Received: from twilight.cs.hut.fi (twilight.cs.hut.fi [130.233.40.5]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j56J0kXq017956 for ; Mon, 6 Jun 2005 12:00:47 -0700 Received: by twilight.cs.hut.fi (Postfix, from userid 60001) id 245452DD3; Mon, 6 Jun 2005 21:59:44 +0300 (EEST) Received: from [127.0.0.1] (kekkonen.cs.hut.fi [130.233.41.50]) by twilight.cs.hut.fi (Postfix) with ESMTP id 2CB182DBF for ; Mon, 6 Jun 2005 21:59:42 +0300 (EEST) Mime-Version: 1.0 (Apple Message framework v622) Content-Transfer-Encoding: 7bit Message-Id: Content-Type: text/plain; charset=US-ASCII; format=flowed To: netdev@oss.sgi.com From: Teemu Koponen Subject: New address announcements in RTMGRP_IPV4_IFADDR netlink group Date: Mon, 6 Jun 2005 11:59:38 -0700 X-Mailer: Apple Mail (2.622) X-archive-position: 2136 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tkoponen@iki.fi Precedence: bulk X-list: netdev Content-Length: 1061 Lines: 29 Netlink developers and gurus, While fine-tuning the handover speed for a certain L3 mobility daemon under Linux 2.6.11.10, I stumbled into the following behavior which intuitively does not follow the semantics of the RTMGRP_IPV4_IFADDR group: 0) A userspace daemon process is running and listening to the broadcast group. 1) Address is inserted to an interface (ip addr add ... at shell). 2) The daemon receives a NEWADDR message, just as is should, but the daemon is unable to bind to the address *immediately* (actually in the function that processes the netlink message). The result is "cannot assign an address" from the bind call. However, if I do insert a single nanosleep, even with an arbitrary low sleep value, before the bind call, the bind then succeeds. So, what is the semantics of NEWADDR? Should the address be bindable right after receiving the message? Or is there a race-condition between userspace and kernel that the inserted sleep helps to overcome by letting the kernel to run again before the bind call? TIA, Teemu -- From kamenzky@inf.fu-berlin.de Mon Jun 6 12:09:30 2005 Received: with ECARTIS (v1.0.0; list netdev); Mon, 06 Jun 2005 12:09:35 -0700 (PDT) Received: from math.fu-berlin.de (leibniz.math.fu-berlin.de [160.45.40.10]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j56J9TXq018797 for ; Mon, 6 Jun 2005 12:09:30 -0700 Received: (qmail 12495 invoked from network); 6 Jun 2005 21:08:26 +0200 Received: from lusin.mi.fu-berlin.de (HELO mi.fu-berlin.de) (160.45.113.91) by leibniz.math.fu-berlin.de with SMTP; 6 Jun 2005 21:08:26 +0200 Received: (qmail 10674 invoked by uid 9804); 6 Jun 2005 21:08:25 +0200 Received: from localhost (HELO mi.fu-berlin.de) (127.0.0.1) by localhost with SMTP; 6 Jun 2005 21:08:23 +0200 Received: (qmail 10575 invoked by uid 9804); 6 Jun 2005 21:08:23 +0200 Received: from leibniz.math.fu-berlin.de (HELO math.fu-berlin.de) (160.45.40.10) by lusin.mi.fu-berlin.de with SMTP; 6 Jun 2005 21:08:23 +0200 Received: (Qmail 12464 invoked from network); 6 Jun 2005 21:08:23 +0200 Received: From rosine141.inf.fu-berlin.de (HELO ?160.45.116.141?) (160.45.116.141) by leibniz.math.fu-berlin.de with SMTP; 6 Jun 2005 19:08:23 -0000 X-Envelope-Sender: kamenzky@inf.fu-berlin.de X-Remote-IP: 160.45.116.141 Mime-Version: 1.0 (Apple Message framework v622) Content-Type: text/plain; charset=US-ASCII; format=flowed Message-Id: Content-Transfer-Encoding: 7bit From: Nico Subject: OT: Survey facing design patterns and communication Date: Mon, 6 Jun 2005 21:10:41 +0200 To: netdev@oss.sgi.com X-Mailer: Apple Mail (2.622) X-archive-position: 2137 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kamenzky@inf.fu-berlin.de Precedence: bulk X-list: netdev Content-Length: 631 Lines: 22 Hello everybody! We are a group of students at "Freie Universitaet Berlin". As part of our computer science studies we are going to do a survey facing the use of design patterns in communication. Examples of design patterns are "Abstract Factory", "Singleton", "Composite", "Iterator" and "Listener". If you know what we are talking about, you are welcome to take part in our survey. It takes about 5 minutes to fill out the form. Just jump to: http://study.beatdepot.de If you agree, we will send you the results of our survey. Thanks in advance for your participation! And sorry for the interruption of your discussion. From rahulhsaxena@hotmail.com Mon Jun 6 12:13:11 2005 Received: with ECARTIS (v1.0.0; list netdev); Mon, 06 Jun 2005 12:13:15 -0700 (PDT) Received: from hotmail.com (bay24-f18.bay24.hotmail.com [64.4.18.68]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j56JDBXq020281 for ; Mon, 6 Jun 2005 12:13:11 -0700 Received: from mail pickup service by hotmail.com with Microsoft SMTPSVC; Mon, 6 Jun 2005 12:12:09 -0700 Message-ID: Received: from 144.16.64.4 by by24fd.bay24.hotmail.msn.com with HTTP; Mon, 06 Jun 2005 19:12:09 GMT X-Originating-IP: [144.16.64.4] X-Originating-Email: [rahulhsaxena@hotmail.com] X-Sender: rahulhsaxena@hotmail.com In-Reply-To: <20050606121527.GE15391@postel.suug.ch> From: "rahul hari" To: tgraf@suug.ch, hadi@cyberus.ca Cc: diffserv-general@lists.sourceforge.net, netdev@oss.sgi.com Subject: Re: [Linux Diffserv] GRED queueing discipline and the file sch_gred.c Date: Tue, 07 Jun 2005 00:42:09 +0530 Mime-Version: 1.0 Content-Type: text/plain; format=flowed X-OriginalArrivalTime: 06 Jun 2005 19:12:09.0587 (UTC) FILETIME=[A3055C30:01C56ACB] X-archive-position: 2138 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: rahulhsaxena@hotmail.com Precedence: bulk X-list: netdev Content-Length: 3656 Lines: 80 Thanks for all the suggestions Jamal and Thomas. From what you people have been suggesting, i feel that i should be giving a detailed explaination of the problem I am currently working on. I have divided all the traffic on a network into 5 categories : Real time video (UDP1),Real time audio (UDP2), TCP not requiring any QoS (TCP1), TCP requiring QoS but with the size of the entire transaction very low(TCP2), and TCP requiring QoS with the size of the transaction in several MBs (TCP3). Now I am putting UDP1 and TCP1 in one particular queue (say q1) and giving priority to UDP1 (for dequeuing not caring if TCP1 is getting starved). I am putting UDP2 ,TCP2 and TCP3 in a different queue (thus keeping the average queue length almost constant) (say q2)and applying RED on each of TCP2 and TCP3 (the application of the two REDs being independent of each other). Here also I am providing priority to UDP2 (without caring if TCP2 or TCP3 is getting starved ). To schedule between q1 and q2, I am using WRR and to schedule between UDP1 and TCP1, I am using prio. For implementing q2, I am currently putting UDP2,TCP2 and TCP3 in 3 different virtual queues and applying GRED with grio. I am providing UDP2 the highest priority and providing TCP2 and TCP3 equal priorities. To ensure that RED does not apply on the UDP2, I have set Tmax=Tmin so that Pbmax=1. But the results I am getting with this configuration do not match with the results that I have got from the simulations. So I want to implement this stuff such that the UDP2 gets highest priority among the three, is not included while calculating the total average queue length and the qave used for the application of REDs on TCP2 and TCP3 should be equal to the qave of tcp2+ qave of tcp3. To schedule between TCP2 and TCP3, I want to use WRR or something that gives equal priority and prevents the starvation of any of these. PS: please send any further replies to rahul.hari@cse06.itbhu.org instead of this account Regards, Rahul ------- "The fear you let build up in your mind is worse than the situation that actually exists" taken from "who moved my cheese" ----------------------------------------------------------------------------- Rahul Hari Senior Undergraduate Student, Department of CSE, ITBHU, Varanasi. Ph: +91-9845347020 ----------------------------------------------------------------------------- > >* jamal <1118058859.6266.126.camel@localhost.localdomain> 2005-06-06 07:54 > > On Mon, 2005-06-06 at 13:39 +0200, Thomas Graf wrote: > > > > > Use a prio qdisc with RED leaf qdiscs. RED and GREDs purpose is to > > > calculate a marking probability and not to provide any prioritizing > > > schemes. > > > > Prioritization is still implicitly provided if you vary the queue > > lengths or the drop probabilities. > > For example, if you set everything to be exactly the same, and varied > > only the drop probability - the VQ with the highest drop probability > > will be less important (i.e relatively more of its packets will be > > dropped; recall: the drop decision is made before the packet is queued). > >Absolutely, what I meant is that GRED does not take influence on the >actual ordering of packets not dropped. The priority together with >the qavg parameters and the thresholds only have influence on the >probability a packet gets marked/dropped, sure this is prioritization >as well but Rahul wanted to have one VQ strave out another VQ >completely. My point is that this is not possible with GRED. _________________________________________________________________ Think Rani is the best? http://server1.msn.co.in/sp05/iifa/ Make sure she wins the award. From davem@davemloft.net Mon Jun 6 12:48:48 2005 Received: with ECARTIS (v1.0.0; list netdev); Mon, 06 Jun 2005 12:48:58 -0700 (PDT) Received: from sunset.davemloft.net (dsl027-180-168.sfo1.dsl.speakeasy.net [216.27.180.168]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j56JmmXq025972 for ; Mon, 6 Jun 2005 12:48:48 -0700 Received: from localhost ([127.0.0.1] ident=davem) by sunset.davemloft.net with esmtp (Exim 4.50) id 1DfNZF-0003gi-LX; Mon, 06 Jun 2005 12:47:25 -0700 Date: Mon, 06 Jun 2005 12:47:25 -0700 (PDT) Message-Id: <20050606.124725.85409439.davem@davemloft.net> To: john.ronciak@intel.com Cc: mchan@broadcom.com, hadi@cyberus.ca, buytenh@wantstofly.org, mitch.a.williams@intel.com, jdmason@us.ibm.com, shemminger@osdl.org, netdev@oss.sgi.com, Robert.Olsson@data.slu.se, ganesh.venkatesan@intel.com, jesse.brandeburg@intel.com Subject: Re: RFC: NAPI packet weighting patch From: "David S. Miller" In-Reply-To: <468F3FDA28AA87429AD807992E22D07E0450C002@orsmsx408> References: <468F3FDA28AA87429AD807992E22D07E0450C002@orsmsx408> X-Mailer: Mew version 3.3 on Emacs 21.4 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 2139 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev Content-Length: 1304 Lines: 31 From: "Ronciak, John" Date: Mon, 6 Jun 2005 08:35:26 -0700 > We are dropping packets at the HW level (FIFO errors) with 256 > descriptors and the default weight of 64. As we said reducing the > weight eliminates this which is understandable since the driver is being > serviced more fequently. We also hacked the driver to do a buffer > allocation per packet sent up the stack. This reduced the number of > dropped pacekts by about 80% but it was still a significant number of > drops (190K to 39K dropped). So I don't think this is where the problem > is. This is also comfimed with the tg3 driver doing the buffer update > to the HW every 25 descriptors. I reach a different conclusion, sorry. :-) Here is the invariant: If you force the e1000 driver to do RX replenishment every N packets it should reduce the packet drops the same (in the single NIC case) as if you reduced the dev->weight to that same value N. You have two test cases, single NIC and multi-NIC, so you should be very clear in which case your drop number applies to. They are two totally different problems. > We did not up the descriptor ring size with the default weight but will > try this today and report back. Thanks for all of your test data and hard work so far. It's very valuable. From dlstevens@us.ibm.com Mon Jun 6 12:49:46 2005 Received: with ECARTIS (v1.0.0; list netdev); Mon, 06 Jun 2005 12:49:50 -0700 (PDT) Received: from e31.co.us.ibm.com (e31.co.us.ibm.com [32.97.110.129]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j56JnjXq026248 for ; Mon, 6 Jun 2005 12:49:46 -0700 Received: from westrelay02.boulder.ibm.com (westrelay02.boulder.ibm.com [9.17.195.11]) by e31.co.us.ibm.com (8.12.10/8.12.9) with ESMTP id j56Jmgua512976 for ; Mon, 6 Jun 2005 15:48:42 -0400 Received: from d03av04.boulder.ibm.com (d03av04.boulder.ibm.com [9.17.195.170]) by westrelay02.boulder.ibm.com (8.12.10/NCO/VER6.6) with ESMTP id j56JmgXR189852 for ; Mon, 6 Jun 2005 13:48:42 -0600 Received: from d03av04.boulder.ibm.com (loopback [127.0.0.1]) by d03av04.boulder.ibm.com (8.12.11/8.13.3) with ESMTP id j56JmSqr017129 for ; Mon, 6 Jun 2005 13:48:29 -0600 Received: from d03nm121.boulder.ibm.com (d03nm121.boulder.ibm.com [9.17.195.147]) by d03av04.boulder.ibm.com (8.12.11/8.12.11) with ESMTP id j56JmSxk017110; Mon, 6 Jun 2005 13:48:28 -0600 To: davem@davemloft.net, yoshfuji@linux-ipv6.org Cc: netdev@oss.sgi.com MIME-Version: 1.0 Subject: IPV6 RFC3542 compliance [PATCH] X-Mailer: Lotus Notes Release 6.0.2CF1 June 9, 2003 Message-ID: From: David Stevens Date: Mon, 6 Jun 2005 13:48:26 -0600 X-MIMETrack: Serialize by Router on D03NM121/03/M/IBM(Release 6.53HF546 | May 23, 2005) at 06/06/2005 13:48:27 Content-Type: multipart/mixed; boundary="=_mixed 006CCE2C88257018_=" X-archive-position: 2140 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: dlstevens@us.ibm.com Precedence: bulk X-list: netdev Content-Length: 38257 Lines: 787 --=_mixed 006CCE2C88257018_= Content-Type: text/plain; charset="US-ASCII" I've been looking at RFC 3542 (Advanced Sockets API) compliance, and found the following: ("x" is one of {PKTINFO, HOPLIMIT, RTHDR, DSTOPTS, TCLASS }) What RFC 3542 says: 1) IPV6_x as socket options specify "sticky" option values; getsockopt() returns the current values of the sticky options setsockopt() sets the values for future sends 2) IPV6_RECVx are boolean socket options indicated whether the particular field will be returned in ancillary data on a recvmsg() getsockopt() gets the current value (1 or 0) setsockopt() sets or clears the boolean value 3) Ancillary data (send and receive) use IPV6_x for the corresponding data item What current kernel does: 1) IPV6_x are boolean options 2) the sticky versions are not implemented 3) TCLASS is not implemented The patch below adds sending and receiving of traffic class, the definitions for IPV6_RECVx and changes the boolean socket options to their RFC 3542 names. The original names are still there for use with sticky options in the future (not included here), and as the ancillary data message types. The bad news: This patch changes the argument lists of ip6_append_data() and datagram_send_ctl(). This, because traffic class is not an extension header, but part of the IPv6 header. This is analogous to the hop limit, which is an explicit argument to these functions. I've tested these pieces, but I have a couple open questions which may be relevant (will continue looking myself...): 1) In ipv6_pinfo, there is a "hop_limit" field at the top level and another "cork.hop_limit". Why aren't these the same? 2) The (old name) IPV6_RTHDR socket option allows a value of "2", used by TCP. Still need to see what that's about for relevance to other options (but this code leaves that unchanged, except the name). +-DLS in-line for view, attached for applying Signed-off-by: David L Stevens diff -ruNp linux-2.6.11.10/include/linux/in6.h linux-2.6.11.10T2/include/linux/in6.h --- linux-2.6.11.10/include/linux/in6.h 2005-05-16 10:51:43.000000000 -0700 +++ linux-2.6.11.10T2/include/linux/in6.h 2005-05-23 14:12:59.000000000 -0700 @@ -172,6 +172,7 @@ struct in6_flowlabel_req #define IPV6_V6ONLY 26 #define IPV6_JOIN_ANYCAST 27 #define IPV6_LEAVE_ANYCAST 28 +#define IPV6_TCLASS 30 /* IPV6_MTU_DISCOVER values */ #define IPV6_PMTUDISC_DONT 0 @@ -184,6 +185,12 @@ struct in6_flowlabel_req #define IPV6_IPSEC_POLICY 34 #define IPV6_XFRM_POLICY 35 +#define IPV6_RTHDRDSTOPTS 36 +#define IPV6_RECVPKTINFO 37 +#define IPV6_RECVHOPLIMIT 38 +#define IPV6_RECVRTHDR 39 +#define IPV6_RECVHOPOPTS 40 +#define IPV6_RECVDSTOPTS 41 /* * Multicast: @@ -198,4 +205,6 @@ struct in6_flowlabel_req * MCAST_MSFILTER 48 */ +#define IPV6_RECVTCLASS 49 + #endif diff -ruNp linux-2.6.11.10/include/linux/ipv6.h linux-2.6.11.10T2/include/linux/ipv6.h --- linux-2.6.11.10/include/linux/ipv6.h 2005-05-16 10:51:43.000000000 -0700 +++ linux-2.6.11.10T2/include/linux/ipv6.h 2005-05-24 13:18:27.000000000 -0700 @@ -221,7 +221,8 @@ struct ipv6_pinfo { rxhlim:1, hopopts:1, dstopts:1, - rxflow:1; + rxflow:1, + rxtclass:1; } bits; __u8 all; } rxopt; @@ -244,6 +245,7 @@ struct ipv6_pinfo { struct ipv6_txoptions *opt; struct rt6_info *rt; int hop_limit; + int tclass; } cork; }; diff -ruNp linux-2.6.11.10/include/net/ipv6.h linux-2.6.11.10T2/include/net/ipv6.h --- linux-2.6.11.10/include/net/ipv6.h 2005-05-16 10:51:49.000000000 -0700 +++ linux-2.6.11.10T2/include/net/ipv6.h 2005-05-24 14:57:23.000000000 -0700 @@ -347,6 +347,7 @@ extern int ip6_append_data(struct sock int length, int transhdrlen, int hlimit, + int tclass, struct ipv6_txoptions *opt, struct flowi *fl, struct rt6_info *rt, diff -ruNp linux-2.6.11.10/include/net/transp_v6.h linux-2.6.11.10T2/include/net/transp_v6.h --- linux-2.6.11.10/include/net/transp_v6.h 2005-05-16 10:51:51.000000000 -0700 +++ linux-2.6.11.10T2/include/net/transp_v6.h 2005-05-24 14:04:11.000000000 -0700 @@ -37,7 +37,7 @@ extern int datagram_recv_ctl(struct so extern int datagram_send_ctl(struct msghdr *msg, struct flowi *fl, struct ipv6_txoptions *opt, - int *hlimit); + int *hlimit, int *tclass); #define LOOPBACK4_IPV6 __constant_htonl(0x7f000006) diff -ruNp linux-2.6.11.10/net/ipv6/datagram.c linux-2.6.11.10T2/net/ipv6/datagram.c --- linux-2.6.11.10/net/ipv6/datagram.c 2005-05-16 10:52:00.000000000 -0700 +++ linux-2.6.11.10T2/net/ipv6/datagram.c 2005-05-24 14:03:56.000000000 -0700 @@ -388,6 +388,11 @@ int datagram_recv_ctl(struct sock *sk, s int hlim = skb->nh.ipv6h->hop_limit; put_cmsg(msg, SOL_IPV6, IPV6_HOPLIMIT, sizeof(hlim), &hlim); } + if (np->rxopt.bits.rxtclass) { + u8 tclass = (skb->nh.ipv6h->priority << 4) | + ((skb->nh.ipv6h->flow_lbl[0]>>4) & 0xf); + put_cmsg(msg, SOL_IPV6, IPV6_TCLASS, sizeof(tclass), &tclass); + } if (np->rxopt.bits.rxflow && (*(u32*)skb->nh.raw & IPV6_FLOWINFO_MASK)) { u32 flowinfo = *(u32*)skb->nh.raw & IPV6_FLOWINFO_MASK; @@ -414,7 +419,7 @@ int datagram_recv_ctl(struct sock *sk, s int datagram_send_ctl(struct msghdr *msg, struct flowi *fl, struct ipv6_txoptions *opt, - int *hlimit) + int *hlimit, int *tclass) { struct in6_pktinfo *src_info; struct cmsghdr *cmsg; @@ -587,6 +592,15 @@ int datagram_send_ctl(struct msghdr *msg *hlimit = *(int *)CMSG_DATA(cmsg); break; + case IPV6_TCLASS: + if (cmsg->cmsg_len != CMSG_LEN(sizeof(int))) { + err = -EINVAL; + goto exit_f; + } + + *tclass = *(int *)CMSG_DATA(cmsg); + break; + default: LIMIT_NETDEBUG( printk(KERN_DEBUG "invalid cmsg type: %d\n", cmsg->cmsg_type)); diff -ruNp linux-2.6.11.10/net/ipv6/icmp.c linux-2.6.11.10T2/net/ipv6/icmp.c --- linux-2.6.11.10/net/ipv6/icmp.c 2005-05-16 10:52:00.000000000 -0700 +++ linux-2.6.11.10T2/net/ipv6/icmp.c 2005-05-24 15:05:14.000000000 -0700 @@ -287,7 +287,7 @@ void icmpv6_send(struct sk_buff *skb, in int iif = 0; int addr_type = 0; int len; - int hlimit; + int hlimit, tclass; int err = 0; if ((u8*)hdr < skb->head || (u8*)(hdr+1) > skb->tail) @@ -381,6 +381,9 @@ void icmpv6_send(struct sk_buff *skb, in hlimit = np->hop_limit; if (hlimit < 0) hlimit = dst_metric(dst, RTAX_HOPLIMIT); + tclass = np->cork.tclass; + if (tclass < 0) + tclass = 0; msg.skb = skb; msg.offset = skb->nh.raw - skb->data; @@ -398,7 +401,7 @@ void icmpv6_send(struct sk_buff *skb, in err = ip6_append_data(sk, icmpv6_getfrag, &msg, len + sizeof(struct icmp6hdr), sizeof(struct icmp6hdr), - hlimit, NULL, &fl, (struct rt6_info*)dst, + hlimit, tclass, NULL, &fl, (struct rt6_info*)dst, MSG_DONTWAIT); if (err) { ip6_flush_pending_frames(sk); @@ -432,6 +435,7 @@ static void icmpv6_echo_reply(struct sk_ struct dst_entry *dst; int err = 0; int hlimit; + int tclass; saddr = &skb->nh.ipv6h->daddr; @@ -467,15 +471,18 @@ static void icmpv6_echo_reply(struct sk_ hlimit = np->hop_limit; if (hlimit < 0) hlimit = dst_metric(dst, RTAX_HOPLIMIT); + tclass = np->cork.tclass; + if (tclass < 0) + tclass = 0; idev = in6_dev_get(skb->dev); msg.skb = skb; msg.offset = 0; - err = ip6_append_data(sk, icmpv6_getfrag, &msg, skb->len + sizeof(struct icmp6hdr), - sizeof(struct icmp6hdr), hlimit, NULL, &fl, - (struct rt6_info*)dst, MSG_DONTWAIT); + err = ip6_append_data(sk, icmpv6_getfrag, &msg, skb->len + + sizeof(struct icmp6hdr), sizeof(struct icmp6hdr), hlimit, + tclass, NULL, &fl, (struct rt6_info*)dst, MSG_DONTWAIT); if (err) { ip6_flush_pending_frames(sk); diff -ruNp linux-2.6.11.10/net/ipv6/ip6_flowlabel.c linux-2.6.11.10T2/net/ipv6/ip6_flowlabel.c --- linux-2.6.11.10/net/ipv6/ip6_flowlabel.c 2005-05-16 10:52:00.000000000 -0700 +++ linux-2.6.11.10T2/net/ipv6/ip6_flowlabel.c 2005-05-24 14:04:28.000000000 -0700 @@ -311,7 +311,7 @@ fl_create(struct in6_flowlabel_req *freq msg.msg_control = (void*)(fl->opt+1); flowi.oif = 0; - err = datagram_send_ctl(&msg, &flowi, fl->opt, &junk); + err = datagram_send_ctl(&msg, &flowi, fl->opt, &junk, &junk); if (err) goto done; err = -EINVAL; diff -ruNp linux-2.6.11.10/net/ipv6/ip6_output.c linux-2.6.11.10T2/net/ipv6/ip6_output.c --- linux-2.6.11.10/net/ipv6/ip6_output.c 2005-05-16 10:52:00.000000000 -0700 +++ linux-2.6.11.10T2/net/ipv6/ip6_output.c 2005-05-24 14:58:51.000000000 -0700 @@ -211,7 +211,7 @@ int ip6_xmit(struct sock *sk, struct sk_ struct ipv6hdr *hdr; u8 proto = fl->proto; int seg_len = skb->len; - int hlimit; + int hlimit, tclass; u32 mtu; if (opt) { @@ -253,6 +253,13 @@ int ip6_xmit(struct sock *sk, struct sk_ hlimit = np->hop_limit; if (hlimit < 0) hlimit = dst_metric(dst, RTAX_HOPLIMIT); + tclass = -1; + if (np) + tclass = np->cork.tclass; + if (tclass < 0) + tclass = 0; + hdr->priority = (np->cork.tclass>>4) &0xf; + hdr->flow_lbl[0] |= (np->cork.tclass & 0xf)<<4; hdr->payload_len = htons(seg_len); hdr->nexthdr = proto; @@ -806,10 +813,11 @@ out_err_release: return err; } -int ip6_append_data(struct sock *sk, int getfrag(void *from, char *to, int offset, int len, int odd, struct sk_buff *skb), - void *from, int length, int transhdrlen, - int hlimit, struct ipv6_txoptions *opt, struct flowi *fl, struct rt6_info *rt, - unsigned int flags) +int ip6_append_data(struct sock *sk, int getfrag(void *from, char *to, + int offset, int len, int odd, struct sk_buff *skb), + void *from, int length, int transhdrlen, + int hlimit, int tclass, struct ipv6_txoptions *opt, struct flowi *fl, + struct rt6_info *rt, unsigned int flags) { struct inet_sock *inet = inet_sk(sk); struct ipv6_pinfo *np = inet6_sk(sk); @@ -847,6 +855,7 @@ int ip6_append_data(struct sock *sk, int np->cork.rt = rt; inet->cork.fl = *fl; np->cork.hop_limit = hlimit; + np->cork.tclass = tclass; inet->cork.fragsize = mtu = dst_pmtu(&rt->u.dst); inet->cork.length = 0; sk->sk_sndmsg_page = NULL; @@ -1130,6 +1139,10 @@ int ip6_push_pending_frames(struct sock *(u32*)hdr = fl->fl6_flowlabel | htonl(0x60000000); + /* traffic class */ + hdr->priority = (np->cork.tclass>>4) & 0xf; + hdr->flow_lbl[0] |= (np->cork.tclass & 0xf)<<4; + if (skb->len <= sizeof(struct ipv6hdr) + IPV6_MAXPLEN) hdr->payload_len = htons(skb->len - sizeof(struct ipv6hdr)); else diff -ruNp linux-2.6.11.10/net/ipv6/ipv6_sockglue.c linux-2.6.11.10T2/net/ipv6/ipv6_sockglue.c --- linux-2.6.11.10/net/ipv6/ipv6_sockglue.c 2005-05-16 10:52:00.000000000 -0700 +++ linux-2.6.11.10T2/net/ipv6/ipv6_sockglue.c 2005-06-06 11:52:15.000000000 -0700 @@ -208,33 +208,38 @@ int ipv6_setsockopt(struct sock *sk, int retv = 0; break; - case IPV6_PKTINFO: + case IPV6_RECVPKTINFO: np->rxopt.bits.rxinfo = valbool; retv = 0; break; - case IPV6_HOPLIMIT: + case IPV6_RECVHOPLIMIT: np->rxopt.bits.rxhlim = valbool; retv = 0; break; - case IPV6_RTHDR: + case IPV6_RECVRTHDR: if (val < 0 || val > 2) goto e_inval; np->rxopt.bits.srcrt = val; retv = 0; break; - case IPV6_HOPOPTS: + case IPV6_RECVHOPOPTS: np->rxopt.bits.hopopts = valbool; retv = 0; break; - case IPV6_DSTOPTS: + case IPV6_RECVDSTOPTS: np->rxopt.bits.dstopts = valbool; retv = 0; break; + case IPV6_RECVTCLASS: + np->rxopt.bits.rxtclass = valbool; + retv = 0; + break; + case IPV6_FLOWINFO: np->rxopt.bits.rxflow = valbool; retv = 0; @@ -274,7 +279,7 @@ int ipv6_setsockopt(struct sock *sk, int msg.msg_controllen = optlen; msg.msg_control = (void*)(opt+1); - retv = datagram_send_ctl(&msg, &fl, opt, &junk); + retv = datagram_send_ctl(&msg, &fl, opt, &junk, &junk); if (retv) goto done; update: @@ -620,26 +625,30 @@ int ipv6_getsockopt(struct sock *sk, int val = np->ipv6only; break; - case IPV6_PKTINFO: + case IPV6_RECVPKTINFO: val = np->rxopt.bits.rxinfo; break; - case IPV6_HOPLIMIT: + case IPV6_RECVHOPLIMIT: val = np->rxopt.bits.rxhlim; break; - case IPV6_RTHDR: + case IPV6_RECVRTHDR: val = np->rxopt.bits.srcrt; break; - case IPV6_HOPOPTS: + case IPV6_RECVHOPOPTS: val = np->rxopt.bits.hopopts; break; - case IPV6_DSTOPTS: + case IPV6_RECVDSTOPTS: val = np->rxopt.bits.dstopts; break; + case IPV6_RECVTCLASS: + val = np->rxopt.bits.rxtclass; + break; + case IPV6_FLOWINFO: val = np->rxopt.bits.rxflow; break; diff -ruNp linux-2.6.11.10/net/ipv6/raw.c linux-2.6.11.10T2/net/ipv6/raw.c --- linux-2.6.11.10/net/ipv6/raw.c 2005-05-16 10:52:00.000000000 -0700 +++ linux-2.6.11.10T2/net/ipv6/raw.c 2005-05-24 15:09:42.000000000 -0700 @@ -617,6 +617,7 @@ static int rawv6_sendmsg(struct kiocb *i struct flowi fl; int addr_len = msg->msg_namelen; int hlimit = -1; + int tclass = -1; u16 proto; int err; @@ -702,7 +703,7 @@ static int rawv6_sendmsg(struct kiocb *i memset(opt, 0, sizeof(struct ipv6_txoptions)); opt->tot_len = sizeof(struct ipv6_txoptions); - err = datagram_send_ctl(msg, &fl, opt, &hlimit); + err = datagram_send_ctl(msg, &fl, opt, &hlimit, &tclass); if (err < 0) { fl6_sock_release(flowlabel); return err; @@ -758,6 +759,12 @@ static int rawv6_sendmsg(struct kiocb *i hlimit = dst_metric(dst, RTAX_HOPLIMIT); } + if (tclass < 0) { + tclass = np->cork.tclass; + if (tclass < 0) + tclass = 0; + } + if (msg->msg_flags&MSG_CONFIRM) goto do_confirm; @@ -766,8 +773,9 @@ back_from_confirm: err = rawv6_send_hdrinc(sk, msg->msg_iov, len, &fl, (struct rt6_info*)dst, msg->msg_flags); } else { lock_sock(sk); - err = ip6_append_data(sk, ip_generic_getfrag, msg->msg_iov, len, 0, - hlimit, opt, &fl, (struct rt6_info*)dst, msg->msg_flags); + err = ip6_append_data(sk, ip_generic_getfrag, msg->msg_iov, + len, 0, hlimit, tclass, opt, &fl, (struct rt6_info*)dst, + msg->msg_flags); if (err) ip6_flush_pending_frames(sk); diff -ruNp linux-2.6.11.10/net/ipv6/udp.c linux-2.6.11.10T2/net/ipv6/udp.c --- linux-2.6.11.10/net/ipv6/udp.c 2005-05-16 10:52:00.000000000 -0700 +++ linux-2.6.11.10T2/net/ipv6/udp.c 2005-05-24 15:11:58.000000000 -0700 @@ -637,6 +637,7 @@ static int udpv6_sendmsg(struct kiocb *i int addr_len = msg->msg_namelen; int ulen = len; int hlimit = -1; + int tclass = -1; int corkreq = up->corkflag || msg->msg_flags&MSG_MORE; int err; @@ -758,7 +759,7 @@ do_udp_sendmsg: memset(opt, 0, sizeof(struct ipv6_txoptions)); opt->tot_len = sizeof(*opt); - err = datagram_send_ctl(msg, fl, opt, &hlimit); + err = datagram_send_ctl(msg, fl, opt, &hlimit, &tclass); if (err < 0) { fl6_sock_release(flowlabel); return err; @@ -812,6 +813,11 @@ do_udp_sendmsg: if (hlimit < 0) hlimit = dst_metric(dst, RTAX_HOPLIMIT); } + if (tclass < 0) { + tclass = np->cork.tclass; + if (tclass < 0) + tclass = 0; + } if (msg->msg_flags&MSG_CONFIRM) goto do_confirm; @@ -832,9 +838,10 @@ back_from_confirm: do_append_data: up->len += ulen; - err = ip6_append_data(sk, ip_generic_getfrag, msg->msg_iov, ulen, sizeof(struct udphdr), - hlimit, opt, fl, (struct rt6_info*)dst, - corkreq ? msg->msg_flags|MSG_MORE : msg->msg_flags); + err = ip6_append_data(sk, ip_generic_getfrag, msg->msg_iov, ulen, + sizeof(struct udphdr), hlimit, tclass, opt, fl, + (struct rt6_info*)dst, + corkreq ? msg->msg_flags|MSG_MORE : msg->msg_flags); if (err) udp_v6_flush_pending_frames(sk); else if (!corkreq) --=_mixed 006CCE2C88257018_= Content-Type: application/octet-stream; name="rfc3542.patch" Content-Disposition: attachment; filename="rfc3542.patch" Content-Transfer-Encoding: base64 ZGlmZiAtcnVOcCBsaW51eC0yLjYuMTEuMTAvaW5jbHVkZS9saW51eC9pbjYuaCBsaW51eC0yLjYu MTEuMTBUMi9pbmNsdWRlL2xpbnV4L2luNi5oCi0tLSBsaW51eC0yLjYuMTEuMTAvaW5jbHVkZS9s aW51eC9pbjYuaAkyMDA1LTA1LTE2IDEwOjUxOjQzLjAwMDAwMDAwMCAtMDcwMAorKysgbGludXgt Mi42LjExLjEwVDIvaW5jbHVkZS9saW51eC9pbjYuaAkyMDA1LTA1LTIzIDE0OjEyOjU5LjAwMDAw MDAwMCAtMDcwMApAQCAtMTcyLDYgKzE3Miw3IEBAIHN0cnVjdCBpbjZfZmxvd2xhYmVsX3JlcQog I2RlZmluZSBJUFY2X1Y2T05MWQkJMjYKICNkZWZpbmUgSVBWNl9KT0lOX0FOWUNBU1QJMjcKICNk ZWZpbmUgSVBWNl9MRUFWRV9BTllDQVNUCTI4CisjZGVmaW5lIElQVjZfVENMQVNTCQkzMAogCiAv KiBJUFY2X01UVV9ESVNDT1ZFUiB2YWx1ZXMgKi8KICNkZWZpbmUgSVBWNl9QTVRVRElTQ19ET05U CQkwCkBAIC0xODQsNiArMTg1LDEyIEBAIHN0cnVjdCBpbjZfZmxvd2xhYmVsX3JlcQogCiAjZGVm aW5lIElQVjZfSVBTRUNfUE9MSUNZCTM0CiAjZGVmaW5lIElQVjZfWEZSTV9QT0xJQ1kJMzUKKyNk ZWZpbmUgSVBWNl9SVEhEUkRTVE9QVFMJMzYKKyNkZWZpbmUgSVBWNl9SRUNWUEtUSU5GTwkzNwor I2RlZmluZSBJUFY2X1JFQ1ZIT1BMSU1JVAkzOAorI2RlZmluZSBJUFY2X1JFQ1ZSVEhEUgkJMzkK KyNkZWZpbmUgSVBWNl9SRUNWSE9QT1BUUwk0MAorI2RlZmluZSBJUFY2X1JFQ1ZEU1RPUFRTCTQx CiAKIC8qCiAgKiBNdWx0aWNhc3Q6CkBAIC0xOTgsNCArMjA1LDYgQEAgc3RydWN0IGluNl9mbG93 bGFiZWxfcmVxCiAgKiBNQ0FTVF9NU0ZJTFRFUgkJNDgKICAqLwogCisjZGVmaW5lIElQVjZfUkVD VlRDTEFTUwkJNDkKKwogI2VuZGlmCmRpZmYgLXJ1TnAgbGludXgtMi42LjExLjEwL2luY2x1ZGUv bGludXgvaXB2Ni5oIGxpbnV4LTIuNi4xMS4xMFQyL2luY2x1ZGUvbGludXgvaXB2Ni5oCi0tLSBs aW51eC0yLjYuMTEuMTAvaW5jbHVkZS9saW51eC9pcHY2LmgJMjAwNS0wNS0xNiAxMDo1MTo0My4w MDAwMDAwMDAgLTA3MDAKKysrIGxpbnV4LTIuNi4xMS4xMFQyL2luY2x1ZGUvbGludXgvaXB2Ni5o CTIwMDUtMDUtMjQgMTM6MTg6MjcuMDAwMDAwMDAwIC0wNzAwCkBAIC0yMjEsNyArMjIxLDggQEAg c3RydWN0IGlwdjZfcGluZm8gewogCQkJCXJ4aGxpbToxLAogCQkJCWhvcG9wdHM6MSwKIAkJCQlk c3RvcHRzOjEsCi0gICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgIHJ4ZmxvdzoxOworICAg ICAgICAgICAgICAgICAgICAgICAgICAgICAgICByeGZsb3c6MSwKKwkJCQlyeHRjbGFzczoxOwog CQl9IGJpdHM7CiAJCV9fdTgJCWFsbDsKIAl9IHJ4b3B0OwpAQCAtMjQ0LDYgKzI0NSw3IEBAIHN0 cnVjdCBpcHY2X3BpbmZvIHsKIAkJc3RydWN0IGlwdjZfdHhvcHRpb25zICpvcHQ7CiAJCXN0cnVj dCBydDZfaW5mbwkqcnQ7CiAJCWludCBob3BfbGltaXQ7CisJCWludCB0Y2xhc3M7CiAJfSBjb3Jr OwogfTsKIApkaWZmIC1ydU5wIGxpbnV4LTIuNi4xMS4xMC9pbmNsdWRlL25ldC9pcHY2LmggbGlu dXgtMi42LjExLjEwVDIvaW5jbHVkZS9uZXQvaXB2Ni5oCi0tLSBsaW51eC0yLjYuMTEuMTAvaW5j bHVkZS9uZXQvaXB2Ni5oCTIwMDUtMDUtMTYgMTA6NTE6NDkuMDAwMDAwMDAwIC0wNzAwCisrKyBs aW51eC0yLjYuMTEuMTBUMi9pbmNsdWRlL25ldC9pcHY2LmgJMjAwNS0wNS0yNCAxNDo1NzoyMy4w MDAwMDAwMDAgLTA3MDAKQEAgLTM0Nyw2ICszNDcsNyBAQCBleHRlcm4gaW50CQkJaXA2X2FwcGVu ZF9kYXRhKHN0cnVjdCBzb2NrCiAJCQkJCQlpbnQgbGVuZ3RoLAogCQkJCQkJaW50IHRyYW5zaGRy bGVuLAogCQkgICAgICAJCQkJaW50IGhsaW1pdCwKKwkJICAgICAgCQkJCWludCB0Y2xhc3MsCiAJ CQkJCQlzdHJ1Y3QgaXB2Nl90eG9wdGlvbnMgKm9wdCwKIAkJCQkJCXN0cnVjdCBmbG93aSAqZmws CiAJCQkJCQlzdHJ1Y3QgcnQ2X2luZm8gKnJ0LApkaWZmIC1ydU5wIGxpbnV4LTIuNi4xMS4xMC9p bmNsdWRlL25ldC90cmFuc3BfdjYuaCBsaW51eC0yLjYuMTEuMTBUMi9pbmNsdWRlL25ldC90cmFu c3BfdjYuaAotLS0gbGludXgtMi42LjExLjEwL2luY2x1ZGUvbmV0L3RyYW5zcF92Ni5oCTIwMDUt MDUtMTYgMTA6NTE6NTEuMDAwMDAwMDAwIC0wNzAwCisrKyBsaW51eC0yLjYuMTEuMTBUMi9pbmNs dWRlL25ldC90cmFuc3BfdjYuaAkyMDA1LTA1LTI0IDE0OjA0OjExLjAwMDAwMDAwMCAtMDcwMApA QCAtMzcsNyArMzcsNyBAQCBleHRlcm4gaW50CQkJZGF0YWdyYW1fcmVjdl9jdGwoc3RydWN0IHNv CiBleHRlcm4gaW50CQkJZGF0YWdyYW1fc2VuZF9jdGwoc3RydWN0IG1zZ2hkciAqbXNnLAogCQkJ CQkJICBzdHJ1Y3QgZmxvd2kgKmZsLAogCQkJCQkJICBzdHJ1Y3QgaXB2Nl90eG9wdGlvbnMgKm9w dCwKLQkJCQkJCSAgaW50ICpobGltaXQpOworCQkJCQkJICBpbnQgKmhsaW1pdCwgaW50ICp0Y2xh c3MpOwogCiAjZGVmaW5lCQlMT09QQkFDSzRfSVBWNgkJX19jb25zdGFudF9odG9ubCgweDdmMDAw MDA2KQogCmRpZmYgLXJ1TnAgbGludXgtMi42LjExLjEwL25ldC9pcHY2L2RhdGFncmFtLmMgbGlu dXgtMi42LjExLjEwVDIvbmV0L2lwdjYvZGF0YWdyYW0uYwotLS0gbGludXgtMi42LjExLjEwL25l dC9pcHY2L2RhdGFncmFtLmMJMjAwNS0wNS0xNiAxMDo1MjowMC4wMDAwMDAwMDAgLTA3MDAKKysr IGxpbnV4LTIuNi4xMS4xMFQyL25ldC9pcHY2L2RhdGFncmFtLmMJMjAwNS0wNS0yNCAxNDowMzo1 Ni4wMDAwMDAwMDAgLTA3MDAKQEAgLTM4OCw2ICszODgsMTEgQEAgaW50IGRhdGFncmFtX3JlY3Zf Y3RsKHN0cnVjdCBzb2NrICpzaywgcwogCQlpbnQgaGxpbSA9IHNrYi0+bmguaXB2NmgtPmhvcF9s aW1pdDsKIAkJcHV0X2Ntc2cobXNnLCBTT0xfSVBWNiwgSVBWNl9IT1BMSU1JVCwgc2l6ZW9mKGhs aW0pLCAmaGxpbSk7CiAJfQorCWlmIChucC0+cnhvcHQuYml0cy5yeHRjbGFzcykgeworCQl1OCB0 Y2xhc3MgPSAoc2tiLT5uaC5pcHY2aC0+cHJpb3JpdHkgPDwgNCkgfAorCQkJKChza2ItPm5oLmlw djZoLT5mbG93X2xibFswXT4+NCkgJiAweGYpOworCQlwdXRfY21zZyhtc2csIFNPTF9JUFY2LCBJ UFY2X1RDTEFTUywgc2l6ZW9mKHRjbGFzcyksICZ0Y2xhc3MpOworCX0KIAogCWlmIChucC0+cnhv cHQuYml0cy5yeGZsb3cgJiYgKCoodTMyKilza2ItPm5oLnJhdyAmIElQVjZfRkxPV0lORk9fTUFT SykpIHsKIAkJdTMyIGZsb3dpbmZvID0gKih1MzIqKXNrYi0+bmgucmF3ICYgSVBWNl9GTE9XSU5G T19NQVNLOwpAQCAtNDE0LDcgKzQxOSw3IEBAIGludCBkYXRhZ3JhbV9yZWN2X2N0bChzdHJ1Y3Qg c29jayAqc2ssIHMKIAogaW50IGRhdGFncmFtX3NlbmRfY3RsKHN0cnVjdCBtc2doZHIgKm1zZywg c3RydWN0IGZsb3dpICpmbCwKIAkJICAgICAgc3RydWN0IGlwdjZfdHhvcHRpb25zICpvcHQsCi0J CSAgICAgIGludCAqaGxpbWl0KQorCQkgICAgICBpbnQgKmhsaW1pdCwgaW50ICp0Y2xhc3MpCiB7 CiAJc3RydWN0IGluNl9wa3RpbmZvICpzcmNfaW5mbzsKIAlzdHJ1Y3QgY21zZ2hkciAqY21zZzsK QEAgLTU4Nyw2ICs1OTIsMTUgQEAgaW50IGRhdGFncmFtX3NlbmRfY3RsKHN0cnVjdCBtc2doZHIg Km1zZwogCQkJKmhsaW1pdCA9ICooaW50ICopQ01TR19EQVRBKGNtc2cpOwogCQkJYnJlYWs7CiAK KwkJY2FzZSBJUFY2X1RDTEFTUzoKKwkJCWlmIChjbXNnLT5jbXNnX2xlbiAhPSBDTVNHX0xFTihz aXplb2YoaW50KSkpIHsKKwkJCQllcnIgPSAtRUlOVkFMOworCQkJCWdvdG8gZXhpdF9mOworCQkJ fQorCisJCQkqdGNsYXNzID0gKihpbnQgKilDTVNHX0RBVEEoY21zZyk7CisJCQlicmVhazsKKwog CQlkZWZhdWx0OgogCQkJTElNSVRfTkVUREVCVUcoCiAJCQkJcHJpbnRrKEtFUk5fREVCVUcgImlu dmFsaWQgY21zZyB0eXBlOiAlZFxuIiwgY21zZy0+Y21zZ190eXBlKSk7CmRpZmYgLXJ1TnAgbGlu dXgtMi42LjExLjEwL25ldC9pcHY2L2ljbXAuYyBsaW51eC0yLjYuMTEuMTBUMi9uZXQvaXB2Ni9p Y21wLmMKLS0tIGxpbnV4LTIuNi4xMS4xMC9uZXQvaXB2Ni9pY21wLmMJMjAwNS0wNS0xNiAxMDo1 MjowMC4wMDAwMDAwMDAgLTA3MDAKKysrIGxpbnV4LTIuNi4xMS4xMFQyL25ldC9pcHY2L2ljbXAu YwkyMDA1LTA1LTI0IDE1OjA1OjE0LjAwMDAwMDAwMCAtMDcwMApAQCAtMjg3LDcgKzI4Nyw3IEBA IHZvaWQgaWNtcHY2X3NlbmQoc3RydWN0IHNrX2J1ZmYgKnNrYiwgaW4KIAlpbnQgaWlmID0gMDsK IAlpbnQgYWRkcl90eXBlID0gMDsKIAlpbnQgbGVuOwotCWludCBobGltaXQ7CisJaW50IGhsaW1p dCwgdGNsYXNzOwogCWludCBlcnIgPSAwOwogCiAJaWYgKCh1OCopaGRyIDwgc2tiLT5oZWFkIHx8 ICh1OCopKGhkcisxKSA+IHNrYi0+dGFpbCkKQEAgLTM4MSw2ICszODEsOSBAQCB2b2lkIGljbXB2 Nl9zZW5kKHN0cnVjdCBza19idWZmICpza2IsIGluCiAJCWhsaW1pdCA9IG5wLT5ob3BfbGltaXQ7 CiAJaWYgKGhsaW1pdCA8IDApCiAJCWhsaW1pdCA9IGRzdF9tZXRyaWMoZHN0LCBSVEFYX0hPUExJ TUlUKTsKKwl0Y2xhc3MgPSBucC0+Y29yay50Y2xhc3M7CisJaWYgKHRjbGFzcyA8IDApCisJCXRj bGFzcyA9IDA7CiAKIAltc2cuc2tiID0gc2tiOwogCW1zZy5vZmZzZXQgPSBza2ItPm5oLnJhdyAt IHNrYi0+ZGF0YTsKQEAgLTM5OCw3ICs0MDEsNyBAQCB2b2lkIGljbXB2Nl9zZW5kKHN0cnVjdCBz a19idWZmICpza2IsIGluCiAJZXJyID0gaXA2X2FwcGVuZF9kYXRhKHNrLCBpY21wdjZfZ2V0ZnJh ZywgJm1zZywKIAkJCSAgICAgIGxlbiArIHNpemVvZihzdHJ1Y3QgaWNtcDZoZHIpLAogCQkJICAg ICAgc2l6ZW9mKHN0cnVjdCBpY21wNmhkciksCi0JCQkgICAgICBobGltaXQsIE5VTEwsICZmbCwg KHN0cnVjdCBydDZfaW5mbyopZHN0LAorCQkJICAgICAgaGxpbWl0LCB0Y2xhc3MsIE5VTEwsICZm bCwgKHN0cnVjdCBydDZfaW5mbyopZHN0LAogCQkJICAgICAgTVNHX0RPTlRXQUlUKTsKIAlpZiAo ZXJyKSB7CiAJCWlwNl9mbHVzaF9wZW5kaW5nX2ZyYW1lcyhzayk7CkBAIC00MzIsNiArNDM1LDcg QEAgc3RhdGljIHZvaWQgaWNtcHY2X2VjaG9fcmVwbHkoc3RydWN0IHNrXwogCXN0cnVjdCBkc3Rf ZW50cnkgKmRzdDsKIAlpbnQgZXJyID0gMDsKIAlpbnQgaGxpbWl0OworCWludCB0Y2xhc3M7CiAK IAlzYWRkciA9ICZza2ItPm5oLmlwdjZoLT5kYWRkcjsKIApAQCAtNDY3LDE1ICs0NzEsMTggQEAg c3RhdGljIHZvaWQgaWNtcHY2X2VjaG9fcmVwbHkoc3RydWN0IHNrXwogCQlobGltaXQgPSBucC0+ aG9wX2xpbWl0OwogCWlmIChobGltaXQgPCAwKQogCQlobGltaXQgPSBkc3RfbWV0cmljKGRzdCwg UlRBWF9IT1BMSU1JVCk7CisJdGNsYXNzID0gbnAtPmNvcmsudGNsYXNzOworCWlmICh0Y2xhc3Mg PCAwKQorCQl0Y2xhc3MgPSAwOwogCiAJaWRldiA9IGluNl9kZXZfZ2V0KHNrYi0+ZGV2KTsKIAog CW1zZy5za2IgPSBza2I7CiAJbXNnLm9mZnNldCA9IDA7CiAKLQllcnIgPSBpcDZfYXBwZW5kX2Rh dGEoc2ssIGljbXB2Nl9nZXRmcmFnLCAmbXNnLCBza2ItPmxlbiArIHNpemVvZihzdHJ1Y3QgaWNt cDZoZHIpLAotCQkJCXNpemVvZihzdHJ1Y3QgaWNtcDZoZHIpLCBobGltaXQsIE5VTEwsICZmbCwK LQkJCQkoc3RydWN0IHJ0Nl9pbmZvKilkc3QsIE1TR19ET05UV0FJVCk7CisJZXJyID0gaXA2X2Fw cGVuZF9kYXRhKHNrLCBpY21wdjZfZ2V0ZnJhZywgJm1zZywgc2tiLT5sZW4gKworCQlzaXplb2Yo c3RydWN0IGljbXA2aGRyKSwgc2l6ZW9mKHN0cnVjdCBpY21wNmhkciksIGhsaW1pdCwKKwkJdGNs YXNzLCBOVUxMLCAmZmwsIChzdHJ1Y3QgcnQ2X2luZm8qKWRzdCwgTVNHX0RPTlRXQUlUKTsKIAog CWlmIChlcnIpIHsKIAkJaXA2X2ZsdXNoX3BlbmRpbmdfZnJhbWVzKHNrKTsKZGlmZiAtcnVOcCBs aW51eC0yLjYuMTEuMTAvbmV0L2lwdjYvaXA2X2Zsb3dsYWJlbC5jIGxpbnV4LTIuNi4xMS4xMFQy L25ldC9pcHY2L2lwNl9mbG93bGFiZWwuYwotLS0gbGludXgtMi42LjExLjEwL25ldC9pcHY2L2lw Nl9mbG93bGFiZWwuYwkyMDA1LTA1LTE2IDEwOjUyOjAwLjAwMDAwMDAwMCAtMDcwMAorKysgbGlu dXgtMi42LjExLjEwVDIvbmV0L2lwdjYvaXA2X2Zsb3dsYWJlbC5jCTIwMDUtMDUtMjQgMTQ6MDQ6 MjguMDAwMDAwMDAwIC0wNzAwCkBAIC0zMTEsNyArMzExLDcgQEAgZmxfY3JlYXRlKHN0cnVjdCBp bjZfZmxvd2xhYmVsX3JlcSAqZnJlcQogCQltc2cubXNnX2NvbnRyb2wgPSAodm9pZCopKGZsLT5v cHQrMSk7CiAJCWZsb3dpLm9pZiA9IDA7CiAKLQkJZXJyID0gZGF0YWdyYW1fc2VuZF9jdGwoJm1z ZywgJmZsb3dpLCBmbC0+b3B0LCAmanVuayk7CisJCWVyciA9IGRhdGFncmFtX3NlbmRfY3RsKCZt c2csICZmbG93aSwgZmwtPm9wdCwgJmp1bmssICZqdW5rKTsKIAkJaWYgKGVycikKIAkJCWdvdG8g ZG9uZTsKIAkJZXJyID0gLUVJTlZBTDsKZGlmZiAtcnVOcCBsaW51eC0yLjYuMTEuMTAvbmV0L2lw djYvaXA2X291dHB1dC5jIGxpbnV4LTIuNi4xMS4xMFQyL25ldC9pcHY2L2lwNl9vdXRwdXQuYwot LS0gbGludXgtMi42LjExLjEwL25ldC9pcHY2L2lwNl9vdXRwdXQuYwkyMDA1LTA1LTE2IDEwOjUy OjAwLjAwMDAwMDAwMCAtMDcwMAorKysgbGludXgtMi42LjExLjEwVDIvbmV0L2lwdjYvaXA2X291 dHB1dC5jCTIwMDUtMDUtMjQgMTQ6NTg6NTEuMDAwMDAwMDAwIC0wNzAwCkBAIC0yMTEsNyArMjEx LDcgQEAgaW50IGlwNl94bWl0KHN0cnVjdCBzb2NrICpzaywgc3RydWN0IHNrXwogCXN0cnVjdCBp cHY2aGRyICpoZHI7CiAJdTggIHByb3RvID0gZmwtPnByb3RvOwogCWludCBzZWdfbGVuID0gc2ti LT5sZW47Ci0JaW50IGhsaW1pdDsKKwlpbnQgaGxpbWl0LCB0Y2xhc3M7CiAJdTMyIG10dTsKIAog CWlmIChvcHQpIHsKQEAgLTI1Myw2ICsyNTMsMTMgQEAgaW50IGlwNl94bWl0KHN0cnVjdCBzb2Nr ICpzaywgc3RydWN0IHNrXwogCQlobGltaXQgPSBucC0+aG9wX2xpbWl0OwogCWlmIChobGltaXQg PCAwKQogCQlobGltaXQgPSBkc3RfbWV0cmljKGRzdCwgUlRBWF9IT1BMSU1JVCk7CisJdGNsYXNz ID0gLTE7CisJaWYgKG5wKQorCQl0Y2xhc3MgPSBucC0+Y29yay50Y2xhc3M7CisJaWYgKHRjbGFz cyA8IDApCisJCXRjbGFzcyA9IDA7CisJaGRyLT5wcmlvcml0eSA9IChucC0+Y29yay50Y2xhc3M+ PjQpICYweGY7CisJaGRyLT5mbG93X2xibFswXSB8PSAobnAtPmNvcmsudGNsYXNzICYgMHhmKTw8 NDsKIAogCWhkci0+cGF5bG9hZF9sZW4gPSBodG9ucyhzZWdfbGVuKTsKIAloZHItPm5leHRoZHIg PSBwcm90bzsKQEAgLTgwNiwxMCArODEzLDExIEBAIG91dF9lcnJfcmVsZWFzZToKIAlyZXR1cm4g ZXJyOwogfQogCi1pbnQgaXA2X2FwcGVuZF9kYXRhKHN0cnVjdCBzb2NrICpzaywgaW50IGdldGZy YWcodm9pZCAqZnJvbSwgY2hhciAqdG8sIGludCBvZmZzZXQsIGludCBsZW4sIGludCBvZGQsIHN0 cnVjdCBza19idWZmICpza2IpLAotCQkgICAgdm9pZCAqZnJvbSwgaW50IGxlbmd0aCwgaW50IHRy YW5zaGRybGVuLAotCQkgICAgaW50IGhsaW1pdCwgc3RydWN0IGlwdjZfdHhvcHRpb25zICpvcHQs IHN0cnVjdCBmbG93aSAqZmwsIHN0cnVjdCBydDZfaW5mbyAqcnQsCi0JCSAgICB1bnNpZ25lZCBp bnQgZmxhZ3MpCitpbnQgaXA2X2FwcGVuZF9kYXRhKHN0cnVjdCBzb2NrICpzaywgaW50IGdldGZy YWcodm9pZCAqZnJvbSwgY2hhciAqdG8sCisJaW50IG9mZnNldCwgaW50IGxlbiwgaW50IG9kZCwg c3RydWN0IHNrX2J1ZmYgKnNrYiksCisJdm9pZCAqZnJvbSwgaW50IGxlbmd0aCwgaW50IHRyYW5z aGRybGVuLAorCWludCBobGltaXQsIGludCB0Y2xhc3MsIHN0cnVjdCBpcHY2X3R4b3B0aW9ucyAq b3B0LCBzdHJ1Y3QgZmxvd2kgKmZsLAorCXN0cnVjdCBydDZfaW5mbyAqcnQsIHVuc2lnbmVkIGlu dCBmbGFncykKIHsKIAlzdHJ1Y3QgaW5ldF9zb2NrICppbmV0ID0gaW5ldF9zayhzayk7CiAJc3Ry dWN0IGlwdjZfcGluZm8gKm5wID0gaW5ldDZfc2soc2spOwpAQCAtODQ3LDYgKzg1NSw3IEBAIGlu dCBpcDZfYXBwZW5kX2RhdGEoc3RydWN0IHNvY2sgKnNrLCBpbnQKIAkJbnAtPmNvcmsucnQgPSBy dDsKIAkJaW5ldC0+Y29yay5mbCA9ICpmbDsKIAkJbnAtPmNvcmsuaG9wX2xpbWl0ID0gaGxpbWl0 OworCQlucC0+Y29yay50Y2xhc3MgPSB0Y2xhc3M7CiAJCWluZXQtPmNvcmsuZnJhZ3NpemUgPSBt dHUgPSBkc3RfcG10dSgmcnQtPnUuZHN0KTsKIAkJaW5ldC0+Y29yay5sZW5ndGggPSAwOwogCQlz ay0+c2tfc25kbXNnX3BhZ2UgPSBOVUxMOwpAQCAtMTEzMCw2ICsxMTM5LDEwIEBAIGludCBpcDZf cHVzaF9wZW5kaW5nX2ZyYW1lcyhzdHJ1Y3Qgc29jayAKIAkKIAkqKHUzMiopaGRyID0gZmwtPmZs Nl9mbG93bGFiZWwgfCBodG9ubCgweDYwMDAwMDAwKTsKIAorCS8qIHRyYWZmaWMgY2xhc3MgKi8K KwloZHItPnByaW9yaXR5ID0gKG5wLT5jb3JrLnRjbGFzcz4+NCkgJiAweGY7CisJaGRyLT5mbG93 X2xibFswXSB8PSAobnAtPmNvcmsudGNsYXNzICYgMHhmKTw8NDsKKwogCWlmIChza2ItPmxlbiA8 PSBzaXplb2Yoc3RydWN0IGlwdjZoZHIpICsgSVBWNl9NQVhQTEVOKQogCQloZHItPnBheWxvYWRf bGVuID0gaHRvbnMoc2tiLT5sZW4gLSBzaXplb2Yoc3RydWN0IGlwdjZoZHIpKTsKIAllbHNlCmRp ZmYgLXJ1TnAgbGludXgtMi42LjExLjEwL25ldC9pcHY2L2lwdjZfc29ja2dsdWUuYyBsaW51eC0y LjYuMTEuMTBUMi9uZXQvaXB2Ni9pcHY2X3NvY2tnbHVlLmMKLS0tIGxpbnV4LTIuNi4xMS4xMC9u ZXQvaXB2Ni9pcHY2X3NvY2tnbHVlLmMJMjAwNS0wNS0xNiAxMDo1MjowMC4wMDAwMDAwMDAgLTA3 MDAKKysrIGxpbnV4LTIuNi4xMS4xMFQyL25ldC9pcHY2L2lwdjZfc29ja2dsdWUuYwkyMDA1LTA2 LTA2IDExOjUyOjE1LjAwMDAwMDAwMCAtMDcwMApAQCAtMjA4LDMzICsyMDgsMzggQEAgaW50IGlw djZfc2V0c29ja29wdChzdHJ1Y3Qgc29jayAqc2ssIGludAogCQlyZXR2ID0gMDsKIAkJYnJlYWs7 CiAKLQljYXNlIElQVjZfUEtUSU5GTzoKKwljYXNlIElQVjZfUkVDVlBLVElORk86CiAJCW5wLT5y eG9wdC5iaXRzLnJ4aW5mbyA9IHZhbGJvb2w7CiAJCXJldHYgPSAwOwogCQlicmVhazsKIAotCWNh c2UgSVBWNl9IT1BMSU1JVDoKKwljYXNlIElQVjZfUkVDVkhPUExJTUlUOgogCQlucC0+cnhvcHQu Yml0cy5yeGhsaW0gPSB2YWxib29sOwogCQlyZXR2ID0gMDsKIAkJYnJlYWs7CiAKLQljYXNlIElQ VjZfUlRIRFI6CisJY2FzZSBJUFY2X1JFQ1ZSVEhEUjoKIAkJaWYgKHZhbCA8IDAgfHwgdmFsID4g MikKIAkJCWdvdG8gZV9pbnZhbDsKIAkJbnAtPnJ4b3B0LmJpdHMuc3JjcnQgPSB2YWw7CiAJCXJl dHYgPSAwOwogCQlicmVhazsKIAotCWNhc2UgSVBWNl9IT1BPUFRTOgorCWNhc2UgSVBWNl9SRUNW SE9QT1BUUzoKIAkJbnAtPnJ4b3B0LmJpdHMuaG9wb3B0cyA9IHZhbGJvb2w7CiAJCXJldHYgPSAw OwogCQlicmVhazsKIAotCWNhc2UgSVBWNl9EU1RPUFRTOgorCWNhc2UgSVBWNl9SRUNWRFNUT1BU UzoKIAkJbnAtPnJ4b3B0LmJpdHMuZHN0b3B0cyA9IHZhbGJvb2w7CiAJCXJldHYgPSAwOwogCQli cmVhazsKIAorCWNhc2UgSVBWNl9SRUNWVENMQVNTOgorCQlucC0+cnhvcHQuYml0cy5yeHRjbGFz cyA9IHZhbGJvb2w7CisJCXJldHYgPSAwOworCQlicmVhazsKKwogCWNhc2UgSVBWNl9GTE9XSU5G TzoKIAkJbnAtPnJ4b3B0LmJpdHMucnhmbG93ID0gdmFsYm9vbDsKIAkJcmV0diA9IDA7CkBAIC0y NzQsNyArMjc5LDcgQEAgaW50IGlwdjZfc2V0c29ja29wdChzdHJ1Y3Qgc29jayAqc2ssIGludAog CQltc2cubXNnX2NvbnRyb2xsZW4gPSBvcHRsZW47CiAJCW1zZy5tc2dfY29udHJvbCA9ICh2b2lk Kikob3B0KzEpOwogCi0JCXJldHYgPSBkYXRhZ3JhbV9zZW5kX2N0bCgmbXNnLCAmZmwsIG9wdCwg Jmp1bmspOworCQlyZXR2ID0gZGF0YWdyYW1fc2VuZF9jdGwoJm1zZywgJmZsLCBvcHQsICZqdW5r LCAmanVuayk7CiAJCWlmIChyZXR2KQogCQkJZ290byBkb25lOwogdXBkYXRlOgpAQCAtNjIwLDI2 ICs2MjUsMzAgQEAgaW50IGlwdjZfZ2V0c29ja29wdChzdHJ1Y3Qgc29jayAqc2ssIGludAogCQl2 YWwgPSBucC0+aXB2Nm9ubHk7CiAJCWJyZWFrOwogCi0JY2FzZSBJUFY2X1BLVElORk86CisJY2Fz ZSBJUFY2X1JFQ1ZQS1RJTkZPOgogCQl2YWwgPSBucC0+cnhvcHQuYml0cy5yeGluZm87CiAJCWJy ZWFrOwogCi0JY2FzZSBJUFY2X0hPUExJTUlUOgorCWNhc2UgSVBWNl9SRUNWSE9QTElNSVQ6CiAJ CXZhbCA9IG5wLT5yeG9wdC5iaXRzLnJ4aGxpbTsKIAkJYnJlYWs7CiAKLQljYXNlIElQVjZfUlRI RFI6CisJY2FzZSBJUFY2X1JFQ1ZSVEhEUjoKIAkJdmFsID0gbnAtPnJ4b3B0LmJpdHMuc3JjcnQ7 CiAJCWJyZWFrOwogCi0JY2FzZSBJUFY2X0hPUE9QVFM6CisJY2FzZSBJUFY2X1JFQ1ZIT1BPUFRT OgogCQl2YWwgPSBucC0+cnhvcHQuYml0cy5ob3BvcHRzOwogCQlicmVhazsKIAotCWNhc2UgSVBW Nl9EU1RPUFRTOgorCWNhc2UgSVBWNl9SRUNWRFNUT1BUUzoKIAkJdmFsID0gbnAtPnJ4b3B0LmJp dHMuZHN0b3B0czsKIAkJYnJlYWs7CiAKKwljYXNlIElQVjZfUkVDVlRDTEFTUzoKKwkJdmFsID0g bnAtPnJ4b3B0LmJpdHMucnh0Y2xhc3M7CisJCWJyZWFrOworCiAJY2FzZSBJUFY2X0ZMT1dJTkZP OgogCQl2YWwgPSBucC0+cnhvcHQuYml0cy5yeGZsb3c7CiAJCWJyZWFrOwpkaWZmIC1ydU5wIGxp bnV4LTIuNi4xMS4xMC9uZXQvaXB2Ni9yYXcuYyBsaW51eC0yLjYuMTEuMTBUMi9uZXQvaXB2Ni9y YXcuYwotLS0gbGludXgtMi42LjExLjEwL25ldC9pcHY2L3Jhdy5jCTIwMDUtMDUtMTYgMTA6NTI6 MDAuMDAwMDAwMDAwIC0wNzAwCisrKyBsaW51eC0yLjYuMTEuMTBUMi9uZXQvaXB2Ni9yYXcuYwky MDA1LTA1LTI0IDE1OjA5OjQyLjAwMDAwMDAwMCAtMDcwMApAQCAtNjE3LDYgKzYxNyw3IEBAIHN0 YXRpYyBpbnQgcmF3djZfc2VuZG1zZyhzdHJ1Y3Qga2lvY2IgKmkKIAlzdHJ1Y3QgZmxvd2kgZmw7 CiAJaW50IGFkZHJfbGVuID0gbXNnLT5tc2dfbmFtZWxlbjsKIAlpbnQgaGxpbWl0ID0gLTE7CisJ aW50IHRjbGFzcyA9IC0xOwogCXUxNiBwcm90bzsKIAlpbnQgZXJyOwogCkBAIC03MDIsNyArNzAz LDcgQEAgc3RhdGljIGludCByYXd2Nl9zZW5kbXNnKHN0cnVjdCBraW9jYiAqaQogCQltZW1zZXQo b3B0LCAwLCBzaXplb2Yoc3RydWN0IGlwdjZfdHhvcHRpb25zKSk7CiAJCW9wdC0+dG90X2xlbiA9 IHNpemVvZihzdHJ1Y3QgaXB2Nl90eG9wdGlvbnMpOwogCi0JCWVyciA9IGRhdGFncmFtX3NlbmRf Y3RsKG1zZywgJmZsLCBvcHQsICZobGltaXQpOworCQllcnIgPSBkYXRhZ3JhbV9zZW5kX2N0bCht c2csICZmbCwgb3B0LCAmaGxpbWl0LCAmdGNsYXNzKTsKIAkJaWYgKGVyciA8IDApIHsKIAkJCWZs Nl9zb2NrX3JlbGVhc2UoZmxvd2xhYmVsKTsKIAkJCXJldHVybiBlcnI7CkBAIC03NTgsNiArNzU5 LDEyIEBAIHN0YXRpYyBpbnQgcmF3djZfc2VuZG1zZyhzdHJ1Y3Qga2lvY2IgKmkKIAkJCWhsaW1p dCA9IGRzdF9tZXRyaWMoZHN0LCBSVEFYX0hPUExJTUlUKTsKIAl9CiAKKwlpZiAodGNsYXNzIDwg MCkgeworCQl0Y2xhc3MgPSBucC0+Y29yay50Y2xhc3M7CisJCWlmICh0Y2xhc3MgPCAwKQorCQkJ dGNsYXNzID0gMDsKKwl9CisKIAlpZiAobXNnLT5tc2dfZmxhZ3MmTVNHX0NPTkZJUk0pCiAJCWdv dG8gZG9fY29uZmlybTsKIApAQCAtNzY2LDggKzc3Myw5IEBAIGJhY2tfZnJvbV9jb25maXJtOgog CQllcnIgPSByYXd2Nl9zZW5kX2hkcmluYyhzaywgbXNnLT5tc2dfaW92LCBsZW4sICZmbCwgKHN0 cnVjdCBydDZfaW5mbyopZHN0LCBtc2ctPm1zZ19mbGFncyk7CiAJfSBlbHNlIHsKIAkJbG9ja19z b2NrKHNrKTsKLQkJZXJyID0gaXA2X2FwcGVuZF9kYXRhKHNrLCBpcF9nZW5lcmljX2dldGZyYWcs IG1zZy0+bXNnX2lvdiwgbGVuLCAwLAotCQkJCQlobGltaXQsIG9wdCwgJmZsLCAoc3RydWN0IHJ0 Nl9pbmZvKilkc3QsIG1zZy0+bXNnX2ZsYWdzKTsKKwkJZXJyID0gaXA2X2FwcGVuZF9kYXRhKHNr LCBpcF9nZW5lcmljX2dldGZyYWcsIG1zZy0+bXNnX2lvdiwKKwkJCWxlbiwgMCwgaGxpbWl0LCB0 Y2xhc3MsIG9wdCwgJmZsLCAoc3RydWN0IHJ0Nl9pbmZvKilkc3QsCisJCQltc2ctPm1zZ19mbGFn cyk7CiAKIAkJaWYgKGVycikKIAkJCWlwNl9mbHVzaF9wZW5kaW5nX2ZyYW1lcyhzayk7CmRpZmYg LXJ1TnAgbGludXgtMi42LjExLjEwL25ldC9pcHY2L3VkcC5jIGxpbnV4LTIuNi4xMS4xMFQyL25l dC9pcHY2L3VkcC5jCi0tLSBsaW51eC0yLjYuMTEuMTAvbmV0L2lwdjYvdWRwLmMJMjAwNS0wNS0x NiAxMDo1MjowMC4wMDAwMDAwMDAgLTA3MDAKKysrIGxpbnV4LTIuNi4xMS4xMFQyL25ldC9pcHY2 L3VkcC5jCTIwMDUtMDUtMjQgMTU6MTE6NTguMDAwMDAwMDAwIC0wNzAwCkBAIC02MzcsNiArNjM3 LDcgQEAgc3RhdGljIGludCB1ZHB2Nl9zZW5kbXNnKHN0cnVjdCBraW9jYiAqaQogCWludCBhZGRy X2xlbiA9IG1zZy0+bXNnX25hbWVsZW47CiAJaW50IHVsZW4gPSBsZW47CiAJaW50IGhsaW1pdCA9 IC0xOworCWludCB0Y2xhc3MgPSAtMTsKIAlpbnQgY29ya3JlcSA9IHVwLT5jb3JrZmxhZyB8fCBt c2ctPm1zZ19mbGFncyZNU0dfTU9SRTsKIAlpbnQgZXJyOwogCkBAIC03NTgsNyArNzU5LDcgQEAg ZG9fdWRwX3NlbmRtc2c6CiAJCW1lbXNldChvcHQsIDAsIHNpemVvZihzdHJ1Y3QgaXB2Nl90eG9w dGlvbnMpKTsKIAkJb3B0LT50b3RfbGVuID0gc2l6ZW9mKCpvcHQpOwogCi0JCWVyciA9IGRhdGFn cmFtX3NlbmRfY3RsKG1zZywgZmwsIG9wdCwgJmhsaW1pdCk7CisJCWVyciA9IGRhdGFncmFtX3Nl bmRfY3RsKG1zZywgZmwsIG9wdCwgJmhsaW1pdCwgJnRjbGFzcyk7CiAJCWlmIChlcnIgPCAwKSB7 CiAJCQlmbDZfc29ja19yZWxlYXNlKGZsb3dsYWJlbCk7CiAJCQlyZXR1cm4gZXJyOwpAQCAtODEy LDYgKzgxMywxMSBAQCBkb191ZHBfc2VuZG1zZzoKIAkJaWYgKGhsaW1pdCA8IDApCiAJCQlobGlt aXQgPSBkc3RfbWV0cmljKGRzdCwgUlRBWF9IT1BMSU1JVCk7CiAJfQorCWlmICh0Y2xhc3MgPCAw KSB7CisJCXRjbGFzcyA9IG5wLT5jb3JrLnRjbGFzczsKKwkJaWYgKHRjbGFzcyA8IDApCisJCQl0 Y2xhc3MgPSAwOworCX0KIAogCWlmIChtc2ctPm1zZ19mbGFncyZNU0dfQ09ORklSTSkKIAkJZ290 byBkb19jb25maXJtOwpAQCAtODMyLDkgKzgzOCwxMCBAQCBiYWNrX2Zyb21fY29uZmlybToKIAog ZG9fYXBwZW5kX2RhdGE6CiAJdXAtPmxlbiArPSB1bGVuOwotCWVyciA9IGlwNl9hcHBlbmRfZGF0 YShzaywgaXBfZ2VuZXJpY19nZXRmcmFnLCBtc2ctPm1zZ19pb3YsIHVsZW4sIHNpemVvZihzdHJ1 Y3QgdWRwaGRyKSwKLQkJCSAgICAgIGhsaW1pdCwgb3B0LCBmbCwgKHN0cnVjdCBydDZfaW5mbyop ZHN0LAotCQkJICAgICAgY29ya3JlcSA/IG1zZy0+bXNnX2ZsYWdzfE1TR19NT1JFIDogbXNnLT5t c2dfZmxhZ3MpOworCWVyciA9IGlwNl9hcHBlbmRfZGF0YShzaywgaXBfZ2VuZXJpY19nZXRmcmFn LCBtc2ctPm1zZ19pb3YsIHVsZW4sCisJCXNpemVvZihzdHJ1Y3QgdWRwaGRyKSwgaGxpbWl0LCB0 Y2xhc3MsIG9wdCwgZmwsCisJCShzdHJ1Y3QgcnQ2X2luZm8qKWRzdCwKKwkJY29ya3JlcSA/IG1z Zy0+bXNnX2ZsYWdzfE1TR19NT1JFIDogbXNnLT5tc2dfZmxhZ3MpOwogCWlmIChlcnIpCiAJCXVk cF92Nl9mbHVzaF9wZW5kaW5nX2ZyYW1lcyhzayk7CiAJZWxzZSBpZiAoIWNvcmtyZXEpCg== --=_mixed 006CCE2C88257018_=-- From tgraf@suug.ch Mon Jun 6 12:55:04 2005 Received: with ECARTIS (v1.0.0; list netdev); Mon, 06 Jun 2005 12:55:08 -0700 (PDT) Received: from postel.suug.ch (postel.suug.ch [195.134.158.23]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j56Jt3Xq027261 for ; Mon, 6 Jun 2005 12:55:04 -0700 Received: by postel.suug.ch (Postfix, from userid 10001) id 41DBE1C0EF; Mon, 6 Jun 2005 21:54:22 +0200 (CEST) Date: Mon, 6 Jun 2005 21:54:22 +0200 From: Thomas Graf To: Teemu Koponen Cc: netdev@oss.sgi.com Subject: Re: New address announcements in RTMGRP_IPV4_IFADDR netlink group Message-ID: <20050606195422.GJ15391@postel.suug.ch> References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-archive-position: 2141 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev Content-Length: 1179 Lines: 23 * Teemu Koponen 2005-06-06 11:59 > 0) A userspace daemon process is running and listening to the broadcast > group. > > 1) Address is inserted to an interface (ip addr add ... at shell). > > 2) The daemon receives a NEWADDR message, just as is should, but the > daemon is unable to bind to the address *immediately* (actually in the > function that processes the netlink message). The result is "cannot > assign an address" from the bind call. However, if I do insert a single > nanosleep, even with an arbitrary low sleep value, before the bind > call, the bind then succeeds. > > So, what is the semantics of NEWADDR? Should the address be bindable > right after receiving the message? The bind() call doesn't fail because of the address being non-existant, it fails because the route has not been created for it. The netlink message is generated before we notify the other subsystems about the addition of a new address so you try to bind to an adress for which no route has been generated yet. The best solution is probably to wait for the route addition notification message being received and then bind to that address. From john.ronciak@intel.com Mon Jun 6 13:32:27 2005 Received: with ECARTIS (v1.0.0; list netdev); Mon, 06 Jun 2005 13:32:38 -0700 (PDT) Received: from orsfmr002.jf.intel.com (fmr17.intel.com [134.134.136.16]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j56KWRXq030063 for ; Mon, 6 Jun 2005 13:32:27 -0700 Received: from orsfmr101.jf.intel.com (orsfmr101.jf.intel.com [10.7.209.17]) by orsfmr002.jf.intel.com (8.12.10/8.12.10/d: major-outer.mc,v 1.1 2004/09/17 17:50:56 root Exp $) with ESMTP id j56KTtBi012037; Mon, 6 Jun 2005 20:29:55 GMT Received: from orsmsxvs040.jf.intel.com (orsmsxvs040.jf.intel.com [192.168.65.206]) by orsfmr101.jf.intel.com (8.12.10/8.12.10/d: major-inner.mc,v 1.2 2004/09/17 18:05:01 root Exp $) with SMTP id j56KTLUB028765; Mon, 6 Jun 2005 20:29:52 GMT Received: from orsmsx331.amr.corp.intel.com ([192.168.65.56]) by orsmsxvs040.jf.intel.com (SAVSMTP 3.1.7.47) with SMTP id M2005060613295112784 ; Mon, 06 Jun 2005 13:29:51 -0700 Received: from orsmsx408.amr.corp.intel.com ([192.168.65.52]) by orsmsx331.amr.corp.intel.com with Microsoft SMTPSVC(6.0.3790.211); Mon, 6 Jun 2005 13:29:51 -0700 X-MimeOLE: Produced By Microsoft Exchange V6.5.7226.0 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Subject: RE: RFC: NAPI packet weighting patch Date: Mon, 6 Jun 2005 13:29:50 -0700 Message-ID: <468F3FDA28AA87429AD807992E22D07E0450C00B@orsmsx408> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: RFC: NAPI packet weighting patch Thread-Index: AcVq0KInJf0GW6tXQUu/BgSh9BGIAQABPK1A From: "Ronciak, John" To: "David S. Miller" Cc: , , , "Williams, Mitch A" , , , , , "Venkatesan, Ganesh" , "Brandeburg, Jesse" X-OriginalArrivalTime: 06 Jun 2005 20:29:51.0695 (UTC) FILETIME=[7DDA95F0:01C56AD6] X-Scanned-By: MIMEDefang 2.44 Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id j56KWRXq030063 X-archive-position: 2142 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: john.ronciak@intel.com Precedence: bulk X-list: netdev Content-Length: 757 Lines: 22 > If you force the e1000 driver to do RX replenishment every N > packets it should reduce the packet drops the same (in the > single NIC case) as if you reduced the dev->weight to that > same value N. But this isn't what we are seeing. Even if we just reduce the weight value to 32 from 64, all of the drops go away. So there seems to be other things affecting this. We are just talking about single NIC testing at this point. I agree that single and multi-NIC results different issues and we will need to test this as well with whatever we come up with out of this. I also like your idea about the weight value being adjusted based on real work done using some measurable metric. This seems like a good path to explore as well. Cheers, John From mchan@broadcom.com Mon Jun 6 13:41:20 2005 Received: with ECARTIS (v1.0.0; list netdev); Mon, 06 Jun 2005 13:41:24 -0700 (PDT) Received: from MMS1.broadcom.com (mms1.broadcom.com [216.31.210.17]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j56KfJXq030916 for ; Mon, 6 Jun 2005 13:41:20 -0700 Received: from 10.10.64.121 by MMS1.broadcom.com with SMTP (Broadcom SMTP Relay (Email Firewall v6.1.0)); Mon, 06 Jun 2005 13:39:58 -0700 X-Server-Uuid: 146C3151-C1DE-4F71-9D02-C3BE503878DD Received: from mail-irva-8.broadcom.com ([10.10.64.221]) by mail-irva-1.broadcom.com (Post.Office MTA v3.5.3 release 223 ID# 0-72233U7200L2200S0V35) with ESMTP id com; Mon, 6 Jun 2005 13:39:56 -0700 Received: from mon-irva-10.broadcom.com (mon-irva-10.broadcom.com [10.10.64.171]) by mail-irva-8.broadcom.com (MOS 3.5.6-GR) with ESMTP id BCM96590; Mon, 6 Jun 2005 13:39:53 -0700 (PDT) Received: from nt-irva-0741.brcm.ad.broadcom.com ( nt-irva-0741.brcm.ad.broadcom.com [10.8.194.54]) by mon-irva-10.broadcom.com (8.9.1/8.9.1) with ESMTP id NAA27549; Mon, 6 Jun 2005 13:39:53 -0700 (PDT) Received: from 10.7.18.143 ([10.7.18.143]) by NT-IRVA-0741.brcm.ad.broadcom.com ([10.8.194.54]) with Microsoft Exchange Server HTTP-DAV ; Mon, 6 Jun 2005 20:39:52 +0000 Received: from rh4 by nt-irva-0741; 06 Jun 2005 12:42:22 -0700 Subject: [PATCH] tg3: Fix link failure in 5701 From: "Michael Chan" To: davem@davemloft.net cc: iod00d@hp.com, peterc@gelato.unsw.edu.au, netdev@oss.sgi.com Date: Mon, 06 Jun 2005 12:42:22 -0700 Message-ID: <1118086942.5008.14.camel@rh4> MIME-Version: 1.0 X-Mailer: Evolution 2.0.2 (2.0.2-3) X-WSS-ID: 6EBA6B172U46020352-01-01 Content-Type: text/plain Content-Transfer-Encoding: 7bit X-archive-position: 2143 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mchan@broadcom.com Precedence: bulk X-list: netdev Content-Length: 1309 Lines: 41 On some 5701 devices with older bootcode, the LED configuration bits in SRAM may be invalid with value zero. The fix is to check for invalid bits (0) and default to PHY 1 mode. Incorrect LED mode will lead to error in programming the PHY. Thanks to Grant Grundler for debugging the problem. >From Grant: | In May, 2004, tg3 v3.4 changed how MAC_LED_CTRL (0x40c) was getting | programmed and how to determine what to program into LED_CTRL. The new | code trusted NIC_SRAM_DATA_CFG (0x00000b58) to indicate what to write | to LED_CTRL and MII EXT_CTRL registers. On "IOX Core Lan", SRAM was | saying MODE_MAC (0x0) and that doesn't work. Signed-off-by: Michael Chan diff -Nru led1/drivers/net/tg3.c led2/drivers/net/tg3.c --- led1/drivers/net/tg3.c 2005-06-06 10:19:56.692541944 -0700 +++ led2/drivers/net/tg3.c 2005-06-06 10:34:49.251852304 -0700 @@ -8555,6 +8555,16 @@ case NIC_SRAM_DATA_CFG_LED_MODE_MAC: tp->led_ctrl = LED_CTRL_MODE_MAC; + + /* Default to PHY_1_MODE if 0 (MAC_MODE) is + * read on some older 5700/5701 bootcode. + */ + if (GET_ASIC_REV(tp->pci_chip_rev_id) == + ASIC_REV_5700 || + GET_ASIC_REV(tp->pci_chip_rev_id) == + ASIC_REV_5701) + tp->led_ctrl = LED_CTRL_MODE_PHY_1; + break; case SHASTA_EXT_LED_SHARED: From rmk+netdev=oss.sgi.com@arm.linux.org.uk Mon Jun 6 14:48:49 2005 Received: with ECARTIS (v1.0.0; list netdev); Mon, 06 Jun 2005 14:48:55 -0700 (PDT) Received: from caramon.arm.linux.org.uk (caramon.arm.linux.org.uk [212.18.232.186]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j56LmjXq001817 for ; Mon, 6 Jun 2005 14:48:48 -0700 Received: from flint.arm.linux.org.uk ([2002:d412:e8ba:1:201:2ff:fe14:8fad]) by caramon.arm.linux.org.uk with asmtp (TLSv1:DES-CBC3-SHA:168) (Exim 4.41) id 1DfPRS-0000AV-TP for netdev@oss.sgi.com; Mon, 06 Jun 2005 22:47:31 +0100 Received: from rmk by flint.arm.linux.org.uk with local (Exim 4.41) id 1DfPRR-0000Zt-S2 for netdev@oss.sgi.com; Mon, 06 Jun 2005 22:47:29 +0100 Date: Mon, 6 Jun 2005 22:47:29 +0100 From: Russell King To: netdev@oss.sgi.com Subject: Fwd: [Bug 4615] Modem connection stalls out. Message-ID: <20050606224729.B12034@flint.arm.linux.org.uk> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5.1i X-archive-position: 2144 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: rmk@arm.linux.org.uk Precedence: bulk X-list: netdev Content-Length: 3648 Lines: 116 Anyone have any ideas on this bug? The "No buffer space available" looks like the system is running low on memory. Would networking folk concur with that? ----- Forwarded message from bugme-daemon@kernel-bugs.osdl.org ----- Date: Mon, 6 Jun 2005 09:19:47 -0700 From: bugme-daemon@kernel-bugs.osdl.org To: rmk@arm.linux.org.uk Subject: [Bug 4615] Modem connection stalls out. http://bugzilla.kernel.org/show_bug.cgi?id=4615 ------- Additional Comments From alangrimes@starpower.net 2005-06-06 09:19 ------- The only reliable feedback I get from the bug, asside from its obvious symptoms, is through ping... Here is a typical output: 64 bytes from 10.65.28.26: icmp_seq=296 ttl=255 time=2552 ms 64 bytes from 10.65.28.26: icmp_seq=297 ttl=255 time=1561 ms 64 bytes from 10.65.28.26: icmp_seq=298 ttl=255 time=567 ms 64 bytes from 10.65.28.26: icmp_seq=299 ttl=255 time=137 ms 64 bytes from 10.65.28.26: icmp_seq=300 ttl=255 time=484 ms # Hmm, exactly 5 ## minutes, though I've seen it quit after only 10 seconds...) ping: sendmsg: No buffer space available ping: sendmsg: No buffer space available ping: sendmsg: No buffer space available ping: sendmsg: No buffer space available ping: sendmsg: No buffer space available ping: sendmsg: No buffer space available ## It would have continued repeating this message indefinitely... ## Note: many more iterations have been removed from this report!!! =P ## Below is what happens when I manually disconnect the modem by sending ## the break signal to the dialer. ping: sendmsg: Network is unreachable ping: sendmsg: Network is unreachable ping: sendmsg: Network is unreachable ping: sendmsg: Network is unreachable --- 10.65.28.26 ping statistics --- 337 packets transmitted, 300 received, 10% packet loss, time 35529 5ms rtt min/avg/max/mdev = 122.973/687.037/6298.163/1093.624 ms, pipe 7 ################################### After power cycling the modem here's what the dialer does: leenooks ~ # wvdial --> WvDial: Internet dialer version 1.54.0 --> Initializing modem. --> Sending: ATZ --> Sending: ATQ0 --> Re-Sending: ATZ ### the dialer is hung and will report that the modem is not responding in a ### few seconds... ### At this point I could ctrl-break the dialer and try again, ### However, this would be entirely unproductive as I'd get the same mesage ### each and every time. ### Only by allowing it to complete its cycle, will it return the modem to ### functionality. I suspect that the dialer sends an IOCTL or something to the ### driver which clears the fault... --> Modem not responding. leenooks ~ # wvdial --> WvDial: Internet dialer version 1.54.0 --> Initializing modem. --> Sending: ATZ ATZ OK --> Sending: AT&F&D2&C1X4V1Q0S7=70W2\N3&K3S11=60 AT&F&D2&C1X4V1Q0S7=70W2\N3&K3S11=60 OK --> Modem initialized. --> Sending: ATDT7038298111 --> Waiting for carrier. ATDT7038298111 CONNECT 49333 --> Carrier detected. Waiting for prompt. ** Ascend TNT2.LNHVA.MD.RCN.NET Terminal Server ** Login: --> Looks like a login prompt. --> Sending: alangrimes alangrimes Password: --> Looks like a password prompt. --> Sending: (password) Entering PPP Session. IP address is 66.44.56.212 MTU is 1006. --> Looks like a welcome message. --> Starting pppd at Tue Jun 7 04:58:30 2005 --> pid of pppd: 19733 --> Using interface ppp0 --> local IP address 66.44.56.212 --> remote IP address 10.65.28.27 --> primary DNS address 207.172.3.10 --> secondary DNS address 207.172.3.11 ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. ----- End forwarded message ----- -- Russell King From davem@davemloft.net Mon Jun 6 15:18:01 2005 Received: with ECARTIS (v1.0.0; list netdev); Mon, 06 Jun 2005 15:18:04 -0700 (PDT) Received: from sunset.davemloft.net (dsl027-180-168.sfo1.dsl.speakeasy.net [216.27.180.168]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j56MHuXq004137 for ; Mon, 6 Jun 2005 15:18:01 -0700 Received: from localhost ([127.0.0.1] ident=davem) by sunset.davemloft.net with esmtp (Exim 4.50) id 1DfPth-0007N4-Rl; Mon, 06 Jun 2005 15:16:41 -0700 Date: Mon, 06 Jun 2005 15:16:41 -0700 (PDT) Message-Id: <20050606.151641.95895557.davem@davemloft.net> To: mchan@broadcom.com Cc: iod00d@hp.com, peterc@gelato.unsw.edu.au, netdev@oss.sgi.com Subject: Re: [PATCH] tg3: Fix link failure in 5701 From: "David S. Miller" In-Reply-To: <1118086942.5008.14.camel@rh4> References: <1118086942.5008.14.camel@rh4> X-Mailer: Mew version 3.3 on Emacs 21.4 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 2145 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev Content-Length: 409 Lines: 11 From: "Michael Chan" Date: Mon, 06 Jun 2005 12:42:22 -0700 > On some 5701 devices with older bootcode, the LED configuration bits in > SRAM may be invalid with value zero. The fix is to check for invalid > bits (0) and default to PHY 1 mode. Incorrect LED mode will lead to > error in programming the PHY. > > Thanks to Grant Grundler for debugging the problem. Applied, thanks a log. From davem@davemloft.net Mon Jun 6 15:28:36 2005 Received: with ECARTIS (v1.0.0; list netdev); Mon, 06 Jun 2005 15:28:40 -0700 (PDT) Received: from sunset.davemloft.net (dsl027-180-168.sfo1.dsl.speakeasy.net [216.27.180.168]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j56MSaXq005215 for ; Mon, 6 Jun 2005 15:28:36 -0700 Received: from localhost ([127.0.0.1] ident=davem) by sunset.davemloft.net with esmtp (Exim 4.50) id 1DfQ42-0007Qq-SG; Mon, 06 Jun 2005 15:27:22 -0700 Date: Mon, 06 Jun 2005 15:27:22 -0700 (PDT) Message-Id: <20050606.152722.31644561.davem@davemloft.net> To: iod00d@hp.com Cc: mchan@broadcom.com, peterc@gelato.unsw.edu.au, netdev@oss.sgi.com Subject: Re: [PATCH] tg3: Fix link failure in 5701 From: "David S. Miller" In-Reply-To: <20050606222631.GE12068@esmail.cup.hp.com> References: <1118086942.5008.14.camel@rh4> <20050606.151641.95895557.davem@davemloft.net> <20050606222631.GE12068@esmail.cup.hp.com> X-Mailer: Mew version 3.3 on Emacs 21.4 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 2146 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev Content-Length: 479 Lines: 15 From: Grant Grundler Date: Mon, 6 Jun 2005 15:26:31 -0700 > Btw, where can I see which version of tg3 will get this fix? > > I'm certainly I'll be asked the question "which tg3 version > is required" more than the few times. It will be version "3.30" with release date "June 6, 2005" I will push it to Linus as soon as the kernel.org mirror system picks it up from my GIT tree at: rsync://rsync.kernel.org/pub/scm/linux/kernel/git/davem/tg3-2.6.git/ From davem@davemloft.net Mon Jun 6 15:30:24 2005 Received: with ECARTIS (v1.0.0; list netdev); Mon, 06 Jun 2005 15:30:30 -0700 (PDT) Received: from sunset.davemloft.net (dsl027-180-168.sfo1.dsl.speakeasy.net [216.27.180.168]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j56MUOXq005664 for ; Mon, 6 Jun 2005 15:30:24 -0700 Received: from localhost ([127.0.0.1] ident=davem) by sunset.davemloft.net with esmtp (Exim 4.50) id 1DfQ5p-0007RL-B8; Mon, 06 Jun 2005 15:29:13 -0700 Date: Mon, 06 Jun 2005 15:29:13 -0700 (PDT) Message-Id: <20050606.152913.88479223.davem@davemloft.net> To: tgraf@suug.ch Cc: netdev@oss.sgi.com Subject: Re: [PATCHSET] PKT_SCHED related fixes and a meta ematch completion From: "David S. Miller" In-Reply-To: <20050603211241.593114000@axs> References: <20050603211241.593114000@axs> X-Mailer: Mew version 3.3 on Emacs 21.4 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 2147 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev Content-Length: 509 Lines: 13 From: Thomas Graf Date: Fri, 03 Jun 2005 23:12:41 +0200 > The following patchset fixes some serious bugs that prevent > the basic classifier and the meta ematch from working properly. > Patch 2 adds a few new meta collectors for socket attribtues which > I'd like to have in 2.6.12 as well. If you think this is too > intrusive (it isn't ;->) I'll resend patch 4 with offsets fixed. I'll try to get these 4 patches into 2.6.12, they all look straight forward and sane to me. Thanks Thomas. From davem@davemloft.net Mon Jun 6 15:37:30 2005 Received: with ECARTIS (v1.0.0; list netdev); Mon, 06 Jun 2005 15:37:38 -0700 (PDT) Received: from sunset.davemloft.net (dsl027-180-168.sfo1.dsl.speakeasy.net [216.27.180.168]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j56MbUXq006468 for ; Mon, 6 Jun 2005 15:37:30 -0700 Received: from localhost ([127.0.0.1] ident=davem) by sunset.davemloft.net with esmtp (Exim 4.50) id 1DfQCQ-0007Rs-D4; Mon, 06 Jun 2005 15:36:02 -0700 Date: Mon, 06 Jun 2005 15:36:02 -0700 (PDT) Message-Id: <20050606.153602.23015220.davem@davemloft.net> To: herbert@gondor.apana.org.au Cc: johnpol@2ka.mipt.ru, jmorris@redhat.com, linux-crypto@vger.kernel.org, netdev@oss.sgi.com Subject: Re: [RFC] Replace scatterlist with crypto_frag From: "David S. Miller" In-Reply-To: <20050604103249.GA1378@gondor.apana.org.au> References: <20050604102204.GA1214@gondor.apana.org.au> <20050604142939.4e2efc55@zanzibar.2ka.mipt.ru> <20050604103249.GA1378@gondor.apana.org.au> X-Mailer: Mew version 3.3 on Emacs 21.4 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-as