Received: with ECARTIS (v1.0.0; list netdev); Mon, 26 Jul 2004 09:38:21 -0700 (PDT) Received: from mail1.slu.se (mail1.slu.se [130.238.96.11]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i6QGcFV5001745 for ; Mon, 26 Jul 2004 09:38:15 -0700 Received: from robur.slu.se (robur.slu.se [130.238.98.12]) by mail1.slu.se (8.12.10/8.12.10) with ESMTP id i6QGbPIR023607; Mon, 26 Jul 2004 18:37:25 +0200 Received: by robur.slu.se (Postfix, from userid 1000) id 1FED3EC33E; Mon, 26 Jul 2004 18:37:26 +0200 (CEST) From: Robert Olsson MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <16645.13126.52445.630789@robur.slu.se> Date: Mon, 26 Jul 2004 18:37:26 +0200 To: Pasi Sjoholm Cc: Francois Romieu , H?ctor Mart?n , Linux-Kernel , , , Subject: Re: ksoftirqd uses 99% CPU triggered by network traffic (maybe RLT-8139 related) In-Reply-To: References: <20040725235927.B30025@electric-eye.fr.zoreil.com> X-Mailer: VM 7.18 under Emacs 21.3.1 X-archive-position: 7167 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: Robert.Olsson@data.slu.se Precedence: bulk X-list: netdev Content-Length: 1462 Lines: 41 Pasi Sjoholm writes: > Pid: 2, comm: ksoftirqd/0 > EIP: 0060:[] CPU: 0 > EIP is at rtl8139_poll+0xb4/0x100 [8139too] > EFLAGS: 00000247 Not tainted (2.6.7-mm7) > EAX: ffffe000 EBX: 00000040 ECX: df4824f8 EDX: c0441978 > ESI: df482400 EDI: e0868000 EBP: dff85f80 DS: 007b ES: 007b > CR0: 8005003b CR2: b7c5a000 CR3: 1fafd000 CR4: 000006d0 > [] ksoftirqd+0x0/0xc0 > [] net_rx_action+0x6a/0x110 > [] __do_softirq+0xa9/0xb0 > [] do_softirq+0x27/0x30 > [] ksoftirqd+0x68/0xc0 > [] kthread+0xa5/0xb0 > [] kthread+0x0/0xb0 > [] kernel_thread_helper+0x5/0x14 > -- > > I'm not a kernel expert but it would seem that ksoftirqd is in some sort a > loop because I didn't get any "printk("%s wakes ksoftirqd\n", > current->comm);"-lines. Hello! This looks very much like the problem we see when doing route DoS testing with Alexey. In summary: High softirq loads can totally kill userland. The reason is that do_softirq() is run from many places hard interrupts, local_bh_enable etc and bypasses the ksoftirqd protection. It just been discussed at OLS with Andrea and Dipankar and others. Current RCU suffers from this problem as well. I've experimented some code to defer softirq's to ksoftirqd after a time as well as deferring all softirq's to ksoftirqd. Andrea had some ideas as well as Ingo. Cheers. --ro