Received: with ECARTIS (v1.0.0; list netdev); Mon, 26 Jul 2004 15:38:38 -0700 (PDT) Received: from mail1.slu.se (mail1.slu.se [130.238.96.11]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i6QMcV5U023042 for ; Mon, 26 Jul 2004 15:38:32 -0700 Received: from robur.slu.se (robur.slu.se [130.238.98.12]) by mail1.slu.se (8.12.10/8.12.10) with ESMTP id i6QMcCIR028083; Tue, 27 Jul 2004 00:38:12 +0200 Received: by robur.slu.se (Postfix, from userid 1000) id D3498EC33E; Tue, 27 Jul 2004 00:38:12 +0200 (CEST) From: Robert Olsson MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Message-ID: <16645.34772.760879.146784@robur.slu.se> Date: Tue, 27 Jul 2004 00:38:12 +0200 To: Pasi Sjoholm Cc: Robert Olsson , Francois Romieu , H?ctor Mart?n , Linux-Kernel , , , , Subject: Re: ksoftirqd uses 99% CPU triggered by network traffic (maybe RLT-8139 related) In-Reply-To: References: <16645.13126.52445.630789@robur.slu.se> X-Mailer: VM 7.18 under Emacs 21.3.1 Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id i6QMcV5U023042 X-archive-position: 7176 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: Robert.Olsson@data.slu.se Precedence: bulk X-list: netdev Content-Length: 1944 Lines: 64 Pasi Sjoholm writes: > Hur är läget Robert?-) Tack bra! > > In summary: High softirq loads can totally kill userland. The reason is that > > do_softirq() is run from many places hard interrupts, local_bh_enable etc > > and bypasses the ksoftirqd protection. It just been discussed at OLS with > > Andrea and Dipankar and others. Current RCU suffers from this problem as well. > > Ok, this explanation makes sense and my point of view I think this is > quite critical problem if you can "crash" linux kernel just sending enough > packets to network interface for an example. Well the packets also has to create hard softirq loads in practice this means route lookup or something else for normal traffic the RX_SOFIRQ is very well behaved and schedules itself to give other softirq's a chance to run also I'll guess you have softirq's not only from the network. If you decrease your load a bit you come to more nomal operation? > I would be more than glad to help you in testing if you want to publish > some patches. Well maybe we should start to verify that this problem. Alexey did a litte program doing gettimeofday to run to see how long user mode could be starved. Also note we add new source of softirq's. #include #include #include main() { struct timeval tv, prev_tv; __s64 diff; __u32 i; __s32 maxlat = 50; gettimeofday(&prev_tv, NULL); printf("time control loop starting\n"); for (i=0;;i++) { gettimeofday(&tv, NULL); diff = (tv.tv_sec - prev_tv.tv_sec)*1000000 + (tv.tv_usec - prev_tv.tv_usec); if (diff > 1000000) printf("**%lld\n", diff); prev_tv = tv; if (diff > maxlat) { maxlat = diff; printf("new maxlat = %d\n", maxlat); } if(!(i % 1000000)) printf("timestamp diff = %lld, maxlat = %d\n", diff, maxlat); } } Cheers. --ro