Received: with ECARTIS (v1.0.0; list netdev); Sun, 16 Jan 2005 04:33:00 -0800 (PST) Received: from mx1.slu.se (mx1.slu.se [130.238.96.70]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id j0GCWssb028430 for ; Sun, 16 Jan 2005 04:32:55 -0800 Received: from robur.slu.se (robur.slu.se [130.238.98.12]) by mx1.slu.se (8.13.1/8.13.1) with ESMTP id j0GCWn9Q028551; Sun, 16 Jan 2005 13:32:49 +0100 Received: by robur.slu.se (Postfix, from userid 1000) id 77846EC1A0; Sun, 16 Jan 2005 13:32:49 +0100 (CET) From: Robert Olsson MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <16874.24305.461492.48668@robur.slu.se> Date: Sun, 16 Jan 2005 13:32:49 +0100 To: jeremy.guthrie@berbee.com Cc: netdev@oss.sgi.com, Robert Olsson Subject: Re: V2.4 policy router operates faster/better than V2.6 In-Reply-To: <200501141326.29575.jeremy.guthrie@berbee.com> References: <16871.60849.905998.527106@robur.slu.se> <200501141300.44347.jeremy.guthrie@berbee.com> <200501141326.29575.jeremy.guthrie@berbee.com> X-Mailer: VM 7.18 under Emacs 21.3.1 X-Scanned-By: MIMEDefang 2.48 on 130.238.96.70 X-Virus-Scanned: ClamAV 0.80/650/Sun Jan 2 19:00:02 2005 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 303 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: Robert.Olsson@data.slu.se Precedence: bulk X-list: netdev Jeremy M. Guthrie writes: > I actually upped the buffer count to 8192 buffers instead of 10k. > Of the 74 samples I have thus far, 57 have been clean of errors. > Most of the sample errors appear to be shortly after the cache flush. I don't really believe in increasing RX buffers to this extent. We verified that you have CPU available and the drops occur when the timer based GC happens. Increasing buffers decreases overall performance and adds jitter. We saw also the timed based GC were taking the dst-entries from about 600k to 40k in one shot. I think this what we should look into. Just GC is "work" also after GC a lot flows has to be recreated doing fib lookup and creating new entries. We want to smoothen the GC process so happen more frequent and does less work. Some time ago an "in-flow" GC (as opposed to timer based) was added to the routing code look for cand in route.c. In setup like yours (and ours) it would be better to relay on this process to a higher extent. Anyway in /proc/sys/net/ipv4/route/ you have the files. gc_elasticity, gc_interval, gc_thresh etc I would avoid gc_min_interval. And you can play with your running system and for drops without causing your users to much pain. We save the patch for routing without route hash and GC until later, --ro