netdev
[Top] [All Lists]

Re: "dst cache overflow"

To: Robert Olsson <Robert.Olsson@xxxxxxxxxxx>
Subject: Re: "dst cache overflow"
From: Harald Welte <laforge@xxxxxxxxxxxx>
Date: Tue, 21 Sep 2004 23:55:02 +0200
Cc: cd@xxxxxxxxxx, netdev@xxxxxxxxxxx
In-reply-to: <200409201907.43317.cd@xxxxxxxxxx>
References: <16719.13095.369830.547715@xxxxxxxxxxxx>
Sender: netdev-bounce@xxxxxxxxxxx
User-agent: Mutt/1.5.6+20040818i
On Mon, Sep 20, 2004 at 09:44:39PM +0200, Robert Olsson wrote:
> Hello!
> 
>  size   IN: hit     tot    mc no_rt bcast madst masrc  OUT: hit     tot     
> mc 
> GC:tot ignore gmiss dstof HLS:in    out
> 10501       593     339     0     0     0     0     0       250      24      0
>    362    360     2     0    229     48
> 
> 
> The number of packets that hits the route hash is about the same as the
> number of packets that misses it. (hit/tot)
> 
> Either your route hash is so small in that case increase rhash_entries. 
> Or you are receiving a DoS attack.

Neither is the case.  I have now logged into that box and did some
further analysis.  There definitely is a dst_entry leak somewhere in
the kernel.  the number of entries in ip_dst_cache slab is constantly
increasing, now just before rebooting the box it had approached about
65k.

At least when you manuall flush the cache, the number of allocated
ip_dst_entries from slab should decrease...

After the reboot (uptime 4 hours at this point):

according to /proc/slabinfo, there's 7530 entries allocated
If you cat /proc/net/rt_cache, you see about 1990 entries.  
I've added a patch to export the number of dst_entries that are sitting
in the dst_garbage_list, it's 5549.  This value is increasing constantly
over time.  Coincidentially, if we subtract 7530-1990 we get almost
exactly this number.

I bet that something inside the kernel forgets dst_release().. IMQ is
just compiled, not used (so I don't see how it should come from this).

Any comments, suggestions?

If nothing else helps, I will add a seq_file interface to
dst_garbage_list and try to find some similarity betwen the stale
entries in order to get a clue about what's going on.

>                                               --ro

btw: The system was runnign 2.4.x until about two weeks ago... with no
dst_cache problems.

-- 
- Harald Welte <laforge@xxxxxxxxxxxx>               http://www.gnumonks.org/
============================================================================
Programming is like sex: One mistake and you have to support it your lifetime

Attachment: signature.asc
Description: Digital signature

<Prev in Thread] Current Thread [Next in Thread>