netdev
[Top] [All Lists]

Re: dst cache cleared on netdev down?

To: Richard Guy Briggs <rgb@xxxxxxxxxxxxxxxxxxxxx>
Subject: Re: dst cache cleared on netdev down?
From: Andi Kleen <ak@xxxxxx>
Date: Wed, 13 Jun 2001 03:30:19 +0200
Cc: netdev@xxxxxxxxxxx
In-reply-to: <20010606135149.I31244@grendel.conscoop.ottawa.on.ca>; from rgb@conscoop.ottawa.on.ca on Wed, Jun 06, 2001 at 07:51:49PM +0200
References: <20010606135149.I31244@grendel.conscoop.ottawa.on.ca>
Sender: owner-netdev@xxxxxxxxxxx
On Wed, Jun 06, 2001 at 07:51:49PM +0200, Richard Guy Briggs wrote:
> Hi all,
> 
> I'm seeing oopses possibly coming from the attempted use of a dst chache
> entry after a device has been downed.
> 
> Can someone affirm that when a device goes down, it takes out all the
> routing table entries for that device and it also takes out all the dst
> cache entries for that device?

When an IP address is deleted the routing cache is flushed after some delay.
This will remove all dst_entries in it that do have a zero reference count.
When you're relying on a dst_entry with zero reference count that's probably
a bug. 

> I am now getting oopses
> <http://west.toad.com/barfs/2001-06june-05/001.008.ksymopps> in
> neigh_connected_output() at 6f/b0, which could be dev->hard_header() or
> neigh->ops->queue_xmit().  If fact, I suspect neigh->ha, but don't know
> for certain.  Is it possible that neigh->ha is bugus when it tries to
> evaluate it before calling dev->hard_header?  I assume that the three
> assignments in the variable declarations are protected by the compiler
> and don't need to be in the body of the code to be checked before
> assignment?  If any one of the variables from which they point are null,
> it will not cause an oops?

When a physical device goes down its neighbours with zero reference count
get deleted. When you have a virtual interface the neighbours it sees should
be for the virtual interface though.
> 
> 
> Is this a bug in neigh_connected_output(), the way we are using it, or
> the way we are attempting to clean up after the physical device goes
> down?

I guess you're messing up reference counts somewhere, so data structures
get deleted under you.


-Andi

-- 
Life would be so much easier if we could just look at the source code.

<Prev in Thread] Current Thread [Next in Thread>