netdev
[Top] [All Lists]

dst cache cleared on netdev down?

To: netdev@xxxxxxxxxxx
Subject: dst cache cleared on netdev down?
From: Richard Guy Briggs <rgb@xxxxxxxxxxxxxxxxxxxxx>
Date: Wed, 6 Jun 2001 13:51:49 -0400
Sender: owner-netdev@xxxxxxxxxxx
User-agent: Mutt/1.2.5i
Hi all,

I'm seeing oopses possibly coming from the attempted use of a dst chache
entry after a device has been downed.

Can someone affirm that when a device goes down, it takes out all the
routing table entries for that device and it also takes out all the dst
cache entries for that device?


The problem stems from the current kludgy way that FreeS/WAN gets
packets.

FreeS/WAN currently "attaches" a physical device to an ipsec virtual
device, so that when a packet is routed to that virtual device it
eventually comes out after encryption, being sent to the physical
device.  This was actually the way that it happenned in 2.0 kernels, but
with the advent of dst cache, it now does a routing table lookup again,
attempting to use the physical device if a valid route exists.

When that physical device goes down for any reason, we would simply take
down the corresponding virtual device.  This would have the effect of
clearing all the routes that had been used to direct packets through the
ipsec device.  When the physical device came back up (in this case, ppp,
using the roaring penguin userspace driver) packets for which secure
tunnels had been set up were now being sent in the clear.

The code was changed so that if the physical device went down, the
virtual device would stay up but simply drop the packets until the
physical device was re-attached, ensuring that packets were dropped
rather than being sent in the clear.

I am now getting oopses
<http://west.toad.com/barfs/2001-06june-05/001.008.ksymopps> in
neigh_connected_output() at 6f/b0, which could be dev->hard_header() or
neigh->ops->queue_xmit().  If fact, I suspect neigh->ha, but don't know
for certain.  Is it possible that neigh->ha is bugus when it tries to
evaluate it before calling dev->hard_header?  I assume that the three
assignments in the variable declarations are protected by the compiler
and don't need to be in the body of the code to be checked before
assignment?  If any one of the variables from which they point are null,
it will not cause an oops?


Is this a bug in neigh_connected_output(), the way we are using it, or
the way we are attempting to clean up after the physical device goes
down?

        slainte mhath, RGB
-- 
Richard Guy Briggs -- PGP key available            Auto-Free Ottawa! Canada
<www.conscoop.ottawa.on.ca/rgb/>                       <www.flora.org/afo/>
Prevent Internet Wiretapping!        --        FreeS/WAN:<www.freeswan.org>
Thanks for voting Green! -- <green.ca>      Marillion:<www.marillion.co.uk>

<Prev in Thread] Current Thread [Next in Thread>