netdev
[Top] [All Lists]

Re: static routes and dead gateway detection

To: Alexey Kuznetsov <kuznet@xxxxxxxxxxxxx>
Subject: Re: static routes and dead gateway detection
From: Julian Anastasov <ja@xxxxxx>
Date: Mon, 23 Jul 2001 23:20:53 +0000 (GMT)
Cc: <netdev@xxxxxxxxxxx>
In-reply-to: <200107222115.BAA07086@xxxxxxxxxxxxxx>
Resent-date: Mon, 23 Jul 2001 16:09:40 -0700
Resent-from: root@xxxxxxxxxxx
Resent-message-id: <200107232309.f6NN9eq22618@xxxxxxxxxxx>
Resent-to: netdev-outgoing@xxxxxxxxxxx
Sender: owner-netdev@xxxxxxxxxxx
        Hello Alexey,

On Mon, 23 Jul 2001, Alexey Kuznetsov wrote:

> >                                              I can't find a place
> > where the proto static routes are used.
>
> proto is not used by kernel (except for marking routes created
> by itself with proto "kernel"), it is used by routing daemons,
> namely, gated.

        Yes, while writing the patch I found it in gated, I looked in
zebra too and I see that the static routes are not changed from these
daemons.

> > So, I implemented a way to make the proto static routes permanent.
>
> Not so bad idea. Only pretty useless one, gated and brothers do this nicely.

        I have setups with many rules and routes, in different tables,
I have to check whether all known daemons have multiple tables support...
But with such patch the kernel becomes a nice healthchecking daemon :))

> Implementation is wrong, but you will get this effect using code under
> #ifdef CONFIG_IP_ROUTE_MULTIPATH for normal routes.

        This patch works nicely but may be I'm missing something ...
When I delete devices the routes are flushed successfully (the dead ones
particulary). So far, I didn't found any problems (month or so).

> > kernel(s). It is for 2.2 and can be ported to 2.4 too. How these RTPROT
> > codes are really used in the routing daemons and do they use static
> > routes too?
>
> Look into gated manual, it explains diffrence of routes with
> RTF_STATIC (BSD term).

        Yes, I have read some docs on this issue.

> >     What I see as problem even in the plain 2.2.19 kernel is that
> > when one device for one of the nexthops (when the prefsrc is not from
> > this device) is removed and
>
>
> Sorry, you did something wrong here. On unregister you must destroy all the
> references to this device. Being unregistered, the device disappears
> forever and cannot return.

        In this patch the dead gateways are flushed exactly at the same
time when all other (non-static) dead routes are flushed, i.e. when the
device is removed (fib_num_down_nh_devs, sorry for the bad func name, it is
incorrect, may be the comments too). This ugly function tries to
distinguish whether a prefsrc was deleted or the device was marked down.
We try to preserve the dead routes only when the device is marked down.
And not only for multipath routes.

        The real problem is that the multipath routes can conatin dead
paths. This is not mine, I see it as a 2.2 (may be 2.4 too) problem with
non-permanent devices. When the device is removed I can see that the
route contains dead nexthop looking like:

default
        nexthop via A.A.A.A  dev ifXXX weight 1 dead
        nexthop via B.B.B.B  dev eth0 weight 1

these "ifXXX" device names printed from the iproute's ll_idx_n2a function
mean the device does not exist and this is true. Later (after adding
the new device) tcpdump shows these ifXXX names :) This is funny and
I have to check the reason. May be the old dev index is inherited from
the route. In any case, currently, the multipath route recreating is
mandatory after a device from this route is removed. I do this in my
setup: the routes are recreated after the devices are recreated.

> > added again it can receive another dev index
>
> Full non-sense. "This" device cannot get another index, index
> is the only thing distinguishing devices.

        Yes, the device is deleted but the nexthops remain because
the multipath route is autodeleted when all nexthops are dead. I don't
claim that when the device is created again, the "ifXXX" is replaced with
the previous name. It remains "ifXXX". This is the reason I'm talking
about nh_ifname or similar solution. Then the new device can replace the
old one, by name. Then the route not need to be recreated. But this can
open another discussion, may be.

> > The patch contains a fix in fib_sync_up() about similar problem, i.e.
> > not to touch nh_dev for DEAD routes.
>
> Do not leave undefined crap there, that's answer. :-)

        You'll correct me if I'm wrong :) For such devs, nh_dev points to
a crap (when the device is removed). This is a multipath route.
If the crap nh_dev->flags&IFF_UP is true we reach to nh->nh_dev != dev.
nh_dev != dev can fail only when a new device allocates the same space.
Can't happen but 0.0001% is possible :) This is the reason I'm
comparing the if indexes but you can argue that ifindex can wrap :)

> Alexey


Regards

--
Julian Anastasov <ja@xxxxxx>


<Prev in Thread] Current Thread [Next in Thread>