xfs
[Top] [All Lists]

DHCP failover failure when peer crashed

To: "dhcp-server@xxxxxxx" <dhcp-server@xxxxxxx>
Subject: DHCP failover failure when peer crashed
From: "Bernhard R. Erdmann" <be@xxxxxxxxxxx>
Date: Tue, 28 May 2002 23:31:23 +0200
Cc: Linux XFS Mailing List <linux-xfs@xxxxxxxxxxx>
Sender: owner-linux-xfs@xxxxxxxxxxx
Hi,

I got a interesting scenario using dhcp-3.0.1rc9 with DHCP failover.

One of the peers shut down its one and only / fs (XFS on Linux, shut
down due to filesystem errors, applications just get an I/O error after
the filesystem has been umounted by the kernel). So the system (kernel,
daemons, network) is still running but no application can complete disk
I/O.

The other peer could not hand out leases because the running dhcpd on
the crashed machine disturbed and did not let go control.

These are some lines from the working peer not being able to hand out
leases because 192.168.9.50 (the crashed peer) was still alive on
network:

May 24 09:08:56 hermes dhcpd: DHCPREQUEST for 192.168.9.173
(192.168.9.50) from 00:01:02:c8:cd:8a via eth0: lease in transition
state expired
May 24 09:08:56 hermes dhcpd: DHCPREQUEST for 192.168.9.173
(192.168.9.50) from 00:01:02:c8:cd:8a via 192.168.9.6: lease in
transition state expired
May 24 09:09:00 hermes dhcpd: DHCPREQUEST for 192.168.9.173
(192.168.9.50) from 00:01:02:c8:cd:8a via eth0: lease in transition
state expired
May 24 09:09:00 hermes dhcpd: DHCPREQUEST for 192.168.9.173
(192.168.9.50) from 00:01:02:c8:cd:8a via 192.168.9.6: lease in
transition state expired
May 24 09:09:08 hermes dhcpd: DHCPREQUEST for 192.168.9.173
(192.168.9.50) from 00:01:02:c8:cd:8a via eth0: lease in transition
state expired
May 24 09:09:08 hermes dhcpd: DHCPREQUEST for 192.168.9.173
(192.168.9.50) from 00:01:02:c8:cd:8a via 192.168.9.6: lease in
transition state expired

My opinion is: the dhcpd does not handle gracefully disk I/O errors when
trying to write dhcpd.leases. If an important filesystem is shut down
due to errors, the dhcpd on the crashed host disturbs failover and does
not completely give control to the surviving peer.


<Prev in Thread] Current Thread [Next in Thread>