failsafe
[Top] [All Lists]

Re: [LinuxFailSafe] regarding Redundancy in TCP/IP Stack

To: <sndtrn27@xxxxxxxxxxx>
Subject: Re: [LinuxFailSafe] regarding Redundancy in TCP/IP Stack
From: Eric Lee Green <eric@xxxxxxxxxx>
Date: Sun, 3 Jun 2001 22:35:33 -0700 (MST)
Cc: <failsafe@xxxxxxxxxxx>
In-reply-to: <65256A61.0014C7B5.00@xxxxxxxxxxxxxxxxxxx>
Sender: owner-failsafe@xxxxxxxxxxx
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Mon, 4 Jun 2001 sndtrn27@xxxxxxxxxxx wrote:
> hi all , list
> We am novice to the Linux TCP / IP stack arch,
> At present i want to implement redundancy
> at socket level in the Stack.. Can you please
> help me with some docs, information in this regards
>
> we want to know
> 1. The Data structures that are kept by the system for maintaining the
> Connection.
> 2. Kernel related data structures that are part of the TCP / IP stack.
> 3. Any Documents, Links that can help us in getting with the procedure .as to
> how it can be implemeted efficiently.
> 4. Pros & cons in implementing such redundancy.
> 5. kernel related other information as to which modules are interdependent to
> this (If any).

> 6. If any work is going in this regards, then what is the present
> status. & for more detail whom shall then we refer to.

This is an interesting topic and subject. My gut feel is that the current
design of the network stack is not conducive to redundancy, but I don't
have anything behind that gut feel other than a general knowledge of how
monolithic operating systems behave and how the Linux network stack has
become intertangled with various other pieces of the system such as the
disk buffer cache (part of the optimizations for web serving so that the
Linux guys can now claim to have the world's fastest web server, sigh).
This doesn't bode well for the notion of either doing a seamless
replacement of the network stack with one which incorporates socket
migration and redundancy, or patching the current network stack to
incorporate socket migration and redundancy. In any event, this is more of
an issue for fault-tolerate systems than for high-availability systems, in
my opinion.

Note that FailSafe is a high-availability system, not a fault-tolerant
system. The difference is that high-availability systems tolerate
momentary disruptions of service during the failover process.  Most
modern-day protocols are fairly tolerant of random disconnects, and most
people are not going to get upset if they have to click on a "Try Again?"
button once or twice a year when hardware fails and the failover takes
place. Similarly, if a web page display gets interrupted due to a server
falling over, most people are fine with hitting the "Refresh" button to
get the whole page again.

I'm personally more interested in redundant distributed filesystems, and
especially filesystems that can handle the case of multiple master-slave
relationships. DRBD is a nice proof of concept but it is much too
low-level for what is needed, what is needed is something like what CODA
does, which incorporates versioning in order to do faster syncs after a
reconnect. Unfortunately CODA is very, very far away from being a
production-quality system, with some serious design limitations (like
utter lack of any kind of locking protocol). Of course, file locking on
distributed filesystems is itself a research topic :-}. And without the
low-level socket migration stuff that you're talking about, file locking
is only useful with connectionless protocols such as NFS anyhow.

- -- 
Eric Lee Green                             mailto:eric@xxxxxxxxxx
  GnuPG public key at http://badtux.org/eric/eric.gpg
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.5 (GNU/Linux)
Comment: For info see http://www.gnupg.org

iD8DBQE7Gx4q3DrrK1kMA04RAvwzAKCNbkBmi0iTQL8yHfuLSAwlGeViAgCfUv1/
dFK31ZOtdY9/2iT07v82sss=
=jI+3
-----END PGP SIGNATURE-----


<Prev in Thread] Current Thread [Next in Thread>