[BACK]Return to failover.sgml CVS log [TXT][DIR] Up to [Development] / failsafe / FailSafe-books / LnxFailSafe_PG

File: [Development] / failsafe / FailSafe-books / LnxFailSafe_PG / failover.sgml (download)

Revision 1.1, Wed Nov 29 22:01:12 2000 UTC (16 years, 10 months ago) by vasa
Branch: MAIN
CVS Tags: HEAD

New documentation files for the Programmers' Guide.

<!-- Fragment document type declaration subset:
ArborText, Inc., 1988-1997, v.4001
<!DOCTYPE SET PUBLIC "-//Davenport//DTD DocBook V3.0//EN" [
<!ENTITY scriptlib.sgml SYSTEM "scriptlib.sgml">
<!ENTITY scriptlibapp.sgml SYSTEM "scriptlibapp.sgml">
<!ENTITY startgui.sgml SYSTEM "startgui.sgml">
<!ENTITY preface.sgml SYSTEM "preface.sgml">
<!ENTITY overview.sgml SYSTEM "overview.sgml">
<!ENTITY action.sgml SYSTEM "action.sgml">
<!ENTITY database.sgml SYSTEM "database.sgml">
<!ENTITY install.sgml SYSTEM "install.sgml">
<!ENTITY gloss.sgml SYSTEM "gloss.sgml">
<!ENTITY index.sgml SYSTEM "index.sgml">
<!ENTITY monitor SYSTEM "figures/monitor.eps" NDATA eps>
<!ENTITY resource.ai SYSTEM "figures/resource.ai.eps" NDATA eps>
<!ENTITY optional.ai SYSTEM "figures/optional.ai.eps" NDATA eps>
<!ENTITY manager.ai SYSTEM "figures/manager.ai.eps" NDATA eps>
<!ENTITY depend.ai SYSTEM "figures/depend.ai.eps" NDATA eps>
<!ENTITY type.ai SYSTEM "figures/type.ai.eps" NDATA eps>
<!ENTITY attrib.ai SYSTEM "figures/attrib.ai.eps" NDATA eps>
<!ENTITY action.ai SYSTEM "figures/action.ai.eps" NDATA eps>
<!ENTITY star.configuration SYSTEM "figures/star.configuration.eps" NDATA eps>
<!ENTITY n.plus.2.configuration SYSTEM "figures/n.plus.2.configuration.eps" NDATA eps>
<!ENTITY square.configuration SYSTEM "figures/square.configuration.eps" NDATA eps>
]>
-->
<chapter id="LE16529-PARENT">
<title id="LE16529-TITLE">Creating a Failover Policy</title>
<para>This chapter tells you how to create a failover policy. It describes
the following <?Pub Caret>topics:<itemizedlist>
<listitem><para><xref linkend="fpcontent"></para>
</listitem>
<listitem><para><xref linkend="fpinterface"></para>
</listitem>
<listitem><para><xref linkend="fpexample"></para>
</listitem>
</itemizedlist></para>
<sect1 id="fpcontent"><?Pub Dtl>
<title>Contents of a Failover Policy</title>
<para>A <firstterm>failover policy</firstterm> is the method by which
a resource group is failed over from one node to another. A failover policy
consists of the following:<indexterm id="ITfailover-0"><primary>failover
policy</primary><secondary>contents</secondary></indexterm></para>
<itemizedlist>
<listitem><para>Failover domain</para>
</listitem>
<listitem><para>Failover attributes</para>
</listitem>
<listitem><para>Failover scripts</para>
</listitem>
</itemizedlist>
<para>Linux FailSafe uses the failover domain output from a failover script
along with failover attributes to determine on which node a resource group
should reside. </para>
<para>The administrator must configure a failover policy for each resource
group. The name of the failover policy must be unique within the pool.
</para>
<sect2>
<title>Failover Domain</title>
<para>A <firstterm>failover domain</firstterm> is the <literal>ordered
</literal> list of nodes on which a given resource group can be allocated.
The nodes listed in the failover domain <literal>must</literal> be within
the same cluster; however, the failover domain does not have to include
every node in the cluster. The failover domain can also be used to statically
load balance the resource groups in a cluster.</para>
<para>Examples:<itemizedlist>
<listitem><para>In a four&ndash;node cluster, a set of two nodes that
have access to a particular XLV volume may be the failover domain of the
resource group containing that XLV volume.</para>
</listitem>
<listitem><para>In a cluster of nodes named venus, mercury, and pluto,
you could configure the following initial failover domains for resource
groups RG1 and RG2:<indexterm id="ITfailover-1"><primary>failover policy
</primary><secondary>failover domain</secondary></indexterm> <indexterm
id="ITfailover-2"><primary>domain</primary></indexterm> <indexterm id="ITfailover-3">
<primary>failover domain</primary></indexterm><itemizedlist>
<listitem><para>mercury, venus, pluto for RG1</para>
</listitem>
<listitem><para>pluto, mercury for RG2</para>
</listitem>
</itemizedlist></para>
</listitem>
</itemizedlist></para>
<para>The administrator defines the initial failover domain<indexterm
id="ITfailover-4"><primary>initial failover domain</primary></indexterm>
when configuring a failover policy. The initial failover domain is used
when a cluster is first booted. The ordered list specified by the initial
failover domain is transformed into a run-time failover domain<indexterm
id="ITfailover-5"><primary>run-time failover domain</primary></indexterm>
by the failover script. With each failure, the failover script takes the
current run-time failover domain and potentially modifies it; the initial
failover domain is never used again. Depending on the run-time conditions
 and contents of the failover script, the initial and run-time failover
domains may be identical.</para>
<para>For example, suppose the initial failover domain is: <literallayout>
N1 N2 N3</literallayout></para>
<para>The runtime failover domain will vary based on the failover script: <itemizedlist>
<listitem><para>If <literal>ordered</literal>:  <literallayout>N1 N2 N3
</literallayout></para>
</listitem>
<listitem><para>If <literal>round-robin</literal>:  <literallayout>N2 N3 N1
</literallayout></para>
</listitem>
<listitem><para>If a customized failover script, the order could be any
permutation, based on the contents of the script: <literallayout>N1 N2 N3
N1 N3 N2
N2 N3 N1
N2 N1 N3
N3 N2 N1
N3 N1 N2</literallayout></para>
</listitem>
</itemizedlist></para>
<para>Linux FailSafe stores the run-time failover domain and uses it as
input to the next failover script invocation.</para>
</sect2>
<sect2>
<title>Failover Attributes</title>
<para>A <firstterm>failover attribute</firstterm> is a value that is passed
to the failover scrip and used by Linux FailSafe for the purpose of modifying
the run-time failover domain for a specific resource group. There are
required and optional failover attributes, and you can also specify your
own strings as attributes.<indexterm id="ITfailover-6"><primary>failover
policy</primary><secondary>failover attributes</secondary></indexterm> <indexterm
id="ITfailover-7"><primary>attributes</primary></indexterm> <indexterm
id="ITfailover-8"><primary>failover attributes</primary></indexterm></para>
<para><xref linkend="LE45720-PARENT"> shows the required failover attributes.
 You must specify one and only one of these attributes. Note that the
starting conditions for the attributes differs:  for the required attributes,
the starting condition is that a node joins the cluster membership when
the cluster is already providing HA services; for the optional attributes,
the starting condition is that HA services are started and the resource
group is running in only one node in the cluster</para>
<table frame="topbot" pgwide="1" id="LE45720-PARENT">
<title id="LE45720-TITLE">Required Failover Attributes (mutually exclusive)
</title>
<tgroup cols="2" colsep="0" rowsep="1">
<colspec colwidth="135*">
<colspec colwidth="261*">
<thead>
<row><entry align="left" valign="bottom"><para>Name</para></entry><entry
align="left" valign="bottom"><para>Description</para></entry></row></thead>
<tbody>
<row rowsep="0">
<entry align="left" valign="top"><para><indexterm id="ITfailover-9"><primary>
Auto_Failback failover attribute</primary></indexterm><literal>Auto_Failback
</literal></para></entry>
<entry align="left" valign="top"><para>Specifies that the resource group
is made online based on the failover policy when a node joins the cluster.
This attribute is best used when some type of load balancing is required.
You must specify either this attribute or the <literal>Controlled_Failback
</literal> attribute.</para></entry></row>
<row>
<entry align="left" valign="top"><para><indexterm id="ITfailover-10">
<primary>Controlled_Failback failover attribute</primary></indexterm><literal>
Controlled_Failback</literal></para></entry>
<entry align="left" valign="top"><para>Specifies that the resource group
remains on the same node when a node joins the cluster. This attribute
is best used when client/server applications have expensive recovery mechanisms,
such as databases or any application that uses <literal>tcp</literal>
to communicate. You must specify either this attribute or the <literal>
Auto_Failback</literal> attribute.</para></entry></row></tbody></tgroup>
</table>
<para>When defining a failover policy, you can optionally also choose
one and only one of the recovery attributes shown in <xref linkend="LE27762-PARENT">.
The recovery attribute determines the node on which a resource group will
be allocated when its state changes to online and a member of the group
is already allocated (such as when volumes are present).</para>
<table frame="topbot" pgwide="1" id="LE27762-PARENT">
<title id="LE27762-TITLE">Optional Failover Attributes (mutually exclusive)
</title>
<tgroup cols="2" colsep="0" rowsep="1">
<colspec colwidth="136*">
<colspec colwidth="260*">
<thead>
<row><entry align="left" valign="bottom"><para>Name</para></entry><entry
align="left" valign="bottom"><para>Description</para></entry></row></thead>
<tbody>
<row rowsep="0">
<entry align="left" valign="top"><para><indexterm id="ITfailover-11">
<primary>Auto_Recovery failover attribute</primary></indexterm><literal>
Auto_Recovery</literal></para></entry>
<entry align="left" valign="top"><para>Specifies that the resource group
is made online based on the failover policy even when an exclusivity check
shows that the resource group is running on a node. This attribute is
optional and is mutually exclusive with the <literal>Inplace_Recovery
</literal> attribute. If you specify neither of these attributes, Linux
FailSafe will use this attribute by default if you have specified the <literal>
Auto_Failback</literal> attribute.</para></entry></row>
<row>
<entry align="left" valign="top"><para><indexterm id="ITfailover-12">
<primary>InPlace_Recovery failover attribute</primary></indexterm><literal>
InPlace_Recovery</literal></para></entry>
<entry align="left" valign="top"><para>Specifies that the resource group
is made online on the same node where  the resource group is running.
This attribute is the default and is mutually exclusive with the <literal>
Auto_Recovery</literal> attribute. If you specify neither of these attributes,
Linux FailSafe will use this attribute by default if you have specified
the <literal>Controlled_Failback</literal> attribute.</para></entry></row>
</tbody></tgroup></table>
</sect2>
<sect2>
<title>Failover Scripts</title>
<para><indexterm id="ITfailover-13"><primary>failover policy</primary>
<secondary>failover script</secondary></indexterm> <indexterm id="ITfailover-14">
<primary>failover script</primary><secondary>description</secondary></indexterm>A
failover script generates the run-time failover domain and returns it
to the Linux FailSafe process. The Linux FailSafe process applies the
failover attributes and then selects the first node in the returned failover
domain that is also in the current node membership.</para>
<note>
<para>The run-time of the failover script must be capped to a system-definable
maximum. Hence, any external calls must be guaranteed to return quickly.
If the failover script takes too long to return, Linux FailSafe will kill
the script process and use the previous run-time failover domain.</para>
</note>
<para>Failover scripts are stored in the  <filename>/usr/lib/failsafe/policies
</filename>  directory.<indexterm id="ITfailover-15"><primary><literal>
/usr/lib/failsafe/policies</literal> directory</primary></indexterm></para>
<sect3>
<title>The <filename>ordered</filename> Failover Script</title>
<para>The<indexterm id="ITfailover-16"><primary><literal>ordered</literal>
 failover script</primary></indexterm> <literal>ordered</literal> failover
script is provided with the release. The <literal>ordered</literal> script
never changes the initial domain; when using this script, the initial
and run-time domains are equivalent. The script reads six lines from the
input file and in case of errors logs the input parameters and/or the
error to the script log.</para>
<para>The following example shows the contents of the <filename>ordered
</filename> failover script. (Line breaks added for readability.)</para>
<programlisting>#!/bin/sh
#
# Copyright (c) 2000 Silicon Graphics, Inc.  All Rights Reserved.
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of version 2 of the GNU General Public License as
# published by the Free Software Foundation.
#
# This program is distributed in the hope that it would be useful, but
# WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
#
# Further, this software is distributed without any warranty that it is
# free of the rightful claim of any third person regarding infringement
# or the like.  Any license provided herein, whether implied or
# otherwise, applies only to this software file.  Patent licenses, if
# any, provided herein do not apply to combinations of this program with
# other software, or any other product whatsoever.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write the Free Software Foundation,
# Inc., 59 Temple Place - Suite 330, Boston MA 02111-1307, USA.
#
# Contact information: Silicon Graphics, Inc., 1600 Amphitheatre Pkwy,
# Mountain View, CA 94043, or:
#
# http://www.sgi.com
#
# For further information regarding this notice, see:
#
# http://oss.sgi.com/projects/GenInfo/NoticeExplan

#
# $1 - input file
# $2 - output file
#
# line 1 input file - version
# line 2 input file - name
# line 3 input file - owner field
# line 4 input file - attributes
# line 5 input file - list of possible owners
# line 6 input file - application failover domain


DIR=/usr/lib/failsafe/bin
LOG="${DIR}/ha_cilog -g ha_script -s script"
FILE=/usr/lib/failsafe/policies/ordered

input=$1
output=$2

{
  read version
  read name
  read owner
  read attr
  read mem1 mem2 mem3 mem4 mem5 mem6 mem7 mem8
  read afd1 afd2 afd3 afd4 afd5 afd6 afd7 afd8
} &lt; ${input}


${LOG} -l 1 "${FILE}:" `/bin/cat ${input}`

if [ "${version}" -ne 1 ] ; then
    ${LOG} -l 1 "ERROR: ${FILE}: Different version no. Should be (1) rather than (${version})" ;
    exit 1;
elif [ -z "${name}" ]; then
    ${LOG} -l 1 "ERROR: ${FILE}: Failover script not defined";
    exit 1;
elif [ -z "${attr}" ]; then
    ${LOG} -l 1 "ERROR: ${FILE}: Attributes not defined";
    exit 1;
elif [ -z "${mem1}" ]; then
    ${LOG} -l 1 "ERROR: ${FILE}: No node membership defined";
    exit 1;
elif [ -z "${afd1}" ]; then
    ${LOG} -l 1 "ERROR: ${FILE}: No failover domain defined";
    exit 1;
fi

found=0
for i in $afd1 $afd2 $afd3 $afd4 $afd5 $afd6 $afd7 $afd8; do
    for j in $mem1 $mem2 $mem3 $mem4 $mem5 $mem6 $mem7 $mem8; do
        if [ "X${j}" = "X${i}" ]; then
            found=1;
            break;
        fi
    done
done


if [ ${found} -eq 0 ]; then
    mem="("$mem1")"" ""("$mem2")"" ""("$mem3")"" ""("$mem4")"" \
    ""("$mem5")"" ""("$mem6")"" ""("$mem7")"" ""("$mem8")";
    afd="("$afd1")"" ""("$afd2")"" ""("$afd3")"" ""("$afd4")"" \
    ""("$afd5")"" ""("$afd6")"" ""("$afd7")"" ""("$afd8")";
    ${LOG} -l 1 "ERROR: ${FILE}: Policy script failed"
    ${LOG} -l 1 "ERROR: ${FILE}: " `/bin/cat ${input}`
    ${LOG} -l 1 "ERROR: ${FILE}: Nodes defined in membership do not match \
    the ones in failure domain"
    ${LOG} -l 1 "ERROR: ${FILE}: Parameters read from input file: \
    version = $version, name = $name, owner = $owner,  attribute = $attr, \
    nodes = $mem, afd = $afd"
    exit 1;
fi


if [ ${found} -eq 1 ]; then
    rm -f ${output}
    echo $afd1 $afd2 $afd3 $afd4 $afd5 $afd6 $afd7 $afd8 > ${output}
    exit 0
fi
exit 1</programlisting>
</sect3>
<sect3>
<title>The <filename>round-robin </filename>Failover Script</title>
<para>The <filename>round-robin</filename> script selects the resource
group owner in a round-robin (circular) fashion. This policy can be used
for resource groups that can be run in any node in the cluster.</para>
<para>The following example shows the contents of the <filename>round-robin 
</filename>failover script. </para>
<programlisting>#!/bin/sh
#
# Copyright (c) 2000 Silicon Graphics, Inc.  All Rights Reserved.
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of version 2 of the GNU General Public License as
# published by the Free Software Foundation.
#
# This program is distributed in the hope that it would be useful, but
# WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
#
# Further, this software is distributed without any warranty that it is
# free of the rightful claim of any third person regarding infringement
# or the like.  Any license provided herein, whether implied or
# otherwise, applies only to this software file.  Patent licenses, if
# any, provided herein do not apply to combinations of this program with
# other software, or any other product whatsoever.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write the Free Software Foundation,
# Inc., 59 Temple Place - Suite 330, Boston MA 02111-1307, USA.
#
# Contact information: Silicon Graphics, Inc., 1600 Amphitheatre Pkwy,
# Mountain View, CA 94043, or:
#
# http://www.sgi.com
#
# For further information regarding this notice, see:
#
# http://oss.sgi.com/projects/GenInfo/NoticeExplan


#
# $1 - input file
# $2 - output file
#
# line 1 input file - version
# line 2 input file - name
# line 3 input file - owner field
# line 4 input file - attributes
# line 5 input file - Possible list of owners
# line 6 input file - application failover domain

DIR=/usr/lib/failsafe/bin
LOG="${DIR}/ha_cilog -g ha_script -s script"
FILE=/usr/lib/failsafe/policies/round-robin

# Read input file
input=$1
output=$2

{
  read version
  read name
  read owner
  read attr
  read mem1 mem2 mem3 mem4 mem5 mem6 mem7 mem8
  read afd1 afd2 afd3 afd4 afd5 afd6 afd7 afd8
} &lt; ${input}

# Validate input file
${LOG} -l 1 "${FILE}:" `/bin/cat ${input}`

if [ "${version}" -ne 1 ] ; then
    ${LOG} -l 1 "ERROR: ${FILE}: Different version no. Should be (1) \
    rather than (${version})" ;
    exit 1;
elif [ -z "${name}" ]; then
    ${LOG} -l 1 "ERROR: ${FILE}: Failover script not defined";
    exit 1;
elif [ -z "${attr}" ]; then
    ${LOG} -l 1 "ERROR: ${FILE}: Attributes not defined";
    exit 1;
elif [ -z "${mem1}" ]; then
    ${LOG} -l 1 "ERROR: ${FILE}: No node membership defined";
    exit 1;
elif [ -z "${afd1}" ]; then
    ${LOG} -l 1 "ERROR: ${FILE}: No failover domain defined";
    exit 1;
fi



# Return 0 if $1 is in the membership and return 1 otherwise.
check_in_mem()
{
    for j in $mem1 $mem2 $mem3 $mem4 $mem5 $mem6 $mem7 $mem8; do
        if [ "X${j}" = "X$1" ]; then
            return 0;
        fi
    done
    return 1;
}

# Check if owner has to be changed. There is no need to change owner if
# owner node is in the possible list of owners.
check_in_mem ${owner}
if [ $? -eq 0 ]; then
    nextowner=${owner};
fi

# Search for the next owner
if [ "X${nextowner}" = "X" ]; then
    next=0;
    for i in $afd1 $afd2 $afd3 $afd4 $afd5 $afd6 $afd7 $afd8; do
        if [ "X${i}" = "X${owner}" ]; then
            next=1;
            continue;
        fi

        if [ "X${owner}" = "XNO ONE" ]; then
            next=1;
        fi

        if [ ${next} -eq 1 ]; then
            # Check if ${i} is in membership
            check_in_mem ${i};
            if [ $? -eq 0 ]; then
                # found next owner
                nextowner=${i};
                next=0;
                break;
            fi
        fi
    done
fi

if [ "X${nextowner}" = "X" ]; then
    # wrap round the afd list.
    for i in $afd1 $afd2 $afd3 $afd4 $afd5 $afd6 $afd7 $afd8; do
        if [ "X${i}" = "X${owner}" ]; then
            # Search for next owner complete
            break;
        fi

        # Previous loop should have found new owner
        if [ "X${owner}" = "XNO ONE" ]; then
            break;
        fi

        if [ ${next} -eq 1 ]; then
            check_in_mem ${i};
            if [ $? -eq 0 ]; then
                # found next owner
                nextowner=${i};
                next=0;
                break;
            fi
        fi
    done
fi

if [ "X${nextowner}" = "X" ]; then
    ${LOG} -l 1 "ERROR: ${FILE}: Policy script failed"
    ${LOG} -l 1 "ERROR: ${FILE}: " `/bin/cat ${input}`
    ${LOG} -l 1 "ERROR: ${FILE}: Could not find new owner"
    exit 1;
fi



# nextowner is the new owner
print=0;
rm -f ${output};

# Print the new afd to the output file
echo -n "${nextowner} " > ${output};
for i in $afd1 $afd2 $afd3 $afd4 $afd5 $afd6 $afd7 $afd8;
do
    if [ "X${nextowner}" = "X${i}" ]; then
        print=1;
    elif [ ${print} -eq 1 ]; then
        echo -n "${i} " >> ${output}
    fi
done

print=1;
for i in $afd1 $afd2 $afd3 $afd4 $afd5 $afd6 $afd7 $afd8; do
    if [ "X${nextowner}" = "X${i}" ]; then
        print=0;
    elif [ ${print} -eq 1 ]; then
        echo -n "${i} " >> ${output}
    fi
done

echo >> ${output};
exit 0;</programlisting>
</sect3>
<sect3>
<title>Creating a New Failover Script</title>
<para>If the <literal>ordered</literal> or <filename>round-robin</filename>
scripts do not meet your needs, you can create a new failover script and
place it in the <filename>/usr/lib/failsafe/policies</filename> directory.
 You can then configure the cluster configuration database to use your
new failover script for the required resource groups.</para>
</sect3>
</sect2>
</sect1>
<sect1 id="fpinterface"><?Pub Dtl>
<title>Failover Script Interface</title>
<para>The following is passed to the failover script:<indexterm id="ITfailover-17">
<primary>failover policy</primary><secondary>failover script interface
</secondary></indexterm> <indexterm id="ITfailover-18"><primary>failover
script</primary><secondary>interface</secondary></indexterm></para>
<programlisting>function(<replaceable>version</replaceable>, <replaceable>
name</replaceable>, <replaceable>owner</replaceable>, <replaceable>attributes
</replaceable>, <replaceable>possibleowners</replaceable>, <replaceable>
domain</replaceable>)</programlisting>
<variablelist>
<varlistentry><term><replaceable>version</replaceable></term>
<listitem>
<para>Linux FailSafe version. The Linux FailSafe release uses version
number 1. </para>
</listitem>
</varlistentry>
<varlistentry><term><replaceable>name</replaceable></term>
<listitem>
<para>Name of the failover script (used for error validations and logging
purposes).</para>
</listitem>
</varlistentry>
<varlistentry><term><replaceable>owner</replaceable></term>
<listitem>
<para><replaceable></replaceable>Logical name of the node that has the
resource group allocated.</para>
</listitem>
</varlistentry>
<varlistentry><term><replaceable>attributes</replaceable></term>
<listitem>
<para>Failover attributes (<literal>Auto_Failback</literal> or <literal>
Controlled_Failback</literal> must be included)</para>
</listitem>
</varlistentry>
<varlistentry><term><replaceable>possibleowners</replaceable></term>
<listitem>
<para>List of possible owners for the resource group. This list can be
subset of the current node membership.</para>
</listitem>
</varlistentry>
<varlistentry><term><replaceable>domain</replaceable></term>
<listitem>
<para>Ordered list of nodes used at the last failover. (At the first failover,
the initial failover domain is used.)</para>
</listitem>
</varlistentry>
</variablelist>
<para>The failover script returns the newly generated run-time failover
domain to Linux FailSafe, which then chooses the node on which the resource
group should be allocated by applying the failover attributes and node
membership to the run-time failover domain.</para>
</sect1>
<sect1 id="fpexample"><?Pub Dtl>
<title>Example Failover Policies for Linux FailSafe</title>
<para>There are two general types of configuration, each of which can
have from 2 through 8 nodes:</para>
<itemizedlist>
<listitem><para><replaceable>N</replaceable> nodes that can potentially
failover their applications to any of the other nodes in the cluster. 
</para>
</listitem>
<listitem><para><replaceable>N</replaceable> primary nodes that can failover
to <replaceable>M</replaceable> backup nodes. For example, you could have
3 primary nodes and 1&nbsp; backup node.</para>
</listitem>
</itemizedlist>
<para>This section shows examples of failover policies for the following
types of configuration, each of which can have from 2 through 8 nodes:
</para>
<itemizedlist>
<listitem><para><replaceable>N</replaceable> primary nodes and one backup
node (<replaceable>N+</replaceable>1)</para>
</listitem>
<listitem><para><replaceable>N</replaceable> primary nodes and two backup
nodes (<replaceable>N+</replaceable>2)</para>
</listitem>
<listitem><para><replaceable>N</replaceable> primary nodes and <replaceable>
M </replaceable>backup nodes (<replaceable>N+M</replaceable>)</para>
<note>
<para>The diagrams in the following sections illustrate the configuration
concepts discussed here, but they do not address all required or supported
elements, such as reset hubs. For configuration details, see the <citetitle>
Linux FailSafe Installation and Maintenance Instructions</citetitle>.
</para>
</note>
</listitem>
</itemizedlist>
<sect2>
<title>N+1 Configuration for Linux FailSafe</title>
<para><xref linkend="LE70211-PARENT"> shows a specific instance of an <replaceable>
N+</replaceable>1 configuration in which there are three primary nodes
and one backup node. (This is also known as a <firstterm>star configuration
</firstterm>.) The disks shown could each be disk farms.<indexterm id="ITfailover-19">
<primary>failover policy</primary><secondary>examples</secondary><tertiary>
N+1</tertiary></indexterm> <indexterm id="ITfailover-20"><primary>configurations
</primary><secondary>N+1</secondary></indexterm></para>
<para><figure id="LE70211-PARENT">
<title id="LE70211-TITLE"><replaceable>N+</replaceable>1 Configuration
Concept</title>
<graphic entityref="star.configuration"></graphic>
</figure></para>
<para>You could configure the following failover policies for load balancing:
</para>
<itemizedlist>
<listitem><para>Failover policy for RG1:</para>
<itemizedlist>
<listitem><para>Initial failover domain = A, D</para>
</listitem>
<listitem><para>Failover attribute = <literal>Auto_Failback</literal></para>
</listitem>
<listitem><para>Failover script = <literal>ordered</literal></para>
</listitem>
</itemizedlist>
</listitem>
<listitem><para>Failover policy for RG2:</para>
<itemizedlist>
<listitem><para>Initial failover domain = B, D</para>
</listitem>
<listitem><para>Failover attribute = <literal>Auto_Failback</literal></para>
</listitem>
<listitem><para>Failover script = <literal>ordered</literal></para>
</listitem>
</itemizedlist>
</listitem>
<listitem><para>Failover policy for RG3:</para>
<itemizedlist>
<listitem><para>Initial failover domain = C, D</para>
</listitem>
<listitem><para>Failover attribute = <literal>Auto_Failback</literal></para>
</listitem>
<listitem><para>Failover script = <literal>ordered</literal></para>
</listitem>
</itemizedlist>
</listitem>
</itemizedlist>
<para>If node A fails, RG1 will fail over to node D. As soon as node A
reboots, RG1 will be moved back to node A.</para>
<para>If you change the failover attribute to <literal>Controlled_Failback
</literal> for RG1 and node A fails, RG1 will fail over to node D and
will remain running on node D even if node A reboots.</para>
</sect2>
<sect2>
<title>N+2 Configuration</title>
<para><xref linkend="LE15098-PARENT"> shows a specific instance of an
 <replaceable>N+</replaceable>2 configuration in which there are four
primary nodes and two backup nodes. The disks shown could each be disk
farms. <indexterm id="ITfailover-21"><primary>failover policy</primary>
<secondary>examples</secondary><tertiary>N+2</tertiary></indexterm>  <indexterm
id="ITfailover-22"><primary>configurations</primary><secondary>N+2</secondary>
</indexterm></para>
<para><figure id="LE15098-PARENT">
<title id="LE15098-TITLE"><replaceable>N+</replaceable>2 Configuration
Concept</title>
<graphic entityref="n.plus.2.configuration"></graphic>
</figure></para>
<para>You could configure the following failover policy for resource groups
RG7 and RG8:</para>
<itemizedlist>
<listitem><para>Failover policy for RG7:</para>
<itemizedlist>
<listitem><para>Initial failover domain = A, E, F</para>
</listitem>
<listitem><para>Failover attribute = <literal>Controlled_Failback</literal></para>
</listitem>
<listitem><para>Failover script =<literal> ordered</literal></para>
</listitem>
</itemizedlist>
</listitem>
<listitem><para>Failover policy for RG8:</para>
<itemizedlist>
<listitem><para>Initial failover domain = B, F, E</para>
</listitem>
<listitem><para>Failover attribute = <literal>Auto_Failback</literal></para>
</listitem>
<listitem><para>Failover script = <literal>ordered</literal></para>
</listitem>
</itemizedlist>
</listitem>
</itemizedlist>
<para>If node A fails, RG7 will fail over to node E. If node E also fails,
RG7 will fail over to node F. If A is rebooted, RG7 will remain on node
F.</para>
<para>If node B fails, RG8 will fail over to node F. If B is rebooted,
RG8 will return to node B.</para>
</sect2>
<sect2>
<title>N+M Configuration for Linux FailSafe</title>
<para><xref linkend="LE66765-PARENT"> shows a specific instance of an <replaceable>
N+M</replaceable> configuration in which there are four primary nodes
and each can serve as a backup node. The disk shown could be a disk farm. <indexterm
id="ITfailover-23"><primary>configurations</primary><secondary>N+M</secondary>
</indexterm>  <indexterm id="ITfailover-24"><primary>failover policy</primary>
<secondary>examples</secondary><tertiary>N+M</tertiary></indexterm></para>
<para><figure id="LE66765-PARENT">
<title id="LE66765-TITLE"><replaceable>N</replaceable>+<replaceable>M
</replaceable> Configuration Concept</title>
<graphic entityref="square.configuration"></graphic>
</figure></para>
<para>You could configure the following failover policy for resource groups
RG5 and RG6:</para>
<itemizedlist>
<listitem><para>Failover policy for RG5:</para>
<itemizedlist>
<listitem><para>Initial failover domain = A, B, C, D</para>
</listitem>
<listitem><para>Failover attribute = <literal>Controlled_Failback</literal></para>
</listitem>
<listitem><para>Failover script = <literal>ordered</literal></para>
</listitem>
</itemizedlist>
</listitem>
<listitem><para>Failover policy for RG6:</para>
<itemizedlist>
<listitem><para>Initial failover domain = C, A, D</para>
</listitem>
<listitem><para>Failover attribute = <literal>Controlled_Failback</literal></para>
</listitem>
<listitem><para>Failover script = <literal>ordered</literal></para>
</listitem>
</itemizedlist>
</listitem>
</itemizedlist>
<para>If node C fails, RG6 will fail over to node A. When node C reboots,
RG6 will remain running on node A. If node A then fails, RG6 will return
to node C and RG5 will move to node B. If node B then fails, RG5 moves
to node C.</para>
</sect2>
</sect1>
</chapter>
<?Pub *0000031412>