[BACK]Return to nodeconfig.sgml CVS log [TXT][DIR] Up to [Development] / failsafe / FailSafe-books / LnxFailSafe_AG

File: [Development] / failsafe / FailSafe-books / LnxFailSafe_AG / nodeconfig.sgml (download)

Revision 1.1, Wed Nov 29 21:58:28 2000 UTC (16 years, 10 months ago) by vasa
Branch: MAIN
CVS Tags: HEAD

New documentation files for the Admin Guide.

<!-- Fragment document type declaration subset:
ArborText, Inc., 1988-1997, v.4001
<!DOCTYPE SET PUBLIC "-//Davenport//DTD DocBook V3.0//EN" [
<!ENTITY ha.cluster.messages SYSTEM "figures/ha.cluster.messages.eps" NDATA eps>
<!ENTITY machine.not.in.ha.cluster SYSTEM "figures/machine.not.in.ha.cluster.eps" NDATA eps>
<!ENTITY ha.cluster.config.info.flow SYSTEM "figures/ha.cluster.config.info.flow.eps" NDATA eps>
<!ENTITY software.layers SYSTEM "figures/software.layers.eps" NDATA eps>
<!ENTITY n1n4 SYSTEM "figures/n1n4.eps" NDATA eps>
<!ENTITY example.sgml SYSTEM "example.sgml">
<!ENTITY appupgrade.sgml SYSTEM "appupgrade.sgml">
<!ENTITY a1-1.failsafe.components SYSTEM "figures/a1-1.failsafe.components.eps" NDATA eps>
<!ENTITY a1-6.disk.storage.takeover SYSTEM "figures/a1-6.disk.storage.takeover.eps" NDATA eps>
<!ENTITY a2-3.non.shared.disk.config SYSTEM "figures/a2-3.non.shared.disk.config.eps" NDATA eps>
<!ENTITY a2-4.shared.disk.config SYSTEM "figures/a2-4.shared.disk.config.eps" NDATA eps>
<!ENTITY a2-5.shred.disk.2active.cnfig SYSTEM "figures/a2-5.shred.disk.2active.cnfig.eps" NDATA eps>
<!ENTITY a2-1.examp.interface.config SYSTEM "figures/a2-1.examp.interface.config.eps" NDATA eps>
<!ENTITY intro.sgml SYSTEM "intro.sgml">
<!ENTITY overview.sgml SYSTEM "overview.sgml">
<!ENTITY planning.sgml SYSTEM "planning.sgml">
<!ENTITY admintools.sgml SYSTEM "admintools.sgml">
<!ENTITY config.sgml SYSTEM "config.sgml">
<!ENTITY operate.sgml SYSTEM "operate.sgml">
<!ENTITY diag.sgml SYSTEM "diag.sgml">
<!ENTITY recover.sgml SYSTEM "recover.sgml">
<!ENTITY clustproc.sgml SYSTEM "clustproc.sgml">
<!ENTITY appfiles.sgml SYSTEM "appfiles.sgml">
<!ENTITY gloss.sgml SYSTEM "gloss.sgml">
<!ENTITY preface.sgml SYSTEM "preface.sgml">
<!ENTITY index.sgml SYSTEM "index.sgml">
]>
-->
<chapter id="LE32854-PARENT">
<title id="LE32854-TITLE">Installing Linux FailSafe Software and Preparing
the System</title>
<para>This chapter describes several system administration procedures that
must be performed on the nodes in a cluster to prepare and configure them
for Linux FailSafe. These procedures assume that you have done the planning
described in <xref linkend="LE88622-PARENT">.</para>
<para>The major sections in this chapter are as follows:</para>
<itemizedlist><?Pub Dtl>
<listitem><para><xref linkend="LE29006-PARENT"></para>
</listitem>
<listitem><para><xref linkend="LE97755-PARENT"></para>
</listitem>
<listitem><para><xref linkend="LE23103-PARENT"></para>
</listitem>
<listitem><para><xref linkend="LE13651-PARENT"></para>
</listitem>
<listitem><para><xref linkend="LE39637-PARENT"></para>
</listitem>
<listitem><para><xref linkend="LE97738-PARENT"></para>
</listitem>
<listitem><para><xref linkend="LE90681-PARENT"></para>
</listitem>
</itemizedlist>
<sect1 id="LE29006-PARENT">
<title id="LE29006-TITLE">Overview of Configuring Nodes for Linux FailSafe
</title>
<para>Performing the system administration procedures required to prepare
nodes for Linux FailSafe involves these steps:</para>
<orderedlist>
<listitem><para>Install required software as described in <xref linkend="LE97755-PARENT">.
</para>
</listitem>
<listitem><para>Configure the system files on each node, as described in<xref
linkend="LE23103-PARENT">.</para>
</listitem>
<listitem><para>Check the setting of two important configuration issues on
each node as described in <xref linkend="LE13651-PARENT">.</para>
</listitem>
<listitem><para>Create the devices and filesystems required by the highly
available applications you plan to run on the cluster. See <xref linkend="LE39637-PARENT">.
</para>
</listitem>
<listitem><para>Configure the network interfaces on the nodes using the procedure
in <xref linkend="LE97738-PARENT">.</para>
</listitem>
<listitem><para>Configure the serial ports used on each node for the serial
connection to the other nodes by following the procedure in <xref linkend="LE90681-PARENT">.
</para>
</listitem>
<listitem><para>When you are ready configure the nodes so that Linux FailSafe
software starts up when they are rebooted.</para>
</listitem>
</orderedlist>
<para>To complete the configuration of nodes for Linux FailSafe, you must
configure the components of the Linux FailSafe system, as described in <xref
linkend="LE94219-PARENT">.</para>
</sect1>
<sect1 id="LE97755-PARENT">
<title id="LE97755-TITLE">Installing Required Software</title>
<note>
<para>The Linux FailSafe base CD requires about 25 MB.</para>
</note>
<para><indexterm id="ITnodeconfig-0"><primary>installing Linux FailSafe software
</primary></indexterm> <indexterm id="ITnodeconfig-1"><primary>Linux FailSafe
</primary><secondary>installation</secondary></indexterm>To install the software,
follow these steps:</para>
<orderedlist>
<listitem><para>Make sure all servers in the cluster are running a supported
release of Linux.</para>
</listitem>
<listitem><para>Depending on the servers and storage in the configuration
and the Linux revision level, install the latest install patches that are
required for the platform and applications.</para>
</listitem>
<listitem><para>On each system in the pool, install the version of the multiplexer
driver that is appropriate to the operating system. Use the CD that accompanies
the multiplexer. Reboot the system after installation.</para>
</listitem>
<listitem><para>On each node that is part of the pool, install the following
software, in order:<orderedlist>
<listitem><para><literal>sysadm_base-tcpmux</literal></para>
</listitem>
<listitem><para><literal>sysadm_base-lib</literal></para>
</listitem>
<listitem><para><literal>sysadm_base-server</literal></para>
</listitem>
<listitem><para><literal>cluster_admin</literal></para>
</listitem>
<listitem><para><literal>cluster_services</literal></para>
</listitem>
<listitem><para><literal>failsafe</literal></para>
</listitem>
<listitem><para><literal>sysadm_failsafe-server</literal></para>
</listitem>
</orderedlist></para>
<note>
<title>Note</title>
<para>You must install <literal>sysadm_base-tcpmux</literal>, <?Pub _nolinebreak><literal>
sysadm_base-server</literal><?Pub /_nolinebreak>, and <literal>sysadm_failsafe
</literal> packages on those nodes from which you want to run the FailSafe
GUI. If you do not want to run the GUI on a specific node, you do not need
to install these software packages on that node.</para>
</note>
</listitem>
<listitem><para>If the pool nodes are to be administered by a Web-based version
of the Linux FailSafe Cluster Manager GUI, install the following subsystems,
in order: <orderedlist>
<listitem><para><literal>IBMJava118-JRE</literal></para>
</listitem>
<listitem><para><literal>sysadm_base-client</literal></para>
</listitem>
<listitem><para><literal>sysadm_failsafe-web</literal></para>
<para>If the workstation launches the GUI client from a Web browser that supports
Java&trade;, install: <filename>java_plugin</filename> from the Linux FailSafe
CD<comment>Is there going to be an actual CD?</comment></para>
<para>If the Java plug-in is not installed when the Linux FailSafe Manager
GUI is run from a browser, the browser is redirected to <?Pub _nolinebreak><filename>
http://java.sun.com/products/plugin/1.1/plugin-install.html</filename><?Pub /_nolinebreak></para>
<para>After installing the Java plug-in, you must close all browser windows
and restart the browser.</para>
<para>For a non-Linux workstation, download the Java Plug-in from <?Pub _nolinebreak><filename>
http://java.sun.com/products/plugin/1.1/plugin-install.html</filename><?Pub /_nolinebreak></para>
<para>If the Java plug-in is not installed when the Linux FailSafe Manager
GUI is run from a browser, the browser is redirected to this site.</para>
</listitem>
<listitem><para><literal>sysadm_failsafe-client</literal></para>
</listitem>
</orderedlist></para>
</listitem>
<listitem><para>Install software on the administrative workstation (GUI client).
</para>
<para>If the workstation runs the GUI client from a Linux desktop, install
these subsystems:</para>
<orderedlist>
<listitem><para><literal>IBMJava118-JRE</literal></para>
</listitem>
<listitem><para><literal>sysadm_base-client</literal></para>
</listitem>
</orderedlist>
</listitem>
<listitem><para>On the appropriate servers, install other optional software,
such as storage management or network board software.</para>
</listitem>
<listitem><para>Install patches that are required for the platform and applications.
</para>
</listitem>
</orderedlist>
</sect1>
<sect1 id="LE23103-PARENT">
<title id="LE23103-TITLE">Configuring System Files</title>
<para><indexterm id="ITnodeconfig-2"><primary>system files</primary></indexterm>When
you install the Linux FailSafe Software, there are some system file considerations
you must take into account. This section describes the required and optional
changes you make to the following files for every node in the pool:</para>
<itemizedlist>
<listitem><para><filename><indexterm id="ITnodeconfig-3"><primary>/etc/services
file</primary></indexterm>/etc/services</filename></para>
</listitem>
<listitem><para><filename>/etc/failsafe/config/cad.options</filename></para>
</listitem>
<listitem><para><filename>/etc/failsafe/config/cdbd.options</filename></para>
</listitem>
<listitem><para><filename>/etc/failsafe/config/cmond.options</filename></para>
</listitem>
</itemizedlist>
<sect2>
<title>Configuring /etc/services for Linux FailSafe</title>
<para>The <filename>/etc/services</filename> file must contain entries for <literal>
sgi-cmsd</literal>, <literal>sgi-crsd</literal>, <literal>sgi-gcd</literal>,
and <literal>sgi-cad</literal> on each node before starting HA services in
the node. The port numbers assigned for these processes must be the same in
all nodes in the cluster. Note that <literal>sgi-cad</literal> requires a
TCP port.</para>
<para>The following shows an example of <filename>/etc/services</filename>
entries for     <literal>sgi-cmsd</literal>, <literal>sgi-crsd</literal>, <literal>
sgi-gcd</literal> and <literal>sgi-cad</literal>:</para>
<programlisting>sgi-cmsd   7000/udp           # SGI Cluster Membership Daemon
sgi-crsd   17001/udp           # Cluster reset services daemon
sgi-gcd    17002/udp           # SGI Group Communication Daemon
sgi-cad    17003/tcp           # Cluster Admin daemon</programlisting>
</sect2>
<sect2>
<title>Configuring /etc/failsafe/config/cad.options for Linux FailSafe</title>
<para>The <filename>/etc/failsafe/config/cad.options</filename> file contains
the list of parameters that the cluster administration daemon (CAD) reads
when the process is started. The CAD provides cluster information to the Linux
FailSafe Cluster Manager GUI.<indexterm id="ITnodeconfig-4"><primary>/etc/failsafe/config/cad.options
file</primary></indexterm> <indexterm id="ITnodeconfig-5"><primary>CAD options
file</primary></indexterm></para>
<para>The following options can be set in the <filename>cad.options</filename>
file:</para>
<variablelist>
<varlistentry><term><literal>--append_log</literal></term>
<listitem>
<para>Append CAD logging information to the CAD log file instead of overwriting
it.</para>
</listitem>
</varlistentry>
<varlistentry><term><literal>--log_file </literal><replaceable>filename</replaceable></term>
<listitem>
<para>CAD log file name. Alternately, this can be specified as <literal>-lf 
</literal><replaceable>filename.</replaceable></para>
</listitem>
</varlistentry>
<varlistentry><term><literal>-vvvv</literal></term>
<listitem>
<para>Verbosity level. The number of &ldquo;<literal>v</literal>&rdquo;s indicates
the level of logging. Setting <literal>-v</literal> logs the fewest messages.
Setting <literal>-vvvv</literal> logs the highest number of messages.</para>
</listitem>
</varlistentry>
</variablelist>
<para>The following example shows an <filename>/etc/failsafe/config/cad.options
</filename> file:</para>
<programlisting>-vv -lf /var/log/failsafe/cad_nodename --append_log</programlisting>
<para>When you change the <filename>cad.options</filename> file, you must
restart the CAD processes with the <?Pub _nolinebreak><command>/etc/rc.d/init.d/fs_cluster
restart</command><?Pub /_nolinebreak> command for those changes to take affect.
</para>
</sect2>
<sect2>
<title>Configuring /etc/failsafe/config/cdbd.options for Linux FailSafe</title>
<para>The <filename>/etc/failsafe/config/cdbd.options</filename> file contains
the list of parameters that the cdbd daemon reads when the process is started.
The cdbd daemon is the configuration database daemon that manages the distribution
of cluster configuration database (CDB) across the nodes in the pool.<indexterm
id="ITnodeconfig-6"><primary>/etc/failsafe/config/cdbd.options file</primary>
</indexterm> <indexterm id="ITnodeconfig-7"><primary>cdbd options file</primary>
</indexterm></para>
<para>The following options can be set in the <filename>cdbd.options</filename>
file:</para>
<variablelist>
<varlistentry><term><literal>-logevents</literal> <replaceable>eventname</replaceable></term>
<listitem>
<para>Log selected events. These event names may be used: <command>all</command>, <command>
internal</command>, <command>args</command>, <command>attach</command>, <command>
chandle</command>, <command>node</command>, <command>tree</command>, <command>
lock</command>, <command>datacon</command>, <command>trap</command>, <command>
notify</command>, <command>access</command>,<command>&ensp;storage</command>.
</para>
<para>The default value for this option is <command>all</command>.</para>
</listitem>
</varlistentry>
<varlistentry><term><literal>-logdest</literal> <replaceable>log_destination
</replaceable></term>
<listitem>
<para>Set log destination. These log destinations may be used: <command>all
</command>, <command>stdout</command>, <command>stderr</command>, <command>
syslog</command>, <command>logfile</command>. If multiple destinations are
specified, the log messages are written to all of them. If <command>logfile
</command> is specified, it has no effect unless the -logfile option is also
specified.  If the log destination is <literal>stderr</literal> or <literal>
stdout</literal>, logging is then disabled if <literal>cdbd</literal> runs
as a daemon, because <literal>stdout</literal> and <literal>stderr</literal>
are closed when <literal>cdbd</literal> is running as a daemon.</para>
<para>The default value for this option is <command>logfile</command>.</para>
</listitem>
</varlistentry>
<varlistentry><term><literal>-logfile</literal> <replaceable>filename</replaceable></term>
<listitem>
<para>Set log file name.</para>
<para>The default value is <filename>/var/log/failsafe/cdbd_log</filename></para>
</listitem>
</varlistentry>
<varlistentry><term><literal>-logfilemax</literal> <replaceable>maximum_size
</replaceable></term>
<listitem>
<para>Set log file maximum size (in bytes). If the file exceeds the maximum
size, any preexisting <filename>filename.old</filename> will be deleted, the
current file will be renamed to <filename>filename.old</filename>, and a new
file will be created. A single message will not be split across files.</para>
<para>If <literal>-logfile</literal> is set, the default value for this option
is 10000000.</para>
</listitem>
</varlistentry>
<varlistentry><term><literal>-loglevel</literal> <replaceable>log level</replaceable></term>
<listitem>
<para>Set log level. These log levels may be used: <command>always</command>, <command>
critical</command>, <command>error</command>, <command>warning</command>, <command>
info</command>, <command>moreinfo</command>, <command>freq</command>, <command>
morefreq</command>, <command>trace</command>, <command>busy</command>.</para>
<para>The default value for this option is <command>info</command>.</para>
</listitem>
</varlistentry>
<varlistentry><term><literal>-trace</literal> <replaceable>trace class</replaceable></term>
<listitem>
<para>Trace selected events. These trace classes may be used: <command>all
</command>, <command>rpcs</command>, <command>updates</command>, <command>
transactions</command>, <command>monitor</command>. No tracing is done, even
if it is requested for one or more classes of events, unless either or both
of <literal>-tracefile</literal> or <literal>-tracelog</literal> is specified.
</para>
<para>The default value for this option is <command>transactions</command>.
</para>
</listitem>
</varlistentry>
<varlistentry><term><literal>-tracefile</literal> <replaceable>filename</replaceable></term>
<listitem>
<para>Set trace file name.</para>
</listitem>
</varlistentry>
<varlistentry><term><literal>-tracefilemax</literal> <replaceable>maximum
size</replaceable></term>
<listitem>
<para>Set trace file maximum size (in bytes). If the file exceeds the maximum
size, any preexisting<filename>&ensp;filename.old</filename> will be deleted,
the current file will be renamed to <filename>filename.old</filename>.</para>
</listitem>
</varlistentry>
<varlistentry><term><literal>-[no]tracelog</literal></term>
<listitem>
<para>[Do not] trace to log destination. When this option is set, tracing
messages are directed to the log destination or destinations. If there is
also a trace file, the tracing messages are written there as well.</para>
</listitem>
</varlistentry>
<varlistentry><term><literal>-[no]parent_timer</literal></term>
<listitem>
<para>[Do not] exit when parent exits.</para>
<para>The default value for this option is <literal>-noparent_timer</literal>.
</para>
</listitem>
</varlistentry>
<varlistentry><term><literal>-[no]daemonize</literal></term>
<listitem>
<para>[Do not] run as a daemon.</para>
<para>The default value for this option is <literal>-daemonize</literal>.
</para>
</listitem>
</varlistentry>
<varlistentry><term><literal>-l</literal></term>
<listitem>
<para>Do not run as a daemon.</para>
</listitem>
</varlistentry>
<varlistentry><term><literal>-h</literal></term>
<listitem>
<para>Print usage message.</para>
</listitem>
</varlistentry>
<varlistentry><term><literal>-o help</literal></term>
<listitem>
<para>Print usage message.</para>
</listitem>
</varlistentry>
</variablelist>
<para>Note that if you use the default values for these options, the system
will be configured so that all log messages of level <command>info</command>
or less, and all trace messages for transaction events to file <?Pub _nolinebreak><filename>
/var/log/failsafe/cdbd_log</filename><?Pub /_nolinebreak>. When the file size
reaches 10MB, this file will be moved to its namesake with the <filename>
.old</filename> extension, and logging will roll over to a new file of the
same name. A single message will not be split across files.</para>
<para>The following example shows an <?Pub _nolinebreak><filename>/etc/failsafe/config/cdbd.options
</filename><?Pub /_nolinebreak> file that directs all <literal>cdbd</literal>
logging information to <?Pub _nolinebreak><filename>/var/log/messages</filename><?Pub /_nolinebreak>,
and all <literal>cdbd</literal> tracing information to <?Pub _nolinebreak><filename>
/var/log/failsafe/cdbd_ops1</filename><?Pub /_nolinebreak>. All log events
are being logged, and the following trace events are being logged: RPCs, updates
and transactions. When the size of the tracefile <?Pub _nolinebreak><filename>
/var/log/failsafe/cdbd_ops1</filename><?Pub /_nolinebreak> exceeds 100000000,
this file is renamed to <?Pub _nolinebreak><filename>/var/log/failsafe/cdbd_ops1.old
</filename><?Pub /_nolinebreak><?Pub Caret> and a new file <?Pub _nolinebreak><filename>
/var/log/failsafe/cdbd_ops1</filename><?Pub /_nolinebreak> is created. A single
message is not split across files.</para>
<programlisting>-logevents all -loglevel trace -logdest syslog -trace rpcs -trace 
updates -trace transactions -tracefile /var/log/failsafe/cdbd_ops1 
-tracefilemax 100000000</programlisting>
<para>The following example shows an<filename>&ensp;/etc/failsafe/config/cdbd.options
</filename> file that directs all log and trace messages into one file, <filename>
/var/log/failsafe/cdbd_chaos6</filename>, for which a maximum size of 100000000
is specified. <literal>-tracelog</literal> directs the tracing to the log
file.</para>
<programlisting>-logevents all -loglevel trace -trace rpcs -trace updates -trace 
transactions -tracelog -logfile /var/log/failsafe/cdbd_chaos6 
-logfilemax 100000000 -logdest logfile.</programlisting>
<para>When you change the <filename>cdbd.options</filename> file, you must
restart the <literal>cdbd</literal> processes with the <?Pub _nolinebreak><command>
/etc/rc.d/init.d/fs_cluster restart</command><?Pub /_nolinebreak> command
for those changes to take affect.</para>
</sect2>
<sect2 id="LE32812-PARENT">
<title id="LE32812-TITLE">Configuring /etc/failsafe/config/cmond.options for
Linux FailSafe</title>
<para>The<indexterm id="ITnodeconfig-8"><primary>/etc/failsafe/config/cmond.options
</primary></indexterm><filename>/etc/failsafe/config/cmond.options</filename>
file contains the list of parameters that the cluster monitor daemon (<command>
cmond</command>) reads when the process is started. It also specifies the
name of the file that logs <command>cmond</command> events. The cluster monitor
daemon provides a framework for starting, stopping, and monitoring process
groups. See the <command>cmond</command> man page for information on the cluster
monitor daemon.</para>
<para>The following options can be set in the <filename>cmond.options</filename>
file:</para>
<variablelist>
<varlistentry><term><literal>-L</literal> <replaceable>loglevel</replaceable></term>
<listitem>
<para>Set log level to <replaceable>loglevel</replaceable></para>
</listitem>
</varlistentry>
<varlistentry><term><literal>-d</literal></term>
<listitem>
<para>Run in debug mode</para>
</listitem>
</varlistentry>
<varlistentry><term><literal>-l</literal></term>
<listitem>
<para>Lazy mode, where <command>cmond</command> does not validate its connection
to the cluster database</para>
</listitem>
</varlistentry>
<varlistentry><term><literal>-t </literal><replaceable>napinterval</replaceable></term>
<listitem>
<para>The time interval in milliseconds after which <command>cmond</command>
checks for liveliness of process groups it is monitoring</para>
</listitem>
</varlistentry>
<varlistentry><term><literal>-s</literal> [<replaceable>eventname</replaceable>]
</term>
<listitem>
<para>Log messages to <filename>stderr</filename></para>
</listitem>
</varlistentry>
</variablelist>
<para>A default <filename>cmond.options</filename> file is shipped with the
following options. This default options file logs <command>cmond</command>
events to the <command>/var/log/failsafe/cmond_log</command> file.</para>
<programlisting>-L info -f /var/log/failsafe/cmond_log</programlisting>
</sect2>
</sect1>
<sect1 id="LE13651-PARENT">
<title id="LE13651-TITLE">Additional Configuration Issues</title>
<para><indexterm id="ITnodeconfig-9"><primary>additional configuration issues
</primary></indexterm>During the hardware installation of Linux FailSafe nodes,
two additional issues must be considered:</para>
<itemizedlist>
<listitem><para><indexterm id="ITnodeconfig-10"><primary>Automatic booting
</primary></indexterm> The Linux FailSafe software requires the nodes to be
automatically booted when they are reset or when the node is powered on. 
Linux on x86 will be dependent upon BIOS configuration to ensure this.  Some
PC BIOSes will hang indefinitely upon error.  Clearly this is not useful for
high availability situations.  On other platforms, such as PowerPC, Alpha,
etc, the necessary steps will vary.</para>
<para>A related, but not identical issue is that of reboots on kernel panics.
 To ensure the system will reboot even in the case of a kernel failure, set
the panic value in a system boot file, such as  <filename>init.d/boot.local
</filename>:</para>
<programlisting>echo "<replaceable>number</replaceable>" > /proc/sys/kernel/panic
</programlisting>
<para><replaceable>number</replaceable> is the number of seconds after a panic
before the system will reset.</para>
<para>If you would prefer administrator intervention to be required during
a hardware or kernel failure, you may leave this disabled</para>
</listitem>
<listitem><para><indexterm id="ITnodeconfig-11"><primary>SCSI ID parameter
</primary></indexterm>The SCSI controllers' host IDs of the nodes in a Linux
FailSafe cluster using physically shared storage must be different. If a cluster
has no shared storage or is using shared Fibre Channel storage, the value
of SCSI host ID is not important.</para>
</listitem>
</itemizedlist>
<para>You can check the ID of most Linux controllers in the logged kernel
messages from boot time:</para>
<programlisting># <userinput>grep ID= /var/log/messages</userinput>&ensp;
&lt;6>(scsi0) Wide Channel, SCSI ID=7, 16/255 SCBs</programlisting>
<para>Changing the SCSI host ID is specific to the SCSI controller in use.
Refer to the controller documentation.</para>
<para>A controller uses its SCSI ID on all buses attached to it. Therefore,
you must make sure that no device attached to a node has the same number as
its SCSI unit number.</para>
</sect1>
<sect1 id="LE39637-PARENT">
<title id="LE39637-TITLE">Choosing and Configuring devices and Filesystems
</title>
<para><indexterm id="ITnodeconfig-12"><primary>filesystem creation</primary>
</indexterm> <indexterm id="ITnodeconfig-13"><primary>logical volume creation
</primary></indexterm> <indexterm id="ITnodeconfig-14"><primary>logical volume
</primary><secondary>creation</secondary></indexterm> Creating devices, logical
volumes, and filesystems will have a variety of steps specific to the filesystems
and other tools selected.  Documenting these is outside the scope of this
guide. Please refer to the system and distribution-specific documentation
for more assistance in this area.</para>
<para>When you create the volumes and filesystems for use with Linux FailSafe,
remember these important points:</para>
<itemizedlist>
<listitem><para>If the shared disks are not in a RAID storage system, mirrored
volumes should be used.</para>
</listitem>
<listitem><para><indexterm id="ITnodeconfig-15"><primary>logical volume</primary>
<secondary>owner</secondary></indexterm>  Each device used must be owned by
the same node that is the primary node for the highly available applications
that use the logical volume. To simplify the management of the <replaceable>
nodenames</replaceable> (owners) of volumes on shared disks, follow these
recommendations:</para>
<itemizedlist>
<listitem><para>Work with the volumes on a shared disk from only one node
in the cluster.</para>
</listitem>
<listitem><para>After you create all the volumes on one node, you can selectively
shift the ownership to the other node.</para>
</listitem>
</itemizedlist>
</listitem>
<listitem><para>If the volumes you create are used as raw volumes (no filesystem)
for storing database data, the database system may require that the device
names have specific owners, groups, and modes. If this is the case (see the
documentation provided by the database vendor), use the <command>chown</command>
and <command>chmod </command> commands (see the <command>chown</command> and <command>
chmod</command> reference pages) to set the owner, group, and mode as required.
</para>
</listitem>
<listitem><para>No filesystem entries are made in <?Pub _nolinebreak><filename>
/etc/fstab</filename><?Pub /_nolinebreak> for filesystems on shared disks;
Linux FailSafe software mounts the filesystems on shared disks. However, to
simplify system administration, consider adding comments to <?Pub _nolinebreak><filename>
/etc/fstab</filename><?Pub /_nolinebreak> that list the filesystems configured
for Linux FailSafe. Thus, a system administrator who sees mounted Linux FailSafe
filesystems in the output of the <command>df</command> command and looks for
the filesystems in the <?Pub _nolinebreak><filename>/etc/fstab</filename><?Pub /_nolinebreak> file
will learn that they are filesystems managed by Linux FailSafe.</para>
</listitem>
<listitem><para>Be sure to create the mount point directory for each filesystem
on all nodes in the failover domain.</para>
</listitem>
</itemizedlist>
</sect1>
<sect1 id="LE97738-PARENT">
<title id="LE97738-TITLE">Configuring Network Interfaces</title>
<para><indexterm id="ITnodeconfig-16"><primary>network interface</primary>
<secondary>configuration</secondary></indexterm>The procedure in this section
describes how to configure the network interfaces on the nodes in a Linux
FailSafe cluster. The example shown in <xref linkend="LE47532-PARENT"> is
used in the procedure.</para>
<para><figure id="LE47532-PARENT">
<title id="LE47532-TITLE">Example Interface Configuration</title>
<graphic entityref="a2-1.examp.interface.config"></graphic>
</figure> <orderedlist>
<listitem><para>If possible, add every IP address, IP name, and IP alias for
the nodes to <filename>/etc/hosts</filename> on one node.</para>
<programlisting>190.0.2.1 xfs-ha1.company.com xfs-ha1
190.0.2.3 stocks
190.0.3.1 priv-xfs-ha1
190.0.2.2 xfs-ha2.company.com xfs-ha2
190.0.2.4 bonds
190.0.3.2 priv-xfs-ha2</programlisting>
<note>
<para>IP aliases that are used exclusively by highly available services  should
not be added to system configuration files.  These aliases will be added and
removed by Linux FailSafe.</para>
</note>
</listitem>
<listitem><para>Add all of the IP addresses from Step&nbsp;1             
             
<!-- This hardcoded numeric reference should be updated.-->
 to <replaceable>/etc/hosts</replaceable> on the other nodes in the cluster.
</para>
</listitem>
<listitem><para>If there are IP addresses, IP names, or IP aliases that you
did not add to <filename>/etc/hosts</filename> in Steps 1 and 2, verify that
NIS is configured on all nodes in the cluster.</para>
<para>If the <literal>ypbind</literal> is <option>off</option>, you must start
NIS. See your distribution's documentation for details.</para>
</listitem>
<listitem><para>For IP addresses, IP names, and IP aliases that you did not
add to <filename>/etc/hosts</filename> on the nodes in Steps 1 and 2, verify
that they are in the NIS database by entering this command for each address:
</para>
<programlisting># <userinput>ypmatch <replaceable>address mapname</replaceable></userinput>
190.0.2.1 xfs-ha1.company.com xfs-ha1</programlisting>
<para><replaceable>address</replaceable> is an IP address, IP name, or IP
alias. <replaceable>mapname</replaceable> is <literal>hosts.byaddr</literal>
if address is an IP address; otherwise, it is <literal>hosts</literal>. If <command>
ypmatch</command> reports that <replaceable>address</replaceable> doesn't
match, it must be added to the NIS database. See your distribution's documentation
for details.</para>
</listitem>
<listitem><para>On one node, statically configure that node's interface and
IP address with the provided distribution tools. </para>
<para>For the example in <xref linkend="LE47532-PARENT">, on a SuSE system,
the public interface name and IP address lines are configured into <filename>
/etc/rc.config</filename> in the following variables.  Please note that YaST
is the preferred method for modifying these variables.  In any event, you
should refer to the documentation of your distribution for help here:</para>
<programlisting>NETDEV_0=eth0
IPADDR_0=$HOSTNAME</programlisting>
<para><literal>$HOSTNAME</literal> is an alias for an IP address that appears
in <filename>/etc/hosts</filename>.</para>
<para>If there are additional public interfaces, their interface names and
IP addresses appear on lines like these:</para>
<programlisting>NETDEV_1=
IPADDR_1=</programlisting>
<para>In the example, the control network name and IP address are</para>
<programlisting>NETDEV_2=eth3
IPADDR_3=priv-$HOSTNAME</programlisting>
<para>The control network IP address in this example, <literal>priv-$HOSTNAME
</literal>, is an alias for an IP address that appears in <filename>/etc/hosts
</filename>.</para>
</listitem>
<listitem><para>Repeat Steps 5 and 6 on the other nodes.</para>
</listitem>
<listitem><para>Verify that Linux FailSafe is <command><option>off</option></command>
on each node:</para>
<programlisting># <userinput>/usr/lib/failsafe/bin/fsconfig failsafe</userinput>
# <userinput>if [ $? -eq 1 ]; then echo off; else echo on; fi</userinput>

</programlisting>
<para>If <literal>failsafe</literal> is <option>on</option> on any node, enter
this command on that node:</para>
<programlisting># <userinput>/usr/lib/failsafe/bin/fsconfig failsafe off</userinput></programlisting>
</listitem>
<listitem><para>Configure an e-mail alias on each node that sends the Linux
FailSafe e-mail notifications of cluster transitions to a user outside the
Linux FailSafe cluster and to a user on the other nodes in the cluster. For
example, if there are two nodes called <literal>xfs-ha1</literal> and <literal>
xfs-ha2</literal>, in <filename>/etc/aliases</filename> on <literal>xfs-ha1
</literal>, add</para>
<programlisting>fsafe_admin:operations@console.xyz.com,admin_user@xfs-ha2.xyz.com 
</programlisting>
<para>On xfs-ha2, add this line to <filename>/etc/aliases</filename>:</para>
<programlisting>fsafe_admin:operations@console.xyz.com,admin_user@xfs-ha1.xyz.com 
</programlisting>
<para>The alias you choose, <literal>fsafe_admin</literal> in this case, is
the value you will use for the mail destination address when you configure
your system. In this example, <literal>operations</literal> is the user outside
the cluster and <literal> admin_user</literal> is a user on each node.</para>
</listitem>
<listitem><para>If the nodes use NIS (<literal>ypbind</literal> is  enabled
to start at boot time, or the BIND domain name server (DNS), switching to
local name resolution is recommended. Additionally, you should modify the <filename>
/etc/nsswitch.conf</filename> file so that it reads as follows:</para>
<programlisting>hosts:                  files nis dns </programlisting>
<note>
<para>Exclusive use of NIS or DNS for IP address lookup for the cluster nodes
has been shown to reduce availability in situations where the NIS service
becomes unreliable.</para>
</note>
</listitem>
<listitem><para>Reboot both nodes to put the new network configuration into
effect.</para>
</listitem>
</orderedlist></para>
</sect1>
<sect1 id="LE90681-PARENT">
<title id="LE90681-TITLE">Configuration for Reset</title>
<para>You can use one of the following methods for reset:<itemizedlist>
<listitem><para>EMP, which requires the following:<itemizedlist>
<listitem><para>Verify that the <command>getty</command> processes for serial
ports <literal>/dev/ttyS0</literal> and <literal>/dev/ttyS1</literal> are
turned off (this is normally the default)</para>
</listitem>
<listitem><para>Configure the BIOS</para>
</listitem>
</itemizedlist></para>
</listitem>
<listitem><para>A serial port PCI board to supply additional serial ports
</para>
</listitem>
<listitem><para>A USB serial port adapter  to supply additional serial ports
</para>
</listitem>
<listitem><para>STONITH network-attached power switch,which requires that
you enable a <literal>getty</literal> on <?Pub _nolinebreak><literal>/dev/ttyS0
</literal><?Pub /_nolinebreak>. </para>
</listitem>
</itemizedlist></para>
<para></para>
<sect2>
<title>Changing the getty Process</title>
<para>The <command>getty</command> process for serial ports <literal>/dev/ttyS0
</literal> and <literal>/dev/ttyS1</literal> should be off if you are using
the EMP port for reset. The <command>getty</command> process for serial port <literal>
/dev/ttyS0</literal> <indexterm id="ITnodeconfig-17"><primary>serial port
configuration</primary></indexterm> should be on if you are using STONITH.
</para>
<para>To change the setting, perform these steps on each node:</para>
<orderedlist>
<listitem><para>Open the file <filename>/etc/inittab</filename> for editing.
</para>
</listitem>
<listitem><para>Find the line for the port by looking at the comments on the
right for the port number.</para>
</listitem>
<listitem><para>Change the third field of this line to <option>off</option>
or <literal>on</literal>, as required. For example:</para>
<programlisting>t2:23:off:/sbin/getty -N ttyd2 co_9600          # port 2</programlisting>
</listitem>
<listitem><para>Save the file.</para>
</listitem>
<listitem><para>Enter these commands to make the change take effect:</para>
<programlisting># <userinput>killall getty</userinput>
# <userinput>init q</userinput></programlisting>
</listitem>
</orderedlist>
</sect2>
<sect2>
<title>Configuring the BIOS</title>
<para>To use the EMP for reset, you must enable the EMP port in the BIOS (server
systems shipped by SGI have it enabled by default). If you are comfortable
not having a serial console available, then the remaining serial port can
be used for reset purposes. This involves going into the BIOS and disabling
the console redirection option. </para>
</sect2>
</sect1>
</chapter>
<?Pub *0000036802>