Installing Linux FailSafe Software and Preparing the System This chapter describes several system administration procedures that must be performed on the nodes in a cluster to prepare and configure them for Linux FailSafe. These procedures assume that you have done the planning described in . The major sections in this chapter are as follows: Overview of Configuring Nodes for Linux FailSafe Performing the system administration procedures required to prepare nodes for Linux FailSafe involves these steps: Install required software as described in . Configure the system files on each node, as described in. Check the setting of two important configuration issues on each node as described in . Create the devices and filesystems required by the highly available applications you plan to run on the cluster. See . Configure the network interfaces on the nodes using the procedure in . Configure the serial ports used on each node for the serial connection to the other nodes by following the procedure in . When you are ready configure the nodes so that Linux FailSafe software starts up when they are rebooted. To complete the configuration of nodes for Linux FailSafe, you must configure the components of the Linux FailSafe system, as described in . Installing Required Software The Linux FailSafe base CD requires about 25 MB. installing Linux FailSafe software Linux FailSafe installationTo install the software, follow these steps: Make sure all servers in the cluster are running a supported release of Linux. Depending on the servers and storage in the configuration and the Linux revision level, install the latest install patches that are required for the platform and applications. On each system in the pool, install the version of the multiplexer driver that is appropriate to the operating system. Use the CD that accompanies the multiplexer. Reboot the system after installation. On each node that is part of the pool, install the following software, in order: sysadm_base-tcpmux sysadm_base-lib sysadm_base-server cluster_admin cluster_services failsafe sysadm_failsafe-server Note You must install sysadm_base-tcpmux, sysadm_base-server, and sysadm_failsafe packages on those nodes from which you want to run the FailSafe GUI. If you do not want to run the GUI on a specific node, you do not need to install these software packages on that node. If the pool nodes are to be administered by a Web-based version of the Linux FailSafe Cluster Manager GUI, install the following subsystems, in order: IBMJava118-JRE sysadm_base-client sysadm_failsafe-web If the workstation launches the GUI client from a Web browser that supports Java™, install: java_plugin from the Linux FailSafe CDIs there going to be an actual CD? If the Java plug-in is not installed when the Linux FailSafe Manager GUI is run from a browser, the browser is redirected to http://java.sun.com/products/plugin/1.1/plugin-install.html After installing the Java plug-in, you must close all browser windows and restart the browser. For a non-Linux workstation, download the Java Plug-in from http://java.sun.com/products/plugin/1.1/plugin-install.html If the Java plug-in is not installed when the Linux FailSafe Manager GUI is run from a browser, the browser is redirected to this site. sysadm_failsafe-client Install software on the administrative workstation (GUI client). If the workstation runs the GUI client from a Linux desktop, install these subsystems: IBMJava118-JRE sysadm_base-client On the appropriate servers, install other optional software, such as storage management or network board software. Install patches that are required for the platform and applications. Configuring System Files system filesWhen you install the Linux FailSafe Software, there are some system file considerations you must take into account. This section describes the required and optional changes you make to the following files for every node in the pool: /etc/services file/etc/services /etc/failsafe/config/cad.options /etc/failsafe/config/cdbd.options /etc/failsafe/config/cmond.options Configuring /etc/services for Linux FailSafe The /etc/services file must contain entries for sgi-cmsd, sgi-crsd, sgi-gcd, and sgi-cad on each node before starting HA services in the node. The port numbers assigned for these processes must be the same in all nodes in the cluster. Note that sgi-cad requires a TCP port. The following shows an example of /etc/services entries for sgi-cmsd, sgi-crsd, sgi-gcd and sgi-cad: sgi-cmsd 7000/udp # SGI Cluster Membership Daemon sgi-crsd 17001/udp # Cluster reset services daemon sgi-gcd 17002/udp # SGI Group Communication Daemon sgi-cad 17003/tcp # Cluster Admin daemon Configuring /etc/failsafe/config/cad.options for Linux FailSafe The /etc/failsafe/config/cad.options file contains the list of parameters that the cluster administration daemon (CAD) reads when the process is started. The CAD provides cluster information to the Linux FailSafe Cluster Manager GUI./etc/failsafe/config/cad.options file CAD options file The following options can be set in the cad.options file: --append_log Append CAD logging information to the CAD log file instead of overwriting it. --log_file filename CAD log file name. Alternately, this can be specified as -lf filename. -vvvv Verbosity level. The number of “v”s indicates the level of logging. Setting -v logs the fewest messages. Setting -vvvv logs the highest number of messages. The following example shows an /etc/failsafe/config/cad.options file: -vv -lf /var/log/failsafe/cad_nodename --append_log When you change the cad.options file, you must restart the CAD processes with the /etc/rc.d/init.d/fs_cluster restart command for those changes to take affect. Configuring /etc/failsafe/config/cdbd.options for Linux FailSafe The /etc/failsafe/config/cdbd.options file contains the list of parameters that the cdbd daemon reads when the process is started. The cdbd daemon is the configuration database daemon that manages the distribution of cluster configuration database (CDB) across the nodes in the pool./etc/failsafe/config/cdbd.options file cdbd options file The following options can be set in the cdbd.options file: -logevents eventname Log selected events. These event names may be used: all, internal, args, attach, chandle, node, tree, lock, datacon, trap, notify, access, storage. The default value for this option is all. -logdest log_destination Set log destination. These log destinations may be used: all , stdout, stderr, syslog, logfile. If multiple destinations are specified, the log messages are written to all of them. If logfile is specified, it has no effect unless the -logfile option is also specified. If the log destination is stderr or stdout, logging is then disabled if cdbd runs as a daemon, because stdout and stderr are closed when cdbd is running as a daemon. The default value for this option is logfile. -logfile filename Set log file name. The default value is /var/log/failsafe/cdbd_log -logfilemax maximum_size Set log file maximum size (in bytes). If the file exceeds the maximum size, any preexisting filename.old will be deleted, the current file will be renamed to filename.old, and a new file will be created. A single message will not be split across files. If -logfile is set, the default value for this option is 10000000. -loglevel log level Set log level. These log levels may be used: always, critical, error, warning, info, moreinfo, freq, morefreq, trace, busy. The default value for this option is info. -trace trace class Trace selected events. These trace classes may be used: all , rpcs, updates, transactions, monitor. No tracing is done, even if it is requested for one or more classes of events, unless either or both of -tracefile or -tracelog is specified. The default value for this option is transactions. -tracefile filename Set trace file name. -tracefilemax maximum size Set trace file maximum size (in bytes). If the file exceeds the maximum size, any preexisting filename.old will be deleted, the current file will be renamed to filename.old. -[no]tracelog [Do not] trace to log destination. When this option is set, tracing messages are directed to the log destination or destinations. If there is also a trace file, the tracing messages are written there as well. -[no]parent_timer [Do not] exit when parent exits. The default value for this option is -noparent_timer. -[no]daemonize [Do not] run as a daemon. The default value for this option is -daemonize. -l Do not run as a daemon. -h Print usage message. -o help Print usage message. Note that if you use the default values for these options, the system will be configured so that all log messages of level info or less, and all trace messages for transaction events to file /var/log/failsafe/cdbd_log. When the file size reaches 10MB, this file will be moved to its namesake with the .old extension, and logging will roll over to a new file of the same name. A single message will not be split across files. The following example shows an /etc/failsafe/config/cdbd.options file that directs all cdbd logging information to /var/log/messages, and all cdbd tracing information to /var/log/failsafe/cdbd_ops1. All log events are being logged, and the following trace events are being logged: RPCs, updates and transactions. When the size of the tracefile /var/log/failsafe/cdbd_ops1 exceeds 100000000, this file is renamed to /var/log/failsafe/cdbd_ops1.old and a new file /var/log/failsafe/cdbd_ops1 is created. A single message is not split across files. -logevents all -loglevel trace -logdest syslog -trace rpcs -trace updates -trace transactions -tracefile /var/log/failsafe/cdbd_ops1 -tracefilemax 100000000 The following example shows an /etc/failsafe/config/cdbd.options file that directs all log and trace messages into one file, /var/log/failsafe/cdbd_chaos6, for which a maximum size of 100000000 is specified. -tracelog directs the tracing to the log file. -logevents all -loglevel trace -trace rpcs -trace updates -trace transactions -tracelog -logfile /var/log/failsafe/cdbd_chaos6 -logfilemax 100000000 -logdest logfile. When you change the cdbd.options file, you must restart the cdbd processes with the /etc/rc.d/init.d/fs_cluster restart command for those changes to take affect. Configuring /etc/failsafe/config/cmond.options for Linux FailSafe The/etc/failsafe/config/cmond.options /etc/failsafe/config/cmond.options file contains the list of parameters that the cluster monitor daemon ( cmond) reads when the process is started. It also specifies the name of the file that logs cmond events. The cluster monitor daemon provides a framework for starting, stopping, and monitoring process groups. See the cmond man page for information on the cluster monitor daemon. The following options can be set in the cmond.options file: -L loglevel Set log level to loglevel -d Run in debug mode -l Lazy mode, where cmond does not validate its connection to the cluster database -t napinterval The time interval in milliseconds after which cmond checks for liveliness of process groups it is monitoring -s [eventname] Log messages to stderr A default cmond.options file is shipped with the following options. This default options file logs cmond events to the /var/log/failsafe/cmond_log file. -L info -f /var/log/failsafe/cmond_log Additional Configuration Issues additional configuration issues During the hardware installation of Linux FailSafe nodes, two additional issues must be considered: Automatic booting The Linux FailSafe software requires the nodes to be automatically booted when they are reset or when the node is powered on. Linux on x86 will be dependent upon BIOS configuration to ensure this. Some PC BIOSes will hang indefinitely upon error. Clearly this is not useful for high availability situations. On other platforms, such as PowerPC, Alpha, etc, the necessary steps will vary. A related, but not identical issue is that of reboots on kernel panics. To ensure the system will reboot even in the case of a kernel failure, set the panic value in a system boot file, such as init.d/boot.local : echo "number" > /proc/sys/kernel/panic number is the number of seconds after a panic before the system will reset. If you would prefer administrator intervention to be required during a hardware or kernel failure, you may leave this disabled SCSI ID parameter The SCSI controllers' host IDs of the nodes in a Linux FailSafe cluster using physically shared storage must be different. If a cluster has no shared storage or is using shared Fibre Channel storage, the value of SCSI host ID is not important. You can check the ID of most Linux controllers in the logged kernel messages from boot time: # grep ID= /var/log/messages  <6>(scsi0) Wide Channel, SCSI ID=7, 16/255 SCBs Changing the SCSI host ID is specific to the SCSI controller in use. Refer to the controller documentation. A controller uses its SCSI ID on all buses attached to it. Therefore, you must make sure that no device attached to a node has the same number as its SCSI unit number. Choosing and Configuring devices and Filesystems filesystem creation logical volume creation logical volume creation Creating devices, logical volumes, and filesystems will have a variety of steps specific to the filesystems and other tools selected. Documenting these is outside the scope of this guide. Please refer to the system and distribution-specific documentation for more assistance in this area. When you create the volumes and filesystems for use with Linux FailSafe, remember these important points: If the shared disks are not in a RAID storage system, mirrored volumes should be used. logical volume owner Each device used must be owned by the same node that is the primary node for the highly available applications that use the logical volume. To simplify the management of the nodenames (owners) of volumes on shared disks, follow these recommendations: Work with the volumes on a shared disk from only one node in the cluster. After you create all the volumes on one node, you can selectively shift the ownership to the other node. If the volumes you create are used as raw volumes (no filesystem) for storing database data, the database system may require that the device names have specific owners, groups, and modes. If this is the case (see the documentation provided by the database vendor), use the chown and chmod commands (see the chown and chmod reference pages) to set the owner, group, and mode as required. No filesystem entries are made in /etc/fstab for filesystems on shared disks; Linux FailSafe software mounts the filesystems on shared disks. However, to simplify system administration, consider adding comments to /etc/fstab that list the filesystems configured for Linux FailSafe. Thus, a system administrator who sees mounted Linux FailSafe filesystems in the output of the df command and looks for the filesystems in the /etc/fstab file will learn that they are filesystems managed by Linux FailSafe. Be sure to create the mount point directory for each filesystem on all nodes in the failover domain. Configuring Network Interfaces network interface configurationThe procedure in this section describes how to configure the network interfaces on the nodes in a Linux FailSafe cluster. The example shown in is used in the procedure.
Example Interface Configuration
If possible, add every IP address, IP name, and IP alias for the nodes to /etc/hosts on one node. 190.0.2.1 xfs-ha1.company.com xfs-ha1 190.0.2.3 stocks 190.0.3.1 priv-xfs-ha1 190.0.2.2 xfs-ha2.company.com xfs-ha2 190.0.2.4 bonds 190.0.3.2 priv-xfs-ha2 IP aliases that are used exclusively by highly available services should not be added to system configuration files. These aliases will be added and removed by Linux FailSafe. Add all of the IP addresses from Step 1 to /etc/hosts on the other nodes in the cluster. If there are IP addresses, IP names, or IP aliases that you did not add to /etc/hosts in Steps 1 and 2, verify that NIS is configured on all nodes in the cluster. If the ypbind is , you must start NIS. See your distribution's documentation for details. For IP addresses, IP names, and IP aliases that you did not add to /etc/hosts on the nodes in Steps 1 and 2, verify that they are in the NIS database by entering this command for each address: # ypmatch address mapname 190.0.2.1 xfs-ha1.company.com xfs-ha1 address is an IP address, IP name, or IP alias. mapname is hosts.byaddr if address is an IP address; otherwise, it is hosts. If ypmatch reports that address doesn't match, it must be added to the NIS database. See your distribution's documentation for details. On one node, statically configure that node's interface and IP address with the provided distribution tools. For the example in , on a SuSE system, the public interface name and IP address lines are configured into /etc/rc.config in the following variables. Please note that YaST is the preferred method for modifying these variables. In any event, you should refer to the documentation of your distribution for help here: NETDEV_0=eth0 IPADDR_0=$HOSTNAME $HOSTNAME is an alias for an IP address that appears in /etc/hosts. If there are additional public interfaces, their interface names and IP addresses appear on lines like these: NETDEV_1= IPADDR_1= In the example, the control network name and IP address are NETDEV_2=eth3 IPADDR_3=priv-$HOSTNAME The control network IP address in this example, priv-$HOSTNAME , is an alias for an IP address that appears in /etc/hosts . Repeat Steps 5 and 6 on the other nodes. Verify that Linux FailSafe is on each node: # /usr/lib/failsafe/bin/fsconfig failsafe # if [ $? -eq 1 ]; then echo off; else echo on; fi If failsafe is on any node, enter this command on that node: # /usr/lib/failsafe/bin/fsconfig failsafe off Configure an e-mail alias on each node that sends the Linux FailSafe e-mail notifications of cluster transitions to a user outside the Linux FailSafe cluster and to a user on the other nodes in the cluster. For example, if there are two nodes called xfs-ha1 and xfs-ha2, in /etc/aliases on xfs-ha1 , add fsafe_admin:operations@console.xyz.com,admin_user@xfs-ha2.xyz.com On xfs-ha2, add this line to /etc/aliases: fsafe_admin:operations@console.xyz.com,admin_user@xfs-ha1.xyz.com The alias you choose, fsafe_admin in this case, is the value you will use for the mail destination address when you configure your system. In this example, operations is the user outside the cluster and admin_user is a user on each node. If the nodes use NIS (ypbind is enabled to start at boot time, or the BIND domain name server (DNS), switching to local name resolution is recommended. Additionally, you should modify the /etc/nsswitch.conf file so that it reads as follows: hosts: files nis dns Exclusive use of NIS or DNS for IP address lookup for the cluster nodes has been shown to reduce availability in situations where the NIS service becomes unreliable. Reboot both nodes to put the new network configuration into effect.
Configuration for Reset You can use one of the following methods for reset: EMP, which requires the following: Verify that the getty processes for serial ports /dev/ttyS0 and /dev/ttyS1 are turned off (this is normally the default) Configure the BIOS A serial port PCI board to supply additional serial ports A USB serial port adapter to supply additional serial ports STONITH network-attached power switch,which requires that you enable a getty on /dev/ttyS0 . Changing the getty Process The getty process for serial ports /dev/ttyS0 and /dev/ttyS1 should be off if you are using the EMP port for reset. The getty process for serial port /dev/ttyS0 serial port configuration should be on if you are using STONITH. To change the setting, perform these steps on each node: Open the file /etc/inittab for editing. Find the line for the port by looking at the comments on the right for the port number. Change the third field of this line to or on, as required. For example: t2:23:off:/sbin/getty -N ttyd2 co_9600 # port 2 Save the file. Enter these commands to make the change take effect: # killall getty # init q Configuring the BIOS To use the EMP for reset, you must enable the EMP port in the BIOS (server systems shipped by SGI have it enabled by default). If you are comfortable not having a serial console available, then the remaining serial port can be used for reset purposes. This involves going into the BIOS and disabling the console redirection option.