Overview of the Linux FailSafe System This chapter provides an overview of the components and operation of the Linux FailSafe system. It contains these major sections: High Availability and Linux FailSafe In the world of mission critical computing, the availability of information and computing resources is extremely important. The availability of a system is affected by how long it is unavailable after a failure in any of its components. Different degrees of availability are provided by different types of systems: fault-tolerant systems, definition Fault-tolerant systems (continuous availability). These systems use redundant components and specialized logic to ensure continuous operation and to provide complete data integrity. On these systems the degree of availability is extremely high. Some of these systems can also tolerate outages due to hardware or software upgrades (continuous availability). This solution is very expensive and requires specialized hardware and software. Highly available systems. These systems survive single points of failure by using redundant off-the-shelf components and specialized software. They provide a lower degree of availability than the fault-tolerant systems, but at much lower cost. Typically these systems provide high availability only for client/server applications, and base their redundancy on cluster architectures with shared resources. The Silicon Graphics® Linux FailSafe product provides a general facility for providing highly available services. Linux FailSafe provides highly available services for a cluster that contains multiple nodes ( N-node configuration). Using Linux FailSafe, you can configure a highly available system in any of the following topologies: Basic two-node configuration Ring configuration Star configuration, in which multiple applications running on multiple nodes are backed up by one node Symmetric pool configuration These configurations provide redundancy of processors and I/O controllers. Redundancy of storage can either be obtained through the use of multi-hosted RAID disk devices and mirrored disks, or with redundant disk systems which are kept in synchronization. If one of the nodes in the cluster or one of the nodes' components fails, a different node in the cluster restarts the highly available services of the failed node. To clients, the services on the replacement node are indistinguishable from the original services before failure occurred. It appears as if the original node has crashed and rebooted quickly. The clients notice only a brief interruption in the highly available service. In a Linux FailSafe highly available system, nodes can serve as backup for other nodes. Unlike the backup resources in a fault-tolerant system, which serve purely as redundant hardware for backup in case of failure, the resources of each node in a highly available system can be used during normal operation to run other applications that are not necessarily highly available services. All highly available services are owned and accessed by one node at a time. Highly available services are monitored by the Linux FailSafe software. During normal operation, if a failure is detected on any of these components, a failover process is initiated. Using Linux FailSafe, you can define a failover policy to establish which node will take over the services under what conditions. This process consists of resetting the failed node (to ensure data consistency), doing any recovery required by the failed over services, and quickly restarting the services on the node that will take them over. Linux FailSafe supports selective failover in which individual highly available applications can be failed over to a backup node independent of the other highly available applications on that node. Linux FailSafe highly available services fall into two groups: highly available resources and highly available applications. Highly available resources include network interfaces, logical volumes, and filesystems such as ext2f or reiserfs that have been configured for Linux FailSafe. Silicon Graphics has also developed Linux FailSafe NFS. Highly available applications can include applications such as NFS, Apache, etc. Linux FailSafe See Linux FailSafe Linux FailSafe provides a framework for making additional applications into highly available services. If you want to add highly available applications on a Linux FailSafe cluster, you must write scripts to handle application monitoring functions. Information on developing these scripts is described in the Linux FailSafe Programmer's Guide. If you need assistance in this regard, contact SGI Global Services, which offers custom Linux FailSafe agent development and HA integration services. Concepts In order to use Linux FailSafe, you must understand the concepts in this section.concepts Cluster Node (or Node) A cluster node is a single Linux execution environment. In other words, a single physical or virtual machine. In current Linux environments this will always be an individual computer. The term node is used to indicate this meaning in this guide for brevity, as opposed to any meaning such as a network node. cluster node node Pool A pool is the entire set of nodes having membership in a group of clusters. The clusters are usually close together and should always serve a common purpose. A replicated cluster configuration database is stored on each node in the pool. pool Cluster A cluster is a collection of one or more nodes coupled to each other by networks or other similar interconnections. A cluster belongs to one pool and only one pool. A cluster is identified by a simple name; this name must be unique within the pool. A particular node may be a member of only one cluster. All nodes in a cluster are also in the pool; however, all nodes in the pool are not necessarily in the cluster.cluster Node Membership A node membership is the list of nodes in a cluster on which Linux FailSafe can allocate resource groups. node membership membership Process Membership A process membership process membership is the list of process instances in a cluster that form a process group. There can be multiple process groups per node. Resource A resource is a single physical or logical entity that provides a service to clients or other resources. For example, a resource can be a single disk volume, a particular network address, or an application such as a web server. A resource is generally available for use over time on two or more nodes in a cluster, although it can only be allocated to one node at any given time. resource definition Resources are identified by a resource name and a resource type. One resource can be dependent on one or more other resources; if so, it will not be able to start (that is, be made available for use) unless the dependent resources are also started. Dependent resources must be part of the same resource group and are identified in a resource dependency list. Resource Type A resource type is a particular class of resource. All of the resources in a particular resource type can be handled in the same way for the purposes of failover. Every resource is an instance of exactly one resource type.resource type description A resource type is identified by a simple name; this name should be unique within the cluster. A resource type can be defined for a specific node, or it can be defined for an entire cluster. A resource type definition for a specific node overrides a clusterwide resource type definition with the same name; this allows an individual node to override global settings from a clusterwide resource type definition. Like resources, a resource type can be dependent on one or more other resource types. If such a dependency exists, at least one instance of each of the dependent resource types must be defined. For example, a resource type named Netscape_web might have resource type dependencies on resource types named IP_address and volume . If a resource named web1 is defined with the Netscape_web resource type, then the resource group containing web1 must also contain at least one resource of the type IP_address and one resource of the type volume. The Linux FailSafe software includes some predefined resource types. If these types fit the application you want to make highly available, you can reuse them. If none fit, you can create additional resource types by using the instructions in the Linux FailSafe Programmer's Guide. Resource Name A resource name identifies a specific instance of a resource type. A resource name must be unique for a given resource type.resourcename Resource Group A resource group is a collection of interdependent resources. A resource group is identified by a simple name; this name must be unique within a cluster. shows an example of the resources and their corresponding resource types for a resource group named WebGroup. resource groupdefinition Example Resource Group Resource Resource Type 10.10.48.22 IP_address /fs1 filesystem vol1 volume web1 Netscape_web
If any individual resource in a resource group becomes unavailable for its intended use, then the entire resource group is considered unavailable. Therefore, a resource group is the unit of failover. Resource groups cannot overlap; that is, two resource groups cannot contain the same resource.
Resource Dependency List A resource dependency list is a list of resources upon which a resource depends. Each resource instance must have resource dependencies that satisfy its resource type dependencies before it can be added to a resource group. Resource Type Dependency List A resource type dependency list is a list of resource types upon which a resource type depends. For example, the filesystem resource type depends upon the volume resource type, and the Netscape_web resource type depends upon the filesystem and IP_address resource types.resource type dependency list dependency list For example, suppose a file system instance fs1 is mounted on volume vol1. Before fs1 can be added to a resource group, fs1 must be defined to depend on vol1. Linux FailSafe only knows that a file system instance must have one volume instance in its dependency list. This requirement is inferred from the resource type dependency list. resourcedependency list Failover A failover is the process of allocating a resource group (or application) to another node, according to a failover policy. A failover may be triggered by the failure of a resource, a change in the node membership (such as when a node fails or starts), or a manual request by the administrator.failover Failover Policy A failover policy is the method used by Linux FailSafe to determine the destination node of a failover. A failover policy consists of the following: Failover domain Failover attributes Failover script Linux FailSafe uses the failover domain output from a failover script along with failover attributes to determine on which node a resource group should reside. The administrator must configure a failover policy for each resource group. A failover policy name must be unique within the pool. Linux FailSafe includes predefined failover policies, but you can define your own failover algorithms as well. failover policy Failover Domain A failover domain is the ordered list of nodes on which a given resource group can be allocated. The nodes listed in the failover domain must be within the same cluster; however, the failover domain does not have to include every node in the cluster. failover domain domain application failover domain The administrator defines the initial failover domain when creating a failover policy. This list is transformed into a run-time failover domain by the failover script; Linux FailSafe uses the run-time failover domain along with failover attributes and the node membership to determine the node on which a resource group should reside. Linux FailSafe stores the run-time failover domain and uses it as input to the next failover script invocation. Depending on the run-time conditions and contents of the failover script, the initial and run-time failover domains may be identical. In general, Linux FailSafe allocates a given resource group to the first node listed in the run-time failover domain that is also in the node membership; the point at which this allocation takes place is affected by the failover attributes. Failover Attribute A failover attribute is a string that affects the allocation of a resource group in a cluster. The administrator must specify system attributes (such as Auto_Failback or Controlled_Failback), and can optionally supply site-specific attributes.failover attributes Failover Scripts A failover script is a shell script that generates a run-time failover domain and returns it to the Linux FailSafe process. The Linux FailSafe process ha_fsd applies the failover attributes and then selects the first node in the returned failover domain that is also in the current node membership.failover scriptdescription The following failover scripts are provided with the Linux FailSafe release: ordered, which never changes the initial failover domain. When using this script, the initial and run-time failover domains are equivalent. round-robin, which selects the resource group owner in a round-robin (circular) fashion. This policy can be used for resource groups that can be run in any node in the cluster. If these scripts do not meet your needs, you can create a new failover script using the information in this guide. Action Scripts The action scripts are the set of scripts that determine how a resource is started, monitored, and stopped. There must be a set of action scripts specified for each resource type. action scripts The following is the complete set of action scripts that can be specified for each resource type: exclusive, which verifies that a resource is not already running start, which starts a resource stop, which stops a resource monitor, which monitors a resource restart, which restarts a resource on the same server after a monitoring failure occurs The release includes action scripts for predefined resource types. If these scripts fit the resource type that you want to make highly available, you can reuse them by copying them and modifying them as needed. If none fits, you can create additional action scripts by using the instructions in the Linux FailSafe Programmer's Guide.
Additional Linux FailSafe Features Linux FailSafefeatures Linux FailSafe provides the following features to increase the flexibility and ease of operation of a highly available system: Dynamic management Fine grain failover Local restarts These features are summarized in the following sections. Dynamic Management Linux FailSafe allows you to perform a variety of administrative tasks while the system is running: Dynamically managed application monitoring Linux FailSafe allows you to turn monitoring of an application on and off while other highly available applications continue to run. This allows you to perform online application upgrades without bringing down the Linux FailSafe system. Dynamically managed Linux FailSafe resources Linux FailSafe allows you to add resources while the Linux FailSafe system is online. Dynamically managed Linux FailSafe upgrades Linux FailSafe allows you to upgrade Linux FailSafe software on one node at a time without taking down the entire Linux FailSafe cluster. Fine Grain Failover Using Linux FailSafe, you can specify fine-grain failover . Fine-grain failover is a process in which a specific resource group is failed over from one node to another node while other resource groups continue to run on the first node, where possible. Fine-grain failover is possible in Linux FailSafe because the unit of failover is the resource group, and not the entire node. Local Restarts Linux FailSafe allows you to fail over a resource group onto the same node. This feature enables you to configure a single-node system, where backup for a particular application is provided on the same machine, if possible. It also enables you to indicate that a specified number of local restarts be attempted before the resource group fails over to a different node. Linux FailSafe Administration You can perform all Linux FailSafe administrative tasks by means of the Linux FailSafe Cluster Manager Graphical User Interface (GUI). The Linux FailSafe GUI provides a guided interface to configure, administer, and monitor a Linux FailSafe-controlled highly available cluster. The Linux FailSafe GUI also provides screen-by-screen help text. If you wish, you can perform Linux FailSafe administrative tasks directly by means of the Linux FailSafe Cluster Manager CLI, which provides a command-line interface for the administration tasks. For information on Linux FailSafe Cluster Manager tools, see . For information on Linux FailSafe configuration and administration tasks, see , and . Hardware Components of a Linux FailSafe Cluster Linux FailSafehardware components , shows an example of Linux FailSafe hardware components, in this case for a two-node system.
Sample Linux FailSafe System Components
The hardware components of the Linux FailSafe system are as follows: Up to eight Linux nodes Two or more interfaces on each node to control networks (Ethernet, FDDI, or any other available network interface) At least two network interfaces on each node are required for the control network heartbeat connection, by which each node monitors the state of other nodes. The Linux FailSafe software also uses this connection to pass control messages between nodes. These interfaces have distinct IP addresses. A mechanism for remote reset of nodes A reset ensures that the failed node is not using the shared disks when the replacement node takes them over. Disk storage and SCSI bus shared by the nodes in the cluster The nodes in the Linux FailSafe system can share dual-hosted disk storage over a shared fast and wide SCSI bus where this is supported by the SCSI controller and Linux driver. Note that few Linux drivers are currently known to implement this correctly. Please check hardware compatibility lists if this is a configuration you plan to use. Fibre Channel solutions should universally support this. The Linux FailSafe system is designed to survive a single point of failure. Therefore, when a system component fails, it must be restarted, repaired, or replaced as soon as possible to avoid the possibility of two or more failed components.
Linux FailSafe Disk Connections A Linux FailSafe system supports the following disk connections: RAID support Single controller or dual controllers Single or dual hubs Single or dual pathing JBOD support Single or dual vaults Single or dual hubs Network-mirrored support Clustered filesystems such as GFS Network mirroring block devices such as with DRBD Network mirrored devices are not discussed in the examples within this guide. However, the Linux FailSafe configuration items that are set for shared storage apply validly to network-duplicated storage. SCSI disks can be connected to two machines only. Fibre channel disks can be connected to multiple machines. Linux FailSafe Supported Configurations Linux FailSafe supports the following highly available configurations: Basic two-node configuration Star configuration of multiple primary and 1 backup node Ring configuration You can use the following reset models when configuring a Linux FailSafe system: Server-to-server. Each server is directly connected to another for reset. May be unidirectional. Network. Each server can reset any other by sending a signal over the control network to a multiplexer. The following sections provide descriptions of the different Linux FailSafe configurations. Basic Two-Node Configuration In a basic two-node configuration, the following arrangements are possible: All highly available services run on one node. The other node is the backup node. After failover, the services run on the backup node. In this case, the backup node is a hot standby for failover purposes only. The backup node can run other applications that are not highly available services. Highly available services run concurrently on both nodes. For each service, the other node serves as a backup node. For example, both nodes can be exporting different NFS filesystems. If a failover occurs, one node then exports all of the NFS filesystems. Highly Available Resources This section discusses the highly available resources that are provided on a Linux FailSafe system. Nodes If a node crashes or hangs (for example, due to a parity error or bus error), the Linux FailSafe software detects this. A different node, determined by the failover policy, takes over the failed node's services after resetting the failed node. If a node fails, the interfaces, access to storage, and services also become unavailable. See the succeeding sections for descriptions of how the Linux FailSafe system handles or eliminates these points of failure. Network Interfaces and IP Addresses network interfaceoverview IP addressoverview Clients access the highly available services provided by the Linux FailSafe cluster using IP addresses. Each highly available service can use multiple IP addresses. The IP addresses are not tied to a particular highly available service; they can be shared by all the highly available services in the cluster. Linux FailSafe uses the IP aliasing mechanism to support multiple IP addresses on a single network interface. Clients can use a highly available service that uses multiple IP addresses even when there is only one network interface in the server node. The IP aliasing mechanism allows a Linux FailSafe configuration that has a node with multiple network interfaces to be backed up by a node with a single network interface. IP addresses configured on multiple network interfaces are moved to the single interface on the other node in case of a failure. Linux FailSafe requires that each network interface in a cluster have an IP address that does not failover. These IP addresses, called fixed IP addresses, are used to monitor network interfaces. Each fixed IP address must be configured to a network interface at system boot up time. All other IP addresses in the cluster are configured as highly available IP addresses. Highly available IP addresses are configured on a network interface. During failover and recovery processes they are moved to another network interface in the other node by Linux FailSafe. Highly available IP addresses are specified when you configure the Linux FailSafe system. Linux FailSafe uses the ifconfig command to configure an IP address on a network interface and to move IP addresses from one interface to another. In some networking implementations, IP addresses cannot be moved from one interface to another by using only the ifconfig command. Linux FailSafe uses re-MACing (MAC address impersonation) to support these networking implementations. Re-MACing moves the physical (MAC) address of a network interface to another interface. It is done by using the macconfig command. Re-MACing is done in addition to the standard ifconfig process that Linux FailSafe uses to move IP addresses. To do RE-MACing in Linux FailSafe, a resource of type MAC_Address is used. Re-MACing can be used only on Ethernet networks. It cannot be used on FDDI networks. Re-MACing is required when packets called gratuitous ARP packets are not passed through the network. These packets are generated automatically when an IP address is added to an interface (as in a failover process). They announce a new mapping of an IP address to MAC address. This tells clients on the local subnet that a particular interface now has a particular IP address. Clients then update their internal ARP caches with the new MAC address for the IP address. (The IP address just moved from interface to interface.) When gratuitous ARP packets are not passed through the network, the internal ARP caches of subnet clients cannot be updated. In these cases, re-MACing is used. This moves the MAC address of the original interface to the new interface. Thus, both the IP address and the MAC address are moved to the new interface and the internal ARP caches of clients do not need updating. Re-MACing is not done by default; you must specify that it be done for each pair of primary and secondary interfaces that requires it. A procedure in the section describes how you can determine whether re-MACing is required. In general, routers and PC/NFS clients may require re-MACing interfaces. A side effect of re-MACing is that the original MAC address of an interface that has received a new MAC address is no longer available for use. Because of this, each network interface has to be backed up by a dedicated backup interface. This backup interface cannot be used by clients as a primary interface. (After a failover to this interface, packets sent to the original MAC address are ignored by every node on the network.) Each backup interface backs up only one network interface. Disks The Linux FailSafe cluster can include shared SCSI-based storage in the form of individual disks, RAID systems, or Fibre Channel storage systems. disks, sharedand disk failure disks, shared and disk controller failureWith mirrored volumes on the disks in a RAID or Fibre Channel system, the device system should provide redundancy. No participation of the Linux FailSafe system software is required for a disk failure. If a disk controller fails, the Linux FailSafe system software initiates the failover process. failoverof disk storage , shows disk storage takeover on a two-node system. The surviving node takes over the shared disks and recovers the logical volumes and filesystems on the disks. This process is expedited by a filesystem such as ReiserFS or XFS, because of journaling technology that does not require the use of the fsck command for filesystem consistency checking.
Disk Storage Failover on a Two-Node System
Highly Available Applications Each application has a primary node and up to seven additional nodes that you can use as a backup node, according to the failover policy you define. The primary node is the node on which the application runs when Linux FailSafe is in normal state. When a failure of any highly available resources or highly available application is detected by Linux FailSafe software, all highly available resources in the affected resource group on the failed node are failed over to a different node and the highly available applications on the failed node are stopped. When these operations are complete, the highly available applications are started on the backup node. All information about highly available applications, including the primary node, components of the resource group, and failover policy for the application and monitoring, is specified when you configure your Linux FailSafe system with the Cluster Manager GUI or with the Cluster Manager CLI. Information on configuring the system is provided in . Monitoring scripts detect the failure of a highly available application. The Linux FailSafe software provides a framework for making applications highly available services. By writing scripts and configuring the system in accordance with those scripts, you can turn client/server applications into highly available applications. For information, see the Linux FailSafe Programmer's Guide. Failover and Recovery Processes failoverdescription failoverand recovery processesWhen a failure is detected on one node (the node has crashed, hung, or been shut down, or a highly available service is no longer operating), a different node performs a failover of the highly available services that are being provided on the node with the failure (called the failed node). Failover allows all of the highly available services, including those provided by the failed node, to remain available within the cluster. A failure in a highly available service can be detected by Linux FailSafe processes running on another node. Depending on which node detects the failure, the sequence of actions following the failure is different. If the failure is detected by the Linux FailSafe software running on the same node, the failed node performs these operations: Stops the highly available resource group running on the node Moves the highly available resource group to a different node, according to the defined failover policy for the resource group Sends a message to the node that will take over the services to start providing all resource group services previously provided by the failed node When it receives the message, the node that is taking over the resource group performs these operations: Transfers ownership of the resource group from the failed node to itself Starts offering the resource group services that were running on the failed node If the failure is detected by Linux FailSafe software running on a different node, the node detecting the failure performs these operations: Using the serial connection between the nodes, reboots the failed node to prevent corruption of data Transfers ownership of the resource group from the failed node to the other nodes in the cluster, based on the resource group failover policy. Starts offering the resource group services that were running on the failed node When a failed node comes back up, whether the node automatically starts to provide highly available services again depends on the failover policy you define. For information on defining failover policies, see . Normally, a node that experiences a failure automatically reboots and resumes providing highly available services. This scenario works well for transient errors (as well as for planned outages for equipment and software upgrades). However, if there are persistent errors, automatic reboot can cause recovery and an immediate failover again. To prevent this, the Linux FailSafe software checks how long the rebooted node has been up since the last time it was started. If the interval is less than five minutes (by default), the Linux FailSafe software automatically disables Linux FailSafe from booting on the failed node and does not start up the Linux FailSafe software on this node. It also writes error messages to /var/log/failsafe and to the appropriate log file. Overview of Configuring and Testing a New Linux FailSafe Cluster After the Linux FailSafe cluster hardware has been installed, follow this general procedure to configure and test the Linux FailSafe system: Become familiar with Linux FailSafe terms by reviewing this chapter. Plan the configuration of highly available applications and services on the cluster using . Perform various administrative tasks, including the installation of prerequisite software, that are required by Linux FailSafe, as described in . Define the Linux FailSafe configuration as explained in . Test the Linux FailSafe system in three phases: test individual components prior to starting Linux FailSafe software, test normal operation of the Linux FailSafe system, and simulate failures to test the operation of the system after a failure occurs. Linux FailSafe System Software This section describes the software layers, communication paths, and cluster configuration database. Layers A Linux FailSafe system has the following software layers:system softwarelayers layers Plug-ins, which create highly available services. If the application plug-in you want is not available, you can hire the Silicon Graphics Global Services group to develop the required software, or you can use the Linux FailSafe Programmer's Guide to write the software yourself.plug-ins Linux FailSafe base, which includes the ability to define resource groups and failover policies base High-availability cluster infrastructure that lets you define clusters, resources, and resource types (this consists of the cluster_services installation package) infrastructure high-availability infrastructure cluster_ha subsystem cluster_admin subsystem cluster_control subsystem Cluster software infrastructure, which lets you do the following: Perform node logging Administer the cluster Define nodes The cluster software infrastructure consists of the cluster_admin and cluster_control subsystems). shows a graphic representation of these layers. describes the layers for Linux FailSafe, which are located in the /usr/lib/failsafe/bin directory.
Software Layers
Contents of <filename>/usr/lib/failsafe/bin</filename> Layer SubsystemProcess Description Linux FailSafe Base failsafe2 ha_fsd Linux FailSafe daemon. Provides basic component of the Linux FailSafe software. failsafe2 subsystem ha_fsd process High-availability cluster infrastructure cluster_ha ha_cmsd Cluster membership daemon. Provides the list of nodes, called node membership, available to the cluster.ha_cmsd process cluster_ha subsystem ha_gcd Group membership daemon. Provides group membership and reliable communication services in the presence of failures to Linux FailSafe processes ha_gcd process. ha_srmd System resource manager daemon. Manages resources, resource groups, and resource types. Executes action scripts for resources.ha_srmd process ha_ifd Interface agent daemon. Monitors the local node's network interfaces. ha_ifd process Cluster software infrastructure cluster_admin cad Cluster administration daemon. Provides administration services.cluster_admin subsystem cluster_control crsd Node control daemon. Monitors the serial connection to other nodes. Has the ability to reset other nodes.cluster_control subsystem crsd process cmond Daemon that manages all other daemons. This process starts other processes in all nodes in the cluster and restarts them on failures. cdbd Manages the configuration database and keeps each copy in sync on all nodes in the pool
Communication Paths The following figures show communication paths in Linux FailSafe. Note that they do not represent cmond. communication paths system softwarecommunication paths
read/write actions to the cluster configuration database diagram Read/Write Actions to the Cluster Configuration Database
shows the communication path for a node that is in the pool but not in a cluster.
node not in a cluster diagram Communication Path for a Node that is Not in a Cluster
Conditions Under Which Action Scripts are Executed Action scripts are executed under the following conditions: exclusive: the resource group is made online by the user or HA processes are started start: the resource group is made online by the user, HA processes are started, or there is a resource group failover stop: the resource group is made offline, HA process are stopped, the resource group fails over, or the node is shut down monitor: the resource group is online restart: the monitor script fails When Does FailSafe Execute Action and Failover Scripts The order of execution is as follows: Linux FailSafe is started, usually at node boot or manually, and reads the resource group information from the cluster configuration database. Linux FailSafe asks the system resource manager (SRM) to run exclusive scripts for all resource groups that are in the Online ready state. SRM returns one of the following states for each resource group: running partially running not running If a resource group has a state of not running in a node where HA services have been started, the following occurs: Linux FailSafe runs the failover policy script associated with the resource group. The failover policy scripts take the list of nodes that are capable of running the resource group (the failover domain ) as a parameter. The failover policy script returns an ordered list of nodes in descending order of priority (the run-time failover domain) where the resource group can be placed. Linux FailSafe sends a request to SRM to move the resource group to the first node in the run-time failover domain. SRM executes the start action script for all resources in the resource group: If the start script fails, the resource group is marked online on that node with an srmd executable error error. If the start script is successful, SRM automatically starts monitoring those resources. After the specified start monitoring time passes, SRM executes the monitor action script for the resource in the resource group. If the state of the resource group is running or partially running on only one node in the cluster, Linux FailSafe runs the associated failover policy script: If the highest priority node is the same node where the resource group is partially running or running, the resource group is made online on the same node. In the partially running case, Linux FailSafe asks SRM to execute start scripts for resources in the resource group that are not running. If the highest priority node is a another node in the cluster, Linux FailSafe asks SRM to execute stop action scripts for resources in the resource group. Linux FailSafe makes the resource group online in the highest priority node in the cluster. If the state of the resource group is running or partially running in multiple nodes in the cluster, the resource group is marked with an error exclusivity error. These resource groups will require operator intervention to become online in the cluster. shows the message paths for action scripts and failover policy scripts.
message paths diagram Message Paths for Action Scripts and Failover Policy Scripts
Components The cluster configuration database is a key component of Linux FailSafe software. It contains all information about the following: system softwarecomponents components Resources Resource types Resource groups Failover policies Nodes Clusters The cluster configuration database daemon (cdbd) maintains identical databases on each node in the cluster. cluster administration daemon administration daemon The following are the contents of the failsafe directories under the /usr/lib and /var hierarchies: /var/run/failsafe/comm/ Directory that contains files that communicate between various daemons. /usr/lib/failsafe/common_scripts/ Directory that contains the script library (the common functions that may be used in action scripts). /var/log/failsafe/ Directory that contains the logs of all scripts and daemons executed by Linux FailSafe. The outputs and errors from the commands within the scripts are logged in the script_nodename file. /usr/lib/failsafe/policies/ Directory that contains the failover scripts used for resource groups. /usr/lib/failsafe/resource_types/template Directory that contains the template action scripts. /usr/lib/failsafe/resource_types/rt_name Directory that contains the action scripts for the rt_name resource type. For example, /usr/lib/failsafe/resource_types/filesystem . resource_types/rt_name/exclusive Script that verifies that a resource of this resource type is not already running. For example, resource_types/filesystem/exclusive. resource_types/rt_name/monitor Script that monitors a resource of this type. resource_types/rt_name/restart Script that restarts a resource of this resource type on the same node after a monitoring failure. resource_types/rt_name/start Script that starts a resource of this resource type. resource_types/rt_name/stop Script that stops a resource of this resource type. shows the administrative commands available for use in scripts. ha_cilog commandadministrative commandscommandsAdministrative Commands for Use in Scripts Command Purpose ha_cilog Logs messages to the script_ nodename log files.log messages message logging ha_execute_lock Executes a command with a file lock. This allows command execution to be serialized ha_exec2 Executes a command and retries the command on failure or timeout.monitoring processes processmonitoringha_cilog command ha_filelock Locks a file. ha_filelock command lock a file file locking and unlocking ha_fileunlock Unlocks a file. ha_fileunlock command unlock a file ha_ifdadmin Communicates with the ha_ifd network interface agent daemon. ha_ifdadmin command communicate with the network interface agent daemon ha_http_ping2 Checks if a web server is running.ha_http_ping2 command Netscape node check ha_macconfig2 Displays or modifies MAC addresses of a network interface.ha_macconfig2 command MAC address modification and display