This section describes the software layers, communication paths, and cluster configuration database.
A Linux FailSafe system has the following software layers:
Plug-ins, which create highly available services. If the application plug-in you want is not available, you can hire the Silicon Graphics Global Services group to develop the required software, or you can use the Linux FailSafe Programmer's Guide to write the software yourself.
Linux FailSafe base, which includes the ability to define resource groups and failover policies
High-availability cluster infrastructure that lets you define clusters, resources, and resource types (this consists of the cluster_services installation package)
Cluster software infrastructure, which lets you do the following:
Perform node logging
Administer the cluster
Define nodes
The cluster software infrastructure consists of the cluster_admin and cluster_control subsystems).
Figure 1-3 shows a graphic representation of these layers. Table 1-2 describes the layers for Linux FailSafe, which are located in the /usr/lib/failsafe/bin directory.
Table 1-2. Contents of /usr/lib/failsafe/bin
Layer | Subsystem | Process | Description |
|---|---|---|---|
Linux FailSafe Base | failsafe2 | ha_fsd | Linux FailSafe daemon. Provides basic component of the Linux FailSafe software. |
High-availability cluster infrastructure | cluster_ha | ha_cmsd | Cluster membership daemon. Provides the list of nodes, called node membership, available to the cluster. |
ha_gcd | Group membership daemon. Provides group membership and reliable communication services in the presence of failures to Linux FailSafe processes. | ||
ha_srmd | System resource manager daemon. Manages resources, resource groups, and resource types. Executes action scripts for resources. | ||
ha_ifd | Interface agent daemon. Monitors the local node's network interfaces. | ||
Cluster software infrastructure | cluster_admin | cad | Cluster administration daemon. Provides administration services. |
cluster_control | crsd | Node control daemon. Monitors the serial connection to other nodes. Has the ability to reset other nodes. | |
cmond | Daemon that manages all other daemons. This process starts other processes in all nodes in the cluster and restarts them on failures. | ||
cdbd | Manages the configuration database and keeps each copy in sync on all nodes in the pool |
The following figures show communication paths in Linux FailSafe. Note that they do not represent cmond.
Figure 1-5 shows the communication path for a node that is in the pool but not in a cluster.
Action scripts are executed under the following conditions:
exclusive: the resource group is made online by the user or HA processes are started
start: the resource group is made online by the user, HA processes are started, or there is a resource group failover
stop: the resource group is made offline, HA process are stopped, the resource group fails over, or the node is shut down
monitor: the resource group is online
restart: the monitor script fails
The order of execution is as follows:
Linux FailSafe is started, usually at node boot or manually, and reads the resource group information from the cluster configuration database.
Linux FailSafe asks the system resource manager (SRM) to run exclusive scripts for all resource groups that are in the Online ready state.
SRM returns one of the following states for each resource group:
running
partially running
not running
If a resource group has a state of not running in a node where HA services have been started, the following occurs:
Linux FailSafe runs the failover policy script associated with the resource group. The failover policy scripts take the list of nodes that are capable of running the resource group (the failover domain) as a parameter.
The failover policy script returns an ordered list of nodes in descending order of priority (the run-time failover domain) where the resource group can be placed.
Linux FailSafe sends a request to SRM to move the resource group to the first node in the run-time failover domain.
SRM executes the start action script for all resources in the resource group:
If the start script fails, the resource group is marked online on that node with an srmd executable error error.
If the start script is successful, SRM automatically starts monitoring those resources. After the specified start monitoring time passes, SRM executes the monitor action script for the resource in the resource group.
If the state of the resource group is running or partially running on only one node in the cluster, Linux FailSafe runs the associated failover policy script:
If the highest priority node is the same node where the resource group is partially running or running, the resource group is made online on the same node. In the partially running case, Linux FailSafe asks SRM to execute start scripts for resources in the resource group that are not running.
If the highest priority node is a another node in the cluster, Linux FailSafe asks SRM to execute stop action scripts for resources in the resource group. Linux FailSafe makes the resource group online in the highest priority node in the cluster.
If the state of the resource group is running or partially running in multiple nodes in the cluster, the resource group is marked with an error exclusivity error. These resource groups will require operator intervention to become online in the cluster.
Figure 1-6 shows the message paths for action scripts and failover policy scripts.
The cluster configuration database is a key component of Linux FailSafe software. It contains all information about the following:
Resources
Resource types
Resource groups
Failover policies
Nodes
Clusters
The cluster configuration database daemon (cdbd) maintains identical databases on each node in the cluster.
The following are the contents of the failsafe directories under the /usr/lib and /var hierarchies:
/var/run/failsafe/comm/
Directory that contains files that communicate between various daemons.
/usr/lib/failsafe/common_scripts/
Directory that contains the script library (the common functions that may be used in action scripts).
/var/log/failsafe/
Directory that contains the logs of all scripts and daemons executed by Linux FailSafe. The outputs and errors from the commands within the scripts are logged in the script_nodename file.
/usr/lib/failsafe/policies/
Directory that contains the failover scripts used for resource groups.
/usr/lib/failsafe/resource_types/template
Directory that contains the template action scripts.
/usr/lib/failsafe/resource_types/rt_name
Directory that contains the action scripts for the rt_name resource type. For example, /usr/lib/failsafe/resource_types/filesystem .
resource_types/rt_name/exclusive
Script that verifies that a resource of this resource type is not already running. For example, resource_types/filesystem/exclusive.
resource_types/rt_name/monitor
Script that monitors a resource of this type.
resource_types/rt_name/restart
Script that restarts a resource of this resource type on the same node after a monitoring failure.
resource_types/rt_name/start
Script that starts a resource of this resource type.
resource_types/rt_name/stop
Script that stops a resource of this resource type.
Table 1-3 shows the administrative commands available for use in scripts.
Table 1-3. Administrative Commands for Use in Scripts
Command | Purpose |
|---|---|
ha_cilog | Logs messages to the script_ nodename log files. |
ha_execute_lock | Executes a command with a file lock. This allows command execution to be serialized |
ha_exec2 | Executes a command and retries the command on failure or timeout. |
ha_filelock | Locks a file. |
ha_fileunlock | Unlocks a file. |
ha_ifdadmin | Communicates with the ha_ifd network interface agent daemon. |
ha_http_ping2 | Checks if a web server is running. |
ha_macconfig2 | Displays or modifies MAC addresses of a network interface. |