Linux FailSafe Cluster Configuration This chapter describes administrative tasks you perform to configure the components of a Linux FailSafe system. It describes how to perform tasks using the FailSafe Cluster Manager Graphical User Interface (GUI) and the FailSafe Cluster Manager Command Line Interface (CLI). The major sections in this chapter are as follows: Setting Configuration Defaults defaults system configuration defaultsBefore you configure the components of a FailSafe system, you can set default values for some of the components that Linux FailSafe will use when defining the components. Default cluster Certain cluster manager commands require you to specify a cluster. You can specify a default cluster to use as the default if you do not specify a cluster explicitly. Default node Certain cluster manager commands require you to specify a node. With the Cluster Manager CLI, you can specify a default node to use as the default if you do not specify a node explicitly. Default resource type Certain cluster manager commands require you to specify a resource type. With the Cluster Manager CLI, you can specify a default resource type to use as the default if you do not specify a resource type explicitly. Setting Default Cluster with the Cluster Manager GUI The GUI prompts you to enter the name of the default cluster when you have not specified one. Alternately, you can set the default cluster by clicking the “Select Cluster...” button at the bottom of the FailSafe Manager window. When using the GUI, there is no need to set a default node or resource type. Setting and Viewing Configuration Defaults with the Cluster Manager CLI When you are using the Cluster Manager CLI, you can use the following commands to specify default values. The default values are in effect only for the current session of the Cluster Manager CLI. Use the following command to specify a default cluster: cmgr> set cluster A Use the following command to specify a default node: cmgr> set node A Use the following command to specify a default resource type: cmgr> set resource_type A You can view the current default configuration values of the Cluster Manager CLI with the following command: cmgr> show set defaults Name Restrictions name restrictionsWhen you specify the names of the various components of a Linux FailSafe system, the name cannot begin with an underscore (_) or include any whitespace characters. In addition, the name of any Linux FailSafe component cannot contain a space, an unprintable character, or a *, ?, \, or #. The following is the list of permitted characters for the name of a Linux FailSafe component: alphanumeric characters / . - (hyphen) _ (underscore) : = @ ' These character restrictions hold true whether you are configuring your system with the Cluster Manager GUI or the Cluster Manager CLI. Configuring Timeout Values and Monitoring Intervals timeout values monitoring intervalWhen you configure the components of a Linux FailSafe system, you configure various timeout values and monitoring intervals that determine the application downtown of a highly-available system when there is a failure. To determine reasonable values to set for your system, consider the following equation: application downtime = failure detection + time to handle failure + failure recovery Failure detection depends on the type of failure that is detected: When a node goes down, there will be a node failure detection after the node timeout; this is an HA parameter that you can modify. All failures that translate into a node failure (such as heartbeat failure and OS failure) fall into this failure category. Node timeout has a default value of 15 seconds. For information on modifying the node timeout value, see . When there is a resource failure, there is a monitor failure of a resource. The amount of time this will take is determined by the following: The monitoring interval for the resource type The monitor timeout for the resource type The number of restarts defined for the resource type, if the restart mode is configured on For information on setting values for a resource type, see . Reducing these values will result in a shorter failover time, but reducing these values could lead to significant increase in the Linux FailSafe overhead on the system performance and could also lead to false failovers. The time to handle a failure is something that the user cannot control. In general, this should take a few seconds. The failure recovery time is determined by the total time it takes for Linux FailSafe to perform the following: Execute the failover policy script (approximately five seconds). Run the stop action script for all resources in the resource group. This is not required for node failure; the failing node will be reset. Run the start action script for all resources in the resource group Cluster Configuration To set up a Linux FailSafe system, you configure the cluster that will support the highly available services. This requires the following steps: Defining the local host Defining any additional nodes that are eligible to be included in the cluster Defining the cluster The following subsections describe these tasks. Defining Cluster Nodes cluster node See node nodedefinition nodecreationA cluster node is a single Linux image. Usually, a cluster node is an individual computer. The term node is also used in this guide for brevity. The pool is the entire set of nodes available for clustering. The first node you define must be the local host, which is the host you have logged into to perform cluster administration. When you are defining multiple nodes, it is advisable to wait for a minute or so between each node definition. When nodes are added to the configuration database, the contents of the configuration database are also copied to the node being added. The node definition operation is completed when the new node configuration is added to the database, at which point the database configuration is synchronized. If you define two nodes one after another, the second operation might fail because the first database synchronization is not complete. To add a logical node definition to the pool of nodes that are eligible to be included in a cluster, you must provide the following information about the node: Logical name: This name can contain letters and numbers but not spaces or pound signs. The name must be composed of no more than 255 characters. Any legal hostname is also a legal node name. For example, for a node whose hostname is “venus.eng.company.com” you can use a node name of “venus”, “node1”, or whatever is most convenient. hostname Hostname: The fully qualified name of the host, such as “server1.company.com”. Hostnames cannot begin with an underscore, include any whitespace, or be longer than 255 characters. This hostname should be the same as the output of the hostname command on the node you are defining. The IP address associated with this hostname should not be the same as any IP address you define as highly available when you define a Linux FailSafe IP address resource. Linux FailSafe will not accept an IP address (such as “192.0.2.22”) for this input. Node ID: This number must be unique for each node in the pool and be in the range 1 through 32767. system controller defining for nodeSystem controller information. If the node has a system controller and you want Linux FailSafe to use the controller to reset the node, you must provide the following information about the system controller: Type of system controller: chalL, msc, mmsc System controller port password (optional) Administrative status, which you can set to determine whether Linux FailSafe can use the port: enabled, disabled Logical node name of system controller owner (i.e. the system that is physically attached to the system controller) Device name of port on owner node that is attached to the system controller Type of owner device: tty control network defining for node IP addresscontrol networkhostnamecontrol network A list of control networks, which are the networks used for heartbeats, reset messages, and other Linux FailSafe messages. For each network, provide the following: Hostname or IP address. This address must not be the same as any IP address you define as highly available when you define a Linux FailSafe IP address resource, and it must be resolved in the /etc/hosts file. heartbeat network Flags (hb for heartbeats, ctrl for control messages, priority). At least two control networks must use heartbeats, and at least one must use control messages. Linux FailSafe requires multiple heartbeat networks. Usually a node sends heartbeat messages to another node on only one network at a time. However, there are times when a node might send heartbeat messages to another node on multiple networks simultaneously. This happens when the sender node does not know which networks are up and which others are down. This is a transient state and eventually the heartbeat network converges towards the highest priority network that is up. Note that at any time different pairs of nodes might be using different networks for heartbeats. Although all nodes in the Linux FailSafe cluster should have two control networks, it is possible to define a node to add to the pool with one control network. Defining a Node with the Cluster Manager GUI To define a node with the Cluster Manager GUI, perform the following steps: Launch the FailSafe Manager. On the left side of the display, click on the “Nodes & Cluster” category. On the right side of the display click on the “Define a Node” task link to launch the task. Enter the selected inputs on this screen. Click on “Next” at the bottom of the screen and continue inputing information on the second screen. Click on “OK” at the bottom of the screen to complete the task, or click on “Cancel” to cancel. Defining a Node with the Cluster Manager CLI Use the following command to add a logical node definition: cmgr> define node A Entering this command specifies the name of the node you are defining and puts you in a mode that enables you to define the parameters of the node. These parameters correspond to the items defined in . The following prompts appear: Enter commands, when finished enter either "done" or "cancel" A? When this prompt of the node name appears, you enter the node parameters in the following format: set hostname to B set nodeid to C set sysctrl_type to D set sysctrl_password to E set sysctrl_status to F set sysctrl_owner to G set sysctrl_device to H set sysctrl_owner_type to I add nic J You use the add nicJ command to define the network interfaces. You use this command for each network interface to define. When you enter this command, the following prompt appears: Enter network interface commands, when finished enter "done" or "cancel" NIC - J? When this prompt appears, you use the following commands to specify the flags for the control network: set heartbeat to K set ctrl_msgs to L set priority to M After you have defined a network controller, you can use the following command from the node name prompt to remove it: cmgr> remove nicN When you have finished defining a node, enter done. The following example defines a node called cm1a, with one controller: cmgr> define node cm1a Enter commands, when finished enter either "done" or "cancel" cm1a? set hostname to cm1a cm1a? set nodeid to 1 cm1a? set sysctrl_type to msc cm1a? set sysctrl_password to [ ] cm1a? set sysctrl_status to enabled cm1a? set sysctrl_owner to cm2 cm1a? set sysctrl_device to /dev/ttyd2 cm1a? set sysctrl_owner_type to tty cm1a? add nic cm1 Enter network interface commands, when finished enter “done” or “cancel” NIC - cm1 > set heartbeat to true NIC - cm1 > set ctrl_msgs to true NIC - cm1 > set priority to 0 NIC - cm1 > done cm1a? done cmgr> If you have invoked the Cluster Manager CLI with the -p option,or you entered the set prompting on command, the display appears as in the following example: cmgr> define node cm1a Enter commands, when finished enter either "done" or "cancel" Nodename [optional]? cm1a Node ID? 1 Do you wish to define system controller info[y/n]:y Sysctrl Type <null>? (null) Sysctrl Password[optional]? ( ) Sysctrl Status <enabled|disabled>? enabled Sysctrl Owner? cm2 Sysctrl Device? /dev/ttyd2 Sysctrl Owner Type <tty>? (tty) Number of Network Interfaces ? (1) NIC 1 - IP Address? cm1 NIC 1 - Heartbeat HB (use network for heartbeats) <true|false>? true NIC 1 - Priority <1,2,...>? 0 NIC 2 - IP Address? cm2 NIC 2 - Heartbeat HB (use network for heartbeats) <true|false>? true NIC 2 - (use network for control messages) <true|false>? false NIC 2 - Priority <1,2,...>? 1 Modifying and Deleting Cluster Nodes nodedeleting node modifyingAfter you have defined a cluster node, you can modify or delete the cluster with the Cluster Manager GUI or the Cluster Manager CLI. You must remove a node from a cluster before you can delete the node. Modifying a Node with the Cluster Manager GUI To modify a node with the Cluster Manager GUI, perform the following steps: Launch the FailSafe Manager. On the left side of the display, click on the “Nodes & Cluster” category. On the right side of the display click on the “Modify a Node Definition” task link to launch the task. Modify the node parameters. Click on “OK” at the bottom of the screen to complete the task, or click on “Cancel” to cancel. Modifying a Node with the Cluster Manager CLI You can use the following command to modify an existing node. After entering this command, you can execute any of the commands you use to define a node. cmgr> modify node A Deleting a Node with the Cluster Manager GUI To delete a node with the Cluster Manager GUI, perform the following steps: Launch the FailSafe Manager. On the left side of the display, click on the “Nodes & Cluster” category. On the right side of the display click on the “Delete a Node” task link to launch the task. Enter the name of the node to delete. Click on “OK” at the bottom of the screen to complete the task, or click on “Cancel” to cancel. Deleting a Node with the Cluster Manager CLI After defining a node, you can delete it with the following command: cmgr> delete node A You can delete a node only if the node is not currently part of a cluster. This means that first you must modify a cluster that contains the node so that it no longer contains that node before you can delete it. Displaying Cluster Nodes nodedisplaying After you define cluster nodes, you can perform the following display tasks: display the attributes of a node display the nodes that are members of a specific cluster display all the nodes that have been defined You can perform any of these tasks with the FailSafe Cluster Manager GUI or the Linux FailSafe Cluster Manager CLI. Displaying Nodes with the Cluster Manager GUI The Cluster Manager GUI provides a convenient graphic display of the defined nodes of a cluster and the attributes of those nodes through the FailSafe Cluster View. You can launch the FailSafe Cluster View directly, or you can bring it up at any time by clicking on “FailSafe Cluster View” at the bottom of the “FailSafe Manager” display. From the View menu of the FailSafe Cluster View, you can select “Nodes in Pool” to view all nodes defined in the Linux FailSafe pool. You can also select “Nodes In Cluster” to view all nodes that belong to the default cluster. Click any node's name or icon to view detailed status and configuration information about the node. Displaying Nodes with the Cluster Manager CLI After you have defined a node, you can display the node's parameters with the following command: cmgr> show node A A show node command on node cm1a would yield the following display: cmgr> show node cm1 Logical Node Name: cm1 Hostname: cm1 Nodeid: 1 Reset type: reset System Controller: msc System Controller status: enabled System Controller owner: cm2 System Controller owner device: /dev/ttyd2 System Controller owner type: tty ControlNet Ipaddr: cm1 ControlNet HB: true ControlNet Control: true ControlNet Priority: 0 You can see a list of all of the nodes that have been defined with the following command: cmgr> show nodes in pool You can see a list of all of the nodes that have defined for a specified cluster with the following command: cmgr> show nodes [in cluster A] If you have specified a default cluster, you do not need to specify a cluster when you use this command and it will display the nodes defined in the default cluster. Linux FailSafe HA Parameters There are several parameters that determine the behavior of the nodes in a cluster of a Linux FailSafe system. The Linux FailSafe parameters are as follows: The tie-breaker node, which is the logical name of a machine used to compute node membership in situations where 50% of the nodes in a cluster can talk to each other. If you do not specify a tie-breaker node, the node with the lowest node ID number is used. The tie-breaker node is a cluster-wide parameter. It is recommended that you configure a tie-breaker node even if there is an odd number of nodes in the cluster, since one node may be deactivated, leaving an even number of nodes to determine membership. In a heterogeneous cluster, where the nodes are of different sizes and capabilities, the largest node in the cluster with the most important application or the maximum number of resource groups should be configured as the tie-breaker node. Node timeout, which is the timeout period, in milliseconds. If no heartbeat is received from a node in this period of time, the node is considered to be dead and is not considered part of the cluster membership. The node timeout must be at least 5 seconds. In addition, the node timeout must be at least 10 times the heartbeat interval for proper Linux FailSafe operation; otherwise, false failovers may be triggered. Node timeout is a cluster-wide parameter. The interval, in milliseconds, between heartbeat messages. This interval must be greater than 500 milliseconds and it must not be greater than one-tenth the value of the node timeout period. This interval is set to one second, by default. Heartbeat interval is a cluster-wide parameter. The higher the number of heartbeats (smaller heartbeat interval), the greater the potential for slowing down the network. Conversely, the fewer the number of heartbeats (larger heartbeat interval), the greater the potential for reducing availability of resources. The node wait timenode wait time, in milliseconds, which is the time a node waits for other nodes to join the cluster before declaring a new cluster membership. If the value is not set for the cluster, Linux FailSafe assumes the value to be the node timeout times the number of nodes. The powerfail mode, which indicates whether a special power failure algorithm should be run when no response is received from a system controller after a reset request. This can be set to ON or OFF. Powerfail is a node-specific parameter, and should be defined for the machine that performs the reset operation. Resetting Linux FailSafe Parameters with the Cluster Manager GUI To set Linux FailSafe parameters with the Cluster Manager GUI, perform the following steps: Launch the FailSafe Manager from a menu or the command line. On the left side of the display, click on the “Nodes & Cluster” category. On the right side of the display click on the “Set Linux FailSafe HA Parameters” task link to launch the task. Enter the selected inputs. Click on “OK” at the bottom of the screen to complete the task, or click on “Cancel” to cancel. Resetting Linux FailSafe Parameters with the Cluster Manager CLI You can modify the Linux FailSafe parameters with the following command: cmgr> modify ha_parameters [on node A] [in cluster  B] If you have specified a default node or a default cluster, you do not have to specify a node or a cluster in this command. Linux FailSafe will use the default. Enter commands, when finished enter either "done" or "cancel" A? When this prompt of the node name appears, you enter the Linux FailSafe parameters you wish to modify in the following format: set node_timeout to A set heartbeat to B set run_pwrfail to C set tie_breaker to D Defining a Cluster A cluster is a collection of one or more nodes coupled with each other by networks or other similar interconnects. In Linux FailSafe, a cluster is identified by a simple name. A given node may be a member of only one cluster. To define a cluster, you must provide the following information: The logical name of the cluster, with a maximum length of 255 characters. The mode of operation: normal (the default) or experimental. Experimental mode allows you to configure a Linux FailSafe cluster in which resource groups do not fail over when a node failure is detected. This mode can be useful when you are tuning node timeouts or heartbeat values. When a cluster is configured in normal mode, Linux FailSafe fails over resource groups when it detects failure in a node or resource group. (Optional) The email address to use to notify the system administrator when problems occur in the cluster (for example, root@system) (Optional) The email program to use to notify the system administrator when problems occur in the cluster (for example, /usr/bin/mail). Specifying the email program is optional and you can specify only the notification address in order to receive notifications by mail. If an address is not specified, notification will not be sent. Adding Nodes to a Cluster After you have added nodes to the pool and defined a cluster, you must provide the names of the nodes to include in the cluster. Defining a Cluster with the Cluster Manager GUI To define a cluster with the Cluster Manager GUI, perform the following steps: Launch the Linux FailSafe Manager. On the left side of the display, click on “Guided Configuration”. On the right side of the display click on “Set Up a New Cluster” to launch the task link. In the resulting window, click each task link in turn, as it becomes available. Enter the selected inputs for each task. When finished, click “OK” to close the taskset window. Defining a Cluster with the Cluster Manager CLI When you define a cluster with the CLI, you define and cluster and add nodes to the cluster with the same command. Use the following cluster manager CLI command to define a cluster: cmgr> define cluster A Entering this command specifies the name of the node you are defining and puts you in a mode that allows you to add nodes to the cluster. The following prompt appears: cluster A? When this prompt appears during cluster creation, you can specify nodes to include in the cluster and you can specify an email address to direct messages that originate in this cluster. You specify nodes to include in the cluster with the following command: cluster A? add node C cluster A? You can add as many nodes as you want to include in the cluster. You specify an email program to use to direct messages with the following command: cluster A? set notify_cmd to B cluster A? You specify an email address to direct messages with the following command: cluster A? set notify_addr to B cluster A? You specify a mode for the cluster (normal or experimental) with the following command: cluster A? set ha_mode to D cluster A? When you are finished defining the cluster, enter done to return to the cmgr prompt. Modifying and Deleting Clusters After you have defined a cluster, you can modify the attributes of the cluster or you can delete the cluster. You cannot delete a cluster that contains nodes; you must move those nodes out of the cluster first. Modifying and Deleting a Cluster with the Cluster Manager GUI To modify a cluster with the Cluster Manager GUI, perform the following procedure: Launch the Linux FailSafe Manager. On the left side of the display, click on the “Nodes & Cluster” category. On the right side of the display click on the “Modify a Cluster Definition” task link to launch the task. Enter the selected inputs. Click on “OK” at the bottom of the screen to complete the task, or click on “Cancel” to cancel. To delete a cluster with the Cluster Manager GUI, perform the following procedure: Launch the Linux FailSafe Manager. On the left side of the display, click on the “Nodes & Cluster” category. On the right side of the display click on the “Delete a Cluster” task link to launch the task. Enter the selected inputs. Click on “OK” at the bottom of the screen to complete the task, or click on “Cancel” to cancel. Modifying and Deleting a Cluster with the Cluster Manager CLI To modify an existing cluster, enter the following command: cmgr> modify cluster A Entering this command specifies the name of the cluster you are modifying and puts you in a mode that allows you to modify the cluster. The following prompt appears: cluster A? When this prompt appears, you can modify the cluster definition with the following commands: cluster A? set notify_addr to B cluster A? set notify_cmd to B cluster A? add node C cluster A? remove node D cluster A? When you are finished modifying the cluster, enter done to return to the cmgr prompt. You can delete a defined cluster with the following command: cmgr> delete cluster A Displaying Clusters You can display defined clusters with the Cluster Manager GUI or the Cluster Manager CLI. Displaying a Cluster with the Cluster Manager GUI The Cluster Manager GUI provides a convenient display of a cluster and its components through the FailSafe Cluster View. You can launch the FailSafe Cluster View directly, or you can bring it up at any time by clicking on the “FailSafe Cluster View” prompt at the bottom of the “FailSafe Manager” display. From the View menu of the FailSafe Cluster View, you can choose elements within the cluster to examine. To view details of the cluster, click on the cluster name or icon. Status and configuration information will appear in a new window. To view this information within the FailSafe Cluster View window, select Options. When you then click on the Show Details option, the status details will appear in the right side of the window. Displaying a Cluster with the Cluster Manager CLI After you have defined a cluster, you can display the nodes in that cluster with the following command: cmgr> show cluster A You can see a list of the clusters that have been defined with the following command: cmgr> show clusters Resource Configuration resourceconfiguration overviewA resource is a single physical or logical entity that provides a service to clients or other resources. A resource is generally available for use on two or more nodes in a cluster, although only one node controls the resource at any given time. For example, a resource can be a single disk volume, a particular network address, or an application such as a web node. Defining Resources resourcedefinition overviewResources are identified by a resource name and a resource type. A resource name identifies a specific instance of a resource type. A resource type is a particular class of resource. All of the resources in a given resource type can be handled in the same way for the purposes of failover. Every resource is an instance of exactly one resource type. A resource type is identified with a simple name. A resource type can be defined for a specific logical node, or it can be defined for an entire cluster. A resource type that is defined for a node will override a clusterwide resource type definition of the same name; this allows an individual node to override global settings from a clusterwide resource type definition. The Linux FailSafe software includes many predefined resource types. If these types fit the application you want to make into a highly available service, you can reuse them. If none fit, you can define additional resource types. To define a resource, you provide the following information: The name of the resource to define, with a maximum length of 255 characters. The type of resource to define. The Linux FailSafe system contains some pre-defined resource types (template and IP_Address ). You can define your own resource type as well. The name of the cluster that contains the resource. The logical name of the node that contains the resource (optional). If you specify a node, a local version of the resource will be defined on that node. Resource type-specific attributes for the resource. Each resource type may require specific parameters to define for the resource, as described in the following subsections. You can define up to 100 resources in a Linux FailSafe configuration. IP Address Resource Attributes resourceIP address IP address resourceThe IP Address resources are the IP addresses used by clients to access the highly available services within the resource group. These IP addresses are moved from one node to another along with the other resources in the resource group when a failure is detected. You specify the resource name of an IP address in dotted decimal notation. IP names that require name resolution should not be used. For example, 192.26.50.1 is a valid resource name of the IP Address resource type. The IP address you define as a Linux FailSafe resource must not be the same as the IP address of a node hostname or the IP address of a node's control network. When you define an IP address, you can optionally specifying the following parameters. If you specify any of these parameters, you must specify all of them. The broadcast address for the IP address. The network mask of the IP address. A comma-separated list of interfaces on which the IP address can be configured. This ordered list is a superset of all the interfaces on all nodes where this IP address might be allocated. Hence, in a mixed cluster with different ethernet drivers, an IP address might be placed on eth0 on one system and ln0 on a another. In this case the interfaces field would be eth0,ln0 or ln0,eth0. The order of the list of interfaces determines the priority order for determining which IP address will be used for local restarts of the node. Adding Dependency to a Resource resourcedependencies One resource can be dependent on one or more other resources; if so, it will not be able to start (that is, be made available for use) unless the dependent resources are started as well. Dependent resources must be part of the same resource group. Like resources, a resource type can be dependent on one or more other resource types. If such a dependency exists, at least one instance of each of the dependent resource types must be defined. For example, a resource type named Netscape_web might have resource type dependencies on a resource types named IP_address and volume . If a resource named ws1 is defined with the Netscape_web resource type, then the resource group containing ws1 must also contain at least one resource of the type IP_address and one resource of the type volume. You cannot make resources mutually dependent. For example, if resource A is dependent on resource B, then you cannot make resource B dependent on resource A. In addition, you cannot define cyclic dependencies. For example, if resource A is dependent on resource B, and resource B is dependent on resource C, then resource C cannot be dependent on resource A. When you add a dependency to a resource definition, you provide the following information: The name of the existing resource to which you are adding a dependency. The resource type of the existing resource to which you are adding a dependency. The name of the cluster that contains the resource. Optionally, the logical node name of the node in the cluster that contains the resource. If specified, resource dependencies are added to the node's definition of the resource. If this is not specified, resource dependencies are added to the cluster-wide resource definition. The resource name of the resource dependency. The resource type of the resource dependency. Defining a Resource with the Cluster Manager GUI To define a resource with the Cluster Manager GUI, perform the following steps: Launch the FailSafe Manager. On the left side of the display, click on the “Resources & Resource Types” category. On the right side of the display click on the “Define a New Resource” task link to launch the task. Enter the selected inputs. Click on “OK” at the bottom of the screen to complete the task. On the right side of the display, click on the “Add/Remove Dependencies for a Resource Definition” to launch the task. Enter the selected inputs. Click on “OK” at the bottom of the screen to complete the task. When you use this command to define a resource, you define a cluster-wide resource that is not specific to a node. For information on defining a node-specific resource, see . Defining a Resource with the Cluster Manager CLI Use the following CLI command to define a clusterwide resource: cmgr> define resourceA [ of resource_typeB] [ in clusterC] Entering this command specifies the name and resource type of the resource you are defining within a specified cluster. If you have specified a default cluster or a default resource type, you do not need to specify a resource type or a cluster in this command and the CLI will use the default. When you use this command to define a resource, you define a clusterwide resource that is not specific to a node. For information on defining a node-specific resource, see . The following prompt appears: resource A? When this prompt appears during resource creation, you can enter the following commands to specify the attributes of the resource you are defining and to add and remove dependencies from the resource: resource A? setkey tovalue resource A? add dependencyE of typeF resource A? remove dependencyE of typeF The attributes you define with the set key to value command will depend on the type of resource you are defining, as described in . For detailed information on how to determine the format for defining resource attributes, see . When you are finished defining the resource and its dependencies, enter done to return to the cmgr prompt. Specifying Resource Attributes with Cluster Manager CLI To see the format in which you can specify the user-specific attributes that you need to set for a particular resource type, you can enter the following command to see the full definition of that resource type: cmgr> show resource_type A in clusterB For example, to see the key attributes you define for a resource of a defined resource type IP_address, you would enter the following command: cmgr> show resource_type IP_address in cluster nfs-cluster Name: IP_address Predefined: true Order: 401 Restart mode: 1 Restart count: 2 Action name: stop Executable: /usr/lib/failsafe/resource_types/IP_address/stop Maximum execution time: 80000ms Monitoring interval: 0ms Start monitoring time: 0ms Action name: exclusive Executable: /usr/lib/failsafe/resource_types/IP_address/exclusive Maximum execution time: 100000ms Monitoring interval: 0ms Start monitoring time: 0ms Action name: start Executable: /usr/lib/failsafe/resource_types/IP_address/start Maximum execution time: 80000ms Monitoring interval: 0ms Start monitoring time: 0ms Action name: restart Executable: /usr/lib/failsafe/resource_types/IP_address/restart Maximum execution time: 80000ms Monitoring interval: 0ms Start monitoring time: 0ms Action name: monitor Executable: /usr/lib/failsafe/resource_types/IP_address/monitor Maximum execution time: 40000ms Monitoring interval: 20000ms Start monitoring time: 50000ms Type specific attribute: NetworkMask Data type: string Type specific attribute: interfaces Data type: string Type specific attribute: BroadcastAddress Data type: string No resource type dependencies The display reflects the format in which you can specify the group id, the device owner, and the device file permissions for the volume. In this case, the devname-group key specifies the group id of the device file, the devname_owner key specifies the owner of the device file, and the devname_mode key specifies the device file permissions. For example, to set the group id to sys, enter the following command: resource A? set devname-group to sys This remainder of this section summarizes the attributes you specify for the predefined Linux FailSafe resource types with the set key to value command of the Cluster Manger CLI. IP address resource resource IP addressWhen you define an IP address, you specify the following attributes: NetworkMask The subnet mask of the IP address interfaces A comma-separated list of interfaces on which the IP address can be configured BroadcastAddress The broadcast address for the IP address Defining a Node-Specific Resource resourcenode-specific node-specific resourceYou can redefine an existing resource with a resource definition that applies only to a particular node. Only existing clusterwide resources can be redefined; resources already defined for a specific cluster node cannot be redefined. == REVIEWERS: NEW – NEEDS TO BE CORRECTED? == You use this feature when you configure heterogeneous clusters for an IP_address resource. For example, IP_address 192.26.50.2 can be configured on et0 on an SGI Challenge node and on eth0 on all other Linux servers. Do we support mixing IRIX and Linux nodes? if not, then this reference needs to be changed to some other Linux interface name.The clusterwide resource definition for 192.26.50.2 will have the interfaces field set to eth0 and the node-specific definition for the Challenge node will have et0 as the interfaces field. Defining a Node-Specific Resource with the Cluster Manager GUI Using the Cluster Manager GUI, you can take an existing clusterwide resource definition and redefine it for use on a specific node in the cluster: Launch the FailSafe Manager. On the left side of the display, click on the “Resources & Resource Types” category. On the right side of the display click on the “Redefine a Resource For a Specific Node” task link to launch the task. Enter the selected inputs. Click on “OK” at the bottom of the screen to complete the task. Defining a Node-Specific Resource with the Cluster Manager CLI You can use the Cluster Manager CLI to redefine a clusterwide resource to be specific to a node just as you define a clusterwide resource, except that you specify a node on the define resource command. Use the following CLI command to define a node-specific resource: cmgr> define resourceA of resource_typeB on nodeC [in cluster D] If you have specified a default cluster, you do not need to specify a cluster in this command and the CLI will use the default. Modifying and Deleting Resources resourcemodifying resource deletingAfter you have defined resources, you can modify and delete them. You can modify only the type-specific attributes for a resource. You cannot rename a resource once it has been defined. There are some resource attributes whose modification does not take effect until the resource group containing that resource is brought online again. For example, if you modify the export options of a resource of type NFS, the modifications do not take effect immediately; they take effect when the resource is brought online. Modifying and Deleting Resources with the Cluster Manager GUI To modify a resource with the Cluster Manager GUI, perform the following procedure: Launch the FailSafe Manager. On the left side of the display, click on the “Resources & Resource Types” category. On the right side of the display click on the “Modify a Resource Definition” task link to launch the task. Enter the selected inputs. Click on “OK” at the bottom of the screen to complete the task, or click on “Cancel” to cancel. To delete a resource with the Cluster Manager GUI, perform the following procedure: Launch the FailSafe Manager. On the left side of the display, click on the “Resources & Resource Types” category. On the right side of the display click on the “Delete a Resource” task link to launch the task. Enter the selected inputs. Click on “OK” at the bottom of the screen to complete the task, or click on “Cancel” to cancel. Modifying and Deleting Resources with the Cluster Manager CLI Use the following CLI command to modify a resource: cmgr> modify resourceA of resource_typeB [ in clusterC] Entering this command specifies the name and resource type of the resource you are modifying within a specified cluster. If you have specified a default cluster, you do not need to specify a cluster in this command and the CLI will use the default. You modify a resource using the same commands you use to define a resource. You can use the following command to delete a resource definition: cmgr> delete resourceA of resource_typeB [ in clusterD] Displaying Resources resourcedisplaying You can display resources in various ways. You can display the attributes of a particular defined resource, you can display all of the defined resources in a specified resource group, or you can display all the defined resources of a specified resource type. Displaying Resources with the Cluster Manager GUI The Cluster Manager GUI provides a convenient display of resources through the FailSafe Cluster View. You can launch the FailSafe Cluster View directly, or you can bring it up at any time by clicking on the “FailSafe Cluster View” button at the bottom of the “FailSafe Manager” display. From the View menu of the FailSafe Cluster View, select Resources to see all defined resources. The status of these resources will be shown in the icon (green indicates online, grey indicates offline). Alternately, you can select “Resources of Type” from the View menu to see resources organized by resource type, or you can select “Resources by Group” to see resources organized by resource group. Displaying Resources with the Cluster Manager CLI Use the following command to view the parameters of a defined resource: cmgr> show resource A  of resource_type B Use the following command to view all of the defined resources in a resource group: cmgr> show resources in resource_group A [in clusterB] If you have specified a default cluster, you do not need to specify a cluster in this command and the CLI will use the default. Use the following command to view all of the defined resources of a particular resource type in a specified cluster: cmgr> show resources of resource_type A [in clusterB] If you have specified a default cluster, you do not need to specify a cluster in this command and the CLI will use the default. Defining a Resource Type resource type definitionThe Linux FailSafe software includes many predefined resource types. If these types fit the application you want to make into a highly available service, you can reuse them. If none fits, you can define additional resource types. Complete information on defining resource types is provided in the Linux FailSafe Programmer's Guide. This manual provides a summary of that information. To define a new resource type, you must have the following information: Name of the resource type, with a maximum length of 255 characters. Name of the cluster to which the resource type will apply. Node on which the resource type will apply, if the resource type is to be restricted to a specific node. Order of performing the action scripts for resources of this type in relation to resources of other types: Resources are started in the increasing order of this value Resources are stopped in the decreasing order of this value See the Linux FailSafe Programmer's Guide for a full description of the order ranges available. Restart mode, which can be one of the following values: 0 = Do not restart on monitoring failures 1 = Restart a fixed number of times Number of local restarts (when restart mode is 1). Location of the executable script. This is always /usr/lib/failsafe/resources_types/rtname, where rtname is the resource type name. Monitoring interval, which is the time period (in milliseconds) between successive executions of the monitor action script; this is only valid for the monitor action script. Starting time for monitoring. When the resource group is made in online in a cluster node, Linux FailSafe will start monitoring the resources after the specified time period (in milliseconds). Action scripts to be defined for this resource type, You must specify scripts for start, stop, exclusive, and monitor, although the monitor script may contain only a return-success function if you wish. If you specify 1 for the restart mode, you must specify a restart script. Type-specific attributes to be defined for this resource type. The action scripts use this information to start, stop, and monitor a resource of this resource type. For example, NFS requires the following resource keys: export-point, which takes a value that defines the export disk name. This name is used as input to the exportfs command. For example: export-point = /this_disk export-info, which takes a value that defines the export options for the filesystem. These options are used in the exportfs command. For example: export-info = rw,sync,no_root_squash filesystem, which takes a value that defines the raw filesystem. This name is used as input to the mount ) command. For example: filesystem = /dev/sda1 To define a new resource type, you use the Cluster Manager GUI or the Cluster Manager CLI. Defining a Resource Type with the Cluster Manager GUI To define a resource type with the Cluster Manager GUI, perform the following steps: Launch the FailSafe Manager. On the left side of the display, click on the “Resources & Resource Types” category. On the right side of the display click on the “Define a Resource Type” task link to launch the task. Enter the selected inputs. Click on “OK” at the bottom of the screen to complete the task. Defining a Resource Type with the Cluster Manager CLI The following steps show the use of cluster_mgr interactively to define a resource type called test_rt. Log in as root. Execute the cluster_mgr command using the -p option to prompt you for information (the command name can be abbreviated to cmgr): # /usr/lib/failsafe/bin/cluster_mgr -p Welcome to Linux FailSafe Cluster Manager Command-Line Interface cmgr> Use the set subcommand to specify the default cluster used for cluster_mgr operations. In this example, we use a cluster named test: cmgr> set cluster test If you prefer, you can specify the cluster name as needed with each subcommand. Use the define resource_type subcommand. By default, the resource type will apply across the cluster; if you wish to limit the resource_type to a specific node, enter the node name when prompted. If you wish to enable restart mode, enter 1 when prompted.  The following example only shows the prompts and answers for two action scripts (start and stop) for a new resource type named test_rt. cmgr> define resource_type test_rt (Enter "cancel" at any time to abort) Node[optional]? Order ? 300 Restart Mode ? (0) DEFINE RESOURCE TYPE OPTIONS   0) Modify Action Script.   1) Add Action Script.   2) Remove Action Script.   3) Add Type Specific Attribute.   4) Remove Type Specific Attribute.   5) Add Dependency.   6) Remove Dependency.   7) Show Current Information.   8) Cancel. (Aborts command)   9) Done. (Exits and runs command) Enter option:1 No current resource type actions Action name ? start Executable Time? 40000 Monitoring Interval? 0 Start Monitoring Time? 0   0) Modify Action Script.   1) Add Action Script.   2) Remove Action Script.   3) Add Type Specific Attribute.   4) Remove Type Specific Attribute.   5) Add Dependency.   6) Remove Dependency.   7) Show Current Information.   8) Cancel. (Aborts command)   9) Done. (Exits and runs command) Enter option:1 Current resource type actions:   Action - 1: start Action name stop Executable Time? 40000 Monitoring Interval? 0 Start Monitoring Time? 0    0) Modify Action Script.   1) Add Action Script.   2) Remove Action Script.   3) Add Type Specific Attribute.   4) Remove Type Specific Attribute.   5) Add Dependency.   6) Remove Dependency.   7) Show Current Information.   8) Cancel. (Aborts command)   9) Done. (Exits and runs command) Enter option:3 No current type specific attributes Type Specific Attribute ? integer-att Datatype ? integer Default value[optional] ? 33   0) Modify Action Script.   1) Add Action Script.   2) Remove Action Script.   3) Add Type Specific Attribute.   4) Remove Type Specific Attribute.   5) Add Dependency.   6) Remove Dependency.   7) Show Current Information.   8) Cancel. (Aborts command)   9) Done. (Exits and runs command) Enter option:3 Current type specific attributes:   Type Specific Attribute - 1: export-point Type Specific Attribute ? string-att Datatype ? string Default value[optional] ? rw   0) Modify Action Script.   1) Add Action Script.   2) Remove Action Script.   3) Add Type Specific Attribute.   4) Remove Type Specific Attribute.   5) Add Dependency.   6) Remove Dependency.   7) Show Current Information.   8) Cancel. (Aborts command)   9) Done. (Exits and runs command)Enter option:5 No current resource type dependencies Dependency name ? filesystem   0) Modify Action Script.   1) Add Action Script.   2) Remove Action Script.   3) Add Type Specific Attribute.   4) Remove Type Specific Attribute.   5) Add Dependency.   6) Remove Dependency.   7) Show Current Information.   8) Cancel. (Aborts command)   9) Done. (Exits and runs command) Enter option:7 Current resource type actions:   Action - 1: start   Action - 2: stop Current type specific attributes:   Type Specific Attribute - 1: integer-att   Type Specific Attribute - 2: string-att No current resource type dependencies Resource dependencies to be added:   Resource dependency - 1: filesystem   0) Modify Action Script.   1) Add Action Script.   2) Remove Action Script.   3) Add Type Specific Attribute.   4) Remove Type Specific Attribute.   5) Add Dependency.   6) Remove Dependency.   7) Show Current Information.   8) Cancel. (Aborts command)   9) Done. (Exits and runs command) Enter option:9 Successfully created resource_type test_rt cmgr> show resource_types NFS template Netscape_web test_rt statd Oracle_DB MAC_address IP_address INFORMIX_DB filesystem volume cmgr> exit # Defining a Node-Specific Resource Type resource type node-specific node-specific resource typeYou can redefine an existing resource type with a resource definition that applies only to a particular node. Only existing clusterwide resource types can be redefined; resource types already defined for a specific cluster node cannot be redefined. A resource type that is defined for a node overrides a cluster-wide resource type definition with the same name; this allows an individual node to override global settings from a clusterwide resource type definition. You can use this feature if you want to have different script timeouts for a node or you want to restart a resource on only one node in the cluster. For example, the IP_address resource has local restart enabled by default. If you would like to have an IP address type without local restart for a particular node, you can make a copy of the IP_address clusterwide resource type with all of the parameters the same except for restart mode, which you set to 0. Defining a Node-Specific Resource Type with the Cluster Manager GUI Using the Cluster Manager GUI, you can take an existing clusterwide resource type definition and redefine it for use on a specific node in the cluster. Perform the following tasks: Launch the FailSafe Manager. On the left side of the display, click on the “Resources & Resource Types” category. On the right side of the display click on the “Redefine a Resource Type For a Specific Node” task link to launch the task. Enter the selected inputs. Click on “OK” at the bottom of the screen to complete the task. Defining a Node-Specific Resource Type with the Cluster Manager CLI With the Cluster Manager CLI, you redefine a node-specific resource type just as you define a cluster-wide resource type, except that you specify a node on the define resource_type command. Use the following CLI command to define a node-specific resource type: cmgr> define resource_type Aon nodeB [ in clusterC] If you have specified a default cluster, you do not need to specify a cluster in this command and the CLI will use the default. Adding Dependencies to a Resource Type resource type dependenciesLike resources, a resource type can be dependent on one or more other resource types. If such a dependency exists, at least one instance of each of the dependent resource types must be defined. For example, a resource type named Netscape_web might have resource type dependencies on a resource type named IP_address and volume. If a resource named ws1 is defined with the Netscape_web resource type, then the resource group containing ws1 must also contain at least one resource of the type IP_address nd one resource of the type volume. When using the Cluster Manager GUI, you add or remove dependencies for a resource type by selecting the “Add/Remove Dependencies for a Resource Type” from the “Resources & Resource Types” display and providing the indicated input. When using the Cluster Manager CLI, you add or remove dependencies when you define or modify the resource type. Modifying and Deleting Resource Types resource type modifying resource typedeletingAfter you have defined a resource types, you can modify and delete them. Modifying and Deleting Resource Types with the Cluster Manager GUI To modify a resource type with the Cluster Manager GUI, perform the following procedure: Launch the FailSafe Manager. On the left side of the display, click on the “Resources & Resource Types” category. On the right side of the display click on the “Modify a Resource Type Definition” task link to launch the task. Enter the selected inputs. Click on “OK” at the bottom of the screen to complete the task, or click on “Cancel” to cancel. To delete a resource type with the Cluster Manager GUI, perform the following procedure: Launch the FailSafe Manager. On the left side of the display, click on the “Resources & Resource Types” category. On the right side of the display click on the “Delete a Resource Type” task link to launch the task. Enter the selected inputs. Click on “OK” at the bottom of the screen to complete the task, or click on “Cancel” to cancel. Modifying and Deleting Resource Types with the Cluster Manager CLI Use the following CLI command to modify a resource: cmgr> modify resource_type A [in clusterB] Entering this command specifies the resource type you are modifying within a specified cluster. If you have specified a default cluster, you do not need to specify a cluster in this command and the CLI will use the default. You modify a resource type using the same commands you use to define a resource type. You can use the following command to delete a resource type: cmgr> delete resource_type A [in clusterB] Installing (Loading) a Resource Type on a Cluster installing resource type resource type installingWhen you define a cluster, Linux FailSafe installs a set of resource type definitions that you can use that include default values. If you need to install additional standard Silicon Graphics-supplied resource type definitions on the cluster, or if you delete a standard resource type definition and wish to reinstall it, you can load that resource type definition on the cluster. The resource type definition you are installing cannot exist on the cluster. Installing a Resource Type with the Cluster Manager GUI To install a resource type using the GUI, select the “Load a Resource” task from the “Resources & Resource Types” task page and enter the resource type to load. Installing a Resource Type with the Cluster Manager CLI Use the following CLI command to install a resource type on a cluster: cmgr> install resource_type A [in clusterB] If you have specified a default cluster, you do not need to specify a cluster in this command and the CLI will use the default. Displaying Resource Types resource type displayingAfter you have defined a resource types, you can display them. Displaying Resource Types with the Cluster Manager GUI The Cluster Manager GUI provides a convenient display of resource types through the FailSafe Cluster View. You can launch the FailSafe Cluster View directly, or you can bring it up at any time by clicking on the “FailSafe Cluster View” prompt at the bottom of the “FailSafe Manager” display. From the View menu of the FailSafe Cluster View, select Types to see all defined resource types. You can then click on any of the resource type icons to view the parameters of the resource type. Displaying Resource Types with the Cluster Manager CLI Use the following command to view the parameters of a defined resource type in a specified cluster: cmgr> show resource_type A [ in clusterB] If you have specified a default cluster, you do not need to specify a cluster in this command and the CLI will use the default. Use the following command to view all of the defined resource types in a cluster: cmgr> show resource_types [in cluster A] If you have specified a default cluster, you do not need to specify a cluster in this command and the CLI will use the default. Use the following command to view all of the defined resource types that have been installed: cmgr> show resource_types installed Defining a Failover Policy Before you can configure your resources into a resource group, you must determine which failover policy to apply to the resource group. To define a failover policy, you provide the following information: failover policydefinition The name of the failover policy, with a maximum length of 63 characters, which must be unique within the pool. The name of an existing failover script. The initial failover domain, which is an ordered list of the nodes on which the resource group may execute. The administrator supplies the initial failover domain when configuring the failover policy; this is input to the failover script, which generates the runtime failover domain. The failover attributes, which modify the behavior of the failover script. Complete information on failover policies and failover scripts, with an emphasis on writing your own failover policies and scripts, is provided in the Linux FailSafe Programmer's Guide. Failover Scripts failover policy failover script failover scriptA failover script helps determine the node that is chosen for a failed resource group. The failover script takes the initial failover domain and transforms it into the runtime failover domain. Depending upon the contents of the script, the initial and the runtime domains may be identical. The ordered failover script is provided with the Linux FailSafe release. The ordered script never changes the initial domain; when using this script, the initial and runtime domains are equivalent. The round-robin failover script is also provided with the Linux FailSafe release. The round-robin cript selects the resource group owner in a round-robin (circular) fashion. This policy can be used for resource groups that can be run in any node in the cluster. Failover scripts are stored in the /usr/lib/failsafe/policies directory. If the ordered script does not meet your needs, you can define a new failover script and place it in the /usr/lib/failsafe/policies directory. When you are using the FailSafe GUI, the GUI automatically detects your script and presents it to you as a choice for you to use. You can configure the Linux FailSafe database to use your new failover script for the required resource groups. For information on defining failover scripts, see the Linux FailSafe Programmer's Guide. Failover Domain A failover domain is the ordered list of nodes on which a given resource group can be allocated. The nodes listed in the failover domain must be within the same cluster; however, the failover domain does not have to include every node in the cluster. The failover domain can be used to statically load balance the resource groups in a cluster. Examples: In a four-node cluster, two nodes might share a volume. The failover domain of the resource group containing the volume will be the two nodes that share the volume. If you have a cluster of nodes named venus, mercury, and pluto, you could configure the following initial failover domains for resource groups RG1 and RG2:failover policy failover domain domain failover domain venus, mercury, pluto for RG1 pluto, mercury for RG2 When you define a failover policy, you specify the initial failover domain. The initial failover domain is used when a cluster is first booted. The ordered list specified by the initial failover domain is transformed into a runtime failover domainrun-time failover domain by the failover script. With each failure, the failover script takes the current run-time failover domain and potentially modifies it; the initial failover domain is never used again. Depending on the run-time conditions and contents of the failover script, the initial and run-time failover domains may be identical. Linux FailSafe stores the run-time failover domain and uses it as input to the next failover script invocation. Failover Attributes failover policy failover attributes failover attributesA failover attribute is a value that is passed to the failover script and used by Linux FailSafe for the purpose of modifying the run-time failover domain used for a specific resource group. You can specify a failover attribute of Auto_Failback, Controlled_Failback, Auto_Recovery, or InPlace_Recovery. Auto_Failback and Controlled_Failback are mutually exclusive, but you must specify one or the other. Auto_Recovery and InPlace_Recovery are mutually exclusive, but whether you specify one or the other is optional. A failover attribute of Auto_Failback specifies that the resource group will be run on the first available node in the runtime failover domain. If the first node fails, the next available node will be used; when the first node reboots, the resource group will return to it. This attribute is best used when some type of load balancing is required. A failover attribute of Controlled_Failback specifies that the resource group will be run on the first available node in the runtime failover domain, and will remain running on that node until it fails. If the first node fails, the next available node will be used; the resource group will remain on this new node even after the first node reboots.This attribute is best used when client/server applications have expensive recovery mechanisms, such as databases or any application that uses tcp to communicate. The recovery attributes Auto_Recovery and InPlace_Recovery determine the node on which a resource group will be allocated when its state changes to online and a member of the group is already allocated (such as when volumes are present). Auto_Recovery specifies that the failover policy will be used to allocate the resource group; this is the default recovery attribute if you have specified the Auto_Failback attribute. InPlace_Recovery specifies that the resource group will be allocated on the node that already contains part of the resource group; this is the default recovery attribute if you have specified the Controlled_Failback attribute. See the Linux FailSafe Programmer's Guide for a full discussions of example failover policies. Defining a Failover Policy with the Cluster Manager GUI To define a failover policy using the GUI, perform the following steps: Launch the FailSafe Manager. On the left side of the display, click on the “Failover Policies & Resource Groups” category. On the right side of the display click on the “Define a Failover Policy” task link to launch the task. Enter the selected inputs. Click on “OK” at the bottom of the screen to complete the task. Defining a Failover Policy with the Cluster Manager CLI To define a failover policy, enter the following command at the cmgr prompt to specify the name of the failover policy: cmgr> define failover_policy A The following prompt appears: failover_policy A? When this prompt appears you can use the following commands to specify the components of a failover policy: failover_policy A? set attribute to B failover policy A? set script to C failover policy A? set domain to D failover_policy A? When you define a failover policy, you can set as many attributes and domains as your setup requires, but executing the add attribute and add domain commands with different values. The CLI also allows you to specify multiple domains in one command of the following format: failover_policy A? set domain to A B C ... The components of a failover policy are described in detail in the Linux FailSafe Programmer's Guide and in summary in . When you are finished defining the failover policy, enter done to return to the cmgr prompt. Modifying and Deleting Failover Policies After you have defined a failover policy, you can modify or delete it. Modifying and Deleting Failover Policies with the Cluster Manager GUI To modify a failover policy with the Cluster Manager GUI, perform the following procedure: Launch the FailSafe Manager. On the left side of the display, click on the “Failover Policies & Resource Groups” category. On the right side of the display click on the “Modify a Failover Policy Definition” task link to launch the task. Enter the selected inputs. Click on “OK” at the bottom of the screen to complete the task, or click on “Cancel” to cancel. To delete a failover policy with the Cluster Manager GUI, perform the following procedure: Launch the FailSafe Manager. On the left side of the display, click on the “Failover Policies & Resource Groups” category. On the right side of the display click on the “Delete a Failover Policy” task link to launch the task. Enter the selected inputs. Click on “OK” at the bottom of the screen to complete the task, or click on “Cancel” to cancel. Modifying and Deleting Failover Policies with the Cluster Manager CLI Use the following CLI command to modify a failover policy: cmgr> modify failover_policy A You modify a failover policy using the same commands you use to define a failover policy. You can use the following command to delete a failover policy definition: cmgr> delete failover_policy A Displaying Failover Policies You can use Linux FailSafe to display any of the following: The components of a specified failover policy All of the failover policies that have been defined All of the failover policy attributes that have been defined All of the failover policy scripts that have been defined Displaying Failover Policies with the Cluster Manager GUI The Cluster Manager GUI provides a convenient display of failover policies through the FailSafe Cluster View. You can launch the FailSafe Cluster View directly, or you can bring it up at any time by clicking on the “FailSafe Cluster View” prompt at the bottom of the “FailSafe Manager” display. From the View menu of the FailSafe Cluster View, select Failover Policies to see all defined failover policies. Displaying Failover Policies with the Cluster Manager CLI Use the following command to view the parameters of a defined failover policy: cmgr> show failover_policy A Use the following command to view all of the defined failover policies: cmgr> show failover policies Use the following command to view all of the defined failover policy attributes: cmgr> show failover_policy attributes Use the following command to view all of the defined failover policy scripts: cmgr> show failover_policy scripts Defining Resource Groups resource group definitionResources are configured together into resource groups. A resource group is a collection of interdependent resources. If any individual resource in a resource group becomes unavailable for its intended use, then the entire resource group is considered unavailable. Therefore, a resource group is the unit of failover for Linux FailSafe. For example, a resource group could contain all of the resources that are required for the operation of a web node, such as the web node itself, the IP address with which it communicates to the outside world, and the disk volumes containing the content that it serves. When you define a resource group, you specify a failover policy. A failover policy controls the behavior of a resource group in failure situations. To define a resource group, you provide the following information: The name of the resource group, with a maximum length of 63 characters. The name of the cluster to which the resource group is available The resources to include in the resource group, and their resource types The name of the failover policy that determines which node will take over the services of the resource group on failure Linux FailSafe does not allow resource groups that do not contain any resources to be brought online. You can define up to 100 resources configured in any number of resource groups. Defining a Resource Group with the Cluster Manager GUI To define a resource group with the Cluster Manager GUI, perform the following steps: Launch the FailSafe Manager. On the left side of the display, click on “Guided Configuration”. On the right side of the display click on “Set Up Highly Available Resource Groups” to launch the task link. In the resulting window, click each task link in turn, as it becomes available. Enter the selected inputs for each task. When finished, click “OK” to close the taskset window. Defining a Resource Group with the Cluster Manager CLI To configure a resource group, enter the following command at the cmgr prompt to specify the name of a resource group and the cluster to which the resource group is available: cmgr> define resource_group A [in cluster B] Entering this command specifies the name of the resource group you are defining within a specified cluster. If you have specified a default cluster, you do not need to specify a cluster in this command and the CLI will use the default. The following prompt appears: Enter commands, when finished enter either "done" or "cancel" resource_group A? When this prompt appears you can use the following commands to specify the resources to include in the resource group and the failover policy to apply to the resource group: resource_group A? add resource Bof resource_type C resource_group A? set failover_policy to D After you have set the failover policy and you have finished adding resources to the resource group, enter done to return to the cmgr prompt. For a full example of resource group creation using the Cluster Manager CLI, see . Modifying and Deleting Resource Groups resource group modifying resource groupdeletingAfter you have defined resource groups, you can modify and delete the resource groups. You can change the failover policy of a resource group by specifying a new failover policy associated with that resource group, and you can add or delete resources to the existing resource group. Note, however, that since you cannot have a resource group online that does not contain any resources, Linux FailSafe does not allow you to delete all resources from a resource group once the resource group is online. Likewise, Linux FailSafe does not allow you to bring a resource group online if it has no resources. Also, resources must be added and deleted in atomic units; this means that resources which are interdependent must be added and deleted together. Modifying and Deleting Resource Groups with the Cluster Manager GUI To modify a failure policy with the Cluster Manager GUI, perform the following procedure: Launch the FailSafe Manager. On the left side of the display, click on the “Failover Policies & Resource Groups” category. On the right side of the display click on the “Modify a Resource Group Definition” task link to launch the task. Enter the selected inputs. Click on “OK” at the bottom of the screen to complete the task, or click on “Cancel” to cancel. To add or delete resources to a resource group definition with the Cluster Manager GUI, perform the following procedure: Launch the FailSafe Manager. On the left side of the display, click on the “Failover Policies & Resource Groups” category. On the right side of the display click on the “Add/Remove Resources in Resource Group” task link to launch the task. Enter the selected inputs. Click on “OK” at the bottom of the screen to complete the task, or click on “Cancel” to cancel. To delete a resource group with the Cluster Manager GUI, perform the following procedure: Launch the FailSafe Manager. On the left side of the display, click on the “Failover Policies & Resource Groups” category. On the right side of the display click on the “Delete a Resource Group” task link to launch the task. Enter the selected inputs. Click on “OK” at the bottom of the screen to complete the task, or click on “Cancel” to cancel. Modifying and Deleting Resource Groups with the Cluster Manager CLI Use the following CLI command to modify a resource group: cmgr> modify resource_group A [in cluster B] If you have specified a default cluster, you do not need to specify a cluster in this command and the CLI will use the default. You modify a resource group using the same commands you use to define a failover policy: resource_group A? add resource Bof resource_type C resource_group A? set failover_policy to D You can use the following command to delete a resource group definition: cmgr> delete resource_group A [in cluster B] If you have specified a default cluster, you do not need to specify a cluster in this command and the CLI will use the default. Displaying Resource Groups resource group displayingYou can display the parameters of a defined resource group, and you can display all of the resource groups defined for a cluster. Displaying Resource Groups with the Cluster Manager GUI The Cluster Manager GUI provides a convenient display of resource groups through the FailSafe Cluster View. You can launch the FailSafe Cluster View directly, or you can bring it up at any time by clicking on the “FailSafe Cluster View” prompt at the bottom of the “FailSafe Manager” display. From the View menu of the FailSafe Cluster View, select Groups to see all defined resource groups. To display which nodes are currently running which groups, select “Groups owned by Nodes.” To display which groups are running which failover policies, select “Groups by Failover Policies.” Displaying Resource Groups with the Cluster Manager CLI Use the following command to view the parameters of a defined resource group: cmgr> show resource_group A [ in cluster B] If you have specified a default cluster, you do not need to specify a cluster in this command and the CLI will use the default. Use the following command to view all of the defined failover policies: cmgr> show resource_groups [in cluster A] Linux FailSafe System Log Configuration log groupsLinux FailSafe maintains system logs for each of the Linux FailSafe daemons. You can customize the system logs according to the level of logging you wish to maintain. A log group is a set of processes that log to the same log file according to the same logging configuration. All Linux FailSafe daemons make one log group each. Linux FailSafe maintains the following log groups: cli log cli Commands log crsd log crsd Cluster reset services (crsd) log diags log diags Diagnostics log ha_agent logha_agent HA monitoring agents (ha_ifmx2) log ha_cmsd logha_cmsd Cluster membership daemon (ha_cmsd) log ha_fsd logha_fsd Linux FailSafe daemon (ha_fsd) log ha_gcd logha_gcd Group communication daemon (ha_gcd) log ha_ifd logha_ifd network interface monitoring daemon (ha_ifd) log ha_script logha_script Action and Failover policy scripts log ha_srmd logha_srmd System resource manager (ha_srmd) log Log group configuration information is maintained for all nodes in the pool for the cli and crsd log groups or for all nodes in the cluster for all other log groups.You can also customize the log group configuration for a specific node in the cluster or pool. When you configure a log group, you specify the following information: The log level, specified as character strings with the CUI and numerically (1 to 19) with the CLI, as described below The log file to log to The node whose specified log group you are customizing (optional) log level The log level specifies the verbosity of the logging, controlling the amount of log messages that Linux FailSafe will write into an associated log group's file. There are 10 debug level. , shows the logging levels as you specify them with the GUI and the CLI. Log Levels GUI level CLI levelMeaning Off 0 No logging Minimal 1 Logs notification of critical errors and normal operation Info 2 Logs minimal notification plus warning Default 5 Logs all Info messages plus additional notifications Debug0 10 ... Debug0 through Debug9 (11 -19 in CLI) log increasingly more debug information, including data structures. Many megabytes of disk space can be consumed on the server when debug levels are used in a log configuration. Debug9 19
Notifications of critical errors and normal operations are always sent to /var/log/failsafe/. Changes you make to the log level for a log group do not affect SYSLOG. The Linux FailSafe software appends the node name to the name of the log file you specify. For example, when you specify the log file name for a log group as /var/log/failsafe/cli, the file name will be /var/log/failsafe/cli_nodename. log filesThe default log file names are as follows. /var/log/failsafe/cmsd_ nodename log file for cluster membership services daemon in node nodename /var/log/failsafe/gcd_ nodename log file for group communication daemon in node nodename /var/log/failsafe/srmd_ nodename log file for system resource manager daemon in node nodename /var/log/failsafe/failsafe_ nodename log file for Linux FailSafe daemon, a policy implementor for resource groups, in node nodename /var/log/failsafe/agent_nodename log file for monitoring agent named agent in node nodename. For example, ifd_ nodename is the log file for the interface daemon monitoring agent that monitors interfaces and IP addresses and performs local failover of IP addresses. /var/log/failsafe/crsd_ nodename log file for reset daemon in node nodename /var/log/failsafe/script_ nodename log file for scripts in node nodename /var/log/failsafe/cli_ nodename log file or internal administrative commands in node nodename invoked by the Cluster Manager GUI and Cluster Manager CLI For information on using log groups in system recovery, see . Configuring Log Groups with the Cluster Manager GUI To configure a log group with the Cluster Manager GUI, perform the following steps: Launch the FailSafe Manager. On the left side of the display, click on the “Nodes & Clusters” category. On the right side of the display click on the “Set Log Configuration” task link to launch the task. Enter the selected inputs. Click on “OK” at the bottom of the screen to complete the task. Configuring Log Groups with the Cluster Manager CLI You can configure a log group with the following CLI command: cmgr> define log_group A [ on node B] [in cluster C] You specify the node if you wish to customize the log group configuration for a specific node only. If you have specified a default cluster, you do not have to specify a cluster in this command; Linux FailSafe will use the default. The following prompt appears: Enter commands, when finished enter either "done" or "cancel" log_group A? When this prompt of the node name appears, you enter the log group parameters you wish to modify in the following format: log_group A? set log_level to A log_group A? add log_file A log_group A? remove log_file A When you are finished configuring the log group, enter done to return to the cmgr prompt. Modifying Log Groups with the Cluster Manager CLI Use the following CLI command to modify a log group: cmgr> modify log_group A  on [node B] [ in cluster C] You modify a log group using the same commands you use to define a log group. Displaying Log Group Definitions with the Cluster Manager GUI To display log group definitions with the Cluster Manager GUI, run “Set Log Configuration” and choose the log group to display from the rollover menu. The current log level and log file for that log group will be displayed in the task window, where you can change those settings if you desire. Displaying Log Group Definitions with the Cluster Manager CLI Use the following command to view the parameters of a defined resource: cmgr> show log_groups This command shows all of the log groups currently defined, with the log group name, the logging levels and the log files. For information on viewing the contents of the log file, see .
Resource Group Creation Example resource group creation exampleUse the following procedure to create a resource group using the Cluster Manager CLI: Determine the list of resources that belong to the resource group you are defining. The list of resources that belong to a resource group are the resources that move from one node to another as one unit. resourceNFS resource type NFSA resource group that provides NFS services would contain a resource of each of the following types: IP_address volume filesystem NFS All resource and resource type dependencies of resources in a resource group must be satisfied. For example, the NFS resource type depends on the filesystem resource type, so a resource group containing a resource of NFS resource type should also contain a resource of filesystem resource type. Determine the failover policy to be used by the resource group. Use the template cluster_mgr script available in the /usr/lib/failsafe/cmgr-templates/cmgr-create-resource_group file. This example shows a script that creates a resource group with the following characteristics: The resource group is named nfs-group The resource group is in cluster HA-cluster The resource group uses the failover policy the resource group contains IP_Address, volume, filesystem, and NFS resources The following script can be used to create this resource group: define resource_group nfs-group in cluster HA-cluster   set failover_policy to n1_n2_ordered   add resource 192.0.2.34 of resource_type IP_address   add resource havol1 of resource_type volume   add resource /hafs1 of resource_type filesystem   add resource /hafs1 of resource_type NFS done Run this script using the -f option of the cluster_mgr command. Linux FailSafe Configuration Example CLI Script The following Cluster Manager CLI script provides an example which shows how to configure a cluster in the cluster database. The script illustrates the CLI commands that you execute when you define a cluster. You will use the parameters of your own system when you configure your cluster. After you create a CLI script, you can set the execute permissions and execute the script directly. For general information on CLI scripts, see . For information on the CLI template files that you can use to create your own configuration script, see . #!/usr/lib/failsafe/bin/cluster_mgr -f ################################################################# # # # Sample cmgr script to create a 2-node cluster in the cluster # # database (cdb). # # This script is created using cmgr template files under # # /usr/lib/failsafe/cmgr-scripts directory. # # The cluster has 2 resource groups: # # 1. nfs-group - Has 2 NFS, 2 filesystem, 2 volume, 1 statd and # # 1 IP_address resources. # # 2. web-group - Has 1 Netscape_web and 1 IP_address resources. # # # # NOTE: After running this script to define the cluster in the # # cdb, the user has to enable the two resource groups using the # # cmgr admin online resource_group command. # # # ################################################################# # # Create the first node. # Information to create a node is obtained from template script: # /usr/lib/failsafe/cmgr-templates/cmgr-create-node # # # # logical name of the node. It is recommended that logical name of the # node be output of hostname(1) command. # define node sleepy # # Hostname of the node. This is optional. If this field is not # specified,logical name of the node is assumed to be hostname. # This value has to be # the output of hostname(1) command. #   set hostname to sleepy # # Node identifier. Node identifier is a 16 bit integer that uniquely # identifies the node. This field is optional. If value is # not provided,cluster software generates node identifier. # Example value: 1   set nodeid to 101 # # Description of the system controller of this node. # System controller can be “chalL” or “msc” or “mmsc”. If the node is a # Challenge DM/L/XL, then system controller type is “chalL”. If the # node is Origin 200 or deskside Origin 2000, then the system # controller type is “msc”. If the node is rackmount Origin 2000, the # system controller type is “mmsc”. # Possible values: msc, mmsc, chalL #   set sysctrl_type to msc # # You can enable or disable system controller definition. Users are # expected to enable system controller definition after verify the # serial reset cables connected to this node. # Possible values: enabled, disabled #   set sysctrl_status to enabled # # The system controller password for doing privileged system controller # commands. # This field is optional. #   set sysctrl_password to none # # System controller owner. The node name of the machine that is # connected using serial cables to system controller of this node. # System controller node also has to be defined in the CDB. #   set sysctrl_owner to grumpy # # System controller device. The absolute device path name of the tty # to which the serial cable is connected in this node. # Example value: /dev/ttyd2 #   set sysctrl_device to /dev/ttyd2 # # Currently, the system controller owner can be connected to the system # controller on this node using “tty” device. # Possible value: tty #   set sysctrl_owner_type to tty # # List of control networks. There can be multiple control networks # specified for a node. HA cluster software uses these control # networks for communication between nodes. At least two control # networks should be specified for heartbeat messages and one # control network for failsafe control messages. # For each control network for the node, please add one more # control network section. # # Name of control network IP address. This IP address must # be configured on the network interface in /etc/rc.config # file in the node. # It is recommended that the IP address in internet dot notation # is provided. # Example value: 192.26.50.3 #   add nic 192.26.50.14 # # Flag to indicate if the control network can be used for sending # heartbeat messages. # Possible values: true, false #   set heartbeat to true # # Flag to indicate if the control network can be used for sending # failsafe control messages. # Possible values: true, false #   set ctrl_msgs to true # # Priority of the control network. Higher the priority value, lower the # priority of the control network. # Example value: 1 #   set priority to 1 # # Control network information complete #   done # # Add more control networks information here. # # Name of control network IP address. This IP address must be # configured on the network interface in /etc/rc.config # file in the node. # It is recommended that the IP address in internet dot # notation is provided. # Example value: 192.26.50.3 #   add nic 150.166.41.60 # # Flag to indicate if the control network can be used for sending # heartbeat messages. # Possible values: true, false #   set heartbeat to true # # Flag to indicate if the control network can be used for sending # failsafe control messages. # Possible values: true, false #   set ctrl_msgs to false # # Priority of the control network. Higher the priority value, lower the # priority of the control network. # Example value: 1 #   set priority to 2 # # Control network information complete #   done # # Node definition complete # done # # Create the second node. # Information to create a node is obtained from template script: # /usr/lib/failsafe/cmgr-templates/cmgr-create-node # # # # logical name of the node. It is recommended that logical name of # the node be output of hostname(1) command. # define node grumpy # # Hostname of the node. This is optional. If this field is not # specified,logical name of the node is assumed to be hostname. # This value has to be # the output of hostname(1) command. #   set hostname to grumpy # # Node identifier. Node identifier is a 16 bit integer that uniquely # identifies the node. This field is optional. If value is # not provided,cluster software generates node identifier. # Example value: 1   set nodeid to 102 # # Description of the system controller of this node. # System controller can be “chalL” or “msc” or “mmsc”. If the node is a # Challenge DM/L/XL, then system controller type is “chalL”. If the # node is Origin 200 or deskside Origin 2000, then the system # controller type is “msc”. If the node is rackmount Origin 2000, # the system controller type is “mmsc”. # Possible values: msc, mmsc, chalL #   set sysctrl_type to msc # # You can enable or disable system controller definition. Users are # expected to enable system controller definition after verify the # serial reset cables connected to this node. # Possible values: enabled, disabled #   set sysctrl_status to enabled # # The system controller password for doing privileged system controller # commands. # This field is optional. #   set sysctrl_password to none # # System controller owner. The node name of the machine that is # connected using serial cables to system controller of this node. # System controller node also has to be defined in the CDB. #   set sysctrl_owner to sleepy # # System controller device. The absolute device path name of the tty # to which the serial cable is connected in this node. # Example value: /dev/ttyd2 #   set sysctrl_device to /dev/ttyd2 # # Currently, the system controller owner can be connected to the system # controller on this node using “tty” device. # Possible value: tty #   set sysctrl_owner_type to tty # # List of control networks. There can be multiple control networks # specified for a node. HA cluster software uses these control # networks for communication between nodes. At least two control # networks should be specified for heartbeat messages and one # control network for failsafe control messages. # For each control network for the node, please add one more # control network section. # # Name of control network IP address. This IP address must be # configured on the network interface in /etc/rc.config # file in the node. # It is recommended that the IP address in internet dot notation # is provided. # Example value: 192.26.50.3 #   add nic 192.26.50.15 # # Flag to indicate if the control network can be used for sending # heartbeat messages. # Possible values: true, false #   set heartbeat to true # # Flag to indicate if the control network can be used for sending # failsafe control messages. # Possible values: true, false #   set ctrl_msgs to true # # Priority of the control network. Higher the priority value, lower the # priority of the control network. # Example value: 1 #   set priority to 1 # # Control network information complete #   done # # Add more control networks information here. # # Name of control network IP address. This IP address must be # configured on the network interface in /etc/rc.config # file in the node. # It is recommended that the IP address in internet dot notation # is provided. # Example value: 192.26.50.3 #   add nic 150.166.41.61 # # Flag to indicate if the control network can be used for sending # heartbeat messages. # Possible values: true, false #   set heartbeat to true # # Flag to indicate if the control network can be used for sending # failsafe control messages. # Possible values: true, false #   set ctrl_msgs to false # # Priority of the control network. Higher the priority value, lower the # priority of the control network. # Example value: 1 #   set priority to 2 # # Control network information complete #   done # # Node definition complete # done # # Define (create) the cluster. # Information to create the cluster is obtained from template script: # /usr/lib/failsafe/cmgr-templates/cmgr-create-cluster # # # Name of the cluster. # define cluster failsafe-cluster # # Notification command for the cluster. This is optional. If this # field is not specified, /usr/bin/mail command is used for # notification. Notification is sent when there is change in status of # cluster, node and resource group. #   set notify_cmd to /usr/bin/mail # # Notification address for the cluster. This field value is passed as # argument to the notification command. Specifying the notification # command is optional and user can specify only the notification # address in order to receive notifications by mail. If address is # not specified, notification will not be sent. # Example value: failsafe_alias@sysadm.company.com   set notify_addr to robinhood@sgi.com princejohn@sgi.com # # List of nodes added to the cluster. # Repeat the following line for each node to be added to the cluster. # Node should be already defined in the CDB and logical name of the # node has to be specified.   add node sleepy # # Add more nodes to the cluster here. #   add node grumpy # # Cluster definition complete # done # # Create failover policies # Information to create the failover policies is obtained from # template script: # /usr/lib/failsafe/cmgr-templates/cmgr-create-cluster # # # Create the first failover policy. # # # Name of the failover policy. # define failover_policy sleepy-primary # # Failover policy attribute. This field is mandatory. # Possible values: Auto_Failback, Controlled_Failback, Auto_Recovery, # InPlace_Recovery #   set attribute to Auto_Failback   set attribute to Auto_Recovery # # Failover policy script. The failover policy scripts have to # be present in # /usr/lib/failsafe/policies directory. This field is mandatory. # Example value: ordered (file name not the full path name).   set script to ordered # # Failover policy domain. Ordered list of nodes in the cluster # separated by spaces. This field is mandatory. #   set domain to sleepy grumpy # # Failover policy definition complete # done # # Create the second failover policy. # # # Name of the failover policy. # define failover_policy grumpy-primary # # Failover policy attribute. This field is mandatory. # Possible values: Auto_Failback, Controlled_Failback, Auto_Recovery, # InPlace_Recovery #   set attribute to Auto_Failback   set attribute to InPlace_Recovery # # Failover policy script. The failover policy scripts have # to be present in # /usr/lib/failsafe/policies directory. This field is mandatory. # Example value: ordered (file name not the full path name).   set script to ordered # # Failover policy domain. Ordered list of nodes in the cluster # separated by spaces. This field is mandatory. #   set domain to grumpy sleepy # # Failover policy definition complete # done # # Create the IP_address resources. # Information to create an IP_address resource is obtained from: # /usr/lib/failsafe/cmgr-templates/cmgr-create-resource-IP_address # # # If multiple resources of resource type IP_address have to be created, # repeat the following IP_address definition template. # # Name of the IP_address resource. The name of the resource has to # be IP address in the internet “.” notation. This IP address is used # by clients to access highly available resources. # Example value: 192.26.50.140 # define resource 150.166.41.179 of resource_type IP_address in cluster failsafe-cluster # # The network mask for the IP address. The network mask value is used # to configure the IP address on the network interface. # Example value: 0xffffff00   set NetworkMask to 0xffffff00 # # The ordered list of interfaces that can be used to configure the IP # address.The list of interface names are separated by comma. # Example value: eth0, eth1   set interfaces to eth1 # # The broadcast address for the IP address. # Example value: 192.26.50.255   set BroadcastAddress to 150.166.41.255 # # IP_address resource definition for the cluster complete # done # # Name of the IP_address resource. The name of the resource has to be # IP address in the internet “.” notation. This IP address is used by # clients to access highly available resources. # Example value: 192.26.50.140 # define resource 150.166.41.99 of resource_type IP_address in cluster failsafe-cluster # # The network mask for the IP address. The network mask value is used # to configure the IP address on the network interface. # Example value: 0xffffff00   set NetworkMask to 0xffffff00 # # The ordered list of interfaces that can be used to configure the IP # address. # The list of interface names are separated by comma. # Example value: eth0, eth1   set interfaces to eth1 # # The broadcast address for the IP address. # Example value: 192.26.50.255   set BroadcastAddress to 150.166.41.255 # # IP_address resource definition for the cluster complete # done # # Create the volume resources. # Information to create a volume resource is obtained from: # /usr/lib/failsafe/cmgr-templates/cmgr-create-resource-volume # # # If multiple resources of resource type volume have to be created, # repeat the following volume definition template. # # Name of the volume. The name of the volume has to be: # Example value: HA_vol (not /dev/xlv/HA_vol) # define resource bagheera of resource_type volume in cluster failsafe-cluster # # The user name of the device file name. This field is optional. If # this field is not specified, value ``root'' is used. # Example value: oracle   set devname-owner to root # # The group name of the device file name. This field is optional. # If this field is not specified, value ``sys” is used. # Example value: oracle   set devname-group to sys # # The device file permissions. This field is optional. If this # field is not specified, value ``666” is used. The file permissions # have to be specified in octal notation. See chmod(1) for more # information. # Example value: 666   set devname-mode to 666 # # Volume resource definition for the cluster complete # done # # Name of the volume. The name of the volume has to be: # Example value: HA_vol (not /dev/xlv/HA_vol) # define resource bhaloo of resource_type volume in cluster failsafe-cluster # # The user name of the device file name. This field is optional. If this # field is not specified, value “root” is used. # Example value: oracle   set devname-owner to root # # The group name of the device file name. This field is optional. # If this field is not specified, value “sys” is used. # Example value: oracle   set devname-group to sys # # The device file permissions. This field is optional. If this field is # not specified, value “666” is used. The file permissions # have to be specified in octal notation. See chmod(1) for more # information. # Example value: 666   set devname-mode to 666 # # Volume resource definition for the cluster complete # done # # Create the filesystem resources. # Information to create a filesystem resource is obtained from: # /usr/lib/failsafe/cmgr-templates/cmgr-create-resource-filesystem # # # filesystem resource type is for XFS filesystem only. # If multiple resources of resource type filesystem have to be created, # repeat the following filesystem definition template. # # Name of the filesystem. The name of the filesystem resource has # to be absolute path name of the filesystem mount point. # Example value: /shared_vol # define resource /haathi of resource_type filesystem in cluster failsafe-cluster # # The name of the volume resource corresponding to the filesystem. This # resource should be the same as the volume dependency, see below. # This field is mandatory. # Example value: HA_vol   set volume-name to bagheera # # The options to be used when mounting the filesystem. This field is # mandatory. For the list of mount options, see fstab(4). # Example value: “rw”   set mount-options to rw # # The monitoring level for the filesystem. This field is optional. If # this field is not specified, value “1” is used. # Monitoring level can be # 1 - Checks if filesystem exists in the mtab file (see mtab(4)). This # is a lightweight check compared to monitoring level 2. # 2 - Checks if the filesystem is mounted using stat(1m) command. #   set monitoring-level to 2 done # # Add filesystem resource type dependency # modify resource /haathi of resource_type filesystem in cluster failsafe-cluster # # The filesystem resource type definition also contains a resource # dependency on a volume resource. # This field is mandatory. # Example value: HA_vol   add dependency bagheera of type volume # # filesystem resource definition for the cluster complete # done # # Name of the filesystem. The name of the filesystem resource has # to be absolute path name of the filesystem mount point. # Example value: /shared_vol # define resource /sherkhan of resource_type filesystem in cluster failsafe-cluster # # The name of the volume resource corresponding to the filesystem. This # resource should be the same as the volume dependency, see below. # This field is mandatory. # Example value: HA_vol   set volume-name to bhaloo # # The options to be used when mounting the filesystem. This field is # mandatory.For the list of mount options, see fstab(4). # Example value: “rw”   set mount-options to rw # # The monitoring level for the filesystem. This field is optional. If # this field is not specified, value “1” is used. # Monitoring level can be # 1 - Checks if filesystem exists in the mtab file (see mtab(4)). This # is a lightweight check compared to monitoring level 2. # 2 - Checks if the filesystem is mounted using stat(1m) command. #   set monitoring-level to 2 done # # Add filesystem resource type dependency # modify resource /sherkhan of resource_type filesystem in cluster failsafe-cluster # # The filesystem resource type definition also contains a resource # dependency on a volume resource. # This field is mandatory. # Example value: HA_vol   add dependency bhaloo of type volume # # filesystem resource definition for the cluster complete # done # # Create the statd resource. # Information to create a filesystem resource is obtained from: # /usr/lib/failsafe/cmgr-templates/cmgr-create-resource-statd # # # If multiple resources of resource type statd have to be created, # repeat the following filesystem definition template. # # Name of the statd. The name of the resource has to be the location # of the NFS/lockd directory. # Example value: /disk1/statmon # define resource /haathi/statmon of resource_type statd in cluster failsafe-cluster # # The IP address on which the NFS clinets connect, this resource should # be the same as the IP_address dependency, see below. # This field is mandatory. # Example value: 128.1.2.3   set InterfaceAddress to 150.166.41.99 done # # Add the statd resource type dependencies # modify resource /haathi/statmon of resource_type statd in cluster failsafe-cluster # # The statd resource type definition also contains a resource # dependency on a IP_address resource. # This field is mandatory. # Example value: 128.1.2.3   add dependency 150.166.41.99 of type IP_address # # The statd resource type definition also contains a resource # dependency on a filesystem resource. It defines the location of # the NFS lock directory filesystem. # This field is mandatory. # Example value: /disk1   add dependency /haathi of type filesystem # # statd resource definition for the cluster complete # done # # Create the NFS resources. # Information to create a NFS resource is obtained from: # /usr/lib/failsafe/cmgr-templates/cmgr-create-resource-NFS # # # If multiple resources of resource type NFS have to be created, repeat # the following NFS definition template. # # Name of the NFS export point. The name of the NFS resource has to be # export path name of the filesystem mount point. # Example value: /disk1 # define resource /haathi of resource_type NFS in cluster failsafe-cluster # # The export options to be used when exporting the filesystem. For the # list of export options, see exportfs(1M). # This field is mandatory. # Example value: “rw,wsync,anon=root”   set export-info to rw # # The name of the filesystem resource corresponding to the export # point. This resource should be the same as the filesystem dependency, # see below. # This field is mandatory. # Example value: /disk1   set filesystem to /haathi done # # Add the resource type dependency # modify resource /haathi of resource_type NFS in cluster failsafe-cluster # # The NFS resource type definition also contains a resource dependency # on a filesystem resource. # This field is mandatory. # Example value: /disk1   add dependency /haathi of type filesystem # # The NFS resource type also contains a pseudo resource dependency # on a statd resource. You really must have a statd resource associated # with a NFS resource, so the NFS locks can be failed over. # This field is mandatory. # Example value: /disk1/statmon   add dependency /haathi/statmon of type statd # # NFS resource definition for the cluster complete # done # # Name of the NFS export point. The name of the NFS resource has to be # export path name of the filesystem mount point. # Example value: /disk1 # define resource /sherkhan of resource_type NFS in cluster failsafe-cluster # # The export options to be used when exporting the filesystem. For the # list of export options, see exportfs(1M). # This field is mandatory. # Example value: “rw,wsync,anon=root”   set export-info to rw # # The name of the filesystem resource corresponding to the export # point. This # resource should be the same as the filesystem dependency, see below. # This field is mandatory. # Example value: /disk1   set filesystem to /sherkhan done # # Add the resource type dependency # modify resource /sherkhan of resource_type NFS in cluster failsafe-cluster # # The NFS resource type definition also contains a resource dependency # on a filesystem resource. # This field is mandatory. # Example value: /disk1   add dependency /sherkhan of type filesystem # # The NFS resource type also contains a pseudo resource dependency # on a statd resource. You really must have a statd resource associated # with a NFS resource, so the NFS locks can be failed over. # This field is mandatory. # Example value: /disk1/statmon   add dependency /haathi/statmon of type statd # # NFS resource definition for the cluster complete # done # # Create the Netscape_web resource. # Information to create a Netscape_web resource is obtained from: # /usr/lib/failsafe/cmgr-templates/cmgr-create-resource-Netscape_web # # # If multiple resources of resource type Netscape_web have to be # created, repeat the following filesystem definition template. # # Name of the Netscape WEB server. The name of the resource has to be # a unique identifier. # Example value: ha80 # define resource web-server of resource_type Netscape_web in cluster failsafe-cluster # # The locations of the servers startup and stop scripts. # This field is mandatory. # Example value: /usr/ns-home/ha86   set admin-scripts to /var/netscape/suitespot/https-control3 # # the TCP port number with the server listens on. # This field is mandatory. # Example value: 80   set port-number to 80 # # The desired monitoring level, the user can specify either; # 1 - checks for process existence # 2 - issues an HTML query to the server. # This field is mandatory. # Example value: 2   set monitor-level to 2 # # The locations of the WEB servers initial HTML page # This field is mandatory. # Example value: /var/www/htdocs   set default-page-location to /var/www/htdocs # # The WEB servers IP address, this must be a configured IP_address # resource. # This resource should be the same as the IP_address dependency, see # below. # This field is mandatory. # Example value: 28.12.9.5   set web-ipaddr to 150.166.41.179 done # # Add the resource dependency # modify resource web-server of resource_type Netscape_web in cluster failsafe-cluster # # The Netscape_web resource type definition also contains a resource # dependency on a IP_address resource. # This field is mandatory. # Example value: 28.12.9.5   add dependency 150.166.41.179 of type IP_address # # Netscape_web resource definition for the cluster complete # done # # Create the resource groups. # Information to create a resource group is obtained from: # /usr/lib/failsafe/cmgr-templates/cmgr-create-resource_group # # # Name of the resource group. Name of the resource group must be unique # in the cluster. # define resource_group nfs-group in cluster failsafe-cluster # # Failover policy for the resource group. This field is mandatory. # Failover policy should be already defined in the CDB. #   set failover_policy to sleepy-primary # # List of resources in the resource group. # Repeat the following line for each resource to be added to the # resource group.   add resource 150.166.41.99 of resource_type IP_address # # Add more resources to the resource group here. #   add resource bagheera of resource_type volume   add resource bhaloo of resource_type volume   add resource /haathi of resource_type filesystem   add resource /sherkhan of resource_type filesystem   add resource /haathi/statmon of resource_type statd   add resource /haathi of resource_type NFS   add resource /sherkhan of resource_type NFS # # Resource group definition complete # done # # Name of the resource group. Name of the resource group must be unique # in the cluster. # define resource_group web-group in cluster failsafe-cluster # # Failover policy for the resource group. This field is mandatory. # Failover policy should be already defined in the CDB. #   set failover_policy to grumpy-primary # # List of resources in the resource group. # Repeat the following line for each resource to be added to the # resource group.   add resource 150.166.41.179 of resource_type IP_address # # Add more resources to the resource group here. # add resource web-server of resource_type Netscape_web # # Resource group definition complete # done # # Script complete. This should be last line of the script # quit