Testing Linux FailSafe Configuration This chapter explains how to test the Linux FailSafe system configuration using the Cluster Manager GUI and the Cluster Manager CLI. For general information on using the Cluster Manager GUI and the Cluster Manager CLI, see . The sections in this chapter are as follows: Overview of FailSafe Diagnostic Commands diagnostic command overview diags_nodename log file connectivity, testing with GUI shows the tests you can perform with Linux FailSafe diagnostic commands: FailSafe Diagnostic Test Summary Diagnostic Test Checks Performed resource Checks that the resource type parameters are setCheck that the parameters are syntactically correct Validates that the parameters exist resource group Tests all resources defined in the resource group failover policy Checks that the failover policy existsChecks that the failover domain contains a valid list of hosts network connectivity Checks that the control interfaces are on the same networkChecks that the nodes can communicate with each other serial connection Checks that the nodes can reset each other
All transactions are logged to the diagnostics file diags_ nodename in the log directory. You should test resource groups before starting FailSafe HA services or starting a resource group. These tests are designed to check for resource inconsistencies which could prevent the resource group from starting successfully.
Performing Diagnostic Tasks with the Cluster Manager GUI To test the components of a FailSafe system using the Cluster Manager GUI, perform the following steps: Select Task Manager on the FailSafe Toolchest. On the left side of the display, click on the “Diagnostics” category. Select one of the diagnostics tasks that appear on the right side of the display: “Test Connectivity,” “Test Resources,” or “Test Failover Policy.” Testing Connectivity with the Cluster Manager GUI network connectivity testing with GUI serial connectionstesting with GUI When you select the “Test Connectivity” task from the Diagnostics display, you can test the network and serial connections on the nodes in your cluster by entering the requested inputs. You can test all of the nodes in the cluster at one time, or you can specify an individual node to test. Testing Resources with the Cluster Manager GUI When you select the “Test Resources” task from the Diagnostics display, you can test the resources on the nodes in your cluster by entering the requested inputs. You can test resources by type and by group. You can test the resources of a resource type or in a resource group on all of the nodes in the cluster at one time, or you can specify an individual node to test. Resource tests are performed only on nodes in the resource group's application failover domain. Testing Failover Policies with the Cluster Manager GUI failover policy testing with GUIWhen you select the “Test Failover Policy” task from the Diagnostics display, you can test whether a failover policy is defined correctly. This test checks the failover policy by validating the policy script, failover attributes, and whether the application failover domain consists of valid nodes from the cluster. Performing Diagnostic Tasks with the Cluster Manager CLI The following subsections described how to perform diagnostic tasks on your system using the Cluster Manager CLI commands. Testing the Serial Connections with the Cluster Manager CLI serial connections testing with CLIYou can use the Cluster Manager CLI to test the serial connections between the Linux FailSafe nodes. This test pings each specified node through the serial line and produces an error message if the ping is not successful. Do not execute this command while FailSafe is running. When you are using the Cluster Manager CLI, use the following command to test the serial connections for the machines in a cluster cmgr> test serial in cluster A [on nodeB nodeC ...] This test yields an error message when it encounters its first error, indicating the node that did not respond. If you receive an error message after executing this test, verify the cable connections of the serial cable from the indicated node's serial port to the remote power control unit or the system controller port of the other nodes and run the test again. The following shows an example of the test serial CLI command: # cluster_mgr Welcome to Linux FailSafe Cluster Manager Command-Line Interface cmgr> test serial in cluster eagan on node cm1 Success: testing serial... Success: Ensuring Node Can Get IP Addresses For All Specified Hosts Success: Number of IP addresses obtained for <cm1> = 1 Success: The first IP address for <cm1> = 128.162.19.34 Success: Checking serial lines via crsd (crsd is running) Success: Successfully checked serial line Success: Serial Line OK Success: overall exit status:success, tests failed:0, total tests executed:1 The following shows an example of an attempt to run the test serial CLI command while FailSafe is running (causing the command to fail to execute): cmgr> test serial in cluster eagan on node cm1 Error: Cannot run the serial tests, diagnostics has detected FailSafe (ha_cmsd) is running Failed to execute FailSafe tests/diagnostics ha test command failed cmgr> Testing Network Connectivity with the Cluster Manager CLI network connectivity testing with CLIYou can use the Cluster Manager CLI to test the network connectivity in a cluster. This test checks if the specified nodes can communicate with each other through each configured interface in the nodes. This test will not run if FailSafe is running. When you are using the Cluster Manager CLI, use the following command to test the network connectivity for the machines in a cluster cmgr> test connectivity in cluster A [on nodeB nodeC ...] The following shows an example of the test connectivity CLI command: cmgr> test connectivity in cluster eagan on node cm1 Success: testing connectivity... Success: checking that the control IP_addresses are on the same networks Success: pinging address cm1-priv interface ef0 from host cm1 Success: pinging address cm1 interface ef1 from host cm1 Success: overall exit status:success, tests failed:0, total tests executed:1 This test yields an error message when it encounters its first error, indicating the node that did not respond. If you receive an error message after executing this test, verify that the network interface has been configured up, using the ifconfig command, for example: # /usr/etc/ifconfig ec3 ec3: flags=c63<UP,BROADCAST,NOTRAILERS,RUNNING,FILTMULTI,MULTICAST>   inet 190.0.3.1 netmask 0xffffff00 broadcast 190.0.3.255 The UP in the first line of output indicates that the interface is configured up. If the network interface is configured up, verify that the network cables are connected properly and run the test again. Testing Resources with the Cluster Manager CLI You can use the Cluster Manager CLI to test any configured resource by resource name or by resource type. The Cluster Manager CLI uses the following syntax to test a resource by name: cmgr> test resource A of resource_typeB in clusterC [ on nodeD node E ...] The following shows an example of testing a resource by name: cmgr> test resource /disk1 of resource_type filesystem in cluster eagan on machine cm1 Success: *** testing node resources on node cm1 *** Success: *** testing all filesystem resources on node cm1 *** Success: testing resource /disk1 of resource type filesystem on node cm1 Success: overall exit status:success, tests failed:0, total tests executed:1 The Cluster Manager CLI uses the following syntax to test a resource by resource type: cmgr> test resource_type A in clusterB [on nodeC nodeD...] The following shows an example of testing resources by resource type: cmgr> test resource_type filesystem in cluster eagan on machine cm1 Success: *** testing node resources on node cm1 *** Success: *** testing all filesystem resources on node cm1 *** Success: testing resource /disk4 of resource type filesystem on node cm1 Success: testing resource /disk5 of resource type filesystem on node cm1 Success: testing resource /disk2 of resource type filesystem on node cm1 Success: testing resource /disk3 of resource type filesystem on node cm1 Success: testing resource /disk1 of resource type filesystem on node cm1 Success: overall exit status:success, tests failed:0, total tests executed:5 You can use the CLI to test volume and filesystem resources in destructive mode. This provides a more thorough test of filesystems and volumes. CLI tests will not run in destructive mode if FailSafe is running. The Cluster Manager CLI uses the following syntax for the commands that test resources in destructive mode: cmgr> test resource A of resource_typeB in clusterC [ on nodeD node C ...] destructive The following sections describe the diagnostic tests available for resources. Testing Logical Volumes volumetesting with CLIYou can use the Cluster Manager CLI to test the logical volumes in a cluster. This test checks if the specified volume is configured correctly. When you are using the Cluster Manager CLI, use the following command to test a logical volume: cmgr> test resource A of resource_type volume on clusterB [ on nodeC node D ...] The following example tests a logical volume: cmgr> test resource alternate of resource_type volume on cluster eagan Success: *** testing node resources on node cm1 *** Success: *** testing all volume resources on node cm1 *** Success: running resource type volume tests on node cm1 Success: *** testing node resources on node cm2 *** Success: *** testing all volume resources on node cm2 *** Success: running resource type volume tests on node cm2 Success: overall exit status:success, tests failed:0, total tests executed:2 cmgr> The following example tests a logical volume in destructive mode: cmgr> test resource alternate of resource_type volume on cluster eagan destructive Warning: executing the tests in destructive mode Success: *** testing node resources on node cm1 *** Success: *** testing all volume resources on node cm1 *** Success: running resource type volume tests on node cm1 Success: successfully assembled volume: alternate Success: *** testing node resources on node cm2 *** Success: *** testing all volume resources on node cm2 *** Success: running resource type volume tests on node cm2 Success: successfully assembled volume: alternate Success: overall exit status:success, tests failed:0, total tests executed:2 cmgr> Testing Filesystems You can use the Cluster Manager CLI to test the filesystems configured in a cluster. This test checks if the specified filesystem is configured correctly and, in addition, checks whether the volume the filesystem will reside on is configured correctly.filesystem testing with CLI When you are using the Cluster Manager CLI, use the following command to test a filesystem: cmgr> test resource A of resource_type filesystems on cluster B [on nodeC  nodeD ...] The following example tests a filesystem. This example first uses a CLI show command to display the filesystems that have been defined in a cluster. cmgr> show resources of resource_type filesystem in cluster eagan /disk4 type filesystem /disk5 type filesystem /disk2 type filesystem /disk3 type filesystem /disk1 type filesystem cmgr> test resource /disk4 of resource_type filesystem in cluster eagan on node cm1 Success: *** testing node resources on node cm1 *** Success: *** testing all filesystem resources on node cm1 *** Success: successfully mounted filesystem: /disk4 Success: overall exit status:success, tests failed:0, total tests executed:1 cmgr> The following example tests a filesystem in destructive mode: cmgr> test resource /disk4 of resource_type filesystem in cluster eagan on node cm1 destructive Warning: executing the tests in destructive mode Success: *** testing node resources on node cm1 *** Success: *** testing all filesystem resources on node cm1 *** Success: successfully mounted filesystem: /disk4 Success: overall exit status:success, tests failed:0, total tests executed:1 cmgr> Testing NFS Filesystems You can use the Cluster Manager CLI to test the NFS filesystems configured in a cluster. This test checks if the specified NFS filesystem is configured correctly and, in addition, checks whether the volume the NFS filesystem will reside on is configured correctly. filesystemNFS, testing with CLI NFS filesystem testing with CLI When you are using the Cluster Manager CLI, use the following command to test an NFS filesystem: cmgr> test resource A of resource_type NFS on clusterB [ on nodeC node D ...] The following example tests an NFS filesystem: cmgr> test resource /disk4 of resource_type NFS in cluster eagan Success: *** testing node resources on node cm1 *** Success: *** testing all NFS resources on node cm1 *** Success: *** testing node resources on node cm2 *** Success: *** testing all NFS resources on node cm2 *** Success: overall exit status:success, tests failed:0, total tests executed:2 cmgr> Testing statd Resources You can use the Cluster Manager CLI to test the statd resources configured in a cluster. When you are using the Cluster Manager CLI, use the following command to test an NFS filesystem: statdtesting with CLI resourcestatd, testing with CLI cmgr> test resource A of resource_type statd on clusterB [ on nodeC node D ...] The following example tests a statd resource: cmgr> test resource /disk1/statmon of resource_type statd in cluster eagan Success: *** testing node resources on node cm1 *** Success: *** testing all statd resources on node cm1 *** Success: *** testing node resources on node cm2 *** Success: *** testing all statd resources on node cm2 *** Success: overall exit status:success, tests failed:0, total tests executed:2 cmgr> Testing Netscape-web Resources You can use the Cluster Manager CLI to test the Netscape Web resources configured in a cluster.Netscape Web testing with CLI resourceNetscape Web, testing with CLI When you are using the Cluster Manager CLI, use the following command to test a Netscape-web resource: cmgr> test resource A of resource_type Netscape_web on cluster B [on nodeC nodeD ...] The following example tests a Netscape-web resource. In this example, the Netscape-web resource on node cm2 failed the diagnostic test. cmgr> test resource nss-enterprise of resource_type Netscape_web in cluster eagan Success: *** testing node resources on node cm1 *** Success: *** testing all Netscape_web resources on node cm1 *** Success: *** testing node resources on node cm2 *** Success: *** testing all Netscape_web resources on node cm2 *** Warning: resource nss-enterprise has invaild script /var/netscape/suitespot/https-ha85 location Warning: /var/netscape/suitespot/https-ha85/config/magnus.conf must contain the "Port" parameter Warning: /var/netscape/suitespot/https-ha85/config/magnus.conf must contain the "Address" parameter Warning: resource nss-enterprise of type Netscape_web failed Success: overall exit status:failed, tests failed:1, total tests executed:2 Failed to execute FailSafe tests/diagnostics ha test command failed cmgr> Testing Resource Groups You can use the Cluster Manager CLI to test a resource group. This test cycles through the resource tests for all of the resources defined for a resource group. Resource tests are performed only on nodes in the resource group's application failover domain. resource grouptesting with CLI Netscape servers, testing with CLI The Cluster Manager CLI uses the following syntax for the commands that test resource groups: cmgr> test resource_group Ain cluster B [on nodeC nodeD ... ] The following example tests a resource group. This example first uses a CLI show command to display the resource groups that have been defined in a cluster. cmgr> show resource_groups in cluster eagan Resource Groups:   nfs2   informix cmgr> test resource_group nfs2 in cluster eagan on machine cm1 Success: *** testing node resources on node cm1 *** Success: testing resource /disk4 of resource type NFS on node cm1 Success: testing resource /disk3 of resource type NFS on node cm1 Success: testing resource /disk3/statmon of resource type statd on node cm1 Success: testing resource 128.162.19.45 of resource type IP_address on node cm1 Success: testing resource /disk4 of resource type filesystem on node cm1 Success: testing resource /disk3 of resource type filesystem on node cm1 Success: testing resource dmf1 of resource type volume on node cm1 Success: testing resource dmfjournals of resource type volume on node cm1 Success: overall exit status:success, tests failed:0, total tests executed:16 cmgr> Testing Failover Policies with the Cluster Manager CLI You can use the Cluster Manager CLI to test whether a failover policy is defined correctly. This test checks the failover policy by validating the policy script, failover attributes, and whether the application failover domain consists of valid nodes from the cluster. failover policytesting with CLI The Cluster Manager CLI uses the following syntax for the commands that test a failover policy: cmgr> test failover_policy Ain cluster B [on nodeC  nodeD  ...] The following example tests a failover policy. This example first uses a CLI show command to display the failover policies that have been defined in a cluster. cmgr> show failover_policies Failover Policies:   reverse   ordered-in-order cmgr> test failover_policy reverse in cluster eagan Success: *** testing node resources on node cm1 *** Success: testing policy reverse on node cm1 Success: *** testing node resources on node cm2 *** Success: testing policy reverse on node cm2 Success: overall exit status:success, tests failed:0, total tests executed:2 cmgr>