Testing Linux FailSafe Configuration
This chapter explains how to test the Linux FailSafe system configuration
using the Cluster Manager GUI and the Cluster Manager CLI. For general
information on using the Cluster Manager GUI and the Cluster Manager CLI,
see .
The sections in this chapter are as follows:
Overview of FailSafe Diagnostic Commands
diagnostic command overview
diags_nodename log file
connectivity,
testing with GUI
shows the tests you can perform with Linux FailSafe diagnostic commands:
FailSafe Diagnostic Test Summary
Diagnostic Test
Checks Performed
resource
Checks that the resource type parameters
are setCheck that the parameters are syntactically correct
Validates that the parameters exist
resource group
Tests all resources defined in
the resource group
failover policy
Checks that the failover policy
existsChecks that the failover domain contains a valid list
of hosts
network connectivity
Checks that the control interfaces
are on the same networkChecks that the nodes can communicate
with each other
serial connection
Checks that the nodes can reset
each other
All transactions are logged to the diagnostics file diags_
nodename in the log directory.
You should test resource groups before starting FailSafe HA services
or starting a resource group. These tests are designed to check for resource
inconsistencies which could prevent the resource group from starting successfully.
Performing Diagnostic Tasks with the Cluster
Manager GUI
To test the components of a FailSafe system using the Cluster Manager
GUI, perform the following steps:
Select Task Manager on the FailSafe Toolchest.
On the left side of the display, click on the “Diagnostics”
category.
Select one of the diagnostics tasks that appear on the
right side of the display: “Test Connectivity,” “Test
Resources,” or “Test Failover Policy.”
Testing Connectivity with the Cluster Manager GUI
network connectivity
testing with GUI
serial connectionstesting with GUI
When you select the “Test Connectivity” task from
the Diagnostics display, you can test the network and serial connections
on the nodes in your cluster by entering the requested inputs. You can
test all of the nodes in the cluster at one time, or you can specify an
individual node to test.
Testing Resources with the Cluster Manager GUI
When you select the “Test Resources” task from the Diagnostics
display, you can test the resources on the nodes in your cluster by entering
the requested inputs. You can test resources by type and by group. You
can test the resources of a resource type or in a resource group on all
of the nodes in the cluster at one time, or you can specify an individual
node to test. Resource tests are performed only on nodes in the resource
group's application failover domain.
Testing Failover Policies with the Cluster Manager GUI
failover policy
testing with GUIWhen you select the “Test
Failover Policy” task from the Diagnostics display, you can test
whether a failover policy is defined correctly. This test checks the failover
policy by validating the policy script, failover attributes, and whether
the application failover domain consists of valid nodes from the cluster.
Performing Diagnostic Tasks with the Cluster
Manager CLI
The following subsections described how to perform diagnostic tasks
on your system using the Cluster Manager CLI commands.
Testing the Serial Connections with the Cluster Manager CLI
serial connections
testing with CLIYou can use the Cluster Manager
CLI to test the serial connections between the Linux FailSafe nodes. This
test pings each specified node through the serial line and produces an
error message if the ping is not successful. Do not execute this command
while FailSafe is running.
When you are using the Cluster Manager CLI, use the following command
to test the serial connections for the machines in a cluster
cmgr> test serial in cluster
A [on node B
node C ...]
This test yields an error message when it encounters its first error,
indicating the node that did not respond. If you receive an error message
after executing this test, verify the cable connections of the serial
cable from the indicated node's serial port to the remote power control
unit or the system controller port of the other nodes and run the test
again.
The following shows an example of the test serial
CLI command:
# cluster_mgr
Welcome to Linux FailSafe Cluster Manager Command-Line Interface
cmgr> test serial in cluster eagan on node cm1
Success: testing serial...
Success: Ensuring Node Can Get IP Addresses For All Specified Hosts
Success: Number of IP addresses obtained for <cm1> = 1
Success: The first IP address for <cm1> = 128.162.19.34
Success: Checking serial lines via crsd (crsd is running)
Success: Successfully checked serial line
Success: Serial Line OK
Success: overall exit status:success, tests failed:0, total tests executed:1
The following shows an example of an attempt to run the
test serial CLI command while FailSafe is running (causing the
command to fail to execute):
cmgr> test serial in cluster eagan on node cm1
Error: Cannot run the serial tests, diagnostics has detected FailSafe (ha_cmsd) is running
Failed to execute FailSafe tests/diagnostics ha
test command failed
cmgr>
Testing Network Connectivity with the Cluster
Manager CLI
network connectivity
testing with CLIYou can use the Cluster
Manager CLI to test the network connectivity in a cluster. This test checks
if the specified nodes can communicate with each other through each configured
interface in the nodes. This test will not run if FailSafe is running.
When you are using the Cluster Manager CLI, use the following command
to test the network connectivity for the machines in a cluster
cmgr> test connectivity in cluster
A [on node B
node C ...]
The following shows an example of the test connectivity
CLI command:
cmgr> test connectivity in cluster eagan on node cm1
Success: testing connectivity...
Success: checking that the control IP_addresses are on the same networks
Success: pinging address cm1-priv interface ef0 from host cm1
Success: pinging address cm1 interface ef1 from host cm1
Success: overall exit status:success, tests failed:0, total tests
executed:1
This test yields an error message when it encounters its first error,
indicating the node that did not respond. If you receive an error message
after executing this test, verify that the network interface has been
configured up, using the ifconfig command, for example:
# /usr/etc/ifconfig ec3
ec3: flags=c63<UP,BROADCAST,NOTRAILERS,RUNNING,FILTMULTI,MULTICAST>
inet 190.0.3.1 netmask 0xffffff00 broadcast 190.0.3.255
The UP in the first line of output indicates that the interface
is configured up.
If the network interface is configured up, verify that the network
cables are connected properly and run the test again.
Testing Resources with the Cluster Manager CLI
You can use the Cluster Manager CLI to test any configured resource
by resource name or by resource type.
The Cluster Manager CLI uses the following syntax to test a resource
by name:
cmgr> test resource A
of resource_type B
in cluster C [
on node D node
E ...]
The following shows an example of testing a resource by name:
cmgr> test resource /disk1 of resource_type filesystem in cluster eagan on machine cm1
Success: *** testing node resources on node cm1 ***
Success: *** testing all filesystem resources on node cm1 ***
Success: testing resource /disk1 of resource type filesystem on node cm1
Success: overall exit status:success, tests failed:0, total tests executed:1
The Cluster Manager CLI uses the following syntax to test a resource
by resource type:
cmgr> test resource_type
A in cluster B
[on node C
node D...]
The following shows an example of testing resources by resource
type:
cmgr> test resource_type filesystem in cluster eagan on machine cm1
Success: *** testing node resources on node cm1 ***
Success: *** testing all filesystem resources on node cm1 ***
Success: testing resource /disk4 of resource type filesystem on node cm1
Success: testing resource /disk5 of resource type filesystem on node cm1
Success: testing resource /disk2 of resource type filesystem on node cm1
Success: testing resource /disk3 of resource type filesystem on node cm1
Success: testing resource /disk1 of resource type filesystem on node cm1
Success: overall exit status:success, tests failed:0, total tests executed:5
You can use the CLI to test volume and filesystem resources in
destructive mode. This provides a more thorough test of filesystems
and volumes. CLI tests will not run in destructive mode if FailSafe is
running.
The Cluster Manager CLI uses the following syntax for the commands
that test resources in destructive mode:
cmgr> test resource A
of resource_type B
in cluster C [
on node D node
C ...] destructive
The following sections describe the diagnostic tests available for
resources.
Testing Logical Volumes
volumetesting
with CLIYou can use the Cluster Manager CLI to
test the logical volumes in a cluster. This test checks if the specified
volume is configured correctly.
When you are using the Cluster Manager CLI, use the following command
to test a logical volume:
cmgr> test resource A
of resource_type volume on cluster B [
on node C node
D ...]
The following example tests a logical volume:
cmgr> test resource alternate of resource_type volume on cluster eagan
Success: *** testing node resources on node cm1 ***
Success: *** testing all volume resources on node cm1 ***
Success: running resource type volume tests on node cm1
Success: *** testing node resources on node cm2 ***
Success: *** testing all volume resources on node cm2 ***
Success: running resource type volume tests on node cm2
Success: overall exit status:success, tests failed:0, total tests executed:2
cmgr>
The following example tests a logical volume in destructive mode:
cmgr> test resource alternate of resource_type volume on cluster eagan destructive
Warning: executing the tests in destructive mode
Success: *** testing node resources on node cm1 ***
Success: *** testing all volume resources on node cm1 ***
Success: running resource type volume tests on node cm1
Success: successfully assembled volume: alternate
Success: *** testing node resources on node cm2 ***
Success: *** testing all volume resources on node cm2 ***
Success: running resource type volume tests on node cm2
Success: successfully assembled volume: alternate
Success: overall exit status:success, tests failed:0, total tests executed:2
cmgr>
Testing Filesystems
You can use the Cluster Manager CLI to test the filesystems configured
in a cluster. This test checks if the specified filesystem is configured
correctly and, in addition, checks whether the volume the filesystem will
reside on is configured correctly.filesystem
testing with CLI
When you are using the Cluster Manager CLI, use the following command
to test a filesystem:
cmgr> test resource A
of resource_type filesystems on cluster
B [on node C
node D ...]
The following example tests a filesystem. This example first uses
a CLI show command to display the filesystems that
have been defined in a cluster.
cmgr> show resources of resource_type filesystem in cluster eagan
/disk4 type filesystem
/disk5 type filesystem
/disk2 type filesystem
/disk3 type filesystem
/disk1 type filesystem
cmgr> test resource /disk4 of resource_type filesystem in cluster eagan on node cm1
Success: *** testing node resources on node cm1 ***
Success: *** testing all filesystem resources on node cm1 ***
Success: successfully mounted filesystem: /disk4
Success: overall exit status:success, tests failed:0, total tests executed:1
cmgr>
The following example tests a filesystem in destructive mode:
cmgr> test resource /disk4 of resource_type filesystem in cluster eagan on node cm1 destructive
Warning: executing the tests in destructive mode
Success: *** testing node resources on node cm1 ***
Success: *** testing all filesystem resources on node cm1 ***
Success: successfully mounted filesystem: /disk4
Success: overall exit status:success, tests failed:0, total tests executed:1
cmgr>
Testing NFS Filesystems
You can use the Cluster Manager CLI to test the NFS filesystems
configured in a cluster. This test checks if the specified NFS filesystem
is configured correctly and, in addition, checks whether the volume the
NFS filesystem will reside on is configured correctly.
filesystemNFS, testing with CLI
NFS filesystem testing
with CLI
When you are using the Cluster Manager CLI, use the following command
to test an NFS filesystem:
cmgr> test resource A
of resource_type NFS on cluster B [
on node C node
D ...]
The following example tests an NFS filesystem:
cmgr> test resource /disk4 of resource_type NFS in cluster eagan
Success: *** testing node resources on node cm1 ***
Success: *** testing all NFS resources on node cm1 ***
Success: *** testing node resources on node cm2 ***
Success: *** testing all NFS resources on node cm2 ***
Success: overall exit status:success, tests failed:0, total tests executed:2
cmgr>
Testing statd Resources
You can use the Cluster Manager CLI to test the statd resources
configured in a cluster. When you are using the Cluster Manager CLI, use
the following command to test an NFS filesystem:
statdtesting with CLI resourcestatd, testing with
CLI
cmgr> test resource A
of resource_type statd on cluster B [
on node C node
D ...]
The following example tests a statd resource:
cmgr> test resource /disk1/statmon of resource_type statd in cluster eagan
Success: *** testing node resources on node cm1 ***
Success: *** testing all statd resources on node cm1 ***
Success: *** testing node resources on node cm2 ***
Success: *** testing all statd resources on node cm2 ***
Success: overall exit status:success, tests failed:0, total tests executed:2
cmgr>
Testing Netscape-web Resources
You can use the Cluster Manager CLI to test the Netscape Web resources
configured in a cluster.Netscape Web
testing with CLI resourceNetscape Web, testing
with CLI
When you are using the Cluster Manager CLI, use the following command
to test a Netscape-web resource:
cmgr> test resource A
of resource_type Netscape_web on cluster
B [on node C
node D ...]
The following example tests a Netscape-web resource. In this example,
the Netscape-web resource on node cm2 failed the diagnostic test.
cmgr> test resource nss-enterprise of resource_type Netscape_web in cluster eagan
Success: *** testing node resources on node cm1 ***
Success: *** testing all Netscape_web resources on node cm1 ***
Success: *** testing node resources on node cm2 ***
Success: *** testing all Netscape_web resources on node cm2 ***
Warning: resource nss-enterprise has invaild script /var/netscape/suitespot/https-ha85 location
Warning: /var/netscape/suitespot/https-ha85/config/magnus.conf must contain the
"Port" parameter
Warning: /var/netscape/suitespot/https-ha85/config/magnus.conf must contain the
"Address" parameter
Warning: resource nss-enterprise of type Netscape_web failed
Success: overall exit status:failed, tests failed:1, total tests executed:2
Failed to execute FailSafe tests/diagnostics ha
test command failed
cmgr>
Testing Resource Groups
You can use the Cluster Manager CLI to test a resource group. This
test cycles through the resource tests for all of the resources defined
for a resource group. Resource tests are performed only on nodes in the
resource group's application failover domain.
resource grouptesting with CLI
Netscape servers, testing
with CLI
The Cluster Manager CLI uses the following syntax for the commands
that test resource groups:
cmgr> test resource_group
A in cluster
B [on node C
node D ...
]
The following example tests a resource group. This example first
uses a CLI show command to display the resource groups
that have been defined in a cluster.
cmgr> show resource_groups in cluster eagan
Resource Groups:
nfs2
informix
cmgr> test resource_group nfs2 in cluster eagan on machine cm1
Success: *** testing node resources on node cm1 ***
Success: testing resource /disk4 of resource type NFS on node cm1
Success: testing resource /disk3 of resource type NFS on node cm1
Success: testing resource /disk3/statmon of resource type statd on node cm1
Success: testing resource 128.162.19.45 of resource type IP_address on node cm1
Success: testing resource /disk4 of resource type filesystem on node cm1
Success: testing resource /disk3 of resource type filesystem on node cm1
Success: testing resource dmf1 of resource type volume on node cm1
Success: testing resource dmfjournals of resource type volume on node cm1
Success: overall exit status:success, tests failed:0, total tests executed:16
cmgr>
Testing Failover Policies with the Cluster Manager
CLI
You can use the Cluster Manager CLI to test whether a failover policy
is defined correctly. This test checks the failover policy by validating
the policy script, failover attributes, and whether the application failover
domain consists of valid nodes from the cluster.
failover policytesting with CLI
The Cluster Manager CLI uses the following syntax for the commands
that test a failover policy:
cmgr> test failover_policy
A in cluster
B [on node C
node D
...]
The following example tests a failover policy. This example first
uses a CLI show command to display the failover policies
that have been defined in a cluster.
cmgr> show failover_policies
Failover Policies:
reverse
ordered-in-order
cmgr> test failover_policy reverse in cluster eagan
Success: *** testing node resources on node cm1 ***
Success: testing policy reverse on node cm1
Success: *** testing node resources on node cm2 ***
Success: testing policy reverse on node cm2
Success: overall exit status:success, tests failed:0, total tests executed:2
cmgr>