5.3. Configuring Timeout Values and Monitoring Intervals

When you configure the components of a Linux FailSafe system, you configure various timeout values and monitoring intervals that determine the application downtown of a highly-available system when there is a failure. To determine reasonable values to set for your system, consider the following equation:

application downtime = failure detection + time to handle failure + failure recovery

Failure detection depends on the type of failure that is detected:

Reducing these values will result in a shorter failover time, but reducing these values could lead to significant increase in the Linux FailSafe overhead on the system performance and could also lead to false failovers.

The time to handle a failure is something that the user cannot control. In general, this should take a few seconds.

The failure recovery time is determined by the total time it takes for Linux FailSafe to perform the following: