xfs
[Top] [All Lists]

[PATCH V4] xfs: Document error handlers behavior

To: linux-xfs@xxxxxxxxxxxxxxx, xfs@xxxxxxxxxxx
Subject: [PATCH V4] xfs: Document error handlers behavior
From: Carlos Maiolino <cmaiolino@xxxxxxxxxx>
Date: Tue, 13 Sep 2016 05:03:05 -0400
Delivered-to: xfs@xxxxxxxxxxx
Document the implementation of error handlers into sysfs.


Signed-off-by: Carlos Maiolino <cmaiolino@xxxxxxxxxx>
---
Changelog:

V2:
        - Add a description of the precedence order of each option, focusing on
          the behavior of "fail_at_unmount" which was not well explained in V1

V3:
        - Fix English spelling mistakes suggested by Eric

V4:
        - Typo mistakes, document ENODEV default value for max_retries, fix
          directories's hierarchy description

 Documentation/filesystems/xfs.txt | 75 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 75 insertions(+)

diff --git a/Documentation/filesystems/xfs.txt 
b/Documentation/filesystems/xfs.txt
index 8146e9f..374af3b 100644
--- a/Documentation/filesystems/xfs.txt
+++ b/Documentation/filesystems/xfs.txt
@@ -348,3 +348,78 @@ Removed Sysctls
   ----                         -------
   fs.xfs.xfsbufd_centisec      v4.0
   fs.xfs.age_buffer_centisecs  v4.0
+
+Error handling
+==============
+
+XFS can act differently according to the type of error found
+during its operation. The implementation introduces the following
+concepts to the error handler:
+
+ -failure speed:
+       Defines how fast XFS should shut down when a specific error is found
+       during the filesystem operation. It can shut down immediately, after a
+       defined number of retries, after a set time period, or simply retry
+       forever. The old "retry forever" behavior is still the default, except
+       during unmount, where any IOs retrying due to errors will be cancelled
+       and unmount will be allowed to proceed.
+
+ -error classes:
+       Specifies the subsystem/location of the error handlers, such as
+       metadata or memory allocation. Only metadata IO errors are handled
+       at this time.
+
+ -error handlers:
+       Defines the behavior for a specific error.
+
+The filesystem behavior during an error can be set via sysfs files, Each
+configuration option works independently, the first condition met for a
+specific configuration will cause the filesystem to shut down.
+
+The configuration files are organized into the following hierarchy:
+
+  /sys/fs/xfs/<dev>/error/<class>/<error>/
+
+Each directory contains:
+
+ /sys/fs/xfs/<dev>/error/
+
+       fail_at_unmount         (Min:  0  Default:  1  Max: 1)
+               Defines the global error behavior at unmount time. If set to the
+               default value of 1, XFS will cancel any pending IO retries, shut
+               down, and unmount. If set to 0, pending IO retries may prevent
+               the filesystem from unmounting.
+
+       <class> subdirectories
+               Contains specific error handlers configuration
+               (Ex: /sys/fs/xfs/<dev>/error/metadata, see below).
+
+ /sys/fs/xfs/<dev>/error/<class>/
+
+       Directory containing configuration for a specific error <class>;
+       currently only the "metadata" <class> is implemented.
+       The contents of this directory are <class> specific, since each <class>
+       might need to handle different types of errors.
+
+ /sys/fs/xfs/<dev>/error/<class>/<error>/
+
+       Contains the failure speed configuration files for specific errors in
+       this <class>, as well as a "default" behavior. Each <error> directory
+       contains the following configuration files:
+
+       max_retries                     (Min: -1  Default: -1  Max: INTMAX)
+               Defines the allowed number of retries of a specific error before
+               the filesystem will shut down.  The default value of "-1" will
+               cause XFS to retry forever for this specific error.  Setting it
+               to "0" will cause XFS to fail immediately when the specific
+               error is found, and setting it to "N," where N is greater than 
0,
+               will make XFS retry "N" times before shutting down.
+               Default value for ENODEV error is set to '0', once there is no
+               reason to keep retrying if the device is gone.
+
+       retry_timeout_seconds           (Min:  0  Default:  0  Max: 1 day)
+               Define the amount of time (in seconds) that the filesystem is
+               allowed to retry its operations when the specific error is
+               found. The default value of "0" will cause XFS to retry forever.
+
+       
-- 
2.5.5

<Prev in Thread] Current Thread [Next in Thread>