xfs
[Top] [All Lists]

[PATCH] xfs: Document error handling behavior

To: xfs@xxxxxxxxxxx
Subject: [PATCH] xfs: Document error handling behavior
From: Carlos Maiolino <cmaiolino@xxxxxxxxxx>
Date: Tue, 19 Jul 2016 12:04:17 +0200
Delivered-to: xfs@xxxxxxxxxxx
This is the first try to document the implementation of error handlers into
sysfs.

Reviews and comments are appreciated, please also notice I'm not english-native,
so, spelling corrections are also appreciated :)

Signed-off-by: Carlos Maiolino <cmaiolino@xxxxxxxxxx>
---
 Documentation/filesystems/xfs.txt | 78 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 78 insertions(+)

diff --git a/Documentation/filesystems/xfs.txt 
b/Documentation/filesystems/xfs.txt
index 8146e9f..1df868a 100644
--- a/Documentation/filesystems/xfs.txt
+++ b/Documentation/filesystems/xfs.txt
@@ -348,3 +348,81 @@ Removed Sysctls
   ----                         -------
   fs.xfs.xfsbufd_centisec      v4.0
   fs.xfs.age_buffer_centisecs  v4.0
+
+Error handling
+==============
+
+XFS can act differently according with the type of error found
+during its operation. The implementation introduces the following
+concepts to the error handler:
+
+ -failure speed:
+       Defines how fast XFS should shutdown in case of a specific
+       error is found during the filesystem  operation. It can
+       shutdown immediately, after a defined number of tries, or
+       simply try forever, which was the old behavior and is now
+       set as default behavior, except during unmount time, where
+       in case of a error is found while unmounting, the filesystem
+       will shutdown.
+
+ -error classes:
+       Specifies the subsystem/location where the error handlers
+       configure the behavior for, such as metadata or memory allocation.
+
+ -error handlers:
+       Defines the behavior for a specific error.
+
+The filesystem behavior during an error can be set via sysfs files, where, the
+errors are organized with the following structure:
+
+  /sys/fs/xfs/<dev>/error/<class>/<error>/
+
+Each directory contains:
+
+ /sys/fs/xfs/<dev>/error/
+
+       fail_at_unmount         (Min:  0  Default:  1  Max: 1)
+               Defines the global error behavior during unmount time. If set to
+               "1", XFS will shutdown in case of any error is found, otherwise,
+               if set to "0", the filesystem will indefinitely retry to cleanly
+               unmount the filesystem.
+
+       <class> subdirectories
+               Contains specific error handlers configuration
+               (Ex: /sys/fs/xfs/<dev>/error/metadata).
+
+ /sys/fs/xfs/<dev>/error/<class>/
+
+       The contents of this directory are <class> specific, since each <class>
+       might need to handle different types of errors. All <error> directory
+       though, contains the "default" directory, which is a global 
configuration
+       for errors not available for independent configuration.
+
+ /sys/fs/xfs/<dev>/error/<class>/<error>
+
+       Contains the failure speed configuration files for each specific error,
+       including the "default" behavior, which contains the same configuration
+       options as the specific errors.
+
+       The available configurations for each error type are:
+
+       max_retries                     (Min: -1  Default: -1  Max: INTMAX)
+               Define how many tries the filesystem is allowed to retry its
+               operations during the specific error, before shutdown the
+               filesystem. Setting this file to "-1", will set XFS to retry
+               forever in the specific error, setting it to "0", will make
+               XFS to fail immediately after the specific error is found,
+               while setting it to a "N" value, where N is greater than 0,
+               will make XFS retry "N" times before shutdown.
+
+       retry_timeout_seconds           (Min:  0  Default:  0  Max: INTMAX)
+               Define the amount of time (in seconds) that the filesystem is
+               allowed to retry its operations when the specific error is
+               found. "0" means no wait time.
+
+
+       "max_retries" takes precedence over "retry_timeout_seconds", where,
+       "retry_timeout_seconds" will only be tested if the "max_retries" limit
+       were not reached yet or is set to retry forever ("-1"). If "max_retries"
+       limit is reached, the filesystem will shutdown, wether or not
+       "retry_timeout_seconds" has been reached.
-- 
2.7.4

<Prev in Thread] Current Thread [Next in Thread>