xfs
[Top] [All Lists]

Re: [PATCH] xfs: Document error handlers behavior [V2]

To: xfs@xxxxxxxxxxx
Subject: Re: [PATCH] xfs: Document error handlers behavior [V2]
From: Carlos Maiolino <cmaiolino@xxxxxxxxxx>
Date: Wed, 31 Aug 2016 05:49:33 -0400
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <1470734124-65204-1-git-send-email-cmaiolino@xxxxxxxxxx>
Mail-followup-to: xfs@xxxxxxxxxxx
References: <1470734124-65204-1-git-send-email-cmaiolino@xxxxxxxxxx>
User-agent: Mutt/1.5.24 (2015-08-30)
Hi folks, any comments on this?

Cheers

On Tue, Aug 09, 2016 at 05:15:24AM -0400, Carlos Maiolino wrote:
> Document the implementation of error handlers into sysfs.
> 
> Changelog:
> 
> V2:
>       - Add a description of the precedence order of each option, focusing on
>         the behavior of "fail_at_unmount" which was not well explained in V1
> 
> Signed-off-by: Carlos Maiolino <cmaiolino@xxxxxxxxxx>
> ---
>  Documentation/filesystems/xfs.txt | 94 
> +++++++++++++++++++++++++++++++++++++++
>  1 file changed, 94 insertions(+)
> 
> diff --git a/Documentation/filesystems/xfs.txt 
> b/Documentation/filesystems/xfs.txt
> index 8146e9f..d483e0b 100644
> --- a/Documentation/filesystems/xfs.txt
> +++ b/Documentation/filesystems/xfs.txt
> @@ -348,3 +348,97 @@ Removed Sysctls
>    ----                               -------
>    fs.xfs.xfsbufd_centisec    v4.0
>    fs.xfs.age_buffer_centisecs        v4.0
> +
> +Error handling
> +==============
> +
> +XFS can act differently according with the type of error found
> +during its operation. The implementation introduces the following
> +concepts to the error handler:
> +
> + -failure speed:
> +     Defines how fast XFS should shutdown in case of a specific
> +     error is found during the filesystem  operation. It can
> +     shutdown immediately, after a defined number of tries, or
> +     simply try forever, which was the old behavior and is now
> +     set as default behavior, except during unmount time, where
> +     in case of a error is found while unmounting, the filesystem
> +     will shutdown.
> +
> + -error classes:
> +     Specifies the subsystem/location where the error handlers
> +     configure the behavior for, such as metadata or memory allocation.
> +
> + -error handlers:
> +     Defines the behavior for a specific error.
> +
> +The filesystem behavior during an error can be set via sysfs files, where, 
> the
> +errors are organized with the following structure:
> +
> +  /sys/fs/xfs/<dev>/error/<class>/<error>/
> +
> +Each directory contains:
> +
> + /sys/fs/xfs/<dev>/error/
> +
> +     fail_at_unmount         (Min:  0  Default:  1  Max: 1)
> +             Defines the global error behavior during unmount time. If set to
> +             "1", XFS will shutdown in case of any error is found, otherwise,
> +             if set to "0", the filesystem will indefinitely retry to cleanly
> +             unmount the filesystem.
> +
> +     <class> subdirectories
> +             Contains specific error handlers configuration
> +             (Ex: /sys/fs/xfs/<dev>/error/metadata).
> +
> + /sys/fs/xfs/<dev>/error/<class>/
> +
> +     The contents of this directory are <class> specific, since each <class>
> +     might need to handle different types of errors. All <error> directory
> +     though, contains the "default" directory, which is a global 
> configuration
> +     for errors not available for independent configuration.
> +
> + /sys/fs/xfs/<dev>/error/<class>/<error>
> +
> +     Contains the failure speed configuration files for each specific error,
> +     including the "default" behavior, which contains the same configuration
> +     options as the specific errors.
> +
> +     The available configurations for each error type are:
> +
> +     max_retries                     (Min: -1  Default: -1  Max: INTMAX)
> +             Define how many tries the filesystem is allowed to retry its
> +             operations during the specific error, before shutdown the
> +             filesystem. Setting this file to "-1", will set XFS to retry
> +             forever in the specific error, setting it to "0", will make
> +             XFS to fail immediately after the specific error is found,
> +             while setting it to a "N" value, where N is greater than 0,
> +             will make XFS retry "N" times before shutdown.
> +
> +     retry_timeout_seconds           (Min:  0  Default:  0  Max: INTMAX)
> +             Define the amount of time (in seconds) that the filesystem is
> +             allowed to retry its operations when the specific error is
> +             found. "0" means no wait time.
> +
> +
> +
> + Order of precedence:
> +             "max_retries" takes precedence over "retry_timeout_seconds",
> +             where, "retry_timeout_seconds" will only be tested if
> +             "max_retries" limit was not reached yet or is set to retry
> +             forever ("-1"). If "max_retries" limit is reached, the
> +             filesystem will shutdown, wether or not "retry_timeout_seconds"
> +             has been reached.
> +
> +             "fail_at_unmount" on the other hand, works independently of the
> +             remainder options. It will only be tested during unmount time,
> +             but, it will shutdown the filesystem independent of the limits
> +             set into "max_retries" or "retry_timeout_seconds".
> +             It has been added because sysfs configuration can't be changed
> +             after an unmount is triggered, once the sysfs directory from
> +             the filesystem being unmounted will be detached from the sysfs
> +             tree, so, even if the sysadmin wants to make XFS retry forever
> +             for any error during the filesystem operation, the filesystem
> +             can still be properly unmounted if any error was detected and
> +             "fail_at_unmount" is set. Otherwise, the umount process get
> +             stuck forever.
> -- 
> 2.5.5
> 
> _______________________________________________
> xfs mailing list
> xfs@xxxxxxxxxxx
> http://oss.sgi.com/mailman/listinfo/xfs

-- 
Carlos

<Prev in Thread] Current Thread [Next in Thread>