[PATCH 0/7] Configurable error behavior [V2]
Carlos Maiolino
cmaiolino at redhat.com
Wed May 4 04:53:28 CDT 2016
On Wed, May 04, 2016 at 08:59:48AM +1000, Dave Chinner wrote:
> On Tue, May 03, 2016 at 07:55:33PM +0200, Carlos Maiolino wrote:
> > Hi folks,
> >
> > I spoke with Dave and to offload him a bit, I took over his patchset for enable
> > configurable error handlers.
> >
> > Since he did most of the design, I kept his signed-off to the patches.
> >
> > Here are core the changes I did from his last patchset to this one:
> >
> > - Removed fail_speed configuration
> > According with what has been discussed, there is no real need to have a
> > fail_speed configuration, but instead, use the "max_retries" configuration to
> > get for how long the filesystem should retrying, with the only special case for
> > "-1", which will make the filesystem to retry forever.
> >
> > - Fail at unmount is no longer a config option
> > Having a filesystem stuck forever during unmount due errors is not a
> > good thing, so just enforce it if we are trying to unmount a failed
> > filesystem.
>
> I think this is wrong - the option should still remain to let the
> filesystem retry for a long while if desired. We have lazy unmounts
> to deal with this situation (i.e. it won't block an unmount command)
> and there are cases where leaving the unmount retrying in the
> background is useful.
>
> If you are particularly worried about not being able to shut down a
> filesystem that has been unmounted lazily and hence terminate the
> retry forever loop, then we should be looking at ensuring the fs is
> still visible in /sys/fs/xfs/<dev> when it is in this state and
> providing a new shutdown hook through that interface....
>
Right, I will keep this option, although, I don't think it's useful to have this
option configurable with a per-error granularity, a single fail_at_unmount might
suffice.
> > I reduced by now, the amount of patches into this patchset, once, our priority
> > here IMHO is to enable the possibility to shutdown the filesystem when we have
> > metadata errors, and I can work on disabling specific error configs and add
> > memory errors later, I hope that with a reduced amount of patches it can be
> > easier for people to review the core of the error handlers configuration and
> > speed up the inclusion of this patchset.
>
> We'll see - the issue here is that once we settle on a sysfs
> interface, we can't change it easily (it's part of the user facing
> ABI). Hence if we don't consider all the different types of errors
> we want to handle from the start, we may miss something that we
> can't easily fix in future. Hence we at least have to consider the
> different constraints for the different error types now to determine
> if the abstractions and presentation will handle everything we think
> we might need...
>
Agreed, the interface for now looks good IMO, and adding the possibility to hide
some specific error handlers from userspace can be done later without a real
impact in the ABI, I just believe that implementing the interface for errors
that will be visible is more important for now, specifically for EIO and ENOSPC
> Cheers,
>
> Dave.
> --
> Dave Chinner
> david at fromorbit.com
--
Carlos
More information about the xfs
mailing list