On Thu, Feb 19, 2015 at 05:04:19PM -0200, Carlos Maiolino wrote:
> > Well, the switch is simple characterisation. What we do with that
> > error type can be much more complex, and that's why I haven't tried
> > to address those issues here. When we've sorted out what we need
> > and how we are going to configure the error handling, then we can
> > add it.
> > e.g. if we need configurable error handling, it needs to be
> > configurable for different error types, and it needs to be
> > configurable on a per-mount basis. And it needs to be configurable
> > at runtime, not just at mount time. That kind of leads to using
> > sysfs for this. e.g. for each error type we ned to handle different
> > behaviour for:
> > $ cat /sys/fs/xfs/vda/meta_write_errors/enospc/type
> > [transient] permanent
> > $ cat /sys/fs/xfs/vda/meta_write_errors/enospc/perm_timeout_seconds
> > 300
> > $ cat /sys/fs/xfs/vda/meta_write_errors/enospc/perm_max_retry_attempts
> > 50
> > $ cat /sys/fs/xfs/vda/meta_write_errors/enospc/transient_fail_at_umount
> > 1
> > And then have generic infrastructure to set it up and handle the
> > buffer errors according to the config?
> > > (I think that's accurately summing up irc-and-side-channel discussions) ;)
> > Pretty much.
> talking about possible configurable error handlers, what about leave this
> of failure to the sysadmin? Instead a time or type based configuration what
> about something that the administrator could just say "next IO to device X
> should fail permanently"?
How is this different to just shutting down the filesystem
immediately via 'xfs_io -x -c shutdown /path/to/mnt/pt' ?
Regardless of this, leave failures as transient, then when an
error condition occurs (say thinp device ENOSPC), this will error
out on the next IO that is retried:
# echo permanent > /sys/fs/xfs/vda/meta_write_errors/enospc/type
# echo 0 > /sys/fs/xfs/vda/meta_write_errors/enospc/perm_max_retry_attempts
Will make the next device ENOSPC IO error shut the filesystem down.