xfs
[Top] [All Lists]

Re: [PATCH] xfs: Introduce permanent async buffer write IO failures

To: Dave Chinner <david@xxxxxxxxxxxxx>
Subject: Re: [PATCH] xfs: Introduce permanent async buffer write IO failures
From: Carlos Maiolino <cmaiolino@xxxxxxxxxx>
Date: Thu, 19 Feb 2015 17:04:19 -0200
Cc: Eric Sandeen <sandeen@xxxxxxxxxxx>, xfs@xxxxxxxxxxx
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <20150218235220.GQ4251@dastard>
Mail-followup-to: Dave Chinner <david@xxxxxxxxxxxxx>, Eric Sandeen <sandeen@xxxxxxxxxxx>, xfs@xxxxxxxxxxx
References: <1424298740-25821-1-git-send-email-david@xxxxxxxxxxxxx> <54E51CC7.8040709@xxxxxxxxxxx> <20150218235220.GQ4251@dastard>
User-agent: Mutt/1.5.23 (2014-03-12)
> 
> Well, the switch is simple characterisation. What we do with that
> error type can be much more complex, and that's why I haven't tried
> to address those issues here. When we've sorted out what we need
> and how we are going to configure the error handling, then we can
> add it.
> 
> e.g. if we need configurable error handling, it needs to be
> configurable for different error types, and it needs to be
> configurable on a per-mount basis. And it needs to be configurable
> at runtime, not just at mount time. That kind of leads to using
> sysfs for this. e.g. for each error type we ned to handle different
> behaviour for:
> 
> $ cat /sys/fs/xfs/vda/meta_write_errors/enospc/type
> [transient] permanent
> $ cat /sys/fs/xfs/vda/meta_write_errors/enospc/perm_timeout_seconds
> 300
> $ cat /sys/fs/xfs/vda/meta_write_errors/enospc/perm_max_retry_attempts
> 50
> $ cat /sys/fs/xfs/vda/meta_write_errors/enospc/transient_fail_at_umount
> 1
> 
> And then have generic infrastructure to set it up and handle the
> buffer errors according to the config?
> 
> > (I think that's accurately summing up irc-and-side-channel discussions) ;)
> 
> Pretty much.
> 

talking about possible configurable error handlers, what about leave this choice
of failure to the sysadmin? Instead a time or type based configuration what
about something that the administrator could just say "next IO to device X
should fail permanently"?
Anyway, I know it would not be automatic, but it adds some flexibility to the
current behavior.

Anyway, just my 2 cents.

-- 
Carlos

<Prev in Thread] Current Thread [Next in Thread>