xfs
[Top] [All Lists]

Re: [PATCH] xfs: lobotomise xfs_trans_read_buf_map()

To: Mark Tinguely <tinguely@xxxxxxx>
Subject: Re: [PATCH] xfs: lobotomise xfs_trans_read_buf_map()
From: Dave Chinner <david@xxxxxxxxxxxxx>
Date: Thu, 4 Dec 2014 06:54:33 +1100
Cc: Christoph Hellwig <hch@xxxxxxxxxxxxx>, xfs@xxxxxxxxxxx
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <547F19AE.6050808@xxxxxxx>
References: <1417473290-17544-1-git-send-email-david@xxxxxxxxxxxxx> <20141202165930.GA28571@xxxxxxxxxxxxx> <20141202224518.GG18131@dastard> <20141203105122.GA3727@xxxxxxxxxxxxx> <547F19AE.6050808@xxxxxxx>
User-agent: Mutt/1.5.21 (2010-09-15)
On Wed, Dec 03, 2014 at 08:09:50AM -0600, Mark Tinguely wrote:
> On 12/03/14 04:51, Christoph Hellwig wrote:
> >On Wed, Dec 03, 2014 at 09:45:18AM +1100, Dave Chinner wrote:
> >>>Can you fix the inconsistent return for the trylock case in a follow on
> >>>patch?  This difference doesn't look intentional to me, and I would
> >>>be surprised if it's correctly handled in the callers.
> >>
> >>Ok, I'll do an audit and make this common in a follow up patch. Just
> >>to confirm:
> >>
> >>            if (!(flags & XBF_TRYLOCK))
> >>                    return -ENOMEM;
> >>            return -EAGAIN;
> >>
> >>is what you want to see, right?
> >
> >Yes.
> 
> Even ENOMEM / EAGAIN could be wrong if _xfs_buf_find() was given an
> illegal block number - then it would be EFSCORRUPT.

Well, in theory.  Yes, the comment I wrote in xfs_buf_find() at the
time of adding the first check says we should return EFSCORRUPTED,
but that's only indicative of the fact that we've been handed
something that is bad. The code won't do anything different if it
is handed ENOMEM or EFSCORRUPTED - the same error handling will
occur, and the only difference will be a warning in the log.

> I think we need to push the error message from _xfs_buf_find(). I
> played with it once and seemed to have lost it and can do it again
> if no one else has the time.

The check in _xfs_buf_find() is a loud warning because it
indicates we failed to validate a block number at a higher layer.
IOWs, if you see that warning fire in _xfs_buf_find(), we need to
track down where that bad block number came from and why it wasn't
correctly validated before we tried to actually use it. Those are
the bugs we need to fix; reworking the buffer cache API so that
_xfs_buf_find() can return an error won't fix those bugs, nor change
the error handling behaviour of the callers of xfs_trans*buf[_map].

Hence I don't think we need to change _Xfs_buf_find() or the way it
reports errors at all. Put the work into fixing the block number
validation bugs it exposes...

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx

<Prev in Thread] Current Thread [Next in Thread>