On Mon, Mar 03, 2014 at 11:34:35AM -0600, Eric Sandeen wrote:
> On 3/2/14, 11:39 PM, Dave Chinner wrote:
> > From: Dave Chinner <dchinner@xxxxxxxxxx>
> > While the verifier reoutines may return EFSBADCRC when a buffer ahs
> > a bad CRC, we need to translate that to EFSCORRUPTED so that the
> > higher layers treat the error appropriately and so we return a
> > consistent error to userspace. This fixes a xfs/005 regression.
> Can you say a little more about the philosophy here?
> xfs/005 regresses because it expects "structure needs cleaning"
> So if we instead return our (icky) CRC error code, we get something else.
> But it is truly a different root cause.
> So the goal is to NEVER leak EFSBADCRC to userspace? Maybe a comment
> above that error definition would help document that.
Not permanently. At the moment, none of the code handles it
correctly, and the leak to userspace is just a symptom that tells us
we got somethign wrong. We have plenty of places where we check for
EFSCORRUPTED and do something special, but if we get EFSBADCRC
instead it will do the wrong thing....
> And I'm bit worried that we'll leak more in the future if things changed,
> or if things got missed here. Everything you have here looks fine, but
> it's not obvious that every path has been caught; it seems a bit random.
It's not random. It's buffer reads that matter, and I
checked all the calls to xfs_buf_read, xfs_buf_read_map,
xfs_trans_read_buf and xfs_trans_read_buf. There aren't any other
read interfaces that use verifiers, and so nothing else can return
EFSBADCRC. For the log recovery cases, the buffer reads don' use
verifiers, and those that do won't return EFSBADCRC (e.g. inode
> I know we _just_ merged my "differentiator" patches, but I wonder if
> it would be better to add XFS_BSTATE_BADCRC to b_state or some other
> field, and go back to always assigning EFSCORRUPTED. What do you think?
It's just the first layer of adding differentiating support. We've
just put the mechanism in place to do the differentiation because we
need it for *userspace functionality* before we need it for
in-kernel functionality. We put it in the kernel because it has
value to us developers to indicate what type of corruption error was
detected in the dmesg output. We can't however, do everything at
once, so for the moment the kernel code needs to translate it back
to something the higher layers understand and treat correctly.
> When I wrote those I wasn't thinking about keeping it all internal
> to the filesystem.
Only for the moment, until there's code in the kernel that makes it
a meaningfully different error.