On Mon, Mar 03, 2014 at 12:44:26PM -0500, Brian Foster wrote:
> On Mon, Mar 03, 2014 at 04:39:53PM +1100, Dave Chinner wrote:
> > From: Dave Chinner <dchinner@xxxxxxxxxx>
> > While the verifier reoutines may return EFSBADCRC when a buffer ahs
> > a bad CRC, we need to translate that to EFSCORRUPTED so that the
> > higher layers treat the error appropriately and so we return a
> > consistent error to userspace. This fixes a xfs/005 regression.
> > Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx>
> > ---
> This change looks Ok to me, but when I start looking through the users
> of bp->b_error, I see examples like xfs_dir3_data_read() being called in
> xfs_dir2_leaf_addname() where it looks like an error could bubble all
> the way up to xfs_vn_mknod() and its callers.
Which means the patch prevents the EFSBADCRC leaking back out
through that path because it converts it in xfs_trans_read_buf_map.
> If the intent is to use EFSBADCRC as an internal-only error to
> differentiate corruption from crc failure, why not push this more
> closely to the boundaries that we have already defined? For example, we
> already convert positive errnos to negative at the internal/external
> boundaries. Could we convert those to use some kind of
> XFS_USERSPACE_ERROR(error) macro/helper that converts errors
That doesn't solve the problem needing an error conversion layer in
the first place. The long term goal is to remove the error
conversions in XFS by converting the core code to the same error
passing conventions as the rest of the kernel code. We manage to
screw the negation up fairly regularly because it is convoluted and
we cal into generic code that returns negative errors from the core
that returns positive errors in lots of places. The conversion
surface is just too large to manage sanely.
> Another thought could be to reconsider whether we still need some of
> these extra warnings, as in the xfs_mount.c hunk below, now that we have
> the generic xfs_verifier_error() messaging. E.g., if we could remove
> those, perhaps we could snub out EFSBADCRC in or around the verifier
> after it makes a distinction.
Redundant errors aren't s significant problem. It's the lack of
meaningful error messages that are much more of an issue. We get
more meaningful error messages as a result of the EFSBADCRC changes
that have been made, but for the moment that error simply means
EFSCORRUPTED to the higher layers. Hence the translation back to
EFSCORRUPTED at the (low) layers where the error no longer has
a distinct meaning.
As we add more functionality, EFSBADCRC will become more meaningful
and so get propagated higher into the kernel code. But for now, it
should remain an error that doesn't escape the lower layers...