On Tue, Mar 04, 2014 at 09:29:46AM +1100, Dave Chinner wrote:
> On Mon, Mar 03, 2014 at 12:44:26PM -0500, Brian Foster wrote:
> > On Mon, Mar 03, 2014 at 04:39:53PM +1100, Dave Chinner wrote:
> > > From: Dave Chinner <dchinner@xxxxxxxxxx>
> > >
> > > While the verifier reoutines may return EFSBADCRC when a buffer ahs
> > > a bad CRC, we need to translate that to EFSCORRUPTED so that the
> > > higher layers treat the error appropriately and so we return a
> > > consistent error to userspace. This fixes a xfs/005 regression.
> > >
> > > Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx>
> > > ---
> > This change looks Ok to me, but when I start looking through the users
> > of bp->b_error, I see examples like xfs_dir3_data_read() being called in
> > xfs_dir2_leaf_addname() where it looks like an error could bubble all
> > the way up to xfs_vn_mknod() and its callers.
> Sure, but:
> Which means the patch prevents the EFSBADCRC leaking back out
> through that path because it converts it in xfs_trans_read_buf_map.
Ok, I see. FWIW, I was also trolling through some of the log recovery
code and it appears that code uses b_ops for write verification purposes
only (i.e., crc generation I suspect), correct?
> > If the intent is to use EFSBADCRC as an internal-only error to
> > differentiate corruption from crc failure, why not push this more
> > closely to the boundaries that we have already defined? For example, we
> > already convert positive errnos to negative at the internal/external
> > boundaries. Could we convert those to use some kind of
> > XFS_USERSPACE_ERROR(error) macro/helper that converts errors
> > appropriately?
> That doesn't solve the problem needing an error conversion layer in
> the first place. The long term goal is to remove the error
> conversions in XFS by converting the core code to the same error
> passing conventions as the rest of the kernel code. We manage to
> screw the negation up fairly regularly because it is convoluted and
> we cal into generic code that returns negative errors from the core
> that returns positive errors in lots of places. The conversion
> surface is just too large to manage sanely.
Well it wasn't clear that the error conversion layer was a problem that
itself needed solving, but fair enough. ;) If the objective is to move
away from the positive/negative business to something more "kernel
native," then building more infrastructure around that is probably not
the way to go, unless that was actually part of the changeover strategy
(e.g., if we happened to use something that conditionally negated error
values to support incremental codepath conversions... as a semi-random
> > Another thought could be to reconsider whether we still need some of
> > these extra warnings, as in the xfs_mount.c hunk below, now that we have
> > the generic xfs_verifier_error() messaging. E.g., if we could remove
> > those, perhaps we could snub out EFSBADCRC in or around the verifier
> > after it makes a distinction.
> Redundant errors aren't s significant problem. It's the lack of
> meaningful error messages that are much more of an issue. We get
> more meaningful error messages as a result of the EFSBADCRC changes
> that have been made, but for the moment that error simply means
> EFSCORRUPTED to the higher layers. Hence the translation back to
> EFSCORRUPTED at the (low) layers where the error no longer has
> a distinct meaning.
Right, the purpose is not to solve the problem of redundant errors.
Rather, if the only thing preventing the consolidation of the EFSBADCRC
trap to a single hunk are redundant errors, then it wouldn't be a
problem to remove those errors and push the EFSBADCRC trap into the
generic code for the purpose of clarity.
> As we add more functionality, EFSBADCRC will become more meaningful
> and so get propagated higher into the kernel code. But for now, it
> should remain an error that doesn't escape the lower layers...
> Dave Chinner