[PATCH] xfs: limit superblock corruption errors to probable corruption
Brian Foster
bfoster at redhat.com
Thu Jan 30 14:26:21 CST 2014
On 01/29/2014 12:11 AM, Eric Sandeen wrote:
> Today, if
>
> xfs_sb_read_verify
> xfs_sb_verify
> xfs_mount_validate_sb
>
> detects superblock corruption, it'll be extremely noisy, dumping
> 2 stacks, 2 hexdumps, etc.
>
> This is because we call XFS_CORRUPTION_ERROR in xfs_mount_validate_sb
> as well as in xfs_sb_read_verify.
>
> Also, *any* errors in xfs_mount_validate_sb which are not corruption
> per se; things like too-big-blocksize, bad version, bad magic, v1 dirs,
> rw-incompat etc - things which do not return EFSCORRUPTED - will
> still do the whole XFS_CORRUPTION_ERROR spew when xfs_sb_read_verify
> sees any error at all. And it suggests to the user that they
> should run xfs_repair, even if the root cause of the mount failure
> is a simple incompatibility.
>
> I'll submit that the probably-not-corrupted errors don't warrant
> this much noise, so this patch removes the high-level
> XFS_CORRUPTION_ERROR which was firing for every error return
> except EWRONGFS.
>
> It also adds one to the path which detects a failed checksum.
>
> The idea is, if it's really _corruption_ we can call
> XFS_CORRUPTION_ERROR at the point of detection. More benign
> incompatibilities can do a little printk & fail the mount without
> so much drama.
>
> Signed-off-by: Eric Sandeen <sandeen at redhat.com>
> ---
>
> I could see an argument where we might still want the hexdump
> for things like bad magic - ok, just what *was* the magic? But
> I think we do need to reserve the oops-mimicing-backtraces for
> the most severe problems. Discuss. ;)
>
This seems pretty reasonable to me, particularly if pretty much any
error via the xfs_sb_verify() path dumps corruption noise...
> diff --git a/fs/xfs/xfs_sb.c b/fs/xfs/xfs_sb.c
> index 511cce9..b575317 100644
> --- a/fs/xfs/xfs_sb.c
> +++ b/fs/xfs/xfs_sb.c
> @@ -617,6 +617,8 @@ xfs_sb_read_verify(
> /* Only fail bad secondaries on a known V5 filesystem */
> if (bp->b_bn != XFS_SB_DADDR &&
> xfs_sb_version_hascrc(&mp->m_sb)) {
> + XFS_CORRUPTION_ERROR(__func__, XFS_ERRLEVEL_LOW,
> + mp, bp->b_addr);
> error = EFSCORRUPTED;
> goto out_error;
> }
> @@ -625,12 +627,8 @@ xfs_sb_read_verify(
> error = xfs_sb_verify(bp, true);
>
> out_error:
> - if (error) {
> - if (error != EWRONGFS)
> - XFS_CORRUPTION_ERROR(__func__, XFS_ERRLEVEL_LOW,
> - mp, bp->b_addr);
> + if (error)
> xfs_buf_ioerror(bp, error);
> - }
> }
... but why not leave the corruption output here in out_error, change
the check to (error == EFSCORRUPTED) and remove the now duplicate
corruption message in xfs_mount_validate_sb() (or replace it with a
warn/notice message)? This would catch the other EFSCORRUPTED returns in
a consistent manner, including another potential duplicate in the write
verifier. I guess we'd lose a little specificity between the crc failure
and sb validation, but we could add a warn/notice for the former too.
Brian
>
> /*
>
> _______________________________________________
> xfs mailing list
> xfs at oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs
>
More information about the xfs
mailing list