xfs
[Top] [All Lists]

Re: [PATCH] xfs: limit superblock corruption errors to probable corrupti

To: Eric Sandeen <sandeen@xxxxxxxxxx>, xfs-oss <xfs@xxxxxxxxxxx>
Subject: Re: [PATCH] xfs: limit superblock corruption errors to probable corruption
From: Brian Foster <bfoster@xxxxxxxxxx>
Date: Thu, 30 Jan 2014 15:26:21 -0500
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <52E88D8B.90208@xxxxxxxxxx>
References: <52E88D8B.90208@xxxxxxxxxx>
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.2.0
On 01/29/2014 12:11 AM, Eric Sandeen wrote:
> Today, if
> 
> xfs_sb_read_verify
>   xfs_sb_verify
>     xfs_mount_validate_sb
> 
> detects superblock corruption, it'll be extremely noisy, dumping
> 2 stacks, 2 hexdumps, etc.
> 
> This is because we call XFS_CORRUPTION_ERROR in xfs_mount_validate_sb
> as well as in xfs_sb_read_verify.
> 
> Also, *any* errors in xfs_mount_validate_sb which are not corruption
> per se; things like too-big-blocksize, bad version, bad magic, v1 dirs,
> rw-incompat etc - things which do not return EFSCORRUPTED - will
> still do the whole XFS_CORRUPTION_ERROR spew when xfs_sb_read_verify
> sees any error at all.  And it suggests to the user that they 
> should run xfs_repair, even if the root cause of the mount failure
> is a simple incompatibility.
> 
> I'll submit that the probably-not-corrupted errors don't warrant
> this much noise, so this patch removes the high-level
> XFS_CORRUPTION_ERROR which was firing for every error return
> except EWRONGFS.
> 
> It also adds one to the path which detects a failed checksum.
> 
> The idea is, if it's really _corruption_ we can call
> XFS_CORRUPTION_ERROR at the point of detection.  More benign
> incompatibilities can do a little printk & fail the mount without
> so much drama.
> 
> Signed-off-by: Eric Sandeen <sandeen@xxxxxxxxxx>
> ---
> 
> I could see an argument where we might still want the hexdump
> for things like bad magic - ok, just what *was* the magic?  But
> I think we do need to reserve the oops-mimicing-backtraces for
> the most severe problems.  Discuss.  ;)
> 

This seems pretty reasonable to me, particularly if pretty much any
error via the xfs_sb_verify() path dumps corruption noise...

> diff --git a/fs/xfs/xfs_sb.c b/fs/xfs/xfs_sb.c
> index 511cce9..b575317 100644
> --- a/fs/xfs/xfs_sb.c
> +++ b/fs/xfs/xfs_sb.c
> @@ -617,6 +617,8 @@ xfs_sb_read_verify(
>                       /* Only fail bad secondaries on a known V5 filesystem */
>                       if (bp->b_bn != XFS_SB_DADDR &&
>                           xfs_sb_version_hascrc(&mp->m_sb)) {
> +                             XFS_CORRUPTION_ERROR(__func__, XFS_ERRLEVEL_LOW,
> +                                                  mp, bp->b_addr);
>                               error = EFSCORRUPTED;
>                               goto out_error;
>                       }
> @@ -625,12 +627,8 @@ xfs_sb_read_verify(
>       error = xfs_sb_verify(bp, true);
>  
>  out_error:
> -     if (error) {
> -             if (error != EWRONGFS)
> -                     XFS_CORRUPTION_ERROR(__func__, XFS_ERRLEVEL_LOW,
> -                                          mp, bp->b_addr);
> +     if (error)
>               xfs_buf_ioerror(bp, error);
> -     }
>  }

... but why not leave the corruption output here in out_error, change
the check to (error == EFSCORRUPTED) and remove the now duplicate
corruption message in xfs_mount_validate_sb() (or replace it with a
warn/notice message)? This would catch the other EFSCORRUPTED returns in
a consistent manner, including another potential duplicate in the write
verifier. I guess we'd lose a little specificity between the crc failure
and sb validation, but we could add a warn/notice for the former too.

Brian

>  
>  /*
> 
> _______________________________________________
> xfs mailing list
> xfs@xxxxxxxxxxx
> http://oss.sgi.com/mailman/listinfo/xfs
> 

<Prev in Thread] Current Thread [Next in Thread>