xfs
[Top] [All Lists]

Re: [PATCH, RFC] xfs: don't break from growfs ag update loop on error

To: Eric Sandeen <sandeen@xxxxxxxxxx>
Subject: Re: [PATCH, RFC] xfs: don't break from growfs ag update loop on error
From: Eric Sandeen <sandeen@xxxxxxxxxxx>
Date: Mon, 09 Sep 2013 15:36:18 -0500
Cc: "'linux-xfs@xxxxxxxxxxx'" <linux-xfs@xxxxxxxxxxx>
Delivered-to: linux-xfs@xxxxxxxxxxx
In-reply-to: <520D1AAC.8090701@xxxxxxxxxx>
References: <520D1AAC.8090701@xxxxxxxxxx>
User-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.8; rv:17.0) Gecko/20130801 Thunderbird/17.0.8
On 8/15/13 1:15 PM, Eric Sandeen wrote:
> When xfs_growfs_data_private() is updating backup superblocks,
> it bails out on the first error encountered, whether reading or
> writing:

Any thoughts on this one?  W/ the verifiers, we have a higher
chance of encountering an error, and leaving the rest of the
supers un-updated.  Repair will then possibly revert the fs to
it's pre-growfs state, and data loss will ensue...

Thanks,
-Eric

> * If we get an error writing out the alternate superblocks,
> * just issue a warning and continue.  The real work is
> * already done and committed.
> 
> This can cause a problem later during repair, because repair
> looks at all superblocks, and picks the most prevalent one
> as correct.  If we bail out early in the backup superblock
> loop, we can end up with more "bad" matching superblocks than
> good, and a post-growfs repair may revert the filesystem to
> the old geometry.
> 
> With the combination of superblock verifiers and old bugs,
> we're more likely to encounter read errors due to verification.
> 
> And perhaps even worse, we don't even properly write any of the
> newly-added superblocks in the new AGs.
> 
> Even with this change, growfs will still say:
> 
>   xfs_growfs: XFS_IOC_FSGROWFSDATA xfsctl failed: Structure needs cleaning
>   data blocks changed from 319815680 to 335216640
> 
> which might be confusing to the user, but it at least communicates
> that something has gone wrong, and dmesg will probably highlight
> the need for an xfs_repair.
> 
> And this is still best-effort; if verifiers fail on more than
> half the backup supers, they may still "win" - but that's probably
> best left to repair to more gracefully handle by doing its own
> strict verification as part of the backup super "voting."
> 
> Signed-off-by: Eric Sandeen <sandeen@xxxxxxxxxx>
> ---
> 
> diff --git a/fs/xfs/xfs_fsops.c b/fs/xfs/xfs_fsops.c
> index 614eb0c..70714bb 100644
> --- a/fs/xfs/xfs_fsops.c
> +++ b/fs/xfs/xfs_fsops.c
> @@ -153,7 +153,7 @@ xfs_growfs_data_private(
>       xfs_buf_t               *bp;
>       int                     bucket;
>       int                     dpct;
> -     int                     error;
> +     int                     error, saved_error = 0;
>       xfs_agnumber_t          nagcount;
>       xfs_agnumber_t          nagimax = 0;
>       xfs_rfsblock_t          nb, nb_mod;
> @@ -495,29 +495,33 @@ xfs_growfs_data_private(
>                               error = ENOMEM;
>               }
>  
> +             /*
> +              * If we get an error reading or writing alternate superblocks,
> +              * continue.  xfs_repair chooses the "best" superblock based
> +              * on most matches; if we break early, we'll leave more
> +              * superblocks un-updated than updated, and xfs_repair may
> +              * pick them over the properly-updated primary.
> +              */
>               if (error) {
>                       xfs_warn(mp,
>               "error %d reading secondary superblock for ag %d",
>                               error, agno);
> -                     break;
> +                     saved_error = error;
> +                     continue;
>               }
>               xfs_sb_to_disk(XFS_BUF_TO_SBP(bp), &mp->m_sb, XFS_SB_ALL_BITS);
>  
> -             /*
> -              * If we get an error writing out the alternate superblocks,
> -              * just issue a warning and continue.  The real work is
> -              * already done and committed.
> -              */
>               error = xfs_bwrite(bp);
>               xfs_buf_relse(bp);
>               if (error) {
>                       xfs_warn(mp,
>               "write error %d updating secondary superblock for ag %d",
>                               error, agno);
> -                     break; /* no point in continuing */
> +                     saved_error = error;
> +                     continue;
>               }
>       }
> -     return error;
> +     return saved_error ? saved_error : error;
>  
>   error0:
>       xfs_trans_cancel(tp, XFS_TRANS_ABORT);
> 
> 
> _______________________________________________
> xfs mailing list
> xfs@xxxxxxxxxxx
> http://oss.sgi.com/mailman/listinfo/xfs
> 

<Prev in Thread] Current Thread [Next in Thread>