xfs
[Top] [All Lists]

Re: [PATCH, RFC] xfs: don't break from growfs ag update loop on error

To: Mark Tinguely <tinguely@xxxxxxx>
Subject: Re: [PATCH, RFC] xfs: don't break from growfs ag update loop on error
From: Eric Sandeen <sandeen@xxxxxxxxxxx>
Date: Tue, 10 Sep 2013 10:23:35 -0500
Cc: Eric Sandeen <sandeen@xxxxxxxxxx>, "'linux-xfs@xxxxxxxxxxx'" <linux-xfs@xxxxxxxxxxx>
Delivered-to: linux-xfs@xxxxxxxxxxx
In-reply-to: <522F390D.20809@xxxxxxx>
References: <520D1AAC.8090701@xxxxxxxxxx> <522E3142.7090501@xxxxxxxxxxx> <522F390D.20809@xxxxxxx>
User-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.8; rv:17.0) Gecko/20130801 Thunderbird/17.0.8
On 9/10/13 10:21 AM, Mark Tinguely wrote:
> On 09/09/13 15:36, Eric Sandeen wrote:
>> On 8/15/13 1:15 PM, Eric Sandeen wrote:
>>> When xfs_growfs_data_private() is updating backup superblocks,
>>> it bails out on the first error encountered, whether reading or
>>> writing:
>>
>> Any thoughts on this one?  W/ the verifiers, we have a higher
>> chance of encountering an error, and leaving the rest of the
>> supers un-updated.  Repair will then possibly revert the fs to
>> it's pre-growfs state, and data loss will ensue...
>>
>> Thanks,
>> -Eric
>>
>>> * If we get an error writing out the alternate superblocks,
>>> * just issue a warning and continue.  The real work is
>>> * already done and committed.
>>>
>>> This can cause a problem later during repair, because repair
>>> looks at all superblocks, and picks the most prevalent one
>>> as correct.  If we bail out early in the backup superblock
>>> loop, we can end up with more "bad" matching superblocks than
>>> good, and a post-growfs repair may revert the filesystem to
>>> the old geometry.
>>>
>>> With the combination of superblock verifiers and old bugs,
>>> we're more likely to encounter read errors due to verification.
>>>
>>> And perhaps even worse, we don't even properly write any of the
>>> newly-added superblocks in the new AGs.
>>>
>>> Even with this change, growfs will still say:
>>>
>>>    xfs_growfs: XFS_IOC_FSGROWFSDATA xfsctl failed: Structure needs cleaning
>>>    data blocks changed from 319815680 to 335216640
>>>
>>> which might be confusing to the user, but it at least communicates
>>> that something has gone wrong, and dmesg will probably highlight
>>> the need for an xfs_repair.
>>>
>>> And this is still best-effort; if verifiers fail on more than
>>> half the backup supers, they may still "win" - but that's probably
>>> best left to repair to more gracefully handle by doing its own
>>> strict verification as part of the backup super "voting."
>>>
>>> Signed-off-by: Eric Sandeen<sandeen@xxxxxxxxxx>
>>> ---
> 
> Make sense to me - it could have been any kind of error including not being 
> able to get a xfs_buf for the new secondary (a temp ENOMEM).
> 
> I wonder if it could be possible to fix corrupt entries rather than just skip 
> them...

Well, that should be xfs_repair's job right - and I think it does?  (Esp. w/ my 
other patch,
[PATCH] xfs_repair: zero out unused parts of superblocks)

but we'd want growfs to alert the user of the problem one way or another...

-Eric
 
> Probably could test this patch by corrupting a v5 secondary superblock and 
> verifying with xfs_db.
> 
> Reviewed-by: Mark Tinguely <tinguely@xxxxxxx>
> 

<Prev in Thread] Current Thread [Next in Thread>