xfs
[Top] [All Lists]

Re: [PATCH 1/3] xfs: don't shutdown log recovery on validation errors

To: Ben Myers <bpm@xxxxxxx>
Subject: Re: [PATCH 1/3] xfs: don't shutdown log recovery on validation errors
From: Eric Sandeen <sandeen@xxxxxxxxxxx>
Date: Fri, 14 Jun 2013 11:15:41 -0500
Cc: Dave Jones <davej@xxxxxxxxxx>, xfs@xxxxxxxxxxx
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <20130614160940.GA32736@xxxxxxx>
References: <1371003548-4026-1-git-send-email-david@xxxxxxxxxxxxx> <1371003548-4026-2-git-send-email-david@xxxxxxxxxxxxx> <20130613010441.GX20932@xxxxxxx> <20130613020827.GG29338@dastard> <20130613220903.GA20932@xxxxxxx> <20130614001306.GM29338@dastard> <20130614160940.GA32736@xxxxxxx>
User-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.8; rv:17.0) Gecko/20130509 Thunderbird/17.0.6
On 6/14/13 11:09 AM, Ben Myers wrote:
> Hey,
> 
> On Fri, Jun 14, 2013 at 10:13:06AM +1000, Dave Chinner wrote:
>> On Thu, Jun 13, 2013 at 05:09:03PM -0500, Ben Myers wrote:
>>> Hi Dave,
>>>
>>> On Thu, Jun 13, 2013 at 12:08:27PM +1000, Dave Chinner wrote:
>>>> On Wed, Jun 12, 2013 at 08:04:41PM -0500, Ben Myers wrote:
>>>>> On Wed, Jun 12, 2013 at 12:19:06PM +1000, Dave Chinner wrote:
>>>>>> From: Dave Chinner <dchinner@xxxxxxxxxx>
>>>>>>
>>>>>> Unfortunately, we cannot guarantee that items logged multiple times
>>>>>> and replayed by log recovery do not take objects back in time. When
>>>>>> theya re taken back in time, the go into an intermediate state which
>>>>>> is corrupt, and hence verification that occurs on this intermediate
>>>>>> state causes log recovery to abort with a corruption shutdown.
>>>>>>
>>>>>> Instead of causing a shutdown and unmountable filesystem, don't
>>>>>> verify post-recovery items before they are written to disk. This is
>>>>>> less than optimal, but there is no way to detect this issue for
>>>>>> non-CRC filesystems If log recovery successfully completes, this
>>>>>> will be undone and the object will be consistent by subsequent
>>>>>> transactions that are replayed, so in most cases we don't need to
>>>>>> take drastic action.
>>>>>>
>>>>>> For CRC enabled filesystems, leave the verifiers in place - we need
>>>>>> to call them to recalculate the CRCs on the objects anyway. This
>>>>>> recovery problem canbe solved for such filesystems - we have a LSN
>>>>>> stamped in all metadata at writeback time that we can to determine
>>>>>> whether the item should be replayed or not. This is a separate piece
>>>>>> of work, so is not addressed by this patch.
>>>>>
>>>>> Is there a test case for this one?  How are you reproducing this?
>>>>
>>>> The test case was Dave Jones running sysrq-b on a hung test machine.
>>>> The machine would occasionally end up with a corrupt home directory.
>>>>
>>>> http://oss.sgi.com/pipermail/xfs/2013-May/026759.html
>>>>
>>>> Analysis from a metdadump provided by Dave:
>>>>
>>>> http://oss.sgi.com/pipermail/xfs/2013-June/026965.html
>>>>
>>>> And Cai also appeared to be hitting this after a crash on 3.10-rc4,
>>>> as it's giving exactly the same "verifier failed during log recovery"
>>>> stack trace:
>>>>
>>>> http://oss.sgi.com/pipermail/xfs/2013-June/026889.html
>>>
>>> Thanks.  It appears that the verifiers have found corruption due to a
>>> flaw in log recovery, and the fix you are proposing is to stop using
>>> them.  If we do that, we'll have no way of detecting the corruption and
>>> will end up hanging users of older kernels out to dry.
>>
>> We've never detected it before, and it's causing regressions for
>> multiple people. We *can't fix it* because we can't detect the
>> situation sanely, and we are not leaving people with old kernels
>> hanging out to dry. The opposite is true: we are fucking over
>> current users by preventing log recovery on filesystems that will
>> recovery perfectly OK and have almost always recovered just fine in
>> the past.
>>
>>> I think your suggestion that non-debug systems could warn instead of
>>> fail is a good one, but removing the verifier altogether is
>>> inappropriate.
>>
>> Changing every single verifier in a non-trivial way is not something
>> I'm about to do for a -rc6 kernel. Removing the verifiers from log
>> recovery just reverts to the pre-3.8 situation, so is perfectly
>> acceptable short term solution while we do the more invasive verify
>> changes.
>>
>>> Can you make the metadump available?  I need to understand this better
>>> before I can sign off.  Also:  Any idea how far back this one goes?
>>
>> No, I can't make the metadump available to you - it was provided
>> privately and not obfuscated and so you'd have to ask Dave for it.
> 
> Dave (Jones), could you make the metadump available to me?  I'd like to
> understand this a little bit better.  I'm a bit uncomfortable with the
> proposition that we should corrupt silently in this case...

Ben, isn't it the case that the corruption would only happen if
log replay failed for some reason (as has always been the case,
verifier or not), but with the verifier in place, it kills replay
even w/o other problems due to a logical problem with the
(recently added) verifiers?

IOW - this seems like an actual functional regression due to the
addition of the verifier, and dchinner's patch gets us back
to the almost-always-fine state we were in prior to the change.

As we're at -rc6, it seems quite reasonable to me as a quick
fix to just short-circuit it for now.

If you have time to analyze dave's metadump that's cool, but
this seems like something that really needs to be addressed
before 3.10 gets out the door.

Whenever you're ready, you can also add:

Reviewed-by: Eric Sandeen <sandeen@xxxxxxxxxx>

(And dchinner, should this cc: stable for 3.9?)

Thanks,
-Eric

> Thanks,
>       Ben
> 
> _______________________________________________
> xfs mailing list
> xfs@xxxxxxxxxxx
> http://oss.sgi.com/mailman/listinfo/xfs
> 

<Prev in Thread] Current Thread [Next in Thread>