[PATCH 1/3] xfs: don't shutdown log recovery on validation errors
Mark Tinguely
tinguely at sgi.com
Fri Jun 14 07:55:01 CDT 2013
On 06/13/13 19:13, Dave Chinner wrote:
> On Thu, Jun 13, 2013 at 05:09:03PM -0500, Ben Myers wrote:
>> Hi Dave,
>>
>> On Thu, Jun 13, 2013 at 12:08:27PM +1000, Dave Chinner wrote:
>>> On Wed, Jun 12, 2013 at 08:04:41PM -0500, Ben Myers wrote:
>>>> On Wed, Jun 12, 2013 at 12:19:06PM +1000, Dave Chinner wrote:
>>>>> From: Dave Chinner<dchinner at redhat.com>
>>>>>
>>>>> Unfortunately, we cannot guarantee that items logged multiple times
>>>>> and replayed by log recovery do not take objects back in time. When
>>>>> theya re taken back in time, the go into an intermediate state which
>>>>> is corrupt, and hence verification that occurs on this intermediate
>>>>> state causes log recovery to abort with a corruption shutdown.
>>>>>
>>>>> Instead of causing a shutdown and unmountable filesystem, don't
>>>>> verify post-recovery items before they are written to disk. This is
>>>>> less than optimal, but there is no way to detect this issue for
>>>>> non-CRC filesystems If log recovery successfully completes, this
>>>>> will be undone and the object will be consistent by subsequent
>>>>> transactions that are replayed, so in most cases we don't need to
>>>>> take drastic action.
>>>>>
>>>>> For CRC enabled filesystems, leave the verifiers in place - we need
>>>>> to call them to recalculate the CRCs on the objects anyway. This
>>>>> recovery problem canbe solved for such filesystems - we have a LSN
>>>>> stamped in all metadata at writeback time that we can to determine
>>>>> whether the item should be replayed or not. This is a separate piece
>>>>> of work, so is not addressed by this patch.
>>>>
>>>> Is there a test case for this one? How are you reproducing this?
>>>
>>> The test case was Dave Jones running sysrq-b on a hung test machine.
>>> The machine would occasionally end up with a corrupt home directory.
>>>
>>> http://oss.sgi.com/pipermail/xfs/2013-May/026759.html
>>>
>>> Analysis from a metdadump provided by Dave:
>>>
>>> http://oss.sgi.com/pipermail/xfs/2013-June/026965.html
>>>
>>> And Cai also appeared to be hitting this after a crash on 3.10-rc4,
>>> as it's giving exactly the same "verifier failed during log recovery"
>>> stack trace:
>>>
>>> http://oss.sgi.com/pipermail/xfs/2013-June/026889.html
>>
>> Thanks. It appears that the verifiers have found corruption due to a
>> flaw in log recovery, and the fix you are proposing is to stop using
>> them. If we do that, we'll have no way of detecting the corruption and
>> will end up hanging users of older kernels out to dry.
>
> We've never detected it before, and it's causing regressions for
> multiple people. We *can't fix it* because we can't detect the
> situation sanely, and we are not leaving people with old kernels
> hanging out to dry. The opposite is true: we are fucking over
> current users by preventing log recovery on filesystems that will
> recovery perfectly OK and have almost always recovered just fine in
> the past.
>
>> I think your suggestion that non-debug systems could warn instead of
>> fail is a good one, but removing the verifier altogether is
>> inappropriate.
>
> Changing every single verifier in a non-trivial way is not something
> I'm about to do for a -rc6 kernel. Removing the verifiers from log
> recovery just reverts to the pre-3.8 situation, so is perfectly
> acceptable short term solution while we do the more invasive verify
> changes.
>
>> Can you make the metadump available? I need to understand this better
>> before I can sign off. Also: Any idea how far back this one goes?
>
> No, I can't make the metadump available to you - it was provided
> privately and not obfuscated and so you'd have to ask Dave for it.
>
> As to how long this problem has existed? It's a zero-day bug. Like I
> said, I've suspected for years that this can happen, and only now do
> we have proof of it...
>
> Cheers,
>
> Dave.
My gut feeling for the patch was the same as Ben's, but thinking this
over, I have to take back all the eloquent curse works. IMO, we have to
bring the patch in because the goal for Linux 3.10 is to have a stable
environment for the non-CRC case and the verifier is breaking XFS log
recovery for the common non-CRC case.
---
In the common case, the verifier is tripping over the fundemental
difference between how the AIL works (consolidate buffer writes into one
and it can write anything that made it to log up to l_last_sync_lsn) and
the log recovery, which works on each modification.
If there is another unknown kind of future write, then it would be nice
to know and having a warning message would help. Unfortunately, the
warning may make recovery noisy and falsely concern the users, but we
are in uncharted waters.
I will blame and put a reviewed-by on the patch.
--Mark.
More information about the xfs
mailing list