xfs
[Top] [All Lists]

Re: Sudden File System Corruption

To: Mike Dacre <mike.dacre@xxxxxxxxx>
Subject: Re: Sudden File System Corruption
From: Dave Chinner <david@xxxxxxxxxxxxx>
Date: Mon, 9 Dec 2013 09:20:14 +1100
Cc: Ben Myers <bpm@xxxxxxx>, xfs@xxxxxxxxxxx
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <CAPd9ww_Df6FwDc_Kv82LKAZR5ML8DLUHwLruBY+innemOtggtw@xxxxxxxxxxxxxx>
References: <CAPd9ww_qT9J_Rt04g7+OApoBeggNOyWNwD+57DiDTuUvz-O-0g@xxxxxxxxxxxxxx> <20131205174058.GF1935@xxxxxxx> <20131205175053.GG1935@xxxxxxx> <CAPd9ww9YFbMEe-dM96zHsbRJgQuBHfF=ipromch1Yw6SzPUftg@xxxxxxxxxxxxxx> <20131206002308.GS10553@xxxxxxx> <CAPd9ww8XDzGbSZsEEoCmSuJ+KBYUWqHeRON1sFr6bG1fZ6af7w@xxxxxxxxxxxxxx> <20131206225612.GU10553@xxxxxxx> <CAPd9ww_Df6FwDc_Kv82LKAZR5ML8DLUHwLruBY+innemOtggtw@xxxxxxxxxxxxxx>
User-agent: Mutt/1.5.21 (2010-09-15)
[ For future reference - can people keep triage on the public list
so everyone can see that the problem is being worked on? ]

On Fri, Dec 06, 2013 at 03:15:33PM -0800, Mike Dacre wrote:
> On Fri, Dec 6, 2013 at 2:56 PM, Ben Myers <bpm@xxxxxxx> wrote:
> > It's great that you have this.  And an interesting repair log.
> > The good news is that it doesn't look like the corruption that
> > xfs_repair doesn't fix, the bad news is that I don't recognise
> > it.
> 
> Here is the repair log from right after the corruption happened.
> The repair was successful.

If xfs_repair didn't report any freespace corruption, then it's
because it didn't see any. And that's not actually surprising for
this sort of shutdown followed by log recovery failures.

What it means the corruption was detected pretty much
immediately after it occurred and the shutdown confined it to the
log before it could be propagated to the in place metadata. Which
generally means the shutdown occurred within 30s of it occurring.

In my experience, this sort of "corruption confined to the log"
shutdown is usually a result of some kind of memory corruption that
is captured accidentally in the log due to object relogging (i.e. in
a dirty region from a previous change that is not yet committed to
the log) prior to it being detected in a transaction.

Without being able to see the before/after log recovery filesystem
images, there's nothing we can do to track this down further.

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx

<Prev in Thread] Current Thread [Next in Thread>