[Top] [All Lists]

Re: XFS filesystem corruption

To: Stan Hoeppner <stan@xxxxxxxxxxxxxxxxx>
Subject: Re: XFS filesystem corruption
From: Dave Chinner <david@xxxxxxxxxxxxx>
Date: Mon, 11 Mar 2013 09:45:36 +1100
Cc: Ric Wheeler <rwheeler@xxxxxxxxxx>, Julien FERRERO <jferrero06@xxxxxxxxx>, xfs@xxxxxxxxxxx
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <513B84AD.2000603@xxxxxxxxxxxxxxxxx>
References: <CAPcwv6wZJSBtgF-L6KNSn6N6Y+wUZJFXdbcg+zYRwoaB2sDdjw@xxxxxxxxxxxxxx> <20130306161519.2c28d911@xxxxxxxxxxxxxx> <CAPcwv6wqv0b_CPqDpBfOwVDg23uBi=tpGQSy9XuH2uWS5oVMWQ@xxxxxxxxxxxxxx> <20130306232100.6286f640@xxxxxxxxxxxxxx> <5137CD46.6070909@xxxxxxxxxx> <5139A3B6.3040805@xxxxxxxxxxxxxxxxx> <5139D792.4090304@xxxxxxxxxx> <513A350A.508@xxxxxxxxxxxxxxxxx> <20130309091152.GH23616@dastard> <513B84AD.2000603@xxxxxxxxxxxxxxxxx>
User-agent: Mutt/1.5.21 (2010-09-15)
On Sat, Mar 09, 2013 at 12:51:25PM -0600, Stan Hoeppner wrote:
> On 3/9/2013 3:11 AM, Dave Chinner wrote:
> > On Fri, Mar 08, 2013 at 12:59:22PM -0600, Stan Hoeppner wrote:
> >> On 3/8/2013 6:20 AM, Ric Wheeler wrote:
> >>>> Something that none of us mentioned WRT write barriers is that while the
> >>>> filesystem structure may avoid corruption when the power is cut, files
> >>>> may still be corrupted, in conditions such as any/all of these:
> >>
> >> I made it very clear I was discussing file corruption here, not
> >> filesystem corruption.  You already covered that base.  I was
> >> specifically addressing the fact that XFS performs barriers on metadata
> >> writes but not file data writes.
> > 
> > Actually, you're not correct there, either, Stan. ;)
> With "either" you're implying I was incorrect twice, and I wasn't, not
> in whole anyway, maybe in part. ;)

The "either" was in reference to you correcting someone else...

> > XFS only issues cache flushes/FUA writes for log IO. Metadata IO is
> > done exactly the same way that data IO is done - without barriers.
> > It's because metadata lost in drive caches at the time of a crash is
> > rewritten by journal replay that filesystem corruption does not
> > occur.
> Technical semantics.  Geeze, give the non dev a break now and then.  ;)

It's the technical semantics that matter when it comes to behaviour
at power loss.  That's why I pick on "technical semantics" - it's
makes your analysis and understanding of problems better, and that
means there's less for me to do in future ;)

>  Does everyone remember the transitive property of equality from math
> class decades ago?  It states "If A=B and B=C then A=C".  Thus if
> barrier writes to the journal protect the journal, and the journal
> protects metadata, then barrier writes to the journal protect metadata.

Yup, but the devil is in the detail - we don't protect individual
metadata writes at all and that difference is significant enough to
comment on.... :P

>  I had a detail incorrect, but not the big picture.  And I'd bet the OP
> is more interested in the big picture.  So surely I'd get a B or a C
> here, but certainly not an F.

Certainly a B+ - like I said, I'm being picky because you seem to
understand the details once explained... :)

> > As it is, if the application uses direct IO (likely, as it
> > sounds like video capture/editing/playout here) then log IO
> > will also ensure that the data written by the app is on disk (i.e.
> > that's ithe mechanism by which fsync works).
> So this would be an interesting upside down case for XFS, as the file
> data may be intact, but the filesystem gets corrupted, the opposite of
> the design point.

Well, if barriers are working correctly, then there won't be any
filesystem corruption, either...

> >>> Also, if there are active writers, this is inherently racy. A better
> >>> script would unmount the file systems :)
> >>
> >> Yes, a umount would be even better.
> > 
> > Change the bios so that the power button does not cause a power down
> > so the OS can capture the button event and trigger an orderly
> > shutdown.
> Dare I say "Dave you're incorrect". ;)

Heh.  Not so much incorrect as "unaware of the entire scope". I
browsed the thread and didn't pick up on this little detail...


Dave Chinner

<Prev in Thread] Current Thread [Next in Thread>