XFS filesystem corruption

Ric Wheeler rwheeler at redhat.com
Fri Mar 8 06:20:34 CST 2013


On 03/08/2013 03:39 AM, Stan Hoeppner wrote:
> On 3/6/2013 5:12 PM, Ric Wheeler wrote:
>
>> We actually test brutal "Power off" for xfs, ext4 and other file
>> systems. If your storage is configured properly and you have barriers
>> enabled, they all pass without corruption.
> Something that none of us mentioned WRT write barriers is that while the
> filesystem structure may avoid corruption when the power is cut, files
> may still be corrupted, in conditions such as any/all of these:
>
> 1.  unwritten data still in buffer cache

This is true only for user data, not the file system metadata. We should always 
be able to drop power without seeing corruption (like the garbled ls output).

> 2.  drive caches are enabled

Write barriers will take care of drives with write cache enabled, as long as the 
hardware RAID card is not in the middle and misleading us.

> 3.  BBWC not working properly

This should not be a worry. If the battery (or in more modern cards, flash 
backed) is not working, a good card will flip into write through caching. Should 
be slow, but safe.

Note that the write cache state on the drives is still a question mark - that 
needs to be disabled normally.

>
> If the techs are determined to hard cut power because they don't have
> the time or the knowledge to do a clean shutdown, it may be well worth
> your time/effort to write a script and teach the field techs to execute
> it, before flipping the master switch.  Your simple script would run as
> root, or you'd need to do some sudo foo within, and would contain
> something like:
>
> #! /bin/sh
> sync
> echo 2 > /proc/sys/vm/drop_caches
> echo "Ready for power down."
>
> This will flush pending writes in buffer cache to disk, and assumes of
> course that drive caches are disabled, and/or that BBWC, if present, is
> functioning properly.  It also assumes no applications are still
> actively writing files, in which case you're screwed regardless.  It's
> not a perfect solution and there's no guarantee you won't suffer file
> corruption, but this greatly increases your odds against it.
>

For file system *metadata* consistency, you should not have to do this ever if 
the stack is properly configured.  The application data will still be lost.

Also, if there are active writers, this is inherently racy. A better script 
would unmount the file systems :)

Ric




More information about the xfs mailing list