Xfs_repair and journalling

Michael L. Semon mlsemon35 at gmail.com
Sat Mar 16 22:01:08 CDT 2013


Disclaimer:  I'm an XFS user, not a developer...

On 03/16/2013 11:56 AM, Subranshu Patel wrote:
> This question is related to xfs_repair (recovery) and journalling.
>
> I powered off (improper shut down) the system when the IO was
> undergoing on mounted XFS filesystem.
>
> Then I tried to recover the inconsistent filesystem using xfs_repair,
> after powering on the same machine.
>
> The XFS filesystem didn’t get recovered which was not expected. The
> output displayed by xfs_repair is as follows:
>
> Phase 1 - find and verify superblock...
> Phase 2 - using internal log
>          - zero log...
> ERROR: The filesystem has valuable metadata changes in a log which needs to
> be replayed.  Mount the filesystem to replay the log, and unmount it before
> re-running xfs_repair.  If you are unable to mount the filesystem, then use
> the -L option to destroy the log and attempt a repair.
> Note that destroying the log may cause corruption -- please attempt a mount
> of the filesystem before doing this.

So do just that, it's worth it.

This happens to me when I have a partition marked read-only in 
/etc/fstab, remount it read-write to make changes to it, then something 
bad happens, and I forget about the read-only mark and run xfs_repair on 
the partition.

> The question that arises here is that why xfs_repair should be re-run
> after mounting and unmounting the XFS filesystem. According to my
> understanding, when we perform mount operation, recovery is
> automatically done if the filesystem is in inconsistent state. Then
> what is the need of re-running xfs_repair after mount is being
> performed? Does xfs_repair recovers something indifferent from the one
> recovered on mount? What exactly happens when we mount and unmount XFS
> filesystem?

The log replay is done in the kernel, and xfs_repair is a userspace 
tool.  They could easily give divergent results.  But from the 
xfs_repair man page:

"Regardless, the filesystem to be repaired must be unmounted, otherwise, 
the resulting filesystem may be inconsistent or corrupt."

This means that it is frowned upon to fsck an XFS partition that has 
been mounted in any way.  This is the part that is different.

> This is not observed in EXT4, fsck successfully recovers without
> mounting the filesystem.

No, fsck got your filesystem to a point where it thinks it is usable, 
and that may or may not be true, depending on how much of your data and 
journal was written to disk before the poweroff.  Write caching on 
drives makes this matter worse.

There is a difference between the automatic boot-time fsck settings and 
using the fsck tool manually for almost all filesystems that have an 
fsck-like tool.  There is also a difference between fsck on a 
file-system mounted read-only and one that is not mounted at all.  Even 
with NTFS on Windows XP, chkdsk will find a couple more bad files if it 
is run from different boot media.

If the file system states that all is OK, still umount the FS at the 
next opportunity and do a real fsck on it.  It will keep you from asking 
"how did this file go bad" six months later.

If you mount your partitions read-write, then XFS log replay takes care 
of most (but not all) situations.  That's why you might take a little 
extra care of your read-only partitions, to be sure that you remount 
them read-only after you modify them.

Good luck!

Michael



More information about the xfs mailing list