[Top] [All Lists]

Re: Corrupted filesystem: thoughts?

To: Emmanuel Florac <eflorac@xxxxxxxxxxxxxx>
Subject: Re: Corrupted filesystem: thoughts?
From: Dave Chinner <david@xxxxxxxxxxxxx>
Date: Tue, 7 Oct 2014 21:56:57 +1100
Cc: xfs@xxxxxxxxxxx
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <20141007093533.66ba8c7b@xxxxxxxxxxxxxx>
References: <20141007093533.66ba8c7b@xxxxxxxxxxxxxx>
User-agent: Mutt/1.5.21 (2010-09-15)
On Tue, Oct 07, 2014 at 09:35:33AM +0200, Emmanuel Florac wrote:
> Here we go again: Adaptec RAID adapters don't play well with HGST 3TB
> drives for some reason. When a drive fails, the filesystem is almost
> always corrupted. This one looks pretty bad according to the output of
> the latest "xfs_repair -n" of the 15 TB filesystem. Here is a sample 
> of the 2MB log:

Is the log clean? (use xfs_logprint -t <dev> to determine that)

xfs_repair -n ignores dirty logs, so the state on disk it may be
reporting on can be inconsistent and that inconsistency is "fixed"
by running log recovery.

> Phase 1 - find and verify superblock...
>         - reporting progress in intervals of 15 minutes
> Phase 2 - using internal log
>         - scan filesystem freespace and inode maps...
> out-of-order bno btree record 83 (332943 42) block 27/4681
> out-of-order bno btree record 90 (322762 76) block 27/4681
> out-of-order bno btree record 91 (331903 125) block 27/4681

That implies that some updates to the btree block haven't occurred
that compact out the old entries (that are now out of order). That
can either occur due to writes going missing, or log recovery not
being run....

Most of the issues could be caused by having a dirty log that needs
recovery, so that is the first thing to check. Note that a
non-dry-run xfs_repair will warn and abort if the log is dirty,
unlike "xfs_repair -n".


Dave Chinner

<Prev in Thread] Current Thread [Next in Thread>