[Top] [All Lists]

Re: help with xfs_repair on 10TB fs

To: Alberto Accomazzi <aaccomazzi@xxxxxxxxx>
Subject: Re: help with xfs_repair on 10TB fs
From: Eric Sandeen <sandeen@xxxxxxxxxxx>
Date: Sat, 17 Jan 2009 12:50:29 -0600
Cc: xfs@xxxxxxxxxxx
In-reply-to: <adcf4ef70901171042p31054ae0rb56819fce7b6f47e@xxxxxxxxxxxxxx>
References: <adcf4ef70901170913l693376d7s6fd0395e2c88e10@xxxxxxxxxxxxxx> <4972166D.5000006@xxxxxxxxxxx> <adcf4ef70901171042p31054ae0rb56819fce7b6f47e@xxxxxxxxxxxxxx>
User-agent: Thunderbird (Macintosh/20081209)
Alberto Accomazzi wrote:
> On Sat, Jan 17, 2009 at 12:33 PM, Eric Sandeen <sandeen@xxxxxxxxxxx> wrote:
>> Alberto Accomazzi wrote:
>>> I need some help with figuring out how to repair a large XFS
>>> filesystem (10TB of data, 100+ million files).  xfs_repair seems to
>>> have crapped out before finishing the job and now I'm not sure how to
>>> proceed.
>> How did it "crap out?
> Well, in the way I described below, namely it ran for several hours and then
> died without completing.  As you can see from the log (which captured both
> stdout and stderr) there's nothing that indicates what terminated the
> program.  And it's definitely not running now.
>> the src.rpm from
>> http://kojipkgs.fedoraproject.org/packages/xfsprogs/2.10.2/3.fc11/src/
> Ok, I guess it's worth giving it a shot.  I assume I don't need to worry
> about kernel modules because the xfsprogs don't depend on that, right?


>>> After bringing the system back, a mount of the fs reported problems:
>>> Starting XFS recovery on filesystem: sdb1 (logdev: internal)
>>> Filesystem "sdb1": XFS internal error xfs_btree_check_sblock at line 334
>> of file
>>>  /home/buildsvn/rpmbuild/BUILD/xfs-kmod-0.4/_kmod_build_/xfs_btree.c.
>>  Caller 0x
>>> ffffffff882fa8d2
>> so log replay is failing now; but that indicates an unclean shutdown.
>> Something else must have happened between the xfs_repair and this mount
>> instance?
> Sorry, I wasn't clear: there was indeed an unclean shutdown (actually a
> couple), after which the mount would not succeed presumably because of the
> dirty log.  I was able to mount the system read-only and take enough of a
> look to see that there was significant corruption of the data.  Running
> xfs_repair -L at that point seemed the only option available.  But do let me
> know if this line of thinking is incorrect.

yes, if you have a dirty log that won't replay, zapping the log via
repair is about the only option.  I wonder what the first hint of
trouble here was, though, what led to all this misery.... :)


<Prev in Thread] Current Thread [Next in Thread>