[Top] [All Lists]

Re: Stalled xfs_repair on 100TB filesystem

To: xfs@xxxxxxxxxxx
Subject: Re: Stalled xfs_repair on 100TB filesystem
From: Stan Hoeppner <stan@xxxxxxxxxxxxxxxxx>
Date: Tue, 02 Mar 2010 18:35:22 -0600
In-reply-to: <DD534F7C25BFA14FB18E6D603135D7EA0A11E82ECB@sbapexch05>
References: <DD534F7C25BFA14FB18E6D603135D7EA0A11E82ECB@sbapexch05>
User-agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv: Gecko/20100227 Thunderbird/3.0.3
Jason Vagalatos put forth on 3/2/2010 11:22 AM:
> Hello,
> On Friday 2/26 I started an xfs_repair on a 100TB filesystem:
> #> nohup xfs_repair -v -l /dev/logfs-sessions/logdev 
> /dev/logfs-sessions/sessions > /root/xfs_repair.out.logfs1.sjc.02262010 &
> I've been monitoring the process with 'top' and tailing the output file from 
> the redirect above.  I believe the repair has "stalled".  When the process 
> was running 'top' showed almost all physical memory consumed and 12.6G of 
> virt memory consumed by xfs_repair.  It made it all the way to Phase 6 and 
> has been sitting at agno = 14 for almost 48 hours.  The memory consumption of 
> xfs_repair has ceased but the process is still "running" and consuming 100% 
> CPU:

Here's how another user solved this xfs_repair "hanging" problem.  I say
"hang" because "stall" didn't return the right Google results.



"In betwenn I created a test filesystem 360GB with 120million inodes on it.
xfs_repair without options is unable to complete. If I run xfs_repair -o
bhash=8192 the repair process terminates normally (the filesystem is
actually ok)."

Unfortunately it appears you'll have to start the repair over again.


<Prev in Thread] Current Thread [Next in Thread>