We are running xfs_repair v2.9.4. Will just the -P flag suffice for this
version? What is the best way to calculate the bhash size value needed if we
need to use that option too?
From: xfs-bounces@xxxxxxxxxxx [mailto:xfs-bounces@xxxxxxxxxxx] On Behalf Of
Sent: Tuesday, March 02, 2010 4:44 PM
To: Stan Hoeppner
Subject: Re: Stalled xfs_repair on 100TB filesystem
Stan Hoeppner wrote:
> Jason Vagalatos put forth on 3/2/2010 11:22 AM:
>> On Friday 2/26 I started an xfs_repair on a 100TB filesystem:
>> #> nohup xfs_repair -v -l /dev/logfs-sessions/logdev
>> /dev/logfs-sessions/sessions > /root/xfs_repair.out.logfs1.sjc.02262010 &
>> I've been monitoring the process with 'top' and tailing the output file from
>> the redirect above. I believe the repair has "stalled". When the process
>> was running 'top' showed almost all physical memory consumed and 12.6G of
>> virt memory consumed by xfs_repair. It made it all the way to Phase 6 and
>> has been sitting at agno = 14 for almost 48 hours. The memory consumption
>> of xfs_repair has ceased but the process is still "running" and consuming
>> 100% CPU:
> Here's how another user solved this xfs_repair "hanging" problem. I say
> "hang" because "stall" didn't return the right Google results.
> "In betwenn I created a test filesystem 360GB with 120million inodes on it.
> xfs_repair without options is unable to complete. If I run xfs_repair -o
> bhash=8192 the repair process terminates normally (the filesystem is
> actually ok)."
> Unfortunately it appears you'll have to start the repair over again.
FWIW, Jason - which xfsprogs version are you running? This patch went in a
> [PATCH] libxfs: increase hash chain depth when we run out of slots
> A couple people reported xfs_repair hangs after
> "Traversing filesystem ..." in xfs_repair. This happens
> when all slots in the cache are full and referenced, and the
> loop in cache_node_get() which tries to shake unused entries
> fails to find any - it just keeps upping the priority and goes
> This can be worked around by restarting xfs_repair with
> -P and/or "-o bhash=<largersize>" for older xfs_repair.
> I started down the path of increasing the number of hash buckets
> on the fly, but Barry suggested simply increasing the max allowed
> depth which is much simpler (thanks!)
> Resizing the hash lengths does mean that cache_report ends up with
> most things in the "greater-than" category:
> Hash buckets with 23 entries 3 ( 3%)
> Hash buckets with 24 entries 3 ( 3%)
> Hash buckets with >24 entries 50 ( 85%)
> but I think I'll save that fix for another patch unless there's
> real concern right now.
> I tested this on the metadump image provided by Tomek.
> Signed-off-by: Eric Sandeen <sandeen@xxxxxxxxxxx>
> Reported-by: Tomek Kruszona <bloodyscarion@xxxxxxxxx>
> Reported-by: Riku Paananen <riku.paananen@xxxxxxxxxxx>
xfs mailing list