Stalled xfs_repair on 100TB filesystem
Dave Chinner
david at fromorbit.com
Tue Mar 2 18:25:00 CST 2010
On Tue, Mar 02, 2010 at 09:22:34AM -0800, Jason Vagalatos wrote:
> Hello, On Friday 2/26 I started an xfs_repair on a 100TB
> filesystem:
>
> #> nohup xfs_repair -v -l /dev/logfs-sessions/logdev
> /dev/logfs-sessions/sessions >
> /root/xfs_repair.out.logfs1.sjc.02262010 &
>
> I've been monitoring the process with 'top' and tailing the output
> file from the redirect above. I believe the repair has
> "stalled". When the process was running 'top' showed almost all
> physical memory consumed and 12.6G of virt memory consumed by
> xfs_repair. It made it all the way to Phase 6 and has been
> sitting at agno = 14 for almost 48 hours. The memory consumption
> of xfs_repair has ceased but the process is still "running" and
> consuming 100% CPU:
I wish we could reproduce hangs like this easily. I'd kill the
repair and run with the -P option. From the xfs_repair man page:
-P Disable prefetching of inode and directory blocks. Use
this option if you find xfs_repair gets stuck and
proceeding. Interrupting a stuck xfs_repair is safe.
Cheers,
Dave.
--
Dave Chinner
david at fromorbit.com
More information about the xfs
mailing list