xfs
[Top] [All Lists]

xfs_repair on large fs: out of memory

To: linux-xfs@xxxxxxxxxxx
Subject: xfs_repair on large fs: out of memory
From: "Ed L. Cashin" <ecashin@xxxxxxxxxx>
Date: Wed, 17 May 2006 15:36:06 -0400
Sender: linux-xfs-bounce@xxxxxxxxxxx
User-agent: Mutt/1.5.11+cvs20060126
Hi.  In trying to run xfs_repair on a 13 TB filesystem, the program
either runs out of memory ...

  host02:~# sysctl vm.overcommit_memory=2
  vm.overcommit_memory = 2
  host02:~# /opt/xfsprogs-2.7.11/sbin/xfs_repair -v -n /dev/mapper/vg-lv 
  Phase 1 - find and verify superblock...
  
  fatal error -- couldn't allocate block map, size = 106834464
  host02:~# 

... or just uses up all the system's memory and is killed if I use a
more liberal vm.overcommit_memory setting.  An strace shows that mmap
is being called repeatedly.

...
  mmap(NULL, 53420032, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) 
= 0x2b2201d65000
  mmap(NULL, 53420032, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) 
= 0x2b2205057000
  mmap(NULL, 53420032, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) 
= 0x2b2208349000
  mmap(NULL, 53420032, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) 
= 0x2b220b63b000
  +++ killed by SIGKILL +++
  Process 3146 detached

This machine has a gigabyte of memory.  It's running an x86_64
2.6.16.13 kernel.

I think that the XFS log is not OK because on mount attempts the
messages below appear in the logs.

  kernel: XFS mounting filesystem dm-0
  kernel: Starting XFS recovery on filesystem: dm-0 (logdev: internal)
  kernel: XFS: xlog_recover_process_data: bad clientid
  kernel: XFS: log mount/recovery failed: error 5
  kernel: XFS: log mount failed

Based on the "bad clientid" message, I think that "xfs_repair -L"
would be the next logical step, but it seems like there must be a bug
in xfs_repair if it runs out of memory instead of telling me that.

-- 
  Ed L Cashin <ecashin@xxxxxxxxxx>


<Prev in Thread] Current Thread [Next in Thread>