Hi. In trying to run xfs_repair on a 13 TB filesystem, the program
either runs out of memory ...
host02:~# sysctl vm.overcommit_memory=2
vm.overcommit_memory = 2
host02:~# /opt/xfsprogs-2.7.11/sbin/xfs_repair -v -n /dev/mapper/vg-lv
Phase 1 - find and verify superblock...
fatal error -- couldn't allocate block map, size = 106834464
host02:~#
... or just uses up all the system's memory and is killed if I use a
more liberal vm.overcommit_memory setting. An strace shows that mmap
is being called repeatedly.
...
mmap(NULL, 53420032, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0)
= 0x2b2201d65000
mmap(NULL, 53420032, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0)
= 0x2b2205057000
mmap(NULL, 53420032, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0)
= 0x2b2208349000
mmap(NULL, 53420032, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0)
= 0x2b220b63b000
+++ killed by SIGKILL +++
Process 3146 detached
This machine has a gigabyte of memory. It's running an x86_64
2.6.16.13 kernel.
I think that the XFS log is not OK because on mount attempts the
messages below appear in the logs.
kernel: XFS mounting filesystem dm-0
kernel: Starting XFS recovery on filesystem: dm-0 (logdev: internal)
kernel: XFS: xlog_recover_process_data: bad clientid
kernel: XFS: log mount/recovery failed: error 5
kernel: XFS: log mount failed
Based on the "bad clientid" message, I think that "xfs_repair -L"
would be the next logical step, but it seems like there must be a bug
in xfs_repair if it runs out of memory instead of telling me that.
--
Ed L Cashin <ecashin@xxxxxxxxxx>
|