http://oss.sgi.com/bugzilla/show_bug.cgi?id=718
Summary: xfs_repair 2.8.10 + 2.7.11 + 2.6.36 crash while trying
to repair corrupted fs
Product: Linux XFS
Version: Current
Platform: PC
OS/Version: Linux
Status: NEW
Severity: critical
Priority: P1
Component: xfsprogs
AssignedTo: xfs-master@xxxxxxxxxxx
ReportedBy: frank.baumgart@xxxxxxx
Linux Kernel 2.6.18-rc4, system ran 2.6.17 when corruption crept in.
md RAID level 5 with 4*300 GB (xfs_growfs'ed from 3*300 GB several weeks before
without any apparent problems). fs has never been out of space.
The load on this fs is very light.
see "xfs-info.log" for xfs_info output.
corruption affected a few files in a subtree with 500 GB of data first.
1)
ran xfs_repair 2.6.36 (comes with SUSE 10.0)
result: xfs_repair crashed, removing the directory entry for the 500 GB subtree.
"df" still prints the full data content as before the corruption.
"lost+found" is empty.
Now that I have to fear the loss of 500 GB of non-backup'ed data and the repair
tool is also failing my trust has gone to zero, too, so subsequent tests have
been done with "-n".
2)
ran xfs_repair 2.8.10
result:
a) "xfs_repair -n" crashes in Phase 3 without any error message at all
(see "xfs.log-2.8.10-no-modify")
b) xfs_repair crashes with Electric Fence in pass 3 with:
"ElectricFence Exiting: mprotect() failed: Cannot allocate memory"
c) running xfs_repair with debug info in gdb 6.3 crashes with some
un-backtraceable thread violation
Apparently the thread optimization may be good for speed but not healthy for the
data recovery process. My trust is now way below zero.
Therefore I looked for the latest 2.7.x version, in the hope to see a last
stable, single-threaded tool but could only get hold of 2.7.11 (not 2.7.18)
3) ran xfs_repair 2.7.11
a) xfs_repair (without "-n") stops in phase 4 (at least...) with:
...
Phase 4 - check for duplicate blocks...
- setting up duplicate extent list...
- clear lost+found (if it exists) ...
- clearing existing "lost+found" inode
- deleting existing "lost+found" entry
- check for inodes claiming duplicate blocks...
- agno = 0
data fork in regular inode 6694201 claims used block 197330880
xfs_repair: dinode.c:2433: process_dinode_int: Assertion `err == 0' failed.
b) xfs_repair -n with "ef" (electric fence) produces "mprotect()" message at
Pass 4, agno=19.
--
Configure bugmail: http://oss.sgi.com/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
|