http://oss.sgi.com/bugzilla/show_bug.cgi?id=631
Summary: xfs_repairs fails to recover filesystem (phase 6)
Product: Linux XFS
Version: Current
Platform: PC
OS/Version: Linux
Status: NEW
Severity: normal
Priority: P2
Component: xfsprogs
AssignedTo: xfs-master@xxxxxxxxxxx
ReportedBy: duncan.sands@xxxxxxx
Phase 6 - check inode connectivity...
- resetting contents of realtime bitmap and summary inodes
- ensuring existence of lost+found directory
- traversing filesystem starting at / ...
rebuilding directory inode 128
fatal error -- can't read block 16777216 for directory inode 151153880
At first I thought that the hard-drive had failed, but further tests
showed that it was OK; also strace showed that xfs_repair managed to
successfully read all blocks it asked the kernel for. I then noticed
that the block number is 0x1000000 in hex, which seemed awfully
suspicious, and began to suspect a bug in xfs_repair. Here's a
backtrace at the failure point:
#0 libxfs_da_do_buf (trans=0x0, dp=0x877dcd0, bno=16777216,
mappedbnop=0xbfb1d49c, bpp=0xbfb1d4bc,
whichfork=16777216, caller=2, ra=0x8072b7a) at xfs_da_btree.c:2033
#1 0x08084ed4 in libxfs_da_read_bufr (trans=0x0, dp=0x0, bno=0, mappedbno=-1,
bpp=0x0, whichfork=0) at util.c:667
#2 0x08072b7a in longform_dir2_check_node (mp=0xbfb1d820, ip=0x877dcd0,
hashtab=0xbfb1d4c0, freetab=0x877e408)
at phase6.c:2218
#3 0x08074dc5 in longform_dir2_entry_check (mp=0xbfb1d820, ino=151153880,
ip=0x877dcd0, num_illegal=0xbfb1d744,
need_dot=0xbfb1d748, stack=0xbfb1d7d0, irec=0x8177344, ino_offset=56) at
phase6.c:2739
#4 0x08078ab8 in process_dirstack (mp=0xbfb1d820, stack=0xbfb1d7d0) at
phase6.c:3560
#5 0x08079333 in phase6 (mp=0xbfb1d820) at phase6.c:3968
#6 0x0808041c in main (argc=0, argv=0xbfb1d820) at xfs_repair.c:509
This call to xfs_da_map_covers_blocks in xfs_da_do_buf returns zero
because mapp->br_startblock == HOLESTARTBLOCK (returning zero is bad):
2032 if (!xfs_da_map_covers_blocks(nmap, mapp, bno, nfsb)) {
Here nmap==1, *mapp=={br_startoff = 16777216, br_startblock =
18446744073709551614, br_blockcount = 1, br_state = XFS_EXT_NORM},
bno==16777216, nfsb==1.
Because mappedbno==-1, xfs_da_do_buf then returns EFSCORRUPTED,
which causes longform_dir2_check_node to abort with a call to
do_error:
2216 if (libxfs_bmap_next_offset(NULL, ip, &next_da_bno,
XFS_DATA_FORK))
2217 break;
2218 if (libxfs_da_read_bufr(NULL, ip, da_bno, -1, &bp,
2219 XFS_DATA_FORK)) {
2220 do_error(_("can't read block %u for directory
inode "
2221 "%llu\n"),
2222 da_bno, ip->i_ino);
At this point I got stuck. Given the fact that the failure occurs
for a block number with a very special set of bits, I guess it could
be due to an incorrect bit manipulation that only fails when given
all zeros or something like that.
Any ideas? I would really like to recover this filesystem...
Best wishes,
Duncan.
--
Configure bugmail: http://oss.sgi.com/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
|