| To: | xfs@xxxxxxxxxxx |
|---|---|
| Subject: | repaired xfs corruption causing invalid directories? |
| From: | Martin Murray <murrayma@xxxxxxxxxxxxxxxxxxx> |
| Date: | Wed, 10 Dec 2008 05:13:53 -0500 |
| User-agent: | Mozilla-Thunderbird 2.0.0.17 (X11/20081018) |
|
I've got a ~3TB xfs filesystem build on an md raid5 of four 1TB drives.
Ignoring a long, embarrassing story, I accidentally nuked the first 32 odd
megabytes of the partition with dd. xfs_repair was able to recover almost
everything. Unsurprisingly, I must have zero'd out a number of directory
entries, as, suddenly, everything appeared in lost+found. We've been
identifying, renaming, and moving entries in lost+found back to their
original places, but have encountered something strange. I'm hoping someone
can give me some advice. I do realize that my problem is due to operator error, but I'm hoping someone has a suggestion. Recently, after a reboot, doing an ls -li, as root, on the filesystem gives the following: ls: cannot access debian: Invalid argument ls: cannot access Documentation: Invalid argument 6988868050 ??????????? ? ? ? ? ? debian 13645710122 ??????????? ? ? ? ? ? Documentation 1073742080 drwxr-xr-x+ 2 nobody nogroup 35 2008-11-03 20:20 temp/ 536872760 drwxrwsrw-+ 51 nobody nogroup 4096 2008-11-01 06:20 TigerLineData/ 3221230395 drwsrwsrwx+ 23 nobody nogroup 8192 2008-12-09 22:52 Win32/ For brevity, I've omitted most of the working subdirectories. The strange part is that some of these directories, for example, the debian one, was created after the xfs_repair. Also, I cannot remove files in the directories nor even the directories themselves. I tried doing an xfsdump over the net to another machine and noticed screenfulls of errors similar to: xfsdump: WARNING: unable to open directory: ino 4296482141: Invalid argument xfsdump: WARNING: unable to open directory: ino 4296482142: Invalid argument xfsdump: WARNING: unable to open directory: ino 4296482150: Invalid argument ... xfsdump: WARNING: unable to open directory: ino 16796385598: Invalid argument xfsdump: WARNING: unable to open directory: ino 16797367298: Invalid argument xfsdump: WARNING: unable to open directory: ino 16797367299: Invalid argument The xfsdump is still running. I hope to backup the still-reachable data before addressing these invalid directories. Right before I did the xfsdump, however, I ran xfs_db and printed out one of the invalid directories: xfs_db> inode 6988868050 xfs_db> print core.magic = 0x494e core.mode = 040755 core.version = 2 core.format = 2 (extents) core.nlinkv2 = 5 core.onlink = 0 core.projid = 0 core.uid = 65534 core.gid = 65534 core.flushiter = 7 core.atime.sec = Sat Nov 8 00:50:05 2008 core.atime.nsec = 694489000 core.mtime.sec = Thu Oct 30 07:38:40 2008 core.mtime.nsec = 000000000 core.ctime.sec = Sat Nov 8 05:05:42 2008 core.ctime.nsec = 589799654 core.size = 4096 core.nblocks = 1 core.extsize = 0 core.nextents = 1 core.naextents = 0 core.forkoff = 11 core.aformat = 1 (local) core.dmevmask = 0 core.dmstate = 0 core.newrtbm = 0 core.prealloc = 0 core.realtime = 0 core.immutable = 0 core.append = 0 core.sync = 0 core.noatime = 0 core.nodump = 0 core.rtinherit = 0 core.projinherit = 0 core.nosymlinks = 0 core.extsz = 0 core.extszinherit = 0 core.nodefrag = 0 core.filestream = 0 core.gen = 3924941202 next_unlinked = null u.bmx[0] = [startoff,startblock,blockcount,extentflag] 0:[0,436805271,1,0] a.sfattr.hdr.totsize = 62 a.sfattr.hdr.count = 1 a.sfattr.list[0].namelen = 15 a.sfattr.list[0].valuelen = 40 a.sfattr.list[0].root = 1 a.sfattr.list[0].secure = 0 a.sfattr.list[0].name = "SGI_ACL_DEFAULT" a.sfattr.list[0].value = "\000\000\000\003\000\000\000\001\377\377\377\377\000\a\000\000\000\000\000\004\377\377\377\377\000\005\000\000\000\000\000 \377\377\377\377\000\a\000\000" Lastly, here is a copy of the xfs_info for the filesystem: meta-data=/dev/md0 isize=256 agcount=32, agsize=22892816 blks
= sectsz=4096 attr=2
data = bsize=4096 blocks=732569856, imaxpct=5
= sunit=16 swidth=48 blks
naming =version 2 bsize=4096
log =internal bsize=4096 blocks=32768, version=2
= sectsz=4096 sunit=1 blks, lazy-count=1
realtime =none extsz=196608 blocks=0, rtextents=0I've started reading the XFS filesystem structure PDF, but I cannot yet infer which field in the block would cause the invalid argument error. The one thing I've noticed so far, is that it appears that all the invalid directories are the extent format. Oh, I can provide an xfs_metadump as soon as the xfsdump finishes. Thank you, Martin Murray |
| <Prev in Thread] | Current Thread | [Next in Thread> |
|---|---|---|
| ||
| Previous by Date: | Re: New XFS git tree on oss.sgi.com, Christoph Hellwig |
|---|---|
| Next by Date: | Re: 12x performance drop on md/linux+sw raid1 due to barriers [xfs], Bill Davidsen |
| Previous by Thread: | PARTIAL TAKE 990768 - setattr perm check test - 193, Tim Shimmin |
| Next by Thread: | [XFS updates] XFS development tree branch, master, updated. v2.6.28-rc3-1489-gc4cd747, xfs |
| Indexes: | [Date] [Thread] [Top] [All Lists] |