xfs
[Top] [All Lists]

repaired xfs corruption causing invalid directories?

To: xfs@xxxxxxxxxxx
Subject: repaired xfs corruption causing invalid directories?
From: Martin Murray <murrayma@xxxxxxxxxxxxxxxxxxx>
Date: Wed, 10 Dec 2008 05:13:53 -0500
User-agent: Mozilla-Thunderbird 2.0.0.17 (X11/20081018)
I've got a ~3TB xfs filesystem build on an md raid5 of four 1TB drives. Ignoring a long, embarrassing story, I accidentally nuked the first 32 odd megabytes of the partition with dd. xfs_repair was able to recover almost everything. Unsurprisingly, I must have zero'd out a number of directory entries, as, suddenly, everything appeared in lost+found. We've been identifying, renaming, and moving entries in lost+found back to their original places, but have encountered something strange. I'm hoping someone can give me some advice.

I do realize that my problem is due to operator error, but I'm hoping someone has a suggestion.

Recently, after a reboot, doing an ls -li, as root, on the filesystem gives the following:

ls: cannot access debian: Invalid argument
ls: cannot access Documentation: Invalid argument
6988868050 ???????????    ? ?      ?               ?                ? debian
13645710122 ??????????? ? ? ? ? ? Documentation
1073742080 drwxr-xr-x+    2 nobody nogroup        35 2008-11-03 20:20 temp/
536872760 drwxrwsrw-+ 51 nobody nogroup 4096 2008-11-01 06:20 TigerLineData/
3221230395 drwsrwsrwx+   23 nobody nogroup      8192 2008-12-09 22:52 Win32/

For brevity, I've omitted most of the working subdirectories. The strange part is that some of these directories, for example, the debian one, was created after the xfs_repair. Also, I cannot remove files in the directories nor even the directories themselves. I tried doing an xfsdump over the net to another machine and noticed screenfulls of errors similar to:

xfsdump: WARNING: unable to open directory: ino 4296482141: Invalid argument
xfsdump: WARNING: unable to open directory: ino 4296482142: Invalid argument
xfsdump: WARNING: unable to open directory: ino 4296482150: Invalid argument
...
xfsdump: WARNING: unable to open directory: ino 16796385598: Invalid argument
xfsdump: WARNING: unable to open directory: ino 16797367298: Invalid argument
xfsdump: WARNING: unable to open directory: ino 16797367299: Invalid argument

The xfsdump is still running. I hope to backup the still-reachable data before addressing these invalid directories. Right before I did the xfsdump, however, I ran xfs_db and printed out one of the invalid directories:

xfs_db> inode 6988868050
xfs_db> print
core.magic = 0x494e
core.mode = 040755
core.version = 2
core.format = 2 (extents)
core.nlinkv2 = 5
core.onlink = 0
core.projid = 0
core.uid = 65534
core.gid = 65534
core.flushiter = 7
core.atime.sec = Sat Nov  8 00:50:05 2008
core.atime.nsec = 694489000
core.mtime.sec = Thu Oct 30 07:38:40 2008
core.mtime.nsec = 000000000
core.ctime.sec = Sat Nov  8 05:05:42 2008
core.ctime.nsec = 589799654
core.size = 4096
core.nblocks = 1
core.extsize = 0
core.nextents = 1
core.naextents = 0
core.forkoff = 11
core.aformat = 1 (local)
core.dmevmask = 0
core.dmstate = 0
core.newrtbm = 0
core.prealloc = 0
core.realtime = 0
core.immutable = 0
core.append = 0
core.sync = 0
core.noatime = 0
core.nodump = 0
core.rtinherit = 0
core.projinherit = 0
core.nosymlinks = 0
core.extsz = 0
core.extszinherit = 0
core.nodefrag = 0
core.filestream = 0
core.gen = 3924941202
next_unlinked = null
u.bmx[0] = [startoff,startblock,blockcount,extentflag] 0:[0,436805271,1,0]
a.sfattr.hdr.totsize = 62
a.sfattr.hdr.count = 1
a.sfattr.list[0].namelen = 15
a.sfattr.list[0].valuelen = 40
a.sfattr.list[0].root = 1
a.sfattr.list[0].secure = 0
a.sfattr.list[0].name = "SGI_ACL_DEFAULT"
a.sfattr.list[0].value = "\000\000\000\003\000\000\000\001\377\377\377\377\000\a\000\000\000\000\000\004\377\377\377\377\000\005\000\000\000\000\000 \377\377\377\377\000\a\000\000"

        Lastly, here is a copy of the xfs_info for the filesystem:

meta-data=/dev/md0               isize=256    agcount=32, agsize=22892816 blks
         =                       sectsz=4096  attr=2
data     =                       bsize=4096   blocks=732569856, imaxpct=5
         =                       sunit=16     swidth=48 blks
naming   =version 2              bsize=4096
log      =internal               bsize=4096   blocks=32768, version=2
         =                       sectsz=4096  sunit=1 blks, lazy-count=1
realtime =none                   extsz=196608 blocks=0, rtextents=0

I've started reading the XFS filesystem structure PDF, but I cannot yet infer which field in the block would cause the invalid argument error. The one thing I've noticed so far, is that it appears that all the invalid directories are the extent format.

        Oh, I can provide an xfs_metadump as soon as the xfsdump finishes.

Thank you, Martin Murray

<Prev in Thread] Current Thread [Next in Thread>
  • repaired xfs corruption causing invalid directories?, Martin Murray <=