I've got a ~3TB xfs filesystem build on an md raid5 of four 1TB drives.
Ignoring a long, embarrassing story, I accidentally nuked the first 32 odd
megabytes of the partition with dd. xfs_repair was able to recover almost
everything. Unsurprisingly, I must have zero'd out a number of directory
entries, as, suddenly, everything appeared in lost+found. We've been
identifying, renaming, and moving entries in lost+found back to their
original places, but have encountered something strange. I'm hoping someone
can give me some advice.
I do realize that my problem is due to operator error, but I'm hoping
someone has a suggestion.
Recently, after a reboot, doing an ls -li, as root, on the filesystem gives
the following:
ls: cannot access debian: Invalid argument
ls: cannot access Documentation: Invalid argument
6988868050 ??????????? ? ? ? ? ? debian
13645710122 ??????????? ? ? ? ? ?
Documentation
1073742080 drwxr-xr-x+ 2 nobody nogroup 35 2008-11-03 20:20 temp/
536872760 drwxrwsrw-+ 51 nobody nogroup 4096 2008-11-01 06:20
TigerLineData/
3221230395 drwsrwsrwx+ 23 nobody nogroup 8192 2008-12-09 22:52 Win32/
For brevity, I've omitted most of the working subdirectories. The strange
part is that some of these directories, for example, the debian one, was
created after the xfs_repair. Also, I cannot remove files in the directories
nor even the directories themselves. I tried doing an xfsdump over the net
to another machine and noticed screenfulls of errors similar to:
xfsdump: WARNING: unable to open directory: ino 4296482141: Invalid argument
xfsdump: WARNING: unable to open directory: ino 4296482142: Invalid argument
xfsdump: WARNING: unable to open directory: ino 4296482150: Invalid argument
...
xfsdump: WARNING: unable to open directory: ino 16796385598: Invalid argument
xfsdump: WARNING: unable to open directory: ino 16797367298: Invalid argument
xfsdump: WARNING: unable to open directory: ino 16797367299: Invalid argument
The xfsdump is still running. I hope to backup the still-reachable data
before addressing these invalid directories. Right before I did the xfsdump,
however, I ran xfs_db and printed out one of the invalid directories:
xfs_db> inode 6988868050
xfs_db> print
core.magic = 0x494e
core.mode = 040755
core.version = 2
core.format = 2 (extents)
core.nlinkv2 = 5
core.onlink = 0
core.projid = 0
core.uid = 65534
core.gid = 65534
core.flushiter = 7
core.atime.sec = Sat Nov 8 00:50:05 2008
core.atime.nsec = 694489000
core.mtime.sec = Thu Oct 30 07:38:40 2008
core.mtime.nsec = 000000000
core.ctime.sec = Sat Nov 8 05:05:42 2008
core.ctime.nsec = 589799654
core.size = 4096
core.nblocks = 1
core.extsize = 0
core.nextents = 1
core.naextents = 0
core.forkoff = 11
core.aformat = 1 (local)
core.dmevmask = 0
core.dmstate = 0
core.newrtbm = 0
core.prealloc = 0
core.realtime = 0
core.immutable = 0
core.append = 0
core.sync = 0
core.noatime = 0
core.nodump = 0
core.rtinherit = 0
core.projinherit = 0
core.nosymlinks = 0
core.extsz = 0
core.extszinherit = 0
core.nodefrag = 0
core.filestream = 0
core.gen = 3924941202
next_unlinked = null
u.bmx[0] = [startoff,startblock,blockcount,extentflag] 0:[0,436805271,1,0]
a.sfattr.hdr.totsize = 62
a.sfattr.hdr.count = 1
a.sfattr.list[0].namelen = 15
a.sfattr.list[0].valuelen = 40
a.sfattr.list[0].root = 1
a.sfattr.list[0].secure = 0
a.sfattr.list[0].name = "SGI_ACL_DEFAULT"
a.sfattr.list[0].value =
"\000\000\000\003\000\000\000\001\377\377\377\377\000\a\000\000\000\000\000\004\377\377\377\377\000\005\000\000\000\000\000
\377\377\377\377\000\a\000\000"
Lastly, here is a copy of the xfs_info for the filesystem:
meta-data=/dev/md0 isize=256 agcount=32, agsize=22892816 blks
= sectsz=4096 attr=2
data = bsize=4096 blocks=732569856, imaxpct=5
= sunit=16 swidth=48 blks
naming =version 2 bsize=4096
log =internal bsize=4096 blocks=32768, version=2
= sectsz=4096 sunit=1 blks, lazy-count=1
realtime =none extsz=196608 blocks=0, rtextents=0
I've started reading the XFS filesystem structure PDF, but I cannot yet
infer which field in the block would cause the invalid argument error. The
one thing I've noticed so far, is that it appears that all the invalid
directories are the extent format.
Oh, I can provide an xfs_metadump as soon as the xfsdump finishes.
Thank you, Martin Murray
|