xfs
[Top] [All Lists]

recovering corrupt filesystem after raid failure

To: xfs@xxxxxxxxxxx
Subject: recovering corrupt filesystem after raid failure
From: David Lechner <david@xxxxxxxxxxxxxx>
Date: Sun, 21 Feb 2016 19:29:54 -0600
Delivered-to: xfs@xxxxxxxxxxx
Dkim-signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lechnology.com; s=default; h=Content-Transfer-Encoding:Content-Type: MIME-Version:Date:Message-ID:Subject:From:To; bh=aWdWiFaezssAZCa7+GCll2aSx7BRidaJeE8yY9W8a4M=; b=ikZ90rIGt2oqwb+UsH9jUf+6He XbZvaugy0DSBSV4hGh81zFCuAg6v5LMU1sq25SbS7olEGO/uVwX7qOCF6YzS0p9HMNam6PeNPmYuG Z9Cz+g+TPSTPViJT3nM9+HPu/GLvHZQ7tmU/wj9DIu/F2vBp9kPiU6EmNaNtB6K58/CBOOSraalzB +4Vn2ivU/YS5FGxl7cNzY9Cq4nOF/6HpDHEkeIpsJ4S7muJFR5igpSLE080exnxkBtAubhUUXfH7b yPnon+DAE9UbGU6OeeoDbG3O3kuEpmWHORYJeaKaf6rKcs9y7AYjpwPIPvdh5/SM0lgLUYDtWf/Yp 9UuG363w==;
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.5.1
Long story short, I had a dual disk failure in a raid 5. I've managed to
get the raid back up and salvaged what I could. However, the xfs is
seriously damaged. I've tried running xfs_repair, but it is failing and
it recommended to send a message to this mailing list. This is an Ubuntu
12.04 machine, so xfs_repair version 3.1.7.


The file system won't mount. Fails with "mount: Structure needs
cleaning". So I tried xfs_repair. I had to resort to xfs_repair -L
because the first 500MB or so of the filesystem was wiped out. Now,
xfs_repair /dev/md127 gets stuck, so I am running xfs_repair -P
/dev/md127. This gets much farther, but it is failing too. It gives an
error message like this:


...
disconnected inode 2101958, moving to lost+found
corrupt dinode 2101958, extent total = 1, nblocks = 0.  This is a bug.
Please capture the filesystem metadata with xfs_metadump and
report it to xfs@xxxxxxxxxxxx
cache_node_purge: refcount was 1, not zero (node=0x7f2c57e1b120)

fatal error -- 117 - couldn't iget disconnected inode



However, nblocks = 0 does not seem to be true...

xfs_db -x /dev/md127
cache_node_purge: refcount was 1, not zero (node=0x219c9e0)
xfs_db: cannot read root inode (117)
cache_node_purge: refcount was 1, not zero (node=0x21a0620)
xfs_db: cannot read realtime bitmap inode (117)
xfs_db> inode 2101958
xfs_db> print
core.magic = 0x494e
core.mode = 0100664
core.version = 2
core.format = 2 (extents)
core.nlinkv2 = 1
core.onlink = 0
core.projid_lo = 0
core.projid_hi = 0
core.uid = 119
core.gid = 133
core.flushiter = 5
core.atime.sec = Sun Apr 26 02:30:54 2015
core.atime.nsec = 000000000
core.mtime.sec = Fri Nov  7 14:54:27 2014
core.mtime.nsec = 000000000
core.ctime.sec = Sun Apr 26 02:30:54 2015
core.ctime.nsec = 941028318
core.size = 279864
core.nblocks = 69
core.extsize = 0
core.nextents = 1
core.naextents = 0
core.forkoff = 0
core.aformat = 2 (extents)
core.dmevmask = 0
core.dmstate = 0
core.newrtbm = 0
core.prealloc = 0
core.realtime = 0
core.immutable = 0
core.append = 0
core.sync = 0
core.noatime = 0
core.nodump = 0
core.rtinherit = 0
core.projinherit = 0
core.nosymlinks = 0
core.extsz = 0
core.extszinherit = 0
core.nodefrag = 0
core.filestream = 0
core.gen = 3320313054
next_unlinked = null
u.bmx[0] = [startoff,startblock,blockcount,extentflag] 0:[0,147322885,69,0]


If I re-run xfs_repair -P /dev/md127, it will fail on different
seemingly random inode with the same error message.

I've uploaded the output of xfs_metadump to dropbox if anyone would like
to have a look. It is 22MB compressed, 2.2GB uncompressed.

https://www.dropbox.com/s/o18cxapu7o75sor/xfs_metadump.xz?dl=0

<Prev in Thread] Current Thread [Next in Thread>