XFS filesystem recovery from secondary superblocks

Aaron Goulding aarongldng at gmail.com
Wed Oct 31 00:02:28 CDT 2012


Hello! So I have an XFS filesystem that isn't mounting, and quite a long
story as to why and what I've tried.

And before you start, yes backups are the preferred method of restoration
at this point. Never trust your files to a single FS, etc.

So I have a 9 disk MD array (0.9 superblock format, total usable space
14TB) configured as an LVM PV, one VG, then one LV with not quite all the
space allocated. That LV is formatted XFS and mounted as /mnt/storage. This
was started on Ubuntu 10.04 which has been release-upgraded to 12.04. The
LV has been grown 3 times over the last two years. The system's boot, root
and swap partitions are on a separate drive.

So what happened? Well one drive died spectacularly. It had a full bearing
failure which caused a power drain and the system kicked out two more
drives instantly. This put the array into an offline state as expected. I
replaced the failed drive with a new one, and checked carefully the disk
order, before attempting to re-assemble the array. At the time, I didn't
know about mdadm --re-add. (Likely my first mistake)

mdadm --create --assume-clean --level=6 --raid-devices=9 /dev/md0 /dev/sdg1
missing /dev/sdh1 /dev/sdj1 /dev/sdd1 /dev/sdb1 /dev/sde1 /dev/sdf1
/dev/sdc1

The first problem with this is that the update to Ubuntu meant it created
the superblocks as 1.2 instead of 0.9. Not catching this, I then added in
the replacement /dev/sdi1. This started the array rebuilding incorrectly. I
quickly realized my mistake and stopped the array, then recreated again,
this time using superblock 0.9 format, but the damage had already been done
to roughly the first 100GB of the array, possibly more.

I attempted to restore the lvm superblock from the backup stored in
/etc/lvm/backup/

pvcreate -f -vv --uuid "hJrAn2-wTd8-vY11-steD-23Jh-AwKK-4VvnkH"
--restorefile /etc/lvm/backup/vg1 /dev/md0

When that failed, I decided to attach a second array so I could more safely
examine the problem. I built a second MD array with 7 3T disks in RAID6,
giving me a 15TB /mnt/restore volume to work with. I made a dd copy of
/dev/md0 to a test file I could manipulate safely.

Once I had the file created, I tried xfs_clean -f /mnt/restore/md0.dat to
no luck. I used a hex editor to add XFSB to be beginning, hoping the
recovery would just clean around the LVM data with similar results. The
result looks like the following:

Phase 1 - find and verify superblock...
bad primary superblock - bad or unsupported version !!!

attempting to find secondary superblock...
....................................................................................................
unable to verify superblock, continuing...
....................................................................................................
unable to verify superblock, continuing...
....................................................................................................
unable to verify superblock, continuing...
....................................................................................................
unable to verify superblock, continuing...
....................................................................................................
unable to verify superblock, continuing...
....................................................................................................
unable to verify superblock, continuing...
....................................................................................................
unable to verify superblock, continuing...
....................................................................................................
unable to verify superblock, continuing...
....................................................................................................
unable to verify superblock, continuing...
....................................................................................................
unable to verify superblock, continuing...
....................................................................................................
Exiting now.

running xfs_db /mnt/restore/md0.dat would appear to run out of memory.

So I realized I needed to pull the data out of LVM and re-assemble it
properly if I was going to make any progress. So I checked the backup
config again:

# Generated by LVM2 version 2.02.66(2) (2010-05-20): Sun Jul 29 13:40:58
2012

contents = "Text Format Volume Group"
version = 1

description = "Created *after* executing 'vgcfgbackup'"

creation_host = "jarvis"        # Linux jarvis 3.0.0-23-server #39-Ubuntu
SMP Thu Jul 19 19:37:41 UTC 2012 x86_64
creation_time = 1343594458      # Sun Jul 29 13:40:58 2012

vg1 {
        id = "hJrAn2-wTd8-vY11-steD-23Jh-AwKK-4VvnkH"
        seqno = 19
        status = ["RESIZEABLE", "READ", "WRITE"]
        flags = []
        extent_size = 8192              # 4 Megabytes
        max_lv = 0
        max_pv = 0

        physical_volumes {

                pv0 {
                        id = "VRHqH4-oIje-iQWV-iLUL-dLXX-eEf9-mLd9Z7"
                        device = "/dev/md0"     # Hint only

                        status = ["ALLOCATABLE"]
                        flags = []
                        dev_size = 27349166336  # 12.7354 Terabytes
                        pe_start = 768
                        pe_count = 3338521      # 12.7354 Terabytes
                }
        }

        logical_volumes {

                storage {
                        id = "H47IMn-ohEG-3W6l-NfCu-ePjJ-U255-FcIjdp"
                        status = ["READ", "WRITE", "VISIBLE"]
                        flags = []
                        segment_count = 4

                        segment1 {
                                start_extent = 0
                                extent_count = 2145769  # 8.18546 Terabytes

                                type = "striped"
                                stripe_count = 1        # linear

                                stripes = [
                                        "pv0", 25794
                                ]
                        }
                        segment2 {
                                start_extent = 2145769
                                extent_count = 626688   # 2.39062 Terabytes

                                type = "striped"
                                stripe_count = 1        # linear

                                stripes = [
                                        "pv0", 2174063
                                ]
                        }
                        segment3 {
                                start_extent = 2772457
                                extent_count = 384170   # 1.46549 Terabytes

                                type = "striped"
                                stripe_count = 1        # linear

                                stripes = [
                                        "pv0", 2954351
                                ]
                        }
                        segment4 {
                                start_extent = 3156627
                                extent_count = 140118   # 547.336 Gigabytes

                                type = "striped"
                                stripe_count = 1        # linear

                                stripes = [
                                        "pv0", 2800751
                                ]
                        }
                }
        }
}

So I noticed segment4 comes before segment3 (based off stripes = [
"pv0",2800751 ]) and standard extent size was 4MB, so I wrote the following:

echo "writing seg 1 .."
dd if=/dev/md0 of=/dev/md1 bs=4194304 seek=0 skip=25794 count=2145769

echo "writing seg 2 .."
dd if=/dev/md0 of=/dev/md1 bs=4194304 seek=2145769 skip=2174063 count=626688

echo "writing seg 3 .."
dd if=/dev/md0 of=/dev/md1 bs=4194304 seek=2772457 skip=2954351 count=384170

echo "writing seg 4 .."
dd if=/dev/md0 of=/dev/md1 bs=4194304 seek=3156627 skip=2800751 count=140118

then just to make sure things were clean, I zeroed out the remainder of
/dev/md1

I used the hex editor (shed) again to make sure the first four bytes on the
drive are XFSB.

Once done, I tried xfs_repair again, this time on /dev/md1 with the same
results as above.

Next I tried xfs_db /dev/md1 to see if anything would load. I get the
following:

root at jarvis:/mnt# xfs_db /dev/md1
Floating point exception

With the following in dmesg:

[1568395.691767] xfs_db[30966] trap divide error ip:41e4b5 sp:7fff5db8ab90
error:0 in xfs_db[400000+6a000]


So at this point I'm stumped. I'm hoping one of you clever folks out there
might have some next steps I can take. I'm okay with a partial recovery,
and I'm okay if the directory tree gets horked and I have to dig through
lost+found, but I'd really like to at least be able to recover something
from this. I'm happy to post any info needed on this.

Thanks!

-Aaron
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://oss.sgi.com/pipermail/xfs/attachments/20121030/48e651e1/attachment-0001.htm>


More information about the xfs mailing list