xfs
[Top] [All Lists]

badly destroyed XFS (on LVM) - how to repair?

To: xfs@xxxxxxxxxxx
Subject: badly destroyed XFS (on LVM) - how to repair?
From: Michael Monnerie <michael.monnerie@xxxxxxxxxxxxxxxxxxx>
Date: Sat, 12 May 2012 14:49:56 +0200
Organization: it-management http://it-management.at
User-agent: KMail/4.7.2 (Linux/3.3.3-zmi; KDE/4.7.2; x86_64; ; )
(This might be an LVM problem, but who knows?)

I have here an XFS that was on a server running within a XenServer
machine. In the end it existed of 4x 2TB disks:
# pvscan
  PV /dev/xvdg   VG sharestore   lvm2 [1.95 TiB / 0    free]
  PV /dev/xvdf   VG sharestore   lvm2 [1.95 TiB / 0    free]
  PV /dev/xvdc   VG sharestore   lvm2 [1.95 TiB / 0    free]
  PV /dev/xvde   VG sharestore   lvm2 [1.95 TiB / 0    free]
  Total: 4 [7.81 TiB] / in use: 4 [7.81 TiB] / in no VG: 0 [0   ]

Then by accident the admin made "fdisk /dev/xvdg", created a partition
like this:
Platte /dev/xvdg: 1073 MByte, 1073741824 Byte
139 Köpfe, 8 Sektoren/Spuren, 1885 Zylinder
Einheiten = Zylinder von 1112 × 512 = 569344 Bytes
Disk identifier: 0xb30cf4db

    Gerät  boot.     Anfang        Ende     Blöcke   Id  System
/dev/xvdg1               1        1886     1048448   82  Linux Swap

(partittion starting sector 63), and did "mkswap /dev/xvdg1". After a
reboot, LVM did not recognize the full disk anymore.

# pvscan
  Couldn't find device with uuid 396XfX-EbMZ-0J6q-C3bj-3n6d-vruJ-6Oiy7w.
  PV unknown device   VG sharestore   lvm2 [1.95 TiB / 0    free]
  PV /dev/xvdf        VG sharestore   lvm2 [1.95 TiB / 0    free]
  PV /dev/xvdc        VG sharestore   lvm2 [1.95 TiB / 0    free]
  PV /dev/xvde        VG sharestore   lvm2 [1.95 TiB / 0    free]
  Total: 4 [7.81 TiB] / in use: 4 [7.81 TiB] / in no VG: 0 [0

This is where I jumped in. I must say that in the meantime the source VM
got deleted, and that I only got access to the data disks. I believe
/dev/xvdg was the very first of those LVM disks before, but I'm not
sure.

I tried "pvcreate --uuid 396XfX-EbMZ-0J6q-C3bj-3n6d-vruJ-6Oiy7w --
norestorefile /dev/xvdg", which did not succeed.
Then I made a backup of the first sectors of /dev/xvdg, and did
"dd if=/dev/xvdf of=/dev/xvdg bsQ2 countc", and tried again with
"pvcreate --uuid 396XfX-EbMZ-0J6q-C3bj-3n6d-vruJ-6Oiy7w --norestorefile
/dev/xvdg" - this time it worked. Strange thing: /dev/sharestore/public
is not created, it's only accessible via /dev/dm-0, but
I can mount the XFS but it's destroyed, "ls" shows:
ls: cannot access /1/hope: Invalid argument
ls: cannot access /1/jog: Invalid argument
ls: cannot access /1/maza: Invalid argument
ls: cannot access /1/public: Invalid argument
ls: cannot access /1/upload: Invalid argument
ls: cannot access /1/du.old: Invalid argument
ls: cannot access /1/.fsr: Invalid argument
total 45
drwxrwx---  17 root       1000 4096 May  1 00:00 ./
drwxr-xr-x  25 root    root     632 May 12 12:42 ../
???????????  ? ?       ?          ?            ? .fsr
drwx------   7    1007 nogroup 4096 Oct 19  2010 anse/
-rw-r--r--   1 root    root     951 Jan  1 00:10 du.20120101
-rw-r--r--   1 root    root     456 Feb  1 00:10 du.20120201
-rw-r--r--   1 root    root     455 Mar  1 00:11 du.20120301
-rw-r--r--   1 root    root     464 Apr  1 00:06 du.20120401
-rw-r--r--   1 root    root     464 May  1 00:00 du.20120501
???????????  ? ?       ?          ?            ? du.old
-rwx------   1 root    root     253 Nov  7  2010 find-inode.sh*
???????????  ? ?       ?          ?            ? hope
drwxrwxr-x+  4    1007 nogroup   49 Nov 29  2009 itm/
???????????  ? ?       ?          ?            ? jog
drwx------   6 makedns nogroup 4096 Aug 24  2010 lama/
???????????  ? ?       ?          ?            ? maza
drwx------   2    1008 nogroup   68 Jan 12  2010 paan/
???????????  ? ?       ?          ?            ? public
drwxrwxr-t   5 root    www     4096 Mar 17 11:33 tmp/
drwxr-xr-x   2 nobody  root     144 Mar 17 11:41 torrent/
???????????  ? ?       ?          ?            ? upload
drwx------   2    1003 nogroup   88 Nov 23  2009 vop/

Then I made a xfs_metadump, and xfs_repair, both with version 3.0.1,
which seems to not work. xfs_repair said this:

# xfs_repair -n /dev/dm-0 2>&1|tee xfs.log
Phase 1 - find and verify superblock...
Phase 2 - using internal log
        - scan filesystem freespace and inode maps...
        - found root inode chunk
Phase 3 - for each AG...
        - scan (but don't clear) agi unlinked lists...
        - process known inodes and perform inode discovery...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - agno = 4
        - agno = 5
        - agno = 6
        - agno = 7
        - agno = 8
        - process newly discovered inodes...
Phase 4 - check for duplicate blocks...
        - setting up duplicate extent list...
        - check for inodes claiming duplicate blocks...
        - agno = 0
        - agno = 2
        - agno = 1
        - agno = 3
        - agno = 4
        - agno = 5
        - agno = 6
        - agno = 7
        - agno = 8
No modify flag set, skipping phase 5
Phase 6 - check inode connectivity...
        - traversing filesystem ...
        - traversal finished ...
        - moving disconnected inodes to lost+found ...
Phase 7 - verify link counts...
No modify flag set, skipping filesystem flush and exiting.

So it found no error. I then installed xfsprogs 3.1.8, and tried the
repair on the metadump:

# xfs_repair xfs.metadump
Phase 1 - find and verify superblock...
bad primary superblock - bad magic number !!!

attempting to find secondary superblock...
................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................found
candidate secondary superblock...
superblock read failed, offset 1073976639488, size 131072, ag 1, rval 0

fatal error -- No such file or directory

I tried the xfs_repair with v3.1.8 on the volume - again without any
inconsistencies? I guess I still have a problem with the underlying LVM,
or it's an error which is currently not checked for in xfs_repair.

Could someone help me?
1) Maybe I need to do more on the /dev/xvdg volume in order to fix that?
2) I can give access to that machine for a developer if that helps.
3) I put the metadump on http://sonbae.zmi.at/xfs.metadump.bz2

Here are some LVM stats if that helps:

# pvdisplay
  --- Physical volume ---
  PV Name               /dev/xvdg
  VG Name               sharestore
  PV Size               1.95 TiB / not usable 64.00 MiB
  Allocatable           yes (but full)
  PE Size               64.00 MiB
  Total PE              32007
  Free PE               0
  Allocated PE          32007
  PV UUID               396XfX-EbMZ-0J6q-C3bj-3n6d-vruJ-6Oiy7w

  --- Physical volume ---
  PV Name               /dev/xvdf
  VG Name               sharestore
  PV Size               1.95 TiB / not usable 64.00 MiB
  Allocatable           yes (but full)
  PE Size               64.00 MiB
  Total PE              32007
  Free PE               0
  Allocated PE          32007
  PV UUID               fEDk3P-JLRj-cKVj-3AsG-hcbP-LQbi-QcdTl5

  --- Physical volume ---
  PV Name               /dev/xvdc
  VG Name               sharestore
  PV Size               1.95 TiB / not usable 64.00 MiB
  Allocatable           yes (but full)
  PE Size               64.00 MiB
  Total PE              32007
  Free PE               0
  Allocated PE          32007
  PV UUID               yQ2GQ4-EX38-Dlb4-RPOB-58Nn-0mg4-CN4w9a

  --- Physical volume ---
  PV Name               /dev/xvde
  VG Name               sharestore
  PV Size               1.95 TiB / not usable 64.00 MiB
  Allocatable           yes (but full)
  PE Size               64.00 MiB
  Total PE              32007
  Free PE               0
  Allocated PE          32007
  PV UUID               7exsxJ-F4Cm-eEiv-I1sS-qmIY-7Oxj-caAss5

# vgdisplay
  --- Volume group ---
  VG Name               sharestore
  System ID
  Format                lvm2
  Metadata Areas        4
  Metadata Sequence No  10
  VG Access             read/write
  VG Status             resizable
  MAX LV                0
  Cur LV                1
  Open LV               1
  Max PV                0
  Cur PV                4
  Act PV                4
  VG Size               7.81 TiB
  PE Size               64.00 MiB
  Total PE              128028
  Alloc PE / Size       128028 / 7.81 TiB
  Free  PE / Size       0 / 0
  VG UUID               Ieov6b-2qof-KjzF-ypN1-QwZK-YE3C-DcVPeP

# lvdisplay
  /dev/mapper/sharestore-public: open failed: No such file or directory
  --- Logical volume ---
  LV Name                /dev/sharestore/public
  VG Name                sharestore
  LV UUID                VsZljE-lUU2-oqvm-u4y6-xYz7-cpNA-8zBsRH
  LV Write Access        read/write
  LV Status              NOT available
  LV Size                7.81 TiB
  Current LE             128028
  Segments               4
  Allocation             inherit
  Read ahead sectors     1536

The volume is accessible only via /dev/dm-0, it seems this might be my
problem?

--
mit freundlichen Grüssen,
Michael Monnerie, Ing. BSc

it-management Internet Services: Protéger
http://proteger.at [gesprochen: Prot-e-schee]
Tel: +43 660 / 415 6531

Attachment: signature.asc
Description: This is a digitally signed message part.

<Prev in Thread] Current Thread [Next in Thread>