xfs
[Top] [All Lists]

RE: Unable to mount and repair filesystems

To: Eric Sandeen <sandeen@xxxxxxxxxxx>, "xfs@xxxxxxxxxxx" <xfs@xxxxxxxxxxx>
Subject: RE: Unable to mount and repair filesystems
From: Gerard Beekmans <GBeekmans@xxxxxxxx>
Date: Thu, 29 Jan 2015 21:27:32 +0000
Accept-language: en-CA, en-US
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <54CA9586.1010607@xxxxxxxxxxx>
References: <D90435AEFF34654AA1122988C66C8678023F0277C9@xxxxxxxxxxxxxxxxxxx> <54CA9586.1010607@xxxxxxxxxxx>
Thread-index: AdA76PstgYvQ3IMGTOi2marBUntUTgAUmbkAAA2PIoA=
Thread-topic: Unable to mount and repair filesystems
> -----Original Message-----
> Are you certain that the volume / storage behind dm-9 is in decent shape?
> (i.e. is it really even an xfs filesystem?)

The question "is it in decent shape" is probably the million dollar question.

What I do know is this:

* It's all LVM based
* The first problem partition is /dev/data/srv which in turn is a symlink to 
/dev/dm-9
* The second problem partition is /dev/os/opt which in turn is a symlink to 
/dev/dm-7

Both were originally formatted as XFS and /etc/fstab has same. Now I can' t be 
sure if the symlinks were always dm-7 and dm-9.

Comparing what "lvdisplay" tell in terms of block device major & minor numbers 
and compare to the dm-* symlinks, they all match up. So by all accounts it 
ought to be correct.

Running xfs_db on those two partitions shows what I understand to be the "right 
stuff" aside from an error when it first runs:

# xfs_db /dev/os/opt
Metadata corruption detected at block 0x4e2001/0x200
xfs_db: cannot init perag data (117). Continuing anyway.
xfs_db> sb 0
xfs_db> p
magicnum = 0x58465342
blocksize = 4096
dblocks = 3133440
rblocks = 0
rextents = 0
uuid = b4ab7d1d-d383-4c49-af2c-be120ff967a7
logstart = 262148
rootino = 128
rbmino = 129
rsumino = 130
rextsize = 1
agblocks = 128000
agcount = 25
rbmblocks = 0
logblocks = 2560
versionnum = 0xb4b4
sectsize = 512
inodesize = 256
inopblock = 16
fname = "opt\000\000\000\000\000\000\000\000\000"
blocklog = 12
sectlog = 9
inodelog = 8
inopblog = 4
agblklog = 17
rextslog = 0
inprogress = 0
imax_pct = 25
icount = 576
ifree = 135
fdblocks = 3079156
frextents = 0
uquotino = 0
gquotino = 0
qflags = 0
flags = 0
shared_vn = 0
inoalignmt = 2
unit = 0
width = 0
dirblklog = 0
logsectlog = 0
logsectsize = 0
logsunit = 1
features2 = 0x8a
bad_features2 = 0x8a
features_compat = 0
features_ro_compat = 0
features_incompat = 0
features_log_incompat = 0
crc = 0 (correct)
pquotino = 0
lsn = 0

> A VM crashing definitely should not result in a badly corrupt/unmountable
> filesystem.
> 
> Is there any other interesting part of the story? :)

The full setup is as follows:

The VM question is a VMware guest running on a vmware cluster. The actual files 
that make up the VM is stored on a SAN that VMware accesses via NFS.

The outage occurred at the SAN level making the NFS storage unavailable which 
in turn turned off all the VMs running on it (turned off in the virtual sense).

~50 VMs then were brought online and none had any serious issues. Most needed a 
form of fsck to bring things back to consistency. This is the only VM that 
suffered the way it did. Other VMs are a mix of Linux, BSD, OpenSolaris and 
Windows with all their varieties of filesystems (ext3, ext4, xfs, ntfs and so 
on).

It is possible that it is the vmware VMDK file that belongs to this VM that is 
the issue but it does not appear to be corrupt from a vmdk standpoint. Just the 
data inside of it.

Gerard

<Prev in Thread] Current Thread [Next in Thread>