Comment # 2
on bug 1157
from Dave Chinner
Not an XFs problem. This will be a problem with teh block device and/or lun
being presented to XFS overflowing a 32 bit unsigned sector count or some
similar sort of problem..We've seen lots of problems identical to this in teh
past at 2TB and 4TB boundaries because the block device overflows the address
and wraps offsets above 4TB back to offset zero, overwriting the primary
superblock and all the allocation group headers, btree root blocks or inodes
(e.g. the root inode) that are in the first few sectors of the device.
You need to talk to the vendor of your storage hardware to get them to fix
whatever is broken in their driver/hardware that is causing this. XFS is not
the problem, it's just the messenger....
-Dave.
Problem: Unable to create XFS filesystem on big (>= 4TB) volumes. By "volumes"
I mean logical unit created on Huawei's OceanStor data store product.
Environment:
rpm --query centos-release:
centos-release-7-2.1511.el7.centos.2.10.x86_64
uname -a:
3.10.0-327.22.2.el7.x86_64 #1 SMP Thu Jun 23 17:05:11 UTC 2016 x86_64 x86_64
x86_64 GNU/Linux
xfsprogs-3.2.2 (default for centos7)
========
command: mkfs.xfs /dev/sdh
output:
meta-data="" isize=256 agcount=18,
agsize=268435455 blks
= sectsz=512 attr=2, projid32bit=1
= crc=0 finobt=0
data = "" blocks=4830380032, imaxpct=5
= sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0 ftype=0
log =internal log bsize=4096 blocks=521728, version=2
= sectsz=512 sunit=0 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
bad magic number
bad magic number
Metadata corruption detected at block 0x0/0x200
libxfs_writebufr: write verifer failed on bno 0x0/0x200
Metadata corrupiton always appears at the same block. The above command
(mkfs.xfs) ends with rc=0, and the syslog does not report any problems.
However, when I try to mount filesystem problem obviously appears:
mount output:
mount: wrong fs type, bad option, bad superblock on /dev/sdp,
missing codepage or helper program, or other error
dmesg:
kernel: XFS (sdp): Invalid superblock magic number
I managed to find out the exact size when the XFS filesystem stops working:
4TB. Everything >= 4TB (4096 GB) fails. The volume of size 4095 GB works
without any problems.
I decided to try different XFS versions (xfsprogs package to be more specific).
The kernel version was unchanged ( 3.10.0-327.22.2.el7.x86_64 ). Here are the
results:
#############################################################
# xfsprogs-3.1.1 (the one which is installed with centos6): #
#############################################################
creating XFS: OK
mount: FAILED, system logs:
kernel: XFS (sdx): Invalid superblock magic number
#######################################
# xfsprogs-4.5.0 (compiled manually): #
#######################################
creating XFS: OK
mount: FAILED, system logs:
kernel: XFS (sdaf): Mounting V5 Filesystem
kernel: XFS (sdaf): Log inconsistent or not a log (last==0, first!=1)
kernel: XFS (sdaf): empty log check failed
kernel: XFS (sdaf): log mount/recovery failed: error -22
kernel: XFS (sdaf): log mount failed
Another scenario: create volume group with one volume, create one big logical
volume on this group. Everything works fine on volumes of size <4TB, when the
volume is bigger:
#########################
# lvm + xfsprogs-3.1.1: #
# lvm + xfsprogs-3.2.2: #
#########################
creating XFS: OK
mount: OK
write test (fio): WARN, system logs:
kernel: XFS (dm-2): Metadata corruption detected at
xfs_agi_read_verify+0x5e/0x110 [xfs], block 0x2
kernel: XFS (dm-2): Unmount and run xfs_repair
kernel: XFS (dm-2): First 64 bytes of corrupted metadata buffer:
kernel: ffff880feafe1600: fe ed ba be 00 00 00 00 00 00 00 02 00 00 00 00
................
kernel: ffff880feafe1610: 00 00 00 00 00 00 0f e2 00 00 00 01 00 00 00 00
................
kernel: ffff880feafe1620: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
................
kernel: ffff880feafe1630: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
................
kernel: XFS (dm-2): metadata I/O error: block 0x2 ("xfs_trans_read_buf_map")
error 117 numblks 1
remount (umount + mount): FAILED, system logs:
kernel: XFS (dm-2): Mounting V4 Filesystem
kernel: XFS (dm-2): Metadata corruption detected at
xfs_inode_buf_verify+0x75/0xd0 [xfs],
# multiple occurences of this lines:
###
block 0x40
kernel: XFS (dm-2): Unmount and run xfs_repair
kernel: XFS (dm-2): First 64 bytes of corrupted metadata buffer:
kernel: ffff8807641c8000: fe ed ba be 00 00 00 00 00 00 00 02 00 00 00 00
................
kernel: ffff8807641c8010: 00 00 00 00 00 00 10 20 00 00 00 01 00 00 00 25
....... .......%
kernel: ffff8807641c8020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
................
kernel: ffff8807641c8030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
................
kernel: XFS (dm-2): Metadata corruption detected at
xfs_inode_buf_verify+0x75/0xd0 [xfs],
###
# And this block at the end:
block 0x40
kernel: XFS (dm-2): Unmount and run xfs_repair
kernel: XFS (dm-2): First 64 bytes of corrupted metadata buffer:
kernel: ffff8807641c8000: fe ed ba be 00 00 00 00 00 00 00 02 00 00 00 00
................
kernel: ffff8807641c8010: 00 00 00 00 00 00 10 20 00 00 00 01 00 00 00 25
....... .......%
kernel: ffff8807641c8020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
................
kernel: ffff8807641c8030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
................
kernel: XFS (dm-2): metadata I/O error: block 0x40 ("xfs_trans_read_buf_map")
error 117 numblks 16
kernel: XFS (dm-2): xfs_imap_to_bp: xfs_trans_read_buf() returned error -117.
kernel: XFS (dm-2): failed to read root inode
#########################
# lvm + xfsprogs 4.5.0: #
#########################
creating XFS: OK
mount: OK
write test (fio): WARN, system logs:
kernel: XFS (dm-2): Metadata CRC error detected at
xfs_agi_read_verify+0x5e/0x110 [xfs], block 0x2
kernel: XFS (dm-2): Unmount and run xfs_repair
kernel: XFS (dm-2): First 64 bytes of corrupted metadata buffer:
kernel: ffff8807d4d55c00: fe ed ba be 00 00 00 00 00 00 00 02 00 00 00 00
................
kernel: ffff8807d4d55c10: 00 00 00 00 00 00 0f da 00 00 00 01 00 00 00 00
................
kernel: ffff8807d4d55c20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
................
kernel: ffff8807d4d55c30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
................
kernel: XFS (dm-2): metadata I/O error: block 0x2 ("xfs_trans_read_buf_map")
error 74 numblks 1
remount (umount + mount): FAILED, system logs:
kernel: XFS (dm-2): Mounting V5 Filesystem
kernel: XFS (dm-2): Metadata corruption detected at
xfs_inode_buf_verify+0x75/0xd0 [xfs],
# multiple occurences of this lines:
###
block 0x60
kernel: XFS (dm-2): Unmount and run xfs_repair
kernel: XFS (dm-2): First 64 bytes of corrupted metadata buffer:
kernel: ffff8807de11c000: fe ed ba be 00 00 00 00 00 00 00 02 00 00 00 00
................
kernel: ffff8807de11c010: 00 00 00 00 00 00 10 38 00 00 00 01 00 00 00 a6
.......8........
kernel: ffff8807de11c020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
................
kernel: ffff8807de11c030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
................
kernel: XFS (dm-2): Metadata corruption detected at
xfs_inode_buf_verify+0x75/0xd0 [xfs],
###
# And this block at the end:
block 0x60
kernel: XFS (dm-2): Unmount and run xfs_repair
kernel: XFS (dm-2): First 64 bytes of corrupted metadata buffer:
kernel: ffff8807de11c000: fe ed ba be 00 00 00 00 00 00 00 02 00 00 00 00
................
kernel: ffff8807de11c010: 00 00 00 00 00 00 10 38 00 00 00 01 00 00 00 a6
.......8........
kernel: ffff8807de11c020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
................
kernel: ffff8807de11c030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
................
kernel: XFS (dm-2): metadata I/O error: block 0x60 ("xfs_trans_read_buf_map")
error 117 numblks 32
kernel: XFS (dm-2): xfs_imap_to_bp: xfs_trans_read_buf() returned error -117.
kernel: XFS (dm-2): failed to read root inode
Please inform me if you need anything else.
Thank you,
BartÅomiej Daca