Hi,
I'm seeing problems with XFS 1.3.1 when using the xfs_db tool. I've tried
several different combinations of kernels and hardware including:
XFS Patched RedHat Kernel RPM (available from SGI directly)
2.4.22 XFS Patched Kernel
HP 5i SCSI controllers
HP 6400 SCSI controllers
Dell Laptop (IDE)
The problem occurs in every different combination of the above. It is very
easy to produce the problem, the steps I use are:
1) induce moderate to heavy I/O on the XFS partition
2) use the xfs_db tool with a command like any of the following:
xfs_db -ir /dev/hdc6 -c frag
xfs_db -r /dev/hdc6 -c freesp
xfs_db -ir /dev/cciss/c0d0p2
xfs_db> frag
xfs_db> freesp
I find that if I use the xfs_db commands a few times there is a high
probability of partition errors and subsequent shutdown. In particular
interactive usage of xfs_db (third example above) seems very prone to causing
the problem. Errors like the following will show up in the system logs:
kernel: xfs_force_shutdown(ide1(22,6),0x8) called from line 1070 of file
xfs_trans.c. Return address = 0xe0e6db2b
kernel: Filesystem "ide1(22,6)": Corruption of in-memory data detected.
Shutting down filesystem: ide1(22,6)
kernel: Please umount the filesystem, and rectify the problem(s)
I find that my XFS partitions are generally very stable - however we hit these
disks pretty hard and run xfs_fsr in a daily cron job to keep ahead of our
fragmentation. I am using xfs_db to monitor fragmentation on the filesystems
in question and this is obviously shaking my confidence in proceeding with XFS
on our production filesystems. Anyone have any suggestions on steps to take to
correct these problems and improve the stability?
Thanks!
-Andy Smith
aps@xxxxxxx
|