xfs
[Top] [All Lists]

Re: another XFS+LVM+SoftwareRAID5 query

To: Jan-Frode Myklebust <janfrode@xxxxxxxxxxxxxxx>
Subject: Re: another XFS+LVM+SoftwareRAID5 query
From: Charles Steinkuehler <charles@xxxxxxxxxxxxxxxx>
Date: Wed, 21 Jul 2004 17:31:21 -0500
Cc: linux-xfs@xxxxxxxxxxx
In-reply-to: <20040721212921.GB28273@ii.uib.no>
References: <20040721212921.GB28273@ii.uib.no>
Sender: linux-xfs-bounce@xxxxxxxxxxx
User-agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.6) Gecko/20040113
Jan-Frode Myklebust wrote:

<snip>

So I am a bit conserned I might get the same problem as Charles
Steinkuehler if I ever need to run xfs_repair.. Charles did you find a
solution for your problem?

I found a solution...I switched to JFS. ;-P

While my problems with XFS may have been greatly magnified due to low-level hardware probems (fixed by switching to a Promise SATA controller), there seem to be enough folks reporting problems with XFS + LVM + Software RAID5 that I would hesitate to use this combination in production without a *LOT* of testing I didn't have time for. I don't currently need either extended attribute support or the larger filesystem sizes available in XFS, so JFS is working fine for me.

NOTE: A quick check on my system (running debian testing, 2.6.6-1-k7 kernel, Promise TX4 w/4x now-working SATA drives) and the first of my stress tests passes (lvcreate, mkfs.xfs, mount, bonnie++, umount, xfs_repair). I don't have enough room for the second stress-test I was using (rsync apx 150G of data from an ext3 partition, umount, and xfs_repair). I suspect if you can extract/compile the kernel tarball, umount, and get a clean xfs_repair, things are probably working normally.

I'd probably also try a hard power-off shutdown while extracting the kernel tarball (or compiling, or otherwise pounding on the FS) followed by an xfs_repair to make sure you can return to normal from a real-world error condition (probably a good idea to mount all but the volume under test as read-only before-hand!). I was seeing the "switching cache buffer size" messages when running xfs_repair (and IIRC when mounting or unmounting), not in normal operation (once I formatted with size=4096), so I'd definately try to test xfs_repair on a 'broken' FS before trusting it with production data.

Finally, while unrelated to LVM/RAID5, based on information I've absorbed in my extensive googling, XFS seems to have a tendency to keep data in a write-cache, which combined with it's journal replay characteristics on startup (ie: zeroing unclean files) can cause problems if your system is ever subject to unclean shutdowns. Hopefully someone on-list with more knowledge of XFS internals can comment on how accurate this is.

--
Charles Steinkuehler
charles@xxxxxxxxxxxxxxxx


<Prev in Thread] Current Thread [Next in Thread>