xfs
[Top] [All Lists]

Re: Problems using xfs on RAID 5 volumes

To: "Horchler, Joerg" <joerg.horchler@xxxxxxxxxxxxx>
Subject: Re: Problems using xfs on RAID 5 volumes
From: David Chinner <dgc@xxxxxxx>
Date: Tue, 10 Jan 2006 07:57:14 +1100
Cc: linux-xfs@xxxxxxxxxxx
In-reply-to: <37191A9D3396224A92F6F9306DFB119D0142A7B0@MARS.coremedia.com>
References: <37191A9D3396224A92F6F9306DFB119D0142A7B0@MARS.coremedia.com>
Sender: linux-xfs-bounce@xxxxxxxxxxx
User-agent: Mutt/1.4.2.1i
On Mon, Jan 09, 2006 at 11:34:56AM +0100, Horchler, Joerg wrote:
> Hi, 
> 
> we have a big problem using XFS on our fileserver. Our configuration is:
> 
> We are using a Dell PowerVault as external RAID Array which is configured
> with two logical volumes. Each logical volume is configured with 7 physical
> disks. Six disks are configured to form a RAID 5 and the last is configured
> as hot spare. Our server is a 'SuSE Linux Enterprise Server 9' running with
> kernel  2.6.5-7.151-smp. xfsprogs of version 2.6.25-0.2 are installed. I
> don't know which version of XFS is installed with the running kernel. 
> 
> Now our problem:
> 
> Every time a physical disk fails (and the RAID swaps from state OPTIMAL to
> DEGRADED) the RAID rebuilds onto the hot spare. During this rebuild we get a
> lot of XFS errors in our dmesg:
> 
> 0x0: 66 4e 1f 21 5d 98 0e d9 23 70 65 00 1f 02 00 7d
> Filesystem "dm-4": XFS internal error xfs_da_do_buf(2) at line 2273 of file
> fs/xfs/xfs_da_btree.c.  Caller 0xf918f522

That indicates that XFS has received corrupt data from the disk when
reading a directory entry. Given that you've had a disk failure and
the volume is rebuilding, I'd suspect a RAID problem....

> nfsd: non-standard errno: -990

EFSCORRUPTED.

> The more curious problem is that during such a rebuild we loose some files
> on the filesystem. The worst case was that XFS stops the filesystem which
> produces I/O errors. Then we have to remount and repair the filesystem which
> produces several GB of data lost. 

This sounds like your RAID controller is not rebuilding the
underlying volume correctly or has some problem with writing new
data to the volume while a reconstruction is in progress. This does
not sound like an XFS problem at all.

Cheers,

Dave.
-- 
Dave Chinner
R&D Software Enginner
SGI Australian Software Group


<Prev in Thread] Current Thread [Next in Thread>