xfs
[Top] [All Lists]

Re: 2.4.18-xfs 1TB stability problems.

To: Clem Taylor <ctaylor@xxxxxxxxxxxxxxx>
Subject: Re: 2.4.18-xfs 1TB stability problems.
From: Paul Schutte <paul@xxxxxxxxxxx>
Date: Fri, 12 Apr 2002 02:49:36 +0200
Cc: linux-xfs@xxxxxxxxxxx
References: <3CB626D1.398F9598@chipwrights.com>
Sender: owner-linux-xfs@xxxxxxxxxxx
There are a few issues with the linux kernel (Nothing to do with XFS).
You can look at this URL to see what Andrea Arcangeli has to say.

http://lwn.net/2002/0411/a/vm-33-reason.php3

get a stock 2.4.18 from a kernel mirror (I use ftp.is.co.za).

ftp://ftp.is.co.za/linux/kernel/pub/linux/kernel/v2.4/linux-2.4.18.tar.gz

apply
ftp://ftp.is.co.za/linux/kernel/pub/linux/kernel/v2.4/testing/patch-2.4.19-pre6.bz2

and then

ftp://ftp.is.co.za/linux/kernel/pub/linux/kernel/people/andrea/kernels/v2.4/2.4.19pre6aa1.bz2

The Andrea Arcangeli kernels has XFS and his latest vm tweaks.
I do not know what version of XFS he uses.
It is worth a try.

Paul


Clem Taylor wrote:

> I'm using 2.4.18-xfs on a dual 1.2GHz Athlon with 1gig of RAM. The machine
> has a 3ware 7850 with 8 160G drives (RAID5) with an XFS file system
> (1.0TB). It also has a SCSI software raid root volume and a IDE scratch
> disk both with ext3. The XFS filesystem is ~40% full with an average file
> size of ~4-5G
>
> The 1TB array is a recent addition to a previously stable system. The RAID
> volume seemed fine in my initial burn in and stress testing, but now that I
> have live data on the array I've been having stability problems. In the
> last <2 months I've had 3 crashes...
>
> The first crash was a strange OOM type problem after the box had been up
> for a few weeks. It started with a series of 'eth0: memory shortage' on a
> box that was mostly doing NFS, followed by several 'Unable to handle kernel
> NULL pointer dereference at virtual address 00000030' in kswapd, then
> klogd, shortly after that it oppsed and wedged.
>
> In the last two weeks I've had it die twice, both times within a minute of
> starting to mv ~800M from an SGI O2 to the linux box via NFS3. The O2
> reported NFS timeout errors, the linux box would respond to pings and some
> things that didn't depend on diskio continued to work. dmesg and df would
> hang and couldn't be interrupted. In both cases I was forced to reset.
>
> I'm not sure if the problems I'm seeing have anything to do with XFS, but
> my 2 most recent crashes occurred shortly after starting to move data to
> the XFS volume on an otherwise idle system.
>
> Any ideas / debugging tips?
>
>                   Many thanks,
>                   Clem


<Prev in Thread] Current Thread [Next in Thread>