xfs
[Top] [All Lists]

weird XFS+RAID+big IDE problem?

To: linux-xfs@xxxxxxxxxxx
Subject: weird XFS+RAID+big IDE problem?
From: Derek Glidden <dglidden@xxxxxxxxxxxxxxx>
Date: 23 Oct 2003 13:07:27 -0400
Organization:
Sender: linux-xfs-bounce@xxxxxxxxxxx
I'm building a new server at home with 4x 160G Samsung IDE hard drives
and Linux software RAID5.  (I'd love to go SCSI with some nice hardware
RAID, but... well, someone send me some money.  I barely got away with
getting the drives.  You know how it is.  So don't go there.  :)

/     == /dev/md0
/var  == /dev/md1
/usr  == /dev/md2
/home == /dev/md3
/opt  == /dev/md4

with all drives partitioned exactly the same and each MD being RAID5.

I'd love to use XFS on it like I do on everything else I build, but I'm
having problems.  Any time there is any significant disk I/o on more
than one filesystem, e.g. rsync'ing from another box onto /opt while
building a kernel under /usr, I wind up with xfs_shutdown on one or more
filesystems, kernel panic and really bad filesystem corruption.  (More
than once to the point that "xfs_repair" reported it could not find an
XFS filesystem on the volume.)  I've also had random xfs_shutdowns and
filesystem corruption while the machine is apparently just sitting idle
or with very minimal disk I/o to just one volume, but I can eventually
(pretty quickly more often than not) make it crash every time if I start
lots of I/o to multiple filesystems simultaneously.

I've tried with all four drives on the built-in IDE controllers on the
motherboards, I've tried two attached to a PCI Promise card so there is
only one drive per IDE channel, I've even built it with all four drives
on PCI controllers as just a single RAID volume and a fifth boot/system
drive on the built-in IDE so only /opt is XFS/RAID5 and everything else
are just single partitions on a single drive as XFS.

It gets installed as a base Debian Woody system.  I've tried running
with 2.4.21 + XFS 1.3.1, 2.4.22 + xfs 2.4.22 split patches, 2.4.21aa1,
2.4.22aa1 and building the kernel with debug stuff enabled.  But the
panics usually mean that once the box goes down, I'm totally boned,
debug symbols/debugger or not, and have to hit the big red button.

The same box, without changing any of the hardware configuration and
even using the exact same kernel binary, but using ext3 for all the
filesystems has been rock solid stable for several days now, no matter
how much I throw at it, including all the things that make it crash with
XFS.

The whole system has been burnt in with "ctcs" for five days straight
with absolutely no problems, so I'm hesitant to say it might be a
hardware problem, especially since the same box with ext3 in place of
XFS has been stable.

I've got at least two other servers, at home and at work, built exactly
the same way, solid as the proverbial rock, but with 120G drives or
smaller.  So my only semi-reasonable guess is there is some bizarre
interaction with XFS+Software RAID5+IDE larger than 128G that is making
it crashy.

Has anyone else tried building a similar machine with any success?  Has
anyone else had problems with XFS and Linux software RAID and big IDE
drives?

I realize this is a really paltry amount of mostly useless random
information, but I've really been flailing since I can't get any idea of
where the actual problem might be and the crashes appear pretty randomly
other than I can make them happen eventually (once the box lasted three
days before seriously eating itself alive) if I start up lots of disk
I/o.

It's not going "into production" to replace my old server until I'm sure
I can keep it rock solid, so I have no problems blowing it away and
installing some completely experimental patch or software mod on it, so
if anyone has any suggestions, or any code changes they'd like to try to
see if they can find the problem, I am completely open to doing whatever
to this thing.

-- 
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
"We all enter this world in the    | Support Electronic Freedom
same way: naked; screaming; soaked |        http://www.eff.org/
in blood. But if you live your     |  http://www.anti-dmca.org/
life right, that kind of thing     |---------------------------
doesn't have to stop there." -- Dana Gould




<Prev in Thread] Current Thread [Next in Thread>