xfs
[Top] [All Lists]

2.4.18 + XFS 1.1 on multiprocessor: bizzare hard hang on Samba writes.

To: linux-xfs@xxxxxxxxxxx
Subject: 2.4.18 + XFS 1.1 on multiprocessor: bizzare hard hang on Samba writes.
From: Thor Lancelot Simon <tls@xxxxxxxxxxxx>
Date: Sun, 21 Apr 2002 03:27:24 -0400
Sender: owner-linux-xfs@xxxxxxxxxxx
User-agent: Mutt/1.2.5i
I have a rather interesting old box, a 6 x PPro ALR beast with two primary
PCI buses and a truly immense number of disk bays.  At the moment it's got
Linux 2.4.18 with XFS 1.1 on it; the disks are four IDE disks in a software
RAID and five IBM SCSI disks on an Adaptec 3210S I2O RAID controller.  The
only other thing of note in the machine is a Netgear GA-620 (Tigon-II)
ethernet adapter.

I have, unfortunately, discovered that though I can't provoke the problem
any other way, if I copy many large files onto the box using Samba, I get
an almost instant hard hang.  No network I/O, no keyboard input; I can't
even drop to the debugger.

I figured the problem was with the software RAID (even though I'm using an
external log on a partition of the Adaptec's "disk") but copying onto the
Adaptec RAID volume, as it turns out, has the same issue.  So, I assume
the likely culprit is a locking botch somewhere in the acenic or dpti
drivers or in XFS.  I assume everyone in the world would know if acenic
or dpti were broken (they have many more users that XFS, I've got to guess)
so I tend to blame XFS...

I note that Linux spinlocks seem to use cli/sti to disable all interrupts
so a locking botch does seem like a likely cause of a total, irrecoverable
hang.  That leaves me with little or no idea how to debug this, but I'd be
glad to give it a shot if someone could make suggestions.  I work at a
router vendor that ships a Linux-based product so I can handle the kernel
debugger fairly well; I just don't know where to start with this kind of
problem, since we ship only uniprocessor machines and locking issues aren't
exactly common. :-)

I'd be perfectly willing to arrange login or serial console access to the
box for anyone from SGI who cared to look at this; or just let me know
what you want me to look at and I'll be glad to report back.

Thor


<Prev in Thread] Current Thread [Next in Thread>