xfs
[Top] [All Lists]

Re: RHEL4/SL4 XFS stack problem?

To: Eric Sandeen <sandeen@xxxxxxx>
Subject: Re: RHEL4/SL4 XFS stack problem?
From: "Michael Mansour" <mic@xxxxxxxxxxx>
Date: Wed, 4 Jan 2006 14:59:16 +1000
Cc: linux-xfs@xxxxxxxxxxx
In-reply-to: <Pine.LNX.4.44.0601032246260.8801-100000@xxxxxxxxxxxxxxxxxxxxxxxx>
References: <20060104014041.M15493@xxxxxxxxxxx> <Pine.LNX.4.44.0601032246260.8801-100000@xxxxxxxxxxxxxxxxxxxxxxxx>
Sender: linux-xfs-bounce@xxxxxxxxxxx
Hi Eric,

> > After building a couple of clusters using xfs on the shared storage device
> > (and using md and lvm on top of that), I'm getting this error now which hard
> > crashes my machines:
> > 
> >  do_IRQ: stack overflow: 284
> >   [<c01078a2>] do_IRQ+0x44/0x130
> 
> The rest of the message would be most interesting, to see what your 
> stack actually looks like.

What I've shown above is the only bit I can see on the console, can't use the
keyboard or anything at that point and I have to physically powercycle the 
server.

> Recent xfs is reasonable on 4k stacks and there are a few things in 
> the works to make it better.  But depending on what you stack up in 
> your IO path you could probably still blow it.

Hmm... ok, my stack is:

md   for IDE disk mirrors
lvm  for LV support
drbd for the shared storage
xfs  formatted the filesystem

I run the linuxha.net HA software which uses drbd for network-linked shared
storage.

Do you think all that stacking is the problem? would the previous email
stating that I can build from kernel.org using RH config file but changing to
8k stack make this work?

Thanks.

Michael.

> -Eric
> 
> > I'm using Scientific Linux 4.2 (RHEL4 Update 2) with a SL Contrib kernel of:
> > 
> > kernel-smp-2.6.9-11.EL.XFS
> > 
> > which has xfs support. I also use the xfsprogs rpm supplied by Dag Wieers. I
> > run on an x86 platform.
> > 
> > After googling quite a bit, it seems that RH have caused an issue with their
> > RHEL4 release by only enabling a 4k stack, where it seems that XFS requires 
> > an
> > 8k stack?
> > 
> > I'd really like to know how to fix this problem as I just finished months of
> > works building a couple of SL4 clustered environments using XFS, and now 
> > with
> > this problem am looking at the unpleasant alternative of getting rid of the
> > XFS filesystems and changing them to ext3, which will take me approximately
> > half a day of work per cluster for the added benefit of a slower filesystem.
> > 
> > I just visited the SGI site to see if there's any hints to fixes of this
> > problem there, which is where I got this email address from.
> > 
> > Any help is very much appreciated.
> > 
> > Michael.
> > 
> >
------- End of Original Message -------


<Prev in Thread] Current Thread [Next in Thread>