xfs
[Top] [All Lists]

Re: kernel-smp-2.4.3-SGI_XFS_1.0.1_PR3 bug report

To: linux-xfs list <linux-xfs@xxxxxxxxxxx>
Subject: Re: kernel-smp-2.4.3-SGI_XFS_1.0.1_PR3 bug report
From: Christopher McCrory <chrismcc@xxxxxxxxxxxxxxxx>
Date: Fri, 06 Jul 2001 11:15:39 -0700
Organization: Pricegrabber.com
Sender: owner-linux-xfs@xxxxxxxxxxx
User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:0.9.2) Gecko/20010702
Hello...

        To make it easier, I'll summarize several message posts:

Steve Lord wrote:
> The 2.4.5 based kernel RPMs do not include any redhat patches, they are
> just a base kernel + XFS. Attempting to integrate redhat's patch set into
> a different kernel base is like juggling spaghetti!

D'oh. I should have actually looked at what was in them. I was guessing they were RH rawhide + XFS


Eric Sandeen wrote: > Hm, is aacraid even in 2.4.5? From the other posts, apparently not. Oops


Christian, Chip wrote:

> It was added by RedHat, so I guess the expectation is that it exist in all kernel rpms?

Actually, I thought they were in Linus's kernel already.



Nathan Straz wrote:

> Did you get a look at the process table before you took the machine
> down?  Theorically, if some of the Apache processes hit a common
> deadlock, it would keep spawning new servers which would hit the same
> deadlock and the load would skyrocket like you describe.
>
> I don't remember if kdb is in PR3, but if you could get a back trace of
> the deadlocked processes, that might be helpful.

<and>

Seth Mos wrote:
> More details please, what does it do beside webserving in what setup is
> the machine configured.
> Do the logfiles or dmesg have any messages, are there hung processes is
> this a highmem machine?

<and others similar>

This server is a web only server, part of a server farm behind a cisco LocalDirector. So if there was a php coding problem/apache problem, I would have probably seen it across all the servers. Also going back to XFS kernel rpm 1.0 made the problem go away.

I have a devel machine I tested XFS on originally (and 1.0.1PR3 rpms). It runs fine. The server I had the problems on is a production server now. So I really can't test anything on it. No web trafic == No $$$

When this was happening, my main concern was getting the server back operational, not getting debug/process info. SOrry. But I did think it warrented at least a report to the developers.

I'll try the next PR release and watch it. I'll try to be ready to get error messages and such.


-- Christopher McCrory "The guy that keeps the servers running" chrismcc@xxxxxxxxxxxxxxxx http://www.pricegrabber.com

I don't make jokes in base 13. Anyone who does should get help. --Douglas Adams


<Prev in Thread] Current Thread [Next in Thread>