On Fri, Nov 15, 2002 at 01:52:33PM +0100, Andi Kleen wrote:
> The 2.5 VM is still rather untuned, so you may just run into some
> generic VM problem.
It's something else (or was) I think.
Under memory pressure (xfsdump to another filesystem is a good way to
show this) various allocations fail but don't seem to be harmful. To
get these messages typically I find you need to dump several GB.
Those messages are the result of read-ahead failing[1]. I mentioned
this to hch who knows this code 10x better than me seemed to think it
wasn't the cause of the oopsen I was seeing but rather that it was
read-ahead and it would fail gracefully.
That said, shortly after this I would get oopsen, every night when
xfsdump was running I would get an oops. About five days ago hch
committed quite a few changes and my oopsen stopped but xfsdump was
getting stick in io_schedule.
I decided to try track this down with kdb only it went away completely
a couple of days ago. I've since repeated by xfsdump stress-test
which *always* made this happen something like 50 times and it's
worked every time, I cannot anymore cause either the oops or an the
process no longer gets stuck.
I guess if nobody puts their hand up to claim to having fixed this, I
should cvs update -D "5 days ago" sort of thing and try to get it to
oops again, but so far, it's working so the temptation and desire to
do this is rather limited :)
--cw
[1] Putting a show_stack() or whatever it's called in to dump the
call-path when you get allocation failures shows this.
|