xfs
[Top] [All Lists]

Re: ADD 804570 - The elevator bug

To: Tony Gale <gale@xxxxxxxxxxxxxxxxxx>
Subject: Re: ADD 804570 - The elevator bug
From: utz lehmann <xfs@xxxxxxxxxx>
Date: Mon, 4 Dec 2000 19:22:52 +0100
Cc: Russell Cattelan <cattelan@xxxxxxxxxxx>, linux-xfs@xxxxxxxxxxx
In-reply-to: <XFMail.20001204140814.gale@syntax.dera.gov.uk>
References: <3A28A221.37A42167@thebarn.com> <XFMail.20001204140814.gale@syntax.dera.gov.uk>
Sender: owner-linux-xfs@xxxxxxxxxxx
User-agent: Mutt/1.2.5i
hello

sounds like the recovery bug i found last week.

try a older kernel version. from november, 14th should work.
the beta kernel (test5 based) should work too.
or try following patch.
the latest cvs version still have the bug, i just checkt it.


utz

-----------------------------------------------------------------------

Date: Fri, 01 Dec 2000 19:12:03 -0800
From: Rajagopal Ananthanarayanan <ananth@xxxxxxx>
To: utz lehmann <xfs@xxxxxxxxxx>
CC: linux-xfs@xxxxxxxxxxx
Subject: Re: kernelcrash during root filesystem recovery

utz lehmann wrote:
> 
> ok, here is the backtrace (via serial console):
        [ ... ]
> 
> Starting XFS recovery on filesystem: ide0(3,6) (dev: 3/6)
> kernel BUG at slab.c:1542!
> 
        [ ... ]
> 
> what should i do next?

First, the immediate BUG() is due to a bogus sized
kmalloc being requested.

Second, I've been seeing problems here with recovery;
so far I thought it was a bug in code that I've been
working on. But looking at your backtrace may be something
else is broken.

Looking through some recent changes, I think a bcopy
was accidentally deleted. In file fs/xfs/xfs_log_recover.c,
AFTER the kmem_realloc( ... ) at line 1218, can you ADD:

         bcopy(dp , &ptr[old_len], len);                 /* s, d, l */

Can you please recompile & retry recovery?

Thanks for your efforts in providing debug information!

ananth.

PS: Daniel, revision 1.195 is where the bcopy was taken out.
    It appears to be an error. Can you please check?

-- 
--------------------------------------------------------------------------
Rajagopal Ananthanarayanan ("ananth")
Member Technical Staff, SGI.
--------------------------------------------------------------------------



Tony Gale [gale@xxxxxxxxxxxxxxxxxx] wrote:
> 
> This may account for my test xfs news server not surviving for more
> than a week. But, the filesystem pretty much goes unrecoverable after
> I am forced to reset the box:
> 
> kmem_alloc doing a vmalloc 241488 size & PAGE_SIZE 0 rval=0xf8829000
> Start mounting filesystem: sd(8,17)
> Starting XFS recovery on filesystem: sd(8,17) (dev: 8/17)
> cmn_err level 1 Filesystem "sd(8,17)": xfs_inode_recover: Bad inode
> log record, rec ptr 0xf5165fc0, dino ptr 0xf5091d00, dino bp
> 0xe2cb73c0, ino 121654813, total extents = -4746, nblocks = 16
> XFS: log mount/recovery failed
> XFS: log mount failed
> Size 241488 doing a vfree 0xf8829000
> 
> Now xfs_check is spewing countless (with the block number increasing):
> 
> block 2/195770000 out of range
> 
> -tony
> 
> 
> 
> 
> On 02-Dec-2000 Russell Cattelan wrote:
> > 
> > Yes this is a know problem in the latest 2.4 kernels.
> > It has been observed on other file  systems as well not just XFS.
> > 
> > I have do have a kernel with Jens elevator patch, that does
> > appear to fix the starvation problem. Unfortunately it appears to
> > either
> > have problems itself or is exposing  problems in the XFS code.
> > 
> > Currently XFS kiobuf based io causes a lockup that eventually cause
> > the
> > kernel to through an NMI.
> > 
> > Non kiobuf io causes pagebuf to panic under heavy load.
> > 
> > I got this running late friday and haven't had much
> > of a chance to investigate.
> > 
> > Since this is a linux bug we are  waiting for the official
> > fix to show up in the linux tree.
> > 
> > --
> > Russell Cattelan
> > cattelan@xxxxxxxxxxx
> 
> ---
> E-Mail: Tony Gale <gale@xxxxxxxxxxxxxxxxxx>
> I cannot draw a cart, nor eat dried oats; If it be man's work I will do it.
> 
> The views expressed above are entirely those of the writer
> and do not represent the views, policy or understanding of
> any other person or official body.

<Prev in Thread] Current Thread [Next in Thread>