[Top] [All Lists]

Re: XFS kernel BUG at fs/buffer.c:470! with

To: Alessandro Bono <alessandro.bono@xxxxxxxxx>
Subject: Re: XFS kernel BUG at fs/buffer.c:470! with
From: Jan Kara <jack@xxxxxxx>
Date: Thu, 26 Feb 2009 17:58:39 +0100
Cc: Dave Chinner <david@xxxxxxxxxxxxx>, Christoph Hellwig <hch@xxxxxxxxxxxxx>, linux-xfs <linux-xfs@xxxxxxxxxxx>, linux-kernel <linux-kernel@xxxxxxxxxxxxxxx>
In-reply-to: <1234432077.9204.15.camel@xxxxxxxxxxxxxxxxx>
References: <1234011974.7435.11.camel@xxxxxxxxxxxxxxxxx> <20090208222859.GA2532@xxxxxxxxxxxxx> <1234132752.12370.0.camel@xxxxxxxxxxxxxxxxx> <20090208224249.GA11931@xxxxxxxxxxxxx> <1234133120.12370.7.camel@xxxxxxxxxxxxxxxxx> <20090209075308.GA7360@xxxxxxxxxxxxx> <20090210104304.GP8830@disturbed> <1234432077.9204.15.camel@xxxxxxxxxxxxxxxxx>
User-agent: Mutt/1.5.13 (2006-08-11)
> On Tue, 2009-02-10 at 21:43 +1100, Dave Chinner wrote:
> > On Mon, Feb 09, 2009 at 02:53:08AM -0500, Christoph Hellwig wrote:
> > > On Sun, Feb 08, 2009 at 11:45:20PM +0100, Alessandro Bono wrote:
> > > > sure, attached
> > > 
> > > That would be a missing PagePrivate bit in page_buffers() called from
> > > end_buffer_async_write.  PG_private can only be cleared via drop_buffers
> > > which requires the page not having PG_writeback set which must be
> > > set until end_buffer_async_write is done.  Very strange, and all this
> > > is generic code without xfs involvement.  Did this happen once
> > > or can you reproduce it?
> > 
> > Hmmmm - i wonder if this has anything to do with the writeback fixes
> > that went into Alessandro, can you revert to (not
> > plain 2.6.28) and see if you can reproduce the problem?
> another test another bug
> kernel 2.6.29-rc4-git4 with DEBUG_PAGEALLOC and CONFIG_DEBUG_LIST
> enabled (idea taken from a totally unrelated mail from Ingo Molnar to
> catch a memory corruption), usual bug attached
> 2.6.27 from ubuntu not survived to rsync
> btw my first report of a similar problem was with a kernel but
> at time I was using binary driver for my radeon card and Christoph
> suggest me that I have to recreate problem without any binary driver
> maybe it's not a recent regression, it's simply easier to hit with a
> newer kernel
> I don't have abandoned idea of a hardware problem but I don't know how
> to be sure
> any suggestion?
  Hmm, are you still able to reproduce the problem? As I'm looking into
registers in your dump, no register really seems to contain sensible page
flags so it could be some corruption of page pointer. If you are still
able to reproduce, could you please do so with the attached patch
applied? It will dump us much more information... Thanks.


Jan Kara <jack@xxxxxxx>
SuSE CR Labs

<Prev in Thread] Current Thread [Next in Thread>