xfs
[Top] [All Lists]

Re: Crashes in various ext2 functions while running xfstest/check

To: Chris Pascoe <c.pascoe@xxxxxxxxxxxxxx>
Subject: Re: Crashes in various ext2 functions while running xfstest/check
From: Steve Lord <lord@xxxxxxx>
Date: Mon, 04 Jun 2001 09:42:09 -0500
Cc: Steve Lord <lord@xxxxxxx>, linux-xfs@xxxxxxxxxxx
In-reply-to: Message from Chris Pascoe <c.pascoe@csee.uq.edu.au> of "Mon, 04 Jun 2001 19:10:02 +1000." <Pine.GSO.4.33.0106041759020.8217-100000@mango.csee.uq.edu.au>
Sender: owner-linux-xfs@xxxxxxxxxxx
Hi Chris,

Thanks for the detailed analysis, this gives me a couple of ideas. The
changes you describe will work - since you have effectively disabled
highmem by doing so. I think the problem is more along the lines of xfs
not cleaning up these pages correctly in some situation.

Steve


> Hi Steve,
> 
> Further to my last emails on this, I think I've tracked down why the
> crashes occur, but don't know how to fix it.  I eliminated the scsi
> hardware, ethernet card, etc, that Seth Mos suggested might be the problem
> (got loans of completely different hardware).  I can reliably crash my
> test machine in under an hour by running test 013 in a loop, and letting
> the "/etc/cron.hourly/sysstat" cron job run.  Doing some random other
> commands during the process helps speed the crash up.
> 
> The crashes I see are related to the machine having highmem support, and
> buffers allocated with pages in high memory making their way onto the
> (fs/buffer.c) free_list.  I added an extra field to struct buffer_head
> that records in the buffer head who created it (in create_empty_buffers),
> and what function called put_last_free.  In every instance, the
> buffer_head that causes the crash was created by
> hook_buffers_to_page_delay, and put onto the free list later by a call to
> __invalidate_buffers.  (Adding code to record in the bh who called
> that.... done.... crashed, - the caller was blkdev_put this time, but I'll
> run a few more tests).
> 
> When one of these bh's with bh->b_page in high memory is given to ext2 by
> getblk, and a "bread" performed, bh->b_data gets set to values < PAGE_SIZE
> by a call to set_bh_page.  This is why it looked like the bh's were
> corrupted in my previous backtraces.  The actual disk IO that was
> performed on these pages proceeds okay though, as ll_rw_blk() does
> create_bounce's for the real disk I/O (which is why the dereferences
> you saw came after a successful call to bread).
> 
> I can seemingly (no crashes after a weekend of repeats) make the crashes
> go away by replacing GFP_HIGHUSER with GFP_USER in clean_inode
> (fs/inode.c), and _pagebuf_lookup_pages (fs/pagebuf/page_buf.c).
> Changing one alone doesn't make any difference.
> 
> Hope that this makes some sense to you, and you can just say aha, and wave
> the magic wand :).  I hope you can replicate it locally with this
> information.
> 
> Regards,
> Chris



<Prev in Thread] Current Thread [Next in Thread>