Just an update here, I have replicated this here, and this is not just
a high mem issue - but highmem when you have mixed filesystems is the only
place this will cause problems, ext2 cannot cope with the buffers it is getting
which have been incorrectly returned to the buffer cache by XFS. The feature
of XFS which appears to be behind this is the mixture of direct and buffered
I/O on the same file. For most people this should not be an issue, but some
of the xfs test suite running on a highmem box can trigger this. I am not
sure if fixing this will clean up other peoples XFS highmem problems.
Steve
> Hi Steve,
>
> Further to my last emails on this, I think I've tracked down why the
> crashes occur, but don't know how to fix it. I eliminated the scsi
> hardware, ethernet card, etc, that Seth Mos suggested might be the problem
> (got loans of completely different hardware). I can reliably crash my
> test machine in under an hour by running test 013 in a loop, and letting
> the "/etc/cron.hourly/sysstat" cron job run. Doing some random other
> commands during the process helps speed the crash up.
>
> The crashes I see are related to the machine having highmem support, and
> buffers allocated with pages in high memory making their way onto the
> (fs/buffer.c) free_list. I added an extra field to struct buffer_head
> that records in the buffer head who created it (in create_empty_buffers),
> and what function called put_last_free. In every instance, the
> buffer_head that causes the crash was created by
> hook_buffers_to_page_delay, and put onto the free list later by a call to
> __invalidate_buffers. (Adding code to record in the bh who called
> that.... done.... crashed, - the caller was blkdev_put this time, but I'll
> run a few more tests).
>
> When one of these bh's with bh->b_page in high memory is given to ext2 by
> getblk, and a "bread" performed, bh->b_data gets set to values < PAGE_SIZE
> by a call to set_bh_page. This is why it looked like the bh's were
> corrupted in my previous backtraces. The actual disk IO that was
> performed on these pages proceeds okay though, as ll_rw_blk() does
> create_bounce's for the real disk I/O (which is why the dereferences
> you saw came after a successful call to bread).
>
> I can seemingly (no crashes after a weekend of repeats) make the crashes
> go away by replacing GFP_HIGHUSER with GFP_USER in clean_inode
> (fs/inode.c), and _pagebuf_lookup_pages (fs/pagebuf/page_buf.c).
> Changing one alone doesn't make any difference.
>
> Hope that this makes some sense to you, and you can just say aha, and wave
> the magic wand :). I hope you can replicate it locally with this
> information.
>
> Regards,
> Chris
|