> Juha Saarinen wrote:
> >
> > fsstress filled up /usr 100% so I was going remove the p* directories
> > and files. This provoked an oops:
> >
> > Kernel BUG at ll_rw_blk.c:700!
> >
>
> Ah. ll_rw_blk.c:700. I know it well.
>
> My bet: A buffer was unmapped in block_flushpage() but somehow
> somebody set it dirty again, and bdflush/kupdate tried to write
> the thing out.
We managed to get into the read path without mapping the buffer, the
information I really need to diagnose this is elsewhere else in the
kernel, not the buffer_head, time to do a debug by induction.
>
> For ext3 I have implemented a "buffer tracing" mechanism. At
> any interesting pointin the lifecycle of a buffer_head you
> call
>
> BUFFER_TRACE(bh, "some descriptive text")
>
> then, thoughout the code I've added things like:
>
> J_ASSERT_BH(bh, some_expression);
>
> If the assertion fails you get a 64-slot backtrace of
> all the buffer's state transitions. There's an example
> output trace from when I was hitting exactly this same BUG()
> at http://www.zip.com.au/~akpm/buffer-trace.txt - it's great.
> See how journal_commit_transaction incorrectly sets the
> buffer dirty and then flush_dirty_buffers grabs it.
We do have buffer tracing, but at the pagebuf buffer level, not the
buffer head level, all metadata is managed through a different structure.
buffer heads only come into play when we actually do I/O. But it looks
useful for the I/O path end of things, I may take a look if I can find
the time.
Steve
|