xfs
[Top] [All Lists]

Re: Buffer Head Corruption Found

To: dcox@xxxxxxxxxx
Subject: Re: Buffer Head Corruption Found
From: Steve Lord <lord@xxxxxxx>
Date: Wed, 24 Jan 2001 15:50:00 -0600
Cc: Linux-XFS <linux-xfs@xxxxxxxxxxx>
In-reply-to: Message from Danny <danscox@xxxxxxxxxxxxxx> of "Tue, 23 Jan 2001 17:14:41 EST." <3A6E0251.65BD4FB7@xxxxxxxxxxxxxx>
Sender: owner-linux-xfs@xxxxxxxxxxx
> All,
> 
>       I just spent today performing a pseudo binary search for a buffer head
> corruption I have been experiencing with XFS and RAID5.  I have no idea
> why it only happens in this instance, as you'll see.
> 
>       In page_buf.c, around line 1424, a call is made to kmem_cache_alloc
> ().  The short story is: at least one pointer is returned that is
> already in use!
> 
>       I wrote a function that steps through the buffer_head lists, and checks
> for b_next_free == NULL.  Since it's a circular list, that should never
> be true.
> 
>       However, after the call to kmem_cache_alloc, and the subsequent 'memset
> (bh, 0,...)', I have my NULL.  This also is the source of most of my
> Oopes from within buffer.c.  Those functions are not expecting a NULL in
> b_next_free at all ;-).
> 
>       So: I've found it, but I have no idea why kmem_cache_alloc would return
> a previously used bh, nor what to do about it.

Hmm, I am not sure how kmem_cache_alloc can do that either, is it not more
likely that a buffer is being freed, but not removed from the list - i.e. the
needle is in that other haystack over there. Maybe turning on memory poisoning
will make things fall over faster - in mm/slab.c there are three defines :

#define DEBUG           0
#define STATS           0
#define FORCED_DEBUG    0

I think you want to set the DEBUG flag to 1

Steve


> 
>       Ideas?
> 
>       Thanks!
> 
> -- 
> "Men occasionally stumble over the truth, but most of them pick 
> themselves up and hurry off as if nothing had happened." 
>    -- Winston Churchill 
> 
> Danny



<Prev in Thread] Current Thread [Next in Thread>