[BUG report]xfs_btree_make_block_unfull generated an OOPS
Eric Sandeen
sandeen at sandeen.net
Mon Dec 14 21:15:09 CST 2009
hank peng wrote:
> 2009/12/15 Dave Chinner <david at fromorbit.com>:
>> On Tue, Dec 15, 2009 at 08:49:37AM +0800, hank peng wrote:
>>> Hi, Eric:
>>> I add some code like this:
>>> if (*stat) {
>>> printk("*stat = 0x%08x, oindex = %p, index = %p\n",
>>> *stat, oindex, index);
>>> if (oindex == NULL || index == NULL) {
>> This won't catch bad non-NULL pointers like you are seeing.
>>
>>> printk("BUG occured!\n");
>>> printk("oindex = %p, index = %p\n", oindex, index);
>>> BUG();
>>> }
>>> *oindex = *index = cur->bc_ptrs[level];
>>> return 0;
>>> }
>>>
>>> And the same OOPS happened again but a little different, kernel messages are:
>>>
>>> <snip>
>>> *stat = 0x00000001, oindex = e87d7bf8, index = e87d7bfc
>>> *stat = 0x00000001, oindex = e87d7bf8, index = e87d7bfc
>>> *stat = 0x00000001, oindex = e87d7bf8, index = e87d7bfc
>>> *stat = 0x00000001, oindex = e87d7bf8, index = e87d7bfc
>>> *stat = 0x00000001, oindex = 00000501, index = 22008424
>>> Unable to handle kernel paging request for data at address 0x22008424
Are you using any of the xfs userspace prior to this error, or is it a
fresh boot and just normal IO?
I ask because libxfs calls sys_ustat() which at one point was corrupting
userspace, at least, with 32-bit userspace on a 64-bit kernel:
https://bugzilla.redhat.com/show_bug.cgi?id=472795
Even with that fixed there were still some reports of odd behavior
on ppc... I don't know if things might be going wrong in kernelspace
as well...
https://bugzilla.redhat.com/show_bug.cgi?id=517994
and I haven't gotten to the bottom of that yet ...
Very few things actually use sys_ustat, but xfs userspace does...
just a random thought.
-eric
>> Given that oindex and index are stack varibles, this indicates some
>> thing is probably smashing the stack. Possibly a buffer overrun. To
>> narrow down the possible cause, can you add the debug:
>>
>> printk("%s:%s: oindex = %p, index = %p\n",
>> __func__, __LINE__, oindex, index);
>>
>> throughout the xfs_btree_make_block_unfull() function? i.e. at
>> first entry, before the xfs_btree_rshift() call, before the
>> xfs_btree_lshift() call, etc, to see if any of the parameters
>> are being modified during execution of the function?
>>
>> If the variables being passed into xfs_btree_make_block_unfull() are
>> already bad, then do the same thing for the caller
>> xfs_btree_insert(). This may help narrow down where the problem
>> is coming from....
>>
> Thanks for your reply!
> As you said, I added some code like this:
> /* First, try shifting an entry to the right neighbor. */
> printk("%s: before xfs_btree_rshift, oindex = %p, index = %p\n",
> __func__, oindex, index);
> error = xfs_btree_rshift(cur, level, stat);
> if (error || *stat)
> return error;
>
> /* Next, try shifting an entry to the left neighbor. */
> printk("%s: before xfs_btree_lshift, oindex = %p, index = %p\n",
> __func__, oindex, index);
> error = xfs_btree_lshift(cur, level, stat);
> if (error)
> return error;
>
> if (*stat) {
> printk("*stat = 0x%08x, oindex = %p, index = %p\n",
> *stat, oindex, index);
> if (oindex == NULL || index == NULL) {
> printk("BUG occured!\n");
> printk("oindex = %p, index = %p\n", oindex, index);
> BUG();
> }
> *oindex = *index = cur->bc_ptrs[level];
> return 0;
> }
>
>
> xfs_btree_set_ptr_null(cur, &nptr);
> if (numrecs == cur->bc_ops->get_maxrecs(cur, level)) {
> printk("%s: before calling
> xfs_btree_make_block_unfull, &optr = %p, &ptr = %p\n",
> __func__, &optr, &ptr);
> error = xfs_btree_make_block_unfull(cur, level, numrecs,
> &optr, &ptr, &nptr, &ncur, &nrec, stat);
> if (error || *stat == 0)
> goto error0;
> }
>
>
> We are waiting for OOPS to happen.
>
> I hope it will nerver be memory corrupt problem which is nightmare for
> me to debug.
>
>> Cheers,
>>
>> Dave.
>> --
>> Dave Chinner
>> david at fromorbit.com
>>
>
>
>
More information about the xfs
mailing list