[BUG report]xfs_btree_make_block_unfull generated an OOPS
hank peng
pengxihan at gmail.com
Mon Dec 14 21:22:59 CST 2009
2009/12/15 Eric Sandeen <sandeen at sandeen.net>:
> hank peng wrote:
>> 2009/12/15 Dave Chinner <david at fromorbit.com>:
>>> On Tue, Dec 15, 2009 at 08:49:37AM +0800, hank peng wrote:
>>>> Hi, Eric:
>>>> I add some code like this:
>>>> if (*stat) {
>>>> printk("*stat = 0x%08x, oindex = %p, index = %p\n",
>>>> *stat, oindex, index);
>>>> if (oindex == NULL || index == NULL) {
>>> This won't catch bad non-NULL pointers like you are seeing.
>>>
>>>> printk("BUG occured!\n");
>>>> printk("oindex = %p, index = %p\n", oindex, index);
>>>> BUG();
>>>> }
>>>> *oindex = *index = cur->bc_ptrs[level];
>>>> return 0;
>>>> }
>>>>
>>>> And the same OOPS happened again but a little different, kernel messages are:
>>>>
>>>> <snip>
>>>> *stat = 0x00000001, oindex = e87d7bf8, index = e87d7bfc
>>>> *stat = 0x00000001, oindex = e87d7bf8, index = e87d7bfc
>>>> *stat = 0x00000001, oindex = e87d7bf8, index = e87d7bfc
>>>> *stat = 0x00000001, oindex = e87d7bf8, index = e87d7bfc
>>>> *stat = 0x00000001, oindex = 00000501, index = 22008424
>>>> Unable to handle kernel paging request for data at address 0x22008424
>
> Are you using any of the xfs userspace prior to this error, or is it a
> fresh boot and just normal IO?
>
no xfs userspace prior to this error, just normal IO. Besides, it need
some time to produce the OOPS.
> I ask because libxfs calls sys_ustat() which at one point was corrupting
> userspace, at least, with 32-bit userspace on a 64-bit kernel:
> https://bugzilla.redhat.com/show_bug.cgi?id=472795
>
Forgot to say, I use "-o inode64" when mount.
# uname -a
Linux Storage 2.6.31.6-svn40 #30 Tue Dec 15 09:50:02 CST 2009 ppc unknown
# mount
rootfs on / type rootfs (rw)
/dev/root on / type ext2 (rw,relatime,errors=continue)
/dev/mtdblock2 on /mnt/sys_data type jffs2 (rw,relatime)
proc on /proc type proc (rw,relatime)
sysfs on /sys type sysfs (rw,relatime)
tmpfs on /opt/upgrade type tmpfs (rw,relatime)
devpts on /dev/pts type devpts (rw,relatime,gid=5,mode=620)
/dev/Pool_md2/ss1 on /mnt/Pool_md2/ss1 type xfs
(rw,relatime,attr2,inode64,noquota)
> Even with that fixed there were still some reports of odd behavior
> on ppc... I don't know if things might be going wrong in kernelspace
> as well...
>
> https://bugzilla.redhat.com/show_bug.cgi?id=517994
> and I haven't gotten to the bottom of that yet ...
>
> Very few things actually use sys_ustat, but xfs userspace does...
> just a random thought.
>
> -eric
>
>>> Given that oindex and index are stack varibles, this indicates some
>>> thing is probably smashing the stack. Possibly a buffer overrun. To
>>> narrow down the possible cause, can you add the debug:
>>>
>>> printk("%s:%s: oindex = %p, index = %p\n",
>>> __func__, __LINE__, oindex, index);
>>>
>>> throughout the xfs_btree_make_block_unfull() function? i.e. at
>>> first entry, before the xfs_btree_rshift() call, before the
>>> xfs_btree_lshift() call, etc, to see if any of the parameters
>>> are being modified during execution of the function?
>>>
>>> If the variables being passed into xfs_btree_make_block_unfull() are
>>> already bad, then do the same thing for the caller
>>> xfs_btree_insert(). This may help narrow down where the problem
>>> is coming from....
>>>
>> Thanks for your reply!
>> As you said, I added some code like this:
>> /* First, try shifting an entry to the right neighbor. */
>> printk("%s: before xfs_btree_rshift, oindex = %p, index = %p\n",
>> __func__, oindex, index);
>> error = xfs_btree_rshift(cur, level, stat);
>> if (error || *stat)
>> return error;
>>
>> /* Next, try shifting an entry to the left neighbor. */
>> printk("%s: before xfs_btree_lshift, oindex = %p, index = %p\n",
>> __func__, oindex, index);
>> error = xfs_btree_lshift(cur, level, stat);
>> if (error)
>> return error;
>>
>> if (*stat) {
>> printk("*stat = 0x%08x, oindex = %p, index = %p\n",
>> *stat, oindex, index);
>> if (oindex == NULL || index == NULL) {
>> printk("BUG occured!\n");
>> printk("oindex = %p, index = %p\n", oindex, index);
>> BUG();
>> }
>> *oindex = *index = cur->bc_ptrs[level];
>> return 0;
>> }
>>
>>
>> xfs_btree_set_ptr_null(cur, &nptr);
>> if (numrecs == cur->bc_ops->get_maxrecs(cur, level)) {
>> printk("%s: before calling
>> xfs_btree_make_block_unfull, &optr = %p, &ptr = %p\n",
>> __func__, &optr, &ptr);
>> error = xfs_btree_make_block_unfull(cur, level, numrecs,
>> &optr, &ptr, &nptr, &ncur, &nrec, stat);
>> if (error || *stat == 0)
>> goto error0;
>> }
>>
>>
>> We are waiting for OOPS to happen.
>>
>> I hope it will nerver be memory corrupt problem which is nightmare for
>> me to debug.
>>
>>> Cheers,
>>>
>>> Dave.
>>> --
>>> Dave Chinner
>>> david at fromorbit.com
>>>
>>
>>
>>
>
>
--
The simplest is not all best but the best is surely the simplest!
More information about the xfs
mailing list