xfs
[Top] [All Lists]

Re: [BUG report]xfs_btree_make_block_unfull generated an OOPS

To: Eric Sandeen <sandeen@xxxxxxxxxxx>
Subject: Re: [BUG report]xfs_btree_make_block_unfull generated an OOPS
From: hank peng <pengxihan@xxxxxxxxx>
Date: Tue, 15 Dec 2009 11:22:59 +0800
Cc: Dave Chinner <david@xxxxxxxxxxxxx>, xfs-oss <xfs@xxxxxxxxxxx>
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=d/+t9LK+cOhVqw1WvUggxH5mRvDHE6VpaAJ/ThEHncQ=; b=B4AGrS5LWwfxwA3uuwDh4UQsOCBuT+W+QQ9jWG9hrpQ6BZ80HS4EHt9xFRzJGjf04L mK+J+iO7iqahz4o1w2tvKUPMkxx4DRQ0srZp0/MnV3RPGpYlagPxtDlfPsJxRbdDlK1f pb5FPz+F+RcVP8Z/KH/Wrnx0it/dD3PIObKaQ=
Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; b=qd0NVGCWeotl3AMgk2CeHZH8sgofJ77PCj3iKnBjG8DDRHTCLCpNlDF+WJsSLbAYg1 N1QytlryHvncO3b7M9I1id8WEPHraCCZGzJ2gSQPJLuTt2AYWy6uRv6jNssgC8/xOQrS xRjONXTMUW3iT3nAo7aQIWiq5At3Mrat95Xbo=
In-reply-to: <4B26FF3D.2020500@xxxxxxxxxxx>
References: <4B1F1211.90607@xxxxxxxxxxx> <4B1F31FD.3020705@xxxxxxxxxxx> <389deec70912082220pcb3b5d1q516ac197d31502c5@xxxxxxxxxxxxxx> <389deec70912082230g38987576pc48d7699f23844c5@xxxxxxxxxxxxxx> <389deec70912140119q40ed91cao62fe9c9ebdf13601@xxxxxxxxxxxxxx> <4B26604B.3060901@xxxxxxxxxxx> <389deec70912141649g767a1540hdeae66707c4c68fd@xxxxxxxxxxxxxx> <20091215012640.GA4850@xxxxxxxxxxxxxxxx> <389deec70912141756k23776aajbc90c6d7e3fc8d4b@xxxxxxxxxxxxxx> <4B26FF3D.2020500@xxxxxxxxxxx>
2009/12/15 Eric Sandeen <sandeen@xxxxxxxxxxx>:
> hank peng wrote:
>> 2009/12/15 Dave Chinner <david@xxxxxxxxxxxxx>:
>>> On Tue, Dec 15, 2009 at 08:49:37AM +0800, hank peng wrote:
>>>> Hi, Eric:
>>>> I add some code like this:
>>>> if (*stat) {
>>>>                 printk("*stat = 0x%08x, oindex = %p, index = %p\n",
>>>>                                 *stat, oindex, index);
>>>>                 if (oindex == NULL || index == NULL) {
>>> This won't catch bad non-NULL pointers like you are seeing.
>>>
>>>>                         printk("BUG occured!\n");
>>>>                         printk("oindex = %p, index = %p\n", oindex, index);
>>>>                         BUG();
>>>>                 }
>>>>                 *oindex = *index = cur->bc_ptrs[level];
>>>>                 return 0;
>>>>         }
>>>>
>>>> And the same OOPS happened again but a little different, kernel messages 
>>>> are:
>>>>
>>>> <snip>
>>>> *stat = 0x00000001, oindex = e87d7bf8, index = e87d7bfc
>>>> *stat = 0x00000001, oindex = e87d7bf8, index = e87d7bfc
>>>> *stat = 0x00000001, oindex = e87d7bf8, index = e87d7bfc
>>>> *stat = 0x00000001, oindex = e87d7bf8, index = e87d7bfc
>>>> *stat = 0x00000001, oindex = 00000501, index = 22008424
>>>> Unable to handle kernel paging request for data at address 0x22008424
>
> Are you using any of the xfs userspace prior to this error, or is it a
> fresh boot and just normal IO?
>
no xfs userspace prior to this error, just normal IO. Besides, it need
some time to produce the OOPS.

> I ask because libxfs calls sys_ustat() which at one point was corrupting
> userspace, at least, with 32-bit userspace on a 64-bit kernel:
> https://bugzilla.redhat.com/show_bug.cgi?id=472795
>
Forgot to say, I use "-o inode64" when mount.

# uname -a
Linux Storage 2.6.31.6-svn40 #30 Tue Dec 15 09:50:02 CST 2009 ppc unknown
# mount
rootfs on / type rootfs (rw)
/dev/root on / type ext2 (rw,relatime,errors=continue)
/dev/mtdblock2 on /mnt/sys_data type jffs2 (rw,relatime)
proc on /proc type proc (rw,relatime)
sysfs on /sys type sysfs (rw,relatime)
tmpfs on /opt/upgrade type tmpfs (rw,relatime)
devpts on /dev/pts type devpts (rw,relatime,gid=5,mode=620)
/dev/Pool_md2/ss1 on /mnt/Pool_md2/ss1 type xfs
(rw,relatime,attr2,inode64,noquota)



> Even with that fixed there were still some reports of odd behavior
> on ppc... I don't know if things might be going wrong in kernelspace
> as well...
>
> https://bugzilla.redhat.com/show_bug.cgi?id=517994
> and I haven't gotten to the bottom of that yet ...
>
> Very few things actually use sys_ustat, but xfs userspace does...
> just a random thought.
>
> -eric
>
>>> Given that oindex and index are stack varibles, this indicates some
>>> thing is probably smashing the stack. Possibly a buffer overrun. To
>>> narrow down the possible cause, can you add the debug:
>>>
>>>        printk("%s:%s: oindex = %p, index = %p\n",
>>>                        __func__, __LINE__, oindex, index);
>>>
>>> throughout the xfs_btree_make_block_unfull() function? i.e. at
>>> first entry, before the xfs_btree_rshift() call, before the
>>> xfs_btree_lshift() call, etc, to see if any of the parameters
>>> are being modified during execution of the function?
>>>
>>> If the variables being passed into xfs_btree_make_block_unfull() are
>>> already bad, then do the same thing for the caller
>>> xfs_btree_insert(). This may help narrow down where the problem
>>> is coming from....
>>>
>> Thanks for your reply!
>> As you said, I added some code like this:
>> /* First, try shifting an entry to the right neighbor. */
>>         printk("%s: before xfs_btree_rshift, oindex = %p, index = %p\n",
>>                         __func__, oindex, index);
>>         error = xfs_btree_rshift(cur, level, stat);
>>         if (error || *stat)
>>                 return error;
>>
>>         /* Next, try shifting an entry to the left neighbor. */
>>         printk("%s: before xfs_btree_lshift, oindex = %p, index = %p\n",
>>                         __func__, oindex, index);
>>         error = xfs_btree_lshift(cur, level, stat);
>>         if (error)
>>                 return error;
>>
>>         if (*stat) {
>>                 printk("*stat = 0x%08x, oindex = %p, index = %p\n",
>>                                 *stat, oindex, index);
>>                 if (oindex == NULL || index == NULL) {
>>                         printk("BUG occured!\n");
>>                         printk("oindex = %p, index = %p\n", oindex, index);
>>                         BUG();
>>                 }
>>                 *oindex = *index = cur->bc_ptrs[level];
>>                 return 0;
>>         }
>>
>>
>> xfs_btree_set_ptr_null(cur, &nptr);
>>         if (numrecs == cur->bc_ops->get_maxrecs(cur, level)) {
>>                 printk("%s: before calling
>> xfs_btree_make_block_unfull, &optr = %p, &ptr = %p\n",
>>                                 __func__, &optr, &ptr);
>>                 error = xfs_btree_make_block_unfull(cur, level, numrecs,
>>                                         &optr, &ptr, &nptr, &ncur, &nrec, 
>> stat);
>>                 if (error || *stat == 0)
>>                         goto error0;
>>         }
>>
>>
>> We are waiting for OOPS to happen.
>>
>> I hope it will nerver be memory corrupt problem which is nightmare for
>> me to debug.
>>
>>> Cheers,
>>>
>>> Dave.
>>> --
>>> Dave Chinner
>>> david@xxxxxxxxxxxxx
>>>
>>
>>
>>
>
>



-- 
The simplest is not all best but the best is surely the simplest!

<Prev in Thread] Current Thread [Next in Thread>