netdev
[Top] [All Lists]

Re: OOPS when force unloading sctp with CONFIG_DEBUG_SLAB enabled

To: rusty@xxxxxxxxxxxxxxx
Subject: Re: OOPS when force unloading sctp with CONFIG_DEBUG_SLAB enabled
From: Sridhar Samudrala <sri@xxxxxxxxxx>
Date: Thu, 18 Mar 2004 10:24:10 -0800 (PST)
Cc: linux-kernel@xxxxxxxxxxxxxxx, netdev@xxxxxxxxxxx
In-reply-to: <Pine.LNX.4.58.0403171008560.2014@xxxxxxxxxxxxxxxxxxxxx>
References: <Pine.LNX.4.58.0403171008560.2014@xxxxxxxxxxxxxxxxxxxxx>
Sender: netdev-bounce@xxxxxxxxxxx
I took a look at the changes to module unload code(sys_delete_module) after
2.6.3. One difference i noticed was that mod->waiter is being set to the
kthread that runs __try_stop_module() instead of the thread that calls
sys_delete_module().

The following patch fixed the oops.

diff -Nru a/kernel/module.c b/kernel/module.c
--- a/kernel/module.c   Thu Mar 18 10:17:58 2004
+++ b/kernel/module.c   Thu Mar 18 10:17:58 2004
@@ -591,6 +591,10 @@
        /* Stop the machine so refcounts can't move and disable module. */
        ret = try_stop_module(mod, flags, &forced);

+       /* Mark it as dying. */
+       mod->waiter = current;
+       mod->state = MODULE_STATE_GOING;
+
        /* Never wait if forced. */
        if (!forced && module_refcount(mod) != 0)
                wait_for_zero_refcount(mod);

I am not sure if this is the right or the complete fix, but i think that
mod->waiter is definitely being set incorrectly in __try_stop_module().

Thanks
Sridhar

On Wed, 17 Mar 2004, Sridhar Samudrala wrote:

> I am getting the following oops when force unloading sctp (rmmod -f sctp) with
> 2.6.5-rc1 and 2.6.4.  This happens only when CONFIG_DEBUG_SLAB is enabled.
>
> This used to work fine until 2.6.3.
>
> looks like somehow a freed task struct is getting dereferenced in
> try_to_wake_up() and it causes oops when debug memory allocations is enabled.
> The call sequence seems to be
>       sys_delete_module
>       cleanup_module
>       sock_release
>       module_put
>       wake_up_process
>       try_to_wake_up
>
> Thanks
> Sridhar
>
> Mar 17 08:31:40 w-sridhar kernel: Unable to handle kernel paging request at 
> virtual address 6b6b6b7b
> Mar 17 08:31:40 w-sridhar kernel:  printing eip:
> Mar 17 08:31:40 w-sridhar kernel: c011ad39
> Mar 17 08:31:40 w-sridhar kernel: *pde = 00000000
> Mar 17 08:31:40 w-sridhar kernel: Oops: 0000 [#1]
> Mar 17 08:31:40 w-sridhar kernel: PREEMPT SMP
> Mar 17 08:31:40 w-sridhar kernel: CPU:    0
> Mar 17 08:31:40 w-sridhar kernel: EIP:    0060:[<c011ad39>]    Tainted: GF
> Mar 17 08:31:40 w-sridhar kernel: EFLAGS: 00010086   (2.6.5-rc1)
> Mar 17 08:31:40 w-sridhar kernel: EIP is at try_to_wake_up+0x29/0x310
> Mar 17 08:31:40 w-sridhar kernel: eax: 6b6b6b6b   ebx: c041cc80   ecx: 
> ccc66da4   edx: cdc258a0
> Mar 17 08:31:40 w-sridhar kernel: esi: cdc7c000   edi: c041cc80   ebp: 
> cdc7df18   esp: cdc7def4
> Mar 17 08:31:40 w-sridhar kernel: ds: 007b   es: 007b   ss: 0068
> Mar 17 08:31:40 w-sridhar kernel: Process rmmod (pid: 1636, 
> threadinfo=cdc7c000 task=cde2a140)
> Mar 17 08:31:40 w-sridhar kernel: Stack: d08e2300 cdc7df14 c02bde4c cfcd9304 
> 00000000 00000282 cdf1e5bc cdc7c000
> Mar 17 08:31:40 w-sridhar kernel:        d08e2300 cdc7df2c c011b03e cdc258a0 
> 00000007 00000000 cdc7df44 c02bac4a
> Mar 17 08:31:40 w-sridhar kernel:        cdf1e5bc c03843b8 d08e2300 00000a80 
> cdc7df54 d08d6f24 cdf1e5bc 00000a80
> Mar 17 08:31:40 w-sridhar kernel: Call Trace:
> Mar 17 08:31:40 w-sridhar kernel:  [<c02bde4c>] sk_free+0x6c/0xf0
> Mar 17 08:31:40 w-sridhar kernel:  [<c011b03e>] wake_up_process+0x1e/0x30
> Mar 17 08:31:40 w-sridhar kernel:  [<c02bac4a>] sock_release+0xea/0xf0
> Mar 17 08:31:40 w-sridhar kernel:  [<d08d6f24>] cleanup_module+0x24/0x1d5 
> [sctp]
> Mar 17 08:31:40 w-sridhar kernel:  [<c013dbd4>] sys_delete_module+0x174/0x1d0
> Mar 17 08:31:40 w-sridhar kernel:  [<c0157d18>] sys_munmap+0x58/0x80
> Mar 17 08:31:40 w-sridhar kernel:  [<c0107adf>] syscall_call+0x7/0xb
> Mar 17 08:31:40 w-sridhar kernel:
> Mar 17 08:31:40 w-sridhar kernel: Code: 8b 40 10 8b 14 85 20 f0 41 c0 ff 46 
> 14 01 d7 31 c0 86 07 84
> Mar 17 08:31:40 w-sridhar kernel:  <6>note: rmmod[1636] exited with 
> preempt_count 1
> Mar 17 08:31:40 w-sridhar kernel: Debug: sleeping function called from 
> invalid context at include/linux/rwsem.h:43
> Mar 17 08:31:40 w-sridhar kernel: in_atomic():1, irqs_disabled():0
> Mar 17 08:31:40 w-sridhar kernel: Call Trace:
> Mar 17 08:31:40 w-sridhar kernel:  [<c011f46b>] __might_sleep+0xab/0xd0
> Mar 17 08:31:40 w-sridhar kernel:  [<c0123b02>] profile_exit_task+0x22/0x60
> Mar 17 08:31:40 w-sridhar kernel:  [<c0125a9a>] do_exit+0x7a/0x610
> Mar 17 08:31:40 w-sridhar kernel:  [<c0108cd0>] do_divide_error+0x0/0xf0
> Mar 17 08:31:40 w-sridhar kernel:  [<c0119855>] do_page_fault+0x215/0x58a
> Mar 17 08:31:40 w-sridhar kernel:  [<c011a934>] recalc_task_prio+0xb4/0x1f0
> Mar 17 08:31:40 w-sridhar kernel:  [<c011c9f9>] schedule+0x3a9/0x7b0
> Mar 17 08:31:40 w-sridhar kernel:  [<c011bfe3>] scheduler_tick+0x43/0x6a0
> Mar 17 08:31:40 w-sridhar kernel:  [<c0119640>] do_page_fault+0x0/0x58a
> Mar 17 08:31:40 w-sridhar kernel:  [<c0108569>] error_code+0x2d/0x38
> Mar 17 08:31:40 w-sridhar kernel:  [<c02b007b>] atkbd_connect+0x19b/0x420
> Mar 17 08:31:40 w-sridhar kernel:  [<c011ad39>] try_to_wake_up+0x29/0x310
> Mar 17 08:31:40 w-sridhar kernel:  [<c02bde4c>] sk_free+0x6c/0xf0
> Mar 17 08:31:40 w-sridhar kernel:  [<c011b03e>] wake_up_process+0x1e/0x30
> Mar 17 08:31:40 w-sridhar kernel:  [<c02bac4a>] sock_release+0xea/0xf0
> Mar 17 08:31:40 w-sridhar kernel:  [<d08d6f24>] cleanup_module+0x24/0x1d5 
> [sctp]
> Mar 17 08:31:40 w-sridhar kernel:  [<c013dbd4>] sys_delete_module+0x174/0x1d0
> Mar 17 08:31:40 w-sridhar kernel:  [<c0157d18>] sys_munmap+0x58/0x80
> Mar 17 08:31:40 w-sridhar kernel:  [<c0107adf>] syscall_call+0x7/0xb
>
>
>

<Prev in Thread] Current Thread [Next in Thread>