Hi dave, While doing a test comprising of : insmod e100 ifup eth0 rmmod e100 on test9-bk9 bits, I got the following Oops : Nov 5 14:54:58 linux kernel: Unable to handle kernel paging request at virtu
About the patch below, I am still not clear about dev->reg_state (NETREG_UNREGISTERING) being used correctly, or how netdev_run_todo works correctly, so possibly the race may not be fixed. I am tryin
Hey, what if the dev refcount goes to zero before your dev_hold? Actually this repeated notifier looks like it wouldn't work anyway. Why would a protocol drop it's reference when notified a second t
If the loops runs once or twice that is not a bug, it is possible for processes to grab onto the device via rtnetlink queries and similar and we have to pause and potentially schedule to deal with th
Actually, that is what I had written in my second mail. I have seen a message regarding ref count of 1, some delay and then the rmmod works fine. So it doesn't seem busted. - KK <shemminger@osdl.| l
Try this. Instead of dropping the last reference in unregister, it does it after all other references are gone (sort of like the old 2.4 code). diff -Nru a/net/core/dev.c b/net/core/dev.c -- a/net/co
Yes. I guess rtnetlink_rcv() calls netdev_run_todo() to handle that case. That was my original intention, but won't a driver that calls unregister_netdevice() followed by a free_netdev() still panic
So how will this guarantee that the dev is valid after the dev_put() long enough to do the BUG_ON() and dev->destructor code ? Won't the same panic happen ? Any idea how the dev gets freed up ? I was
Because the code there should be able to depend on having the last reference. No other code should be able to find the dev to get a new reference to it, since it is no longer in the dev_list. Code th
Nope, I still get the panic with the change you suggested. We need to understand this better though we seem to be on the right track. I will try to get the stack now (couldn't get this time since I w
Actually, even the original code looks good, and if that panic'd, the following won't change that (verified it also). When unregister_netdev() is called by the driver, it first calls unregister_netde
That's correct. It could be some 'user after free' or similar issue. Just an idea of something else to look for. My earlier comments about "putting to zero multiple times" were misguided, I forgot th
I think I found the problem in the link event code. linkwatch_event() calls rtnl_unlock() when it gets an event (UNREGISTER) for the device going down. But this gets called before the device unregist
Hi dave, While doing a test comprising of : insmod e100 ifup eth0 rmmod e100 on test9-bk9 bits, I got the following Oops : Nov 5 14:54:58 linux kernel: Unable to handle kernel paging request at virtu
About the patch below, I am still not clear about dev->reg_state (NETREG_UNREGISTERING) being used correctly, or how netdev_run_todo works correctly, so possibly the race may not be fixed. I am tryin
Hey, what if the dev refcount goes to zero before your dev_hold? Actually this repeated notifier looks like it wouldn't work anyway. Why would a protocol drop it's reference when notified a second t
If the loops runs once or twice that is not a bug, it is possible for processes to grab onto the device via rtnetlink queries and similar and we have to pause and potentially schedule to deal with th
Actually, that is what I had written in my second mail. I have seen a message regarding ref count of 1, some delay and then the rmmod works fine. So it doesn't seem busted. - KK <shemminger@osdl.| l
Try this. Instead of dropping the last reference in unregister, it does it after all other references are gone (sort of like the old 2.4 code). diff -Nru a/net/core/dev.c b/net/core/dev.c -- a/net/co