I think I found the problem in the link event code. linkwatch_event()
calls rtnl_unlock() when it gets an event (UNREGISTER) for the device
going down. But this gets called before the device unregister gets called,
via the rmmod/pci_device_remove), and frees up the dev. Later the device
unregister code panics the system.
I also noticed that this panic happens for e100 but not for the 3com
driver. 3com doesn't generate events for up/down using the linkwatch.
I tested with the following patch, and the panic disappeared (the
device shutdown properly). Dave, any need for rtnl_unlock() in this
diff -ruN linux-2.6.0-test9-bk9/net/core/link_watch.c
--- linux-2.6.0-test9-bk9/net/core/link_watch.c 2003-11-06 12:26:30.000000000
+++ linux-2.6.0-test9-bk9.new/net/core/link_watch.c 2003-11-06
@@ -15,6 +15,7 @@
@@ -89,9 +90,11 @@
linkwatch_nextevent = jiffies + HZ;
On Thu, 6 Nov 2003, David S. Miller wrote:
> On Thu, 6 Nov 2003 11:58:24 -0800
> Krishna Kumar <kumarkr@xxxxxxxxxx> wrote:
> > When unregister_netdev() is called by the driver, it first calls
> > unregister_netdevice() which
> > drops it's last ref to the dev, making it zero. unregister_netdev() then
> > calls rtnl_unlock() which
> > calls netdev_run_todo(), which calls netdev_wait_allrefs() and only after
> > that succeeds,
> > does the driver do a free_netdev(). So the dev should not be freed while
> > the wait_ref() is
> > executing, and the original code looks correct.
> That's correct.
> > I don't know if it is some corruption on my system, some hardware problem ?
> > I will look
> > some more, also try to get a different machine.
> It could be some 'user after free' or similar issue.
> Just an idea of something else to look for.
> My earlier comments about "putting to zero multiple times" were
> misguided, I forgot that these days dev_put() just decrements the
> count and does not do anything special when the count reaches zero.