Patrick McHardy <kaber@xxxxxxxxx> wrote:
> There are several instances of request_module beeing called while
> holding the rtnl semaphore in net/sched. A pratical problem with
> this is the teql scheduler which deadlocks when calling register_netdev
> from its init function. A more far-fetched problem would be some crazy
> person with their modules in a nfs-mounted directory on a server
> reachable over a dial-on-demand link. I couldn't come up with a
> solution except for refusing to autoload teql, maybe someone else has
> an idea.
There are a couple of causes for this problem:
1) Abuse of the rtnl. It's being used for too many things. It's
basically the networking system's BKL. If the locking were more
granular then this shouldn't occur.
2) Hooking random net/sched requests into rtnetlink. By being an
rtnetlink user you pay the price of taking the rtnl. Most of the
net/sched stuff has nothing to do with rtnetlink. You know it
because they all live in AF_UNSPEC :)
Tackling either problem would lead to a solution to the dead-lock.
However, neither is trivial to solve.
On a related note, I'm working on making it easier to add new netlink
families which could lead to a solution to 2.
Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@xxxxxxxxxxxxxxxxxxx>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
|