netdev
[Top] [All Lists]

Re: request_module while holding rtnl semaphore

To: kaber@xxxxxxxxx (Patrick McHardy)
Subject: Re: request_module while holding rtnl semaphore
From: Herbert Xu <herbert@xxxxxxxxxxxxxxxxxxx>
Date: Sat, 06 Nov 2004 10:35:45 +1100
Cc: netdev@xxxxxxxxxxx
In-reply-to: <41899DCF.3050804@xxxxxxxxx>
Organization: Core
Sender: netdev-bounce@xxxxxxxxxxx
User-agent: tin/1.7.4-20040225 ("Benbecula") (UNIX) (Linux/2.4.27-hx-1-686-smp (i686))
Patrick McHardy <kaber@xxxxxxxxx> wrote:
> There are several instances of request_module beeing called while
> holding the rtnl semaphore in net/sched. A pratical problem with
> this is the teql scheduler which deadlocks when calling register_netdev
> from its init function. A more far-fetched problem would be some crazy
> person with their modules in a nfs-mounted directory on a server
> reachable over a dial-on-demand link. I couldn't come up with a
> solution except for refusing to autoload teql, maybe someone else has
> an idea.

There are a couple of causes for this problem:

1) Abuse of the rtnl.  It's being used for too many things.  It's
basically the networking system's BKL.  If the locking were more
granular then this shouldn't occur.

2) Hooking random net/sched requests into rtnetlink.  By being an
rtnetlink user you pay the price of taking the rtnl.  Most of the
net/sched stuff has nothing to do with rtnetlink.  You know it
because they all live in AF_UNSPEC :)

Tackling either problem would lead to a solution to the dead-lock.
However, neither is trivial to solve.

On a related note, I'm working on making it easier to add new netlink
families which could lead to a solution to 2.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@xxxxxxxxxxxxxxxxxxx>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

<Prev in Thread] Current Thread [Next in Thread>