netdev
[Top] [All Lists]

Re: modular net drivers

To: "netdev@xxxxxxxxxxx" <netdev@xxxxxxxxxxx>
Subject: Re: modular net drivers
From: Michael Richardson <mcr@xxxxxxxxxxxxxxxxxxxxxx>
Date: Mon, 26 Jun 2000 12:14:03 -0600
In-reply-to: Your message of "Wed, 21 Jun 2000 07:49:40 +1000." <4450.961537780@ocs3.ocs-net>
Sender: owner-netdev@xxxxxxxxxxx
>>>>> "Keith" == Keith Owens <kaos@xxxxxxxxxx> writes:
    Keith> On Tue, 20 Jun 2000 17:01:56 +1000, 
    Keith> Rusty Russell <rusty@xxxxxxxxxxxxxxxx> wrote:
    >> Keith Owens wrote:
    >>> It is also an important bug fix.  The module code has suffered from
    >>> unload races ever since the kernel locking became fine grained, users
    >>> can crash the kernel.
    >> 
    >> Races which can be largely solved at the moment by having the module
    >> page removal code sync all bh's and softirqs after calling cleanup().
    >> Hell, we could even poll all CPUs and check they're not executing in
    >> the about-to-be-freed pages.  Speed is completely unimportant here.

    Keith> This race is not obvious but IMHO it exists.  The original theory 
was 

    Keith>   Kernel load and unload code runs under the big kernel lock.
    Keith>   open() and similar code runs under the big kernel lock.
    Keith>   If the code does MOD_INC_USE_COUNT before sleeping then we are 
safe.

    Keith> But consider this race, even on UP.

    Keith>   Module has been used, nothing is currently using it, use_count == 
0.
    Keith>   rmmod runs, either manual or autoclean.
    Keith>   The module is marked as being deleted.
    Keith>   module_cleanup() is entered, does I/O, sleeps, loses big kernel
  
  The module_cleanup() is broken in that case.

  It should get all resources (i.e. locks) that it needs before doing 
*anything* and
should release all resources as soon as it fails to get any others. To do
anything else is to cause a possible deadlock. This is textbook multiprocessing.

    Keith> AFAICT the only safe mechanism is one that checks the module state
    Keith> *before* entering the module.  Once you enter the module and sleep 
all
    Keith> bets are off.  And that means exporting the module information to the
    Keith> open() layer, which is what Al Viro has been doing.

  Is the "module is marked as being deleted" the info that is passed to
open()? "Module is deleted" is an atomic operation. It either occurs because
use_count==0, or it fails, and all further calls to the module don't find
it.

]      Out and about in Ottawa.    hmmm... beer.                |  firewalls  [
]   Michael Richardson, Sandelman Software Works, Ottawa, ON    |net architect[
] mcr@xxxxxxxxxxxxxxxxxxxxxx http://www.sandelman.ottawa.on.ca/ |device driver[
] panic("Just another NetBSD/notebook using, kernel hacking, security guy");  [




<Prev in Thread] Current Thread [Next in Thread>