netdev
[Top] [All Lists]

Re: modular net drivers, take 2

To: Keith Owens <kaos@xxxxxxxxxx>
Subject: Re: modular net drivers, take 2
From: Andrew Morton <andrewm@xxxxxxxxxx>
Date: Wed, 21 Jun 2000 02:37:58 +0000
Cc: "netdev@xxxxxxxxxxx" <netdev@xxxxxxxxxxx>
References: Your message of "Wed, 21 Jun 2000 01:18:43 GMT." <395017F3.516165AD@uow.edu.au> <6062.961552095@kao2.melbourne.sgi.com>
Sender: owner-netdev@xxxxxxxxxxx
Keith Owens wrote:
> 
> Change every module that registers anything to make sure that they
> replace the register data with stubs on exit?  And make sure that all
> of them do so before they sleep anywhere in module cleanup?  It would
> work but is it the best solution?

Not when you put it that way.

> The existing method of avoiding module races is beginning to look like
> a dead dog.  Look at the constraints we have to run under :-
> 
> * All code that can ever call any module functions must either have a
>   reference count on that module or must run under the same lock as the
>   module unload (big kernel lock).

Yes.

> * Every module must be checked to see that it never sleeps before doing
>   MOD_INC_USE_COUNT.
> * Every module must be checked to see that it never sleeps after doing
>   MOD_DEC_USE_COUNT.

These two can be avoided by hoisting the inc/dec up into the netdevice
layer.  But we need to wrap dev->get_stats(), dev->ioctl(), etc with
inc/dec as well..

> * Every module that registers anything must change the registered
>   functions in module cleanup and must do so before sleeping (new).
> 
> That is an awful lot of opportunities to make mistakes.  And forcing
> lots of code to run under the big kernel lock does not scale well.

I am now remembering Alexey's disparaging comments about "self-modifying
code". 

It sucks and I'm still seeking a single, centralised fix. How about plan
J (warning: inelegance approaching):

Module unload is a very rare occurence, so let's penalise that and that
alone.

We grab the ENTIRE machine within sys_delete_module. 

Like, grab the big kernel lock, then wait until ALL other CPUs are
spinning on the kernel lock, and then allow sys_delete_module to
proceed.

<Prev in Thread] Current Thread [Next in Thread>