netdev
[Top] [All Lists]

Re: tx_timeout and timer serialisation

To: Donald Becker <becker@xxxxxxxxx>
Subject: Re: tx_timeout and timer serialisation
From: Andrew Morton <andrewm@xxxxxxxxxx>
Date: Fri, 19 May 2000 01:39:21 +1000
Cc: netdev@xxxxxxxxxxx
References: <3923F8CD.AECBDA6D@xxxxxxxxxx> <Pine.LNX.4.10.10005181020000.1658-100000@xxxxxxxxxxxxx>
Sender: owner-netdev@xxxxxxxxxxx
Donald Becker wrote:
> 
> On Fri, 19 May 2000, Andrew Morton wrote:
> > kuznet@xxxxxxxxxxxxx wrote:
> 
> > > Timers are self-destructable as rule. See? Normal usage
> > > for timer is to have it allocated inside an object and
> > > timer event detroys the object together with timer.
> 
> I'm curious -- what code does this?

me too.

> > Still, my immediate concern is this:
> >
> > I'll be spending the next <however long it takes> working through the
> > old net drivers.  One very common theme/bug in these is the pattern:
> >
> > xxx_close()
> > {
> >       ...
> >       del_timer();
> >       release(some_resources);
> >       ...
> > }
> >
> > xxx_timer()
> > {
> >       use(some_resources);
> > }
> 
> I don't see the semantic problem here.

Well, the timer handler could be executing on another CPU _while_ the
timer is being deleted.  So when del_timer() returns, the handler is
still executing!  So in this example, CPU0 could execute
use(some_resources) AFTER CPU1 has done release(some_resources).

del_timer_sync() is a 2.3 function which spins until the handler tells
the world that it has finished (by clearing timer->running).

> This was the recommended way to use the timer routines.  If the semantics
> have changed, there should be new names for the changed semantics.

I agree.  I assume what happened was that when the timer code went SMP,
the synchronous semantics of del_timer() became asynchronous but the
name was unchanged.

I believe what _should_ have been done when converting this to SMP was
to

- manage the "handler is running" flag _outside_ the handler
- preserve del_timer()'s synchronous semantics (via the spin)
- introduce a new function del_timer_async() which has the
  handler-can-still-be-running semantics.

But this is a historical guess.

I'm not sure if 2.2 is clean in this regard.  It doesn't have
del_timer_sync().  Spent ten minutes peering at 2.2, and I can't see
why, if some random piece of kernel code does a del_timer() in IRQ or
process context, concurrent execution of the post-del_timer() code and
the handler will not occur.

> It would be useful to distinguish between "bugs" and "interfaces changes
> that have made the following no longer correct since version X.Y.Z".

If you come into the game late enough, these are synonymous :)

But you're right - it is the latter.



-- 
-akpm-

<Prev in Thread] Current Thread [Next in Thread>