netdev
[Top] [All Lists]

Re: 3c59x.c

To: Andrew Morton <andrewm@xxxxxxxxxx>
Subject: Re: 3c59x.c
From: Donald Becker <becker@xxxxxxxxx>
Date: Sun, 26 Mar 2000 14:20:02 -0500 (EST)
Cc: netdev <netdev@xxxxxxxxxxx>
In-reply-to: <38DE25E0.EAF22B3@uow.edu.au>
Sender: owner-netdev@xxxxxxxxxxx
On Sun, 26 Mar 2000, Andrew Morton wrote:

> I'd like to work this through a little more then summarise the outcome
> as a plan of what (if anything) needs to be done.  Once the intended
> changes are well reviewed someone (doesn't really matter who) can cut a
> patch.  Sound OK?

You should understand my stance on this: I wrote and maintained many drivers
for seven years.  In the past seven months significant, poorly considered
changes were made to the driver interface.  For most of those changes I
wasn't consulted, or even given warning about them.  The only notice was
mail to the linux-kernel (not even to linux-net) list that the interface had
already changed.

Some of the interface changes were obviously first passes, and had never
really been tested.  They were quickly fixed, but it was clear that they
were not thought through before being implemented.  Interface changes are
major undertakings, and they should have been treated as such.  Instead the
attitude was "Here are changes.  Fix your drivers to work with them."

> > You should bring this up on the linux-vortex mailing list.
> > If you do not read the linux-vortex mailing list, you should not even think
> > about modifying the driver until you understand more about it.
> 
> Thanks - I'll check it out.

It should have been the first place to check.

> > > - why do we independently kmalloc dev->priv?  Why not let
> > >   init_etherdev() do it?
> > 
> > Because of alignment requirements.  Some descriptors must be longword
..
> init_etherdev() aligns dev->priv on a 32 byte boundary (this was
> recently fixed up).  Is this not suitable?

The proper statement was "this was recently fixed up, after yet another
person broke it".  Counting on init_etherdev() to provide a correctly
aligned, properly allocated structure will continue to be a bad assumption.

> > > - Are there MOD_INC/DEC_USE_COUNT races?  What are the rules here?
> > 
> > The locking for this is handled outside the drivers, as it should be.
> 
> Sorry, this was a comment to myself which snuck through.  I am told that
> there are insmod/rmmod races in various drivers but I've never seen a
> description of the sorts of things to look out for.

Yes, there are races in some drivers.  But I don't believe there are any
here.

The driver interface should be designed so that locking is unnecessary in
most common cases.  Driver writer can always introduce their own races, of
course.

> > > - vortex_start_xmit() is racy.  It is not protected from h/w
> > >   interrupts.  It needs a spin_lock_irqsave(priv->lock) and
> > >   vortex_interrupt() needs a spin_lock(priv->lock);
> > 
> > The vortex_start_xmit() is for the 3c590 series only, not the later 
> > hardware.
> > The hardware only needs to be protected against simultaneous transmit
> > attempts, not a receive that happens in the middle of transferring a packet
> > to the card's (large!) FIFO.
> 
> Are you sure about this?  That a Tx complete interrupt (or a
> vortex_interrupt() call from vortex_tx_timeout()) cannot occur when
> vortex_start_xmit() is executing? 

The Tx-packet-load register activity is independent of the interrupt
handling.  Although that should be verified with the current SMP setup.

You should be able to ignore the timeout.  The semantics of watchdog
timeouts are that they are never called during a transmit attempt.

Hmmm, actually, that *should* be reviewed.  I can't find anyplace that the
new semantics in pre-2.4 have been defined, so there may well be a new race
condition here.  If there isn't one currently, it could easily grow one.
It would bad to add the overhead of a spin lock during normal operation just
to handle this rare error case.

> > Putting a spin lock there will waste a *lot* of spin time, since the old
> > 3c590 isn't especially fast, especially if we are using PIO mode.
> 
> I assume the wastage you are talking about it stalling the Rx interrupt
> while the PIO Tx is in progress?
..
> > These macros are only OK if they are trivial locking functions.  When they
> > are more complex functions every card without a large Tx queue will suffer
> > badly.
> 
> They're quite lightweight. Probably < 100 insns for the worst case.

Acckkk!!  100 instructions is *not* light weight!  This is the normal
run-time situation, not an error recovery path.

> > I've had pci-skeleton.c for a long time, but it apparently wasn't worthy.
> > It is now incorrect for 2.3.*.
> 
> That went over my head...
> 
> I find skeleton.c very useful.  I'd be interested in pci-skeleton.c.  Is
> it available somewhere?

The version in
  http://cesdis.gsfc.nasa.gov/linux/drivers/kern-2.3/index.html
  ftp://cesdis.gsfc.nasa.gov/pub/linux/drivers/kern-2.3/pci-skeleton.c
is the latest of mine, but keep in mind that it uses my PCI scan structure
and backwards compatibility code that will not be accepted into the kernel.

> > At least parts of the ancient skeleton.c have been updated with the new
> > interface.
> 
> Parts, yes.  It's missing the align-IP-header-on-longword fiddle as well
> as using the global cli().  There could be other things.  I'll have a
> closer look.

It was designed as an ISA skeleton.
Some patches did not update it when people made interface changes, so it
doesn't represent an accurate example.

The pci-skeleton.c code does compile, and my drivers attempt to closely
match the its structure and semantics.  Well, moderated by the desire to
minimize the code changes from previous versions.

> You didn't address some of my observations - I think they were pretty
> straightforward things like checking returns from resource allocation
> attempts.  I'll included them when I revisit this later in the week.

Previously GFP_KERNEL meant "wait until you have the memory", so you didn't
have to check all allocations.  I believe that has changed in pre-2.4 (and
other GPLed OSes that use the drivers, such L4 and OStk, don't make that
assertion) so checking the allocations is now needed.  Recovering from
a failed allocation *will* add noticable code to attach() and open().

The descriptor-based drivers should all already handle failed skbuff
allocations, although they have never been tested when you cannot allocate
any skbuffs at all during initialization.

Donald Becker
Scyld Computing Corporation, becker@xxxxxxxxx



<Prev in Thread] Current Thread [Next in Thread>