On Thu, 30 Mar 2000 kuznet@xxxxxxxxxxxxx wrote:
> > - Why do some drivers statically allocate their struct net_device
> > (eepro, cs89x0) whereas others call init_etherdev()? What's the right
> > thing to do?
> By historical reasons. This mess was supposed to be rectified
> some time ago in 2.3 (when file drivers/net/setup.c was created),
> but the cleanup was not completed, unfortunately.
No, this is unrelated. The specific problem was a crude coversion of
drivers to use modules, but the conversion only supported a single card via
a single static structure.
I didn't use that module conversion, and instead added module support
independently. Thus drivers maintained "in-kernel" kept the cruft, while my
versions never had it.
> I think right way is not to use static net_device structures at all.
> But it can (and do) result in problems with passing kernel boot options
> to driver, which are unsolved until now.
> To be honest, I do not know what way is correct. This three-fold way
> was just shit, new one is apparently cleaner, but it is incomplete,
> at least passing options via kernel command line does not work now,
> which is bad. I am not an expert here, alas.
The reason Space.c still exists was the big problem when I tried to remove
it. See net_init.c, which was supposed to take over the functionality of
Space.c. But using it exclusively broke the documentation, which advised
people to directly edit Space.c (LILO parameters and modules were not fully
supported back then). So Space.c had to remain.
I assumed that Space.c would be around only temporarily, but people then
people continue to add ever more elaborate initialization cruft to Space.c.
[[ It's like a shed wiht a flawed foundation that is still being used. The
plan is for it to be torn down later when everything is moved to a new
building, but you come back later to find a twenty story apartment
building with everyone blaming you for the foundation. ]]
We can now get rid of Space.c, since no current documentation advises people
to modify it. But it can and should be done in a way that does not break
LILO-passed parameters, which are still required for some configurations.
> > - The manipulation of the pci_root_buses and pci_devices lists in pci.c
> > has no SMP or IRQ protection. The net drivers call into pci.c to
> > add/remove things from these lists but provide no race avoidance. Is
> > there something higher up which guarantees that these list operations
> > are serialised wrt some random soundcard driver or is this a bug?
> I do not know, I am not an expert here too. Seems, all this code is supposed
> to be executed only under kernel lock and never executed from an
> interrupt context. Then it should be valid.
Do you mean under a global whole-kernel lock?
> > - As far as I can tell, many Ethernet chips separate the Tx and Rx
> > functions sufficiently well for the driver author to be able to let the
> > Tx and Rx threads operate independently. And the orginal architecture
> > allowed this to occur.
> > But the recommended practice of locking the whole driver within
> > hard_start_xmit() will penalise the Rx threads.
> This practice was not recommended, certainly!
Not recommended? But look at what the only guide, the driver conversions,
have as a de facto recommendation -- a horrible spinning waste of time, the
cost of which is hidden because it's difficult to measure.
> If you mean "controller" lock, it is TX lock as rule.
The queue logic should default to seeing that driver Tx queue routine is
serialized. This is a cleaner interface, results in less wasted time, and
should almost never be a SMP performance hit. There are few (no?) drivers
that do substantial-but-parallelizable work when queuing a packet for
The existing practice makes it appear that the driver must always protect
itself against simultaneous transmission attempts from multiple processors.
> RX locking is made automatically by IRQ serialization logic,
> so that the question occurs only when driver needs to make something
> serialized with RX interrupts itself. No help from top level
> is possible here, it is an internal driver problem and driver
> have to create its private locks, if it is necessary,
> or, alternatively, to disable device irqs, while doing dangerous job.
> > If this is correct, should we be using separate rx and tx device-private
> > locks?
> If driver needs them.
Arrgggg. It's hard to tell what is an SMP bug and what is expected behavior
that the driver must protect itself against.
For instance, there was a bug in (IIRC) early 2.2 kernels where an interrupt
handler would be re-entered on SMP systems. It took a long time to track
this down. The reaction I got (which was later corrected) was that it was
perfectly reasonable for two processors to be running the interrupt handler
simultaneously, and that driver should add locks to make certain that this
worked. Yes, that could be made to work, but it would add horrible locking
overhead for the very rare case where more than one processor could do work.
> > BTW: a while back I noticed that the driver's probe1() is being called a
> > huge number of times at bootup from net/dev.c. Is this known about?
> No... Could you give more information? For what device does this occur?
This is a buglet where the driver probes are called a zillion times, every
probe called for each entry in Space.c, rather than calling the PCI (and
other non-ISA) probes just once in an initial phase. This behavior is
always wrong, even for the ISA drivers which should only be called multiple
time if a card is found.
I took special note of this because the bug was blamed on my drivers, as
part of the general PCI scan flame, rather than the flawed change to
Space.c which enabled the eth1..ethN entries.
Scyld Computing Corporation, becker@xxxxxxxxx