netdev
[Top] [All Lists]

Tx queueing

To: "netdev@xxxxxxxxxxx" <netdev@xxxxxxxxxxx>
Subject: Tx queueing
From: Andrew Morton <andrewm@xxxxxxxxxx>
Date: Fri, 19 May 2000 01:10:12 +1000
Sender: owner-netdev@xxxxxxxxxxx
A number of drivers do this:

start_xmit()
{
        netif_stop_queue()
        ...
        if (room for another packet)
                netif_wake_queue()
        ...
}

I suspect this is a simple port from the dev->tbusy days.

It would seem to be more sensible to do

start_xmit()
{
        ...
        if (!room for another packet)
                netif_stop_queue()
}

but the functional difference here is that we are no longer scheduling
another BH run, so if there are additional packets queued "up there"
then their presentation to the driver will be delayed until **this CPU**
makes another BH run.  

For devices which have a Tx packet ring or decent FIFO I don't expect
this to be a problem, because the Tx ISR will call netif_wake_queue()
and the subsequent BH run will keep stuffing packets into the Tx ring
until it's full.  But for devices which have very limited Tx buffering
there may be a lost opportunity to refill the Tx buffer earlier.  Seems
unlikely to me.

Do we see a problem with the above approach?  Or is the benefit so small
that it's not worth bothering about?


Incidentally, Alexey.  You should change

        int cpu = smp_processor_id();
to

        const int cpu = smp_processor_id();

This allows GCC to generate _much_ better code on UP.  It only saves
50-60 insns in dev.c, but it's free...




Also, I'm still attracted to the idea of dequeueing packets within the
driver (the 'pull' model) rather than stuffing them in via
qdisc_restart() and the BH callback.  A while back Don said:

> The BSD stack uses the scheme of dequeuing packets in the ISR.  This was a
> good design in the VAX days, and with primative hardware that handled only
> single packets.  But it has horrible cache behavior, needs an extra lock,
> and can result the interrupt service routine running a very long time,
> blocking interrupts.

I never understood the point about cache behaviour.  Perhaps he was
referring to the benefit which a sequence of short loops has over a
single, long loop?  And nowadays we only block interrupts for this
device (or things on this device's IRQ?).

One advantage which the 'pull' model has is with CPU/NIC bonding. 
AFAIK, the only way at present of bonding a NIC to a CPU is via the
IRQ.  This is fine for the ISR and the BH callback, but at present the
direct userland->socket->qdisc->driver path will be executed on a random
CPU. Moving some of this into the ISR will make bonding more effective.

Or teach qdisc_restart() to simply queue packets and rely on the
CPU-specific softnet callback to do the transmit.  Probably doesn't make
much diff.

Of course, all this is simply noise without benchmarks...

Has anyone done any serious work with NIC/CPU bonding?


-- 
-akpm-

<Prev in Thread] Current Thread [Next in Thread>