[Top] [All Lists]

Re: e100 "Ferguson" release

To: Jeff Garzik <jgarzik@xxxxxxxxx>
Subject: Re: e100 "Ferguson" release
From: Ben Greear <greearb@xxxxxxxxxxxxxxx>
Date: Sun, 03 Aug 2003 00:32:01 -0700
Cc: "Feldman, Scott" <scott.feldman@xxxxxxxxx>, netdev@xxxxxxxxxxx
In-reply-to: <3F2CA65F.8060105@xxxxxxxxx>
Organization: Candela Technologies
References: <C6F5CF431189FA4CBAEC9E7DD5441E010222927D@xxxxxxxxxxxxxxxxxxxxxx> <3F2CA65F.8060105@xxxxxxxxx>
Sender: netdev-bounce@xxxxxxxxxxx
User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.5a) Gecko/20030718
Jeff Garzik wrote:

* (API) Does the out-of-tx-resources condition in e100_xmit_frame ever really happen? I am under the impression that returning non-zero in ->hard_start_xmit results in the packet sometimes being requeued and sometimes dropped. I prefer to guarantee a more-steady state, by simply dropping the packet unconditionally, when this uncommon condition occurs. So, I would
a) mark the failure condition with unlikely(), and
b) if the condition occurs, simply drop the packet (tx_dropped++, kfree skb), and return zero.

Though, ultimately, I wish the net stack would support some way to _guarantee_ that the skb is requeued for transmit. Some packet schedulers in the kernel will drop the skb even if the ->hard_start_xmit return code indicates "requeue". This makes sense from the rule of "skbs are lossy, and can be dropped"... but it really sucks on hardware where unexpected -- but temporary -- loss of TX resources occurs. One can prevent 20-50% (or more) packet loss on certain classes of connections, simply by being able to tell the net stack "hey, if I could go back in time and issue a netif_stop_queue, before you called ->hard_start_xmit, I would" :)

Although I have not tried this latest patch, the existing e100 and e1000 in
2.4.21 seldom seem to return true to this method:  netif_queue_stopped(odev),
even when the next hard_start_xmit() call fails.  For instance, this is the
code I use in pktgen.c:

                        if (!netif_queue_stopped(odev)) {
                                if (odev->hard_start_xmit(next->skb, odev)) {
                                        if (net_ratelimit()) {
                                                printk(KERN_INFO "Hard xmit 
                                        next->last_ok = 0;
                                else {
                                        queue_stopped = 0;
                                        next->last_ok = 1;
                                        next->tx_bytes += (next->cur_pkt_size + 
4); /* count csum */

With e100 and e1000, I see the very large numbers of the hard_start_xmit failure
when running very high packets-per-second rates (small packets).
I see virtually no failures with tulip.  pktgen knows how to re-queue, but it's
curious it has to so often.  For code that does not requeue, this could be even
more of a bummer.

To point b), I think if the driver accepts the packet in hard_start_xmit, it 
be able to send the packet out, otherwise return the 'requeue' value and let the
calling code know.  It is very important to me, at least, to know if a packet 
really been sent or not.


Ben Greear <greearb@xxxxxxxxxxxxxxx>
Candela Technologies Inc

<Prev in Thread] Current Thread [Next in Thread>