On Fri, 2005-03-04 at 03:47, Baruch Even wrote:
> jamal wrote:
> > Can you explain a little more? Why does the the throttling cause any
> > bad behavior thats any different from the queue being full? In both
> > cases, packets arriving during that transient will be dropped.
>
> If you have 300 packets in the queue and the throttling kicks in you now
> drop ALL packets until the queue is empty, this will normally take some
> time, during all of this time you are dropping all the ACKs that are
> coming in, you lose SACK information and potentially you leave no packet
> in flight so that the next packet will be sent only due to retransmit
> timer waking up, at which point your congestion control algorithm starts
> from cwnd=1.
>
> You can look at the report http://hamilton.ie/net/LinuxHighSpeed.pdf for
> some graphs of the effects.
>
Always cool to see some test running across the pond.
Were the processors tied to NICs?
Your experiment is more than likely a single flow, correct?
In other words the whole queue was infact dedicated just for your one
flow - thats why you can call this queue a transient burst queue.
Do you still have the data that shows how many packets were dropped
during this period. Do you still have the experimental data? I am
particulary interested in seeing the softnet stats as well as tcp
netstats.
I think your main problem was the huge amounts of SACK on the writequeue
and the resultant processing i.e section 1.1 and how you resolved that.
I dont see any issue in dropping ACKs, many of them even for such large
windows as you have - TCPs ACKs are cummulative. It is true if you drop
"large" enough amounts of ACKS, you will end up in timeouts - but large
enough in your case must be in the minimal 1000 packets. And to say you
dropped a 1000 packets while processing 300 means you were taking too
long processing the 300. So it would be interesting to see a repeat of
the test after youve resolved 1.1 but without removing the congestion
code.
Then what would be really interesting is to see the perfomance you get
from multiple flows with and without congestion.
I am not against a the benchmarky nature of the single flow and tuning
for that, but we should also look at a wider scope at the effect before
you handwave based on the result of one testcase.
Infact i would agree with giving you a way to turn off the congestion
control - and i am not sure how long we should keep it around with NAPI
getting more popular.. I will prepare a simple patch.
What you really need to do eventually is use NAPI not these antiquated
schemes.
I am also worried that since you used a non-NAPI driver, the effect of
reordering necessitating the UNDO is much much higher.
So if i was you i would repeat 1.2 with the fix from 1.1 as well as
tying the NIC to one CPU. And it would be a good idea to present more
detailed results - not just tcp windows fluctuating (you may not need
them for the paper, but would be useful to see for debugging purposes
other parameters).
> >>the smart schemes are not going to make it that much better if
> >>the hardware/software can't keep up.
> >
> > consider that this queue could be shared by as many as a few thousand
> > unrelated TCP flows - not just one. It is also used for packets being
> > forwarded. If you factor that the system has to react to protect itself
> > then these schemes may make sense. The best place to do it is really in
> > hardware, but the closer to the hardware as possible is the next besr
> > possible spot.
>
> Actually the problem we had was with TCP end-system performance
> problems, compared to them the router problem is more limited since it
> only needs to do a lookup on a hash, tree or whatever and not a linked
> list of several thousand packets.
>
I am not sure i followed. If you mean routers dont use linked lists
you are highly mistaken.
> I'd prefer avoiding an AFQ scheme in the incoming queue, if you do add
> one, please make it configurable so I can disable it. The drop-tail
> behaviour is good enough for me. Remember that an AFQ needs to drop
> packets long before the queue is full so there will likely be more
> losses involved.
What i was suggesting to Stephen would probably make more sense to kick
in when theres congestion. weighted windowing allows to sense things
that are coming; so the idea was to more not allow new flows once the
we are congested.
Just Use NAPI driver and you wont have to worry about this.
cheers,
jamal
|