netdev
[Top] [All Lists]

Re: [patch] tcp_tw in 2.4

To: kuznet@xxxxxxxxxxxxx
Subject: Re: [patch] tcp_tw in 2.4
From: Andrew Morton <andrewm@xxxxxxxxxx>
Date: Wed, 04 Oct 2000 00:20:06 +1100
Cc: netdev@xxxxxxxxxxx
References: <39D75F78.492E2CA5@uow.edu.au> from "Andrew Morton" at Oct 1, 0 08:15:01 pm <200010011645.UAA09976@ms2.inr.ac.ru>
Sender: owner-netdev@xxxxxxxxxxx
kuznet@xxxxxxxxxxxxx wrote:
> 
> Hello!
> 
> > * tcp_init() wants to set sysctl_tcp_max_tw_buckets to 180,000.  This
> > seems too high (Andi says 22 megs).  I think the patch here is more
> > consistent.
> 
> For machine with 256MB it is not very big number.
> 
> Failure to create tw bucket is hard bug. Default value is to be so high
> as possible.

So with 180k connections and a 60 second TCP_TIMEWAIT_LEN, the machine
is limited to a maximum sustained rate of 3,000 connections per second?

> > * tcp_twkill() can consume a huge amount of time if it has enough
> > connections to deal with.  When running lmbench I have observed it
> > killing 2,500 connections in a single pass, which means we spend 15
> > milliseconds in the timer handler.  This is crazy.
> 
> If you expect of machine doing 700 conn/sec superb latency,
> you expect an impossible thing.

OK, this isn't a very important issue for scheduling latency.  Can be
lived with.  But...

> We do much more particularly because the work is batched, when it is possible.
> 
> If you want to do sound recording on your web server,
> it is better to increase granularity of tw calendar.
> 
> > So I just kill a hundred and then reschedule the timer to run in a
> > couple of jiffies time.  The downside: this limits the tw reaping to
> > 2,500 connections per second.
> 
> You simply limited your power by 2500 conn/sec, not more.
> We must destroy buckets with the rate, which they are created.

Tell me if this is wrong:

A uniprocessor server is handling 3,000 connections per second (probably
not possible.  What is the maximum 2.4 can do?)

That server will reap the timed-wait connections once per 7.5 seconds.

So a single timer handler will kill 22,500 tcp_tw_buckets in a single
pass.

So that timer handler will not return for 0.14 seconds.

But the machine is under heavy interrupt load - the timer handler will
probably take 0.3 seconds or more.

So once per seven seconds:

 - the machine's backlog queue will fill up and it will drop
   around two thousand packets.  Subsequent client retransmits
   will add even more load.
 - No timer handlers or softirqs will run for 0.3 seconds.
 - Process accounting will stop.
 - Other scary things :)
 - The machine probably won't reach backlog equilibrium for
   half a second or more.

This all sounds pretty bad and suggests that either TCP_TWKILL_SLOTS is
far too small or I've missed something obvious :)

Increasing TCP_TWKILL_SLOTS to 256 would smooth things out, but I
suggest that something which is load-adaptive is more efficient and easy
to code.

<Prev in Thread] Current Thread [Next in Thread>