> Ok, tested with 0608. Hmm. what can I say, the window never grows past 8kB
> with mss of 256. Both the ofo-queue pruning and the burstiness is masked
> by this behaviour (obviously). The latter of course only if the receiver
> is a linux with 0608 too.
> Tested with mss of 536 and 1024 too - results in max window of ~16kB and
> ~24kB respectively. I can't say I'm satisfied though. This penalises
> connections with smaller MTUs. Think MTU of 576 which I think is pretty
> common on the Internet at whole. With larger RTTs you can not use the
> whole bandwidth which is available because the window is just too
> small. Those tests were done with PPP which only allocates MRU bytes per
> skb but your average ethernet driver has to allocate 1500+ bytes per skb
> regardless of what the actual packet size is.
TCP calculates _maximal_ window possible with current mss and device.
If the window converged to 8K, it cannot be larger for this connection.
If it were >8K, it would prune. It is law of nature, rather than
something deterined by our choice.
If you want larger window (you does not want this, in your case
8K is enough), you have to increase rcvbuf. But:
> You can enlarge the socket buffer size to get a bigger window but how many
> people will?
It does not matter. People should not increase rcvbuf per socket.
Essentially, this number is determined by amount of available
RAM and by number of active connections, rather than by network conditions.
Even current value of 64K is too large for common server configuration.
See? User is even not permitted to increase rcvbuf significantly,
it is limited administratively.
Certainly, one day we will have to do more smart "fair" memory management,
which will allow to correlate memory consumption to network conditions.
For now it is impossible, existing algorithms (f.e. the work made in PSC)
are too coarce to be useful for production OS, which Linux is.
> used - As in past tense?
In past and in present continuous. 8)
> Remember that truesize is not the whole story. The cloned skbs show up in
> wmem_alloc too which is why we got bitten by the burstiness. I see the
> heuristics are on the conservative side though.
Cloned skbs are not counted, because their number is limited by tx_queue_len.
For slow links it must be small number, sort of 4.
I have no idea, why it is larger in your case. If you have tx_queue_len
of 3, the overhead <= 3.
> Valid examples are wireless and satellite links. Congestion window can
> grow freely because the delay was constant in this test.
"thin" link is link with small power and small window.
"thick" link is link with large power and large window.
"long thin link" is semantic non-sense. Link is either thick,
or it is not long. 8)8)
I have never heard about links with large power and mtu of 256.
If wireless ones are of these kind, they are useless for IP applications.
Large power links must have large mtu (>= 1500, at least), no questions,
and you will have 32K default window then.
RATIONALE: mtu is selected small only for small bandwidth links
with underdeveloped link layer protocols to decrease latency dominated
by packet sizes. If it is the case, power is 1 and window is small.
As soon as latency is not dominated by packet size, small mss
is not required. It is plain logic. 8)