On Tue, Mar 13, 2001 at 04:52:06PM +0100, Michal Ostrowski wrote:
>
> When poling the state of a TCP socket, the criteria used to determine
> whether or not we can write to the socket are:
>
> if (tcp_wspace(sk) >= tcp_min_write_space(sk)) {
>
> This translates to:
>
> sk->sndbuf - sk->wmem_queued >= sk->wmem_queued/2
>
> or
>
> sk->sndbuf >= 3/2 * sk->wmem_queued
>
>
> In tcp_sendmsg a different set of criteria is used. Here the test is
> done with tcp_memory_free() and is equivalent to:
>
> sk->wmem_queued < sk->sndbuf
>
> The condition in tcp_sendmsg() is less demanding than the condition in
> tcp_poll() and so it appears possible for poll() or select() to return
> without flagging a socket as being writable when in fact a write
> operation to the socket could complete without blocking.
>
> Is this wrong or is this just my imagination?
It is done intentionally for performance reasons. poll -> context switch ->
send is relatively expensive and for good bandwidth shouldn't be done that
often. TCP sendmsg has to work always though.
-Andi
|