In my experimental environment I've noticed that the socket lock on a
TCP socket can be held for a substantial amount of time when calling
tcp_v4_do_rcv from tcp_v4_rcv (up 64ms -- 375Mhz PPC). This occurs when
the call to tcp_v4_rcv results in subsequent calls to ip_output(), which
results in queueing a large number of packets for transmission.
When called from tcp_v4_rcv, tcp_v4_do_rcv is protected by
bh_lock_sock(), with the additional condition that there are no users
queued on the lock (and thus the work cannot be added to backlog
processing).
OTOH, tcp_v4_do_rcv is also called from __release_sock (via the
"backlog_rcv" function pointer). In this case the spin_lock is not
held, but *local* bh processing is suppressed. This would seem to imply
that bh processing on another processor could grab the spin lock and
call tcp_v4_do_rcv at the same time (I don't see how softirq processing
here on other processors is blocked --- maybe this is the part I'm
missing).
Assuming this analysis is correct:
1. The locking around backlog_rcv in __release_sock must be
strengthened to prevent bh processing on another cpu from calling
tcp_v4_do_rcv or,
2. The locking around tcp_v4_do_rcv in tcp_v4_rcv can be relaxed.
In the case of (2) the scenario I mentioned initially is alleviated
since the spin lock would presumably not be held through calls to
ip_output(). Anyone trying to grab to socket lock would sleep on it (as
opposed to spinning on it for this long period of time).
Of course, the overall point here is; is it possible to do all of the
ip_output() processing without holding the socket lock (or at the very
least not holding the spin lock component)?
--
Michal Ostrowski
mostrows@xxxxxxxxxxxxx
|