On Sat, Mar 26, 2005 at 09:48:31PM -0800, Matt Mackall wrote:
> I believe the mempool can be shared among all sockets that represent
> the same storage device. Packets out any socket represent progress.
What's the point to have more than one socket connected to each storage
device anyway?
> Yes, done before it was even called iSCSI.
Ok, theoretical deadlock conditions aren't nice anyway, but knowing this
is a real life problem too makes it more interesting ;).
> The receive buffer is allocated at the time we DMA it from the card.
> We have no idea of its contents and we won't know what socket mempool
> to pull the receive skbuff from until much higher in the network
> stack, which could be quite a while later if we're under OOM load. And
> we can't have a mempool big enough to handle all the traffic that
> might potentially be deferred for softirq processing when we're OOM,
> especially at gigabit rates.
>
> I think this is actually the tricky piece of the problem and solving
> the socket and send buffer allocation doesn't help until this gets
> figured out.
>
> We could perhaps try to address this with another special receive-side
> alloc_skb that fails most of the time on OOM but sometimes pulls from
> a special reserve.
One algo to handle this is: after we get the gfp_atomic failure, we
look at all the mempools are registered for a certain NIC, and we pick
a random mempools that isn't empty. We use the non-empty mempool to
receive the packet, and we let the netif_rx process the packet. Then if
going up the stack we find that the packet doesn't belong to the
socket-mempool, we discard the packet and we release the ram back into
the mempool. This should make progress since eventually the right packet
will go in the right mempool.
> > Perhaps the mempooling overhead will be too huge to pay for it even when
> > it's not necessary, in such case the iscsid will have to pass a new
> > bitflag to the socket syscall, when it creates the socket meant to talk
> > with the remote disk.
>
> I think we probably attach a mempool to a socket after the fact. And
I guess you meant before the fact (i.e. before the connection to the
server), anything attached after the fact (whatever the fact is ;) isn't
going to help.
|