On Wed, Mar 30, 2005 at 05:22:08PM +0200, Andi Kleen wrote:
> On Tue, Mar 29, 2005 at 04:08:32PM -0500, jamal wrote:
> > Sender is holding onto memory (retransmit queue i assume) waiting
> > for ACKs. Said sender is under OOM and therefore drops ACKs coming in
> > and as a result cant let go of these precious resource sitting on the
> > retransmit queue.
> > And iscsi cant wait long enough for someone else to release memory so
> > the ACKs can be delivered.
> > Did i capture this correctly?
>
> Or worse your swap device is on iscsi and you need the ACK to free
> memory.
>
> But that is unrealistic because it could only happen if 100% of
> your memory is dirty pages or filled up by other non VM users.
> Which I think is pretty unlikely. Normally the dirty limits in the VM
> should prevent it anyways - VM is supposed to block before all
> your memory is dirty. The CPU can still dirty pages in user space,
> but the cleaner should also clean it and if necessary block
> the process.
I seem to recall this being fairly easy to trigger by simply pulling
the network cable while there's heavy mmap + write load. The system
will quickly spiral down into OOM and will remain wedged when you plug
the network back in. With iSCSI, after some extended period all the
I/Os will have SCSI timeouts and lose everything.
It's going to be fairly typical for iSCSI boxes to do all their I/O
over iSCSI, including swap and root. Things like blades and cluster
nodes.
--
Mathematics is the supreme nostalgia of our time.
|