netdev
[Top] [All Lists]

Re: [Ksummit-2005-discuss] Summary of 2005 Kernel Summit Proposed Topics

To: Andi Kleen <ak@xxxxxx>
Subject: Re: [Ksummit-2005-discuss] Summary of 2005 Kernel Summit Proposed Topics
From: Matt Mackall <mpm@xxxxxxxxxxx>
Date: Wed, 30 Mar 2005 09:24:13 -0800
Cc: jamal <hadi@xxxxxxxxxx>, Dmitry Yusupov <dmitry_yus@xxxxxxxxx>, James Bottomley <James.Bottomley@xxxxxxxxxxxxxxxxxxxxx>, Rik van Riel <riel@xxxxxxxxxx>, andrea@xxxxxxx, michaelc@xxxxxxxxxxx, open-iscsi@xxxxxxxxxxxxxxxx, ksummit-2005-discuss@xxxxxxxxx, netdev <netdev@xxxxxxxxxxx>
In-reply-to: <20050330152208.GB12672@xxxxxx>
References: <20050327054831.GA15453@xxxxxxxxx> <1111905181.4753.15.camel@mylaptop> <20050326224621.61f6d917.davem@xxxxxxxxxxxxx> <Pine.LNX.4.61.0503272245350.30885@xxxxxxxxxxxxxxxxxxxxxxxxxxx> <m1zmwn21hk.fsf@xxxxxx> <1112027284.5531.27.camel@mulgrave> <20050329152008.GD63268@xxxxxx> <1112116762.5088.65.camel@beastie> <1112130512.1077.107.camel@xxxxxxxxxxxxxxxx> <20050330152208.GB12672@xxxxxx>
Sender: netdev-bounce@xxxxxxxxxxx
User-agent: Mutt/1.5.6+20040907i
On Wed, Mar 30, 2005 at 05:22:08PM +0200, Andi Kleen wrote:
> On Tue, Mar 29, 2005 at 04:08:32PM -0500, jamal wrote:
> > Sender is holding onto memory (retransmit queue i assume) waiting
> > for ACKs. Said sender is under OOM and therefore drops ACKs coming in
> > and as a result cant let go of these precious resource sitting on the
> > retransmit queue. 
> > And iscsi cant wait long enough for someone else to release memory so
> > the ACKs can be delivered. 
> > Did i capture this correctly?
> 
> Or worse your swap device is on iscsi and you need the ACK to free
> memory. 
> 
> But that is unrealistic because it could only happen if 100% of
> your memory is dirty  pages or filled up by other non VM users.
> Which I think is pretty unlikely. Normally the dirty limits in the VM
> should prevent it anyways - VM is supposed to block before all
> your memory is dirty. The CPU can still dirty pages in user space,
> but the cleaner should also clean it and if necessary block
> the process.

I seem to recall this being fairly easy to trigger by simply pulling
the network cable while there's heavy mmap + write load. The system
will quickly spiral down into OOM and will remain wedged when you plug
the network back in. With iSCSI, after some extended period all the
I/Os will have SCSI timeouts and lose everything.

It's going to be fairly typical for iSCSI boxes to do all their I/O
over iSCSI, including swap and root. Things like blades and cluster
nodes.

-- 
Mathematics is the supreme nostalgia of our time.

<Prev in Thread] Current Thread [Next in Thread>