[Top] [All Lists]

Re: How to handle TIF_MEMDIE stalls?

To: Michal Hocko <mhocko@xxxxxxx>
Subject: Re: How to handle TIF_MEMDIE stalls?
From: Theodore Ts'o <tytso@xxxxxxx>
Date: Mon, 2 Mar 2015 11:39:13 -0500
Cc: Dave Chinner <david@xxxxxxxxxxxxx>, Johannes Weiner <hannes@xxxxxxxxxxx>, Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx>, dchinner@xxxxxxxxxx, linux-mm@xxxxxxxxx, rientjes@xxxxxxxxxx, oleg@xxxxxxxxxx, akpm@xxxxxxxxxxxxxxxxxxxx, mgorman@xxxxxxx, torvalds@xxxxxxxxxxxxxxxxxxxx, xfs@xxxxxxxxxxx
Delivered-to: xfs@xxxxxxxxxxx
Dkim-signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=thunk.org; s=ef5046eb; h=In-Reply-To:Content-Type:MIME-Version:References:Message-ID:Subject:Cc:To:From:Date; bh=L9Sp9D6AmcnvwYXFYv57iF5/U4lEy4BApUfnuaARSpo=; b=UsqdGBchp5TVNh/Hvpj0NYN80PwvNJxKNKhIu+ifeRyC8iJaGAFjYL33WmvIXWbmvIkKWJJXCXaxhmn0QQCrKVrENzDvYbmXYp2Iy2KGVPdQcWCqsy1//09V7qpEcZQdKEPWLA74IHhxQA9XuHC5F4zKObTrPDvArh4Jxiq3IoA=;
In-reply-to: <20150302151832.GE26334@xxxxxxxxxxxxxx>
References: <20150210151934.GA11212@xxxxxxxxxxxxxxxxxxxxxx> <201502111123.ICD65197.FMLOHSQJFVOtFO@xxxxxxxxxxxxxxxxxxx> <201502172123.JIE35470.QOLMVOFJSHOFFt@xxxxxxxxxxxxxxxxxxx> <20150217125315.GA14287@xxxxxxxxxxxxxxxxxxxxxx> <20150217225430.GJ4251@dastard> <20150219102431.GA15569@xxxxxxxxxxxxxxxxxxxxxx> <20150219225217.GY12722@dastard> <20150221235227.GA25079@xxxxxxxxxxxxxxxxxxxxxx> <20150223004521.GK12722@dastard> <20150302151832.GE26334@xxxxxxxxxxxxxx>
User-agent: Mutt/1.5.23 (2014-03-12)
On Mon, Mar 02, 2015 at 04:18:32PM +0100, Michal Hocko wrote:
> The idea is sound. But I am pretty sure we will find many corner
> cases. E.g. what if the mere reservation attempt causes the system
> to go OOM and trigger the OOM killer?

Doctor, doctor, it hurts when I do that....

So don't trigger the OOM killer.  We can let the caller decide
whether the reservation request should block or return ENOMEM, but the
whole point of the reservation request idea is that this happens
*before* we've taken any mutexes, so blocking won't prevent forward

The file system could send down a different flag if we are doing
writebacks for page cleaning purposes, in which case the reservation
request would be a "just a heads up, we *will* be needing this much
memory, but this is not something where we can block or return ENOMEM,
so please give us the highest priority for using the free reserves".

> I think the idea is good! It will just be quite tricky to get there
> without causing more problems than those being solved. The biggest
> question mark so far seems to be the reservation size estimation. If
> it is hard for any caller to know the size beforehand (which would
> be really close to the actually used size) then the whole complexity
> in the code sounds like an overkill and asking administrator to tune
> min_free_kbytes seems a better fit (we would still have to teach the
> allocator to access reserves when really necessary) because the system
> would behave more predictably (although some memory would be wasted).

If we do need to teach the allocator to access reserves when really
necessary, don't we have that already via GFP_NOIO/GFP_NOFS and
GFP_NOFAIL?  If the goal is do something more fine-grained,
unfortunately at least for the short-term we'll need to preserve the
existing behaviour and issue warnings until the file system starts
adding GFP_NOFAIL to those memory allocations where previously,
GFP_NOFS was effectively guaranteeing that failures would almostt
never happen.

I know at least one place discovered with recent change (and revert)
where I'll be fixing ext4, but I suspect it won't be the only one,
especially in the block device drivers.

                                                - Ted

<Prev in Thread] Current Thread [Next in Thread>