xfs
[Top] [All Lists]

Re: How to handle TIF_MEMDIE stalls?

To: Johannes Weiner <hannes@xxxxxxxxxxx>
Subject: Re: How to handle TIF_MEMDIE stalls?
From: Theodore Ts'o <tytso@xxxxxxx>
Date: Sun, 1 Mar 2015 08:43:22 -0500
Cc: Dave Chinner <david@xxxxxxxxxxxxx>, Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx>, mhocko@xxxxxxx, dchinner@xxxxxxxxxx, linux-mm@xxxxxxxxx, rientjes@xxxxxxxxxx, oleg@xxxxxxxxxx, akpm@xxxxxxxxxxxxxxxxxxxx, mgorman@xxxxxxx, torvalds@xxxxxxxxxxxxxxxxxxxx, xfs@xxxxxxxxxxx
Delivered-to: xfs@xxxxxxxxxxx
Dkim-signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=thunk.org; s=ef5046eb; h=In-Reply-To:Content-Type:MIME-Version:References:Message-ID:Subject:Cc:To:From:Date; bh=ESntlZ8nvvtNVlDKalnMJGZllS8vpk664BdSoeYYNKQ=; b=RK/6P5UNVNCk20Vt4chp/3IQ+E0iK9R5gsoQJGYQqnljqeqd/Ad0qmB5x4cI6+go/ZXczogacnpMV7lIup9C7CP/w8w/hEUkSqwSnia9fbDZvqyhZA3oczFu/1uRh5qubSl0dE6ptHDdad6ltpsSu2b6TUt3WB45bd3Bpdx1+YI=;
In-reply-to: <20150228221558.GA23028@xxxxxxxxxxxxxxxxxxxxxx>
References: <201502172123.JIE35470.QOLMVOFJSHOFFt@xxxxxxxxxxxxxxxxxxx> <20150217125315.GA14287@xxxxxxxxxxxxxxxxxxxxxx> <20150217225430.GJ4251@dastard> <20150219102431.GA15569@xxxxxxxxxxxxxxxxxxxxxx> <20150219225217.GY12722@dastard> <20150221235227.GA25079@xxxxxxxxxxxxxxxxxxxxxx> <20150223004521.GK12722@dastard> <20150228162943.GA17989@xxxxxxxxxxxxxxxxxxxxxx> <20150228164158.GE5404@xxxxxxxxx> <20150228221558.GA23028@xxxxxxxxxxxxxxxxxxxxxx>
User-agent: Mutt/1.5.23 (2014-03-12)
On Sat, Feb 28, 2015 at 05:15:58PM -0500, Johannes Weiner wrote:
> Overestimating should be fine, the result would a bit of false memory
> pressure.  But underestimating and looping can't be an option or the
> original lockups will still be there.  We need to guarantee forward
> progress or the problem is somewhat mitigated at best - only now with
> quite a bit more complexity in the allocator and the filesystems.

We've lived with looping as it is and in practice it's actually worked
well.  I can only speak for ext4, but I do a lot of testing under very
high memory pressure situations, and it is used in *production* under
very high stress situations --- and the only time we'e run into
trouble is when the looping behaviour somehow got accidentally
*removed*.

There have been MM experts who have been worrying about this situation
for a very long time, but honestly, it seems to be much more of a
theoretical than actual concern.  So if you don't want to get
hints/estimates about how much memory the file system is about to use,
when the file system is willing to wait or even potentially return
ENOMEM (although I suspect starting to return ENOMEM where most user
space application don't expect it will cause more problems), I'm
personally happy to just use GFP_NOFAIL everywhere --- or to hard code
my own infinite loops if the MM developers want to take GFP_NOFAIL
away.  Because in my experience, looping simply hasn't been as awful
as some folks on this thread have made it out to be.

So if you don't like the complexity because the perfect is the enemy
of the good, we can just drop this and the file systems can simply
continue to loop around their memory allocation calls...  or if that
fails we can start adding subsystem specific mempools, which would be
even more wasteful of memory and probably at least as complicated.

                                                        - Ted

<Prev in Thread] Current Thread [Next in Thread>