[Top] [All Lists]

Re: How to handle TIF_MEMDIE stalls?

To: Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx>
Subject: Re: How to handle TIF_MEMDIE stalls?
From: Michal Hocko <mhocko@xxxxxxx>
Date: Fri, 20 Feb 2015 10:10:01 +0100
Cc: hannes@xxxxxxxxxxx, david@xxxxxxxxxxxxx, dchinner@xxxxxxxxxx, linux-mm@xxxxxxxxx, rientjes@xxxxxxxxxx, oleg@xxxxxxxxxx, akpm@xxxxxxxxxxxxxxxxxxxx, mgorman@xxxxxxx, torvalds@xxxxxxxxxxxxxxxxxxxx, xfs@xxxxxxxxxxx, linux-fsdevel@xxxxxxxxxxxxxxx, fernando_b1@xxxxxxxxxxxxx
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <201502192229.FCJ73987.MFQLOHSJFFtOOV@xxxxxxxxxxxxxxxxxxx>
References: <20150218082502.GA4478@xxxxxxxxxxxxxx> <20150218104859.GM12722@dastard> <20150218121602.GC4478@xxxxxxxxxxxxxx> <20150219110124.GC15569@xxxxxxxxxxxxxxxxxxxxxx> <20150219122914.GH28427@xxxxxxxxxxxxxx> <201502192229.FCJ73987.MFQLOHSJFFtOOV@xxxxxxxxxxxxxxxxxxx>
User-agent: Mutt/1.5.23 (2014-03-12)
On Thu 19-02-15 22:29:37, Tetsuo Handa wrote:
> Michal Hocko wrote:
> > On Thu 19-02-15 06:01:24, Johannes Weiner wrote:
> > [...]
> > > Preferrably, we'd get rid of all nofail allocations and replace them
> > > with preallocated reserves.  But this is not going to happen anytime
> > > soon, so what other option do we have than resolving this on the OOM
> > > killer side?
> > 
> > As I've mentioned in other email, we might give GFP_NOFAIL allocator
> > access to memory reserves (by giving it __GFP_HIGH). This is still not a
> > 100% solution because reserves could get depleted but this risk is there
> > even with multiple oom victims. I would still argue that this would be a
> > better approach because selecting more victims might hit pathological
> > case more easily (other victims might be blocked on the very same lock
> > e.g.).
> > 
> Does "multiple OOM victims" mean "select next if first does not die"?
> Then, I think my timeout patch 
> http://marc.info/?l=linux-mm&m=142002495532320&w=2
> does not deplete memory reserves. ;-)

It doesn't because
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2603,9 +2603,7 @@ gfp_to_alloc_flags(gfp_t gfp_mask)
                        alloc_flags |= ALLOC_NO_WATERMARKS;
                else if (in_serving_softirq() && (current->flags & PF_MEMALLOC))
                        alloc_flags |= ALLOC_NO_WATERMARKS;
-               else if (!in_interrupt() &&
-                               ((current->flags & PF_MEMALLOC) ||
-                                unlikely(test_thread_flag(TIF_MEMDIE))))
+               else if (!in_interrupt() && (current->flags & PF_MEMALLOC))
                        alloc_flags |= ALLOC_NO_WATERMARKS;

you disabled the TIF_MEMDIE heuristic and use it only for OOM exclusion
and break out from the allocator. Exiting task might need a memory to do
so and you make all those allocations fail basically. How do you know
this is not going to blow up?

> If we change to permit invocation of the OOM killer for GFP_NOFS / GFP_NOIO,
> those who do not want to fail (e.g. journal transaction) will start passing
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@xxxxxxxxxx  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>

Michal Hocko

<Prev in Thread] Current Thread [Next in Thread>