xfs
[Top] [All Lists]

Re: memory reclaim problems on fs usage

To: Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx>
Subject: Re: memory reclaim problems on fs usage
From: Arkadiusz MiÅkiewicz <arekm@xxxxxxxx>
Date: Thu, 12 Nov 2015 22:28:26 +0100
Cc: linux-mm@xxxxxxxxx, xfs@xxxxxxxxxxx
Delivered-to: xfs@xxxxxxxxxxx
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=maven.pl; s=maven; h=from:to:subject:date:user-agent:cc:references:in-reply-to :mime-version:content-type:content-transfer-encoding:message-id; bh=n7XN7de5lSCOxC500e8GRABdsEbtyuLtvbCrtkyb/l4=; b=RW4yEegy8UiAYS02jlKd4KaZ/gJQfUexI6HGXIBDIQZKSX51VUPethDkrOIf0iVo5O o9BtaI4pYOIodOgi/czujSiaxu6W24260fawtZhz3MsvoT3vyxvEsvY7CrS5mWod+TB0 RYhyGCURDjPD3jM8KNI3GlOM77m6T+F14QjDw=
In-reply-to: <56449E44.7020407@xxxxxxxxxxxxxxxxxxx>
References: <201511102313.36685.arekm@xxxxxxxx> <201511120706.10739.arekm@xxxxxxxx> <56449E44.7020407@xxxxxxxxxxxxxxxxxxx>
User-agent: KMail/1.13.7 (Linux/4.3.0; KDE/4.14.13; x86_64; ; )
On Thursday 12 of November 2015, Tetsuo Handa wrote:
> On 2015/11/12 15:06, Arkadiusz MiÅkiewicz wrote:
> > On Wednesday 11 of November 2015, Tetsuo Handa wrote:
> >> Arkadiusz Mi?kiewicz wrote:
> >>> This patch is against which tree? (tried 4.1, 4.2 and 4.3)
> >> 
> >> Oops. Whitespace-damaged. This patch is for vanilla 4.1.2.
> >> Reposting with one condition corrected.
> > 
> > Here is log:
> > 
> > http://ixion.pld-linux.org/~arekm/log-mm-1.txt.gz
> > 
> > Uncompresses is 1.4MB, so not posting here.
> 
> Thank you for the log. The result is unexpected for me.

[...]

> 
> vmstat_update() and submit_flushes() remained pending for about 110
> seconds. If xlog_cil_push_work() were spinning inside GFP_NOFS allocation,
> it should be reported as MemAlloc: traces, but no such lines are recorded.
> I don't know why xlog_cil_push_work() did not call schedule() for so long.
> Anyway, applying
> http://lkml.kernel.org/r/20151111160336.GD1432@xxxxxxxxxxxxxx should solve
> vmstat_update() part.

To apply that patch on top of 4.1.13 I also had to apply patches listed below. 

So in summary appllied:
http://sprunge.us/GYBb
http://sprunge.us/XWUX
http://sprunge.us/jZjV

(Could try http://lkml.kernel.org/r/20151111160336.GD1432@xxxxxxxxxxxxxx only 
if there is version for 4.1 tree somewhere)

commit 0aaa29a56e4fb0fc9e24edb649e2733a672ca099
Author: Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx>
Date:   Fri Nov 6 16:28:37 2015 -0800

    mm, page_alloc: reserve pageblocks for high-order atomic allocations on 
demand

commit 974a786e63c96a2401a78ddba926f34c128474f1
Author: Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx>
Date:   Fri Nov 6 16:28:34 2015 -0800

    mm, page_alloc: remove MIGRATE_RESERVE

commit c2d42c16ad83006a706d83e51a7268db04af733a
Author: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
Date:   Thu Nov 5 18:48:43 2015 -0800

    mm/vmstat.c: uninline node_page_state()

commit 176bed1de5bf977938cad26551969eca8f0883b1
Author: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
Date:   Thu Oct 15 13:01:50 2015 -0700

    vmstat: explicitly schedule per-cpu work on the CPU we need it to run on


[...]

> 
> Well, what steps should we try next for isolating the problem?
> 
> Swap is not used at all. Turning off swap might help.

Disabled swap.

> 
> [ 8633.753574] Free swap  = 117220800kB
> [ 8633.753576] Total swap = 117220820kB
> 
> Turning off perf might also help.
> 
> [ 5001.394085] perf interrupt took too long (2505 > 2495), lowering
> kernel.perf_event_max_sample_rate to 50100

Didn't find a way to disable perf. kernel .config option gets autoenabled by 
some dependency. So left this untouched.


With mentioned patches I wasn't able to reproduce memory allocation problem 
(still trying though). 

Current debug log: http://ixion.pld-linux.org/~arekm/log-mm-2.txt.gz

-- 
Arkadiusz MiÅkiewicz, arekm / ( maven.pl | pld-linux.org )

<Prev in Thread] Current Thread [Next in Thread>