[Top] [All Lists]

Re: [patch 3/5] mm: try to distribute dirty pages fairly across zones

To: Pekka Enberg <penberg@xxxxxxxxxxxxxx>
Subject: Re: [patch 3/5] mm: try to distribute dirty pages fairly across zones
From: Johannes Weiner <jweiner@xxxxxxxxxx>
Date: Fri, 30 Sep 2011 10:55:39 +0200
Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>, Mel Gorman <mgorman@xxxxxxx>, Christoph Hellwig <hch@xxxxxxxxxxxxx>, Dave Chinner <david@xxxxxxxxxxxxx>, Wu Fengguang <fengguang.wu@xxxxxxxxx>, Jan Kara <jack@xxxxxxx>, Rik van Riel <riel@xxxxxxxxxx>, Minchan Kim <minchan.kim@xxxxxxxxx>, Chris Mason <chris.mason@xxxxxxxxxx>, "Theodore Ts'o" <tytso@xxxxxxx>, Andreas Dilger <adilger.kernel@xxxxxxxxx>, Shaohua Li <shaohua.li@xxxxxxxxx>, xfs@xxxxxxxxxxx, linux-btrfs@xxxxxxxxxxxxxxx, linux-ext4@xxxxxxxxxxxxxxx, linux-mm@xxxxxxxxx, linux-fsdevel@xxxxxxxxxxxxxxx, linux-kernel@xxxxxxxxxxxxxxx
In-reply-to: <CAOJsxLFWfH5zDG8ui=yQyOcZY_nXhK6r+ziapLg9Zhmb3ibuWQ@xxxxxxxxxxxxxx>
References: <1317367044-475-1-git-send-email-jweiner@xxxxxxxxxx> <1317367044-475-4-git-send-email-jweiner@xxxxxxxxxx> <CAOJsxLFWfH5zDG8ui=yQyOcZY_nXhK6r+ziapLg9Zhmb3ibuWQ@xxxxxxxxxxxxxx>
On Fri, Sep 30, 2011 at 10:35:25AM +0300, Pekka Enberg wrote:
> Hi Johannes!
> On Fri, Sep 30, 2011 at 10:17 AM, Johannes Weiner <jweiner@xxxxxxxxxx> wrote:
> > But there is a flaw in that we have a zoned page allocator which does
> > not care about the global state but rather the state of individual
> > memory zones.  And right now there is nothing that prevents one zone
> > from filling up with dirty pages while other zones are spared, which
> > frequently leads to situations where kswapd, in order to restore the
> > watermark of free pages, does indeed have to write pages from that
> > zone's LRU list.  This can interfere so badly with IO from the flusher
> > threads that major filesystems (btrfs, xfs, ext4) mostly ignore write
> > requests from reclaim already, taking away the VM's only possibility
> > to keep such a zone balanced, aside from hoping the flushers will soon
> > clean pages from that zone.
> The obvious question is: how did you test this? Can you share the results?

Meh, sorry about that, they were in the series introduction the last
time and I forgot to copy them over.

I did single-threaded, linear writing to an USB stick as the effect is
most pronounced with slow backing devices.

[ The write deferring on ext4 because of delalloc is so extreme that I
  could trigger it even with simple linear writers on a mediocre
  rotating disk, though.  I can not access the logfiles right now, but
  the nr_vmscan_writes went practically away here as well and runtime
  was unaffected with the patched kernel. ]

                        Test results

15M DMA + 3246M DMA32 + 504M Normal = 3765M memory
40% dirty ratio, 10% background ratio
16G USB thumb drive
10 runs of dd if=/dev/zero of=disk/zeroes bs=32k count=$((10 << 15))

                seconds                 nr_vmscan_write
                        (stddev)               min|     median|        max
vanilla:         549.747( 3.492)             0.000|      0.000|      0.000
patched:         550.996( 3.802)             0.000|      0.000|      0.000

vanilla:        1183.094(53.178)         54349.000|  59341.000|  65163.000
patched:         558.049(17.914)             0.000|      0.000|     43.000

vanilla:         573.679(14.015)        156657.000| 460178.000| 606926.000
patched:         563.365(11.368)             0.000|      0.000|   1362.000

vanilla:         561.197(15.782)             0.000|2725438.000|4143837.000
patched:         568.806(17.496)             0.000|      0.000|      0.000

<Prev in Thread] Current Thread [Next in Thread>