xfs
[Top] [All Lists]

Re: [regression v4.0-rc1] mm: IPIs from TLB flushes causing significant

To: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
Subject: Re: [regression v4.0-rc1] mm: IPIs from TLB flushes causing significant performance degradation.
From: Dave Chinner <david@xxxxxxxxxxxxx>
Date: Tue, 3 Mar 2015 22:34:37 +1100
Cc: Mel Gorman <mgorman@xxxxxxx>, Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>, Ingo Molnar <mingo@xxxxxxxxxx>, Matt B <jackdachef@xxxxxxxxx>, Linux Kernel Mailing List <linux-kernel@xxxxxxxxxxxxxxx>, linux-mm <linux-mm@xxxxxxxxx>, xfs@xxxxxxxxxxx
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <CA+55aFyczb5asoTwhzaJr1JdRi1epg1A6cFJgnzMMZj6U0gFWA@xxxxxxxxxxxxxx>
References: <20150302010413.GP4251@dastard> <CA+55aFzGFvVGD_8Y=jTkYwgmYgZnW0p0Fjf7OHFPRcL6Mz4HOw@xxxxxxxxxxxxxx> <20150303014733.GL18360@dastard> <CA+55aFw+7V9DfxBA2_DhMNrEQOkvdwjFFga5Y67-a6yVeAz+NQ@xxxxxxxxxxxxxx> <CA+55aFw+fb=Fh4M2wA4dVskgqN7PhZRGZS6JTMx4Rb1Qn++oaA@xxxxxxxxxxxxxx> <20150303052004.GM18360@dastard> <CA+55aFyczb5asoTwhzaJr1JdRi1epg1A6cFJgnzMMZj6U0gFWA@xxxxxxxxxxxxxx>
User-agent: Mutt/1.5.21 (2010-09-15)
On Mon, Mar 02, 2015 at 10:56:14PM -0800, Linus Torvalds wrote:
> On Mon, Mar 2, 2015 at 9:20 PM, Dave Chinner <david@xxxxxxxxxxxxx> wrote:
> >>
> >> But are those migrate-page calls really common enough to make these
> >> things happen often enough on the same pages for this all to matter?
> >
> > It's looking like that's a possibility.
> 
> Hmm. Looking closer, commit 10c1045f28e8 already should have
> re-introduced the "pte was already NUMA" case.
> 
> So that's not it either, afaik. Plus your numbers seem to say that
> it's really "migrate_pages()" that is done more. So it feels like the
> numa balancing isn't working right.

So that should show up in the vmstats, right? Oh, and there's a
tracepoint in migrate_pages, too. Same 6x10s samples in phase 3:

3.19:

        55,898      migrate:mm_migrate_pages

And a sample of the events shows 99.99% of these are:

mm_migrate_pages:     nr_succeeded=1 nr_failed=0 mode=MIGRATE_ASYNC reason=

4.0-rc1:

        364,442      migrate:mm_migrate_pages

They are also single page MIGRATE_ASYNC events like for 3.19.

And 'grep "numa\|migrate" /proc/vmstat' output for the entire
xfs_repair run:

3.19:

numa_hit 5163221
numa_miss 121274
numa_foreign 121274
numa_interleave 12116
numa_local 5153127
numa_other 131368
numa_pte_updates 36482466
numa_huge_pte_updates 0
numa_hint_faults 34816515
numa_hint_faults_local 9197961
numa_pages_migrated 1228114
pgmigrate_success 1228114
pgmigrate_fail 0

4.0-rc1:

numa_hit 36952043
numa_miss 92471
numa_foreign 92471
numa_interleave 10964
numa_local 36927384
numa_other 117130
numa_pte_updates 84010995
numa_huge_pte_updates 0
numa_hint_faults 81697505
numa_hint_faults_local 21765799
numa_pages_migrated 32916316
pgmigrate_success 32916316
pgmigrate_fail 0

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx

<Prev in Thread] Current Thread [Next in Thread>