| To: | Dave Chinner <david@xxxxxxxxxxxxx>, Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>, Ingo Molnar <mingo@xxxxxxxxxx>, Matt B <jackdachef@xxxxxxxxx> |
|---|---|
| Subject: | Re: [regression v4.0-rc1] mm: IPIs from TLB flushes causing significant performance degradation. |
| From: | Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> |
| Date: | Mon, 2 Mar 2015 11:47:52 -0800 |
| Cc: | Linux Kernel Mailing List <linux-kernel@xxxxxxxxxxxxxxx>, linux-mm <linux-mm@xxxxxxxxx>, xfs@xxxxxxxxxxx |
| Delivered-to: | xfs@xxxxxxxxxxx |
| Dkim-signature: | v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; bh=TVyVcfLoDAcqFfpiZz4YaPIOETPDomJxKpo8lxgRIWM=; b=dXAQDwKadMBhVJLS8bHB3U6XX5gQVZ9EJWxe7J+KeMs0j15MK8gfyw74Cj05NjNa8k R0lUDw8Thu8+4vsrUBnvz0m9q84xRl/Fm106bUMEUb5eVXHZWy8SWvB7RGWhy/kx0mwV uwQ9hwVXXC7OlbDrKQK1/utRifoyom15dwiLgHYQ7pxrQ1qXi5bMRvz6kwV7fk0DzMQ0 QOgzK2DxnHxalRIUaUwmOJx6Jl3MwiPy30z+i/blx57FsAzvvos6SDXgMmGqg7LhLusL f/nUS05CrP4zICwmoisSKLOC1oMY0z6z7KWj6ZatAipIZh0OAYOREFy9heGtr9IQmoF3 uJzA== |
| Dkim-signature: | v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux-foundation.org; s=google; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; bh=TVyVcfLoDAcqFfpiZz4YaPIOETPDomJxKpo8lxgRIWM=; b=cV15+sSUS0C3LXhisTbUBPpq3wI/cBZkEDg2n/kQ23Q56QkFZPAhcDKC8NbRDDCEKA GO7Am4eYi41XJp/8wdHpS8uYsA/nfmVgCBihwn7tF2Hy75P7GGjgQqXuVs1qpKvoPpVC P6TccmN25TsDJsyWncKRXaEH/3Ke+16mrNs5U= |
| In-reply-to: | <20150302010413.GP4251@dastard> |
| References: | <20150302010413.GP4251@dastard> |
| Sender: | linus971@xxxxxxxxx |
On Sun, Mar 1, 2015 at 5:04 PM, Dave Chinner <david@xxxxxxxxxxxxx> wrote:
>
> Across the board the 4.0-rc1 numbers are much slower, and the
> degradation is far worse when using the large memory footprint
> configs. Perf points straight at the cause - this is from 4.0-rc1
> on the "-o bhash=101073" config:
>
> - 56.07% 56.07% [kernel] [k]
> default_send_IPI_mask_sequence_phys
> - 99.99% physflat_send_IPI_mask
> - 99.37% native_send_call_func_ipi
..
>
> And the same profile output from 3.19 shows:
>
> - 9.61% 9.61% [kernel] [k]
> default_send_IPI_mask_sequence_phys
> - 99.98% physflat_send_IPI_mask
> - 96.26% native_send_call_func_ipi
...
>
> So either there's been a massive increase in the number of IPIs
> being sent, or the cost per IPI have greatly increased. Either way,
> the result is a pretty significant performance degradatation.
And on Mon, Mar 2, 2015 at 11:17 AM, Matt <jackdachef@xxxxxxxxx> wrote:
>
> Linus already posted a fix to the problem, however I can't seem to
> find the matching commit in his tree (searching for "TLC regression"
> or "TLB cache").
That was commit f045bbb9fa1b, which was then refined by commit
721c21c17ab9, because it turned out that ARM64 had a very subtle
relationship with tlb->end and fullmm.
But both of those hit 3.19, so none of this should affect 4.0-rc1.
There's something else going on.
I assume it's the mm queue from Andrew, so adding him to the cc. There
are changes to the page migration etc, which could explain it.
There are also a fair amount of APIC changes in 4.0-rc1, so I guess it
really could be just that the IPI sending itself has gotten much
slower. Adding Ingo for that, although I don't think
default_send_IPI_mask_sequence_phys() itself hasn't actually changed,
only other things around the apic. So I'd be inclined to blame the mm
changes.
Obviously bisection would find it..
Linus
|
| Previous by Date: | Re: How to handle TIF_MEMDIE stalls?, Johannes Weiner |
|---|---|
| Next by Date: | Re: How to handle TIF_MEMDIE stalls?, Johannes Weiner |
| Previous by Thread: | [regression v4.0-rc1] mm: IPIs from TLB flushes causing significant performance degradation., Dave Chinner |
| Next by Thread: | Re: [regression v4.0-rc1] mm: IPIs from TLB flushes causing significant performance degradation., Dave Chinner |
| Indexes: | [Date] [Thread] [Top] [All Lists] |