[regression v4.0-rc1] mm: IPIs from TLB flushes causing significant performance degradation.

Linus Torvalds torvalds at linux-foundation.org
Mon Mar 2 13:47:52 CST 2015


On Sun, Mar 1, 2015 at 5:04 PM, Dave Chinner <david at fromorbit.com> wrote:
>
> Across the board the 4.0-rc1 numbers are much slower, and the
> degradation is far worse when using the large memory footprint
> configs. Perf points straight at the cause - this is from 4.0-rc1
> on the "-o bhash=101073" config:
>
> -   56.07%    56.07%  [kernel]            [k] default_send_IPI_mask_sequence_phys
>       - 99.99% physflat_send_IPI_mask
>          - 99.37% native_send_call_func_ipi
..
>
> And the same profile output from 3.19 shows:
>
> -    9.61%     9.61%  [kernel]            [k] default_send_IPI_mask_sequence_phys
>      - 99.98% physflat_send_IPI_mask
>          - 96.26% native_send_call_func_ipi
...
>
> So either there's been a massive increase in the number of IPIs
> being sent, or the cost per IPI have greatly increased. Either way,
> the result is a pretty significant performance degradatation.

And on Mon, Mar 2, 2015 at 11:17 AM, Matt <jackdachef at gmail.com> wrote:
>
> Linus already posted a fix to the problem, however I can't seem to
> find the matching commit in his tree (searching for "TLC regression"
> or "TLB cache").

That was commit f045bbb9fa1b, which was then refined by commit
721c21c17ab9, because it turned out that ARM64 had a very subtle
relationship with tlb->end and fullmm.

But both of those hit 3.19, so none of this should affect 4.0-rc1.
There's something else going on.

I assume it's the mm queue from Andrew, so adding him to the cc. There
are changes to the page migration etc, which could explain it.

There are also a fair amount of APIC changes in 4.0-rc1, so I guess it
really could be just that the IPI sending itself has gotten much
slower. Adding Ingo for that, although I don't think
default_send_IPI_mask_sequence_phys() itself hasn't actually changed,
only other things around the apic. So I'd be inclined to blame the mm
changes.

Obviously bisection would find it..

                          Linus



More information about the xfs mailing list