Re: [linus:master] [x86/mm/tlb] 7e33001b8b: will-it-scale.per_thread_ops 20.7% improvement

From: Rik van Riel
Date: Sat Nov 30 2024 - 12:29:27 EST


On Sat, 2024-11-30 at 16:07 +0800, kernel test robot wrote:
>
>
> Hello,
>
> in this test, we don't have CONFIG_DEBUG_VM.
> # CONFIG_DEBUG_VM is not set
>
> below report is just FYI.
>
>
> kernel test robot noticed a 20.7% improvement of will-it-
> scale.per_thread_ops on:
>
>
> commit: 7e33001b8b9a78062679e0fdf5b0842a49063135 ("x86/mm/tlb: Put
> cpumask_test_cpu() check in switch_mm_irqs_off() under
> CONFIG_DEBUG_VM")
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git maste
> r

It's good to get this confirmation that the mm_cpumask
really is that expensive.

I guess we could experiment with something like the following:

1) Stop using the mm_cpumask altogether on x86
2) Instead, at context switch time just update
per_cpu variables like cpu_tlbstate.loaded_mm
and friends
3) At (much rarer) TLB flush time:
- Iterate over all CPUs
- Use cpustate.loaded_mm and .is_lazy to build a 
(per-CPU?) cpumask.
- Pass that cpumask to functions like flush_tlb_multi
and on_each_cpu_mask

Does that make sense as something we could try to
further reduce context switch overhead, and the
TLB flush thundering herd on the mm_cpumask triggered
by the main loop in will-it-scale's tlb_flush2 test?

https://github.com/antonblanchard/will-it-scale/blob/master/tests/tlb_flush2.c


--
All Rights Reversed.