Re: [PATCH] x86,switch_mm: skip atomic operations for init_mm

From: Andy Lutomirski
Date: Fri Jun 01 2018 - 11:11:25 EST


On Fri, Jun 1, 2018 at 5:28 AM Rik van Riel <riel@xxxxxxxxxxx> wrote:
>
> Song noticed switch_mm_irqs_off taking a lot of CPU time in recent
> kernels,using 2.4% of a 48 CPU system during a netperf to localhost run.
> Digging into the profile, we noticed that cpumask_clear_cpu and
> cpumask_set_cpu together take about half of the CPU time taken by
> switch_mm_irqs_off.
>
> However, the CPUs running netperf end up switching back and forth
> between netperf and the idle task, which does not require changes
> to the mm_cpumask. Furthermore, the init_mm cpumask ends up being
> the most heavily contended one in the system.`
>
> Skipping cpumask_clear_cpu and cpumask_set_cpu for init_mm
> (mostly the idle task) reduced CPU use of switch_mm_irqs_off
> from 2.4% of the CPU to 1.9% of the CPU, with the following
> netperf commandline:

I'm conceptually fine with this change. Does mm_cpumask(&init_mm) end
up in a deterministic state?

Mike, depending on exactly what's going on with your benchmark, this
might help recover a bit of your performance, too.

--Andy