Re: [PATCH 4/7] x86,tlb: make lazy TLB mode lazier
From: Andy Lutomirski
Date: Tue Jul 17 2018 - 16:04:32 EST
On Mon, Jul 16, 2018 at 12:03 PM, Rik van Riel <riel@xxxxxxxxxxx> wrote:
> Lazy TLB mode can result in an idle CPU being woken up by a TLB flush,
> when all it really needs to do is reload %CR3 at the next context switch,
> assuming no page table pages got freed.
>
> Memory ordering is used to prevent race conditions between switch_mm_irqs_off,
> which checks whether .tlb_gen changed, and the TLB invalidation code, which
> increments .tlb_gen whenever page table entries get invalidated.
>
> The atomic increment in inc_mm_tlb_gen is its own barrier; the context
> switch code adds an explicit barrier between reading tlbstate.is_lazy and
> next->context.tlb_gen.
>
> Unlike the 2016 version of this patch, CPUs with cpu_tlbstate.is_lazy set
> are not removed from the mm_cpumask(mm), since that would prevent the TLB
> flush IPIs at page table free time from being sent to all the CPUs
> that need them.
>
> This patch reduces total CPU use in the system by about 1-2% for a
> memcache workload on two socket systems, and by about 1% for a heavily
> multi-process netperf between two systems.
>
I'm not 100% certain I'm replying to the right email, and I haven't
gotten the tip-bot notification at all, but:
I think you've introduced a minor-ish performance regression due to
changing the old (admittedly terribly documented) control flow a bit.
Before, if real_prev == next, we would skip:
load_mm_cr4(next);
switch_ldt(real_prev, next);
Now we don't any more. I think you should reinstate that
optimization. It's probably as simple as wrapping them in an if
(real_priv != next) with a comment like /* Remote changes that would
require a cr4 or ldt reload will unconditionally send an IPI even to
lazy CPUs. So, if we aren't changing our mm, we don't need to refresh
cr4 or the ldt */
Hmm. load_mm_cr4() should bypass itself when mm == &init_mm. Want to
fix that part or should I?
--Andy