Re: [PATCH 6/9] KVM: x86: Provide paravirtualized flush_tlb_multi()

From: Andy Lutomirski
Date: Tue Jun 25 2019 - 23:56:50 EST


On Tue, Jun 25, 2019 at 8:41 PM Nadav Amit <namit@xxxxxxxxxx> wrote:
>
> > On Jun 25, 2019, at 8:35 PM, Andy Lutomirski <luto@xxxxxxxxxx> wrote:
> >
> > On Tue, Jun 25, 2019 at 7:39 PM Nadav Amit <namit@xxxxxxxxxx> wrote:
> >>> On Jun 25, 2019, at 2:40 PM, Dave Hansen <dave.hansen@xxxxxxxxx> wrote:
> >>>
> >>> On 6/12/19 11:48 PM, Nadav Amit wrote:
> >>>> Support the new interface of flush_tlb_multi, which also flushes the
> >>>> local CPU's TLB, instead of flush_tlb_others that does not. This
> >>>> interface is more performant since it parallelize remote and local TLB
> >>>> flushes.
> >>>>
> >>>> The actual implementation of flush_tlb_multi() is almost identical to
> >>>> that of flush_tlb_others().
> >>>
> >>> This confused me a bit. I thought we didn't support paravirtualized
> >>> flush_tlb_multi() from reading earlier in the series.
> >>>
> >>> But, it seems like that might be Xen-only and doesn't apply to KVM and
> >>> paravirtualized KVM has no problem supporting flush_tlb_multi(). Is
> >>> that right? It might be good to include some of that background in the
> >>> changelog to set the context.
> >>
> >> Iâll try to improve the change-logs a bit. There is no inherent reason for
> >> PV TLB-flushers not to implement their own flush_tlb_multi(). It is left
> >> for future work, and here are some reasons:
> >>
> >> 1. Hyper-V/Xen TLB-flushing code is not very simple
> >> 2. I donât have a proper setup
> >> 3. I am lazy
> >
> > In the long run, I think that we're going to want a way for one CPU to
> > do a remote flush and then, with appropriate locking, update the
> > tlb_gen fields for the remote CPU. Getting this right may be a bit
> > nontrivial.
>
> What do you mean by âdo a remote flushâ?
>

I mean a PV-assisted flush on a CPU other than the CPU that started
it. If you look at flush_tlb_func_common(), it's doing some work that
is rather fancier than just flushing the TLB. By replacing it with
just a pure flush on Xen or Hyper-V, we're losing the potential CR3
switch and this bit:

/* Both paths above update our state to mm_tlb_gen. */
this_cpu_write(cpu_tlbstate.ctxs[loaded_mm_asid].tlb_gen, mm_tlb_gen);

Skipping the former can hurt idle performance, although we should
consider just disabling all the lazy optimizations on systems with PV
flush. (And I've asked Intel to help us out here in future hardware.
I have no idea what the result of asking will be.) Skipping the
cpu_tlbstate write means that we will do unnecessary flushes in the
future, and that's not doing us any favors.

In principle, we should be able to do something like:

flush_tlb_multi(...);
for(each CPU that got flushed) {
spin_lock(something appropriate?);
per_cpu_write(cpu, cpu_tlbstate.ctxs[loaded_mm_asid].tlb_gen, f->new_tlb_gen);
spin_unlock(...);
}

with the caveat that it's more complicated than this if the flush is a
partial flush, and that we'll want to check that the ctx_id still
matches, etc.

Does this make sense?