Re: [PATCH v3 2/6] KVM: X86: Implement PV IPIs in linux guest

From: Radim Krcmar
Date: Fri Jul 20 2018 - 09:12:12 EST


2018-07-20 11:45+0800, Wanpeng Li:
> On Fri, 20 Jul 2018 at 07:05, David Matlack <dmatlack@xxxxxxxxxx> wrote:
> >
> > On Mon, Jul 2, 2018 at 11:23 PM Wanpeng Li <kernellwp@xxxxxxxxx> wrote:
> > >
> > > From: Wanpeng Li <wanpengli@xxxxxxxxxxx>
> > >
> > > Implement paravirtual apic hooks to enable PV IPIs.
> >
> > Very cool. Thanks for working on this!
>
> Thanks David!
>
> >
> > >
> > > apic->send_IPI_mask
> > > apic->send_IPI_mask_allbutself
> > > apic->send_IPI_allbutself
> > > apic->send_IPI_all
> > >
> > > The PV IPIs supports maximal 128 vCPUs VM, it is big enough for cloud
> > > environment currently,
> >
> > From the Cloud perspective, 128 vCPUs is already obsolete. GCE's
> > n1-utlramem-160 VMs have 160 vCPUs where the maximum APIC ID is 231.
> > I'd definitely prefer an approach that scales to higher APIC IDs, like
> > Paolo's offset idea.
>
> Ok, I will try the offset method in next version.
>
> >
> > To Radim's point of real world performance testing, do you know what
> > is the primary source of multi-target IPIs? If it's TLB shootdowns we
> > might get a bigger bang for our buck with a PV TLB Shootdown.

I assume it is the TLB shootdown by a large margin, but never profiled
it.

We could add a more complex PV TLB shootdown that does the whole TLB
invalidation inside KVM, but I don't think it would be significantly
faster, because we need to force a VM exit if the target VCPU is
running. With PV IPI, we get exit-less IPI injection and a VM exit for
TLB invalidation.

> The "Function Call interrupts", there is a lot of callers for
> smp_call_function_many() except TLB Shootdowns in linux kernel which
> try to run a function on a set of other CPUs. TLB Shootdown still can
> get benefit from PV IPIs even if PV TLB Shootdown is enabled since
> IPIs should be sent to the vCPUs which are active and will incur
> vmexits. PV IPIs will benefit both vCPUs overcommit and
> non-overcommit(which PV TLB Shootdown can't help) scenarios. Btw,
> hyperv also implements PV IPIs even if PV TLB Shootdown is present.
> https://lkml.org/lkml/2018/7/3/537

Hyper-V in a better spot as guests can be using hypervisor's VPIDX for
cpu bitmaps and pass it to the IPI hypercall, so there is no APIC ID
translation on either side -- the IPI hypercall seems worth it just for
the simpler logic.

Hyper-V can also address up to 4096 VCPUs in one hypercall (optimized as
sparse 2 level 64-ary tree), so we might want to do something like that
(with higher ceiling) as the probability of being within 128 APIC IDs
should rapidly diminish with growing number of VCPUs.