RE: [PATCH] x86/hyper-v: use cheaper HVCALL_FLUSH_VIRTUAL_ADDRESS_{LIST,SPACE} hypercalls when possible
From: KY Srinivasan
Date: Wed Jun 20 2018 - 13:55:48 EST
> -----Original Message-----
> From: Vitaly Kuznetsov <vkuznets@xxxxxxxxxx>
> Sent: Wednesday, June 20, 2018 1:24 AM
> To: Michael Kelley (EOSG) <Michael.H.Kelley@xxxxxxxxxxxxx>
> Cc: x86@xxxxxxxxxx; devel@xxxxxxxxxxxxxxxxxxxxxx; linux-
> kernel@xxxxxxxxxxxxxxx; KY Srinivasan <kys@xxxxxxxxxxxxx>; Haiyang Zhang
> <haiyangz@xxxxxxxxxxxxx>; Stephen Hemminger
> <sthemmin@xxxxxxxxxxxxx>; Thomas Gleixner <tglx@xxxxxxxxxxxxx>; Ingo
> Molnar <mingo@xxxxxxxxxx>; H. Peter Anvin <hpa@xxxxxxxxx>; Tianyu Lan
> <Tianyu.Lan@xxxxxxxxxxxxx>
> Subject: Re: [PATCH] x86/hyper-v: use cheaper
> HVCALL_FLUSH_VIRTUAL_ADDRESS_{LIST,SPACE} hypercalls when possible
>
> "Michael Kelley (EOSG)" <Michael.H.Kelley@xxxxxxxxxxxxx> writes:
>
> >> -----Original Message-----
> >> From: linux-kernel-owner@xxxxxxxxxxxxxxx <linux-kernel-
> owner@xxxxxxxxxxxxxxx> On Behalf
> >> Of Vitaly Kuznetsov
> >> Sent: Friday, June 15, 2018 9:30 AM
> >> To: x86@xxxxxxxxxx
> >> Cc: devel@xxxxxxxxxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; KY
> Srinivasan
> >> <kys@xxxxxxxxxxxxx>; Haiyang Zhang <haiyangz@xxxxxxxxxxxxx>;
> Stephen Hemminger
> >> <sthemmin@xxxxxxxxxxxxx>; Thomas Gleixner <tglx@xxxxxxxxxxxxx>;
> Ingo Molnar
> >> <mingo@xxxxxxxxxx>; H. Peter Anvin <hpa@xxxxxxxxx>; Tianyu Lan
> >> <Tianyu.Lan@xxxxxxxxxxxxx>
> >> Subject: [PATCH] x86/hyper-v: use cheaper
> HVCALL_FLUSH_VIRTUAL_ADDRESS_{LIST,SPACE}
> >> hypercalls when possible
> >>
> >> While working on Hyper-V style PV TLB flush support in KVM I noticed that
> >> real Windows guests use TLB flush hypercall in a somewhat smarter way:
> when
> >> the flush needs to be performed on a subset of first 64 vCPUs or on all
> >> present vCPUs Windows avoids more expensive hypercalls which support
> >> sparse CPU sets and uses their 'cheap' counterparts. This means that
> >> HV_X64_EX_PROCESSOR_MASKS_RECOMMENDED name is actually a
> misnomer: EX
> >> hypercalls (which support sparse CPU sets) are "available", not
> >> "recommended". This makes sense as they are actually harder to parse.
> >>
> >> Nothing stops us from being equally 'smart' in Linux too. Switch to
> >> doing cheaper hypercalls whenever possible.
> >>
> >> Signed-off-by: Vitaly Kuznetsov <vkuznets@xxxxxxxxxx>
> >> ---
> >
> > This is a good idea. We should probably do the same with the hypercalls for
> sending
> > IPIs -- try the simpler version first and move to the more complex _EX
> version only
> > if necessary.
> >
> > A complication: We've recently found a problem with the code for doing
> IPI
> > hypercalls, and the bug affects the TLB flush code as well. As secondary
> CPUs
> > are started, there's a window of time where the hv_vp_index entry for a
> > secondary CPU is uninitialized. We are seeing IPIs happening in that
> window, and
> > the IPI hypercall code uses the uninitialized hv_vp_index entry. Same
> thing could
> > happen with the TLB flush hypercall code. I didn't actually see any
> occurrences of
> > the TLB case in my tracing, but we should fix it anyway in case a TLB flush
> gets
> > added at some point in the future.
> >
> > KY has a patch coming. In the patch, hv_cpu_number_to_vp_number()
> > and cpumask_to_vpset() can both return U32_MAX if they encounter an
> > uninitialized hv_vp_index entry, and the code needs to be able to bail out
> to
> > the native functions for that particular IPI or TLB flush operation. Once the
> > initialization of secondary CPUs is complete, the uninitialized situation
> won't
> > happen again, and the hypercall path will always be used.
>
> Sure,
I am surprised that we have not seen this issue in tlb flush enlightenments.
K. Y
>
> with TLB flush we can always fall back to doing it natively (by sending
> IPIs).
>
> >
> > We'll need to coordinate on these patches. Be aware that the IPI flavor of
> the
> > bug is currently causing random failures when booting 4.18 RC1 on Hyper-V
> VMs
> > with large vCPU counts.
>
> Thanks for the heads up! This particular patch is just an optimization
> so there's no rush, IPI fix is definitely more important.
>
> >
> > Reviewed-by: Michael Kelley <mikelley@xxxxxxxxxxxxx>
>
> Thanks!
>
> --
> Vitaly