Re: [PATCH] x86/hyper-v: use cheaper HVCALL_FLUSH_VIRTUAL_ADDRESS_{LIST,SPACE} hypercalls when possible

From: Vitaly Kuznetsov
Date: Wed Jun 20 2018 - 04:34:50 EST


"Michael Kelley (EOSG)" <Michael.H.Kelley@xxxxxxxxxxxxx> writes:

>> -----Original Message-----
>> From: linux-kernel-owner@xxxxxxxxxxxxxxx <linux-kernel-owner@xxxxxxxxxxxxxxx> On Behalf
>> Of Vitaly Kuznetsov
>> Sent: Friday, June 15, 2018 9:30 AM
>> To: x86@xxxxxxxxxx
>> Cc: devel@xxxxxxxxxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; KY Srinivasan
>> <kys@xxxxxxxxxxxxx>; Haiyang Zhang <haiyangz@xxxxxxxxxxxxx>; Stephen Hemminger
>> <sthemmin@xxxxxxxxxxxxx>; Thomas Gleixner <tglx@xxxxxxxxxxxxx>; Ingo Molnar
>> <mingo@xxxxxxxxxx>; H. Peter Anvin <hpa@xxxxxxxxx>; Tianyu Lan
>> <Tianyu.Lan@xxxxxxxxxxxxx>
>> Subject: [PATCH] x86/hyper-v: use cheaper HVCALL_FLUSH_VIRTUAL_ADDRESS_{LIST,SPACE}
>> hypercalls when possible
>>
>> While working on Hyper-V style PV TLB flush support in KVM I noticed that
>> real Windows guests use TLB flush hypercall in a somewhat smarter way: when
>> the flush needs to be performed on a subset of first 64 vCPUs or on all
>> present vCPUs Windows avoids more expensive hypercalls which support
>> sparse CPU sets and uses their 'cheap' counterparts. This means that
>> HV_X64_EX_PROCESSOR_MASKS_RECOMMENDED name is actually a misnomer: EX
>> hypercalls (which support sparse CPU sets) are "available", not
>> "recommended". This makes sense as they are actually harder to parse.
>>
>> Nothing stops us from being equally 'smart' in Linux too. Switch to
>> doing cheaper hypercalls whenever possible.
>>
>> Signed-off-by: Vitaly Kuznetsov <vkuznets@xxxxxxxxxx>
>> ---
>
> This is a good idea. We should probably do the same with the hypercalls for sending
> IPIs -- try the simpler version first and move to the more complex _EX version only
> if necessary.
>
> A complication: We've recently found a problem with the code for doing IPI
> hypercalls, and the bug affects the TLB flush code as well. As secondary CPUs
> are started, there's a window of time where the hv_vp_index entry for a
> secondary CPU is uninitialized. We are seeing IPIs happening in that window, and
> the IPI hypercall code uses the uninitialized hv_vp_index entry. Same thing could
> happen with the TLB flush hypercall code. I didn't actually see any occurrences of
> the TLB case in my tracing, but we should fix it anyway in case a TLB flush gets
> added at some point in the future.
>
> KY has a patch coming. In the patch, hv_cpu_number_to_vp_number()
> and cpumask_to_vpset() can both return U32_MAX if they encounter an
> uninitialized hv_vp_index entry, and the code needs to be able to bail out to
> the native functions for that particular IPI or TLB flush operation. Once the
> initialization of secondary CPUs is complete, the uninitialized situation won't
> happen again, and the hypercall path will always be used.

Sure,

with TLB flush we can always fall back to doing it natively (by sending
IPIs).

>
> We'll need to coordinate on these patches. Be aware that the IPI flavor of the
> bug is currently causing random failures when booting 4.18 RC1 on Hyper-V VMs
> with large vCPU counts.

Thanks for the heads up! This particular patch is just an optimization
so there's no rush, IPI fix is definitely more important.

>
> Reviewed-by: Michael Kelley <mikelley@xxxxxxxxxxxxx>

Thanks!

--
Vitaly