Re: [PATCH] x86/hyper-v: use cheaper HVCALL_FLUSH_VIRTUAL_ADDRESS_{LIST, SPACE} hypercalls when possible

From: Vitaly Kuznetsov
Date: Tue Jun 19 2018 - 08:58:26 EST


Thomas Gleixner <tglx@xxxxxxxxxxxxx> writes:

> On Fri, 15 Jun 2018, Vitaly Kuznetsov wrote:
>> * Fills in gva_list starting from offset. Returns the number of items added.
>> @@ -93,10 +95,19 @@ static void hyperv_flush_tlb_others(const struct cpumask *cpus,
>> if (cpumask_equal(cpus, cpu_present_mask)) {
>> flush->flags |= HV_FLUSH_ALL_PROCESSORS;
>> } else {
>> + /*
>> + * It is highly likely that VP ids are in ascending order
>> + * matching Linux CPU ids; Check VP index for the highest CPU
>> + * in the supplied set to see if EX hypercall is required.
>> + * This is just a best guess but should work most of the time.
>
> TLB flushing based on 'best guess' and 'should work most of the time' is
> not a brilliant approach.
>

Oh no no no, that's not what I meant :-)

We have the following problem: from the supplied CPU set we need to
figure out if we can get away with HVCALL_FLUSH_VIRTUAL_ADDRESS_{LIST,
SPACE} hypercalls which are cheaper or if we need to use more expensing
HVCALL_FLUSH_VIRTUAL_ADDRESS_{LIST, SPACE}_EX ones. The dividing line is
the highest VP_INDEX of the supplied CPU set: in case it is < 64 cheaper
hypercalls are OK. Now how do we check that? In the patch I have the
following approach:
1) Check VP number for the highest CPU in the supplied set. In case it
is > 64 we for sure need more expensive hypercalls. This is the "guess"
which works most of the time because Linux CPU ids usually match
VP_INDEXes.

2) In case the answer to the previous question was negative we start
preparing input for the cheaper hypercall. However, if while walking the
CPU set we meet a CPU with VP_INDEX higher than 64 we'll discard the
prepared input and switch to the more expensive hypercall.

Said that the 'guess' here is just an optimization to avoid walking the
whole CPU set when we find the required answer quickly by looking at the
highest bit. This will help big systems with hundreds of CPUs.

--
Vitaly