Re: [PATCH v11 16/46] KVM: x86: hyper-v: Don't use sparse_set_to_vcpu_mask() in kvm_hv_send_ipi()

From: Vitaly Kuznetsov
Date: Fri Oct 21 2022 - 08:42:01 EST


Vitaly Kuznetsov <vkuznets@xxxxxxxxxx> writes:

> Sean Christopherson <seanjc@xxxxxxxxxx> writes:
>
>> On Tue, Oct 04, 2022, Vitaly Kuznetsov wrote:
>
> ...
>
>>>
>>> - if (all_cpus) {
>>> - kvm_send_ipi_to_many(kvm, vector, NULL);
>>> - } else {
>>> - sparse_set_to_vcpu_mask(kvm, sparse_banks, valid_bank_mask, vcpu_mask);
>>> -
>>> - kvm_send_ipi_to_many(kvm, vector, vcpu_mask);
>>> - }
>>> + kvm_hv_send_ipi_to_many(kvm, vector, all_cpus ? NULL : sparse_banks, valid_bank_mask);
>>
>> Any objection to not using a ternary operator?
>>
>> if (all_cpus)
>> kvm_hv_send_ipi_to_many(kvm, vector, NULL, 0);
>> else
>> kvm_hv_send_ipi_to_many(kvm, vector, sparse_banks, valid_bank_mask);
>>
>
> Not at all,
>
>> Mostly because it's somewhat arbitrary that earlier code ensures valid_bank_mask
>> is set in the all_cpus=true case, e.g. arguably KVM doesn't need to do the var_cnt
>> sanity check in the all_cpus case:
>>
>> all_cpus = send_ipi_ex.vp_set.format == HV_GENERIC_SET_ALL;
>> if (all_cpus)
>> goto check_and_send_ipi;
>>
>> valid_bank_mask = send_ipi_ex.vp_set.valid_bank_mask;
>> if (hc->var_cnt != hweight64(valid_bank_mask))
>> return HV_STATUS_INVALID_HYPERCALL_INPUT;
>>
>> if (!hc->var_cnt)
>> goto ret_success;
>>
>
> I think 'var_cnt' (== hweight64(valid_bank_mask)) has to be checked in
> 'all_cpus' case, especially in kvm_hv_flush_tlb(): the code which reads
> TLB flush entries will read them from the wrong offset (data_offset/
> consumed_xmm_halves) otherwise. The problem is less severe in
> kvm_hv_send_ipi() as there's no data after CPU banks.
>
> At the bare minimum, "KVM: x86: hyper-v: Handle
> HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST{,EX} calls gently" patch from this
> series will have to be adjusted. I *think* mandating var_cnt==0 in 'all_cpus'
> is OK but I don't recall such requirement from TLFS, maybe it's safer to
> just adjust 'data_offset'/'consumed_xmm_halves' even in 'all_cpus' case.
>
> Let me do some tests...

"We can neither confirm nor deny the existence of the problem". Windows
guests seem to be smart enough to avoid using *_EX hypercalls altogether
for "all cpus" case (as non-ex versions are good enough). Let's keep
allowing non-zero var_cnt for 'all cpus' case for now and think about
hardening it later...

--
Vitaly