Re: [PATCH 2/2] KVM: x86: Simplify APICv update request logic

From: Maxim Levitsky
Date: Sun Oct 10 2021 - 08:49:20 EST


On Fri, 2021-10-08 at 18:01 -0700, Sean Christopherson wrote:
> Drop confusing and flawed code that intentionally sets that per-VM APICv
> inhibit mask after sending KVM_REQ_APICV_UPDATE to all vCPUs. The code
> is confusing because it's not obvious that there's no race between a CPU
> seeing the request and consuming the new mask. The code works only
> because the request handling path takes the same lock, i.e. responding
> vCPUs will be blocked until the full update completes.

Actually this code is here on purpose:

While it is true that the main reader of apicv_inhibit_reasons (KVM_REQ_APICV_UPDATE handler)
does take the kvm->arch.apicv_update_lock lock, so it will see the correct value
regardless of this patch, the reason why this code first raises the KVM_REQ_APICV_UPDATE
and only then updates the arch.apicv_inhibit_reasons is that I put a warning into svm_vcpu_run
which checks that per cpu AVIC inhibit state matches the global AVIC inhibit state.

That warning proved to be very useful to ensure that AVIC inhibit works correctly.

If this patch is applied, the warning can no longer work reliably unless
it takes the apicv_update_lock which will have a performance hit.

The reason is that if we just update apicv_inhibit_reasons, we can race
with vCPU which is about to re-enter the guest mode and trigger this warning.

Best regards,
Maxim Levitsky

>
> The concept is flawed because ordering the mask update after the request
> can't be relied upon for correct behavior. The only guarantee provided
> by kvm_make_all_cpus_request() is that all vCPUs exited the guest. It
> does not guarantee all vCPUs are waiting on the lock. E.g. a VCPU could
> be in the process of handling an emulated MMIO APIC access page fault
> that occurred before the APICv update was initiated, and thus toggling
> and reading the per-VM field would be racy. If correctness matters, KVM
> either needs to use the per-vCPU status (if appropriate), take the lock,
> or have some other mechanism that guarantees the per-VM status is correct.
>
> Cc: Maxim Levitsky <mlevitsk@xxxxxxxxxx>
> Signed-off-by: Sean Christopherson <seanjc@xxxxxxxxxx>
> ---
> arch/x86/kvm/x86.c | 16 +++++++---------
> 1 file changed, 7 insertions(+), 9 deletions(-)
>
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 4a52a08707de..960c2d196843 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -9431,29 +9431,27 @@ EXPORT_SYMBOL_GPL(kvm_vcpu_update_apicv);
>
> void __kvm_request_apicv_update(struct kvm *kvm, bool activate, ulong bit)
> {
> - unsigned long old, new;
> + unsigned long old;
>
> if (!kvm_x86_ops.check_apicv_inhibit_reasons ||
> !static_call(kvm_x86_check_apicv_inhibit_reasons)(bit))
> return;
>
> - old = new = kvm->arch.apicv_inhibit_reasons;
> + old = kvm->arch.apicv_inhibit_reasons;
>
> if (activate)
> - __clear_bit(bit, &new);
> + __clear_bit(bit, &kvm->arch.apicv_inhibit_reasons);
> else
> - __set_bit(bit, &new);
> + __set_bit(bit, &kvm->arch.apicv_inhibit_reasons);
>
> - if (!!old != !!new) {
> + if (!!old != !!kvm->arch.apicv_inhibit_reasons) {
> trace_kvm_apicv_update_request(activate, bit);
> kvm_make_all_cpus_request(kvm, KVM_REQ_APICV_UPDATE);
> - kvm->arch.apicv_inhibit_reasons = new;
> - if (new) {
> + if (kvm->arch.apicv_inhibit_reasons) {
> unsigned long gfn = gpa_to_gfn(APIC_DEFAULT_PHYS_BASE);
> kvm_zap_gfn_range(kvm, gfn, gfn+1);
> }
> - } else
> - kvm->arch.apicv_inhibit_reasons = new;
> + }
> }
> EXPORT_SYMBOL_GPL(__kvm_request_apicv_update);
>