Re: [PATCH] KVM: x86: Unify cross-vCPU IBPB

From: Sean Christopherson
Date: Wed Mar 26 2025 - 16:46:17 EST


On Thu, Mar 20, 2025, Yosry Ahmed wrote:
> arch/x86/kvm/svm/svm.c | 24 ------------------------
> arch/x86/kvm/svm/svm.h | 2 --
> arch/x86/kvm/vmx/nested.c | 6 +++---
> arch/x86/kvm/vmx/vmx.c | 15 ++-------------
> arch/x86/kvm/vmx/vmx.h | 3 +--
> arch/x86/kvm/x86.c | 19 ++++++++++++++++++-
> 6 files changed, 24 insertions(+), 45 deletions(-)
>
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index 8abeab91d329d..89bda9494183e 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -1484,25 +1484,10 @@ static int svm_vcpu_create(struct kvm_vcpu *vcpu)
> return err;
> }
>
> -static void svm_clear_current_vmcb(struct vmcb *vmcb)
> -{
> - int i;
> -
> - for_each_online_cpu(i)
> - cmpxchg(per_cpu_ptr(&svm_data.current_vmcb, i), vmcb, NULL);

Ha! I was going to say that processing only online CPUs is likely wrong, but
you made that change on the fly. I'll probably split that to a separate commit
since it's technically a bug fix.

A few other nits, but I'll take care of them when applying.

Overall, nice cleanup!

> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 69c20a68a3f01..4034190309a61 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -4961,6 +4961,8 @@ static bool need_emulate_wbinvd(struct kvm_vcpu *vcpu)
> return kvm_arch_has_noncoherent_dma(vcpu->kvm);
> }
>
> +static DEFINE_PER_CPU(struct kvm_vcpu *, last_vcpu);
> +
> void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
> {
> struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
> @@ -4983,6 +4985,18 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
>
> kvm_x86_call(vcpu_load)(vcpu, cpu);
>
> + if (vcpu != per_cpu(last_vcpu, cpu)) {

I have a slight preference for using this_cpu_read() (and write) so that it's more
obvious this is operating on the current CPU.

> + /*
> + * Flush the branch predictor when switching vCPUs on the same physical
> + * CPU, as each vCPU should have its own branch prediction domain. No
> + * IBPB is needed when switching between L1 and L2 on the same vCPU
> + * unless IBRS is advertised to the vCPU. This is handled on the nested
> + * VM-Exit path.
> + */
> + indirect_branch_prediction_barrier();
> + per_cpu(last_vcpu, cpu) = vcpu;
> + }
> +
> /* Save host pkru register if supported */
> vcpu->arch.host_pkru = read_pkru();
>
> @@ -12367,10 +12381,13 @@ void kvm_arch_vcpu_postcreate(struct kvm_vcpu *vcpu)
>
> void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu)
> {
> - int idx;
> + int idx, cpu;
>
> kvmclock_reset(vcpu);
>
> + for_each_possible_cpu(cpu)
> + cmpxchg(per_cpu_ptr(&last_vcpu, cpu), vcpu, NULL);

It's definitely worth keeping a version of SVM's comment to explaining the cross-CPU
nullification.

> +
> kvm_x86_call(vcpu_free)(vcpu);
>
> kmem_cache_free(x86_emulator_cache, vcpu->arch.emulate_ctxt);
> --
> 2.49.0.395.g12beb8f557-goog
>