Re: [PATCH v2 2/7] KVM: x86: Extract VMXON and EFER.SVME enablement to kernel

From: Xu Yilun

Date: Thu Dec 18 2025 - 21:30:20 EST


On Wed, Dec 17, 2025 at 11:01:59AM -0800, Sean Christopherson wrote:
> On Wed, Dec 17, 2025, Xu Yilun wrote:
> > > >+#define x86_virt_call(fn) \
> > > >+({ \
> > > >+ int __r; \
> > > >+ \
> > > >+ if (IS_ENABLED(CONFIG_KVM_INTEL) && \
> > > >+ cpu_feature_enabled(X86_FEATURE_VMX)) \
> > > >+ __r = x86_vmx_##fn(); \
> > > >+ else if (IS_ENABLED(CONFIG_KVM_AMD) && \
> > > >+ cpu_feature_enabled(X86_FEATURE_SVM)) \
> > > >+ __r = x86_svm_##fn(); \
> > > >+ else \
> > > >+ __r = -EOPNOTSUPP; \
> > > >+ \
> > > >+ __r; \
> > > >+})
> > > >+
> > > >+int x86_virt_get_cpu(int feat)
> > > >+{
> > > >+ int r;
> > > >+
> > > >+ if (!x86_virt_feature || x86_virt_feature != feat)
> > > >+ return -EOPNOTSUPP;
> > > >+
> > > >+ if (this_cpu_inc_return(virtualization_nr_users) > 1)
> > > >+ return 0;
> > >
> > > Should we assert that preemption is disabled? Calling this API when preemption
> > > is enabled is wrong.
> > >
> > > Maybe use __this_cpu_inc_return(), which already verifies preemption status.
>
> I always forget that the double-underscores have the checks.
>
> > Is it better we explicitly assert the preemption for x86_virt_get_cpu()
> > rather than embed the check in __this_cpu_inc_return()? We are not just
> > protecting the racing for the reference counter. We should ensure the
> > "counter increase + x86_virt_call(get_cpu)" can't be preempted.
>
> I don't have a strong preference. Using __this_cpu_inc_return() without any
> nearby preemption_{enable,disable}() calls makes it quite clears that preemption
> is expected to be disabled by the caller. But I'm also ok being explicit.

Looking into __this_cpu_inc_return(), it finally calls
check_preemption_disabled() which doesn't strictly requires preemption.
It only ensures the context doesn't switch to another CPU. If the caller
is in cpuhp context, preemption is possible.

But in this x86_virt_get_cpu(), we need to ensure preemption disabled,
otherwise caller A increases counter but hasn't do actual VMXON yet and
get preempted. Caller B opts in and get the wrong info that VMX is
already on, and fails on following vmx operations.

On a second thought, maybe we disable preemption inside
x86_virt_get_cpu() to protect the counter-vmxon racing, this is pure
internal thing for this kAPI.

Thanks,
Yilun