Re: [PATCH 0/7] KVM: x86: APX reg prep work
From: Paolo Bonzini
Date: Mon Apr 06 2026 - 17:42:10 EST
Il lun 6 apr 2026, 17:28 Sean Christopherson <seanjc@xxxxxxxxxx> ha scritto:
> > You're right about fast paths...
>
> Ya, potential fastpath usage is why I wanted to just context switch around
> entry/exit.
>
> > so something like the attached patch.
> > It is not too bad to translate into assembly, where it could use
> > alternatives (in the same way as
> > RESTORE_GUEST_SPEC_CTRL/RESTORE_GUEST_SPEC_CTRL_BODY) in place of
> > static_cpu_has(). Maybe it's best to bite the bullet and do it
> > already...
>
> My strong vote is to context switch in assembly, but _conditionally_ context
> switch R16-R31.
>
> But that second paragraph isn't quite correct, at least not for KVM. Specifically,
> "need a branch prior to regaining speculative safety" isn't correct, as that holds
> true if and only if "regaining speculative safety" requires executing code that
> might access R16-R31. If we massage __vmx_vcpu_run() to restore SPEC_CTRL in
> assembly, same as __svm_vcpu_run(), then __{svm,vmx}_vcpu_run() can simply context
> switch R16-R31 if and only if APX is enabled in XCR0.
I might even have patches for that lying around (the SPEC_CTRL part).
> KVM always intercepts XCR0 writes (when XCR0 isn't context switched by "hardware",
> i.e. ignoring SEV-ES+ and TDX guests), and IIUC all access to R16-R31 is gated on
> XCR0.APX=1
Right, fortunately.
> . So unless I'm missing something (or hardware is flawed and lets the
> guest speculative consume R16-R31, which would be sad), it's perfectly safe to
> run the guest with host state in R16-R31.
>
> That would avoid pointlessly context switching 16 registers when APX is not being
> used by the guest, and would avoid having to write XCR0 in the fastpath.
For now yes, but once/if the kernel starts using the registers there's
no way out of writing XCR0 for APX-disabled guests in the fast path.
If we ignore that, we can keep guest XCR0 all the time for now, and
that would be:
- move SPEC_CTRL to assembly
- not changing XCR0 handling at all
- use XCR0 in addition to just static_cpu_has(X86_FEATURE_APX) to make
r16-r31 swap conditional
> > - if (vcpu->arch.xcr0 != kvm_host.xcr0)
> > + /*
> > + * Do not load the definitive XCR0 yet; vcpu->arch.early_xcr0 keeps
> > + * APX enabled so that the kernel can move to and from r16...r31.
> > + */
> > + if (vcpu->arch.early_xcr0 != kvm_host.xcr0)
> > xsetbv(XCR_XFEATURE_ENABLED_MASK,
> > - load_guest ? vcpu->arch.xcr0 : kvm_host.xcr0);
> > + load_guest ? vcpu->arch.early_xcr0 : kvm_host.xcr0);
>
> Even _if_ we want to play XCR0 games,
(which depends on whether we want to be ready for kernel usage of APX, right?)
> tracking early_xcr0 is unnecessary. This can be:
>
> /*
> * XCR0 is context switched around VM-Enter/VM-Exit if APX is enabled
> * in the host but not in the guest.
> */
> if (vcpu->arch.xcr0 != kvm_host.xcr0 &&
> (!cpu_feature_enabled(X86_FEATURE_APX) ||
> vcpu->arch.xcr0 & XFEATURE_MASK_APX))
> xsetbv(XCR_XFEATURE_ENABLED_MASK,
> load_guest ? vcpu->arch.xcr0 : kvm_host.xcr0);
This is a bit more complex however, because in the end early_xcr0 is
precomputing the same conditions and optimizations. For example...
> > +void __kvm_load_guest_apx(struct kvm_vcpu *vcpu)
> > +{
> > + if (vcpu->arch.early_xcr0 != vcpu->arch.xcr0)
> > + xsetbv(XCR_XFEATURE_ENABLED_MASK, vcpu->arch.xcr0);
>
> This is wrong. The "real" xcr0 needs to be loaded *after* accessing R16+.
... this is actually the same optimization you mention above: the real
xcr0 only needs to be loaded if APX is off in the guest, and in that
case you don't need to load r16-r31. So you can load xcr0 first, and
then any components (for now only APX) that need to be swapped.
> > + if (!(vcpu->arch.xcr0 & XFEATURE_MASK_APX))
> > + return;
... Because the loads are conditional on APX being enabled in the real xcr0.
Paolo
> > +
> > + WARN_ON_ONCE(!irqs_disabled());
> > +
> > + asm("mov %[r16], %%r16\n"
> > + "mov %[r17], %%r17\n" // ...
> > + : : [r16] "m" (vcpu->arch.regs[16]),
> > + [r17] "m" (vcpu->arch.regs[17]));
> > +}
>