Re: [PATCH v2 3/4] x86/bugs: KVM: Add support for SRSO_MSR_FIX

From: Sean Christopherson
Date: Wed Jan 08 2025 - 12:18:27 EST


On Wed, Jan 08, 2025, Borislav Petkov wrote:
> > And do you know what 0xd23f corresponds to?
>
> How's that:
>
> $ objdump -D arch/x86/kvm/kvm.ko
> ...
> 000000000000d1a0 <kvm_vcpu_halt>:
> d1a0: e8 00 00 00 00 call d1a5 <kvm_vcpu_halt+0x5>
> d1a5: 55 push %rbp
> ...
>
> d232: e8 09 93 ff ff call 6540 <kvm_vcpu_check_block>
> d237: 85 c0 test %eax,%eax
> d239: 0f 88 f6 01 00 00 js d435 <kvm_vcpu_halt+0x295>
> d23f: f3 90 pause
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>
> d241: e8 00 00 00 00 call d246 <kvm_vcpu_halt+0xa6>
> d246: 48 89 c3 mov %rax,%rbx
> d249: e8 00 00 00 00 call d24e <kvm_vcpu_halt+0xae>
> d24e: 84 c0 test %al,%al
>
>
> Which makes sense :-)

Ooh, it's just the MSR writes that increased. I misinterpreted the profile
statement and thought that something in KVM was jumping from ~0% to 4.31%. If
the cost really is just this:

1.66% qemu-system-x86 [kernel.kallsyms] [k] native_write_msr
1.50% qemu-system-x86 [kernel.kallsyms] [k] native_write_msr_safe

vs

1.01% qemu-system-x86 [kernel.kallsyms] [k] native_write_msr
0.81% qemu-system-x86 [kernel.kallsyms] [k] native_write_msr_safe

then my vote is to go with the user_return approach. It's unfortunate that
restoring full speculation may be delayed until a CPU exits to userspace or KVM
is unloaded, but given that enable_virt_at_load is enabled by default, in practice
it's likely still far better than effectively always running the host with reduced
speculation.

> > Yeah, especially if this is all an improvement over the existing mitigation.
> > Though since it can impact non-virtualization workloads, maybe it should be a
> > separately selectable mitigation? I.e. not piggybacked on top of ibpb-vmexit?
>
> Well, ibpb-on-vmexit is your typical cloud provider scenario where you address
> the VM/VM attack vector by doing an IBPB on VMEXIT.

No? svm_vcpu_load() emits IBPB when switching VMCBs, i.e. when switching between
vCPUs that may live in separate security contexts. That IBPB is skipped when
X86_FEATURE_IBPB_ON_VMEXIT is enabled, because the host is trusted to not attack
its guests.

> This SRSO_MSR_FIX thing protects the *host* from a malicious guest so you
> need both enabled for full protection on the guest/host vector.

If reducing speculation protects the host, why wouldn't that also protect other
guests? The CPU needs to bounce through the host before enterring a different
guest.

And if for some reason reducing speculation doesn't suffice, wouldn't it be
better to fall back to doing IBPB only when switching VMCBs?