Re: [PATCH v3 6/6] KVM: VMX: Move VERW closer to VMentry for MDS mitigation

From: Sean Christopherson
Date: Thu Oct 26 2023 - 17:23:04 EST


On Thu, Oct 26, 2023, Pawan Gupta wrote:
> On Thu, Oct 26, 2023 at 12:30:55PM -0700, Sean Christopherson wrote:
> > > if (static_branch_unlikely(&vmx_l1d_should_flush))
> > > vmx_l1d_flush(vcpu);
> >
> > There's an existing bug here. vmx_1ld_flush() is not guaranteed to do a flush in
> > "conditional mode", and is not guaranteed to do a ucode-based flush
>
> AFAICT, it is based on the condition whether after a VMexit any
> sensitive data could have been touched or not. If L1TF mitigation
> doesn't consider certain data sensitive and skips L1D flush, executing
> VERW isn't giving any protection, since that data can anyways be leaked
> from L1D using L1TF.

That assumes vcpu->arch.l1tf_flush_l1d is 100% precise and accurate, which is most
definitely not the case. You're also preventing the admin from choosing between
being super paranoind (always flush L1D) and mostly paranoid (conditionally flush
L1D, always flush CPU buffers).

AIUI, flushing the L1D is crazy expensive compared to flushing the CPU buffers,
so it's entirely plausible for someone to want to choose the mostly paranoid
option.

Side topic, isn't the NMI path missing a call to kvm_set_cpu_l1tf_flush_l1d()?

> > /*
> > * The MMIO stale data vulnerability is a subset of the general MDS
> > * vulnerability, i.e. this is mutually exclusive with the VERW that's
> > * done just before VM-Enter. The vulnerability requires the attacker,
> > * i.e. the guest, to do MMIO, so this "clear" can be done earlier.
> > */
> > if (static_branch_unlikely(&mmio_stale_data_clear) &&
> > !cpu_buffers_flushed && kvm_arch_has_assigned_device(vcpu->kvm))
> > mds_clear_cpu_buffers();
>
> This is certainly better, but I don't know what scenario is this helping with.

Heh, that's host I feel about moving VERW to just before VM-Enter. I have a hard
time believing there's meaningful sensitive that's accessed in __vmx_vcpu_run().
The closest thing is probably CR2, but that's a very dubious vector since CR2 will
hold a guest value for most VM-Enters.

I'm not against moving VERW close to VM-Enter because it's relatively straightforward,
but if we're going to be super paranoid, why not go all the way and not have to
worry about what ifs?