Re: [PATCH] KVM: x86: Track supported ARCH_CAPABILITIES in kvm_caps

From: Pawan Gupta
Date: Thu May 25 2023 - 16:34:17 EST

On Thu, May 25, 2023 at 11:42:24PM +0800, Xiaoyao Li wrote:
> On 5/23/2023 11:34 AM, Pawan Gupta wrote:
> > > If a guest is exposed without ARCH_CAP_TAA_NO, ARCH_CAP_MDS_NO,
> > > ARCH_CAP_FB_CLEAR, vmx_update_fb_clear_dis() will leave
> > > vmx->disable_fb_clear as true. So VERW doesn't clear Fill Buffer for guest.
> > > But in the view of guset, it expects VERW to clear Fill Buffer.
> >
> > That is correct, but whether VERW clears the CPU buffers also depends on
> > if the hardware is affected or not, enumerating MD_CLEAR solely does not
> > guarantee that VERW will flush CPU buffers. This was true even before
> > MMIO Stale Data was discovered.
> >
> > If host(hardware) enumerates:
> >
> > MD_CLEAR | MDS_NO | VERW behavior
> > ---------|--------|-------------------
> > 1 | 0 | Clears CPU buffers
> >
> > But on an MDS mitigated hardware(MDS_NO=1) if guest enumerates:
> >
> > MD_CLEAR | MDS_NO | VERW behavior
> > ---------|--------|-----------------------
> > 1 | 0 | Not guaranteed to clear
> > CPU buffers
> >
> > After MMIO Stale Data, FB_CLEAR_DIS was introduced to keep this behavior
> > intact(for hardware that is not affected by MDS/TAA).
> Sorry, I don't understand it. What the behavior is?

That on a mitigated hardware VERW may not clear the micro-architectural

There are many micro-architectural buffers, VERW only clears the
affected ones. This is indicated in section "Fill Buffer Clearing
Operations" of [1].

Some processors may enumerate MD_CLEAR because they overwrite all
buffers affected by MDS/TAA, but they do not overwrite fill buffer
values. This is because fill buffers are not susceptible to MDS or TAA
on those processors.

For processors affected by FBSDP where MD_CLEAR may not overwrite fill
buffer values, Intel has released microcode updates that enumerate
FB_CLEAR so that VERW does overwrite fill buffer values.

> > If the userspace
> > truly wants the guest to have VERW flush behavior, it can export
> >
> > I see your point that from a guest's perspective it is being lied about
> > VERW behavior. OTOH, I am not sure if it is a good enough reason for
> > mitigated hardware to keep the overhead of clearing micro-architectural
> > buffers for generations of CPUs.
> User takes the responsiblity because itself requests the specific feature
> combination for its guest.

As I understand, the MD_CLEAR enumeration on mitigated hardware is done
purely for VM migration compatibility. Software is not expected to use
VERW on mitigated hardware, below is from MDS documentation [2]:

Future processors set the MDS_NO bit in IA32_ARCH_CAPABILITIES to
indicate they are not affected by microarchitectural data sampling.
Such processors will continue to enumerate the MD_CLEAR bit in CPUID.
As none of these data buffers are vulnerable to exposure on such
parts, no data buffer overwriting is required or expected for such
parts, despite the MD_CLEAR indication. Software should look to the
MDS_NO bit to determine whether buffer overwriting mitigations are