RE: [PATCH v2] x86/mm: Don't disable INVLPG if "incomplete Global INVLPG flushes" is fixed by microcode

From: Michael Kelley
Date: Thu Apr 04 2024 - 13:49:04 EST


From: Sean Christopherson <seanjc@xxxxxxxxxx> Sent: Thursday, April 4, 2024 10:28 AM
>
> On Thu, Apr 04, 2024, Andrew Cooper wrote:
> > On 04/04/2024 5:18 pm, Sean Christopherson wrote:
> > > On Mon, Mar 25, 2024, Michael Kelley wrote:
> > >>> static void setup_pcid(void)
> > >>> {
> > >>> + const struct x86_cpu_id *invlpg_miss_match;
> > >>> +
> > >>> if (!IS_ENABLED(CONFIG_X86_64))
> > >>> return;
> > >>>
> > >>> if (!boot_cpu_has(X86_FEATURE_PCID))
> > >>> return;
> > >>>
> > >>> - if (x86_match_cpu(invlpg_miss_ids)) {
> > >>> + invlpg_miss_match = x86_match_cpu(invlpg_miss_ids);
> > >>> + if (invlpg_miss_match &&
> > >>> + invlpg_miss_match->driver_data > boot_cpu_data.microcode) {
> > >>> pr_info("Incomplete global flushes, disabling PCID");
> > >>> setup_clear_cpu_cap(X86_FEATURE_PCID);
> > >>> return;
> > >> As noted in similar places where microcode versions are
> > >> checked, hypervisors often lie to guests about microcode versions.
> > >> For example, see comments in bad_spectre_microcode(). I
> > >> know Hyper-V guests always see the microcode version as
> > >> 0xFFFFFFFF (max u32 value). So in a Hyper-V guest the above
> > >> code will always leave PCID enabled.
> > > Enumerating broken PCID support to a guest is very arguably a hypervisor bug.
> > > Hypervisors also lie to guest about FMS. As KVM *user* with affected hardware
> > > (home box), I would want the kernel to assume PCID works if X86_FEATURE_HYPERVISOR
> > > is present.
> >
> > I have very mixed feelings about all of this.
> >
> > The Gracemont INVLPG vs PCID bug was found in the field, so PCID will
> > have been enumerated to guests at that point.  You can't blindly drop
> > PCID until the VM next reboots.
> >
> > A related example.  I wrote the patch to hide XSAVES to work around an
> > AMD erratum where XSAVEC sufficed, and the consequences were so dire for
> > some versions of Windows that there was a suggestion to simply revert
> > the workaround to make VMs run again.  Windows intentionally asserts
> > sanity (== expectations) in what it can see; I have no idea whether it
> > would object in this case but hiding PCID is definitely playing with fire.
>
> Yeah, KVM users got burned by that too. d52734d00b8e ("KVM: x86: Give a
> hint when Win2016 might fail to boot due to XSAVES erratum").
>
> But practically speaking, that ship has sailed for KVM, as KVM advertises PCID
> support if and only if the host kernel supports it, i.e. clearing X86_FEATURE_PCID
> in setup_pcid() means KVM stops advertising to userspace, which in turn means
> QEMU stops enumerating it VMs by default.

FWIW, my home laptop has a RaptorLake CPU with the flaw. It's running an
up-to-date Windows 11/Hyper-V and a Linux guest VM. The microcode has
not been updated to a version with the fix. But the Linux guest still
sees CPUID 0x00000001 ECX with the "PCID" flag (bit 17) as set. So evidently
Hyper-V is not filtering this flaw.

I agree one could argue that it is a hypervisor bug to present PCID to the guest
in this situation. It's a lot cleaner to not have a guest be checking FMS and
microcode versions. But whether that's practical in the real world, at least
for Hyper-V, I don't know. What's the real impact of running with PCID while
the flaw is still present? I don’t know the history here ...

Michael