Re: [PATCH v2] KVM: x86: Do not return host topology information from KVM_GET_SUPPORTED_CPUID

From: Jim Mattson
Date: Wed Jan 25 2023 - 19:58:56 EST


On Wed, Jan 25, 2023 at 2:44 PM Paolo Bonzini <pbonzini@xxxxxxxxxx> wrote:
>
> On 1/25/23 23:09, Jim Mattson wrote:
> > The topology leaves returned by KVM_GET_SUPPORTED_CPUID *for over a
> > decade* have been passed through unmodified from the host. They have
> > never made sense for KVM_SET_CPUID2, with the unlikely exception of a
> > whole-host VM.
>
> True, unfortunately people have not read the nonexistent documentation
> and they are:
>
> 1) rarely adjusting correctly all of 0xB, 0x1F and 0x8000001E;
>
> 2) never bounding CPUID[EAX=0].EAX to a known CPUID leaf, resulting for
> example in inconsistencies between 0xB and 0x1F.
>
> *But* (2) should not be needed unless you care about maintaining
> homogeneous CPUID within a VM migration pool. For something like
> kvmtool, having to do (2) would be a workaround for the bug that this
> patch fixes.

Maybe we should just populate up to leaf 3. :-)

> > Our VMM populates the topology of the guest CPUID table on its own, as
> > any VMM must. However, it uses the host topology (which
> > KVM_GET_SUPPORTED_CPUID has been providing pass-through *for over a
> > decade*) to see if the requested guest topology is possible.
>
> Ok, thanks; this is useful to know.
>
> > Changing a long-established ABI in a way that breaks userspace
> > applications is a bad practice. I didn't think we, as a community, did
> > that. I didn't realize that we were only catering to open source
> > implementations here.
>
> We aren't. But the open source implementations provide some guidance as
> to how the API is being used in the wild, and what the pitfalls are.
>
> You wrote it yourself: any VMM must either populate the topology on its
> own, or possibly fill it with zeros. Returning a value that is
> extremely unlikely to be used is worse in pretty much every way (apart
> from not breaking your VMM, of course).

I've complained about this particular ioctl more than I can remember.
This is just one of its many problems.

> With a total of six known users (QEMU, crosvm, kvmtool, firecracker,
> rust-vmm, and the Google VMM), KVM is damned if it reverts the patch and
> damned if it doesn't. There is a tension between fixing the one VMM
> that was using KVM_GET_SUPPORTED_CPUID correctly and now breaks loudly,
> and fixing 3-4 that were silently broken and are now fixed. I will
> probably send a patch to crosvm, though.
>
> The VMM being _proprietary_ doesn't really matter, however it does
> matter to me that it is not _public_: it is only used within Google, and
> the breakage is neither hard to fix in the VMM nor hard to temporarily
> avoid by reverting the patch in the Google kernel.

Sadly, there isn't a single kernel involved. People running our VMM on
their desktops are going to be impacted as soon as this patch hits
that distro. (I don't know if I can say which distro that is.) So, now
we have to get the VMM folks to urgently accommodate this change and
get a new distribution out.