Re: [PATCH] KVM: x86: VMX: Make smaller physical guest address space support user-configurable

From: Paolo Bonzini
Date: Thu Sep 03 2020 - 16:02:49 EST


On 03/09/20 20:32, Jim Mattson wrote:
>> [Checking writes to CR3] would be way too slow. Even the current
>> trapping of present #PF can introduce some slowdown depending on the
>> workload.
>
> Yes, I was concerned about that...which is why I would not want to
> enable pedantic mode. But if you're going to be pedantic, why go
> halfway?

Because I am not sure about any guest, even KVM, caring about setting
bits 51:46 in CR3.

>>> Does the typical guest care about whether or not setting any of the
>>> bits 51:46 in a PFN results in a fault?
>>
>> At least KVM with shadow pages does, which is a bit niche but it shows
>> that you cannot really rely on no one doing it. As you guessed, the
>> main usage of the feature is for machines with 5-level page tables where
>> there are no reserved bits; emulating smaller MAXPHYADDR allows
>> migrating VMs from 4-level page-table hosts.
>>
>> Enabling per-VM would not be particularly useful IMO because if you want
>> to disable this code you can just set host MAXPHYADDR = guest
>> MAXPHYADDR, which should be the common case unless you want to do that
>> kind of Skylake to Icelake (or similar) migration.
>
> I expect that it will be quite common to run 46-bit wide legacy VMs on
> Ice Lake hardware, as Ice Lake machines start showing up in
> heterogeneous data centers.

If you'll be okay with running _all_ 46-bit wide legacy VMs without
MAXPHYADDR emulation, that's what this patch is for. If you'll be okay
with running _only_ 46-bit wide VMs without emulation, you still don't
need special enabling per-VM beyond the automatic one based on
CPUID[0x8000_0008]. Do you think you'll need to enable it for some
special 46-bit VMs?

Paolo