Re: [PATCH v2 00/11] KVM: Support guest MAXPHYADDR < host MAXPHYADDR

From: Tom Lendacky
Date: Fri Jun 19 2020 - 17:52:24 EST


On 6/19/20 10:39 AM, Mohammed Gamal wrote:
When EPT/NPT is enabled, KVM does not really look at guest physical
address size. Address bits above maximum physical memory size are reserved.
Because KVM does not look at these guest physical addresses, it currently
effectively supports guest physical address sizes equal to the host.

This can be problem when having a mixed setup of machines with 5-level page
tables and machines with 4-level page tables, as live migration can change
MAXPHYADDR while the guest runs, which can theoretically introduce bugs.

In this patch series we add checks on guest physical addresses in EPT
violation/misconfig and NPF vmexits and if needed inject the proper
page faults in the guest.

A more subtle issue is when the host MAXPHYADDR is larger than that of the
guest. Page faults caused by reserved bits on the guest won't cause an EPT
violation/NPF and hence we also check guest MAXPHYADDR and add PFERR_RSVD_MASK
error code to the page fault if needed.

I'm probably missing something here, but I'm confused by this statement. Is this for a case where a page has been marked not present and the guest has also set what it believes are reserved bits? Then when the page is accessed, the guest sees a page fault without the error code for reserved bits? If so, my understanding is that is architecturally correct. P=0 is considered higher priority than other page faults, at least on AMD. So if you have a P=0 and other issues exist within the PTE, AMD will report the P=0 fault and that's it.

The priority of other page fault conditions when P=1 is not defined and I don't think we guarantee that you would get all error codes on fault. Software is always expected to address the page fault and retry, and it may get another page fault when it does, with a different error code. Assuming the other errors are addressed, eventually the reserved bits would cause an NPF and that could be detected by the HV and handled appropriately.


The last 3 patches (i.e. SVM bits and patch 11) are not intended for
immediate inclusion and probably need more discussion.
We've been noticing some unexpected behavior in handling NPF vmexits
on AMD CPUs (see individual patches for details), and thus we are
proposing a workaround (see last patch) that adds a capability that
userspace can use to decide who to deal with hosts that might have
issues supprting guest MAXPHYADDR < host MAXPHYADDR.

Also, something to consider. On AMD, when memory encryption is enabled (via the SYS_CFG MSR), a guest can actually have a larger MAXPHYADDR than the host. How do these patches all play into that?

Thanks,
Tom



Mohammed Gamal (7):
KVM: x86: Add helper functions for illegal GPA checking and page fault
injection
KVM: x86: mmu: Move translate_gpa() to mmu.c
KVM: x86: mmu: Add guest physical address check in translate_gpa()
KVM: VMX: Add guest physical address check in EPT violation and
misconfig
KVM: SVM: introduce svm_need_pf_intercept
KVM: SVM: Add guest physical address check in NPF/PF interception
KVM: x86: SVM: VMX: Make GUEST_MAXPHYADDR < HOST_MAXPHYADDR support
configurable

Paolo Bonzini (4):
KVM: x86: rename update_bp_intercept to update_exception_bitmap
KVM: x86: update exception bitmap on CPUID changes
KVM: VMX: introduce vmx_need_pf_intercept
KVM: VMX: optimize #PF injection when MAXPHYADDR does not match

arch/x86/include/asm/kvm_host.h | 10 ++------
arch/x86/kvm/cpuid.c | 2 ++
arch/x86/kvm/mmu.h | 6 +++++
arch/x86/kvm/mmu/mmu.c | 12 +++++++++
arch/x86/kvm/svm/svm.c | 41 +++++++++++++++++++++++++++---
arch/x86/kvm/svm/svm.h | 6 +++++
arch/x86/kvm/vmx/nested.c | 28 ++++++++++++--------
arch/x86/kvm/vmx/vmx.c | 45 +++++++++++++++++++++++++++++----
arch/x86/kvm/vmx/vmx.h | 6 +++++
arch/x86/kvm/x86.c | 29 ++++++++++++++++++++-
arch/x86/kvm/x86.h | 1 +
include/uapi/linux/kvm.h | 1 +
12 files changed, 158 insertions(+), 29 deletions(-)