Re: [PATCH] KVM: x86: Ignore MSR_AMD64_BU_CFG access

From: Sean Christopherson
Date: Mon Sep 25 2023 - 14:30:16 EST


On Mon, Sep 25, 2023, Maciej S. Szmigiero wrote:
> From: "Maciej S. Szmigiero" <maciej.szmigiero@xxxxxxxxxx>
>
> Hyper-V enabled Windows Server 2022 KVM VM cannot be started on Zen1 Ryzen
> since it crashes at boot with SYSTEM_THREAD_EXCEPTION_NOT_HANDLED +
> STATUS_PRIVILEGED_INSTRUCTION (in other words, because of an unexpected #GP
> in the guest kernel).
>
> This is because Windows tries to set bit 8 in MSR_AMD64_BU_CFG and can't
> handle receiving a #GP when doing so.

Any idea why?

> Give this MSR the same treatment that commit 2e32b7190641
> ("x86, kvm: Add MSR_AMD64_BU_CFG2 to the list of ignored MSRs") gave
> MSR_AMD64_BU_CFG2 under justification that this MSR is baremetal-relevant
> only.

Ugh, that commit set a terrible example. The kernel change should have been
conditioned on !X86_FEATURE_HYPERVISOR if the MSR only has meaning for bare metal.

> Although apparently it was then needed for Linux guests, not Windows as in
> this case.
>
> With this change, the aforementioned guest setup is able to finish booting
> successfully.
>
> This issue can be reproduced either on a Summit Ridge Ryzen (with
> just "-cpu host") or on a Naples EPYC (with "-cpu host,stepping=1" since
> EPYC is ordinarily stepping 2).

This seems like it needs to be tagged for stable?

> Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@xxxxxxxxxx>
> ---
> arch/x86/include/asm/msr-index.h | 1 +
> arch/x86/kvm/x86.c | 2 ++
> 2 files changed, 3 insertions(+)
>
> diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
> index 1d111350197f..c80a5cea80c4 100644
> --- a/arch/x86/include/asm/msr-index.h
> +++ b/arch/x86/include/asm/msr-index.h
> @@ -553,6 +553,7 @@
> #define MSR_AMD64_CPUID_FN_1 0xc0011004
> #define MSR_AMD64_LS_CFG 0xc0011020
> #define MSR_AMD64_DC_CFG 0xc0011022
> +#define MSR_AMD64_BU_CFG 0xc0011023

What document actually defines this MSR? All of the PPRs I can find for Family 17h
list it as:

MSRC001_1023 [Table Walker Configuration] (Core::X86::Msr::TW_CFG)

> #define MSR_AMD64_DE_CFG 0xc0011029
> #define MSR_AMD64_DE_CFG_LFENCE_SERIALIZE_BIT 1
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 9f18b06bbda6..2f3cdd798185 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -3639,6 +3639,7 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
> case MSR_IA32_UCODE_WRITE:
> case MSR_VM_HSAVE_PA:
> case MSR_AMD64_PATCH_LOADER:
> + case MSR_AMD64_BU_CFG:

I am sorely tempted to say that this should be solved in userspace via MSR
filtering. IIUC, the MSR truly is model specific, and I don't love the idea of
effectively ignoring accesses to unknown MSRs. And I really, really don't want
KVM to pivot on FMS.

Paolo, is punting to userspace reasonable, or should we just bite the bullet in
KVM and commit to ignoring MSRs like this?

> case MSR_AMD64_BU_CFG2:
> case MSR_AMD64_DC_CFG:
> case MSR_F15H_EX_CFG:
> @@ -4062,6 +4063,7 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
> case MSR_K8_INT_PENDING_MSG:
> case MSR_AMD64_NB_CFG:
> case MSR_FAM10H_MMIO_CONF_BASE:
> + case MSR_AMD64_BU_CFG:
> case MSR_AMD64_BU_CFG2:
> case MSR_IA32_PERF_CTL:
> case MSR_AMD64_DC_CFG: