Re: [PATCH] KVM: SVM: Do not virtualize MSR accesses for APIC LVTT register

From: Suravee Suthikulpanit
Date: Thu Jul 28 2022 - 04:55:41 EST


Maxim,

On 7/28/22 2:38 PM, Maxim Levitsky wrote:
On Sun, 2022-07-24 at 22:34 -0500, Suravee Suthikulpanit wrote:
AMD does not support APIC TSC-deadline timer mode. AVIC hardware
will generate GP fault when guest kernel writes 1 to bits [18]
of the APIC LVTT register (offset 0x32) to set the timer mode.
(Note: bit 18 is reserved on AMD system).

Therefore, always intercept and let KVM emulate the MSR accesses.

Fixes: f3d7c8aa6882 ("KVM: SVM: Fix x2APIC MSRs interception")
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@xxxxxxx>
---
arch/x86/kvm/svm/svm.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index aef63aae922d..3e0639a68385 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -118,7 +118,14 @@ static const struct svm_direct_access_msrs {
{ .index = X2APIC_MSR(APIC_ESR), .always = false },
{ .index = X2APIC_MSR(APIC_ICR), .always = false },
{ .index = X2APIC_MSR(APIC_ICR2), .always = false },
- { .index = X2APIC_MSR(APIC_LVTT), .always = false },
+
+ /*
+ * Note:
+ * AMD does not virtualize APIC TSC-deadline timer mode, but it is
+ * emulated by KVM. When setting APIC LVTT (0x832) register bit 18,
+ * the AVIC hardware would generate GP fault. Therefore, always
+ * intercept the MSR 0x832, and do not setup direct_access_msr.
+ */
{ .index = X2APIC_MSR(APIC_LVTTHMR), .always = false },
{ .index = X2APIC_MSR(APIC_LVTPC), .always = false },
{ .index = X2APIC_MSR(APIC_LVT0), .always = false },


LVT is not something I would expect x2avic to even try to emulate, I would expect
it to dumbly forward the write to apic backing page (garbage in, garbage out) and then
signal trap vmexit?

I also think that regular AVIC works like that (just forwards the write to the page).

The main difference b/w AVIC and x2AVIC is the MSR interception control, which needs to
not-intercept x2APIC MSRs for x2AVIC (allowing HW to virtualize MSR accesses).
However, the hypervisor can decide which x2APIC MSR to intercept and emulate.

I am asking because there is a remote possibility that due to some bug the guest got
direct access to x2apic registers of the host, and this is how you got that #GP.
Could you double check it?

I have verified this behavior with the HW designer and requested them to document
this in the next AMD programmers manual that will include x2AVIC details.

We really need x2avic (and vNMI) spec to be published to know exactly how all of this
is supposed to work.

I have raised the concern to the team responsible for publishing the doc.

Best Regards,
Suravee