Re: [PATCH v5 5/8] KVM: x86: Support interrupt dispatch in x2APIC mode with APIC-write VM exit
From: Sean Christopherson
Date: Thu Jan 13 2022 - 16:29:18 EST
On Fri, Dec 31, 2021, Zeng Guang wrote:
> In VMX non-root operation, new behavior applies to
"new behavior" is ambiguous, it's not clear if it refers to new hardware behavior,
new KVM behavior, etc...
> virtualize WRMSR to vICR in x2APIC mode. Depending
Please wrap at ~75 chars, this is too narrow.
> on settings of the VM-execution controls, CPU would
> produce APIC-write VM-exit following the 64-bit value
> written to offset 300H on the virtual-APIC page(vICR).
> KVM needs to retrieve the value written by CPU and
> emulate the vICR write to deliver an interrupt.
>
> Current KVM doesn't consider to handle the 64-bit setting
> on vICR in trap-like APIC-write VM-exit. Because using
> kvm_lapic_reg_write() to emulate writes to APIC_ICR requires
> the APIC_ICR2 is already programmed correctly. But in the
> above APIC-write VM-exit, CPU writes the whole 64 bits to
> APIC_ICR rather than program higher 32 bits and lower 32
> bits to APIC_ICR2 and APIC_ICR respectively. So, KVM needs
> to retrieve the whole 64-bit value and program higher 32 bits
> to APIC_ICR2 first.
I think this is simply saying:
Upcoming Intel CPUs will support virtual x2APIC MSR writes to the vICR,
i.e. will trap and generate an APIC-write VM-Exit instead of intercepting
the WRMSR. Add support for handling "nodecode" x2APIC writes, which were
previously impossible.
Note, x2APIC MSR writes are 64 bits wide.
and then the shortlog can be:
KVM: x86: Add support for vICR APIC-write VM-Exits in x2APIC mode
The "interrupt dispatch" part is quite confusing because it's not really germane
to the change; yes, the vICR write does (eventually) dispatch an IRQ, but that
has nothing to do with the code being modified.
> Signed-off-by: Zeng Guang <guang.zeng@xxxxxxxxx>
> ---
> arch/x86/kvm/lapic.c | 12 +++++++++---
> arch/x86/kvm/lapic.h | 5 +++++
> 2 files changed, 14 insertions(+), 3 deletions(-)
>
> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> index f206fc35deff..3ce7142ba00e 100644
> --- a/arch/x86/kvm/lapic.c
> +++ b/arch/x86/kvm/lapic.c
> @@ -2186,15 +2186,21 @@ EXPORT_SYMBOL_GPL(kvm_lapic_set_eoi);
> /* emulate APIC access in a trap manner */
> void kvm_apic_write_nodecode(struct kvm_vcpu *vcpu, u32 offset)
> {
> - u32 val = 0;
> + struct kvm_lapic *apic = vcpu->arch.apic;
> + u64 val = 0;
>
> /* hw has done the conditional check and inst decode */
> offset &= 0xff0;
>
> - kvm_lapic_reg_read(vcpu->arch.apic, offset, 4, &val);
> + /* exception dealing with 64bit data on vICR in x2apic mode */
> + if ((offset == APIC_ICR) && apic_x2apic_mode(apic)) {
Sorry, I failed to reply to your response in the previous version. I suggested
a WARN_ON(offset != APIC_ICR), but you were concerned that apic_x2apic_mode()
would be expensive to check before @offset. I don't think that's a valid concern
as apic_x2apic_mode() is simply:
apic->vcpu->arch.apic_base & X2APIC_ENABLE
And is likely well-predicted by the CPU, especially in single tenant or pinned
scenarios where the pCPU is running a single VM/vCPU, i.e. will amost never see
X2APIC_ENABLE toggling.
So I stand behind my previous feedback[*] that we should split on x2APIC.
> + val = kvm_lapic_get_reg64(apic, offset);
> + kvm_lapic_reg_write(apic, APIC_ICR2, (u32)(val>>32));
> + } else
> + kvm_lapic_reg_read(apic, offset, 4, &val);
Needs curly braces. But again, I stand behind my previous feedback that this
would be better written as:
if (apic_x2apic_mode(apic)) {
if (WARN_ON_ONCE(offset != APIC_ICR))
return 1;
kvm_lapic_reg_read(apic, offset, 8, &val);
kvm_lapic_reg_write64(apic, offset, val);
} else {
kvm_lapic_reg_read(apic, offset, 4, &val);
kvm_lapic_reg_write(apic, offset, val);
}
after a patch (provided in earlier feedback) to introduce kvm_lapic_reg_write64().
[*] https://lore.kernel.org/all/YTvcJZSd1KQvNmaz@xxxxxxxxxx
> /* TODO: optimize to just emulate side effect w/o one more write */
> - kvm_lapic_reg_write(vcpu->arch.apic, offset, val);
> + kvm_lapic_reg_write(apic, offset, (u32)val);
> }
> EXPORT_SYMBOL_GPL(kvm_apic_write_nodecode);
>