On Fri, Dec 31, 2021, Zeng Guang wrote:
In VMX non-root operation, new behavior applies to"new behavior" is ambiguous, it's not clear if it refers to new hardware behavior,
new KVM behavior, etc...
virtualize WRMSR to vICR in x2APIC mode. DependingPlease wrap at ~75 chars, this is too narrow.
on settings of the VM-execution controls, CPU wouldI think this is simply saying:
produce APIC-write VM-exit following the 64-bit value
written to offset 300H on the virtual-APIC page(vICR).
KVM needs to retrieve the value written by CPU and
emulate the vICR write to deliver an interrupt.
Current KVM doesn't consider to handle the 64-bit setting
on vICR in trap-like APIC-write VM-exit. Because using
kvm_lapic_reg_write() to emulate writes to APIC_ICR requires
the APIC_ICR2 is already programmed correctly. But in the
above APIC-write VM-exit, CPU writes the whole 64 bits to
APIC_ICR rather than program higher 32 bits and lower 32
bits to APIC_ICR2 and APIC_ICR respectively. So, KVM needs
to retrieve the whole 64-bit value and program higher 32 bits
to APIC_ICR2 first.
Upcoming Intel CPUs will support virtual x2APIC MSR writes to the vICR,
i.e. will trap and generate an APIC-write VM-Exit instead of intercepting
the WRMSR. Add support for handling "nodecode" x2APIC writes, which were
previously impossible.
Note, x2APIC MSR writes are 64 bits wide.
and then the shortlog can be:
KVM: x86: Add support for vICR APIC-write VM-Exits in x2APIC mode
The "interrupt dispatch" part is quite confusing because it's not really germane
to the change; yes, the vICR write does (eventually) dispatch an IRQ, but that
has nothing to do with the code being modified.
Signed-off-by: Zeng Guang <guang.zeng@xxxxxxxxx>Sorry, I failed to reply to your response in the previous version. I suggested
---
arch/x86/kvm/lapic.c | 12 +++++++++---
arch/x86/kvm/lapic.h | 5 +++++
2 files changed, 14 insertions(+), 3 deletions(-)
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index f206fc35deff..3ce7142ba00e 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -2186,15 +2186,21 @@ EXPORT_SYMBOL_GPL(kvm_lapic_set_eoi);
/* emulate APIC access in a trap manner */
void kvm_apic_write_nodecode(struct kvm_vcpu *vcpu, u32 offset)
{
- u32 val = 0;
+ struct kvm_lapic *apic = vcpu->arch.apic;
+ u64 val = 0;
/* hw has done the conditional check and inst decode */
offset &= 0xff0;
- kvm_lapic_reg_read(vcpu->arch.apic, offset, 4, &val);
+ /* exception dealing with 64bit data on vICR in x2apic mode */
+ if ((offset == APIC_ICR) && apic_x2apic_mode(apic)) {
a WARN_ON(offset != APIC_ICR), but you were concerned that apic_x2apic_mode()
would be expensive to check before @offset. I don't think that's a valid concern
as apic_x2apic_mode() is simply:
apic->vcpu->arch.apic_base & X2APIC_ENABLE
And is likely well-predicted by the CPU, especially in single tenant or pinned
scenarios where the pCPU is running a single VM/vCPU, i.e. will amost never see
X2APIC_ENABLE toggling.
So I stand behind my previous feedback[*] that we should split on x2APIC.
+ val = kvm_lapic_get_reg64(apic, offset);Needs curly braces. But again, I stand behind my previous feedback that this
+ kvm_lapic_reg_write(apic, APIC_ICR2, (u32)(val>>32));
+ } else
+ kvm_lapic_reg_read(apic, offset, 4, &val);
would be better written as:
if (apic_x2apic_mode(apic)) {
if (WARN_ON_ONCE(offset != APIC_ICR))
return 1;
kvm_lapic_reg_read(apic, offset, 8, &val);
kvm_lapic_reg_write64(apic, offset, val);
} else {
kvm_lapic_reg_read(apic, offset, 4, &val);
kvm_lapic_reg_write(apic, offset, val);
}
after a patch (provided in earlier feedback) to introduce kvm_lapic_reg_write64().
[*] https://lore.kernel.org/all/YTvcJZSd1KQvNmaz@xxxxxxxxxx
/* TODO: optimize to just emulate side effect w/o one more write */
- kvm_lapic_reg_write(vcpu->arch.apic, offset, val);
+ kvm_lapic_reg_write(apic, offset, (u32)val);
}
EXPORT_SYMBOL_GPL(kvm_apic_write_nodecode);