Re: [PATCH] x86/kvm/lapic: always disable MMIO interface in x2APIC mode

From: Paolo Bonzini
Date: Fri Sep 14 2018 - 13:21:28 EST


On 02/08/2018 17:08, Vitaly Kuznetsov wrote:
> When VMX is used with flexpriority disabled (because of no support or
> if disabled with module parameter) MMIO interface to lAPIC is still
> available in x2APIC mode while it shouldn't be (kvm-unit-tests):
>
> PASS: apic_disable: Local apic enabled in x2APIC mode
> PASS: apic_disable: CPUID.1H:EDX.APIC[bit 9] is set
> FAIL: apic_disable: *0xfee00030: 50014
>
> The issue appears because we basically do nothing while switching to
> x2APIC mode when APIC access page is not used. apic_mmio_{read,write}
> only check if lAPIC is disabled before proceeding to actual write.
>
> When APIC access is virtualized we correctly manipulate with VMX controls
> in vmx_set_virtual_apic_mode() and we don't get vmexits from memory writes
> in x2APIC mode so there's no issue.
>
> Disabling MMIO interface seems to be easy. The question is: what do we
> do with these reads and writes? If we add apic_x2apic_mode() check to
> apic_mmio_in_range() and return -EOPNOTSUPP these reads and writes will
> go to userspace. When lAPIC is in kernel, Qemu uses this interface to
> inject MSIs only (see kvm_apic_mem_write() in hw/i386/kvm/apic.c). This
> somehow works with disabled lAPIC but when we're in xAPIC mode we will
> get a real injected MSI from every write to lAPIC. Not good.
>
> The simplest solution seems to be to just ignore writes to the region
> and return ~0 for all reads when we're in x2APIC mode. This is what this
> patch does. However, this approach is inconsistent with what currently
> happens when flexpriority is enabled: we allocate APIC access page and
> create KVM memory region so in x2APIC modes all reads and writes go to
> this pre-allocated page which is, btw, the same for all vCPUs.
>
> Signed-off-by: Vitaly Kuznetsov <vkuznets@xxxxxxxxxx>
> ---
> Changes since RFC:
> - add KVM_X86_QUIRK_LAPIC_MMIO_HOLE disabling the newly introduced
> 'MMIO hole' behavior [Paolo Bonzini]
> ---
> arch/x86/include/uapi/asm/kvm.h | 1 +
> arch/x86/kvm/lapic.c | 22 +++++++++++++++++++---
> 2 files changed, 20 insertions(+), 3 deletions(-)
>
> diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h
> index 86299efa804a..fd23d5778ea1 100644
> --- a/arch/x86/include/uapi/asm/kvm.h
> +++ b/arch/x86/include/uapi/asm/kvm.h
> @@ -377,6 +377,7 @@ struct kvm_sync_regs {
>
> #define KVM_X86_QUIRK_LINT0_REENABLED (1 << 0)
> #define KVM_X86_QUIRK_CD_NW_CLEARED (1 << 1)
> +#define KVM_X86_QUIRK_LAPIC_MMIO_HOLE (1 << 2)
>
> #define KVM_STATE_NESTED_GUEST_MODE 0x00000001
> #define KVM_STATE_NESTED_RUN_PENDING 0x00000002
> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> index b5cd8465d44f..83c4e8cc7eb9 100644
> --- a/arch/x86/kvm/lapic.c
> +++ b/arch/x86/kvm/lapic.c
> @@ -1291,9 +1291,8 @@ EXPORT_SYMBOL_GPL(kvm_lapic_reg_read);
>
> static int apic_mmio_in_range(struct kvm_lapic *apic, gpa_t addr)
> {
> - return kvm_apic_hw_enabled(apic) &&
> - addr >= apic->base_address &&
> - addr < apic->base_address + LAPIC_MMIO_LENGTH;
> + return addr >= apic->base_address &&
> + addr < apic->base_address + LAPIC_MMIO_LENGTH;
> }
>
> static int apic_mmio_read(struct kvm_vcpu *vcpu, struct kvm_io_device *this,
> @@ -1305,6 +1304,15 @@ static int apic_mmio_read(struct kvm_vcpu *vcpu, struct kvm_io_device *this,
> if (!apic_mmio_in_range(apic, address))
> return -EOPNOTSUPP;
>
> + if (!kvm_apic_hw_enabled(apic) || apic_x2apic_mode(apic)) {
> + if (!kvm_check_has_quirk(vcpu->kvm,
> + KVM_X86_QUIRK_LAPIC_MMIO_HOLE))
> + return -EOPNOTSUPP;
> +
> + memset(data, 0xff, len);
> + return 0;
> + }
> +
> kvm_lapic_reg_read(apic, offset, len, data);
>
> return 0;
> @@ -1864,6 +1872,14 @@ static int apic_mmio_write(struct kvm_vcpu *vcpu, struct kvm_io_device *this,
> if (!apic_mmio_in_range(apic, address))
> return -EOPNOTSUPP;
>
> + if (!kvm_apic_hw_enabled(apic) || apic_x2apic_mode(apic)) {
> + if (!kvm_check_has_quirk(vcpu->kvm,
> + KVM_X86_QUIRK_LAPIC_MMIO_HOLE))
> + return -EOPNOTSUPP;
> +
> + return 0;
> + }
> +
> /*
> * APIC register must be aligned on 128-bits boundary.
> * 32/64/128 bits registers must be accessed thru 32 bits.
>

Queued, thanks.

Paolo