Re: [PATCH v6 6/9] KVM: x86: lapic: don't allow to change APIC ID unconditionally

From: Maxim Levitsky
Date: Sun Mar 13 2022 - 05:24:29 EST


On Fri, 2022-03-11 at 21:28 +0800, Zeng Guang wrote:
>
> On 3/11/2022 12:26 PM, Sean Christopherson wrote:
> > On Wed, Mar 09, 2022, Maxim Levitsky wrote:
> > > On Wed, 2022-03-09 at 06:01 +0000, Sean Christopherson wrote:
> > > > > Could you share the links?
> > > >
> > > > Doh, sorry (they're both in this one).
> > > >
> > > > https://lore.kernel.org/all/20220301135526.136554-5-mlevitsk@xxxxxxxxxx
> > > >
> > > >
> > >
> > > My opinion on this subject is very simple: we need to draw the line somewhere.
> >
> > ...
> >
> >
> > Since the goal is to simplify KVM, can we try the inhibit route and see what the
> > code looks like before making a decision? I think it might actually yield a less
> > awful KVM than the readonly approach, especially if the inhibit is "sticky", i.e.
> > we don't try to remove the inhibit on subsequent changes.
> >
> > Killing the VM, as proposed, is very user unfriendly as the user will have no idea
> > why the VM was killed. WARN is out of the question because this is user triggerable.
> > Returning an emulation error would be ideal, but getting that result up through
> > apic_mmio_write() could be annoying and end up being more complex.
> >
> > The touchpoints will all be the same, unless I'm missing something the difference
> > should only be a call to set an inhibit instead killing the VM.
>
> Introduce an inhibition - APICV_INHIBIT_REASON_APICID_CHG to deactivate
> APICv once KVM guest would try to change APIC ID in xapic mode, and same
> sanity check in KVM_{SET,GET}_LAPIC for live migration. KVM will keep
> alive but obviously lose benefit from hardware acceleration in this way.
>
> So how do you think the proposal like this ?
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 6dcccb304775..30d825c069be 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -1046,6 +1046,7 @@ struct kvm_x86_msr_filter {
> #define APICV_INHIBIT_REASON_X2APIC 5
> #define APICV_INHIBIT_REASON_BLOCKIRQ 6
> #define APICV_INHIBIT_REASON_ABSENT 7
> +#define APICV_INHIBIT_REASON_APICID_CHG 8
>
> struct kvm_arch {
> unsigned long n_used_mmu_pages;
> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> index 22929b5b3f9b..66cd54fa4515 100644
> --- a/arch/x86/kvm/lapic.c
> +++ b/arch/x86/kvm/lapic.c
> @@ -2044,10 +2044,19 @@ static int kvm_lapic_reg_write(struct kvm_lapic *apic, u32 reg, u32 val)
>
> switch (reg) {
> case APIC_ID: /* Local APIC ID */
> - if (!apic_x2apic_mode(apic))
> - kvm_apic_set_xapic_id(apic, val >> 24);
> - else
> + if (apic_x2apic_mode(apic)) {
> ret = 1;
> + break;
> + }
> + /*
> + * If changing APIC ID with any APIC acceleration enabled,
> + * deactivate APICv to avoid unexpected issues.
> + */
> + if (enable_apicv && (val >> 24) != apic->vcpu->vcpu_id)
> + kvm_request_apicv_update(apic->vcpu->kvm,
> + false, APICV_INHIBIT_REASON_APICID_CHG);
> +
> + kvm_apic_set_xapic_id(apic, val >> 24);
> break;
>
> case APIC_TASKPRI:
> @@ -2628,11 +2637,19 @@ int kvm_get_apic_interrupt(struct kvm_vcpu *vcpu)
> static int kvm_apic_state_fixup(struct kvm_vcpu *vcpu,
> struct kvm_lapic_state *s, bool set)
> {
> - if (apic_x2apic_mode(vcpu->arch.apic)) {
> - u32 *id = (u32 *)(s->regs + APIC_ID);
> - u32 *ldr = (u32 *)(s->regs + APIC_LDR);
> - u64 icr;
> + u32 *id = (u32 *)(s->regs + APIC_ID);
> + u32 *ldr = (u32 *)(s->regs + APIC_LDR);
> + u64 icr;
> + if (!apic_x2apic_mode(vcpu->arch.apic)) {
> + /*
> + * If APIC ID changed with any APIC acceleration enabled,
> + * deactivate APICv to avoid unexpected issues.
> + */
> + if (enable_apicv && (*id >> 24) != vcpu->vcpu_id)
> + kvm_request_apicv_update(vcpu->kvm,
> + false, APICV_INHIBIT_REASON_APICID_CHG);
> + } else {
> if (vcpu->kvm->arch.x2apic_format) {
> if (*id != vcpu->vcpu_id)
> return -EINVAL;
> diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
> index 82d56f8055de..f78754bdc1d0 100644
> --- a/arch/x86/kvm/svm/avic.c
> +++ b/arch/x86/kvm/svm/avic.c
> @@ -931,7 +931,8 @@ bool svm_check_apicv_inhibit_reasons(ulong bit)
> BIT(APICV_INHIBIT_REASON_IRQWIN) |
> BIT(APICV_INHIBIT_REASON_PIT_REINJ) |
> BIT(APICV_INHIBIT_REASON_X2APIC) |
> - BIT(APICV_INHIBIT_REASON_BLOCKIRQ);
> + BIT(APICV_INHIBIT_REASON_BLOCKIRQ) |
> + BIT(APICV_INHIBIT_REASON_APICID_CHG);
>
> return supported & BIT(bit);
> }
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 7beba7a9f247..91265f0784bd 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -7751,7 +7751,8 @@ static bool vmx_check_apicv_inhibit_reasons(ulong bit)
> ulong supported = BIT(APICV_INHIBIT_REASON_DISABLE) |
> BIT(APICV_INHIBIT_REASON_ABSENT) |
> BIT(APICV_INHIBIT_REASON_HYPERV) |
> - BIT(APICV_INHIBIT_REASON_BLOCKIRQ);
> + BIT(APICV_INHIBIT_REASON_BLOCKIRQ) |
> + BIT(APICV_INHIBIT_REASON_APICID_CHG);
>
> return supported & BIT(bit);
> }
>
>
>

This won't work with nested AVIC - we can't just inhibit a nested guest using its own AVIC,
because migration happens.

Best regards,
Maxim Levitsky