Re: [PATCH v6 6/9] KVM: x86: lapic: don't allow to change APIC ID unconditionally

From: Sean Christopherson
Date: Wed Mar 09 2022 - 01:01:37 EST


TL;DR: Maxim, any objection to yet another inhibit? Any potential issues you can think of?

On Wed, Mar 09, 2022, Chao Gao wrote:
> On Tue, Mar 08, 2022 at 11:04:01PM +0000, Sean Christopherson wrote:
> >On Fri, Feb 25, 2022, Zeng Guang wrote:
> >> From: Maxim Levitsky <mlevitsk@xxxxxxxxxx>
> >>
> >> No normal guest has any reason to change physical APIC IDs,
> >
> >I don't think we can reasonably assume this, my analysis in the link (that I just
> >realized I deleted from context here) shows it's at least plausible that an existing
> >guest could rely on the APIC ID being writable. And that's just one kernel, who
> >know what else is out there, especially given that people use KVM to emulate really
> >old stuff, often on really old hardware.
>
> Making xAPIC ID readonly is not only based on your analysis, but also Intel SDM
> clearly saying writable xAPIC ID is processor model specific and ***software should
> avoid writing to xAPIC ID***.

Intel isn't the only vendor KVM supports, and xAPIC ID is fully writable according
to AMD's docs and AMD's hardware. x2APIC is even (indirectly) writable, but luckily
KVM has never modeled that...

Don't get me wrong, I would love to make xAPIC ID read-only, and I fully agree
that the probability of breaking someone's setup is very low, I just don't think
the benefits of forcing it are worth the risk of breaking userspace.

> If writable xAPIC ID support should be retained and is tied to a module param,
> live migration would depend on KVM's module params: e.g., migrate a VM with
> modified xAPIC ID (apic_id_readonly off on this system) to one with
> xapic_id_readonly on would fail, right? Is this failure desired?

Hrm, I was originally thinking it's not a terrible outcome, but I was assuming
that userspace would gracefully handle migration failure. That's a bad assumption.

> if not, we need to have a VM-scope control. e.g., add an inhibitor of APICv
> (XAPIC_ID_MODIFIED) and disable APICv forever for this VM if its vCPUs or
> QEMU modifies xAPIC ID.

Inhibiting APICv if IPIv is enabled (implied for AMD's AVIC) is probably a better
option than a module param. I was worried about ending up with silently degraded
VM performance, but that's easily solved by adding a stat to track APICv inhibitions,
which would be useful for other cases too (getting AMD's AVIC enabled is comically
difficult).

That would also let us drop the code buggy avic_handle_apic_id_update().

And it wouldn't necessarily have to be forever, though I agree that's a perfectly
fine approach until we have data that shows anything fancier is necessary.

> >Practically speaking, anyone that wants to deploy IPIv is going to have to make
> >the switch at some point, but that doesn't help people running legacy crud that
> >don't care about IPIv.
> >
> >I was thinking a module param would be trivial, and it is (see below) if the
> >param is off by default. A module param will also provide a convenient opportunity
> >to resolve the loophole reported by Maxim[1][2], though it's a bit funky.
>
> Could you share the links?

Doh, sorry (they're both in this one).

https://lore.kernel.org/all/20220301135526.136554-5-mlevitsk@xxxxxxxxxx