Re: [PATCH v6 6/9] KVM: x86: lapic: don't allow to change APIC ID unconditionally

From: Sean Christopherson
Date: Tue Mar 08 2022 - 18:04:13 EST


On Fri, Feb 25, 2022, Zeng Guang wrote:
> From: Maxim Levitsky <mlevitsk@xxxxxxxxxx>
>
> No normal guest has any reason to change physical APIC IDs,

I don't think we can reasonably assume this, my analysis in the link (that I just
realized I deleted from context here) shows it's at least plausible that an existing
guest could rely on the APIC ID being writable. And that's just one kernel, who
know what else is out there, especially given that people use KVM to emulate really
old stuff, often on really old hardware.

Practically speaking, anyone that wants to deploy IPIv is going to have to make
the switch at some point, but that doesn't help people running legacy crud that
don't care about IPIv.

I was thinking a module param would be trivial, and it is (see below) if the
param is off by default. A module param will also provide a convenient opportunity
to resolve the loophole reported by Maxim[1][2], though it's a bit funky.

Anyways, with an off-by-default module param, we can just do:

if (!enable_apicv || !cpu_has_vmx_ipiv() || !xapic_id_readonly)
enable_ipiv = false;

Forcing userspace to take advantage of IPIv is rather annoying, but it's not the
end of world.

Having the param on by default is a mess. Either we break userspace (above), or
we only kinda break userspace by having it on iff IPIv is on, but then we end up
with cyclical dependency hell. E.g. userspace makes xAPIC ID writable and forces
on IPIv, which one "wins"? And if it's on by default, we can't fix the loophole
in KVM_SET_LAPIC.

If we really wanted to have it on by default, we could have a Kconfig and make
_that_ off by default, e.g.

static bool __read_mostly xapic_id_readonly = IS_ENABLED(CONFING_KVM_XAPIC_ID_RO);

but that seems like overkill. If a kernel owner knows they want the param on,
it should be easy enough to force it without a Kconfig.

So I think my vote would be for something like this? Compile tested only...

---
arch/x86/kvm/lapic.c | 14 +++++++++-----
arch/x86/kvm/svm/avic.c | 5 +++++
arch/x86/kvm/x86.c | 4 ++++
arch/x86/kvm/x86.h | 1 +
4 files changed, 19 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index c4c3155d98db..2c01cd45fb18 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -2043,7 +2043,7 @@ static int kvm_lapic_reg_write(struct kvm_lapic *apic, u32 reg, u32 val)

switch (reg) {
case APIC_ID: /* Local APIC ID */
- if (!apic_x2apic_mode(apic))
+ if (!apic_x2apic_mode(apic) && !xapic_id_readonly)
kvm_apic_set_xapic_id(apic, val >> 24);
else
ret = 1;
@@ -2634,10 +2634,7 @@ static int kvm_apic_state_fixup(struct kvm_vcpu *vcpu,
u32 *ldr = (u32 *)(s->regs + APIC_LDR);
u64 icr;

- if (vcpu->kvm->arch.x2apic_format) {
- if (*id != vcpu->vcpu_id)
- return -EINVAL;
- } else {
+ if (!vcpu->kvm->arch.x2apic_format) {
if (set)
*id >>= 24;
else
@@ -2650,6 +2647,10 @@ static int kvm_apic_state_fixup(struct kvm_vcpu *vcpu,
* split to ICR+ICR2 in userspace for backwards compatibility.
*/
if (set) {
+ if ((vcpu->kvm->arch.x2apic_format || xapic_id_readonly) &&
+ (*id != vcpu->vcpu_id))
+ return -EINVAL;
+
*ldr = kvm_apic_calc_x2apic_ldr(*id);

icr = __kvm_lapic_get_reg(s->regs, APIC_ICR) |
@@ -2659,6 +2660,9 @@ static int kvm_apic_state_fixup(struct kvm_vcpu *vcpu,
icr = __kvm_lapic_get_reg64(s->regs, APIC_ICR);
__kvm_lapic_set_reg(s->regs, APIC_ICR2, icr >> 32);
}
+ } else if (set && xapic_id_readonly &&
+ (__kvm_lapic_get_reg(s->regs, APIC_ID) >> 24) != vcpu->vcpu_id) {
+ return -EINVAL;
}

return 0;
diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
index b37b353ec086..4a031d9686c2 100644
--- a/arch/x86/kvm/svm/avic.c
+++ b/arch/x86/kvm/svm/avic.c
@@ -442,6 +442,11 @@ static int avic_handle_apic_id_update(struct kvm_vcpu *vcpu)
struct vcpu_svm *svm = to_svm(vcpu);
u32 id = kvm_xapic_id(vcpu->arch.apic);

+ if (xapic_id_readonly && id != vcpu->vcpu_id) {
+ kvm_prepare_emulation_failure_exit(vcpu);
+ return 0;
+ }
+
if (vcpu->vcpu_id == id)
return 0;

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 4fa4d8269e5b..67706d468ed3 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -177,6 +177,10 @@ module_param(tsc_tolerance_ppm, uint, S_IRUGO | S_IWUSR);
static int __read_mostly lapic_timer_advance_ns = -1;
module_param(lapic_timer_advance_ns, int, S_IRUGO | S_IWUSR);

+bool __read_mostly xapic_id_readonly;
+module_param(xapic_id_readonly, bool, 0444);
+EXPORT_SYMBOL_GPL(xapic_id_readonly);
+
static bool __read_mostly vector_hashing = true;
module_param(vector_hashing, bool, S_IRUGO);

diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
index aa86abad914d..89f40c921c08 100644
--- a/arch/x86/kvm/x86.h
+++ b/arch/x86/kvm/x86.h
@@ -302,6 +302,7 @@ static inline bool kvm_mpx_supported(void)
extern unsigned int min_timer_period_us;

extern bool enable_vmware_backdoor;
+extern bool xapic_id_readonly;

extern int pi_inject_timer;


base-commit: 1e147f6f90668f2c2b57406d451f0cfcd2ba19d0
--