Re: [PATCH v2 13/23] KVM: x86: Disable APIC logical map if vCPUs are aliased in logical mode

From: Sean Christopherson
Date: Fri Sep 16 2022 - 14:52:39 EST


On Wed, Sep 14, 2022, Suthikulpanit, Suravee wrote:
> > > > diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> > > > index 6b2f538b8fd0..75748c380ceb 100644
> > > > --- a/arch/x86/kvm/lapic.c
> > > > +++ b/arch/x86/kvm/lapic.c
> > > > @@ -303,12 +303,13 @@ void kvm_recalculate_apic_map(struct kvm *kvm)
> > > > if (!mask)
> > > > continue;
> > > > - if (!is_power_of_2(mask)) {
> > > > + ldr = ffs(mask) - 1;
> > > > + if (!is_power_of_2(mask) || cluster[ldr]) {
> > >
> > > Should this be checking if the cluster[ldr] is pointing to the same struct
> > > apic instead? For example:
> > >
> > > if (!is_power_of_2(mask) || cluster[ldr] != apic)
> > >
> > > From my observation, the kvm_recalculate_apic_map() can be called many
> > > times, and the cluster[ldr] could have already been assigned from the
> > > previous invocation. So, as long as it is the same, it should be okay.
> >
> > No, because cluster[ldr] can never match "apic". kvm_recalculate_apic_map()
> > creates and populates a _new_ kvm_apic_map every time, it doesn't do an in-place
> > update of the current map.
>
> Yes, the _new_ is getting created and initialized every time we call
> kvm_recalculate_apic_map(), and then passed into
> kvm_apic_map_get_logical_dest() along with the reference of cluster and mask
> to get populated based on the provided ldr. Please note that the
> new->phys_map[x2apic_id] is already assigned before the calling of

Ooh, this is what I was missing. LDR is read-only for x2APIC, and so KVM simply
uses the phys_map and computes the phys_map indices by reversing the x2APIC=>LDR
calculation.

So it's so much not that _can_ "apic" can match "cluster[ldr]", it's actually that
"apic" _must_ match "cluster[ldr]" for this flow. Overwriting the cluster entry
is all kinds of pointless. It's either unnecessary (no bugs) or it breaks things
(bugs in either LDR calculation or logical dest math).

Rather than add an exception to the cluster[] check, which is very confusing for
xAPIC, the whole flow can be skipped for x2APIC, with a sanity check that the LDR
does indeed align with the x2APIC ID.

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 76a19bf1eb55..e9d7c647e8a7 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -347,6 +347,12 @@ void kvm_recalculate_apic_map(struct kvm *kvm)
}
new->logical_mode = logical_mode;

+ /* I'll add a comment here. */
+ if (apic_x2apic_mode(apic)) {
+ WARN_ON_ONCE(ldr != kvm_apic_calc_x2apic_ldr(x2apic_id));
+ continue;
+ }
+
if (WARN_ON_ONCE(!kvm_apic_map_get_logical_dest(new, ldr,
&cluster, &mask))) {
new->logical_mode = KVM_APIC_MODE_MAP_DISABLED;

Alternatively, the x2APIC handling could be done after kvm_apic_map_get_logical_dest(),
e.g. so that KVM can sanity check mask+cluster[ldr], but that's annoying to implement
and IMO it's overkill since the we can just as easily verify the math via tests on top
of the LDR sanity check.

I'll do a better job of verifying that APICv + x2APIC yields the correct inhibits.
I verified x2APIC functional correctness, and that APICv + xAPIC yielded the correct
inhibits, but I obviously didn't verify APICv + x2APIC yielded the correct inhibits.

Thanks much!