Re: [PART1 RFC v4 08/11] svm: Add VMEXIT handlers for AVIC

From: Suravee Suthikulanit
Date: Thu Apr 28 2016 - 18:08:33 EST


Hi Radim / Paolo,

Sorry for delay in response.

On 4/12/2016 11:22 AM, Radim KrÄmÃÅ wrote:
2016-04-07 03:20-0500, Suravee Suthikulpanit:
From: Suravee Suthikulpanit <suravee.suthikulpanit@xxxxxxx>

This patch introduces VMEXIT handlers, avic_incomplete_ipi_interception()
and avic_unaccelerated_access_interception() along with two trace points
(trace_kvm_avic_incomplete_ipi and trace_kvm_avic_unaccelerated_access).

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@xxxxxxx>
---
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
@@ -3515,6 +3515,250 @@ static int mwait_interception(struct vcpu_svm *svm)
+static u32 *avic_get_logical_id_entry(struct kvm_vcpu *vcpu, u8 mda, bool flat)
+{
+ struct kvm_arch *vm_data = &vcpu->kvm->arch;
+ int index;
+ u32 *logical_apic_id_table;
+
+ if (flat) { /* flat */
+ if (mda > 7)

Don't you want to check that just one bit it set?

+ return NULL;
+ index = mda;
+ } else { /* cluster */
+ int apic_id = mda & 0xf;
+ int cluster_id = (mda & 0xf0) >> 8;

">> 4".

+
+ if (apic_id > 4 || cluster_id >= 0xf)
+ return NULL;
+ index = (cluster_id << 2) + apic_id;

ffs(apic_id), because 'apic_id' must be compacted into 2 bits.

+ }
+ logical_apic_id_table = (u32 *) page_address(vm_data->avic_logical_id_table_page);
+
+ return &logical_apic_id_table[index];
+}

I have quite messed up in the logic here for handling the logical cluster ID. Sorry for not catching this earlier. I'm rewriting this function altogether to simplify it in the V5.

[...]
+ lid = ffs(dlid) - 1;
+ ret = avic_handle_ldr_write(&svm->vcpu, svm->vcpu.vcpu_id, lid);
+ if (ret)
+ return 0;

OS can actually change LDR, so the old one should be invalidated.

(No OS does, but that is not an important factor for the hypervisor.)


By validating the old one, are you suggesting that we should disable the logical APIC table entry previously used by this vcpu? If so, that means we would need to cached the previous LDR value since the one in vAPIC backing page would already be updated.

[...]

+ if (vm_data->ldr_mode != mod) {
+ clear_page(page_address(vm_data->avic_logical_id_table_page));
+ vm_data->ldr_mode = mod;
+ }
+ break;
+ }

All these cases need to be called on KVM_SET_LAPIC -- the userspace can
provide completely new set of APIC registers and AVIC should build its
maps with them. Test with save/restore or migration.

Hm.. This means we might need to introduce a new hook which is called from the arch/x86/kvm/lapic.c: kvm_apic_post_state_restore(). Probably something like kvm_x86_ops->apic_post_state_restore(), which sets up the new physical and logical APIC id tables for AVIC.

If this works, I'll add support to handle this and test with the migration stuff in the V5.

+ if (offset >= 0x400) {
+ WARN(1, "Unsupported APIC offset %#x\n", offset);

"printk_ratelimited(KERN_INFO " is the most severe message you could
give. I think that not printing anything is best,

+ return ret;

because we should not return, but continue to emulate the access.

Then, this would continue as if we are handling the normal fault access.


+ }
+
+ if (trap) {
+ /* Handling Trap */
+ if (!write) /* Trap read should never happens */
+ BUG();

(BUG_ON(!write) is shorter, though I would avoid BUG -- only guests are
going to fail, so we don't need to kill the host.)

Ok. What about just WARN_ONCE(!write, "svm: Handling trap read.\n");

Thanks,
Suravee