Re: [PART1 RFC v4 08/11] svm: Add VMEXIT handlers for AVIC

From: Radim KrÄmÃÅ
Date: Fri Apr 29 2016 - 10:56:37 EST


2016-04-28 17:08-0500, Suravee Suthikulanit:
> On 4/12/2016 11:22 AM, Radim KrÄmÃÅ wrote:
>> 2016-04-07 03:20-0500, Suravee Suthikulpanit:
>> > From: Suravee Suthikulpanit <suravee.suthikulpanit@xxxxxxx>
>> >
>> > This patch introduces VMEXIT handlers, avic_incomplete_ipi_interception()
>> > and avic_unaccelerated_access_interception() along with two trace points
>> > (trace_kvm_avic_incomplete_ipi and trace_kvm_avic_unaccelerated_access).
>> >
>> > Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@xxxxxxx>
>> > ---
>> > diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
>> > @@ -3515,6 +3515,250 @@ static int mwait_interception(struct vcpu_svm *svm)
>> > [...]
>> > + lid = ffs(dlid) - 1;
>> > + ret = avic_handle_ldr_write(&svm->vcpu, svm->vcpu.vcpu_id, lid);
>> > + if (ret)
>> > + return 0;
>>
>> OS can actually change LDR, so the old one should be invalidated.
>>
>> (No OS does, but that is not an important factor for the hypervisor.)
>>
>
> By validating the old one, are you suggesting that we should disable the
> logical APIC table entry previously used by this vcpu? If so, that means we
> would need to cached the previous LDR value since the one in vAPIC backing
> page would already be updated.

Yes, the cache could be used to speed up recalculate_apic_map() too, so
it might not be a total waste.

Which reminds me that physical APIC_ID doesn't use correct cache.
svm->vcpu.vcpu_id is only the initial ID, but APIC_ID could be changed
more than once.
It would be great to make APIC_ID read-only in all cases, because x86
spec allows us to do so, but KVM_SET_LAPIC can set APIC ID too and I
think we don't retroactively modify userspace API ... Paolo?

>> > [...]
>>
>> > + if (vm_data->ldr_mode != mod) {
>> > + clear_page(page_address(vm_data->avic_logical_id_table_page));
>> > + vm_data->ldr_mode = mod;
>> > + }
>> > + break;
>> > + }
>>
>> All these cases need to be called on KVM_SET_LAPIC -- the userspace can
>> provide completely new set of APIC registers and AVIC should build its
>> maps with them. Test with save/restore or migration.
>
> Hm.. This means we might need to introduce a new hook which is called from
> the arch/x86/kvm/lapic.c: kvm_apic_post_state_restore(). Probably something
> like kvm_x86_ops->apic_post_state_restore(), which sets up the new physical
> and logical APIC id tables for AVIC.

Sounds good. I imagine the callback would just call
avic_unaccel_trap_write() for relevant registers.

>> > + return ret;
>>
>> because we should not return, but continue to emulate the access.
>
> Then, this would continue as if we are handling the normal fault access.

Exactly, it is a normal access to an undefined register.

>> > + }
>> > +
>> > + if (trap) {
>> > + /* Handling Trap */
>> > + if (!write) /* Trap read should never happens */
>> > + BUG();
>>
>> (BUG_ON(!write) is shorter, though I would avoid BUG -- only guests are
>> going to fail, so we don't need to kill the host.)
>
> Ok. What about just WARN_ONCE(!write, "svm: Handling trap read.\n");

Sure, it's a hardware bug and calling avic_unaccel_trap_write() on a
read access shouldn't result in a bug. I am slightly inclined towards
'if (trap && write)' and optional 'WARN_ONCE(trap,' in the else branch.

Thanks.