Re: [PART1 RFC v3 12/12] svm: Manage vcpu load/unload when enable AVIC
From: Radim KrÄmÃÅ
Date: Fri Mar 18 2016 - 17:45:00 EST
2016-03-18 01:09-0500, Suravee Suthikulpanit:
> From: Suravee Suthikulpanit <suravee.suthikulpanit@xxxxxxx>
>
> When a vcpu is loaded/unloaded to a physical core, we need to update
> host physical APIC ID information in the Physical APIC-ID table
> accordingly.
>
> Also, when vCPU is blocking/un-blocking (due to halt instruction),
> we need to make sure that the is-running bit in set accordingly in the
> physical APIC-ID table.
>
> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@xxxxxxx>
> ---
> arch/x86/kvm/svm.c | 121 +++++++++++++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 121 insertions(+)
>
> diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
> +static int avic_vcpu_load(struct kvm_vcpu *vcpu, int cpu, bool is_load)
> +{
> + int h_phy_apic_id;
(Paolo said a lot about those names.)
> + u64 *entry, new_entry;
> + struct vcpu_svm *svm = to_svm(vcpu);
> + int ret = 0;
> + if (!svm_vcpu_avic_enabled(svm))
> + return 0;
(The check for NULL below feels weird when it has already been used.)
> +
> + if (!svm)
> + return -EINVAL;
!svm means that KVM completely blew up ... don't check for it.
(See implementation of to_svm.)
> +
> + /* Note: APIC ID = 0xff is used for broadcast.
> + * APIC ID > 0xff is reserved.
> + */
> + h_phy_apic_id = __default_cpu_present_to_apicid(cpu);
> +
> + if (h_phy_apic_id >= AVIC_PHY_APIC_ID_MAX)
> + return -EINVAL;
> +
> + entry = svm->avic_phy_apic_id_cache;
The naming is confusing ... can avic_phy_apic_id_cache change during
execution of this function?
If yes, then add READ_ONCE and distinguish the pointer name.
If not, then use svm->avic_phy_apic_id_cache directly.
entry would be ok name for current new_entry.
> + if (!entry)
> + return -EINVAL;
> +
> + if (is_load) {
> + new_entry = READ_ONCE(*entry);
Please move this before the if.
> + BUG_ON(new_entry & AVIC_PHY_APIC_ID__IS_RUN_MSK);
> +
> + new_entry &= ~AVIC_PHY_APIC_ID__HOST_PHY_APIC_ID_MSK;
> + new_entry |= (h_phy_apic_id & AVIC_PHY_APIC_ID__HOST_PHY_APIC_ID_MSK);
> +
> + /**
> + * Restore AVIC running flag if it was set during
> + * vcpu unload.
> + */
> + if (svm->avic_was_running)
> + new_entry |= AVIC_PHY_APIC_ID__IS_RUN_MSK;
> + else
> + new_entry &= ~AVIC_PHY_APIC_ID__IS_RUN_MSK;
You even BUG_ON when AVIC_PHY_APIC_ID__IS_RUN_MSK is set, so there is no
reason to clear it.
(Also, don't BUG.)
> +
> + WRITE_ONCE(*entry, new_entry);
This will translate to two writes in 32 bit mode and we need to write
physical ID first to avoid spurious doorbells ...
is the order guaranteed?
> + } else {
> + new_entry = READ_ONCE(*entry);
> + /**
> + * This handles the case when vcpu is scheduled out
> + * and has not yet not called blocking. We save the
> + * AVIC running flag so that we can restore later.
> + */
is_running must be disabled in between ...blocking and ...unblocking,
because we don't want to miss interrupts and block forever.
I somehow don't get it from the comment. :)
> + if (new_entry & AVIC_PHY_APIC_ID__IS_RUN_MSK) {
> + svm->avic_was_running = true;
> + new_entry &= ~AVIC_PHY_APIC_ID__IS_RUN_MSK;
> + WRITE_ONCE(*entry, new_entry);
> + } else {
> + svm->avic_was_running = false;
> + }
(This can be shorter by setting avic_was_running first.)
> + }
> +
> + return ret;
(return 0;)
> +}
> +
> +/**
> + * This function is called during VCPU halt/unhalt.
> + */
> +static int avic_set_running(struct kvm_vcpu *vcpu, bool is_run)
> +{
> + int ret = 0;
> + int h_phy_apic_id;
> + u64 *entry, new_entry;
> + struct vcpu_svm *svm = to_svm(vcpu);
> +
> + if (!svm_vcpu_avic_enabled(svm))
> + return 0;
> +
> + /* Note: APIC ID = 0xff is used for broadcast.
> + * APIC ID > 0xff is reserved.
> + */
> + h_phy_apic_id = __default_cpu_present_to_apicid(vcpu->cpu);
> +
> + if (h_phy_apic_id >= AVIC_PHY_APIC_ID_MAX)
> + return -EINVAL;
The cache should be valid only if this condition is true.
We can get rid of it in both function.
> +
> + entry = svm->avic_phy_apic_id_cache;
> + if (!entry)
> + return -EINVAL;
> +
> + if (is_run) {
Both READ_ONCE and WRITE_ONCE belong outside of the if.
> + /* Handle vcpu unblocking after HLT */
> + new_entry = READ_ONCE(*entry);
> + new_entry |= AVIC_PHY_APIC_ID__IS_RUN_MSK;
> + WRITE_ONCE(*entry, new_entry);
> + } else {
> + /* Handle vcpu blocking due to HLT */
> + new_entry = READ_ONCE(*entry);
> + new_entry &= ~AVIC_PHY_APIC_ID__IS_RUN_MSK;
> + WRITE_ONCE(*entry, new_entry);
> + }
> +
> + return ret;
> +}
> +
> static void svm_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
> {
> struct vcpu_svm *svm = to_svm(vcpu);