Re: [PATCH] xen-evtchn: Bind dyn evtchn:qemu-dm interrupt to next online VCPU

From: Boris Ostrovsky
Date: Mon Jun 05 2017 - 10:11:18 EST


On 06/05/2017 06:14 AM, Anoob Soman wrote:
> On 02/06/17 17:24, Boris Ostrovsky wrote:
>>> static int set_affinity_irq(struct irq_data *data, const struct
>>> cpumask *dest,
>>> bool force)
>>> diff --git a/drivers/xen/evtchn.c b/drivers/xen/evtchn.c
>>> index 10f1ef5..1192f24 100644
>>> --- a/drivers/xen/evtchn.c
>>> +++ b/drivers/xen/evtchn.c
>>> @@ -58,6 +58,8 @@
>>> #include <xen/xen-ops.h>
>>> #include <asm/xen/hypervisor.h>
>>> +static DEFINE_PER_CPU(int, bind_last_selected_cpu);
>> This should be moved into evtchn_bind_interdom_next_vcpu() since that's
>> the only place referencing it.
>
> Sure, I will do it.
>
>>
>> Why is it a percpu variable BTW? Wouldn't making it global result in
>> better interrupt distribution?
>
> The reason for percpu instead of global, was to avoid locking. We can
> have a global variable (last_cpu) without locking, but value of
> last_cpu wont be consistent, without locks. Moreover, since
> irq_affinity is also used in the calculation of cpu to bind, having a
> percpu or global wouldn't really matter, as the result (selected_cpu)
> is more likely to be random (because different irqs can have different
> affinity). What do you guys suggest.

Doesn't initial affinity (which is what we expect here since irqbalance
has not run yet) typically cover all guest VCPUs?

>
>>
>>> +
>>> struct per_user_data {
>>> struct mutex bind_mutex; /* serialize bind/unbind operations */
>>> struct rb_root evtchns;
>>> @@ -421,6 +423,36 @@ static void evtchn_unbind_from_user(struct
>>> per_user_data *u,
>>> del_evtchn(u, evtchn);
>>> }
>>> +static void evtchn_bind_interdom_next_vcpu(int evtchn)
>>> +{
>>> + unsigned int selected_cpu, irq;
>>> + struct irq_desc *desc = NULL;
>>> + unsigned long flags;
>>> +
>>> + irq = irq_from_evtchn(evtchn);
>>> + desc = irq_to_desc(irq);
>>> +
>>> + if (!desc)
>>> + return;
>>> +
>>> + raw_spin_lock_irqsave(&desc->lock, flags);
>>> + selected_cpu = this_cpu_read(bind_last_selected_cpu);
>>> + selected_cpu = cpumask_next_and(selected_cpu,
>>> + desc->irq_common_data.affinity, cpu_online_mask);
>>> +
>>> + if (unlikely(selected_cpu >= nr_cpu_ids))
>>> + selected_cpu =
>>> cpumask_first_and(desc->irq_common_data.affinity,
>>> + cpu_online_mask);
>>> +
>>> + raw_spin_unlock_irqrestore(&desc->lock, flags);
>> I think if you follow Juergen's suggestion of wrapping everything into
>> irq_enable/disable you can drop the lock altogether (assuming you keep
>> bind_last_selected_cpu percpu).
>>
>> -boris
>>
>
> I think we would still require spin_lock(). spin_lock is for irq_desc.

If you are trying to protect affinity then it may well change after you
drop the lock.

In fact, don't you have a race here? If we offline a VCPU we will (by
way of cpu_disable_common()->fixup_irqs()) update affinity to reflect
that a CPU is gone and there is a chance that xen_rebind_evtchn_to_cpu()
will happen after that.

So, contrary to what I said earlier ;-) not only do you need the lock,
but you should hold it across xen_rebind_evtchn_to_cpu() call. Does this
make sense?

-boris


>
>>> + this_cpu_write(bind_last_selected_cpu, selected_cpu);
>>> +
>>> + local_irq_disable();
>>> + /* unmask expects irqs to be disabled */
>>> + xen_rebind_evtchn_to_cpu(evtchn, selected_cpu);
>>> + local_irq_enable();
>>> +}
>>> +
>>>
>