Re: [PATCH] irqchip/gic: Enable gic_set_affinity set more than one cpu

From: Cheng Chao
Date: Tue Oct 25 2016 - 22:04:45 EST




on 10/25/2016 06:09 PM, Marc Zyngier wrote:
> On 15/10/16 08:23, Cheng Chao wrote:
>> On 10/15/2016 01:33 AM, Marc Zyngier wrote:
>>>> on 10/13/2016 11:31 PM, Marc Zyngier wrote:
>>>>> On Thu, 13 Oct 2016 18:57:14 +0800
>>>>> Cheng Chao <cs.os.kernel@xxxxxxxxx> wrote:
>>>>>
>>>>>> GIC can distribute an interrupt to more than one cpu,
>>>>>> but now, gic_set_affinity sets only one cpu to handle interrupt.
>>>>>
>>>>> What makes you think this is a good idea? What purpose does it serves?
>>>>> I can only see drawbacks to this: You're waking up more than one CPU,
>>>>> wasting power, adding jitter and clobbering the cache.
>>>>>
>>>>> I assume you see a benefit to that approach, so can you please spell it
>>>>> out?
>>>>>
>>>>
>>>> Ok, You are right, but the performance is another point that we should consider.
>>>>
>>>> We use E1 device to transmit/receive video stream. we find that E1's interrupt is
>>>> only on the one cpu that cause this cpu usage is almost 100%,
>>>> but other cpus is much lower load, so the performance is not good.
>>>> the cpu is 4-core.
>>>
>>> It looks to me like you're barking up the wrong tree. We have
>>> NAPI-enabled network drivers for this exact reason, and adding more
>>> interrupts to an already overloaded system doesn't strike me as going in
>>> the right direction. May I suggest that you look at integrating NAPI
>>> into your E1 driver?
>>>
>>
>> great, NAPI maybe is a good option, I can try to use NAPI. thank you.
>>
>> In other hand, gic_set_affinity sets only one cpu to handle interrupt,
>> that really makes me a little confused, why does GIC's driver not like
>> the others(MPIC, APIC etc) to support many cpus to handle interrupt?
>>
>> It seems that the GIC's driver constrain too much.
>
> There is several drawbacks to this:
> - Cache impacts and power efficiency, as already mentioned
> - Not virtualizable (you cannot efficiently implement this in a
> hypervisor that emulates a GICv2 distributor)
> - Doesn't scale (you cannot go beyond 8 CPUs)
>
> I strongly suggest you give NAPI a go, and only then consider
> delivering interrupts to multiple CPUs, because multiple CPU
> delivery is not future proof.
>

Thanks again, the E1 driver with NAPI is on the right track.

>> I think it is more reasonable to let user decide what to do.
>>
>> If I care about the power etc, then I only echo single cpu to
>> /proc/irq/xx/smp_affinity, but if I expect more than one cpu to handle
>> one special interrupt, I can echo 'what I expect cpus' to
>> /proc/irq/xx/smp_affinity.
>
> If that's what you really want, a better patch may be something like this:
>

I hope the GIC'c driver is more flexible, and gic_set_affinity() doesn't constrain
to set only one cpu. the GIC supports to distribute more than one cpu after all.


> diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c
> index d6c404b..b301d72 100644
> --- a/drivers/irqchip/irq-gic.c
> +++ b/drivers/irqchip/irq-gic.c
> @@ -326,20 +326,25 @@ static int gic_set_affinity(struct irq_data *d, const struct cpumask *mask_val,
> {
> void __iomem *reg = gic_dist_base(d) + GIC_DIST_TARGET + (gic_irq(d) & ~3);
> unsigned int cpu, shift = (gic_irq(d) % 4) * 8;
> - u32 val, mask, bit;
> - unsigned long flags;
> + u32 val, mask, bit = 0;
> + unsigned long flags, aff = 0;
>
> - if (!force)
> - cpu = cpumask_any_and(mask_val, cpu_online_mask);
> - else
> - cpu = cpumask_first(mask_val);
> + for_each_cpu(cpu, mask_val) {
> + if (force) {
> + aff = 1 << cpu;
> + break;
> + }
> +
> + aff |= cpu_online(cpu) << cpu;
> + }
>
> - if (cpu >= NR_GIC_CPU_IF || cpu >= nr_cpu_ids)
> + if (!aff)
> return -EINVAL;
>
> gic_lock_irqsave(flags);
> mask = 0xff << shift;
> - bit = gic_cpu_map[cpu] << shift;
> + for_each_set_bit(cpu, &aff, nr_cpu_ids)
> + bit |= gic_cpu_map[cpu] << shift;
> val = readl_relaxed(reg) & ~mask;
> writel_relaxed(val | bit, reg);
> gic_unlock_irqrestore(flags);
>

this patch is more better than before.
a little check add.

diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c
index 58e5b4e..b3d0f07 100644
--- a/drivers/irqchip/irq-gic.c
+++ b/drivers/irqchip/irq-gic.c
@@ -326,20 +326,28 @@ static int gic_set_affinity(struct irq_data *d, const struct cpumask *mask_val,
{
void __iomem *reg = gic_dist_base(d) + GIC_DIST_TARGET + (gic_irq(d) & ~3);
unsigned int cpu, shift = (gic_irq(d) % 4) * 8;
- u32 val, mask, bit;
- unsigned long flags;
+ u32 val, mask, bit = 0;
+ unsigned long flags, aff = 0;

- if (!force)
- cpu = cpumask_any_and(mask_val, cpu_online_mask);
- else
- cpu = cpumask_first(mask_val);
+ for_each_cpu(cpu, mask_val) {
+ if (cpu >= NR_GIC_CPU_IF || cpu >= nr_cpu_ids)
+ break;
+
+ if (force) {
+ aff = 1 << cpu;
+ break;
+ }
+
+ aff |= cpu_online(cpu) << cpu;
+ }

- if (cpu >= NR_GIC_CPU_IF || cpu >= nr_cpu_ids)
+ if (!aff)
return -EINVAL;

gic_lock_irqsave(flags);
mask = 0xff << shift;
- bit = gic_cpu_map[cpu] << shift;
+ for_each_set_bit(cpu, &aff, nr_cpu_ids)
+ bit |= gic_cpu_map[cpu] << shift;
val = readl_relaxed(reg) & ~mask;
writel_relaxed(val | bit, reg);
gic_unlock_irqrestore(flags);

> Thanks,
>
> M.
>

Thanks,
Cheng