Re: [PATCH 3.18-rc3 v8 1/4] irqchip: gic: Make gic_raise_softirq() FIQ-safe

From: Daniel Thompson
Date: Mon Nov 24 2014 - 16:01:41 EST


On 24/11/14 20:38, Thomas Gleixner wrote:
> On Mon, 24 Nov 2014, Thomas Gleixner wrote:
>> On Mon, 24 Nov 2014, Thomas Gleixner wrote:
>>> On Fri, 14 Nov 2014, Daniel Thompson wrote:
>>>> diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c
>>>> index 38493ff28fa5..0db62a6f1ee3 100644
>>>> --- a/drivers/irqchip/irq-gic.c
>>>> +++ b/drivers/irqchip/irq-gic.c
>>>> @@ -73,6 +73,13 @@ struct gic_chip_data {
>>>> static DEFINE_RAW_SPINLOCK(irq_controller_lock);
>>>>
>>>> /*
>>>> + * This lock may be locked for reading by FIQ handlers. Thus although
>>>> + * read locking may be used liberally, write locking must only take
>>>> + * place only when local FIQ handling is disabled.
>>>> + */
>>>> +static DEFINE_RWLOCK(fiq_safe_cpu_map_lock);
>>>> +
>>>> +/*
>>>> * The GIC mapping of CPU interfaces does not necessarily match
>>>> * the logical CPU numbering. Let's use a mapping as returned
>>>> * by the GIC itself.
>>>> @@ -624,7 +631,7 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq)
>>>> int cpu;
>>>> unsigned long flags, map = 0;
>>>>
>>>> - raw_spin_lock_irqsave(&irq_controller_lock, flags);
>>>> + read_lock_irqsave(&fiq_safe_cpu_map_lock, flags);
>>>
>>> Just for the record:
>>>
>>> You might have noticed that you replace a raw lock with a non raw
>>> one. That's not an issue on mainline, but that pretty much renders
>>> that code broken for RT.
>>>
>>> Surely nothing I worry too much about given the current state of RT.
>>
>> And having a second thought here. Looking at the protection scope
>> independent of the spin vs. rw lock
>>
>> gic_raise_softirq()
>>
>> lock();
>>
>> /* Does not need any protection */
>> for_each_cpu(cpu, mask)
>> map |= gic_cpu_map[cpu];
>>
>> /*
>> * Can be outside the lock region as well as it makes sure
>> * that previous writes (usually the IPI data) are visible
>> * before the write to the SOFTINT register.
>> */
>> dmb(ishst);
>>
>> /* Why needs this protection? */
>> write(map, gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT));
>>
>> unlock();
>>
>> gic_migrate_target()
>>
>> ....
>> lock();
>>
>> /* Migrate all peripheral interrupts */
>>
>> unlock();
>>
>> So what's the point of that protection?
>>
>> gic_raise_softirq() is used to send IPIs, which are PPIs on the target
>> CPUs so they are not affected from the migration of the peripheral
>> interrupts at all.
>>
>> The write to the SOFTINT register in gic_migrate_target() is not
>> inside the lock region. So what's serialized by the lock in
>> gic_raise_softirq() at all?
>>
>> Either I'm missing something really important here or this locking
>> exercise in gic_raise_softirq() and therefor the rwlock conversion is
>> completely pointless.
>
> Thanks to Marc I figured it out now what I'm missing. That stuff is
> part of the bl switcher horror. Well documented as all of that ...
>
> So the lock protects against an IPI being sent to the current cpu
> while the target map is redirected and the pending state of the
> current cpu is migrated to another cpu.
>
> It's not your fault, that the initial authors of that just abused
> irq_controller_lock for that purpose instead of introducing a seperate
> lock with a clear description of the protection scope in the first
> place.
>
> Now you came up with the rw lock to handle the following FIQ related
> case:
> gic_raise_softirq()
> lock(x);
> ---> FIQ
> handle_fiq()
> gic_raise_softirq()
> lock(x); <-- Live lock
>
> Now the rwlock lets you avoid that, and it only lets you avoid that
> because rwlocks are not fair.
>
> So while I cannot come up with a brilliant replacement, it would be
> really helpful documentation wise if you could do the following:
>
> 1) Create a patch which introduces irq_migration_lock as a raw
> spinlock and replaces the usage of irq_controller_lock in
> gic_raise_softirq() and gic_migrate_target() along with a proper
> explanation in the code and the changelog of course.

Replace irq_controller_lock or augment it with a new one?

irq_raise_softirq() can share a single r/w lock with irq_set_affinity()
because irq_set_affinity() would have to lock it for writing and that
would bring the deadlock back for a badly timed FIQ.

Thus if we want calls to gic_raise_softirq() to be FIQ-safe there there
must be two locks taken in gic_migrate_target().

We can eliminate irq_controller_lock but we cannot replace it with one
r/w lock.


> 2) Make the rwlock conversion on top of that with a proper
> documentation in the code of the only relevant reason (See above).
>
> The protection scope which prevents IPIs being sent while switching
> over is still the same and not affected.
>
> That's not the first time that I stumble over this bl switcher mess
> which got boltet into the kernel mindlessly.
>
> If the scope of the issue would have been clear up front, I wouldn't
> have complained about the RT relevance for this as it is simple to
> either disable FIQs for RT or just handle the above case differently.
>
> Thanks,
>
> tglx
>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/