Re: [PATCH 04/15] irqchip/gic: WARN if setting the interrupt type fails

From: Geert Uytterhoeven
Date: Fri Mar 18 2016 - 06:23:30 EST


Hi Jon,

On Fri, Mar 18, 2016 at 10:54 AM, Jon Hunter <jonathanh@xxxxxxxxxx> wrote:
> On 18/03/16 09:20, Geert Uytterhoeven wrote:
>> On Thu, Mar 17, 2016 at 5:20 PM, Jon Hunter <jonathanh@xxxxxxxxxx> wrote:
>>> On 17/03/16 15:18, Jason Cooper wrote:
>>>> On Thu, Mar 17, 2016 at 03:04:01PM +0000, Jon Hunter wrote:
>>>>> On 17/03/16 14:51, Thomas Gleixner wrote:
>>>>>> On Thu, 17 Mar 2016, Jon Hunter wrote:
>>>>>>> Setting the interrupt type for private peripheral interrupts (PPIs) may
>>>>>>> not be supported by a given GIC because it is IMPLEMENTATION DEFINED
>>>>>>> whether this is allowed. There is no way to know if setting the type is
>>>>>>> supported for a given GIC and so the value written is read back to
>>>>>>> verify it matches the desired configuration. If it does not match then
>>>>>>> an error is return.
>>>>>>>
>>>>>>> There are cases where the interrupt configuration read from firmware
>>>>>>> (such as a device-tree blob), has been incorrect and hence
>>>>>>> gic_configure_irq() has returned an error. This error has gone
>>>>>>> undetected because the error code returned was ignored but the interrupt
>>>>>>> still worked fine because the configuration for the interrupt could not
>>>>>>> be overwritten.
>>>>>>>
>>>>>>> Given that this has done undetected and we should only fail to set the
>>>>>>> type for PPIs whose configuration cannot be changed anyway, don't return
>>>>>>> an error and simply WARN if this fails. This will allows us to fix up any
>>>>>>> places in the kernel where we should be checking the return status and
>>>>>>> maintain back compatibility with firmware images that may have incorrect
>>>>>>> interrupt configurations.
>>>>>>
>>>>>> Though silently returning 0 is really the wrong thing to do. You can add the
>>>>>> warn, but why do you want to return success?
>>>>>
>>>>> Yes that would be the correct thing to do I agree. However, the problem
>>>>> is that if we do this, then after the patch "irqdomain: Don't set type
>>>>> when mapping an IRQ" is applied, we may break interrupts for some
>>>>> existing device-tree binaries that have bad configuration (such as omap4
>>>>> and tegra20/30 ... see patches 1 and 2) that have gone unnoticed. So it
>>>>> is a back compatibility issue.
>>
>> Indeed (also for sh73a0 and r8a7779).
>
> Thanks. I was wondering if there are others. Do you know what the
> correct setting should be? Ie. should it be IRQ_TYPE_EDGE_RISING as
> well? I can then include this with OMAP and Tegra.

The warnings went away when using IRQ_TYPE_EDGE_RISING.
Patches sent.

>>>> This sounds like a textbook case for adding a boolean dt property. If
>>>> "can-set-ppi-type" is absent (old DT blobs and new blobs without the
>>>> ability), warn and return zero. If it's present, the driver can set the
>>>> type, returning errors as encountered.
>>>
>>> True. However, if we did have this "can-set-ppi-type" property set for a
>>> device, it really should never fail (unless someone specified it
>>> incorrectly). So I am trying to understand the value in adding a new DT
>>> property.
>>
>> Do we really want to add properties that basically indicate that a description
>> in DT is correct?
>>
>> Alternatively, it can be fixed in the kernel in a DT quirk (if SoC == xxx then
>> fix TWD).
>
> I am not sure I fully understand your proposal, but please note that it

The kernel can modify the DT in early startup code if it detects that the
TWD's interrupt type is wrong.

> may not be just limited to the TWD (although this does appear to be the
> one client that is wrong for a lot of SoCs). PPIs are also used for the
> armv7/8 timers as well.

True.

> The problem is that we have a lot of SoCs with twd-timers and I have no
> way to test all of these to know which could be a problem. So I thought
> that warning would be a good first step to fixing them.

Definitely.

> However, I am still trying to see the real value in returning an error
> in this case. May be I am the only one with that perspective?

Given the above, we cannot start returning an error until all problems are
fixed.

Even after that, we have to care about stable DT ABIs. In this case it's not
really an ABI change, but a real buggy description in DTS.

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@xxxxxxxxxxxxxx

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds