Re: [PATCH 04/15] irqchip/gic: WARN if setting the interrupt type fails

From: Marc Zyngier
Date: Sat Apr 09 2016 - 06:59:11 EST


On Thu, 17 Mar 2016 15:04:01 +0000
Jon Hunter <jonathanh@xxxxxxxxxx> wrote:

>
> On 17/03/16 14:51, Thomas Gleixner wrote:
> > On Thu, 17 Mar 2016, Jon Hunter wrote:
> >
> >> Setting the interrupt type for private peripheral interrupts (PPIs) may
> >> not be supported by a given GIC because it is IMPLEMENTATION DEFINED
> >> whether this is allowed. There is no way to know if setting the type is
> >> supported for a given GIC and so the value written is read back to
> >> verify it matches the desired configuration. If it does not match then
> >> an error is return.
> >>
> >> There are cases where the interrupt configuration read from firmware
> >> (such as a device-tree blob), has been incorrect and hence
> >> gic_configure_irq() has returned an error. This error has gone
> >> undetected because the error code returned was ignored but the interrupt
> >> still worked fine because the configuration for the interrupt could not
> >> be overwritten.
> >>
> >> Given that this has done undetected and we should only fail to set the
> >> type for PPIs whose configuration cannot be changed anyway, don't return
> >> an error and simply WARN if this fails. This will allows us to fix up any
> >> places in the kernel where we should be checking the return status and
> >> maintain back compatibility with firmware images that may have incorrect
> >> interrupt configurations.
> >
> > Though silently returning 0 is really the wrong thing to do. You can add the
> > warn, but why do you want to return success?
>
> Yes that would be the correct thing to do I agree. However, the problem
> is that if we do this, then after the patch "irqdomain: Don't set type
> when mapping an IRQ" is applied, we may break interrupts for some
> existing device-tree binaries that have bad configuration (such as omap4
> and tegra20/30 ... see patches 1 and 2) that have gone unnoticed. So it
> is a back compatibility issue.
>
> If you are wondering why these interrupts break after "irqdomain: Don't
> set type when mapping an IRQ", it is because today
> irq_create_fwspec_mapping() does not check the return code from setting
> the type, but if we defer setting the type until __setup_irq() which
> does check the return code, then all of a sudden interrupts that were
> working (even with bad configurations) start to fail.
>
> The reason why I opted not to return an error code from
> gic_configure_irq() is it really can't fail. The failure being reported
> does not prevent the interrupt from working, but tells you your
> configuration does not match the hardware setting which you cannot
> overwrite.
>
> So to maintain back compatibility and avoid any silent errors, I opted
> to make it a WARN and not return an error.
>
> If people are ok with potentially breaking interrupts for device-tree
> binaries with bad settings, then I am ok to return an error here.

I think we need to phase things. Let's start with warning people for a
few kernel releases. Actively maintained platforms will quickly address
the issue (fixing their DT). As I see it, this issue seems rather
widespread (even kvmtool outputs a DT with the wrong triggering
information).

Once we've fixed the bulk of the platforms and virtual environments, we
can start thinking about making it fail harder.

Thanks,

M.
--
Jazz is not dead. It just smells funny.