Re: [PATCH v2 1/2] driver core: platform: Clarify that IRQ 0 is invalid

From: Bjorn Helgaas
Date: Mon May 04 2020 - 18:27:03 EST


On Mon, May 04, 2020 at 09:07:21PM +0200, Greg Kroah-Hartman wrote:
> On Mon, May 04, 2020 at 01:08:22PM -0500, Bjorn Helgaas wrote:
> > On Sat, May 02, 2020 at 08:15:37AM +0200, Greg Kroah-Hartman wrote:
> > > On Fri, May 01, 2020 at 05:40:41PM -0500, Bjorn Helgaas wrote:
> > > > From: Bjorn Helgaas <bhelgaas@xxxxxxxxxx>
> > > >
> > > > These interfaces return a negative error number or an IRQ:
> > > >
> > > > platform_get_irq()
> > > > platform_get_irq_optional()
> > > > platform_get_irq_byname()
> > > > platform_get_irq_byname_optional()
> > > >
> > > > The function comments suggest checking for error like this:
> > > >
> > > > irq = platform_get_irq(...);
> > > > if (irq < 0)
> > > > return irq;
> > > >
> > > > which is what most callers (~900 of 1400) do, so it's implicit
> > > > that IRQ 0 is invalid. But some callers check for "irq <= 0",
> > > > and it's not obvious from the source that we never return an
> > > > IRQ 0.
> > > >
> > > > Make this more explicit by updating the comments to say that
> > > > an IRQ number is always non-zero and adding a WARN() if we
> > > > ever do return zero. If we do return IRQ 0, it likely
> > > > indicates a bug in the arch-specific parts of
> > > > platform_get_irq().
> > >
> > > I worry about adding WARN() as there are systems that do
> > > panic_on_warn() and syzbot trips over this as well. I don't
> > > think that for this issue it would be a problem, but what really
> > > is this warning about that someone could do anything with?
> > >
> > > Other than that minor thing, this looks good to me, thanks for
> > > finally clearing this up.
> >
> > What I'm concerned about is an arch that returns 0. Most drivers
> > don't check for 0 so they'll just try to use it, and things will
> > fail in some obscure way. My assumption is that if there really
> > is no IRQ, we should return -ENOENT or similar instead of 0.
> >
> > I could be convinced that it's not worth warning about at all, or
> > we could do something like the following:
> >
> > diff --git a/drivers/base/platform.c b/drivers/base/platform.c
> > index 084cf1d23d3f..4afa5875e14d 100644
> > --- a/drivers/base/platform.c
> > +++ b/drivers/base/platform.c
> > @@ -220,7 +220,11 @@ int platform_get_irq_optional(struct platform_device *dev, unsigned int num)
> > ret = -ENXIO;
> > #endif
> > out:
> > - WARN(ret == 0, "0 is an invalid IRQ number\n");
> > + /* Returning zero here is likely a bug in the arch IRQ code */
> > + if (ret == 0) {
> > + pr_warn("0 is an invalid IRQ number\n");
> > + dump_stack();
> > + }
> > return ret;
> > }
> > ...

> I like that, but you said this is something that the platform people
> should only see when bringing up a new system, so maybe the WARN() is
> fine. It's not user-triggerable, so your original is ok.

Is that an ack? Thomas, any thoughts?

I suspect we could see this given a broken DT, too, so I'm not sure
it's strictly a bringup problem.

I would probably argue that even this case would be an arch defect:
the kernel should validate data from a DT at least enough to avoid
giving a bogus, useless IRQ to a driver.

Bjorn