Re: [PATCH] of: property: Fix fw_devlink handling of interrupt-map

From: Anup Patel
Date: Wed May 29 2024 - 06:17:31 EST


On Wed, May 29, 2024 at 12:03 PM Marc Zyngier <maz@xxxxxxxxxx> wrote:
>
> On Wed, 29 May 2024 06:15:52 +0100,
> Anup Patel <apatel@xxxxxxxxxxxxxxxx> wrote:
> >
> > On Tue, May 28, 2024 at 10:11 PM Marc Zyngier <maz@xxxxxxxxxx> wrote:
> > >
> > > Commit d976c6f4b32c ("of: property: Add fw_devlink support for
> > > interrupt-map property") tried to do what it says on the tin,
> > > but failed on a couple of points:
> > >
> > > - it confuses bytes and cells. Not a huge deal, except when it
> > > comes to pointer arithmetic
> > >
> > > - it doesn't really handle anything but interrupt-maps that have
> > > their parent #address-cells set to 0
> > >
> > > The combinations of the two leads to some serious fun on my M1
> > > box, with plenty of WARN-ON() firing all over the shop, and
> > > amusing values being generated for interrupt specifiers.
> > >
> > > Address both issues so that I can boot my machines again.
> > >
> > > Fixes: d976c6f4b32c ("of: property: Add fw_devlink support for interrupt-map property")
> > > Signed-off-by: Marc Zyngier <maz@xxxxxxxxxx>
> > > Cc: Anup Patel <apatel@xxxxxxxxxxxxxxxx>
> > > Cc: Saravana Kannan <saravanak@xxxxxxxxxx>
> > > Cc: Rob Herring (Arm) <robh@xxxxxxxxxx>
> >
> > Thanks for the fix patch but unfortunately it breaks for RISC-V.
> >
> > > ---
> > > drivers/of/property.c | 16 ++++++++++++++--
> > > 1 file changed, 14 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/drivers/of/property.c b/drivers/of/property.c
> > > index 1c83e68f805b..9adebc63bea9 100644
> > > --- a/drivers/of/property.c
> > > +++ b/drivers/of/property.c
> > > @@ -1322,7 +1322,13 @@ static struct device_node *parse_interrupt_map(struct device_node *np,
> > > addrcells = of_bus_n_addr_cells(np);
> > >
> > > imap = of_get_property(np, "interrupt-map", &imaplen);
> > > - if (!imap || imaplen <= (addrcells + intcells))
> > > + imaplen /= sizeof(*imap);
> > > +
> > > + /*
> > > + * Check that we have enough runway for the child unit interrupt
> > > + * specifier and a phandle. That's the bare minimum we can expect.
> > > + */
> > > + if (!imap || imaplen <= (addrcells + intcells + 1))
> > > return NULL;
> > > imap_end = imap + imaplen;
> > >
> > > @@ -1346,8 +1352,14 @@ static struct device_node *parse_interrupt_map(struct device_node *np,
> > > if (!index)
> > > return sup_args.np;
> > >
> > > - of_node_put(sup_args.np);
> > > + /*
> > > + * Account for the full parent unit interrupt specifier
> > > + * (address cells, interrupt cells, and phandle).
> > > + */
> > > + imap += of_bus_n_addr_cells(sup_args.np);
> >
> > This breaks for RISC-V because we don't have "#address-cells"
> > property in interrupt controller DT node and of_bus_n_addr_cells()
> > retrieves "#address-cells" from the parent of interrupt controller.
>
> That's a feature, not a bug. #address-cells, AFAICT, applies to all
> child nodes until you set it otherwise.
>
> >
> > The of_irq_parse_raw() looks for "#address-cells" property
> > in the interrupt controller DT node only so we should do a
> > similar thing here as well.
>
> This looks more like a of_irq_parse_raw() bug than anything else.

If we change of_irq_parse_raw() to use of_bus_n_addr_cells()
then it would still break for RISC-V.

Using of_bus_n_addr_cells() over here forces interrupt controller
DT nodes to have a "#address-cells" DT property. There are many
interrupt controller (e.g. RISC-V PLIC or RISC-V APLIC) where the
DT bindings don't require "#address-cells" DT property and existing
DTS files with such interrupt controllers don't have it either.

In the RISC-V world, there have been quite a few QEMU releases
where the generated DT node of the interrupt controller does not
have the "#address-cells" property. This patch breaks the kernel
for all such QEMU releases.

I think we should align the implementation in parse_interrupt_map()
with of_irq_parse_raw() and use of_property_read_u32() instead of
of_bus_n_addr_cells().

Regards,
Anup

>
> >
> > The below change on top of this patch worked for me.
> >
> > diff --git a/drivers/of/property.c b/drivers/of/property.c
> > index 9adebc63bea9..f54da2989ea9 100644
> > --- a/drivers/of/property.c
> > +++ b/drivers/of/property.c
> > @@ -1308,7 +1308,7 @@ static struct device_node
> > *parse_interrupt_map(struct device_node *np,
> > {
> > const __be32 *imap, *imap_end, *addr;
> > struct of_phandle_args sup_args;
> > - u32 addrcells, intcells;
> > + u32 addrcells, intcells, paddrcells;
> > int i, imaplen;
> >
> > if (!IS_ENABLED(CONFIG_OF_IRQ))
> > @@ -1356,7 +1356,8 @@ static struct device_node
> > *parse_interrupt_map(struct device_node *np,
> > * Account for the full parent unit interrupt specifier
> > * (address cells, interrupt cells, and phandle).
> > */
> > - imap += of_bus_n_addr_cells(sup_args.np);
> > + if (!of_property_read_u32(sup_args.np, "#address-cells", &paddrcells))
> > + imap += paddrcells;
>
> This looks wrong to me for the reason I outlined above: you need to
> look for a valid #address-cells all along the parent chain, not just
> in the interrupt-controller node.
>
> M.
>
> --
> Without deviation from the norm, progress is not possible.