Re: [PATCH] of: property: Fix fw_devlink handling of interrupt-map

From: Marc Zyngier
Date: Wed May 29 2024 - 08:00:17 EST


On Wed, 29 May 2024 12:28:34 +0100,
Anup Patel <apatel@xxxxxxxxxxxxxxxx> wrote:
>
> On Wed, May 29, 2024 at 4:15 PM Marc Zyngier <maz@xxxxxxxxxx> wrote:
> >
> > On Wed, 29 May 2024 11:16:30 +0100,
> > Anup Patel <apatel@xxxxxxxxxxxxxxxx> wrote:
> > >
> > > On Wed, May 29, 2024 at 12:03 PM Marc Zyngier <maz@xxxxxxxxxx> wrote:
> > > >
> > > > On Wed, 29 May 2024 06:15:52 +0100,
> > > > Anup Patel <apatel@xxxxxxxxxxxxxxxx> wrote:
> > > > >
> > > > > On Tue, May 28, 2024 at 10:11 PM Marc Zyngier <maz@kernelorg> wrote:
> > > > > >
> > > > > > Commit d976c6f4b32c ("of: property: Add fw_devlink support for
> > > > > > interrupt-map property") tried to do what it says on the tin,
> > > > > > but failed on a couple of points:
> > > > > >
> > > > > > - it confuses bytes and cells. Not a huge deal, except when it
> > > > > > comes to pointer arithmetic
> > > > > >
> > > > > > - it doesn't really handle anything but interrupt-maps that have
> > > > > > their parent #address-cells set to 0
> > > > > >
> > > > > > The combinations of the two leads to some serious fun on my M1
> > > > > > box, with plenty of WARN-ON() firing all over the shop, and
> > > > > > amusing values being generated for interrupt specifiers.
> > > > > >
> > > > > > Address both issues so that I can boot my machines again.
> > > > > >
> > > > > > Fixes: d976c6f4b32c ("of: property: Add fw_devlink support for interrupt-map property")
> > > > > > Signed-off-by: Marc Zyngier <maz@xxxxxxxxxx>
> > > > > > Cc: Anup Patel <apatel@xxxxxxxxxxxxxxxx>
> > > > > > Cc: Saravana Kannan <saravanak@xxxxxxxxxx>
> > > > > > Cc: Rob Herring (Arm) <robh@xxxxxxxxxx>
> > > > >
> > > > > Thanks for the fix patch but unfortunately it breaks for RISC-V.
> > > > >
> > > > > > ---
> > > > > > drivers/of/property.c | 16 ++++++++++++++--
> > > > > > 1 file changed, 14 insertions(+), 2 deletions(-)
> > > > > >
> > > > > > diff --git a/drivers/of/property.c b/drivers/of/property.c
> > > > > > index 1c83e68f805b..9adebc63bea9 100644
> > > > > > --- a/drivers/of/property.c
> > > > > > +++ b/drivers/of/property.c
> > > > > > @@ -1322,7 +1322,13 @@ static struct device_node *parse_interrupt_map(struct device_node *np,
> > > > > > addrcells = of_bus_n_addr_cells(np);
> > > > > >
> > > > > > imap = of_get_property(np, "interrupt-map", &imaplen);
> > > > > > - if (!imap || imaplen <= (addrcells + intcells))
> > > > > > + imaplen /= sizeof(*imap);
> > > > > > +
> > > > > > + /*
> > > > > > + * Check that we have enough runway for the child unit interrupt
> > > > > > + * specifier and a phandle. That's the bare minimum we can expect.
> > > > > > + */
> > > > > > + if (!imap || imaplen <= (addrcells + intcells + 1))
> > > > > > return NULL;
> > > > > > imap_end = imap + imaplen;
> > > > > >
> > > > > > @@ -1346,8 +1352,14 @@ static struct device_node *parse_interrupt_map(struct device_node *np,
> > > > > > if (!index)
> > > > > > return sup_args.np;
> > > > > >
> > > > > > - of_node_put(sup_args.np);
> > > > > > + /*
> > > > > > + * Account for the full parent unit interrupt specifier
> > > > > > + * (address cells, interrupt cells, and phandle).
> > > > > > + */
> > > > > > + imap += of_bus_n_addr_cells(sup_args.np);
> > > > >
> > > > > This breaks for RISC-V because we don't have "#address-cells"
> > > > > property in interrupt controller DT node and of_bus_n_addr_cells()
> > > > > retrieves "#address-cells" from the parent of interrupt controller.
> > > >
> > > > That's a feature, not a bug. #address-cells, AFAICT, applies to all
> > > > child nodes until you set it otherwise.
> > > >
> > > > >
> > > > > The of_irq_parse_raw() looks for "#address-cells" property
> > > > > in the interrupt controller DT node only so we should do a
> > > > > similar thing here as well.
> > > >
> > > > This looks more like a of_irq_parse_raw() bug than anything else.
> > >
> > > If we change of_irq_parse_raw() to use of_bus_n_addr_cells()
> > > then it would still break for RISC-V.
> >
> > I'm not trying to "fix" riscv. I'm merely outlining that you are
> > relying on both broken DTs and a buggy OS.
> >
> > >
> > > Using of_bus_n_addr_cells() over here forces interrupt controller
> > > DT nodes to have a "#address-cells" DT property. There are many
> > > interrupt controller (e.g. RISC-V PLIC or RISC-V APLIC) where the
> > > DT bindings don't require "#address-cells" DT property and existing
> > > DTS files with such interrupt controllers don't have it either.
> >
> > It forces you to set #address-cells *if you rely on a different
> > value in a child node*. It's not like the semantics are new.
>
> We don't have child nodes under the interrupt controller DT node
> (for both RISC-V PLIC and APLIC) so we certainly don't require the
> "#address-cells" property in the interrupt controller DT node.

You keep missing the point.

You *do* require it if the parent node has an #address-cells value
that doesn't apply to its children nodes. Basic property inheritance.
Interrupt controller nodes are not special in this regard (and please
allow me to think that I know a thing or two about those).

So it's not that "you don't need it". It's that "you're relying on
something that is broken".

But in your defence, the DT spec is amusingly self-contradictory:

<quote>
2.3.5. #address-cells and #size-cells

The #address-cells and #size-cells properties may be used in any
device node that has children in the devicetree hierarchy and
describes how child device nodes should be addressed. The
#address-cells property defines the number of <u32> cells used to
encode the address field in a child node’s reg property. The
#size-cells property defines the number of <u32> cells used to encode
the size field in a child node’s reg property.

The #address-cells and #size-cells properties are not inherited from
ancestors in the devicetree. They shall be explicitly defined.
</quote>

Followed by:

<quote>
2.4.3.1. interrupt-map

Note

Both the child node and the interrupt parent node are required to have
#address-cells and #interrupt-cells properties defined. If a unit
address component is not required, #address-cells shall be explicitly
defined to be zero.
</quote>

which says one thing and then the other about property inheritance,
but then asserts that #address-cells isn't optional.

> >
> > >
> > > In the RISC-V world, there have been quite a few QEMU releases
> > > where the generated DT node of the interrupt controller does not
> > > have the "#address-cells" property. This patch breaks the kernel
> > > for all such QEMU releases.
> >
> > Congratulations, you've forked DT. News at 11.
>
> Can you elaborate how ?

You've stated it yourself. You are relying on a behaviour that
deviates from the standard by having DTs with missing properties

And since we can't travel back it time to fix this, the only solution
I can see is to support both behaviours by quirking it.

M.

--
Without deviation from the norm, progress is not possible.