Re: [PATCH v4] of: property: Add fw_devlink support for interrupt-map property

From: Marc Zyngier
Date: Tue May 28 2024 - 12:36:24 EST


On Thu, 09 May 2024 13:08:20 +0100,
Anup Patel <apatel@xxxxxxxxxxxxxxxx> wrote:
>
> Some of the PCI host controllers (such as generic PCI host controller)
> use "interrupt-map" DT property to describe the mapping between PCI
> endpoints and PCI interrupt pins. This is the only case where the
> interrupts are not described in DT.
>
> Currently, there is no fw_devlink created based on "interrupt-map"
> DT property so interrupt controller is not guaranteed to be probed
> before the PCI host controller. This affects every platform where
> both PCI host controller and interrupt controllers are probed as
> regular platform devices.
>
> This creates fw_devlink between consumers (PCI host controller) and
> supplier (interrupt controller) based on "interrupt-map" DT property.
>
> Signed-off-by: Anup Patel <apatel@xxxxxxxxxxxxxxxx>
> Reviewed-by: Saravana Kannan <saravanak@xxxxxxxxxx>
> ---
> Changes since v3:
> - Added a comment about of_irq_parse_raw()
> - Removed redundant NULL assignments to sup_args.np
> Changes since v2:
> - No need for a loop to find #interrupt-cells property value
> - Fix node de-reference leak when index is greater than number
> of entries in interrupt-map property
> Changes since v1:
> - Updated commit description based on Rob's suggestion
> - Use of_irq_parse_raw() for parsing interrupt-map DT property

This patch breaks badly on my M1 Mini, with a continuous stream of
boot time warnings lasting about 100 seconds:

[ 97.832335] ------------[ cut here ]------------
[ 97.836955] /soc/pcie@690000000/pci@2,0 interrupt-map failed, using interrupt-controller
[ 97.845072] WARNING: CPU: 0 PID: 1 at drivers/of/irq.c:277 of_irq_parse_raw+0x620/0x730
[ 97.853087] Modules linked in:
[ 97.856139] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G W 6.10.0-rc1 #2915
[ 97.864163] Hardware name: Apple Mac mini (M1, 2020) (DT)
[ 97.869570] pstate: 61400009 (nZCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--)
[ 97.876546] pc : of_irq_parse_raw+0x620/0x730
[ 97.880907] lr : of_irq_parse_raw+0x620/0x730
[ 97.885267] sp : ffffc000800538a0
[ 97.888581] x29: ffffc000800538a0 x28: 0000000000000000 x27: ffffffffff5ffc68
[ 97.895732] x26: ffffc000800539a4 x25: ffffffffff5ffbbc x24: ffffc00080053a48
[ 97.902883] x23: ffffe3ff73284e68 x22: ffffb7889aede6d0 x21: ffffe3ff735f9320
[ 97.910034] x20: ffffc00080053964 x19: ffffb7889aede6d0 x18: ffffffffffffffff
[ 97.917185] x17: 7075727265746e69 x16: 20676e697375202c x15: 64656c6961662070
[ 97.924336] x14: 616d2d7470757272 x13: 72656c6c6f72746e x12: 6f632d7470757272
[ 97.931487] x11: 65746e6920676e69 x10: ffffe3ff740bcb68 x9 : ffffe3ff72335b38
[ 97.938639] x8 : 00000001000028a4 x7 : ffffe3ff740afc08 x6 : 00000000000038a4
[ 97.945789] x5 : 00000000000067b0 x4 : c0000001000028a4 x3 : 0000000000000000
[ 97.952940] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffffb784c16ade00
[ 97.960092] Call trace:
[ 97.962534] of_irq_parse_raw+0x620/0x730
[ 97.966545] parse_interrupt_map+0xfc/0x188
[ 97.970731] of_fwnode_add_links+0x170/0x1e0
[ 97.975004] fw_devlink_parse_fwtree+0x44/0x98
[ 97.979452] fw_devlink_parse_fwtree+0x6c/0x98
[ 97.983899] fw_devlink_parse_fwtree+0x6c/0x98
[ 97.988347] device_add+0x610/0x6a8
[ 97.991836] of_device_add+0x4c/0x70
[ 97.995411] of_platform_device_create_pdata+0xa0/0x160
[ 98.000644] of_platform_bus_create+0x184/0x370
[ 98.005178] of_platform_populate+0x68/0x160
[ 98.009451] of_platform_default_populate_init+0xf4/0x118
[ 98.014859] do_one_initcall+0x4c/0x320
[ 98.018695] do_initcalls+0xf4/0x1d8
[ 98.022271] kernel_init_freeable+0x12c/0x280
[ 98.026632] kernel_init+0x2c/0x1f8
[ 98.030120] ret_from_fork+0x10/0x20
[ 98.033696] ---[ end trace 0000000000000000 ]---

which comes from 10a20b34d735f ("of/irq: Don't ignore
interrupt-controller when interrupt-map failed").

Each of the 3 PCIe ports are described as such:

port02: pci@2,0 {
device_type = "pci";
reg = <0x1000 0x0 0x0 0x0 0x0>;
reset-gpios = <&pinctrl_ap 33 GPIO_ACTIVE_LOW>;

#address-cells = <3>;
#size-cells = <2>;
ranges;

interrupt-controller;
#interrupt-cells = <1>;

interrupt-map-mask = <0 0 0 7>;
interrupt-map = <0 0 0 1 &port02 0 0 0 0>,
<0 0 0 2 &port02 0 0 0 1>,
<0 0 0 3 &port02 0 0 0 2>,
<0 0 0 4 &port02 0 0 0 3>;
status = "disabled";
};

and get probed *972 times*, which seem... excessive, given that there
are only 4 entries per port.

> ---
> drivers/of/property.c | 52 +++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 52 insertions(+)
>
> diff --git a/drivers/of/property.c b/drivers/of/property.c
> index a6358ee99b74..2d749a18b037 100644
> --- a/drivers/of/property.c
> +++ b/drivers/of/property.c
> @@ -1311,6 +1311,57 @@ static struct device_node *parse_interrupts(struct device_node *np,
> return of_irq_parse_one(np, index, &sup_args) ? NULL : sup_args.np;
> }
>
> +static struct device_node *parse_interrupt_map(struct device_node *np,
> + const char *prop_name, int index)
> +{
> + const __be32 *imap, *imap_end, *addr;
> + struct of_phandle_args sup_args;
> + u32 addrcells, intcells;
> + int i, imaplen;
> +
> + if (!IS_ENABLED(CONFIG_OF_IRQ))
> + return NULL;
> +
> + if (strcmp(prop_name, "interrupt-map"))
> + return NULL;
> +
> + if (of_property_read_u32(np, "#interrupt-cells", &intcells))
> + return NULL;
> + addrcells = of_bus_n_addr_cells(np);
> +
> + imap = of_get_property(np, "interrupt-map", &imaplen);
> + if (!imap || imaplen <= (addrcells + intcells))

This is "interesting". You compare a number of *bytes* with a number
of cells. Only off by a factor of 4...

Also, you need a minimum of one extra cell to hold the phandle, and a
yet unknown number of cells for whatever follows the phandle.

> + return NULL;
> + imap_end = imap + imaplen;

Same problem, with pointer arithmetic this time.

> +
> + while (imap < imap_end) {
> + addr = imap;
> + imap += addrcells;
> +
> + sup_args.np = np;
> + sup_args.args_count = intcells;
> + for (i = 0; i < intcells; i++)
> + sup_args.args[i] = be32_to_cpu(imap[i]);
> + imap += intcells;
> +
> + /*
> + * Upon success, the function of_irq_parse_raw() returns
> + * interrupt controller DT node pointer in sup_args.np.
> + */
> + if (of_irq_parse_raw(addr, &sup_args))
> + return NULL;
> +
> + if (!index)
> + return sup_args.np;
> +
> + of_node_put(sup_args.np);
> + imap += sup_args.args_count + 1;

This really doesn't map (pun intended) to the way the interrupt-map
entries are built. You need to account for the parent address size as
well.

I'll post a patch fixing both issues.

M.

--
Without deviation from the norm, progress is not possible.