Re: [PATCH V13 2/5] PCI: Create device tree node for bridge

From: Jonathan Cameron
Date: Mon Sep 11 2023 - 17:27:33 EST


On Tue, 15 Aug 2023 10:19:57 -0700
Lizhi Hou <lizhi.hou@xxxxxxx> wrote:

> The PCI endpoint device such as Xilinx Alveo PCI card maps the register
> spaces from multiple hardware peripherals to its PCI BAR. Normally,
> the PCI core discovers devices and BARs using the PCI enumeration process.
> There is no infrastructure to discover the hardware peripherals that are
> present in a PCI device, and which can be accessed through the PCI BARs.
>
> Apparently, the device tree framework requires a device tree node for the
> PCI device. Thus, it can generate the device tree nodes for hardware
> peripherals underneath. Because PCI is self discoverable bus, there might
> not be a device tree node created for PCI devices. Furthermore, if the PCI
> device is hot pluggable, when it is plugged in, the device tree nodes for
> its parent bridges are required. Add support to generate device tree node
> for PCI bridges.
>
> Add an of_pci_make_dev_node() interface that can be used to create device
> tree node for PCI devices.
>
> Add a PCI_DYNAMIC_OF_NODES config option. When the option is turned on,
> the kernel will generate device tree nodes for PCI bridges unconditionally.
>
> Initially, add the basic properties for the dynamically generated device
> tree nodes which include #address-cells, #size-cells, device_type,
> compatible, ranges, reg.
>
> Acked-by: Bjorn Helgaas <bhelgaas@xxxxxxxxxx>
> Signed-off-by: Lizhi Hou <lizhi.hou@xxxxxxx>

I tried to bring this up for a custom PCIe card emulated in QEMU on an ARM ACPI
machine.

There are some missing parts that were present in Clements series, but not this
one, particularly creation of the root pci object.

Anyhow, hit an intermittent crash...


> ---
> +static int of_pci_prop_intr_map(struct pci_dev *pdev, struct of_changeset *ocs,
> + struct device_node *np)
> +{
> + struct of_phandle_args out_irq[OF_PCI_MAX_INT_PIN];
> + u32 i, addr_sz[OF_PCI_MAX_INT_PIN], map_sz = 0;
> + __be32 laddr[OF_PCI_ADDRESS_CELLS] = { 0 };
> + u32 int_map_mask[] = { 0xffff00, 0, 0, 7 };
> + struct device_node *pnode;
> + struct pci_dev *child;
> + u32 *int_map, *mapp;
> + int ret;
> + u8 pin;
> +
> + pnode = pci_device_to_OF_node(pdev->bus->self);
> + if (!pnode)
> + pnode = pci_bus_to_OF_node(pdev->bus);
> +
> + if (!pnode) {
> + pci_err(pdev, "failed to get parent device node");
> + return -EINVAL;
> + }
> +
> + laddr[0] = cpu_to_be32((pdev->bus->number << 16) | (pdev->devfn << 8));
> + for (pin = 1; pin <= OF_PCI_MAX_INT_PIN; pin++) {
> + i = pin - 1;
> + out_irq[i].np = pnode;
> + out_irq[i].args_count = 1;
> + out_irq[i].args[0] = pin;
> + ret = of_irq_parse_raw(laddr, &out_irq[i]);
> + if (ret) {
> + pci_err(pdev, "parse irq %d failed, ret %d", pin, ret);
> + continue;

If all the interrupt parsing fails we continue ever time...

> + }
> + ret = of_property_read_u32(out_irq[i].np, "#address-cells",
> + &addr_sz[i]);
> + if (ret)
> + addr_sz[i] = 0;

This never happens.

> + }
> +
> + list_for_each_entry(child, &pdev->subordinate->devices, bus_list) {
> + for (pin = 1; pin <= OF_PCI_MAX_INT_PIN; pin++) {
> + i = pci_swizzle_interrupt_pin(child, pin) - 1;
> + map_sz += 5 + addr_sz[i] + out_irq[i].args_count;

and here we end up derefencing random memory which happens in my case to cause
a massive allocation sometimes and that fails one of the assertions in the
allocator.

I'd suggest just setting addr_sz[xxx] = {}; above
to ensure it's initialized. Then the if(ret) handling should not be needed
as well as of_property_read_u32 should be side effect free I hope!

> + }
> + }
> +
> + int_map = kcalloc(map_sz, sizeof(u32), GFP_KERNEL);
> + mapp = int_map;