Re: [PATCH 6/6] PCI: of: Create device-tree root bus node

From: Herve Codina
Date: Wed Nov 06 2024 - 09:54:35 EST


On Tue, 5 Nov 2024 12:59:01 -0600
Bjorn Helgaas <helgaas@xxxxxxxxxx> wrote:

> On Mon, Nov 04, 2024 at 06:20:00PM +0100, Herve Codina wrote:
> > PCI devices device-tree nodes can be already created. This was
> > introduced by commit 407d1a51921e ("PCI: Create device tree node for
> > bridge").
>
> I guess 407d1a51921e creates device tree nodes for bridges, including
> Root Ports, which are enumerated as PCI-to-PCI bridges, right?

It creates DT nodes for bridges and devices.

>
> > In order to have device-tree nodes related to PCI devices attached on
> > their PCI root bus, a root bus device-tree node is needed. This root bus
> > node will be used as the parent node of the first level devices scanned
> > on the bus.
> >
> > On non device-tree based system (such as ACPI), a device-tree node for
> > the PCI root bus does not exist. ...
>
> I'm wondering if "root bus" is the right description for this patch
> and whether "PCI host bridge" might be more accurate. The bus itself
> doesn't really have a physical counterpart other than being the
> secondary side of a PCI host bridge where the primary side is some
> kind of CPU bus.

Indeed, you're right. I will rename "root bus" to "host bridge" everywhere
in the patch (description but also the code itself).

>
> An ACPI namespace doesn't include a "root bus" object, but it *does*
> include a PCI host bridge (PNP0A03) object, which is where any address
> translation between the CPU bus and the PCI hierarchy is described.
>
> I suspect this patch is adding a DT node that corresponds to the
> PNP0A03 host bridge object, and the "ranges" property of the new node
> will describe the mapping from the CPU address space to the PCI
> address space.

Exactly.

>
> > Indeed, this component is not described
> > in a device-tree used at boot.
>
> But maybe I'm on the wrong track, because obviously PCI host
> controllers *are* described in DTs used at boot.

They are described in a device-tree used at boot on device-tree based
systems.
On x86, we are on ACPI based system -> No DT used at boot -> PCI host
controller not described in DT.

I could replace with "Indeed, this component is not described in a
device-tree used at boot because, in that case, device-tree is not
used to describe the hardware"

>
> > The device-tree PCI root bus node creation needs to be done at runtime.
> > This is done in the same way as for the creation of the PCI device
> > nodes. I.e. node and properties are created based on computed
> > information done by the PCI core.
>
> See address translation question below.
>
> > +void of_pci_make_root_bus_node(struct pci_bus *bus)
> > +{
> > + struct device_node *np = NULL;
> > + struct of_changeset *cset;
> > + const char *name;
> > + int ret;
> > +
> > + /*
> > + * If there is already a device tree node linked to this device,
> > + * return immediately.
> > + */
> > + if (pci_bus_to_OF_node(bus))
> > + return;
> > +
> > + /* Check if there is a DT root node to attach this created node */
> > + if (!of_root) {
> > + pr_err("of_root node is NULL, cannot create PCI root bus node");
> > + return;
> > + }
> > +
> > + name = kasprintf(GFP_KERNEL, "pci-root@%x,%x", pci_domain_nr(bus),
> > + bus->number);
>
> Should this be "pci%d@%x,%x" to match the typical descriptions of PCI
> host bridges in DT?

What do you think I should use for the %d you proposed.
Also I supposed your "@%x,%x" is still pci_domain_nr(bus), bus->number.

>
> > +static int of_pci_root_bus_prop_ranges(struct pci_bus *bus,
> > + struct of_changeset *ocs,
> > + struct device_node *np)
> > +{
> > + struct pci_host_bridge *bridge = to_pci_host_bridge(bus->bridge);
> > + struct resource_entry *window;
> > + unsigned int ranges_sz = 0;
> > + unsigned int n_range = 0;
> > + struct resource *res;
> > + int n_addr_cells;
> > + u32 *ranges;
> > + u64 val64;
> > + u32 flags;
> > + int ret;
> > +
> > + n_addr_cells = of_n_addr_cells(np);
> > + if (n_addr_cells <= 0 || n_addr_cells > 2)
> > + return -EINVAL;
> > +
> > + resource_list_for_each_entry(window, &bridge->windows) {
> > + res = window->res;
> > + if (!of_pci_is_range_resource(res, &flags))
> > + continue;
> > + n_range++;
> > + }
> > +
> > + if (!n_range)
> > + return 0;
> > +
> > + ranges = kcalloc(n_range,
> > + (OF_PCI_ADDRESS_CELLS + OF_PCI_SIZE_CELLS +
> > + n_addr_cells) * sizeof(*ranges),
> > + GFP_KERNEL);
> > + if (!ranges)
> > + return -ENOMEM;
> > +
> > + resource_list_for_each_entry(window, &bridge->windows) {
> > + res = window->res;
> > + if (!of_pci_is_range_resource(res, &flags))
> > + continue;
> > +
> > + /* PCI bus address */
> > + val64 = res->start;
> > + of_pci_set_address(NULL, &ranges[ranges_sz], val64, 0, flags, false);
> > + ranges_sz += OF_PCI_ADDRESS_CELLS;
> > +
> > + /* Host bus address */
> > + if (n_addr_cells == 2)
> > + ranges[ranges_sz++] = upper_32_bits(val64);
> > + ranges[ranges_sz++] = lower_32_bits(val64);
>
> IIUC this sets both the parent address (the host bus (CPU) physical
> address) and the child address (PCI bus address) to the same value.
>
> I think that's wrong because these addresses need not be identical.
>
> I think the parent address should be the res->start value, and the
> child address should be "res->start - window->offset", similar to
> what's done by pcibios_resource_to_bus().

I see.
I will update in the next iteration.

Thanks for your feedback.
Best regards,
Hervé