Re: [PATCH v11 06/11] PCI: dwc: Use devicetree 'ranges' property to get rid of cpu_addr_fixup() callback

From: Bjorn Helgaas
Date: Fri Mar 14 2025 - 18:11:12 EST


On Fri, Mar 14, 2025 at 11:21:19AM -0400, Frank Li wrote:
> On Thu, Mar 13, 2025 at 06:56:17PM -0400, Frank Li wrote:
> > On Thu, Mar 13, 2025 at 05:04:50PM -0500, Bjorn Helgaas wrote:
> > > On Thu, Mar 13, 2025 at 11:38:42AM -0400, Frank Li wrote:
> > > > The 'ranges' property at PCI controller parent bus can indicate address
> > > > translation information. Most system's bus fabric use 1:1 map between input
> > > > and output address. but some hardware like i.MX8QXP doesn't use 1:1 map.
> > >
> > > I think you've used reg["addr_space"] to get the offset for Endpoints
> > > forever.
> >
> > Yes, it still need ranges informaiton at parent bus.
> >
> > bus@000
> > {
> > ranges = <...>; [1] /* still need this */
> > pcie {
> > ranges = <...>;[2]
> > };
> > pcie-ep {};
> > }

Yes, of course. I'm just making the point that the subject/commit log
says this patch uses 'ranges' but in fact it uses 'reg'.

> > > I just noticed that through v9, you used 'ranges' to get the offset
> > > for the Root Complex (with "Add parent_bus_offset to resource_entry"),
> > > and I think v10 switched to use reg["config"] instead.
> > >
> > > I think I originally proposed the idea of "Add parent_bus_offset to
> > > resource_entry" patch, but I think it turned out to be kind of an ugly
> > > approach.
> > >
> > > Anyway, IIUC this v11 patch actually uses reg["config"] to compute the
> > > offset, not 'ranges', so we should probably update the subject and
> > > commit log to reflect that, and maybe remove the now-unused bits of
> > > the devicetree example.
> >
> > We use reg["config"] to detect offset, but still need parent dts's ranges.
> > There are two ranges, one is at parent pci bus [1], the other is under
> > 'pci bus' [2].
>
> Beside, luckly dwc use reg["config"] to indicate config space. but dt also
> define ranges [2] under pcie node, which can also include 'config's space.
>
> cadence also use reg["cfg"] to do that.
> res = platform_get_resource_byname(pdev, IORESOURCE_MEM, "cfg");
>
> I am not sure why both choose use reg[] instead of [2]ranges under
> pcie node. But the result make our situation simpler.
>
> > Although use reg["config"], but still need ranges [1]. And information at
> > ranges [2] also need be correct.
> >
> > The whole devicetree example also validate to help write address translate
> > informaiton.
> >
> > > I do worry a little bit about the assumption that the offset of
> > > reg["config"] is the same as the offset of the other pieces. The main
> > > place we use the offset on RCs is for the ATU, and isn't the ATU in
> > > the MemSpace area at 0x8000_0000 below?
> >
> > No, "Bus fabric" only decode input address from "0x7000_0000..UPLIMIT".
> > Then output address to 0x8000_0000..UPLIMIT. So below 0x8000_0000 never
> > happen.

Minor miscommunication, I think. I didn't mean there were addresses
smaller than 0x8000_0000; I meant that in the picture, MemSpace at
0x8000_0000 is below CfgSpace at 0x8ff0_0000. The important point is
that CfgSpace is a separate region from MemSpace, and we're applying
the CfgSpace offset to the ATU in MemSpace.

I think it's OK to assume that for now. AFAICS there is nothing in
devicetree that explicitly mentions the ATU input address space; it's
just implicitly part of the intermediate address space described by
the bus@5f000000 'ranges'.

> > > It's great that in this case the 0x7ff0_0000 to 0x8ff0_0000 "config"
> > > offset is the same as the 0x7000_0000 to 0x8000_0000 MemSpace offset,
> > > but I don't know that this is guaranteed for all designs.
> >
> > So far, it is the same for use dwc chips. If we meet difference, we can
> > add later.
> >
> > reg["config"] only simplied our implement base on the offset is the same.
> > But whole concept is unchanged.

> > > > See below diagram:
> > > >
> > > > ┌─────────┐ ┌────────────┐
> > > > ┌─────┐ │ │ IA: 0x8ff8_0000 │ │
> > > > │ CPU ├───►│ ┌────►├─────────────────┐ │ PCI │
> > > > └─────┘ │ │ │ IA: 0x8ff0_0000 │ │ │
> > > > CPU Addr │ │ ┌─►├─────────────┐ │ │ Controller │
> > > > 0x7ff8_0000─┼───┘ │ │ │ │ │ │
> > > > │ │ │ │ │ │ │ PCI Addr
> > > > 0x7ff0_0000─┼──────┘ │ │ └──► IOSpace ─┼────────────►
> > > > │ │ │ │ │ 0
> > > > 0x7000_0000─┼────────►├─────────┐ │ │ │
> > > > └─────────┘ │ └──────► CfgSpace ─┼────────────►
> > > > BUS Fabric │ │ │ 0
> > > > │ │ │
> > > > └──────────► MemSpace ─┼────────────►
> > > > IA: 0x8000_0000 │ │ 0x8000_0000
> > > > └────────────┘
> > > >
> > > > bus@5f000000 {
> > > > compatible = "simple-bus";
> > > > #address-cells = <1>;
> > > > #size-cells = <1>;
> > > > ranges = <0x80000000 0x0 0x70000000 0x10000000>;
> > > >
> > > > pcie@5f010000 {
> > > > compatible = "fsl,imx8q-pcie";
> > > > reg = <0x5f010000 0x10000>, <0x8ff00000 0x80000>;
> > > > reg-names = "dbi", "config";

> > > > #address-cells = <3>;
> > > > #size-cells = <2>;
> > > > device_type = "pci";
> > > > bus-range = <0x00 0xff>;
> > > > ranges = <0x81000000 0 0x00000000 0x8ff80000 0 0x00010000>,
> > > > <0x82000000 0 0x80000000 0x80000000 0 0x0ff00000>;

Of course we need this 'ranges' to describe the translation between
intermediate addresses and PCI bus addresses. My point is that this
is not relevant to the parent bus offset we're computing in this
patch.

So I think for purposes of this patch, we can omit pcie@5f010000
#address-cells and everything after it.

Bjorn